[2022-01-17 11:28:51 swin_tiny_patch4_window7_224] (main.py 309): INFO Full config saved to output/swin_tiny_patch4_window7_224/fix_ddp/config.json [2022-01-17 11:28:51 swin_tiny_patch4_window7_224] (main.py 312): INFO AMP_OPT_LEVEL: '' AUG: AUTO_AUGMENT: rand-m9-mstd0.5-inc1 COLOR_JITTER: 0.4 CUTMIX: 1.0 CUTMIX_MINMAX: null MIXUP: 0.8 MIXUP_MODE: batch MIXUP_PROB: 1.0 MIXUP_SWITCH_PROB: 0.5 RECOUNT: 1 REMODE: pixel REPROB: 0.25 BASE: - '' DATA: BATCH_SIZE: 128 CACHE_MODE: part DATASET: imagenet DATA_PATH: /dataset/7202515b/ IMG_SIZE: 224 INTERPOLATION: bicubic NUM_WORKERS: 8 PIN_MEMORY: true ZIP_MODE: false EVAL_MODE: false LOCAL_RANK: 0 MODEL: DROP_PATH_RATE: 0.2 DROP_RATE: 0.0 LABEL_SMOOTHING: 0.1 NAME: swin_tiny_patch4_window7_224 NUM_CLASSES: 1000 PRETRAINED: '' RESUME: '' SWIN: APE: false DEPTHS: - 2 - 2 - 6 - 2 EMBED_DIM: 96 IN_CHANS: 3 MLP_RATIO: 4.0 NUM_HEADS: - 3 - 6 - 12 - 24 PATCH_NORM: true PATCH_SIZE: 4 QKV_BIAS: true QK_SCALE: null WINDOW_SIZE: 7 SWIN_MLP: APE: false DEPTHS: - 2 - 2 - 6 - 2 EMBED_DIM: 96 IN_CHANS: 3 MLP_RATIO: 4.0 NUM_HEADS: - 3 - 6 - 12 - 24 PATCH_NORM: true PATCH_SIZE: 4 WINDOW_SIZE: 7 TYPE: swin OUTPUT: output/swin_tiny_patch4_window7_224/fix_ddp PRINT_FREQ: 10 SAVE_FREQ: 10 SEED: 0 TAG: fix_ddp TEST: CROP: true SEQUENTIAL: false THROUGHPUT_MODE: false TRAIN: ACCUMULATION_STEPS: 0 AUTO_RESUME: false BASE_LR: 0.001 CLIP_GRAD: 5.0 EPOCHS: 300 LR_SCHEDULER: DECAY_EPOCHS: 30 DECAY_RATE: 0.1 NAME: cosine MIN_LR: 1.0e-05 OPTIMIZER: BETAS: - 0.9 - 0.999 EPS: 1.0e-08 MOMENTUM: 0.9 NAME: adamw START_EPOCH: 0 USE_CHECKPOINT: false WARMUP_EPOCHS: 20 WARMUP_LR: 1.0e-06 WEIGHT_DECAY: 0.05 [2022-01-17 11:28:57 swin_tiny_patch4_window7_224] (main.py 69): INFO Creating model:swin/swin_tiny_patch4_window7_224 [2022-01-17 11:29:05 swin_tiny_patch4_window7_224] (main.py 73): INFO SwinTransformer( (patch_embed): PatchEmbed( (proj): Conv2d(3, 96, kernel_size=(4, 4), stride=(4, 4)) (norm): LayerNorm((96,), eps=1e-05, elementwise_affine=True) ) (pos_drop): Dropout(p=0.0, inplace=False) (layers): ModuleList( (0): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=96, out_features=288, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=96, out_features=96, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): Identity() (norm2): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=96, out_features=384, bias=True) (act): GELU() (fc2): Linear(in_features=384, out_features=96, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=96, out_features=288, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=96, out_features=96, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((96,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=96, out_features=384, bias=True) (act): GELU() (fc2): Linear(in_features=384, out_features=96, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (downsample): PatchMerging( (reduction): Linear(in_features=384, out_features=192, bias=False) (norm): LayerNorm((384,), eps=1e-05, elementwise_affine=True) ) ) (1): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=192, out_features=576, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=192, out_features=192, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=192, out_features=768, bias=True) (act): GELU() (fc2): Linear(in_features=768, out_features=192, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=192, out_features=576, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=192, out_features=192, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((192,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=192, out_features=768, bias=True) (act): GELU() (fc2): Linear(in_features=768, out_features=192, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (downsample): PatchMerging( (reduction): Linear(in_features=768, out_features=384, bias=False) (norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) ) ) (2): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (2): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (3): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (4): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (5): SwinTransformerBlock( (norm1): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=384, out_features=1152, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=384, out_features=384, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((384,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=384, out_features=1536, bias=True) (act): GELU() (fc2): Linear(in_features=1536, out_features=384, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (downsample): PatchMerging( (reduction): Linear(in_features=1536, out_features=768, bias=False) (norm): LayerNorm((1536,), eps=1e-05, elementwise_affine=True) ) ) (3): BasicLayer( (blocks): ModuleList( (0): SwinTransformerBlock( (norm1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): SwinTransformerBlock( (norm1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): WindowAttention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) (softmax): Softmax(dim=-1) ) (drop_path): DropPath() (norm2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) ) ) (norm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (avgpool): AdaptiveAvgPool1d() (head): Linear(in_features=768, out_features=1000, bias=True) ) [2022-01-17 11:29:05 swin_tiny_patch4_window7_224] (main.py 80): INFO number of params: 28288354 [2022-01-17 11:29:05 swin_tiny_patch4_window7_224] (main.py 120): INFO Start training [2022-01-17 11:29:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][0/1251] eta 8:14:19 lr 0.000001 time 23.7090 (23.7090) loss 6.9591 (6.9591) grad_norm 1.5338 (1.5338) [2022-01-17 11:29:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][10/1251] eta 1:25:47 lr 0.000001 time 1.6167 (4.1481) loss 6.9664 (6.9739) grad_norm 1.3354 (1.3887) [2022-01-17 11:30:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][20/1251] eta 1:05:33 lr 0.000002 time 2.0573 (3.1950) loss 6.9705 (6.9737) grad_norm 1.3590 (1.3789) [2022-01-17 11:30:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][30/1251] eta 0:58:22 lr 0.000002 time 1.7046 (2.8682) loss 6.9316 (6.9683) grad_norm 1.4504 (1.3845) [2022-01-17 11:30:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][40/1251] eta 0:54:53 lr 0.000003 time 3.6386 (2.7201) loss 6.9641 (6.9653) grad_norm 1.4252 (1.3853) [2022-01-17 11:31:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][50/1251] eta 0:52:46 lr 0.000003 time 2.3246 (2.6364) loss 6.9468 (6.9637) grad_norm 1.3219 (1.3807) [2022-01-17 11:31:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][60/1251] eta 0:51:28 lr 0.000003 time 2.0952 (2.5936) loss 6.8956 (6.9613) grad_norm 1.3488 (1.3724) [2022-01-17 11:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][70/1251] eta 0:49:28 lr 0.000004 time 1.7146 (2.5134) loss 6.9474 (6.9601) grad_norm 1.4443 (1.3736) [2022-01-17 11:32:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][80/1251] eta 0:48:05 lr 0.000004 time 2.0982 (2.4644) loss 6.9309 (6.9565) grad_norm 1.4097 (1.3698) [2022-01-17 11:32:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][90/1251] eta 0:47:13 lr 0.000005 time 2.2248 (2.4406) loss 6.9520 (6.9536) grad_norm 1.2602 (1.3640) [2022-01-17 11:33:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][100/1251] eta 0:46:18 lr 0.000005 time 2.1861 (2.4136) loss 6.9148 (6.9518) grad_norm 1.2911 (1.3604) [2022-01-17 11:33:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][110/1251] eta 0:45:16 lr 0.000005 time 1.5310 (2.3812) loss 6.9427 (6.9492) grad_norm 1.2743 (1.3545) [2022-01-17 11:33:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][120/1251] eta 0:44:38 lr 0.000006 time 2.7888 (2.3687) loss 6.9050 (6.9471) grad_norm 1.3613 (1.3491) [2022-01-17 11:34:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][130/1251] eta 0:44:11 lr 0.000006 time 2.2436 (2.3654) loss 6.9370 (6.9453) grad_norm 1.3509 (1.3429) [2022-01-17 11:34:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][140/1251] eta 0:43:50 lr 0.000007 time 3.2885 (2.3680) loss 6.9527 (6.9443) grad_norm 1.3150 (1.3372) [2022-01-17 11:35:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][150/1251] eta 0:43:11 lr 0.000007 time 1.6337 (2.3542) loss 6.9435 (6.9426) grad_norm 1.2202 (1.3329) [2022-01-17 11:35:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][160/1251] eta 0:42:45 lr 0.000007 time 2.9157 (2.3515) loss 6.9406 (6.9414) grad_norm 1.1487 (1.3264) [2022-01-17 11:35:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][170/1251] eta 0:42:12 lr 0.000008 time 1.7008 (2.3426) loss 6.9476 (6.9405) grad_norm 1.2412 (1.3198) [2022-01-17 11:36:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][180/1251] eta 0:41:40 lr 0.000008 time 2.9246 (2.3347) loss 6.9286 (6.9388) grad_norm 1.1802 (1.3135) [2022-01-17 11:36:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][190/1251] eta 0:40:57 lr 0.000009 time 1.7108 (2.3164) loss 6.8814 (6.9368) grad_norm 1.1924 (1.3073) [2022-01-17 11:36:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][200/1251] eta 0:40:19 lr 0.000009 time 2.2805 (2.3018) loss 6.9090 (6.9358) grad_norm 1.2007 (1.3024) [2022-01-17 11:37:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][210/1251] eta 0:39:50 lr 0.000009 time 2.7545 (2.2962) loss 6.9161 (6.9350) grad_norm 1.1284 (1.2966) [2022-01-17 11:37:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][220/1251] eta 0:39:26 lr 0.000010 time 3.2258 (2.2958) loss 6.9212 (6.9339) grad_norm 1.1965 (1.2901) [2022-01-17 11:37:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][230/1251] eta 0:38:58 lr 0.000010 time 1.8291 (2.2906) loss 6.9255 (6.9332) grad_norm 1.1355 (1.2852) [2022-01-17 11:38:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][240/1251] eta 0:38:38 lr 0.000011 time 2.1930 (2.2932) loss 6.8518 (6.9321) grad_norm 1.1651 (1.2790) [2022-01-17 11:38:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][250/1251] eta 0:38:10 lr 0.000011 time 1.5601 (2.2877) loss 6.9303 (6.9309) grad_norm 1.0359 (1.2719) [2022-01-17 11:39:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][260/1251] eta 0:37:51 lr 0.000011 time 3.0432 (2.2923) loss 6.8686 (6.9296) grad_norm 1.1278 (1.2656) [2022-01-17 11:39:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][270/1251] eta 0:37:25 lr 0.000012 time 1.9524 (2.2894) loss 6.8634 (6.9283) grad_norm 1.1287 (1.2608) [2022-01-17 11:39:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][280/1251] eta 0:36:54 lr 0.000012 time 2.0113 (2.2811) loss 6.8901 (6.9276) grad_norm 1.1212 (1.2539) [2022-01-17 11:40:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][290/1251] eta 0:36:24 lr 0.000013 time 1.8850 (2.2735) loss 6.9057 (6.9266) grad_norm 1.0626 (1.2484) [2022-01-17 11:40:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][300/1251] eta 0:35:57 lr 0.000013 time 2.7601 (2.2691) loss 6.8604 (6.9256) grad_norm 1.0971 (1.2418) [2022-01-17 11:40:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][310/1251] eta 0:35:28 lr 0.000013 time 2.0511 (2.2618) loss 6.9194 (6.9243) grad_norm 1.0529 (1.2360) [2022-01-17 11:41:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][320/1251] eta 0:34:56 lr 0.000014 time 1.5205 (2.2517) loss 6.8925 (6.9232) grad_norm 1.0277 (1.2298) [2022-01-17 11:41:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][330/1251] eta 0:34:29 lr 0.000014 time 1.8905 (2.2471) loss 6.8497 (6.9224) grad_norm 1.0256 (1.2237) [2022-01-17 11:41:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][340/1251] eta 0:34:08 lr 0.000015 time 3.4195 (2.2486) loss 6.8789 (6.9215) grad_norm 1.0006 (1.2172) [2022-01-17 11:42:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][350/1251] eta 0:33:45 lr 0.000015 time 2.3929 (2.2477) loss 6.8900 (6.9207) grad_norm 0.9560 (1.2111) [2022-01-17 11:42:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][360/1251] eta 0:33:25 lr 0.000015 time 2.5607 (2.2510) loss 6.8438 (6.9196) grad_norm 1.0424 (1.2058) [2022-01-17 11:42:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][370/1251] eta 0:33:01 lr 0.000016 time 1.5288 (2.2487) loss 6.8505 (6.9187) grad_norm 1.0000 (1.2007) [2022-01-17 11:43:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][380/1251] eta 0:32:39 lr 0.000016 time 3.3679 (2.2493) loss 6.8811 (6.9178) grad_norm 1.0357 (1.1956) [2022-01-17 11:43:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][390/1251] eta 0:32:13 lr 0.000017 time 2.1786 (2.2453) loss 6.8647 (6.9165) grad_norm 1.0950 (1.1905) [2022-01-17 11:44:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][400/1251] eta 0:31:48 lr 0.000017 time 1.9593 (2.2431) loss 6.8844 (6.9153) grad_norm 0.9684 (1.1851) [2022-01-17 11:44:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][410/1251] eta 0:31:25 lr 0.000017 time 1.9704 (2.2415) loss 6.9091 (6.9146) grad_norm 0.9565 (1.1805) [2022-01-17 11:44:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][420/1251] eta 0:31:01 lr 0.000018 time 2.9430 (2.2405) loss 6.8457 (6.9136) grad_norm 0.9210 (1.1752) [2022-01-17 11:45:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][430/1251] eta 0:30:37 lr 0.000018 time 2.2175 (2.2386) loss 6.8935 (6.9126) grad_norm 0.9162 (1.1702) [2022-01-17 11:45:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][440/1251] eta 0:30:16 lr 0.000019 time 2.7443 (2.2400) loss 6.8535 (6.9116) grad_norm 1.0036 (1.1653) [2022-01-17 11:45:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][450/1251] eta 0:29:54 lr 0.000019 time 2.1767 (2.2408) loss 6.8543 (6.9109) grad_norm 0.9462 (1.1608) [2022-01-17 11:46:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][460/1251] eta 0:29:32 lr 0.000019 time 2.6667 (2.2412) loss 6.9135 (6.9102) grad_norm 0.9527 (1.1564) [2022-01-17 11:46:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][470/1251] eta 0:29:09 lr 0.000020 time 2.6146 (2.2397) loss 6.8671 (6.9092) grad_norm 0.9325 (1.1519) [2022-01-17 11:47:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][480/1251] eta 0:28:47 lr 0.000020 time 2.1589 (2.2405) loss 6.8627 (6.9082) grad_norm 0.9458 (1.1482) [2022-01-17 11:47:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][490/1251] eta 0:28:23 lr 0.000021 time 1.8796 (2.2383) loss 6.8901 (6.9073) grad_norm 0.9124 (1.1438) [2022-01-17 11:47:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][500/1251] eta 0:28:01 lr 0.000021 time 3.1054 (2.2391) loss 6.9039 (6.9061) grad_norm 0.8742 (1.1396) [2022-01-17 11:48:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][510/1251] eta 0:27:39 lr 0.000021 time 2.4146 (2.2394) loss 6.9068 (6.9054) grad_norm 0.9010 (1.1356) [2022-01-17 11:48:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][520/1251] eta 0:27:14 lr 0.000022 time 1.7055 (2.2366) loss 6.9178 (6.9048) grad_norm 0.9040 (1.1317) [2022-01-17 11:48:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][530/1251] eta 0:26:50 lr 0.000022 time 1.9295 (2.2341) loss 6.8410 (6.9036) grad_norm 0.8882 (1.1278) [2022-01-17 11:49:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][540/1251] eta 0:26:26 lr 0.000023 time 2.9057 (2.2314) loss 6.8498 (6.9028) grad_norm 0.9363 (1.1243) [2022-01-17 11:49:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][550/1251] eta 0:26:02 lr 0.000023 time 1.8359 (2.2292) loss 6.8663 (6.9022) grad_norm 0.9156 (1.1207) [2022-01-17 11:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][560/1251] eta 0:25:40 lr 0.000023 time 2.1405 (2.2294) loss 6.8778 (6.9016) grad_norm 0.9706 (1.1171) [2022-01-17 11:50:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][570/1251] eta 0:25:17 lr 0.000024 time 1.9113 (2.2284) loss 6.8546 (6.9008) grad_norm 0.9450 (1.1134) [2022-01-17 11:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][580/1251] eta 0:24:55 lr 0.000024 time 2.9954 (2.2287) loss 6.8734 (6.9002) grad_norm 0.8815 (1.1097) [2022-01-17 11:51:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][590/1251] eta 0:24:32 lr 0.000025 time 2.0931 (2.2270) loss 6.8545 (6.8994) grad_norm 0.8544 (1.1065) [2022-01-17 11:51:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][600/1251] eta 0:24:09 lr 0.000025 time 1.8691 (2.2258) loss 6.7979 (6.8986) grad_norm 1.0288 (1.1038) [2022-01-17 11:51:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][610/1251] eta 0:23:45 lr 0.000025 time 1.7421 (2.2238) loss 6.8439 (6.8977) grad_norm 0.8410 (1.1004) [2022-01-17 11:52:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][620/1251] eta 0:23:23 lr 0.000026 time 2.5057 (2.2247) loss 6.8291 (6.8971) grad_norm 0.8921 (1.0972) [2022-01-17 11:52:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][630/1251] eta 0:23:00 lr 0.000026 time 2.1208 (2.2228) loss 6.8670 (6.8963) grad_norm 0.8436 (1.0945) [2022-01-17 11:52:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][640/1251] eta 0:22:36 lr 0.000027 time 1.8528 (2.2206) loss 6.8374 (6.8952) grad_norm 0.9028 (1.0917) [2022-01-17 11:53:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][650/1251] eta 0:22:13 lr 0.000027 time 2.1536 (2.2190) loss 6.8006 (6.8942) grad_norm 0.9889 (1.0895) [2022-01-17 11:53:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][660/1251] eta 0:21:51 lr 0.000027 time 2.5022 (2.2189) loss 6.8358 (6.8934) grad_norm 0.8757 (1.0872) [2022-01-17 11:53:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][670/1251] eta 0:21:29 lr 0.000028 time 2.2938 (2.2187) loss 6.8631 (6.8925) grad_norm 0.8697 (1.0849) [2022-01-17 11:54:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][680/1251] eta 0:21:07 lr 0.000028 time 2.0778 (2.2192) loss 6.9164 (6.8919) grad_norm 1.0107 (1.0825) [2022-01-17 11:54:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][690/1251] eta 0:20:44 lr 0.000029 time 1.9756 (2.2181) loss 6.8481 (6.8911) grad_norm 0.8466 (1.0810) [2022-01-17 11:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][700/1251] eta 0:20:22 lr 0.000029 time 2.6288 (2.2183) loss 6.8645 (6.8903) grad_norm 0.8750 (1.0794) [2022-01-17 11:55:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][710/1251] eta 0:19:59 lr 0.000029 time 1.6494 (2.2178) loss 6.7844 (6.8895) grad_norm 1.1315 (1.0774) [2022-01-17 11:55:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][720/1251] eta 0:19:37 lr 0.000030 time 1.5991 (2.2180) loss 6.8541 (6.8885) grad_norm 1.0213 (1.0763) [2022-01-17 11:56:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][730/1251] eta 0:19:15 lr 0.000030 time 1.5563 (2.2182) loss 6.8319 (6.8877) grad_norm 0.9008 (1.0749) [2022-01-17 11:56:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][740/1251] eta 0:18:53 lr 0.000031 time 2.2076 (2.2190) loss 6.8469 (6.8872) grad_norm 0.9865 (1.0743) [2022-01-17 11:56:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][750/1251] eta 0:18:32 lr 0.000031 time 2.0824 (2.2201) loss 6.8553 (6.8865) grad_norm 1.0810 (1.0729) [2022-01-17 11:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][760/1251] eta 0:18:10 lr 0.000031 time 2.4041 (2.2209) loss 6.8075 (6.8858) grad_norm 0.9251 (1.0724) [2022-01-17 11:57:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][770/1251] eta 0:17:48 lr 0.000032 time 2.1594 (2.2213) loss 6.8210 (6.8851) grad_norm 1.1485 (1.0717) [2022-01-17 11:58:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][780/1251] eta 0:17:26 lr 0.000032 time 2.1848 (2.2215) loss 6.8383 (6.8843) grad_norm 1.0887 (1.0732) [2022-01-17 11:58:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][790/1251] eta 0:17:04 lr 0.000033 time 2.8497 (2.2225) loss 6.8223 (6.8835) grad_norm 1.0493 (1.0737) [2022-01-17 11:58:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][800/1251] eta 0:16:42 lr 0.000033 time 1.9332 (2.2221) loss 6.8146 (6.8827) grad_norm 1.1546 (1.0744) [2022-01-17 11:59:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][810/1251] eta 0:16:20 lr 0.000033 time 2.1941 (2.2226) loss 6.7751 (6.8817) grad_norm 1.0375 (1.0757) [2022-01-17 11:59:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][820/1251] eta 0:15:58 lr 0.000034 time 2.4293 (2.2236) loss 6.8330 (6.8812) grad_norm 1.2627 (1.0772) [2022-01-17 11:59:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][830/1251] eta 0:15:36 lr 0.000034 time 2.2033 (2.2240) loss 6.7534 (6.8802) grad_norm 1.0934 (1.0776) [2022-01-17 12:00:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][840/1251] eta 0:15:13 lr 0.000035 time 1.8039 (2.2231) loss 6.8131 (6.8792) grad_norm 2.0714 (1.0807) [2022-01-17 12:00:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][850/1251] eta 0:14:51 lr 0.000035 time 2.8826 (2.2233) loss 6.7812 (6.8783) grad_norm 1.5309 (1.0844) [2022-01-17 12:00:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][860/1251] eta 0:14:29 lr 0.000035 time 2.7543 (2.2226) loss 6.8058 (6.8775) grad_norm 1.0255 (1.0867) [2022-01-17 12:01:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][870/1251] eta 0:14:05 lr 0.000036 time 1.6550 (2.2203) loss 6.8421 (6.8768) grad_norm 1.1984 (1.0879) [2022-01-17 12:01:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][880/1251] eta 0:13:43 lr 0.000036 time 2.1377 (2.2192) loss 6.8419 (6.8759) grad_norm 1.1787 (1.0892) [2022-01-17 12:02:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][890/1251] eta 0:13:20 lr 0.000037 time 2.9004 (2.2183) loss 6.7417 (6.8751) grad_norm 1.2225 (1.0924) [2022-01-17 12:02:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][900/1251] eta 0:12:58 lr 0.000037 time 3.4258 (2.2194) loss 6.8294 (6.8744) grad_norm 1.4333 (1.0965) [2022-01-17 12:02:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][910/1251] eta 0:12:36 lr 0.000037 time 2.1932 (2.2188) loss 6.7953 (6.8735) grad_norm 1.0544 (1.0977) [2022-01-17 12:03:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][920/1251] eta 0:12:14 lr 0.000038 time 2.0588 (2.2204) loss 6.8895 (6.8727) grad_norm 1.1856 (1.0998) [2022-01-17 12:03:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][930/1251] eta 0:11:53 lr 0.000038 time 2.8616 (2.2215) loss 6.8233 (6.8719) grad_norm 1.7534 (1.1034) [2022-01-17 12:03:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][940/1251] eta 0:11:31 lr 0.000039 time 3.1263 (2.2238) loss 6.8382 (6.8710) grad_norm 1.4340 (1.1072) [2022-01-17 12:04:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][950/1251] eta 0:11:09 lr 0.000039 time 2.7877 (2.2236) loss 6.7169 (6.8700) grad_norm 1.3574 (1.1119) [2022-01-17 12:04:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][960/1251] eta 0:10:46 lr 0.000039 time 1.7530 (2.2224) loss 6.7775 (6.8691) grad_norm 1.1108 (1.1138) [2022-01-17 12:05:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][970/1251] eta 0:10:23 lr 0.000040 time 2.1626 (2.2203) loss 6.7836 (6.8681) grad_norm 1.5968 (1.1173) [2022-01-17 12:05:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][980/1251] eta 0:10:01 lr 0.000040 time 2.7274 (2.2196) loss 6.7743 (6.8671) grad_norm 1.2598 (1.1233) [2022-01-17 12:05:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][990/1251] eta 0:09:38 lr 0.000041 time 1.7763 (2.2171) loss 6.7943 (6.8666) grad_norm 1.6440 (1.1299) [2022-01-17 12:06:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1000/1251] eta 0:09:16 lr 0.000041 time 2.5136 (2.2177) loss 6.7863 (6.8656) grad_norm 0.9712 (1.1307) [2022-01-17 12:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1010/1251] eta 0:08:54 lr 0.000041 time 1.8259 (2.2175) loss 6.8560 (6.8648) grad_norm 1.7870 (1.1363) [2022-01-17 12:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1020/1251] eta 0:08:32 lr 0.000042 time 3.9841 (2.2196) loss 6.8207 (6.8639) grad_norm 1.7343 (1.1416) [2022-01-17 12:07:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1030/1251] eta 0:08:10 lr 0.000042 time 2.0335 (2.2202) loss 6.7343 (6.8628) grad_norm 1.1745 (1.1443) [2022-01-17 12:07:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1040/1251] eta 0:07:48 lr 0.000043 time 2.1322 (2.2202) loss 6.7476 (6.8621) grad_norm 1.8067 (1.1486) [2022-01-17 12:07:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1050/1251] eta 0:07:26 lr 0.000043 time 2.1899 (2.2204) loss 6.6715 (6.8610) grad_norm 1.4136 (1.1523) [2022-01-17 12:08:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1060/1251] eta 0:07:04 lr 0.000043 time 3.0617 (2.2209) loss 6.6952 (6.8605) grad_norm 1.2754 (1.1560) [2022-01-17 12:08:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1070/1251] eta 0:06:42 lr 0.000044 time 1.9870 (2.2217) loss 6.7939 (6.8595) grad_norm 1.4323 (1.1569) [2022-01-17 12:09:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1080/1251] eta 0:06:19 lr 0.000044 time 1.6826 (2.2207) loss 6.7150 (6.8585) grad_norm 1.7152 (1.1621) [2022-01-17 12:09:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1090/1251] eta 0:05:57 lr 0.000045 time 1.6357 (2.2199) loss 6.7551 (6.8575) grad_norm 1.4268 (1.1661) [2022-01-17 12:09:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1100/1251] eta 0:05:35 lr 0.000045 time 3.0517 (2.2195) loss 6.8059 (6.8566) grad_norm 1.3021 (1.1688) [2022-01-17 12:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1110/1251] eta 0:05:13 lr 0.000045 time 2.5030 (2.2202) loss 6.7829 (6.8556) grad_norm 1.4590 (1.1706) [2022-01-17 12:10:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1120/1251] eta 0:04:50 lr 0.000046 time 1.4934 (2.2202) loss 6.6909 (6.8546) grad_norm 1.2447 (1.1725) [2022-01-17 12:10:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1130/1251] eta 0:04:28 lr 0.000046 time 1.7937 (2.2188) loss 6.8565 (6.8541) grad_norm 1.3894 (1.1771) [2022-01-17 12:11:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1140/1251] eta 0:04:06 lr 0.000047 time 2.8733 (2.2189) loss 6.5588 (6.8527) grad_norm 1.4423 (1.1802) [2022-01-17 12:11:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1150/1251] eta 0:03:44 lr 0.000047 time 2.3480 (2.2180) loss 6.8164 (6.8516) grad_norm 1.4281 (1.1817) [2022-01-17 12:11:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1160/1251] eta 0:03:21 lr 0.000047 time 1.8742 (2.2165) loss 6.8289 (6.8507) grad_norm 1.6903 (1.1854) [2022-01-17 12:12:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1170/1251] eta 0:02:59 lr 0.000048 time 1.9249 (2.2169) loss 6.6973 (6.8496) grad_norm 1.4087 (1.1899) [2022-01-17 12:12:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1180/1251] eta 0:02:37 lr 0.000048 time 2.6965 (2.2178) loss 6.7267 (6.8489) grad_norm 2.9372 (1.1944) [2022-01-17 12:13:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1190/1251] eta 0:02:15 lr 0.000049 time 1.9806 (2.2168) loss 6.5701 (6.8482) grad_norm 1.3548 (1.1982) [2022-01-17 12:13:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1200/1251] eta 0:01:53 lr 0.000049 time 2.5255 (2.2169) loss 6.7145 (6.8473) grad_norm 1.6568 (1.1998) [2022-01-17 12:13:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1210/1251] eta 0:01:30 lr 0.000049 time 1.5416 (2.2168) loss 6.6036 (6.8461) grad_norm 1.2999 (1.2015) [2022-01-17 12:14:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1220/1251] eta 0:01:08 lr 0.000050 time 2.2592 (2.2163) loss 6.6313 (6.8450) grad_norm 1.6643 (1.2042) [2022-01-17 12:14:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1230/1251] eta 0:00:46 lr 0.000050 time 2.5745 (2.2159) loss 6.7628 (6.8441) grad_norm 1.8887 (1.2106) [2022-01-17 12:14:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1240/1251] eta 0:00:24 lr 0.000051 time 1.2218 (2.2147) loss 6.7208 (6.8431) grad_norm 2.7623 (1.2150) [2022-01-17 12:15:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [0/300][1250/1251] eta 0:00:02 lr 0.000051 time 1.1838 (2.2095) loss 6.7677 (6.8421) grad_norm 1.4611 (1.2196) [2022-01-17 12:15:09 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 0 training takes 0:46:04 [2022-01-17 12:15:09 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_0 saving...... [2022-01-17 12:15:21 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_0 saved !!! [2022-01-17 12:15:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 16.328 (16.328) Loss 6.3492 (6.3492) Acc@1 1.953 (1.953) Acc@5 6.250 (6.250) [2022-01-17 12:15:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.586 (3.007) Loss 6.3661 (6.3587) Acc@1 2.441 (2.051) Acc@5 6.934 (6.507) [2022-01-17 12:16:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.978 (2.279) Loss 6.3665 (6.3547) Acc@1 1.367 (1.990) Acc@5 5.664 (6.599) [2022-01-17 12:16:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.899 (2.097) Loss 6.3838 (6.3562) Acc@1 1.270 (1.985) Acc@5 5.469 (6.483) [2022-01-17 12:16:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.184 (2.070) Loss 6.3439 (6.3546) Acc@1 1.953 (1.927) Acc@5 6.348 (6.388) [2022-01-17 12:16:53 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 1.906 Acc@5 6.346 [2022-01-17 12:16:53 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 1.9% [2022-01-17 12:16:53 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 1.91% [2022-01-17 12:17:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][0/1251] eta 7:24:35 lr 0.000051 time 21.3235 (21.3235) loss 6.7374 (6.7374) grad_norm 1.8688 (1.8688) [2022-01-17 12:17:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][10/1251] eta 1:24:17 lr 0.000051 time 2.2308 (4.0753) loss 6.6237 (6.7088) grad_norm 1.8322 (1.5825) [2022-01-17 12:17:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][20/1251] eta 1:04:33 lr 0.000052 time 1.5905 (3.1466) loss 6.5549 (6.7139) grad_norm 1.3604 (1.6713) [2022-01-17 12:18:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][30/1251] eta 1:00:38 lr 0.000052 time 1.5332 (2.9795) loss 6.7097 (6.7279) grad_norm 2.0439 (1.6501) [2022-01-17 12:18:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][40/1251] eta 0:56:47 lr 0.000053 time 3.5093 (2.8139) loss 6.6690 (6.7261) grad_norm 2.2199 (1.6924) [2022-01-17 12:19:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][50/1251] eta 0:54:40 lr 0.000053 time 2.5981 (2.7311) loss 6.6186 (6.7203) grad_norm 1.3876 (1.7200) [2022-01-17 12:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][60/1251] eta 0:52:05 lr 0.000053 time 2.5021 (2.6239) loss 6.7613 (6.7142) grad_norm 1.7797 (1.6893) [2022-01-17 12:19:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][70/1251] eta 0:49:38 lr 0.000054 time 1.6947 (2.5224) loss 6.6262 (6.7067) grad_norm 2.1279 (1.7867) [2022-01-17 12:20:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][80/1251] eta 0:48:02 lr 0.000054 time 2.2482 (2.4613) loss 6.5988 (6.7066) grad_norm 1.5068 (1.7919) [2022-01-17 12:20:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][90/1251] eta 0:46:54 lr 0.000055 time 1.8603 (2.4239) loss 6.5621 (6.7053) grad_norm 1.8589 (1.7821) [2022-01-17 12:20:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][100/1251] eta 0:45:55 lr 0.000055 time 2.2801 (2.3940) loss 6.7421 (6.7057) grad_norm 2.9994 (1.8267) [2022-01-17 12:21:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][110/1251] eta 0:45:09 lr 0.000055 time 2.0827 (2.3750) loss 6.8022 (6.7063) grad_norm 1.8858 (1.8650) [2022-01-17 12:21:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][120/1251] eta 0:44:34 lr 0.000056 time 2.4648 (2.3644) loss 6.7584 (6.7082) grad_norm 1.8662 (1.8515) [2022-01-17 12:22:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][130/1251] eta 0:44:13 lr 0.000056 time 2.5075 (2.3670) loss 6.8087 (6.7089) grad_norm 2.0508 (1.8567) [2022-01-17 12:22:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][140/1251] eta 0:43:39 lr 0.000057 time 2.4060 (2.3575) loss 6.5757 (6.7047) grad_norm 1.6304 (1.8451) [2022-01-17 12:22:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][150/1251] eta 0:43:07 lr 0.000057 time 2.0634 (2.3504) loss 6.7041 (6.7041) grad_norm 2.7975 (1.8623) [2022-01-17 12:23:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][160/1251] eta 0:42:31 lr 0.000057 time 2.2138 (2.3388) loss 6.6063 (6.7034) grad_norm 1.4713 (1.8532) [2022-01-17 12:23:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][170/1251] eta 0:41:56 lr 0.000058 time 1.7341 (2.3280) loss 6.7059 (6.7023) grad_norm 2.2371 (1.8604) [2022-01-17 12:23:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][180/1251] eta 0:41:20 lr 0.000058 time 2.4556 (2.3159) loss 6.7055 (6.6991) grad_norm 1.6281 (1.8659) [2022-01-17 12:24:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][190/1251] eta 0:40:47 lr 0.000059 time 1.8922 (2.3065) loss 6.7315 (6.6973) grad_norm 1.9626 (1.8647) [2022-01-17 12:24:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][200/1251] eta 0:40:21 lr 0.000059 time 2.5287 (2.3036) loss 6.7818 (6.6966) grad_norm 2.1222 (1.8711) [2022-01-17 12:24:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][210/1251] eta 0:39:56 lr 0.000059 time 1.8036 (2.3021) loss 6.7210 (6.6945) grad_norm 2.3918 (1.8912) [2022-01-17 12:25:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][220/1251] eta 0:39:22 lr 0.000060 time 2.2240 (2.2914) loss 6.7320 (6.6931) grad_norm 1.5225 (1.8999) [2022-01-17 12:25:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][230/1251] eta 0:38:51 lr 0.000060 time 2.4306 (2.2839) loss 6.5655 (6.6894) grad_norm 1.5700 (1.8942) [2022-01-17 12:26:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][240/1251] eta 0:38:24 lr 0.000061 time 2.5068 (2.2794) loss 6.6697 (6.6860) grad_norm 2.5611 (1.8929) [2022-01-17 12:26:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][250/1251] eta 0:38:00 lr 0.000061 time 1.9089 (2.2782) loss 6.6188 (6.6826) grad_norm 1.7194 (1.8900) [2022-01-17 12:26:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][260/1251] eta 0:37:33 lr 0.000061 time 1.8311 (2.2738) loss 6.7247 (6.6830) grad_norm 1.8150 (1.8830) [2022-01-17 12:27:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][270/1251] eta 0:37:02 lr 0.000062 time 2.4796 (2.2656) loss 6.6909 (6.6821) grad_norm 1.7594 (1.8845) [2022-01-17 12:27:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][280/1251] eta 0:36:31 lr 0.000062 time 1.6710 (2.2569) loss 6.7474 (6.6804) grad_norm 1.4361 (1.8844) [2022-01-17 12:27:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][290/1251] eta 0:36:05 lr 0.000063 time 1.5355 (2.2537) loss 6.7259 (6.6794) grad_norm 2.1460 (1.8892) [2022-01-17 12:28:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][300/1251] eta 0:35:44 lr 0.000063 time 2.1769 (2.2551) loss 6.4646 (6.6766) grad_norm 2.0584 (1.8957) [2022-01-17 12:28:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][310/1251] eta 0:35:22 lr 0.000063 time 1.7907 (2.2552) loss 6.4370 (6.6743) grad_norm 2.3378 (1.9043) [2022-01-17 12:28:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][320/1251] eta 0:35:02 lr 0.000064 time 2.9106 (2.2578) loss 6.6244 (6.6720) grad_norm 2.0840 (1.9184) [2022-01-17 12:29:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][330/1251] eta 0:34:40 lr 0.000064 time 1.8833 (2.2586) loss 6.6459 (6.6697) grad_norm 1.8554 (1.9210) [2022-01-17 12:29:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][340/1251] eta 0:34:12 lr 0.000065 time 1.5949 (2.2536) loss 6.7895 (6.6690) grad_norm 1.6985 (1.9167) [2022-01-17 12:30:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][350/1251] eta 0:33:51 lr 0.000065 time 3.2500 (2.2548) loss 6.6213 (6.6682) grad_norm 1.8489 (1.9177) [2022-01-17 12:30:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][360/1251] eta 0:33:21 lr 0.000065 time 1.6727 (2.2464) loss 6.6045 (6.6671) grad_norm 2.0142 (1.9231) [2022-01-17 12:30:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][370/1251] eta 0:32:54 lr 0.000066 time 1.7744 (2.2413) loss 6.6344 (6.6654) grad_norm 1.9103 (1.9260) [2022-01-17 12:31:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][380/1251] eta 0:32:27 lr 0.000066 time 2.2166 (2.2364) loss 6.5872 (6.6637) grad_norm 2.6303 (1.9408) [2022-01-17 12:31:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][390/1251] eta 0:32:06 lr 0.000067 time 2.4127 (2.2370) loss 6.5122 (6.6625) grad_norm 3.0101 (1.9537) [2022-01-17 12:31:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][400/1251] eta 0:31:45 lr 0.000067 time 2.1698 (2.2387) loss 6.6114 (6.6588) grad_norm 1.9767 (1.9637) [2022-01-17 12:32:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][410/1251] eta 0:31:25 lr 0.000067 time 2.2635 (2.2417) loss 6.5666 (6.6564) grad_norm 3.0086 (1.9742) [2022-01-17 12:32:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][420/1251] eta 0:31:02 lr 0.000068 time 2.2301 (2.2407) loss 6.3926 (6.6546) grad_norm 2.1585 (1.9852) [2022-01-17 12:32:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][430/1251] eta 0:30:35 lr 0.000068 time 1.8102 (2.2360) loss 6.4865 (6.6539) grad_norm 1.9955 (1.9891) [2022-01-17 12:33:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][440/1251] eta 0:30:08 lr 0.000069 time 1.9393 (2.2302) loss 6.6906 (6.6530) grad_norm 1.7463 (2.0005) [2022-01-17 12:33:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][450/1251] eta 0:29:45 lr 0.000069 time 2.0240 (2.2294) loss 6.7230 (6.6530) grad_norm 1.4486 (1.9985) [2022-01-17 12:34:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][460/1251] eta 0:29:22 lr 0.000069 time 2.6309 (2.2276) loss 6.6551 (6.6513) grad_norm 1.8890 (1.9912) [2022-01-17 12:34:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][470/1251] eta 0:29:01 lr 0.000070 time 2.4721 (2.2293) loss 6.6411 (6.6495) grad_norm 1.3069 (1.9868) [2022-01-17 12:34:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][480/1251] eta 0:28:40 lr 0.000070 time 2.1661 (2.2313) loss 6.7133 (6.6494) grad_norm 2.6059 (1.9889) [2022-01-17 12:35:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][490/1251] eta 0:28:19 lr 0.000071 time 2.3997 (2.2339) loss 6.6574 (6.6474) grad_norm 2.1737 (1.9911) [2022-01-17 12:35:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][500/1251] eta 0:27:56 lr 0.000071 time 1.8543 (2.2323) loss 6.6171 (6.6451) grad_norm 1.8693 (1.9953) [2022-01-17 12:35:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][510/1251] eta 0:27:32 lr 0.000071 time 1.5683 (2.2303) loss 6.8204 (6.6439) grad_norm 2.4490 (1.9997) [2022-01-17 12:36:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][520/1251] eta 0:27:06 lr 0.000072 time 1.5920 (2.2246) loss 6.4726 (6.6431) grad_norm 2.3710 (2.0035) [2022-01-17 12:36:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][530/1251] eta 0:26:41 lr 0.000072 time 1.8932 (2.2205) loss 6.5907 (6.6424) grad_norm 3.7026 (2.0070) [2022-01-17 12:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][540/1251] eta 0:26:15 lr 0.000073 time 1.8931 (2.2158) loss 6.6089 (6.6406) grad_norm 1.7950 (2.0086) [2022-01-17 12:37:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][550/1251] eta 0:25:50 lr 0.000073 time 1.9453 (2.2113) loss 6.5888 (6.6403) grad_norm 1.5923 (2.0065) [2022-01-17 12:37:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][560/1251] eta 0:25:25 lr 0.000073 time 1.9154 (2.2076) loss 6.5787 (6.6387) grad_norm 2.6024 (2.0070) [2022-01-17 12:37:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][570/1251] eta 0:25:03 lr 0.000074 time 2.1835 (2.2083) loss 6.3538 (6.6381) grad_norm 2.0001 (2.0036) [2022-01-17 12:38:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][580/1251] eta 0:24:41 lr 0.000074 time 2.3688 (2.2076) loss 6.6621 (6.6374) grad_norm 1.9791 (2.0054) [2022-01-17 12:38:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][590/1251] eta 0:24:20 lr 0.000075 time 2.4740 (2.2093) loss 6.4101 (6.6358) grad_norm 2.5187 (2.0093) [2022-01-17 12:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][600/1251] eta 0:23:58 lr 0.000075 time 2.2339 (2.2094) loss 6.4303 (6.6332) grad_norm 2.0469 (2.0135) [2022-01-17 12:39:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][610/1251] eta 0:23:38 lr 0.000075 time 2.1297 (2.2134) loss 6.6873 (6.6324) grad_norm 1.7390 (2.0114) [2022-01-17 12:39:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][620/1251] eta 0:23:17 lr 0.000076 time 2.4600 (2.2148) loss 6.4341 (6.6299) grad_norm 2.3526 (2.0118) [2022-01-17 12:40:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][630/1251] eta 0:22:56 lr 0.000076 time 2.3920 (2.2160) loss 6.7262 (6.6305) grad_norm 1.8241 (2.0137) [2022-01-17 12:40:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][640/1251] eta 0:22:34 lr 0.000077 time 2.6184 (2.2173) loss 6.3369 (6.6280) grad_norm 2.0823 (2.0144) [2022-01-17 12:40:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][650/1251] eta 0:22:12 lr 0.000077 time 1.9021 (2.2175) loss 6.6244 (6.6268) grad_norm 2.5329 (2.0135) [2022-01-17 12:41:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][660/1251] eta 0:21:49 lr 0.000077 time 2.5215 (2.2160) loss 6.6058 (6.6261) grad_norm 1.8910 (2.0136) [2022-01-17 12:41:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][670/1251] eta 0:21:26 lr 0.000078 time 1.8871 (2.2143) loss 6.5769 (6.6230) grad_norm 2.3330 (2.0152) [2022-01-17 12:42:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][680/1251] eta 0:21:04 lr 0.000078 time 2.6492 (2.2150) loss 6.4053 (6.6217) grad_norm 2.5586 (2.0176) [2022-01-17 12:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][690/1251] eta 0:20:42 lr 0.000079 time 1.8468 (2.2154) loss 6.6489 (6.6200) grad_norm 2.6546 (2.0230) [2022-01-17 12:42:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][700/1251] eta 0:20:20 lr 0.000079 time 2.5738 (2.2153) loss 6.5779 (6.6178) grad_norm 2.0137 (2.0237) [2022-01-17 12:43:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][710/1251] eta 0:19:57 lr 0.000079 time 1.8670 (2.2128) loss 6.7420 (6.6171) grad_norm 1.7625 (2.0233) [2022-01-17 12:43:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][720/1251] eta 0:19:34 lr 0.000080 time 2.1124 (2.2119) loss 6.6187 (6.6150) grad_norm 1.6568 (2.0231) [2022-01-17 12:43:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][730/1251] eta 0:19:11 lr 0.000080 time 1.8959 (2.2111) loss 6.4005 (6.6135) grad_norm 2.3622 (2.0240) [2022-01-17 12:44:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][740/1251] eta 0:18:49 lr 0.000080 time 2.5087 (2.2107) loss 6.4866 (6.6121) grad_norm 1.9049 (2.0283) [2022-01-17 12:44:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][750/1251] eta 0:18:26 lr 0.000081 time 1.6922 (2.2093) loss 6.3392 (6.6103) grad_norm 2.8887 (2.0349) [2022-01-17 12:44:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][760/1251] eta 0:18:03 lr 0.000081 time 1.7936 (2.2074) loss 6.5721 (6.6102) grad_norm 2.2989 (2.0378) [2022-01-17 12:45:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][770/1251] eta 0:17:40 lr 0.000082 time 2.4050 (2.2052) loss 6.6123 (6.6086) grad_norm 1.8330 (2.0408) [2022-01-17 12:45:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][780/1251] eta 0:17:18 lr 0.000082 time 2.2357 (2.2044) loss 6.6570 (6.6080) grad_norm 1.7335 (2.0389) [2022-01-17 12:45:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][790/1251] eta 0:16:56 lr 0.000082 time 1.9873 (2.2040) loss 6.5070 (6.6071) grad_norm 2.1053 (2.0381) [2022-01-17 12:46:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][800/1251] eta 0:16:33 lr 0.000083 time 1.9167 (2.2038) loss 6.6985 (6.6057) grad_norm 1.9108 (2.0388) [2022-01-17 12:46:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][810/1251] eta 0:16:12 lr 0.000083 time 2.5270 (2.2053) loss 6.2856 (6.6035) grad_norm 1.4168 (2.0391) [2022-01-17 12:47:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][820/1251] eta 0:15:50 lr 0.000084 time 2.7353 (2.2064) loss 6.5686 (6.6031) grad_norm 1.6949 (2.0404) [2022-01-17 12:47:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][830/1251] eta 0:15:28 lr 0.000084 time 1.7085 (2.2062) loss 6.2973 (6.6006) grad_norm 3.0217 (2.0417) [2022-01-17 12:47:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][840/1251] eta 0:15:07 lr 0.000084 time 2.4792 (2.2074) loss 6.1724 (6.6000) grad_norm 1.9837 (2.0427) [2022-01-17 12:48:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][850/1251] eta 0:14:44 lr 0.000085 time 2.0323 (2.2049) loss 6.5539 (6.5990) grad_norm 2.4230 (2.0434) [2022-01-17 12:48:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][860/1251] eta 0:14:21 lr 0.000085 time 3.0305 (2.2043) loss 6.3338 (6.5982) grad_norm 2.4156 (2.0439) [2022-01-17 12:48:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][870/1251] eta 0:13:59 lr 0.000086 time 2.2192 (2.2034) loss 6.3225 (6.5969) grad_norm 2.1724 (2.0452) [2022-01-17 12:49:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][880/1251] eta 0:13:37 lr 0.000086 time 2.2545 (2.2024) loss 6.5542 (6.5961) grad_norm 2.8476 (2.0486) [2022-01-17 12:49:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][890/1251] eta 0:13:15 lr 0.000086 time 2.0136 (2.2031) loss 6.6771 (6.5957) grad_norm 1.9260 (2.0499) [2022-01-17 12:50:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][900/1251] eta 0:12:54 lr 0.000087 time 3.1947 (2.2073) loss 6.5893 (6.5945) grad_norm 1.7046 (2.0510) [2022-01-17 12:50:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][910/1251] eta 0:12:32 lr 0.000087 time 2.4176 (2.2075) loss 6.5157 (6.5942) grad_norm 2.9465 (2.0550) [2022-01-17 12:50:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][920/1251] eta 0:12:10 lr 0.000088 time 2.8599 (2.2069) loss 6.3053 (6.5931) grad_norm 1.8315 (2.0564) [2022-01-17 12:51:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][930/1251] eta 0:11:47 lr 0.000088 time 1.5964 (2.2043) loss 6.5693 (6.5921) grad_norm 1.6841 (2.0575) [2022-01-17 12:51:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][940/1251] eta 0:11:25 lr 0.000088 time 2.1523 (2.2030) loss 6.3091 (6.5918) grad_norm 2.8240 (2.0623) [2022-01-17 12:51:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][950/1251] eta 0:11:02 lr 0.000089 time 1.5519 (2.2022) loss 6.5532 (6.5906) grad_norm 1.6411 (2.0635) [2022-01-17 12:52:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][960/1251] eta 0:10:40 lr 0.000089 time 2.3329 (2.2014) loss 6.6207 (6.5887) grad_norm 1.6457 (2.0634) [2022-01-17 12:52:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][970/1251] eta 0:10:18 lr 0.000090 time 1.8332 (2.2021) loss 6.4155 (6.5873) grad_norm 2.2048 (2.0675) [2022-01-17 12:52:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][980/1251] eta 0:09:57 lr 0.000090 time 2.9776 (2.2040) loss 6.1376 (6.5857) grad_norm 2.1471 (2.0687) [2022-01-17 12:53:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][990/1251] eta 0:09:35 lr 0.000090 time 2.0184 (2.2053) loss 6.5403 (6.5846) grad_norm 2.3022 (2.0727) [2022-01-17 12:53:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1000/1251] eta 0:09:13 lr 0.000091 time 2.8156 (2.2064) loss 6.5444 (6.5828) grad_norm 2.0627 (2.0720) [2022-01-17 12:54:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1010/1251] eta 0:08:51 lr 0.000091 time 1.7147 (2.2060) loss 6.4939 (6.5810) grad_norm 2.0402 (2.0741) [2022-01-17 12:54:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1020/1251] eta 0:08:29 lr 0.000092 time 1.9744 (2.2041) loss 6.5488 (6.5799) grad_norm 2.1957 (2.0745) [2022-01-17 12:54:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1030/1251] eta 0:08:06 lr 0.000092 time 2.0804 (2.2017) loss 6.6196 (6.5791) grad_norm 1.8802 (2.0754) [2022-01-17 12:55:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1040/1251] eta 0:07:44 lr 0.000092 time 2.0361 (2.2001) loss 6.4635 (6.5775) grad_norm 1.8814 (2.0742) [2022-01-17 12:55:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1050/1251] eta 0:07:22 lr 0.000093 time 2.8057 (2.2013) loss 6.2532 (6.5758) grad_norm 1.7462 (2.0731) [2022-01-17 12:55:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1060/1251] eta 0:07:00 lr 0.000093 time 1.9750 (2.2007) loss 6.5543 (6.5745) grad_norm 2.1956 (2.0737) [2022-01-17 12:56:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1070/1251] eta 0:06:38 lr 0.000094 time 2.5649 (2.1998) loss 6.5112 (6.5738) grad_norm 2.4456 (2.0731) [2022-01-17 12:56:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1080/1251] eta 0:06:16 lr 0.000094 time 1.9868 (2.2003) loss 6.6772 (6.5732) grad_norm 2.6420 (2.0734) [2022-01-17 12:56:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1090/1251] eta 0:05:54 lr 0.000094 time 3.0991 (2.2028) loss 6.5221 (6.5720) grad_norm 2.9456 (2.0768) [2022-01-17 12:57:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1100/1251] eta 0:05:32 lr 0.000095 time 2.1886 (2.2032) loss 6.4451 (6.5708) grad_norm 1.6719 (2.0781) [2022-01-17 12:57:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1110/1251] eta 0:05:10 lr 0.000095 time 2.1863 (2.2027) loss 6.4392 (6.5691) grad_norm 1.5909 (2.0775) [2022-01-17 12:58:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1120/1251] eta 0:04:48 lr 0.000096 time 1.7037 (2.2016) loss 6.5028 (6.5678) grad_norm 1.9155 (2.0775) [2022-01-17 12:58:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1130/1251] eta 0:04:26 lr 0.000096 time 3.3649 (2.2009) loss 6.5155 (6.5670) grad_norm 1.8500 (2.0773) [2022-01-17 12:58:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1140/1251] eta 0:04:04 lr 0.000096 time 2.0959 (2.2004) loss 6.2647 (6.5650) grad_norm 1.9159 (2.0795) [2022-01-17 12:59:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1150/1251] eta 0:03:42 lr 0.000097 time 2.7886 (2.2005) loss 6.7838 (6.5640) grad_norm 2.2370 (2.0804) [2022-01-17 12:59:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1160/1251] eta 0:03:20 lr 0.000097 time 2.2083 (2.2016) loss 6.3367 (6.5634) grad_norm 3.1751 (2.0843) [2022-01-17 12:59:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1170/1251] eta 0:02:58 lr 0.000098 time 2.4838 (2.2015) loss 6.1234 (6.5620) grad_norm 2.1873 (2.0880) [2022-01-17 13:00:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1180/1251] eta 0:02:36 lr 0.000098 time 1.9059 (2.2020) loss 6.4147 (6.5615) grad_norm 2.0800 (2.0885) [2022-01-17 13:00:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1190/1251] eta 0:02:14 lr 0.000098 time 2.2381 (2.2004) loss 6.3409 (6.5606) grad_norm 1.6070 (2.0894) [2022-01-17 13:00:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1200/1251] eta 0:01:52 lr 0.000099 time 1.5490 (2.1989) loss 6.5680 (6.5597) grad_norm 1.8068 (2.0874) [2022-01-17 13:01:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1210/1251] eta 0:01:30 lr 0.000099 time 2.3943 (2.1980) loss 5.9899 (6.5581) grad_norm 2.0899 (2.0855) [2022-01-17 13:01:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1220/1251] eta 0:01:08 lr 0.000100 time 2.2145 (2.1965) loss 6.1918 (6.5568) grad_norm 1.9121 (2.0855) [2022-01-17 13:01:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1230/1251] eta 0:00:46 lr 0.000100 time 1.8032 (2.1962) loss 6.2067 (6.5554) grad_norm 2.5528 (2.0875) [2022-01-17 13:02:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1240/1251] eta 0:00:24 lr 0.000100 time 1.8636 (2.1956) loss 6.5622 (6.5552) grad_norm 2.6270 (2.0896) [2022-01-17 13:02:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [1/300][1250/1251] eta 0:00:02 lr 0.000101 time 1.1541 (2.1904) loss 6.1182 (6.5538) grad_norm 2.4095 (2.0899) [2022-01-17 13:02:33 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 1 training takes 0:45:40 [2022-01-17 13:02:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.066 (19.066) Loss 5.5691 (5.5691) Acc@1 6.055 (6.055) Acc@5 18.457 (18.457) [2022-01-17 13:03:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.906 (3.481) Loss 5.5324 (5.5855) Acc@1 6.250 (5.975) Acc@5 18.652 (17.241) [2022-01-17 13:03:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.622 (2.577) Loss 5.5407 (5.5771) Acc@1 5.469 (6.055) Acc@5 16.699 (17.411) [2022-01-17 13:03:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.311 (2.239) Loss 5.5345 (5.5758) Acc@1 6.055 (6.127) Acc@5 19.238 (17.566) [2022-01-17 13:04:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.247 (2.195) Loss 5.5630 (5.5774) Acc@1 6.934 (6.086) Acc@5 19.043 (17.554) [2022-01-17 13:04:10 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 6.130 Acc@5 17.586 [2022-01-17 13:04:10 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 6.1% [2022-01-17 13:04:10 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 6.13% [2022-01-17 13:04:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][0/1251] eta 7:26:28 lr 0.000101 time 21.4133 (21.4133) loss 6.3458 (6.3458) grad_norm 1.9607 (1.9607) [2022-01-17 13:04:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][10/1251] eta 1:24:14 lr 0.000101 time 2.7715 (4.0728) loss 6.0914 (6.3848) grad_norm 1.8630 (1.9969) [2022-01-17 13:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][20/1251] eta 1:03:06 lr 0.000102 time 1.6489 (3.0763) loss 6.0360 (6.4134) grad_norm 1.7637 (2.1185) [2022-01-17 13:05:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][30/1251] eta 0:57:25 lr 0.000102 time 1.9747 (2.8215) loss 6.2893 (6.4262) grad_norm 2.2892 (2.2225) [2022-01-17 13:06:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][40/1251] eta 0:54:56 lr 0.000102 time 3.9113 (2.7225) loss 6.7766 (6.4287) grad_norm 2.3902 (2.2425) [2022-01-17 13:06:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][50/1251] eta 0:52:43 lr 0.000103 time 2.5142 (2.6343) loss 6.5887 (6.4381) grad_norm 2.1967 (2.2568) [2022-01-17 13:06:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][60/1251] eta 0:50:36 lr 0.000103 time 1.8507 (2.5498) loss 6.5163 (6.4596) grad_norm 1.5956 (2.2305) [2022-01-17 13:07:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][70/1251] eta 0:48:49 lr 0.000104 time 1.8646 (2.4801) loss 6.3794 (6.4591) grad_norm 2.3347 (2.2684) [2022-01-17 13:07:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][80/1251] eta 0:47:47 lr 0.000104 time 3.1579 (2.4491) loss 6.5605 (6.4605) grad_norm 2.5240 (2.3062) [2022-01-17 13:07:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][90/1251] eta 0:47:10 lr 0.000104 time 2.8214 (2.4383) loss 6.5088 (6.4535) grad_norm 1.7022 (2.2865) [2022-01-17 13:08:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][100/1251] eta 0:46:39 lr 0.000105 time 1.8039 (2.4321) loss 6.2612 (6.4522) grad_norm 1.6106 (2.2680) [2022-01-17 13:08:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][110/1251] eta 0:45:55 lr 0.000105 time 2.1984 (2.4148) loss 6.2502 (6.4377) grad_norm 2.7149 (2.2513) [2022-01-17 13:09:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][120/1251] eta 0:45:15 lr 0.000106 time 2.8102 (2.4010) loss 6.5874 (6.4430) grad_norm 2.7745 (2.2589) [2022-01-17 13:09:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][130/1251] eta 0:44:24 lr 0.000106 time 1.5179 (2.3771) loss 6.6544 (6.4412) grad_norm 2.7064 (2.2551) [2022-01-17 13:09:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][140/1251] eta 0:43:53 lr 0.000106 time 2.2130 (2.3702) loss 6.4211 (6.4444) grad_norm 2.4028 (2.2518) [2022-01-17 13:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][150/1251] eta 0:43:19 lr 0.000107 time 2.3115 (2.3610) loss 6.0093 (6.4366) grad_norm 2.1871 (2.2641) [2022-01-17 13:10:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][160/1251] eta 0:42:50 lr 0.000107 time 3.1828 (2.3561) loss 6.3227 (6.4368) grad_norm 1.8329 (2.2388) [2022-01-17 13:10:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][170/1251] eta 0:42:02 lr 0.000108 time 1.5574 (2.3338) loss 6.4917 (6.4352) grad_norm 3.8427 (2.2430) [2022-01-17 13:11:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][180/1251] eta 0:41:15 lr 0.000108 time 1.9258 (2.3118) loss 6.4134 (6.4353) grad_norm 2.4982 (2.2411) [2022-01-17 13:11:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][190/1251] eta 0:40:36 lr 0.000108 time 2.1747 (2.2963) loss 6.2148 (6.4352) grad_norm 1.7787 (2.2322) [2022-01-17 13:11:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][200/1251] eta 0:40:03 lr 0.000109 time 2.6352 (2.2872) loss 6.4968 (6.4371) grad_norm 2.4927 (2.2210) [2022-01-17 13:12:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][210/1251] eta 0:39:36 lr 0.000109 time 1.9597 (2.2827) loss 6.2971 (6.4383) grad_norm 2.0584 (2.2118) [2022-01-17 13:12:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][220/1251] eta 0:39:08 lr 0.000110 time 1.8669 (2.2782) loss 6.4006 (6.4332) grad_norm 1.8240 (2.2049) [2022-01-17 13:12:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][230/1251] eta 0:38:52 lr 0.000110 time 2.5718 (2.2844) loss 6.6034 (6.4339) grad_norm 1.8213 (2.2059) [2022-01-17 13:13:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][240/1251] eta 0:38:25 lr 0.000110 time 2.9376 (2.2803) loss 6.6177 (6.4270) grad_norm 2.5815 (2.2068) [2022-01-17 13:13:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][250/1251] eta 0:38:02 lr 0.000111 time 2.1807 (2.2804) loss 6.0645 (6.4229) grad_norm 2.9309 (2.1968) [2022-01-17 13:14:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][260/1251] eta 0:37:37 lr 0.000111 time 1.8559 (2.2775) loss 6.6067 (6.4202) grad_norm 2.3140 (2.2012) [2022-01-17 13:14:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][270/1251] eta 0:37:10 lr 0.000112 time 2.8528 (2.2734) loss 6.5715 (6.4218) grad_norm 2.0624 (2.2061) [2022-01-17 13:14:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][280/1251] eta 0:36:41 lr 0.000112 time 2.2453 (2.2670) loss 6.5644 (6.4186) grad_norm 2.1359 (2.2039) [2022-01-17 13:15:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][290/1251] eta 0:36:14 lr 0.000112 time 2.0607 (2.2627) loss 6.2211 (6.4130) grad_norm 2.1577 (2.2096) [2022-01-17 13:15:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][300/1251] eta 0:35:46 lr 0.000113 time 1.7128 (2.2567) loss 6.5551 (6.4130) grad_norm 2.3243 (2.2131) [2022-01-17 13:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][310/1251] eta 0:35:27 lr 0.000113 time 3.8468 (2.2610) loss 6.4100 (6.4111) grad_norm 2.1234 (2.2140) [2022-01-17 13:16:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][320/1251] eta 0:34:59 lr 0.000114 time 2.2824 (2.2549) loss 6.3793 (6.4104) grad_norm 1.9561 (2.2127) [2022-01-17 13:16:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][330/1251] eta 0:34:33 lr 0.000114 time 2.4237 (2.2508) loss 6.4117 (6.4073) grad_norm 2.1385 (2.2123) [2022-01-17 13:16:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][340/1251] eta 0:34:08 lr 0.000114 time 1.8622 (2.2486) loss 6.1039 (6.4036) grad_norm 2.2297 (2.2085) [2022-01-17 13:17:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][350/1251] eta 0:33:47 lr 0.000115 time 3.4931 (2.2507) loss 6.3121 (6.4015) grad_norm 1.8886 (2.2008) [2022-01-17 13:17:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][360/1251] eta 0:33:24 lr 0.000115 time 2.5658 (2.2500) loss 6.0480 (6.3926) grad_norm 2.7371 (2.1970) [2022-01-17 13:18:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][370/1251] eta 0:32:57 lr 0.000116 time 1.7652 (2.2443) loss 6.4611 (6.3913) grad_norm 2.6611 (2.2046) [2022-01-17 13:18:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][380/1251] eta 0:32:32 lr 0.000116 time 1.8275 (2.2414) loss 6.2759 (6.3905) grad_norm 2.2976 (2.2074) [2022-01-17 13:18:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][390/1251] eta 0:32:09 lr 0.000116 time 2.9100 (2.2405) loss 6.5498 (6.3882) grad_norm 2.2403 (2.2058) [2022-01-17 13:19:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][400/1251] eta 0:31:40 lr 0.000117 time 1.6855 (2.2335) loss 5.9851 (6.3878) grad_norm 2.1233 (2.2033) [2022-01-17 13:19:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][410/1251] eta 0:31:18 lr 0.000117 time 2.2001 (2.2340) loss 6.6950 (6.3864) grad_norm 2.0945 (2.2003) [2022-01-17 13:19:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][420/1251] eta 0:30:56 lr 0.000118 time 2.6886 (2.2341) loss 6.4724 (6.3859) grad_norm 2.3738 (2.2057) [2022-01-17 13:20:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][430/1251] eta 0:30:35 lr 0.000118 time 3.5313 (2.2354) loss 6.1456 (6.3852) grad_norm 2.0220 (2.2059) [2022-01-17 13:20:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][440/1251] eta 0:30:10 lr 0.000118 time 1.6427 (2.2319) loss 6.0655 (6.3860) grad_norm 2.5804 (2.2002) [2022-01-17 13:20:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][450/1251] eta 0:29:46 lr 0.000119 time 1.5559 (2.2305) loss 6.4771 (6.3837) grad_norm 1.9426 (2.1957) [2022-01-17 13:21:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][460/1251] eta 0:29:22 lr 0.000119 time 2.2595 (2.2285) loss 6.2103 (6.3814) grad_norm 1.8235 (2.1940) [2022-01-17 13:21:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][470/1251] eta 0:29:01 lr 0.000120 time 3.6258 (2.2303) loss 6.0681 (6.3811) grad_norm 2.7602 (2.2010) [2022-01-17 13:22:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][480/1251] eta 0:28:38 lr 0.000120 time 1.7461 (2.2284) loss 6.0721 (6.3791) grad_norm 2.6739 (2.2097) [2022-01-17 13:22:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][490/1251] eta 0:28:16 lr 0.000120 time 1.5978 (2.2291) loss 6.4452 (6.3799) grad_norm 1.9162 (2.2111) [2022-01-17 13:22:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][500/1251] eta 0:27:53 lr 0.000121 time 1.8838 (2.2278) loss 6.5470 (6.3783) grad_norm 2.3179 (2.2157) [2022-01-17 13:23:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][510/1251] eta 0:27:29 lr 0.000121 time 3.7673 (2.2266) loss 6.5172 (6.3775) grad_norm 2.5552 (2.2168) [2022-01-17 13:23:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][520/1251] eta 0:27:04 lr 0.000122 time 1.5808 (2.2230) loss 6.3721 (6.3760) grad_norm 2.1845 (2.2142) [2022-01-17 13:23:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][530/1251] eta 0:26:42 lr 0.000122 time 1.8921 (2.2220) loss 6.3890 (6.3736) grad_norm 2.1136 (2.2081) [2022-01-17 13:24:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][540/1251] eta 0:26:19 lr 0.000122 time 1.7346 (2.2215) loss 6.3629 (6.3729) grad_norm 1.7970 (2.2045) [2022-01-17 13:24:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][550/1251] eta 0:25:57 lr 0.000123 time 3.1474 (2.2218) loss 6.2083 (6.3732) grad_norm 2.2937 (2.2029) [2022-01-17 13:24:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][560/1251] eta 0:25:34 lr 0.000123 time 2.3206 (2.2207) loss 5.9915 (6.3709) grad_norm 1.8549 (2.2024) [2022-01-17 13:25:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][570/1251] eta 0:25:11 lr 0.000124 time 1.8540 (2.2197) loss 6.4360 (6.3703) grad_norm 2.0172 (2.2033) [2022-01-17 13:25:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][580/1251] eta 0:24:48 lr 0.000124 time 1.8476 (2.2187) loss 5.9684 (6.3684) grad_norm 2.4215 (2.2082) [2022-01-17 13:26:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][590/1251] eta 0:24:26 lr 0.000124 time 2.8373 (2.2182) loss 6.3663 (6.3684) grad_norm 2.4359 (2.2099) [2022-01-17 13:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][600/1251] eta 0:24:03 lr 0.000125 time 1.8865 (2.2173) loss 6.1042 (6.3674) grad_norm 1.6850 (2.2052) [2022-01-17 13:26:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][610/1251] eta 0:23:41 lr 0.000125 time 2.1892 (2.2183) loss 6.0415 (6.3669) grad_norm 1.9600 (2.2019) [2022-01-17 13:27:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][620/1251] eta 0:23:19 lr 0.000126 time 2.4914 (2.2183) loss 6.3459 (6.3667) grad_norm 2.1736 (2.2014) [2022-01-17 13:27:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][630/1251] eta 0:22:57 lr 0.000126 time 3.0355 (2.2175) loss 6.5097 (6.3659) grad_norm 2.4284 (2.2027) [2022-01-17 13:27:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][640/1251] eta 0:22:34 lr 0.000126 time 1.8610 (2.2168) loss 6.3689 (6.3622) grad_norm 2.6940 (2.2025) [2022-01-17 13:28:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][650/1251] eta 0:22:12 lr 0.000127 time 1.8851 (2.2163) loss 6.6025 (6.3622) grad_norm 3.4166 (2.2067) [2022-01-17 13:28:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][660/1251] eta 0:21:48 lr 0.000127 time 1.8848 (2.2147) loss 6.4497 (6.3604) grad_norm 1.8841 (2.2092) [2022-01-17 13:28:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][670/1251] eta 0:21:26 lr 0.000128 time 3.0473 (2.2140) loss 6.4633 (6.3601) grad_norm 2.7840 (2.2100) [2022-01-17 13:29:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][680/1251] eta 0:21:03 lr 0.000128 time 1.7688 (2.2127) loss 5.9140 (6.3591) grad_norm 2.2367 (2.2128) [2022-01-17 13:29:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][690/1251] eta 0:20:41 lr 0.000128 time 1.5538 (2.2130) loss 6.3809 (6.3573) grad_norm 1.9575 (2.2110) [2022-01-17 13:30:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][700/1251] eta 0:20:18 lr 0.000129 time 1.9188 (2.2117) loss 6.5380 (6.3575) grad_norm 2.3468 (2.2124) [2022-01-17 13:30:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][710/1251] eta 0:19:56 lr 0.000129 time 2.7906 (2.2108) loss 6.2922 (6.3569) grad_norm 1.8198 (2.2125) [2022-01-17 13:30:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][720/1251] eta 0:19:33 lr 0.000130 time 1.6657 (2.2107) loss 6.1224 (6.3551) grad_norm 1.9828 (2.2106) [2022-01-17 13:31:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][730/1251] eta 0:19:12 lr 0.000130 time 1.8487 (2.2112) loss 6.3707 (6.3559) grad_norm 2.3448 (2.2144) [2022-01-17 13:31:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][740/1251] eta 0:18:49 lr 0.000130 time 2.8402 (2.2113) loss 5.8608 (6.3563) grad_norm 2.5428 (2.2183) [2022-01-17 13:31:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][750/1251] eta 0:18:25 lr 0.000131 time 1.6320 (2.2073) loss 5.8724 (6.3555) grad_norm 2.6067 (2.2195) [2022-01-17 13:32:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][760/1251] eta 0:18:03 lr 0.000131 time 2.1155 (2.2060) loss 6.6096 (6.3554) grad_norm 1.8600 (2.2182) [2022-01-17 13:32:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][770/1251] eta 0:17:40 lr 0.000132 time 2.2403 (2.2052) loss 6.2900 (6.3532) grad_norm 1.8054 (2.2150) [2022-01-17 13:32:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][780/1251] eta 0:17:19 lr 0.000132 time 2.9238 (2.2060) loss 6.4764 (6.3521) grad_norm 2.0401 (2.2138) [2022-01-17 13:33:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][790/1251] eta 0:16:57 lr 0.000132 time 2.4944 (2.2068) loss 6.3842 (6.3513) grad_norm 2.0233 (2.2123) [2022-01-17 13:33:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][800/1251] eta 0:16:34 lr 0.000133 time 1.8512 (2.2049) loss 6.3549 (6.3506) grad_norm 1.9245 (2.2093) [2022-01-17 13:33:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][810/1251] eta 0:16:11 lr 0.000133 time 2.1493 (2.2040) loss 6.2937 (6.3500) grad_norm 2.0616 (2.2096) [2022-01-17 13:34:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][820/1251] eta 0:15:50 lr 0.000134 time 2.6125 (2.2044) loss 5.7162 (6.3486) grad_norm 2.8874 (2.2130) [2022-01-17 13:34:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][830/1251] eta 0:15:28 lr 0.000134 time 2.7984 (2.2056) loss 6.2469 (6.3463) grad_norm 2.6614 (2.2141) [2022-01-17 13:35:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][840/1251] eta 0:15:06 lr 0.000134 time 1.8532 (2.2061) loss 6.3413 (6.3440) grad_norm 1.7807 (2.2135) [2022-01-17 13:35:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][850/1251] eta 0:14:45 lr 0.000135 time 3.3415 (2.2086) loss 6.5954 (6.3441) grad_norm 2.6934 (2.2143) [2022-01-17 13:35:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][860/1251] eta 0:14:23 lr 0.000135 time 1.8660 (2.2082) loss 6.4170 (6.3422) grad_norm 1.9757 (2.2146) [2022-01-17 13:36:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][870/1251] eta 0:14:00 lr 0.000136 time 1.7709 (2.2056) loss 6.0991 (6.3404) grad_norm 2.5849 (2.2154) [2022-01-17 13:36:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][880/1251] eta 0:13:37 lr 0.000136 time 1.9240 (2.2037) loss 6.7088 (6.3407) grad_norm 3.0104 (2.2175) [2022-01-17 13:36:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][890/1251] eta 0:13:15 lr 0.000136 time 2.3135 (2.2036) loss 6.1014 (6.3407) grad_norm 2.1602 (2.2187) [2022-01-17 13:37:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][900/1251] eta 0:12:53 lr 0.000137 time 1.8367 (2.2031) loss 6.3978 (6.3384) grad_norm 2.6030 (2.2186) [2022-01-17 13:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][910/1251] eta 0:12:30 lr 0.000137 time 2.2345 (2.2020) loss 6.1815 (6.3373) grad_norm 2.3201 (2.2177) [2022-01-17 13:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][920/1251] eta 0:12:08 lr 0.000138 time 1.8172 (2.2016) loss 6.3407 (6.3372) grad_norm 2.5882 (2.2212) [2022-01-17 13:38:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][930/1251] eta 0:11:46 lr 0.000138 time 2.8589 (2.2013) loss 6.5493 (6.3355) grad_norm 2.0006 (2.2218) [2022-01-17 13:38:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][940/1251] eta 0:11:25 lr 0.000138 time 2.0245 (2.2032) loss 6.5600 (6.3335) grad_norm 2.3538 (2.2224) [2022-01-17 13:39:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][950/1251] eta 0:11:03 lr 0.000139 time 2.2558 (2.2034) loss 6.4073 (6.3326) grad_norm 2.2497 (2.2226) [2022-01-17 13:39:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][960/1251] eta 0:10:40 lr 0.000139 time 1.5570 (2.2025) loss 6.3890 (6.3316) grad_norm 2.1832 (2.2246) [2022-01-17 13:39:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][970/1251] eta 0:10:19 lr 0.000140 time 1.9293 (2.2038) loss 6.2375 (6.3293) grad_norm 1.9417 (2.2246) [2022-01-17 13:40:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][980/1251] eta 0:09:57 lr 0.000140 time 1.8256 (2.2034) loss 5.9037 (6.3277) grad_norm 2.0373 (2.2245) [2022-01-17 13:40:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][990/1251] eta 0:09:34 lr 0.000140 time 2.3321 (2.2030) loss 6.3579 (6.3268) grad_norm 2.3274 (2.2267) [2022-01-17 13:40:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1000/1251] eta 0:09:12 lr 0.000141 time 1.8829 (2.2015) loss 6.4169 (6.3255) grad_norm 2.2872 (2.2257) [2022-01-17 13:41:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1010/1251] eta 0:08:50 lr 0.000141 time 1.5974 (2.2004) loss 6.3747 (6.3251) grad_norm 2.2236 (2.2281) [2022-01-17 13:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1020/1251] eta 0:08:28 lr 0.000142 time 1.6860 (2.2000) loss 6.4066 (6.3235) grad_norm 2.1206 (2.2289) [2022-01-17 13:41:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1030/1251] eta 0:08:06 lr 0.000142 time 2.1875 (2.2008) loss 6.6504 (6.3225) grad_norm 2.7362 (2.2299) [2022-01-17 13:42:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1040/1251] eta 0:07:44 lr 0.000142 time 2.0106 (2.2012) loss 5.7401 (6.3208) grad_norm 3.7649 (2.2338) [2022-01-17 13:42:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1050/1251] eta 0:07:22 lr 0.000143 time 2.1712 (2.2018) loss 6.4825 (6.3199) grad_norm 2.0558 (2.2357) [2022-01-17 13:43:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1060/1251] eta 0:07:00 lr 0.000143 time 2.1498 (2.2003) loss 6.3656 (6.3202) grad_norm 1.9924 (2.2353) [2022-01-17 13:43:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1070/1251] eta 0:06:37 lr 0.000144 time 1.8045 (2.1985) loss 6.3282 (6.3194) grad_norm 2.2180 (2.2348) [2022-01-17 13:43:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1080/1251] eta 0:06:15 lr 0.000144 time 2.6913 (2.1986) loss 6.2320 (6.3191) grad_norm 2.5113 (2.2325) [2022-01-17 13:44:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1090/1251] eta 0:05:53 lr 0.000144 time 1.8828 (2.1985) loss 5.9799 (6.3188) grad_norm 2.4080 (2.2329) [2022-01-17 13:44:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1100/1251] eta 0:05:32 lr 0.000145 time 2.3894 (2.1992) loss 5.6663 (6.3169) grad_norm 2.0688 (2.2326) [2022-01-17 13:44:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1110/1251] eta 0:05:10 lr 0.000145 time 1.9477 (2.1987) loss 6.3627 (6.3171) grad_norm 2.2656 (2.2321) [2022-01-17 13:45:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1120/1251] eta 0:04:47 lr 0.000146 time 1.8763 (2.1981) loss 5.9737 (6.3165) grad_norm 2.4729 (2.2341) [2022-01-17 13:45:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1130/1251] eta 0:04:26 lr 0.000146 time 2.1201 (2.1994) loss 6.2594 (6.3165) grad_norm 2.1191 (2.2378) [2022-01-17 13:46:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1140/1251] eta 0:04:04 lr 0.000146 time 1.8630 (2.1995) loss 6.5172 (6.3162) grad_norm 2.5112 (2.2429) [2022-01-17 13:46:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1150/1251] eta 0:03:42 lr 0.000147 time 1.6268 (2.1997) loss 6.5650 (6.3165) grad_norm 2.0549 (2.2426) [2022-01-17 13:46:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1160/1251] eta 0:03:20 lr 0.000147 time 1.9948 (2.1985) loss 6.3185 (6.3153) grad_norm 2.7013 (2.2420) [2022-01-17 13:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1170/1251] eta 0:02:57 lr 0.000148 time 1.7581 (2.1971) loss 6.1189 (6.3149) grad_norm 1.9368 (2.2417) [2022-01-17 13:47:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1180/1251] eta 0:02:35 lr 0.000148 time 2.0711 (2.1966) loss 6.4042 (6.3145) grad_norm 2.6818 (2.2437) [2022-01-17 13:47:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1190/1251] eta 0:02:13 lr 0.000148 time 2.3210 (2.1962) loss 5.9404 (6.3138) grad_norm 2.9961 (2.2452) [2022-01-17 13:48:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1200/1251] eta 0:01:52 lr 0.000149 time 2.1535 (2.1963) loss 6.3976 (6.3128) grad_norm 2.3617 (2.2461) [2022-01-17 13:48:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1210/1251] eta 0:01:30 lr 0.000149 time 1.9704 (2.1957) loss 6.5333 (6.3117) grad_norm 2.2770 (2.2461) [2022-01-17 13:48:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1220/1251] eta 0:01:08 lr 0.000150 time 2.1806 (2.1959) loss 6.5249 (6.3114) grad_norm 2.2905 (2.2471) [2022-01-17 13:49:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1230/1251] eta 0:00:46 lr 0.000150 time 2.1570 (2.1963) loss 6.1329 (6.3109) grad_norm 1.8796 (2.2479) [2022-01-17 13:49:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1240/1251] eta 0:00:24 lr 0.000150 time 2.1527 (2.1956) loss 6.3660 (6.3107) grad_norm 2.2804 (2.2481) [2022-01-17 13:49:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [2/300][1250/1251] eta 0:00:02 lr 0.000151 time 1.1745 (2.1905) loss 6.4508 (6.3094) grad_norm 2.7020 (2.2492) [2022-01-17 13:49:51 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 2 training takes 0:45:40 [2022-01-17 13:50:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.020 (18.020) Loss 4.9885 (4.9885) Acc@1 10.449 (10.449) Acc@5 28.809 (28.809) [2022-01-17 13:50:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.280 (3.439) Loss 5.0477 (4.9506) Acc@1 10.840 (11.737) Acc@5 26.367 (28.240) [2022-01-17 13:50:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.959 (2.653) Loss 4.8589 (4.9439) Acc@1 11.719 (11.509) Acc@5 29.883 (28.246) [2022-01-17 13:51:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.283 (2.412) Loss 4.9251 (4.9462) Acc@1 11.621 (11.505) Acc@5 30.566 (28.393) [2022-01-17 13:51:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.070 (2.225) Loss 4.9223 (4.9478) Acc@1 12.012 (11.552) Acc@5 28.027 (28.392) [2022-01-17 13:51:30 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 11.530 Acc@5 28.352 [2022-01-17 13:51:30 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 11.5% [2022-01-17 13:51:30 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 11.53% [2022-01-17 13:51:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][0/1251] eta 7:23:50 lr 0.000151 time 21.2871 (21.2871) loss 6.0097 (6.0097) grad_norm 2.0064 (2.0064) [2022-01-17 13:52:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][10/1251] eta 1:25:43 lr 0.000151 time 1.8978 (4.1448) loss 6.3747 (6.2698) grad_norm 1.9571 (2.1589) [2022-01-17 13:52:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][20/1251] eta 1:04:47 lr 0.000152 time 2.1580 (3.1577) loss 6.0820 (6.1987) grad_norm 1.9198 (2.2122) [2022-01-17 13:52:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][30/1251] eta 0:58:34 lr 0.000152 time 1.3574 (2.8787) loss 6.2142 (6.2033) grad_norm 1.6565 (2.1822) [2022-01-17 13:53:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][40/1251] eta 0:55:21 lr 0.000152 time 3.5964 (2.7428) loss 6.1281 (6.2112) grad_norm 2.3326 (2.2308) [2022-01-17 13:53:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][50/1251] eta 0:53:44 lr 0.000153 time 2.4623 (2.6848) loss 6.1371 (6.2104) grad_norm 2.3041 (2.2179) [2022-01-17 13:54:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][60/1251] eta 0:51:34 lr 0.000153 time 2.2107 (2.5981) loss 5.8695 (6.1886) grad_norm 2.0006 (2.2369) [2022-01-17 13:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][70/1251] eta 0:49:50 lr 0.000154 time 1.5964 (2.5320) loss 6.4215 (6.1937) grad_norm 2.0847 (2.2401) [2022-01-17 13:54:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][80/1251] eta 0:48:34 lr 0.000154 time 2.5906 (2.4888) loss 6.1552 (6.1971) grad_norm 2.4675 (2.2321) [2022-01-17 13:55:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][90/1251] eta 0:47:26 lr 0.000154 time 2.4537 (2.4515) loss 5.9937 (6.1814) grad_norm 2.7335 (2.2525) [2022-01-17 13:55:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][100/1251] eta 0:46:09 lr 0.000155 time 2.1381 (2.4061) loss 6.0341 (6.1922) grad_norm 1.8578 (2.2329) [2022-01-17 13:55:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][110/1251] eta 0:45:22 lr 0.000155 time 1.7436 (2.3862) loss 6.1234 (6.1785) grad_norm 2.2132 (2.2247) [2022-01-17 13:56:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][120/1251] eta 0:44:51 lr 0.000156 time 3.1472 (2.3802) loss 6.4798 (6.1783) grad_norm 1.6719 (2.2259) [2022-01-17 13:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][130/1251] eta 0:44:19 lr 0.000156 time 2.8288 (2.3724) loss 6.3972 (6.1716) grad_norm 2.4419 (2.2325) [2022-01-17 13:57:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][140/1251] eta 0:43:46 lr 0.000156 time 2.2886 (2.3639) loss 6.4262 (6.1688) grad_norm 1.8211 (2.2361) [2022-01-17 13:57:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][150/1251] eta 0:43:05 lr 0.000157 time 1.9281 (2.3484) loss 6.1013 (6.1635) grad_norm 2.0101 (2.2480) [2022-01-17 13:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][160/1251] eta 0:42:29 lr 0.000157 time 2.1517 (2.3372) loss 6.1051 (6.1621) grad_norm 2.3908 (2.2569) [2022-01-17 13:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][170/1251] eta 0:41:49 lr 0.000158 time 2.3543 (2.3215) loss 6.3498 (6.1641) grad_norm 2.5559 (2.2646) [2022-01-17 13:58:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][180/1251] eta 0:41:13 lr 0.000158 time 2.9038 (2.3095) loss 6.5323 (6.1614) grad_norm 2.2003 (2.2604) [2022-01-17 13:58:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][190/1251] eta 0:40:42 lr 0.000158 time 1.6886 (2.3020) loss 6.2849 (6.1547) grad_norm 1.9478 (2.2612) [2022-01-17 13:59:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][200/1251] eta 0:40:05 lr 0.000159 time 1.7785 (2.2886) loss 6.2449 (6.1476) grad_norm 1.8590 (2.2731) [2022-01-17 13:59:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][210/1251] eta 0:39:37 lr 0.000159 time 1.9478 (2.2837) loss 6.3704 (6.1470) grad_norm 2.1434 (2.2720) [2022-01-17 13:59:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][220/1251] eta 0:39:19 lr 0.000160 time 3.6374 (2.2882) loss 5.6570 (6.1479) grad_norm 2.4561 (2.2820) [2022-01-17 14:00:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][230/1251] eta 0:38:53 lr 0.000160 time 1.5384 (2.2854) loss 6.1376 (6.1506) grad_norm 1.9963 (2.2746) [2022-01-17 14:00:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][240/1251] eta 0:38:25 lr 0.000160 time 1.7915 (2.2800) loss 6.4952 (6.1514) grad_norm 1.8643 (2.2691) [2022-01-17 14:01:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][250/1251] eta 0:37:55 lr 0.000161 time 1.9041 (2.2733) loss 6.4663 (6.1476) grad_norm 2.0774 (2.2623) [2022-01-17 14:01:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][260/1251] eta 0:37:30 lr 0.000161 time 2.5673 (2.2706) loss 6.3394 (6.1432) grad_norm 1.9328 (2.2648) [2022-01-17 14:01:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][270/1251] eta 0:37:04 lr 0.000162 time 2.5464 (2.2674) loss 5.9964 (6.1468) grad_norm 1.9540 (2.2687) [2022-01-17 14:02:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][280/1251] eta 0:36:37 lr 0.000162 time 2.0249 (2.2635) loss 5.7188 (6.1414) grad_norm 2.0964 (2.2646) [2022-01-17 14:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][290/1251] eta 0:36:18 lr 0.000162 time 3.1784 (2.2673) loss 6.3894 (6.1460) grad_norm 2.1192 (2.2602) [2022-01-17 14:02:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][300/1251] eta 0:35:55 lr 0.000163 time 2.7889 (2.2662) loss 6.1261 (6.1499) grad_norm 2.0289 (2.2643) [2022-01-17 14:03:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][310/1251] eta 0:35:33 lr 0.000163 time 2.5092 (2.2672) loss 5.6290 (6.1494) grad_norm 1.9493 (2.2617) [2022-01-17 14:03:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][320/1251] eta 0:35:05 lr 0.000164 time 1.5902 (2.2617) loss 5.9704 (6.1426) grad_norm 2.5550 (2.2606) [2022-01-17 14:03:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][330/1251] eta 0:34:39 lr 0.000164 time 2.8421 (2.2579) loss 5.6940 (6.1371) grad_norm 1.7686 (2.2619) [2022-01-17 14:04:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][340/1251] eta 0:34:08 lr 0.000164 time 1.7826 (2.2490) loss 6.1059 (6.1366) grad_norm 2.2012 (2.2646) [2022-01-17 14:04:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][350/1251] eta 0:33:43 lr 0.000165 time 2.2565 (2.2460) loss 6.2339 (6.1333) grad_norm 2.2509 (2.2637) [2022-01-17 14:05:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][360/1251] eta 0:33:19 lr 0.000165 time 2.2703 (2.2444) loss 6.0390 (6.1320) grad_norm 2.6841 (2.2686) [2022-01-17 14:05:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][370/1251] eta 0:32:59 lr 0.000166 time 2.1678 (2.2474) loss 6.3510 (6.1316) grad_norm 3.3692 (2.2844) [2022-01-17 14:05:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][380/1251] eta 0:32:36 lr 0.000166 time 1.8337 (2.2461) loss 5.8134 (6.1302) grad_norm 2.2934 (2.2848) [2022-01-17 14:06:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][390/1251] eta 0:32:14 lr 0.000166 time 2.4361 (2.2464) loss 5.5368 (6.1317) grad_norm 3.5449 (2.2882) [2022-01-17 14:06:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][400/1251] eta 0:31:50 lr 0.000167 time 2.2648 (2.2447) loss 5.8317 (6.1314) grad_norm 1.8783 (2.2859) [2022-01-17 14:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][410/1251] eta 0:31:24 lr 0.000167 time 2.1019 (2.2410) loss 6.3186 (6.1310) grad_norm 2.3152 (2.2788) [2022-01-17 14:07:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][420/1251] eta 0:30:58 lr 0.000168 time 1.5919 (2.2367) loss 6.4693 (6.1331) grad_norm 1.8544 (2.2748) [2022-01-17 14:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][430/1251] eta 0:30:34 lr 0.000168 time 1.8894 (2.2339) loss 6.3358 (6.1354) grad_norm 2.7681 (2.2786) [2022-01-17 14:07:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][440/1251] eta 0:30:08 lr 0.000168 time 1.9014 (2.2306) loss 5.8665 (6.1321) grad_norm 2.2588 (2.2833) [2022-01-17 14:08:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][450/1251] eta 0:29:44 lr 0.000169 time 2.4998 (2.2282) loss 5.9158 (6.1332) grad_norm 2.1114 (2.2825) [2022-01-17 14:08:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][460/1251] eta 0:29:20 lr 0.000169 time 1.9440 (2.2251) loss 6.3590 (6.1337) grad_norm 2.3514 (2.2798) [2022-01-17 14:08:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][470/1251] eta 0:28:57 lr 0.000170 time 2.6695 (2.2244) loss 5.5332 (6.1333) grad_norm 2.2932 (2.2806) [2022-01-17 14:09:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][480/1251] eta 0:28:37 lr 0.000170 time 2.0760 (2.2277) loss 6.0885 (6.1300) grad_norm 3.8088 (2.2904) [2022-01-17 14:09:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][490/1251] eta 0:28:18 lr 0.000170 time 2.8315 (2.2315) loss 6.2774 (6.1304) grad_norm 2.3661 (2.2915) [2022-01-17 14:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][500/1251] eta 0:27:54 lr 0.000171 time 1.4811 (2.2300) loss 6.3756 (6.1297) grad_norm 2.2356 (2.2909) [2022-01-17 14:10:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][510/1251] eta 0:27:35 lr 0.000171 time 2.7174 (2.2341) loss 5.5483 (6.1273) grad_norm 1.7974 (2.2875) [2022-01-17 14:10:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][520/1251] eta 0:27:15 lr 0.000172 time 1.9103 (2.2369) loss 6.3786 (6.1265) grad_norm 1.9507 (2.2863) [2022-01-17 14:11:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][530/1251] eta 0:26:52 lr 0.000172 time 2.5225 (2.2358) loss 5.4971 (6.1233) grad_norm 2.0310 (2.2858) [2022-01-17 14:11:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][540/1251] eta 0:26:26 lr 0.000172 time 1.7675 (2.2318) loss 6.3464 (6.1242) grad_norm 2.2552 (2.2862) [2022-01-17 14:11:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][550/1251] eta 0:26:00 lr 0.000173 time 1.8439 (2.2260) loss 6.0960 (6.1218) grad_norm 2.0461 (2.2957) [2022-01-17 14:12:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][560/1251] eta 0:25:36 lr 0.000173 time 1.8545 (2.2230) loss 5.5725 (6.1231) grad_norm 2.2646 (2.2978) [2022-01-17 14:12:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][570/1251] eta 0:25:12 lr 0.000174 time 2.4635 (2.2209) loss 6.1211 (6.1238) grad_norm 2.4149 (2.2953) [2022-01-17 14:13:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][580/1251] eta 0:24:50 lr 0.000174 time 2.8303 (2.2215) loss 5.4782 (6.1218) grad_norm 1.7298 (2.2930) [2022-01-17 14:13:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][590/1251] eta 0:24:27 lr 0.000174 time 2.1312 (2.2197) loss 5.6024 (6.1195) grad_norm 2.5717 (2.2951) [2022-01-17 14:13:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][600/1251] eta 0:24:03 lr 0.000175 time 1.9683 (2.2170) loss 6.0519 (6.1187) grad_norm 2.5573 (2.2947) [2022-01-17 14:14:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][610/1251] eta 0:23:41 lr 0.000175 time 2.4964 (2.2174) loss 5.6514 (6.1172) grad_norm 2.1471 (2.2928) [2022-01-17 14:14:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][620/1251] eta 0:23:20 lr 0.000176 time 3.1892 (2.2192) loss 6.1238 (6.1176) grad_norm 2.8079 (2.2909) [2022-01-17 14:14:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][630/1251] eta 0:23:00 lr 0.000176 time 2.4368 (2.2225) loss 5.9620 (6.1156) grad_norm 1.9523 (2.2920) [2022-01-17 14:15:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][640/1251] eta 0:22:37 lr 0.000176 time 1.6680 (2.2218) loss 5.7415 (6.1152) grad_norm 2.0432 (2.2882) [2022-01-17 14:15:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][650/1251] eta 0:22:15 lr 0.000177 time 2.7810 (2.2217) loss 6.2435 (6.1173) grad_norm 2.0482 (2.2889) [2022-01-17 14:15:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][660/1251] eta 0:21:51 lr 0.000177 time 2.6057 (2.2196) loss 5.8187 (6.1159) grad_norm 2.5451 (2.2945) [2022-01-17 14:16:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][670/1251] eta 0:21:29 lr 0.000178 time 1.6502 (2.2194) loss 5.8367 (6.1146) grad_norm 2.6210 (2.2960) [2022-01-17 14:16:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][680/1251] eta 0:21:06 lr 0.000178 time 2.2163 (2.2175) loss 6.3137 (6.1132) grad_norm 3.4712 (2.2996) [2022-01-17 14:17:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][690/1251] eta 0:20:44 lr 0.000178 time 2.9723 (2.2189) loss 6.1419 (6.1126) grad_norm 2.1340 (2.2996) [2022-01-17 14:17:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][700/1251] eta 0:20:21 lr 0.000179 time 1.9369 (2.2171) loss 5.5182 (6.1120) grad_norm 2.2031 (2.3010) [2022-01-17 14:17:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][710/1251] eta 0:19:58 lr 0.000179 time 2.2403 (2.2160) loss 6.3323 (6.1114) grad_norm 2.3944 (2.3025) [2022-01-17 14:18:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][720/1251] eta 0:19:36 lr 0.000180 time 2.5322 (2.2151) loss 6.5265 (6.1134) grad_norm 2.4589 (2.3050) [2022-01-17 14:18:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][730/1251] eta 0:19:12 lr 0.000180 time 1.9336 (2.2128) loss 6.3373 (6.1128) grad_norm 2.7132 (2.3045) [2022-01-17 14:18:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][740/1251] eta 0:18:50 lr 0.000180 time 1.5690 (2.2122) loss 6.4019 (6.1094) grad_norm 1.9602 (2.3052) [2022-01-17 14:19:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][750/1251] eta 0:18:28 lr 0.000181 time 2.1005 (2.2116) loss 5.7229 (6.1079) grad_norm 2.2263 (2.3048) [2022-01-17 14:19:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][760/1251] eta 0:18:05 lr 0.000181 time 1.8591 (2.2103) loss 5.8076 (6.1086) grad_norm 2.3324 (2.3073) [2022-01-17 14:19:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][770/1251] eta 0:17:42 lr 0.000182 time 2.1360 (2.2089) loss 6.2333 (6.1078) grad_norm 2.1873 (2.3098) [2022-01-17 14:20:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][780/1251] eta 0:17:19 lr 0.000182 time 1.8612 (2.2074) loss 6.0347 (6.1075) grad_norm 1.7231 (2.3095) [2022-01-17 14:20:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][790/1251] eta 0:16:57 lr 0.000182 time 2.1504 (2.2068) loss 6.2696 (6.1066) grad_norm 2.2276 (2.3091) [2022-01-17 14:20:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][800/1251] eta 0:16:34 lr 0.000183 time 2.0903 (2.2061) loss 5.9116 (6.1062) grad_norm 2.3976 (2.3120) [2022-01-17 14:21:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][810/1251] eta 0:16:12 lr 0.000183 time 2.1244 (2.2059) loss 6.3723 (6.1072) grad_norm 3.2174 (2.3164) [2022-01-17 14:21:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][820/1251] eta 0:15:50 lr 0.000184 time 1.8992 (2.2060) loss 6.2117 (6.1070) grad_norm 2.5056 (2.3174) [2022-01-17 14:22:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][830/1251] eta 0:15:28 lr 0.000184 time 1.9459 (2.2054) loss 5.7572 (6.1065) grad_norm 2.2409 (2.3170) [2022-01-17 14:22:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][840/1251] eta 0:15:06 lr 0.000184 time 2.3765 (2.2053) loss 5.9276 (6.1068) grad_norm 2.4654 (2.3162) [2022-01-17 14:22:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][850/1251] eta 0:14:44 lr 0.000185 time 1.5583 (2.2056) loss 6.1329 (6.1057) grad_norm 2.9423 (2.3172) [2022-01-17 14:23:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][860/1251] eta 0:14:22 lr 0.000185 time 1.8934 (2.2056) loss 6.3728 (6.1040) grad_norm 2.7588 (2.3193) [2022-01-17 14:23:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][870/1251] eta 0:14:01 lr 0.000186 time 1.9604 (2.2077) loss 6.2342 (6.1039) grad_norm 2.0258 (2.3194) [2022-01-17 14:23:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][880/1251] eta 0:13:39 lr 0.000186 time 2.0147 (2.2095) loss 6.2682 (6.1028) grad_norm 3.2340 (2.3219) [2022-01-17 14:24:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][890/1251] eta 0:13:17 lr 0.000186 time 1.9060 (2.2086) loss 5.9883 (6.1035) grad_norm 1.7542 (2.3209) [2022-01-17 14:24:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][900/1251] eta 0:12:54 lr 0.000187 time 2.2336 (2.2064) loss 6.3075 (6.1030) grad_norm 1.8386 (2.3199) [2022-01-17 14:24:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][910/1251] eta 0:12:31 lr 0.000187 time 1.7788 (2.2051) loss 5.6348 (6.1027) grad_norm 2.1499 (2.3189) [2022-01-17 14:25:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][920/1251] eta 0:12:09 lr 0.000188 time 2.1654 (2.2041) loss 6.3529 (6.1028) grad_norm 1.7184 (2.3188) [2022-01-17 14:25:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][930/1251] eta 0:11:47 lr 0.000188 time 2.2141 (2.2035) loss 6.5549 (6.1020) grad_norm 2.5645 (2.3224) [2022-01-17 14:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][940/1251] eta 0:11:25 lr 0.000188 time 2.5438 (2.2035) loss 5.7912 (6.1024) grad_norm 2.6526 (2.3247) [2022-01-17 14:26:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][950/1251] eta 0:11:03 lr 0.000189 time 1.7824 (2.2029) loss 6.3201 (6.1032) grad_norm 2.2478 (2.3248) [2022-01-17 14:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][960/1251] eta 0:10:41 lr 0.000189 time 2.4907 (2.2038) loss 6.1153 (6.1008) grad_norm 2.2569 (2.3250) [2022-01-17 14:27:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][970/1251] eta 0:10:19 lr 0.000190 time 1.7545 (2.2049) loss 5.7392 (6.1009) grad_norm 2.4824 (2.3249) [2022-01-17 14:27:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][980/1251] eta 0:09:58 lr 0.000190 time 2.9254 (2.2077) loss 6.0865 (6.0993) grad_norm 2.7844 (2.3282) [2022-01-17 14:27:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][990/1251] eta 0:09:36 lr 0.000190 time 2.2198 (2.2071) loss 5.4376 (6.0990) grad_norm 2.7277 (2.3279) [2022-01-17 14:28:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1000/1251] eta 0:09:13 lr 0.000191 time 1.8434 (2.2053) loss 6.2447 (6.0978) grad_norm 2.1804 (2.3273) [2022-01-17 14:28:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1010/1251] eta 0:08:51 lr 0.000191 time 1.8778 (2.2037) loss 6.1357 (6.0971) grad_norm 1.9786 (2.3271) [2022-01-17 14:29:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1020/1251] eta 0:08:29 lr 0.000192 time 1.8627 (2.2059) loss 6.0870 (6.0952) grad_norm 2.2274 (2.3283) [2022-01-17 14:29:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1030/1251] eta 0:08:07 lr 0.000192 time 2.2647 (2.2047) loss 6.1726 (6.0942) grad_norm 2.1875 (2.3269) [2022-01-17 14:29:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1040/1251] eta 0:07:45 lr 0.000192 time 1.7025 (2.2039) loss 6.1862 (6.0938) grad_norm 2.2143 (2.3274) [2022-01-17 14:30:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1050/1251] eta 0:07:22 lr 0.000193 time 2.2509 (2.2030) loss 6.2115 (6.0929) grad_norm 2.9127 (2.3270) [2022-01-17 14:30:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1060/1251] eta 0:07:01 lr 0.000193 time 2.8163 (2.2045) loss 6.0770 (6.0915) grad_norm 2.0729 (2.3298) [2022-01-17 14:30:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1070/1251] eta 0:06:38 lr 0.000194 time 1.8787 (2.2036) loss 5.7392 (6.0894) grad_norm 2.1209 (2.3306) [2022-01-17 14:31:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1080/1251] eta 0:06:16 lr 0.000194 time 1.8884 (2.2019) loss 6.1325 (6.0895) grad_norm 2.2392 (2.3318) [2022-01-17 14:31:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1090/1251] eta 0:05:54 lr 0.000194 time 3.1259 (2.2020) loss 6.2832 (6.0896) grad_norm 2.3450 (2.3320) [2022-01-17 14:31:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1100/1251] eta 0:05:32 lr 0.000195 time 2.4167 (2.2027) loss 6.3593 (6.0879) grad_norm 1.9940 (2.3310) [2022-01-17 14:32:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1110/1251] eta 0:05:10 lr 0.000195 time 2.1240 (2.2019) loss 5.6253 (6.0840) grad_norm 2.4178 (2.3312) [2022-01-17 14:32:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1120/1251] eta 0:04:48 lr 0.000196 time 1.9123 (2.2009) loss 6.1448 (6.0819) grad_norm 2.2747 (2.3355) [2022-01-17 14:32:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1130/1251] eta 0:04:26 lr 0.000196 time 2.5019 (2.2010) loss 5.3277 (6.0805) grad_norm 2.8238 (2.3352) [2022-01-17 14:33:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1140/1251] eta 0:04:04 lr 0.000196 time 2.9498 (2.2007) loss 5.7339 (6.0793) grad_norm 2.1329 (2.3353) [2022-01-17 14:33:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1150/1251] eta 0:03:42 lr 0.000197 time 2.2906 (2.1994) loss 5.7440 (6.0774) grad_norm 3.3644 (2.3368) [2022-01-17 14:34:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1160/1251] eta 0:03:20 lr 0.000197 time 1.8505 (2.1985) loss 5.6558 (6.0764) grad_norm 2.1883 (2.3374) [2022-01-17 14:34:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1170/1251] eta 0:02:58 lr 0.000198 time 2.2977 (2.1991) loss 5.6215 (6.0758) grad_norm 2.2879 (2.3367) [2022-01-17 14:34:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1180/1251] eta 0:02:36 lr 0.000198 time 3.2546 (2.2004) loss 6.3528 (6.0760) grad_norm 2.0585 (2.3389) [2022-01-17 14:35:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1190/1251] eta 0:02:14 lr 0.000198 time 1.8794 (2.2013) loss 6.0217 (6.0762) grad_norm 2.5992 (2.3380) [2022-01-17 14:35:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1200/1251] eta 0:01:52 lr 0.000199 time 1.7777 (2.2005) loss 6.1013 (6.0758) grad_norm 2.0397 (2.3404) [2022-01-17 14:35:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1210/1251] eta 0:01:30 lr 0.000199 time 2.1528 (2.2000) loss 5.8911 (6.0736) grad_norm 2.0372 (2.3406) [2022-01-17 14:36:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1220/1251] eta 0:01:08 lr 0.000200 time 2.1087 (2.1998) loss 6.2322 (6.0734) grad_norm 2.0959 (2.3405) [2022-01-17 14:36:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1230/1251] eta 0:00:46 lr 0.000200 time 1.8939 (2.1982) loss 6.1788 (6.0741) grad_norm 2.3304 (2.3403) [2022-01-17 14:36:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1240/1251] eta 0:00:24 lr 0.000200 time 1.5737 (2.1966) loss 5.5231 (6.0725) grad_norm 3.2249 (2.3408) [2022-01-17 14:37:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [3/300][1250/1251] eta 0:00:02 lr 0.000201 time 1.1859 (2.1913) loss 6.3238 (6.0733) grad_norm 2.4387 (2.3411) [2022-01-17 14:37:11 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 3 training takes 0:45:41 [2022-01-17 14:37:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.026 (19.026) Loss 4.4408 (4.4408) Acc@1 18.164 (18.164) Acc@5 38.379 (38.379) [2022-01-17 14:37:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.586 (3.436) Loss 4.3675 (4.3836) Acc@1 18.945 (18.475) Acc@5 38.867 (39.249) [2022-01-17 14:38:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.576 (2.634) Loss 4.4629 (4.3735) Acc@1 18.848 (18.662) Acc@5 38.086 (39.551) [2022-01-17 14:38:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.282 (2.212) Loss 4.3712 (4.3682) Acc@1 18.066 (18.662) Acc@5 40.137 (39.664) [2022-01-17 14:38:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.033 (2.172) Loss 4.4245 (4.3687) Acc@1 17.969 (18.614) Acc@5 38.574 (39.622) [2022-01-17 14:38:47 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 18.568 Acc@5 39.650 [2022-01-17 14:38:47 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 18.6% [2022-01-17 14:38:47 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 18.57% [2022-01-17 14:39:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][0/1251] eta 7:21:32 lr 0.000201 time 21.1774 (21.1774) loss 6.2718 (6.2718) grad_norm 2.6312 (2.6312) [2022-01-17 14:39:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][10/1251] eta 1:28:17 lr 0.000201 time 2.7629 (4.2687) loss 6.3029 (5.9453) grad_norm 2.2116 (2.2972) [2022-01-17 14:39:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][20/1251] eta 1:06:41 lr 0.000202 time 1.2866 (3.2510) loss 6.2943 (5.9107) grad_norm 1.9509 (2.3350) [2022-01-17 14:40:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][30/1251] eta 0:59:50 lr 0.000202 time 1.9214 (2.9408) loss 5.4328 (5.8910) grad_norm 2.3625 (2.3022) [2022-01-17 14:40:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][40/1251] eta 0:57:30 lr 0.000202 time 5.4919 (2.8496) loss 6.0907 (5.9167) grad_norm 2.6740 (2.3353) [2022-01-17 14:41:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][50/1251] eta 0:54:24 lr 0.000203 time 1.9371 (2.7184) loss 5.9009 (5.9079) grad_norm 2.1685 (2.2976) [2022-01-17 14:41:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][60/1251] eta 0:51:44 lr 0.000203 time 1.8328 (2.6069) loss 5.8822 (5.9294) grad_norm 2.2910 (2.2920) [2022-01-17 14:41:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][70/1251] eta 0:49:54 lr 0.000204 time 2.2987 (2.5353) loss 6.0038 (5.9171) grad_norm 1.9322 (2.3592) [2022-01-17 14:42:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][80/1251] eta 0:49:16 lr 0.000204 time 3.8487 (2.5246) loss 5.3857 (5.9102) grad_norm 2.0881 (2.3784) [2022-01-17 14:42:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][90/1251] eta 0:48:00 lr 0.000204 time 1.9648 (2.4810) loss 5.8251 (5.9025) grad_norm 3.0442 (2.4435) [2022-01-17 14:42:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][100/1251] eta 0:47:08 lr 0.000205 time 1.6431 (2.4573) loss 6.2658 (5.9167) grad_norm 2.3421 (2.4525) [2022-01-17 14:43:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][110/1251] eta 0:45:54 lr 0.000205 time 1.9094 (2.4140) loss 6.0678 (5.9139) grad_norm 2.0332 (2.4342) [2022-01-17 14:43:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][120/1251] eta 0:45:46 lr 0.000206 time 3.6921 (2.4285) loss 6.2449 (5.9110) grad_norm 2.1988 (2.4244) [2022-01-17 14:44:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][130/1251] eta 0:44:49 lr 0.000206 time 1.7676 (2.3994) loss 6.1372 (5.9197) grad_norm 2.7954 (2.4194) [2022-01-17 14:44:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][140/1251] eta 0:43:53 lr 0.000206 time 1.8899 (2.3708) loss 6.2534 (5.9306) grad_norm 2.9169 (2.4326) [2022-01-17 14:44:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][150/1251] eta 0:43:05 lr 0.000207 time 2.1115 (2.3480) loss 5.8253 (5.9310) grad_norm 1.9699 (2.4291) [2022-01-17 14:45:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][160/1251] eta 0:42:31 lr 0.000207 time 2.9851 (2.3388) loss 5.6339 (5.9387) grad_norm 1.9392 (2.4360) [2022-01-17 14:45:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][170/1251] eta 0:41:44 lr 0.000208 time 2.2435 (2.3167) loss 6.0328 (5.9364) grad_norm 2.3802 (2.4352) [2022-01-17 14:45:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][180/1251] eta 0:41:05 lr 0.000208 time 2.1015 (2.3018) loss 6.2266 (5.9414) grad_norm 2.2204 (2.4300) [2022-01-17 14:46:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][190/1251] eta 0:40:33 lr 0.000208 time 2.2124 (2.2936) loss 5.3194 (5.9320) grad_norm 2.3521 (2.4268) [2022-01-17 14:46:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][200/1251] eta 0:40:03 lr 0.000209 time 2.1596 (2.2867) loss 6.2802 (5.9370) grad_norm 2.0728 (2.4311) [2022-01-17 14:46:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][210/1251] eta 0:39:37 lr 0.000209 time 2.1387 (2.2840) loss 6.2755 (5.9421) grad_norm 1.9463 (2.4411) [2022-01-17 14:47:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][220/1251] eta 0:39:13 lr 0.000210 time 2.5221 (2.2825) loss 5.6586 (5.9430) grad_norm 2.0784 (2.4298) [2022-01-17 14:47:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][230/1251] eta 0:38:48 lr 0.000210 time 1.7283 (2.2807) loss 6.1383 (5.9393) grad_norm 3.3325 (2.4283) [2022-01-17 14:47:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][240/1251] eta 0:38:32 lr 0.000210 time 2.2436 (2.2869) loss 6.0617 (5.9472) grad_norm 2.5919 (2.4233) [2022-01-17 14:48:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][250/1251] eta 0:38:04 lr 0.000211 time 1.5095 (2.2820) loss 6.4178 (5.9444) grad_norm 2.3232 (2.4287) [2022-01-17 14:48:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][260/1251] eta 0:37:44 lr 0.000211 time 3.4772 (2.2855) loss 6.3593 (5.9433) grad_norm 2.4988 (2.4246) [2022-01-17 14:49:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][270/1251] eta 0:37:15 lr 0.000212 time 1.9085 (2.2785) loss 5.9358 (5.9446) grad_norm 2.6374 (2.4316) [2022-01-17 14:49:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][280/1251] eta 0:36:46 lr 0.000212 time 1.6336 (2.2719) loss 6.3087 (5.9472) grad_norm 2.3847 (2.4384) [2022-01-17 14:49:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][290/1251] eta 0:36:16 lr 0.000212 time 1.8737 (2.2647) loss 5.1361 (5.9442) grad_norm 2.3757 (2.4444) [2022-01-17 14:50:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][300/1251] eta 0:35:46 lr 0.000213 time 2.3625 (2.2572) loss 5.7528 (5.9469) grad_norm 2.3074 (2.4403) [2022-01-17 14:50:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][310/1251] eta 0:35:20 lr 0.000213 time 1.5552 (2.2537) loss 6.0685 (5.9424) grad_norm 2.2685 (2.4382) [2022-01-17 14:50:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][320/1251] eta 0:34:59 lr 0.000214 time 2.2898 (2.2553) loss 6.1636 (5.9416) grad_norm 4.6780 (2.4547) [2022-01-17 14:51:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][330/1251] eta 0:34:35 lr 0.000214 time 1.9583 (2.2534) loss 5.5669 (5.9385) grad_norm 2.5193 (2.4496) [2022-01-17 14:51:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][340/1251] eta 0:34:12 lr 0.000214 time 2.7596 (2.2528) loss 6.3442 (5.9406) grad_norm 2.4231 (2.4413) [2022-01-17 14:51:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][350/1251] eta 0:33:50 lr 0.000215 time 1.9375 (2.2531) loss 6.1406 (5.9414) grad_norm 2.4541 (2.4444) [2022-01-17 14:52:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][360/1251] eta 0:33:27 lr 0.000215 time 2.4635 (2.2527) loss 5.7336 (5.9377) grad_norm 2.9118 (2.4539) [2022-01-17 14:52:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][370/1251] eta 0:33:02 lr 0.000216 time 2.1318 (2.2500) loss 6.0426 (5.9348) grad_norm 2.5795 (2.4621) [2022-01-17 14:53:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][380/1251] eta 0:32:38 lr 0.000216 time 2.1548 (2.2487) loss 6.0633 (5.9327) grad_norm 2.6426 (2.4757) [2022-01-17 14:53:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][390/1251] eta 0:32:17 lr 0.000216 time 2.3929 (2.2502) loss 6.1184 (5.9316) grad_norm 2.4071 (2.4746) [2022-01-17 14:53:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][400/1251] eta 0:31:53 lr 0.000217 time 2.1886 (2.2487) loss 6.2946 (5.9327) grad_norm 1.9945 (2.4670) [2022-01-17 14:54:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][410/1251] eta 0:31:29 lr 0.000217 time 2.1242 (2.2467) loss 5.9758 (5.9334) grad_norm 2.1238 (2.4608) [2022-01-17 14:54:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][420/1251] eta 0:31:04 lr 0.000218 time 1.6745 (2.2442) loss 5.5585 (5.9296) grad_norm 1.9542 (2.4543) [2022-01-17 14:54:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][430/1251] eta 0:30:39 lr 0.000218 time 2.8505 (2.2406) loss 6.0782 (5.9283) grad_norm 2.1691 (2.4507) [2022-01-17 14:55:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][440/1251] eta 0:30:12 lr 0.000218 time 1.8568 (2.2352) loss 6.4025 (5.9290) grad_norm 3.0672 (2.4558) [2022-01-17 14:55:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][450/1251] eta 0:29:45 lr 0.000219 time 1.8565 (2.2296) loss 5.6199 (5.9290) grad_norm 2.7875 (2.4507) [2022-01-17 14:55:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][460/1251] eta 0:29:22 lr 0.000219 time 2.1458 (2.2283) loss 6.1315 (5.9294) grad_norm 2.4290 (2.4478) [2022-01-17 14:56:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][470/1251] eta 0:28:57 lr 0.000220 time 2.3146 (2.2241) loss 6.1744 (5.9313) grad_norm 1.9524 (2.4456) [2022-01-17 14:56:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][480/1251] eta 0:28:31 lr 0.000220 time 1.8682 (2.2201) loss 5.1625 (5.9265) grad_norm 2.4111 (2.4440) [2022-01-17 14:56:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][490/1251] eta 0:28:07 lr 0.000220 time 1.5581 (2.2169) loss 6.1974 (5.9284) grad_norm 2.2664 (2.4470) [2022-01-17 14:57:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][500/1251] eta 0:27:48 lr 0.000221 time 2.6189 (2.2212) loss 6.3074 (5.9273) grad_norm 2.0738 (2.4429) [2022-01-17 14:57:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][510/1251] eta 0:27:27 lr 0.000221 time 3.3982 (2.2237) loss 5.8536 (5.9240) grad_norm 2.2570 (2.4396) [2022-01-17 14:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][520/1251] eta 0:27:07 lr 0.000222 time 2.5728 (2.2262) loss 6.3870 (5.9251) grad_norm 2.4193 (2.4471) [2022-01-17 14:58:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][530/1251] eta 0:26:44 lr 0.000222 time 1.8904 (2.2260) loss 5.2529 (5.9204) grad_norm 2.3380 (2.4471) [2022-01-17 14:58:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][540/1251] eta 0:26:23 lr 0.000222 time 1.8746 (2.2274) loss 5.1564 (5.9191) grad_norm 1.8018 (2.4475) [2022-01-17 14:59:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][550/1251] eta 0:26:00 lr 0.000223 time 2.9230 (2.2254) loss 5.9030 (5.9170) grad_norm 2.2370 (2.4432) [2022-01-17 14:59:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][560/1251] eta 0:25:34 lr 0.000223 time 2.0246 (2.2202) loss 5.4848 (5.9164) grad_norm 2.1840 (2.4414) [2022-01-17 14:59:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][570/1251] eta 0:25:09 lr 0.000224 time 1.8687 (2.2166) loss 5.3520 (5.9143) grad_norm 3.1015 (2.4501) [2022-01-17 15:00:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][580/1251] eta 0:24:46 lr 0.000224 time 1.6388 (2.2156) loss 6.0989 (5.9143) grad_norm 2.4149 (2.4498) [2022-01-17 15:00:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][590/1251] eta 0:24:25 lr 0.000224 time 3.0031 (2.2167) loss 6.3372 (5.9158) grad_norm 2.0607 (2.4456) [2022-01-17 15:01:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][600/1251] eta 0:24:04 lr 0.000225 time 2.5250 (2.2189) loss 5.6192 (5.9129) grad_norm 1.9967 (2.4422) [2022-01-17 15:01:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][610/1251] eta 0:23:42 lr 0.000225 time 2.0396 (2.2196) loss 6.1173 (5.9129) grad_norm 2.6930 (2.4456) [2022-01-17 15:01:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][620/1251] eta 0:23:21 lr 0.000226 time 2.2370 (2.2218) loss 6.2961 (5.9126) grad_norm 2.1905 (2.4421) [2022-01-17 15:02:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][630/1251] eta 0:22:57 lr 0.000226 time 2.0107 (2.2188) loss 6.2564 (5.9151) grad_norm 3.1312 (2.4470) [2022-01-17 15:02:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][640/1251] eta 0:22:34 lr 0.000226 time 2.4538 (2.2161) loss 5.2417 (5.9135) grad_norm 2.1105 (2.4471) [2022-01-17 15:02:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][650/1251] eta 0:22:10 lr 0.000227 time 1.5732 (2.2136) loss 5.9703 (5.9123) grad_norm 1.6764 (2.4411) [2022-01-17 15:03:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][660/1251] eta 0:21:48 lr 0.000227 time 1.9054 (2.2149) loss 5.7126 (5.9103) grad_norm 2.0236 (2.4412) [2022-01-17 15:03:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][670/1251] eta 0:21:27 lr 0.000228 time 2.1723 (2.2153) loss 6.5269 (5.9091) grad_norm 3.4150 (2.4405) [2022-01-17 15:04:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][680/1251] eta 0:21:09 lr 0.000228 time 7.2967 (2.2229) loss 5.8336 (5.9101) grad_norm 3.2457 (2.4381) [2022-01-17 15:04:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][690/1251] eta 0:20:46 lr 0.000228 time 1.9197 (2.2215) loss 5.0140 (5.9096) grad_norm 2.0437 (2.4353) [2022-01-17 15:04:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][700/1251] eta 0:20:23 lr 0.000229 time 1.5853 (2.2200) loss 5.4935 (5.9103) grad_norm 2.1996 (2.4297) [2022-01-17 15:05:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][710/1251] eta 0:19:59 lr 0.000229 time 1.6723 (2.2168) loss 6.1623 (5.9071) grad_norm 2.3205 (2.4276) [2022-01-17 15:05:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][720/1251] eta 0:19:36 lr 0.000230 time 2.8901 (2.2152) loss 5.5641 (5.9037) grad_norm 3.3519 (2.4253) [2022-01-17 15:05:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][730/1251] eta 0:19:13 lr 0.000230 time 2.2934 (2.2138) loss 6.0912 (5.9021) grad_norm 3.1206 (2.4267) [2022-01-17 15:06:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][740/1251] eta 0:18:50 lr 0.000230 time 1.9284 (2.2132) loss 5.6659 (5.9008) grad_norm 1.9270 (2.4254) [2022-01-17 15:06:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][750/1251] eta 0:18:28 lr 0.000231 time 2.1903 (2.2131) loss 5.8918 (5.9015) grad_norm 2.4001 (2.4243) [2022-01-17 15:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][760/1251] eta 0:18:06 lr 0.000231 time 2.5852 (2.2122) loss 5.4428 (5.9016) grad_norm 2.4777 (2.4234) [2022-01-17 15:07:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][770/1251] eta 0:17:43 lr 0.000232 time 1.9418 (2.2105) loss 5.1508 (5.8997) grad_norm 3.4081 (2.4251) [2022-01-17 15:07:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][780/1251] eta 0:17:21 lr 0.000232 time 2.2202 (2.2104) loss 6.0610 (5.8993) grad_norm 2.2918 (2.4251) [2022-01-17 15:07:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][790/1251] eta 0:16:58 lr 0.000232 time 2.1419 (2.2093) loss 6.3730 (5.8972) grad_norm 3.1152 (2.4324) [2022-01-17 15:08:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][800/1251] eta 0:16:37 lr 0.000233 time 2.5511 (2.2107) loss 6.4012 (5.8963) grad_norm 2.2193 (2.4355) [2022-01-17 15:08:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][810/1251] eta 0:16:15 lr 0.000233 time 2.1979 (2.2120) loss 5.6630 (5.8954) grad_norm 2.0476 (2.4335) [2022-01-17 15:09:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][820/1251] eta 0:15:54 lr 0.000234 time 2.6810 (2.2141) loss 5.8260 (5.8966) grad_norm 3.0775 (2.4335) [2022-01-17 15:09:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][830/1251] eta 0:15:32 lr 0.000234 time 2.4367 (2.2139) loss 6.0729 (5.8967) grad_norm 2.1028 (2.4299) [2022-01-17 15:09:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][840/1251] eta 0:15:10 lr 0.000234 time 3.7667 (2.2160) loss 6.1733 (5.8963) grad_norm 2.8335 (2.4305) [2022-01-17 15:10:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][850/1251] eta 0:14:47 lr 0.000235 time 1.7529 (2.2141) loss 6.0015 (5.8945) grad_norm 2.1522 (2.4295) [2022-01-17 15:10:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][860/1251] eta 0:14:25 lr 0.000235 time 2.5239 (2.2124) loss 6.4188 (5.8967) grad_norm 1.9125 (2.4276) [2022-01-17 15:10:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][870/1251] eta 0:14:02 lr 0.000236 time 3.1455 (2.2105) loss 6.3058 (5.8988) grad_norm 1.9068 (2.4270) [2022-01-17 15:11:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][880/1251] eta 0:13:39 lr 0.000236 time 1.8398 (2.2076) loss 6.2396 (5.8991) grad_norm 2.5846 (2.4271) [2022-01-17 15:11:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][890/1251] eta 0:13:16 lr 0.000236 time 2.8510 (2.2071) loss 5.6097 (5.8974) grad_norm 2.2885 (2.4260) [2022-01-17 15:11:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][900/1251] eta 0:12:54 lr 0.000237 time 2.5425 (2.2065) loss 6.0994 (5.8946) grad_norm 2.2819 (2.4249) [2022-01-17 15:12:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][910/1251] eta 0:12:33 lr 0.000237 time 2.2271 (2.2101) loss 5.5429 (5.8945) grad_norm 2.5147 (2.4275) [2022-01-17 15:12:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][920/1251] eta 0:12:12 lr 0.000238 time 1.8429 (2.2117) loss 6.1743 (5.8949) grad_norm 2.4093 (2.4268) [2022-01-17 15:13:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][930/1251] eta 0:11:50 lr 0.000238 time 2.5766 (2.2129) loss 6.1235 (5.8953) grad_norm 2.2578 (2.4249) [2022-01-17 15:13:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][940/1251] eta 0:11:28 lr 0.000238 time 1.9879 (2.2130) loss 5.2546 (5.8962) grad_norm 2.9661 (2.4238) [2022-01-17 15:13:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][950/1251] eta 0:11:05 lr 0.000239 time 1.6886 (2.2113) loss 5.6105 (5.8954) grad_norm 2.4898 (2.4248) [2022-01-17 15:14:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][960/1251] eta 0:10:42 lr 0.000239 time 2.0192 (2.2081) loss 6.2279 (5.8937) grad_norm 2.3842 (2.4228) [2022-01-17 15:14:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][970/1251] eta 0:10:19 lr 0.000240 time 1.9246 (2.2060) loss 5.8607 (5.8936) grad_norm 2.5653 (2.4218) [2022-01-17 15:14:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][980/1251] eta 0:09:57 lr 0.000240 time 2.4899 (2.2050) loss 5.7865 (5.8938) grad_norm 3.3007 (2.4214) [2022-01-17 15:15:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][990/1251] eta 0:09:35 lr 0.000240 time 2.2527 (2.2064) loss 6.0770 (5.8933) grad_norm 3.1139 (2.4250) [2022-01-17 15:15:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1000/1251] eta 0:09:13 lr 0.000241 time 1.3299 (2.2062) loss 6.1032 (5.8927) grad_norm 2.1883 (2.4228) [2022-01-17 15:16:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1010/1251] eta 0:08:52 lr 0.000241 time 1.4935 (2.2102) loss 5.9594 (5.8931) grad_norm 1.9288 (2.4197) [2022-01-17 15:16:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1020/1251] eta 0:08:30 lr 0.000242 time 3.9225 (2.2109) loss 5.5846 (5.8911) grad_norm 2.5344 (2.4192) [2022-01-17 15:16:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1030/1251] eta 0:08:08 lr 0.000242 time 1.9275 (2.2094) loss 5.8582 (5.8883) grad_norm 2.7365 (2.4201) [2022-01-17 15:17:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1040/1251] eta 0:07:45 lr 0.000242 time 2.0514 (2.2073) loss 6.3006 (5.8870) grad_norm 2.6867 (2.4210) [2022-01-17 15:17:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1050/1251] eta 0:07:23 lr 0.000243 time 1.9051 (2.2053) loss 5.9417 (5.8859) grad_norm 2.7512 (2.4231) [2022-01-17 15:17:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1060/1251] eta 0:07:01 lr 0.000243 time 2.1153 (2.2046) loss 6.2573 (5.8861) grad_norm 2.1002 (2.4231) [2022-01-17 15:18:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1070/1251] eta 0:06:38 lr 0.000244 time 2.1702 (2.2040) loss 6.3194 (5.8864) grad_norm 2.6939 (2.4238) [2022-01-17 15:18:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1080/1251] eta 0:06:16 lr 0.000244 time 2.1409 (2.2041) loss 5.3330 (5.8857) grad_norm 4.8270 (2.4262) [2022-01-17 15:18:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1090/1251] eta 0:05:54 lr 0.000244 time 1.5340 (2.2048) loss 5.3721 (5.8851) grad_norm 2.4410 (2.4270) [2022-01-17 15:19:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1100/1251] eta 0:05:32 lr 0.000245 time 2.5338 (2.2047) loss 5.3314 (5.8843) grad_norm 2.2475 (2.4277) [2022-01-17 15:19:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1110/1251] eta 0:05:10 lr 0.000245 time 2.2259 (2.2051) loss 6.1384 (5.8814) grad_norm 2.1897 (2.4259) [2022-01-17 15:20:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1120/1251] eta 0:04:49 lr 0.000246 time 2.7630 (2.2072) loss 5.8807 (5.8815) grad_norm 2.4380 (2.4251) [2022-01-17 15:20:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1130/1251] eta 0:04:27 lr 0.000246 time 1.9315 (2.2066) loss 5.3343 (5.8806) grad_norm 2.4625 (2.4246) [2022-01-17 15:20:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1140/1251] eta 0:04:04 lr 0.000246 time 1.7125 (2.2059) loss 5.5830 (5.8778) grad_norm 2.2427 (2.4248) [2022-01-17 15:21:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1150/1251] eta 0:03:42 lr 0.000247 time 1.8880 (2.2037) loss 6.1403 (5.8773) grad_norm 2.0864 (2.4256) [2022-01-17 15:21:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1160/1251] eta 0:03:20 lr 0.000247 time 1.9778 (2.2026) loss 6.0812 (5.8762) grad_norm 1.9479 (2.4244) [2022-01-17 15:21:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1170/1251] eta 0:02:58 lr 0.000248 time 1.9441 (2.2016) loss 6.1772 (5.8765) grad_norm 1.9586 (2.4244) [2022-01-17 15:22:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1180/1251] eta 0:02:36 lr 0.000248 time 2.2664 (2.2025) loss 5.9483 (5.8754) grad_norm 2.1045 (2.4267) [2022-01-17 15:22:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1190/1251] eta 0:02:14 lr 0.000248 time 2.0938 (2.2030) loss 5.4548 (5.8752) grad_norm 2.1691 (2.4282) [2022-01-17 15:22:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1200/1251] eta 0:01:52 lr 0.000249 time 2.1881 (2.2027) loss 5.2244 (5.8752) grad_norm 2.0910 (2.4287) [2022-01-17 15:23:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1210/1251] eta 0:01:30 lr 0.000249 time 2.2517 (2.2021) loss 5.2408 (5.8718) grad_norm 2.3059 (2.4275) [2022-01-17 15:23:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1220/1251] eta 0:01:08 lr 0.000250 time 2.2911 (2.2027) loss 5.6506 (5.8717) grad_norm 2.2740 (2.4249) [2022-01-17 15:23:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1230/1251] eta 0:00:46 lr 0.000250 time 2.4120 (2.2030) loss 5.9088 (5.8700) grad_norm 2.4765 (2.4248) [2022-01-17 15:24:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1240/1251] eta 0:00:24 lr 0.000250 time 1.4428 (2.2011) loss 5.3789 (5.8701) grad_norm 3.2412 (2.4271) [2022-01-17 15:24:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [4/300][1250/1251] eta 0:00:02 lr 0.000251 time 1.1939 (2.1959) loss 5.2699 (5.8709) grad_norm 1.9052 (2.4255) [2022-01-17 15:24:35 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 4 training takes 0:45:47 [2022-01-17 15:24:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 21.206 (21.206) Loss 3.9304 (3.9304) Acc@1 24.121 (24.121) Acc@5 48.438 (48.438) [2022-01-17 15:25:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.993 (3.467) Loss 3.9135 (3.9179) Acc@1 23.145 (24.325) Acc@5 48.438 (47.567) [2022-01-17 15:25:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.971 (2.746) Loss 3.8460 (3.8950) Acc@1 26.172 (24.558) Acc@5 49.512 (47.838) [2022-01-17 15:25:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.983 (2.390) Loss 3.9851 (3.9065) Acc@1 22.656 (24.181) Acc@5 45.215 (47.628) [2022-01-17 15:26:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.732 (2.215) Loss 3.9246 (3.9034) Acc@1 20.996 (24.104) Acc@5 45.117 (47.528) [2022-01-17 15:26:13 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 24.266 Acc@5 47.714 [2022-01-17 15:26:13 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 24.3% [2022-01-17 15:26:13 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 24.27% [2022-01-17 15:26:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][0/1251] eta 6:45:29 lr 0.000251 time 19.4479 (19.4479) loss 6.0224 (6.0224) grad_norm 3.0411 (3.0411) [2022-01-17 15:26:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][10/1251] eta 1:19:26 lr 0.000251 time 1.8640 (3.8408) loss 6.0554 (5.9232) grad_norm 2.2483 (2.4838) [2022-01-17 15:27:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][20/1251] eta 1:05:12 lr 0.000252 time 2.3759 (3.1779) loss 5.9189 (5.8541) grad_norm 2.8394 (2.4562) [2022-01-17 15:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][30/1251] eta 0:58:00 lr 0.000252 time 1.8528 (2.8506) loss 5.2897 (5.8469) grad_norm 1.9830 (2.5154) [2022-01-17 15:28:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][40/1251] eta 0:54:59 lr 0.000252 time 3.4702 (2.7249) loss 5.9929 (5.9120) grad_norm 2.1941 (2.4499) [2022-01-17 15:28:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][50/1251] eta 0:53:06 lr 0.000253 time 2.1640 (2.6532) loss 6.0826 (5.9189) grad_norm 2.5010 (2.4096) [2022-01-17 15:28:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][60/1251] eta 0:51:18 lr 0.000253 time 1.7446 (2.5850) loss 6.0264 (5.9088) grad_norm 2.4593 (2.4747) [2022-01-17 15:29:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][70/1251] eta 0:49:28 lr 0.000254 time 1.6932 (2.5137) loss 5.2890 (5.9082) grad_norm 3.2781 (2.4413) [2022-01-17 15:29:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][80/1251] eta 0:48:17 lr 0.000254 time 2.7904 (2.4745) loss 6.0208 (5.8909) grad_norm 2.1545 (2.4736) [2022-01-17 15:29:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][90/1251] eta 0:47:24 lr 0.000254 time 1.9036 (2.4499) loss 6.0683 (5.8807) grad_norm 2.0695 (2.4577) [2022-01-17 15:30:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][100/1251] eta 0:46:07 lr 0.000255 time 1.8553 (2.4044) loss 5.5676 (5.8601) grad_norm 2.6133 (2.4390) [2022-01-17 15:30:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][110/1251] eta 0:45:06 lr 0.000255 time 1.5265 (2.3724) loss 6.3720 (5.8663) grad_norm 1.7386 (2.4027) [2022-01-17 15:30:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][120/1251] eta 0:44:18 lr 0.000256 time 2.1377 (2.3507) loss 6.0705 (5.8712) grad_norm 3.0779 (2.4107) [2022-01-17 15:31:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][130/1251] eta 0:43:44 lr 0.000256 time 2.1727 (2.3412) loss 6.1521 (5.8630) grad_norm 2.6765 (2.4182) [2022-01-17 15:31:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][140/1251] eta 0:43:06 lr 0.000256 time 2.2292 (2.3284) loss 5.5152 (5.8570) grad_norm 2.2857 (2.4099) [2022-01-17 15:32:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][150/1251] eta 0:42:39 lr 0.000257 time 1.5238 (2.3245) loss 6.2391 (5.8526) grad_norm 2.9551 (2.4173) [2022-01-17 15:32:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][160/1251] eta 0:42:08 lr 0.000257 time 2.1291 (2.3176) loss 5.9970 (5.8484) grad_norm 2.1334 (2.4154) [2022-01-17 15:32:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][170/1251] eta 0:41:45 lr 0.000258 time 2.2014 (2.3180) loss 5.6915 (5.8420) grad_norm 2.7365 (2.4045) [2022-01-17 15:33:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][180/1251] eta 0:41:15 lr 0.000258 time 2.1671 (2.3117) loss 6.1744 (5.8257) grad_norm 2.5360 (2.4087) [2022-01-17 15:33:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][190/1251] eta 0:40:49 lr 0.000258 time 1.5817 (2.3087) loss 5.1984 (5.8189) grad_norm 2.2544 (2.4008) [2022-01-17 15:33:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][200/1251] eta 0:40:18 lr 0.000259 time 1.9447 (2.3010) loss 5.1591 (5.8142) grad_norm 2.8550 (2.4174) [2022-01-17 15:34:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][210/1251] eta 0:39:47 lr 0.000259 time 1.8658 (2.2932) loss 5.0327 (5.8048) grad_norm 2.4178 (2.4151) [2022-01-17 15:34:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][220/1251] eta 0:39:10 lr 0.000260 time 1.7428 (2.2803) loss 5.7383 (5.7927) grad_norm 2.7204 (2.4185) [2022-01-17 15:34:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][230/1251] eta 0:38:47 lr 0.000260 time 1.9148 (2.2796) loss 6.1559 (5.7880) grad_norm 2.9054 (2.4181) [2022-01-17 15:35:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][240/1251] eta 0:38:25 lr 0.000260 time 1.5690 (2.2801) loss 5.8985 (5.7875) grad_norm 2.0965 (2.4165) [2022-01-17 15:35:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][250/1251] eta 0:37:57 lr 0.000261 time 1.9055 (2.2753) loss 5.9877 (5.7800) grad_norm 1.9867 (2.4122) [2022-01-17 15:36:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][260/1251] eta 0:37:30 lr 0.000261 time 1.9748 (2.2713) loss 5.9258 (5.7820) grad_norm 1.9658 (2.3989) [2022-01-17 15:36:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][270/1251] eta 0:37:01 lr 0.000262 time 1.5707 (2.2650) loss 6.0891 (5.7777) grad_norm 2.5004 (2.4028) [2022-01-17 15:36:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][280/1251] eta 0:36:37 lr 0.000262 time 1.8449 (2.2636) loss 5.9378 (5.7770) grad_norm 1.7696 (2.3944) [2022-01-17 15:37:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][290/1251] eta 0:36:12 lr 0.000262 time 1.7473 (2.2610) loss 6.0561 (5.7807) grad_norm 2.0492 (2.3980) [2022-01-17 15:37:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][300/1251] eta 0:35:52 lr 0.000263 time 1.8908 (2.2636) loss 5.5026 (5.7789) grad_norm 1.8008 (2.3978) [2022-01-17 15:37:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][310/1251] eta 0:35:26 lr 0.000263 time 1.5581 (2.2598) loss 5.6397 (5.7764) grad_norm 2.4720 (2.3899) [2022-01-17 15:38:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][320/1251] eta 0:34:58 lr 0.000264 time 1.5714 (2.2541) loss 5.0320 (5.7764) grad_norm 2.7693 (2.3907) [2022-01-17 15:38:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][330/1251] eta 0:34:30 lr 0.000264 time 1.8878 (2.2481) loss 5.8213 (5.7763) grad_norm 2.0742 (2.3878) [2022-01-17 15:38:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][340/1251] eta 0:34:05 lr 0.000264 time 1.9524 (2.2453) loss 6.1008 (5.7766) grad_norm 2.1977 (2.3818) [2022-01-17 15:39:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][350/1251] eta 0:33:39 lr 0.000265 time 1.9584 (2.2411) loss 4.8694 (5.7680) grad_norm 3.0418 (2.3874) [2022-01-17 15:39:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][360/1251] eta 0:33:17 lr 0.000265 time 1.8693 (2.2417) loss 5.9239 (5.7668) grad_norm 2.3950 (2.3972) [2022-01-17 15:40:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][370/1251] eta 0:32:57 lr 0.000266 time 2.1389 (2.2447) loss 5.8426 (5.7726) grad_norm 2.0381 (2.3919) [2022-01-17 15:40:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][380/1251] eta 0:32:40 lr 0.000266 time 2.6746 (2.2508) loss 5.9706 (5.7693) grad_norm 1.8355 (2.3904) [2022-01-17 15:40:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][390/1251] eta 0:32:14 lr 0.000266 time 1.9014 (2.2467) loss 6.0116 (5.7687) grad_norm 2.1981 (2.3915) [2022-01-17 15:41:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][400/1251] eta 0:31:46 lr 0.000267 time 1.5595 (2.2407) loss 5.7714 (5.7637) grad_norm 1.8630 (2.3950) [2022-01-17 15:41:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][410/1251] eta 0:31:18 lr 0.000267 time 1.9280 (2.2333) loss 6.0095 (5.7635) grad_norm 2.2121 (2.3871) [2022-01-17 15:41:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][420/1251] eta 0:30:53 lr 0.000268 time 2.2221 (2.2307) loss 5.8148 (5.7593) grad_norm 2.0596 (2.3867) [2022-01-17 15:42:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][430/1251] eta 0:30:32 lr 0.000268 time 2.5326 (2.2325) loss 5.5185 (5.7534) grad_norm 1.9204 (2.3833) [2022-01-17 15:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][440/1251] eta 0:30:11 lr 0.000268 time 2.4971 (2.2341) loss 5.8988 (5.7503) grad_norm 2.0517 (2.3817) [2022-01-17 15:43:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][450/1251] eta 0:29:49 lr 0.000269 time 1.8632 (2.2336) loss 6.1774 (5.7543) grad_norm 2.3768 (2.3775) [2022-01-17 15:43:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][460/1251] eta 0:29:29 lr 0.000269 time 1.6673 (2.2367) loss 6.2480 (5.7553) grad_norm 3.0520 (2.3734) [2022-01-17 15:43:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][470/1251] eta 0:29:05 lr 0.000270 time 1.7472 (2.2353) loss 5.0118 (5.7572) grad_norm 2.0004 (2.3763) [2022-01-17 15:44:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][480/1251] eta 0:28:41 lr 0.000270 time 1.8540 (2.2328) loss 5.1098 (5.7553) grad_norm 1.8082 (2.3733) [2022-01-17 15:44:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][490/1251] eta 0:28:15 lr 0.000270 time 1.9383 (2.2277) loss 5.2094 (5.7507) grad_norm 2.3289 (2.3717) [2022-01-17 15:44:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][500/1251] eta 0:27:51 lr 0.000271 time 2.0971 (2.2258) loss 6.1950 (5.7499) grad_norm 2.0303 (2.3669) [2022-01-17 15:45:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][510/1251] eta 0:27:28 lr 0.000271 time 1.7754 (2.2245) loss 5.5067 (5.7438) grad_norm 2.3337 (2.3638) [2022-01-17 15:45:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][520/1251] eta 0:27:10 lr 0.000272 time 2.6817 (2.2310) loss 6.0399 (5.7448) grad_norm 2.2912 (2.3654) [2022-01-17 15:45:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][530/1251] eta 0:26:48 lr 0.000272 time 1.9308 (2.2303) loss 5.5314 (5.7458) grad_norm 2.2417 (2.3623) [2022-01-17 15:46:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][540/1251] eta 0:26:25 lr 0.000272 time 2.2142 (2.2295) loss 5.9567 (5.7492) grad_norm 2.2516 (2.3667) [2022-01-17 15:46:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][550/1251] eta 0:26:00 lr 0.000273 time 1.6941 (2.2254) loss 4.9269 (5.7447) grad_norm 2.7392 (2.3705) [2022-01-17 15:47:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][560/1251] eta 0:25:38 lr 0.000273 time 2.7850 (2.2266) loss 5.5517 (5.7453) grad_norm 2.6486 (2.3806) [2022-01-17 15:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][570/1251] eta 0:25:15 lr 0.000274 time 1.5864 (2.2248) loss 5.7880 (5.7452) grad_norm 1.8271 (2.3804) [2022-01-17 15:47:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][580/1251] eta 0:24:52 lr 0.000274 time 1.8930 (2.2241) loss 4.8881 (5.7435) grad_norm 2.5434 (2.3762) [2022-01-17 15:48:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][590/1251] eta 0:24:30 lr 0.000274 time 1.8140 (2.2242) loss 6.0311 (5.7448) grad_norm 1.9593 (2.3707) [2022-01-17 15:48:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][600/1251] eta 0:24:10 lr 0.000275 time 2.8235 (2.2280) loss 5.6644 (5.7388) grad_norm 2.4366 (2.3706) [2022-01-17 15:48:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][610/1251] eta 0:23:46 lr 0.000275 time 1.8642 (2.2248) loss 5.4510 (5.7369) grad_norm 2.6114 (2.3811) [2022-01-17 15:49:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][620/1251] eta 0:23:21 lr 0.000276 time 1.8123 (2.2210) loss 5.2509 (5.7378) grad_norm 2.5723 (2.3777) [2022-01-17 15:49:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][630/1251] eta 0:22:58 lr 0.000276 time 2.0698 (2.2201) loss 6.2147 (5.7360) grad_norm 1.7257 (2.3739) [2022-01-17 15:49:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][640/1251] eta 0:22:36 lr 0.000276 time 3.1570 (2.2205) loss 6.0428 (5.7351) grad_norm 2.1645 (2.3710) [2022-01-17 15:50:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][650/1251] eta 0:22:12 lr 0.000277 time 1.5646 (2.2178) loss 4.9065 (5.7316) grad_norm 2.2124 (2.3704) [2022-01-17 15:50:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][660/1251] eta 0:21:50 lr 0.000277 time 1.9221 (2.2175) loss 5.3014 (5.7312) grad_norm 1.8974 (2.3649) [2022-01-17 15:50:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][670/1251] eta 0:21:27 lr 0.000278 time 1.5428 (2.2160) loss 5.1792 (5.7284) grad_norm 2.0701 (2.3634) [2022-01-17 15:51:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][680/1251] eta 0:21:05 lr 0.000278 time 2.0893 (2.2159) loss 6.1828 (5.7299) grad_norm 2.5025 (2.3629) [2022-01-17 15:51:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][690/1251] eta 0:20:42 lr 0.000278 time 2.5343 (2.2140) loss 5.9909 (5.7273) grad_norm 3.3758 (2.3642) [2022-01-17 15:52:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][700/1251] eta 0:20:20 lr 0.000279 time 2.8731 (2.2149) loss 5.9146 (5.7243) grad_norm 2.0846 (2.3604) [2022-01-17 15:52:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][710/1251] eta 0:19:57 lr 0.000279 time 2.2004 (2.2135) loss 5.4383 (5.7256) grad_norm 2.4067 (2.3596) [2022-01-17 15:52:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][720/1251] eta 0:19:36 lr 0.000279 time 1.9725 (2.2149) loss 5.9692 (5.7236) grad_norm 2.5024 (2.3623) [2022-01-17 15:53:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][730/1251] eta 0:19:13 lr 0.000280 time 2.0194 (2.2146) loss 5.3596 (5.7233) grad_norm 2.1903 (2.3621) [2022-01-17 15:53:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][740/1251] eta 0:18:52 lr 0.000280 time 3.0708 (2.2160) loss 5.2196 (5.7217) grad_norm 2.8078 (2.3668) [2022-01-17 15:53:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][750/1251] eta 0:18:29 lr 0.000281 time 1.5723 (2.2141) loss 5.3194 (5.7163) grad_norm 1.9526 (2.3722) [2022-01-17 15:54:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][760/1251] eta 0:18:07 lr 0.000281 time 1.9787 (2.2140) loss 6.0995 (5.7174) grad_norm 2.8381 (2.3717) [2022-01-17 15:54:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][770/1251] eta 0:17:43 lr 0.000281 time 1.5758 (2.2120) loss 6.0405 (5.7166) grad_norm 1.9460 (2.3720) [2022-01-17 15:55:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][780/1251] eta 0:17:22 lr 0.000282 time 3.1220 (2.2136) loss 6.1353 (5.7162) grad_norm 2.3367 (2.3712) [2022-01-17 15:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][790/1251] eta 0:16:59 lr 0.000282 time 1.9054 (2.2125) loss 5.9794 (5.7139) grad_norm 2.2983 (2.3737) [2022-01-17 15:55:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][800/1251] eta 0:16:37 lr 0.000283 time 1.9524 (2.2107) loss 5.7590 (5.7121) grad_norm 1.9144 (2.3746) [2022-01-17 15:56:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][810/1251] eta 0:16:14 lr 0.000283 time 2.5951 (2.2088) loss 5.5897 (5.7130) grad_norm 2.1308 (2.3718) [2022-01-17 15:56:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][820/1251] eta 0:15:51 lr 0.000283 time 2.2165 (2.2070) loss 4.9747 (5.7107) grad_norm 2.0915 (2.3713) [2022-01-17 15:56:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][830/1251] eta 0:15:28 lr 0.000284 time 1.8477 (2.2064) loss 5.3229 (5.7080) grad_norm 2.0281 (2.3706) [2022-01-17 15:57:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][840/1251] eta 0:15:07 lr 0.000284 time 3.3687 (2.2087) loss 5.3743 (5.7087) grad_norm 2.2207 (2.3681) [2022-01-17 15:57:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][850/1251] eta 0:14:46 lr 0.000285 time 3.1788 (2.2107) loss 5.3598 (5.7082) grad_norm 2.2739 (2.3662) [2022-01-17 15:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][860/1251] eta 0:14:24 lr 0.000285 time 2.1021 (2.2118) loss 5.6329 (5.7069) grad_norm 2.0069 (2.3630) [2022-01-17 15:58:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][870/1251] eta 0:14:02 lr 0.000285 time 1.7527 (2.2117) loss 6.0318 (5.7064) grad_norm 1.9401 (2.3613) [2022-01-17 15:58:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][880/1251] eta 0:13:40 lr 0.000286 time 1.8873 (2.2107) loss 5.8030 (5.7061) grad_norm 3.6187 (2.3685) [2022-01-17 15:59:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][890/1251] eta 0:13:17 lr 0.000286 time 1.8911 (2.2084) loss 6.1093 (5.7062) grad_norm 1.8183 (2.3669) [2022-01-17 15:59:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][900/1251] eta 0:12:54 lr 0.000287 time 2.5963 (2.2077) loss 6.3258 (5.7097) grad_norm 2.4584 (2.3654) [2022-01-17 15:59:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][910/1251] eta 0:12:32 lr 0.000287 time 1.9705 (2.2074) loss 5.3098 (5.7110) grad_norm 1.8600 (2.3623) [2022-01-17 16:00:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][920/1251] eta 0:12:10 lr 0.000287 time 1.8833 (2.2068) loss 5.8743 (5.7117) grad_norm 2.4657 (2.3621) [2022-01-17 16:00:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][930/1251] eta 0:11:48 lr 0.000288 time 1.9065 (2.2083) loss 5.4290 (5.7082) grad_norm 2.1204 (2.3623) [2022-01-17 16:00:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][940/1251] eta 0:11:27 lr 0.000288 time 2.5113 (2.2101) loss 6.1273 (5.7071) grad_norm 6.7107 (2.3695) [2022-01-17 16:01:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][950/1251] eta 0:11:04 lr 0.000289 time 1.9093 (2.2090) loss 5.1146 (5.7066) grad_norm 1.7706 (2.3721) [2022-01-17 16:01:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][960/1251] eta 0:10:41 lr 0.000289 time 1.7968 (2.2060) loss 6.0876 (5.7054) grad_norm 2.0927 (2.3718) [2022-01-17 16:01:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][970/1251] eta 0:10:19 lr 0.000289 time 1.9206 (2.2048) loss 5.7150 (5.7062) grad_norm 2.3896 (2.3697) [2022-01-17 16:02:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][980/1251] eta 0:09:57 lr 0.000290 time 2.1791 (2.2037) loss 5.4255 (5.7055) grad_norm 2.3115 (2.3677) [2022-01-17 16:02:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][990/1251] eta 0:09:35 lr 0.000290 time 2.4919 (2.2046) loss 5.2484 (5.7057) grad_norm 3.4442 (2.3689) [2022-01-17 16:02:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1000/1251] eta 0:09:13 lr 0.000291 time 1.7719 (2.2047) loss 5.7221 (5.7039) grad_norm 1.8781 (2.3693) [2022-01-17 16:03:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1010/1251] eta 0:08:51 lr 0.000291 time 1.8682 (2.2056) loss 5.9105 (5.7053) grad_norm 1.7813 (2.3716) [2022-01-17 16:03:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1020/1251] eta 0:08:29 lr 0.000291 time 2.6176 (2.2056) loss 4.8789 (5.7053) grad_norm 2.2658 (2.3740) [2022-01-17 16:04:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1030/1251] eta 0:08:07 lr 0.000292 time 2.8278 (2.2054) loss 5.8056 (5.7029) grad_norm 2.6151 (2.3746) [2022-01-17 16:04:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1040/1251] eta 0:07:44 lr 0.000292 time 1.6693 (2.2032) loss 5.8491 (5.7029) grad_norm 4.4444 (2.3797) [2022-01-17 16:04:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1050/1251] eta 0:07:22 lr 0.000293 time 1.8803 (2.2026) loss 5.2366 (5.7029) grad_norm 2.3132 (2.3777) [2022-01-17 16:05:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1060/1251] eta 0:07:00 lr 0.000293 time 1.6776 (2.2014) loss 5.3220 (5.7029) grad_norm 4.5584 (2.3779) [2022-01-17 16:05:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1070/1251] eta 0:06:38 lr 0.000293 time 1.8525 (2.2009) loss 5.0732 (5.7020) grad_norm 2.4308 (2.3792) [2022-01-17 16:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1080/1251] eta 0:06:16 lr 0.000294 time 2.5331 (2.1999) loss 5.7745 (5.7008) grad_norm 1.9473 (2.3760) [2022-01-17 16:06:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1090/1251] eta 0:05:54 lr 0.000294 time 1.8966 (2.1992) loss 6.4988 (5.7023) grad_norm 2.4525 (2.3737) [2022-01-17 16:06:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1100/1251] eta 0:05:32 lr 0.000295 time 1.7255 (2.1993) loss 5.9702 (5.7019) grad_norm 2.2560 (2.3711) [2022-01-17 16:06:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1110/1251] eta 0:05:10 lr 0.000295 time 2.7734 (2.1993) loss 5.6251 (5.7014) grad_norm 1.8743 (2.3691) [2022-01-17 16:07:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1120/1251] eta 0:04:48 lr 0.000295 time 2.7754 (2.1992) loss 6.1316 (5.7002) grad_norm 2.3936 (2.3674) [2022-01-17 16:07:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1130/1251] eta 0:04:26 lr 0.000296 time 1.6826 (2.1995) loss 5.6814 (5.6995) grad_norm 1.8506 (2.3659) [2022-01-17 16:08:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1140/1251] eta 0:04:04 lr 0.000296 time 1.7609 (2.2001) loss 5.0043 (5.6991) grad_norm 2.1873 (2.3657) [2022-01-17 16:08:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1150/1251] eta 0:03:42 lr 0.000297 time 2.1154 (2.2006) loss 5.5886 (5.6963) grad_norm 2.5315 (2.3651) [2022-01-17 16:08:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1160/1251] eta 0:03:20 lr 0.000297 time 2.6255 (2.2016) loss 5.9483 (5.6962) grad_norm 1.9622 (2.3651) [2022-01-17 16:09:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1170/1251] eta 0:02:58 lr 0.000297 time 1.7864 (2.2019) loss 5.7192 (5.6967) grad_norm 1.7860 (2.3639) [2022-01-17 16:09:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1180/1251] eta 0:02:36 lr 0.000298 time 1.9989 (2.2022) loss 5.0546 (5.6951) grad_norm 2.7388 (2.3630) [2022-01-17 16:09:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1190/1251] eta 0:02:14 lr 0.000298 time 2.0429 (2.2022) loss 5.5774 (5.6925) grad_norm 2.2932 (2.3607) [2022-01-17 16:10:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1200/1251] eta 0:01:52 lr 0.000299 time 2.2350 (2.2004) loss 5.8520 (5.6919) grad_norm 2.6439 (2.3605) [2022-01-17 16:10:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1210/1251] eta 0:01:30 lr 0.000299 time 2.2368 (2.1990) loss 5.4971 (5.6914) grad_norm 2.2443 (2.3598) [2022-01-17 16:10:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1220/1251] eta 0:01:08 lr 0.000299 time 1.8885 (2.1980) loss 5.1881 (5.6912) grad_norm 2.7189 (2.3608) [2022-01-17 16:11:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1230/1251] eta 0:00:46 lr 0.000300 time 1.5721 (2.1976) loss 5.3203 (5.6906) grad_norm 2.1940 (2.3641) [2022-01-17 16:11:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1240/1251] eta 0:00:24 lr 0.000300 time 1.2788 (2.1968) loss 4.6206 (5.6892) grad_norm 1.8406 (2.3618) [2022-01-17 16:11:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [5/300][1250/1251] eta 0:00:02 lr 0.000301 time 1.1898 (2.1914) loss 5.7076 (5.6891) grad_norm 1.7871 (2.3591) [2022-01-17 16:11:54 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 5 training takes 0:45:41 [2022-01-17 16:12:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 16.980 (16.980) Loss 3.5713 (3.5713) Acc@1 27.637 (27.637) Acc@5 54.297 (54.297) [2022-01-17 16:12:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.683 (3.358) Loss 3.6069 (3.5118) Acc@1 27.637 (29.395) Acc@5 52.637 (54.714) [2022-01-17 16:12:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.623 (2.754) Loss 3.5022 (3.5147) Acc@1 30.566 (29.311) Acc@5 54.395 (54.511) [2022-01-17 16:13:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.932 (2.313) Loss 3.4529 (3.5254) Acc@1 30.957 (29.338) Acc@5 54.980 (54.256) [2022-01-17 16:13:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.976 (2.134) Loss 3.3649 (3.5216) Acc@1 30.957 (29.421) Acc@5 57.031 (54.294) [2022-01-17 16:13:31 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 29.462 Acc@5 54.256 [2022-01-17 16:13:31 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 29.5% [2022-01-17 16:13:31 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 29.46% [2022-01-17 16:13:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][0/1251] eta 7:12:24 lr 0.000301 time 20.7389 (20.7389) loss 4.8401 (4.8401) grad_norm 1.9478 (1.9478) [2022-01-17 16:14:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][10/1251] eta 1:24:01 lr 0.000301 time 2.0276 (4.0626) loss 5.4890 (5.4848) grad_norm 1.9951 (1.9868) [2022-01-17 16:14:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][20/1251] eta 1:05:09 lr 0.000301 time 2.2447 (3.1762) loss 5.3681 (5.4613) grad_norm 2.0714 (2.0977) [2022-01-17 16:14:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][30/1251] eta 0:57:10 lr 0.000302 time 1.5881 (2.8098) loss 5.5823 (5.4788) grad_norm 2.2066 (2.3284) [2022-01-17 16:15:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][40/1251] eta 0:53:48 lr 0.000302 time 3.2591 (2.6663) loss 5.6912 (5.5324) grad_norm 2.6967 (2.3506) [2022-01-17 16:15:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][50/1251] eta 0:53:04 lr 0.000303 time 2.4018 (2.6515) loss 4.8912 (5.5166) grad_norm 1.9645 (2.3385) [2022-01-17 16:16:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][60/1251] eta 0:51:06 lr 0.000303 time 1.7745 (2.5750) loss 5.6205 (5.4876) grad_norm 1.6853 (2.3067) [2022-01-17 16:16:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][70/1251] eta 0:49:13 lr 0.000303 time 1.9153 (2.5005) loss 5.5042 (5.5209) grad_norm 2.1946 (2.3361) [2022-01-17 16:16:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][80/1251] eta 0:48:19 lr 0.000304 time 2.2310 (2.4761) loss 5.8043 (5.5481) grad_norm 1.9639 (2.3269) [2022-01-17 16:17:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][90/1251] eta 0:47:31 lr 0.000304 time 2.7826 (2.4564) loss 5.9561 (5.5436) grad_norm 1.9492 (2.3310) [2022-01-17 16:17:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][100/1251] eta 0:46:42 lr 0.000305 time 1.4906 (2.4346) loss 6.0848 (5.5573) grad_norm 1.9930 (2.3056) [2022-01-17 16:17:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][110/1251] eta 0:45:36 lr 0.000305 time 1.6241 (2.3985) loss 6.1303 (5.5439) grad_norm 1.6738 (2.3094) [2022-01-17 16:18:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][120/1251] eta 0:45:17 lr 0.000305 time 3.4490 (2.4026) loss 5.9923 (5.5510) grad_norm 2.1739 (2.3001) [2022-01-17 16:18:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][130/1251] eta 0:44:37 lr 0.000306 time 1.7649 (2.3888) loss 5.2388 (5.5615) grad_norm 3.5102 (2.3060) [2022-01-17 16:19:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][140/1251] eta 0:43:58 lr 0.000306 time 1.9115 (2.3746) loss 6.0087 (5.5612) grad_norm 1.9206 (2.2893) [2022-01-17 16:19:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][150/1251] eta 0:42:56 lr 0.000307 time 1.7220 (2.3403) loss 5.7084 (5.5694) grad_norm 2.6410 (2.2904) [2022-01-17 16:19:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][160/1251] eta 0:42:08 lr 0.000307 time 1.8837 (2.3176) loss 5.5754 (5.5840) grad_norm 2.0437 (2.2837) [2022-01-17 16:20:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][170/1251] eta 0:41:35 lr 0.000307 time 1.9241 (2.3082) loss 5.4819 (5.5825) grad_norm 2.1559 (2.2674) [2022-01-17 16:20:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][180/1251] eta 0:41:10 lr 0.000308 time 2.9732 (2.3071) loss 5.2171 (5.5788) grad_norm 3.0965 (2.2764) [2022-01-17 16:20:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][190/1251] eta 0:40:41 lr 0.000308 time 1.7154 (2.3011) loss 5.3924 (5.5833) grad_norm 2.0239 (2.2650) [2022-01-17 16:21:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][200/1251] eta 0:40:18 lr 0.000309 time 2.7298 (2.3014) loss 5.5659 (5.5883) grad_norm 1.8069 (2.2547) [2022-01-17 16:21:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][210/1251] eta 0:39:57 lr 0.000309 time 1.6082 (2.3027) loss 6.1216 (5.5847) grad_norm 2.8408 (2.2588) [2022-01-17 16:21:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][220/1251] eta 0:39:23 lr 0.000309 time 1.5653 (2.2921) loss 5.7034 (5.5650) grad_norm 2.5757 (2.2613) [2022-01-17 16:22:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][230/1251] eta 0:38:48 lr 0.000310 time 1.7433 (2.2808) loss 5.9792 (5.5670) grad_norm 3.9804 (2.2762) [2022-01-17 16:22:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][240/1251] eta 0:38:14 lr 0.000310 time 1.8582 (2.2691) loss 6.1605 (5.5658) grad_norm 2.4373 (2.2894) [2022-01-17 16:23:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][250/1251] eta 0:37:46 lr 0.000311 time 1.8656 (2.2646) loss 5.6367 (5.5625) grad_norm 2.4801 (2.2846) [2022-01-17 16:23:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][260/1251] eta 0:37:22 lr 0.000311 time 2.2962 (2.2629) loss 5.9826 (5.5624) grad_norm 2.3115 (2.2816) [2022-01-17 16:23:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][270/1251] eta 0:37:01 lr 0.000311 time 2.1584 (2.2647) loss 6.0896 (5.5623) grad_norm 1.9047 (2.2954) [2022-01-17 16:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][280/1251] eta 0:36:32 lr 0.000312 time 1.5743 (2.2583) loss 6.0004 (5.5569) grad_norm 1.9236 (2.2942) [2022-01-17 16:24:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][290/1251] eta 0:36:11 lr 0.000312 time 2.4880 (2.2601) loss 5.7982 (5.5592) grad_norm 2.3109 (2.2971) [2022-01-17 16:24:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][300/1251] eta 0:35:42 lr 0.000313 time 1.5784 (2.2528) loss 5.5137 (5.5552) grad_norm 1.9628 (2.2905) [2022-01-17 16:25:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][310/1251] eta 0:35:19 lr 0.000313 time 2.2019 (2.2522) loss 5.8576 (5.5573) grad_norm 2.1125 (2.2852) [2022-01-17 16:25:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][320/1251] eta 0:34:53 lr 0.000313 time 1.6982 (2.2488) loss 5.3123 (5.5535) grad_norm 2.8586 (2.2905) [2022-01-17 16:25:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][330/1251] eta 0:34:30 lr 0.000314 time 2.9036 (2.2481) loss 6.1558 (5.5585) grad_norm 2.9665 (2.2908) [2022-01-17 16:26:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][340/1251] eta 0:34:01 lr 0.000314 time 1.9097 (2.2415) loss 5.7998 (5.5562) grad_norm 2.0023 (2.2881) [2022-01-17 16:26:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][350/1251] eta 0:33:41 lr 0.000315 time 3.1477 (2.2433) loss 5.8371 (5.5530) grad_norm 1.7544 (2.2816) [2022-01-17 16:27:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][360/1251] eta 0:33:19 lr 0.000315 time 1.9279 (2.2441) loss 5.8916 (5.5585) grad_norm 1.8633 (2.2749) [2022-01-17 16:27:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][370/1251] eta 0:33:02 lr 0.000315 time 2.2874 (2.2504) loss 5.6372 (5.5613) grad_norm 2.1860 (2.2698) [2022-01-17 16:27:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][380/1251] eta 0:32:39 lr 0.000316 time 2.3358 (2.2500) loss 5.8891 (5.5596) grad_norm 2.8021 (2.2782) [2022-01-17 16:28:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][390/1251] eta 0:32:10 lr 0.000316 time 1.8126 (2.2419) loss 5.4769 (5.5624) grad_norm 2.2643 (2.2797) [2022-01-17 16:28:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][400/1251] eta 0:31:38 lr 0.000317 time 2.0097 (2.2308) loss 5.8063 (5.5679) grad_norm 2.1462 (2.2766) [2022-01-17 16:28:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][410/1251] eta 0:31:10 lr 0.000317 time 1.8987 (2.2238) loss 4.8311 (5.5633) grad_norm 2.2016 (2.2777) [2022-01-17 16:29:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][420/1251] eta 0:30:45 lr 0.000317 time 2.5222 (2.2204) loss 6.1600 (5.5633) grad_norm 2.6601 (2.2781) [2022-01-17 16:29:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][430/1251] eta 0:30:28 lr 0.000318 time 5.7390 (2.2267) loss 5.8288 (5.5603) grad_norm 1.8923 (2.2782) [2022-01-17 16:29:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][440/1251] eta 0:30:07 lr 0.000318 time 1.7046 (2.2283) loss 5.4962 (5.5631) grad_norm 2.2100 (2.2829) [2022-01-17 16:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][450/1251] eta 0:29:47 lr 0.000319 time 2.3662 (2.2316) loss 5.7109 (5.5579) grad_norm 1.7688 (2.2804) [2022-01-17 16:30:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][460/1251] eta 0:29:24 lr 0.000319 time 1.9124 (2.2307) loss 4.5377 (5.5551) grad_norm 2.5391 (2.2778) [2022-01-17 16:31:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][470/1251] eta 0:29:01 lr 0.000319 time 2.5527 (2.2303) loss 5.6360 (5.5551) grad_norm 2.4368 (2.2777) [2022-01-17 16:31:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][480/1251] eta 0:28:39 lr 0.000320 time 1.9140 (2.2302) loss 5.8408 (5.5531) grad_norm 2.7659 (2.2798) [2022-01-17 16:31:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][490/1251] eta 0:28:18 lr 0.000320 time 3.0783 (2.2318) loss 4.8119 (5.5506) grad_norm 3.7198 (2.2797) [2022-01-17 16:32:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][500/1251] eta 0:27:53 lr 0.000321 time 1.6168 (2.2289) loss 5.9015 (5.5497) grad_norm 1.6967 (2.2886) [2022-01-17 16:32:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][510/1251] eta 0:27:30 lr 0.000321 time 2.2750 (2.2272) loss 5.2145 (5.5478) grad_norm 2.3724 (2.2859) [2022-01-17 16:32:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][520/1251] eta 0:27:07 lr 0.000321 time 1.8678 (2.2261) loss 5.4198 (5.5434) grad_norm 1.9990 (2.2816) [2022-01-17 16:33:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][530/1251] eta 0:26:43 lr 0.000322 time 2.2348 (2.2244) loss 5.8779 (5.5457) grad_norm 1.7365 (2.2802) [2022-01-17 16:33:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][540/1251] eta 0:26:20 lr 0.000322 time 1.8502 (2.2236) loss 5.5697 (5.5427) grad_norm 1.7414 (2.2764) [2022-01-17 16:33:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][550/1251] eta 0:25:57 lr 0.000323 time 1.5814 (2.2223) loss 4.9109 (5.5379) grad_norm 2.1845 (2.2729) [2022-01-17 16:34:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][560/1251] eta 0:25:36 lr 0.000323 time 1.9924 (2.2236) loss 6.0560 (5.5386) grad_norm 2.1874 (2.2764) [2022-01-17 16:34:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][570/1251] eta 0:25:13 lr 0.000323 time 2.2410 (2.2225) loss 6.0337 (5.5415) grad_norm 3.0446 (2.2749) [2022-01-17 16:35:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][580/1251] eta 0:24:48 lr 0.000324 time 2.2508 (2.2179) loss 5.6033 (5.5411) grad_norm 1.8585 (2.2758) [2022-01-17 16:35:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][590/1251] eta 0:24:25 lr 0.000324 time 1.8740 (2.2173) loss 5.7711 (5.5448) grad_norm 1.9674 (2.2723) [2022-01-17 16:35:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][600/1251] eta 0:24:03 lr 0.000325 time 2.2452 (2.2176) loss 5.9988 (5.5417) grad_norm 2.1219 (2.2730) [2022-01-17 16:36:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][610/1251] eta 0:23:41 lr 0.000325 time 2.2411 (2.2183) loss 5.1475 (5.5387) grad_norm 2.2555 (2.2777) [2022-01-17 16:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][620/1251] eta 0:23:19 lr 0.000325 time 2.3786 (2.2181) loss 6.0069 (5.5405) grad_norm 2.4435 (2.2769) [2022-01-17 16:36:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][630/1251] eta 0:22:56 lr 0.000326 time 2.0125 (2.2158) loss 5.3204 (5.5385) grad_norm 2.2127 (2.2768) [2022-01-17 16:37:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][640/1251] eta 0:22:33 lr 0.000326 time 2.2152 (2.2156) loss 5.0552 (5.5398) grad_norm 2.1375 (2.2780) [2022-01-17 16:37:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][650/1251] eta 0:22:10 lr 0.000327 time 2.0289 (2.2143) loss 5.5393 (5.5388) grad_norm 1.9092 (2.2753) [2022-01-17 16:37:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][660/1251] eta 0:21:49 lr 0.000327 time 2.5572 (2.2151) loss 5.8101 (5.5370) grad_norm 2.8929 (2.2758) [2022-01-17 16:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][670/1251] eta 0:21:27 lr 0.000327 time 2.4246 (2.2159) loss 5.7909 (5.5385) grad_norm 2.0941 (2.2744) [2022-01-17 16:38:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][680/1251] eta 0:21:04 lr 0.000328 time 1.6000 (2.2144) loss 5.8570 (5.5406) grad_norm 2.3653 (2.2750) [2022-01-17 16:38:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][690/1251] eta 0:20:40 lr 0.000328 time 1.6076 (2.2112) loss 5.6102 (5.5390) grad_norm 2.0166 (2.2724) [2022-01-17 16:39:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][700/1251] eta 0:20:17 lr 0.000329 time 2.5433 (2.2101) loss 5.9363 (5.5399) grad_norm 2.2933 (2.2758) [2022-01-17 16:39:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][710/1251] eta 0:19:55 lr 0.000329 time 2.2780 (2.2099) loss 5.8168 (5.5385) grad_norm 1.9649 (2.2741) [2022-01-17 16:40:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][720/1251] eta 0:19:32 lr 0.000329 time 2.2430 (2.2080) loss 4.8964 (5.5355) grad_norm 2.5557 (2.2739) [2022-01-17 16:40:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][730/1251] eta 0:19:09 lr 0.000330 time 1.9061 (2.2067) loss 4.9259 (5.5336) grad_norm 1.9944 (2.2769) [2022-01-17 16:40:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][740/1251] eta 0:18:47 lr 0.000330 time 2.1284 (2.2064) loss 5.8784 (5.5340) grad_norm 1.8842 (2.2755) [2022-01-17 16:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][750/1251] eta 0:18:25 lr 0.000331 time 1.5774 (2.2070) loss 6.0222 (5.5342) grad_norm 1.8800 (2.2747) [2022-01-17 16:41:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][760/1251] eta 0:18:04 lr 0.000331 time 2.7262 (2.2083) loss 5.7142 (5.5371) grad_norm 2.7140 (2.2740) [2022-01-17 16:41:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][770/1251] eta 0:17:42 lr 0.000331 time 1.9203 (2.2085) loss 6.0950 (5.5381) grad_norm 2.4528 (2.2753) [2022-01-17 16:42:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][780/1251] eta 0:17:20 lr 0.000332 time 1.8858 (2.2091) loss 5.5456 (5.5375) grad_norm 2.4756 (2.2776) [2022-01-17 16:42:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][790/1251] eta 0:16:58 lr 0.000332 time 2.0100 (2.2096) loss 5.5669 (5.5378) grad_norm 1.8412 (2.2777) [2022-01-17 16:43:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][800/1251] eta 0:16:36 lr 0.000333 time 2.5739 (2.2091) loss 4.9090 (5.5350) grad_norm 1.9994 (2.2772) [2022-01-17 16:43:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][810/1251] eta 0:16:12 lr 0.000333 time 1.9027 (2.2056) loss 5.6617 (5.5342) grad_norm 2.3041 (2.2751) [2022-01-17 16:43:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][820/1251] eta 0:15:49 lr 0.000333 time 1.9432 (2.2020) loss 4.6357 (5.5344) grad_norm 3.0800 (2.2751) [2022-01-17 16:44:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][830/1251] eta 0:15:26 lr 0.000334 time 1.9398 (2.2010) loss 5.4513 (5.5348) grad_norm 1.9768 (2.2743) [2022-01-17 16:44:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][840/1251] eta 0:15:04 lr 0.000334 time 2.0606 (2.2007) loss 5.7631 (5.5322) grad_norm 1.9251 (2.2734) [2022-01-17 16:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][850/1251] eta 0:14:42 lr 0.000335 time 2.3305 (2.2000) loss 5.9971 (5.5333) grad_norm 2.8013 (2.2750) [2022-01-17 16:45:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][860/1251] eta 0:14:20 lr 0.000335 time 2.8091 (2.2013) loss 5.0790 (5.5334) grad_norm 2.8112 (2.2743) [2022-01-17 16:45:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][870/1251] eta 0:13:58 lr 0.000335 time 1.5357 (2.2021) loss 5.5590 (5.5342) grad_norm 2.2007 (2.2745) [2022-01-17 16:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][880/1251] eta 0:13:36 lr 0.000336 time 1.8292 (2.2021) loss 5.7645 (5.5354) grad_norm 1.8195 (2.2715) [2022-01-17 16:46:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][890/1251] eta 0:13:15 lr 0.000336 time 2.1658 (2.2045) loss 5.2864 (5.5367) grad_norm 1.7080 (2.2743) [2022-01-17 16:46:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][900/1251] eta 0:12:54 lr 0.000337 time 2.6850 (2.2053) loss 5.3490 (5.5369) grad_norm 2.0005 (2.2741) [2022-01-17 16:46:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][910/1251] eta 0:12:31 lr 0.000337 time 1.5958 (2.2042) loss 5.3534 (5.5343) grad_norm 2.1514 (2.2728) [2022-01-17 16:47:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][920/1251] eta 0:12:08 lr 0.000337 time 1.7955 (2.2020) loss 5.6516 (5.5340) grad_norm 1.8866 (2.2738) [2022-01-17 16:47:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][930/1251] eta 0:11:46 lr 0.000338 time 1.8316 (2.2011) loss 6.0549 (5.5334) grad_norm 2.3530 (2.2717) [2022-01-17 16:48:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][940/1251] eta 0:11:24 lr 0.000338 time 1.7914 (2.2002) loss 5.3071 (5.5351) grad_norm 2.7803 (2.2751) [2022-01-17 16:48:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][950/1251] eta 0:11:03 lr 0.000339 time 2.6026 (2.2031) loss 5.7710 (5.5362) grad_norm 2.5781 (2.2724) [2022-01-17 16:48:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][960/1251] eta 0:10:41 lr 0.000339 time 2.5335 (2.2040) loss 5.5571 (5.5360) grad_norm 2.3728 (2.2711) [2022-01-17 16:49:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][970/1251] eta 0:10:19 lr 0.000339 time 1.8494 (2.2052) loss 5.3675 (5.5342) grad_norm 3.2376 (2.2719) [2022-01-17 16:49:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][980/1251] eta 0:09:57 lr 0.000340 time 1.8026 (2.2046) loss 5.7792 (5.5314) grad_norm 1.9636 (2.2706) [2022-01-17 16:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][990/1251] eta 0:09:34 lr 0.000340 time 1.6504 (2.2023) loss 5.9542 (5.5341) grad_norm 2.5676 (2.2728) [2022-01-17 16:50:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1000/1251] eta 0:09:12 lr 0.000341 time 1.6041 (2.1996) loss 5.8801 (5.5342) grad_norm 1.8692 (2.2698) [2022-01-17 16:50:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1010/1251] eta 0:08:49 lr 0.000341 time 2.3381 (2.1984) loss 4.5887 (5.5328) grad_norm 1.8779 (2.2702) [2022-01-17 16:50:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1020/1251] eta 0:08:27 lr 0.000341 time 2.9722 (2.1980) loss 4.6826 (5.5330) grad_norm 2.0678 (2.2703) [2022-01-17 16:51:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1030/1251] eta 0:08:05 lr 0.000342 time 2.7292 (2.1972) loss 5.3999 (5.5334) grad_norm 1.8626 (2.2699) [2022-01-17 16:51:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1040/1251] eta 0:07:43 lr 0.000342 time 2.2106 (2.1978) loss 4.9076 (5.5341) grad_norm 2.8893 (2.2719) [2022-01-17 16:52:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1050/1251] eta 0:07:22 lr 0.000343 time 2.8005 (2.2007) loss 5.6196 (5.5359) grad_norm 2.0281 (2.2735) [2022-01-17 16:52:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1060/1251] eta 0:07:00 lr 0.000343 time 2.1572 (2.2014) loss 5.6966 (5.5360) grad_norm 2.6518 (2.2710) [2022-01-17 16:52:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1070/1251] eta 0:06:38 lr 0.000343 time 2.5326 (2.2014) loss 5.0309 (5.5343) grad_norm 2.1880 (2.2723) [2022-01-17 16:53:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1080/1251] eta 0:06:16 lr 0.000344 time 1.8750 (2.2007) loss 6.0496 (5.5330) grad_norm 2.5213 (2.2700) [2022-01-17 16:53:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1090/1251] eta 0:05:53 lr 0.000344 time 2.3688 (2.1987) loss 4.9608 (5.5342) grad_norm 2.2303 (2.2682) [2022-01-17 16:53:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1100/1251] eta 0:05:31 lr 0.000345 time 2.1616 (2.1985) loss 5.8436 (5.5345) grad_norm 2.1690 (2.2653) [2022-01-17 16:54:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1110/1251] eta 0:05:09 lr 0.000345 time 2.1008 (2.1975) loss 5.4943 (5.5341) grad_norm 2.1891 (2.2626) [2022-01-17 16:54:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1120/1251] eta 0:04:47 lr 0.000345 time 2.1688 (2.1965) loss 4.8863 (5.5338) grad_norm 2.8013 (2.2619) [2022-01-17 16:54:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1130/1251] eta 0:04:25 lr 0.000346 time 1.8803 (2.1961) loss 5.0609 (5.5324) grad_norm 2.9656 (2.2626) [2022-01-17 16:55:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1140/1251] eta 0:04:03 lr 0.000346 time 2.5020 (2.1957) loss 4.4019 (5.5328) grad_norm 2.0776 (2.2635) [2022-01-17 16:55:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1150/1251] eta 0:03:41 lr 0.000347 time 2.5033 (2.1947) loss 4.8791 (5.5307) grad_norm 1.7723 (2.2611) [2022-01-17 16:56:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1160/1251] eta 0:03:19 lr 0.000347 time 2.1434 (2.1965) loss 6.0802 (5.5297) grad_norm 2.5291 (2.2603) [2022-01-17 16:56:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1170/1251] eta 0:02:57 lr 0.000347 time 1.8844 (2.1968) loss 4.9475 (5.5287) grad_norm 1.9919 (2.2608) [2022-01-17 16:56:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1180/1251] eta 0:02:36 lr 0.000348 time 2.6156 (2.1983) loss 5.6080 (5.5268) grad_norm 2.0328 (2.2582) [2022-01-17 16:57:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1190/1251] eta 0:02:14 lr 0.000348 time 2.5049 (2.1982) loss 4.9995 (5.5254) grad_norm 1.8657 (2.2563) [2022-01-17 16:57:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1200/1251] eta 0:01:52 lr 0.000349 time 1.9001 (2.1966) loss 5.2344 (5.5258) grad_norm 1.7630 (2.2572) [2022-01-17 16:57:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1210/1251] eta 0:01:30 lr 0.000349 time 2.5508 (2.1954) loss 5.3344 (5.5250) grad_norm 1.9575 (2.2561) [2022-01-17 16:58:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1220/1251] eta 0:01:08 lr 0.000349 time 2.1670 (2.1942) loss 4.7634 (5.5249) grad_norm 1.8831 (2.2557) [2022-01-17 16:58:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1230/1251] eta 0:00:46 lr 0.000350 time 1.9660 (2.1930) loss 6.1439 (5.5252) grad_norm 2.0656 (2.2541) [2022-01-17 16:58:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1240/1251] eta 0:00:24 lr 0.000350 time 1.7771 (2.1932) loss 5.7630 (5.5259) grad_norm 1.8911 (2.2530) [2022-01-17 16:59:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [6/300][1250/1251] eta 0:00:02 lr 0.000351 time 1.1650 (2.1881) loss 5.3130 (5.5265) grad_norm 1.9578 (2.2545) [2022-01-17 16:59:09 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 6 training takes 0:45:37 [2022-01-17 16:59:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.667 (18.667) Loss 3.2093 (3.2093) Acc@1 34.180 (34.180) Acc@5 59.863 (59.863) [2022-01-17 16:59:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.562 (3.403) Loss 3.1572 (3.2280) Acc@1 33.691 (33.958) Acc@5 61.035 (59.482) [2022-01-17 17:00:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.611 (2.530) Loss 3.2199 (3.2240) Acc@1 35.645 (34.189) Acc@5 60.449 (59.882) [2022-01-17 17:00:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.919 (2.304) Loss 3.2568 (3.2263) Acc@1 33.301 (34.249) Acc@5 58.203 (59.939) [2022-01-17 17:00:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.898 (2.203) Loss 3.1165 (3.2176) Acc@1 36.328 (34.201) Acc@5 61.426 (60.056) [2022-01-17 17:00:46 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 34.096 Acc@5 59.946 [2022-01-17 17:00:46 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 34.1% [2022-01-17 17:00:46 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 34.10% [2022-01-17 17:01:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][0/1251] eta 7:36:34 lr 0.000351 time 21.8983 (21.8983) loss 5.5266 (5.5266) grad_norm 1.8854 (1.8854) [2022-01-17 17:01:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][10/1251] eta 1:25:22 lr 0.000351 time 1.7810 (4.1276) loss 6.0473 (5.5233) grad_norm 1.8507 (1.9736) [2022-01-17 17:01:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][20/1251] eta 1:04:23 lr 0.000351 time 1.4187 (3.1385) loss 5.8783 (5.4609) grad_norm 2.5489 (2.1482) [2022-01-17 17:02:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][30/1251] eta 0:57:13 lr 0.000352 time 1.3888 (2.8118) loss 5.7280 (5.4982) grad_norm 2.1928 (2.1890) [2022-01-17 17:02:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][40/1251] eta 0:55:11 lr 0.000352 time 6.2666 (2.7343) loss 5.3535 (5.4989) grad_norm 2.2627 (2.2440) [2022-01-17 17:03:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][50/1251] eta 0:52:43 lr 0.000353 time 2.4684 (2.6337) loss 5.2552 (5.4603) grad_norm 2.7368 (2.2066) [2022-01-17 17:03:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][60/1251] eta 0:51:32 lr 0.000353 time 2.0480 (2.5970) loss 5.2292 (5.4635) grad_norm 2.1253 (2.2074) [2022-01-17 17:03:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][70/1251] eta 0:49:57 lr 0.000353 time 1.5624 (2.5379) loss 5.6423 (5.4752) grad_norm 2.0436 (2.2781) [2022-01-17 17:04:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][80/1251] eta 0:49:09 lr 0.000354 time 3.8377 (2.5186) loss 5.6898 (5.4830) grad_norm 3.0381 (2.3027) [2022-01-17 17:04:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][90/1251] eta 0:47:44 lr 0.000354 time 1.8839 (2.4673) loss 4.6152 (5.4930) grad_norm 1.8214 (2.3349) [2022-01-17 17:04:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][100/1251] eta 0:46:43 lr 0.000355 time 2.4960 (2.4355) loss 5.0862 (5.4941) grad_norm 2.3008 (2.3134) [2022-01-17 17:05:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][110/1251] eta 0:45:44 lr 0.000355 time 1.8368 (2.4053) loss 5.3697 (5.4868) grad_norm 2.4762 (2.3058) [2022-01-17 17:05:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][120/1251] eta 0:45:06 lr 0.000355 time 3.1602 (2.3929) loss 5.6393 (5.4906) grad_norm 2.2454 (2.2812) [2022-01-17 17:05:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][130/1251] eta 0:44:34 lr 0.000356 time 2.7548 (2.3861) loss 5.8725 (5.4939) grad_norm 1.9905 (2.2569) [2022-01-17 17:06:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][140/1251] eta 0:43:48 lr 0.000356 time 2.1415 (2.3656) loss 6.0148 (5.4797) grad_norm 3.1178 (2.2757) [2022-01-17 17:06:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][150/1251] eta 0:43:02 lr 0.000357 time 2.2030 (2.3456) loss 5.6257 (5.4675) grad_norm 2.0429 (2.2691) [2022-01-17 17:07:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][160/1251] eta 0:42:32 lr 0.000357 time 3.0979 (2.3394) loss 5.6296 (5.4748) grad_norm 1.8074 (2.2668) [2022-01-17 17:07:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][170/1251] eta 0:41:56 lr 0.000357 time 2.4079 (2.3277) loss 5.3643 (5.4736) grad_norm 1.9598 (2.2614) [2022-01-17 17:07:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][180/1251] eta 0:41:29 lr 0.000358 time 1.7973 (2.3248) loss 5.7103 (5.4799) grad_norm 3.0605 (2.2530) [2022-01-17 17:08:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][190/1251] eta 0:40:57 lr 0.000358 time 2.1838 (2.3166) loss 5.5767 (5.4682) grad_norm 2.3078 (2.2484) [2022-01-17 17:08:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][200/1251] eta 0:40:32 lr 0.000359 time 2.6520 (2.3147) loss 5.9169 (5.4633) grad_norm 1.7405 (2.2426) [2022-01-17 17:08:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][210/1251] eta 0:39:58 lr 0.000359 time 2.2319 (2.3040) loss 5.7684 (5.4806) grad_norm 2.5121 (2.2328) [2022-01-17 17:09:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][220/1251] eta 0:39:23 lr 0.000359 time 1.9182 (2.2928) loss 5.6110 (5.4723) grad_norm 2.9584 (2.2374) [2022-01-17 17:09:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][230/1251] eta 0:38:53 lr 0.000360 time 2.2909 (2.2856) loss 6.0593 (5.4742) grad_norm 5.1263 (2.2543) [2022-01-17 17:09:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][240/1251] eta 0:38:26 lr 0.000360 time 2.6485 (2.2814) loss 4.7800 (5.4647) grad_norm 1.8752 (2.2610) [2022-01-17 17:10:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][250/1251] eta 0:37:59 lr 0.000361 time 2.2074 (2.2775) loss 4.9385 (5.4474) grad_norm 2.2291 (2.2578) [2022-01-17 17:10:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][260/1251] eta 0:37:38 lr 0.000361 time 1.9800 (2.2788) loss 6.1781 (5.4483) grad_norm 2.2163 (2.2529) [2022-01-17 17:11:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][270/1251] eta 0:37:18 lr 0.000361 time 3.1063 (2.2820) loss 6.2110 (5.4521) grad_norm 1.7969 (2.2457) [2022-01-17 17:11:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][280/1251] eta 0:36:48 lr 0.000362 time 1.9862 (2.2741) loss 5.3255 (5.4513) grad_norm 2.5355 (2.2382) [2022-01-17 17:11:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][290/1251] eta 0:36:15 lr 0.000362 time 1.8410 (2.2641) loss 4.5908 (5.4410) grad_norm 1.6803 (2.2431) [2022-01-17 17:12:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][300/1251] eta 0:35:50 lr 0.000363 time 1.9666 (2.2618) loss 5.0327 (5.4395) grad_norm 1.9022 (2.2367) [2022-01-17 17:12:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][310/1251] eta 0:35:24 lr 0.000363 time 2.2721 (2.2576) loss 5.8206 (5.4356) grad_norm 1.7604 (2.2327) [2022-01-17 17:12:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][320/1251] eta 0:34:55 lr 0.000363 time 2.4403 (2.2510) loss 5.1082 (5.4375) grad_norm 2.1551 (2.2256) [2022-01-17 17:13:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][330/1251] eta 0:34:35 lr 0.000364 time 2.6472 (2.2530) loss 4.7288 (5.4367) grad_norm 1.8064 (2.2235) [2022-01-17 17:13:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][340/1251] eta 0:34:15 lr 0.000364 time 2.0965 (2.2564) loss 5.3749 (5.4349) grad_norm 2.0920 (2.2194) [2022-01-17 17:13:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][350/1251] eta 0:33:50 lr 0.000365 time 1.6939 (2.2539) loss 5.8399 (5.4403) grad_norm 1.9423 (2.2255) [2022-01-17 17:14:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][360/1251] eta 0:33:26 lr 0.000365 time 2.4629 (2.2517) loss 4.8237 (5.4353) grad_norm 1.9727 (2.2238) [2022-01-17 17:14:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][370/1251] eta 0:33:02 lr 0.000365 time 2.8110 (2.2498) loss 5.7757 (5.4346) grad_norm 1.7449 (2.2187) [2022-01-17 17:15:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][380/1251] eta 0:32:40 lr 0.000366 time 1.8256 (2.2512) loss 4.8397 (5.4338) grad_norm 2.1042 (2.2177) [2022-01-17 17:15:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][390/1251] eta 0:32:13 lr 0.000366 time 1.6988 (2.2461) loss 5.4526 (5.4340) grad_norm 1.9201 (2.2193) [2022-01-17 17:15:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][400/1251] eta 0:31:47 lr 0.000367 time 1.9095 (2.2418) loss 5.3999 (5.4300) grad_norm 2.0381 (2.2155) [2022-01-17 17:16:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][410/1251] eta 0:31:23 lr 0.000367 time 2.2363 (2.2397) loss 5.7792 (5.4319) grad_norm 3.0215 (2.2140) [2022-01-17 17:16:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][420/1251] eta 0:31:01 lr 0.000367 time 2.9491 (2.2396) loss 5.6025 (5.4254) grad_norm 1.8805 (2.2186) [2022-01-17 17:16:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][430/1251] eta 0:30:37 lr 0.000368 time 1.9373 (2.2385) loss 5.7378 (5.4214) grad_norm 2.3686 (2.2144) [2022-01-17 17:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][440/1251] eta 0:30:13 lr 0.000368 time 2.1251 (2.2356) loss 6.1038 (5.4232) grad_norm 2.2896 (2.2145) [2022-01-17 17:17:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][450/1251] eta 0:29:51 lr 0.000369 time 2.1642 (2.2363) loss 5.5394 (5.4193) grad_norm 2.2412 (2.2120) [2022-01-17 17:17:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][460/1251] eta 0:29:30 lr 0.000369 time 2.9878 (2.2377) loss 5.4733 (5.4243) grad_norm 2.7141 (2.2121) [2022-01-17 17:18:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][470/1251] eta 0:29:04 lr 0.000369 time 1.8475 (2.2333) loss 5.6245 (5.4259) grad_norm 2.0532 (2.2176) [2022-01-17 17:18:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][480/1251] eta 0:28:40 lr 0.000370 time 2.2632 (2.2311) loss 5.9142 (5.4234) grad_norm 2.2980 (2.2174) [2022-01-17 17:19:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][490/1251] eta 0:28:15 lr 0.000370 time 2.5109 (2.2280) loss 5.1200 (5.4207) grad_norm 2.5598 (2.2297) [2022-01-17 17:19:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][500/1251] eta 0:27:50 lr 0.000371 time 2.6457 (2.2241) loss 5.2831 (5.4217) grad_norm 1.6177 (2.2284) [2022-01-17 17:19:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][510/1251] eta 0:27:30 lr 0.000371 time 2.5199 (2.2279) loss 5.9401 (5.4253) grad_norm 2.2446 (2.2272) [2022-01-17 17:20:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][520/1251] eta 0:27:10 lr 0.000371 time 2.7889 (2.2311) loss 6.1537 (5.4266) grad_norm 1.8328 (2.2257) [2022-01-17 17:20:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][530/1251] eta 0:26:49 lr 0.000372 time 2.1823 (2.2327) loss 5.2901 (5.4263) grad_norm 1.9174 (2.2258) [2022-01-17 17:20:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][540/1251] eta 0:26:24 lr 0.000372 time 1.7427 (2.2281) loss 5.1853 (5.4227) grad_norm 1.8727 (2.2229) [2022-01-17 17:21:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][550/1251] eta 0:25:58 lr 0.000373 time 1.9541 (2.2226) loss 5.5299 (5.4248) grad_norm 2.3955 (2.2172) [2022-01-17 17:21:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][560/1251] eta 0:25:33 lr 0.000373 time 2.7614 (2.2193) loss 5.5926 (5.4219) grad_norm 2.0179 (2.2174) [2022-01-17 17:21:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][570/1251] eta 0:25:11 lr 0.000373 time 2.2440 (2.2191) loss 4.8544 (5.4202) grad_norm 2.6643 (2.2182) [2022-01-17 17:22:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][580/1251] eta 0:24:47 lr 0.000374 time 1.8210 (2.2171) loss 5.6430 (5.4174) grad_norm 2.1367 (2.2167) [2022-01-17 17:22:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][590/1251] eta 0:24:25 lr 0.000374 time 2.2017 (2.2172) loss 5.6063 (5.4144) grad_norm 2.1585 (2.2188) [2022-01-17 17:22:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][600/1251] eta 0:24:03 lr 0.000375 time 2.4243 (2.2169) loss 5.3329 (5.4147) grad_norm 2.4486 (2.2232) [2022-01-17 17:23:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][610/1251] eta 0:23:40 lr 0.000375 time 2.2218 (2.2167) loss 5.7395 (5.4200) grad_norm 2.0165 (2.2191) [2022-01-17 17:23:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][620/1251] eta 0:23:18 lr 0.000375 time 1.7174 (2.2163) loss 5.4722 (5.4157) grad_norm 2.0385 (2.2181) [2022-01-17 17:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][630/1251] eta 0:22:56 lr 0.000376 time 2.3469 (2.2173) loss 5.3847 (5.4171) grad_norm 2.3081 (2.2159) [2022-01-17 17:24:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][640/1251] eta 0:22:34 lr 0.000376 time 2.7390 (2.2163) loss 5.2629 (5.4195) grad_norm 1.9600 (2.2120) [2022-01-17 17:24:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][650/1251] eta 0:22:13 lr 0.000377 time 2.9566 (2.2187) loss 5.9952 (5.4191) grad_norm 1.7828 (2.2163) [2022-01-17 17:25:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][660/1251] eta 0:21:51 lr 0.000377 time 2.1792 (2.2191) loss 5.6583 (5.4208) grad_norm 1.8115 (2.2127) [2022-01-17 17:25:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][670/1251] eta 0:21:29 lr 0.000377 time 2.2417 (2.2192) loss 5.3284 (5.4240) grad_norm 2.5046 (2.2099) [2022-01-17 17:25:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][680/1251] eta 0:21:06 lr 0.000378 time 2.5056 (2.2177) loss 5.6837 (5.4237) grad_norm 2.1611 (2.2097) [2022-01-17 17:26:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][690/1251] eta 0:20:41 lr 0.000378 time 2.2766 (2.2132) loss 5.8764 (5.4247) grad_norm 1.8954 (2.2070) [2022-01-17 17:26:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][700/1251] eta 0:20:18 lr 0.000379 time 2.1128 (2.2109) loss 5.5106 (5.4222) grad_norm 2.5637 (2.2081) [2022-01-17 17:26:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][710/1251] eta 0:19:55 lr 0.000379 time 2.2146 (2.2103) loss 4.6349 (5.4238) grad_norm 1.8007 (2.2089) [2022-01-17 17:27:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][720/1251] eta 0:19:33 lr 0.000379 time 2.7584 (2.2107) loss 4.5463 (5.4243) grad_norm 1.8357 (2.2081) [2022-01-17 17:27:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][730/1251] eta 0:19:11 lr 0.000380 time 2.2435 (2.2103) loss 5.7862 (5.4275) grad_norm 2.0859 (2.2069) [2022-01-17 17:28:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][740/1251] eta 0:18:48 lr 0.000380 time 1.7073 (2.2087) loss 4.8839 (5.4275) grad_norm 1.8117 (2.2042) [2022-01-17 17:28:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][750/1251] eta 0:18:26 lr 0.000381 time 2.1957 (2.2077) loss 5.0405 (5.4209) grad_norm 2.2726 (2.2062) [2022-01-17 17:28:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][760/1251] eta 0:18:03 lr 0.000381 time 2.2501 (2.2065) loss 5.1383 (5.4162) grad_norm 2.7400 (2.2085) [2022-01-17 17:29:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][770/1251] eta 0:17:40 lr 0.000381 time 1.8287 (2.2058) loss 5.7198 (5.4173) grad_norm 2.4302 (2.2088) [2022-01-17 17:29:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][780/1251] eta 0:17:18 lr 0.000382 time 2.0301 (2.2049) loss 5.3555 (5.4196) grad_norm 2.0128 (2.2094) [2022-01-17 17:29:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][790/1251] eta 0:16:56 lr 0.000382 time 2.2443 (2.2054) loss 5.2416 (5.4177) grad_norm 2.2992 (2.2102) [2022-01-17 17:30:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][800/1251] eta 0:16:35 lr 0.000383 time 2.1649 (2.2063) loss 5.9582 (5.4175) grad_norm 2.1207 (2.2130) [2022-01-17 17:30:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][810/1251] eta 0:16:14 lr 0.000383 time 2.4659 (2.2088) loss 4.6403 (5.4160) grad_norm 2.2081 (2.2130) [2022-01-17 17:31:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][820/1251] eta 0:15:51 lr 0.000383 time 2.2829 (2.2083) loss 4.4594 (5.4152) grad_norm 2.1385 (2.2114) [2022-01-17 17:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][830/1251] eta 0:15:29 lr 0.000384 time 2.5321 (2.2090) loss 5.3369 (5.4160) grad_norm 1.8065 (2.2084) [2022-01-17 17:31:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][840/1251] eta 0:15:07 lr 0.000384 time 2.7859 (2.2086) loss 4.9165 (5.4120) grad_norm 1.5879 (2.2066) [2022-01-17 17:32:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][850/1251] eta 0:14:45 lr 0.000385 time 2.6057 (2.2077) loss 5.3791 (5.4120) grad_norm 2.2749 (2.2053) [2022-01-17 17:32:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][860/1251] eta 0:14:22 lr 0.000385 time 1.6881 (2.2059) loss 5.3986 (5.4093) grad_norm 2.2140 (2.2065) [2022-01-17 17:32:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][870/1251] eta 0:13:59 lr 0.000385 time 2.1143 (2.2047) loss 4.2866 (5.4104) grad_norm 1.8349 (2.2054) [2022-01-17 17:33:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][880/1251] eta 0:13:38 lr 0.000386 time 3.5441 (2.2058) loss 5.4977 (5.4097) grad_norm 2.2714 (2.2038) [2022-01-17 17:33:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][890/1251] eta 0:13:16 lr 0.000386 time 2.4842 (2.2054) loss 5.7823 (5.4070) grad_norm 2.2409 (2.2019) [2022-01-17 17:33:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][900/1251] eta 0:12:53 lr 0.000387 time 1.8723 (2.2037) loss 4.7079 (5.4074) grad_norm 2.6323 (2.2035) [2022-01-17 17:34:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][910/1251] eta 0:12:31 lr 0.000387 time 2.6462 (2.2039) loss 5.6302 (5.4091) grad_norm 2.0948 (2.2068) [2022-01-17 17:34:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][920/1251] eta 0:12:09 lr 0.000387 time 3.1271 (2.2045) loss 5.5020 (5.4106) grad_norm 1.8911 (2.2068) [2022-01-17 17:34:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][930/1251] eta 0:11:47 lr 0.000388 time 2.5310 (2.2047) loss 5.5033 (5.4117) grad_norm 1.5821 (2.2035) [2022-01-17 17:35:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][940/1251] eta 0:11:25 lr 0.000388 time 1.7343 (2.2056) loss 5.7339 (5.4118) grad_norm 2.0920 (2.2035) [2022-01-17 17:35:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][950/1251] eta 0:11:03 lr 0.000389 time 2.2310 (2.2053) loss 5.3983 (5.4121) grad_norm 2.0138 (2.2015) [2022-01-17 17:36:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][960/1251] eta 0:10:41 lr 0.000389 time 3.3088 (2.2052) loss 5.4104 (5.4125) grad_norm 1.6609 (2.2001) [2022-01-17 17:36:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][970/1251] eta 0:10:19 lr 0.000389 time 2.2554 (2.2034) loss 4.8464 (5.4105) grad_norm 3.0579 (2.2006) [2022-01-17 17:36:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][980/1251] eta 0:09:56 lr 0.000390 time 1.8362 (2.2012) loss 5.5330 (5.4097) grad_norm 1.7557 (2.2005) [2022-01-17 17:37:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][990/1251] eta 0:09:34 lr 0.000390 time 2.7762 (2.2012) loss 5.5368 (5.4104) grad_norm 2.4962 (2.2016) [2022-01-17 17:37:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1000/1251] eta 0:09:12 lr 0.000391 time 2.6320 (2.2025) loss 5.7143 (5.4083) grad_norm 2.0663 (2.2020) [2022-01-17 17:37:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1010/1251] eta 0:08:50 lr 0.000391 time 1.9094 (2.2023) loss 6.0520 (5.4071) grad_norm 2.7697 (2.2008) [2022-01-17 17:38:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1020/1251] eta 0:08:28 lr 0.000391 time 2.5080 (2.2010) loss 5.3017 (5.4058) grad_norm 1.6845 (2.2009) [2022-01-17 17:38:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1030/1251] eta 0:08:06 lr 0.000392 time 1.9101 (2.2005) loss 5.4486 (5.4055) grad_norm 1.9520 (2.2009) [2022-01-17 17:38:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1040/1251] eta 0:07:44 lr 0.000392 time 2.2106 (2.1991) loss 5.3071 (5.4047) grad_norm 3.1788 (2.2032) [2022-01-17 17:39:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1050/1251] eta 0:07:21 lr 0.000393 time 2.4731 (2.1980) loss 5.1845 (5.4045) grad_norm 2.6777 (2.2025) [2022-01-17 17:39:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1060/1251] eta 0:06:59 lr 0.000393 time 1.5390 (2.1969) loss 4.7150 (5.4030) grad_norm 2.0363 (2.2028) [2022-01-17 17:39:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1070/1251] eta 0:06:37 lr 0.000393 time 1.8674 (2.1961) loss 5.5325 (5.4038) grad_norm 1.7864 (2.2002) [2022-01-17 17:40:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1080/1251] eta 0:06:15 lr 0.000394 time 3.1297 (2.1978) loss 5.0519 (5.4025) grad_norm 2.3255 (2.2019) [2022-01-17 17:40:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1090/1251] eta 0:05:54 lr 0.000394 time 1.7682 (2.1990) loss 5.6215 (5.4013) grad_norm 2.2135 (2.2026) [2022-01-17 17:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1100/1251] eta 0:05:32 lr 0.000395 time 2.1278 (2.2001) loss 5.4985 (5.3984) grad_norm 1.8017 (2.2028) [2022-01-17 17:41:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1110/1251] eta 0:05:10 lr 0.000395 time 2.1780 (2.1996) loss 4.9200 (5.3983) grad_norm 1.8018 (2.2000) [2022-01-17 17:41:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1120/1251] eta 0:04:48 lr 0.000395 time 2.6645 (2.1994) loss 5.6971 (5.3979) grad_norm 1.8259 (2.1994) [2022-01-17 17:42:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1130/1251] eta 0:04:25 lr 0.000396 time 1.8818 (2.1977) loss 5.5307 (5.3977) grad_norm 1.7707 (2.2009) [2022-01-17 17:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1140/1251] eta 0:04:04 lr 0.000396 time 1.8943 (2.1984) loss 4.9883 (5.3966) grad_norm 2.2883 (2.2013) [2022-01-17 17:42:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1150/1251] eta 0:03:41 lr 0.000397 time 2.2264 (2.1976) loss 5.0783 (5.3957) grad_norm 1.9531 (2.1983) [2022-01-17 17:43:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1160/1251] eta 0:03:19 lr 0.000397 time 2.3069 (2.1966) loss 5.7720 (5.3970) grad_norm 2.3118 (2.1993) [2022-01-17 17:43:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1170/1251] eta 0:02:57 lr 0.000397 time 1.5861 (2.1946) loss 5.7504 (5.3960) grad_norm 2.1876 (2.2001) [2022-01-17 17:43:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1180/1251] eta 0:02:35 lr 0.000398 time 1.8103 (2.1932) loss 4.8857 (5.3955) grad_norm 1.8828 (2.2030) [2022-01-17 17:44:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1190/1251] eta 0:02:13 lr 0.000398 time 2.4143 (2.1927) loss 4.7656 (5.3964) grad_norm 2.2024 (2.2020) [2022-01-17 17:44:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1200/1251] eta 0:01:51 lr 0.000399 time 2.2285 (2.1928) loss 4.4077 (5.3944) grad_norm 1.7749 (2.2002) [2022-01-17 17:45:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1210/1251] eta 0:01:29 lr 0.000399 time 1.8403 (2.1921) loss 5.5870 (5.3939) grad_norm 2.3565 (2.1994) [2022-01-17 17:45:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1220/1251] eta 0:01:07 lr 0.000399 time 2.7211 (2.1928) loss 4.4041 (5.3931) grad_norm 1.8609 (2.1986) [2022-01-17 17:45:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1230/1251] eta 0:00:46 lr 0.000400 time 1.9464 (2.1933) loss 5.6565 (5.3926) grad_norm 1.7726 (2.1975) [2022-01-17 17:46:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1240/1251] eta 0:00:24 lr 0.000400 time 1.9531 (2.1942) loss 5.1949 (5.3908) grad_norm 1.8552 (2.1972) [2022-01-17 17:46:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [7/300][1250/1251] eta 0:00:02 lr 0.000401 time 1.1336 (2.1891) loss 5.0588 (5.3906) grad_norm 2.1447 (2.1954) [2022-01-17 17:46:25 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 7 training takes 0:45:38 [2022-01-17 17:46:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.156 (18.156) Loss 2.9763 (2.9763) Acc@1 39.258 (39.258) Acc@5 65.137 (65.137) [2022-01-17 17:47:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.605 (3.471) Loss 2.9139 (2.9876) Acc@1 41.113 (37.855) Acc@5 66.211 (64.142) [2022-01-17 17:47:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.281 (2.616) Loss 2.9825 (2.9907) Acc@1 39.062 (38.058) Acc@5 63.477 (63.709) [2022-01-17 17:47:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.602 (2.266) Loss 3.0276 (2.9984) Acc@1 38.086 (37.954) Acc@5 64.648 (63.697) [2022-01-17 17:47:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.816 (2.198) Loss 3.0226 (3.0038) Acc@1 37.207 (37.776) Acc@5 62.402 (63.603) [2022-01-17 17:48:03 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 37.788 Acc@5 63.680 [2022-01-17 17:48:03 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 37.8% [2022-01-17 17:48:03 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 37.79% [2022-01-17 17:48:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][0/1251] eta 7:14:04 lr 0.000401 time 20.8192 (20.8192) loss 5.5628 (5.5628) grad_norm 1.9724 (1.9724) [2022-01-17 17:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][10/1251] eta 1:23:00 lr 0.000401 time 2.5947 (4.0133) loss 5.1445 (5.5513) grad_norm 1.7086 (1.9615) [2022-01-17 17:49:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][20/1251] eta 1:04:09 lr 0.000401 time 1.3398 (3.1273) loss 5.7965 (5.5646) grad_norm 2.2825 (2.0620) [2022-01-17 17:49:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][30/1251] eta 0:57:49 lr 0.000402 time 1.5572 (2.8412) loss 5.9163 (5.5515) grad_norm 2.0440 (2.2090) [2022-01-17 17:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][40/1251] eta 0:55:11 lr 0.000402 time 3.6250 (2.7349) loss 5.3108 (5.4327) grad_norm 2.2362 (2.2047) [2022-01-17 17:50:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][50/1251] eta 0:53:31 lr 0.000403 time 2.5030 (2.6743) loss 5.9845 (5.4797) grad_norm 2.2483 (2.1946) [2022-01-17 17:50:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][60/1251] eta 0:51:39 lr 0.000403 time 1.5630 (2.6027) loss 5.2644 (5.4578) grad_norm 1.6895 (2.1881) [2022-01-17 17:51:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][70/1251] eta 0:49:48 lr 0.000403 time 1.5917 (2.5303) loss 5.3442 (5.4600) grad_norm 1.9103 (2.1829) [2022-01-17 17:51:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][80/1251] eta 0:48:27 lr 0.000404 time 2.2633 (2.4826) loss 4.4144 (5.4417) grad_norm 2.2170 (2.2037) [2022-01-17 17:51:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][90/1251] eta 0:47:22 lr 0.000404 time 1.8965 (2.4485) loss 5.7345 (5.4414) grad_norm 1.9091 (2.2135) [2022-01-17 17:52:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][100/1251] eta 0:46:08 lr 0.000405 time 2.2627 (2.4052) loss 4.7701 (5.4099) grad_norm 1.8791 (2.1901) [2022-01-17 17:52:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][110/1251] eta 0:45:24 lr 0.000405 time 2.5159 (2.3880) loss 5.5065 (5.4152) grad_norm 1.5490 (2.1627) [2022-01-17 17:52:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][120/1251] eta 0:44:43 lr 0.000405 time 1.7605 (2.3731) loss 5.0336 (5.4030) grad_norm 2.0659 (2.1475) [2022-01-17 17:53:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][130/1251] eta 0:44:09 lr 0.000406 time 1.8680 (2.3634) loss 5.7386 (5.4062) grad_norm 1.9514 (2.1307) [2022-01-17 17:53:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][140/1251] eta 0:43:28 lr 0.000406 time 3.0256 (2.3479) loss 4.7255 (5.3821) grad_norm 2.3540 (2.1229) [2022-01-17 17:53:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][150/1251] eta 0:42:44 lr 0.000407 time 1.6419 (2.3293) loss 4.5670 (5.3712) grad_norm 2.2679 (2.1228) [2022-01-17 17:54:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][160/1251] eta 0:42:03 lr 0.000407 time 1.8824 (2.3128) loss 5.8444 (5.3780) grad_norm 2.1255 (2.1277) [2022-01-17 17:54:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][170/1251] eta 0:41:33 lr 0.000407 time 2.3409 (2.3063) loss 5.5403 (5.3838) grad_norm 1.9666 (2.1182) [2022-01-17 17:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][180/1251] eta 0:41:11 lr 0.000408 time 2.5145 (2.3075) loss 5.4374 (5.3734) grad_norm 2.9272 (2.1262) [2022-01-17 17:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][190/1251] eta 0:40:45 lr 0.000408 time 2.1140 (2.3049) loss 5.9266 (5.3772) grad_norm 2.1786 (2.1347) [2022-01-17 17:55:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][200/1251] eta 0:40:16 lr 0.000409 time 1.5871 (2.2991) loss 4.7279 (5.3818) grad_norm 1.9954 (2.1286) [2022-01-17 17:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][210/1251] eta 0:39:51 lr 0.000409 time 2.5316 (2.2976) loss 4.3471 (5.3751) grad_norm 1.9803 (2.1114) [2022-01-17 17:56:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][220/1251] eta 0:39:17 lr 0.000409 time 1.8815 (2.2865) loss 4.4958 (5.3613) grad_norm 2.3220 (2.1125) [2022-01-17 17:56:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][230/1251] eta 0:38:38 lr 0.000410 time 1.9559 (2.2708) loss 4.3203 (5.3595) grad_norm 2.7223 (2.1196) [2022-01-17 17:57:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][240/1251] eta 0:38:10 lr 0.000410 time 1.9744 (2.2653) loss 5.9232 (5.3665) grad_norm 1.8491 (2.1274) [2022-01-17 17:57:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][250/1251] eta 0:37:46 lr 0.000411 time 2.9762 (2.2640) loss 5.7086 (5.3752) grad_norm 2.0795 (2.1209) [2022-01-17 17:57:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][260/1251] eta 0:37:22 lr 0.000411 time 1.5816 (2.2626) loss 5.6786 (5.3762) grad_norm 1.9752 (2.1168) [2022-01-17 17:58:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][270/1251] eta 0:36:56 lr 0.000411 time 1.8856 (2.2595) loss 5.8884 (5.3702) grad_norm 1.6652 (2.1169) [2022-01-17 17:58:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][280/1251] eta 0:36:36 lr 0.000412 time 2.6254 (2.2617) loss 5.9944 (5.3697) grad_norm 2.2313 (2.1267) [2022-01-17 17:59:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][290/1251] eta 0:36:12 lr 0.000412 time 2.7339 (2.2602) loss 5.6648 (5.3675) grad_norm 1.8735 (2.1242) [2022-01-17 17:59:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][300/1251] eta 0:35:49 lr 0.000413 time 1.6855 (2.2599) loss 4.5615 (5.3709) grad_norm 3.2572 (2.1287) [2022-01-17 17:59:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][310/1251] eta 0:35:18 lr 0.000413 time 1.8657 (2.2512) loss 5.1692 (5.3591) grad_norm 1.8607 (2.1234) [2022-01-17 18:00:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][320/1251] eta 0:34:56 lr 0.000413 time 3.2109 (2.2519) loss 5.6570 (5.3565) grad_norm 1.8569 (2.1221) [2022-01-17 18:00:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][330/1251] eta 0:34:27 lr 0.000414 time 1.6038 (2.2446) loss 5.8406 (5.3603) grad_norm 1.7663 (2.1220) [2022-01-17 18:00:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][340/1251] eta 0:34:02 lr 0.000414 time 2.2403 (2.2418) loss 4.5779 (5.3615) grad_norm 1.7823 (2.1189) [2022-01-17 18:01:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][350/1251] eta 0:33:38 lr 0.000415 time 1.8128 (2.2407) loss 5.6965 (5.3626) grad_norm 2.3784 (2.1155) [2022-01-17 18:01:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][360/1251] eta 0:33:19 lr 0.000415 time 3.8875 (2.2439) loss 5.2899 (5.3661) grad_norm 1.8885 (2.1108) [2022-01-17 18:01:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][370/1251] eta 0:32:54 lr 0.000415 time 1.6417 (2.2409) loss 4.6492 (5.3558) grad_norm 4.6584 (2.1181) [2022-01-17 18:02:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][380/1251] eta 0:32:32 lr 0.000416 time 1.8810 (2.2416) loss 5.3441 (5.3513) grad_norm 1.9520 (2.1179) [2022-01-17 18:02:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][390/1251] eta 0:32:04 lr 0.000416 time 1.8427 (2.2347) loss 4.7989 (5.3503) grad_norm 2.5181 (2.1182) [2022-01-17 18:03:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][400/1251] eta 0:31:45 lr 0.000417 time 3.7039 (2.2386) loss 5.4667 (5.3431) grad_norm 2.1536 (2.1156) [2022-01-17 18:03:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][410/1251] eta 0:31:20 lr 0.000417 time 1.9870 (2.2363) loss 5.1106 (5.3428) grad_norm 3.7348 (2.1193) [2022-01-17 18:03:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][420/1251] eta 0:30:57 lr 0.000417 time 1.8721 (2.2351) loss 4.4868 (5.3354) grad_norm 2.1869 (2.1319) [2022-01-17 18:04:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][430/1251] eta 0:30:31 lr 0.000418 time 2.0046 (2.2309) loss 4.5511 (5.3312) grad_norm 1.8495 (2.1304) [2022-01-17 18:04:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][440/1251] eta 0:30:10 lr 0.000418 time 3.6056 (2.2324) loss 5.1911 (5.3331) grad_norm 2.4764 (2.1311) [2022-01-17 18:04:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][450/1251] eta 0:29:48 lr 0.000419 time 1.9911 (2.2331) loss 5.0845 (5.3303) grad_norm 1.7415 (2.1309) [2022-01-17 18:05:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][460/1251] eta 0:29:25 lr 0.000419 time 2.2889 (2.2318) loss 5.6094 (5.3295) grad_norm 1.9346 (2.1309) [2022-01-17 18:05:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][470/1251] eta 0:28:58 lr 0.000419 time 1.8560 (2.2259) loss 5.1411 (5.3321) grad_norm 2.8211 (2.1333) [2022-01-17 18:05:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][480/1251] eta 0:28:37 lr 0.000420 time 3.0937 (2.2274) loss 5.2125 (5.3305) grad_norm 2.4568 (2.1325) [2022-01-17 18:06:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][490/1251] eta 0:28:12 lr 0.000420 time 2.1813 (2.2246) loss 5.8790 (5.3288) grad_norm 1.9851 (2.1331) [2022-01-17 18:06:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][500/1251] eta 0:27:48 lr 0.000421 time 2.5208 (2.2216) loss 5.2296 (5.3293) grad_norm 1.5706 (2.1305) [2022-01-17 18:06:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][510/1251] eta 0:27:24 lr 0.000421 time 2.2864 (2.2193) loss 5.3954 (5.3278) grad_norm 1.5751 (2.1258) [2022-01-17 18:07:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][520/1251] eta 0:27:03 lr 0.000421 time 3.0427 (2.2213) loss 5.9323 (5.3296) grad_norm 2.3820 (2.1283) [2022-01-17 18:07:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][530/1251] eta 0:26:39 lr 0.000422 time 2.5278 (2.2191) loss 4.0431 (5.3235) grad_norm 2.1963 (2.1232) [2022-01-17 18:08:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][540/1251] eta 0:26:18 lr 0.000422 time 2.4930 (2.2197) loss 5.6995 (5.3243) grad_norm 2.2151 (2.1247) [2022-01-17 18:08:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][550/1251] eta 0:25:56 lr 0.000423 time 2.3922 (2.2204) loss 4.2914 (5.3217) grad_norm 1.7225 (2.1261) [2022-01-17 18:08:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][560/1251] eta 0:25:35 lr 0.000423 time 2.6720 (2.2222) loss 5.5879 (5.3144) grad_norm 1.8014 (2.1253) [2022-01-17 18:09:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][570/1251] eta 0:25:11 lr 0.000423 time 1.9145 (2.2193) loss 5.8359 (5.3149) grad_norm 5.6378 (2.1318) [2022-01-17 18:09:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][580/1251] eta 0:24:48 lr 0.000424 time 2.3341 (2.2189) loss 4.5287 (5.3152) grad_norm 1.7880 (2.1300) [2022-01-17 18:09:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][590/1251] eta 0:24:22 lr 0.000424 time 1.6025 (2.2131) loss 5.2700 (5.3105) grad_norm 1.7212 (2.1277) [2022-01-17 18:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][600/1251] eta 0:23:59 lr 0.000425 time 1.8229 (2.2107) loss 5.6889 (5.3092) grad_norm 1.7780 (2.1227) [2022-01-17 18:10:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][610/1251] eta 0:23:36 lr 0.000425 time 1.8071 (2.2097) loss 5.5375 (5.3081) grad_norm 2.5783 (2.1231) [2022-01-17 18:10:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][620/1251] eta 0:23:13 lr 0.000425 time 1.9393 (2.2084) loss 5.8313 (5.3064) grad_norm 1.8766 (2.1178) [2022-01-17 18:11:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][630/1251] eta 0:22:49 lr 0.000426 time 1.8800 (2.2055) loss 4.9506 (5.3114) grad_norm 1.9300 (2.1185) [2022-01-17 18:11:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][640/1251] eta 0:22:27 lr 0.000426 time 1.6482 (2.2051) loss 5.7765 (5.3088) grad_norm 1.8936 (2.1141) [2022-01-17 18:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][650/1251] eta 0:22:05 lr 0.000427 time 2.0978 (2.2063) loss 5.1699 (5.3049) grad_norm 1.7917 (2.1125) [2022-01-17 18:12:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][660/1251] eta 0:21:44 lr 0.000427 time 1.8457 (2.2068) loss 5.3753 (5.3050) grad_norm 2.0543 (2.1143) [2022-01-17 18:12:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][670/1251] eta 0:21:23 lr 0.000427 time 2.7241 (2.2096) loss 5.5472 (5.3044) grad_norm 1.9031 (2.1146) [2022-01-17 18:13:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][680/1251] eta 0:21:03 lr 0.000428 time 1.8574 (2.2131) loss 5.7091 (5.3073) grad_norm 2.8383 (2.1129) [2022-01-17 18:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][690/1251] eta 0:20:41 lr 0.000428 time 2.2474 (2.2134) loss 5.3786 (5.3068) grad_norm 2.7717 (2.1143) [2022-01-17 18:13:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][700/1251] eta 0:20:18 lr 0.000429 time 2.5591 (2.2112) loss 4.6759 (5.3066) grad_norm 3.9178 (2.1200) [2022-01-17 18:14:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][710/1251] eta 0:19:53 lr 0.000429 time 1.9549 (2.2057) loss 4.1876 (5.3038) grad_norm 1.8410 (2.1203) [2022-01-17 18:14:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][720/1251] eta 0:19:29 lr 0.000429 time 2.2359 (2.2029) loss 5.5216 (5.3029) grad_norm 1.5908 (2.1161) [2022-01-17 18:14:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][730/1251] eta 0:19:06 lr 0.000430 time 2.5212 (2.2015) loss 5.8505 (5.3043) grad_norm 1.8724 (2.1158) [2022-01-17 18:15:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][740/1251] eta 0:18:44 lr 0.000430 time 2.1831 (2.2015) loss 5.1429 (5.3051) grad_norm 2.3534 (2.1150) [2022-01-17 18:15:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][750/1251] eta 0:18:22 lr 0.000431 time 1.8513 (2.2013) loss 4.8005 (5.3058) grad_norm 1.9454 (2.1167) [2022-01-17 18:15:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][760/1251] eta 0:18:01 lr 0.000431 time 2.9642 (2.2025) loss 5.4066 (5.3052) grad_norm 2.7779 (2.1158) [2022-01-17 18:16:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][770/1251] eta 0:17:40 lr 0.000431 time 2.8331 (2.2055) loss 5.2188 (5.3035) grad_norm 2.5477 (2.1176) [2022-01-17 18:16:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][780/1251] eta 0:17:19 lr 0.000432 time 2.1799 (2.2076) loss 5.3899 (5.3042) grad_norm 3.1013 (2.1175) [2022-01-17 18:17:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][790/1251] eta 0:16:58 lr 0.000432 time 2.8891 (2.2097) loss 4.5066 (5.3008) grad_norm 1.8502 (2.1150) [2022-01-17 18:17:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][800/1251] eta 0:16:35 lr 0.000433 time 2.0829 (2.2081) loss 5.4354 (5.3000) grad_norm 1.7208 (2.1110) [2022-01-17 18:17:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][810/1251] eta 0:16:13 lr 0.000433 time 2.8471 (2.2070) loss 5.7042 (5.3003) grad_norm 1.5362 (2.1082) [2022-01-17 18:18:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][820/1251] eta 0:15:50 lr 0.000433 time 1.7302 (2.2045) loss 4.7655 (5.2976) grad_norm 2.5030 (2.1120) [2022-01-17 18:18:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][830/1251] eta 0:15:27 lr 0.000434 time 1.9038 (2.2037) loss 5.5013 (5.2990) grad_norm 2.2089 (2.1119) [2022-01-17 18:18:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][840/1251] eta 0:15:05 lr 0.000434 time 2.0759 (2.2032) loss 5.7253 (5.2996) grad_norm 1.9560 (2.1100) [2022-01-17 18:19:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][850/1251] eta 0:14:43 lr 0.000435 time 2.1979 (2.2034) loss 5.3067 (5.2988) grad_norm 1.9406 (2.1090) [2022-01-17 18:19:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][860/1251] eta 0:14:22 lr 0.000435 time 2.5140 (2.2048) loss 4.8393 (5.3001) grad_norm 1.8430 (2.1079) [2022-01-17 18:20:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][870/1251] eta 0:14:00 lr 0.000435 time 2.1546 (2.2064) loss 6.0997 (5.2998) grad_norm 2.2752 (2.1062) [2022-01-17 18:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][880/1251] eta 0:13:38 lr 0.000436 time 1.5505 (2.2070) loss 5.3549 (5.2995) grad_norm 2.0163 (2.1057) [2022-01-17 18:20:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][890/1251] eta 0:13:16 lr 0.000436 time 1.8643 (2.2057) loss 5.1678 (5.2979) grad_norm 1.9373 (2.1043) [2022-01-17 18:21:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][900/1251] eta 0:12:52 lr 0.000437 time 2.0458 (2.2022) loss 5.2826 (5.2976) grad_norm 1.8937 (2.1025) [2022-01-17 18:21:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][910/1251] eta 0:12:30 lr 0.000437 time 1.8283 (2.2006) loss 5.6625 (5.2945) grad_norm 1.9450 (2.1017) [2022-01-17 18:21:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][920/1251] eta 0:12:08 lr 0.000437 time 2.5262 (2.2018) loss 4.3446 (5.2924) grad_norm 1.7055 (2.1046) [2022-01-17 18:22:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][930/1251] eta 0:11:46 lr 0.000438 time 2.2285 (2.2008) loss 4.4361 (5.2904) grad_norm 2.5496 (2.1035) [2022-01-17 18:22:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][940/1251] eta 0:11:24 lr 0.000438 time 2.1628 (2.2020) loss 5.0219 (5.2915) grad_norm 2.1964 (2.1016) [2022-01-17 18:22:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][950/1251] eta 0:11:02 lr 0.000439 time 2.1631 (2.2027) loss 4.0556 (5.2889) grad_norm 1.8598 (2.0998) [2022-01-17 18:23:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][960/1251] eta 0:10:41 lr 0.000439 time 2.2438 (2.2050) loss 5.6669 (5.2913) grad_norm 1.7127 (2.1034) [2022-01-17 18:23:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][970/1251] eta 0:10:19 lr 0.000439 time 2.2359 (2.2040) loss 5.6818 (5.2912) grad_norm 1.6070 (2.1014) [2022-01-17 18:24:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][980/1251] eta 0:09:56 lr 0.000440 time 1.8365 (2.2016) loss 5.2792 (5.2898) grad_norm 1.8699 (2.0991) [2022-01-17 18:24:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][990/1251] eta 0:09:34 lr 0.000440 time 1.9512 (2.1995) loss 4.3583 (5.2895) grad_norm 1.6098 (2.0966) [2022-01-17 18:24:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1000/1251] eta 0:09:11 lr 0.000441 time 2.5430 (2.1987) loss 5.6016 (5.2911) grad_norm 2.2795 (2.0948) [2022-01-17 18:25:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1010/1251] eta 0:08:49 lr 0.000441 time 2.1772 (2.1979) loss 5.2023 (5.2922) grad_norm 1.9284 (2.0955) [2022-01-17 18:25:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1020/1251] eta 0:08:27 lr 0.000441 time 2.1974 (2.1977) loss 4.7625 (5.2921) grad_norm 1.8518 (2.0935) [2022-01-17 18:25:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1030/1251] eta 0:08:05 lr 0.000442 time 2.4873 (2.1982) loss 5.1522 (5.2911) grad_norm 2.2304 (2.0938) [2022-01-17 18:26:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1040/1251] eta 0:07:43 lr 0.000442 time 2.0418 (2.1980) loss 4.4449 (5.2921) grad_norm 2.5548 (2.0927) [2022-01-17 18:26:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1050/1251] eta 0:07:21 lr 0.000443 time 1.8229 (2.1983) loss 5.6556 (5.2903) grad_norm 2.4894 (2.0923) [2022-01-17 18:26:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1060/1251] eta 0:06:59 lr 0.000443 time 2.3182 (2.1978) loss 5.6164 (5.2928) grad_norm 2.1165 (2.0929) [2022-01-17 18:27:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1070/1251] eta 0:06:37 lr 0.000443 time 2.2156 (2.1982) loss 5.1399 (5.2912) grad_norm 2.5016 (2.0935) [2022-01-17 18:27:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1080/1251] eta 0:06:15 lr 0.000444 time 1.9441 (2.1980) loss 4.6964 (5.2900) grad_norm 3.9567 (2.0974) [2022-01-17 18:28:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1090/1251] eta 0:05:54 lr 0.000444 time 1.7011 (2.1996) loss 5.5655 (5.2918) grad_norm 1.5523 (2.0963) [2022-01-17 18:28:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1100/1251] eta 0:05:32 lr 0.000445 time 1.5867 (2.1995) loss 5.4535 (5.2885) grad_norm 2.5345 (2.0963) [2022-01-17 18:28:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1110/1251] eta 0:05:10 lr 0.000445 time 3.0076 (2.1998) loss 5.3859 (5.2886) grad_norm 1.7441 (2.0974) [2022-01-17 18:29:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1120/1251] eta 0:04:48 lr 0.000445 time 1.9979 (2.1986) loss 5.5300 (5.2889) grad_norm 1.8961 (2.0984) [2022-01-17 18:29:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1130/1251] eta 0:04:25 lr 0.000446 time 1.9794 (2.1977) loss 5.5388 (5.2880) grad_norm 2.0669 (2.0989) [2022-01-17 18:29:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1140/1251] eta 0:04:03 lr 0.000446 time 1.9856 (2.1970) loss 5.3701 (5.2895) grad_norm 2.0799 (2.0983) [2022-01-17 18:30:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1150/1251] eta 0:03:41 lr 0.000447 time 3.3147 (2.1967) loss 5.2542 (5.2889) grad_norm 2.4172 (2.0971) [2022-01-17 18:30:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1160/1251] eta 0:03:19 lr 0.000447 time 1.9552 (2.1968) loss 4.6776 (5.2880) grad_norm 1.6691 (2.0959) [2022-01-17 18:30:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1170/1251] eta 0:02:58 lr 0.000447 time 2.2093 (2.1978) loss 4.9163 (5.2864) grad_norm 1.6578 (2.0930) [2022-01-17 18:31:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1180/1251] eta 0:02:36 lr 0.000448 time 2.1555 (2.1978) loss 5.1922 (5.2841) grad_norm 2.0454 (2.0933) [2022-01-17 18:31:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1190/1251] eta 0:02:14 lr 0.000448 time 2.2614 (2.1974) loss 5.8375 (5.2842) grad_norm 1.6347 (2.0952) [2022-01-17 18:32:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1200/1251] eta 0:01:52 lr 0.000449 time 1.9448 (2.1961) loss 4.5235 (5.2832) grad_norm 1.9785 (2.0959) [2022-01-17 18:32:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1210/1251] eta 0:01:30 lr 0.000449 time 2.5782 (2.1964) loss 5.4438 (5.2836) grad_norm 1.7061 (2.0935) [2022-01-17 18:32:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1220/1251] eta 0:01:08 lr 0.000449 time 2.6636 (2.1969) loss 4.8801 (5.2820) grad_norm 1.7133 (2.0911) [2022-01-17 18:33:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1230/1251] eta 0:00:46 lr 0.000450 time 2.1913 (2.1969) loss 5.3154 (5.2819) grad_norm 1.8243 (2.0912) [2022-01-17 18:33:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1240/1251] eta 0:00:24 lr 0.000450 time 1.2578 (2.1947) loss 4.3879 (5.2796) grad_norm 1.8748 (2.0903) [2022-01-17 18:33:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [8/300][1250/1251] eta 0:00:02 lr 0.000451 time 1.1799 (2.1891) loss 5.6961 (5.2805) grad_norm 2.8228 (2.0894) [2022-01-17 18:33:42 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 8 training takes 0:45:38 [2022-01-17 18:34:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.493 (18.493) Loss 2.8609 (2.8609) Acc@1 40.039 (40.039) Acc@5 67.969 (67.969) [2022-01-17 18:34:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.921 (3.318) Loss 2.8328 (2.8226) Acc@1 41.113 (40.536) Acc@5 67.090 (66.859) [2022-01-17 18:34:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.918 (2.531) Loss 2.8115 (2.8099) Acc@1 41.992 (41.034) Acc@5 66.992 (66.904) [2022-01-17 18:34:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.302 (2.221) Loss 2.7857 (2.8144) Acc@1 41.504 (41.022) Acc@5 67.480 (66.611) [2022-01-17 18:35:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.622 (2.156) Loss 2.7766 (2.8214) Acc@1 43.066 (40.839) Acc@5 67.969 (66.528) [2022-01-17 18:35:17 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 41.014 Acc@5 66.658 [2022-01-17 18:35:17 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 41.0% [2022-01-17 18:35:17 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 41.01% [2022-01-17 18:35:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][0/1251] eta 7:37:56 lr 0.000451 time 21.9637 (21.9637) loss 4.5415 (4.5415) grad_norm 2.5014 (2.5014) [2022-01-17 18:36:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][10/1251] eta 1:21:52 lr 0.000451 time 2.1494 (3.9584) loss 4.8814 (5.0831) grad_norm 1.8747 (2.1868) [2022-01-17 18:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][20/1251] eta 1:03:43 lr 0.000451 time 1.2494 (3.1058) loss 4.3637 (5.0250) grad_norm 2.0127 (2.0643) [2022-01-17 18:36:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][30/1251] eta 0:57:50 lr 0.000452 time 1.5462 (2.8422) loss 4.8269 (5.1150) grad_norm 2.4726 (2.0748) [2022-01-17 18:37:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][40/1251] eta 0:54:30 lr 0.000452 time 3.1439 (2.7008) loss 4.3419 (5.1378) grad_norm 1.9904 (2.0902) [2022-01-17 18:37:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][50/1251] eta 0:52:50 lr 0.000453 time 2.3982 (2.6399) loss 4.5043 (5.1849) grad_norm 2.1500 (2.0345) [2022-01-17 18:37:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][60/1251] eta 0:50:52 lr 0.000453 time 2.2206 (2.5633) loss 4.7024 (5.1752) grad_norm 1.6325 (1.9923) [2022-01-17 18:38:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][70/1251] eta 0:49:06 lr 0.000453 time 1.9041 (2.4951) loss 5.2295 (5.2043) grad_norm 1.7697 (1.9829) [2022-01-17 18:38:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][80/1251] eta 0:47:58 lr 0.000454 time 2.8095 (2.4578) loss 5.9089 (5.2119) grad_norm 1.7773 (2.0295) [2022-01-17 18:38:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][90/1251] eta 0:47:12 lr 0.000454 time 2.2479 (2.4401) loss 5.7573 (5.2328) grad_norm 2.0687 (2.0400) [2022-01-17 18:39:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][100/1251] eta 0:46:10 lr 0.000455 time 2.0641 (2.4073) loss 5.5723 (5.2451) grad_norm 2.0016 (2.0499) [2022-01-17 18:39:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][110/1251] eta 0:45:02 lr 0.000455 time 1.9125 (2.3686) loss 5.1082 (5.2433) grad_norm 2.2159 (2.0436) [2022-01-17 18:40:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][120/1251] eta 0:44:18 lr 0.000455 time 2.2296 (2.3509) loss 5.0525 (5.2493) grad_norm 3.5442 (2.0532) [2022-01-17 18:40:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][130/1251] eta 0:43:55 lr 0.000456 time 2.5817 (2.3508) loss 5.5920 (5.2397) grad_norm 2.4883 (2.0657) [2022-01-17 18:40:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][140/1251] eta 0:43:10 lr 0.000456 time 1.5409 (2.3317) loss 5.7184 (5.2338) grad_norm 1.5897 (2.0451) [2022-01-17 18:41:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][150/1251] eta 0:42:35 lr 0.000457 time 1.8693 (2.3212) loss 4.2676 (5.2113) grad_norm 1.8391 (2.0420) [2022-01-17 18:41:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][160/1251] eta 0:42:06 lr 0.000457 time 1.8956 (2.3156) loss 5.1754 (5.2355) grad_norm 1.4126 (2.0166) [2022-01-17 18:41:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][170/1251] eta 0:41:35 lr 0.000457 time 1.5545 (2.3089) loss 5.4401 (5.2300) grad_norm 3.1894 (2.0233) [2022-01-17 18:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][180/1251] eta 0:41:00 lr 0.000458 time 1.6709 (2.2970) loss 5.2653 (5.2313) grad_norm 2.4206 (2.0509) [2022-01-17 18:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][190/1251] eta 0:40:29 lr 0.000458 time 1.8026 (2.2900) loss 5.5852 (5.2301) grad_norm 1.9843 (2.0480) [2022-01-17 18:42:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][200/1251] eta 0:40:07 lr 0.000459 time 2.7751 (2.2907) loss 5.1448 (5.2352) grad_norm 1.8443 (2.0358) [2022-01-17 18:43:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][210/1251] eta 0:39:40 lr 0.000459 time 1.9530 (2.2864) loss 4.3849 (5.2370) grad_norm 1.8546 (2.0290) [2022-01-17 18:43:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][220/1251] eta 0:39:04 lr 0.000459 time 1.7851 (2.2741) loss 5.6867 (5.2350) grad_norm 2.1588 (2.0179) [2022-01-17 18:44:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][230/1251] eta 0:38:31 lr 0.000460 time 1.9260 (2.2639) loss 5.4197 (5.2444) grad_norm 2.1531 (2.0081) [2022-01-17 18:44:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][240/1251] eta 0:38:02 lr 0.000460 time 1.8908 (2.2576) loss 5.4142 (5.2559) grad_norm 1.9440 (2.0059) [2022-01-17 18:44:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][250/1251] eta 0:37:43 lr 0.000461 time 1.8434 (2.2613) loss 5.3067 (5.2646) grad_norm 1.9977 (2.0051) [2022-01-17 18:45:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][260/1251] eta 0:37:15 lr 0.000461 time 2.1739 (2.2555) loss 4.1743 (5.2612) grad_norm 2.8144 (2.0109) [2022-01-17 18:45:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][270/1251] eta 0:36:52 lr 0.000461 time 2.1534 (2.2550) loss 4.1820 (5.2538) grad_norm 1.7603 (2.0120) [2022-01-17 18:45:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][280/1251] eta 0:36:27 lr 0.000462 time 2.1333 (2.2525) loss 5.6931 (5.2547) grad_norm 1.4853 (2.0081) [2022-01-17 18:46:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][290/1251] eta 0:36:01 lr 0.000462 time 2.4573 (2.2488) loss 5.3936 (5.2591) grad_norm 1.8483 (2.0011) [2022-01-17 18:46:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][300/1251] eta 0:35:33 lr 0.000463 time 2.0605 (2.2434) loss 4.6754 (5.2510) grad_norm 1.6205 (2.0008) [2022-01-17 18:46:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][310/1251] eta 0:35:08 lr 0.000463 time 2.2070 (2.2402) loss 4.7316 (5.2461) grad_norm 1.9207 (2.0016) [2022-01-17 18:47:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][320/1251] eta 0:34:43 lr 0.000463 time 2.3355 (2.2377) loss 5.5101 (5.2403) grad_norm 2.4554 (1.9997) [2022-01-17 18:47:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][330/1251] eta 0:34:19 lr 0.000464 time 1.5667 (2.2362) loss 4.5936 (5.2346) grad_norm 1.8717 (1.9979) [2022-01-17 18:48:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][340/1251] eta 0:33:57 lr 0.000464 time 2.1499 (2.2367) loss 5.4448 (5.2348) grad_norm 1.7902 (2.0035) [2022-01-17 18:48:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][350/1251] eta 0:33:32 lr 0.000465 time 2.0741 (2.2331) loss 5.3991 (5.2339) grad_norm 1.8873 (2.0045) [2022-01-17 18:48:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][360/1251] eta 0:33:11 lr 0.000465 time 2.4698 (2.2353) loss 5.5373 (5.2322) grad_norm 2.1506 (2.0042) [2022-01-17 18:49:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][370/1251] eta 0:32:49 lr 0.000465 time 1.7894 (2.2350) loss 5.1615 (5.2243) grad_norm 2.1282 (2.0085) [2022-01-17 18:49:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][380/1251] eta 0:32:23 lr 0.000466 time 1.6496 (2.2311) loss 5.9180 (5.2255) grad_norm 1.6671 (2.0047) [2022-01-17 18:49:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][390/1251] eta 0:31:57 lr 0.000466 time 1.8887 (2.2268) loss 5.5775 (5.2229) grad_norm 3.0157 (2.0160) [2022-01-17 18:50:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][400/1251] eta 0:31:32 lr 0.000467 time 2.3616 (2.2241) loss 4.6722 (5.2237) grad_norm 2.0513 (2.0136) [2022-01-17 18:50:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][410/1251] eta 0:31:07 lr 0.000467 time 2.0026 (2.2208) loss 4.4352 (5.2201) grad_norm 2.2872 (2.0103) [2022-01-17 18:50:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][420/1251] eta 0:30:51 lr 0.000467 time 1.9867 (2.2277) loss 4.2777 (5.2229) grad_norm 1.9043 (2.0071) [2022-01-17 18:51:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][430/1251] eta 0:30:29 lr 0.000468 time 2.5033 (2.2286) loss 4.8280 (5.2195) grad_norm 1.7578 (2.0080) [2022-01-17 18:51:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][440/1251] eta 0:30:03 lr 0.000468 time 1.8466 (2.2238) loss 5.1887 (5.2198) grad_norm 2.0887 (2.0082) [2022-01-17 18:51:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][450/1251] eta 0:29:39 lr 0.000469 time 2.6170 (2.2212) loss 5.3344 (5.2210) grad_norm 1.8277 (2.0040) [2022-01-17 18:52:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][460/1251] eta 0:29:14 lr 0.000469 time 1.8246 (2.2176) loss 4.3736 (5.2131) grad_norm 2.2417 (2.0039) [2022-01-17 18:52:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][470/1251] eta 0:28:51 lr 0.000469 time 2.8548 (2.2174) loss 5.4019 (5.2084) grad_norm 3.2262 (2.0106) [2022-01-17 18:53:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][480/1251] eta 0:28:27 lr 0.000470 time 1.9638 (2.2140) loss 6.1071 (5.2076) grad_norm 1.9513 (2.0078) [2022-01-17 18:53:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][490/1251] eta 0:28:04 lr 0.000470 time 2.1386 (2.2129) loss 4.1250 (5.2025) grad_norm 1.9680 (2.0031) [2022-01-17 18:53:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][500/1251] eta 0:27:43 lr 0.000471 time 2.1317 (2.2149) loss 5.4528 (5.2006) grad_norm 1.6571 (2.0049) [2022-01-17 18:54:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][510/1251] eta 0:27:22 lr 0.000471 time 2.7806 (2.2163) loss 5.3278 (5.2019) grad_norm 1.7088 (2.0040) [2022-01-17 18:54:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][520/1251] eta 0:27:02 lr 0.000471 time 2.9078 (2.2192) loss 5.5019 (5.2037) grad_norm 1.9208 (2.0016) [2022-01-17 18:54:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][530/1251] eta 0:26:40 lr 0.000472 time 1.8735 (2.2202) loss 4.6598 (5.2025) grad_norm 2.5772 (1.9993) [2022-01-17 18:55:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][540/1251] eta 0:26:15 lr 0.000472 time 1.6913 (2.2164) loss 5.2775 (5.1984) grad_norm 2.1922 (2.0028) [2022-01-17 18:55:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][550/1251] eta 0:25:50 lr 0.000473 time 1.9701 (2.2121) loss 5.7256 (5.2023) grad_norm 1.8360 (2.0005) [2022-01-17 18:55:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][560/1251] eta 0:25:25 lr 0.000473 time 1.9805 (2.2077) loss 4.2623 (5.2030) grad_norm 1.9614 (1.9984) [2022-01-17 18:56:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][570/1251] eta 0:25:01 lr 0.000473 time 2.1496 (2.2050) loss 4.7678 (5.2021) grad_norm 3.2944 (1.9980) [2022-01-17 18:56:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][580/1251] eta 0:24:38 lr 0.000474 time 2.5057 (2.2041) loss 5.0907 (5.2024) grad_norm 1.7745 (1.9988) [2022-01-17 18:56:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][590/1251] eta 0:24:15 lr 0.000474 time 1.9520 (2.2014) loss 5.8780 (5.2040) grad_norm 1.6602 (1.9979) [2022-01-17 18:57:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][600/1251] eta 0:23:53 lr 0.000475 time 2.0860 (2.2014) loss 4.8665 (5.2015) grad_norm 1.5087 (1.9997) [2022-01-17 18:57:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][610/1251] eta 0:23:31 lr 0.000475 time 2.8740 (2.2017) loss 5.6810 (5.2019) grad_norm 1.5307 (2.0016) [2022-01-17 18:58:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][620/1251] eta 0:23:09 lr 0.000475 time 1.8835 (2.2025) loss 5.5812 (5.1998) grad_norm 1.8688 (1.9991) [2022-01-17 18:58:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][630/1251] eta 0:22:47 lr 0.000476 time 2.1450 (2.2016) loss 5.7638 (5.1956) grad_norm 1.6451 (1.9966) [2022-01-17 18:58:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][640/1251] eta 0:22:26 lr 0.000476 time 2.4599 (2.2036) loss 5.8646 (5.2005) grad_norm 1.4986 (1.9905) [2022-01-17 18:59:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][650/1251] eta 0:22:04 lr 0.000477 time 2.5678 (2.2039) loss 4.9775 (5.2004) grad_norm 2.3776 (1.9913) [2022-01-17 18:59:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][660/1251] eta 0:21:43 lr 0.000477 time 2.5496 (2.2052) loss 4.9538 (5.1994) grad_norm 1.6333 (1.9892) [2022-01-17 18:59:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][670/1251] eta 0:21:21 lr 0.000477 time 2.5118 (2.2055) loss 4.8380 (5.2007) grad_norm 3.1847 (1.9921) [2022-01-17 19:00:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][680/1251] eta 0:20:59 lr 0.000478 time 2.0156 (2.2060) loss 4.6794 (5.1983) grad_norm 1.7014 (1.9870) [2022-01-17 19:00:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][690/1251] eta 0:20:37 lr 0.000478 time 1.9368 (2.2058) loss 5.3749 (5.1993) grad_norm 1.6344 (1.9847) [2022-01-17 19:01:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][700/1251] eta 0:20:14 lr 0.000478 time 2.2143 (2.2048) loss 5.7138 (5.1971) grad_norm 2.0673 (1.9848) [2022-01-17 19:01:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][710/1251] eta 0:19:51 lr 0.000479 time 1.8303 (2.2029) loss 4.7301 (5.1962) grad_norm 2.1038 (1.9863) [2022-01-17 19:01:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][720/1251] eta 0:19:29 lr 0.000479 time 1.9941 (2.2015) loss 5.6660 (5.1965) grad_norm 2.1348 (1.9876) [2022-01-17 19:02:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][730/1251] eta 0:19:06 lr 0.000480 time 2.1882 (2.2009) loss 4.1450 (5.1930) grad_norm 1.8875 (1.9848) [2022-01-17 19:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][740/1251] eta 0:18:45 lr 0.000480 time 2.6260 (2.2018) loss 5.4553 (5.1923) grad_norm 3.4694 (1.9914) [2022-01-17 19:02:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][750/1251] eta 0:18:22 lr 0.000480 time 2.5720 (2.2005) loss 5.0753 (5.1942) grad_norm 2.2795 (1.9959) [2022-01-17 19:03:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][760/1251] eta 0:18:00 lr 0.000481 time 1.9138 (2.2000) loss 4.8786 (5.1938) grad_norm 1.8405 (1.9964) [2022-01-17 19:03:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][770/1251] eta 0:17:37 lr 0.000481 time 2.3687 (2.1990) loss 4.6848 (5.1923) grad_norm 2.9259 (1.9983) [2022-01-17 19:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][780/1251] eta 0:17:14 lr 0.000482 time 2.1994 (2.1969) loss 4.1642 (5.1925) grad_norm 2.3306 (1.9978) [2022-01-17 19:04:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][790/1251] eta 0:16:52 lr 0.000482 time 2.8285 (2.1965) loss 5.2801 (5.1922) grad_norm 1.8102 (1.9981) [2022-01-17 19:04:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][800/1251] eta 0:16:30 lr 0.000482 time 1.9347 (2.1962) loss 4.1299 (5.1919) grad_norm 1.4778 (1.9954) [2022-01-17 19:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][810/1251] eta 0:16:09 lr 0.000483 time 3.4233 (2.1974) loss 5.8092 (5.1938) grad_norm 1.9078 (1.9977) [2022-01-17 19:05:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][820/1251] eta 0:15:46 lr 0.000483 time 1.8673 (2.1966) loss 4.7965 (5.1939) grad_norm 1.8365 (1.9956) [2022-01-17 19:05:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][830/1251] eta 0:15:25 lr 0.000484 time 2.9309 (2.1982) loss 4.4083 (5.1925) grad_norm 1.8125 (1.9944) [2022-01-17 19:06:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][840/1251] eta 0:15:03 lr 0.000484 time 1.7982 (2.1983) loss 4.9991 (5.1882) grad_norm 2.5841 (1.9924) [2022-01-17 19:06:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][850/1251] eta 0:14:41 lr 0.000484 time 2.6371 (2.1991) loss 5.1269 (5.1872) grad_norm 1.9994 (1.9932) [2022-01-17 19:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][860/1251] eta 0:14:19 lr 0.000485 time 1.9660 (2.1990) loss 4.9221 (5.1847) grad_norm 1.7953 (1.9923) [2022-01-17 19:07:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][870/1251] eta 0:13:58 lr 0.000485 time 2.7085 (2.2019) loss 5.3705 (5.1823) grad_norm 1.4319 (1.9904) [2022-01-17 19:07:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][880/1251] eta 0:13:36 lr 0.000486 time 1.9170 (2.2021) loss 5.0000 (5.1801) grad_norm 2.2004 (1.9935) [2022-01-17 19:08:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][890/1251] eta 0:13:15 lr 0.000486 time 2.8882 (2.2032) loss 5.4842 (5.1802) grad_norm 2.7869 (1.9959) [2022-01-17 19:08:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][900/1251] eta 0:12:52 lr 0.000486 time 1.6897 (2.2004) loss 5.7445 (5.1787) grad_norm 2.1347 (1.9967) [2022-01-17 19:08:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][910/1251] eta 0:12:29 lr 0.000487 time 1.9309 (2.1977) loss 5.4725 (5.1791) grad_norm 1.7296 (1.9958) [2022-01-17 19:09:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][920/1251] eta 0:12:06 lr 0.000487 time 2.3853 (2.1956) loss 5.1966 (5.1765) grad_norm 1.7511 (1.9971) [2022-01-17 19:09:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][930/1251] eta 0:11:44 lr 0.000488 time 2.4835 (2.1950) loss 5.2242 (5.1731) grad_norm 1.8075 (1.9945) [2022-01-17 19:09:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][940/1251] eta 0:11:23 lr 0.000488 time 2.1628 (2.1962) loss 4.0961 (5.1697) grad_norm 2.6773 (1.9952) [2022-01-17 19:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][950/1251] eta 0:11:01 lr 0.000488 time 1.9688 (2.1973) loss 5.5854 (5.1689) grad_norm 1.9586 (1.9950) [2022-01-17 19:10:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][960/1251] eta 0:10:40 lr 0.000489 time 3.2957 (2.2008) loss 4.7857 (5.1675) grad_norm 1.8403 (1.9929) [2022-01-17 19:10:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][970/1251] eta 0:10:18 lr 0.000489 time 2.4582 (2.2008) loss 5.4494 (5.1669) grad_norm 2.5086 (1.9946) [2022-01-17 19:11:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][980/1251] eta 0:09:55 lr 0.000490 time 1.6495 (2.1984) loss 5.5378 (5.1669) grad_norm 1.7303 (1.9960) [2022-01-17 19:11:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][990/1251] eta 0:09:33 lr 0.000490 time 2.3736 (2.1977) loss 5.5886 (5.1661) grad_norm 1.4856 (1.9953) [2022-01-17 19:11:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1000/1251] eta 0:09:11 lr 0.000490 time 3.4135 (2.1983) loss 5.8134 (5.1668) grad_norm 2.2392 (1.9938) [2022-01-17 19:12:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1010/1251] eta 0:08:49 lr 0.000491 time 2.1668 (2.1981) loss 3.6887 (5.1656) grad_norm 1.7201 (1.9926) [2022-01-17 19:12:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1020/1251] eta 0:08:27 lr 0.000491 time 2.5641 (2.1986) loss 5.7009 (5.1637) grad_norm 1.8996 (1.9913) [2022-01-17 19:13:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1030/1251] eta 0:08:05 lr 0.000492 time 2.2476 (2.1980) loss 5.3391 (5.1648) grad_norm 2.1671 (1.9900) [2022-01-17 19:13:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1040/1251] eta 0:07:43 lr 0.000492 time 2.4752 (2.1972) loss 6.1494 (5.1664) grad_norm 2.2567 (1.9901) [2022-01-17 19:13:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1050/1251] eta 0:07:21 lr 0.000492 time 1.6290 (2.1953) loss 5.1354 (5.1653) grad_norm 1.5269 (1.9880) [2022-01-17 19:14:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1060/1251] eta 0:06:59 lr 0.000493 time 2.3062 (2.1961) loss 5.6298 (5.1658) grad_norm 2.1726 (1.9862) [2022-01-17 19:14:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1070/1251] eta 0:06:37 lr 0.000493 time 3.3641 (2.1986) loss 5.2242 (5.1664) grad_norm 1.8549 (1.9849) [2022-01-17 19:14:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1080/1251] eta 0:06:16 lr 0.000494 time 2.3991 (2.1997) loss 5.6443 (5.1650) grad_norm 1.9919 (1.9847) [2022-01-17 19:15:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1090/1251] eta 0:05:54 lr 0.000494 time 1.6193 (2.1991) loss 5.0961 (5.1658) grad_norm 1.9027 (1.9851) [2022-01-17 19:15:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1100/1251] eta 0:05:31 lr 0.000494 time 1.9558 (2.1977) loss 4.9320 (5.1646) grad_norm 1.8241 (1.9840) [2022-01-17 19:15:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1110/1251] eta 0:05:09 lr 0.000495 time 2.4897 (2.1976) loss 4.1306 (5.1600) grad_norm 2.1897 (1.9839) [2022-01-17 19:16:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1120/1251] eta 0:04:47 lr 0.000495 time 2.8712 (2.1976) loss 5.5497 (5.1596) grad_norm 1.7212 (1.9838) [2022-01-17 19:16:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1130/1251] eta 0:04:25 lr 0.000496 time 2.7270 (2.1970) loss 5.0047 (5.1590) grad_norm 1.9053 (1.9852) [2022-01-17 19:17:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1140/1251] eta 0:04:03 lr 0.000496 time 2.4587 (2.1961) loss 4.7167 (5.1565) grad_norm 1.6284 (1.9841) [2022-01-17 19:17:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1150/1251] eta 0:03:41 lr 0.000496 time 2.0736 (2.1952) loss 5.7132 (5.1579) grad_norm 1.6047 (1.9819) [2022-01-17 19:17:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1160/1251] eta 0:03:19 lr 0.000497 time 2.1898 (2.1951) loss 5.7970 (5.1555) grad_norm 2.2070 (1.9807) [2022-01-17 19:18:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1170/1251] eta 0:02:57 lr 0.000497 time 1.5050 (2.1935) loss 4.4161 (5.1546) grad_norm 2.1643 (1.9789) [2022-01-17 19:18:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1180/1251] eta 0:02:35 lr 0.000498 time 2.3443 (2.1928) loss 4.7013 (5.1524) grad_norm 1.8213 (1.9791) [2022-01-17 19:18:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1190/1251] eta 0:02:13 lr 0.000498 time 1.6924 (2.1923) loss 4.1842 (5.1523) grad_norm 1.8700 (1.9808) [2022-01-17 19:19:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1200/1251] eta 0:01:51 lr 0.000498 time 1.5990 (2.1919) loss 4.2942 (5.1505) grad_norm 1.8842 (1.9806) [2022-01-17 19:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1210/1251] eta 0:01:29 lr 0.000499 time 2.5512 (2.1933) loss 5.0737 (5.1518) grad_norm 1.6297 (1.9794) [2022-01-17 19:19:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1220/1251] eta 0:01:07 lr 0.000499 time 1.8279 (2.1932) loss 5.1243 (5.1511) grad_norm 1.7060 (1.9788) [2022-01-17 19:20:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1230/1251] eta 0:00:46 lr 0.000500 time 1.5222 (2.1949) loss 5.3118 (5.1503) grad_norm 2.0585 (1.9780) [2022-01-17 19:20:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1240/1251] eta 0:00:24 lr 0.000500 time 2.1775 (2.1944) loss 4.0731 (5.1502) grad_norm 1.9927 (1.9777) [2022-01-17 19:20:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [9/300][1250/1251] eta 0:00:02 lr 0.000500 time 1.1631 (2.1890) loss 5.3733 (5.1501) grad_norm 1.9684 (1.9762) [2022-01-17 19:20:56 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 9 training takes 0:45:38 [2022-01-17 19:21:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.001 (19.001) Loss 2.6528 (2.6528) Acc@1 42.871 (42.871) Acc@5 69.336 (69.336) [2022-01-17 19:21:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.950 (3.328) Loss 2.6160 (2.5977) Acc@1 44.727 (44.656) Acc@5 71.387 (70.162) [2022-01-17 19:21:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.277 (2.604) Loss 2.6056 (2.5947) Acc@1 42.285 (44.513) Acc@5 70.996 (70.392) [2022-01-17 19:22:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.923 (2.279) Loss 2.6127 (2.5932) Acc@1 43.555 (44.383) Acc@5 68.652 (70.284) [2022-01-17 19:22:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.829 (2.145) Loss 2.5416 (2.5975) Acc@1 44.629 (44.388) Acc@5 72.656 (70.203) [2022-01-17 19:22:32 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 44.454 Acc@5 70.008 [2022-01-17 19:22:32 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 44.5% [2022-01-17 19:22:32 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 44.45% [2022-01-17 19:22:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][0/1251] eta 8:20:28 lr 0.000501 time 24.0034 (24.0034) loss 5.4168 (5.4168) grad_norm 2.0246 (2.0246) [2022-01-17 19:23:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][10/1251] eta 1:27:54 lr 0.000501 time 1.7833 (4.2504) loss 5.4682 (5.3458) grad_norm 1.6959 (1.6999) [2022-01-17 19:23:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][20/1251] eta 1:05:40 lr 0.000501 time 2.3475 (3.2008) loss 4.3084 (5.2579) grad_norm 2.0126 (1.9757) [2022-01-17 19:24:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][30/1251] eta 0:57:25 lr 0.000502 time 1.6101 (2.8222) loss 5.6825 (5.1792) grad_norm 1.7280 (2.0071) [2022-01-17 19:24:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][40/1251] eta 0:55:23 lr 0.000502 time 5.1558 (2.7448) loss 5.2530 (5.1667) grad_norm 1.4089 (1.9336) [2022-01-17 19:24:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][50/1251] eta 0:52:38 lr 0.000502 time 1.5867 (2.6299) loss 4.4181 (5.1272) grad_norm 1.6151 (1.8903) [2022-01-17 19:25:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][60/1251] eta 0:50:58 lr 0.000503 time 2.5842 (2.5680) loss 5.4288 (5.1066) grad_norm 1.4568 (1.9061) [2022-01-17 19:25:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][70/1251] eta 0:49:23 lr 0.000503 time 1.8648 (2.5094) loss 4.1796 (5.1255) grad_norm 1.9496 (1.8830) [2022-01-17 19:25:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][80/1251] eta 0:48:07 lr 0.000504 time 2.4645 (2.4661) loss 5.3466 (5.1446) grad_norm 1.6945 (1.8653) [2022-01-17 19:26:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][90/1251] eta 0:47:00 lr 0.000504 time 1.8047 (2.4290) loss 5.6257 (5.1343) grad_norm 1.6891 (1.8598) [2022-01-17 19:26:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][100/1251] eta 0:45:52 lr 0.000504 time 1.8890 (2.3913) loss 4.9382 (5.1198) grad_norm 2.5551 (1.8652) [2022-01-17 19:26:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][110/1251] eta 0:45:06 lr 0.000505 time 2.1980 (2.3718) loss 4.6675 (5.1255) grad_norm 1.4110 (1.8780) [2022-01-17 19:27:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][120/1251] eta 0:44:36 lr 0.000505 time 3.0320 (2.3664) loss 5.2596 (5.1237) grad_norm 2.1104 (1.8629) [2022-01-17 19:27:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][130/1251] eta 0:44:23 lr 0.000506 time 2.7501 (2.3758) loss 5.3856 (5.1158) grad_norm 2.3337 (1.8639) [2022-01-17 19:28:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][140/1251] eta 0:43:39 lr 0.000506 time 1.6289 (2.3578) loss 5.1034 (5.1177) grad_norm 2.0685 (1.8642) [2022-01-17 19:28:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][150/1251] eta 0:43:02 lr 0.000506 time 2.2520 (2.3453) loss 5.5077 (5.1239) grad_norm 2.0923 (1.8659) [2022-01-17 19:28:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][160/1251] eta 0:42:14 lr 0.000507 time 1.9148 (2.3229) loss 5.4836 (5.1322) grad_norm 1.5840 (1.8553) [2022-01-17 19:29:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][170/1251] eta 0:41:30 lr 0.000507 time 2.2439 (2.3040) loss 5.2089 (5.1412) grad_norm 1.8646 (1.8671) [2022-01-17 19:29:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][180/1251] eta 0:40:55 lr 0.000508 time 2.0061 (2.2924) loss 5.5924 (5.1409) grad_norm 3.6918 (1.8773) [2022-01-17 19:29:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][190/1251] eta 0:40:25 lr 0.000508 time 2.2140 (2.2863) loss 5.5742 (5.1456) grad_norm 2.0162 (1.8865) [2022-01-17 19:30:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][200/1251] eta 0:39:57 lr 0.000508 time 1.7814 (2.2811) loss 5.4420 (5.1400) grad_norm 1.5619 (1.8832) [2022-01-17 19:30:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][210/1251] eta 0:39:32 lr 0.000509 time 2.2964 (2.2789) loss 5.7731 (5.1482) grad_norm 1.4316 (1.8811) [2022-01-17 19:30:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][220/1251] eta 0:39:02 lr 0.000509 time 1.7095 (2.2721) loss 4.3089 (5.1494) grad_norm 1.7147 (1.8860) [2022-01-17 19:31:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][230/1251] eta 0:38:30 lr 0.000510 time 2.5912 (2.2633) loss 5.8170 (5.1474) grad_norm 2.0540 (1.8924) [2022-01-17 19:31:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][240/1251] eta 0:37:57 lr 0.000510 time 1.7705 (2.2530) loss 4.4293 (5.1290) grad_norm 1.6257 (1.9060) [2022-01-17 19:31:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][250/1251] eta 0:37:34 lr 0.000510 time 2.5494 (2.2522) loss 4.4057 (5.1311) grad_norm 1.8246 (1.9107) [2022-01-17 19:32:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][260/1251] eta 0:37:10 lr 0.000511 time 2.2989 (2.2503) loss 5.8675 (5.1406) grad_norm 1.6160 (1.9027) [2022-01-17 19:32:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][270/1251] eta 0:36:50 lr 0.000511 time 4.2768 (2.2535) loss 4.9454 (5.1289) grad_norm 1.8344 (1.9016) [2022-01-17 19:33:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][280/1251] eta 0:36:29 lr 0.000512 time 2.2431 (2.2553) loss 5.4647 (5.1215) grad_norm 1.5893 (1.8939) [2022-01-17 19:33:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][290/1251] eta 0:36:07 lr 0.000512 time 1.8018 (2.2556) loss 5.1832 (5.1168) grad_norm 1.8544 (1.8939) [2022-01-17 19:33:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][300/1251] eta 0:35:41 lr 0.000512 time 1.7916 (2.2520) loss 5.5586 (5.1120) grad_norm 1.6710 (1.9030) [2022-01-17 19:34:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][310/1251] eta 0:35:18 lr 0.000513 time 3.1925 (2.2514) loss 5.4507 (5.1165) grad_norm 1.8924 (1.9038) [2022-01-17 19:34:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][320/1251] eta 0:34:49 lr 0.000513 time 2.3288 (2.2447) loss 4.7863 (5.1151) grad_norm 1.6263 (1.9036) [2022-01-17 19:34:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][330/1251] eta 0:34:25 lr 0.000514 time 1.7350 (2.2422) loss 4.5480 (5.1118) grad_norm 1.4829 (1.8989) [2022-01-17 19:35:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][340/1251] eta 0:34:00 lr 0.000514 time 1.5388 (2.2395) loss 5.0582 (5.1182) grad_norm 1.6102 (1.8936) [2022-01-17 19:35:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][350/1251] eta 0:33:35 lr 0.000514 time 2.2512 (2.2373) loss 4.6530 (5.1133) grad_norm 1.6267 (1.8928) [2022-01-17 19:35:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][360/1251] eta 0:33:08 lr 0.000515 time 2.2572 (2.2314) loss 5.3797 (5.1185) grad_norm 1.9705 (1.8995) [2022-01-17 19:36:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][370/1251] eta 0:32:42 lr 0.000515 time 1.8927 (2.2277) loss 5.2782 (5.1182) grad_norm 2.1021 (1.8969) [2022-01-17 19:36:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][380/1251] eta 0:32:15 lr 0.000516 time 1.9582 (2.2227) loss 5.4509 (5.1161) grad_norm 1.7405 (1.8943) [2022-01-17 19:37:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][390/1251] eta 0:31:53 lr 0.000516 time 2.5365 (2.2226) loss 5.6366 (5.1185) grad_norm 3.0777 (1.8984) [2022-01-17 19:37:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][400/1251] eta 0:31:35 lr 0.000516 time 3.4580 (2.2279) loss 3.8405 (5.1208) grad_norm 1.3089 (1.8993) [2022-01-17 19:37:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][410/1251] eta 0:31:14 lr 0.000517 time 1.9338 (2.2292) loss 5.3973 (5.1278) grad_norm 1.9283 (1.8983) [2022-01-17 19:38:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][420/1251] eta 0:30:53 lr 0.000517 time 1.9099 (2.2302) loss 4.7672 (5.1225) grad_norm 1.7013 (1.8939) [2022-01-17 19:38:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][430/1251] eta 0:30:30 lr 0.000518 time 2.2393 (2.2297) loss 5.3344 (5.1197) grad_norm 1.5830 (1.8925) [2022-01-17 19:38:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][440/1251] eta 0:30:08 lr 0.000518 time 2.2528 (2.2296) loss 5.4369 (5.1192) grad_norm 1.9153 (1.8954) [2022-01-17 19:39:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][450/1251] eta 0:29:46 lr 0.000518 time 1.6830 (2.2307) loss 5.3707 (5.1211) grad_norm 1.5280 (1.8960) [2022-01-17 19:39:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][460/1251] eta 0:29:22 lr 0.000519 time 1.9250 (2.2276) loss 4.1474 (5.1219) grad_norm 2.1646 (1.8920) [2022-01-17 19:39:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][470/1251] eta 0:28:55 lr 0.000519 time 1.6274 (2.2216) loss 4.1201 (5.1195) grad_norm 1.9755 (1.8928) [2022-01-17 19:40:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][480/1251] eta 0:28:29 lr 0.000520 time 1.8477 (2.2179) loss 5.3826 (5.1129) grad_norm 1.8127 (1.8916) [2022-01-17 19:40:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][490/1251] eta 0:28:08 lr 0.000520 time 2.5322 (2.2183) loss 4.2630 (5.1090) grad_norm 1.8006 (1.8915) [2022-01-17 19:41:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][500/1251] eta 0:27:47 lr 0.000520 time 2.1301 (2.2202) loss 5.3499 (5.1120) grad_norm 1.5408 (1.8892) [2022-01-17 19:41:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][510/1251] eta 0:27:25 lr 0.000521 time 1.6698 (2.2203) loss 5.4858 (5.1164) grad_norm 1.9398 (1.8883) [2022-01-17 19:41:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][520/1251] eta 0:27:05 lr 0.000521 time 1.8514 (2.2235) loss 4.3055 (5.1178) grad_norm 2.1184 (1.8881) [2022-01-17 19:42:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][530/1251] eta 0:26:43 lr 0.000522 time 2.4696 (2.2247) loss 5.0327 (5.1207) grad_norm 1.8775 (1.8904) [2022-01-17 19:42:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][540/1251] eta 0:26:19 lr 0.000522 time 1.9587 (2.2220) loss 5.0070 (5.1223) grad_norm 1.9260 (1.8920) [2022-01-17 19:42:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][550/1251] eta 0:25:55 lr 0.000522 time 1.9278 (2.2186) loss 4.0917 (5.1209) grad_norm 3.9105 (1.8979) [2022-01-17 19:43:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][560/1251] eta 0:25:31 lr 0.000523 time 1.8414 (2.2167) loss 4.8504 (5.1181) grad_norm 2.3968 (1.9031) [2022-01-17 19:43:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][570/1251] eta 0:25:08 lr 0.000523 time 1.9151 (2.2146) loss 4.9168 (5.1133) grad_norm 2.2516 (1.9021) [2022-01-17 19:43:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][580/1251] eta 0:24:46 lr 0.000524 time 1.8312 (2.2152) loss 4.0989 (5.1155) grad_norm 1.7527 (1.9011) [2022-01-17 19:44:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][590/1251] eta 0:24:25 lr 0.000524 time 2.4190 (2.2176) loss 5.5379 (5.1182) grad_norm 1.5982 (1.8983) [2022-01-17 19:44:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][600/1251] eta 0:24:04 lr 0.000524 time 1.8330 (2.2187) loss 4.9632 (5.1194) grad_norm 1.8680 (1.8951) [2022-01-17 19:45:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][610/1251] eta 0:23:41 lr 0.000525 time 1.6280 (2.2174) loss 5.6356 (5.1164) grad_norm 2.2935 (1.8959) [2022-01-17 19:45:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][620/1251] eta 0:23:16 lr 0.000525 time 1.8206 (2.2133) loss 4.1243 (5.1159) grad_norm 2.1450 (1.8982) [2022-01-17 19:45:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][630/1251] eta 0:22:51 lr 0.000526 time 1.7771 (2.2081) loss 4.1240 (5.1153) grad_norm 1.5578 (1.8942) [2022-01-17 19:46:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][640/1251] eta 0:22:27 lr 0.000526 time 1.6317 (2.2054) loss 5.4125 (5.1199) grad_norm 1.5849 (1.8895) [2022-01-17 19:46:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][650/1251] eta 0:22:06 lr 0.000526 time 2.2865 (2.2068) loss 5.0700 (5.1181) grad_norm 1.6617 (1.8888) [2022-01-17 19:46:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][660/1251] eta 0:21:43 lr 0.000527 time 1.8559 (2.2059) loss 4.4762 (5.1142) grad_norm 1.7317 (1.8872) [2022-01-17 19:47:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][670/1251] eta 0:21:24 lr 0.000527 time 3.2970 (2.2105) loss 5.6637 (5.1171) grad_norm 1.5081 (1.8849) [2022-01-17 19:47:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][680/1251] eta 0:21:04 lr 0.000528 time 3.4465 (2.2153) loss 4.6139 (5.1144) grad_norm 2.0975 (1.8826) [2022-01-17 19:48:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][690/1251] eta 0:20:42 lr 0.000528 time 1.7659 (2.2145) loss 5.1968 (5.1138) grad_norm 2.7694 (1.8828) [2022-01-17 19:48:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][700/1251] eta 0:20:19 lr 0.000528 time 2.0463 (2.2125) loss 5.8393 (5.1154) grad_norm 2.8350 (1.8854) [2022-01-17 19:48:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][710/1251] eta 0:19:55 lr 0.000529 time 1.9861 (2.2101) loss 4.6742 (5.1174) grad_norm 2.2997 (1.8862) [2022-01-17 19:49:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][720/1251] eta 0:19:33 lr 0.000529 time 2.8032 (2.2105) loss 5.5761 (5.1174) grad_norm 1.8707 (1.8869) [2022-01-17 19:49:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][730/1251] eta 0:19:10 lr 0.000530 time 1.8699 (2.2092) loss 4.2267 (5.1145) grad_norm 1.8015 (1.8852) [2022-01-17 19:49:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][740/1251] eta 0:18:47 lr 0.000530 time 2.5074 (2.2072) loss 5.1355 (5.1109) grad_norm 2.3661 (1.8869) [2022-01-17 19:50:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][750/1251] eta 0:18:25 lr 0.000530 time 2.2449 (2.2057) loss 5.0286 (5.1100) grad_norm 1.9301 (1.8875) [2022-01-17 19:50:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][760/1251] eta 0:18:02 lr 0.000531 time 2.5408 (2.2043) loss 4.1015 (5.1076) grad_norm 1.9946 (1.8877) [2022-01-17 19:50:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][770/1251] eta 0:17:40 lr 0.000531 time 2.5404 (2.2048) loss 5.7528 (5.1060) grad_norm 3.5680 (1.8902) [2022-01-17 19:51:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][780/1251] eta 0:17:18 lr 0.000532 time 2.5814 (2.2049) loss 5.2800 (5.1076) grad_norm 1.9871 (1.8906) [2022-01-17 19:51:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][790/1251] eta 0:16:56 lr 0.000532 time 2.1288 (2.2048) loss 4.9708 (5.1062) grad_norm 1.6049 (1.8905) [2022-01-17 19:51:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][800/1251] eta 0:16:34 lr 0.000532 time 2.7264 (2.2062) loss 4.0541 (5.0998) grad_norm 1.7549 (1.8871) [2022-01-17 19:52:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][810/1251] eta 0:16:13 lr 0.000533 time 2.8522 (2.2077) loss 5.3143 (5.0980) grad_norm 1.4661 (1.8853) [2022-01-17 19:52:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][820/1251] eta 0:15:51 lr 0.000533 time 2.1573 (2.2076) loss 5.9012 (5.0989) grad_norm 1.7365 (1.8846) [2022-01-17 19:53:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][830/1251] eta 0:15:29 lr 0.000534 time 2.2651 (2.2077) loss 5.6748 (5.0976) grad_norm 1.3424 (1.8839) [2022-01-17 19:53:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][840/1251] eta 0:15:07 lr 0.000534 time 2.7787 (2.2080) loss 4.9245 (5.0980) grad_norm 1.5892 (1.8830) [2022-01-17 19:53:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][850/1251] eta 0:14:44 lr 0.000534 time 2.2037 (2.2064) loss 4.0791 (5.0973) grad_norm 1.8603 (1.8805) [2022-01-17 19:54:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][860/1251] eta 0:14:21 lr 0.000535 time 1.9085 (2.2045) loss 5.4625 (5.0996) grad_norm 1.5780 (1.8779) [2022-01-17 19:54:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][870/1251] eta 0:13:59 lr 0.000535 time 2.5874 (2.2045) loss 5.5752 (5.1031) grad_norm 1.7732 (1.8779) [2022-01-17 19:54:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][880/1251] eta 0:13:38 lr 0.000536 time 3.6669 (2.2065) loss 5.1711 (5.1027) grad_norm 1.4089 (1.8759) [2022-01-17 19:55:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][890/1251] eta 0:13:16 lr 0.000536 time 1.7280 (2.2076) loss 4.2832 (5.0976) grad_norm 3.1899 (1.8772) [2022-01-17 19:55:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][900/1251] eta 0:12:54 lr 0.000536 time 1.5966 (2.2071) loss 4.1316 (5.0910) grad_norm 1.6868 (1.8797) [2022-01-17 19:56:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][910/1251] eta 0:12:32 lr 0.000537 time 1.6009 (2.2055) loss 5.1179 (5.0906) grad_norm 1.5535 (1.8782) [2022-01-17 19:56:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][920/1251] eta 0:12:09 lr 0.000537 time 2.6202 (2.2047) loss 5.1654 (5.0913) grad_norm 1.8714 (1.8791) [2022-01-17 19:56:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][930/1251] eta 0:11:47 lr 0.000538 time 2.2496 (2.2031) loss 4.6360 (5.0948) grad_norm 1.9532 (1.8788) [2022-01-17 19:57:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][940/1251] eta 0:11:24 lr 0.000538 time 1.9103 (2.2022) loss 4.4339 (5.0955) grad_norm 2.0513 (1.8787) [2022-01-17 19:57:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][950/1251] eta 0:11:02 lr 0.000538 time 2.2156 (2.2013) loss 4.4525 (5.0930) grad_norm 1.6939 (1.8765) [2022-01-17 19:57:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][960/1251] eta 0:10:40 lr 0.000539 time 2.3903 (2.2017) loss 5.2421 (5.0938) grad_norm 1.4171 (1.8762) [2022-01-17 19:58:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][970/1251] eta 0:10:18 lr 0.000539 time 2.2478 (2.2027) loss 4.5668 (5.0922) grad_norm 1.8630 (1.8753) [2022-01-17 19:58:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][980/1251] eta 0:09:57 lr 0.000540 time 1.5148 (2.2031) loss 5.3223 (5.0910) grad_norm 2.4777 (1.8752) [2022-01-17 19:58:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][990/1251] eta 0:09:35 lr 0.000540 time 2.5043 (2.2040) loss 5.4697 (5.0911) grad_norm 1.8991 (1.8754) [2022-01-17 19:59:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1000/1251] eta 0:09:12 lr 0.000540 time 1.7164 (2.2028) loss 5.6417 (5.0920) grad_norm 2.3582 (1.8747) [2022-01-17 19:59:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1010/1251] eta 0:08:50 lr 0.000541 time 1.8846 (2.2013) loss 3.8931 (5.0875) grad_norm 1.5904 (1.8740) [2022-01-17 19:59:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1020/1251] eta 0:08:28 lr 0.000541 time 1.8811 (2.2008) loss 5.1364 (5.0858) grad_norm 1.7411 (1.8738) [2022-01-17 20:00:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1030/1251] eta 0:08:06 lr 0.000542 time 2.5705 (2.2002) loss 4.8850 (5.0852) grad_norm 1.7266 (1.8728) [2022-01-17 20:00:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1040/1251] eta 0:07:44 lr 0.000542 time 2.1233 (2.2020) loss 4.3191 (5.0843) grad_norm 1.9336 (1.8723) [2022-01-17 20:01:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1050/1251] eta 0:07:23 lr 0.000542 time 1.8119 (2.2042) loss 5.1386 (5.0841) grad_norm 1.7876 (1.8701) [2022-01-17 20:01:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1060/1251] eta 0:07:00 lr 0.000543 time 1.8852 (2.2031) loss 4.8949 (5.0819) grad_norm 2.0027 (1.8683) [2022-01-17 20:01:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1070/1251] eta 0:06:38 lr 0.000543 time 2.2541 (2.2017) loss 5.2355 (5.0812) grad_norm 1.5705 (1.8685) [2022-01-17 20:02:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1080/1251] eta 0:06:16 lr 0.000544 time 1.5431 (2.1993) loss 4.7441 (5.0808) grad_norm 1.8832 (1.8729) [2022-01-17 20:02:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1090/1251] eta 0:05:54 lr 0.000544 time 2.7532 (2.1992) loss 4.0283 (5.0783) grad_norm 2.4471 (1.8732) [2022-01-17 20:02:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1100/1251] eta 0:05:32 lr 0.000544 time 2.1274 (2.1991) loss 5.2798 (5.0775) grad_norm 1.5092 (1.8727) [2022-01-17 20:03:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1110/1251] eta 0:05:10 lr 0.000545 time 2.4946 (2.1991) loss 5.3757 (5.0770) grad_norm 1.5714 (1.8711) [2022-01-17 20:03:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1120/1251] eta 0:04:47 lr 0.000545 time 1.9142 (2.1980) loss 5.1009 (5.0787) grad_norm 1.8975 (1.8699) [2022-01-17 20:03:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1130/1251] eta 0:04:26 lr 0.000546 time 2.4556 (2.1988) loss 4.7028 (5.0765) grad_norm 1.5485 (1.8708) [2022-01-17 20:04:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1140/1251] eta 0:04:04 lr 0.000546 time 2.2820 (2.1985) loss 5.0517 (5.0729) grad_norm 1.9271 (1.8697) [2022-01-17 20:04:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1150/1251] eta 0:03:42 lr 0.000546 time 2.0153 (2.1983) loss 4.4267 (5.0709) grad_norm 1.8336 (1.8689) [2022-01-17 20:05:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1160/1251] eta 0:03:19 lr 0.000547 time 1.9275 (2.1975) loss 4.7978 (5.0717) grad_norm 2.3725 (1.8720) [2022-01-17 20:05:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1170/1251] eta 0:02:58 lr 0.000547 time 1.7876 (2.1977) loss 5.2331 (5.0729) grad_norm 1.4987 (1.8700) [2022-01-17 20:05:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1180/1251] eta 0:02:35 lr 0.000548 time 1.8430 (2.1970) loss 4.2583 (5.0709) grad_norm 1.7620 (1.8699) [2022-01-17 20:06:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1190/1251] eta 0:02:14 lr 0.000548 time 2.0002 (2.1972) loss 4.5608 (5.0705) grad_norm 1.6634 (1.8679) [2022-01-17 20:06:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1200/1251] eta 0:01:52 lr 0.000548 time 2.7119 (2.1972) loss 5.6972 (5.0708) grad_norm 2.0948 (1.8664) [2022-01-17 20:06:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1210/1251] eta 0:01:30 lr 0.000549 time 3.1715 (2.2006) loss 5.0411 (5.0705) grad_norm 1.7640 (1.8673) [2022-01-17 20:07:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1220/1251] eta 0:01:08 lr 0.000549 time 2.0530 (2.2011) loss 5.3663 (5.0696) grad_norm 1.3978 (1.8678) [2022-01-17 20:07:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1230/1251] eta 0:00:46 lr 0.000550 time 1.5586 (2.2007) loss 4.5191 (5.0686) grad_norm 1.6626 (1.8668) [2022-01-17 20:08:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1240/1251] eta 0:00:24 lr 0.000550 time 1.8662 (2.1999) loss 5.3928 (5.0699) grad_norm 1.9416 (1.8653) [2022-01-17 20:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [10/300][1250/1251] eta 0:00:02 lr 0.000550 time 1.3919 (2.1942) loss 5.2028 (5.0704) grad_norm 1.8790 (1.8645) [2022-01-17 20:08:18 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 10 training takes 0:45:45 [2022-01-17 20:08:18 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_10 saving...... [2022-01-17 20:08:29 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_10 saved !!! [2022-01-17 20:08:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 15.974 (15.974) Loss 2.4638 (2.4638) Acc@1 45.801 (45.801) Acc@5 72.559 (72.559) [2022-01-17 20:09:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.351 (2.846) Loss 2.4573 (2.4743) Acc@1 47.656 (46.786) Acc@5 72.266 (72.656) [2022-01-17 20:09:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.999 (2.263) Loss 2.4580 (2.4721) Acc@1 47.070 (47.080) Acc@5 73.535 (72.889) [2022-01-17 20:09:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.879 (2.000) Loss 2.5693 (2.4769) Acc@1 45.605 (46.947) Acc@5 71.582 (72.792) [2022-01-17 20:09:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.082 (1.967) Loss 2.5405 (2.4839) Acc@1 44.531 (46.768) Acc@5 71.582 (72.699) [2022-01-17 20:09:57 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 46.794 Acc@5 72.584 [2022-01-17 20:09:57 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 46.8% [2022-01-17 20:09:57 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 46.79% [2022-01-17 20:10:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][0/1251] eta 7:34:18 lr 0.000550 time 21.7890 (21.7890) loss 5.6197 (5.6197) grad_norm 1.6785 (1.6785) [2022-01-17 20:10:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][10/1251] eta 1:24:09 lr 0.000551 time 1.8479 (4.0688) loss 5.6782 (5.1328) grad_norm 1.8198 (1.6923) [2022-01-17 20:11:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][20/1251] eta 1:04:14 lr 0.000551 time 1.8353 (3.1314) loss 5.3752 (5.2600) grad_norm 1.7087 (1.8273) [2022-01-17 20:11:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][30/1251] eta 0:56:24 lr 0.000552 time 1.5571 (2.7721) loss 4.8581 (5.1608) grad_norm 2.0707 (1.8981) [2022-01-17 20:11:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][40/1251] eta 0:53:53 lr 0.000552 time 4.0250 (2.6701) loss 4.6736 (5.1217) grad_norm 1.9527 (1.8499) [2022-01-17 20:12:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][50/1251] eta 0:52:25 lr 0.000552 time 3.4631 (2.6188) loss 4.9191 (5.0624) grad_norm 1.3823 (1.8221) [2022-01-17 20:12:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][60/1251] eta 0:50:38 lr 0.000553 time 1.8660 (2.5513) loss 5.7285 (5.0393) grad_norm 1.6403 (1.8183) [2022-01-17 20:12:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][70/1251] eta 0:49:00 lr 0.000553 time 1.5426 (2.4899) loss 4.0142 (5.0144) grad_norm 1.5978 (1.8032) [2022-01-17 20:13:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][80/1251] eta 0:48:19 lr 0.000554 time 3.8539 (2.4761) loss 5.6469 (5.0410) grad_norm 1.9900 (1.8174) [2022-01-17 20:13:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][90/1251] eta 0:47:30 lr 0.000554 time 3.2698 (2.4551) loss 5.1108 (5.0359) grad_norm 1.7845 (1.8224) [2022-01-17 20:14:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][100/1251] eta 0:46:16 lr 0.000554 time 1.8479 (2.4123) loss 4.7927 (5.0432) grad_norm 2.0515 (1.8177) [2022-01-17 20:14:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][110/1251] eta 0:45:15 lr 0.000555 time 1.9378 (2.3798) loss 5.1845 (5.0492) grad_norm 2.2115 (1.8141) [2022-01-17 20:14:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][120/1251] eta 0:44:19 lr 0.000555 time 1.9494 (2.3515) loss 5.4678 (5.0298) grad_norm 1.8207 (1.8505) [2022-01-17 20:15:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][130/1251] eta 0:43:44 lr 0.000556 time 2.7794 (2.3409) loss 5.6689 (5.0385) grad_norm 1.4216 (1.8378) [2022-01-17 20:15:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][140/1251] eta 0:43:08 lr 0.000556 time 1.9767 (2.3297) loss 4.7702 (5.0285) grad_norm 1.4190 (1.8330) [2022-01-17 20:15:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][150/1251] eta 0:42:42 lr 0.000556 time 2.1784 (2.3270) loss 5.2073 (5.0336) grad_norm 1.5646 (1.8401) [2022-01-17 20:16:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][160/1251] eta 0:42:20 lr 0.000557 time 2.8266 (2.3287) loss 5.5114 (5.0269) grad_norm 1.6413 (1.8360) [2022-01-17 20:16:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][170/1251] eta 0:41:43 lr 0.000557 time 2.2172 (2.3158) loss 5.0396 (5.0218) grad_norm 2.3529 (1.8334) [2022-01-17 20:16:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][180/1251] eta 0:41:10 lr 0.000558 time 2.5600 (2.3065) loss 4.8058 (5.0152) grad_norm 1.9846 (1.8287) [2022-01-17 20:17:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][190/1251] eta 0:40:34 lr 0.000558 time 1.6594 (2.2948) loss 5.3128 (5.0000) grad_norm 1.4004 (1.8345) [2022-01-17 20:17:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][200/1251] eta 0:40:03 lr 0.000558 time 2.2019 (2.2865) loss 4.5258 (4.9975) grad_norm 2.0706 (1.8336) [2022-01-17 20:17:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][210/1251] eta 0:39:30 lr 0.000559 time 1.8580 (2.2767) loss 4.7700 (4.9988) grad_norm 1.5142 (1.8310) [2022-01-17 20:18:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][220/1251] eta 0:38:58 lr 0.000559 time 2.8378 (2.2681) loss 5.2763 (4.9932) grad_norm 1.9429 (1.8407) [2022-01-17 20:18:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][230/1251] eta 0:38:32 lr 0.000560 time 2.1569 (2.2646) loss 5.8048 (5.0069) grad_norm 1.8010 (1.8367) [2022-01-17 20:19:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][240/1251] eta 0:38:08 lr 0.000560 time 2.5422 (2.2640) loss 5.8228 (5.0146) grad_norm 2.1778 (1.8440) [2022-01-17 20:19:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][250/1251] eta 0:37:44 lr 0.000560 time 1.6923 (2.2625) loss 5.0355 (5.0188) grad_norm 2.0616 (1.8465) [2022-01-17 20:19:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][260/1251] eta 0:37:27 lr 0.000561 time 3.3480 (2.2679) loss 4.8162 (5.0161) grad_norm 1.5701 (1.8503) [2022-01-17 20:20:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][270/1251] eta 0:37:06 lr 0.000561 time 2.4305 (2.2694) loss 4.2028 (5.0179) grad_norm 1.4303 (1.8490) [2022-01-17 20:20:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][280/1251] eta 0:36:34 lr 0.000562 time 1.8850 (2.2603) loss 4.7620 (5.0159) grad_norm 1.7654 (1.8447) [2022-01-17 20:20:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][290/1251] eta 0:35:58 lr 0.000562 time 1.9151 (2.2465) loss 5.1370 (5.0201) grad_norm 1.7916 (1.8477) [2022-01-17 20:21:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][300/1251] eta 0:35:26 lr 0.000562 time 1.8389 (2.2362) loss 5.4785 (5.0135) grad_norm 1.6077 (1.8458) [2022-01-17 20:21:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][310/1251] eta 0:34:57 lr 0.000563 time 2.1531 (2.2288) loss 5.5142 (5.0244) grad_norm 1.8264 (1.8384) [2022-01-17 20:21:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][320/1251] eta 0:34:33 lr 0.000563 time 1.8714 (2.2272) loss 5.7032 (5.0277) grad_norm 1.5527 (1.8299) [2022-01-17 20:22:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][330/1251] eta 0:34:09 lr 0.000564 time 2.2845 (2.2248) loss 4.6305 (5.0343) grad_norm 1.7156 (1.8276) [2022-01-17 20:22:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][340/1251] eta 0:33:44 lr 0.000564 time 1.8818 (2.2224) loss 5.5230 (5.0330) grad_norm 1.8057 (1.8283) [2022-01-17 20:22:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][350/1251] eta 0:33:22 lr 0.000564 time 2.3071 (2.2221) loss 4.1759 (5.0236) grad_norm 2.0857 (1.8284) [2022-01-17 20:23:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][360/1251] eta 0:33:04 lr 0.000565 time 2.3006 (2.2270) loss 5.1659 (5.0219) grad_norm 1.6729 (1.8279) [2022-01-17 20:23:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][370/1251] eta 0:32:48 lr 0.000565 time 2.4337 (2.2339) loss 4.6026 (5.0213) grad_norm 1.4557 (1.8253) [2022-01-17 20:24:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][380/1251] eta 0:32:25 lr 0.000566 time 1.8807 (2.2332) loss 5.4785 (5.0227) grad_norm 2.0475 (1.8267) [2022-01-17 20:24:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][390/1251] eta 0:32:01 lr 0.000566 time 2.3371 (2.2316) loss 5.4974 (5.0289) grad_norm 1.6208 (1.8297) [2022-01-17 20:24:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][400/1251] eta 0:31:38 lr 0.000566 time 1.6197 (2.2304) loss 5.3745 (5.0291) grad_norm 2.1453 (1.8317) [2022-01-17 20:25:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][410/1251] eta 0:31:14 lr 0.000567 time 3.5283 (2.2289) loss 5.3928 (5.0260) grad_norm 2.3811 (1.8310) [2022-01-17 20:25:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][420/1251] eta 0:30:48 lr 0.000567 time 2.0952 (2.2248) loss 5.3464 (5.0228) grad_norm 1.4246 (1.8269) [2022-01-17 20:25:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][430/1251] eta 0:30:24 lr 0.000568 time 2.4701 (2.2227) loss 5.5545 (5.0269) grad_norm 2.0601 (1.8251) [2022-01-17 20:26:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][440/1251] eta 0:30:02 lr 0.000568 time 2.2202 (2.2227) loss 4.2339 (5.0288) grad_norm 2.1499 (1.8264) [2022-01-17 20:26:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][450/1251] eta 0:29:38 lr 0.000568 time 2.8024 (2.2209) loss 4.7265 (5.0192) grad_norm 1.3523 (1.8237) [2022-01-17 20:27:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][460/1251] eta 0:29:17 lr 0.000569 time 2.5730 (2.2215) loss 5.4001 (5.0169) grad_norm 1.8415 (1.8250) [2022-01-17 20:27:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][470/1251] eta 0:28:55 lr 0.000569 time 3.1052 (2.2216) loss 5.6689 (5.0151) grad_norm 1.6269 (1.8233) [2022-01-17 20:27:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][480/1251] eta 0:28:32 lr 0.000570 time 2.1168 (2.2206) loss 5.4935 (5.0160) grad_norm 1.6274 (1.8232) [2022-01-17 20:28:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][490/1251] eta 0:28:08 lr 0.000570 time 2.1976 (2.2194) loss 5.5379 (5.0187) grad_norm 1.8161 (1.8241) [2022-01-17 20:28:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][500/1251] eta 0:27:44 lr 0.000570 time 1.7085 (2.2166) loss 5.0248 (5.0161) grad_norm 1.5370 (1.8226) [2022-01-17 20:28:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][510/1251] eta 0:27:24 lr 0.000571 time 2.9855 (2.2195) loss 5.3986 (5.0127) grad_norm 1.6615 (1.8221) [2022-01-17 20:29:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][520/1251] eta 0:27:03 lr 0.000571 time 2.8757 (2.2212) loss 5.5003 (5.0106) grad_norm 2.0240 (1.8220) [2022-01-17 20:29:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][530/1251] eta 0:26:40 lr 0.000572 time 2.6168 (2.2197) loss 4.8692 (5.0120) grad_norm 1.7423 (1.8164) [2022-01-17 20:29:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][540/1251] eta 0:26:15 lr 0.000572 time 1.6320 (2.2157) loss 4.7437 (5.0118) grad_norm 1.7441 (1.8115) [2022-01-17 20:30:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][550/1251] eta 0:25:54 lr 0.000572 time 2.9119 (2.2178) loss 5.2592 (5.0077) grad_norm 1.7275 (1.8106) [2022-01-17 20:30:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][560/1251] eta 0:25:32 lr 0.000573 time 2.1970 (2.2183) loss 5.1533 (5.0095) grad_norm 1.8183 (1.8113) [2022-01-17 20:31:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][570/1251] eta 0:25:10 lr 0.000573 time 2.2370 (2.2174) loss 5.1805 (5.0101) grad_norm 2.0784 (1.8117) [2022-01-17 20:31:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][580/1251] eta 0:24:46 lr 0.000574 time 1.4887 (2.2151) loss 5.5223 (5.0097) grad_norm 1.5315 (1.8119) [2022-01-17 20:31:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][590/1251] eta 0:24:21 lr 0.000574 time 2.2332 (2.2111) loss 5.3949 (5.0106) grad_norm 1.7552 (1.8126) [2022-01-17 20:32:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][600/1251] eta 0:23:58 lr 0.000574 time 1.5701 (2.2089) loss 5.6219 (5.0091) grad_norm 1.5864 (1.8138) [2022-01-17 20:32:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][610/1251] eta 0:23:35 lr 0.000575 time 2.7680 (2.2081) loss 4.7451 (5.0108) grad_norm 1.9006 (1.8164) [2022-01-17 20:32:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][620/1251] eta 0:23:12 lr 0.000575 time 2.3959 (2.2064) loss 4.4924 (5.0110) grad_norm 2.8447 (1.8165) [2022-01-17 20:33:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][630/1251] eta 0:22:49 lr 0.000576 time 2.1245 (2.2047) loss 5.4418 (5.0117) grad_norm 1.8929 (1.8178) [2022-01-17 20:33:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][640/1251] eta 0:22:25 lr 0.000576 time 1.7216 (2.2023) loss 4.4732 (5.0142) grad_norm 1.5509 (1.8149) [2022-01-17 20:33:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][650/1251] eta 0:22:05 lr 0.000576 time 2.6323 (2.2060) loss 5.4734 (5.0153) grad_norm 1.5895 (1.8105) [2022-01-17 20:34:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][660/1251] eta 0:21:42 lr 0.000577 time 2.1239 (2.2046) loss 5.2880 (5.0127) grad_norm 1.6238 (1.8111) [2022-01-17 20:34:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][670/1251] eta 0:21:22 lr 0.000577 time 1.9680 (2.2076) loss 4.6505 (5.0161) grad_norm 1.5893 (1.8108) [2022-01-17 20:35:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][680/1251] eta 0:21:01 lr 0.000578 time 1.6201 (2.2091) loss 5.0954 (5.0150) grad_norm 1.7321 (1.8116) [2022-01-17 20:35:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][690/1251] eta 0:20:40 lr 0.000578 time 2.7899 (2.2116) loss 5.2752 (5.0155) grad_norm 1.2743 (1.8089) [2022-01-17 20:35:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][700/1251] eta 0:20:17 lr 0.000578 time 1.9018 (2.2102) loss 5.5323 (5.0172) grad_norm 1.4157 (1.8082) [2022-01-17 20:36:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][710/1251] eta 0:19:54 lr 0.000579 time 1.8698 (2.2089) loss 5.5114 (5.0225) grad_norm 1.9404 (1.8083) [2022-01-17 20:36:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][720/1251] eta 0:19:31 lr 0.000579 time 1.5963 (2.2062) loss 5.1895 (5.0224) grad_norm 1.5454 (1.8061) [2022-01-17 20:36:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][730/1251] eta 0:19:08 lr 0.000580 time 2.4369 (2.2041) loss 5.9034 (5.0266) grad_norm 1.4518 (1.8041) [2022-01-17 20:37:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][740/1251] eta 0:18:46 lr 0.000580 time 2.1658 (2.2040) loss 5.4352 (5.0261) grad_norm 1.5677 (1.8019) [2022-01-17 20:37:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][750/1251] eta 0:18:23 lr 0.000580 time 2.3433 (2.2033) loss 5.1653 (5.0241) grad_norm 1.3459 (1.8023) [2022-01-17 20:37:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][760/1251] eta 0:18:02 lr 0.000581 time 2.0460 (2.2045) loss 4.7687 (5.0264) grad_norm 1.3987 (1.8022) [2022-01-17 20:38:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][770/1251] eta 0:17:40 lr 0.000581 time 1.6436 (2.2041) loss 5.4975 (5.0264) grad_norm 1.4913 (1.8019) [2022-01-17 20:38:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][780/1251] eta 0:17:18 lr 0.000582 time 2.4410 (2.2041) loss 4.0251 (5.0250) grad_norm 2.2780 (1.8017) [2022-01-17 20:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][790/1251] eta 0:16:56 lr 0.000582 time 1.9388 (2.2044) loss 5.4271 (5.0247) grad_norm 2.6367 (1.8034) [2022-01-17 20:39:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][800/1251] eta 0:16:34 lr 0.000582 time 1.9190 (2.2041) loss 5.2430 (5.0260) grad_norm 1.5488 (1.8023) [2022-01-17 20:39:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][810/1251] eta 0:16:12 lr 0.000583 time 2.2158 (2.2044) loss 4.6898 (5.0231) grad_norm 2.0971 (1.8007) [2022-01-17 20:40:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][820/1251] eta 0:15:49 lr 0.000583 time 1.8584 (2.2022) loss 5.5972 (5.0237) grad_norm 1.7556 (1.8004) [2022-01-17 20:40:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][830/1251] eta 0:15:26 lr 0.000584 time 1.8480 (2.2000) loss 5.8227 (5.0238) grad_norm 1.9272 (1.8056) [2022-01-17 20:40:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][840/1251] eta 0:15:03 lr 0.000584 time 1.9540 (2.1990) loss 5.6593 (5.0246) grad_norm 1.5297 (1.8041) [2022-01-17 20:41:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][850/1251] eta 0:14:41 lr 0.000584 time 2.0784 (2.1987) loss 5.2510 (5.0280) grad_norm 1.3162 (1.8020) [2022-01-17 20:41:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][860/1251] eta 0:14:19 lr 0.000585 time 2.4085 (2.1993) loss 4.0801 (5.0286) grad_norm 1.9889 (1.8000) [2022-01-17 20:41:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][870/1251] eta 0:13:58 lr 0.000585 time 2.1278 (2.1997) loss 4.6103 (5.0283) grad_norm 1.9190 (1.7997) [2022-01-17 20:42:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][880/1251] eta 0:13:36 lr 0.000586 time 2.3104 (2.1997) loss 5.1475 (5.0239) grad_norm 1.8159 (1.7991) [2022-01-17 20:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][890/1251] eta 0:13:14 lr 0.000586 time 2.5302 (2.2006) loss 4.3461 (5.0216) grad_norm 1.6073 (1.7994) [2022-01-17 20:43:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][900/1251] eta 0:12:52 lr 0.000586 time 2.8521 (2.2018) loss 4.0830 (5.0226) grad_norm 1.7150 (1.8000) [2022-01-17 20:43:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][910/1251] eta 0:12:30 lr 0.000587 time 1.4777 (2.2001) loss 5.5411 (5.0209) grad_norm 1.4570 (1.7992) [2022-01-17 20:43:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][920/1251] eta 0:12:07 lr 0.000587 time 2.1079 (2.1985) loss 5.1542 (5.0174) grad_norm 2.2001 (1.8027) [2022-01-17 20:44:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][930/1251] eta 0:11:45 lr 0.000588 time 1.6087 (2.1972) loss 4.2056 (5.0148) grad_norm 1.7200 (1.8015) [2022-01-17 20:44:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][940/1251] eta 0:11:22 lr 0.000588 time 1.8492 (2.1952) loss 5.4781 (5.0102) grad_norm 1.5861 (1.8018) [2022-01-17 20:44:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][950/1251] eta 0:11:00 lr 0.000588 time 1.5430 (2.1929) loss 4.9880 (5.0088) grad_norm 1.6852 (1.8021) [2022-01-17 20:45:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][960/1251] eta 0:10:37 lr 0.000589 time 1.8329 (2.1923) loss 4.0395 (5.0054) grad_norm 1.5104 (1.8020) [2022-01-17 20:45:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][970/1251] eta 0:10:16 lr 0.000589 time 2.1120 (2.1924) loss 5.2278 (5.0058) grad_norm 2.3357 (1.8036) [2022-01-17 20:45:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][980/1251] eta 0:09:54 lr 0.000590 time 2.2951 (2.1938) loss 4.3957 (5.0054) grad_norm 1.7219 (1.8022) [2022-01-17 20:46:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][990/1251] eta 0:09:33 lr 0.000590 time 2.8285 (2.1964) loss 4.9045 (5.0050) grad_norm 2.0119 (1.8013) [2022-01-17 20:46:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1000/1251] eta 0:09:11 lr 0.000590 time 1.8387 (2.1964) loss 5.1034 (5.0027) grad_norm 2.1508 (1.8015) [2022-01-17 20:46:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1010/1251] eta 0:08:49 lr 0.000591 time 2.4234 (2.1962) loss 3.7496 (4.9996) grad_norm 1.9764 (1.8018) [2022-01-17 20:47:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1020/1251] eta 0:08:26 lr 0.000591 time 1.9126 (2.1943) loss 5.3450 (4.9977) grad_norm 1.5745 (1.8020) [2022-01-17 20:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1030/1251] eta 0:08:04 lr 0.000592 time 2.4028 (2.1943) loss 5.4795 (4.9996) grad_norm 1.6980 (1.8000) [2022-01-17 20:48:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1040/1251] eta 0:07:43 lr 0.000592 time 1.4975 (2.1956) loss 5.0285 (5.0002) grad_norm 1.9895 (1.7991) [2022-01-17 20:48:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1050/1251] eta 0:07:21 lr 0.000592 time 2.4216 (2.1972) loss 4.6927 (4.9994) grad_norm 3.3980 (1.7996) [2022-01-17 20:48:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1060/1251] eta 0:06:59 lr 0.000593 time 2.2324 (2.1977) loss 3.9505 (4.9968) grad_norm 1.6302 (1.8000) [2022-01-17 20:49:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1070/1251] eta 0:06:37 lr 0.000593 time 3.1191 (2.1988) loss 3.9214 (4.9956) grad_norm 1.9748 (1.7999) [2022-01-17 20:49:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1080/1251] eta 0:06:15 lr 0.000594 time 1.9080 (2.1967) loss 4.7031 (4.9970) grad_norm 2.2418 (1.7996) [2022-01-17 20:49:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1090/1251] eta 0:05:53 lr 0.000594 time 1.7278 (2.1937) loss 4.5325 (4.9976) grad_norm 2.4121 (1.7986) [2022-01-17 20:50:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1100/1251] eta 0:05:31 lr 0.000594 time 1.5899 (2.1923) loss 5.6618 (4.9979) grad_norm 1.9672 (1.7984) [2022-01-17 20:50:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1110/1251] eta 0:05:09 lr 0.000595 time 2.8111 (2.1917) loss 4.7507 (4.9975) grad_norm 1.4382 (1.7977) [2022-01-17 20:50:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1120/1251] eta 0:04:47 lr 0.000595 time 1.9559 (2.1909) loss 5.6679 (4.9980) grad_norm 1.3732 (1.7964) [2022-01-17 20:51:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1130/1251] eta 0:04:25 lr 0.000596 time 2.4983 (2.1917) loss 5.5068 (4.9986) grad_norm 1.8462 (1.7957) [2022-01-17 20:51:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1140/1251] eta 0:04:03 lr 0.000596 time 2.6725 (2.1918) loss 4.0395 (4.9969) grad_norm 1.2241 (1.7939) [2022-01-17 20:52:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1150/1251] eta 0:03:41 lr 0.000596 time 2.1074 (2.1924) loss 5.2818 (4.9978) grad_norm 1.5191 (1.7921) [2022-01-17 20:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1160/1251] eta 0:03:19 lr 0.000597 time 2.7791 (2.1945) loss 5.1727 (4.9988) grad_norm 1.8068 (1.7922) [2022-01-17 20:52:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1170/1251] eta 0:02:57 lr 0.000597 time 2.0200 (2.1947) loss 4.7047 (4.9960) grad_norm 1.4664 (1.7923) [2022-01-17 20:53:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1180/1251] eta 0:02:35 lr 0.000598 time 1.5149 (2.1952) loss 4.3357 (4.9969) grad_norm 1.3188 (1.7913) [2022-01-17 20:53:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1190/1251] eta 0:02:13 lr 0.000598 time 1.9718 (2.1952) loss 5.8207 (4.9978) grad_norm 1.4482 (1.7921) [2022-01-17 20:53:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1200/1251] eta 0:01:51 lr 0.000598 time 2.8999 (2.1942) loss 4.7735 (4.9973) grad_norm 1.4800 (1.7914) [2022-01-17 20:54:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1210/1251] eta 0:01:29 lr 0.000599 time 1.9520 (2.1923) loss 3.9405 (4.9959) grad_norm 1.6771 (1.7914) [2022-01-17 20:54:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1220/1251] eta 0:01:07 lr 0.000599 time 1.8342 (2.1925) loss 5.4783 (4.9957) grad_norm 1.5555 (1.7921) [2022-01-17 20:54:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1230/1251] eta 0:00:46 lr 0.000600 time 2.2494 (2.1921) loss 5.3947 (4.9959) grad_norm 1.6010 (1.7921) [2022-01-17 20:55:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1240/1251] eta 0:00:24 lr 0.000600 time 2.3976 (2.1925) loss 5.1842 (4.9969) grad_norm 1.4775 (1.7922) [2022-01-17 20:55:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [11/300][1250/1251] eta 0:00:02 lr 0.000600 time 1.1377 (2.1874) loss 5.1650 (4.9971) grad_norm 1.5394 (1.7905) [2022-01-17 20:55:34 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 11 training takes 0:45:36 [2022-01-17 20:55:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.925 (18.925) Loss 2.3702 (2.3702) Acc@1 47.754 (47.754) Acc@5 73.145 (73.145) [2022-01-17 20:56:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 2.143 (3.374) Loss 2.3520 (2.3711) Acc@1 49.609 (49.237) Acc@5 74.219 (74.299) [2022-01-17 20:56:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.599 (2.556) Loss 2.3591 (2.3635) Acc@1 50.000 (49.042) Acc@5 73.828 (74.484) [2022-01-17 20:56:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.237 (2.313) Loss 2.5057 (2.3594) Acc@1 46.582 (49.017) Acc@5 70.605 (74.414) [2022-01-17 20:57:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.152 (2.204) Loss 2.3418 (2.3656) Acc@1 47.656 (48.778) Acc@5 75.195 (74.271) [2022-01-17 20:57:12 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 48.850 Acc@5 74.352 [2022-01-17 20:57:12 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 48.8% [2022-01-17 20:57:12 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 48.85% [2022-01-17 20:57:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][0/1251] eta 7:37:58 lr 0.000600 time 21.9653 (21.9653) loss 5.8233 (5.8233) grad_norm 1.7651 (1.7651) [2022-01-17 20:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][10/1251] eta 1:23:46 lr 0.000601 time 2.1429 (4.0507) loss 4.5824 (4.7987) grad_norm 1.6522 (1.7366) [2022-01-17 20:58:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][20/1251] eta 1:04:26 lr 0.000601 time 1.5378 (3.1406) loss 4.9947 (4.8290) grad_norm 1.8291 (1.7127) [2022-01-17 20:58:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][30/1251] eta 0:57:40 lr 0.000602 time 1.7514 (2.8337) loss 5.5359 (4.8975) grad_norm 2.3178 (1.7544) [2022-01-17 20:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][40/1251] eta 0:55:19 lr 0.000602 time 3.5920 (2.7414) loss 5.6649 (4.8667) grad_norm 1.7018 (1.7699) [2022-01-17 20:59:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][50/1251] eta 0:54:57 lr 0.000602 time 2.8166 (2.7460) loss 5.0611 (4.8461) grad_norm 2.4435 (1.7907) [2022-01-17 20:59:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][60/1251] eta 0:52:33 lr 0.000603 time 1.9155 (2.6476) loss 5.4128 (4.8199) grad_norm 1.8152 (1.7928) [2022-01-17 21:00:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][70/1251] eta 0:51:01 lr 0.000603 time 1.7214 (2.5922) loss 5.1393 (4.8603) grad_norm 1.9557 (1.7805) [2022-01-17 21:00:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][80/1251] eta 0:49:16 lr 0.000604 time 1.9929 (2.5251) loss 5.4213 (4.8721) grad_norm 1.7167 (1.7826) [2022-01-17 21:01:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][90/1251] eta 0:48:30 lr 0.000604 time 1.9234 (2.5069) loss 5.4164 (4.8732) grad_norm 1.8729 (1.7980) [2022-01-17 21:01:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][100/1251] eta 0:47:08 lr 0.000604 time 1.6166 (2.4578) loss 4.7028 (4.8851) grad_norm 1.4103 (1.7845) [2022-01-17 21:01:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][110/1251] eta 0:46:11 lr 0.000605 time 1.5528 (2.4293) loss 5.2979 (4.9026) grad_norm 1.3293 (1.7729) [2022-01-17 21:02:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][120/1251] eta 0:45:19 lr 0.000605 time 1.9571 (2.4042) loss 4.9536 (4.8900) grad_norm 1.8003 (1.7704) [2022-01-17 21:02:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][130/1251] eta 0:44:45 lr 0.000606 time 2.2245 (2.3959) loss 4.8863 (4.9027) grad_norm 2.3675 (1.7622) [2022-01-17 21:02:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][140/1251] eta 0:43:53 lr 0.000606 time 2.0325 (2.3701) loss 5.2227 (4.9313) grad_norm 1.6021 (1.7584) [2022-01-17 21:03:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][150/1251] eta 0:43:15 lr 0.000606 time 2.1559 (2.3576) loss 4.8228 (4.9251) grad_norm 1.4314 (1.7564) [2022-01-17 21:03:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][160/1251] eta 0:42:38 lr 0.000607 time 1.9733 (2.3448) loss 5.1732 (4.9237) grad_norm 1.7767 (1.7557) [2022-01-17 21:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][170/1251] eta 0:42:17 lr 0.000607 time 2.5189 (2.3475) loss 4.8124 (4.9344) grad_norm 1.5441 (1.7541) [2022-01-17 21:04:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][180/1251] eta 0:41:47 lr 0.000608 time 2.2142 (2.3414) loss 4.1026 (4.9257) grad_norm 2.0489 (1.7525) [2022-01-17 21:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][190/1251] eta 0:41:12 lr 0.000608 time 1.9005 (2.3306) loss 4.5916 (4.9212) grad_norm 1.7231 (1.7497) [2022-01-17 21:04:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][200/1251] eta 0:40:32 lr 0.000608 time 1.6747 (2.3149) loss 5.1727 (4.9241) grad_norm 1.3700 (1.7583) [2022-01-17 21:05:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][210/1251] eta 0:40:19 lr 0.000609 time 1.9165 (2.3244) loss 5.0642 (4.9245) grad_norm 1.6396 (1.7534) [2022-01-17 21:05:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][220/1251] eta 0:39:43 lr 0.000609 time 1.6095 (2.3119) loss 5.2940 (4.9268) grad_norm 1.5740 (1.7586) [2022-01-17 21:06:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][230/1251] eta 0:39:10 lr 0.000610 time 2.3192 (2.3017) loss 3.9966 (4.9222) grad_norm 1.9316 (1.7523) [2022-01-17 21:06:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][240/1251] eta 0:38:36 lr 0.000610 time 1.5969 (2.2915) loss 4.0155 (4.9296) grad_norm 1.4019 (1.7438) [2022-01-17 21:06:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][250/1251] eta 0:38:20 lr 0.000610 time 2.0236 (2.2979) loss 5.1954 (4.9273) grad_norm 1.6053 (1.7381) [2022-01-17 21:07:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][260/1251] eta 0:37:49 lr 0.000611 time 2.2463 (2.2906) loss 5.3214 (4.9272) grad_norm 3.0237 (1.7422) [2022-01-17 21:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][270/1251] eta 0:37:25 lr 0.000611 time 1.9008 (2.2887) loss 5.3473 (4.9347) grad_norm 1.8986 (1.7445) [2022-01-17 21:07:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][280/1251] eta 0:36:58 lr 0.000612 time 1.8317 (2.2843) loss 5.7517 (4.9366) grad_norm 1.4124 (1.7451) [2022-01-17 21:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][290/1251] eta 0:36:37 lr 0.000612 time 2.8802 (2.2862) loss 4.3106 (4.9336) grad_norm 1.5729 (1.7495) [2022-01-17 21:08:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][300/1251] eta 0:36:09 lr 0.000612 time 2.3643 (2.2817) loss 5.2522 (4.9315) grad_norm 1.7712 (1.7503) [2022-01-17 21:08:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][310/1251] eta 0:35:37 lr 0.000613 time 2.2060 (2.2717) loss 5.1338 (4.9248) grad_norm 1.6232 (1.7578) [2022-01-17 21:09:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][320/1251] eta 0:35:07 lr 0.000613 time 1.7615 (2.2636) loss 5.2130 (4.9159) grad_norm 1.9197 (1.7515) [2022-01-17 21:09:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][330/1251] eta 0:34:37 lr 0.000614 time 1.4609 (2.2556) loss 5.3101 (4.9156) grad_norm 1.5664 (1.7525) [2022-01-17 21:10:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][340/1251] eta 0:34:12 lr 0.000614 time 2.2226 (2.2525) loss 5.4586 (4.9188) grad_norm 1.8929 (1.7486) [2022-01-17 21:10:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][350/1251] eta 0:33:44 lr 0.000614 time 2.0985 (2.2469) loss 4.8596 (4.9214) grad_norm 1.4411 (1.7440) [2022-01-17 21:10:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][360/1251] eta 0:33:18 lr 0.000615 time 2.0296 (2.2427) loss 4.7910 (4.9283) grad_norm 1.6994 (1.7466) [2022-01-17 21:11:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][370/1251] eta 0:32:54 lr 0.000615 time 1.5883 (2.2417) loss 4.9925 (4.9307) grad_norm 2.2944 (1.7484) [2022-01-17 21:11:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][380/1251] eta 0:32:36 lr 0.000616 time 3.3814 (2.2463) loss 3.9776 (4.9181) grad_norm 1.6251 (1.7487) [2022-01-17 21:11:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][390/1251] eta 0:32:14 lr 0.000616 time 1.7361 (2.2464) loss 5.7065 (4.9184) grad_norm 1.7988 (1.7469) [2022-01-17 21:12:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][400/1251] eta 0:31:47 lr 0.000616 time 1.8352 (2.2419) loss 5.1652 (4.9210) grad_norm 1.6089 (1.7458) [2022-01-17 21:12:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][410/1251] eta 0:31:26 lr 0.000617 time 1.6256 (2.2433) loss 4.6263 (4.9210) grad_norm 1.8823 (1.7449) [2022-01-17 21:12:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][420/1251] eta 0:31:03 lr 0.000617 time 2.6950 (2.2427) loss 3.8730 (4.9133) grad_norm 2.8152 (1.7489) [2022-01-17 21:13:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][430/1251] eta 0:30:43 lr 0.000618 time 2.6112 (2.2457) loss 4.7135 (4.9139) grad_norm 1.5768 (1.7522) [2022-01-17 21:13:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][440/1251] eta 0:30:19 lr 0.000618 time 1.5684 (2.2438) loss 4.0728 (4.9142) grad_norm 2.2395 (1.7532) [2022-01-17 21:14:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][450/1251] eta 0:29:56 lr 0.000618 time 2.5033 (2.2428) loss 4.6419 (4.9125) grad_norm 1.4324 (1.7494) [2022-01-17 21:14:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][460/1251] eta 0:29:31 lr 0.000619 time 1.8899 (2.2394) loss 5.5697 (4.9059) grad_norm 1.8197 (1.7502) [2022-01-17 21:14:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][470/1251] eta 0:29:09 lr 0.000619 time 2.2362 (2.2398) loss 4.7907 (4.9062) grad_norm 1.5703 (1.7451) [2022-01-17 21:15:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][480/1251] eta 0:28:44 lr 0.000620 time 1.9927 (2.2366) loss 5.5174 (4.9133) grad_norm 2.1136 (1.7443) [2022-01-17 21:15:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][490/1251] eta 0:28:21 lr 0.000620 time 2.4110 (2.2360) loss 5.3828 (4.9122) grad_norm 1.4541 (1.7458) [2022-01-17 21:15:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][500/1251] eta 0:27:57 lr 0.000620 time 1.8497 (2.2340) loss 5.7959 (4.9104) grad_norm 1.8546 (1.7456) [2022-01-17 21:16:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][510/1251] eta 0:27:34 lr 0.000621 time 1.9130 (2.2325) loss 4.8052 (4.9162) grad_norm 2.4167 (1.7480) [2022-01-17 21:16:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][520/1251] eta 0:27:10 lr 0.000621 time 1.9096 (2.2302) loss 4.5266 (4.9144) grad_norm 1.5110 (1.7454) [2022-01-17 21:16:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][530/1251] eta 0:26:47 lr 0.000622 time 2.3059 (2.2294) loss 5.2598 (4.9162) grad_norm 1.6878 (1.7461) [2022-01-17 21:17:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][540/1251] eta 0:26:24 lr 0.000622 time 2.5509 (2.2281) loss 4.9531 (4.9165) grad_norm 1.8849 (1.7433) [2022-01-17 21:17:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][550/1251] eta 0:26:02 lr 0.000622 time 1.6017 (2.2287) loss 5.1919 (4.9185) grad_norm 1.3409 (1.7415) [2022-01-17 21:18:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][560/1251] eta 0:25:38 lr 0.000623 time 1.9606 (2.2263) loss 5.5335 (4.9222) grad_norm 1.4764 (1.7382) [2022-01-17 21:18:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][570/1251] eta 0:25:15 lr 0.000623 time 2.2154 (2.2258) loss 4.3901 (4.9208) grad_norm 1.6120 (1.7351) [2022-01-17 21:18:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][580/1251] eta 0:24:53 lr 0.000624 time 2.0349 (2.2262) loss 4.3468 (4.9185) grad_norm 1.7892 (1.7344) [2022-01-17 21:19:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][590/1251] eta 0:24:32 lr 0.000624 time 2.1139 (2.2279) loss 5.1411 (4.9163) grad_norm 1.5064 (1.7343) [2022-01-17 21:19:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][600/1251] eta 0:24:09 lr 0.000624 time 1.5878 (2.2258) loss 5.2778 (4.9187) grad_norm 1.5809 (1.7326) [2022-01-17 21:19:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][610/1251] eta 0:23:45 lr 0.000625 time 1.9995 (2.2244) loss 5.0939 (4.9212) grad_norm 1.9628 (1.7325) [2022-01-17 21:20:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][620/1251] eta 0:23:22 lr 0.000625 time 2.0030 (2.2227) loss 4.9208 (4.9232) grad_norm 1.7788 (1.7307) [2022-01-17 21:20:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][630/1251] eta 0:22:59 lr 0.000626 time 2.1231 (2.2212) loss 4.4169 (4.9209) grad_norm 1.3021 (1.7276) [2022-01-17 21:20:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][640/1251] eta 0:22:36 lr 0.000626 time 2.2802 (2.2202) loss 4.6614 (4.9173) grad_norm 1.8733 (1.7275) [2022-01-17 21:21:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][650/1251] eta 0:22:13 lr 0.000626 time 1.9295 (2.2192) loss 3.8921 (4.9190) grad_norm 1.3228 (1.7261) [2022-01-17 21:21:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][660/1251] eta 0:21:53 lr 0.000627 time 2.1894 (2.2220) loss 5.3372 (4.9191) grad_norm 1.6791 (1.7290) [2022-01-17 21:22:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][670/1251] eta 0:21:34 lr 0.000627 time 3.2095 (2.2275) loss 5.2492 (4.9176) grad_norm 1.3035 (1.7276) [2022-01-17 21:22:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][680/1251] eta 0:21:11 lr 0.000628 time 1.6622 (2.2262) loss 5.5922 (4.9155) grad_norm 1.5821 (1.7274) [2022-01-17 21:22:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][690/1251] eta 0:20:47 lr 0.000628 time 1.6179 (2.2242) loss 5.4937 (4.9166) grad_norm 1.3383 (1.7240) [2022-01-17 21:23:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][700/1251] eta 0:20:24 lr 0.000628 time 2.0262 (2.2221) loss 4.9034 (4.9167) grad_norm 1.5119 (1.7204) [2022-01-17 21:23:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][710/1251] eta 0:20:00 lr 0.000629 time 2.1868 (2.2194) loss 4.9214 (4.9171) grad_norm 1.3999 (1.7198) [2022-01-17 21:23:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][720/1251] eta 0:19:38 lr 0.000629 time 2.7430 (2.2194) loss 5.5206 (4.9177) grad_norm 1.4290 (1.7187) [2022-01-17 21:24:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][730/1251] eta 0:19:16 lr 0.000630 time 2.5729 (2.2201) loss 5.2697 (4.9170) grad_norm 1.5183 (1.7180) [2022-01-17 21:24:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][740/1251] eta 0:18:54 lr 0.000630 time 2.4849 (2.2199) loss 5.3562 (4.9176) grad_norm 1.6115 (1.7156) [2022-01-17 21:24:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][750/1251] eta 0:18:32 lr 0.000630 time 1.9493 (2.2197) loss 4.9075 (4.9177) grad_norm 2.4522 (1.7182) [2022-01-17 21:25:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][760/1251] eta 0:18:09 lr 0.000631 time 2.4187 (2.2192) loss 4.8775 (4.9166) grad_norm 1.5779 (1.7199) [2022-01-17 21:25:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][770/1251] eta 0:17:46 lr 0.000631 time 1.9531 (2.2175) loss 4.3040 (4.9143) grad_norm 2.2366 (1.7234) [2022-01-17 21:26:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][780/1251] eta 0:17:23 lr 0.000632 time 2.2826 (2.2152) loss 5.5227 (4.9160) grad_norm 1.8026 (1.7226) [2022-01-17 21:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][790/1251] eta 0:17:00 lr 0.000632 time 2.1487 (2.2136) loss 4.8808 (4.9170) grad_norm 1.4905 (1.7215) [2022-01-17 21:26:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][800/1251] eta 0:16:38 lr 0.000632 time 2.2511 (2.2130) loss 4.1215 (4.9137) grad_norm 1.4322 (1.7227) [2022-01-17 21:27:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][810/1251] eta 0:16:16 lr 0.000633 time 2.1422 (2.2143) loss 5.6413 (4.9171) grad_norm 1.6577 (1.7228) [2022-01-17 21:27:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][820/1251] eta 0:15:55 lr 0.000633 time 3.2076 (2.2170) loss 5.7096 (4.9198) grad_norm 1.6298 (1.7258) [2022-01-17 21:27:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][830/1251] eta 0:15:34 lr 0.000634 time 3.6064 (2.2193) loss 5.3044 (4.9214) grad_norm 1.5534 (1.7270) [2022-01-17 21:28:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][840/1251] eta 0:15:12 lr 0.000634 time 1.8339 (2.2194) loss 4.0042 (4.9186) grad_norm 1.4509 (1.7264) [2022-01-17 21:28:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][850/1251] eta 0:14:48 lr 0.000634 time 1.8176 (2.2169) loss 4.9133 (4.9206) grad_norm 1.7707 (1.7275) [2022-01-17 21:28:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][860/1251] eta 0:14:25 lr 0.000635 time 2.2763 (2.2138) loss 4.2922 (4.9210) grad_norm 1.4336 (1.7261) [2022-01-17 21:29:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][870/1251] eta 0:14:02 lr 0.000635 time 2.0106 (2.2113) loss 4.5260 (4.9239) grad_norm 1.7380 (1.7255) [2022-01-17 21:29:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][880/1251] eta 0:13:39 lr 0.000636 time 2.5609 (2.2097) loss 5.3798 (4.9256) grad_norm 1.4193 (1.7243) [2022-01-17 21:30:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][890/1251] eta 0:13:18 lr 0.000636 time 3.2482 (2.2112) loss 3.8230 (4.9234) grad_norm 1.5262 (1.7217) [2022-01-17 21:30:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][900/1251] eta 0:12:56 lr 0.000636 time 2.1948 (2.2120) loss 5.2135 (4.9221) grad_norm 1.5703 (1.7221) [2022-01-17 21:30:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][910/1251] eta 0:12:34 lr 0.000637 time 1.2604 (2.2118) loss 5.5421 (4.9199) grad_norm 1.4571 (1.7193) [2022-01-17 21:31:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][920/1251] eta 0:12:11 lr 0.000637 time 1.4571 (2.2112) loss 4.4169 (4.9136) grad_norm 1.5971 (1.7180) [2022-01-17 21:31:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][930/1251] eta 0:11:50 lr 0.000638 time 2.7601 (2.2131) loss 5.3899 (4.9143) grad_norm 1.4459 (1.7174) [2022-01-17 21:31:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][940/1251] eta 0:11:28 lr 0.000638 time 1.9316 (2.2131) loss 4.9425 (4.9125) grad_norm 1.9588 (1.7191) [2022-01-17 21:32:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][950/1251] eta 0:11:06 lr 0.000638 time 1.8384 (2.2149) loss 5.6849 (4.9131) grad_norm 2.1994 (1.7194) [2022-01-17 21:32:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][960/1251] eta 0:10:44 lr 0.000639 time 1.9716 (2.2133) loss 5.3461 (4.9118) grad_norm 1.7945 (1.7204) [2022-01-17 21:33:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][970/1251] eta 0:10:21 lr 0.000639 time 2.6273 (2.2120) loss 4.3009 (4.9105) grad_norm 2.1487 (1.7230) [2022-01-17 21:33:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][980/1251] eta 0:09:58 lr 0.000640 time 1.9271 (2.2096) loss 5.0925 (4.9085) grad_norm 1.8430 (1.7230) [2022-01-17 21:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][990/1251] eta 0:09:36 lr 0.000640 time 2.3715 (2.2076) loss 5.3193 (4.9108) grad_norm 1.6918 (1.7225) [2022-01-17 21:34:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1000/1251] eta 0:09:14 lr 0.000640 time 2.5313 (2.2073) loss 4.8947 (4.9085) grad_norm 1.5705 (1.7226) [2022-01-17 21:34:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1010/1251] eta 0:08:52 lr 0.000641 time 2.0978 (2.2082) loss 3.9359 (4.9084) grad_norm 1.5675 (1.7202) [2022-01-17 21:34:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1020/1251] eta 0:08:30 lr 0.000641 time 2.5865 (2.2081) loss 5.4479 (4.9076) grad_norm 1.5607 (1.7188) [2022-01-17 21:35:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1030/1251] eta 0:08:08 lr 0.000642 time 2.8984 (2.2088) loss 4.6147 (4.9080) grad_norm 1.6403 (1.7173) [2022-01-17 21:35:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1040/1251] eta 0:07:46 lr 0.000642 time 1.9266 (2.2100) loss 4.2632 (4.9073) grad_norm 1.5816 (1.7164) [2022-01-17 21:35:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1050/1251] eta 0:07:24 lr 0.000642 time 1.7939 (2.2100) loss 5.5411 (4.9065) grad_norm 1.6602 (1.7162) [2022-01-17 21:36:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1060/1251] eta 0:07:02 lr 0.000643 time 2.7769 (2.2105) loss 5.3449 (4.9073) grad_norm 2.1830 (1.7165) [2022-01-17 21:36:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1070/1251] eta 0:06:39 lr 0.000643 time 1.7683 (2.2087) loss 5.6387 (4.9091) grad_norm 1.3045 (1.7164) [2022-01-17 21:36:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1080/1251] eta 0:06:17 lr 0.000644 time 2.0350 (2.2073) loss 4.0079 (4.9079) grad_norm 1.8019 (1.7154) [2022-01-17 21:37:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1090/1251] eta 0:05:55 lr 0.000644 time 2.1865 (2.2079) loss 5.2117 (4.9054) grad_norm 1.5691 (1.7159) [2022-01-17 21:37:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1100/1251] eta 0:05:33 lr 0.000644 time 3.0334 (2.2096) loss 4.6569 (4.9048) grad_norm 1.4170 (1.7141) [2022-01-17 21:38:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1110/1251] eta 0:05:11 lr 0.000645 time 2.6733 (2.2104) loss 3.7992 (4.9045) grad_norm 2.6436 (1.7132) [2022-01-17 21:38:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1120/1251] eta 0:04:49 lr 0.000645 time 1.9029 (2.2092) loss 4.4586 (4.9033) grad_norm 1.5772 (1.7148) [2022-01-17 21:38:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1130/1251] eta 0:04:26 lr 0.000646 time 1.7868 (2.2066) loss 4.3464 (4.9055) grad_norm 2.0362 (1.7159) [2022-01-17 21:39:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1140/1251] eta 0:04:04 lr 0.000646 time 1.8264 (2.2056) loss 4.0554 (4.9065) grad_norm 1.7191 (1.7165) [2022-01-17 21:39:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1150/1251] eta 0:03:42 lr 0.000646 time 2.5726 (2.2046) loss 4.1600 (4.9062) grad_norm 1.5205 (1.7138) [2022-01-17 21:39:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1160/1251] eta 0:03:20 lr 0.000647 time 1.6749 (2.2047) loss 4.9247 (4.9049) grad_norm 1.8223 (1.7134) [2022-01-17 21:40:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1170/1251] eta 0:02:58 lr 0.000647 time 2.5809 (2.2057) loss 4.6940 (4.9016) grad_norm 2.1207 (1.7141) [2022-01-17 21:40:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1180/1251] eta 0:02:36 lr 0.000648 time 1.5445 (2.2060) loss 5.1544 (4.9022) grad_norm 1.2579 (1.7131) [2022-01-17 21:41:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1190/1251] eta 0:02:14 lr 0.000648 time 2.7396 (2.2071) loss 5.5959 (4.9042) grad_norm 1.4800 (1.7117) [2022-01-17 21:41:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1200/1251] eta 0:01:52 lr 0.000648 time 2.2512 (2.2083) loss 4.0195 (4.9047) grad_norm 1.5651 (1.7126) [2022-01-17 21:41:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1210/1251] eta 0:01:30 lr 0.000649 time 2.1543 (2.2090) loss 5.2395 (4.9038) grad_norm 1.6019 (1.7113) [2022-01-17 21:42:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1220/1251] eta 0:01:08 lr 0.000649 time 1.9094 (2.2082) loss 5.1103 (4.9043) grad_norm 1.3456 (1.7100) [2022-01-17 21:42:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1230/1251] eta 0:00:46 lr 0.000650 time 1.9346 (2.2066) loss 4.5565 (4.9016) grad_norm 1.7696 (1.7083) [2022-01-17 21:42:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1240/1251] eta 0:00:24 lr 0.000650 time 2.1056 (2.2053) loss 4.2257 (4.9016) grad_norm 1.5638 (1.7080) [2022-01-17 21:43:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [12/300][1250/1251] eta 0:00:02 lr 0.000650 time 1.3623 (2.2000) loss 5.0979 (4.9017) grad_norm 1.6644 (1.7082) [2022-01-17 21:43:05 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 12 training takes 0:45:52 [2022-01-17 21:43:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.738 (18.738) Loss 2.3526 (2.3526) Acc@1 48.926 (48.926) Acc@5 75.195 (75.195) [2022-01-17 21:43:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.256 (3.291) Loss 2.2856 (2.2948) Acc@1 50.098 (50.550) Acc@5 76.660 (75.755) [2022-01-17 21:44:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.299 (2.643) Loss 2.3534 (2.2768) Acc@1 50.488 (50.781) Acc@5 73.535 (75.730) [2022-01-17 21:44:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.312 (2.206) Loss 2.2246 (2.2762) Acc@1 51.953 (50.643) Acc@5 77.051 (75.892) [2022-01-17 21:44:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.864 (2.148) Loss 2.2886 (2.2795) Acc@1 50.879 (50.638) Acc@5 75.977 (75.793) [2022-01-17 21:44:40 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 50.770 Acc@5 75.810 [2022-01-17 21:44:40 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 50.8% [2022-01-17 21:44:40 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 50.77% [2022-01-17 21:45:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][0/1251] eta 7:30:31 lr 0.000650 time 21.6079 (21.6079) loss 3.4350 (3.4350) grad_norm 1.7505 (1.7505) [2022-01-17 21:45:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][10/1251] eta 1:20:14 lr 0.000651 time 1.4500 (3.8795) loss 3.8366 (4.5579) grad_norm 1.5426 (1.6614) [2022-01-17 21:45:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][20/1251] eta 1:03:51 lr 0.000651 time 2.4017 (3.1128) loss 5.4153 (4.7367) grad_norm 1.6530 (1.7188) [2022-01-17 21:46:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][30/1251] eta 0:56:51 lr 0.000652 time 1.4726 (2.7936) loss 4.8120 (4.7389) grad_norm 1.7297 (1.7071) [2022-01-17 21:46:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][40/1251] eta 0:54:39 lr 0.000652 time 4.6527 (2.7085) loss 5.3073 (4.7320) grad_norm 1.4660 (1.7092) [2022-01-17 21:46:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][50/1251] eta 0:52:07 lr 0.000652 time 1.4738 (2.6039) loss 5.1074 (4.7656) grad_norm 1.6826 (1.7064) [2022-01-17 21:47:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][60/1251] eta 0:50:26 lr 0.000653 time 1.4867 (2.5411) loss 4.7317 (4.8017) grad_norm 1.8034 (1.7092) [2022-01-17 21:47:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][70/1251] eta 0:49:09 lr 0.000653 time 1.5666 (2.4971) loss 4.3774 (4.8054) grad_norm 1.5704 (1.7038) [2022-01-17 21:48:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][80/1251] eta 0:48:38 lr 0.000654 time 3.6400 (2.4921) loss 3.6792 (4.8107) grad_norm 1.6293 (1.6979) [2022-01-17 21:48:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][90/1251] eta 0:47:49 lr 0.000654 time 1.7868 (2.4719) loss 3.8104 (4.7893) grad_norm 2.2803 (1.7039) [2022-01-17 21:48:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][100/1251] eta 0:47:14 lr 0.000654 time 1.9311 (2.4626) loss 3.9397 (4.7838) grad_norm 1.7651 (1.6961) [2022-01-17 21:49:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][110/1251] eta 0:45:56 lr 0.000655 time 1.8683 (2.4162) loss 5.1173 (4.7867) grad_norm 1.7489 (1.7037) [2022-01-17 21:49:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][120/1251] eta 0:44:43 lr 0.000655 time 2.1744 (2.3727) loss 4.8906 (4.7846) grad_norm 1.8253 (1.6890) [2022-01-17 21:49:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][130/1251] eta 0:43:53 lr 0.000656 time 2.2581 (2.3488) loss 5.4756 (4.7962) grad_norm 2.4642 (1.7036) [2022-01-17 21:50:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][140/1251] eta 0:43:06 lr 0.000656 time 1.8593 (2.3276) loss 5.4880 (4.7952) grad_norm 1.7579 (1.7128) [2022-01-17 21:50:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][150/1251] eta 0:42:43 lr 0.000656 time 2.8921 (2.3283) loss 5.3085 (4.8077) grad_norm 1.4313 (1.7155) [2022-01-17 21:50:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][160/1251] eta 0:42:14 lr 0.000657 time 2.2207 (2.3231) loss 4.4689 (4.7977) grad_norm 1.5189 (1.7105) [2022-01-17 21:51:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][170/1251] eta 0:41:36 lr 0.000657 time 1.8591 (2.3093) loss 4.1498 (4.7969) grad_norm 1.5166 (1.7117) [2022-01-17 21:51:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][180/1251] eta 0:41:04 lr 0.000658 time 1.9192 (2.3015) loss 4.8572 (4.8217) grad_norm 1.6963 (1.7103) [2022-01-17 21:52:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][190/1251] eta 0:40:39 lr 0.000658 time 2.3036 (2.2991) loss 3.8395 (4.8091) grad_norm 1.5121 (1.7148) [2022-01-17 21:52:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][200/1251] eta 0:40:18 lr 0.000658 time 2.4440 (2.3010) loss 4.4462 (4.8049) grad_norm 1.5516 (1.7135) [2022-01-17 21:52:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][210/1251] eta 0:40:01 lr 0.000659 time 2.2805 (2.3070) loss 5.2753 (4.8143) grad_norm 1.7276 (1.7126) [2022-01-17 21:53:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][220/1251] eta 0:39:33 lr 0.000659 time 2.5459 (2.3025) loss 5.0960 (4.8104) grad_norm 1.5920 (1.7075) [2022-01-17 21:53:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][230/1251] eta 0:38:57 lr 0.000660 time 2.0310 (2.2895) loss 5.3223 (4.8078) grad_norm 1.3499 (1.7023) [2022-01-17 21:53:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][240/1251] eta 0:38:23 lr 0.000660 time 1.8685 (2.2787) loss 4.8004 (4.8193) grad_norm 1.8051 (1.6993) [2022-01-17 21:54:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][250/1251] eta 0:37:51 lr 0.000660 time 2.0855 (2.2688) loss 4.9211 (4.8273) grad_norm 1.4615 (1.6901) [2022-01-17 21:54:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][260/1251] eta 0:37:25 lr 0.000661 time 2.8923 (2.2658) loss 5.3431 (4.8232) grad_norm 1.3999 (1.6895) [2022-01-17 21:54:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][270/1251] eta 0:37:04 lr 0.000661 time 1.7395 (2.2679) loss 4.3896 (4.8175) grad_norm 1.2454 (1.6911) [2022-01-17 21:55:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][280/1251] eta 0:36:42 lr 0.000662 time 2.7734 (2.2681) loss 4.8875 (4.8133) grad_norm 2.1266 (1.6966) [2022-01-17 21:55:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][290/1251] eta 0:36:16 lr 0.000662 time 2.1947 (2.2651) loss 5.1804 (4.8072) grad_norm 1.2688 (1.6921) [2022-01-17 21:56:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][300/1251] eta 0:35:47 lr 0.000662 time 2.5663 (2.2581) loss 5.0649 (4.7996) grad_norm 1.5966 (1.6885) [2022-01-17 21:56:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][310/1251] eta 0:35:21 lr 0.000663 time 1.8187 (2.2549) loss 4.9685 (4.7998) grad_norm 1.1798 (1.6812) [2022-01-17 21:56:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][320/1251] eta 0:34:59 lr 0.000663 time 2.0003 (2.2551) loss 5.7142 (4.8055) grad_norm 1.6186 (1.6775) [2022-01-17 21:57:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][330/1251] eta 0:34:34 lr 0.000664 time 2.0254 (2.2522) loss 5.6907 (4.8107) grad_norm 1.2445 (1.6783) [2022-01-17 21:57:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][340/1251] eta 0:34:08 lr 0.000664 time 2.3360 (2.2491) loss 4.7336 (4.8054) grad_norm 1.4078 (1.6705) [2022-01-17 21:57:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][350/1251] eta 0:33:40 lr 0.000664 time 1.7314 (2.2430) loss 4.4999 (4.8037) grad_norm 2.0129 (1.6719) [2022-01-17 21:58:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][360/1251] eta 0:33:16 lr 0.000665 time 2.4672 (2.2403) loss 4.6119 (4.8023) grad_norm 1.8771 (1.6747) [2022-01-17 21:58:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][370/1251] eta 0:32:48 lr 0.000665 time 1.6911 (2.2338) loss 4.5806 (4.8105) grad_norm 2.1465 (1.6756) [2022-01-17 21:58:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][380/1251] eta 0:32:25 lr 0.000666 time 2.8540 (2.2334) loss 5.3273 (4.8109) grad_norm 1.7415 (1.6794) [2022-01-17 21:59:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][390/1251] eta 0:32:03 lr 0.000666 time 2.5005 (2.2339) loss 5.0078 (4.8080) grad_norm 1.8462 (1.6748) [2022-01-17 21:59:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][400/1251] eta 0:31:39 lr 0.000666 time 2.2160 (2.2326) loss 5.3794 (4.8100) grad_norm 1.8158 (1.6780) [2022-01-17 21:59:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][410/1251] eta 0:31:18 lr 0.000667 time 1.9436 (2.2333) loss 3.6613 (4.8060) grad_norm 1.4996 (1.6787) [2022-01-17 22:00:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][420/1251] eta 0:30:58 lr 0.000667 time 3.5709 (2.2365) loss 4.1495 (4.8059) grad_norm 1.6661 (1.6756) [2022-01-17 22:00:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][430/1251] eta 0:30:33 lr 0.000668 time 1.9329 (2.2331) loss 4.5889 (4.8086) grad_norm 1.5264 (1.6752) [2022-01-17 22:01:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][440/1251] eta 0:30:09 lr 0.000668 time 1.9024 (2.2312) loss 4.7891 (4.8153) grad_norm 1.5849 (1.6709) [2022-01-17 22:01:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][450/1251] eta 0:29:46 lr 0.000668 time 1.9147 (2.2304) loss 5.2800 (4.8200) grad_norm 1.3186 (1.6675) [2022-01-17 22:01:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][460/1251] eta 0:29:29 lr 0.000669 time 3.6334 (2.2372) loss 5.0489 (4.8152) grad_norm 1.9154 (1.6660) [2022-01-17 22:02:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][470/1251] eta 0:29:06 lr 0.000669 time 1.8234 (2.2358) loss 5.0809 (4.8119) grad_norm 1.6030 (1.6633) [2022-01-17 22:02:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][480/1251] eta 0:28:40 lr 0.000670 time 1.6199 (2.2311) loss 5.3942 (4.8131) grad_norm 1.6755 (1.6624) [2022-01-17 22:02:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][490/1251] eta 0:28:11 lr 0.000670 time 1.6374 (2.2234) loss 5.7192 (4.8203) grad_norm 1.7653 (1.6682) [2022-01-17 22:03:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][500/1251] eta 0:27:46 lr 0.000670 time 1.8636 (2.2194) loss 5.7028 (4.8252) grad_norm 1.4947 (1.6675) [2022-01-17 22:03:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][510/1251] eta 0:27:24 lr 0.000671 time 2.4627 (2.2193) loss 5.0563 (4.8332) grad_norm 1.3515 (1.6649) [2022-01-17 22:03:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][520/1251] eta 0:27:02 lr 0.000671 time 2.0932 (2.2191) loss 5.0853 (4.8341) grad_norm 1.8420 (1.6643) [2022-01-17 22:04:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][530/1251] eta 0:26:39 lr 0.000672 time 2.3347 (2.2180) loss 5.2244 (4.8263) grad_norm 1.4654 (1.6630) [2022-01-17 22:04:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][540/1251] eta 0:26:17 lr 0.000672 time 2.4182 (2.2194) loss 5.1257 (4.8251) grad_norm 1.5325 (1.6631) [2022-01-17 22:05:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][550/1251] eta 0:25:56 lr 0.000672 time 2.1652 (2.2203) loss 5.2465 (4.8210) grad_norm 1.5942 (1.6616) [2022-01-17 22:05:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][560/1251] eta 0:25:34 lr 0.000673 time 2.3783 (2.2201) loss 4.5546 (4.8192) grad_norm 1.4443 (1.6620) [2022-01-17 22:05:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][570/1251] eta 0:25:11 lr 0.000673 time 2.2196 (2.2197) loss 4.3481 (4.8221) grad_norm 2.0199 (1.6617) [2022-01-17 22:06:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][580/1251] eta 0:24:50 lr 0.000674 time 2.4753 (2.2213) loss 4.7156 (4.8259) grad_norm 1.7162 (1.6588) [2022-01-17 22:06:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][590/1251] eta 0:24:26 lr 0.000674 time 1.8479 (2.2181) loss 5.0829 (4.8296) grad_norm 1.6781 (1.6573) [2022-01-17 22:06:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][600/1251] eta 0:24:03 lr 0.000674 time 2.2820 (2.2168) loss 5.4801 (4.8276) grad_norm 1.9964 (1.6601) [2022-01-17 22:07:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][610/1251] eta 0:23:40 lr 0.000675 time 2.5439 (2.2164) loss 5.2891 (4.8290) grad_norm 1.8006 (1.6596) [2022-01-17 22:07:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][620/1251] eta 0:23:18 lr 0.000675 time 1.6591 (2.2171) loss 5.4869 (4.8298) grad_norm 1.3032 (1.6569) [2022-01-17 22:07:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][630/1251] eta 0:22:56 lr 0.000676 time 1.9186 (2.2163) loss 4.5137 (4.8278) grad_norm 1.5927 (1.6563) [2022-01-17 22:08:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][640/1251] eta 0:22:31 lr 0.000676 time 1.5358 (2.2124) loss 5.5386 (4.8248) grad_norm 1.5535 (1.6563) [2022-01-17 22:08:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][650/1251] eta 0:22:08 lr 0.000676 time 1.6382 (2.2100) loss 4.9693 (4.8206) grad_norm 1.2865 (1.6549) [2022-01-17 22:09:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][660/1251] eta 0:21:48 lr 0.000677 time 2.2944 (2.2133) loss 4.6345 (4.8223) grad_norm 1.6356 (1.6569) [2022-01-17 22:09:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][670/1251] eta 0:21:27 lr 0.000677 time 2.1736 (2.2153) loss 3.7426 (4.8194) grad_norm 1.7017 (1.6581) [2022-01-17 22:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][680/1251] eta 0:21:05 lr 0.000678 time 1.7238 (2.2160) loss 5.2655 (4.8196) grad_norm 1.6759 (1.6591) [2022-01-17 22:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][690/1251] eta 0:20:42 lr 0.000678 time 1.7864 (2.2150) loss 3.7108 (4.8221) grad_norm 1.8983 (1.6585) [2022-01-17 22:10:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][700/1251] eta 0:20:19 lr 0.000678 time 1.9267 (2.2132) loss 5.7713 (4.8232) grad_norm 1.5947 (1.6565) [2022-01-17 22:10:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][710/1251] eta 0:19:55 lr 0.000679 time 1.9057 (2.2100) loss 5.2175 (4.8212) grad_norm 1.3025 (1.6556) [2022-01-17 22:11:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][720/1251] eta 0:19:32 lr 0.000679 time 2.1871 (2.2081) loss 4.6399 (4.8201) grad_norm 1.6977 (1.6540) [2022-01-17 22:11:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][730/1251] eta 0:19:10 lr 0.000679 time 1.9787 (2.2077) loss 4.5567 (4.8195) grad_norm 1.6709 (1.6540) [2022-01-17 22:11:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][740/1251] eta 0:18:48 lr 0.000680 time 1.8560 (2.2092) loss 4.6860 (4.8222) grad_norm 1.3177 (1.6524) [2022-01-17 22:12:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][750/1251] eta 0:18:27 lr 0.000680 time 2.6523 (2.2107) loss 4.2373 (4.8209) grad_norm 1.2905 (1.6549) [2022-01-17 22:12:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][760/1251] eta 0:18:06 lr 0.000681 time 2.2175 (2.2126) loss 5.3730 (4.8170) grad_norm 1.6346 (1.6554) [2022-01-17 22:13:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][770/1251] eta 0:17:43 lr 0.000681 time 1.5915 (2.2111) loss 4.2387 (4.8147) grad_norm 1.4011 (1.6525) [2022-01-17 22:13:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][780/1251] eta 0:17:19 lr 0.000681 time 1.8535 (2.2081) loss 5.1045 (4.8147) grad_norm 1.7626 (1.6558) [2022-01-17 22:13:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][790/1251] eta 0:16:56 lr 0.000682 time 1.7330 (2.2044) loss 4.0964 (4.8126) grad_norm 1.4607 (1.6565) [2022-01-17 22:14:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][800/1251] eta 0:16:34 lr 0.000682 time 2.0169 (2.2046) loss 4.6454 (4.8152) grad_norm 1.6108 (1.6549) [2022-01-17 22:14:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][810/1251] eta 0:16:12 lr 0.000683 time 2.2473 (2.2048) loss 4.9768 (4.8139) grad_norm 1.5663 (1.6568) [2022-01-17 22:14:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][820/1251] eta 0:15:50 lr 0.000683 time 2.2738 (2.2049) loss 5.0937 (4.8144) grad_norm 1.6385 (1.6567) [2022-01-17 22:15:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][830/1251] eta 0:15:28 lr 0.000683 time 2.6998 (2.2054) loss 5.0943 (4.8149) grad_norm 1.5801 (1.6573) [2022-01-17 22:15:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][840/1251] eta 0:15:06 lr 0.000684 time 2.1909 (2.2049) loss 5.2250 (4.8177) grad_norm 1.6167 (1.6579) [2022-01-17 22:15:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][850/1251] eta 0:14:44 lr 0.000684 time 2.4277 (2.2046) loss 5.1935 (4.8165) grad_norm 1.4363 (1.6567) [2022-01-17 22:16:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][860/1251] eta 0:14:21 lr 0.000685 time 2.1961 (2.2042) loss 5.1070 (4.8200) grad_norm 1.5491 (1.6559) [2022-01-17 22:16:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][870/1251] eta 0:14:00 lr 0.000685 time 3.0659 (2.2051) loss 5.5415 (4.8191) grad_norm 1.2975 (1.6563) [2022-01-17 22:17:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][880/1251] eta 0:13:37 lr 0.000685 time 1.5235 (2.2039) loss 3.8661 (4.8184) grad_norm 2.0039 (1.6552) [2022-01-17 22:17:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][890/1251] eta 0:13:16 lr 0.000686 time 3.1563 (2.2050) loss 4.4634 (4.8199) grad_norm 1.7531 (1.6573) [2022-01-17 22:17:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][900/1251] eta 0:12:54 lr 0.000686 time 1.5794 (2.2052) loss 4.0845 (4.8221) grad_norm 1.5316 (1.6584) [2022-01-17 22:18:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][910/1251] eta 0:12:32 lr 0.000687 time 2.8258 (2.2068) loss 5.4422 (4.8234) grad_norm 1.6239 (1.6574) [2022-01-17 22:18:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][920/1251] eta 0:12:09 lr 0.000687 time 1.6141 (2.2051) loss 5.3347 (4.8202) grad_norm 1.5194 (1.6556) [2022-01-17 22:18:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][930/1251] eta 0:11:47 lr 0.000687 time 2.7162 (2.2050) loss 4.7140 (4.8164) grad_norm 1.7668 (1.6539) [2022-01-17 22:19:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][940/1251] eta 0:11:25 lr 0.000688 time 2.3163 (2.2048) loss 4.6990 (4.8183) grad_norm 1.6731 (1.6539) [2022-01-17 22:19:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][950/1251] eta 0:11:03 lr 0.000688 time 2.3935 (2.2053) loss 4.7417 (4.8178) grad_norm 1.5821 (1.6533) [2022-01-17 22:20:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][960/1251] eta 0:10:41 lr 0.000689 time 1.8444 (2.2060) loss 5.2823 (4.8160) grad_norm 1.2379 (1.6529) [2022-01-17 22:20:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][970/1251] eta 0:10:19 lr 0.000689 time 2.3187 (2.2052) loss 3.9770 (4.8142) grad_norm 1.6574 (1.6516) [2022-01-17 22:20:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][980/1251] eta 0:09:56 lr 0.000689 time 1.9615 (2.2028) loss 5.2252 (4.8165) grad_norm 1.5389 (1.6500) [2022-01-17 22:21:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][990/1251] eta 0:09:34 lr 0.000690 time 2.4783 (2.2017) loss 4.3912 (4.8164) grad_norm 1.7936 (1.6486) [2022-01-17 22:21:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1000/1251] eta 0:09:12 lr 0.000690 time 2.2790 (2.2026) loss 4.7791 (4.8178) grad_norm 1.5560 (1.6504) [2022-01-17 22:21:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1010/1251] eta 0:08:50 lr 0.000691 time 1.5615 (2.2012) loss 5.4681 (4.8167) grad_norm 1.3099 (1.6502) [2022-01-17 22:22:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1020/1251] eta 0:08:28 lr 0.000691 time 1.6843 (2.2016) loss 5.6487 (4.8167) grad_norm 1.7940 (1.6499) [2022-01-17 22:22:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1030/1251] eta 0:08:06 lr 0.000691 time 2.7286 (2.2027) loss 5.1430 (4.8174) grad_norm 1.9672 (1.6493) [2022-01-17 22:22:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1040/1251] eta 0:07:45 lr 0.000692 time 3.1567 (2.2044) loss 5.2627 (4.8180) grad_norm 1.3989 (1.6488) [2022-01-17 22:23:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1050/1251] eta 0:07:22 lr 0.000692 time 1.5896 (2.2037) loss 4.6716 (4.8190) grad_norm 1.5238 (1.6479) [2022-01-17 22:23:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1060/1251] eta 0:07:00 lr 0.000693 time 1.8980 (2.2020) loss 5.5106 (4.8189) grad_norm 1.3243 (1.6471) [2022-01-17 22:23:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1070/1251] eta 0:06:38 lr 0.000693 time 1.8974 (2.2017) loss 4.9087 (4.8202) grad_norm 1.4961 (1.6465) [2022-01-17 22:24:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1080/1251] eta 0:06:16 lr 0.000693 time 3.1623 (2.2018) loss 5.0398 (4.8175) grad_norm 1.5248 (1.6457) [2022-01-17 22:24:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1090/1251] eta 0:05:54 lr 0.000694 time 2.4552 (2.2023) loss 5.1680 (4.8186) grad_norm 1.2864 (1.6449) [2022-01-17 22:25:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1100/1251] eta 0:05:32 lr 0.000694 time 1.8956 (2.2025) loss 5.3682 (4.8174) grad_norm 1.9819 (1.6458) [2022-01-17 22:25:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1110/1251] eta 0:05:10 lr 0.000695 time 1.9002 (2.2024) loss 5.5448 (4.8168) grad_norm 1.6558 (1.6452) [2022-01-17 22:25:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1120/1251] eta 0:04:48 lr 0.000695 time 2.8430 (2.2031) loss 4.8878 (4.8142) grad_norm 2.7003 (1.6488) [2022-01-17 22:26:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1130/1251] eta 0:04:26 lr 0.000695 time 1.8757 (2.2019) loss 5.0023 (4.8159) grad_norm 1.4904 (1.6502) [2022-01-17 22:26:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1140/1251] eta 0:04:04 lr 0.000696 time 1.9994 (2.2004) loss 4.5248 (4.8167) grad_norm 1.4410 (1.6488) [2022-01-17 22:26:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1150/1251] eta 0:03:42 lr 0.000696 time 2.5718 (2.1994) loss 5.1475 (4.8175) grad_norm 1.8741 (1.6496) [2022-01-17 22:27:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1160/1251] eta 0:03:20 lr 0.000697 time 2.2252 (2.1999) loss 4.7691 (4.8197) grad_norm 1.4205 (1.6489) [2022-01-17 22:27:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1170/1251] eta 0:02:58 lr 0.000697 time 2.5997 (2.2008) loss 4.6358 (4.8196) grad_norm 1.3901 (1.6482) [2022-01-17 22:27:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1180/1251] eta 0:02:36 lr 0.000697 time 1.7577 (2.2001) loss 5.0899 (4.8188) grad_norm 1.7647 (1.6490) [2022-01-17 22:28:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1190/1251] eta 0:02:14 lr 0.000698 time 1.7698 (2.1987) loss 5.0711 (4.8157) grad_norm 2.1116 (1.6490) [2022-01-17 22:28:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1200/1251] eta 0:01:52 lr 0.000698 time 2.6320 (2.1988) loss 4.6894 (4.8142) grad_norm 1.4872 (1.6484) [2022-01-17 22:29:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1210/1251] eta 0:01:30 lr 0.000699 time 2.6371 (2.1988) loss 4.8952 (4.8144) grad_norm 1.2335 (1.6476) [2022-01-17 22:29:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1220/1251] eta 0:01:08 lr 0.000699 time 2.1667 (2.1983) loss 5.3942 (4.8143) grad_norm 1.3810 (1.6466) [2022-01-17 22:29:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1230/1251] eta 0:00:46 lr 0.000699 time 1.6306 (2.1976) loss 4.7717 (4.8141) grad_norm 1.3877 (1.6474) [2022-01-17 22:30:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1240/1251] eta 0:00:24 lr 0.000700 time 1.5125 (2.1963) loss 4.5147 (4.8110) grad_norm 1.6771 (1.6486) [2022-01-17 22:30:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [13/300][1250/1251] eta 0:00:02 lr 0.000700 time 1.1912 (2.1909) loss 4.8077 (4.8121) grad_norm 1.4767 (1.6494) [2022-01-17 22:30:22 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 13 training takes 0:45:41 [2022-01-17 22:30:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.447 (18.447) Loss 2.0841 (2.0841) Acc@1 53.613 (53.613) Acc@5 77.246 (77.246) [2022-01-17 22:31:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.666 (3.569) Loss 2.1463 (2.1542) Acc@1 51.074 (52.459) Acc@5 78.516 (77.290) [2022-01-17 22:31:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.591 (2.609) Loss 2.1923 (2.1489) Acc@1 52.930 (52.618) Acc@5 76.465 (77.646) [2022-01-17 22:31:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.993 (2.332) Loss 2.2026 (2.1541) Acc@1 53.027 (52.593) Acc@5 77.441 (77.577) [2022-01-17 22:31:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.272 (2.178) Loss 2.1868 (2.1523) Acc@1 53.809 (52.780) Acc@5 77.051 (77.703) [2022-01-17 22:31:58 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 52.758 Acc@5 77.754 [2022-01-17 22:31:58 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 52.8% [2022-01-17 22:31:58 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 52.76% [2022-01-17 22:32:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][0/1251] eta 7:30:41 lr 0.000700 time 21.6159 (21.6159) loss 5.1464 (5.1464) grad_norm 1.5118 (1.5118) [2022-01-17 22:32:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][10/1251] eta 1:25:02 lr 0.000701 time 2.8760 (4.1115) loss 5.2567 (4.9261) grad_norm 1.5558 (1.7024) [2022-01-17 22:33:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][20/1251] eta 1:03:55 lr 0.000701 time 1.5870 (3.1159) loss 4.2377 (4.7741) grad_norm 1.4374 (1.6654) [2022-01-17 22:33:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][30/1251] eta 0:57:11 lr 0.000701 time 1.5379 (2.8106) loss 4.3499 (4.8341) grad_norm 1.3574 (1.6229) [2022-01-17 22:33:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][40/1251] eta 0:54:12 lr 0.000702 time 3.6387 (2.6858) loss 4.5426 (4.7675) grad_norm 1.3057 (1.6662) [2022-01-17 22:34:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][50/1251] eta 0:52:35 lr 0.000702 time 2.1645 (2.6277) loss 5.1063 (4.7489) grad_norm 1.7105 (1.6720) [2022-01-17 22:34:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][60/1251] eta 0:50:54 lr 0.000703 time 1.6752 (2.5649) loss 4.6324 (4.7819) grad_norm 1.7996 (1.6709) [2022-01-17 22:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][70/1251] eta 0:48:46 lr 0.000703 time 1.9141 (2.4777) loss 3.8566 (4.7794) grad_norm 1.1329 (1.6961) [2022-01-17 22:35:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][80/1251] eta 0:47:50 lr 0.000703 time 3.6362 (2.4509) loss 4.2360 (4.7951) grad_norm 1.6293 (1.6824) [2022-01-17 22:35:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][90/1251] eta 0:47:31 lr 0.000704 time 1.7107 (2.4565) loss 5.0819 (4.8090) grad_norm 1.5050 (1.7034) [2022-01-17 22:36:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][100/1251] eta 0:46:33 lr 0.000704 time 1.5945 (2.4271) loss 4.3935 (4.8128) grad_norm 1.3119 (1.7065) [2022-01-17 22:36:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][110/1251] eta 0:45:49 lr 0.000705 time 1.9039 (2.4095) loss 4.8158 (4.8084) grad_norm 1.4470 (1.6933) [2022-01-17 22:36:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][120/1251] eta 0:45:08 lr 0.000705 time 2.8606 (2.3949) loss 4.7730 (4.7982) grad_norm 1.7588 (1.6841) [2022-01-17 22:37:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][130/1251] eta 0:44:41 lr 0.000705 time 1.9109 (2.3923) loss 4.0197 (4.8046) grad_norm 1.9259 (1.6686) [2022-01-17 22:37:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][140/1251] eta 0:43:44 lr 0.000706 time 1.8247 (2.3626) loss 5.1219 (4.7994) grad_norm 1.4958 (1.6686) [2022-01-17 22:37:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][150/1251] eta 0:42:54 lr 0.000706 time 1.7260 (2.3381) loss 4.6580 (4.8011) grad_norm 1.7176 (1.6629) [2022-01-17 22:38:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][160/1251] eta 0:42:22 lr 0.000707 time 2.5093 (2.3309) loss 5.4090 (4.8190) grad_norm 1.3723 (1.6625) [2022-01-17 22:38:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][170/1251] eta 0:42:04 lr 0.000707 time 2.1976 (2.3357) loss 3.8572 (4.7984) grad_norm 1.4275 (1.6586) [2022-01-17 22:39:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][180/1251] eta 0:41:37 lr 0.000707 time 2.4284 (2.3320) loss 4.4711 (4.7678) grad_norm 2.6555 (1.6581) [2022-01-17 22:39:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][190/1251] eta 0:41:02 lr 0.000708 time 1.7635 (2.3212) loss 5.0304 (4.7872) grad_norm 1.3476 (1.6533) [2022-01-17 22:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][200/1251] eta 0:40:20 lr 0.000708 time 1.9340 (2.3027) loss 5.0752 (4.7822) grad_norm 1.6596 (1.6520) [2022-01-17 22:39:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][210/1251] eta 0:39:35 lr 0.000709 time 1.9495 (2.2821) loss 4.8158 (4.7755) grad_norm 1.3858 (1.6506) [2022-01-17 22:40:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][220/1251] eta 0:39:07 lr 0.000709 time 1.8191 (2.2767) loss 4.0399 (4.7710) grad_norm 2.0063 (1.6510) [2022-01-17 22:40:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][230/1251] eta 0:38:37 lr 0.000709 time 2.0243 (2.2703) loss 4.8088 (4.7763) grad_norm 1.6186 (1.6488) [2022-01-17 22:41:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][240/1251] eta 0:38:19 lr 0.000710 time 2.2051 (2.2748) loss 5.2476 (4.7887) grad_norm 1.6324 (1.6458) [2022-01-17 22:41:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][250/1251] eta 0:37:53 lr 0.000710 time 2.0004 (2.2712) loss 5.7430 (4.7874) grad_norm 1.3228 (1.6439) [2022-01-17 22:41:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][260/1251] eta 0:37:42 lr 0.000711 time 3.0762 (2.2828) loss 4.2583 (4.7866) grad_norm 1.7619 (1.6470) [2022-01-17 22:42:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][270/1251] eta 0:37:23 lr 0.000711 time 1.8180 (2.2868) loss 4.1498 (4.7812) grad_norm 1.4478 (1.6440) [2022-01-17 22:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][280/1251] eta 0:36:51 lr 0.000711 time 1.8821 (2.2776) loss 4.9653 (4.7824) grad_norm 1.4600 (1.6448) [2022-01-17 22:42:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][290/1251] eta 0:36:17 lr 0.000712 time 1.8937 (2.2656) loss 5.1286 (4.7805) grad_norm 1.6320 (1.6426) [2022-01-17 22:43:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][300/1251] eta 0:35:44 lr 0.000712 time 2.0202 (2.2552) loss 4.6579 (4.7803) grad_norm 1.4879 (1.6380) [2022-01-17 22:43:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][310/1251] eta 0:35:13 lr 0.000713 time 1.9390 (2.2459) loss 4.5156 (4.7734) grad_norm 1.2618 (1.6355) [2022-01-17 22:43:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][320/1251] eta 0:34:44 lr 0.000713 time 1.9196 (2.2390) loss 5.0526 (4.7722) grad_norm 1.5533 (1.6377) [2022-01-17 22:44:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][330/1251] eta 0:34:19 lr 0.000713 time 1.8473 (2.2361) loss 4.8763 (4.7750) grad_norm 1.5037 (1.6375) [2022-01-17 22:44:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][340/1251] eta 0:34:01 lr 0.000714 time 2.0815 (2.2406) loss 4.7721 (4.7699) grad_norm 1.3386 (1.6334) [2022-01-17 22:45:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][350/1251] eta 0:33:50 lr 0.000714 time 2.5732 (2.2538) loss 4.0680 (4.7699) grad_norm 1.5934 (1.6283) [2022-01-17 22:45:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][360/1251] eta 0:33:31 lr 0.000715 time 2.1061 (2.2578) loss 5.4786 (4.7729) grad_norm 1.5349 (1.6273) [2022-01-17 22:45:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][370/1251] eta 0:33:07 lr 0.000715 time 1.9289 (2.2565) loss 5.1421 (4.7668) grad_norm 2.0405 (1.6259) [2022-01-17 22:46:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][380/1251] eta 0:32:39 lr 0.000715 time 1.7691 (2.2497) loss 4.3143 (4.7687) grad_norm 1.6982 (1.6269) [2022-01-17 22:46:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][390/1251] eta 0:32:10 lr 0.000716 time 1.8108 (2.2427) loss 5.3962 (4.7681) grad_norm 1.5413 (1.6313) [2022-01-17 22:46:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][400/1251] eta 0:31:45 lr 0.000716 time 2.1839 (2.2397) loss 5.0374 (4.7721) grad_norm 1.4674 (1.6285) [2022-01-17 22:47:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][410/1251] eta 0:31:19 lr 0.000717 time 2.0381 (2.2348) loss 5.5047 (4.7752) grad_norm 1.5635 (1.6253) [2022-01-17 22:47:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][420/1251] eta 0:30:53 lr 0.000717 time 1.8583 (2.2308) loss 4.0484 (4.7695) grad_norm 1.4700 (1.6244) [2022-01-17 22:47:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][430/1251] eta 0:30:31 lr 0.000717 time 2.1655 (2.2309) loss 4.8412 (4.7723) grad_norm 1.7352 (1.6267) [2022-01-17 22:48:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][440/1251] eta 0:30:12 lr 0.000718 time 3.4656 (2.2349) loss 5.3247 (4.7659) grad_norm 1.4932 (1.6278) [2022-01-17 22:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][450/1251] eta 0:29:51 lr 0.000718 time 2.0387 (2.2366) loss 4.7148 (4.7654) grad_norm 1.1768 (1.6232) [2022-01-17 22:49:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][460/1251] eta 0:29:32 lr 0.000719 time 1.8957 (2.2404) loss 5.1366 (4.7688) grad_norm 1.7944 (1.6249) [2022-01-17 22:49:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][470/1251] eta 0:29:08 lr 0.000719 time 2.1979 (2.2389) loss 5.6355 (4.7619) grad_norm 1.4284 (1.6205) [2022-01-17 22:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][480/1251] eta 0:28:44 lr 0.000719 time 2.4467 (2.2372) loss 4.2773 (4.7588) grad_norm 1.4964 (1.6192) [2022-01-17 22:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][490/1251] eta 0:28:19 lr 0.000720 time 1.9005 (2.2336) loss 4.9601 (4.7609) grad_norm 1.6389 (1.6180) [2022-01-17 22:50:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][500/1251] eta 0:27:56 lr 0.000720 time 2.5513 (2.2319) loss 5.5764 (4.7634) grad_norm 1.5124 (1.6218) [2022-01-17 22:50:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][510/1251] eta 0:27:33 lr 0.000721 time 2.3734 (2.2308) loss 3.8566 (4.7648) grad_norm 1.7414 (1.6255) [2022-01-17 22:51:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][520/1251] eta 0:27:12 lr 0.000721 time 3.3329 (2.2330) loss 5.0749 (4.7635) grad_norm 1.4870 (1.6239) [2022-01-17 22:51:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][530/1251] eta 0:26:48 lr 0.000721 time 2.4078 (2.2313) loss 4.9822 (4.7635) grad_norm 1.8157 (1.6218) [2022-01-17 22:52:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][540/1251] eta 0:26:24 lr 0.000722 time 2.5775 (2.2288) loss 3.8987 (4.7605) grad_norm 1.4422 (1.6205) [2022-01-17 22:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][550/1251] eta 0:26:00 lr 0.000722 time 1.9956 (2.2261) loss 3.5749 (4.7608) grad_norm 1.3264 (1.6156) [2022-01-17 22:52:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][560/1251] eta 0:25:37 lr 0.000723 time 2.1978 (2.2252) loss 4.4724 (4.7645) grad_norm 1.5563 (1.6152) [2022-01-17 22:53:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][570/1251] eta 0:25:12 lr 0.000723 time 1.8791 (2.2216) loss 4.6629 (4.7658) grad_norm 1.5218 (1.6141) [2022-01-17 22:53:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][580/1251] eta 0:24:53 lr 0.000723 time 2.7346 (2.2255) loss 5.8026 (4.7720) grad_norm 1.3012 (1.6127) [2022-01-17 22:53:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][590/1251] eta 0:24:30 lr 0.000724 time 2.4068 (2.2244) loss 4.1132 (4.7775) grad_norm 1.4460 (1.6110) [2022-01-17 22:54:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][600/1251] eta 0:24:07 lr 0.000724 time 2.1961 (2.2234) loss 4.8387 (4.7782) grad_norm 1.4325 (1.6096) [2022-01-17 22:54:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][610/1251] eta 0:23:45 lr 0.000725 time 2.4568 (2.2233) loss 5.1196 (4.7766) grad_norm 1.5625 (1.6099) [2022-01-17 22:54:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][620/1251] eta 0:23:23 lr 0.000725 time 2.8382 (2.2235) loss 4.4660 (4.7737) grad_norm 1.6390 (1.6089) [2022-01-17 22:55:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][630/1251] eta 0:22:58 lr 0.000725 time 1.8418 (2.2200) loss 5.5457 (4.7759) grad_norm 1.7205 (1.6099) [2022-01-17 22:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][640/1251] eta 0:22:37 lr 0.000726 time 1.8858 (2.2214) loss 4.8427 (4.7742) grad_norm 1.5663 (1.6081) [2022-01-17 22:56:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][650/1251] eta 0:22:15 lr 0.000726 time 1.7937 (2.2221) loss 4.0033 (4.7720) grad_norm 1.4720 (1.6088) [2022-01-17 22:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][660/1251] eta 0:21:53 lr 0.000727 time 3.4132 (2.2228) loss 4.2616 (4.7755) grad_norm 1.4664 (1.6051) [2022-01-17 22:56:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][670/1251] eta 0:21:29 lr 0.000727 time 1.6119 (2.2193) loss 4.2368 (4.7744) grad_norm 2.1559 (1.6054) [2022-01-17 22:57:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][680/1251] eta 0:21:06 lr 0.000727 time 2.2728 (2.2181) loss 3.9989 (4.7743) grad_norm 1.5583 (1.6068) [2022-01-17 22:57:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][690/1251] eta 0:20:44 lr 0.000728 time 1.9287 (2.2182) loss 4.5546 (4.7749) grad_norm 1.5561 (1.6057) [2022-01-17 22:57:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][700/1251] eta 0:20:23 lr 0.000728 time 2.8628 (2.2207) loss 5.0375 (4.7772) grad_norm 2.0321 (1.6043) [2022-01-17 22:58:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][710/1251] eta 0:20:00 lr 0.000729 time 1.7764 (2.2197) loss 3.7258 (4.7773) grad_norm 1.9131 (1.6033) [2022-01-17 22:58:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][720/1251] eta 0:19:37 lr 0.000729 time 1.9811 (2.2176) loss 5.2149 (4.7793) grad_norm 1.4511 (1.6053) [2022-01-17 22:58:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][730/1251] eta 0:19:13 lr 0.000729 time 2.2703 (2.2141) loss 4.9451 (4.7810) grad_norm 1.6248 (1.6065) [2022-01-17 22:59:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][740/1251] eta 0:18:51 lr 0.000730 time 2.1292 (2.2144) loss 5.3266 (4.7832) grad_norm 1.4692 (1.6074) [2022-01-17 22:59:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][750/1251] eta 0:18:29 lr 0.000730 time 1.9217 (2.2147) loss 4.8365 (4.7858) grad_norm 1.4626 (1.6076) [2022-01-17 23:00:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][760/1251] eta 0:18:07 lr 0.000731 time 2.8319 (2.2140) loss 4.0831 (4.7876) grad_norm 1.5993 (1.6075) [2022-01-17 23:00:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][770/1251] eta 0:17:44 lr 0.000731 time 1.9401 (2.2126) loss 5.2240 (4.7903) grad_norm 1.6000 (1.6068) [2022-01-17 23:00:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][780/1251] eta 0:17:21 lr 0.000731 time 1.8863 (2.2110) loss 4.6509 (4.7879) grad_norm 1.8965 (1.6111) [2022-01-17 23:01:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][790/1251] eta 0:16:59 lr 0.000732 time 2.5460 (2.2114) loss 5.4776 (4.7905) grad_norm 1.2082 (1.6113) [2022-01-17 23:01:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][800/1251] eta 0:16:37 lr 0.000732 time 2.4982 (2.2117) loss 3.6896 (4.7893) grad_norm 1.7249 (1.6122) [2022-01-17 23:01:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][810/1251] eta 0:16:15 lr 0.000733 time 2.5076 (2.2109) loss 5.1397 (4.7924) grad_norm 1.4085 (1.6116) [2022-01-17 23:02:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][820/1251] eta 0:15:52 lr 0.000733 time 2.6633 (2.2102) loss 5.3630 (4.7929) grad_norm 1.4418 (1.6123) [2022-01-17 23:02:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][830/1251] eta 0:15:30 lr 0.000733 time 2.5467 (2.2103) loss 4.9133 (4.7908) grad_norm 1.7111 (1.6111) [2022-01-17 23:02:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][840/1251] eta 0:15:07 lr 0.000734 time 1.9685 (2.2088) loss 3.5048 (4.7876) grad_norm 1.4060 (1.6100) [2022-01-17 23:03:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][850/1251] eta 0:14:46 lr 0.000734 time 2.5613 (2.2100) loss 4.7432 (4.7866) grad_norm 1.4767 (1.6081) [2022-01-17 23:03:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][860/1251] eta 0:14:24 lr 0.000735 time 1.5635 (2.2103) loss 5.3529 (4.7865) grad_norm 1.3529 (1.6068) [2022-01-17 23:04:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][870/1251] eta 0:14:02 lr 0.000735 time 1.9520 (2.2102) loss 5.4854 (4.7889) grad_norm 1.5494 (1.6051) [2022-01-17 23:04:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][880/1251] eta 0:13:39 lr 0.000735 time 2.1792 (2.2102) loss 3.6118 (4.7882) grad_norm 1.8476 (1.6068) [2022-01-17 23:04:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][890/1251] eta 0:13:17 lr 0.000736 time 2.1601 (2.2101) loss 4.9819 (4.7905) grad_norm 1.9590 (1.6094) [2022-01-17 23:05:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][900/1251] eta 0:12:55 lr 0.000736 time 1.6147 (2.2089) loss 4.3175 (4.7921) grad_norm 1.6117 (1.6096) [2022-01-17 23:05:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][910/1251] eta 0:12:32 lr 0.000737 time 2.1563 (2.2078) loss 4.0680 (4.7912) grad_norm 1.6676 (1.6080) [2022-01-17 23:05:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][920/1251] eta 0:12:10 lr 0.000737 time 1.7221 (2.2068) loss 3.8459 (4.7890) grad_norm 1.6591 (1.6076) [2022-01-17 23:06:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][930/1251] eta 0:11:47 lr 0.000737 time 2.1538 (2.2055) loss 5.1069 (4.7858) grad_norm 1.4568 (1.6068) [2022-01-17 23:06:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][940/1251] eta 0:11:26 lr 0.000738 time 2.1565 (2.2063) loss 4.6717 (4.7846) grad_norm 1.8427 (1.6060) [2022-01-17 23:06:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][950/1251] eta 0:11:04 lr 0.000738 time 2.2220 (2.2065) loss 4.2720 (4.7834) grad_norm 1.2712 (1.6053) [2022-01-17 23:07:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][960/1251] eta 0:10:42 lr 0.000739 time 1.9293 (2.2063) loss 5.2352 (4.7851) grad_norm 1.3900 (1.6035) [2022-01-17 23:07:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][970/1251] eta 0:10:19 lr 0.000739 time 1.7685 (2.2059) loss 3.7884 (4.7845) grad_norm 1.2033 (1.6021) [2022-01-17 23:08:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][980/1251] eta 0:09:57 lr 0.000739 time 2.0147 (2.2053) loss 5.6192 (4.7856) grad_norm 2.1002 (1.6025) [2022-01-17 23:08:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][990/1251] eta 0:09:35 lr 0.000740 time 2.1286 (2.2055) loss 3.9180 (4.7865) grad_norm 1.4700 (1.6008) [2022-01-17 23:08:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1000/1251] eta 0:09:13 lr 0.000740 time 1.8257 (2.2046) loss 4.3475 (4.7854) grad_norm 1.7035 (1.6011) [2022-01-17 23:09:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1010/1251] eta 0:08:50 lr 0.000741 time 2.4012 (2.2032) loss 4.0255 (4.7864) grad_norm 1.5458 (1.5995) [2022-01-17 23:09:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1020/1251] eta 0:08:28 lr 0.000741 time 2.0163 (2.2024) loss 3.5106 (4.7857) grad_norm 1.5514 (1.6005) [2022-01-17 23:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1030/1251] eta 0:08:06 lr 0.000741 time 2.2057 (2.2025) loss 4.9130 (4.7804) grad_norm 1.2326 (1.5997) [2022-01-17 23:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1040/1251] eta 0:07:44 lr 0.000742 time 2.5959 (2.2025) loss 5.5391 (4.7823) grad_norm 1.6907 (1.6001) [2022-01-17 23:10:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1050/1251] eta 0:07:22 lr 0.000742 time 1.7306 (2.2028) loss 3.7635 (4.7807) grad_norm 1.9641 (1.5989) [2022-01-17 23:10:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1060/1251] eta 0:07:00 lr 0.000743 time 2.0059 (2.2029) loss 3.8004 (4.7804) grad_norm 1.5569 (1.5980) [2022-01-17 23:11:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1070/1251] eta 0:06:38 lr 0.000743 time 2.2197 (2.2032) loss 4.5503 (4.7797) grad_norm 1.6468 (1.5962) [2022-01-17 23:11:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1080/1251] eta 0:06:16 lr 0.000743 time 2.0721 (2.2021) loss 3.9162 (4.7788) grad_norm 1.6717 (1.5955) [2022-01-17 23:12:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1090/1251] eta 0:05:54 lr 0.000744 time 1.8684 (2.2014) loss 4.3687 (4.7791) grad_norm 1.3867 (1.5946) [2022-01-17 23:12:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1100/1251] eta 0:05:32 lr 0.000744 time 1.5883 (2.1996) loss 5.0859 (4.7788) grad_norm 1.5441 (1.5932) [2022-01-17 23:12:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1110/1251] eta 0:05:10 lr 0.000745 time 1.8482 (2.1995) loss 4.3566 (4.7756) grad_norm 1.7642 (1.5924) [2022-01-17 23:13:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1120/1251] eta 0:04:48 lr 0.000745 time 1.5440 (2.1988) loss 4.8998 (4.7733) grad_norm 1.2117 (1.5919) [2022-01-17 23:13:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1130/1251] eta 0:04:26 lr 0.000745 time 2.5435 (2.1996) loss 5.1463 (4.7751) grad_norm 1.9655 (1.5915) [2022-01-17 23:13:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1140/1251] eta 0:04:04 lr 0.000746 time 3.0489 (2.2023) loss 4.8872 (4.7739) grad_norm 1.4621 (1.5913) [2022-01-17 23:14:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1150/1251] eta 0:03:42 lr 0.000746 time 2.0636 (2.2025) loss 5.0717 (4.7739) grad_norm 1.7637 (1.5908) [2022-01-17 23:14:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1160/1251] eta 0:03:20 lr 0.000747 time 2.1094 (2.2028) loss 3.6153 (4.7724) grad_norm 1.5093 (1.5903) [2022-01-17 23:14:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1170/1251] eta 0:02:58 lr 0.000747 time 2.0122 (2.2016) loss 3.5619 (4.7723) grad_norm 1.4999 (1.5894) [2022-01-17 23:15:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1180/1251] eta 0:02:36 lr 0.000747 time 1.6274 (2.1987) loss 5.2969 (4.7747) grad_norm 1.5610 (1.5892) [2022-01-17 23:15:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1190/1251] eta 0:02:14 lr 0.000748 time 2.9492 (2.1972) loss 4.1710 (4.7738) grad_norm 1.3704 (1.5889) [2022-01-17 23:15:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1200/1251] eta 0:01:52 lr 0.000748 time 1.6448 (2.1986) loss 4.8347 (4.7720) grad_norm 1.3829 (1.5890) [2022-01-17 23:16:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1210/1251] eta 0:01:30 lr 0.000749 time 2.5297 (2.1989) loss 4.5808 (4.7710) grad_norm 1.4241 (1.5869) [2022-01-17 23:16:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1220/1251] eta 0:01:08 lr 0.000749 time 1.8682 (2.1999) loss 5.0036 (4.7685) grad_norm 1.2223 (1.5851) [2022-01-17 23:17:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1230/1251] eta 0:00:46 lr 0.000749 time 3.3796 (2.2012) loss 4.4488 (4.7664) grad_norm 1.4865 (1.5851) [2022-01-17 23:17:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1240/1251] eta 0:00:24 lr 0.000750 time 1.2257 (2.1990) loss 4.8822 (4.7656) grad_norm 1.8207 (1.5845) [2022-01-17 23:17:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [14/300][1250/1251] eta 0:00:02 lr 0.000750 time 1.2087 (2.1934) loss 5.0946 (4.7639) grad_norm 1.2766 (1.5837) [2022-01-17 23:17:42 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 14 training takes 0:45:44 [2022-01-17 23:18:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.413 (18.413) Loss 2.0620 (2.0620) Acc@1 55.664 (55.664) Acc@5 79.199 (79.199) [2022-01-17 23:18:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.640 (3.290) Loss 2.0729 (2.0976) Acc@1 55.957 (54.181) Acc@5 78.125 (78.995) [2022-01-17 23:18:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.962 (2.538) Loss 2.0862 (2.0991) Acc@1 53.320 (54.367) Acc@5 79.199 (78.664) [2022-01-17 23:18:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.928 (2.206) Loss 2.1783 (2.1031) Acc@1 52.539 (54.177) Acc@5 77.539 (78.456) [2022-01-17 23:19:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 5.162 (2.165) Loss 2.0440 (2.0981) Acc@1 55.078 (54.185) Acc@5 78.809 (78.582) [2022-01-17 23:19:18 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 54.102 Acc@5 78.530 [2022-01-17 23:19:18 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 54.1% [2022-01-17 23:19:18 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 54.10% [2022-01-17 23:19:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][0/1251] eta 7:41:14 lr 0.000750 time 22.1219 (22.1219) loss 5.0432 (5.0432) grad_norm 1.4553 (1.4553) [2022-01-17 23:20:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][10/1251] eta 1:26:27 lr 0.000751 time 2.7898 (4.1799) loss 4.9842 (4.9506) grad_norm 1.2969 (1.5350) [2022-01-17 23:20:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][20/1251] eta 1:08:59 lr 0.000751 time 1.9815 (3.3629) loss 5.0798 (4.8954) grad_norm 1.5564 (1.4991) [2022-01-17 23:20:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][30/1251] eta 1:00:51 lr 0.000751 time 1.9919 (2.9903) loss 5.0260 (4.8239) grad_norm 2.1419 (1.5654) [2022-01-17 23:21:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][40/1251] eta 0:57:17 lr 0.000752 time 2.6041 (2.8389) loss 3.9327 (4.7850) grad_norm 1.2632 (1.5436) [2022-01-17 23:21:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][50/1251] eta 0:53:46 lr 0.000752 time 2.2613 (2.6869) loss 5.1684 (4.8075) grad_norm 2.1854 (1.5708) [2022-01-17 23:21:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][60/1251] eta 0:51:00 lr 0.000753 time 1.8105 (2.5693) loss 5.1927 (4.7798) grad_norm 1.7572 (1.5873) [2022-01-17 23:22:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][70/1251] eta 0:49:10 lr 0.000753 time 2.0403 (2.4982) loss 3.4987 (4.7494) grad_norm 1.3505 (1.5820) [2022-01-17 23:22:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][80/1251] eta 0:48:06 lr 0.000753 time 2.8491 (2.4646) loss 4.8864 (4.7303) grad_norm 1.8720 (1.5860) [2022-01-17 23:22:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][90/1251] eta 0:46:52 lr 0.000754 time 1.7618 (2.4227) loss 5.1777 (4.7070) grad_norm 1.6753 (1.6110) [2022-01-17 23:23:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][100/1251] eta 0:45:57 lr 0.000754 time 1.9816 (2.3957) loss 5.1543 (4.7362) grad_norm 1.5577 (1.6023) [2022-01-17 23:23:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][110/1251] eta 0:45:11 lr 0.000755 time 1.8849 (2.3762) loss 4.9138 (4.7266) grad_norm 1.1512 (1.5930) [2022-01-17 23:24:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][120/1251] eta 0:44:39 lr 0.000755 time 2.5979 (2.3692) loss 4.7689 (4.7341) grad_norm 1.7176 (1.5950) [2022-01-17 23:24:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][130/1251] eta 0:44:14 lr 0.000755 time 3.1629 (2.3680) loss 4.8962 (4.7591) grad_norm 1.5994 (1.5856) [2022-01-17 23:24:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][140/1251] eta 0:43:36 lr 0.000756 time 2.4263 (2.3550) loss 4.9866 (4.7589) grad_norm 1.7430 (1.5797) [2022-01-17 23:25:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][150/1251] eta 0:43:03 lr 0.000756 time 2.4780 (2.3463) loss 5.0276 (4.7494) grad_norm 1.7388 (1.5759) [2022-01-17 23:25:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][160/1251] eta 0:42:23 lr 0.000757 time 2.3268 (2.3310) loss 5.2150 (4.7532) grad_norm 1.4495 (1.5661) [2022-01-17 23:25:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][170/1251] eta 0:41:41 lr 0.000757 time 1.9582 (2.3144) loss 5.1645 (4.7469) grad_norm 1.4514 (1.5693) [2022-01-17 23:26:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][180/1251] eta 0:41:05 lr 0.000757 time 2.2144 (2.3023) loss 4.7807 (4.7351) grad_norm 1.3979 (1.5648) [2022-01-17 23:26:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][190/1251] eta 0:40:34 lr 0.000758 time 1.8269 (2.2948) loss 4.3547 (4.7155) grad_norm 1.3629 (1.5738) [2022-01-17 23:26:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][200/1251] eta 0:40:10 lr 0.000758 time 2.4519 (2.2935) loss 5.4049 (4.7184) grad_norm 1.7722 (1.5794) [2022-01-17 23:27:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][210/1251] eta 0:39:39 lr 0.000759 time 2.6056 (2.2857) loss 5.4527 (4.7351) grad_norm 1.7362 (1.5892) [2022-01-17 23:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][220/1251] eta 0:39:08 lr 0.000759 time 1.8654 (2.2778) loss 4.0049 (4.7266) grad_norm 2.1260 (1.5877) [2022-01-17 23:28:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][230/1251] eta 0:38:41 lr 0.000759 time 2.1959 (2.2735) loss 5.0750 (4.7236) grad_norm 1.7940 (1.5910) [2022-01-17 23:28:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][240/1251] eta 0:38:15 lr 0.000760 time 2.1715 (2.2704) loss 4.0609 (4.7218) grad_norm 1.1222 (1.5921) [2022-01-17 23:28:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][250/1251] eta 0:37:49 lr 0.000760 time 2.4978 (2.2672) loss 4.8100 (4.7216) grad_norm 1.4057 (1.5827) [2022-01-17 23:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][260/1251] eta 0:37:19 lr 0.000761 time 2.0249 (2.2598) loss 3.6104 (4.7188) grad_norm 1.7486 (1.5811) [2022-01-17 23:29:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][270/1251] eta 0:36:57 lr 0.000761 time 1.9671 (2.2604) loss 4.2613 (4.7132) grad_norm 1.4869 (1.5842) [2022-01-17 23:29:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][280/1251] eta 0:36:29 lr 0.000761 time 1.8912 (2.2552) loss 4.7207 (4.7042) grad_norm 1.7046 (1.5868) [2022-01-17 23:30:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][290/1251] eta 0:36:03 lr 0.000762 time 2.1562 (2.2517) loss 4.8800 (4.7062) grad_norm 1.7747 (1.5847) [2022-01-17 23:30:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][300/1251] eta 0:35:38 lr 0.000762 time 2.8113 (2.2489) loss 5.2174 (4.7100) grad_norm 1.2533 (1.5868) [2022-01-17 23:30:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][310/1251] eta 0:35:15 lr 0.000763 time 1.9030 (2.2487) loss 5.5372 (4.7122) grad_norm 1.3439 (1.5815) [2022-01-17 23:31:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][320/1251] eta 0:34:48 lr 0.000763 time 1.5870 (2.2430) loss 4.8194 (4.7189) grad_norm 1.4937 (1.5825) [2022-01-17 23:31:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][330/1251] eta 0:34:25 lr 0.000763 time 1.6720 (2.2422) loss 5.5118 (4.7152) grad_norm 1.4212 (1.5757) [2022-01-17 23:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][340/1251] eta 0:34:04 lr 0.000764 time 2.7583 (2.2441) loss 5.1542 (4.7229) grad_norm 1.1215 (1.5689) [2022-01-17 23:32:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][350/1251] eta 0:33:41 lr 0.000764 time 1.5290 (2.2438) loss 3.2718 (4.7252) grad_norm 1.6095 (1.5679) [2022-01-17 23:32:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][360/1251] eta 0:33:16 lr 0.000765 time 2.2547 (2.2407) loss 5.3766 (4.7243) grad_norm 1.2387 (1.5688) [2022-01-17 23:33:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][370/1251] eta 0:32:52 lr 0.000765 time 2.1544 (2.2387) loss 4.5587 (4.7135) grad_norm 1.5025 (1.5666) [2022-01-17 23:33:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][380/1251] eta 0:32:31 lr 0.000765 time 2.8323 (2.2403) loss 5.2501 (4.7155) grad_norm 1.5630 (1.5690) [2022-01-17 23:33:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][390/1251] eta 0:32:09 lr 0.000766 time 1.8764 (2.2406) loss 4.3045 (4.7147) grad_norm 1.9070 (1.5662) [2022-01-17 23:34:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][400/1251] eta 0:31:44 lr 0.000766 time 1.8692 (2.2375) loss 4.3728 (4.7139) grad_norm 1.3413 (1.5635) [2022-01-17 23:34:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][410/1251] eta 0:31:16 lr 0.000767 time 2.1250 (2.2317) loss 5.1385 (4.7227) grad_norm 1.3706 (1.5591) [2022-01-17 23:34:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][420/1251] eta 0:30:54 lr 0.000767 time 2.7989 (2.2317) loss 4.7283 (4.7244) grad_norm 1.2130 (1.5562) [2022-01-17 23:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][430/1251] eta 0:30:32 lr 0.000767 time 1.9718 (2.2323) loss 4.9640 (4.7240) grad_norm 1.3001 (1.5544) [2022-01-17 23:35:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][440/1251] eta 0:30:08 lr 0.000768 time 1.6114 (2.2302) loss 5.2421 (4.7258) grad_norm 1.5714 (1.5532) [2022-01-17 23:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][450/1251] eta 0:29:45 lr 0.000768 time 1.8521 (2.2294) loss 5.1678 (4.7310) grad_norm 1.4733 (1.5491) [2022-01-17 23:36:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][460/1251] eta 0:29:23 lr 0.000769 time 2.2247 (2.2290) loss 5.0124 (4.7357) grad_norm 1.6670 (1.5484) [2022-01-17 23:36:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][470/1251] eta 0:29:00 lr 0.000769 time 1.7026 (2.2284) loss 5.5415 (4.7403) grad_norm 1.2135 (1.5515) [2022-01-17 23:37:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][480/1251] eta 0:28:36 lr 0.000769 time 1.5968 (2.2260) loss 3.6695 (4.7368) grad_norm 1.2337 (1.5494) [2022-01-17 23:37:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][490/1251] eta 0:28:12 lr 0.000770 time 1.7894 (2.2237) loss 3.5709 (4.7332) grad_norm 1.5477 (1.5488) [2022-01-17 23:37:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][500/1251] eta 0:27:47 lr 0.000770 time 1.9493 (2.2207) loss 4.2472 (4.7311) grad_norm 1.3929 (1.5482) [2022-01-17 23:38:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][510/1251] eta 0:27:23 lr 0.000771 time 1.8839 (2.2183) loss 3.3614 (4.7269) grad_norm 1.1390 (1.5435) [2022-01-17 23:38:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][520/1251] eta 0:27:00 lr 0.000771 time 2.2034 (2.2174) loss 5.2450 (4.7255) grad_norm 1.6947 (1.5460) [2022-01-17 23:38:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][530/1251] eta 0:26:36 lr 0.000771 time 1.9620 (2.2148) loss 4.9563 (4.7283) grad_norm 1.7046 (1.5460) [2022-01-17 23:39:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][540/1251] eta 0:26:13 lr 0.000772 time 1.7960 (2.2126) loss 4.4572 (4.7265) grad_norm 1.2479 (1.5456) [2022-01-17 23:39:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][550/1251] eta 0:25:48 lr 0.000772 time 1.8912 (2.2088) loss 3.9067 (4.7256) grad_norm 1.4712 (1.5462) [2022-01-17 23:39:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][560/1251] eta 0:25:26 lr 0.000773 time 2.1492 (2.2085) loss 3.7760 (4.7212) grad_norm 1.4521 (1.5452) [2022-01-17 23:40:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][570/1251] eta 0:25:02 lr 0.000773 time 1.2497 (2.2068) loss 4.8884 (4.7194) grad_norm 1.5274 (1.5453) [2022-01-17 23:40:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][580/1251] eta 0:24:41 lr 0.000773 time 2.4971 (2.2076) loss 4.7170 (4.7235) grad_norm 1.4878 (1.5427) [2022-01-17 23:41:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][590/1251] eta 0:24:19 lr 0.000774 time 2.4102 (2.2077) loss 3.6953 (4.7225) grad_norm 1.3900 (1.5414) [2022-01-17 23:41:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][600/1251] eta 0:23:59 lr 0.000774 time 2.9265 (2.2116) loss 4.3757 (4.7251) grad_norm 1.2782 (1.5415) [2022-01-17 23:41:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][610/1251] eta 0:23:38 lr 0.000775 time 2.2907 (2.2134) loss 5.2342 (4.7215) grad_norm 1.4936 (1.5410) [2022-01-17 23:42:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][620/1251] eta 0:23:17 lr 0.000775 time 3.1273 (2.2154) loss 3.9500 (4.7181) grad_norm 1.4419 (1.5401) [2022-01-17 23:42:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][630/1251] eta 0:22:56 lr 0.000775 time 2.0408 (2.2167) loss 4.4883 (4.7128) grad_norm 1.3985 (1.5402) [2022-01-17 23:43:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][640/1251] eta 0:22:35 lr 0.000776 time 2.2051 (2.2184) loss 5.0059 (4.7160) grad_norm 1.3054 (1.5387) [2022-01-17 23:43:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][650/1251] eta 0:22:11 lr 0.000776 time 1.8470 (2.2153) loss 5.3694 (4.7183) grad_norm 1.1863 (1.5375) [2022-01-17 23:43:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][660/1251] eta 0:21:48 lr 0.000777 time 2.6406 (2.2135) loss 4.7757 (4.7238) grad_norm 1.4372 (1.5372) [2022-01-17 23:44:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][670/1251] eta 0:21:24 lr 0.000777 time 2.2238 (2.2104) loss 5.0250 (4.7238) grad_norm 1.4180 (1.5375) [2022-01-17 23:44:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][680/1251] eta 0:21:01 lr 0.000777 time 2.2486 (2.2085) loss 4.0113 (4.7244) grad_norm 1.9440 (1.5384) [2022-01-17 23:44:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][690/1251] eta 0:20:39 lr 0.000778 time 2.4079 (2.2099) loss 4.3430 (4.7265) grad_norm 1.2410 (1.5350) [2022-01-17 23:45:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][700/1251] eta 0:20:18 lr 0.000778 time 1.8146 (2.2114) loss 5.1150 (4.7270) grad_norm 1.2952 (1.5346) [2022-01-17 23:45:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][710/1251] eta 0:19:57 lr 0.000779 time 3.0411 (2.2126) loss 4.9784 (4.7275) grad_norm 1.7790 (1.5372) [2022-01-17 23:45:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][720/1251] eta 0:19:35 lr 0.000779 time 3.3505 (2.2140) loss 5.2845 (4.7298) grad_norm 1.9409 (1.5400) [2022-01-17 23:46:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][730/1251] eta 0:19:11 lr 0.000779 time 2.1741 (2.2101) loss 5.0422 (4.7293) grad_norm 1.4690 (1.5394) [2022-01-17 23:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][740/1251] eta 0:18:48 lr 0.000780 time 2.0774 (2.2078) loss 4.0719 (4.7268) grad_norm 1.3436 (1.5376) [2022-01-17 23:46:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][750/1251] eta 0:18:24 lr 0.000780 time 2.1620 (2.2050) loss 4.8930 (4.7262) grad_norm 1.4095 (1.5405) [2022-01-17 23:47:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][760/1251] eta 0:18:01 lr 0.000781 time 2.2039 (2.2035) loss 5.0866 (4.7261) grad_norm 1.4487 (1.5408) [2022-01-17 23:47:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][770/1251] eta 0:17:40 lr 0.000781 time 2.6202 (2.2048) loss 5.0125 (4.7264) grad_norm 1.7388 (1.5404) [2022-01-17 23:47:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][780/1251] eta 0:17:17 lr 0.000781 time 1.8282 (2.2033) loss 5.4831 (4.7273) grad_norm 1.6174 (1.5410) [2022-01-17 23:48:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][790/1251] eta 0:16:55 lr 0.000782 time 2.2156 (2.2031) loss 3.8902 (4.7255) grad_norm 1.6103 (1.5431) [2022-01-17 23:48:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][800/1251] eta 0:16:34 lr 0.000782 time 1.9071 (2.2058) loss 4.5689 (4.7270) grad_norm 1.2196 (1.5416) [2022-01-17 23:49:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][810/1251] eta 0:16:13 lr 0.000783 time 1.8323 (2.2079) loss 3.7356 (4.7281) grad_norm 1.0502 (1.5404) [2022-01-17 23:49:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][820/1251] eta 0:15:51 lr 0.000783 time 1.8985 (2.2087) loss 4.2185 (4.7285) grad_norm 1.8320 (1.5421) [2022-01-17 23:49:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][830/1251] eta 0:15:29 lr 0.000783 time 2.4405 (2.2075) loss 4.9247 (4.7300) grad_norm 1.2941 (1.5417) [2022-01-17 23:50:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][840/1251] eta 0:15:06 lr 0.000784 time 2.1709 (2.2063) loss 5.1387 (4.7308) grad_norm 1.2492 (1.5410) [2022-01-17 23:50:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][850/1251] eta 0:14:44 lr 0.000784 time 1.6952 (2.2046) loss 4.0684 (4.7295) grad_norm 1.9011 (1.5395) [2022-01-17 23:50:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][860/1251] eta 0:14:21 lr 0.000785 time 1.8549 (2.2042) loss 5.3766 (4.7306) grad_norm 1.4766 (1.5411) [2022-01-17 23:51:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][870/1251] eta 0:13:59 lr 0.000785 time 2.1326 (2.2041) loss 3.7143 (4.7287) grad_norm 1.5871 (1.5409) [2022-01-17 23:51:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][880/1251] eta 0:13:37 lr 0.000785 time 1.8768 (2.2029) loss 4.5595 (4.7282) grad_norm 1.3594 (1.5384) [2022-01-17 23:51:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][890/1251] eta 0:13:14 lr 0.000786 time 2.1389 (2.2012) loss 3.6451 (4.7242) grad_norm 1.6052 (1.5380) [2022-01-17 23:52:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][900/1251] eta 0:12:52 lr 0.000786 time 1.9006 (2.2011) loss 4.1865 (4.7222) grad_norm 1.3232 (1.5384) [2022-01-17 23:52:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][910/1251] eta 0:12:30 lr 0.000787 time 2.0903 (2.2022) loss 5.3957 (4.7228) grad_norm 1.4880 (1.5362) [2022-01-17 23:53:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][920/1251] eta 0:12:08 lr 0.000787 time 2.4954 (2.2017) loss 5.1400 (4.7233) grad_norm 1.5673 (1.5370) [2022-01-17 23:53:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][930/1251] eta 0:11:46 lr 0.000787 time 2.1413 (2.2021) loss 5.0849 (4.7219) grad_norm 1.6249 (1.5364) [2022-01-17 23:53:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][940/1251] eta 0:11:24 lr 0.000788 time 2.2107 (2.2021) loss 5.1830 (4.7247) grad_norm 1.6152 (1.5367) [2022-01-17 23:54:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][950/1251] eta 0:11:02 lr 0.000788 time 2.2069 (2.2023) loss 3.7856 (4.7208) grad_norm 1.9300 (1.5361) [2022-01-17 23:54:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][960/1251] eta 0:10:40 lr 0.000789 time 1.7000 (2.2016) loss 4.9942 (4.7229) grad_norm 1.0498 (1.5334) [2022-01-17 23:54:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][970/1251] eta 0:10:18 lr 0.000789 time 1.8333 (2.2000) loss 5.4014 (4.7240) grad_norm 1.4149 (1.5308) [2022-01-17 23:55:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][980/1251] eta 0:09:55 lr 0.000789 time 1.8262 (2.1990) loss 4.3364 (4.7229) grad_norm 1.4648 (1.5295) [2022-01-17 23:55:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][990/1251] eta 0:09:33 lr 0.000790 time 2.1182 (2.1981) loss 4.9662 (4.7228) grad_norm 1.5507 (1.5284) [2022-01-17 23:55:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1000/1251] eta 0:09:11 lr 0.000790 time 2.0085 (2.1976) loss 5.0817 (4.7234) grad_norm 1.3357 (1.5269) [2022-01-17 23:56:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1010/1251] eta 0:08:49 lr 0.000791 time 1.9225 (2.1967) loss 5.4245 (4.7213) grad_norm 1.3141 (1.5279) [2022-01-17 23:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1020/1251] eta 0:08:27 lr 0.000791 time 2.2466 (2.1960) loss 4.3593 (4.7189) grad_norm 1.5300 (1.5269) [2022-01-17 23:57:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1030/1251] eta 0:08:05 lr 0.000791 time 2.7264 (2.1963) loss 4.9632 (4.7182) grad_norm 1.5421 (1.5292) [2022-01-17 23:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1040/1251] eta 0:07:43 lr 0.000792 time 2.6393 (2.1972) loss 4.0793 (4.7173) grad_norm 1.6522 (1.5305) [2022-01-17 23:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1050/1251] eta 0:07:21 lr 0.000792 time 1.7292 (2.1961) loss 3.8503 (4.7145) grad_norm 1.6052 (1.5303) [2022-01-17 23:58:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1060/1251] eta 0:06:59 lr 0.000793 time 2.1376 (2.1974) loss 3.3542 (4.7126) grad_norm 1.2799 (1.5290) [2022-01-17 23:58:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1070/1251] eta 0:06:37 lr 0.000793 time 2.2145 (2.1986) loss 4.3818 (4.7141) grad_norm 1.3367 (1.5277) [2022-01-17 23:58:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1080/1251] eta 0:06:15 lr 0.000793 time 2.5319 (2.1986) loss 3.7696 (4.7148) grad_norm 1.3766 (1.5266) [2022-01-17 23:59:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1090/1251] eta 0:05:53 lr 0.000794 time 2.2628 (2.1986) loss 4.5053 (4.7116) grad_norm 1.4705 (1.5262) [2022-01-17 23:59:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1100/1251] eta 0:05:31 lr 0.000794 time 1.8852 (2.1981) loss 5.2643 (4.7136) grad_norm 1.5371 (1.5258) [2022-01-18 00:00:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1110/1251] eta 0:05:09 lr 0.000795 time 2.1851 (2.1976) loss 4.1049 (4.7130) grad_norm 1.3614 (1.5250) [2022-01-18 00:00:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1120/1251] eta 0:04:47 lr 0.000795 time 2.4103 (2.1973) loss 4.8622 (4.7141) grad_norm 1.7617 (1.5254) [2022-01-18 00:00:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1130/1251] eta 0:04:25 lr 0.000795 time 2.2782 (2.1971) loss 5.0689 (4.7148) grad_norm 2.0358 (1.5256) [2022-01-18 00:01:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1140/1251] eta 0:04:03 lr 0.000796 time 1.7267 (2.1969) loss 3.6810 (4.7141) grad_norm 1.3528 (1.5246) [2022-01-18 00:01:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1150/1251] eta 0:03:42 lr 0.000796 time 3.5875 (2.1992) loss 5.6370 (4.7147) grad_norm 1.4846 (1.5241) [2022-01-18 00:01:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1160/1251] eta 0:03:20 lr 0.000797 time 2.1837 (2.1985) loss 5.6614 (4.7157) grad_norm 1.1985 (1.5229) [2022-01-18 00:02:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1170/1251] eta 0:02:57 lr 0.000797 time 1.7890 (2.1969) loss 5.3195 (4.7170) grad_norm 1.6370 (1.5228) [2022-01-18 00:02:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1180/1251] eta 0:02:35 lr 0.000797 time 2.2381 (2.1967) loss 5.0002 (4.7161) grad_norm 1.5145 (1.5241) [2022-01-18 00:02:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1190/1251] eta 0:02:13 lr 0.000798 time 2.2230 (2.1954) loss 5.3776 (4.7171) grad_norm 1.4851 (1.5235) [2022-01-18 00:03:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1200/1251] eta 0:01:51 lr 0.000798 time 2.2581 (2.1949) loss 5.6512 (4.7167) grad_norm 1.6110 (1.5238) [2022-01-18 00:03:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1210/1251] eta 0:01:29 lr 0.000799 time 1.7863 (2.1938) loss 5.3396 (4.7149) grad_norm 1.4388 (1.5230) [2022-01-18 00:03:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1220/1251] eta 0:01:08 lr 0.000799 time 1.7960 (2.1937) loss 4.9137 (4.7133) grad_norm 1.2842 (1.5219) [2022-01-18 00:04:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1230/1251] eta 0:00:46 lr 0.000799 time 3.0062 (2.1941) loss 4.5071 (4.7127) grad_norm 1.3732 (1.5223) [2022-01-18 00:04:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1240/1251] eta 0:00:24 lr 0.000800 time 2.4589 (2.1950) loss 4.2409 (4.7133) grad_norm 1.8985 (1.5225) [2022-01-18 00:04:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [15/300][1250/1251] eta 0:00:02 lr 0.000800 time 1.1569 (2.1899) loss 5.2142 (4.7119) grad_norm 1.5014 (1.5217) [2022-01-18 00:04:58 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 15 training takes 0:45:39 [2022-01-18 00:05:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.023 (18.023) Loss 2.0922 (2.0922) Acc@1 52.832 (52.832) Acc@5 78.613 (78.613) [2022-01-18 00:05:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.580 (3.284) Loss 2.0884 (2.0462) Acc@1 53.125 (55.016) Acc@5 78.027 (79.474) [2022-01-18 00:05:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.211 (2.574) Loss 1.9919 (2.0530) Acc@1 55.566 (54.757) Acc@5 80.273 (79.455) [2022-01-18 00:06:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.910 (2.231) Loss 2.0257 (2.0552) Acc@1 53.516 (54.653) Acc@5 79.883 (79.404) [2022-01-18 00:06:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.263 (2.149) Loss 2.1249 (2.0538) Acc@1 52.637 (54.761) Acc@5 78.223 (79.402) [2022-01-18 00:06:33 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 54.802 Acc@5 79.334 [2022-01-18 00:06:33 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 54.8% [2022-01-18 00:06:33 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 54.80% [2022-01-18 00:06:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][0/1251] eta 7:31:26 lr 0.000800 time 21.6517 (21.6517) loss 4.7752 (4.7752) grad_norm 1.7346 (1.7346) [2022-01-18 00:07:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][10/1251] eta 1:22:45 lr 0.000801 time 2.2046 (4.0012) loss 4.7257 (4.6383) grad_norm 1.4727 (1.5154) [2022-01-18 00:07:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][20/1251] eta 1:09:02 lr 0.000801 time 2.4603 (3.3654) loss 4.1866 (4.8034) grad_norm 1.5114 (1.4647) [2022-01-18 00:08:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][30/1251] eta 1:00:28 lr 0.000801 time 1.8554 (2.9713) loss 4.7825 (4.8131) grad_norm 2.2937 (1.5042) [2022-01-18 00:08:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][40/1251] eta 0:57:19 lr 0.000802 time 3.8714 (2.8403) loss 4.3637 (4.7984) grad_norm 1.3984 (1.5362) [2022-01-18 00:08:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][50/1251] eta 0:54:45 lr 0.000802 time 2.1084 (2.7357) loss 3.6492 (4.7865) grad_norm 1.3966 (1.5081) [2022-01-18 00:09:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][60/1251] eta 0:52:41 lr 0.000803 time 2.1650 (2.6548) loss 5.3940 (4.8002) grad_norm 1.5809 (1.5035) [2022-01-18 00:09:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][70/1251] eta 0:50:14 lr 0.000803 time 1.8395 (2.5527) loss 3.8775 (4.7873) grad_norm 1.2616 (1.5085) [2022-01-18 00:09:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][80/1251] eta 0:48:33 lr 0.000803 time 2.5292 (2.4877) loss 4.6067 (4.8047) grad_norm 1.2733 (1.4895) [2022-01-18 00:10:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][90/1251] eta 0:47:19 lr 0.000804 time 2.8317 (2.4460) loss 3.9986 (4.7904) grad_norm 1.4473 (1.4940) [2022-01-18 00:10:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][100/1251] eta 0:46:20 lr 0.000804 time 1.8278 (2.4160) loss 4.5203 (4.7542) grad_norm 1.3711 (1.5004) [2022-01-18 00:11:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][110/1251] eta 0:45:37 lr 0.000805 time 2.1620 (2.3994) loss 4.8107 (4.7408) grad_norm 1.6758 (1.4943) [2022-01-18 00:11:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][120/1251] eta 0:44:54 lr 0.000805 time 2.9236 (2.3824) loss 4.6616 (4.7206) grad_norm 1.8821 (1.5026) [2022-01-18 00:11:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][130/1251] eta 0:44:28 lr 0.000805 time 3.0697 (2.3802) loss 4.6617 (4.7166) grad_norm 1.4861 (1.5020) [2022-01-18 00:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][140/1251] eta 0:43:59 lr 0.000806 time 2.5065 (2.3756) loss 4.6478 (4.7238) grad_norm 1.4686 (1.5063) [2022-01-18 00:12:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][150/1251] eta 0:43:11 lr 0.000806 time 1.9452 (2.3538) loss 5.0647 (4.7109) grad_norm 2.0187 (1.5166) [2022-01-18 00:12:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][160/1251] eta 0:42:36 lr 0.000807 time 3.4615 (2.3429) loss 4.0214 (4.7045) grad_norm 1.2511 (1.5190) [2022-01-18 00:13:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][170/1251] eta 0:41:55 lr 0.000807 time 1.8536 (2.3269) loss 5.4215 (4.6811) grad_norm 1.7162 (1.5151) [2022-01-18 00:13:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][180/1251] eta 0:41:26 lr 0.000807 time 2.0436 (2.3215) loss 4.0523 (4.6766) grad_norm 2.6702 (1.5175) [2022-01-18 00:13:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][190/1251] eta 0:40:53 lr 0.000808 time 1.4750 (2.3125) loss 5.5216 (4.6739) grad_norm 1.2621 (1.5176) [2022-01-18 00:14:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][200/1251] eta 0:40:25 lr 0.000808 time 2.5398 (2.3083) loss 5.0398 (4.6702) grad_norm 1.6577 (1.5147) [2022-01-18 00:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][210/1251] eta 0:39:59 lr 0.000809 time 2.2005 (2.3047) loss 5.3272 (4.6764) grad_norm 1.1173 (1.5192) [2022-01-18 00:15:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][220/1251] eta 0:39:31 lr 0.000809 time 2.0051 (2.3006) loss 4.9266 (4.6767) grad_norm 1.5678 (1.5188) [2022-01-18 00:15:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][230/1251] eta 0:39:08 lr 0.000809 time 2.4414 (2.2999) loss 4.5721 (4.6831) grad_norm 1.1911 (1.5115) [2022-01-18 00:15:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][240/1251] eta 0:38:40 lr 0.000810 time 2.7281 (2.2952) loss 4.9054 (4.6666) grad_norm 1.4637 (1.5121) [2022-01-18 00:16:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][250/1251] eta 0:38:11 lr 0.000810 time 2.8508 (2.2891) loss 5.0074 (4.6671) grad_norm 1.6528 (1.5174) [2022-01-18 00:16:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][260/1251] eta 0:37:35 lr 0.000811 time 1.9140 (2.2760) loss 5.0988 (4.6663) grad_norm 1.4828 (1.5147) [2022-01-18 00:16:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][270/1251] eta 0:37:06 lr 0.000811 time 2.1226 (2.2697) loss 4.8996 (4.6761) grad_norm 1.5292 (1.5098) [2022-01-18 00:17:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][280/1251] eta 0:36:40 lr 0.000811 time 1.9271 (2.2665) loss 5.5425 (4.6873) grad_norm 1.6128 (1.5102) [2022-01-18 00:17:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][290/1251] eta 0:36:17 lr 0.000812 time 3.3281 (2.2661) loss 3.8859 (4.6736) grad_norm 1.6924 (1.5074) [2022-01-18 00:17:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][300/1251] eta 0:35:51 lr 0.000812 time 1.9293 (2.2625) loss 4.8007 (4.6763) grad_norm 1.2699 (1.5034) [2022-01-18 00:18:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][310/1251] eta 0:35:27 lr 0.000813 time 2.1180 (2.2605) loss 4.7074 (4.6732) grad_norm 1.4016 (1.5051) [2022-01-18 00:18:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][320/1251] eta 0:35:06 lr 0.000813 time 3.4591 (2.2624) loss 5.3798 (4.6796) grad_norm 1.4070 (1.5053) [2022-01-18 00:19:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][330/1251] eta 0:34:42 lr 0.000813 time 3.3213 (2.2615) loss 3.5930 (4.6677) grad_norm 1.6103 (1.5038) [2022-01-18 00:19:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][340/1251] eta 0:34:16 lr 0.000814 time 2.3127 (2.2572) loss 3.9799 (4.6622) grad_norm 1.6449 (1.5014) [2022-01-18 00:19:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][350/1251] eta 0:33:50 lr 0.000814 time 2.8697 (2.2539) loss 4.9509 (4.6609) grad_norm 1.3487 (1.4995) [2022-01-18 00:20:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][360/1251] eta 0:33:26 lr 0.000815 time 2.4403 (2.2525) loss 4.5140 (4.6646) grad_norm 1.6755 (1.5051) [2022-01-18 00:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][370/1251] eta 0:32:58 lr 0.000815 time 1.6073 (2.2463) loss 4.4713 (4.6623) grad_norm 2.5470 (1.5074) [2022-01-18 00:20:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][380/1251] eta 0:32:37 lr 0.000815 time 2.2724 (2.2471) loss 5.0453 (4.6652) grad_norm 1.3067 (1.5051) [2022-01-18 00:21:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][390/1251] eta 0:32:12 lr 0.000816 time 2.1911 (2.2451) loss 5.0092 (4.6607) grad_norm 1.3904 (1.5037) [2022-01-18 00:21:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][400/1251] eta 0:31:49 lr 0.000816 time 2.7942 (2.2437) loss 4.8799 (4.6593) grad_norm 1.7756 (1.5048) [2022-01-18 00:21:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][410/1251] eta 0:31:23 lr 0.000817 time 1.9181 (2.2400) loss 4.5579 (4.6665) grad_norm 1.7455 (1.5054) [2022-01-18 00:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][420/1251] eta 0:31:00 lr 0.000817 time 2.5482 (2.2386) loss 4.6435 (4.6689) grad_norm 1.1594 (1.5037) [2022-01-18 00:22:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][430/1251] eta 0:30:37 lr 0.000817 time 2.1094 (2.2377) loss 4.6973 (4.6638) grad_norm 1.8083 (1.5046) [2022-01-18 00:23:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][440/1251] eta 0:30:13 lr 0.000818 time 2.2868 (2.2365) loss 5.4182 (4.6639) grad_norm 1.3860 (1.5008) [2022-01-18 00:23:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][450/1251] eta 0:29:50 lr 0.000818 time 1.6393 (2.2357) loss 4.6662 (4.6647) grad_norm 1.4017 (1.4999) [2022-01-18 00:23:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][460/1251] eta 0:29:26 lr 0.000819 time 2.2178 (2.2335) loss 5.7059 (4.6678) grad_norm 1.6076 (1.5035) [2022-01-18 00:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][470/1251] eta 0:29:05 lr 0.000819 time 2.4716 (2.2352) loss 4.6578 (4.6721) grad_norm 1.8497 (1.5057) [2022-01-18 00:24:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][480/1251] eta 0:28:44 lr 0.000819 time 2.5079 (2.2369) loss 5.0259 (4.6698) grad_norm 1.5536 (1.5064) [2022-01-18 00:24:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][490/1251] eta 0:28:20 lr 0.000820 time 1.7534 (2.2340) loss 5.0782 (4.6709) grad_norm 1.2640 (1.5037) [2022-01-18 00:25:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][500/1251] eta 0:27:54 lr 0.000820 time 1.8807 (2.2293) loss 4.9907 (4.6714) grad_norm 1.6683 (1.5051) [2022-01-18 00:25:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][510/1251] eta 0:27:30 lr 0.000821 time 1.9184 (2.2274) loss 5.1303 (4.6717) grad_norm 1.2359 (1.5038) [2022-01-18 00:25:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][520/1251] eta 0:27:07 lr 0.000821 time 2.5479 (2.2264) loss 5.2289 (4.6727) grad_norm 1.4903 (1.5018) [2022-01-18 00:26:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][530/1251] eta 0:26:46 lr 0.000821 time 2.0866 (2.2282) loss 5.1166 (4.6742) grad_norm 1.3185 (1.5012) [2022-01-18 00:26:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][540/1251] eta 0:26:24 lr 0.000822 time 2.8099 (2.2279) loss 4.0114 (4.6705) grad_norm 1.4924 (1.5038) [2022-01-18 00:27:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][550/1251] eta 0:26:00 lr 0.000822 time 1.6026 (2.2254) loss 5.0743 (4.6745) grad_norm 1.3309 (1.5021) [2022-01-18 00:27:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][560/1251] eta 0:25:36 lr 0.000823 time 2.6535 (2.2235) loss 5.5393 (4.6757) grad_norm 1.4501 (1.5012) [2022-01-18 00:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][570/1251] eta 0:25:12 lr 0.000823 time 1.5924 (2.2206) loss 4.8889 (4.6796) grad_norm 1.4793 (1.4990) [2022-01-18 00:28:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][580/1251] eta 0:24:49 lr 0.000823 time 2.1860 (2.2197) loss 4.8432 (4.6788) grad_norm 1.4489 (1.4990) [2022-01-18 00:28:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][590/1251] eta 0:24:26 lr 0.000824 time 2.1588 (2.2184) loss 4.3723 (4.6807) grad_norm 1.6948 (1.5000) [2022-01-18 00:28:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][600/1251] eta 0:24:03 lr 0.000824 time 1.9643 (2.2173) loss 5.0151 (4.6867) grad_norm 1.2146 (1.5010) [2022-01-18 00:29:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][610/1251] eta 0:23:40 lr 0.000825 time 1.8873 (2.2158) loss 4.0862 (4.6862) grad_norm 1.5426 (1.5012) [2022-01-18 00:29:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][620/1251] eta 0:23:18 lr 0.000825 time 2.4974 (2.2156) loss 4.1948 (4.6841) grad_norm 1.2701 (1.4988) [2022-01-18 00:29:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][630/1251] eta 0:22:56 lr 0.000825 time 2.5329 (2.2166) loss 5.2073 (4.6829) grad_norm 1.7093 (1.4993) [2022-01-18 00:30:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][640/1251] eta 0:22:34 lr 0.000826 time 2.1338 (2.2172) loss 5.2047 (4.6864) grad_norm 1.1537 (1.4969) [2022-01-18 00:30:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][650/1251] eta 0:22:13 lr 0.000826 time 2.1671 (2.2191) loss 3.6127 (4.6852) grad_norm 1.6961 (1.4950) [2022-01-18 00:31:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][660/1251] eta 0:21:51 lr 0.000827 time 1.9673 (2.2195) loss 5.4492 (4.6859) grad_norm 1.2568 (1.4935) [2022-01-18 00:31:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][670/1251] eta 0:21:28 lr 0.000827 time 1.8882 (2.2176) loss 4.6144 (4.6865) grad_norm 1.5168 (1.4933) [2022-01-18 00:31:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][680/1251] eta 0:21:04 lr 0.000827 time 2.0153 (2.2150) loss 4.8640 (4.6878) grad_norm 1.2456 (1.4937) [2022-01-18 00:32:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][690/1251] eta 0:20:42 lr 0.000828 time 2.0741 (2.2153) loss 4.1631 (4.6870) grad_norm 1.5893 (1.4921) [2022-01-18 00:32:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][700/1251] eta 0:20:20 lr 0.000828 time 2.3303 (2.2155) loss 5.0128 (4.6875) grad_norm 1.5892 (1.4919) [2022-01-18 00:32:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][710/1251] eta 0:19:58 lr 0.000829 time 1.9223 (2.2148) loss 5.1033 (4.6876) grad_norm 1.5411 (1.4903) [2022-01-18 00:33:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][720/1251] eta 0:19:36 lr 0.000829 time 2.8373 (2.2151) loss 4.7246 (4.6844) grad_norm 1.2046 (1.4891) [2022-01-18 00:33:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][730/1251] eta 0:19:13 lr 0.000829 time 2.4613 (2.2148) loss 4.0336 (4.6853) grad_norm 1.5675 (1.4903) [2022-01-18 00:33:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][740/1251] eta 0:18:52 lr 0.000830 time 2.6283 (2.2153) loss 5.1337 (4.6858) grad_norm 1.6760 (1.4902) [2022-01-18 00:34:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][750/1251] eta 0:18:29 lr 0.000830 time 2.2173 (2.2144) loss 4.3715 (4.6853) grad_norm 1.3727 (1.4895) [2022-01-18 00:34:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][760/1251] eta 0:18:06 lr 0.000831 time 1.8743 (2.2136) loss 4.8570 (4.6854) grad_norm 1.2252 (1.4876) [2022-01-18 00:35:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][770/1251] eta 0:17:45 lr 0.000831 time 2.1418 (2.2146) loss 5.2249 (4.6885) grad_norm 1.7310 (1.4861) [2022-01-18 00:35:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][780/1251] eta 0:17:21 lr 0.000831 time 1.5660 (2.2114) loss 3.7261 (4.6849) grad_norm 1.2215 (1.4885) [2022-01-18 00:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][790/1251] eta 0:16:58 lr 0.000832 time 1.9342 (2.2085) loss 4.4402 (4.6837) grad_norm 1.6990 (1.4885) [2022-01-18 00:36:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][800/1251] eta 0:16:35 lr 0.000832 time 2.0372 (2.2069) loss 4.9858 (4.6858) grad_norm 1.1821 (1.4888) [2022-01-18 00:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][810/1251] eta 0:16:13 lr 0.000833 time 2.1840 (2.2068) loss 4.1766 (4.6845) grad_norm 1.3103 (1.4871) [2022-01-18 00:36:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][820/1251] eta 0:15:51 lr 0.000833 time 2.3520 (2.2072) loss 5.6634 (4.6869) grad_norm 1.8244 (1.4867) [2022-01-18 00:37:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][830/1251] eta 0:15:29 lr 0.000833 time 1.9131 (2.2082) loss 3.9900 (4.6884) grad_norm 1.0761 (1.4856) [2022-01-18 00:37:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][840/1251] eta 0:15:08 lr 0.000834 time 2.3910 (2.2102) loss 5.0751 (4.6871) grad_norm 1.2012 (1.4849) [2022-01-18 00:37:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][850/1251] eta 0:14:46 lr 0.000834 time 2.2166 (2.2095) loss 3.5651 (4.6822) grad_norm 1.6566 (1.4836) [2022-01-18 00:38:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][860/1251] eta 0:14:22 lr 0.000835 time 1.5783 (2.2070) loss 5.0023 (4.6837) grad_norm 1.2278 (1.4842) [2022-01-18 00:38:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][870/1251] eta 0:14:00 lr 0.000835 time 1.5898 (2.2056) loss 4.6957 (4.6799) grad_norm 1.1157 (1.4826) [2022-01-18 00:38:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][880/1251] eta 0:13:38 lr 0.000835 time 2.1894 (2.2053) loss 5.0290 (4.6813) grad_norm 1.1673 (1.4823) [2022-01-18 00:39:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][890/1251] eta 0:13:16 lr 0.000836 time 1.4632 (2.2050) loss 4.2776 (4.6818) grad_norm 2.0065 (1.4824) [2022-01-18 00:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][900/1251] eta 0:12:54 lr 0.000836 time 1.6698 (2.2054) loss 4.8943 (4.6820) grad_norm 1.0777 (1.4823) [2022-01-18 00:40:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][910/1251] eta 0:12:32 lr 0.000837 time 1.6347 (2.2071) loss 3.5183 (4.6811) grad_norm 1.3184 (1.4809) [2022-01-18 00:40:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][920/1251] eta 0:12:11 lr 0.000837 time 2.7742 (2.2094) loss 4.5815 (4.6801) grad_norm 1.6887 (1.4803) [2022-01-18 00:40:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][930/1251] eta 0:11:48 lr 0.000837 time 1.5084 (2.2077) loss 3.7734 (4.6758) grad_norm 1.7159 (1.4809) [2022-01-18 00:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][940/1251] eta 0:11:25 lr 0.000838 time 1.8912 (2.2058) loss 4.5618 (4.6788) grad_norm 1.7319 (1.4806) [2022-01-18 00:41:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][950/1251] eta 0:11:03 lr 0.000838 time 2.2850 (2.2032) loss 5.0525 (4.6782) grad_norm 1.3955 (1.4803) [2022-01-18 00:41:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][960/1251] eta 0:10:41 lr 0.000839 time 1.7348 (2.2029) loss 3.6373 (4.6757) grad_norm 1.4985 (1.4815) [2022-01-18 00:42:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][970/1251] eta 0:10:18 lr 0.000839 time 2.0992 (2.2014) loss 4.5816 (4.6743) grad_norm 1.4777 (1.4805) [2022-01-18 00:42:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][980/1251] eta 0:09:56 lr 0.000839 time 1.8463 (2.2012) loss 4.8170 (4.6760) grad_norm 1.3426 (1.4795) [2022-01-18 00:42:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][990/1251] eta 0:09:34 lr 0.000840 time 2.1727 (2.2013) loss 4.6706 (4.6766) grad_norm 2.1142 (1.4794) [2022-01-18 00:43:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1000/1251] eta 0:09:12 lr 0.000840 time 2.2207 (2.2009) loss 4.9863 (4.6746) grad_norm 1.3872 (1.4790) [2022-01-18 00:43:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1010/1251] eta 0:08:50 lr 0.000841 time 1.9585 (2.2008) loss 5.3237 (4.6727) grad_norm 1.4890 (1.4784) [2022-01-18 00:44:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1020/1251] eta 0:08:28 lr 0.000841 time 2.1090 (2.2008) loss 5.0852 (4.6726) grad_norm 1.2842 (1.4793) [2022-01-18 00:44:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1030/1251] eta 0:08:06 lr 0.000841 time 1.7744 (2.2031) loss 4.2300 (4.6729) grad_norm 1.6222 (1.4780) [2022-01-18 00:44:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1040/1251] eta 0:07:44 lr 0.000842 time 2.8233 (2.2036) loss 4.8987 (4.6710) grad_norm 1.8074 (1.4784) [2022-01-18 00:45:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1050/1251] eta 0:07:22 lr 0.000842 time 1.9023 (2.2019) loss 5.3604 (4.6722) grad_norm 1.4868 (1.4766) [2022-01-18 00:45:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1060/1251] eta 0:07:00 lr 0.000843 time 2.3479 (2.2005) loss 4.9435 (4.6698) grad_norm 2.0075 (1.4777) [2022-01-18 00:45:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1070/1251] eta 0:06:38 lr 0.000843 time 2.2088 (2.1994) loss 5.4066 (4.6695) grad_norm 1.2370 (1.4772) [2022-01-18 00:46:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1080/1251] eta 0:06:15 lr 0.000843 time 2.6147 (2.1987) loss 4.8334 (4.6682) grad_norm 2.0402 (1.4782) [2022-01-18 00:46:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1090/1251] eta 0:05:53 lr 0.000844 time 2.9079 (2.1983) loss 3.6419 (4.6674) grad_norm 1.6636 (1.4778) [2022-01-18 00:46:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1100/1251] eta 0:05:31 lr 0.000844 time 2.5771 (2.1977) loss 4.8405 (4.6661) grad_norm 1.1488 (1.4767) [2022-01-18 00:47:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1110/1251] eta 0:05:09 lr 0.000845 time 1.5318 (2.1972) loss 4.5289 (4.6641) grad_norm 1.2577 (1.4764) [2022-01-18 00:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1120/1251] eta 0:04:48 lr 0.000845 time 2.5078 (2.1994) loss 3.9523 (4.6627) grad_norm 1.7222 (1.4764) [2022-01-18 00:48:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1130/1251] eta 0:04:26 lr 0.000845 time 2.9163 (2.2007) loss 5.0682 (4.6643) grad_norm 1.7995 (1.4761) [2022-01-18 00:48:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1140/1251] eta 0:04:04 lr 0.000846 time 2.1452 (2.1995) loss 4.5953 (4.6648) grad_norm 1.2955 (1.4758) [2022-01-18 00:48:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1150/1251] eta 0:03:42 lr 0.000846 time 1.6702 (2.1981) loss 4.4528 (4.6646) grad_norm 1.4162 (1.4762) [2022-01-18 00:49:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1160/1251] eta 0:03:19 lr 0.000847 time 2.2333 (2.1978) loss 3.8461 (4.6633) grad_norm 1.4840 (1.4765) [2022-01-18 00:49:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1170/1251] eta 0:02:58 lr 0.000847 time 3.1148 (2.1980) loss 5.2391 (4.6641) grad_norm 1.4962 (1.4771) [2022-01-18 00:49:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1180/1251] eta 0:02:36 lr 0.000847 time 2.2661 (2.1984) loss 4.2617 (4.6667) grad_norm 1.2927 (1.4760) [2022-01-18 00:50:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1190/1251] eta 0:02:14 lr 0.000848 time 1.8371 (2.1995) loss 4.4810 (4.6667) grad_norm 1.2444 (1.4760) [2022-01-18 00:50:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1200/1251] eta 0:01:52 lr 0.000848 time 1.6963 (2.1995) loss 4.7755 (4.6684) grad_norm 1.2436 (1.4753) [2022-01-18 00:50:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1210/1251] eta 0:01:30 lr 0.000849 time 2.8833 (2.1998) loss 4.9988 (4.6685) grad_norm 1.0504 (1.4745) [2022-01-18 00:51:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1220/1251] eta 0:01:08 lr 0.000849 time 1.5431 (2.1984) loss 4.2520 (4.6680) grad_norm 1.4868 (1.4742) [2022-01-18 00:51:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1230/1251] eta 0:00:46 lr 0.000849 time 1.6049 (2.1984) loss 4.9997 (4.6693) grad_norm 1.8838 (1.4750) [2022-01-18 00:52:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1240/1251] eta 0:00:24 lr 0.000850 time 1.6958 (2.1967) loss 5.0184 (4.6689) grad_norm 1.4305 (1.4741) [2022-01-18 00:52:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [16/300][1250/1251] eta 0:00:02 lr 0.000850 time 1.1942 (2.1910) loss 4.6567 (4.6678) grad_norm 1.3638 (1.4744) [2022-01-18 00:52:15 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 16 training takes 0:45:41 [2022-01-18 00:52:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.610 (18.610) Loss 2.0033 (2.0033) Acc@1 58.301 (58.301) Acc@5 80.566 (80.566) [2022-01-18 00:52:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.904 (3.210) Loss 1.9448 (1.9654) Acc@1 58.301 (57.564) Acc@5 80.176 (80.655) [2022-01-18 00:53:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.991 (2.428) Loss 1.8988 (1.9532) Acc@1 58.691 (57.264) Acc@5 81.055 (80.864) [2022-01-18 00:53:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.956 (2.241) Loss 2.0656 (1.9489) Acc@1 53.906 (57.079) Acc@5 78.027 (80.954) [2022-01-18 00:53:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.193 (2.159) Loss 2.0099 (1.9534) Acc@1 57.031 (56.967) Acc@5 79.785 (80.881) [2022-01-18 00:53:51 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 56.938 Acc@5 80.810 [2022-01-18 00:53:51 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 56.9% [2022-01-18 00:53:51 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 56.94% [2022-01-18 00:54:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][0/1251] eta 7:28:37 lr 0.000850 time 21.5165 (21.5165) loss 4.3923 (4.3923) grad_norm 1.4780 (1.4780) [2022-01-18 00:54:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][10/1251] eta 1:20:43 lr 0.000851 time 1.5417 (3.9028) loss 5.3574 (4.7830) grad_norm 1.5579 (1.4299) [2022-01-18 00:54:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][20/1251] eta 1:02:46 lr 0.000851 time 1.2984 (3.0598) loss 4.4356 (4.7393) grad_norm 1.5198 (1.4552) [2022-01-18 00:55:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][30/1251] eta 0:57:57 lr 0.000851 time 2.0272 (2.8480) loss 4.2812 (4.6217) grad_norm 1.4557 (1.5167) [2022-01-18 00:55:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][40/1251] eta 0:55:15 lr 0.000852 time 3.8056 (2.7382) loss 4.0507 (4.5320) grad_norm 1.3849 (1.5175) [2022-01-18 00:56:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][50/1251] eta 0:53:39 lr 0.000852 time 1.7697 (2.6808) loss 4.8164 (4.4419) grad_norm 1.5280 (1.5136) [2022-01-18 00:56:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][60/1251] eta 0:51:46 lr 0.000853 time 1.7245 (2.6082) loss 3.9061 (4.4807) grad_norm 1.2611 (1.4979) [2022-01-18 00:56:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][70/1251] eta 0:50:03 lr 0.000853 time 1.7953 (2.5434) loss 4.6387 (4.5220) grad_norm 2.0611 (1.4929) [2022-01-18 00:57:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][80/1251] eta 0:48:54 lr 0.000853 time 3.5798 (2.5058) loss 4.6214 (4.5498) grad_norm 1.2284 (1.5000) [2022-01-18 00:57:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][90/1251] eta 0:47:20 lr 0.000854 time 2.2081 (2.4462) loss 4.9754 (4.5782) grad_norm 1.1945 (1.4912) [2022-01-18 00:57:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][100/1251] eta 0:46:22 lr 0.000854 time 2.5806 (2.4178) loss 4.8767 (4.6030) grad_norm 1.4305 (1.4899) [2022-01-18 00:58:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][110/1251] eta 0:45:38 lr 0.000855 time 1.7995 (2.4005) loss 3.9928 (4.6026) grad_norm 1.1180 (1.4791) [2022-01-18 00:58:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][120/1251] eta 0:45:17 lr 0.000855 time 2.6750 (2.4025) loss 5.5714 (4.6023) grad_norm 1.5629 (1.4769) [2022-01-18 00:59:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][130/1251] eta 0:44:46 lr 0.000855 time 1.8950 (2.3965) loss 5.2000 (4.6070) grad_norm 1.5329 (1.4719) [2022-01-18 00:59:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][140/1251] eta 0:43:55 lr 0.000856 time 1.9076 (2.3725) loss 4.9228 (4.5999) grad_norm 1.8888 (1.4646) [2022-01-18 00:59:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][150/1251] eta 0:43:00 lr 0.000856 time 1.8891 (2.3442) loss 4.2210 (4.5876) grad_norm 2.3102 (1.4727) [2022-01-18 01:00:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][160/1251] eta 0:42:11 lr 0.000857 time 1.9431 (2.3203) loss 4.9976 (4.5945) grad_norm 1.4026 (1.4698) [2022-01-18 01:00:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][170/1251] eta 0:41:42 lr 0.000857 time 2.5201 (2.3151) loss 4.1678 (4.5874) grad_norm 1.2186 (1.4607) [2022-01-18 01:00:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][180/1251] eta 0:41:05 lr 0.000857 time 1.8539 (2.3024) loss 5.3204 (4.5856) grad_norm 1.0605 (1.4512) [2022-01-18 01:01:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][190/1251] eta 0:40:41 lr 0.000858 time 2.2038 (2.3016) loss 4.6078 (4.5788) grad_norm 1.3828 (1.4420) [2022-01-18 01:01:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][200/1251] eta 0:40:15 lr 0.000858 time 1.4669 (2.2982) loss 5.0195 (4.5724) grad_norm 1.3021 (1.4376) [2022-01-18 01:01:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][210/1251] eta 0:39:56 lr 0.000859 time 2.2632 (2.3020) loss 5.3962 (4.5796) grad_norm 1.2713 (1.4348) [2022-01-18 01:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][220/1251] eta 0:39:34 lr 0.000859 time 1.9132 (2.3029) loss 3.9543 (4.5776) grad_norm 1.2532 (1.4345) [2022-01-18 01:02:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][230/1251] eta 0:39:08 lr 0.000859 time 1.8645 (2.3003) loss 4.9792 (4.5693) grad_norm 1.8939 (1.4351) [2022-01-18 01:03:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][240/1251] eta 0:38:39 lr 0.000860 time 2.1592 (2.2948) loss 4.8927 (4.5623) grad_norm 1.2053 (1.4386) [2022-01-18 01:03:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][250/1251] eta 0:38:04 lr 0.000860 time 1.8536 (2.2820) loss 3.6799 (4.5559) grad_norm 1.3843 (1.4397) [2022-01-18 01:03:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][260/1251] eta 0:37:31 lr 0.000861 time 2.0464 (2.2717) loss 4.9816 (4.5666) grad_norm 1.9929 (1.4426) [2022-01-18 01:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][270/1251] eta 0:37:01 lr 0.000861 time 2.2404 (2.2649) loss 5.4005 (4.5592) grad_norm 1.4542 (1.4488) [2022-01-18 01:04:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][280/1251] eta 0:36:36 lr 0.000861 time 2.8077 (2.2617) loss 4.9545 (4.5537) grad_norm 1.5265 (1.4444) [2022-01-18 01:04:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][290/1251] eta 0:36:12 lr 0.000862 time 2.2824 (2.2606) loss 4.7371 (4.5470) grad_norm 1.5492 (1.4474) [2022-01-18 01:05:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][300/1251] eta 0:35:59 lr 0.000862 time 2.2696 (2.2704) loss 4.6679 (4.5516) grad_norm 1.3490 (1.4486) [2022-01-18 01:05:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][310/1251] eta 0:35:39 lr 0.000863 time 2.4669 (2.2735) loss 5.3377 (4.5569) grad_norm 1.6038 (1.4461) [2022-01-18 01:06:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][320/1251] eta 0:35:14 lr 0.000863 time 1.9748 (2.2709) loss 5.0964 (4.5592) grad_norm 1.3171 (1.4445) [2022-01-18 01:06:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][330/1251] eta 0:34:49 lr 0.000863 time 2.0386 (2.2688) loss 3.8124 (4.5470) grad_norm 1.9399 (1.4442) [2022-01-18 01:06:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][340/1251] eta 0:34:16 lr 0.000864 time 1.6137 (2.2575) loss 5.6731 (4.5482) grad_norm 1.3517 (1.4410) [2022-01-18 01:07:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][350/1251] eta 0:33:47 lr 0.000864 time 1.8499 (2.2503) loss 4.9590 (4.5373) grad_norm 1.4704 (1.4422) [2022-01-18 01:07:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][360/1251] eta 0:33:18 lr 0.000865 time 1.6896 (2.2434) loss 3.3364 (4.5424) grad_norm 1.7030 (1.4439) [2022-01-18 01:07:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][370/1251] eta 0:32:54 lr 0.000865 time 2.2205 (2.2413) loss 4.7126 (4.5493) grad_norm 1.5040 (1.4457) [2022-01-18 01:08:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][380/1251] eta 0:32:27 lr 0.000865 time 2.2255 (2.2359) loss 5.0225 (4.5536) grad_norm 1.2867 (1.4457) [2022-01-18 01:08:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][390/1251] eta 0:32:03 lr 0.000866 time 2.2226 (2.2341) loss 4.2441 (4.5529) grad_norm 2.0653 (1.4469) [2022-01-18 01:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][400/1251] eta 0:31:42 lr 0.000866 time 3.2174 (2.2352) loss 5.3062 (4.5531) grad_norm 1.3278 (1.4480) [2022-01-18 01:09:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][410/1251] eta 0:31:20 lr 0.000867 time 1.9523 (2.2355) loss 4.5027 (4.5576) grad_norm 1.5842 (1.4479) [2022-01-18 01:09:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][420/1251] eta 0:31:00 lr 0.000867 time 2.1520 (2.2385) loss 5.4453 (4.5629) grad_norm 1.7716 (1.4544) [2022-01-18 01:09:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][430/1251] eta 0:30:41 lr 0.000867 time 3.2400 (2.2430) loss 4.6510 (4.5674) grad_norm 1.2581 (1.4560) [2022-01-18 01:10:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][440/1251] eta 0:30:20 lr 0.000868 time 2.6822 (2.2444) loss 4.2726 (4.5658) grad_norm 1.1860 (1.4583) [2022-01-18 01:10:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][450/1251] eta 0:29:56 lr 0.000868 time 1.8297 (2.2422) loss 4.7524 (4.5701) grad_norm 1.3099 (1.4577) [2022-01-18 01:11:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][460/1251] eta 0:29:32 lr 0.000869 time 1.6470 (2.2402) loss 4.8602 (4.5738) grad_norm 1.4621 (1.4588) [2022-01-18 01:11:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][470/1251] eta 0:29:08 lr 0.000869 time 3.0642 (2.2384) loss 3.5679 (4.5735) grad_norm 1.5573 (1.4615) [2022-01-18 01:11:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][480/1251] eta 0:28:43 lr 0.000869 time 1.9013 (2.2357) loss 4.7593 (4.5736) grad_norm 1.2757 (1.4605) [2022-01-18 01:12:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][490/1251] eta 0:28:22 lr 0.000870 time 2.6579 (2.2378) loss 5.1105 (4.5757) grad_norm 1.0947 (1.4580) [2022-01-18 01:12:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][500/1251] eta 0:28:03 lr 0.000870 time 2.2027 (2.2410) loss 4.1422 (4.5818) grad_norm 1.1230 (1.4575) [2022-01-18 01:12:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][510/1251] eta 0:27:41 lr 0.000871 time 3.3714 (2.2429) loss 3.9055 (4.5779) grad_norm 1.2745 (1.4566) [2022-01-18 01:13:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][520/1251] eta 0:27:17 lr 0.000871 time 1.7674 (2.2400) loss 4.7706 (4.5820) grad_norm 1.1833 (1.4542) [2022-01-18 01:13:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][530/1251] eta 0:26:52 lr 0.000871 time 2.2744 (2.2369) loss 4.2256 (4.5758) grad_norm 1.6143 (1.4546) [2022-01-18 01:13:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][540/1251] eta 0:26:27 lr 0.000872 time 2.2431 (2.2334) loss 4.8615 (4.5815) grad_norm 1.5049 (1.4580) [2022-01-18 01:14:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][550/1251] eta 0:26:04 lr 0.000872 time 2.4634 (2.2312) loss 4.2177 (4.5868) grad_norm 1.4273 (1.4576) [2022-01-18 01:14:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][560/1251] eta 0:25:40 lr 0.000873 time 2.3390 (2.2292) loss 5.0433 (4.5926) grad_norm 1.3909 (1.4604) [2022-01-18 01:15:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][570/1251] eta 0:25:18 lr 0.000873 time 2.8712 (2.2300) loss 4.3288 (4.5911) grad_norm 1.3948 (1.4582) [2022-01-18 01:15:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][580/1251] eta 0:24:54 lr 0.000873 time 1.7558 (2.2277) loss 4.8596 (4.5924) grad_norm 1.4505 (1.4569) [2022-01-18 01:15:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][590/1251] eta 0:24:33 lr 0.000874 time 2.4618 (2.2286) loss 5.5002 (4.5885) grad_norm 1.1770 (1.4547) [2022-01-18 01:16:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][600/1251] eta 0:24:10 lr 0.000874 time 2.6221 (2.2281) loss 5.3856 (4.5926) grad_norm 1.5005 (1.4552) [2022-01-18 01:16:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][610/1251] eta 0:23:48 lr 0.000875 time 2.5265 (2.2281) loss 4.3717 (4.5958) grad_norm 1.3719 (1.4556) [2022-01-18 01:16:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][620/1251] eta 0:23:25 lr 0.000875 time 1.8357 (2.2282) loss 4.6836 (4.5915) grad_norm 1.2961 (1.4569) [2022-01-18 01:17:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][630/1251] eta 0:23:04 lr 0.000875 time 2.4393 (2.2301) loss 4.9217 (4.5979) grad_norm 2.1908 (1.4573) [2022-01-18 01:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][640/1251] eta 0:22:41 lr 0.000876 time 2.2063 (2.2283) loss 4.4402 (4.5995) grad_norm 1.1965 (1.4537) [2022-01-18 01:17:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][650/1251] eta 0:22:16 lr 0.000876 time 2.1186 (2.2246) loss 4.5126 (4.5983) grad_norm 1.7401 (1.4526) [2022-01-18 01:18:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][660/1251] eta 0:21:53 lr 0.000877 time 2.0041 (2.2223) loss 5.0585 (4.6023) grad_norm 1.5452 (1.4519) [2022-01-18 01:18:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][670/1251] eta 0:21:30 lr 0.000877 time 2.3666 (2.2204) loss 4.2523 (4.6017) grad_norm 1.3863 (1.4521) [2022-01-18 01:19:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][680/1251] eta 0:21:07 lr 0.000877 time 1.8130 (2.2196) loss 4.9753 (4.6030) grad_norm 1.6633 (1.4535) [2022-01-18 01:19:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][690/1251] eta 0:20:45 lr 0.000878 time 2.2693 (2.2196) loss 5.3751 (4.6015) grad_norm 1.2688 (1.4541) [2022-01-18 01:19:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][700/1251] eta 0:20:24 lr 0.000878 time 2.1429 (2.2215) loss 4.7583 (4.5996) grad_norm 1.2929 (1.4515) [2022-01-18 01:20:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][710/1251] eta 0:20:03 lr 0.000878 time 2.5998 (2.2241) loss 4.0605 (4.6030) grad_norm 1.3522 (1.4510) [2022-01-18 01:20:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][720/1251] eta 0:19:41 lr 0.000879 time 2.4563 (2.2247) loss 4.7401 (4.6040) grad_norm 1.4749 (1.4499) [2022-01-18 01:20:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][730/1251] eta 0:19:18 lr 0.000879 time 2.2368 (2.2232) loss 3.6077 (4.6024) grad_norm 1.3925 (1.4485) [2022-01-18 01:21:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][740/1251] eta 0:18:54 lr 0.000880 time 2.1999 (2.2196) loss 5.3429 (4.6084) grad_norm 1.4022 (1.4479) [2022-01-18 01:21:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][750/1251] eta 0:18:30 lr 0.000880 time 1.6090 (2.2169) loss 3.8723 (4.6073) grad_norm 1.2974 (1.4474) [2022-01-18 01:21:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][760/1251] eta 0:18:08 lr 0.000880 time 2.3480 (2.2162) loss 4.8950 (4.6058) grad_norm 1.5867 (1.4498) [2022-01-18 01:22:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][770/1251] eta 0:17:46 lr 0.000881 time 2.1552 (2.2165) loss 4.3208 (4.6023) grad_norm 1.5496 (1.4517) [2022-01-18 01:22:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][780/1251] eta 0:17:24 lr 0.000881 time 2.5252 (2.2169) loss 5.1191 (4.6045) grad_norm 1.3585 (1.4522) [2022-01-18 01:23:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][790/1251] eta 0:17:01 lr 0.000882 time 1.9678 (2.2164) loss 4.8112 (4.6062) grad_norm 1.2768 (1.4512) [2022-01-18 01:23:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][800/1251] eta 0:16:39 lr 0.000882 time 1.9364 (2.2170) loss 4.7767 (4.6075) grad_norm 1.2570 (1.4511) [2022-01-18 01:23:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][810/1251] eta 0:16:18 lr 0.000882 time 2.4617 (2.2194) loss 4.0819 (4.6050) grad_norm 1.1690 (1.4506) [2022-01-18 01:24:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][820/1251] eta 0:15:56 lr 0.000883 time 2.1247 (2.2195) loss 3.7052 (4.6046) grad_norm 1.4382 (1.4492) [2022-01-18 01:24:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][830/1251] eta 0:15:33 lr 0.000883 time 1.8392 (2.2171) loss 4.0978 (4.6039) grad_norm 1.4161 (1.4468) [2022-01-18 01:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][840/1251] eta 0:15:10 lr 0.000884 time 1.8458 (2.2143) loss 4.4496 (4.6062) grad_norm 1.4743 (1.4467) [2022-01-18 01:25:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][850/1251] eta 0:14:46 lr 0.000884 time 1.7656 (2.2114) loss 4.9770 (4.6023) grad_norm 1.3415 (1.4466) [2022-01-18 01:25:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][860/1251] eta 0:14:24 lr 0.000884 time 2.1828 (2.2100) loss 5.2570 (4.6017) grad_norm 1.2369 (1.4461) [2022-01-18 01:25:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][870/1251] eta 0:14:02 lr 0.000885 time 2.5444 (2.2101) loss 4.2271 (4.6010) grad_norm 1.6748 (1.4482) [2022-01-18 01:26:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][880/1251] eta 0:13:39 lr 0.000885 time 1.7841 (2.2091) loss 5.4237 (4.6015) grad_norm 1.5480 (1.4498) [2022-01-18 01:26:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][890/1251] eta 0:13:17 lr 0.000886 time 2.1770 (2.2098) loss 5.0782 (4.6023) grad_norm 2.1887 (1.4501) [2022-01-18 01:27:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][900/1251] eta 0:12:56 lr 0.000886 time 2.2361 (2.2111) loss 5.1518 (4.6010) grad_norm 1.3144 (1.4500) [2022-01-18 01:27:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][910/1251] eta 0:12:35 lr 0.000886 time 2.8098 (2.2164) loss 5.2331 (4.6030) grad_norm 1.5244 (1.4495) [2022-01-18 01:27:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][920/1251] eta 0:12:13 lr 0.000887 time 1.8926 (2.2166) loss 5.0756 (4.6072) grad_norm 1.2361 (1.4495) [2022-01-18 01:28:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][930/1251] eta 0:11:50 lr 0.000887 time 1.8941 (2.2144) loss 3.7388 (4.6028) grad_norm 1.8762 (1.4500) [2022-01-18 01:28:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][940/1251] eta 0:11:28 lr 0.000888 time 1.8514 (2.2129) loss 3.7545 (4.6031) grad_norm 1.3356 (1.4494) [2022-01-18 01:28:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][950/1251] eta 0:11:06 lr 0.000888 time 1.9676 (2.2131) loss 4.7546 (4.6024) grad_norm 1.4298 (1.4493) [2022-01-18 01:29:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][960/1251] eta 0:10:43 lr 0.000888 time 2.5026 (2.2127) loss 5.3963 (4.6043) grad_norm 1.2103 (1.4504) [2022-01-18 01:29:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][970/1251] eta 0:10:21 lr 0.000889 time 2.1402 (2.2113) loss 5.2471 (4.6032) grad_norm 1.4434 (1.4498) [2022-01-18 01:29:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][980/1251] eta 0:09:58 lr 0.000889 time 2.0508 (2.2088) loss 5.1606 (4.6034) grad_norm 1.3730 (1.4486) [2022-01-18 01:30:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][990/1251] eta 0:09:36 lr 0.000890 time 1.8095 (2.2081) loss 5.2475 (4.6051) grad_norm 1.7307 (1.4487) [2022-01-18 01:30:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1000/1251] eta 0:09:14 lr 0.000890 time 2.1947 (2.2078) loss 4.9272 (4.6086) grad_norm 1.0594 (1.4471) [2022-01-18 01:31:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1010/1251] eta 0:08:52 lr 0.000890 time 2.8381 (2.2097) loss 4.4294 (4.6110) grad_norm 1.2758 (1.4465) [2022-01-18 01:31:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1020/1251] eta 0:08:30 lr 0.000891 time 1.9748 (2.2107) loss 4.1258 (4.6106) grad_norm 1.3642 (1.4458) [2022-01-18 01:31:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1030/1251] eta 0:08:08 lr 0.000891 time 2.2296 (2.2121) loss 4.5290 (4.6101) grad_norm 1.3121 (1.4448) [2022-01-18 01:32:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1040/1251] eta 0:07:46 lr 0.000892 time 1.8499 (2.2120) loss 4.8640 (4.6097) grad_norm 1.4701 (1.4439) [2022-01-18 01:32:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1050/1251] eta 0:07:24 lr 0.000892 time 3.2909 (2.2114) loss 4.5377 (4.6113) grad_norm 1.4170 (1.4434) [2022-01-18 01:32:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1060/1251] eta 0:07:01 lr 0.000892 time 1.6448 (2.2091) loss 5.5715 (4.6130) grad_norm 1.2516 (1.4418) [2022-01-18 01:33:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1070/1251] eta 0:06:39 lr 0.000893 time 1.7042 (2.2081) loss 4.1835 (4.6160) grad_norm 1.2750 (1.4415) [2022-01-18 01:33:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1080/1251] eta 0:06:17 lr 0.000893 time 1.9014 (2.2066) loss 4.0452 (4.6174) grad_norm 1.5520 (1.4407) [2022-01-18 01:33:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1090/1251] eta 0:05:55 lr 0.000894 time 3.7935 (2.2075) loss 4.7342 (4.6165) grad_norm 1.2793 (1.4401) [2022-01-18 01:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1100/1251] eta 0:05:33 lr 0.000894 time 2.4600 (2.2081) loss 4.7788 (4.6184) grad_norm 1.4922 (1.4390) [2022-01-18 01:34:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1110/1251] eta 0:05:11 lr 0.000894 time 1.9309 (2.2078) loss 5.3074 (4.6181) grad_norm 1.4324 (1.4391) [2022-01-18 01:35:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1120/1251] eta 0:04:49 lr 0.000895 time 2.1506 (2.2093) loss 5.0278 (4.6197) grad_norm 1.1936 (1.4392) [2022-01-18 01:35:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1130/1251] eta 0:04:27 lr 0.000895 time 2.7373 (2.2083) loss 4.3462 (4.6206) grad_norm 1.4043 (1.4380) [2022-01-18 01:35:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1140/1251] eta 0:04:05 lr 0.000896 time 2.6787 (2.2080) loss 3.7523 (4.6201) grad_norm 1.0767 (1.4375) [2022-01-18 01:36:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1150/1251] eta 0:03:42 lr 0.000896 time 2.2037 (2.2075) loss 4.8741 (4.6215) grad_norm 1.4549 (1.4379) [2022-01-18 01:36:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1160/1251] eta 0:03:20 lr 0.000896 time 1.8275 (2.2076) loss 4.6442 (4.6196) grad_norm 1.0514 (1.4375) [2022-01-18 01:36:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1170/1251] eta 0:02:58 lr 0.000897 time 2.8955 (2.2072) loss 4.8572 (4.6211) grad_norm 1.4508 (1.4362) [2022-01-18 01:37:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1180/1251] eta 0:02:36 lr 0.000897 time 2.8151 (2.2069) loss 4.9835 (4.6242) grad_norm 1.1948 (1.4362) [2022-01-18 01:37:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1190/1251] eta 0:02:14 lr 0.000898 time 1.8840 (2.2057) loss 4.7725 (4.6235) grad_norm 1.4795 (1.4364) [2022-01-18 01:38:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1200/1251] eta 0:01:52 lr 0.000898 time 1.9447 (2.2061) loss 5.1260 (4.6255) grad_norm 1.3507 (1.4364) [2022-01-18 01:38:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1210/1251] eta 0:01:30 lr 0.000898 time 1.8896 (2.2043) loss 4.5982 (4.6237) grad_norm 1.2900 (1.4357) [2022-01-18 01:38:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1220/1251] eta 0:01:08 lr 0.000899 time 2.5276 (2.2035) loss 4.4403 (4.6233) grad_norm 1.6785 (1.4354) [2022-01-18 01:39:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1230/1251] eta 0:00:46 lr 0.000899 time 2.2059 (2.2024) loss 4.7932 (4.6223) grad_norm 1.5036 (1.4351) [2022-01-18 01:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1240/1251] eta 0:00:24 lr 0.000900 time 2.1973 (2.2021) loss 5.3336 (4.6238) grad_norm 1.2859 (1.4348) [2022-01-18 01:39:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [17/300][1250/1251] eta 0:00:02 lr 0.000900 time 1.2894 (2.1973) loss 5.0061 (4.6272) grad_norm 1.4052 (1.4348) [2022-01-18 01:39:40 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 17 training takes 0:45:49 [2022-01-18 01:39:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.160 (18.160) Loss 1.9148 (1.9148) Acc@1 58.008 (58.008) Acc@5 82.031 (82.031) [2022-01-18 01:40:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 2.044 (3.531) Loss 1.9640 (1.9142) Acc@1 56.250 (57.013) Acc@5 81.445 (81.934) [2022-01-18 01:40:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.612 (2.769) Loss 1.9785 (1.9177) Acc@1 57.617 (57.487) Acc@5 81.348 (81.901) [2022-01-18 01:40:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.613 (2.341) Loss 1.9125 (1.9252) Acc@1 58.301 (57.245) Acc@5 82.910 (81.823) [2022-01-18 01:41:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.984 (2.223) Loss 1.9220 (1.9308) Acc@1 57.129 (57.191) Acc@5 82.520 (81.781) [2022-01-18 01:41:19 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 57.186 Acc@5 81.596 [2022-01-18 01:41:19 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 57.2% [2022-01-18 01:41:19 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 57.19% [2022-01-18 01:41:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][0/1251] eta 7:31:28 lr 0.000900 time 21.6534 (21.6534) loss 4.4726 (4.4726) grad_norm 1.4513 (1.4513) [2022-01-18 01:42:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][10/1251] eta 1:23:10 lr 0.000900 time 2.3545 (4.0217) loss 5.3842 (4.6343) grad_norm 1.1580 (1.4104) [2022-01-18 01:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][20/1251] eta 1:02:55 lr 0.000901 time 1.5345 (3.0672) loss 4.1452 (4.5158) grad_norm 1.6024 (1.4023) [2022-01-18 01:42:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][30/1251] eta 0:56:15 lr 0.000901 time 1.9305 (2.7641) loss 4.6268 (4.4644) grad_norm 1.1835 (1.3905) [2022-01-18 01:43:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][40/1251] eta 0:53:59 lr 0.000902 time 5.9651 (2.6754) loss 4.6232 (4.5176) grad_norm 1.3071 (1.4069) [2022-01-18 01:43:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][50/1251] eta 0:51:46 lr 0.000902 time 2.8509 (2.5866) loss 3.9597 (4.4418) grad_norm 1.2660 (1.3914) [2022-01-18 01:43:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][60/1251] eta 0:49:24 lr 0.000902 time 1.5175 (2.4891) loss 4.9737 (4.5093) grad_norm 1.1787 (1.3898) [2022-01-18 01:44:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][70/1251] eta 0:48:12 lr 0.000903 time 1.9097 (2.4490) loss 3.4668 (4.4972) grad_norm 1.4009 (1.3872) [2022-01-18 01:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][80/1251] eta 0:47:30 lr 0.000903 time 3.5517 (2.4343) loss 4.8355 (4.4936) grad_norm 1.4476 (1.3925) [2022-01-18 01:45:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][90/1251] eta 0:47:08 lr 0.000904 time 4.5112 (2.4359) loss 4.8611 (4.5121) grad_norm 1.1833 (1.4073) [2022-01-18 01:45:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][100/1251] eta 0:46:14 lr 0.000904 time 1.9299 (2.4104) loss 4.3033 (4.5132) grad_norm 1.5060 (1.4073) [2022-01-18 01:45:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][110/1251] eta 0:45:26 lr 0.000904 time 1.9339 (2.3899) loss 4.3448 (4.5483) grad_norm 1.3068 (1.4037) [2022-01-18 01:46:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][120/1251] eta 0:44:49 lr 0.000905 time 3.4720 (2.3777) loss 5.0873 (4.5584) grad_norm 1.6857 (1.4012) [2022-01-18 01:46:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][130/1251] eta 0:44:19 lr 0.000905 time 3.3658 (2.3720) loss 4.5456 (4.5559) grad_norm 1.3797 (1.3976) [2022-01-18 01:46:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][140/1251] eta 0:43:44 lr 0.000906 time 1.4905 (2.3618) loss 4.9642 (4.5753) grad_norm 1.3751 (1.3846) [2022-01-18 01:47:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][150/1251] eta 0:43:10 lr 0.000906 time 1.7915 (2.3530) loss 4.8370 (4.5778) grad_norm 1.2736 (1.3896) [2022-01-18 01:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][160/1251] eta 0:42:49 lr 0.000906 time 3.0351 (2.3551) loss 5.2354 (4.5606) grad_norm 1.1262 (1.3985) [2022-01-18 01:48:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][170/1251] eta 0:42:15 lr 0.000907 time 3.3645 (2.3455) loss 4.7525 (4.5340) grad_norm 1.3996 (1.3991) [2022-01-18 01:48:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][180/1251] eta 0:41:30 lr 0.000907 time 1.6908 (2.3259) loss 4.1261 (4.5207) grad_norm 1.4367 (1.3984) [2022-01-18 01:48:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][190/1251] eta 0:40:45 lr 0.000908 time 1.5595 (2.3045) loss 4.8226 (4.5142) grad_norm 1.2555 (1.4027) [2022-01-18 01:49:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][200/1251] eta 0:40:15 lr 0.000908 time 2.0367 (2.2981) loss 5.2278 (4.5267) grad_norm 1.2482 (1.3979) [2022-01-18 01:49:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][210/1251] eta 0:39:46 lr 0.000908 time 2.5761 (2.2926) loss 4.4490 (4.5262) grad_norm 1.6077 (1.3976) [2022-01-18 01:49:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][220/1251] eta 0:39:20 lr 0.000909 time 1.7908 (2.2893) loss 4.5269 (4.5343) grad_norm 1.3832 (1.3957) [2022-01-18 01:50:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][230/1251] eta 0:39:00 lr 0.000909 time 2.5590 (2.2922) loss 5.0845 (4.5364) grad_norm 1.5723 (1.3987) [2022-01-18 01:50:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][240/1251] eta 0:38:35 lr 0.000910 time 1.5116 (2.2900) loss 3.3504 (4.5452) grad_norm 1.3398 (1.3960) [2022-01-18 01:50:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][250/1251] eta 0:38:07 lr 0.000910 time 2.2647 (2.2853) loss 4.0634 (4.5557) grad_norm 1.6531 (1.3936) [2022-01-18 01:51:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][260/1251] eta 0:37:38 lr 0.000910 time 2.4690 (2.2793) loss 4.9298 (4.5566) grad_norm 1.0912 (1.3910) [2022-01-18 01:51:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][270/1251] eta 0:37:08 lr 0.000911 time 2.4476 (2.2718) loss 3.7011 (4.5639) grad_norm 1.5926 (1.3952) [2022-01-18 01:51:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][280/1251] eta 0:36:37 lr 0.000911 time 1.9526 (2.2635) loss 5.5385 (4.5709) grad_norm 1.0535 (1.3979) [2022-01-18 01:52:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][290/1251] eta 0:36:08 lr 0.000912 time 2.4232 (2.2564) loss 5.4007 (4.5813) grad_norm 1.2770 (1.3972) [2022-01-18 01:52:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][300/1251] eta 0:35:46 lr 0.000912 time 2.2098 (2.2566) loss 4.7092 (4.5771) grad_norm 1.2501 (1.3938) [2022-01-18 01:53:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][310/1251] eta 0:35:25 lr 0.000912 time 1.9035 (2.2590) loss 4.3543 (4.5664) grad_norm 1.6675 (1.3941) [2022-01-18 01:53:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][320/1251] eta 0:35:04 lr 0.000913 time 2.0098 (2.2608) loss 3.8536 (4.5661) grad_norm 1.4908 (1.3906) [2022-01-18 01:53:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][330/1251] eta 0:34:43 lr 0.000913 time 2.2195 (2.2627) loss 4.9048 (4.5733) grad_norm 1.2230 (1.3915) [2022-01-18 01:54:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][340/1251] eta 0:34:21 lr 0.000914 time 2.4684 (2.2628) loss 4.1884 (4.5701) grad_norm 1.2968 (1.3903) [2022-01-18 01:54:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][350/1251] eta 0:33:54 lr 0.000914 time 2.5807 (2.2576) loss 5.6138 (4.5712) grad_norm 1.2092 (1.3911) [2022-01-18 01:54:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][360/1251] eta 0:33:23 lr 0.000914 time 1.5585 (2.2484) loss 3.8435 (4.5559) grad_norm 1.1767 (1.3917) [2022-01-18 01:55:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][370/1251] eta 0:32:55 lr 0.000915 time 2.1064 (2.2427) loss 4.9667 (4.5587) grad_norm 1.2729 (1.3954) [2022-01-18 01:55:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][380/1251] eta 0:32:32 lr 0.000915 time 2.1842 (2.2419) loss 3.9642 (4.5583) grad_norm 1.2960 (1.3975) [2022-01-18 01:55:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][390/1251] eta 0:32:07 lr 0.000916 time 1.5380 (2.2385) loss 5.2215 (4.5609) grad_norm 1.5988 (1.3997) [2022-01-18 01:56:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][400/1251] eta 0:31:44 lr 0.000916 time 2.1870 (2.2381) loss 4.9617 (4.5665) grad_norm 1.5910 (1.4022) [2022-01-18 01:56:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][410/1251] eta 0:31:21 lr 0.000916 time 2.2672 (2.2367) loss 3.8030 (4.5641) grad_norm 1.1357 (1.4049) [2022-01-18 01:57:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][420/1251] eta 0:31:00 lr 0.000917 time 1.8373 (2.2385) loss 4.8071 (4.5664) grad_norm 1.4415 (1.4066) [2022-01-18 01:57:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][430/1251] eta 0:30:37 lr 0.000917 time 1.9289 (2.2383) loss 4.0852 (4.5641) grad_norm 1.5138 (1.4054) [2022-01-18 01:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][440/1251] eta 0:30:13 lr 0.000918 time 1.7845 (2.2365) loss 5.3689 (4.5664) grad_norm 1.8616 (1.4054) [2022-01-18 01:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][450/1251] eta 0:29:49 lr 0.000918 time 2.0653 (2.2343) loss 4.7372 (4.5666) grad_norm 1.1850 (1.4051) [2022-01-18 01:58:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][460/1251] eta 0:29:29 lr 0.000918 time 2.1657 (2.2368) loss 5.0479 (4.5651) grad_norm 1.7847 (1.4066) [2022-01-18 01:58:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][470/1251] eta 0:29:05 lr 0.000919 time 2.6003 (2.2351) loss 4.6185 (4.5671) grad_norm 1.3978 (1.4056) [2022-01-18 01:59:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][480/1251] eta 0:28:39 lr 0.000919 time 1.5633 (2.2297) loss 3.5258 (4.5653) grad_norm 1.1950 (1.4034) [2022-01-18 01:59:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][490/1251] eta 0:28:14 lr 0.000920 time 2.3797 (2.2272) loss 4.8901 (4.5689) grad_norm 1.2406 (1.4024) [2022-01-18 01:59:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][500/1251] eta 0:27:50 lr 0.000920 time 1.6607 (2.2239) loss 4.7861 (4.5722) grad_norm 1.3727 (1.4071) [2022-01-18 02:00:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][510/1251] eta 0:27:25 lr 0.000920 time 1.8758 (2.2213) loss 4.7682 (4.5699) grad_norm 1.0682 (1.4039) [2022-01-18 02:00:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][520/1251] eta 0:27:03 lr 0.000921 time 2.4712 (2.2208) loss 4.6511 (4.5682) grad_norm 1.2351 (1.4036) [2022-01-18 02:00:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][530/1251] eta 0:26:42 lr 0.000921 time 2.5491 (2.2221) loss 3.5018 (4.5623) grad_norm 1.4161 (1.4036) [2022-01-18 02:01:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][540/1251] eta 0:26:22 lr 0.000922 time 1.8896 (2.2259) loss 5.4103 (4.5591) grad_norm 1.1647 (1.4021) [2022-01-18 02:01:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][550/1251] eta 0:26:01 lr 0.000922 time 2.1672 (2.2273) loss 4.9945 (4.5587) grad_norm 1.5094 (1.4026) [2022-01-18 02:02:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][560/1251] eta 0:25:38 lr 0.000922 time 2.0834 (2.2258) loss 5.5315 (4.5584) grad_norm 1.5419 (1.4033) [2022-01-18 02:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][570/1251] eta 0:25:13 lr 0.000923 time 1.8208 (2.2225) loss 4.3659 (4.5600) grad_norm 1.2981 (1.4047) [2022-01-18 02:02:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][580/1251] eta 0:24:47 lr 0.000923 time 1.8919 (2.2166) loss 5.1021 (4.5613) grad_norm 1.2379 (1.4026) [2022-01-18 02:03:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][590/1251] eta 0:24:23 lr 0.000924 time 2.2126 (2.2137) loss 4.7674 (4.5649) grad_norm 1.5599 (1.4017) [2022-01-18 02:03:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][600/1251] eta 0:23:59 lr 0.000924 time 1.9444 (2.2106) loss 5.2398 (4.5602) grad_norm 1.3243 (1.4030) [2022-01-18 02:03:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][610/1251] eta 0:23:36 lr 0.000924 time 2.4019 (2.2099) loss 4.4038 (4.5614) grad_norm 1.3471 (1.4053) [2022-01-18 02:04:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][620/1251] eta 0:23:14 lr 0.000925 time 1.8497 (2.2104) loss 5.1704 (4.5598) grad_norm 1.2968 (1.4039) [2022-01-18 02:04:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][630/1251] eta 0:22:53 lr 0.000925 time 2.3066 (2.2120) loss 5.0111 (4.5614) grad_norm 1.3306 (1.4055) [2022-01-18 02:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][640/1251] eta 0:22:32 lr 0.000926 time 2.8193 (2.2138) loss 4.8165 (4.5625) grad_norm 1.1911 (1.4038) [2022-01-18 02:05:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][650/1251] eta 0:22:09 lr 0.000926 time 1.5613 (2.2120) loss 3.3659 (4.5587) grad_norm 1.2553 (1.4037) [2022-01-18 02:05:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][660/1251] eta 0:21:46 lr 0.000926 time 1.5628 (2.2115) loss 5.2917 (4.5595) grad_norm 1.0458 (1.4022) [2022-01-18 02:06:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][670/1251] eta 0:21:24 lr 0.000927 time 2.5142 (2.2101) loss 4.6670 (4.5571) grad_norm 1.4490 (1.4029) [2022-01-18 02:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][680/1251] eta 0:21:03 lr 0.000927 time 2.2018 (2.2131) loss 5.2518 (4.5606) grad_norm 1.4995 (1.4023) [2022-01-18 02:06:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][690/1251] eta 0:20:41 lr 0.000928 time 1.8672 (2.2127) loss 5.4491 (4.5598) grad_norm 1.5489 (1.4027) [2022-01-18 02:07:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][700/1251] eta 0:20:19 lr 0.000928 time 1.8747 (2.2134) loss 4.9080 (4.5594) grad_norm 1.1559 (1.4009) [2022-01-18 02:07:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][710/1251] eta 0:19:57 lr 0.000928 time 2.2604 (2.2136) loss 4.9915 (4.5612) grad_norm 1.4315 (1.4016) [2022-01-18 02:07:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][720/1251] eta 0:19:34 lr 0.000929 time 1.5492 (2.2112) loss 4.5562 (4.5621) grad_norm 1.3346 (1.3999) [2022-01-18 02:08:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][730/1251] eta 0:19:10 lr 0.000929 time 2.3374 (2.2088) loss 4.4643 (4.5608) grad_norm 1.6053 (1.4010) [2022-01-18 02:08:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][740/1251] eta 0:18:48 lr 0.000930 time 2.2431 (2.2080) loss 4.8011 (4.5609) grad_norm 1.4192 (1.4020) [2022-01-18 02:08:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][750/1251] eta 0:18:26 lr 0.000930 time 2.4029 (2.2077) loss 4.1211 (4.5648) grad_norm 1.4656 (1.4018) [2022-01-18 02:09:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][760/1251] eta 0:18:03 lr 0.000930 time 1.9124 (2.2068) loss 5.4521 (4.5655) grad_norm 1.4044 (1.4010) [2022-01-18 02:09:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][770/1251] eta 0:17:40 lr 0.000931 time 2.8080 (2.2053) loss 3.6310 (4.5599) grad_norm 1.4225 (1.4009) [2022-01-18 02:10:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][780/1251] eta 0:17:19 lr 0.000931 time 2.1942 (2.2060) loss 3.4926 (4.5598) grad_norm 1.3058 (1.4019) [2022-01-18 02:10:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][790/1251] eta 0:16:56 lr 0.000932 time 1.5197 (2.2048) loss 5.0333 (4.5623) grad_norm 1.3011 (1.4019) [2022-01-18 02:10:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][800/1251] eta 0:16:34 lr 0.000932 time 2.1762 (2.2044) loss 4.3621 (4.5651) grad_norm 1.2694 (1.4010) [2022-01-18 02:11:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][810/1251] eta 0:16:11 lr 0.000932 time 2.2283 (2.2025) loss 5.1283 (4.5655) grad_norm 1.2761 (1.4019) [2022-01-18 02:11:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][820/1251] eta 0:15:49 lr 0.000933 time 2.1984 (2.2020) loss 3.8229 (4.5629) grad_norm 1.4280 (1.4017) [2022-01-18 02:11:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][830/1251] eta 0:15:27 lr 0.000933 time 2.1748 (2.2036) loss 5.1179 (4.5657) grad_norm 1.7988 (1.4020) [2022-01-18 02:12:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][840/1251] eta 0:15:06 lr 0.000934 time 1.7995 (2.2050) loss 4.3811 (4.5648) grad_norm 1.3504 (1.4027) [2022-01-18 02:12:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][850/1251] eta 0:14:44 lr 0.000934 time 1.9411 (2.2059) loss 4.9364 (4.5656) grad_norm 1.6834 (1.4026) [2022-01-18 02:13:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][860/1251] eta 0:14:23 lr 0.000934 time 2.3982 (2.2082) loss 4.8706 (4.5641) grad_norm 1.1319 (1.4021) [2022-01-18 02:13:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][870/1251] eta 0:14:01 lr 0.000935 time 1.5396 (2.2099) loss 3.9529 (4.5627) grad_norm 1.3388 (1.4005) [2022-01-18 02:13:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][880/1251] eta 0:13:39 lr 0.000935 time 1.5933 (2.2096) loss 4.6518 (4.5612) grad_norm 1.5303 (1.4000) [2022-01-18 02:14:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][890/1251] eta 0:13:16 lr 0.000936 time 1.5471 (2.2071) loss 4.3087 (4.5588) grad_norm 1.6183 (1.4016) [2022-01-18 02:14:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][900/1251] eta 0:12:53 lr 0.000936 time 2.3694 (2.2039) loss 4.7861 (4.5566) grad_norm 1.2069 (1.4032) [2022-01-18 02:14:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][910/1251] eta 0:12:30 lr 0.000936 time 2.1214 (2.2023) loss 4.9402 (4.5550) grad_norm 1.3477 (1.4018) [2022-01-18 02:15:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][920/1251] eta 0:12:08 lr 0.000937 time 2.2568 (2.2018) loss 3.1572 (4.5543) grad_norm 1.2052 (1.4000) [2022-01-18 02:15:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][930/1251] eta 0:11:46 lr 0.000937 time 2.4573 (2.2012) loss 5.4623 (4.5568) grad_norm 1.1527 (1.3990) [2022-01-18 02:15:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][940/1251] eta 0:11:24 lr 0.000938 time 1.8948 (2.2014) loss 4.0712 (4.5580) grad_norm 1.5295 (1.3978) [2022-01-18 02:16:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][950/1251] eta 0:11:03 lr 0.000938 time 1.5345 (2.2032) loss 4.9455 (4.5588) grad_norm 1.7289 (1.3965) [2022-01-18 02:16:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][960/1251] eta 0:10:41 lr 0.000938 time 2.0193 (2.2045) loss 5.1197 (4.5601) grad_norm 1.2519 (1.3966) [2022-01-18 02:16:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][970/1251] eta 0:10:19 lr 0.000939 time 1.7320 (2.2036) loss 4.6853 (4.5595) grad_norm 1.5269 (1.3955) [2022-01-18 02:17:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][980/1251] eta 0:09:57 lr 0.000939 time 2.7793 (2.2044) loss 4.7073 (4.5562) grad_norm 1.2593 (1.3949) [2022-01-18 02:17:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][990/1251] eta 0:09:35 lr 0.000940 time 2.4913 (2.2064) loss 4.9235 (4.5565) grad_norm 1.6855 (1.3948) [2022-01-18 02:18:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1000/1251] eta 0:09:13 lr 0.000940 time 1.5784 (2.2064) loss 4.8472 (4.5574) grad_norm 1.2088 (1.3942) [2022-01-18 02:18:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1010/1251] eta 0:08:51 lr 0.000940 time 1.6590 (2.2044) loss 3.7502 (4.5554) grad_norm 1.5008 (1.3935) [2022-01-18 02:18:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1020/1251] eta 0:08:28 lr 0.000941 time 1.9078 (2.2030) loss 3.1952 (4.5566) grad_norm 1.4248 (1.3928) [2022-01-18 02:19:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1030/1251] eta 0:08:06 lr 0.000941 time 2.2196 (2.2021) loss 4.8658 (4.5574) grad_norm 1.4531 (1.3920) [2022-01-18 02:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1040/1251] eta 0:07:44 lr 0.000942 time 2.2724 (2.2029) loss 4.5531 (4.5586) grad_norm 1.9704 (1.3920) [2022-01-18 02:19:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1050/1251] eta 0:07:22 lr 0.000942 time 2.4056 (2.2022) loss 4.3851 (4.5603) grad_norm 1.0555 (1.3918) [2022-01-18 02:20:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1060/1251] eta 0:07:00 lr 0.000942 time 2.2647 (2.2020) loss 4.1207 (4.5623) grad_norm 1.8290 (1.3922) [2022-01-18 02:20:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1070/1251] eta 0:06:38 lr 0.000943 time 2.5203 (2.2001) loss 4.7787 (4.5618) grad_norm 1.0586 (1.3916) [2022-01-18 02:20:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1080/1251] eta 0:06:16 lr 0.000943 time 2.4979 (2.2001) loss 4.7860 (4.5635) grad_norm 1.4185 (1.3904) [2022-01-18 02:21:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1090/1251] eta 0:05:54 lr 0.000944 time 1.9093 (2.2000) loss 5.2697 (4.5663) grad_norm 1.2736 (1.3892) [2022-01-18 02:21:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1100/1251] eta 0:05:32 lr 0.000944 time 2.1277 (2.2001) loss 4.9341 (4.5680) grad_norm 1.3129 (1.3886) [2022-01-18 02:22:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1110/1251] eta 0:05:10 lr 0.000944 time 2.0062 (2.1994) loss 5.1294 (4.5693) grad_norm 1.3202 (1.3873) [2022-01-18 02:22:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1120/1251] eta 0:04:48 lr 0.000945 time 2.8628 (2.1997) loss 3.9459 (4.5686) grad_norm 1.1649 (1.3878) [2022-01-18 02:22:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1130/1251] eta 0:04:26 lr 0.000945 time 2.0196 (2.1992) loss 4.0853 (4.5668) grad_norm 1.6689 (1.3882) [2022-01-18 02:23:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1140/1251] eta 0:04:03 lr 0.000946 time 2.0078 (2.1977) loss 5.0415 (4.5667) grad_norm 1.5058 (1.3883) [2022-01-18 02:23:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1150/1251] eta 0:03:41 lr 0.000946 time 1.8651 (2.1970) loss 4.4020 (4.5672) grad_norm 1.3880 (1.3882) [2022-01-18 02:23:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1160/1251] eta 0:03:19 lr 0.000946 time 2.5373 (2.1968) loss 4.4579 (4.5698) grad_norm 1.4673 (1.3895) [2022-01-18 02:24:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1170/1251] eta 0:02:57 lr 0.000947 time 1.5473 (2.1955) loss 5.3604 (4.5687) grad_norm 1.3080 (1.3904) [2022-01-18 02:24:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1180/1251] eta 0:02:35 lr 0.000947 time 1.9002 (2.1955) loss 4.2464 (4.5693) grad_norm 1.2730 (1.3898) [2022-01-18 02:24:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1190/1251] eta 0:02:13 lr 0.000948 time 2.1250 (2.1952) loss 4.9906 (4.5690) grad_norm 1.3829 (1.3885) [2022-01-18 02:25:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1200/1251] eta 0:01:51 lr 0.000948 time 2.4539 (2.1951) loss 5.4183 (4.5702) grad_norm 1.2548 (1.3875) [2022-01-18 02:25:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1210/1251] eta 0:01:29 lr 0.000948 time 1.7287 (2.1946) loss 3.2489 (4.5686) grad_norm 1.2098 (1.3870) [2022-01-18 02:25:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1220/1251] eta 0:01:08 lr 0.000949 time 1.8893 (2.1947) loss 4.3378 (4.5694) grad_norm 1.1960 (1.3860) [2022-01-18 02:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1230/1251] eta 0:00:46 lr 0.000949 time 2.1772 (2.1960) loss 4.1521 (4.5684) grad_norm 1.1451 (1.3853) [2022-01-18 02:26:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1240/1251] eta 0:00:24 lr 0.000950 time 1.9381 (2.1965) loss 4.4860 (4.5668) grad_norm 1.5847 (1.3851) [2022-01-18 02:27:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [18/300][1250/1251] eta 0:00:02 lr 0.000950 time 1.1841 (2.1916) loss 4.1286 (4.5682) grad_norm 1.3897 (1.3851) [2022-01-18 02:27:02 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 18 training takes 0:45:42 [2022-01-18 02:27:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.446 (18.446) Loss 1.9751 (1.9751) Acc@1 56.738 (56.738) Acc@5 81.250 (81.250) [2022-01-18 02:27:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.868 (3.199) Loss 2.0019 (1.9462) Acc@1 56.836 (57.218) Acc@5 79.102 (81.232) [2022-01-18 02:27:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.272 (2.626) Loss 1.9708 (1.9320) Acc@1 55.762 (57.082) Acc@5 81.543 (81.464) [2022-01-18 02:28:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.939 (2.293) Loss 1.9886 (1.9170) Acc@1 56.836 (57.277) Acc@5 79.785 (81.726) [2022-01-18 02:28:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 5.128 (2.159) Loss 1.8659 (1.9163) Acc@1 59.180 (57.460) Acc@5 82.227 (81.650) [2022-01-18 02:28:37 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 57.618 Acc@5 81.662 [2022-01-18 02:28:37 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 57.6% [2022-01-18 02:28:37 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 57.62% [2022-01-18 02:28:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][0/1251] eta 7:33:02 lr 0.000950 time 21.7285 (21.7285) loss 4.4236 (4.4236) grad_norm 1.2324 (1.2324) [2022-01-18 02:29:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][10/1251] eta 1:23:59 lr 0.000950 time 1.3619 (4.0612) loss 4.7862 (4.5174) grad_norm 1.2908 (1.3194) [2022-01-18 02:29:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][20/1251] eta 1:06:13 lr 0.000951 time 1.9029 (3.2282) loss 5.0726 (4.3819) grad_norm 1.3189 (1.3349) [2022-01-18 02:30:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][30/1251] eta 0:58:31 lr 0.000951 time 1.8957 (2.8755) loss 5.1269 (4.4198) grad_norm 1.7272 (1.3495) [2022-01-18 02:30:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][40/1251] eta 0:54:47 lr 0.000952 time 2.8330 (2.7149) loss 5.0703 (4.4118) grad_norm 1.4602 (1.3790) [2022-01-18 02:30:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][50/1251] eta 0:52:46 lr 0.000952 time 2.1488 (2.6366) loss 4.4655 (4.4021) grad_norm 1.5010 (1.4002) [2022-01-18 02:31:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][60/1251] eta 0:51:20 lr 0.000952 time 2.5368 (2.5864) loss 4.9597 (4.4350) grad_norm 1.3659 (1.3838) [2022-01-18 02:31:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][70/1251] eta 0:49:53 lr 0.000953 time 1.9339 (2.5343) loss 3.9276 (4.4586) grad_norm 1.1932 (1.3932) [2022-01-18 02:31:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][80/1251] eta 0:48:24 lr 0.000953 time 2.1940 (2.4808) loss 5.0989 (4.4292) grad_norm 1.4012 (1.4120) [2022-01-18 02:32:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][90/1251] eta 0:47:38 lr 0.000954 time 2.1903 (2.4617) loss 5.3143 (4.4495) grad_norm 1.2940 (1.4045) [2022-01-18 02:32:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][100/1251] eta 0:46:25 lr 0.000954 time 1.8864 (2.4201) loss 4.9881 (4.4874) grad_norm 1.5778 (1.4163) [2022-01-18 02:33:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][110/1251] eta 0:45:33 lr 0.000954 time 2.3866 (2.3957) loss 5.0078 (4.4757) grad_norm 1.2575 (1.4198) [2022-01-18 02:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][120/1251] eta 0:44:38 lr 0.000955 time 1.7096 (2.3685) loss 5.1116 (4.4906) grad_norm 1.1148 (1.4045) [2022-01-18 02:33:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][130/1251] eta 0:44:02 lr 0.000955 time 1.8281 (2.3569) loss 3.9760 (4.4896) grad_norm 1.4717 (1.4033) [2022-01-18 02:34:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][140/1251] eta 0:43:31 lr 0.000956 time 2.8624 (2.3508) loss 4.9071 (4.4981) grad_norm 1.5209 (1.3918) [2022-01-18 02:34:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][150/1251] eta 0:42:39 lr 0.000956 time 1.7594 (2.3243) loss 4.8679 (4.5047) grad_norm 1.7304 (1.3966) [2022-01-18 02:34:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][160/1251] eta 0:42:07 lr 0.000956 time 2.5041 (2.3169) loss 4.8433 (4.5079) grad_norm 1.1510 (1.3888) [2022-01-18 02:35:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][170/1251] eta 0:41:45 lr 0.000957 time 1.5470 (2.3174) loss 4.3758 (4.5147) grad_norm 1.3058 (1.3796) [2022-01-18 02:35:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][180/1251] eta 0:41:25 lr 0.000957 time 2.8367 (2.3208) loss 4.7628 (4.5113) grad_norm 1.3938 (1.3735) [2022-01-18 02:36:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][190/1251] eta 0:40:59 lr 0.000958 time 1.7922 (2.3183) loss 5.2266 (4.5186) grad_norm 1.4039 (1.3740) [2022-01-18 02:36:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][200/1251] eta 0:40:31 lr 0.000958 time 2.5717 (2.3134) loss 3.5529 (4.5210) grad_norm 1.4706 (1.3749) [2022-01-18 02:36:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][210/1251] eta 0:39:55 lr 0.000958 time 1.7846 (2.3009) loss 3.7267 (4.5205) grad_norm 1.3371 (1.3759) [2022-01-18 02:37:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][220/1251] eta 0:39:21 lr 0.000959 time 2.2089 (2.2901) loss 4.4845 (4.5241) grad_norm 1.3765 (1.3746) [2022-01-18 02:37:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][230/1251] eta 0:38:47 lr 0.000959 time 2.5861 (2.2793) loss 4.7741 (4.5214) grad_norm 1.6721 (1.3853) [2022-01-18 02:37:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][240/1251] eta 0:38:15 lr 0.000960 time 2.1926 (2.2703) loss 4.6820 (4.5371) grad_norm 1.4128 (1.3879) [2022-01-18 02:38:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][250/1251] eta 0:37:43 lr 0.000960 time 1.8620 (2.2616) loss 3.9658 (4.5331) grad_norm 1.6478 (1.3929) [2022-01-18 02:38:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][260/1251] eta 0:37:18 lr 0.000960 time 2.5411 (2.2586) loss 4.9592 (4.5303) grad_norm 1.2044 (1.3913) [2022-01-18 02:38:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][270/1251] eta 0:36:57 lr 0.000961 time 1.7542 (2.2601) loss 3.8747 (4.5214) grad_norm 1.1952 (1.3956) [2022-01-18 02:39:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][280/1251] eta 0:36:31 lr 0.000961 time 2.5824 (2.2570) loss 5.3297 (4.5286) grad_norm 1.1263 (1.3911) [2022-01-18 02:39:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][290/1251] eta 0:36:06 lr 0.000962 time 1.8975 (2.2549) loss 5.0470 (4.5292) grad_norm 1.1397 (1.3888) [2022-01-18 02:39:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][300/1251] eta 0:35:42 lr 0.000962 time 1.8898 (2.2528) loss 3.6027 (4.5200) grad_norm 1.0376 (1.3860) [2022-01-18 02:40:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][310/1251] eta 0:35:22 lr 0.000962 time 2.1900 (2.2553) loss 3.5806 (4.5210) grad_norm 1.3879 (1.3831) [2022-01-18 02:40:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][320/1251] eta 0:34:57 lr 0.000963 time 2.6301 (2.2532) loss 4.4691 (4.5224) grad_norm 1.2869 (1.3843) [2022-01-18 02:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][330/1251] eta 0:34:39 lr 0.000963 time 3.1374 (2.2579) loss 4.3931 (4.5298) grad_norm 1.0516 (1.3808) [2022-01-18 02:41:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][340/1251] eta 0:34:13 lr 0.000964 time 1.8878 (2.2546) loss 4.9525 (4.5303) grad_norm 1.1198 (1.3828) [2022-01-18 02:41:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][350/1251] eta 0:33:48 lr 0.000964 time 2.5461 (2.2509) loss 4.6934 (4.5283) grad_norm 1.1276 (1.3792) [2022-01-18 02:42:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][360/1251] eta 0:33:19 lr 0.000964 time 2.7742 (2.2442) loss 4.1152 (4.5203) grad_norm 1.3621 (1.3782) [2022-01-18 02:42:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][370/1251] eta 0:32:52 lr 0.000965 time 1.7399 (2.2390) loss 5.0426 (4.5287) grad_norm 1.7103 (1.3815) [2022-01-18 02:42:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][380/1251] eta 0:32:25 lr 0.000965 time 1.7481 (2.2339) loss 4.2939 (4.5336) grad_norm 1.4788 (1.3839) [2022-01-18 02:43:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][390/1251] eta 0:32:02 lr 0.000966 time 2.1528 (2.2329) loss 5.3112 (4.5409) grad_norm 1.3790 (1.3820) [2022-01-18 02:43:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][400/1251] eta 0:31:42 lr 0.000966 time 3.0463 (2.2351) loss 3.6083 (4.5446) grad_norm 1.1975 (1.3824) [2022-01-18 02:43:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][410/1251] eta 0:31:21 lr 0.000966 time 2.7804 (2.2368) loss 4.7628 (4.5414) grad_norm 1.4894 (1.3799) [2022-01-18 02:44:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][420/1251] eta 0:30:56 lr 0.000967 time 1.6033 (2.2341) loss 4.7843 (4.5441) grad_norm 1.6577 (1.3849) [2022-01-18 02:44:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][430/1251] eta 0:30:34 lr 0.000967 time 2.5837 (2.2343) loss 4.7523 (4.5450) grad_norm 1.2517 (1.3825) [2022-01-18 02:45:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][440/1251] eta 0:30:10 lr 0.000968 time 1.4777 (2.2320) loss 4.9605 (4.5479) grad_norm 1.3188 (1.3828) [2022-01-18 02:45:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][450/1251] eta 0:29:48 lr 0.000968 time 2.4936 (2.2324) loss 4.9478 (4.5467) grad_norm 1.0541 (1.3845) [2022-01-18 02:45:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][460/1251] eta 0:29:22 lr 0.000968 time 1.4462 (2.2287) loss 3.8964 (4.5438) grad_norm 1.2874 (1.3848) [2022-01-18 02:46:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][470/1251] eta 0:29:00 lr 0.000969 time 2.3876 (2.2285) loss 4.0160 (4.5414) grad_norm 1.7449 (1.3834) [2022-01-18 02:46:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][480/1251] eta 0:28:35 lr 0.000969 time 1.5046 (2.2248) loss 4.2973 (4.5450) grad_norm 1.6822 (1.3838) [2022-01-18 02:46:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][490/1251] eta 0:28:14 lr 0.000970 time 2.2002 (2.2263) loss 5.4072 (4.5452) grad_norm 1.1694 (1.3822) [2022-01-18 02:47:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][500/1251] eta 0:27:54 lr 0.000970 time 1.5786 (2.2295) loss 4.5174 (4.5470) grad_norm 1.4631 (1.3811) [2022-01-18 02:47:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][510/1251] eta 0:27:33 lr 0.000970 time 2.0877 (2.2315) loss 5.0789 (4.5498) grad_norm 1.2118 (1.3804) [2022-01-18 02:47:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][520/1251] eta 0:27:08 lr 0.000971 time 1.7750 (2.2283) loss 4.2929 (4.5451) grad_norm 1.5401 (1.3786) [2022-01-18 02:48:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][530/1251] eta 0:26:44 lr 0.000971 time 1.8720 (2.2249) loss 4.9364 (4.5464) grad_norm 1.5778 (1.3775) [2022-01-18 02:48:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][540/1251] eta 0:26:21 lr 0.000972 time 1.9846 (2.2240) loss 3.8333 (4.5513) grad_norm 1.4727 (1.3779) [2022-01-18 02:49:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][550/1251] eta 0:25:56 lr 0.000972 time 1.5806 (2.2209) loss 4.4913 (4.5545) grad_norm 1.7969 (1.3779) [2022-01-18 02:49:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][560/1251] eta 0:25:33 lr 0.000972 time 2.0933 (2.2186) loss 4.8440 (4.5530) grad_norm 1.8361 (1.3793) [2022-01-18 02:49:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][570/1251] eta 0:25:09 lr 0.000973 time 1.9269 (2.2165) loss 4.9361 (4.5541) grad_norm 1.8751 (1.3797) [2022-01-18 02:50:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][580/1251] eta 0:24:46 lr 0.000973 time 1.5070 (2.2157) loss 5.0075 (4.5537) grad_norm 1.3094 (1.3778) [2022-01-18 02:50:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][590/1251] eta 0:24:25 lr 0.000974 time 1.9763 (2.2165) loss 3.3715 (4.5485) grad_norm 1.3354 (1.3747) [2022-01-18 02:50:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][600/1251] eta 0:24:03 lr 0.000974 time 1.5956 (2.2180) loss 4.4340 (4.5476) grad_norm 1.3445 (1.3737) [2022-01-18 02:51:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][610/1251] eta 0:23:44 lr 0.000974 time 2.6993 (2.2221) loss 3.5839 (4.5444) grad_norm 1.3442 (1.3725) [2022-01-18 02:51:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][620/1251] eta 0:23:21 lr 0.000975 time 1.5301 (2.2212) loss 5.3671 (4.5433) grad_norm 1.3381 (1.3715) [2022-01-18 02:51:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][630/1251] eta 0:22:57 lr 0.000975 time 1.9301 (2.2184) loss 5.0148 (4.5446) grad_norm 1.1290 (1.3694) [2022-01-18 02:52:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][640/1251] eta 0:22:34 lr 0.000976 time 1.9169 (2.2162) loss 4.9426 (4.5472) grad_norm 1.0251 (1.3673) [2022-01-18 02:52:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][650/1251] eta 0:22:11 lr 0.000976 time 2.8514 (2.2152) loss 5.1503 (4.5470) grad_norm 1.4524 (1.3665) [2022-01-18 02:53:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][660/1251] eta 0:21:48 lr 0.000976 time 1.7209 (2.2140) loss 4.4468 (4.5412) grad_norm 1.1791 (1.3657) [2022-01-18 02:53:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][670/1251] eta 0:21:26 lr 0.000977 time 1.9107 (2.2143) loss 3.9993 (4.5413) grad_norm 1.3407 (1.3643) [2022-01-18 02:53:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][680/1251] eta 0:21:04 lr 0.000977 time 2.2544 (2.2147) loss 3.8541 (4.5346) grad_norm 1.2753 (1.3632) [2022-01-18 02:54:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][690/1251] eta 0:20:42 lr 0.000978 time 2.2058 (2.2139) loss 4.7765 (4.5349) grad_norm 1.5110 (1.3629) [2022-01-18 02:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][700/1251] eta 0:20:19 lr 0.000978 time 2.2656 (2.2136) loss 4.5949 (4.5350) grad_norm 1.4847 (1.3620) [2022-01-18 02:54:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][710/1251] eta 0:19:56 lr 0.000978 time 1.8343 (2.2114) loss 4.9014 (4.5370) grad_norm 0.9549 (1.3603) [2022-01-18 02:55:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][720/1251] eta 0:19:34 lr 0.000979 time 2.7203 (2.2110) loss 4.9681 (4.5366) grad_norm 1.5581 (1.3632) [2022-01-18 02:55:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][730/1251] eta 0:19:11 lr 0.000979 time 2.1814 (2.2109) loss 4.9703 (4.5355) grad_norm 1.4375 (1.3642) [2022-01-18 02:55:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][740/1251] eta 0:18:48 lr 0.000980 time 1.6416 (2.2090) loss 4.5747 (4.5355) grad_norm 1.2119 (1.3627) [2022-01-18 02:56:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][750/1251] eta 0:18:26 lr 0.000980 time 1.5331 (2.2079) loss 4.5510 (4.5356) grad_norm 1.4080 (1.3651) [2022-01-18 02:56:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][760/1251] eta 0:18:04 lr 0.000980 time 2.7650 (2.2095) loss 4.3782 (4.5345) grad_norm 1.2916 (1.3648) [2022-01-18 02:57:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][770/1251] eta 0:17:42 lr 0.000981 time 1.8773 (2.2099) loss 4.7131 (4.5388) grad_norm 1.6331 (1.3640) [2022-01-18 02:57:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][780/1251] eta 0:17:21 lr 0.000981 time 1.6109 (2.2104) loss 3.6773 (4.5416) grad_norm 1.4984 (1.3662) [2022-01-18 02:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][790/1251] eta 0:16:59 lr 0.000982 time 1.5924 (2.2113) loss 3.2550 (4.5398) grad_norm 1.3710 (1.3663) [2022-01-18 02:58:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][800/1251] eta 0:16:37 lr 0.000982 time 2.7444 (2.2116) loss 4.5750 (4.5407) grad_norm 1.0743 (1.3646) [2022-01-18 02:58:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][810/1251] eta 0:16:14 lr 0.000982 time 1.8379 (2.2102) loss 4.5708 (4.5426) grad_norm 1.1409 (1.3632) [2022-01-18 02:58:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][820/1251] eta 0:15:51 lr 0.000983 time 1.5306 (2.2075) loss 4.5281 (4.5421) grad_norm 1.4621 (1.3648) [2022-01-18 02:59:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][830/1251] eta 0:15:29 lr 0.000983 time 2.1171 (2.2068) loss 4.3598 (4.5408) grad_norm 1.1201 (1.3642) [2022-01-18 02:59:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][840/1251] eta 0:15:06 lr 0.000984 time 2.3423 (2.2049) loss 4.5791 (4.5394) grad_norm 1.1347 (1.3637) [2022-01-18 02:59:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][850/1251] eta 0:14:44 lr 0.000984 time 2.0058 (2.2048) loss 5.2749 (4.5386) grad_norm 1.1613 (1.3624) [2022-01-18 03:00:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][860/1251] eta 0:14:21 lr 0.000984 time 1.4624 (2.2032) loss 3.4951 (4.5407) grad_norm 1.1876 (1.3623) [2022-01-18 03:00:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][870/1251] eta 0:14:00 lr 0.000985 time 2.4656 (2.2058) loss 4.2036 (4.5402) grad_norm 1.1096 (1.3613) [2022-01-18 03:01:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][880/1251] eta 0:13:38 lr 0.000985 time 1.9526 (2.2066) loss 4.5010 (4.5419) grad_norm 1.3257 (1.3616) [2022-01-18 03:01:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][890/1251] eta 0:13:16 lr 0.000986 time 2.1152 (2.2068) loss 3.3720 (4.5437) grad_norm 1.9192 (1.3609) [2022-01-18 03:01:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][900/1251] eta 0:12:54 lr 0.000986 time 1.8476 (2.2059) loss 4.6294 (4.5453) grad_norm 1.3367 (1.3617) [2022-01-18 03:02:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][910/1251] eta 0:12:31 lr 0.000986 time 1.5507 (2.2033) loss 5.0396 (4.5448) grad_norm 1.3806 (1.3628) [2022-01-18 03:02:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][920/1251] eta 0:12:08 lr 0.000987 time 2.2050 (2.2015) loss 4.8385 (4.5454) grad_norm 1.1520 (1.3626) [2022-01-18 03:02:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][930/1251] eta 0:11:46 lr 0.000987 time 1.7012 (2.2016) loss 4.2210 (4.5467) grad_norm 1.3168 (1.3618) [2022-01-18 03:03:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][940/1251] eta 0:11:24 lr 0.000988 time 2.0471 (2.2025) loss 4.7784 (4.5470) grad_norm 1.6914 (1.3624) [2022-01-18 03:03:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][950/1251] eta 0:11:04 lr 0.000988 time 3.5665 (2.2062) loss 5.3265 (4.5481) grad_norm 1.3518 (1.3618) [2022-01-18 03:03:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][960/1251] eta 0:10:42 lr 0.000988 time 2.2414 (2.2081) loss 5.0539 (4.5485) grad_norm 1.0354 (1.3613) [2022-01-18 03:04:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][970/1251] eta 0:10:20 lr 0.000989 time 1.5512 (2.2067) loss 5.4154 (4.5535) grad_norm 1.4794 (1.3604) [2022-01-18 03:04:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][980/1251] eta 0:09:57 lr 0.000989 time 2.0136 (2.2057) loss 4.7126 (4.5565) grad_norm 1.0759 (1.3606) [2022-01-18 03:05:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][990/1251] eta 0:09:36 lr 0.000990 time 2.5338 (2.2069) loss 4.2097 (4.5567) grad_norm 1.9835 (1.3616) [2022-01-18 03:05:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1000/1251] eta 0:09:14 lr 0.000990 time 1.8589 (2.2081) loss 4.7286 (4.5574) grad_norm 1.2052 (1.3620) [2022-01-18 03:05:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1010/1251] eta 0:08:51 lr 0.000990 time 1.5921 (2.2066) loss 3.2403 (4.5567) grad_norm 1.3119 (1.3614) [2022-01-18 03:06:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1020/1251] eta 0:08:29 lr 0.000991 time 1.5795 (2.2049) loss 5.1598 (4.5512) grad_norm 1.3092 (1.3603) [2022-01-18 03:06:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1030/1251] eta 0:08:06 lr 0.000991 time 1.8921 (2.2035) loss 4.7072 (4.5505) grad_norm 1.2244 (1.3602) [2022-01-18 03:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1040/1251] eta 0:07:44 lr 0.000992 time 2.6326 (2.2035) loss 5.2713 (4.5492) grad_norm 1.2332 (1.3593) [2022-01-18 03:07:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1050/1251] eta 0:07:22 lr 0.000992 time 1.9523 (2.2025) loss 5.0670 (4.5505) grad_norm 1.3526 (1.3595) [2022-01-18 03:07:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1060/1251] eta 0:07:00 lr 0.000992 time 2.8756 (2.2015) loss 4.1627 (4.5482) grad_norm 1.1274 (1.3597) [2022-01-18 03:07:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1070/1251] eta 0:06:38 lr 0.000993 time 2.2790 (2.2021) loss 3.3762 (4.5475) grad_norm 1.3243 (1.3595) [2022-01-18 03:08:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1080/1251] eta 0:06:16 lr 0.000993 time 2.5288 (2.2024) loss 5.0719 (4.5494) grad_norm 1.5190 (1.3594) [2022-01-18 03:08:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1090/1251] eta 0:05:54 lr 0.000994 time 1.6171 (2.2030) loss 4.8391 (4.5508) grad_norm 1.6585 (1.3606) [2022-01-18 03:09:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1100/1251] eta 0:05:32 lr 0.000994 time 2.3013 (2.2029) loss 5.4834 (4.5528) grad_norm 1.6167 (1.3602) [2022-01-18 03:09:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1110/1251] eta 0:05:10 lr 0.000994 time 1.9300 (2.2025) loss 3.9820 (4.5519) grad_norm 1.1595 (1.3596) [2022-01-18 03:09:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1120/1251] eta 0:04:48 lr 0.000995 time 2.7010 (2.2022) loss 3.9145 (4.5539) grad_norm 1.2574 (1.3583) [2022-01-18 03:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1130/1251] eta 0:04:26 lr 0.000995 time 1.7483 (2.2018) loss 3.7903 (4.5512) grad_norm 1.4225 (1.3585) [2022-01-18 03:10:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1140/1251] eta 0:04:04 lr 0.000996 time 1.9112 (2.2016) loss 3.7771 (4.5505) grad_norm 1.5646 (1.3586) [2022-01-18 03:10:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1150/1251] eta 0:03:42 lr 0.000996 time 2.2347 (2.2016) loss 4.4218 (4.5489) grad_norm 1.2272 (1.3580) [2022-01-18 03:11:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1160/1251] eta 0:03:20 lr 0.000996 time 3.0277 (2.2024) loss 3.2858 (4.5488) grad_norm 1.1135 (1.3578) [2022-01-18 03:11:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1170/1251] eta 0:02:58 lr 0.000997 time 1.9512 (2.2038) loss 4.9121 (4.5496) grad_norm 1.2215 (1.3571) [2022-01-18 03:12:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1180/1251] eta 0:02:36 lr 0.000997 time 1.6640 (2.2036) loss 4.5060 (4.5505) grad_norm 1.0829 (1.3558) [2022-01-18 03:12:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1190/1251] eta 0:02:14 lr 0.000998 time 2.2796 (2.2028) loss 4.6174 (4.5535) grad_norm 1.2879 (1.3554) [2022-01-18 03:12:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1200/1251] eta 0:01:52 lr 0.000998 time 2.3062 (2.2007) loss 4.8450 (4.5525) grad_norm 1.5468 (1.3557) [2022-01-18 03:13:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1210/1251] eta 0:01:30 lr 0.000998 time 1.9730 (2.1989) loss 3.6910 (4.5557) grad_norm 1.2215 (1.3558) [2022-01-18 03:13:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1220/1251] eta 0:01:08 lr 0.000999 time 2.5118 (2.1989) loss 3.3549 (4.5532) grad_norm 1.2125 (1.3559) [2022-01-18 03:13:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1230/1251] eta 0:00:46 lr 0.000999 time 2.4732 (2.1995) loss 4.7267 (4.5542) grad_norm 1.2418 (1.3559) [2022-01-18 03:14:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1240/1251] eta 0:00:24 lr 0.001000 time 1.8339 (2.1988) loss 4.0747 (4.5547) grad_norm 1.2792 (1.3555) [2022-01-18 03:14:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [19/300][1250/1251] eta 0:00:02 lr 0.001000 time 1.1993 (2.1932) loss 5.0036 (4.5538) grad_norm 1.1278 (1.3548) [2022-01-18 03:14:21 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 19 training takes 0:45:44 [2022-01-18 03:14:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.958 (17.958) Loss 1.8532 (1.8532) Acc@1 58.105 (58.105) Acc@5 82.227 (82.227) [2022-01-18 03:14:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.613 (3.327) Loss 1.8179 (1.8100) Acc@1 58.887 (59.553) Acc@5 82.812 (82.812) [2022-01-18 03:15:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.922 (2.586) Loss 1.9512 (1.8360) Acc@1 55.469 (59.003) Acc@5 80.469 (82.375) [2022-01-18 03:15:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.626 (2.287) Loss 1.7876 (1.8384) Acc@1 59.766 (58.959) Acc@5 84.082 (82.353) [2022-01-18 03:15:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.588 (2.160) Loss 1.8069 (1.8382) Acc@1 60.742 (59.013) Acc@5 83.008 (82.389) [2022-01-18 03:15:58 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 58.988 Acc@5 82.560 [2022-01-18 03:15:58 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 59.0% [2022-01-18 03:15:58 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 58.99% [2022-01-18 03:16:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][0/1251] eta 7:20:01 lr 0.000989 time 21.1041 (21.1041) loss 4.3490 (4.3490) grad_norm 1.2131 (1.2131) [2022-01-18 03:16:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][10/1251] eta 1:22:12 lr 0.000989 time 1.8092 (3.9747) loss 3.9874 (4.5802) grad_norm 1.3414 (1.3378) [2022-01-18 03:17:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][20/1251] eta 1:05:12 lr 0.000989 time 1.4633 (3.1785) loss 3.3249 (4.5802) grad_norm 1.1736 (1.3360) [2022-01-18 03:17:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][30/1251] eta 0:57:34 lr 0.000989 time 1.5673 (2.8290) loss 5.4867 (4.5550) grad_norm 1.2265 (1.2967) [2022-01-18 03:17:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][40/1251] eta 0:54:47 lr 0.000989 time 3.6873 (2.7150) loss 3.7357 (4.4543) grad_norm 1.2428 (1.2850) [2022-01-18 03:18:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][50/1251] eta 0:53:24 lr 0.000989 time 2.4485 (2.6679) loss 3.5934 (4.4377) grad_norm 1.6444 (1.3186) [2022-01-18 03:18:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][60/1251] eta 0:51:40 lr 0.000989 time 1.4800 (2.6033) loss 4.2683 (4.4451) grad_norm 1.0312 (1.3207) [2022-01-18 03:18:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][70/1251] eta 0:49:41 lr 0.000989 time 1.8537 (2.5242) loss 4.5832 (4.4764) grad_norm 1.3020 (1.3318) [2022-01-18 03:19:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][80/1251] eta 0:48:17 lr 0.000989 time 3.2768 (2.4742) loss 5.0064 (4.4791) grad_norm 1.4359 (1.3524) [2022-01-18 03:19:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][90/1251] eta 0:47:09 lr 0.000989 time 1.9281 (2.4371) loss 4.0656 (4.4594) grad_norm 2.1249 (1.3702) [2022-01-18 03:20:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][100/1251] eta 0:46:26 lr 0.000989 time 2.2045 (2.4209) loss 4.2083 (4.4198) grad_norm 1.5033 (1.3822) [2022-01-18 03:20:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][110/1251] eta 0:45:45 lr 0.000989 time 2.2410 (2.4065) loss 5.4515 (4.4293) grad_norm 1.1034 (1.3705) [2022-01-18 03:20:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][120/1251] eta 0:45:12 lr 0.000989 time 2.8725 (2.3984) loss 5.1375 (4.4313) grad_norm 1.3459 (1.3626) [2022-01-18 03:21:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][130/1251] eta 0:44:42 lr 0.000989 time 2.1535 (2.3932) loss 3.9341 (4.4251) grad_norm 1.2217 (1.3604) [2022-01-18 03:21:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][140/1251] eta 0:43:55 lr 0.000989 time 1.8992 (2.3722) loss 4.4444 (4.4254) grad_norm 1.2591 (1.3601) [2022-01-18 03:21:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][150/1251] eta 0:43:09 lr 0.000989 time 2.0714 (2.3521) loss 4.2721 (4.4430) grad_norm 1.1744 (1.3580) [2022-01-18 03:22:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][160/1251] eta 0:42:17 lr 0.000989 time 2.3259 (2.3263) loss 5.0394 (4.4544) grad_norm 1.4334 (1.3539) [2022-01-18 03:22:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][170/1251] eta 0:41:38 lr 0.000989 time 2.1498 (2.3114) loss 4.0710 (4.4377) grad_norm 1.3168 (1.3511) [2022-01-18 03:22:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][180/1251] eta 0:41:08 lr 0.000989 time 2.2126 (2.3050) loss 5.0287 (4.4534) grad_norm 1.1780 (1.3401) [2022-01-18 03:23:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][190/1251] eta 0:40:44 lr 0.000989 time 2.4124 (2.3038) loss 4.4503 (4.4683) grad_norm 1.3012 (1.3363) [2022-01-18 03:23:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][200/1251] eta 0:40:17 lr 0.000989 time 2.2001 (2.3005) loss 4.3132 (4.4836) grad_norm 1.3157 (1.3313) [2022-01-18 03:24:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][210/1251] eta 0:40:00 lr 0.000989 time 2.3853 (2.3058) loss 4.4379 (4.5005) grad_norm 1.2863 (1.3234) [2022-01-18 03:24:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][220/1251] eta 0:39:33 lr 0.000989 time 2.2332 (2.3023) loss 3.2421 (4.5031) grad_norm 1.4003 (1.3235) [2022-01-18 03:24:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][230/1251] eta 0:38:58 lr 0.000989 time 1.7994 (2.2908) loss 4.4038 (4.5088) grad_norm 1.0251 (1.3216) [2022-01-18 03:25:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][240/1251] eta 0:38:23 lr 0.000989 time 2.1224 (2.2785) loss 3.6048 (4.5051) grad_norm 1.3013 (1.3272) [2022-01-18 03:25:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][250/1251] eta 0:37:48 lr 0.000989 time 1.8705 (2.2661) loss 4.4091 (4.5016) grad_norm 1.2742 (1.3242) [2022-01-18 03:25:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][260/1251] eta 0:37:20 lr 0.000989 time 2.1331 (2.2608) loss 4.8725 (4.5013) grad_norm 2.3956 (1.3299) [2022-01-18 03:26:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][270/1251] eta 0:36:56 lr 0.000989 time 2.2441 (2.2595) loss 3.9529 (4.5038) grad_norm 1.1306 (1.3378) [2022-01-18 03:26:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][280/1251] eta 0:36:30 lr 0.000989 time 2.6714 (2.2557) loss 3.8989 (4.5099) grad_norm 1.1384 (1.3337) [2022-01-18 03:26:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][290/1251] eta 0:36:05 lr 0.000989 time 2.4979 (2.2530) loss 4.0866 (4.5064) grad_norm 1.9094 (1.3386) [2022-01-18 03:27:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][300/1251] eta 0:35:42 lr 0.000989 time 2.1690 (2.2532) loss 3.4925 (4.5046) grad_norm 1.0898 (1.3369) [2022-01-18 03:27:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][310/1251] eta 0:35:18 lr 0.000989 time 2.1966 (2.2509) loss 5.0977 (4.5025) grad_norm 1.7375 (1.3380) [2022-01-18 03:27:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][320/1251] eta 0:34:51 lr 0.000989 time 1.8183 (2.2469) loss 4.9338 (4.5099) grad_norm 1.8075 (1.3373) [2022-01-18 03:28:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][330/1251] eta 0:34:34 lr 0.000989 time 3.1404 (2.2520) loss 4.8558 (4.5108) grad_norm 1.2601 (1.3365) [2022-01-18 03:28:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][340/1251] eta 0:34:08 lr 0.000989 time 1.6185 (2.2491) loss 4.5871 (4.5166) grad_norm 1.2651 (1.3337) [2022-01-18 03:29:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][350/1251] eta 0:33:45 lr 0.000989 time 2.3461 (2.2477) loss 5.1920 (4.5137) grad_norm 1.3818 (1.3317) [2022-01-18 03:29:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][360/1251] eta 0:33:19 lr 0.000989 time 2.5882 (2.2436) loss 4.6357 (4.5063) grad_norm 1.1682 (1.3282) [2022-01-18 03:29:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][370/1251] eta 0:32:54 lr 0.000989 time 2.3744 (2.2409) loss 4.2295 (4.5077) grad_norm 1.2928 (1.3266) [2022-01-18 03:30:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][380/1251] eta 0:32:28 lr 0.000989 time 1.8500 (2.2366) loss 4.8767 (4.5055) grad_norm 1.3539 (1.3292) [2022-01-18 03:30:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][390/1251] eta 0:32:04 lr 0.000989 time 2.3103 (2.2351) loss 3.8622 (4.5022) grad_norm 1.6188 (1.3296) [2022-01-18 03:30:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][400/1251] eta 0:31:41 lr 0.000989 time 2.4833 (2.2342) loss 5.0967 (4.5004) grad_norm 1.2141 (1.3285) [2022-01-18 03:31:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][410/1251] eta 0:31:20 lr 0.000989 time 2.7180 (2.2365) loss 5.2363 (4.4976) grad_norm 1.5551 (1.3274) [2022-01-18 03:31:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][420/1251] eta 0:30:56 lr 0.000989 time 1.6221 (2.2339) loss 4.7529 (4.5096) grad_norm 1.3681 (1.3273) [2022-01-18 03:32:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][430/1251] eta 0:30:31 lr 0.000989 time 2.1253 (2.2309) loss 5.1136 (4.5172) grad_norm 1.4016 (1.3291) [2022-01-18 03:32:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][440/1251] eta 0:30:05 lr 0.000989 time 2.2286 (2.2265) loss 4.5585 (4.5095) grad_norm 1.4347 (1.3319) [2022-01-18 03:32:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][450/1251] eta 0:29:40 lr 0.000989 time 1.9975 (2.2228) loss 5.1827 (4.5021) grad_norm 1.0367 (1.3299) [2022-01-18 03:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][460/1251] eta 0:29:15 lr 0.000989 time 1.8686 (2.2189) loss 4.7894 (4.5050) grad_norm 1.2798 (1.3284) [2022-01-18 03:33:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][470/1251] eta 0:28:55 lr 0.000989 time 2.8833 (2.2216) loss 4.0162 (4.5040) grad_norm 1.0960 (1.3254) [2022-01-18 03:33:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][480/1251] eta 0:28:33 lr 0.000989 time 2.7864 (2.2229) loss 3.4572 (4.4976) grad_norm 1.4374 (1.3226) [2022-01-18 03:34:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][490/1251] eta 0:28:10 lr 0.000989 time 2.0188 (2.2219) loss 4.7026 (4.4947) grad_norm 1.4035 (1.3208) [2022-01-18 03:34:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][500/1251] eta 0:27:49 lr 0.000989 time 2.2017 (2.2226) loss 4.6487 (4.4896) grad_norm 1.2575 (1.3242) [2022-01-18 03:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][510/1251] eta 0:27:27 lr 0.000989 time 2.7671 (2.2230) loss 4.2354 (4.4807) grad_norm 1.2240 (1.3252) [2022-01-18 03:35:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][520/1251] eta 0:27:03 lr 0.000989 time 1.6218 (2.2208) loss 5.0205 (4.4783) grad_norm 1.3738 (1.3257) [2022-01-18 03:35:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][530/1251] eta 0:26:39 lr 0.000989 time 2.1973 (2.2191) loss 4.6266 (4.4801) grad_norm 1.2180 (1.3239) [2022-01-18 03:35:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][540/1251] eta 0:26:16 lr 0.000989 time 2.5365 (2.2179) loss 5.1054 (4.4772) grad_norm 1.1233 (1.3247) [2022-01-18 03:36:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][550/1251] eta 0:25:56 lr 0.000989 time 3.6420 (2.2201) loss 4.2501 (4.4780) grad_norm 1.9424 (1.3262) [2022-01-18 03:36:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][560/1251] eta 0:25:32 lr 0.000989 time 2.0986 (2.2182) loss 5.2891 (4.4776) grad_norm 1.2806 (1.3256) [2022-01-18 03:37:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][570/1251] eta 0:25:10 lr 0.000989 time 1.8749 (2.2182) loss 4.4989 (4.4802) grad_norm 1.6563 (1.3245) [2022-01-18 03:37:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][580/1251] eta 0:24:47 lr 0.000989 time 2.1799 (2.2163) loss 5.0469 (4.4788) grad_norm 1.4202 (1.3244) [2022-01-18 03:37:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][590/1251] eta 0:24:24 lr 0.000989 time 3.5694 (2.2160) loss 4.6890 (4.4793) grad_norm 1.7677 (1.3246) [2022-01-18 03:38:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][600/1251] eta 0:24:02 lr 0.000989 time 2.1236 (2.2156) loss 4.9381 (4.4847) grad_norm 1.1396 (1.3249) [2022-01-18 03:38:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][610/1251] eta 0:23:41 lr 0.000989 time 2.1486 (2.2172) loss 4.7375 (4.4852) grad_norm 1.2909 (1.3269) [2022-01-18 03:38:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][620/1251] eta 0:23:18 lr 0.000989 time 2.0734 (2.2159) loss 4.6225 (4.4820) grad_norm 1.2159 (1.3291) [2022-01-18 03:39:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][630/1251] eta 0:22:54 lr 0.000989 time 2.4977 (2.2140) loss 5.0704 (4.4806) grad_norm 1.3897 (1.3300) [2022-01-18 03:39:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][640/1251] eta 0:22:31 lr 0.000989 time 1.5835 (2.2112) loss 4.6610 (4.4817) grad_norm 1.1364 (1.3265) [2022-01-18 03:39:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][650/1251] eta 0:22:07 lr 0.000989 time 1.9871 (2.2090) loss 5.1494 (4.4806) grad_norm 1.5620 (1.3267) [2022-01-18 03:40:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][660/1251] eta 0:21:45 lr 0.000989 time 2.1187 (2.2094) loss 4.9935 (4.4795) grad_norm 1.1650 (1.3255) [2022-01-18 03:40:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][670/1251] eta 0:21:23 lr 0.000989 time 2.2059 (2.2097) loss 4.9913 (4.4763) grad_norm 1.2410 (1.3252) [2022-01-18 03:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][680/1251] eta 0:21:02 lr 0.000989 time 1.5449 (2.2115) loss 4.5210 (4.4735) grad_norm 1.4732 (1.3253) [2022-01-18 03:41:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][690/1251] eta 0:20:41 lr 0.000989 time 2.3433 (2.2133) loss 4.6196 (4.4768) grad_norm 1.1649 (1.3250) [2022-01-18 03:41:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][700/1251] eta 0:20:18 lr 0.000989 time 2.1592 (2.2119) loss 5.2878 (4.4817) grad_norm 1.1577 (1.3231) [2022-01-18 03:42:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][710/1251] eta 0:19:56 lr 0.000989 time 2.1602 (2.2114) loss 5.5002 (4.4820) grad_norm 1.1134 (1.3218) [2022-01-18 03:42:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][720/1251] eta 0:19:33 lr 0.000989 time 1.8746 (2.2108) loss 5.0898 (4.4848) grad_norm 1.5811 (1.3226) [2022-01-18 03:42:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][730/1251] eta 0:19:11 lr 0.000989 time 1.8654 (2.2100) loss 4.9984 (4.4816) grad_norm 1.1787 (1.3220) [2022-01-18 03:43:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][740/1251] eta 0:18:48 lr 0.000989 time 2.2645 (2.2091) loss 5.1410 (4.4840) grad_norm 1.2986 (1.3209) [2022-01-18 03:43:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][750/1251] eta 0:18:26 lr 0.000989 time 1.9474 (2.2083) loss 5.1106 (4.4787) grad_norm 1.2537 (1.3203) [2022-01-18 03:44:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][760/1251] eta 0:18:05 lr 0.000989 time 1.9593 (2.2101) loss 4.1957 (4.4772) grad_norm 1.5034 (1.3217) [2022-01-18 03:44:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][770/1251] eta 0:17:43 lr 0.000989 time 1.6515 (2.2104) loss 4.9948 (4.4777) grad_norm 1.3672 (1.3225) [2022-01-18 03:44:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][780/1251] eta 0:17:20 lr 0.000989 time 2.2565 (2.2099) loss 4.6649 (4.4790) grad_norm 0.9632 (1.3215) [2022-01-18 03:45:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][790/1251] eta 0:16:58 lr 0.000988 time 2.2485 (2.2095) loss 4.7298 (4.4805) grad_norm 1.1549 (1.3198) [2022-01-18 03:45:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][800/1251] eta 0:16:36 lr 0.000988 time 2.1199 (2.2098) loss 4.1477 (4.4801) grad_norm 1.2752 (1.3177) [2022-01-18 03:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][810/1251] eta 0:16:14 lr 0.000988 time 2.4890 (2.2108) loss 3.5534 (4.4791) grad_norm 1.5820 (1.3168) [2022-01-18 03:46:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][820/1251] eta 0:15:52 lr 0.000988 time 2.1319 (2.2096) loss 5.1120 (4.4793) grad_norm 1.1506 (1.3183) [2022-01-18 03:46:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][830/1251] eta 0:15:28 lr 0.000988 time 1.6741 (2.2059) loss 4.4780 (4.4826) grad_norm 1.3212 (1.3183) [2022-01-18 03:46:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][840/1251] eta 0:15:05 lr 0.000988 time 1.5990 (2.2036) loss 4.7765 (4.4846) grad_norm 1.0951 (1.3179) [2022-01-18 03:47:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][850/1251] eta 0:14:43 lr 0.000988 time 2.3883 (2.2022) loss 4.5611 (4.4868) grad_norm 1.3733 (1.3178) [2022-01-18 03:47:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][860/1251] eta 0:14:21 lr 0.000988 time 2.5383 (2.2023) loss 3.5089 (4.4827) grad_norm 1.5054 (1.3173) [2022-01-18 03:47:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][870/1251] eta 0:13:58 lr 0.000988 time 1.9000 (2.2017) loss 3.7102 (4.4781) grad_norm 1.2702 (1.3183) [2022-01-18 03:48:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][880/1251] eta 0:13:36 lr 0.000988 time 2.3459 (2.2018) loss 3.7457 (4.4774) grad_norm 1.0788 (1.3181) [2022-01-18 03:48:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][890/1251] eta 0:13:15 lr 0.000988 time 2.0083 (2.2026) loss 4.2739 (4.4728) grad_norm 1.7223 (1.3178) [2022-01-18 03:49:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][900/1251] eta 0:12:53 lr 0.000988 time 2.6198 (2.2032) loss 4.6892 (4.4735) grad_norm 1.1601 (1.3165) [2022-01-18 03:49:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][910/1251] eta 0:12:31 lr 0.000988 time 1.5188 (2.2027) loss 5.0197 (4.4749) grad_norm 1.1636 (1.3158) [2022-01-18 03:49:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][920/1251] eta 0:12:09 lr 0.000988 time 3.4890 (2.2050) loss 4.6245 (4.4749) grad_norm 1.2951 (1.3145) [2022-01-18 03:50:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][930/1251] eta 0:11:47 lr 0.000988 time 1.6212 (2.2055) loss 3.8762 (4.4722) grad_norm 1.3452 (1.3135) [2022-01-18 03:50:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][940/1251] eta 0:11:26 lr 0.000988 time 2.1028 (2.2060) loss 4.6980 (4.4716) grad_norm 1.7122 (1.3132) [2022-01-18 03:50:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][950/1251] eta 0:11:03 lr 0.000988 time 1.5426 (2.2052) loss 4.3501 (4.4702) grad_norm 1.3802 (1.3119) [2022-01-18 03:51:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][960/1251] eta 0:10:41 lr 0.000988 time 2.4691 (2.2046) loss 4.4542 (4.4692) grad_norm 1.0279 (1.3108) [2022-01-18 03:51:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][970/1251] eta 0:10:18 lr 0.000988 time 2.0044 (2.2023) loss 3.5715 (4.4670) grad_norm 1.1433 (1.3101) [2022-01-18 03:51:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][980/1251] eta 0:09:56 lr 0.000988 time 1.8762 (2.2006) loss 3.9944 (4.4657) grad_norm 1.3993 (1.3106) [2022-01-18 03:52:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][990/1251] eta 0:09:34 lr 0.000988 time 2.8294 (2.2009) loss 4.6688 (4.4662) grad_norm 1.4125 (1.3113) [2022-01-18 03:52:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1000/1251] eta 0:09:12 lr 0.000988 time 2.5820 (2.2015) loss 4.9571 (4.4655) grad_norm 1.1060 (1.3100) [2022-01-18 03:53:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1010/1251] eta 0:08:51 lr 0.000988 time 2.5104 (2.2041) loss 4.6479 (4.4657) grad_norm 1.4106 (1.3090) [2022-01-18 03:53:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1020/1251] eta 0:08:29 lr 0.000988 time 1.5695 (2.2051) loss 3.9225 (4.4656) grad_norm 1.5325 (1.3091) [2022-01-18 03:53:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1030/1251] eta 0:08:07 lr 0.000988 time 2.2149 (2.2051) loss 4.3192 (4.4646) grad_norm 1.5144 (1.3099) [2022-01-18 03:54:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1040/1251] eta 0:07:45 lr 0.000988 time 1.9750 (2.2038) loss 3.5483 (4.4645) grad_norm 1.1696 (1.3089) [2022-01-18 03:54:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1050/1251] eta 0:07:23 lr 0.000988 time 2.2144 (2.2051) loss 4.8807 (4.4630) grad_norm 1.6789 (1.3092) [2022-01-18 03:54:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1060/1251] eta 0:07:01 lr 0.000988 time 1.9165 (2.2051) loss 4.0076 (4.4624) grad_norm 1.1636 (1.3084) [2022-01-18 03:55:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1070/1251] eta 0:06:38 lr 0.000988 time 2.1621 (2.2042) loss 4.2222 (4.4645) grad_norm 1.4802 (1.3074) [2022-01-18 03:55:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1080/1251] eta 0:06:16 lr 0.000988 time 1.5837 (2.2021) loss 4.9408 (4.4636) grad_norm 1.6003 (1.3082) [2022-01-18 03:56:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1090/1251] eta 0:05:54 lr 0.000988 time 2.2874 (2.2029) loss 4.8116 (4.4643) grad_norm 1.6730 (1.3085) [2022-01-18 03:56:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1100/1251] eta 0:05:32 lr 0.000988 time 2.2216 (2.2020) loss 4.5081 (4.4615) grad_norm 1.1693 (1.3080) [2022-01-18 03:56:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1110/1251] eta 0:05:10 lr 0.000988 time 1.6615 (2.2007) loss 5.4046 (4.4615) grad_norm 1.1072 (1.3069) [2022-01-18 03:57:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1120/1251] eta 0:04:48 lr 0.000988 time 2.4589 (2.2004) loss 5.2943 (4.4625) grad_norm 1.3047 (1.3070) [2022-01-18 03:57:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1130/1251] eta 0:04:26 lr 0.000988 time 1.8633 (2.2004) loss 4.2047 (4.4606) grad_norm 1.2849 (1.3061) [2022-01-18 03:57:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1140/1251] eta 0:04:04 lr 0.000988 time 3.1151 (2.2010) loss 4.8547 (4.4589) grad_norm 1.3358 (1.3055) [2022-01-18 03:58:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1150/1251] eta 0:03:42 lr 0.000988 time 1.9211 (2.2011) loss 5.3541 (4.4614) grad_norm 1.3197 (1.3065) [2022-01-18 03:58:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1160/1251] eta 0:03:20 lr 0.000988 time 2.3314 (2.2009) loss 3.8021 (4.4612) grad_norm 1.3579 (1.3056) [2022-01-18 03:58:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1170/1251] eta 0:02:58 lr 0.000988 time 2.2572 (2.2007) loss 4.8398 (4.4605) grad_norm 1.2373 (1.3047) [2022-01-18 03:59:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1180/1251] eta 0:02:36 lr 0.000988 time 2.2436 (2.1988) loss 4.6629 (4.4614) grad_norm 1.1640 (1.3054) [2022-01-18 03:59:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1190/1251] eta 0:02:14 lr 0.000988 time 2.0468 (2.1972) loss 5.1922 (4.4598) grad_norm 1.5116 (1.3059) [2022-01-18 03:59:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1200/1251] eta 0:01:52 lr 0.000988 time 2.3560 (2.1961) loss 4.3503 (4.4617) grad_norm 1.2208 (1.3058) [2022-01-18 04:00:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1210/1251] eta 0:01:30 lr 0.000988 time 2.1598 (2.1962) loss 5.0886 (4.4609) grad_norm 1.0622 (1.3054) [2022-01-18 04:00:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1220/1251] eta 0:01:08 lr 0.000988 time 2.1033 (2.1966) loss 4.5227 (4.4605) grad_norm 1.3586 (1.3046) [2022-01-18 04:01:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1230/1251] eta 0:00:46 lr 0.000988 time 1.9162 (2.1964) loss 5.3529 (4.4617) grad_norm 1.0493 (1.3044) [2022-01-18 04:01:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1240/1251] eta 0:00:24 lr 0.000988 time 2.2179 (2.1957) loss 5.1310 (4.4642) grad_norm 1.1290 (1.3043) [2022-01-18 04:01:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [20/300][1250/1251] eta 0:00:02 lr 0.000988 time 1.1428 (2.1906) loss 4.5435 (4.4641) grad_norm 1.0950 (1.3035) [2022-01-18 04:01:39 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 20 training takes 0:45:40 [2022-01-18 04:01:39 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_20 saving...... [2022-01-18 04:01:50 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_20 saved !!! [2022-01-18 04:02:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 16.915 (16.915) Loss 1.7822 (1.7822) Acc@1 61.035 (61.035) Acc@5 83.594 (83.594) [2022-01-18 04:02:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.969 (2.983) Loss 1.7964 (1.7983) Acc@1 61.621 (60.396) Acc@5 84.570 (83.620) [2022-01-18 04:02:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.678 (2.383) Loss 1.8233 (1.8042) Acc@1 58.398 (60.231) Acc@5 83.398 (83.570) [2022-01-18 04:02:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.348 (2.079) Loss 1.8022 (1.7989) Acc@1 60.840 (60.396) Acc@5 83.594 (83.783) [2022-01-18 04:03:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.718 (1.998) Loss 1.8684 (1.8089) Acc@1 60.059 (60.101) Acc@5 82.031 (83.572) [2022-01-18 04:03:20 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 60.100 Acc@5 83.542 [2022-01-18 04:03:20 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 60.1% [2022-01-18 04:03:20 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 60.10% [2022-01-18 04:03:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][0/1251] eta 7:33:47 lr 0.000988 time 21.7649 (21.7649) loss 3.9121 (3.9121) grad_norm 1.1757 (1.1757) [2022-01-18 04:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][10/1251] eta 1:25:09 lr 0.000988 time 2.2577 (4.1174) loss 4.6986 (4.5951) grad_norm 1.2648 (1.2655) [2022-01-18 04:04:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][20/1251] eta 1:04:23 lr 0.000988 time 1.6599 (3.1382) loss 4.4730 (4.5979) grad_norm 1.5209 (1.3271) [2022-01-18 04:04:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][30/1251] eta 0:56:26 lr 0.000988 time 1.4936 (2.7738) loss 4.9773 (4.5854) grad_norm 1.3746 (1.3283) [2022-01-18 04:05:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][40/1251] eta 0:54:17 lr 0.000988 time 3.4829 (2.6899) loss 4.3814 (4.5054) grad_norm 1.2213 (1.3400) [2022-01-18 04:05:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][50/1251] eta 0:53:03 lr 0.000988 time 4.5933 (2.6511) loss 4.7779 (4.5134) grad_norm 1.2858 (1.3270) [2022-01-18 04:05:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][60/1251] eta 0:51:44 lr 0.000988 time 2.4669 (2.6065) loss 3.6242 (4.4869) grad_norm 1.3259 (1.3218) [2022-01-18 04:06:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][70/1251] eta 0:50:12 lr 0.000988 time 1.4638 (2.5509) loss 4.3267 (4.5144) grad_norm 1.2549 (1.3225) [2022-01-18 04:06:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][80/1251] eta 0:48:41 lr 0.000988 time 2.4911 (2.4951) loss 3.3922 (4.4874) grad_norm 1.2143 (1.3141) [2022-01-18 04:07:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][90/1251] eta 0:47:31 lr 0.000988 time 1.9344 (2.4564) loss 4.9467 (4.5029) grad_norm 1.1208 (1.3041) [2022-01-18 04:07:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][100/1251] eta 0:46:18 lr 0.000988 time 1.9194 (2.4138) loss 4.7786 (4.5020) grad_norm 1.2175 (1.3039) [2022-01-18 04:07:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][110/1251] eta 0:45:20 lr 0.000988 time 1.6439 (2.3840) loss 4.2817 (4.5111) grad_norm 1.5165 (1.2986) [2022-01-18 04:08:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][120/1251] eta 0:44:37 lr 0.000988 time 2.0218 (2.3669) loss 4.1884 (4.5117) grad_norm 1.2767 (1.2942) [2022-01-18 04:08:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][130/1251] eta 0:44:00 lr 0.000988 time 2.2205 (2.3559) loss 3.6665 (4.5074) grad_norm 1.7155 (1.3046) [2022-01-18 04:08:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][140/1251] eta 0:43:21 lr 0.000988 time 1.9065 (2.3415) loss 3.7943 (4.4977) grad_norm 1.1384 (1.3135) [2022-01-18 04:09:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][150/1251] eta 0:42:41 lr 0.000988 time 1.8396 (2.3265) loss 5.0808 (4.5223) grad_norm 1.1033 (1.3105) [2022-01-18 04:09:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][160/1251] eta 0:42:03 lr 0.000988 time 2.2754 (2.3130) loss 5.3250 (4.5176) grad_norm 1.1316 (1.3078) [2022-01-18 04:09:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][170/1251] eta 0:41:27 lr 0.000988 time 2.3227 (2.3007) loss 4.6552 (4.5189) grad_norm 1.3025 (1.3100) [2022-01-18 04:10:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][180/1251] eta 0:41:01 lr 0.000988 time 2.4011 (2.2985) loss 5.0519 (4.5197) grad_norm 1.3053 (1.3017) [2022-01-18 04:10:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][190/1251] eta 0:40:34 lr 0.000988 time 1.6900 (2.2942) loss 5.2263 (4.5230) grad_norm 1.4326 (1.3007) [2022-01-18 04:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][200/1251] eta 0:40:12 lr 0.000988 time 2.1797 (2.2950) loss 4.3859 (4.5145) grad_norm 1.3025 (1.3012) [2022-01-18 04:11:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][210/1251] eta 0:39:50 lr 0.000988 time 2.2846 (2.2966) loss 3.9491 (4.4992) grad_norm 1.4764 (1.3038) [2022-01-18 04:11:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][220/1251] eta 0:39:26 lr 0.000988 time 2.5375 (2.2958) loss 4.6897 (4.4946) grad_norm 1.5002 (1.3066) [2022-01-18 04:12:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][230/1251] eta 0:38:47 lr 0.000988 time 1.5780 (2.2797) loss 4.4418 (4.4941) grad_norm 1.2323 (1.3087) [2022-01-18 04:12:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][240/1251] eta 0:38:13 lr 0.000988 time 1.9331 (2.2683) loss 4.1509 (4.4897) grad_norm 1.1370 (1.3079) [2022-01-18 04:12:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][250/1251] eta 0:37:39 lr 0.000988 time 1.9267 (2.2573) loss 4.6214 (4.4861) grad_norm 1.6499 (1.3078) [2022-01-18 04:13:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][260/1251] eta 0:37:20 lr 0.000988 time 2.7591 (2.2607) loss 4.2586 (4.4840) grad_norm 1.4432 (1.3069) [2022-01-18 04:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][270/1251] eta 0:36:53 lr 0.000988 time 1.9237 (2.2568) loss 4.8625 (4.4956) grad_norm 1.1655 (1.3068) [2022-01-18 04:13:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][280/1251] eta 0:36:34 lr 0.000988 time 2.8657 (2.2602) loss 4.9704 (4.4981) grad_norm 1.3477 (1.3089) [2022-01-18 04:14:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][290/1251] eta 0:36:10 lr 0.000988 time 2.1030 (2.2581) loss 4.0119 (4.4946) grad_norm 1.1874 (1.3086) [2022-01-18 04:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][300/1251] eta 0:35:49 lr 0.000988 time 2.8300 (2.2606) loss 4.4207 (4.5011) grad_norm 1.0867 (1.3080) [2022-01-18 04:15:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][310/1251] eta 0:35:24 lr 0.000988 time 1.8089 (2.2578) loss 4.9079 (4.5005) grad_norm 1.4041 (1.3105) [2022-01-18 04:15:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][320/1251] eta 0:34:57 lr 0.000988 time 1.9799 (2.2532) loss 4.6522 (4.5010) grad_norm 1.3379 (1.3102) [2022-01-18 04:15:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][330/1251] eta 0:34:30 lr 0.000988 time 2.2549 (2.2486) loss 4.0366 (4.4961) grad_norm 1.6832 (1.3110) [2022-01-18 04:16:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][340/1251] eta 0:34:07 lr 0.000988 time 2.4841 (2.2475) loss 4.7135 (4.4906) grad_norm 1.2097 (1.3111) [2022-01-18 04:16:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][350/1251] eta 0:33:45 lr 0.000988 time 2.1750 (2.2484) loss 3.8968 (4.4826) grad_norm 1.3226 (1.3099) [2022-01-18 04:16:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][360/1251] eta 0:33:23 lr 0.000988 time 1.9852 (2.2482) loss 4.8193 (4.4806) grad_norm 1.4340 (1.3081) [2022-01-18 04:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][370/1251] eta 0:32:56 lr 0.000988 time 1.7856 (2.2438) loss 4.0276 (4.4693) grad_norm 1.4233 (1.3111) [2022-01-18 04:17:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][380/1251] eta 0:32:31 lr 0.000988 time 2.2068 (2.2401) loss 4.8239 (4.4745) grad_norm 1.4615 (1.3123) [2022-01-18 04:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][390/1251] eta 0:32:05 lr 0.000988 time 1.9710 (2.2364) loss 5.0612 (4.4818) grad_norm 1.4498 (1.3165) [2022-01-18 04:18:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][400/1251] eta 0:31:42 lr 0.000988 time 2.2248 (2.2354) loss 3.9327 (4.4835) grad_norm 1.1497 (1.3123) [2022-01-18 04:18:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][410/1251] eta 0:31:19 lr 0.000988 time 1.8794 (2.2354) loss 4.3502 (4.4875) grad_norm 1.1637 (1.3088) [2022-01-18 04:19:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][420/1251] eta 0:30:55 lr 0.000988 time 1.9434 (2.2329) loss 4.6232 (4.4868) grad_norm 1.1021 (1.3053) [2022-01-18 04:19:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][430/1251] eta 0:30:32 lr 0.000988 time 2.3210 (2.2325) loss 3.5573 (4.4867) grad_norm 1.3503 (1.3027) [2022-01-18 04:19:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][440/1251] eta 0:30:07 lr 0.000988 time 1.6501 (2.2288) loss 4.3561 (4.4889) grad_norm 1.3500 (1.3029) [2022-01-18 04:20:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][450/1251] eta 0:29:43 lr 0.000988 time 1.5796 (2.2262) loss 4.9330 (4.4910) grad_norm 1.2532 (1.3063) [2022-01-18 04:20:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][460/1251] eta 0:29:21 lr 0.000988 time 1.4821 (2.2267) loss 3.7146 (4.4900) grad_norm 1.3190 (1.3073) [2022-01-18 04:20:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][470/1251] eta 0:29:01 lr 0.000988 time 2.9177 (2.2303) loss 3.9689 (4.4879) grad_norm 1.4011 (1.3060) [2022-01-18 04:21:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][480/1251] eta 0:28:40 lr 0.000988 time 2.3841 (2.2319) loss 4.8577 (4.4883) grad_norm 1.0403 (1.3030) [2022-01-18 04:21:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][490/1251] eta 0:28:15 lr 0.000988 time 1.6033 (2.2274) loss 4.1097 (4.4814) grad_norm 1.2260 (1.3039) [2022-01-18 04:21:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][500/1251] eta 0:27:48 lr 0.000988 time 1.9224 (2.2219) loss 4.7993 (4.4826) grad_norm 1.0538 (1.3050) [2022-01-18 04:22:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][510/1251] eta 0:27:23 lr 0.000988 time 2.4177 (2.2183) loss 4.4238 (4.4826) grad_norm 1.1980 (1.3037) [2022-01-18 04:22:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][520/1251] eta 0:27:00 lr 0.000988 time 2.4541 (2.2171) loss 3.3701 (4.4839) grad_norm 0.9681 (1.3026) [2022-01-18 04:22:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][530/1251] eta 0:26:38 lr 0.000988 time 1.9601 (2.2168) loss 4.9293 (4.4865) grad_norm 1.1417 (1.3000) [2022-01-18 04:23:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][540/1251] eta 0:26:17 lr 0.000988 time 3.6883 (2.2182) loss 5.2219 (4.4906) grad_norm 0.9968 (1.2993) [2022-01-18 04:23:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][550/1251] eta 0:25:58 lr 0.000988 time 1.8649 (2.2239) loss 4.4828 (4.4888) grad_norm 1.1462 (1.2975) [2022-01-18 04:24:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][560/1251] eta 0:25:37 lr 0.000988 time 1.5690 (2.2253) loss 3.3040 (4.4892) grad_norm 1.7551 (1.2965) [2022-01-18 04:24:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][570/1251] eta 0:25:13 lr 0.000988 time 1.9707 (2.2229) loss 3.6167 (4.4832) grad_norm 1.1410 (1.2965) [2022-01-18 04:24:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][580/1251] eta 0:24:49 lr 0.000988 time 2.5037 (2.2205) loss 4.4422 (4.4841) grad_norm 1.1181 (1.2956) [2022-01-18 04:25:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][590/1251] eta 0:24:26 lr 0.000988 time 2.3538 (2.2181) loss 4.7014 (4.4882) grad_norm 1.5234 (1.2978) [2022-01-18 04:25:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][600/1251] eta 0:24:02 lr 0.000988 time 1.9617 (2.2154) loss 5.2589 (4.4901) grad_norm 1.5719 (1.2993) [2022-01-18 04:25:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][610/1251] eta 0:23:39 lr 0.000988 time 2.1751 (2.2143) loss 4.8471 (4.4914) grad_norm 1.4733 (1.2989) [2022-01-18 04:26:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][620/1251] eta 0:23:16 lr 0.000988 time 1.9687 (2.2133) loss 5.2132 (4.4891) grad_norm 1.1672 (1.2971) [2022-01-18 04:26:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][630/1251] eta 0:22:55 lr 0.000988 time 2.4064 (2.2149) loss 4.5984 (4.4916) grad_norm 1.6418 (1.2966) [2022-01-18 04:27:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][640/1251] eta 0:22:33 lr 0.000987 time 2.7510 (2.2150) loss 4.5590 (4.4911) grad_norm 1.4610 (1.2963) [2022-01-18 04:27:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][650/1251] eta 0:22:10 lr 0.000987 time 2.0266 (2.2133) loss 4.9269 (4.4922) grad_norm 1.1996 (1.2965) [2022-01-18 04:27:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][660/1251] eta 0:21:47 lr 0.000987 time 1.8042 (2.2120) loss 4.6548 (4.4885) grad_norm 1.2486 (1.2955) [2022-01-18 04:28:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][670/1251] eta 0:21:24 lr 0.000987 time 2.1573 (2.2112) loss 4.1353 (4.4884) grad_norm 1.2282 (1.2951) [2022-01-18 04:28:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][680/1251] eta 0:21:03 lr 0.000987 time 2.8904 (2.2120) loss 4.8949 (4.4910) grad_norm 1.5944 (1.2972) [2022-01-18 04:28:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][690/1251] eta 0:20:41 lr 0.000987 time 1.8290 (2.2123) loss 4.7396 (4.4920) grad_norm 1.1946 (1.2954) [2022-01-18 04:29:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][700/1251] eta 0:20:18 lr 0.000987 time 2.3111 (2.2109) loss 3.8297 (4.4906) grad_norm 1.0804 (1.2934) [2022-01-18 04:29:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][710/1251] eta 0:19:57 lr 0.000987 time 3.3762 (2.2133) loss 3.5686 (4.4877) grad_norm 1.4577 (1.2935) [2022-01-18 04:29:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][720/1251] eta 0:19:34 lr 0.000987 time 2.0943 (2.2123) loss 5.3587 (4.4865) grad_norm 1.4198 (1.2924) [2022-01-18 04:30:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][730/1251] eta 0:19:11 lr 0.000987 time 1.5419 (2.2111) loss 3.8117 (4.4866) grad_norm 1.1256 (1.2910) [2022-01-18 04:30:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][740/1251] eta 0:18:48 lr 0.000987 time 2.2850 (2.2091) loss 4.5917 (4.4834) grad_norm 1.1216 (1.2910) [2022-01-18 04:31:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][750/1251] eta 0:18:27 lr 0.000987 time 2.7783 (2.2114) loss 4.8485 (4.4831) grad_norm 1.0803 (1.2906) [2022-01-18 04:31:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][760/1251] eta 0:18:05 lr 0.000987 time 1.6353 (2.2118) loss 3.6786 (4.4830) grad_norm 1.3289 (1.2911) [2022-01-18 04:31:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][770/1251] eta 0:17:43 lr 0.000987 time 1.9547 (2.2110) loss 4.8519 (4.4820) grad_norm 1.1408 (1.2916) [2022-01-18 04:32:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][780/1251] eta 0:17:21 lr 0.000987 time 2.5764 (2.2106) loss 4.2997 (4.4842) grad_norm 1.3456 (1.2928) [2022-01-18 04:32:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][790/1251] eta 0:16:58 lr 0.000987 time 1.6225 (2.2087) loss 4.2548 (4.4825) grad_norm 1.1969 (1.2923) [2022-01-18 04:32:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][800/1251] eta 0:16:35 lr 0.000987 time 1.8983 (2.2081) loss 3.5947 (4.4823) grad_norm 1.0977 (1.2909) [2022-01-18 04:33:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][810/1251] eta 0:16:13 lr 0.000987 time 2.1508 (2.2084) loss 4.8760 (4.4818) grad_norm 1.3232 (1.2908) [2022-01-18 04:33:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][820/1251] eta 0:15:51 lr 0.000987 time 2.7236 (2.2087) loss 4.7591 (4.4868) grad_norm 1.6113 (1.2930) [2022-01-18 04:33:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][830/1251] eta 0:15:29 lr 0.000987 time 2.1769 (2.2075) loss 4.3561 (4.4862) grad_norm 1.1279 (1.2927) [2022-01-18 04:34:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][840/1251] eta 0:15:07 lr 0.000987 time 1.8539 (2.2072) loss 4.6414 (4.4884) grad_norm 1.1075 (1.2911) [2022-01-18 04:34:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][850/1251] eta 0:14:44 lr 0.000987 time 1.5913 (2.2065) loss 4.2941 (4.4890) grad_norm 1.1569 (1.2920) [2022-01-18 04:34:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][860/1251] eta 0:14:22 lr 0.000987 time 2.2628 (2.2059) loss 3.2098 (4.4881) grad_norm 1.2608 (1.2925) [2022-01-18 04:35:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][870/1251] eta 0:14:00 lr 0.000987 time 2.1129 (2.2053) loss 4.7014 (4.4873) grad_norm 1.3643 (1.2912) [2022-01-18 04:35:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][880/1251] eta 0:13:38 lr 0.000987 time 2.2481 (2.2055) loss 5.1343 (4.4889) grad_norm 1.1378 (1.2914) [2022-01-18 04:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][890/1251] eta 0:13:15 lr 0.000987 time 1.9170 (2.2040) loss 5.1373 (4.4921) grad_norm 1.2423 (1.2919) [2022-01-18 04:36:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][900/1251] eta 0:12:53 lr 0.000987 time 1.8478 (2.2035) loss 4.1525 (4.4897) grad_norm 1.2837 (1.2928) [2022-01-18 04:36:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][910/1251] eta 0:12:30 lr 0.000987 time 1.8494 (2.2020) loss 4.8674 (4.4894) grad_norm 1.3524 (1.2917) [2022-01-18 04:37:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][920/1251] eta 0:12:09 lr 0.000987 time 3.3236 (2.2025) loss 4.9890 (4.4874) grad_norm 1.1408 (1.2909) [2022-01-18 04:37:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][930/1251] eta 0:11:46 lr 0.000987 time 1.9110 (2.2021) loss 4.5595 (4.4893) grad_norm 1.6099 (1.2911) [2022-01-18 04:37:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][940/1251] eta 0:11:24 lr 0.000987 time 2.4714 (2.2022) loss 3.6752 (4.4849) grad_norm 1.1638 (1.2917) [2022-01-18 04:38:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][950/1251] eta 0:11:02 lr 0.000987 time 1.6595 (2.2001) loss 2.9371 (4.4831) grad_norm 1.1853 (1.2916) [2022-01-18 04:38:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][960/1251] eta 0:10:40 lr 0.000987 time 1.9578 (2.1999) loss 4.9770 (4.4778) grad_norm 0.9573 (1.2913) [2022-01-18 04:38:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][970/1251] eta 0:10:17 lr 0.000987 time 1.9247 (2.1988) loss 3.6298 (4.4768) grad_norm 1.4168 (1.2910) [2022-01-18 04:39:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][980/1251] eta 0:09:56 lr 0.000987 time 2.1490 (2.1994) loss 4.1241 (4.4744) grad_norm 1.2374 (1.2897) [2022-01-18 04:39:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][990/1251] eta 0:09:34 lr 0.000987 time 1.7543 (2.2001) loss 4.3154 (4.4773) grad_norm 0.9243 (1.2889) [2022-01-18 04:40:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1000/1251] eta 0:09:12 lr 0.000987 time 2.9255 (2.2016) loss 4.6864 (4.4785) grad_norm 0.8895 (1.2877) [2022-01-18 04:40:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1010/1251] eta 0:08:50 lr 0.000987 time 2.1066 (2.2015) loss 4.8539 (4.4778) grad_norm 1.0685 (1.2870) [2022-01-18 04:40:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1020/1251] eta 0:08:28 lr 0.000987 time 2.6273 (2.2014) loss 4.6928 (4.4785) grad_norm 1.3491 (1.2873) [2022-01-18 04:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1030/1251] eta 0:08:06 lr 0.000987 time 1.8503 (2.2009) loss 4.5302 (4.4767) grad_norm 1.0725 (1.2868) [2022-01-18 04:41:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1040/1251] eta 0:07:44 lr 0.000987 time 2.5891 (2.2007) loss 4.1870 (4.4744) grad_norm 1.1525 (1.2866) [2022-01-18 04:41:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1050/1251] eta 0:07:22 lr 0.000987 time 2.5001 (2.2006) loss 4.9806 (4.4760) grad_norm 1.2333 (1.2849) [2022-01-18 04:42:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1060/1251] eta 0:07:00 lr 0.000987 time 2.1837 (2.2013) loss 4.6994 (4.4772) grad_norm 1.2859 (1.2839) [2022-01-18 04:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1070/1251] eta 0:06:38 lr 0.000987 time 2.7555 (2.2014) loss 3.4890 (4.4739) grad_norm 1.1429 (1.2835) [2022-01-18 04:42:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1080/1251] eta 0:06:16 lr 0.000987 time 2.5708 (2.2009) loss 3.4402 (4.4740) grad_norm 1.1896 (1.2830) [2022-01-18 04:43:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1090/1251] eta 0:05:54 lr 0.000987 time 1.9047 (2.2009) loss 4.4782 (4.4744) grad_norm 1.2363 (1.2831) [2022-01-18 04:43:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1100/1251] eta 0:05:32 lr 0.000987 time 2.4665 (2.2017) loss 5.5757 (4.4734) grad_norm 1.1056 (1.2826) [2022-01-18 04:44:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1110/1251] eta 0:05:10 lr 0.000987 time 1.8595 (2.2001) loss 4.0800 (4.4733) grad_norm 1.1179 (1.2814) [2022-01-18 04:44:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1120/1251] eta 0:04:48 lr 0.000987 time 2.5082 (2.1994) loss 4.8438 (4.4754) grad_norm 1.3805 (1.2803) [2022-01-18 04:44:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1130/1251] eta 0:04:25 lr 0.000987 time 2.1475 (2.1978) loss 4.3235 (4.4756) grad_norm 0.9371 (1.2800) [2022-01-18 04:45:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1140/1251] eta 0:04:03 lr 0.000987 time 2.1364 (2.1978) loss 5.1464 (4.4751) grad_norm 1.0958 (1.2795) [2022-01-18 04:45:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1150/1251] eta 0:03:41 lr 0.000987 time 2.5549 (2.1975) loss 4.3314 (4.4736) grad_norm 1.0390 (1.2785) [2022-01-18 04:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1160/1251] eta 0:03:19 lr 0.000987 time 2.6429 (2.1977) loss 4.7501 (4.4725) grad_norm 1.2637 (1.2784) [2022-01-18 04:46:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1170/1251] eta 0:02:58 lr 0.000987 time 2.4097 (2.1980) loss 4.3921 (4.4732) grad_norm 0.9862 (1.2778) [2022-01-18 04:46:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1180/1251] eta 0:02:36 lr 0.000987 time 2.2473 (2.1982) loss 4.4353 (4.4721) grad_norm 1.4315 (1.2785) [2022-01-18 04:47:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1190/1251] eta 0:02:14 lr 0.000987 time 1.9760 (2.2006) loss 4.2086 (4.4707) grad_norm 1.3798 (1.2792) [2022-01-18 04:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1200/1251] eta 0:01:52 lr 0.000987 time 2.8490 (2.2007) loss 4.5021 (4.4696) grad_norm 1.0568 (1.2783) [2022-01-18 04:47:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1210/1251] eta 0:01:30 lr 0.000987 time 2.2655 (2.1985) loss 4.9236 (4.4710) grad_norm 1.3367 (1.2782) [2022-01-18 04:48:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1220/1251] eta 0:01:08 lr 0.000987 time 2.1548 (2.1964) loss 4.9145 (4.4707) grad_norm 0.9935 (1.2772) [2022-01-18 04:48:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1230/1251] eta 0:00:46 lr 0.000987 time 1.9214 (2.1960) loss 3.6770 (4.4706) grad_norm 1.1450 (1.2769) [2022-01-18 04:48:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1240/1251] eta 0:00:24 lr 0.000987 time 1.7154 (2.1964) loss 4.7247 (4.4691) grad_norm 0.9685 (1.2765) [2022-01-18 04:49:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [21/300][1250/1251] eta 0:00:02 lr 0.000987 time 1.1904 (2.1910) loss 3.9526 (4.4672) grad_norm 1.3159 (1.2763) [2022-01-18 04:49:01 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 21 training takes 0:45:41 [2022-01-18 04:49:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 20.684 (20.684) Loss 1.7309 (1.7309) Acc@1 61.133 (61.133) Acc@5 84.570 (84.570) [2022-01-18 04:49:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.023 (3.589) Loss 1.7952 (1.7598) Acc@1 58.984 (60.369) Acc@5 81.836 (83.700) [2022-01-18 04:50:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.893 (2.801) Loss 1.7437 (1.7515) Acc@1 61.328 (60.654) Acc@5 83.984 (83.626) [2022-01-18 04:50:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.618 (2.295) Loss 1.7930 (1.7604) Acc@1 59.961 (60.745) Acc@5 83.887 (83.616) [2022-01-18 04:50:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.250 (2.237) Loss 1.7647 (1.7569) Acc@1 61.816 (60.799) Acc@5 84.668 (83.791) [2022-01-18 04:50:41 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 60.790 Acc@5 83.754 [2022-01-18 04:50:41 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 60.8% [2022-01-18 04:50:41 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 60.79% [2022-01-18 04:51:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][0/1251] eta 7:21:04 lr 0.000987 time 21.1546 (21.1546) loss 5.1383 (5.1383) grad_norm 1.1799 (1.1799) [2022-01-18 04:51:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][10/1251] eta 1:22:35 lr 0.000987 time 2.1403 (3.9931) loss 4.8175 (4.5241) grad_norm 1.0273 (1.1279) [2022-01-18 04:51:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][20/1251] eta 1:01:56 lr 0.000987 time 1.7540 (3.0194) loss 3.1742 (4.6194) grad_norm 1.1275 (1.1538) [2022-01-18 04:52:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][30/1251] eta 0:55:08 lr 0.000987 time 1.3265 (2.7094) loss 3.3158 (4.5746) grad_norm 1.6251 (1.1986) [2022-01-18 04:52:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][40/1251] eta 0:53:56 lr 0.000987 time 6.4931 (2.6727) loss 4.6342 (4.5554) grad_norm 0.9999 (1.2155) [2022-01-18 04:52:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][50/1251] eta 0:52:28 lr 0.000987 time 2.5116 (2.6218) loss 4.4553 (4.4828) grad_norm 1.3383 (1.2226) [2022-01-18 04:53:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][60/1251] eta 0:50:41 lr 0.000987 time 2.4218 (2.5538) loss 4.3613 (4.4605) grad_norm 1.4345 (1.2205) [2022-01-18 04:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][70/1251] eta 0:49:15 lr 0.000987 time 2.0826 (2.5023) loss 3.5482 (4.4498) grad_norm 1.0844 (1.2149) [2022-01-18 04:54:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][80/1251] eta 0:48:31 lr 0.000987 time 3.7964 (2.4864) loss 5.1141 (4.4464) grad_norm 1.2989 (1.2359) [2022-01-18 04:54:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][90/1251] eta 0:47:57 lr 0.000987 time 2.4821 (2.4782) loss 4.6642 (4.4430) grad_norm 1.7603 (1.2539) [2022-01-18 04:54:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][100/1251] eta 0:46:49 lr 0.000987 time 1.6853 (2.4406) loss 4.6628 (4.4068) grad_norm 1.2636 (1.2707) [2022-01-18 04:55:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][110/1251] eta 0:45:49 lr 0.000987 time 1.8905 (2.4094) loss 3.8955 (4.4211) grad_norm 1.3501 (1.2678) [2022-01-18 04:55:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][120/1251] eta 0:45:05 lr 0.000987 time 2.4969 (2.3925) loss 4.0750 (4.4061) grad_norm 1.2077 (1.2673) [2022-01-18 04:55:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][130/1251] eta 0:44:16 lr 0.000987 time 1.7286 (2.3699) loss 5.2854 (4.4057) grad_norm 1.1950 (1.2600) [2022-01-18 04:56:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][140/1251] eta 0:43:27 lr 0.000987 time 2.1792 (2.3472) loss 4.3545 (4.4240) grad_norm 1.4458 (1.2654) [2022-01-18 04:56:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][150/1251] eta 0:42:50 lr 0.000987 time 2.0496 (2.3351) loss 3.6511 (4.4268) grad_norm 1.1444 (1.2696) [2022-01-18 04:56:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][160/1251] eta 0:42:20 lr 0.000987 time 2.4867 (2.3290) loss 3.4111 (4.4114) grad_norm 1.1622 (1.2716) [2022-01-18 04:57:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][170/1251] eta 0:41:47 lr 0.000987 time 1.7486 (2.3195) loss 3.3963 (4.4116) grad_norm 1.8257 (1.2754) [2022-01-18 04:57:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][180/1251] eta 0:41:20 lr 0.000987 time 2.2772 (2.3162) loss 4.5033 (4.4185) grad_norm 1.2914 (1.2778) [2022-01-18 04:58:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][190/1251] eta 0:41:01 lr 0.000987 time 1.9027 (2.3195) loss 3.0192 (4.4088) grad_norm 1.4616 (1.2798) [2022-01-18 04:58:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][200/1251] eta 0:40:32 lr 0.000987 time 2.8230 (2.3145) loss 4.3313 (4.4099) grad_norm 1.1198 (1.2853) [2022-01-18 04:58:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][210/1251] eta 0:39:55 lr 0.000987 time 1.9430 (2.3008) loss 4.0318 (4.4128) grad_norm 1.0391 (1.2876) [2022-01-18 04:59:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][220/1251] eta 0:39:18 lr 0.000987 time 2.2433 (2.2878) loss 5.1985 (4.3991) grad_norm 1.2949 (1.2911) [2022-01-18 04:59:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][230/1251] eta 0:38:47 lr 0.000987 time 2.0394 (2.2796) loss 3.4283 (4.3982) grad_norm 1.2804 (1.2892) [2022-01-18 04:59:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][240/1251] eta 0:38:16 lr 0.000987 time 2.1826 (2.2720) loss 4.6744 (4.4000) grad_norm 1.0739 (1.2894) [2022-01-18 05:00:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][250/1251] eta 0:37:52 lr 0.000987 time 1.8998 (2.2706) loss 4.2841 (4.3989) grad_norm 1.2071 (1.2884) [2022-01-18 05:00:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][260/1251] eta 0:37:36 lr 0.000987 time 2.4729 (2.2772) loss 4.4191 (4.3955) grad_norm 1.1545 (1.2847) [2022-01-18 05:00:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][270/1251] eta 0:37:15 lr 0.000987 time 1.8576 (2.2785) loss 4.0217 (4.3916) grad_norm 1.1628 (1.2823) [2022-01-18 05:01:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][280/1251] eta 0:36:49 lr 0.000987 time 2.1419 (2.2757) loss 5.2274 (4.3989) grad_norm 1.0983 (1.2811) [2022-01-18 05:01:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][290/1251] eta 0:36:13 lr 0.000987 time 1.6672 (2.2617) loss 4.7026 (4.4001) grad_norm 1.0850 (1.2803) [2022-01-18 05:02:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][300/1251] eta 0:35:46 lr 0.000987 time 1.8324 (2.2567) loss 4.4534 (4.4040) grad_norm 0.9545 (1.2765) [2022-01-18 05:02:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][310/1251] eta 0:35:22 lr 0.000987 time 1.8822 (2.2554) loss 3.2475 (4.4010) grad_norm 1.1684 (1.2748) [2022-01-18 05:02:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][320/1251] eta 0:34:59 lr 0.000987 time 2.6629 (2.2552) loss 5.0199 (4.4077) grad_norm 1.2186 (1.2722) [2022-01-18 05:03:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][330/1251] eta 0:34:36 lr 0.000987 time 1.7863 (2.2543) loss 5.1343 (4.4147) grad_norm 1.2850 (1.2705) [2022-01-18 05:03:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][340/1251] eta 0:34:16 lr 0.000987 time 2.0259 (2.2570) loss 4.0139 (4.4147) grad_norm 1.0755 (1.2697) [2022-01-18 05:03:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][350/1251] eta 0:33:49 lr 0.000987 time 1.9885 (2.2528) loss 3.7848 (4.4209) grad_norm 1.2060 (1.2693) [2022-01-18 05:04:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][360/1251] eta 0:33:28 lr 0.000987 time 2.0643 (2.2545) loss 3.4713 (4.4227) grad_norm 1.0656 (1.2689) [2022-01-18 05:04:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][370/1251] eta 0:33:03 lr 0.000987 time 1.9176 (2.2515) loss 3.5444 (4.4207) grad_norm 1.8621 (1.2715) [2022-01-18 05:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][380/1251] eta 0:32:41 lr 0.000987 time 1.7316 (2.2523) loss 4.3163 (4.4138) grad_norm 1.1499 (1.2719) [2022-01-18 05:05:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][390/1251] eta 0:32:16 lr 0.000987 time 2.0252 (2.2491) loss 4.5479 (4.4089) grad_norm 1.4427 (1.2730) [2022-01-18 05:05:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][400/1251] eta 0:31:52 lr 0.000987 time 1.8803 (2.2477) loss 4.6505 (4.4104) grad_norm 1.3252 (1.2723) [2022-01-18 05:06:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][410/1251] eta 0:31:27 lr 0.000987 time 1.9425 (2.2445) loss 5.0818 (4.4131) grad_norm 1.0706 (1.2681) [2022-01-18 05:06:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][420/1251] eta 0:31:03 lr 0.000987 time 1.5998 (2.2424) loss 4.8881 (4.4060) grad_norm 1.1515 (1.2681) [2022-01-18 05:06:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][430/1251] eta 0:30:37 lr 0.000987 time 1.9059 (2.2383) loss 3.8662 (4.4087) grad_norm 1.4506 (1.2672) [2022-01-18 05:07:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][440/1251] eta 0:30:12 lr 0.000987 time 1.5556 (2.2346) loss 4.4437 (4.4085) grad_norm 1.7023 (1.2673) [2022-01-18 05:07:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][450/1251] eta 0:29:47 lr 0.000986 time 2.5527 (2.2320) loss 4.2785 (4.4141) grad_norm 1.3167 (1.2666) [2022-01-18 05:07:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][460/1251] eta 0:29:22 lr 0.000986 time 2.1265 (2.2283) loss 3.6047 (4.4104) grad_norm 1.3451 (1.2658) [2022-01-18 05:08:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][470/1251] eta 0:29:00 lr 0.000986 time 2.8402 (2.2281) loss 4.6308 (4.4106) grad_norm 1.2094 (1.2656) [2022-01-18 05:08:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][480/1251] eta 0:28:37 lr 0.000986 time 1.7408 (2.2278) loss 4.8178 (4.4137) grad_norm 1.4973 (1.2632) [2022-01-18 05:08:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][490/1251] eta 0:28:15 lr 0.000986 time 2.8368 (2.2284) loss 4.6228 (4.4094) grad_norm 1.3078 (1.2603) [2022-01-18 05:09:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][500/1251] eta 0:27:54 lr 0.000986 time 2.4092 (2.2302) loss 4.6627 (4.4102) grad_norm 1.0210 (1.2601) [2022-01-18 05:09:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][510/1251] eta 0:27:31 lr 0.000986 time 1.8660 (2.2285) loss 3.7609 (4.4068) grad_norm 1.1840 (1.2595) [2022-01-18 05:10:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][520/1251] eta 0:27:08 lr 0.000986 time 2.2763 (2.2271) loss 4.4053 (4.4085) grad_norm 1.2811 (1.2602) [2022-01-18 05:10:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][530/1251] eta 0:26:44 lr 0.000986 time 2.7298 (2.2254) loss 5.2593 (4.4079) grad_norm 1.2588 (1.2581) [2022-01-18 05:10:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][540/1251] eta 0:26:20 lr 0.000986 time 2.0552 (2.2236) loss 4.6227 (4.4060) grad_norm 1.1443 (1.2586) [2022-01-18 05:11:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][550/1251] eta 0:25:56 lr 0.000986 time 2.2052 (2.2209) loss 3.7135 (4.4030) grad_norm 1.1507 (1.2584) [2022-01-18 05:11:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][560/1251] eta 0:25:35 lr 0.000986 time 2.1836 (2.2217) loss 4.6815 (4.4086) grad_norm 1.2196 (1.2582) [2022-01-18 05:11:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][570/1251] eta 0:25:15 lr 0.000986 time 3.4602 (2.2257) loss 4.1735 (4.4140) grad_norm 1.3061 (1.2578) [2022-01-18 05:12:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][580/1251] eta 0:24:54 lr 0.000986 time 1.7634 (2.2278) loss 3.7948 (4.4054) grad_norm 1.1064 (1.2586) [2022-01-18 05:12:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][590/1251] eta 0:24:33 lr 0.000986 time 2.1522 (2.2292) loss 4.3245 (4.4096) grad_norm 1.3287 (1.2583) [2022-01-18 05:12:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][600/1251] eta 0:24:09 lr 0.000986 time 1.8243 (2.2263) loss 4.6130 (4.4080) grad_norm 1.1304 (1.2578) [2022-01-18 05:13:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][610/1251] eta 0:23:45 lr 0.000986 time 1.9045 (2.2233) loss 3.2512 (4.4024) grad_norm 1.2190 (1.2570) [2022-01-18 05:13:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][620/1251] eta 0:23:19 lr 0.000986 time 1.8846 (2.2178) loss 4.5500 (4.4008) grad_norm 1.2120 (1.2573) [2022-01-18 05:13:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][630/1251] eta 0:22:54 lr 0.000986 time 2.2124 (2.2141) loss 4.6146 (4.4061) grad_norm 1.2004 (1.2580) [2022-01-18 05:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][640/1251] eta 0:22:31 lr 0.000986 time 2.0244 (2.2115) loss 3.5688 (4.4021) grad_norm 1.1301 (1.2570) [2022-01-18 05:14:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][650/1251] eta 0:22:09 lr 0.000986 time 2.4087 (2.2127) loss 5.0353 (4.4043) grad_norm 1.1403 (1.2553) [2022-01-18 05:15:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][660/1251] eta 0:21:46 lr 0.000986 time 1.8058 (2.2102) loss 4.7027 (4.4050) grad_norm 1.1943 (1.2563) [2022-01-18 05:15:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][670/1251] eta 0:21:24 lr 0.000986 time 2.1697 (2.2107) loss 3.3953 (4.4004) grad_norm 1.2947 (1.2558) [2022-01-18 05:15:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][680/1251] eta 0:21:03 lr 0.000986 time 2.1538 (2.2120) loss 5.2544 (4.4023) grad_norm 1.1632 (1.2565) [2022-01-18 05:16:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][690/1251] eta 0:20:43 lr 0.000986 time 2.5348 (2.2158) loss 4.0565 (4.4003) grad_norm 1.0440 (1.2562) [2022-01-18 05:16:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][700/1251] eta 0:20:22 lr 0.000986 time 1.9977 (2.2184) loss 4.9262 (4.4008) grad_norm 1.2273 (1.2554) [2022-01-18 05:16:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][710/1251] eta 0:20:00 lr 0.000986 time 2.2517 (2.2197) loss 4.4210 (4.4041) grad_norm 1.2868 (1.2572) [2022-01-18 05:17:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][720/1251] eta 0:19:38 lr 0.000986 time 2.2330 (2.2195) loss 3.6370 (4.4000) grad_norm 1.4130 (1.2565) [2022-01-18 05:17:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][730/1251] eta 0:19:14 lr 0.000986 time 1.6155 (2.2162) loss 4.6125 (4.4015) grad_norm 1.4913 (1.2559) [2022-01-18 05:18:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][740/1251] eta 0:18:50 lr 0.000986 time 1.7889 (2.2131) loss 4.6004 (4.4053) grad_norm 1.4274 (1.2574) [2022-01-18 05:18:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][750/1251] eta 0:18:28 lr 0.000986 time 1.8938 (2.2124) loss 3.3855 (4.4045) grad_norm 1.3446 (1.2579) [2022-01-18 05:18:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][760/1251] eta 0:18:06 lr 0.000986 time 2.2185 (2.2131) loss 3.4448 (4.4041) grad_norm 1.0671 (1.2580) [2022-01-18 05:19:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][770/1251] eta 0:17:44 lr 0.000986 time 1.5513 (2.2125) loss 4.3806 (4.4048) grad_norm 1.4216 (1.2584) [2022-01-18 05:19:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][780/1251] eta 0:17:21 lr 0.000986 time 1.9357 (2.2123) loss 4.4420 (4.4043) grad_norm 0.9154 (1.2568) [2022-01-18 05:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][790/1251] eta 0:16:59 lr 0.000986 time 1.7691 (2.2116) loss 4.6879 (4.4079) grad_norm 1.3199 (1.2565) [2022-01-18 05:20:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][800/1251] eta 0:16:36 lr 0.000986 time 2.0165 (2.2098) loss 4.4258 (4.4060) grad_norm 0.9970 (1.2562) [2022-01-18 05:20:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][810/1251] eta 0:16:14 lr 0.000986 time 1.9367 (2.2108) loss 4.2283 (4.4042) grad_norm 1.3304 (1.2548) [2022-01-18 05:20:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][820/1251] eta 0:15:53 lr 0.000986 time 2.0943 (2.2124) loss 3.8165 (4.4032) grad_norm 1.1334 (1.2550) [2022-01-18 05:21:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][830/1251] eta 0:15:31 lr 0.000986 time 1.6346 (2.2120) loss 4.6836 (4.3995) grad_norm 0.9799 (1.2562) [2022-01-18 05:21:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][840/1251] eta 0:15:08 lr 0.000986 time 1.8855 (2.2099) loss 4.6621 (4.4025) grad_norm 0.9254 (1.2545) [2022-01-18 05:22:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][850/1251] eta 0:14:45 lr 0.000986 time 2.0349 (2.2087) loss 4.0180 (4.4024) grad_norm 1.0373 (1.2525) [2022-01-18 05:22:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][860/1251] eta 0:14:23 lr 0.000986 time 1.8076 (2.2078) loss 4.5867 (4.4023) grad_norm 1.2670 (1.2523) [2022-01-18 05:22:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][870/1251] eta 0:14:01 lr 0.000986 time 1.6031 (2.2077) loss 4.3023 (4.3993) grad_norm 1.2349 (1.2521) [2022-01-18 05:23:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][880/1251] eta 0:13:39 lr 0.000986 time 2.0677 (2.2080) loss 5.0550 (4.3999) grad_norm 1.3233 (1.2536) [2022-01-18 05:23:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][890/1251] eta 0:13:17 lr 0.000986 time 1.9590 (2.2095) loss 4.8526 (4.3995) grad_norm 1.3995 (1.2535) [2022-01-18 05:23:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][900/1251] eta 0:12:55 lr 0.000986 time 2.2079 (2.2090) loss 4.6579 (4.3984) grad_norm 1.1165 (1.2526) [2022-01-18 05:24:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][910/1251] eta 0:12:33 lr 0.000986 time 1.5409 (2.2091) loss 3.6519 (4.3980) grad_norm 1.3549 (1.2519) [2022-01-18 05:24:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][920/1251] eta 0:12:11 lr 0.000986 time 1.9545 (2.2090) loss 4.9921 (4.3963) grad_norm 1.3994 (1.2529) [2022-01-18 05:24:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][930/1251] eta 0:11:49 lr 0.000986 time 1.4924 (2.2098) loss 3.9411 (4.3954) grad_norm 1.2768 (1.2526) [2022-01-18 05:25:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][940/1251] eta 0:11:26 lr 0.000986 time 2.3197 (2.2088) loss 4.3893 (4.3966) grad_norm 2.0169 (1.2537) [2022-01-18 05:25:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][950/1251] eta 0:11:04 lr 0.000986 time 2.0061 (2.2066) loss 4.5563 (4.3959) grad_norm 1.2782 (1.2536) [2022-01-18 05:26:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][960/1251] eta 0:10:42 lr 0.000986 time 1.9217 (2.2070) loss 4.4731 (4.3946) grad_norm 1.0066 (1.2530) [2022-01-18 05:26:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][970/1251] eta 0:10:20 lr 0.000986 time 2.2064 (2.2081) loss 4.0708 (4.3956) grad_norm 1.4055 (1.2526) [2022-01-18 05:26:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][980/1251] eta 0:09:58 lr 0.000986 time 1.8525 (2.2076) loss 4.9452 (4.3962) grad_norm 1.0879 (1.2517) [2022-01-18 05:27:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][990/1251] eta 0:09:36 lr 0.000986 time 1.6887 (2.2069) loss 4.4728 (4.3984) grad_norm 1.2195 (1.2511) [2022-01-18 05:27:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1000/1251] eta 0:09:13 lr 0.000986 time 2.1991 (2.2052) loss 4.6658 (4.3978) grad_norm 1.3335 (1.2509) [2022-01-18 05:27:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1010/1251] eta 0:08:51 lr 0.000986 time 2.3308 (2.2042) loss 4.6845 (4.3969) grad_norm 1.2051 (1.2503) [2022-01-18 05:28:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1020/1251] eta 0:08:28 lr 0.000986 time 1.8284 (2.2027) loss 5.2659 (4.3976) grad_norm 1.2353 (1.2507) [2022-01-18 05:28:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1030/1251] eta 0:08:06 lr 0.000986 time 1.9167 (2.2024) loss 5.1308 (4.3972) grad_norm 1.1830 (1.2512) [2022-01-18 05:28:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1040/1251] eta 0:07:45 lr 0.000986 time 4.2502 (2.2059) loss 4.4360 (4.3997) grad_norm 1.9304 (1.2511) [2022-01-18 05:29:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1050/1251] eta 0:07:23 lr 0.000986 time 2.2470 (2.2075) loss 4.8196 (4.3975) grad_norm 1.2291 (1.2519) [2022-01-18 05:29:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1060/1251] eta 0:07:01 lr 0.000986 time 1.5106 (2.2077) loss 4.5384 (4.3989) grad_norm 1.1018 (1.2509) [2022-01-18 05:30:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1070/1251] eta 0:06:39 lr 0.000986 time 1.7458 (2.2062) loss 4.7402 (4.3985) grad_norm 1.1574 (1.2504) [2022-01-18 05:30:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1080/1251] eta 0:06:17 lr 0.000986 time 3.1442 (2.2058) loss 4.9797 (4.3990) grad_norm 1.0856 (1.2495) [2022-01-18 05:30:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1090/1251] eta 0:05:55 lr 0.000986 time 2.2535 (2.2052) loss 5.1031 (4.4007) grad_norm 1.3501 (1.2494) [2022-01-18 05:31:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1100/1251] eta 0:05:32 lr 0.000986 time 2.1512 (2.2040) loss 5.2031 (4.4045) grad_norm 1.2993 (1.2491) [2022-01-18 05:31:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1110/1251] eta 0:05:10 lr 0.000986 time 1.9607 (2.2032) loss 4.7534 (4.4042) grad_norm 1.0620 (1.2494) [2022-01-18 05:31:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1120/1251] eta 0:04:48 lr 0.000986 time 2.8698 (2.2031) loss 3.3813 (4.4046) grad_norm 1.2208 (1.2496) [2022-01-18 05:32:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1130/1251] eta 0:04:26 lr 0.000986 time 2.5026 (2.2022) loss 3.4959 (4.4050) grad_norm 1.0450 (1.2498) [2022-01-18 05:32:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1140/1251] eta 0:04:04 lr 0.000986 time 2.2455 (2.2011) loss 4.0146 (4.4067) grad_norm 1.1213 (1.2494) [2022-01-18 05:32:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1150/1251] eta 0:03:42 lr 0.000986 time 1.8943 (2.2018) loss 4.0113 (4.4049) grad_norm 1.2068 (1.2494) [2022-01-18 05:33:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1160/1251] eta 0:03:20 lr 0.000986 time 1.5937 (2.2020) loss 4.4810 (4.4057) grad_norm 1.4026 (1.2501) [2022-01-18 05:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1170/1251] eta 0:02:58 lr 0.000986 time 2.7654 (2.2027) loss 4.8038 (4.4062) grad_norm 1.5190 (1.2509) [2022-01-18 05:34:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1180/1251] eta 0:02:36 lr 0.000986 time 2.6144 (2.2036) loss 5.0737 (4.4077) grad_norm 1.1270 (1.2514) [2022-01-18 05:34:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1190/1251] eta 0:02:14 lr 0.000986 time 1.5282 (2.2042) loss 3.8382 (4.4080) grad_norm 1.2106 (1.2510) [2022-01-18 05:34:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1200/1251] eta 0:01:52 lr 0.000986 time 1.8636 (2.2033) loss 4.4966 (4.4094) grad_norm 1.0448 (1.2503) [2022-01-18 05:35:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1210/1251] eta 0:01:30 lr 0.000986 time 2.3156 (2.2012) loss 5.4461 (4.4108) grad_norm 1.0249 (1.2496) [2022-01-18 05:35:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1220/1251] eta 0:01:08 lr 0.000986 time 2.1510 (2.1984) loss 3.5225 (4.4093) grad_norm 1.1103 (1.2493) [2022-01-18 05:35:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1230/1251] eta 0:00:46 lr 0.000986 time 2.2746 (2.1981) loss 4.4516 (4.4108) grad_norm 1.0862 (1.2485) [2022-01-18 05:36:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1240/1251] eta 0:00:24 lr 0.000986 time 1.9158 (2.1979) loss 4.6680 (4.4096) grad_norm 1.3739 (1.2492) [2022-01-18 05:36:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [22/300][1250/1251] eta 0:00:02 lr 0.000986 time 1.1634 (2.1932) loss 4.2929 (4.4069) grad_norm 1.2637 (1.2503) [2022-01-18 05:36:25 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 22 training takes 0:45:44 [2022-01-18 05:36:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.795 (18.795) Loss 1.7729 (1.7729) Acc@1 59.668 (59.668) Acc@5 83.496 (83.496) [2022-01-18 05:37:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.305 (3.280) Loss 1.7216 (1.7032) Acc@1 61.719 (61.905) Acc@5 84.570 (85.032) [2022-01-18 05:37:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.607 (2.591) Loss 1.6803 (1.7164) Acc@1 61.816 (61.444) Acc@5 84.082 (84.594) [2022-01-18 05:37:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.630 (2.314) Loss 1.7332 (1.7129) Acc@1 61.133 (61.555) Acc@5 85.742 (84.699) [2022-01-18 05:37:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.507 (2.187) Loss 1.6486 (1.7110) Acc@1 64.648 (61.662) Acc@5 85.352 (84.668) [2022-01-18 05:38:01 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 61.604 Acc@5 84.678 [2022-01-18 05:38:01 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 61.6% [2022-01-18 05:38:01 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 61.60% [2022-01-18 05:38:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][0/1251] eta 7:33:49 lr 0.000986 time 21.7662 (21.7662) loss 3.7346 (3.7346) grad_norm 1.2457 (1.2457) [2022-01-18 05:38:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][10/1251] eta 1:24:55 lr 0.000986 time 1.8248 (4.1063) loss 3.6961 (4.1799) grad_norm 1.0516 (1.2308) [2022-01-18 05:39:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][20/1251] eta 1:05:40 lr 0.000986 time 1.5266 (3.2012) loss 4.1131 (4.3493) grad_norm 1.0952 (1.2330) [2022-01-18 05:39:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][30/1251] eta 0:58:07 lr 0.000986 time 1.9821 (2.8562) loss 4.4830 (4.3615) grad_norm 1.1087 (1.2147) [2022-01-18 05:39:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][40/1251] eta 0:55:25 lr 0.000986 time 3.9186 (2.7460) loss 4.3896 (4.3487) grad_norm 1.0565 (1.2235) [2022-01-18 05:40:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][50/1251] eta 0:53:08 lr 0.000986 time 1.9433 (2.6547) loss 4.6203 (4.3327) grad_norm 1.4816 (1.2255) [2022-01-18 05:40:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][60/1251] eta 0:51:21 lr 0.000986 time 1.6872 (2.5874) loss 2.9745 (4.3293) grad_norm 1.3036 (1.2211) [2022-01-18 05:41:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][70/1251] eta 0:50:02 lr 0.000986 time 1.7087 (2.5422) loss 4.5548 (4.3248) grad_norm 1.1816 (1.2680) [2022-01-18 05:41:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][80/1251] eta 0:49:20 lr 0.000986 time 3.1932 (2.5280) loss 5.0753 (4.3375) grad_norm 1.1194 (1.2681) [2022-01-18 05:41:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][90/1251] eta 0:48:14 lr 0.000986 time 1.8479 (2.4929) loss 4.9180 (4.3691) grad_norm 1.0528 (1.2554) [2022-01-18 05:42:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][100/1251] eta 0:47:00 lr 0.000986 time 1.6103 (2.4502) loss 4.2712 (4.3988) grad_norm 1.0547 (1.2486) [2022-01-18 05:42:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][110/1251] eta 0:45:36 lr 0.000986 time 1.9329 (2.3981) loss 4.8706 (4.3920) grad_norm 1.2752 (1.2561) [2022-01-18 05:42:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][120/1251] eta 0:44:45 lr 0.000986 time 3.3229 (2.3743) loss 4.9620 (4.3657) grad_norm 1.2041 (1.2566) [2022-01-18 05:43:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][130/1251] eta 0:44:02 lr 0.000986 time 1.9528 (2.3569) loss 4.7716 (4.3542) grad_norm 1.2729 (1.2521) [2022-01-18 05:43:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][140/1251] eta 0:43:27 lr 0.000986 time 2.5649 (2.3467) loss 5.1953 (4.3459) grad_norm 1.0905 (1.2451) [2022-01-18 05:43:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][150/1251] eta 0:43:02 lr 0.000986 time 2.3754 (2.3456) loss 3.6762 (4.3427) grad_norm 1.1118 (1.2472) [2022-01-18 05:44:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][160/1251] eta 0:42:45 lr 0.000986 time 2.9041 (2.3512) loss 3.9064 (4.3422) grad_norm 1.1796 (1.2477) [2022-01-18 05:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][170/1251] eta 0:42:18 lr 0.000986 time 1.7905 (2.3481) loss 3.9143 (4.3351) grad_norm 1.3342 (1.2441) [2022-01-18 05:45:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][180/1251] eta 0:41:44 lr 0.000986 time 2.1234 (2.3385) loss 4.4472 (4.3217) grad_norm 1.1871 (1.2406) [2022-01-18 05:45:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][190/1251] eta 0:40:59 lr 0.000986 time 1.5294 (2.3178) loss 4.6140 (4.3352) grad_norm 1.2764 (1.2401) [2022-01-18 05:45:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][200/1251] eta 0:40:21 lr 0.000986 time 2.4723 (2.3040) loss 4.3081 (4.3425) grad_norm 1.1581 (1.2339) [2022-01-18 05:46:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][210/1251] eta 0:39:42 lr 0.000986 time 1.9793 (2.2891) loss 4.4074 (4.3464) grad_norm 1.4624 (1.2342) [2022-01-18 05:46:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][220/1251] eta 0:39:16 lr 0.000985 time 1.8481 (2.2854) loss 4.5429 (4.3489) grad_norm 1.1029 (1.2373) [2022-01-18 05:46:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][230/1251] eta 0:38:52 lr 0.000985 time 2.1532 (2.2844) loss 4.9066 (4.3475) grad_norm 1.2149 (1.2392) [2022-01-18 05:47:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][240/1251] eta 0:38:22 lr 0.000985 time 1.9529 (2.2774) loss 4.0514 (4.3466) grad_norm 0.9950 (1.2399) [2022-01-18 05:47:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][250/1251] eta 0:38:08 lr 0.000985 time 3.2903 (2.2866) loss 4.3731 (4.3455) grad_norm 1.1290 (1.2345) [2022-01-18 05:47:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][260/1251] eta 0:37:48 lr 0.000985 time 2.5572 (2.2889) loss 3.7641 (4.3488) grad_norm 1.1743 (1.2377) [2022-01-18 05:48:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][270/1251] eta 0:37:21 lr 0.000985 time 1.5031 (2.2847) loss 4.5497 (4.3479) grad_norm 1.0119 (1.2386) [2022-01-18 05:48:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][280/1251] eta 0:36:55 lr 0.000985 time 1.9241 (2.2816) loss 4.5204 (4.3483) grad_norm 1.0252 (1.2383) [2022-01-18 05:49:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][290/1251] eta 0:36:21 lr 0.000985 time 1.6572 (2.2703) loss 4.8583 (4.3542) grad_norm 1.0118 (1.2353) [2022-01-18 05:49:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][300/1251] eta 0:35:50 lr 0.000985 time 2.4709 (2.2616) loss 4.3358 (4.3476) grad_norm 1.2111 (1.2355) [2022-01-18 05:49:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][310/1251] eta 0:35:25 lr 0.000985 time 2.1313 (2.2585) loss 4.8668 (4.3575) grad_norm 0.8883 (1.2302) [2022-01-18 05:50:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][320/1251] eta 0:34:59 lr 0.000985 time 1.8016 (2.2546) loss 4.9174 (4.3581) grad_norm 1.5540 (1.2309) [2022-01-18 05:50:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][330/1251] eta 0:34:35 lr 0.000985 time 1.9882 (2.2533) loss 4.6966 (4.3574) grad_norm 1.3364 (1.2312) [2022-01-18 05:50:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][340/1251] eta 0:34:09 lr 0.000985 time 2.2848 (2.2502) loss 4.0397 (4.3559) grad_norm 1.3344 (1.2331) [2022-01-18 05:51:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][350/1251] eta 0:33:43 lr 0.000985 time 2.2495 (2.2463) loss 2.9354 (4.3493) grad_norm 1.0722 (1.2347) [2022-01-18 05:51:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][360/1251] eta 0:33:23 lr 0.000985 time 2.6720 (2.2481) loss 4.9573 (4.3465) grad_norm 1.3305 (1.2360) [2022-01-18 05:51:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][370/1251] eta 0:33:00 lr 0.000985 time 2.6332 (2.2485) loss 4.5785 (4.3520) grad_norm 1.2106 (1.2394) [2022-01-18 05:52:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][380/1251] eta 0:32:36 lr 0.000985 time 2.5808 (2.2464) loss 4.5114 (4.3573) grad_norm 1.1171 (1.2390) [2022-01-18 05:52:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][390/1251] eta 0:32:10 lr 0.000985 time 1.8892 (2.2426) loss 4.3044 (4.3587) grad_norm 1.9294 (1.2420) [2022-01-18 05:52:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][400/1251] eta 0:31:44 lr 0.000985 time 2.0469 (2.2381) loss 5.0138 (4.3584) grad_norm 1.1881 (1.2410) [2022-01-18 05:53:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][410/1251] eta 0:31:25 lr 0.000985 time 2.5193 (2.2418) loss 5.1397 (4.3587) grad_norm 1.1662 (1.2423) [2022-01-18 05:53:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][420/1251] eta 0:31:04 lr 0.000985 time 2.1322 (2.2436) loss 3.1408 (4.3508) grad_norm 1.2262 (1.2460) [2022-01-18 05:54:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][430/1251] eta 0:30:42 lr 0.000985 time 1.6866 (2.2439) loss 4.6906 (4.3511) grad_norm 1.1425 (1.2461) [2022-01-18 05:54:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][440/1251] eta 0:30:17 lr 0.000985 time 2.1162 (2.2413) loss 3.8503 (4.3499) grad_norm 1.0941 (1.2452) [2022-01-18 05:54:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][450/1251] eta 0:29:56 lr 0.000985 time 2.2854 (2.2429) loss 3.4595 (4.3509) grad_norm 0.9871 (1.2430) [2022-01-18 05:55:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][460/1251] eta 0:29:29 lr 0.000985 time 2.0394 (2.2372) loss 4.7916 (4.3525) grad_norm 1.4786 (1.2430) [2022-01-18 05:55:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][470/1251] eta 0:29:02 lr 0.000985 time 1.5637 (2.2311) loss 3.3875 (4.3505) grad_norm 0.9739 (1.2421) [2022-01-18 05:55:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][480/1251] eta 0:28:38 lr 0.000985 time 1.8370 (2.2290) loss 4.9431 (4.3558) grad_norm 1.5368 (1.2416) [2022-01-18 05:56:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][490/1251] eta 0:28:16 lr 0.000985 time 2.6143 (2.2289) loss 4.1188 (4.3522) grad_norm 1.4276 (1.2408) [2022-01-18 05:56:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][500/1251] eta 0:27:54 lr 0.000985 time 2.0038 (2.2293) loss 4.5381 (4.3532) grad_norm 1.0929 (1.2388) [2022-01-18 05:57:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][510/1251] eta 0:27:33 lr 0.000985 time 2.3881 (2.2311) loss 4.4434 (4.3614) grad_norm 1.2488 (1.2399) [2022-01-18 05:57:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][520/1251] eta 0:27:11 lr 0.000985 time 1.8497 (2.2319) loss 4.6147 (4.3697) grad_norm 1.2538 (1.2401) [2022-01-18 05:57:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][530/1251] eta 0:26:47 lr 0.000985 time 1.6652 (2.2297) loss 4.9333 (4.3704) grad_norm 1.1750 (1.2386) [2022-01-18 05:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][540/1251] eta 0:26:24 lr 0.000985 time 1.9097 (2.2290) loss 4.1021 (4.3710) grad_norm 1.0752 (1.2380) [2022-01-18 05:58:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][550/1251] eta 0:26:01 lr 0.000985 time 2.0855 (2.2276) loss 4.0846 (4.3709) grad_norm 1.1486 (1.2376) [2022-01-18 05:58:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][560/1251] eta 0:25:43 lr 0.000985 time 1.8429 (2.2332) loss 4.4413 (4.3738) grad_norm 1.2227 (1.2375) [2022-01-18 05:59:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][570/1251] eta 0:25:18 lr 0.000985 time 1.6573 (2.2300) loss 4.2684 (4.3797) grad_norm 2.0495 (1.2394) [2022-01-18 05:59:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][580/1251] eta 0:24:52 lr 0.000985 time 1.7237 (2.2250) loss 3.0775 (4.3821) grad_norm 1.2814 (1.2396) [2022-01-18 05:59:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][590/1251] eta 0:24:32 lr 0.000985 time 1.9205 (2.2280) loss 4.7678 (4.3788) grad_norm 1.1442 (1.2380) [2022-01-18 06:00:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][600/1251] eta 0:24:09 lr 0.000985 time 1.7486 (2.2272) loss 5.1163 (4.3797) grad_norm 1.3916 (1.2384) [2022-01-18 06:00:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][610/1251] eta 0:23:46 lr 0.000985 time 1.5654 (2.2249) loss 3.5936 (4.3755) grad_norm 1.4775 (1.2388) [2022-01-18 06:01:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][620/1251] eta 0:23:23 lr 0.000985 time 2.4128 (2.2243) loss 5.0051 (4.3789) grad_norm 1.6178 (1.2381) [2022-01-18 06:01:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][630/1251] eta 0:22:59 lr 0.000985 time 1.5797 (2.2209) loss 4.8693 (4.3803) grad_norm 1.1882 (1.2379) [2022-01-18 06:01:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][640/1251] eta 0:22:36 lr 0.000985 time 1.9585 (2.2208) loss 4.9907 (4.3792) grad_norm 1.3093 (1.2367) [2022-01-18 06:02:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][650/1251] eta 0:22:13 lr 0.000985 time 1.8509 (2.2192) loss 4.6178 (4.3790) grad_norm 1.0930 (1.2372) [2022-01-18 06:02:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][660/1251] eta 0:21:53 lr 0.000985 time 2.8794 (2.2228) loss 5.1969 (4.3773) grad_norm 1.2600 (1.2365) [2022-01-18 06:02:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][670/1251] eta 0:21:32 lr 0.000985 time 2.8184 (2.2239) loss 3.2138 (4.3756) grad_norm 1.2325 (1.2367) [2022-01-18 06:03:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][680/1251] eta 0:21:09 lr 0.000985 time 1.7508 (2.2237) loss 4.5651 (4.3721) grad_norm 1.1653 (1.2374) [2022-01-18 06:03:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][690/1251] eta 0:20:45 lr 0.000985 time 1.6338 (2.2203) loss 4.7542 (4.3729) grad_norm 1.0101 (1.2366) [2022-01-18 06:03:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][700/1251] eta 0:20:21 lr 0.000985 time 2.2324 (2.2164) loss 4.7584 (4.3730) grad_norm 1.4820 (1.2371) [2022-01-18 06:04:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][710/1251] eta 0:19:57 lr 0.000985 time 1.9963 (2.2141) loss 4.9147 (4.3774) grad_norm 1.0250 (1.2365) [2022-01-18 06:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][720/1251] eta 0:19:35 lr 0.000985 time 1.8032 (2.2129) loss 5.0612 (4.3776) grad_norm 1.0996 (1.2371) [2022-01-18 06:05:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][730/1251] eta 0:19:13 lr 0.000985 time 2.2554 (2.2137) loss 3.3350 (4.3784) grad_norm 1.3043 (1.2366) [2022-01-18 06:05:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][740/1251] eta 0:18:51 lr 0.000985 time 1.9832 (2.2152) loss 3.8050 (4.3816) grad_norm 1.0327 (1.2360) [2022-01-18 06:05:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][750/1251] eta 0:18:30 lr 0.000985 time 3.0557 (2.2159) loss 4.4977 (4.3778) grad_norm 1.0669 (1.2357) [2022-01-18 06:06:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][760/1251] eta 0:18:06 lr 0.000985 time 1.7786 (2.2122) loss 4.4800 (4.3734) grad_norm 1.2985 (1.2363) [2022-01-18 06:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][770/1251] eta 0:17:43 lr 0.000985 time 1.7205 (2.2120) loss 4.4509 (4.3764) grad_norm 1.1643 (1.2364) [2022-01-18 06:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][780/1251] eta 0:17:22 lr 0.000985 time 3.1657 (2.2137) loss 3.3605 (4.3740) grad_norm 1.0789 (1.2363) [2022-01-18 06:07:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][790/1251] eta 0:17:00 lr 0.000985 time 2.2306 (2.2133) loss 4.7959 (4.3770) grad_norm 1.3212 (1.2382) [2022-01-18 06:07:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][800/1251] eta 0:16:38 lr 0.000985 time 1.8667 (2.2136) loss 4.6509 (4.3782) grad_norm 1.0905 (1.2379) [2022-01-18 06:07:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][810/1251] eta 0:16:15 lr 0.000985 time 2.0194 (2.2126) loss 3.0207 (4.3772) grad_norm 1.1303 (1.2374) [2022-01-18 06:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][820/1251] eta 0:15:53 lr 0.000985 time 2.4946 (2.2113) loss 3.3404 (4.3746) grad_norm 1.4165 (1.2378) [2022-01-18 06:08:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][830/1251] eta 0:15:30 lr 0.000985 time 1.8160 (2.2095) loss 4.4163 (4.3736) grad_norm 1.2608 (1.2374) [2022-01-18 06:08:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][840/1251] eta 0:15:07 lr 0.000985 time 1.8727 (2.2091) loss 3.5461 (4.3711) grad_norm 1.1274 (1.2373) [2022-01-18 06:09:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][850/1251] eta 0:14:46 lr 0.000985 time 2.2653 (2.2104) loss 3.5264 (4.3677) grad_norm 1.1959 (1.2371) [2022-01-18 06:09:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][860/1251] eta 0:14:24 lr 0.000985 time 2.4769 (2.2107) loss 5.0979 (4.3673) grad_norm 1.3599 (1.2369) [2022-01-18 06:10:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][870/1251] eta 0:14:01 lr 0.000985 time 1.9329 (2.2093) loss 4.5201 (4.3686) grad_norm 1.2564 (1.2368) [2022-01-18 06:10:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][880/1251] eta 0:13:39 lr 0.000985 time 1.8938 (2.2081) loss 4.6219 (4.3710) grad_norm 1.1945 (1.2370) [2022-01-18 06:10:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][890/1251] eta 0:13:17 lr 0.000985 time 2.1001 (2.2086) loss 3.4272 (4.3696) grad_norm 0.9915 (1.2364) [2022-01-18 06:11:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][900/1251] eta 0:12:55 lr 0.000985 time 1.6821 (2.2088) loss 4.5852 (4.3719) grad_norm 0.9373 (1.2365) [2022-01-18 06:11:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][910/1251] eta 0:12:33 lr 0.000985 time 2.5887 (2.2091) loss 5.1803 (4.3741) grad_norm 1.1741 (1.2357) [2022-01-18 06:11:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][920/1251] eta 0:12:10 lr 0.000985 time 2.4612 (2.2083) loss 5.0072 (4.3749) grad_norm 0.9817 (1.2353) [2022-01-18 06:12:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][930/1251] eta 0:11:48 lr 0.000985 time 2.1986 (2.2068) loss 4.5334 (4.3754) grad_norm 1.2240 (1.2348) [2022-01-18 06:12:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][940/1251] eta 0:11:25 lr 0.000985 time 1.7111 (2.2054) loss 4.5302 (4.3742) grad_norm 1.1833 (1.2354) [2022-01-18 06:13:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][950/1251] eta 0:11:04 lr 0.000985 time 2.8654 (2.2067) loss 4.9662 (4.3767) grad_norm 1.1422 (1.2352) [2022-01-18 06:13:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][960/1251] eta 0:10:42 lr 0.000985 time 1.8879 (2.2071) loss 3.0880 (4.3735) grad_norm 1.0739 (1.2344) [2022-01-18 06:13:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][970/1251] eta 0:10:20 lr 0.000985 time 1.8544 (2.2064) loss 4.3139 (4.3733) grad_norm 1.0743 (1.2339) [2022-01-18 06:14:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][980/1251] eta 0:09:57 lr 0.000985 time 1.8001 (2.2061) loss 4.5103 (4.3734) grad_norm 1.2583 (1.2336) [2022-01-18 06:14:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][990/1251] eta 0:09:35 lr 0.000985 time 2.8303 (2.2049) loss 4.6320 (4.3711) grad_norm 1.6232 (1.2342) [2022-01-18 06:14:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1000/1251] eta 0:09:13 lr 0.000985 time 2.2192 (2.2044) loss 3.5464 (4.3707) grad_norm 0.9677 (1.2329) [2022-01-18 06:15:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1010/1251] eta 0:08:51 lr 0.000985 time 1.8571 (2.2050) loss 4.5987 (4.3688) grad_norm 1.1967 (1.2327) [2022-01-18 06:15:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1020/1251] eta 0:08:29 lr 0.000985 time 2.0415 (2.2037) loss 4.6276 (4.3692) grad_norm 1.2444 (1.2327) [2022-01-18 06:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1030/1251] eta 0:08:07 lr 0.000985 time 1.8976 (2.2037) loss 4.8222 (4.3706) grad_norm 1.3379 (1.2323) [2022-01-18 06:16:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1040/1251] eta 0:07:44 lr 0.000985 time 1.9103 (2.2027) loss 5.0405 (4.3716) grad_norm 1.3682 (1.2329) [2022-01-18 06:16:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1050/1251] eta 0:07:22 lr 0.000985 time 2.0576 (2.2016) loss 5.2704 (4.3726) grad_norm 1.5203 (1.2327) [2022-01-18 06:16:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1060/1251] eta 0:07:00 lr 0.000985 time 2.4716 (2.2007) loss 4.7342 (4.3729) grad_norm 1.3823 (1.2334) [2022-01-18 06:17:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1070/1251] eta 0:06:38 lr 0.000985 time 1.9182 (2.2001) loss 3.6909 (4.3723) grad_norm 1.1512 (1.2335) [2022-01-18 06:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1080/1251] eta 0:06:16 lr 0.000985 time 1.8966 (2.1997) loss 3.2792 (4.3721) grad_norm 1.2678 (1.2332) [2022-01-18 06:18:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1090/1251] eta 0:05:54 lr 0.000985 time 2.8158 (2.2010) loss 3.6753 (4.3721) grad_norm 1.1320 (1.2317) [2022-01-18 06:18:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1100/1251] eta 0:05:32 lr 0.000985 time 2.1334 (2.2010) loss 5.0020 (4.3728) grad_norm 1.4425 (1.2304) [2022-01-18 06:18:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1110/1251] eta 0:05:10 lr 0.000985 time 2.1332 (2.2008) loss 3.3427 (4.3714) grad_norm 1.1481 (1.2307) [2022-01-18 06:19:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1120/1251] eta 0:04:48 lr 0.000985 time 1.9773 (2.2002) loss 3.4840 (4.3696) grad_norm 1.5343 (1.2318) [2022-01-18 06:19:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1130/1251] eta 0:04:26 lr 0.000985 time 2.8031 (2.2002) loss 4.6545 (4.3697) grad_norm 1.0789 (1.2311) [2022-01-18 06:19:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1140/1251] eta 0:04:04 lr 0.000985 time 2.8999 (2.2009) loss 3.1982 (4.3686) grad_norm 1.2281 (1.2308) [2022-01-18 06:20:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1150/1251] eta 0:03:42 lr 0.000985 time 2.3019 (2.2001) loss 4.4923 (4.3681) grad_norm 1.1892 (1.2303) [2022-01-18 06:20:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1160/1251] eta 0:03:20 lr 0.000985 time 1.8595 (2.1997) loss 4.6972 (4.3673) grad_norm 1.2919 (1.2298) [2022-01-18 06:20:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1170/1251] eta 0:02:58 lr 0.000985 time 2.6397 (2.1988) loss 3.8694 (4.3675) grad_norm 1.2256 (1.2301) [2022-01-18 06:21:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1180/1251] eta 0:02:36 lr 0.000985 time 2.1447 (2.1990) loss 4.7758 (4.3697) grad_norm 1.0969 (1.2303) [2022-01-18 06:21:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1190/1251] eta 0:02:14 lr 0.000985 time 1.5186 (2.1989) loss 5.0667 (4.3710) grad_norm 1.2274 (1.2303) [2022-01-18 06:22:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1200/1251] eta 0:01:52 lr 0.000985 time 2.1376 (2.2005) loss 2.9760 (4.3717) grad_norm 1.1426 (1.2300) [2022-01-18 06:22:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1210/1251] eta 0:01:30 lr 0.000984 time 2.7148 (2.2007) loss 2.8298 (4.3693) grad_norm 1.0432 (1.2298) [2022-01-18 06:22:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1220/1251] eta 0:01:08 lr 0.000984 time 2.3528 (2.1999) loss 5.3470 (4.3715) grad_norm 1.1603 (1.2286) [2022-01-18 06:23:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1230/1251] eta 0:00:46 lr 0.000984 time 1.9066 (2.1983) loss 4.5015 (4.3716) grad_norm 1.2627 (1.2289) [2022-01-18 06:23:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1240/1251] eta 0:00:24 lr 0.000984 time 1.7642 (2.1969) loss 4.3883 (4.3721) grad_norm 1.2203 (1.2286) [2022-01-18 06:23:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [23/300][1250/1251] eta 0:00:02 lr 0.000984 time 1.1874 (2.1907) loss 3.5428 (4.3711) grad_norm 1.2174 (1.2283) [2022-01-18 06:23:43 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 23 training takes 0:45:41 [2022-01-18 06:24:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.690 (18.690) Loss 1.7633 (1.7633) Acc@1 59.766 (59.766) Acc@5 84.180 (84.180) [2022-01-18 06:24:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.632 (3.399) Loss 1.6820 (1.7353) Acc@1 63.281 (61.887) Acc@5 85.449 (84.695) [2022-01-18 06:24:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.895 (2.633) Loss 1.7488 (1.7296) Acc@1 62.695 (62.267) Acc@5 83.691 (84.868) [2022-01-18 06:24:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.619 (2.340) Loss 1.7028 (1.7321) Acc@1 63.770 (62.333) Acc@5 84.863 (84.851) [2022-01-18 06:25:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.174 (2.125) Loss 1.7477 (1.7279) Acc@1 60.742 (62.438) Acc@5 84.082 (84.892) [2022-01-18 06:25:19 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 62.412 Acc@5 84.894 [2022-01-18 06:25:19 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 62.4% [2022-01-18 06:25:19 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 62.41% [2022-01-18 06:25:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][0/1251] eta 7:32:23 lr 0.000984 time 21.6972 (21.6972) loss 4.8168 (4.8168) grad_norm 1.0208 (1.0208) [2022-01-18 06:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][10/1251] eta 1:21:33 lr 0.000984 time 1.9110 (3.9431) loss 3.1807 (4.3055) grad_norm 1.0321 (1.1431) [2022-01-18 06:26:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][20/1251] eta 1:03:34 lr 0.000984 time 1.3330 (3.0987) loss 4.5295 (4.3405) grad_norm 0.9593 (1.1824) [2022-01-18 06:26:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][30/1251] eta 0:55:47 lr 0.000984 time 1.8699 (2.7415) loss 4.8739 (4.3751) grad_norm 1.2021 (1.2103) [2022-01-18 06:27:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][40/1251] eta 0:54:38 lr 0.000984 time 4.7116 (2.7074) loss 4.4455 (4.3699) grad_norm 1.4976 (1.2156) [2022-01-18 06:27:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][50/1251] eta 0:52:45 lr 0.000984 time 2.4973 (2.6353) loss 4.9316 (4.3795) grad_norm 1.1078 (1.1997) [2022-01-18 06:27:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][60/1251] eta 0:50:37 lr 0.000984 time 1.3812 (2.5505) loss 4.5965 (4.3324) grad_norm 1.0674 (1.1933) [2022-01-18 06:28:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][70/1251] eta 0:49:20 lr 0.000984 time 2.2302 (2.5064) loss 4.3403 (4.3255) grad_norm 1.1145 (1.1851) [2022-01-18 06:28:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][80/1251] eta 0:48:28 lr 0.000984 time 3.3781 (2.4836) loss 4.0957 (4.3646) grad_norm 1.3822 (1.1960) [2022-01-18 06:29:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][90/1251] eta 0:47:28 lr 0.000984 time 2.2241 (2.4531) loss 3.5079 (4.3927) grad_norm 0.9911 (1.1985) [2022-01-18 06:29:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][100/1251] eta 0:46:20 lr 0.000984 time 1.9228 (2.4159) loss 4.9660 (4.3973) grad_norm 1.2457 (1.2007) [2022-01-18 06:29:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][110/1251] eta 0:45:24 lr 0.000984 time 2.1665 (2.3880) loss 4.5318 (4.4192) grad_norm 1.0005 (1.1930) [2022-01-18 06:30:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][120/1251] eta 0:44:54 lr 0.000984 time 3.5905 (2.3826) loss 4.0151 (4.4085) grad_norm 1.1244 (1.1932) [2022-01-18 06:30:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][130/1251] eta 0:44:13 lr 0.000984 time 2.3688 (2.3667) loss 4.7654 (4.4175) grad_norm 1.5091 (1.1938) [2022-01-18 06:30:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][140/1251] eta 0:43:38 lr 0.000984 time 1.5292 (2.3568) loss 3.3849 (4.4007) grad_norm 1.1786 (1.1883) [2022-01-18 06:31:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][150/1251] eta 0:43:02 lr 0.000984 time 1.8437 (2.3456) loss 4.9320 (4.4122) grad_norm 1.1673 (1.1830) [2022-01-18 06:31:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][160/1251] eta 0:42:31 lr 0.000984 time 2.5135 (2.3388) loss 4.7870 (4.3980) grad_norm 1.1605 (1.1804) [2022-01-18 06:31:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][170/1251] eta 0:41:48 lr 0.000984 time 1.9452 (2.3210) loss 4.0520 (4.4005) grad_norm 1.6178 (1.1818) [2022-01-18 06:32:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][180/1251] eta 0:41:10 lr 0.000984 time 2.0776 (2.3071) loss 4.5262 (4.4052) grad_norm 1.6420 (1.1802) [2022-01-18 06:32:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][190/1251] eta 0:40:29 lr 0.000984 time 2.0171 (2.2900) loss 4.0041 (4.4018) grad_norm 1.2869 (1.1806) [2022-01-18 06:32:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][200/1251] eta 0:39:56 lr 0.000984 time 2.4262 (2.2801) loss 3.5614 (4.3814) grad_norm 1.4754 (1.1832) [2022-01-18 06:33:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][210/1251] eta 0:39:21 lr 0.000984 time 1.9167 (2.2689) loss 4.6140 (4.3690) grad_norm 1.1107 (1.1920) [2022-01-18 06:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][220/1251] eta 0:38:55 lr 0.000984 time 1.8708 (2.2654) loss 4.1533 (4.3719) grad_norm 1.1971 (1.1957) [2022-01-18 06:34:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][230/1251] eta 0:38:34 lr 0.000984 time 2.1377 (2.2673) loss 4.7471 (4.3871) grad_norm 1.3145 (1.1994) [2022-01-18 06:34:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][240/1251] eta 0:38:16 lr 0.000984 time 2.4717 (2.2711) loss 4.5089 (4.3883) grad_norm 1.1628 (1.2088) [2022-01-18 06:34:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][250/1251] eta 0:37:51 lr 0.000984 time 1.9833 (2.2697) loss 3.7669 (4.3813) grad_norm 0.9960 (1.2070) [2022-01-18 06:35:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][260/1251] eta 0:37:37 lr 0.000984 time 2.3811 (2.2776) loss 3.5833 (4.3768) grad_norm 1.3803 (1.2099) [2022-01-18 06:35:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][270/1251] eta 0:37:11 lr 0.000984 time 2.1797 (2.2747) loss 4.3669 (4.3720) grad_norm 1.0512 (1.2137) [2022-01-18 06:35:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][280/1251] eta 0:36:44 lr 0.000984 time 1.9078 (2.2702) loss 3.2605 (4.3723) grad_norm 0.9407 (1.2112) [2022-01-18 06:36:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][290/1251] eta 0:36:16 lr 0.000984 time 1.8726 (2.2647) loss 4.0265 (4.3704) grad_norm 1.3465 (1.2121) [2022-01-18 06:36:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][300/1251] eta 0:35:47 lr 0.000984 time 1.8402 (2.2584) loss 4.2939 (4.3756) grad_norm 1.1008 (1.2129) [2022-01-18 06:37:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][310/1251] eta 0:35:23 lr 0.000984 time 2.1518 (2.2570) loss 3.8170 (4.3748) grad_norm 1.1208 (1.2149) [2022-01-18 06:37:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][320/1251] eta 0:34:58 lr 0.000984 time 1.8629 (2.2544) loss 4.3763 (4.3753) grad_norm 1.1589 (1.2151) [2022-01-18 06:37:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][330/1251] eta 0:34:37 lr 0.000984 time 1.9707 (2.2560) loss 3.7114 (4.3705) grad_norm 1.4043 (1.2190) [2022-01-18 06:38:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][340/1251] eta 0:34:09 lr 0.000984 time 1.9286 (2.2499) loss 3.8219 (4.3764) grad_norm 1.0497 (1.2187) [2022-01-18 06:38:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][350/1251] eta 0:33:43 lr 0.000984 time 2.3079 (2.2456) loss 3.6494 (4.3738) grad_norm 1.1925 (1.2223) [2022-01-18 06:38:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][360/1251] eta 0:33:15 lr 0.000984 time 1.9347 (2.2396) loss 3.6354 (4.3730) grad_norm 1.7462 (1.2232) [2022-01-18 06:39:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][370/1251] eta 0:32:52 lr 0.000984 time 2.1773 (2.2388) loss 4.9781 (4.3729) grad_norm 1.3369 (1.2251) [2022-01-18 06:39:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][380/1251] eta 0:32:29 lr 0.000984 time 2.3564 (2.2384) loss 4.7450 (4.3657) grad_norm 0.9942 (1.2245) [2022-01-18 06:39:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][390/1251] eta 0:32:04 lr 0.000984 time 2.1936 (2.2349) loss 4.1116 (4.3589) grad_norm 1.2666 (1.2267) [2022-01-18 06:40:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][400/1251] eta 0:31:38 lr 0.000984 time 1.9874 (2.2309) loss 4.4386 (4.3644) grad_norm 1.0232 (1.2281) [2022-01-18 06:40:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][410/1251] eta 0:31:14 lr 0.000984 time 1.8338 (2.2288) loss 4.7152 (4.3671) grad_norm 1.2235 (1.2274) [2022-01-18 06:40:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][420/1251] eta 0:30:53 lr 0.000984 time 2.8713 (2.2301) loss 4.3017 (4.3694) grad_norm 1.0938 (1.2249) [2022-01-18 06:41:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][430/1251] eta 0:30:31 lr 0.000984 time 2.1520 (2.2307) loss 4.5135 (4.3616) grad_norm 0.9225 (1.2243) [2022-01-18 06:41:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][440/1251] eta 0:30:11 lr 0.000984 time 2.2419 (2.2335) loss 4.6645 (4.3622) grad_norm 1.3602 (1.2250) [2022-01-18 06:42:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][450/1251] eta 0:29:49 lr 0.000984 time 1.8449 (2.2339) loss 4.5438 (4.3614) grad_norm 1.3970 (1.2236) [2022-01-18 06:42:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][460/1251] eta 0:29:24 lr 0.000984 time 2.4330 (2.2308) loss 5.3681 (4.3654) grad_norm 1.1584 (1.2238) [2022-01-18 06:42:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][470/1251] eta 0:28:57 lr 0.000984 time 1.8570 (2.2241) loss 3.7502 (4.3644) grad_norm 1.4626 (1.2252) [2022-01-18 06:43:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][480/1251] eta 0:28:31 lr 0.000984 time 2.0494 (2.2205) loss 5.1333 (4.3638) grad_norm 1.0857 (1.2236) [2022-01-18 06:43:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][490/1251] eta 0:28:09 lr 0.000984 time 2.3184 (2.2206) loss 4.7876 (4.3565) grad_norm 1.2852 (1.2228) [2022-01-18 06:43:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][500/1251] eta 0:27:49 lr 0.000984 time 3.0656 (2.2232) loss 3.2616 (4.3505) grad_norm 1.1166 (1.2256) [2022-01-18 06:44:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][510/1251] eta 0:27:27 lr 0.000984 time 1.5217 (2.2232) loss 4.8198 (4.3448) grad_norm 1.4135 (1.2243) [2022-01-18 06:44:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][520/1251] eta 0:27:03 lr 0.000984 time 1.6430 (2.2211) loss 4.4144 (4.3455) grad_norm 1.1844 (1.2223) [2022-01-18 06:44:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][530/1251] eta 0:26:40 lr 0.000984 time 2.4568 (2.2192) loss 4.6103 (4.3466) grad_norm 1.2109 (1.2216) [2022-01-18 06:45:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][540/1251] eta 0:26:17 lr 0.000984 time 3.1643 (2.2190) loss 3.7825 (4.3412) grad_norm 1.1120 (1.2229) [2022-01-18 06:45:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][550/1251] eta 0:25:56 lr 0.000984 time 1.8980 (2.2201) loss 4.8046 (4.3359) grad_norm 1.1221 (1.2223) [2022-01-18 06:46:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][560/1251] eta 0:25:33 lr 0.000984 time 1.9195 (2.2189) loss 4.5653 (4.3375) grad_norm 1.1869 (1.2213) [2022-01-18 06:46:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][570/1251] eta 0:25:10 lr 0.000984 time 2.3247 (2.2183) loss 4.6288 (4.3394) grad_norm 1.6368 (1.2228) [2022-01-18 06:46:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][580/1251] eta 0:24:48 lr 0.000984 time 2.4277 (2.2177) loss 3.2906 (4.3396) grad_norm 1.1824 (1.2227) [2022-01-18 06:47:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][590/1251] eta 0:24:25 lr 0.000984 time 1.7863 (2.2164) loss 5.1554 (4.3402) grad_norm 1.3316 (1.2217) [2022-01-18 06:47:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][600/1251] eta 0:24:02 lr 0.000984 time 2.1223 (2.2155) loss 4.0157 (4.3414) grad_norm 1.0926 (1.2210) [2022-01-18 06:47:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][610/1251] eta 0:23:41 lr 0.000984 time 2.1793 (2.2174) loss 3.2805 (4.3401) grad_norm 1.0052 (1.2204) [2022-01-18 06:48:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][620/1251] eta 0:23:17 lr 0.000984 time 1.8816 (2.2145) loss 3.5209 (4.3391) grad_norm 1.6115 (1.2198) [2022-01-18 06:48:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][630/1251] eta 0:22:54 lr 0.000984 time 2.4574 (2.2127) loss 4.5190 (4.3426) grad_norm 1.6144 (1.2209) [2022-01-18 06:48:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][640/1251] eta 0:22:30 lr 0.000984 time 2.1917 (2.2105) loss 4.7494 (4.3454) grad_norm 1.2655 (1.2204) [2022-01-18 06:49:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][650/1251] eta 0:22:07 lr 0.000984 time 1.8880 (2.2093) loss 5.0939 (4.3431) grad_norm 0.9280 (1.2199) [2022-01-18 06:49:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][660/1251] eta 0:21:44 lr 0.000984 time 1.8640 (2.2077) loss 3.8618 (4.3375) grad_norm 1.2223 (1.2190) [2022-01-18 06:50:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][670/1251] eta 0:21:23 lr 0.000984 time 2.8018 (2.2097) loss 3.7140 (4.3347) grad_norm 1.3852 (1.2206) [2022-01-18 06:50:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][680/1251] eta 0:21:00 lr 0.000984 time 1.9200 (2.2071) loss 3.6672 (4.3357) grad_norm 1.3507 (1.2211) [2022-01-18 06:50:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][690/1251] eta 0:20:39 lr 0.000984 time 2.5135 (2.2086) loss 4.2403 (4.3353) grad_norm 1.1753 (1.2206) [2022-01-18 06:51:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][700/1251] eta 0:20:17 lr 0.000984 time 1.8520 (2.2096) loss 4.7406 (4.3360) grad_norm 1.4528 (1.2193) [2022-01-18 06:51:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][710/1251] eta 0:19:57 lr 0.000984 time 3.3550 (2.2127) loss 3.6565 (4.3374) grad_norm 1.0734 (1.2195) [2022-01-18 06:51:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][720/1251] eta 0:19:34 lr 0.000984 time 1.8549 (2.2124) loss 4.6986 (4.3358) grad_norm 1.0894 (1.2188) [2022-01-18 06:52:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][730/1251] eta 0:19:13 lr 0.000984 time 2.9343 (2.2132) loss 3.9840 (4.3388) grad_norm 1.2066 (1.2199) [2022-01-18 06:52:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][740/1251] eta 0:18:49 lr 0.000984 time 1.6147 (2.2101) loss 3.4773 (4.3404) grad_norm 1.2678 (1.2204) [2022-01-18 06:52:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][750/1251] eta 0:18:26 lr 0.000984 time 2.7547 (2.2085) loss 4.6747 (4.3453) grad_norm 1.0860 (1.2220) [2022-01-18 06:53:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][760/1251] eta 0:18:03 lr 0.000984 time 2.0555 (2.2069) loss 4.6285 (4.3450) grad_norm 1.0802 (1.2214) [2022-01-18 06:53:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][770/1251] eta 0:17:42 lr 0.000984 time 3.7202 (2.2097) loss 4.5816 (4.3477) grad_norm 1.0053 (1.2197) [2022-01-18 06:54:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][780/1251] eta 0:17:21 lr 0.000984 time 2.0620 (2.2103) loss 3.7468 (4.3481) grad_norm 1.2572 (1.2190) [2022-01-18 06:54:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][790/1251] eta 0:16:59 lr 0.000984 time 2.7805 (2.2107) loss 4.8624 (4.3540) grad_norm 1.3125 (1.2192) [2022-01-18 06:54:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][800/1251] eta 0:16:36 lr 0.000984 time 2.0142 (2.2093) loss 4.4311 (4.3571) grad_norm 1.1168 (1.2177) [2022-01-18 06:55:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][810/1251] eta 0:16:13 lr 0.000984 time 2.6007 (2.2072) loss 3.3247 (4.3585) grad_norm 1.6412 (1.2169) [2022-01-18 06:55:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][820/1251] eta 0:15:50 lr 0.000984 time 1.8652 (2.2050) loss 3.8868 (4.3583) grad_norm 1.0535 (1.2171) [2022-01-18 06:55:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][830/1251] eta 0:15:28 lr 0.000984 time 2.2324 (2.2046) loss 3.5296 (4.3580) grad_norm 1.1562 (1.2170) [2022-01-18 06:56:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][840/1251] eta 0:15:05 lr 0.000984 time 2.2979 (2.2029) loss 4.7135 (4.3610) grad_norm 1.4054 (1.2179) [2022-01-18 06:56:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][850/1251] eta 0:14:43 lr 0.000984 time 2.3724 (2.2027) loss 3.9634 (4.3632) grad_norm 1.2234 (1.2183) [2022-01-18 06:56:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][860/1251] eta 0:14:21 lr 0.000984 time 1.8745 (2.2025) loss 4.1970 (4.3622) grad_norm 1.0414 (1.2184) [2022-01-18 06:57:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][870/1251] eta 0:13:58 lr 0.000984 time 1.7475 (2.2007) loss 3.2846 (4.3609) grad_norm 1.0376 (1.2169) [2022-01-18 06:57:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][880/1251] eta 0:13:36 lr 0.000984 time 2.0595 (2.2002) loss 4.5402 (4.3610) grad_norm 1.3581 (1.2172) [2022-01-18 06:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][890/1251] eta 0:13:14 lr 0.000984 time 2.4761 (2.1997) loss 3.8562 (4.3625) grad_norm 1.2310 (1.2170) [2022-01-18 06:58:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][900/1251] eta 0:12:52 lr 0.000984 time 2.1851 (2.2005) loss 3.5424 (4.3598) grad_norm 1.3377 (1.2167) [2022-01-18 06:58:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][910/1251] eta 0:12:30 lr 0.000983 time 1.8255 (2.2014) loss 4.2056 (4.3633) grad_norm 0.9400 (1.2162) [2022-01-18 06:59:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][920/1251] eta 0:12:08 lr 0.000983 time 1.8548 (2.2007) loss 3.4788 (4.3587) grad_norm 1.1661 (1.2174) [2022-01-18 06:59:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][930/1251] eta 0:11:46 lr 0.000983 time 2.8236 (2.2015) loss 3.2979 (4.3553) grad_norm 1.1401 (1.2173) [2022-01-18 06:59:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][940/1251] eta 0:11:25 lr 0.000983 time 1.8087 (2.2035) loss 5.1566 (4.3567) grad_norm 1.3726 (1.2175) [2022-01-18 07:00:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][950/1251] eta 0:11:03 lr 0.000983 time 1.8613 (2.2052) loss 4.3259 (4.3552) grad_norm 1.1019 (1.2173) [2022-01-18 07:00:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][960/1251] eta 0:10:41 lr 0.000983 time 1.8862 (2.2055) loss 4.1856 (4.3547) grad_norm 1.0952 (1.2164) [2022-01-18 07:00:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][970/1251] eta 0:10:19 lr 0.000983 time 2.2244 (2.2042) loss 4.0825 (4.3533) grad_norm 1.3350 (1.2169) [2022-01-18 07:01:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][980/1251] eta 0:09:57 lr 0.000983 time 1.7453 (2.2031) loss 3.2440 (4.3530) grad_norm 1.2644 (1.2167) [2022-01-18 07:01:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][990/1251] eta 0:09:34 lr 0.000983 time 1.9343 (2.2015) loss 4.4060 (4.3534) grad_norm 1.1325 (1.2163) [2022-01-18 07:02:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1000/1251] eta 0:09:12 lr 0.000983 time 2.1637 (2.2009) loss 5.1819 (4.3522) grad_norm 1.0884 (1.2157) [2022-01-18 07:02:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1010/1251] eta 0:08:50 lr 0.000983 time 2.5819 (2.2004) loss 4.6862 (4.3522) grad_norm 1.0743 (1.2152) [2022-01-18 07:02:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1020/1251] eta 0:08:28 lr 0.000983 time 1.8545 (2.1996) loss 4.6688 (4.3548) grad_norm 1.2900 (1.2159) [2022-01-18 07:03:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1030/1251] eta 0:08:06 lr 0.000983 time 1.9196 (2.1997) loss 3.5327 (4.3562) grad_norm 1.2911 (1.2153) [2022-01-18 07:03:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1040/1251] eta 0:07:44 lr 0.000983 time 1.5788 (2.1993) loss 4.1005 (4.3586) grad_norm 1.2571 (1.2154) [2022-01-18 07:03:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1050/1251] eta 0:07:22 lr 0.000983 time 1.9253 (2.1996) loss 3.3615 (4.3583) grad_norm 1.2755 (1.2154) [2022-01-18 07:04:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1060/1251] eta 0:07:00 lr 0.000983 time 2.7768 (2.2003) loss 3.8216 (4.3595) grad_norm 1.2560 (1.2160) [2022-01-18 07:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1070/1251] eta 0:06:38 lr 0.000983 time 1.7010 (2.2015) loss 4.9271 (4.3609) grad_norm 1.1388 (1.2157) [2022-01-18 07:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1080/1251] eta 0:06:16 lr 0.000983 time 2.0945 (2.2013) loss 4.7596 (4.3620) grad_norm 1.3412 (1.2157) [2022-01-18 07:05:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1090/1251] eta 0:05:54 lr 0.000983 time 1.9245 (2.2003) loss 3.7006 (4.3615) grad_norm 1.4504 (1.2169) [2022-01-18 07:05:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1100/1251] eta 0:05:31 lr 0.000983 time 2.5156 (2.1982) loss 4.6220 (4.3617) grad_norm 1.3654 (1.2171) [2022-01-18 07:06:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1110/1251] eta 0:05:09 lr 0.000983 time 2.5729 (2.1984) loss 4.5375 (4.3636) grad_norm 1.0544 (1.2169) [2022-01-18 07:06:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1120/1251] eta 0:04:47 lr 0.000983 time 2.1210 (2.1984) loss 4.6092 (4.3645) grad_norm 1.1796 (1.2167) [2022-01-18 07:06:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1130/1251] eta 0:04:26 lr 0.000983 time 1.8857 (2.1999) loss 4.9133 (4.3652) grad_norm 1.1724 (1.2165) [2022-01-18 07:07:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1140/1251] eta 0:04:04 lr 0.000983 time 2.2104 (2.2010) loss 4.5321 (4.3641) grad_norm 1.2098 (1.2164) [2022-01-18 07:07:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1150/1251] eta 0:03:42 lr 0.000983 time 2.7081 (2.2023) loss 5.2532 (4.3645) grad_norm 1.2952 (1.2163) [2022-01-18 07:07:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1160/1251] eta 0:03:20 lr 0.000983 time 1.7807 (2.2011) loss 4.1148 (4.3631) grad_norm 1.1777 (1.2167) [2022-01-18 07:08:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1170/1251] eta 0:02:58 lr 0.000983 time 1.6061 (2.1984) loss 3.4640 (4.3607) grad_norm 1.1697 (1.2162) [2022-01-18 07:08:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1180/1251] eta 0:02:35 lr 0.000983 time 1.9566 (2.1970) loss 4.4032 (4.3592) grad_norm 1.2795 (1.2165) [2022-01-18 07:08:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1190/1251] eta 0:02:14 lr 0.000983 time 1.8669 (2.1968) loss 4.4654 (4.3587) grad_norm 1.2632 (1.2158) [2022-01-18 07:09:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1200/1251] eta 0:01:52 lr 0.000983 time 2.2019 (2.1965) loss 4.2604 (4.3570) grad_norm 1.3439 (1.2152) [2022-01-18 07:09:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1210/1251] eta 0:01:30 lr 0.000983 time 1.8741 (2.1968) loss 4.6995 (4.3577) grad_norm 1.1112 (1.2148) [2022-01-18 07:10:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1220/1251] eta 0:01:08 lr 0.000983 time 2.1592 (2.1983) loss 4.2699 (4.3579) grad_norm 1.3781 (1.2154) [2022-01-18 07:10:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1230/1251] eta 0:00:46 lr 0.000983 time 2.1116 (2.2009) loss 4.0255 (4.3584) grad_norm 1.0322 (1.2146) [2022-01-18 07:10:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1240/1251] eta 0:00:24 lr 0.000983 time 1.3782 (2.1997) loss 4.6780 (4.3595) grad_norm 1.3362 (1.2150) [2022-01-18 07:11:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [24/300][1250/1251] eta 0:00:02 lr 0.000983 time 1.1901 (2.1943) loss 3.6236 (4.3575) grad_norm 1.6117 (1.2148) [2022-01-18 07:11:05 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 24 training takes 0:45:45 [2022-01-18 07:11:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.643 (18.643) Loss 1.7029 (1.7029) Acc@1 62.793 (62.793) Acc@5 84.570 (84.570) [2022-01-18 07:11:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.214 (3.574) Loss 1.7417 (1.6621) Acc@1 60.938 (62.695) Acc@5 83.398 (85.369) [2022-01-18 07:12:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.492 (2.714) Loss 1.7036 (1.6555) Acc@1 62.598 (62.821) Acc@5 86.523 (85.575) [2022-01-18 07:12:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.269 (2.397) Loss 1.7253 (1.6560) Acc@1 60.547 (63.073) Acc@5 85.352 (85.654) [2022-01-18 07:12:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.099 (2.211) Loss 1.6768 (1.6557) Acc@1 62.305 (63.048) Acc@5 86.914 (85.706) [2022-01-18 07:12:43 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 63.136 Acc@5 85.740 [2022-01-18 07:12:43 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 63.1% [2022-01-18 07:12:43 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 63.14% [2022-01-18 07:13:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][0/1251] eta 7:27:00 lr 0.000983 time 21.4389 (21.4389) loss 5.1893 (5.1893) grad_norm 1.3376 (1.3376) [2022-01-18 07:13:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][10/1251] eta 1:24:42 lr 0.000983 time 1.5256 (4.0959) loss 4.4762 (4.4706) grad_norm 0.9195 (1.1187) [2022-01-18 07:13:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][20/1251] eta 1:06:26 lr 0.000983 time 2.8257 (3.2381) loss 4.5761 (4.3138) grad_norm 1.0572 (1.1112) [2022-01-18 07:14:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][30/1251] eta 0:58:35 lr 0.000983 time 1.3484 (2.8790) loss 4.6676 (4.2203) grad_norm 1.3222 (1.1895) [2022-01-18 07:14:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][40/1251] eta 0:55:08 lr 0.000983 time 3.9126 (2.7324) loss 4.2088 (4.2567) grad_norm 1.1693 (1.2286) [2022-01-18 07:14:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][50/1251] eta 0:52:42 lr 0.000983 time 1.7186 (2.6334) loss 3.6456 (4.2161) grad_norm 1.3486 (1.2305) [2022-01-18 07:15:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][60/1251] eta 0:51:04 lr 0.000983 time 2.1476 (2.5729) loss 5.0354 (4.3051) grad_norm 1.2978 (1.2275) [2022-01-18 07:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][70/1251] eta 0:49:25 lr 0.000983 time 1.6226 (2.5107) loss 3.0978 (4.3038) grad_norm 1.1531 (1.2245) [2022-01-18 07:16:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][80/1251] eta 0:48:23 lr 0.000983 time 3.0147 (2.4796) loss 3.2796 (4.2899) grad_norm 1.3841 (1.2266) [2022-01-18 07:16:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][90/1251] eta 0:47:15 lr 0.000983 time 1.7147 (2.4424) loss 5.1058 (4.3219) grad_norm 1.2957 (1.2396) [2022-01-18 07:16:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][100/1251] eta 0:46:25 lr 0.000983 time 1.6602 (2.4199) loss 4.9859 (4.3198) grad_norm 1.0576 (1.2308) [2022-01-18 07:17:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][110/1251] eta 0:45:35 lr 0.000983 time 2.0892 (2.3972) loss 5.0741 (4.3175) grad_norm 1.2915 (1.2271) [2022-01-18 07:17:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][120/1251] eta 0:44:54 lr 0.000983 time 2.7646 (2.3826) loss 3.7661 (4.3170) grad_norm 1.6730 (1.2354) [2022-01-18 07:17:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][130/1251] eta 0:44:13 lr 0.000983 time 1.8888 (2.3674) loss 4.6486 (4.3126) grad_norm 1.2146 (1.2343) [2022-01-18 07:18:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][140/1251] eta 0:43:31 lr 0.000983 time 1.5775 (2.3504) loss 4.5650 (4.3052) grad_norm 1.1181 (1.2359) [2022-01-18 07:18:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][150/1251] eta 0:42:54 lr 0.000983 time 2.5029 (2.3379) loss 4.4783 (4.3275) grad_norm 1.4455 (1.2439) [2022-01-18 07:19:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][160/1251] eta 0:42:32 lr 0.000983 time 3.7303 (2.3391) loss 4.8094 (4.3392) grad_norm 0.9452 (1.2446) [2022-01-18 07:19:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][170/1251] eta 0:41:56 lr 0.000983 time 1.6226 (2.3283) loss 3.9695 (4.3315) grad_norm 1.3054 (1.2397) [2022-01-18 07:19:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][180/1251] eta 0:41:19 lr 0.000983 time 2.0567 (2.3155) loss 4.4556 (4.3268) grad_norm 1.1317 (1.2373) [2022-01-18 07:20:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][190/1251] eta 0:40:33 lr 0.000983 time 1.6192 (2.2938) loss 3.9344 (4.3290) grad_norm 1.2046 (1.2306) [2022-01-18 07:20:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][200/1251] eta 0:40:03 lr 0.000983 time 2.9208 (2.2867) loss 4.7422 (4.3363) grad_norm 1.0929 (1.2269) [2022-01-18 07:20:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][210/1251] eta 0:39:31 lr 0.000983 time 1.5010 (2.2783) loss 4.8084 (4.3361) grad_norm 1.2482 (1.2230) [2022-01-18 07:21:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][220/1251] eta 0:39:03 lr 0.000983 time 2.1829 (2.2729) loss 4.0921 (4.3325) grad_norm 1.3518 (1.2229) [2022-01-18 07:21:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][230/1251] eta 0:38:40 lr 0.000983 time 2.0063 (2.2730) loss 4.2161 (4.3327) grad_norm 1.2186 (1.2245) [2022-01-18 07:21:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][240/1251] eta 0:38:23 lr 0.000983 time 3.1810 (2.2780) loss 4.6384 (4.3332) grad_norm 1.0054 (1.2232) [2022-01-18 07:22:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][250/1251] eta 0:38:00 lr 0.000983 time 2.2169 (2.2785) loss 4.2753 (4.3350) grad_norm 1.1519 (1.2235) [2022-01-18 07:22:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][260/1251] eta 0:37:36 lr 0.000983 time 2.6469 (2.2765) loss 4.6128 (4.3338) grad_norm 0.9808 (1.2209) [2022-01-18 07:22:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][270/1251] eta 0:37:08 lr 0.000983 time 1.6048 (2.2714) loss 3.3320 (4.3170) grad_norm 1.0851 (1.2209) [2022-01-18 07:23:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][280/1251] eta 0:36:39 lr 0.000983 time 2.4691 (2.2652) loss 4.6585 (4.3134) grad_norm 1.1530 (1.2174) [2022-01-18 07:23:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][290/1251] eta 0:36:06 lr 0.000983 time 1.8693 (2.2542) loss 4.4573 (4.3100) grad_norm 1.4298 (1.2199) [2022-01-18 07:24:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][300/1251] eta 0:35:38 lr 0.000983 time 1.8735 (2.2490) loss 4.8764 (4.3181) grad_norm 0.9437 (1.2161) [2022-01-18 07:24:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][310/1251] eta 0:35:14 lr 0.000983 time 2.0357 (2.2473) loss 4.2086 (4.3229) grad_norm 1.1149 (1.2170) [2022-01-18 07:24:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][320/1251] eta 0:34:51 lr 0.000983 time 2.6041 (2.2464) loss 3.5282 (4.3294) grad_norm 1.1913 (1.2169) [2022-01-18 07:25:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][330/1251] eta 0:34:27 lr 0.000983 time 1.8381 (2.2447) loss 3.6851 (4.3249) grad_norm 1.1628 (1.2177) [2022-01-18 07:25:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][340/1251] eta 0:34:09 lr 0.000983 time 2.5502 (2.2496) loss 4.7024 (4.3252) grad_norm 1.2872 (1.2170) [2022-01-18 07:25:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][350/1251] eta 0:33:47 lr 0.000983 time 1.6564 (2.2498) loss 5.0788 (4.3224) grad_norm 1.1522 (1.2141) [2022-01-18 07:26:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][360/1251] eta 0:33:22 lr 0.000983 time 2.2685 (2.2471) loss 4.3928 (4.3170) grad_norm 1.3082 (1.2129) [2022-01-18 07:26:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][370/1251] eta 0:32:53 lr 0.000983 time 2.1965 (2.2398) loss 4.4453 (4.3172) grad_norm 1.5854 (1.2152) [2022-01-18 07:26:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][380/1251] eta 0:32:25 lr 0.000983 time 1.9681 (2.2332) loss 4.7852 (4.3210) grad_norm 1.4042 (1.2169) [2022-01-18 07:27:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][390/1251] eta 0:32:00 lr 0.000983 time 2.2082 (2.2307) loss 3.7817 (4.3188) grad_norm 1.1635 (1.2163) [2022-01-18 07:27:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][400/1251] eta 0:31:38 lr 0.000983 time 3.0427 (2.2314) loss 5.0537 (4.3233) grad_norm 1.1820 (1.2172) [2022-01-18 07:28:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][410/1251] eta 0:31:16 lr 0.000983 time 2.4710 (2.2311) loss 3.6097 (4.3250) grad_norm 1.4294 (1.2182) [2022-01-18 07:28:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][420/1251] eta 0:30:55 lr 0.000983 time 1.8799 (2.2325) loss 4.2950 (4.3216) grad_norm 1.2311 (1.2202) [2022-01-18 07:28:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][430/1251] eta 0:30:32 lr 0.000983 time 2.6516 (2.2326) loss 4.7279 (4.3266) grad_norm 1.0680 (1.2193) [2022-01-18 07:29:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][440/1251] eta 0:30:09 lr 0.000983 time 3.0613 (2.2316) loss 4.9805 (4.3279) grad_norm 1.3859 (1.2205) [2022-01-18 07:29:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][450/1251] eta 0:29:45 lr 0.000983 time 1.5670 (2.2297) loss 4.8159 (4.3297) grad_norm 1.1517 (1.2231) [2022-01-18 07:29:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][460/1251] eta 0:29:22 lr 0.000983 time 1.8764 (2.2279) loss 4.1266 (4.3216) grad_norm 1.2797 (1.2244) [2022-01-18 07:30:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][470/1251] eta 0:29:00 lr 0.000983 time 2.4136 (2.2280) loss 5.1203 (4.3224) grad_norm 1.3921 (1.2253) [2022-01-18 07:30:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][480/1251] eta 0:28:39 lr 0.000983 time 3.9299 (2.2308) loss 3.8144 (4.3221) grad_norm 1.1999 (1.2246) [2022-01-18 07:30:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][490/1251] eta 0:28:18 lr 0.000983 time 1.6070 (2.2313) loss 4.5787 (4.3264) grad_norm 1.1027 (1.2234) [2022-01-18 07:31:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][500/1251] eta 0:27:53 lr 0.000983 time 2.1220 (2.2280) loss 4.6132 (4.3315) grad_norm 1.0780 (1.2228) [2022-01-18 07:31:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][510/1251] eta 0:27:27 lr 0.000983 time 1.9783 (2.2238) loss 3.8734 (4.3327) grad_norm 1.2007 (1.2212) [2022-01-18 07:32:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][520/1251] eta 0:27:03 lr 0.000983 time 2.8069 (2.2212) loss 4.4047 (4.3379) grad_norm 1.1900 (1.2198) [2022-01-18 07:32:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][530/1251] eta 0:26:40 lr 0.000983 time 1.8536 (2.2194) loss 4.3020 (4.3367) grad_norm 1.3017 (1.2193) [2022-01-18 07:32:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][540/1251] eta 0:26:18 lr 0.000983 time 3.1035 (2.2203) loss 4.5163 (4.3360) grad_norm 0.9303 (1.2212) [2022-01-18 07:33:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][550/1251] eta 0:25:56 lr 0.000983 time 2.0916 (2.2198) loss 3.7105 (4.3323) grad_norm 1.0744 (1.2200) [2022-01-18 07:33:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][560/1251] eta 0:25:34 lr 0.000983 time 2.4618 (2.2205) loss 3.3951 (4.3334) grad_norm 1.1958 (1.2187) [2022-01-18 07:33:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][570/1251] eta 0:25:14 lr 0.000983 time 1.8109 (2.2241) loss 3.9361 (4.3326) grad_norm 1.7807 (1.2190) [2022-01-18 07:34:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][580/1251] eta 0:24:55 lr 0.000983 time 3.5879 (2.2282) loss 3.2290 (4.3263) grad_norm 1.4370 (1.2178) [2022-01-18 07:34:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][590/1251] eta 0:24:32 lr 0.000982 time 2.1517 (2.2284) loss 4.7340 (4.3288) grad_norm 1.0334 (1.2186) [2022-01-18 07:35:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][600/1251] eta 0:24:09 lr 0.000982 time 1.8429 (2.2264) loss 4.5340 (4.3287) grad_norm 1.1273 (1.2182) [2022-01-18 07:35:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][610/1251] eta 0:23:44 lr 0.000982 time 2.0363 (2.2225) loss 4.0042 (4.3245) grad_norm 1.1384 (1.2180) [2022-01-18 07:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][620/1251] eta 0:23:19 lr 0.000982 time 2.2682 (2.2175) loss 3.3477 (4.3209) grad_norm 1.1177 (1.2182) [2022-01-18 07:36:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][630/1251] eta 0:22:55 lr 0.000982 time 2.2464 (2.2149) loss 4.3598 (4.3211) grad_norm 1.2764 (1.2168) [2022-01-18 07:36:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][640/1251] eta 0:22:32 lr 0.000982 time 2.1061 (2.2130) loss 4.4260 (4.3225) grad_norm 1.2091 (1.2174) [2022-01-18 07:36:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][650/1251] eta 0:22:10 lr 0.000982 time 2.0020 (2.2141) loss 4.4489 (4.3258) grad_norm 1.1745 (1.2165) [2022-01-18 07:37:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][660/1251] eta 0:21:49 lr 0.000982 time 2.1677 (2.2153) loss 3.9457 (4.3263) grad_norm 1.1564 (1.2157) [2022-01-18 07:37:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][670/1251] eta 0:21:27 lr 0.000982 time 2.1161 (2.2156) loss 3.9026 (4.3284) grad_norm 0.9327 (1.2149) [2022-01-18 07:37:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][680/1251] eta 0:21:06 lr 0.000982 time 2.5228 (2.2183) loss 3.7408 (4.3291) grad_norm 1.3051 (1.2155) [2022-01-18 07:38:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][690/1251] eta 0:20:45 lr 0.000982 time 2.3287 (2.2202) loss 3.4659 (4.3265) grad_norm 1.2292 (1.2153) [2022-01-18 07:38:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][700/1251] eta 0:20:24 lr 0.000982 time 2.4522 (2.2223) loss 3.3169 (4.3247) grad_norm 1.5643 (1.2149) [2022-01-18 07:39:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][710/1251] eta 0:20:01 lr 0.000982 time 1.8844 (2.2216) loss 4.7840 (4.3198) grad_norm 1.1017 (1.2144) [2022-01-18 07:39:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][720/1251] eta 0:19:37 lr 0.000982 time 1.6103 (2.2174) loss 4.4155 (4.3196) grad_norm 1.2669 (1.2139) [2022-01-18 07:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][730/1251] eta 0:19:12 lr 0.000982 time 1.5633 (2.2127) loss 3.7162 (4.3196) grad_norm 1.2821 (1.2151) [2022-01-18 07:40:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][740/1251] eta 0:18:49 lr 0.000982 time 2.4990 (2.2104) loss 4.5712 (4.3172) grad_norm 0.8729 (1.2136) [2022-01-18 07:40:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][750/1251] eta 0:18:26 lr 0.000982 time 1.6815 (2.2081) loss 3.5129 (4.3172) grad_norm 1.1695 (1.2124) [2022-01-18 07:40:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][760/1251] eta 0:18:03 lr 0.000982 time 1.9090 (2.2073) loss 3.4701 (4.3160) grad_norm 1.5955 (1.2117) [2022-01-18 07:41:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][770/1251] eta 0:17:42 lr 0.000982 time 2.1557 (2.2081) loss 4.8406 (4.3156) grad_norm 1.3011 (1.2125) [2022-01-18 07:41:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][780/1251] eta 0:17:20 lr 0.000982 time 2.0058 (2.2084) loss 3.9581 (4.3136) grad_norm 1.3134 (1.2137) [2022-01-18 07:41:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][790/1251] eta 0:16:58 lr 0.000982 time 2.4945 (2.2096) loss 4.7141 (4.3148) grad_norm 1.3187 (1.2147) [2022-01-18 07:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][800/1251] eta 0:16:36 lr 0.000982 time 2.5077 (2.2090) loss 4.5759 (4.3156) grad_norm 1.0240 (1.2147) [2022-01-18 07:42:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][810/1251] eta 0:16:15 lr 0.000982 time 2.9746 (2.2111) loss 4.9115 (4.3168) grad_norm 1.3809 (1.2148) [2022-01-18 07:42:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][820/1251] eta 0:15:52 lr 0.000982 time 1.9597 (2.2095) loss 4.1146 (4.3163) grad_norm 1.3485 (1.2150) [2022-01-18 07:43:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][830/1251] eta 0:15:30 lr 0.000982 time 1.8986 (2.2100) loss 4.9987 (4.3185) grad_norm 1.1946 (1.2146) [2022-01-18 07:43:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][840/1251] eta 0:15:08 lr 0.000982 time 2.9438 (2.2104) loss 4.9643 (4.3185) grad_norm 0.9803 (1.2144) [2022-01-18 07:44:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][850/1251] eta 0:14:46 lr 0.000982 time 1.7904 (2.2096) loss 3.5810 (4.3209) grad_norm 1.5144 (1.2143) [2022-01-18 07:44:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][860/1251] eta 0:14:23 lr 0.000982 time 1.8517 (2.2083) loss 3.9755 (4.3200) grad_norm 1.2748 (1.2138) [2022-01-18 07:44:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][870/1251] eta 0:14:02 lr 0.000982 time 2.7715 (2.2115) loss 4.4016 (4.3212) grad_norm 1.4447 (1.2145) [2022-01-18 07:45:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][880/1251] eta 0:13:41 lr 0.000982 time 3.3109 (2.2136) loss 4.7279 (4.3243) grad_norm 1.2551 (1.2146) [2022-01-18 07:45:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][890/1251] eta 0:13:19 lr 0.000982 time 1.8176 (2.2140) loss 4.4922 (4.3256) grad_norm 1.0743 (1.2151) [2022-01-18 07:45:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][900/1251] eta 0:12:56 lr 0.000982 time 1.6960 (2.2116) loss 4.3924 (4.3271) grad_norm 1.0070 (1.2146) [2022-01-18 07:46:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][910/1251] eta 0:12:33 lr 0.000982 time 2.1886 (2.2089) loss 5.0666 (4.3281) grad_norm 1.4035 (1.2145) [2022-01-18 07:46:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][920/1251] eta 0:12:10 lr 0.000982 time 1.9930 (2.2072) loss 4.2785 (4.3297) grad_norm 1.3629 (1.2148) [2022-01-18 07:46:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][930/1251] eta 0:11:48 lr 0.000982 time 1.9485 (2.2082) loss 4.4278 (4.3296) grad_norm 1.1176 (1.2148) [2022-01-18 07:47:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][940/1251] eta 0:11:27 lr 0.000982 time 2.1608 (2.2095) loss 4.2661 (4.3313) grad_norm 1.1564 (1.2152) [2022-01-18 07:47:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][950/1251] eta 0:11:05 lr 0.000982 time 2.8280 (2.2106) loss 3.9905 (4.3306) grad_norm 1.0082 (1.2140) [2022-01-18 07:48:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][960/1251] eta 0:10:43 lr 0.000982 time 1.9995 (2.2099) loss 5.0988 (4.3274) grad_norm 1.1021 (1.2128) [2022-01-18 07:48:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][970/1251] eta 0:10:20 lr 0.000982 time 2.0160 (2.2086) loss 4.4019 (4.3294) grad_norm 1.2908 (1.2120) [2022-01-18 07:48:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][980/1251] eta 0:09:58 lr 0.000982 time 2.2975 (2.2082) loss 4.4625 (4.3288) grad_norm 1.0342 (1.2107) [2022-01-18 07:49:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][990/1251] eta 0:09:36 lr 0.000982 time 2.5096 (2.2083) loss 5.0939 (4.3294) grad_norm 1.7452 (1.2112) [2022-01-18 07:49:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1000/1251] eta 0:09:13 lr 0.000982 time 1.6251 (2.2068) loss 3.7742 (4.3285) grad_norm 0.9561 (1.2103) [2022-01-18 07:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1010/1251] eta 0:08:51 lr 0.000982 time 2.2498 (2.2069) loss 3.0827 (4.3288) grad_norm 1.2834 (1.2097) [2022-01-18 07:50:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1020/1251] eta 0:08:29 lr 0.000982 time 2.2103 (2.2066) loss 4.5122 (4.3276) grad_norm 1.2746 (1.2096) [2022-01-18 07:50:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1030/1251] eta 0:08:07 lr 0.000982 time 1.6637 (2.2070) loss 4.9039 (4.3291) grad_norm 1.2969 (1.2084) [2022-01-18 07:51:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1040/1251] eta 0:07:45 lr 0.000982 time 2.1763 (2.2059) loss 4.8855 (4.3310) grad_norm 1.1373 (1.2083) [2022-01-18 07:51:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1050/1251] eta 0:07:23 lr 0.000982 time 2.1199 (2.2063) loss 3.4750 (4.3323) grad_norm 1.2945 (1.2074) [2022-01-18 07:51:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1060/1251] eta 0:07:01 lr 0.000982 time 2.4902 (2.2071) loss 4.7607 (4.3354) grad_norm 1.0207 (1.2067) [2022-01-18 07:52:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1070/1251] eta 0:06:39 lr 0.000982 time 1.9066 (2.2084) loss 3.4468 (4.3368) grad_norm 1.0065 (1.2078) [2022-01-18 07:52:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1080/1251] eta 0:06:17 lr 0.000982 time 1.6902 (2.2067) loss 3.4207 (4.3369) grad_norm 1.7157 (1.2084) [2022-01-18 07:52:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1090/1251] eta 0:05:55 lr 0.000982 time 2.2324 (2.2055) loss 3.7104 (4.3356) grad_norm 1.8304 (1.2087) [2022-01-18 07:53:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1100/1251] eta 0:05:32 lr 0.000982 time 1.8138 (2.2028) loss 3.6840 (4.3343) grad_norm 1.0785 (1.2088) [2022-01-18 07:53:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1110/1251] eta 0:05:10 lr 0.000982 time 2.0857 (2.2012) loss 4.1529 (4.3345) grad_norm 1.2506 (1.2080) [2022-01-18 07:53:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1120/1251] eta 0:04:48 lr 0.000982 time 1.9103 (2.2003) loss 4.0962 (4.3339) grad_norm 1.0257 (1.2073) [2022-01-18 07:54:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1130/1251] eta 0:04:26 lr 0.000982 time 2.2426 (2.1997) loss 5.0033 (4.3353) grad_norm 1.2452 (1.2077) [2022-01-18 07:54:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1140/1251] eta 0:04:04 lr 0.000982 time 2.2512 (2.1994) loss 4.3035 (4.3336) grad_norm 1.6186 (1.2070) [2022-01-18 07:54:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1150/1251] eta 0:03:42 lr 0.000982 time 2.0235 (2.1997) loss 3.7999 (4.3345) grad_norm 1.0080 (1.2064) [2022-01-18 07:55:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1160/1251] eta 0:03:20 lr 0.000982 time 1.6102 (2.1989) loss 4.1263 (4.3333) grad_norm 1.2252 (1.2064) [2022-01-18 07:55:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1170/1251] eta 0:02:58 lr 0.000982 time 3.2356 (2.2007) loss 4.6921 (4.3365) grad_norm 1.0005 (1.2068) [2022-01-18 07:56:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1180/1251] eta 0:02:36 lr 0.000982 time 2.1065 (2.2033) loss 3.4305 (4.3330) grad_norm 1.3440 (1.2075) [2022-01-18 07:56:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1190/1251] eta 0:02:14 lr 0.000982 time 2.1445 (2.2047) loss 3.2020 (4.3333) grad_norm 1.1898 (1.2078) [2022-01-18 07:56:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1200/1251] eta 0:01:52 lr 0.000982 time 2.2002 (2.2045) loss 4.9723 (4.3337) grad_norm 0.9407 (1.2069) [2022-01-18 07:57:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1210/1251] eta 0:01:30 lr 0.000982 time 2.4427 (2.2047) loss 4.6741 (4.3313) grad_norm 1.1462 (1.2067) [2022-01-18 07:57:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1220/1251] eta 0:01:08 lr 0.000982 time 1.6864 (2.2032) loss 4.2395 (4.3304) grad_norm 1.3165 (1.2065) [2022-01-18 07:57:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1230/1251] eta 0:00:46 lr 0.000982 time 1.8435 (2.2017) loss 4.3490 (4.3302) grad_norm 0.9379 (1.2065) [2022-01-18 07:58:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1240/1251] eta 0:00:24 lr 0.000982 time 2.1039 (2.2021) loss 4.0753 (4.3282) grad_norm 1.0527 (1.2056) [2022-01-18 07:58:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [25/300][1250/1251] eta 0:00:02 lr 0.000982 time 1.1915 (2.1969) loss 4.6941 (4.3285) grad_norm 1.1391 (1.2047) [2022-01-18 07:58:32 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 25 training takes 0:45:48 [2022-01-18 07:58:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.118 (18.118) Loss 1.6671 (1.6671) Acc@1 62.207 (62.207) Acc@5 84.473 (84.473) [2022-01-18 07:59:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.226 (3.322) Loss 1.5306 (1.6370) Acc@1 64.062 (63.175) Acc@5 87.598 (85.609) [2022-01-18 07:59:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 2.025 (2.584) Loss 1.5226 (1.6191) Acc@1 65.723 (63.709) Acc@5 86.230 (85.826) [2022-01-18 07:59:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.898 (2.314) Loss 1.6365 (1.6240) Acc@1 62.793 (63.477) Acc@5 85.156 (85.774) [2022-01-18 08:00:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.613 (2.149) Loss 1.6634 (1.6244) Acc@1 62.402 (63.353) Acc@5 85.840 (85.783) [2022-01-18 08:00:07 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 63.484 Acc@5 85.894 [2022-01-18 08:00:07 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 63.5% [2022-01-18 08:00:07 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 63.48% [2022-01-18 08:00:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][0/1251] eta 7:34:01 lr 0.000982 time 21.7760 (21.7760) loss 4.2873 (4.2873) grad_norm 1.1431 (1.1431) [2022-01-18 08:00:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][10/1251] eta 1:22:04 lr 0.000982 time 1.4653 (3.9678) loss 3.7084 (4.0993) grad_norm 0.9691 (1.1234) [2022-01-18 08:01:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][20/1251] eta 1:02:53 lr 0.000982 time 1.4300 (3.0654) loss 4.0452 (4.1590) grad_norm 1.2984 (1.1622) [2022-01-18 08:01:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][30/1251] eta 0:56:40 lr 0.000982 time 1.2438 (2.7854) loss 4.6503 (4.1340) grad_norm 1.1563 (1.1799) [2022-01-18 08:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][40/1251] eta 0:54:51 lr 0.000982 time 5.3755 (2.7181) loss 4.9975 (4.2497) grad_norm 1.1433 (1.1928) [2022-01-18 08:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][50/1251] eta 0:52:00 lr 0.000982 time 1.3280 (2.5987) loss 4.9397 (4.2705) grad_norm 1.0978 (1.1737) [2022-01-18 08:02:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][60/1251] eta 0:49:49 lr 0.000982 time 1.8967 (2.5098) loss 4.4375 (4.2807) grad_norm 1.1875 (1.1756) [2022-01-18 08:03:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][70/1251] eta 0:48:48 lr 0.000982 time 1.4328 (2.4800) loss 5.2077 (4.2793) grad_norm 1.2218 (1.2160) [2022-01-18 08:03:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][80/1251] eta 0:48:22 lr 0.000982 time 4.2698 (2.4786) loss 4.3524 (4.2749) grad_norm 1.4066 (1.2116) [2022-01-18 08:03:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][90/1251] eta 0:47:17 lr 0.000982 time 1.4590 (2.4436) loss 3.7651 (4.2693) grad_norm 1.3781 (1.2110) [2022-01-18 08:04:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][100/1251] eta 0:46:14 lr 0.000982 time 1.7240 (2.4106) loss 4.9652 (4.2569) grad_norm 1.2440 (1.2166) [2022-01-18 08:04:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][110/1251] eta 0:45:20 lr 0.000982 time 1.9744 (2.3847) loss 4.6212 (4.2601) grad_norm 1.0621 (1.2024) [2022-01-18 08:04:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][120/1251] eta 0:44:52 lr 0.000982 time 3.5148 (2.3803) loss 3.6042 (4.2743) grad_norm 1.4700 (1.2043) [2022-01-18 08:05:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][130/1251] eta 0:44:14 lr 0.000982 time 1.5415 (2.3682) loss 4.4810 (4.2705) grad_norm 1.2696 (1.2047) [2022-01-18 08:05:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][140/1251] eta 0:43:23 lr 0.000982 time 2.2083 (2.3432) loss 3.8490 (4.2698) grad_norm 1.2862 (1.1985) [2022-01-18 08:05:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][150/1251] eta 0:42:45 lr 0.000982 time 1.9189 (2.3301) loss 3.1682 (4.2770) grad_norm 0.9745 (1.1973) [2022-01-18 08:06:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][160/1251] eta 0:42:03 lr 0.000982 time 2.0329 (2.3129) loss 3.5584 (4.2838) grad_norm 1.1248 (1.1968) [2022-01-18 08:06:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][170/1251] eta 0:41:40 lr 0.000982 time 2.5393 (2.3131) loss 4.5141 (4.2762) grad_norm 1.1450 (1.1939) [2022-01-18 08:07:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][180/1251] eta 0:41:10 lr 0.000982 time 2.1037 (2.3067) loss 4.7122 (4.2593) grad_norm 1.7814 (1.1998) [2022-01-18 08:07:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][190/1251] eta 0:40:43 lr 0.000982 time 1.7961 (2.3030) loss 4.5382 (4.2737) grad_norm 1.1516 (1.2019) [2022-01-18 08:07:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][200/1251] eta 0:40:14 lr 0.000982 time 2.4925 (2.2971) loss 4.6933 (4.2812) grad_norm 1.0633 (1.1967) [2022-01-18 08:08:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][210/1251] eta 0:39:47 lr 0.000982 time 2.4979 (2.2931) loss 4.2965 (4.2927) grad_norm 1.0751 (1.1941) [2022-01-18 08:08:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][220/1251] eta 0:39:11 lr 0.000982 time 2.1983 (2.2803) loss 4.2025 (4.2861) grad_norm 1.1974 (1.1987) [2022-01-18 08:08:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][230/1251] eta 0:38:43 lr 0.000982 time 2.0379 (2.2757) loss 4.5284 (4.2808) grad_norm 1.5863 (1.2033) [2022-01-18 08:09:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][240/1251] eta 0:38:15 lr 0.000981 time 2.6381 (2.2707) loss 4.5869 (4.2819) grad_norm 1.0937 (1.2080) [2022-01-18 08:09:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][250/1251] eta 0:37:47 lr 0.000981 time 1.9033 (2.2649) loss 4.4634 (4.2902) grad_norm 0.9539 (1.2084) [2022-01-18 08:09:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][260/1251] eta 0:37:14 lr 0.000981 time 2.1059 (2.2548) loss 4.7405 (4.2928) grad_norm 1.0636 (1.2109) [2022-01-18 08:10:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][270/1251] eta 0:36:47 lr 0.000981 time 1.9304 (2.2505) loss 4.4715 (4.2954) grad_norm 1.0704 (1.2075) [2022-01-18 08:10:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][280/1251] eta 0:36:23 lr 0.000981 time 2.7695 (2.2483) loss 4.4222 (4.2867) grad_norm 1.2074 (1.2064) [2022-01-18 08:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][290/1251] eta 0:35:59 lr 0.000981 time 2.2199 (2.2472) loss 4.5428 (4.2953) grad_norm 1.1419 (1.2064) [2022-01-18 08:11:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][300/1251] eta 0:35:34 lr 0.000981 time 2.1467 (2.2448) loss 5.1282 (4.3031) grad_norm 1.1435 (1.2062) [2022-01-18 08:11:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][310/1251] eta 0:35:11 lr 0.000981 time 1.6768 (2.2439) loss 4.8884 (4.3063) grad_norm 1.1026 (1.2057) [2022-01-18 08:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][320/1251] eta 0:34:49 lr 0.000981 time 2.4693 (2.2447) loss 4.7441 (4.3062) grad_norm 1.1420 (1.2038) [2022-01-18 08:12:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][330/1251] eta 0:34:27 lr 0.000981 time 2.2793 (2.2443) loss 5.0782 (4.3198) grad_norm 1.1254 (1.2051) [2022-01-18 08:12:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][340/1251] eta 0:34:03 lr 0.000981 time 2.2188 (2.2433) loss 3.6487 (4.3231) grad_norm 1.4043 (1.2061) [2022-01-18 08:13:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][350/1251] eta 0:33:41 lr 0.000981 time 2.1418 (2.2436) loss 3.3212 (4.3149) grad_norm 1.1902 (1.2033) [2022-01-18 08:13:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][360/1251] eta 0:33:17 lr 0.000981 time 2.8607 (2.2413) loss 4.2399 (4.3092) grad_norm 1.4591 (1.2040) [2022-01-18 08:13:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][370/1251] eta 0:32:53 lr 0.000981 time 2.1665 (2.2405) loss 4.2119 (4.3018) grad_norm 1.3265 (1.2037) [2022-01-18 08:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][380/1251] eta 0:32:26 lr 0.000981 time 2.2641 (2.2347) loss 3.4913 (4.2987) grad_norm 1.1949 (1.2061) [2022-01-18 08:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][390/1251] eta 0:32:01 lr 0.000981 time 1.8964 (2.2314) loss 3.5610 (4.3006) grad_norm 1.4113 (1.2095) [2022-01-18 08:15:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][400/1251] eta 0:31:40 lr 0.000981 time 2.4355 (2.2327) loss 4.4033 (4.2964) grad_norm 1.2146 (1.2096) [2022-01-18 08:15:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][410/1251] eta 0:31:17 lr 0.000981 time 1.7445 (2.2320) loss 5.0514 (4.2984) grad_norm 1.0457 (1.2091) [2022-01-18 08:15:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][420/1251] eta 0:30:55 lr 0.000981 time 2.3253 (2.2332) loss 4.7951 (4.3004) grad_norm 1.2579 (1.2067) [2022-01-18 08:16:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][430/1251] eta 0:30:30 lr 0.000981 time 2.1433 (2.2290) loss 4.6821 (4.2980) grad_norm 1.4630 (1.2091) [2022-01-18 08:16:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][440/1251] eta 0:30:08 lr 0.000981 time 2.7249 (2.2302) loss 4.7284 (4.2948) grad_norm 1.2697 (1.2087) [2022-01-18 08:16:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][450/1251] eta 0:29:44 lr 0.000981 time 1.7671 (2.2284) loss 4.1935 (4.2948) grad_norm 1.0366 (1.2095) [2022-01-18 08:17:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][460/1251] eta 0:29:20 lr 0.000981 time 1.6080 (2.2254) loss 4.5418 (4.2846) grad_norm 1.0386 (1.2082) [2022-01-18 08:17:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][470/1251] eta 0:28:55 lr 0.000981 time 1.8375 (2.2225) loss 4.4733 (4.2841) grad_norm 1.1262 (1.2063) [2022-01-18 08:17:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][480/1251] eta 0:28:31 lr 0.000981 time 2.3371 (2.2200) loss 4.5807 (4.2847) grad_norm 1.2901 (1.2033) [2022-01-18 08:18:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][490/1251] eta 0:28:08 lr 0.000981 time 1.9408 (2.2190) loss 4.5557 (4.2835) grad_norm 1.1164 (1.2023) [2022-01-18 08:18:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][500/1251] eta 0:27:44 lr 0.000981 time 1.9478 (2.2167) loss 4.5796 (4.2909) grad_norm 1.3129 (1.2044) [2022-01-18 08:19:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][510/1251] eta 0:27:24 lr 0.000981 time 3.3198 (2.2193) loss 4.6274 (4.2903) grad_norm 1.3642 (1.2036) [2022-01-18 08:19:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][520/1251] eta 0:27:01 lr 0.000981 time 2.1722 (2.2175) loss 4.5087 (4.2942) grad_norm 1.1002 (1.2017) [2022-01-18 08:19:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][530/1251] eta 0:26:39 lr 0.000981 time 1.9054 (2.2187) loss 4.9992 (4.2977) grad_norm 1.3262 (1.2007) [2022-01-18 08:20:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][540/1251] eta 0:26:16 lr 0.000981 time 1.7731 (2.2168) loss 4.2752 (4.2988) grad_norm 1.0957 (1.2010) [2022-01-18 08:20:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][550/1251] eta 0:25:55 lr 0.000981 time 2.1906 (2.2195) loss 4.5387 (4.2961) grad_norm 1.0688 (1.2018) [2022-01-18 08:20:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][560/1251] eta 0:25:33 lr 0.000981 time 2.1299 (2.2198) loss 4.4799 (4.2982) grad_norm 1.1799 (1.2008) [2022-01-18 08:21:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][570/1251] eta 0:25:12 lr 0.000981 time 2.1912 (2.2206) loss 3.5646 (4.2967) grad_norm 1.1447 (1.2000) [2022-01-18 08:21:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][580/1251] eta 0:24:48 lr 0.000981 time 2.0333 (2.2179) loss 4.8164 (4.2964) grad_norm 1.1369 (1.1996) [2022-01-18 08:21:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][590/1251] eta 0:24:25 lr 0.000981 time 2.0192 (2.2170) loss 4.1084 (4.3010) grad_norm 1.2847 (1.1984) [2022-01-18 08:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][600/1251] eta 0:23:59 lr 0.000981 time 1.9316 (2.2108) loss 4.7263 (4.3030) grad_norm 1.0485 (1.1969) [2022-01-18 08:22:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][610/1251] eta 0:23:35 lr 0.000981 time 1.9603 (2.2077) loss 3.8644 (4.3032) grad_norm 1.1680 (1.1975) [2022-01-18 08:22:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][620/1251] eta 0:23:11 lr 0.000981 time 1.9194 (2.2058) loss 3.1862 (4.2953) grad_norm 1.1979 (1.1971) [2022-01-18 08:23:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][630/1251] eta 0:22:49 lr 0.000981 time 2.1021 (2.2059) loss 4.4459 (4.2941) grad_norm 1.2150 (1.1959) [2022-01-18 08:23:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][640/1251] eta 0:22:29 lr 0.000981 time 2.6538 (2.2093) loss 3.9118 (4.2950) grad_norm 1.3945 (1.1947) [2022-01-18 08:24:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][650/1251] eta 0:22:08 lr 0.000981 time 1.7906 (2.2111) loss 4.6072 (4.2928) grad_norm 1.0045 (1.1932) [2022-01-18 08:24:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][660/1251] eta 0:21:48 lr 0.000981 time 2.2028 (2.2134) loss 4.7598 (4.2928) grad_norm 1.3429 (1.1931) [2022-01-18 08:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][670/1251] eta 0:21:26 lr 0.000981 time 2.1471 (2.2141) loss 4.5060 (4.2937) grad_norm 0.9801 (1.1928) [2022-01-18 08:25:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][680/1251] eta 0:21:04 lr 0.000981 time 1.8437 (2.2144) loss 4.1272 (4.2953) grad_norm 1.1245 (1.1924) [2022-01-18 08:25:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][690/1251] eta 0:20:41 lr 0.000981 time 1.8868 (2.2130) loss 4.7376 (4.2923) grad_norm 1.1948 (1.1917) [2022-01-18 08:25:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][700/1251] eta 0:20:18 lr 0.000981 time 2.4029 (2.2109) loss 4.9872 (4.2902) grad_norm 1.3074 (1.1922) [2022-01-18 08:26:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][710/1251] eta 0:19:55 lr 0.000981 time 2.3177 (2.2090) loss 4.6451 (4.2908) grad_norm 1.1681 (1.1918) [2022-01-18 08:26:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][720/1251] eta 0:19:32 lr 0.000981 time 1.8029 (2.2085) loss 4.8492 (4.2948) grad_norm 1.4260 (1.1909) [2022-01-18 08:27:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][730/1251] eta 0:19:10 lr 0.000981 time 1.8353 (2.2087) loss 4.2466 (4.2907) grad_norm 1.1759 (1.1910) [2022-01-18 08:27:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][740/1251] eta 0:18:49 lr 0.000981 time 2.7141 (2.2099) loss 4.8576 (4.2938) grad_norm 1.0445 (1.1929) [2022-01-18 08:27:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][750/1251] eta 0:18:27 lr 0.000981 time 1.9007 (2.2115) loss 3.4561 (4.2947) grad_norm 1.1383 (1.1936) [2022-01-18 08:28:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][760/1251] eta 0:18:05 lr 0.000981 time 1.5668 (2.2114) loss 4.0556 (4.2970) grad_norm 1.0232 (1.1928) [2022-01-18 08:28:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][770/1251] eta 0:17:42 lr 0.000981 time 1.8902 (2.2091) loss 3.4937 (4.2975) grad_norm 1.4062 (1.1928) [2022-01-18 08:28:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][780/1251] eta 0:17:19 lr 0.000981 time 2.5312 (2.2074) loss 4.9375 (4.2987) grad_norm 1.1870 (1.1948) [2022-01-18 08:29:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][790/1251] eta 0:16:56 lr 0.000981 time 2.1454 (2.2061) loss 4.8567 (4.2996) grad_norm 1.3138 (1.1956) [2022-01-18 08:29:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][800/1251] eta 0:16:34 lr 0.000981 time 2.1873 (2.2040) loss 4.8801 (4.3011) grad_norm 1.1335 (1.1957) [2022-01-18 08:29:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][810/1251] eta 0:16:11 lr 0.000981 time 1.8617 (2.2032) loss 4.6526 (4.3027) grad_norm 1.2661 (1.1961) [2022-01-18 08:30:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][820/1251] eta 0:15:49 lr 0.000981 time 2.7666 (2.2037) loss 4.8031 (4.3032) grad_norm 1.5269 (1.1970) [2022-01-18 08:30:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][830/1251] eta 0:15:27 lr 0.000981 time 2.1935 (2.2028) loss 3.9560 (4.3052) grad_norm 1.0956 (1.1962) [2022-01-18 08:31:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][840/1251] eta 0:15:05 lr 0.000981 time 2.1373 (2.2030) loss 3.2375 (4.2999) grad_norm 1.1679 (1.1966) [2022-01-18 08:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][850/1251] eta 0:14:43 lr 0.000981 time 1.8514 (2.2034) loss 4.4852 (4.3012) grad_norm 0.9960 (1.1967) [2022-01-18 08:31:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][860/1251] eta 0:14:20 lr 0.000981 time 2.5298 (2.2020) loss 4.3981 (4.3022) grad_norm 1.0513 (1.1966) [2022-01-18 08:32:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][870/1251] eta 0:13:58 lr 0.000981 time 2.5054 (2.2018) loss 4.0787 (4.3009) grad_norm 1.1902 (1.1961) [2022-01-18 08:32:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][880/1251] eta 0:13:36 lr 0.000981 time 1.8258 (2.2017) loss 4.5760 (4.2987) grad_norm 1.3192 (1.1981) [2022-01-18 08:32:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][890/1251] eta 0:13:15 lr 0.000981 time 2.2566 (2.2022) loss 4.4965 (4.2955) grad_norm 1.5770 (1.1990) [2022-01-18 08:33:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][900/1251] eta 0:12:53 lr 0.000981 time 2.6634 (2.2034) loss 4.7861 (4.2946) grad_norm 1.0540 (1.1997) [2022-01-18 08:33:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][910/1251] eta 0:12:31 lr 0.000981 time 2.4121 (2.2047) loss 3.8644 (4.2902) grad_norm 1.2988 (1.1992) [2022-01-18 08:33:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][920/1251] eta 0:12:10 lr 0.000981 time 1.9589 (2.2066) loss 4.4539 (4.2898) grad_norm 1.2173 (1.1990) [2022-01-18 08:34:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][930/1251] eta 0:11:48 lr 0.000981 time 1.8400 (2.2062) loss 3.6748 (4.2893) grad_norm 1.4039 (1.1991) [2022-01-18 08:34:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][940/1251] eta 0:11:25 lr 0.000981 time 1.8796 (2.2045) loss 4.3113 (4.2900) grad_norm 1.3810 (1.1981) [2022-01-18 08:35:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][950/1251] eta 0:11:02 lr 0.000981 time 1.6181 (2.2006) loss 4.7708 (4.2939) grad_norm 1.2422 (1.1983) [2022-01-18 08:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][960/1251] eta 0:10:39 lr 0.000981 time 1.8457 (2.1988) loss 3.6778 (4.2956) grad_norm 1.1953 (1.1976) [2022-01-18 08:35:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][970/1251] eta 0:10:17 lr 0.000981 time 2.2535 (2.1975) loss 4.4597 (4.2945) grad_norm 1.4447 (1.1975) [2022-01-18 08:36:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][980/1251] eta 0:09:55 lr 0.000981 time 1.8354 (2.1974) loss 3.8233 (4.2970) grad_norm 1.1042 (1.1966) [2022-01-18 08:36:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][990/1251] eta 0:09:33 lr 0.000981 time 2.0526 (2.1965) loss 4.5120 (4.2994) grad_norm 1.2176 (1.1964) [2022-01-18 08:36:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1000/1251] eta 0:09:11 lr 0.000981 time 1.9300 (2.1970) loss 4.7303 (4.2996) grad_norm 1.2543 (1.1963) [2022-01-18 08:37:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1010/1251] eta 0:08:49 lr 0.000981 time 3.0864 (2.1979) loss 4.4869 (4.3006) grad_norm 0.9711 (1.1961) [2022-01-18 08:37:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1020/1251] eta 0:08:27 lr 0.000981 time 1.8956 (2.1966) loss 3.9539 (4.3014) grad_norm 1.4404 (1.1963) [2022-01-18 08:37:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1030/1251] eta 0:08:05 lr 0.000981 time 2.4960 (2.1984) loss 4.4852 (4.3004) grad_norm 1.2149 (1.1959) [2022-01-18 08:38:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1040/1251] eta 0:07:43 lr 0.000981 time 1.7906 (2.1989) loss 3.3911 (4.3004) grad_norm 1.8635 (1.1964) [2022-01-18 08:38:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1050/1251] eta 0:07:22 lr 0.000981 time 2.8334 (2.1995) loss 3.5322 (4.3018) grad_norm 1.3027 (1.1969) [2022-01-18 08:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1060/1251] eta 0:07:00 lr 0.000981 time 2.2908 (2.1993) loss 4.8459 (4.3022) grad_norm 1.2164 (1.1970) [2022-01-18 08:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1070/1251] eta 0:06:38 lr 0.000981 time 2.1192 (2.2004) loss 4.8585 (4.3036) grad_norm 1.1123 (1.1964) [2022-01-18 08:39:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1080/1251] eta 0:06:16 lr 0.000981 time 2.2556 (2.1996) loss 5.0925 (4.3040) grad_norm 1.0887 (1.1961) [2022-01-18 08:40:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1090/1251] eta 0:05:53 lr 0.000981 time 2.4423 (2.1987) loss 4.6760 (4.3063) grad_norm 1.1683 (1.1958) [2022-01-18 08:40:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1100/1251] eta 0:05:31 lr 0.000981 time 1.9776 (2.1977) loss 4.5161 (4.3063) grad_norm 1.5463 (1.1959) [2022-01-18 08:40:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1110/1251] eta 0:05:09 lr 0.000981 time 2.2319 (2.1971) loss 3.8371 (4.3047) grad_norm 1.1556 (1.1964) [2022-01-18 08:41:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1120/1251] eta 0:04:47 lr 0.000980 time 1.9254 (2.1953) loss 4.7653 (4.3023) grad_norm 1.4850 (1.1965) [2022-01-18 08:41:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1130/1251] eta 0:04:25 lr 0.000980 time 1.8561 (2.1947) loss 4.7450 (4.3013) grad_norm 1.0477 (1.1965) [2022-01-18 08:41:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1140/1251] eta 0:04:03 lr 0.000980 time 1.9971 (2.1945) loss 3.5500 (4.2988) grad_norm 1.3345 (1.1961) [2022-01-18 08:42:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1150/1251] eta 0:03:41 lr 0.000980 time 1.5888 (2.1951) loss 3.2382 (4.2982) grad_norm 1.4572 (1.1961) [2022-01-18 08:42:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1160/1251] eta 0:03:19 lr 0.000980 time 1.8982 (2.1952) loss 4.2434 (4.2977) grad_norm 1.1576 (1.1963) [2022-01-18 08:42:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1170/1251] eta 0:02:57 lr 0.000980 time 1.9042 (2.1950) loss 4.5845 (4.2969) grad_norm 1.2363 (1.1961) [2022-01-18 08:43:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1180/1251] eta 0:02:35 lr 0.000980 time 1.8980 (2.1960) loss 4.6436 (4.2954) grad_norm 1.1013 (1.1958) [2022-01-18 08:43:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1190/1251] eta 0:02:14 lr 0.000980 time 1.9520 (2.1970) loss 4.4341 (4.2949) grad_norm 0.8864 (1.1958) [2022-01-18 08:44:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1200/1251] eta 0:01:52 lr 0.000980 time 2.7362 (2.1966) loss 4.3601 (4.2910) grad_norm 1.3692 (1.1962) [2022-01-18 08:44:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1210/1251] eta 0:01:29 lr 0.000980 time 1.8693 (2.1947) loss 4.1900 (4.2904) grad_norm 1.1435 (1.1959) [2022-01-18 08:44:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1220/1251] eta 0:01:08 lr 0.000980 time 2.3953 (2.1949) loss 4.1553 (4.2926) grad_norm 1.3729 (1.1958) [2022-01-18 08:45:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1230/1251] eta 0:00:46 lr 0.000980 time 1.8222 (2.1939) loss 3.8689 (4.2906) grad_norm 0.8681 (1.1956) [2022-01-18 08:45:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1240/1251] eta 0:00:24 lr 0.000980 time 2.3990 (2.1936) loss 3.5126 (4.2895) grad_norm 1.2166 (1.1955) [2022-01-18 08:45:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [26/300][1250/1251] eta 0:00:02 lr 0.000980 time 1.1496 (2.1878) loss 4.6415 (4.2899) grad_norm 1.1331 (1.1947) [2022-01-18 08:45:44 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 26 training takes 0:45:37 [2022-01-18 08:46:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.634 (17.634) Loss 1.5907 (1.5907) Acc@1 63.770 (63.770) Acc@5 86.426 (86.426) [2022-01-18 08:46:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.579 (3.220) Loss 1.5321 (1.5704) Acc@1 64.258 (63.947) Acc@5 87.012 (86.009) [2022-01-18 08:46:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.631 (2.561) Loss 1.5260 (1.5702) Acc@1 65.430 (63.881) Acc@5 86.914 (85.989) [2022-01-18 08:46:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.316 (2.360) Loss 1.5820 (1.5602) Acc@1 65.137 (64.056) Acc@5 85.156 (86.230) [2022-01-18 08:47:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.505 (2.192) Loss 1.5684 (1.5577) Acc@1 65.723 (64.074) Acc@5 84.863 (86.290) [2022-01-18 08:47:21 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 64.182 Acc@5 86.372 [2022-01-18 08:47:21 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 64.2% [2022-01-18 08:47:21 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 64.18% [2022-01-18 08:47:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][0/1251] eta 7:55:20 lr 0.000980 time 22.7982 (22.7982) loss 4.7274 (4.7274) grad_norm 1.1391 (1.1391) [2022-01-18 08:48:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][10/1251] eta 1:25:19 lr 0.000980 time 1.6967 (4.1257) loss 3.0297 (3.8580) grad_norm 0.9486 (1.1762) [2022-01-18 08:48:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][20/1251] eta 1:06:43 lr 0.000980 time 2.1631 (3.2526) loss 4.6893 (4.0580) grad_norm 1.0082 (1.1824) [2022-01-18 08:48:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][30/1251] eta 0:58:58 lr 0.000980 time 1.8809 (2.8980) loss 4.4438 (4.2453) grad_norm 1.0201 (1.1702) [2022-01-18 08:49:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][40/1251] eta 0:55:23 lr 0.000980 time 2.9497 (2.7442) loss 4.6005 (4.2692) grad_norm 1.0320 (1.1558) [2022-01-18 08:49:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][50/1251] eta 0:53:17 lr 0.000980 time 2.4072 (2.6621) loss 4.0470 (4.2507) grad_norm 1.3074 (1.1609) [2022-01-18 08:49:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][60/1251] eta 0:50:56 lr 0.000980 time 1.7123 (2.5665) loss 3.4263 (4.2592) grad_norm 1.0515 (1.1478) [2022-01-18 08:50:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][70/1251] eta 0:49:22 lr 0.000980 time 2.1922 (2.5088) loss 4.1506 (4.2715) grad_norm 1.1455 (1.1626) [2022-01-18 08:50:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][80/1251] eta 0:48:36 lr 0.000980 time 3.6005 (2.4910) loss 4.7628 (4.2954) grad_norm 1.1908 (1.1721) [2022-01-18 08:51:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][90/1251] eta 0:47:47 lr 0.000980 time 3.2116 (2.4702) loss 4.5016 (4.3272) grad_norm 1.1012 (1.1718) [2022-01-18 08:51:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][100/1251] eta 0:46:42 lr 0.000980 time 1.7169 (2.4344) loss 4.4281 (4.3441) grad_norm 1.2470 (1.1675) [2022-01-18 08:51:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][110/1251] eta 0:45:39 lr 0.000980 time 1.4866 (2.4007) loss 5.1733 (4.3357) grad_norm 1.3363 (1.1693) [2022-01-18 08:52:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][120/1251] eta 0:45:01 lr 0.000980 time 3.2110 (2.3887) loss 4.9490 (4.3498) grad_norm 1.2487 (1.1775) [2022-01-18 08:52:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][130/1251] eta 0:44:26 lr 0.000980 time 2.1882 (2.3789) loss 3.5169 (4.3138) grad_norm 1.1711 (1.1718) [2022-01-18 08:52:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][140/1251] eta 0:43:33 lr 0.000980 time 1.8890 (2.3522) loss 4.4056 (4.2909) grad_norm 1.0332 (1.1649) [2022-01-18 08:53:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][150/1251] eta 0:42:57 lr 0.000980 time 2.1587 (2.3409) loss 4.5516 (4.2876) grad_norm 1.3088 (1.1639) [2022-01-18 08:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][160/1251] eta 0:42:32 lr 0.000980 time 3.1647 (2.3399) loss 3.5028 (4.2871) grad_norm 1.1204 (1.1614) [2022-01-18 08:54:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][170/1251] eta 0:41:56 lr 0.000980 time 1.9158 (2.3284) loss 4.8722 (4.2793) grad_norm 1.1749 (1.1662) [2022-01-18 08:54:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][180/1251] eta 0:41:19 lr 0.000980 time 2.2154 (2.3156) loss 4.9204 (4.2493) grad_norm 0.9728 (1.1658) [2022-01-18 08:54:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][190/1251] eta 0:40:50 lr 0.000980 time 2.8330 (2.3092) loss 4.4132 (4.2528) grad_norm 1.1341 (1.1649) [2022-01-18 08:55:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][200/1251] eta 0:40:24 lr 0.000980 time 2.6411 (2.3067) loss 4.6431 (4.2599) grad_norm 1.1907 (1.1633) [2022-01-18 08:55:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][210/1251] eta 0:39:59 lr 0.000980 time 2.1789 (2.3054) loss 4.8087 (4.2516) grad_norm 1.0156 (1.1690) [2022-01-18 08:55:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][220/1251] eta 0:39:27 lr 0.000980 time 2.0606 (2.2963) loss 3.7167 (4.2403) grad_norm 1.1367 (1.1692) [2022-01-18 08:56:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][230/1251] eta 0:38:56 lr 0.000980 time 2.5918 (2.2882) loss 4.0029 (4.2438) grad_norm 1.7379 (1.1752) [2022-01-18 08:56:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][240/1251] eta 0:38:26 lr 0.000980 time 2.3775 (2.2812) loss 3.3643 (4.2430) grad_norm 1.1421 (1.1763) [2022-01-18 08:56:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][250/1251] eta 0:37:51 lr 0.000980 time 2.0711 (2.2688) loss 3.7373 (4.2429) grad_norm 1.0443 (1.1781) [2022-01-18 08:57:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][260/1251] eta 0:37:25 lr 0.000980 time 2.0733 (2.2658) loss 4.5202 (4.2459) grad_norm 1.2515 (1.1792) [2022-01-18 08:57:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][270/1251] eta 0:36:58 lr 0.000980 time 2.1443 (2.2616) loss 4.0815 (4.2436) grad_norm 1.1862 (1.1820) [2022-01-18 08:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][280/1251] eta 0:36:37 lr 0.000980 time 3.3844 (2.2630) loss 4.1318 (4.2376) grad_norm 1.0915 (1.1841) [2022-01-18 08:58:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][290/1251] eta 0:36:11 lr 0.000980 time 2.2506 (2.2596) loss 4.4370 (4.2479) grad_norm 1.0547 (1.1794) [2022-01-18 08:58:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][300/1251] eta 0:35:41 lr 0.000980 time 2.1522 (2.2516) loss 4.0806 (4.2508) grad_norm 1.1248 (1.1778) [2022-01-18 08:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][310/1251] eta 0:35:15 lr 0.000980 time 2.3268 (2.2482) loss 4.3351 (4.2412) grad_norm 1.0355 (1.1763) [2022-01-18 08:59:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][320/1251] eta 0:34:54 lr 0.000980 time 2.4794 (2.2500) loss 4.8631 (4.2475) grad_norm 1.1144 (1.1749) [2022-01-18 08:59:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][330/1251] eta 0:34:34 lr 0.000980 time 2.5948 (2.2524) loss 3.6521 (4.2431) grad_norm 1.2725 (1.1730) [2022-01-18 09:00:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][340/1251] eta 0:34:09 lr 0.000980 time 1.8025 (2.2497) loss 4.7691 (4.2422) grad_norm 0.9676 (1.1723) [2022-01-18 09:00:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][350/1251] eta 0:33:45 lr 0.000980 time 2.5414 (2.2483) loss 3.3911 (4.2429) grad_norm 1.4347 (1.1718) [2022-01-18 09:00:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][360/1251] eta 0:33:21 lr 0.000980 time 1.7821 (2.2466) loss 4.7184 (4.2483) grad_norm 1.2460 (1.1764) [2022-01-18 09:01:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][370/1251] eta 0:32:57 lr 0.000980 time 2.0172 (2.2445) loss 3.0067 (4.2426) grad_norm 1.0393 (1.1778) [2022-01-18 09:01:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][380/1251] eta 0:32:34 lr 0.000980 time 2.2010 (2.2444) loss 4.4659 (4.2451) grad_norm 1.2675 (1.1805) [2022-01-18 09:02:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][390/1251] eta 0:32:13 lr 0.000980 time 2.8224 (2.2457) loss 3.0539 (4.2479) grad_norm 1.2962 (1.1813) [2022-01-18 09:02:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][400/1251] eta 0:31:51 lr 0.000980 time 2.2510 (2.2465) loss 3.8242 (4.2369) grad_norm 1.0143 (1.1823) [2022-01-18 09:02:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][410/1251] eta 0:31:25 lr 0.000980 time 1.7740 (2.2418) loss 4.2875 (4.2443) grad_norm 0.9612 (1.1828) [2022-01-18 09:03:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][420/1251] eta 0:30:59 lr 0.000980 time 1.6474 (2.2376) loss 3.8551 (4.2336) grad_norm 1.2995 (1.1839) [2022-01-18 09:03:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][430/1251] eta 0:30:35 lr 0.000980 time 2.7879 (2.2355) loss 4.3143 (4.2285) grad_norm 0.9675 (1.1840) [2022-01-18 09:03:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][440/1251] eta 0:30:12 lr 0.000980 time 1.8806 (2.2353) loss 4.8603 (4.2259) grad_norm 1.1537 (1.1839) [2022-01-18 09:04:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][450/1251] eta 0:29:49 lr 0.000980 time 2.2676 (2.2335) loss 4.0763 (4.2148) grad_norm 1.1930 (1.1837) [2022-01-18 09:04:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][460/1251] eta 0:29:25 lr 0.000980 time 2.2007 (2.2319) loss 4.8343 (4.2192) grad_norm 1.2083 (1.1840) [2022-01-18 09:04:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][470/1251] eta 0:29:02 lr 0.000980 time 2.2826 (2.2309) loss 3.4337 (4.2180) grad_norm 1.2676 (1.1836) [2022-01-18 09:05:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][480/1251] eta 0:28:38 lr 0.000980 time 2.1699 (2.2288) loss 3.3101 (4.2165) grad_norm 1.4466 (1.1829) [2022-01-18 09:05:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][490/1251] eta 0:28:12 lr 0.000980 time 2.5111 (2.2243) loss 5.3108 (4.2143) grad_norm 1.1157 (1.1834) [2022-01-18 09:05:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][500/1251] eta 0:27:48 lr 0.000980 time 1.9498 (2.2216) loss 4.2004 (4.2196) grad_norm 1.1417 (1.1831) [2022-01-18 09:06:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][510/1251] eta 0:27:26 lr 0.000980 time 2.6489 (2.2214) loss 5.1816 (4.2250) grad_norm 1.0988 (1.1824) [2022-01-18 09:06:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][520/1251] eta 0:27:04 lr 0.000980 time 2.5668 (2.2223) loss 4.9212 (4.2250) grad_norm 1.0739 (1.1829) [2022-01-18 09:07:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][530/1251] eta 0:26:43 lr 0.000980 time 2.5269 (2.2239) loss 4.8125 (4.2314) grad_norm 1.1942 (1.1823) [2022-01-18 09:07:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][540/1251] eta 0:26:20 lr 0.000980 time 1.9173 (2.2229) loss 4.9836 (4.2310) grad_norm 1.1452 (1.1846) [2022-01-18 09:07:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][550/1251] eta 0:25:57 lr 0.000980 time 2.1964 (2.2223) loss 4.2761 (4.2351) grad_norm 1.0417 (1.1836) [2022-01-18 09:08:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][560/1251] eta 0:25:36 lr 0.000980 time 2.9641 (2.2238) loss 3.5291 (4.2343) grad_norm 1.4050 (1.1855) [2022-01-18 09:08:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][570/1251] eta 0:25:13 lr 0.000980 time 2.5052 (2.2230) loss 3.9512 (4.2267) grad_norm 1.3505 (1.1886) [2022-01-18 09:08:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][580/1251] eta 0:24:48 lr 0.000980 time 1.6903 (2.2183) loss 4.4397 (4.2296) grad_norm 1.0641 (1.1882) [2022-01-18 09:09:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][590/1251] eta 0:24:24 lr 0.000980 time 2.4393 (2.2152) loss 4.6090 (4.2333) grad_norm 1.1086 (1.1884) [2022-01-18 09:09:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][600/1251] eta 0:24:01 lr 0.000980 time 2.5865 (2.2143) loss 4.5313 (4.2396) grad_norm 1.1116 (1.1895) [2022-01-18 09:09:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][610/1251] eta 0:23:39 lr 0.000980 time 2.1340 (2.2137) loss 5.0956 (4.2410) grad_norm 1.0827 (1.1894) [2022-01-18 09:10:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][620/1251] eta 0:23:18 lr 0.000980 time 2.0128 (2.2156) loss 4.9765 (4.2404) grad_norm 1.2878 (1.1897) [2022-01-18 09:10:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][630/1251] eta 0:22:56 lr 0.000980 time 1.5688 (2.2163) loss 4.7969 (4.2425) grad_norm 1.1526 (1.1883) [2022-01-18 09:11:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][640/1251] eta 0:22:35 lr 0.000980 time 2.6360 (2.2190) loss 3.4754 (4.2416) grad_norm 1.2014 (1.1880) [2022-01-18 09:11:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][650/1251] eta 0:22:13 lr 0.000980 time 2.0114 (2.2190) loss 3.8181 (4.2377) grad_norm 1.2633 (1.1880) [2022-01-18 09:11:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][660/1251] eta 0:21:51 lr 0.000980 time 3.5598 (2.2199) loss 4.2597 (4.2368) grad_norm 1.0352 (1.1873) [2022-01-18 09:12:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][670/1251] eta 0:21:29 lr 0.000980 time 1.9733 (2.2191) loss 4.7170 (4.2398) grad_norm 1.1321 (1.1851) [2022-01-18 09:12:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][680/1251] eta 0:21:04 lr 0.000980 time 1.8797 (2.2153) loss 4.1948 (4.2399) grad_norm 1.0838 (1.1843) [2022-01-18 09:12:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][690/1251] eta 0:20:40 lr 0.000980 time 1.6967 (2.2118) loss 4.2319 (4.2364) grad_norm 1.1253 (1.1843) [2022-01-18 09:13:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][700/1251] eta 0:20:17 lr 0.000980 time 1.9429 (2.2102) loss 4.0132 (4.2360) grad_norm 1.0558 (1.1841) [2022-01-18 09:13:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][710/1251] eta 0:19:54 lr 0.000980 time 2.0715 (2.2079) loss 4.4659 (4.2412) grad_norm 0.9116 (1.1843) [2022-01-18 09:13:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][720/1251] eta 0:19:31 lr 0.000980 time 2.5290 (2.2058) loss 4.0278 (4.2419) grad_norm 0.9947 (1.1828) [2022-01-18 09:14:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][730/1251] eta 0:19:08 lr 0.000979 time 2.2601 (2.2047) loss 3.8522 (4.2418) grad_norm 1.1339 (1.1821) [2022-01-18 09:14:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][740/1251] eta 0:18:47 lr 0.000979 time 2.3782 (2.2057) loss 4.2676 (4.2412) grad_norm 1.5532 (1.1818) [2022-01-18 09:14:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][750/1251] eta 0:18:25 lr 0.000979 time 1.6617 (2.2072) loss 3.8842 (4.2403) grad_norm 0.9227 (1.1816) [2022-01-18 09:15:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][760/1251] eta 0:18:04 lr 0.000979 time 2.7726 (2.2082) loss 3.8946 (4.2400) grad_norm 1.1050 (1.1811) [2022-01-18 09:15:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][770/1251] eta 0:17:42 lr 0.000979 time 2.2678 (2.2097) loss 5.0822 (4.2388) grad_norm 1.6124 (1.1810) [2022-01-18 09:16:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][780/1251] eta 0:17:20 lr 0.000979 time 2.2096 (2.2092) loss 4.7312 (4.2399) grad_norm 1.2042 (1.1814) [2022-01-18 09:16:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][790/1251] eta 0:16:58 lr 0.000979 time 2.5799 (2.2100) loss 3.9616 (4.2398) grad_norm 1.3048 (1.1820) [2022-01-18 09:16:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][800/1251] eta 0:16:36 lr 0.000979 time 2.7771 (2.2092) loss 4.2784 (4.2412) grad_norm 1.1611 (1.1829) [2022-01-18 09:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][810/1251] eta 0:16:13 lr 0.000979 time 1.8551 (2.2082) loss 4.0055 (4.2420) grad_norm 1.1712 (1.1831) [2022-01-18 09:17:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][820/1251] eta 0:15:51 lr 0.000979 time 2.1616 (2.2083) loss 4.1254 (4.2394) grad_norm 0.9692 (1.1823) [2022-01-18 09:17:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][830/1251] eta 0:15:29 lr 0.000979 time 2.1512 (2.2090) loss 3.2094 (4.2386) grad_norm 1.1986 (1.1814) [2022-01-18 09:18:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][840/1251] eta 0:15:07 lr 0.000979 time 2.4594 (2.2087) loss 5.0707 (4.2364) grad_norm 1.2172 (1.1802) [2022-01-18 09:18:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][850/1251] eta 0:14:45 lr 0.000979 time 1.6616 (2.2081) loss 5.1059 (4.2346) grad_norm 1.1704 (1.1795) [2022-01-18 09:19:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][860/1251] eta 0:14:22 lr 0.000979 time 2.2595 (2.2058) loss 3.7537 (4.2329) grad_norm 1.2148 (1.1801) [2022-01-18 09:19:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][870/1251] eta 0:14:01 lr 0.000979 time 1.8578 (2.2083) loss 4.7367 (4.2306) grad_norm 1.0172 (1.1802) [2022-01-18 09:19:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][880/1251] eta 0:13:38 lr 0.000979 time 1.9653 (2.2070) loss 3.3161 (4.2285) grad_norm 1.1841 (1.1803) [2022-01-18 09:20:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][890/1251] eta 0:13:16 lr 0.000979 time 1.6087 (2.2058) loss 4.7852 (4.2306) grad_norm 1.0694 (1.1806) [2022-01-18 09:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][900/1251] eta 0:12:53 lr 0.000979 time 1.8988 (2.2033) loss 3.3242 (4.2284) grad_norm 1.0544 (1.1823) [2022-01-18 09:20:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][910/1251] eta 0:12:31 lr 0.000979 time 2.1256 (2.2028) loss 4.1497 (4.2299) grad_norm 1.1779 (1.1821) [2022-01-18 09:21:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][920/1251] eta 0:12:09 lr 0.000979 time 2.2040 (2.2029) loss 4.1074 (4.2340) grad_norm 1.2217 (1.1816) [2022-01-18 09:21:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][930/1251] eta 0:11:47 lr 0.000979 time 1.8612 (2.2042) loss 4.4881 (4.2390) grad_norm 0.9862 (1.1797) [2022-01-18 09:21:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][940/1251] eta 0:11:25 lr 0.000979 time 2.1769 (2.2046) loss 4.2849 (4.2412) grad_norm 1.3091 (1.1796) [2022-01-18 09:22:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][950/1251] eta 0:11:03 lr 0.000979 time 2.3159 (2.2052) loss 4.0202 (4.2399) grad_norm 1.0591 (1.1796) [2022-01-18 09:22:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][960/1251] eta 0:10:41 lr 0.000979 time 2.7339 (2.2052) loss 4.1439 (4.2387) grad_norm 1.1068 (1.1788) [2022-01-18 09:23:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][970/1251] eta 0:10:19 lr 0.000979 time 1.8551 (2.2044) loss 4.1336 (4.2380) grad_norm 1.1882 (1.1780) [2022-01-18 09:23:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][980/1251] eta 0:09:56 lr 0.000979 time 1.9463 (2.2018) loss 2.9708 (4.2389) grad_norm 0.9536 (1.1777) [2022-01-18 09:23:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][990/1251] eta 0:09:34 lr 0.000979 time 2.0355 (2.2011) loss 4.8828 (4.2413) grad_norm 1.4336 (1.1774) [2022-01-18 09:24:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1000/1251] eta 0:09:12 lr 0.000979 time 2.3329 (2.2007) loss 4.8896 (4.2433) grad_norm 1.2333 (1.1773) [2022-01-18 09:24:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1010/1251] eta 0:08:51 lr 0.000979 time 2.5773 (2.2039) loss 3.7716 (4.2417) grad_norm 0.9421 (1.1778) [2022-01-18 09:24:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1020/1251] eta 0:08:29 lr 0.000979 time 1.5251 (2.2042) loss 3.6548 (4.2415) grad_norm 1.3255 (1.1769) [2022-01-18 09:25:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1030/1251] eta 0:08:07 lr 0.000979 time 1.9315 (2.2052) loss 4.5335 (4.2401) grad_norm 1.2932 (1.1772) [2022-01-18 09:25:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1040/1251] eta 0:07:45 lr 0.000979 time 2.1133 (2.2052) loss 4.1666 (4.2408) grad_norm 1.1201 (1.1771) [2022-01-18 09:25:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1050/1251] eta 0:07:23 lr 0.000979 time 1.5759 (2.2042) loss 4.2235 (4.2399) grad_norm 0.9734 (1.1767) [2022-01-18 09:26:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1060/1251] eta 0:07:00 lr 0.000979 time 2.1766 (2.2023) loss 4.7646 (4.2433) grad_norm 1.2694 (1.1773) [2022-01-18 09:26:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1070/1251] eta 0:06:38 lr 0.000979 time 1.8849 (2.2012) loss 3.0818 (4.2414) grad_norm 1.3405 (1.1780) [2022-01-18 09:27:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1080/1251] eta 0:06:16 lr 0.000979 time 2.0713 (2.2006) loss 3.8888 (4.2435) grad_norm 1.3035 (1.1788) [2022-01-18 09:27:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1090/1251] eta 0:05:54 lr 0.000979 time 1.7434 (2.2005) loss 4.4474 (4.2442) grad_norm 1.2727 (1.1791) [2022-01-18 09:27:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1100/1251] eta 0:05:32 lr 0.000979 time 1.8633 (2.1999) loss 3.6983 (4.2458) grad_norm 1.0688 (1.1790) [2022-01-18 09:28:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1110/1251] eta 0:05:10 lr 0.000979 time 2.2398 (2.2004) loss 4.5389 (4.2434) grad_norm 1.2259 (1.1789) [2022-01-18 09:28:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1120/1251] eta 0:04:48 lr 0.000979 time 2.2801 (2.2016) loss 4.4610 (4.2418) grad_norm 1.0887 (1.1787) [2022-01-18 09:28:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1130/1251] eta 0:04:26 lr 0.000979 time 2.1582 (2.2044) loss 4.0540 (4.2427) grad_norm 1.3267 (1.1790) [2022-01-18 09:29:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1140/1251] eta 0:04:04 lr 0.000979 time 1.6315 (2.2036) loss 3.4716 (4.2432) grad_norm 1.1895 (1.1782) [2022-01-18 09:29:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1150/1251] eta 0:03:42 lr 0.000979 time 1.8691 (2.2033) loss 3.9483 (4.2439) grad_norm 0.9579 (1.1778) [2022-01-18 09:29:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1160/1251] eta 0:03:20 lr 0.000979 time 1.8397 (2.2021) loss 4.1279 (4.2438) grad_norm 1.2009 (1.1775) [2022-01-18 09:30:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1170/1251] eta 0:02:58 lr 0.000979 time 1.8421 (2.2020) loss 5.1490 (4.2455) grad_norm 1.0502 (1.1768) [2022-01-18 09:30:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1180/1251] eta 0:02:36 lr 0.000979 time 2.6381 (2.2018) loss 3.4539 (4.2471) grad_norm 1.0549 (1.1761) [2022-01-18 09:31:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1190/1251] eta 0:02:14 lr 0.000979 time 1.8818 (2.2010) loss 4.4144 (4.2484) grad_norm 1.3213 (1.1770) [2022-01-18 09:31:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1200/1251] eta 0:01:52 lr 0.000979 time 1.9239 (2.1992) loss 3.5648 (4.2479) grad_norm 1.4216 (1.1773) [2022-01-18 09:31:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1210/1251] eta 0:01:30 lr 0.000979 time 1.8579 (2.1975) loss 3.5436 (4.2446) grad_norm 1.2433 (1.1779) [2022-01-18 09:32:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1220/1251] eta 0:01:08 lr 0.000979 time 1.8747 (2.1973) loss 4.2708 (4.2447) grad_norm 1.2669 (1.1778) [2022-01-18 09:32:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1230/1251] eta 0:00:46 lr 0.000979 time 2.2200 (2.1962) loss 4.5321 (4.2451) grad_norm 1.2252 (1.1775) [2022-01-18 09:32:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1240/1251] eta 0:00:24 lr 0.000979 time 2.4908 (2.1975) loss 5.2396 (4.2434) grad_norm 1.3920 (1.1779) [2022-01-18 09:33:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [27/300][1250/1251] eta 0:00:02 lr 0.000979 time 1.2101 (2.1917) loss 3.1719 (4.2418) grad_norm 1.3639 (1.1776) [2022-01-18 09:33:04 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 27 training takes 0:45:42 [2022-01-18 09:33:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.857 (17.857) Loss 1.5607 (1.5607) Acc@1 63.672 (63.672) Acc@5 87.402 (87.402) [2022-01-18 09:33:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.307 (3.525) Loss 1.5588 (1.5707) Acc@1 64.551 (64.187) Acc@5 87.598 (86.923) [2022-01-18 09:33:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.601 (2.572) Loss 1.5738 (1.5524) Acc@1 63.379 (64.541) Acc@5 86.035 (86.886) [2022-01-18 09:34:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.992 (2.334) Loss 1.4961 (1.5448) Acc@1 66.016 (64.604) Acc@5 88.379 (87.018) [2022-01-18 09:34:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.732 (2.188) Loss 1.5000 (1.5489) Acc@1 65.527 (64.615) Acc@5 87.695 (86.819) [2022-01-18 09:34:41 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 64.610 Acc@5 86.710 [2022-01-18 09:34:41 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 64.6% [2022-01-18 09:34:41 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 64.61% [2022-01-18 09:35:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][0/1251] eta 7:22:05 lr 0.000979 time 21.2031 (21.2031) loss 4.4217 (4.4217) grad_norm 1.1129 (1.1129) [2022-01-18 09:35:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][10/1251] eta 1:24:05 lr 0.000979 time 2.5926 (4.0654) loss 4.6177 (4.3869) grad_norm 0.9870 (1.2083) [2022-01-18 09:35:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][20/1251] eta 1:03:13 lr 0.000979 time 1.5040 (3.0819) loss 3.0984 (4.1792) grad_norm 1.2394 (1.2407) [2022-01-18 09:36:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][30/1251] eta 0:56:04 lr 0.000979 time 1.8197 (2.7557) loss 3.9136 (4.1980) grad_norm 1.3477 (1.2211) [2022-01-18 09:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][40/1251] eta 0:53:07 lr 0.000979 time 3.6153 (2.6325) loss 3.5941 (4.2246) grad_norm 1.1753 (1.2020) [2022-01-18 09:36:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][50/1251] eta 0:51:22 lr 0.000979 time 2.2718 (2.5667) loss 3.0348 (4.1663) grad_norm 1.4151 (1.1912) [2022-01-18 09:37:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][60/1251] eta 0:50:24 lr 0.000979 time 1.9524 (2.5394) loss 5.0420 (4.2031) grad_norm 1.2448 (1.2030) [2022-01-18 09:37:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][70/1251] eta 0:49:23 lr 0.000979 time 2.2052 (2.5090) loss 4.8280 (4.2068) grad_norm 1.1572 (1.2096) [2022-01-18 09:38:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][80/1251] eta 0:48:22 lr 0.000979 time 2.8423 (2.4786) loss 3.2893 (4.1806) grad_norm 1.2143 (1.2009) [2022-01-18 09:38:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][90/1251] eta 0:47:29 lr 0.000979 time 2.6581 (2.4543) loss 3.9671 (4.2192) grad_norm 1.3724 (1.2108) [2022-01-18 09:38:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][100/1251] eta 0:46:26 lr 0.000979 time 1.8574 (2.4208) loss 4.9421 (4.2432) grad_norm 1.1386 (1.2098) [2022-01-18 09:39:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][110/1251] eta 0:45:30 lr 0.000979 time 1.8812 (2.3935) loss 3.6096 (4.2734) grad_norm 1.2642 (1.2090) [2022-01-18 09:39:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][120/1251] eta 0:44:43 lr 0.000979 time 2.3835 (2.3729) loss 4.3000 (4.2645) grad_norm 1.6162 (1.2051) [2022-01-18 09:39:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][130/1251] eta 0:44:22 lr 0.000979 time 2.7994 (2.3753) loss 3.5117 (4.2667) grad_norm 1.3056 (1.1974) [2022-01-18 09:40:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][140/1251] eta 0:43:43 lr 0.000979 time 1.7881 (2.3616) loss 4.5185 (4.2763) grad_norm 1.0609 (1.1961) [2022-01-18 09:40:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][150/1251] eta 0:43:09 lr 0.000979 time 1.8888 (2.3520) loss 4.4095 (4.2805) grad_norm 1.1979 (1.1928) [2022-01-18 09:40:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][160/1251] eta 0:42:17 lr 0.000979 time 2.3075 (2.3254) loss 4.4273 (4.3006) grad_norm 1.1155 (1.1899) [2022-01-18 09:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][170/1251] eta 0:41:35 lr 0.000979 time 2.5331 (2.3087) loss 4.1821 (4.2996) grad_norm 1.1713 (1.1853) [2022-01-18 09:41:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][180/1251] eta 0:40:55 lr 0.000979 time 2.0129 (2.2930) loss 4.9994 (4.3112) grad_norm 1.5182 (1.1839) [2022-01-18 09:41:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][190/1251] eta 0:40:20 lr 0.000979 time 2.2671 (2.2811) loss 3.7443 (4.3082) grad_norm 1.0625 (1.1846) [2022-01-18 09:42:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][200/1251] eta 0:39:52 lr 0.000979 time 1.8926 (2.2767) loss 4.0403 (4.3049) grad_norm 1.2970 (1.1877) [2022-01-18 09:42:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][210/1251] eta 0:39:24 lr 0.000979 time 1.6469 (2.2711) loss 3.9567 (4.2975) grad_norm 1.1061 (1.1821) [2022-01-18 09:43:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][220/1251] eta 0:39:10 lr 0.000979 time 1.7907 (2.2796) loss 5.1697 (4.3146) grad_norm 0.9835 (1.1788) [2022-01-18 09:43:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][230/1251] eta 0:38:45 lr 0.000979 time 2.2213 (2.2777) loss 4.7342 (4.3213) grad_norm 1.2855 (1.1803) [2022-01-18 09:43:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][240/1251] eta 0:38:24 lr 0.000979 time 2.2776 (2.2793) loss 4.5277 (4.3216) grad_norm 1.1647 (1.1839) [2022-01-18 09:44:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][250/1251] eta 0:37:53 lr 0.000979 time 2.2795 (2.2714) loss 4.0967 (4.3078) grad_norm 1.3202 (1.1838) [2022-01-18 09:44:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][260/1251] eta 0:37:22 lr 0.000979 time 1.5446 (2.2632) loss 3.2986 (4.2935) grad_norm 1.0583 (1.1814) [2022-01-18 09:44:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][270/1251] eta 0:36:53 lr 0.000979 time 2.3462 (2.2568) loss 4.5452 (4.2904) grad_norm 1.2691 (1.1838) [2022-01-18 09:45:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][280/1251] eta 0:36:29 lr 0.000979 time 1.9179 (2.2553) loss 4.6948 (4.2900) grad_norm 1.1991 (1.1826) [2022-01-18 09:45:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][290/1251] eta 0:36:12 lr 0.000979 time 2.4899 (2.2610) loss 3.3673 (4.2840) grad_norm 1.1840 (1.1882) [2022-01-18 09:46:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][300/1251] eta 0:35:46 lr 0.000979 time 1.6647 (2.2575) loss 3.5125 (4.2810) grad_norm 1.1869 (1.1877) [2022-01-18 09:46:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][310/1251] eta 0:35:19 lr 0.000979 time 2.2058 (2.2527) loss 3.7843 (4.2775) grad_norm 1.2893 (1.1898) [2022-01-18 09:46:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][320/1251] eta 0:34:59 lr 0.000978 time 2.5945 (2.2548) loss 4.4333 (4.2704) grad_norm 1.2059 (1.1886) [2022-01-18 09:47:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][330/1251] eta 0:34:31 lr 0.000978 time 1.6434 (2.2494) loss 3.8777 (4.2591) grad_norm 1.3392 (1.1902) [2022-01-18 09:47:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][340/1251] eta 0:34:03 lr 0.000978 time 1.8575 (2.2434) loss 4.7138 (4.2600) grad_norm 1.1525 (1.1918) [2022-01-18 09:47:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][350/1251] eta 0:33:39 lr 0.000978 time 2.5978 (2.2416) loss 2.8776 (4.2503) grad_norm 1.1968 (1.1893) [2022-01-18 09:48:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][360/1251] eta 0:33:18 lr 0.000978 time 1.8712 (2.2430) loss 4.3096 (4.2505) grad_norm 1.0749 (1.1855) [2022-01-18 09:48:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][370/1251] eta 0:32:56 lr 0.000978 time 2.5331 (2.2432) loss 4.7179 (4.2381) grad_norm 1.2328 (1.1833) [2022-01-18 09:48:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][380/1251] eta 0:32:27 lr 0.000978 time 1.9167 (2.2364) loss 4.3598 (4.2336) grad_norm 1.1297 (1.1841) [2022-01-18 09:49:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][390/1251] eta 0:32:05 lr 0.000978 time 2.8822 (2.2358) loss 2.9906 (4.2358) grad_norm 1.8402 (1.1873) [2022-01-18 09:49:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][400/1251] eta 0:31:46 lr 0.000978 time 2.1349 (2.2399) loss 3.8816 (4.2376) grad_norm 1.0775 (1.1859) [2022-01-18 09:50:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][410/1251] eta 0:31:27 lr 0.000978 time 2.5054 (2.2445) loss 4.3855 (4.2398) grad_norm 1.4584 (1.1872) [2022-01-18 09:50:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][420/1251] eta 0:31:03 lr 0.000978 time 1.8947 (2.2420) loss 4.2389 (4.2361) grad_norm 1.0775 (1.1860) [2022-01-18 09:50:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][430/1251] eta 0:30:34 lr 0.000978 time 1.9726 (2.2341) loss 4.1726 (4.2330) grad_norm 1.3621 (1.1868) [2022-01-18 09:51:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][440/1251] eta 0:30:08 lr 0.000978 time 1.7945 (2.2296) loss 5.0439 (4.2387) grad_norm 1.0850 (1.1878) [2022-01-18 09:51:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][450/1251] eta 0:29:56 lr 0.000978 time 1.8720 (2.2425) loss 4.5846 (4.2408) grad_norm 1.2251 (1.1877) [2022-01-18 09:51:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][460/1251] eta 0:29:33 lr 0.000978 time 1.9362 (2.2427) loss 4.3211 (4.2418) grad_norm 1.0929 (1.1887) [2022-01-18 09:52:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][470/1251] eta 0:29:12 lr 0.000978 time 2.0854 (2.2434) loss 4.5216 (4.2426) grad_norm 0.9795 (1.1874) [2022-01-18 09:52:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][480/1251] eta 0:28:46 lr 0.000978 time 1.7752 (2.2393) loss 4.6179 (4.2388) grad_norm 1.2405 (1.1859) [2022-01-18 09:53:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][490/1251] eta 0:28:23 lr 0.000978 time 1.9033 (2.2383) loss 5.1198 (4.2443) grad_norm 0.9630 (1.1835) [2022-01-18 09:53:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][500/1251] eta 0:28:01 lr 0.000978 time 1.9386 (2.2393) loss 3.7447 (4.2403) grad_norm 0.9973 (1.1852) [2022-01-18 09:53:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][510/1251] eta 0:27:38 lr 0.000978 time 2.2368 (2.2378) loss 4.8846 (4.2347) grad_norm 1.2582 (1.1845) [2022-01-18 09:54:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][520/1251] eta 0:27:12 lr 0.000978 time 2.0095 (2.2337) loss 3.8374 (4.2372) grad_norm 1.2180 (1.1857) [2022-01-18 09:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][530/1251] eta 0:26:53 lr 0.000978 time 1.8890 (2.2378) loss 3.8935 (4.2372) grad_norm 1.2725 (1.1862) [2022-01-18 09:54:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][540/1251] eta 0:26:28 lr 0.000978 time 1.5685 (2.2338) loss 3.8886 (4.2385) grad_norm 1.1293 (1.1870) [2022-01-18 09:55:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][550/1251] eta 0:26:04 lr 0.000978 time 1.6399 (2.2324) loss 4.3996 (4.2411) grad_norm 0.9624 (1.1870) [2022-01-18 09:55:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][560/1251] eta 0:25:39 lr 0.000978 time 1.9944 (2.2284) loss 3.9243 (4.2409) grad_norm 1.0996 (1.1877) [2022-01-18 09:55:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][570/1251] eta 0:25:17 lr 0.000978 time 1.9611 (2.2281) loss 4.7400 (4.2447) grad_norm 1.4052 (1.1890) [2022-01-18 09:56:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][580/1251] eta 0:24:55 lr 0.000978 time 1.6740 (2.2283) loss 4.5776 (4.2511) grad_norm 1.0353 (1.1880) [2022-01-18 09:56:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][590/1251] eta 0:24:33 lr 0.000978 time 2.3105 (2.2285) loss 4.7492 (4.2476) grad_norm 1.1715 (1.1874) [2022-01-18 09:56:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][600/1251] eta 0:24:09 lr 0.000978 time 1.7180 (2.2272) loss 4.4967 (4.2533) grad_norm 0.9738 (1.1855) [2022-01-18 09:57:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][610/1251] eta 0:23:50 lr 0.000978 time 1.9071 (2.2314) loss 4.6536 (4.2514) grad_norm 1.3131 (1.1864) [2022-01-18 09:57:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][620/1251] eta 0:23:25 lr 0.000978 time 1.6026 (2.2275) loss 4.1449 (4.2438) grad_norm 1.0725 (1.1852) [2022-01-18 09:58:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][630/1251] eta 0:23:01 lr 0.000978 time 1.7956 (2.2244) loss 4.5988 (4.2463) grad_norm 1.0582 (1.1857) [2022-01-18 09:58:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][640/1251] eta 0:22:37 lr 0.000978 time 1.8783 (2.2225) loss 3.2354 (4.2413) grad_norm 0.9864 (1.1856) [2022-01-18 09:58:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][650/1251] eta 0:22:18 lr 0.000978 time 2.2875 (2.2271) loss 3.5647 (4.2400) grad_norm 1.1101 (1.1849) [2022-01-18 09:59:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][660/1251] eta 0:21:54 lr 0.000978 time 1.7710 (2.2244) loss 3.1050 (4.2379) grad_norm 1.2181 (1.1840) [2022-01-18 09:59:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][670/1251] eta 0:21:32 lr 0.000978 time 2.1594 (2.2244) loss 4.2892 (4.2403) grad_norm 1.1479 (1.1829) [2022-01-18 09:59:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][680/1251] eta 0:21:08 lr 0.000978 time 1.5703 (2.2217) loss 4.5487 (4.2391) grad_norm 1.3038 (1.1828) [2022-01-18 10:00:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][690/1251] eta 0:20:47 lr 0.000978 time 1.9270 (2.2235) loss 3.2625 (4.2323) grad_norm 1.2385 (1.1819) [2022-01-18 10:00:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][700/1251] eta 0:20:24 lr 0.000978 time 1.8316 (2.2222) loss 4.1939 (4.2349) grad_norm 1.2359 (1.1817) [2022-01-18 10:01:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][710/1251] eta 0:20:01 lr 0.000978 time 2.2441 (2.2209) loss 3.9045 (4.2299) grad_norm 1.2899 (1.1824) [2022-01-18 10:01:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][720/1251] eta 0:19:38 lr 0.000978 time 1.7835 (2.2189) loss 4.4749 (4.2318) grad_norm 1.1338 (1.1808) [2022-01-18 10:01:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][730/1251] eta 0:19:16 lr 0.000978 time 2.2546 (2.2205) loss 3.4569 (4.2363) grad_norm 1.1181 (1.1813) [2022-01-18 10:02:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][740/1251] eta 0:18:54 lr 0.000978 time 2.1168 (2.2209) loss 3.5814 (4.2291) grad_norm 1.1249 (1.1798) [2022-01-18 10:02:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][750/1251] eta 0:18:32 lr 0.000978 time 2.3097 (2.2201) loss 4.0419 (4.2290) grad_norm 1.1069 (1.1800) [2022-01-18 10:02:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][760/1251] eta 0:18:08 lr 0.000978 time 2.1914 (2.2179) loss 4.4189 (4.2303) grad_norm 1.1346 (1.1805) [2022-01-18 10:03:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][770/1251] eta 0:17:46 lr 0.000978 time 1.6631 (2.2175) loss 4.6835 (4.2342) grad_norm 1.0907 (1.1803) [2022-01-18 10:03:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][780/1251] eta 0:17:24 lr 0.000978 time 2.0225 (2.2168) loss 4.1244 (4.2329) grad_norm 1.1262 (1.1794) [2022-01-18 10:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][790/1251] eta 0:17:01 lr 0.000978 time 1.9381 (2.2158) loss 3.8201 (4.2319) grad_norm 1.1653 (1.1791) [2022-01-18 10:04:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][800/1251] eta 0:16:39 lr 0.000978 time 2.4918 (2.2153) loss 4.3189 (4.2328) grad_norm 0.9728 (1.1777) [2022-01-18 10:04:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][810/1251] eta 0:16:15 lr 0.000978 time 1.6913 (2.2130) loss 4.4205 (4.2338) grad_norm 1.0627 (1.1775) [2022-01-18 10:04:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][820/1251] eta 0:15:53 lr 0.000978 time 2.4101 (2.2116) loss 3.6382 (4.2347) grad_norm 1.2906 (1.1778) [2022-01-18 10:05:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][830/1251] eta 0:15:30 lr 0.000978 time 2.7024 (2.2106) loss 4.3578 (4.2317) grad_norm 1.2891 (1.1792) [2022-01-18 10:05:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][840/1251] eta 0:15:08 lr 0.000978 time 2.5699 (2.2099) loss 3.5214 (4.2292) grad_norm 1.0407 (1.1797) [2022-01-18 10:06:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][850/1251] eta 0:14:45 lr 0.000978 time 1.9415 (2.2082) loss 3.1831 (4.2297) grad_norm 1.2553 (1.1783) [2022-01-18 10:06:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][860/1251] eta 0:14:23 lr 0.000978 time 2.6257 (2.2089) loss 4.5015 (4.2260) grad_norm 1.0711 (1.1784) [2022-01-18 10:06:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][870/1251] eta 0:14:01 lr 0.000978 time 2.4707 (2.2088) loss 4.6510 (4.2218) grad_norm 1.1168 (1.1768) [2022-01-18 10:07:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][880/1251] eta 0:13:39 lr 0.000978 time 2.8140 (2.2091) loss 4.4958 (4.2239) grad_norm 1.1274 (1.1773) [2022-01-18 10:07:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][890/1251] eta 0:13:17 lr 0.000978 time 1.5704 (2.2096) loss 5.1542 (4.2223) grad_norm 1.2450 (1.1772) [2022-01-18 10:07:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][900/1251] eta 0:12:55 lr 0.000978 time 2.2282 (2.2085) loss 4.6355 (4.2225) grad_norm 1.1065 (1.1771) [2022-01-18 10:08:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][910/1251] eta 0:12:32 lr 0.000978 time 2.7933 (2.2082) loss 4.7753 (4.2228) grad_norm 1.0816 (1.1764) [2022-01-18 10:08:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][920/1251] eta 0:12:10 lr 0.000978 time 2.5236 (2.2073) loss 4.5122 (4.2230) grad_norm 1.0707 (1.1769) [2022-01-18 10:08:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][930/1251] eta 0:11:48 lr 0.000978 time 1.6495 (2.2070) loss 4.2783 (4.2254) grad_norm 1.1936 (1.1758) [2022-01-18 10:09:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][940/1251] eta 0:11:27 lr 0.000978 time 3.0159 (2.2102) loss 3.7289 (4.2233) grad_norm 1.3952 (1.1768) [2022-01-18 10:09:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][950/1251] eta 0:11:05 lr 0.000978 time 1.8829 (2.2107) loss 3.4842 (4.2191) grad_norm 1.0744 (1.1768) [2022-01-18 10:10:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][960/1251] eta 0:10:44 lr 0.000978 time 2.7537 (2.2132) loss 4.3876 (4.2170) grad_norm 1.1816 (1.1761) [2022-01-18 10:10:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][970/1251] eta 0:10:21 lr 0.000978 time 1.9145 (2.2113) loss 4.9696 (4.2166) grad_norm 1.4305 (1.1767) [2022-01-18 10:10:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][980/1251] eta 0:09:58 lr 0.000978 time 2.1272 (2.2095) loss 3.4182 (4.2157) grad_norm 1.0941 (1.1781) [2022-01-18 10:11:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][990/1251] eta 0:09:36 lr 0.000978 time 2.2813 (2.2072) loss 4.0961 (4.2165) grad_norm 1.4437 (1.1785) [2022-01-18 10:11:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1000/1251] eta 0:09:13 lr 0.000978 time 2.1752 (2.2059) loss 3.7932 (4.2160) grad_norm 1.1003 (1.1790) [2022-01-18 10:11:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1010/1251] eta 0:08:51 lr 0.000978 time 2.0129 (2.2047) loss 4.1949 (4.2162) grad_norm 1.1288 (1.1783) [2022-01-18 10:12:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1020/1251] eta 0:08:29 lr 0.000978 time 2.2816 (2.2040) loss 4.8793 (4.2198) grad_norm 1.5462 (1.1792) [2022-01-18 10:12:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1030/1251] eta 0:08:07 lr 0.000978 time 2.8195 (2.2051) loss 4.5593 (4.2182) grad_norm 1.2977 (1.1788) [2022-01-18 10:12:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1040/1251] eta 0:07:45 lr 0.000978 time 2.8976 (2.2068) loss 4.8488 (4.2198) grad_norm 1.9145 (1.1791) [2022-01-18 10:13:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1050/1251] eta 0:07:23 lr 0.000978 time 3.0403 (2.2082) loss 4.4028 (4.2180) grad_norm 1.1631 (1.1790) [2022-01-18 10:13:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1060/1251] eta 0:07:01 lr 0.000978 time 2.1748 (2.2076) loss 4.8586 (4.2170) grad_norm 1.1805 (1.1784) [2022-01-18 10:14:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1070/1251] eta 0:06:39 lr 0.000978 time 2.4077 (2.2074) loss 3.3842 (4.2156) grad_norm 1.7196 (1.1790) [2022-01-18 10:14:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1080/1251] eta 0:06:17 lr 0.000978 time 2.1579 (2.2073) loss 4.9934 (4.2145) grad_norm 1.2880 (1.1792) [2022-01-18 10:14:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1090/1251] eta 0:05:55 lr 0.000978 time 2.5338 (2.2063) loss 4.3721 (4.2131) grad_norm 0.9919 (1.1785) [2022-01-18 10:15:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1100/1251] eta 0:05:33 lr 0.000978 time 2.2044 (2.2058) loss 4.2934 (4.2130) grad_norm 0.9796 (1.1775) [2022-01-18 10:15:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1110/1251] eta 0:05:11 lr 0.000978 time 2.2949 (2.2059) loss 4.7854 (4.2130) grad_norm 1.1604 (1.1768) [2022-01-18 10:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1120/1251] eta 0:04:49 lr 0.000978 time 3.0551 (2.2064) loss 4.2458 (4.2117) grad_norm 1.0578 (1.1760) [2022-01-18 10:16:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1130/1251] eta 0:04:27 lr 0.000977 time 2.7987 (2.2069) loss 3.8182 (4.2134) grad_norm 1.3727 (1.1754) [2022-01-18 10:16:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1140/1251] eta 0:04:04 lr 0.000977 time 2.3408 (2.2066) loss 4.4714 (4.2133) grad_norm 1.0634 (1.1748) [2022-01-18 10:16:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1150/1251] eta 0:03:42 lr 0.000977 time 1.7876 (2.2052) loss 3.8517 (4.2140) grad_norm 1.0766 (1.1745) [2022-01-18 10:17:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1160/1251] eta 0:03:20 lr 0.000977 time 3.1667 (2.2055) loss 4.0258 (4.2135) grad_norm 1.0843 (1.1742) [2022-01-18 10:17:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1170/1251] eta 0:02:58 lr 0.000977 time 2.2065 (2.2049) loss 4.9484 (4.2135) grad_norm 1.2201 (1.1738) [2022-01-18 10:18:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1180/1251] eta 0:02:36 lr 0.000977 time 1.7578 (2.2044) loss 4.4281 (4.2135) grad_norm 1.1455 (1.1740) [2022-01-18 10:18:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1190/1251] eta 0:02:14 lr 0.000977 time 1.9084 (2.2031) loss 3.3641 (4.2124) grad_norm 1.0742 (1.1748) [2022-01-18 10:18:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1200/1251] eta 0:01:52 lr 0.000977 time 2.8495 (2.2037) loss 4.3732 (4.2111) grad_norm 1.2260 (1.1744) [2022-01-18 10:19:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1210/1251] eta 0:01:30 lr 0.000977 time 2.5024 (2.2037) loss 3.4537 (4.2102) grad_norm 1.2485 (1.1746) [2022-01-18 10:19:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1220/1251] eta 0:01:08 lr 0.000977 time 1.7912 (2.2037) loss 3.9366 (4.2113) grad_norm 1.1742 (1.1751) [2022-01-18 10:19:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1230/1251] eta 0:00:46 lr 0.000977 time 2.4760 (2.2039) loss 4.6416 (4.2118) grad_norm 1.0417 (1.1742) [2022-01-18 10:20:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1240/1251] eta 0:00:24 lr 0.000977 time 1.5133 (2.2021) loss 4.6937 (4.2135) grad_norm 0.8788 (1.1734) [2022-01-18 10:20:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [28/300][1250/1251] eta 0:00:02 lr 0.000977 time 1.2143 (2.1969) loss 3.9600 (4.2157) grad_norm 1.3942 (1.1733) [2022-01-18 10:20:30 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 28 training takes 0:45:48 [2022-01-18 10:20:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.364 (18.364) Loss 1.6176 (1.6176) Acc@1 63.672 (63.672) Acc@5 86.523 (86.523) [2022-01-18 10:21:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.470 (3.363) Loss 1.5846 (1.5558) Acc@1 64.453 (65.057) Acc@5 87.109 (87.012) [2022-01-18 10:21:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.967 (2.555) Loss 1.4927 (1.5454) Acc@1 65.820 (65.007) Acc@5 86.523 (87.058) [2022-01-18 10:21:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.320 (2.357) Loss 1.5134 (1.5348) Acc@1 65.137 (65.354) Acc@5 87.207 (87.172) [2022-01-18 10:21:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.618 (2.184) Loss 1.5809 (1.5432) Acc@1 65.137 (65.046) Acc@5 86.816 (87.036) [2022-01-18 10:22:07 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 65.086 Acc@5 87.026 [2022-01-18 10:22:07 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 65.1% [2022-01-18 10:22:07 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 65.09% [2022-01-18 10:22:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][0/1251] eta 7:34:07 lr 0.000977 time 21.7808 (21.7808) loss 4.5887 (4.5887) grad_norm 1.0561 (1.0561) [2022-01-18 10:22:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][10/1251] eta 1:22:54 lr 0.000977 time 2.2470 (4.0088) loss 4.2140 (4.3482) grad_norm 1.1759 (1.0693) [2022-01-18 10:23:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][20/1251] eta 1:05:05 lr 0.000977 time 1.9305 (3.1724) loss 4.4178 (4.1151) grad_norm 1.3344 (1.1484) [2022-01-18 10:23:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][30/1251] eta 0:57:54 lr 0.000977 time 1.4104 (2.8459) loss 3.9203 (4.1397) grad_norm 0.9752 (1.1295) [2022-01-18 10:23:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][40/1251] eta 0:55:08 lr 0.000977 time 3.7262 (2.7323) loss 4.9926 (4.1256) grad_norm 1.0834 (1.1539) [2022-01-18 10:24:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][50/1251] eta 0:52:40 lr 0.000977 time 1.7242 (2.6312) loss 3.0520 (4.1108) grad_norm 1.1770 (1.1469) [2022-01-18 10:24:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][60/1251] eta 0:50:43 lr 0.000977 time 2.3386 (2.5551) loss 4.5249 (4.0751) grad_norm 0.9661 (1.1458) [2022-01-18 10:25:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][70/1251] eta 0:49:00 lr 0.000977 time 1.7202 (2.4902) loss 3.5798 (4.0751) grad_norm 1.1521 (1.1482) [2022-01-18 10:25:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][80/1251] eta 0:47:57 lr 0.000977 time 3.0468 (2.4572) loss 4.8652 (4.1023) grad_norm 1.1379 (1.1603) [2022-01-18 10:25:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][90/1251] eta 0:46:45 lr 0.000977 time 1.5745 (2.4165) loss 4.3879 (4.1036) grad_norm 1.1266 (1.1675) [2022-01-18 10:26:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][100/1251] eta 0:45:35 lr 0.000977 time 2.2530 (2.3766) loss 4.7167 (4.1107) grad_norm 1.1888 (1.1703) [2022-01-18 10:26:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][110/1251] eta 0:44:51 lr 0.000977 time 2.5103 (2.3593) loss 3.6944 (4.0960) grad_norm 1.2332 (1.1731) [2022-01-18 10:26:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][120/1251] eta 0:44:14 lr 0.000977 time 3.1068 (2.3469) loss 4.7103 (4.1283) grad_norm 1.2585 (1.1709) [2022-01-18 10:27:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][130/1251] eta 0:43:36 lr 0.000977 time 1.7881 (2.3343) loss 4.8951 (4.1416) grad_norm 1.1381 (1.1680) [2022-01-18 10:27:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][140/1251] eta 0:43:03 lr 0.000977 time 1.6664 (2.3255) loss 5.0359 (4.1249) grad_norm 1.4293 (1.1633) [2022-01-18 10:27:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][150/1251] eta 0:42:29 lr 0.000977 time 1.9096 (2.3158) loss 4.2355 (4.1293) grad_norm 1.1855 (1.1633) [2022-01-18 10:28:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][160/1251] eta 0:42:04 lr 0.000977 time 2.9632 (2.3135) loss 4.4774 (4.1558) grad_norm 1.1201 (1.1597) [2022-01-18 10:28:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][170/1251] eta 0:41:28 lr 0.000977 time 1.6809 (2.3022) loss 3.1318 (4.1590) grad_norm 1.5816 (1.1618) [2022-01-18 10:29:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][180/1251] eta 0:40:56 lr 0.000977 time 1.8933 (2.2937) loss 3.4121 (4.1583) grad_norm 1.2676 (1.1670) [2022-01-18 10:29:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][190/1251] eta 0:40:30 lr 0.000977 time 2.4295 (2.2906) loss 4.6886 (4.1502) grad_norm 1.1018 (1.1661) [2022-01-18 10:29:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][200/1251] eta 0:40:06 lr 0.000977 time 2.0780 (2.2896) loss 3.3021 (4.1353) grad_norm 0.9962 (1.1613) [2022-01-18 10:30:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][210/1251] eta 0:39:44 lr 0.000977 time 1.8020 (2.2902) loss 3.5424 (4.1378) grad_norm 1.0140 (1.1585) [2022-01-18 10:30:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][220/1251] eta 0:39:19 lr 0.000977 time 2.5946 (2.2884) loss 3.2195 (4.1482) grad_norm 0.9759 (1.1570) [2022-01-18 10:30:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][230/1251] eta 0:38:46 lr 0.000977 time 1.5800 (2.2783) loss 4.1973 (4.1562) grad_norm 1.5202 (1.1588) [2022-01-18 10:31:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][240/1251] eta 0:38:12 lr 0.000977 time 1.8762 (2.2675) loss 4.3171 (4.1667) grad_norm 1.1485 (1.1585) [2022-01-18 10:31:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][250/1251] eta 0:37:38 lr 0.000977 time 2.2662 (2.2563) loss 4.3101 (4.1676) grad_norm 1.1357 (1.1590) [2022-01-18 10:31:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][260/1251] eta 0:37:08 lr 0.000977 time 2.3169 (2.2490) loss 4.4325 (4.1631) grad_norm 0.8490 (1.1579) [2022-01-18 10:32:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][270/1251] eta 0:36:40 lr 0.000977 time 2.7812 (2.2427) loss 3.9574 (4.1683) grad_norm 0.9961 (1.1582) [2022-01-18 10:32:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][280/1251] eta 0:36:21 lr 0.000977 time 3.1337 (2.2470) loss 3.0934 (4.1605) grad_norm 1.2235 (1.1612) [2022-01-18 10:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][290/1251] eta 0:35:59 lr 0.000977 time 2.9059 (2.2476) loss 4.8080 (4.1622) grad_norm 1.0743 (1.1637) [2022-01-18 10:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][300/1251] eta 0:35:39 lr 0.000977 time 2.5564 (2.2502) loss 4.2447 (4.1613) grad_norm 0.8855 (1.1637) [2022-01-18 10:33:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][310/1251] eta 0:35:15 lr 0.000977 time 2.0434 (2.2486) loss 4.8455 (4.1713) grad_norm 1.1026 (1.1627) [2022-01-18 10:34:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][320/1251] eta 0:34:53 lr 0.000977 time 2.4365 (2.2488) loss 4.9958 (4.1744) grad_norm 1.3304 (1.1627) [2022-01-18 10:34:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][330/1251] eta 0:34:33 lr 0.000977 time 2.6359 (2.2517) loss 4.6648 (4.1784) grad_norm 1.5318 (1.1634) [2022-01-18 10:34:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][340/1251] eta 0:34:08 lr 0.000977 time 2.2946 (2.2482) loss 3.9036 (4.1780) grad_norm 1.4305 (1.1644) [2022-01-18 10:35:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][350/1251] eta 0:33:41 lr 0.000977 time 2.1587 (2.2434) loss 4.4470 (4.1742) grad_norm 1.0771 (1.1634) [2022-01-18 10:35:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][360/1251] eta 0:33:12 lr 0.000977 time 2.7259 (2.2359) loss 5.0809 (4.1779) grad_norm 1.3145 (1.1638) [2022-01-18 10:35:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][370/1251] eta 0:32:43 lr 0.000977 time 1.9463 (2.2290) loss 5.0513 (4.1761) grad_norm 1.2422 (1.1617) [2022-01-18 10:36:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][380/1251] eta 0:32:18 lr 0.000977 time 2.1708 (2.2254) loss 4.2362 (4.1837) grad_norm 1.1074 (1.1615) [2022-01-18 10:36:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][390/1251] eta 0:31:52 lr 0.000977 time 1.9952 (2.2214) loss 4.4979 (4.1881) grad_norm 1.1098 (1.1622) [2022-01-18 10:36:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][400/1251] eta 0:31:30 lr 0.000977 time 1.8999 (2.2210) loss 4.0129 (4.1829) grad_norm 1.2205 (1.1629) [2022-01-18 10:37:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][410/1251] eta 0:31:03 lr 0.000977 time 1.9010 (2.2164) loss 4.8145 (4.1836) grad_norm 1.0154 (1.1623) [2022-01-18 10:37:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][420/1251] eta 0:30:43 lr 0.000977 time 2.5646 (2.2184) loss 4.6867 (4.1799) grad_norm 1.0576 (1.1600) [2022-01-18 10:38:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][430/1251] eta 0:30:24 lr 0.000977 time 2.2370 (2.2219) loss 4.5373 (4.1795) grad_norm 1.0441 (1.1594) [2022-01-18 10:38:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][440/1251] eta 0:30:08 lr 0.000977 time 2.0786 (2.2301) loss 3.4277 (4.1764) grad_norm 1.1325 (1.1579) [2022-01-18 10:38:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][450/1251] eta 0:29:47 lr 0.000977 time 2.2702 (2.2320) loss 3.1500 (4.1771) grad_norm 0.9470 (1.1558) [2022-01-18 10:39:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][460/1251] eta 0:29:26 lr 0.000977 time 2.4149 (2.2333) loss 4.2625 (4.1741) grad_norm 1.1121 (1.1565) [2022-01-18 10:39:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][470/1251] eta 0:29:03 lr 0.000977 time 1.7596 (2.2319) loss 5.1004 (4.1755) grad_norm 1.1842 (1.1579) [2022-01-18 10:39:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][480/1251] eta 0:28:36 lr 0.000977 time 1.8610 (2.2258) loss 4.6357 (4.1843) grad_norm 1.7451 (1.1617) [2022-01-18 10:40:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][490/1251] eta 0:28:08 lr 0.000977 time 1.8762 (2.2192) loss 3.9913 (4.1866) grad_norm 1.1573 (1.1631) [2022-01-18 10:40:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][500/1251] eta 0:27:43 lr 0.000977 time 1.5474 (2.2150) loss 3.4490 (4.1848) grad_norm 0.9732 (1.1642) [2022-01-18 10:40:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][510/1251] eta 0:27:20 lr 0.000977 time 2.2825 (2.2144) loss 4.7759 (4.1840) grad_norm 1.1363 (1.1627) [2022-01-18 10:41:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][520/1251] eta 0:26:58 lr 0.000977 time 2.1809 (2.2140) loss 4.7679 (4.1803) grad_norm 1.0425 (1.1611) [2022-01-18 10:41:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][530/1251] eta 0:26:36 lr 0.000977 time 2.1618 (2.2139) loss 4.3910 (4.1791) grad_norm 0.9405 (1.1604) [2022-01-18 10:42:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][540/1251] eta 0:26:15 lr 0.000977 time 2.9169 (2.2161) loss 4.3146 (4.1750) grad_norm 1.0887 (1.1616) [2022-01-18 10:42:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][550/1251] eta 0:25:54 lr 0.000977 time 2.5347 (2.2176) loss 3.3959 (4.1787) grad_norm 1.2009 (1.1603) [2022-01-18 10:42:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][560/1251] eta 0:25:31 lr 0.000977 time 1.8556 (2.2164) loss 4.5366 (4.1781) grad_norm 1.2837 (1.1631) [2022-01-18 10:43:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][570/1251] eta 0:25:11 lr 0.000977 time 2.6274 (2.2198) loss 4.5452 (4.1799) grad_norm 1.7134 (1.1643) [2022-01-18 10:43:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][580/1251] eta 0:24:49 lr 0.000977 time 2.1702 (2.2193) loss 3.9166 (4.1776) grad_norm 1.3577 (1.1644) [2022-01-18 10:43:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][590/1251] eta 0:24:26 lr 0.000977 time 2.2611 (2.2190) loss 3.6528 (4.1812) grad_norm 1.3343 (1.1641) [2022-01-18 10:44:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][600/1251] eta 0:24:04 lr 0.000977 time 2.1884 (2.2195) loss 4.5066 (4.1820) grad_norm 0.9909 (1.1644) [2022-01-18 10:44:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][610/1251] eta 0:23:41 lr 0.000977 time 2.3275 (2.2183) loss 5.0940 (4.1863) grad_norm 1.2648 (1.1664) [2022-01-18 10:45:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][620/1251] eta 0:23:17 lr 0.000977 time 1.9588 (2.2144) loss 4.4217 (4.1839) grad_norm 1.0958 (1.1646) [2022-01-18 10:45:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][630/1251] eta 0:22:54 lr 0.000977 time 2.2363 (2.2140) loss 4.0554 (4.1821) grad_norm 1.3536 (1.1652) [2022-01-18 10:45:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][640/1251] eta 0:22:32 lr 0.000977 time 1.8985 (2.2138) loss 3.4594 (4.1830) grad_norm 1.2318 (1.1654) [2022-01-18 10:46:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][650/1251] eta 0:22:10 lr 0.000977 time 2.3531 (2.2141) loss 5.2508 (4.1836) grad_norm 1.1226 (1.1664) [2022-01-18 10:46:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][660/1251] eta 0:21:49 lr 0.000977 time 2.2095 (2.2152) loss 3.6283 (4.1860) grad_norm 1.2329 (1.1679) [2022-01-18 10:46:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][670/1251] eta 0:21:27 lr 0.000977 time 1.8863 (2.2152) loss 3.9771 (4.1828) grad_norm 1.2046 (1.1685) [2022-01-18 10:47:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][680/1251] eta 0:21:04 lr 0.000976 time 1.9866 (2.2141) loss 5.0303 (4.1868) grad_norm 1.2487 (1.1680) [2022-01-18 10:47:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][690/1251] eta 0:20:41 lr 0.000976 time 2.2667 (2.2126) loss 3.7860 (4.1889) grad_norm 1.2281 (1.1681) [2022-01-18 10:47:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][700/1251] eta 0:20:17 lr 0.000976 time 1.9437 (2.2103) loss 4.7352 (4.1900) grad_norm 1.0480 (1.1664) [2022-01-18 10:48:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][710/1251] eta 0:19:54 lr 0.000976 time 1.5448 (2.2083) loss 4.3589 (4.1928) grad_norm 1.0073 (1.1649) [2022-01-18 10:48:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][720/1251] eta 0:19:32 lr 0.000976 time 2.2049 (2.2089) loss 4.2560 (4.1934) grad_norm 1.0369 (1.1649) [2022-01-18 10:49:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][730/1251] eta 0:19:11 lr 0.000976 time 2.1398 (2.2094) loss 3.1366 (4.1933) grad_norm 1.4303 (1.1658) [2022-01-18 10:49:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][740/1251] eta 0:18:48 lr 0.000976 time 1.8133 (2.2084) loss 4.2351 (4.1921) grad_norm 1.1188 (1.1667) [2022-01-18 10:49:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][750/1251] eta 0:18:27 lr 0.000976 time 2.5104 (2.2111) loss 4.8904 (4.1905) grad_norm 1.2725 (1.1668) [2022-01-18 10:50:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][760/1251] eta 0:18:05 lr 0.000976 time 1.5974 (2.2107) loss 3.7168 (4.1920) grad_norm 1.4125 (1.1668) [2022-01-18 10:50:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][770/1251] eta 0:17:43 lr 0.000976 time 2.0981 (2.2110) loss 4.4941 (4.1907) grad_norm 1.4506 (1.1665) [2022-01-18 10:50:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][780/1251] eta 0:17:19 lr 0.000976 time 1.6478 (2.2079) loss 4.9591 (4.1924) grad_norm 1.1982 (1.1666) [2022-01-18 10:51:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][790/1251] eta 0:16:56 lr 0.000976 time 1.8468 (2.2053) loss 4.1635 (4.1906) grad_norm 1.1649 (1.1672) [2022-01-18 10:51:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][800/1251] eta 0:16:32 lr 0.000976 time 1.5562 (2.2013) loss 4.5412 (4.1936) grad_norm 1.0727 (1.1664) [2022-01-18 10:51:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][810/1251] eta 0:16:10 lr 0.000976 time 1.8623 (2.2004) loss 4.7834 (4.1926) grad_norm 1.0569 (1.1650) [2022-01-18 10:52:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][820/1251] eta 0:15:48 lr 0.000976 time 1.7698 (2.2002) loss 4.8152 (4.1919) grad_norm 1.0613 (1.1662) [2022-01-18 10:52:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][830/1251] eta 0:15:27 lr 0.000976 time 2.4833 (2.2031) loss 3.6378 (4.1917) grad_norm 1.0000 (1.1644) [2022-01-18 10:53:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][840/1251] eta 0:15:06 lr 0.000976 time 1.8535 (2.2060) loss 4.4310 (4.1922) grad_norm 0.8716 (1.1636) [2022-01-18 10:53:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][850/1251] eta 0:14:44 lr 0.000976 time 1.7743 (2.2053) loss 4.4426 (4.1904) grad_norm 1.3152 (1.1624) [2022-01-18 10:53:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][860/1251] eta 0:14:21 lr 0.000976 time 1.7443 (2.2044) loss 4.0417 (4.1931) grad_norm 1.1528 (1.1621) [2022-01-18 10:54:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][870/1251] eta 0:13:59 lr 0.000976 time 1.8790 (2.2040) loss 4.5442 (4.1914) grad_norm 1.3588 (1.1615) [2022-01-18 10:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][880/1251] eta 0:13:37 lr 0.000976 time 1.9759 (2.2042) loss 3.4069 (4.1924) grad_norm 1.0560 (1.1617) [2022-01-18 10:54:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][890/1251] eta 0:13:15 lr 0.000976 time 1.8241 (2.2036) loss 4.6851 (4.1925) grad_norm 0.9996 (1.1616) [2022-01-18 10:55:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][900/1251] eta 0:12:53 lr 0.000976 time 1.9355 (2.2030) loss 4.8213 (4.1962) grad_norm 1.0486 (1.1609) [2022-01-18 10:55:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][910/1251] eta 0:12:30 lr 0.000976 time 2.1561 (2.2011) loss 3.8583 (4.1945) grad_norm 1.2058 (1.1610) [2022-01-18 10:55:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][920/1251] eta 0:12:08 lr 0.000976 time 2.6833 (2.2015) loss 4.3259 (4.1917) grad_norm 1.0692 (1.1611) [2022-01-18 10:56:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][930/1251] eta 0:11:47 lr 0.000976 time 2.1441 (2.2032) loss 4.1353 (4.1879) grad_norm 1.3846 (1.1610) [2022-01-18 10:56:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][940/1251] eta 0:11:25 lr 0.000976 time 1.8439 (2.2045) loss 3.6429 (4.1892) grad_norm 1.4698 (1.1621) [2022-01-18 10:57:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][950/1251] eta 0:11:03 lr 0.000976 time 1.9115 (2.2040) loss 3.7231 (4.1891) grad_norm 1.3234 (1.1619) [2022-01-18 10:57:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][960/1251] eta 0:10:41 lr 0.000976 time 2.5611 (2.2030) loss 4.6683 (4.1899) grad_norm 1.1605 (1.1622) [2022-01-18 10:57:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][970/1251] eta 0:10:18 lr 0.000976 time 1.6629 (2.2000) loss 5.0217 (4.1894) grad_norm 0.9570 (1.1622) [2022-01-18 10:58:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][980/1251] eta 0:09:55 lr 0.000976 time 1.7281 (2.1982) loss 5.0180 (4.1895) grad_norm 1.1121 (1.1613) [2022-01-18 10:58:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][990/1251] eta 0:09:33 lr 0.000976 time 2.0271 (2.1969) loss 4.8587 (4.1896) grad_norm 1.4481 (1.1626) [2022-01-18 10:58:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1000/1251] eta 0:09:11 lr 0.000976 time 2.1979 (2.1982) loss 5.2169 (4.1912) grad_norm 1.3638 (1.1636) [2022-01-18 10:59:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1010/1251] eta 0:08:49 lr 0.000976 time 1.5901 (2.1975) loss 4.3641 (4.1923) grad_norm 0.9760 (1.1637) [2022-01-18 10:59:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1020/1251] eta 0:08:27 lr 0.000976 time 1.8209 (2.1976) loss 3.7568 (4.1912) grad_norm 1.2257 (1.1637) [2022-01-18 10:59:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1030/1251] eta 0:08:06 lr 0.000976 time 2.3527 (2.1995) loss 4.9606 (4.1932) grad_norm 1.3054 (1.1650) [2022-01-18 11:00:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1040/1251] eta 0:07:44 lr 0.000976 time 1.8403 (2.2008) loss 3.5810 (4.1943) grad_norm 1.1952 (1.1656) [2022-01-18 11:00:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1050/1251] eta 0:07:22 lr 0.000976 time 1.5729 (2.2007) loss 4.9873 (4.1975) grad_norm 1.0091 (1.1658) [2022-01-18 11:01:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1060/1251] eta 0:07:00 lr 0.000976 time 1.8977 (2.1994) loss 4.5602 (4.1948) grad_norm 1.0865 (1.1657) [2022-01-18 11:01:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1070/1251] eta 0:06:38 lr 0.000976 time 3.0418 (2.1993) loss 4.2021 (4.1969) grad_norm 1.1166 (1.1647) [2022-01-18 11:01:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1080/1251] eta 0:06:16 lr 0.000976 time 1.8367 (2.1998) loss 4.2337 (4.1975) grad_norm 1.2781 (1.1645) [2022-01-18 11:02:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1090/1251] eta 0:05:54 lr 0.000976 time 1.7933 (2.2005) loss 3.7646 (4.1960) grad_norm 1.4554 (1.1651) [2022-01-18 11:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1100/1251] eta 0:05:32 lr 0.000976 time 1.9816 (2.1998) loss 4.1625 (4.1944) grad_norm 1.1457 (1.1641) [2022-01-18 11:02:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1110/1251] eta 0:05:10 lr 0.000976 time 2.4997 (2.1999) loss 4.7579 (4.1965) grad_norm 1.1263 (1.1640) [2022-01-18 11:03:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1120/1251] eta 0:04:47 lr 0.000976 time 1.8108 (2.1981) loss 3.3650 (4.1972) grad_norm 1.3705 (1.1640) [2022-01-18 11:03:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1130/1251] eta 0:04:25 lr 0.000976 time 1.7224 (2.1975) loss 4.4851 (4.1973) grad_norm 1.0045 (1.1636) [2022-01-18 11:03:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1140/1251] eta 0:04:04 lr 0.000976 time 2.7079 (2.1986) loss 5.1332 (4.1984) grad_norm 1.1630 (1.1634) [2022-01-18 11:04:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1150/1251] eta 0:03:41 lr 0.000976 time 1.7518 (2.1975) loss 3.6012 (4.1976) grad_norm 1.2782 (1.1635) [2022-01-18 11:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1160/1251] eta 0:03:19 lr 0.000976 time 1.8894 (2.1969) loss 4.9452 (4.1969) grad_norm 1.4336 (1.1637) [2022-01-18 11:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1170/1251] eta 0:02:57 lr 0.000976 time 2.1001 (2.1967) loss 2.9813 (4.1963) grad_norm 1.1794 (1.1632) [2022-01-18 11:05:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1180/1251] eta 0:02:36 lr 0.000976 time 2.9688 (2.1977) loss 4.4067 (4.1958) grad_norm 1.0676 (1.1633) [2022-01-18 11:05:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1190/1251] eta 0:02:14 lr 0.000976 time 1.9159 (2.1983) loss 3.4832 (4.1956) grad_norm 1.2882 (1.1624) [2022-01-18 11:06:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1200/1251] eta 0:01:52 lr 0.000976 time 2.2187 (2.1972) loss 4.5748 (4.1949) grad_norm 1.1143 (1.1621) [2022-01-18 11:06:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1210/1251] eta 0:01:30 lr 0.000976 time 2.3329 (2.1960) loss 3.0770 (4.1948) grad_norm 1.0895 (1.1623) [2022-01-18 11:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1220/1251] eta 0:01:08 lr 0.000976 time 2.2055 (2.1973) loss 3.8765 (4.1931) grad_norm 1.4732 (1.1630) [2022-01-18 11:07:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1230/1251] eta 0:00:46 lr 0.000976 time 1.7161 (2.1981) loss 5.2459 (4.1937) grad_norm 1.2970 (1.1634) [2022-01-18 11:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1240/1251] eta 0:00:24 lr 0.000976 time 1.9346 (2.1964) loss 4.7413 (4.1934) grad_norm 0.9264 (1.1630) [2022-01-18 11:07:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [29/300][1250/1251] eta 0:00:02 lr 0.000976 time 1.1735 (2.1901) loss 4.6124 (4.1932) grad_norm 1.1593 (1.1621) [2022-01-18 11:07:47 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 29 training takes 0:45:40 [2022-01-18 11:08:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.620 (18.620) Loss 1.5340 (1.5340) Acc@1 66.895 (66.895) Acc@5 86.816 (86.816) [2022-01-18 11:08:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.615 (3.299) Loss 1.5557 (1.5529) Acc@1 65.918 (65.243) Acc@5 87.402 (87.118) [2022-01-18 11:08:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.618 (2.549) Loss 1.4893 (1.5447) Acc@1 67.676 (65.485) Acc@5 88.086 (87.384) [2022-01-18 11:08:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.933 (2.263) Loss 1.5947 (1.5496) Acc@1 65.332 (65.427) Acc@5 86.426 (87.210) [2022-01-18 11:09:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.437 (2.149) Loss 1.5835 (1.5456) Acc@1 63.477 (65.518) Acc@5 86.621 (87.314) [2022-01-18 11:09:22 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 65.656 Acc@5 87.426 [2022-01-18 11:09:22 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 65.7% [2022-01-18 11:09:22 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 65.66% [2022-01-18 11:09:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][0/1251] eta 7:36:16 lr 0.000976 time 21.8835 (21.8835) loss 4.9265 (4.9265) grad_norm 1.1341 (1.1341) [2022-01-18 11:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][10/1251] eta 1:23:43 lr 0.000976 time 1.8670 (4.0483) loss 4.7184 (4.2486) grad_norm 1.2265 (1.1089) [2022-01-18 11:10:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][20/1251] eta 1:05:03 lr 0.000976 time 1.3747 (3.1707) loss 4.5807 (4.2173) grad_norm 1.0478 (1.1170) [2022-01-18 11:10:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][30/1251] eta 0:58:24 lr 0.000976 time 1.9647 (2.8704) loss 4.1840 (4.2126) grad_norm 1.1258 (1.0985) [2022-01-18 11:11:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][40/1251] eta 0:55:23 lr 0.000976 time 3.8804 (2.7442) loss 2.9362 (4.1613) grad_norm 1.0078 (1.1068) [2022-01-18 11:11:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][50/1251] eta 0:52:32 lr 0.000976 time 2.0481 (2.6249) loss 4.3181 (4.2029) grad_norm 1.4279 (1.1332) [2022-01-18 11:11:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][60/1251] eta 0:50:21 lr 0.000976 time 1.9289 (2.5371) loss 4.9457 (4.2255) grad_norm 0.9137 (1.1304) [2022-01-18 11:12:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][70/1251] eta 0:48:51 lr 0.000976 time 2.2633 (2.4825) loss 4.0085 (4.1708) grad_norm 1.1714 (1.1521) [2022-01-18 11:12:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][80/1251] eta 0:47:57 lr 0.000976 time 4.9827 (2.4571) loss 4.4216 (4.1933) grad_norm 1.1439 (1.1545) [2022-01-18 11:13:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][90/1251] eta 0:47:02 lr 0.000976 time 1.4986 (2.4311) loss 4.1200 (4.1712) grad_norm 1.1623 (1.1600) [2022-01-18 11:13:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][100/1251] eta 0:46:08 lr 0.000976 time 2.1176 (2.4055) loss 3.3358 (4.1667) grad_norm 1.0304 (1.1570) [2022-01-18 11:13:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][110/1251] eta 0:45:18 lr 0.000976 time 2.1948 (2.3824) loss 4.4873 (4.1952) grad_norm 1.3383 (1.1613) [2022-01-18 11:14:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][120/1251] eta 0:44:45 lr 0.000976 time 3.8981 (2.3741) loss 3.3892 (4.2120) grad_norm 2.2792 (1.1758) [2022-01-18 11:14:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][130/1251] eta 0:44:00 lr 0.000976 time 1.6069 (2.3555) loss 4.2494 (4.2131) grad_norm 1.1542 (1.1734) [2022-01-18 11:14:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][140/1251] eta 0:43:18 lr 0.000976 time 1.9097 (2.3389) loss 4.5511 (4.2110) grad_norm 1.1989 (1.1783) [2022-01-18 11:15:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][150/1251] eta 0:42:54 lr 0.000976 time 2.7767 (2.3380) loss 3.6520 (4.1934) grad_norm 1.2512 (1.1796) [2022-01-18 11:15:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][160/1251] eta 0:42:38 lr 0.000976 time 5.4882 (2.3448) loss 4.2644 (4.2052) grad_norm 0.9330 (1.1816) [2022-01-18 11:16:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][170/1251] eta 0:42:09 lr 0.000976 time 1.8464 (2.3397) loss 4.3741 (4.2085) grad_norm 1.1399 (1.1801) [2022-01-18 11:16:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][180/1251] eta 0:41:23 lr 0.000976 time 1.8987 (2.3186) loss 3.3702 (4.2057) grad_norm 1.1045 (1.1710) [2022-01-18 11:16:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][190/1251] eta 0:40:41 lr 0.000976 time 1.5951 (2.3013) loss 3.8300 (4.2174) grad_norm 1.1279 (1.1725) [2022-01-18 11:17:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][200/1251] eta 0:40:24 lr 0.000976 time 2.4060 (2.3069) loss 5.1335 (4.2146) grad_norm 1.3925 (1.1682) [2022-01-18 11:17:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][210/1251] eta 0:39:51 lr 0.000976 time 2.2883 (2.2972) loss 3.2791 (4.2062) grad_norm 0.9007 (1.1674) [2022-01-18 11:17:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][220/1251] eta 0:39:13 lr 0.000975 time 1.6770 (2.2823) loss 3.0061 (4.1826) grad_norm 1.0529 (1.1673) [2022-01-18 11:18:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][230/1251] eta 0:38:42 lr 0.000975 time 2.2880 (2.2744) loss 3.0912 (4.1806) grad_norm 1.1453 (1.1647) [2022-01-18 11:18:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][240/1251] eta 0:38:27 lr 0.000975 time 2.7217 (2.2819) loss 3.6836 (4.1844) grad_norm 1.1242 (1.1626) [2022-01-18 11:18:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][250/1251] eta 0:38:05 lr 0.000975 time 2.2314 (2.2831) loss 3.6480 (4.1838) grad_norm 1.0481 (1.1622) [2022-01-18 11:19:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][260/1251] eta 0:37:40 lr 0.000975 time 1.6515 (2.2809) loss 4.4477 (4.1790) grad_norm 1.4850 (1.1636) [2022-01-18 11:19:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][270/1251] eta 0:37:14 lr 0.000975 time 2.2102 (2.2776) loss 4.4795 (4.1931) grad_norm 0.9825 (1.1680) [2022-01-18 11:20:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][280/1251] eta 0:36:44 lr 0.000975 time 1.9027 (2.2704) loss 4.6842 (4.1832) grad_norm 1.1505 (1.1671) [2022-01-18 11:20:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][290/1251] eta 0:36:12 lr 0.000975 time 1.9183 (2.2609) loss 4.0060 (4.1808) grad_norm 0.9714 (1.1655) [2022-01-18 11:20:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][300/1251] eta 0:35:45 lr 0.000975 time 1.6779 (2.2562) loss 4.0669 (4.1762) grad_norm 1.1436 (1.1653) [2022-01-18 11:21:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][310/1251] eta 0:35:19 lr 0.000975 time 1.9458 (2.2525) loss 3.2848 (4.1722) grad_norm 1.1105 (1.1662) [2022-01-18 11:21:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][320/1251] eta 0:34:54 lr 0.000975 time 1.8927 (2.2498) loss 3.7943 (4.1751) grad_norm 0.9847 (1.1644) [2022-01-18 11:21:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][330/1251] eta 0:34:27 lr 0.000975 time 2.1427 (2.2451) loss 3.2482 (4.1835) grad_norm 0.9708 (1.1611) [2022-01-18 11:22:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][340/1251] eta 0:34:01 lr 0.000975 time 1.8898 (2.2411) loss 4.5786 (4.1791) grad_norm 0.9349 (1.1580) [2022-01-18 11:22:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][350/1251] eta 0:33:42 lr 0.000975 time 2.5378 (2.2446) loss 3.8910 (4.1844) grad_norm 1.4271 (1.1610) [2022-01-18 11:22:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][360/1251] eta 0:33:19 lr 0.000975 time 2.2759 (2.2442) loss 4.8908 (4.1882) grad_norm 1.0381 (1.1594) [2022-01-18 11:23:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][370/1251] eta 0:32:57 lr 0.000975 time 1.7853 (2.2444) loss 3.0574 (4.1822) grad_norm 1.2201 (1.1622) [2022-01-18 11:23:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][380/1251] eta 0:32:32 lr 0.000975 time 1.9576 (2.2421) loss 3.3765 (4.1848) grad_norm 1.1633 (1.1643) [2022-01-18 11:23:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][390/1251] eta 0:32:08 lr 0.000975 time 2.2469 (2.2400) loss 4.6697 (4.1860) grad_norm 1.6022 (1.1675) [2022-01-18 11:24:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][400/1251] eta 0:31:48 lr 0.000975 time 2.2297 (2.2423) loss 4.5608 (4.1884) grad_norm 1.0485 (1.1708) [2022-01-18 11:24:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][410/1251] eta 0:31:24 lr 0.000975 time 1.8552 (2.2411) loss 3.5872 (4.1812) grad_norm 1.5330 (1.1688) [2022-01-18 11:25:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][420/1251] eta 0:30:57 lr 0.000975 time 1.9201 (2.2357) loss 4.5039 (4.1756) grad_norm 0.9853 (1.1694) [2022-01-18 11:25:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][430/1251] eta 0:30:31 lr 0.000975 time 2.2430 (2.2304) loss 4.3293 (4.1742) grad_norm 1.0136 (1.1681) [2022-01-18 11:25:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][440/1251] eta 0:30:05 lr 0.000975 time 1.9670 (2.2268) loss 4.1873 (4.1750) grad_norm 1.2441 (1.1679) [2022-01-18 11:26:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][450/1251] eta 0:29:52 lr 0.000975 time 2.4619 (2.2381) loss 3.1809 (4.1732) grad_norm 1.2172 (1.1668) [2022-01-18 11:26:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][460/1251] eta 0:29:27 lr 0.000975 time 1.9329 (2.2347) loss 3.9035 (4.1719) grad_norm 1.3684 (1.1678) [2022-01-18 11:26:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][470/1251] eta 0:29:04 lr 0.000975 time 1.8479 (2.2334) loss 3.0703 (4.1737) grad_norm 1.0645 (1.1674) [2022-01-18 11:27:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][480/1251] eta 0:28:42 lr 0.000975 time 1.8892 (2.2336) loss 4.3002 (4.1719) grad_norm 0.9480 (1.1671) [2022-01-18 11:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][490/1251] eta 0:28:23 lr 0.000975 time 1.7631 (2.2385) loss 4.6979 (4.1710) grad_norm 1.0275 (1.1648) [2022-01-18 11:28:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][500/1251] eta 0:27:55 lr 0.000975 time 1.6078 (2.2312) loss 3.0303 (4.1675) grad_norm 1.0522 (1.1649) [2022-01-18 11:28:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][510/1251] eta 0:27:32 lr 0.000975 time 1.9668 (2.2295) loss 3.3485 (4.1659) grad_norm 1.1537 (1.1658) [2022-01-18 11:28:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][520/1251] eta 0:27:10 lr 0.000975 time 1.5973 (2.2307) loss 3.3212 (4.1666) grad_norm 1.2776 (1.1651) [2022-01-18 11:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][530/1251] eta 0:26:49 lr 0.000975 time 1.9339 (2.2327) loss 4.2908 (4.1672) grad_norm 1.0525 (1.1627) [2022-01-18 11:29:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][540/1251] eta 0:26:25 lr 0.000975 time 2.2092 (2.2304) loss 4.7984 (4.1628) grad_norm 1.0321 (1.1629) [2022-01-18 11:29:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][550/1251] eta 0:25:59 lr 0.000975 time 2.0386 (2.2250) loss 2.9739 (4.1618) grad_norm 1.1003 (1.1628) [2022-01-18 11:30:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][560/1251] eta 0:25:35 lr 0.000975 time 1.8425 (2.2218) loss 5.1582 (4.1642) grad_norm 1.0097 (1.1619) [2022-01-18 11:30:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][570/1251] eta 0:25:14 lr 0.000975 time 1.5860 (2.2240) loss 4.3913 (4.1696) grad_norm 1.2015 (1.1611) [2022-01-18 11:30:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][580/1251] eta 0:24:51 lr 0.000975 time 1.5861 (2.2235) loss 4.7438 (4.1722) grad_norm 1.1836 (1.1606) [2022-01-18 11:31:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][590/1251] eta 0:24:29 lr 0.000975 time 1.8067 (2.2232) loss 5.0148 (4.1741) grad_norm 1.4856 (1.1615) [2022-01-18 11:31:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][600/1251] eta 0:24:06 lr 0.000975 time 1.5482 (2.2224) loss 2.9948 (4.1750) grad_norm 1.1668 (1.1610) [2022-01-18 11:32:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][610/1251] eta 0:23:45 lr 0.000975 time 1.6858 (2.2233) loss 4.2886 (4.1760) grad_norm 1.2784 (1.1619) [2022-01-18 11:32:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][620/1251] eta 0:23:22 lr 0.000975 time 1.9254 (2.2222) loss 4.7079 (4.1784) grad_norm 1.3819 (1.1629) [2022-01-18 11:32:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][630/1251] eta 0:22:59 lr 0.000975 time 1.7857 (2.2212) loss 3.5216 (4.1801) grad_norm 1.6295 (1.1638) [2022-01-18 11:33:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][640/1251] eta 0:22:37 lr 0.000975 time 2.8474 (2.2215) loss 4.0008 (4.1802) grad_norm 1.1394 (1.1628) [2022-01-18 11:33:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][650/1251] eta 0:22:14 lr 0.000975 time 2.5287 (2.2213) loss 2.7636 (4.1769) grad_norm 1.1864 (1.1626) [2022-01-18 11:33:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][660/1251] eta 0:21:51 lr 0.000975 time 1.6402 (2.2192) loss 4.5214 (4.1811) grad_norm 0.9034 (1.1632) [2022-01-18 11:34:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][670/1251] eta 0:21:28 lr 0.000975 time 1.8150 (2.2171) loss 4.1884 (4.1803) grad_norm 1.1200 (1.1623) [2022-01-18 11:34:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][680/1251] eta 0:21:06 lr 0.000975 time 2.8054 (2.2172) loss 4.0248 (4.1848) grad_norm 1.1220 (1.1622) [2022-01-18 11:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][690/1251] eta 0:20:43 lr 0.000975 time 1.8533 (2.2172) loss 4.7020 (4.1831) grad_norm 1.0684 (1.1632) [2022-01-18 11:35:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][700/1251] eta 0:20:22 lr 0.000975 time 2.5399 (2.2179) loss 3.9769 (4.1817) grad_norm 1.4236 (1.1624) [2022-01-18 11:35:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][710/1251] eta 0:19:59 lr 0.000975 time 2.2362 (2.2176) loss 4.3461 (4.1845) grad_norm 1.2069 (1.1617) [2022-01-18 11:36:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][720/1251] eta 0:19:38 lr 0.000975 time 3.7462 (2.2193) loss 3.9460 (4.1811) grad_norm 1.1704 (1.1623) [2022-01-18 11:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][730/1251] eta 0:19:15 lr 0.000975 time 1.8306 (2.2171) loss 3.8419 (4.1854) grad_norm 1.3540 (1.1627) [2022-01-18 11:36:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][740/1251] eta 0:18:51 lr 0.000975 time 1.6673 (2.2141) loss 4.9836 (4.1868) grad_norm 0.9981 (1.1610) [2022-01-18 11:37:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][750/1251] eta 0:18:27 lr 0.000975 time 1.9928 (2.2108) loss 3.7963 (4.1875) grad_norm 1.4879 (1.1613) [2022-01-18 11:37:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][760/1251] eta 0:18:05 lr 0.000975 time 2.7755 (2.2113) loss 3.5974 (4.1878) grad_norm 1.5536 (1.1612) [2022-01-18 11:37:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][770/1251] eta 0:17:43 lr 0.000975 time 2.5167 (2.2120) loss 4.6320 (4.1855) grad_norm 1.1821 (1.1618) [2022-01-18 11:38:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][780/1251] eta 0:17:22 lr 0.000975 time 2.3585 (2.2139) loss 4.0694 (4.1792) grad_norm 0.9923 (1.1618) [2022-01-18 11:38:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][790/1251] eta 0:17:00 lr 0.000975 time 1.5649 (2.2145) loss 3.9312 (4.1780) grad_norm 0.9722 (1.1615) [2022-01-18 11:38:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][800/1251] eta 0:16:39 lr 0.000975 time 2.5787 (2.2158) loss 4.2431 (4.1772) grad_norm 0.9531 (1.1616) [2022-01-18 11:39:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][810/1251] eta 0:16:16 lr 0.000975 time 2.5571 (2.2143) loss 3.3730 (4.1771) grad_norm 1.4161 (1.1624) [2022-01-18 11:39:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][820/1251] eta 0:15:53 lr 0.000975 time 2.3548 (2.2120) loss 4.2087 (4.1757) grad_norm 1.2767 (1.1634) [2022-01-18 11:39:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][830/1251] eta 0:15:30 lr 0.000975 time 2.2724 (2.2102) loss 3.8527 (4.1762) grad_norm 0.9529 (1.1617) [2022-01-18 11:40:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][840/1251] eta 0:15:08 lr 0.000975 time 1.9559 (2.2095) loss 4.0408 (4.1763) grad_norm 1.2436 (1.1619) [2022-01-18 11:40:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][850/1251] eta 0:14:45 lr 0.000975 time 2.4656 (2.2094) loss 3.1348 (4.1754) grad_norm 1.0609 (1.1620) [2022-01-18 11:41:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][860/1251] eta 0:14:23 lr 0.000975 time 1.8641 (2.2096) loss 4.7813 (4.1790) grad_norm 1.0327 (1.1618) [2022-01-18 11:41:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][870/1251] eta 0:14:01 lr 0.000975 time 2.0708 (2.2088) loss 3.2447 (4.1763) grad_norm 1.1923 (1.1614) [2022-01-18 11:41:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][880/1251] eta 0:13:38 lr 0.000975 time 1.6868 (2.2069) loss 4.2978 (4.1772) grad_norm 1.1874 (1.1619) [2022-01-18 11:42:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][890/1251] eta 0:13:16 lr 0.000975 time 2.1250 (2.2074) loss 4.5123 (4.1754) grad_norm 0.9917 (1.1625) [2022-01-18 11:42:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][900/1251] eta 0:12:54 lr 0.000975 time 1.9442 (2.2071) loss 3.7908 (4.1797) grad_norm 1.0984 (1.1635) [2022-01-18 11:42:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][910/1251] eta 0:12:32 lr 0.000975 time 2.1477 (2.2072) loss 4.4631 (4.1805) grad_norm 1.3143 (1.1636) [2022-01-18 11:43:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][920/1251] eta 0:12:10 lr 0.000975 time 1.8847 (2.2068) loss 3.8935 (4.1777) grad_norm 1.0156 (1.1629) [2022-01-18 11:43:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][930/1251] eta 0:11:48 lr 0.000975 time 1.7528 (2.2064) loss 3.7750 (4.1763) grad_norm 1.0938 (1.1628) [2022-01-18 11:43:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][940/1251] eta 0:11:26 lr 0.000975 time 1.5576 (2.2065) loss 4.1843 (4.1769) grad_norm 1.1617 (1.1628) [2022-01-18 11:44:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][950/1251] eta 0:11:05 lr 0.000975 time 2.6077 (2.2099) loss 3.9103 (4.1782) grad_norm 1.0025 (1.1621) [2022-01-18 11:44:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][960/1251] eta 0:10:42 lr 0.000975 time 1.9249 (2.2094) loss 4.6243 (4.1775) grad_norm 1.5991 (1.1628) [2022-01-18 11:45:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][970/1251] eta 0:10:20 lr 0.000975 time 1.7381 (2.2073) loss 4.3238 (4.1768) grad_norm 1.3013 (1.1630) [2022-01-18 11:45:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][980/1251] eta 0:09:57 lr 0.000975 time 1.5099 (2.2042) loss 4.7675 (4.1774) grad_norm 1.1486 (1.1629) [2022-01-18 11:45:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][990/1251] eta 0:09:35 lr 0.000974 time 2.1636 (2.2032) loss 4.5426 (4.1808) grad_norm 1.4047 (1.1631) [2022-01-18 11:46:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1000/1251] eta 0:09:13 lr 0.000974 time 2.3717 (2.2048) loss 4.2496 (4.1815) grad_norm 0.9331 (1.1623) [2022-01-18 11:46:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1010/1251] eta 0:08:51 lr 0.000974 time 1.8791 (2.2054) loss 4.4932 (4.1794) grad_norm 1.0102 (1.1620) [2022-01-18 11:46:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1020/1251] eta 0:08:29 lr 0.000974 time 2.1138 (2.2065) loss 4.0411 (4.1770) grad_norm 1.1541 (1.1628) [2022-01-18 11:47:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1030/1251] eta 0:08:08 lr 0.000974 time 2.6535 (2.2089) loss 4.5940 (4.1771) grad_norm 1.1656 (1.1628) [2022-01-18 11:47:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1040/1251] eta 0:07:46 lr 0.000974 time 1.6030 (2.2096) loss 4.1900 (4.1770) grad_norm 1.4506 (1.1636) [2022-01-18 11:48:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1050/1251] eta 0:07:23 lr 0.000974 time 1.8815 (2.2074) loss 4.8732 (4.1776) grad_norm 1.1442 (1.1633) [2022-01-18 11:48:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1060/1251] eta 0:07:00 lr 0.000974 time 1.8980 (2.2039) loss 4.5525 (4.1760) grad_norm 1.1947 (1.1635) [2022-01-18 11:48:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1070/1251] eta 0:06:38 lr 0.000974 time 1.8420 (2.2017) loss 4.3824 (4.1743) grad_norm 1.0437 (1.1638) [2022-01-18 11:49:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1080/1251] eta 0:06:16 lr 0.000974 time 2.0394 (2.2020) loss 3.9883 (4.1743) grad_norm 1.0339 (1.1641) [2022-01-18 11:49:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1090/1251] eta 0:05:54 lr 0.000974 time 2.1849 (2.2024) loss 4.3888 (4.1759) grad_norm 1.2625 (1.1644) [2022-01-18 11:49:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1100/1251] eta 0:05:32 lr 0.000974 time 1.9527 (2.2035) loss 4.1423 (4.1732) grad_norm 1.3584 (1.1641) [2022-01-18 11:50:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1110/1251] eta 0:05:10 lr 0.000974 time 2.8869 (2.2048) loss 3.9612 (4.1733) grad_norm 1.0627 (1.1641) [2022-01-18 11:50:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1120/1251] eta 0:04:48 lr 0.000974 time 1.8838 (2.2059) loss 3.6001 (4.1720) grad_norm 0.9261 (1.1636) [2022-01-18 11:50:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1130/1251] eta 0:04:26 lr 0.000974 time 1.5519 (2.2055) loss 3.5840 (4.1716) grad_norm 1.7391 (1.1649) [2022-01-18 11:51:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1140/1251] eta 0:04:04 lr 0.000974 time 1.9560 (2.2047) loss 4.1742 (4.1703) grad_norm 1.1376 (1.1648) [2022-01-18 11:51:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1150/1251] eta 0:03:42 lr 0.000974 time 1.8866 (2.2033) loss 3.9899 (4.1702) grad_norm 1.1359 (1.1642) [2022-01-18 11:52:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1160/1251] eta 0:03:20 lr 0.000974 time 1.8965 (2.2030) loss 4.7973 (4.1703) grad_norm 1.1270 (1.1638) [2022-01-18 11:52:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1170/1251] eta 0:02:58 lr 0.000974 time 2.0598 (2.2024) loss 3.5107 (4.1700) grad_norm 1.1473 (1.1641) [2022-01-18 11:52:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1180/1251] eta 0:02:36 lr 0.000974 time 2.5148 (2.2024) loss 4.3818 (4.1715) grad_norm 1.4698 (1.1648) [2022-01-18 11:53:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1190/1251] eta 0:02:14 lr 0.000974 time 2.1904 (2.2023) loss 3.4576 (4.1709) grad_norm 1.1825 (1.1646) [2022-01-18 11:53:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1200/1251] eta 0:01:52 lr 0.000974 time 1.8080 (2.2023) loss 4.4011 (4.1716) grad_norm 1.1112 (1.1641) [2022-01-18 11:53:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1210/1251] eta 0:01:30 lr 0.000974 time 1.6237 (2.2012) loss 4.2193 (4.1730) grad_norm 1.0748 (1.1637) [2022-01-18 11:54:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1220/1251] eta 0:01:08 lr 0.000974 time 2.5577 (2.2005) loss 3.4690 (4.1734) grad_norm 1.1490 (1.1640) [2022-01-18 11:54:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1230/1251] eta 0:00:46 lr 0.000974 time 1.9455 (2.2000) loss 3.2299 (4.1716) grad_norm 1.3148 (1.1641) [2022-01-18 11:54:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1240/1251] eta 0:00:24 lr 0.000974 time 1.9918 (2.2003) loss 3.7161 (4.1706) grad_norm 1.0441 (1.1639) [2022-01-18 11:55:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [30/300][1250/1251] eta 0:00:02 lr 0.000974 time 1.1552 (2.1945) loss 4.1716 (4.1707) grad_norm 1.2085 (1.1633) [2022-01-18 11:55:08 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 30 training takes 0:45:45 [2022-01-18 11:55:08 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_30 saving...... [2022-01-18 11:55:19 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_30 saved !!! [2022-01-18 11:55:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 11.748 (11.748) Loss 1.4883 (1.4883) Acc@1 67.773 (67.773) Acc@5 87.695 (87.695) [2022-01-18 11:55:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 5.076 (2.826) Loss 1.5598 (1.4982) Acc@1 63.477 (66.042) Acc@5 86.426 (87.500) [2022-01-18 11:56:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.611 (2.289) Loss 1.5181 (1.5039) Acc@1 65.234 (65.718) Acc@5 88.281 (87.505) [2022-01-18 11:56:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.654 (2.153) Loss 1.5033 (1.4994) Acc@1 66.406 (65.902) Acc@5 87.207 (87.437) [2022-01-18 11:56:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.951 (2.020) Loss 1.5325 (1.5001) Acc@1 65.137 (65.873) Acc@5 88.379 (87.524) [2022-01-18 11:56:49 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 66.042 Acc@5 87.628 [2022-01-18 11:56:49 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 66.0% [2022-01-18 11:56:49 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 66.04% [2022-01-18 11:57:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][0/1251] eta 7:35:23 lr 0.000974 time 21.8417 (21.8417) loss 3.2145 (3.2145) grad_norm 1.1015 (1.1015) [2022-01-18 11:57:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][10/1251] eta 1:23:41 lr 0.000974 time 2.7429 (4.0461) loss 4.5842 (4.2757) grad_norm 1.0823 (1.1538) [2022-01-18 11:57:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][20/1251] eta 1:05:26 lr 0.000974 time 1.4791 (3.1900) loss 3.6072 (4.1796) grad_norm 1.2405 (1.1479) [2022-01-18 11:58:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][30/1251] eta 0:58:19 lr 0.000974 time 1.9123 (2.8661) loss 4.6701 (4.1197) grad_norm 1.1008 (1.1511) [2022-01-18 11:58:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][40/1251] eta 0:54:47 lr 0.000974 time 3.5122 (2.7150) loss 3.0476 (4.0252) grad_norm 1.0328 (1.1714) [2022-01-18 11:59:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][50/1251] eta 0:53:31 lr 0.000974 time 2.7602 (2.6743) loss 4.5075 (4.0034) grad_norm 0.9638 (1.1641) [2022-01-18 11:59:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][60/1251] eta 0:51:18 lr 0.000974 time 1.6023 (2.5848) loss 4.4377 (3.9838) grad_norm 1.7713 (1.1742) [2022-01-18 11:59:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][70/1251] eta 0:49:40 lr 0.000974 time 1.6028 (2.5237) loss 4.6450 (3.9753) grad_norm 1.1835 (1.1762) [2022-01-18 12:00:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][80/1251] eta 0:48:29 lr 0.000974 time 3.3786 (2.4847) loss 5.0926 (3.9736) grad_norm 1.2785 (1.1701) [2022-01-18 12:00:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][90/1251] eta 0:47:28 lr 0.000974 time 2.0118 (2.4533) loss 4.9282 (4.0077) grad_norm 1.2824 (1.1663) [2022-01-18 12:00:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][100/1251] eta 0:46:05 lr 0.000974 time 2.0985 (2.4029) loss 4.3926 (4.0525) grad_norm 1.2248 (1.1610) [2022-01-18 12:01:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][110/1251] eta 0:45:04 lr 0.000974 time 2.1024 (2.3700) loss 4.5117 (4.0833) grad_norm 1.1827 (1.1530) [2022-01-18 12:01:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][120/1251] eta 0:44:17 lr 0.000974 time 2.7404 (2.3496) loss 5.0228 (4.1086) grad_norm 1.3691 (1.1520) [2022-01-18 12:01:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][130/1251] eta 0:43:45 lr 0.000974 time 1.8809 (2.3422) loss 4.7303 (4.1212) grad_norm 1.4010 (1.1604) [2022-01-18 12:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][140/1251] eta 0:43:28 lr 0.000974 time 2.1243 (2.3482) loss 3.0251 (4.1063) grad_norm 1.4545 (1.1669) [2022-01-18 12:02:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][150/1251] eta 0:43:02 lr 0.000974 time 2.7401 (2.3457) loss 3.7051 (4.1064) grad_norm 1.0226 (1.1670) [2022-01-18 12:03:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][160/1251] eta 0:42:32 lr 0.000974 time 3.0518 (2.3393) loss 3.2754 (4.0928) grad_norm 1.0228 (1.1653) [2022-01-18 12:03:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][170/1251] eta 0:41:48 lr 0.000974 time 1.6126 (2.3209) loss 5.0405 (4.1103) grad_norm 0.8383 (1.1630) [2022-01-18 12:03:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][180/1251] eta 0:41:17 lr 0.000974 time 1.9113 (2.3135) loss 2.9595 (4.1297) grad_norm 1.0903 (1.1579) [2022-01-18 12:04:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][190/1251] eta 0:40:41 lr 0.000974 time 1.8908 (2.3010) loss 3.6304 (4.1183) grad_norm 0.9601 (1.1566) [2022-01-18 12:04:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][200/1251] eta 0:40:05 lr 0.000974 time 1.8186 (2.2886) loss 4.4079 (4.1284) grad_norm 1.3541 (1.1526) [2022-01-18 12:04:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][210/1251] eta 0:39:36 lr 0.000974 time 1.8618 (2.2831) loss 4.6838 (4.1323) grad_norm 1.1855 (1.1523) [2022-01-18 12:05:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][220/1251] eta 0:39:12 lr 0.000974 time 2.2912 (2.2817) loss 3.7584 (4.1272) grad_norm 1.1589 (1.1561) [2022-01-18 12:05:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][230/1251] eta 0:38:53 lr 0.000974 time 2.4588 (2.2855) loss 2.8542 (4.1171) grad_norm 1.3056 (1.1582) [2022-01-18 12:05:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][240/1251] eta 0:38:27 lr 0.000974 time 1.8801 (2.2826) loss 4.6080 (4.1139) grad_norm 0.9168 (1.1585) [2022-01-18 12:06:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][250/1251] eta 0:37:53 lr 0.000974 time 1.8856 (2.2714) loss 4.4690 (4.1150) grad_norm 0.9945 (1.1585) [2022-01-18 12:06:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][260/1251] eta 0:37:19 lr 0.000974 time 1.8895 (2.2599) loss 3.6409 (4.1138) grad_norm 1.1267 (1.1650) [2022-01-18 12:07:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][270/1251] eta 0:36:56 lr 0.000974 time 2.5623 (2.2595) loss 3.1437 (4.1275) grad_norm 1.0823 (1.1686) [2022-01-18 12:07:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][280/1251] eta 0:36:28 lr 0.000974 time 1.5103 (2.2538) loss 4.6091 (4.1299) grad_norm 0.9579 (1.1643) [2022-01-18 12:07:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][290/1251] eta 0:35:59 lr 0.000974 time 1.4917 (2.2469) loss 3.2217 (4.1342) grad_norm 1.1000 (1.1601) [2022-01-18 12:08:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][300/1251] eta 0:35:31 lr 0.000974 time 1.8542 (2.2415) loss 3.8807 (4.1430) grad_norm 1.1209 (1.1598) [2022-01-18 12:08:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][310/1251] eta 0:35:11 lr 0.000974 time 2.7761 (2.2434) loss 5.3311 (4.1513) grad_norm 1.1215 (1.1585) [2022-01-18 12:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][320/1251] eta 0:34:41 lr 0.000974 time 1.8122 (2.2358) loss 4.5917 (4.1430) grad_norm 1.0623 (1.1566) [2022-01-18 12:09:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][330/1251] eta 0:34:17 lr 0.000974 time 2.2208 (2.2341) loss 4.5657 (4.1505) grad_norm 1.0474 (1.1570) [2022-01-18 12:09:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][340/1251] eta 0:33:53 lr 0.000974 time 2.0456 (2.2316) loss 4.4036 (4.1505) grad_norm 1.0218 (1.1554) [2022-01-18 12:09:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][350/1251] eta 0:33:37 lr 0.000974 time 2.8796 (2.2394) loss 3.8010 (4.1486) grad_norm 1.2334 (1.1559) [2022-01-18 12:10:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][360/1251] eta 0:33:18 lr 0.000974 time 2.2161 (2.2425) loss 3.9001 (4.1449) grad_norm 1.0808 (1.1578) [2022-01-18 12:10:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][370/1251] eta 0:33:00 lr 0.000974 time 2.1976 (2.2483) loss 3.2452 (4.1383) grad_norm 1.2545 (1.1595) [2022-01-18 12:11:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][380/1251] eta 0:32:41 lr 0.000974 time 1.9103 (2.2520) loss 3.5691 (4.1375) grad_norm 1.2700 (1.1627) [2022-01-18 12:11:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][390/1251] eta 0:32:18 lr 0.000974 time 1.9037 (2.2513) loss 3.4383 (4.1378) grad_norm 1.0740 (1.1626) [2022-01-18 12:11:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][400/1251] eta 0:31:49 lr 0.000974 time 1.8533 (2.2444) loss 3.9059 (4.1506) grad_norm 1.0717 (1.1625) [2022-01-18 12:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][410/1251] eta 0:31:21 lr 0.000974 time 2.0269 (2.2368) loss 4.5557 (4.1528) grad_norm 1.5605 (1.1614) [2022-01-18 12:12:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][420/1251] eta 0:30:55 lr 0.000974 time 1.8355 (2.2331) loss 4.6447 (4.1572) grad_norm 1.0986 (1.1605) [2022-01-18 12:12:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][430/1251] eta 0:30:33 lr 0.000974 time 2.4292 (2.2334) loss 5.0625 (4.1582) grad_norm 1.0139 (1.1610) [2022-01-18 12:13:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][440/1251] eta 0:30:10 lr 0.000974 time 2.4100 (2.2320) loss 3.7357 (4.1586) grad_norm 1.2122 (1.1620) [2022-01-18 12:13:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][450/1251] eta 0:29:46 lr 0.000974 time 1.6211 (2.2305) loss 4.3756 (4.1596) grad_norm 0.9050 (1.1643) [2022-01-18 12:13:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][460/1251] eta 0:29:25 lr 0.000974 time 1.5521 (2.2316) loss 5.0241 (4.1586) grad_norm 1.2649 (1.1630) [2022-01-18 12:14:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][470/1251] eta 0:29:03 lr 0.000974 time 2.6754 (2.2322) loss 4.7306 (4.1554) grad_norm 1.6349 (1.1636) [2022-01-18 12:14:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][480/1251] eta 0:28:37 lr 0.000974 time 1.9646 (2.2282) loss 3.6846 (4.1582) grad_norm 1.1116 (1.1629) [2022-01-18 12:15:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][490/1251] eta 0:28:14 lr 0.000973 time 1.6962 (2.2268) loss 3.5658 (4.1542) grad_norm 1.4154 (1.1614) [2022-01-18 12:15:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][500/1251] eta 0:27:51 lr 0.000973 time 2.1744 (2.2257) loss 3.7322 (4.1552) grad_norm 0.8881 (1.1613) [2022-01-18 12:15:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][510/1251] eta 0:27:28 lr 0.000973 time 2.4088 (2.2249) loss 3.8828 (4.1521) grad_norm 0.9339 (1.1599) [2022-01-18 12:16:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][520/1251] eta 0:27:04 lr 0.000973 time 2.0956 (2.2223) loss 5.1347 (4.1522) grad_norm 1.0632 (1.1598) [2022-01-18 12:16:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][530/1251] eta 0:26:41 lr 0.000973 time 1.4610 (2.2213) loss 3.1596 (4.1523) grad_norm 1.2208 (1.1582) [2022-01-18 12:16:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][540/1251] eta 0:26:19 lr 0.000973 time 1.8738 (2.2210) loss 4.4667 (4.1528) grad_norm 1.1594 (1.1595) [2022-01-18 12:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][550/1251] eta 0:25:56 lr 0.000973 time 1.5350 (2.2201) loss 3.4298 (4.1457) grad_norm 1.1888 (1.1600) [2022-01-18 12:17:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][560/1251] eta 0:25:35 lr 0.000973 time 2.7552 (2.2215) loss 3.0005 (4.1448) grad_norm 1.1310 (1.1589) [2022-01-18 12:17:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][570/1251] eta 0:25:13 lr 0.000973 time 1.5030 (2.2225) loss 3.3840 (4.1467) grad_norm 1.3441 (1.1592) [2022-01-18 12:18:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][580/1251] eta 0:24:51 lr 0.000973 time 1.9053 (2.2227) loss 4.7357 (4.1493) grad_norm 0.9277 (1.1577) [2022-01-18 12:18:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][590/1251] eta 0:24:27 lr 0.000973 time 1.5912 (2.2195) loss 2.9852 (4.1494) grad_norm 1.2967 (1.1575) [2022-01-18 12:19:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][600/1251] eta 0:24:04 lr 0.000973 time 2.8510 (2.2186) loss 4.8249 (4.1491) grad_norm 1.3688 (1.1593) [2022-01-18 12:19:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][610/1251] eta 0:23:41 lr 0.000973 time 2.4049 (2.2178) loss 3.5748 (4.1496) grad_norm 1.0464 (1.1602) [2022-01-18 12:19:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][620/1251] eta 0:23:20 lr 0.000973 time 2.0257 (2.2195) loss 4.8289 (4.1569) grad_norm 0.9563 (1.1600) [2022-01-18 12:20:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][630/1251] eta 0:22:59 lr 0.000973 time 2.5433 (2.2210) loss 4.5951 (4.1571) grad_norm 1.1278 (1.1596) [2022-01-18 12:20:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][640/1251] eta 0:22:36 lr 0.000973 time 2.2175 (2.2197) loss 4.7395 (4.1576) grad_norm 1.1898 (1.1577) [2022-01-18 12:20:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][650/1251] eta 0:22:11 lr 0.000973 time 1.9979 (2.2158) loss 2.9974 (4.1580) grad_norm 0.9871 (1.1574) [2022-01-18 12:21:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][660/1251] eta 0:21:48 lr 0.000973 time 2.2625 (2.2143) loss 4.6905 (4.1566) grad_norm 1.3698 (1.1582) [2022-01-18 12:21:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][670/1251] eta 0:21:26 lr 0.000973 time 2.8446 (2.2139) loss 3.1244 (4.1580) grad_norm 1.1260 (1.1583) [2022-01-18 12:21:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][680/1251] eta 0:21:04 lr 0.000973 time 2.5835 (2.2154) loss 4.5284 (4.1586) grad_norm 1.2151 (1.1590) [2022-01-18 12:22:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][690/1251] eta 0:20:41 lr 0.000973 time 2.1739 (2.2138) loss 3.5777 (4.1585) grad_norm 1.0064 (1.1597) [2022-01-18 12:22:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][700/1251] eta 0:20:19 lr 0.000973 time 2.2743 (2.2141) loss 4.3858 (4.1613) grad_norm 1.2879 (1.1590) [2022-01-18 12:23:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][710/1251] eta 0:19:58 lr 0.000973 time 3.4184 (2.2155) loss 3.7677 (4.1584) grad_norm 1.0578 (1.1589) [2022-01-18 12:23:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][720/1251] eta 0:19:36 lr 0.000973 time 2.8162 (2.2149) loss 4.4884 (4.1602) grad_norm 1.1282 (1.1580) [2022-01-18 12:23:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][730/1251] eta 0:19:12 lr 0.000973 time 1.7679 (2.2129) loss 3.4107 (4.1588) grad_norm 1.2437 (1.1586) [2022-01-18 12:24:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][740/1251] eta 0:18:49 lr 0.000973 time 1.9373 (2.2113) loss 4.5488 (4.1636) grad_norm 1.3097 (1.1612) [2022-01-18 12:24:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][750/1251] eta 0:18:28 lr 0.000973 time 2.9332 (2.2121) loss 4.9054 (4.1643) grad_norm 1.2464 (1.1614) [2022-01-18 12:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][760/1251] eta 0:18:06 lr 0.000973 time 3.3323 (2.2132) loss 4.2609 (4.1607) grad_norm 1.0389 (1.1622) [2022-01-18 12:25:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][770/1251] eta 0:17:45 lr 0.000973 time 1.9927 (2.2142) loss 2.9610 (4.1578) grad_norm 1.0177 (1.1610) [2022-01-18 12:25:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][780/1251] eta 0:17:22 lr 0.000973 time 1.7682 (2.2139) loss 3.2790 (4.1559) grad_norm 0.9769 (1.1614) [2022-01-18 12:26:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][790/1251] eta 0:17:01 lr 0.000973 time 3.6948 (2.2157) loss 4.1878 (4.1582) grad_norm 1.4328 (1.1628) [2022-01-18 12:26:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][800/1251] eta 0:16:38 lr 0.000973 time 1.8632 (2.2133) loss 4.1220 (4.1628) grad_norm 1.0693 (1.1621) [2022-01-18 12:26:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][810/1251] eta 0:16:14 lr 0.000973 time 1.9059 (2.2102) loss 4.4737 (4.1645) grad_norm 1.3198 (1.1633) [2022-01-18 12:27:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][820/1251] eta 0:15:52 lr 0.000973 time 1.5302 (2.2095) loss 4.2286 (4.1657) grad_norm 1.3924 (1.1637) [2022-01-18 12:27:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][830/1251] eta 0:15:30 lr 0.000973 time 2.7381 (2.2098) loss 4.3440 (4.1635) grad_norm 1.1589 (1.1643) [2022-01-18 12:27:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][840/1251] eta 0:15:08 lr 0.000973 time 2.0460 (2.2104) loss 3.7516 (4.1648) grad_norm 1.0430 (1.1631) [2022-01-18 12:28:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][850/1251] eta 0:14:46 lr 0.000973 time 1.8437 (2.2100) loss 3.2456 (4.1662) grad_norm 1.1909 (1.1620) [2022-01-18 12:28:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][860/1251] eta 0:14:23 lr 0.000973 time 2.4839 (2.2095) loss 3.1577 (4.1656) grad_norm 0.9483 (1.1605) [2022-01-18 12:28:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][870/1251] eta 0:14:01 lr 0.000973 time 2.8367 (2.2090) loss 4.6755 (4.1666) grad_norm 1.1516 (1.1591) [2022-01-18 12:29:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][880/1251] eta 0:13:39 lr 0.000973 time 2.1767 (2.2078) loss 4.5690 (4.1674) grad_norm 1.1073 (1.1589) [2022-01-18 12:29:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][890/1251] eta 0:13:16 lr 0.000973 time 1.8151 (2.2054) loss 4.6810 (4.1672) grad_norm 0.9117 (1.1580) [2022-01-18 12:29:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][900/1251] eta 0:12:53 lr 0.000973 time 2.2901 (2.2046) loss 4.6172 (4.1679) grad_norm 0.9214 (1.1568) [2022-01-18 12:30:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][910/1251] eta 0:12:31 lr 0.000973 time 2.5051 (2.2040) loss 5.1101 (4.1700) grad_norm 1.3874 (1.1574) [2022-01-18 12:30:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][920/1251] eta 0:12:09 lr 0.000973 time 2.5700 (2.2041) loss 3.1005 (4.1701) grad_norm 1.1769 (1.1582) [2022-01-18 12:31:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][930/1251] eta 0:11:47 lr 0.000973 time 1.5662 (2.2025) loss 4.9347 (4.1698) grad_norm 1.0419 (1.1577) [2022-01-18 12:31:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][940/1251] eta 0:11:25 lr 0.000973 time 2.8785 (2.2042) loss 4.2497 (4.1696) grad_norm 1.4335 (1.1584) [2022-01-18 12:31:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][950/1251] eta 0:11:03 lr 0.000973 time 1.9763 (2.2049) loss 3.8109 (4.1710) grad_norm 0.9994 (1.1570) [2022-01-18 12:32:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][960/1251] eta 0:10:41 lr 0.000973 time 2.5407 (2.2058) loss 4.6851 (4.1710) grad_norm 1.1640 (1.1567) [2022-01-18 12:32:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][970/1251] eta 0:10:19 lr 0.000973 time 1.7067 (2.2051) loss 3.0192 (4.1717) grad_norm 1.1322 (1.1565) [2022-01-18 12:32:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][980/1251] eta 0:09:57 lr 0.000973 time 3.0740 (2.2049) loss 4.4993 (4.1736) grad_norm 1.5254 (1.1565) [2022-01-18 12:33:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][990/1251] eta 0:09:35 lr 0.000973 time 2.0835 (2.2044) loss 4.1657 (4.1724) grad_norm 1.3716 (1.1566) [2022-01-18 12:33:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1000/1251] eta 0:09:13 lr 0.000973 time 2.5593 (2.2040) loss 4.4384 (4.1699) grad_norm 1.1037 (1.1561) [2022-01-18 12:33:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1010/1251] eta 0:08:51 lr 0.000973 time 1.9193 (2.2037) loss 2.6556 (4.1680) grad_norm 1.1303 (1.1560) [2022-01-18 12:34:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1020/1251] eta 0:08:29 lr 0.000973 time 2.2945 (2.2039) loss 4.0488 (4.1673) grad_norm 1.2197 (1.1550) [2022-01-18 12:34:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1030/1251] eta 0:08:06 lr 0.000973 time 1.7392 (2.2032) loss 4.6666 (4.1676) grad_norm 1.6274 (1.1551) [2022-01-18 12:35:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1040/1251] eta 0:07:44 lr 0.000973 time 2.5138 (2.2011) loss 3.9136 (4.1691) grad_norm 1.3614 (1.1549) [2022-01-18 12:35:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1050/1251] eta 0:07:22 lr 0.000973 time 2.2117 (2.2009) loss 4.2807 (4.1699) grad_norm 1.0722 (1.1539) [2022-01-18 12:35:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1060/1251] eta 0:07:00 lr 0.000973 time 2.1042 (2.2011) loss 3.7971 (4.1684) grad_norm 1.1876 (1.1544) [2022-01-18 12:36:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1070/1251] eta 0:06:38 lr 0.000973 time 1.8764 (2.2019) loss 3.6352 (4.1671) grad_norm 1.0029 (1.1537) [2022-01-18 12:36:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1080/1251] eta 0:06:16 lr 0.000973 time 2.9151 (2.2022) loss 4.2821 (4.1685) grad_norm 0.9313 (1.1530) [2022-01-18 12:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1090/1251] eta 0:05:54 lr 0.000973 time 1.9090 (2.2019) loss 3.8750 (4.1693) grad_norm 1.2842 (1.1539) [2022-01-18 12:37:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1100/1251] eta 0:05:32 lr 0.000973 time 1.8335 (2.1998) loss 3.3087 (4.1678) grad_norm 0.9785 (1.1531) [2022-01-18 12:37:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1110/1251] eta 0:05:10 lr 0.000973 time 2.2393 (2.1999) loss 4.9111 (4.1657) grad_norm 1.1122 (1.1523) [2022-01-18 12:37:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1120/1251] eta 0:04:48 lr 0.000973 time 2.9928 (2.1995) loss 2.9519 (4.1658) grad_norm 1.0614 (1.1518) [2022-01-18 12:38:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1130/1251] eta 0:04:25 lr 0.000973 time 1.7908 (2.1977) loss 4.3474 (4.1657) grad_norm 1.0764 (1.1519) [2022-01-18 12:38:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1140/1251] eta 0:04:03 lr 0.000973 time 1.8933 (2.1969) loss 4.0289 (4.1668) grad_norm 1.0402 (1.1514) [2022-01-18 12:38:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1150/1251] eta 0:03:41 lr 0.000973 time 2.4375 (2.1964) loss 4.4869 (4.1650) grad_norm 1.1772 (1.1517) [2022-01-18 12:39:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1160/1251] eta 0:03:20 lr 0.000973 time 3.2252 (2.1979) loss 4.3803 (4.1670) grad_norm 1.0970 (1.1510) [2022-01-18 12:39:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1170/1251] eta 0:02:58 lr 0.000973 time 1.5678 (2.1979) loss 3.1342 (4.1674) grad_norm 1.3912 (1.1509) [2022-01-18 12:40:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1180/1251] eta 0:02:36 lr 0.000973 time 1.9648 (2.1979) loss 2.9402 (4.1650) grad_norm 1.0703 (1.1511) [2022-01-18 12:40:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1190/1251] eta 0:02:14 lr 0.000973 time 1.8999 (2.1984) loss 3.2877 (4.1654) grad_norm 0.9328 (1.1504) [2022-01-18 12:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1200/1251] eta 0:01:52 lr 0.000973 time 2.6090 (2.1986) loss 4.4217 (4.1666) grad_norm 0.9436 (1.1492) [2022-01-18 12:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1210/1251] eta 0:01:30 lr 0.000973 time 1.9175 (2.1967) loss 4.4429 (4.1695) grad_norm 1.1645 (1.1489) [2022-01-18 12:41:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1220/1251] eta 0:01:08 lr 0.000973 time 2.2199 (2.1956) loss 4.4069 (4.1681) grad_norm 0.9563 (1.1484) [2022-01-18 12:41:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1230/1251] eta 0:00:46 lr 0.000972 time 1.9165 (2.1944) loss 4.2396 (4.1683) grad_norm 1.0610 (1.1482) [2022-01-18 12:42:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1240/1251] eta 0:00:24 lr 0.000972 time 2.2070 (2.1941) loss 4.4443 (4.1665) grad_norm 1.4533 (1.1482) [2022-01-18 12:42:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [31/300][1250/1251] eta 0:00:02 lr 0.000972 time 1.1203 (2.1888) loss 3.7037 (4.1644) grad_norm 1.0449 (1.1479) [2022-01-18 12:42:28 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 31 training takes 0:45:38 [2022-01-18 12:42:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.572 (18.572) Loss 1.4121 (1.4121) Acc@1 67.871 (67.871) Acc@5 89.062 (89.062) [2022-01-18 12:43:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.900 (3.484) Loss 1.4523 (1.4712) Acc@1 66.895 (66.921) Acc@5 87.402 (87.518) [2022-01-18 12:43:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.973 (2.639) Loss 1.5286 (1.4782) Acc@1 65.137 (66.574) Acc@5 85.449 (87.463) [2022-01-18 12:43:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.551 (2.273) Loss 1.4891 (1.4822) Acc@1 66.992 (66.372) Acc@5 87.695 (87.513) [2022-01-18 12:43:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.871 (2.150) Loss 1.4832 (1.4841) Acc@1 63.965 (66.232) Acc@5 88.281 (87.583) [2022-01-18 12:44:04 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 66.212 Acc@5 87.622 [2022-01-18 12:44:04 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 66.2% [2022-01-18 12:44:04 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 66.21% [2022-01-18 12:44:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][0/1251] eta 7:23:08 lr 0.000972 time 21.2534 (21.2534) loss 3.9801 (3.9801) grad_norm 1.3080 (1.3080) [2022-01-18 12:44:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][10/1251] eta 1:25:02 lr 0.000972 time 2.1642 (4.1119) loss 4.1678 (3.9777) grad_norm 1.0175 (1.1090) [2022-01-18 12:45:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][20/1251] eta 1:05:57 lr 0.000972 time 1.6262 (3.2146) loss 4.6152 (4.0334) grad_norm 1.0023 (1.0919) [2022-01-18 12:45:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][30/1251] eta 0:59:54 lr 0.000972 time 1.4142 (2.9436) loss 4.3490 (4.0348) grad_norm 1.3395 (1.1273) [2022-01-18 12:45:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][40/1251] eta 0:56:32 lr 0.000972 time 3.2924 (2.8011) loss 4.8563 (4.0885) grad_norm 0.9666 (1.1802) [2022-01-18 12:46:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][50/1251] eta 0:55:04 lr 0.000972 time 2.7508 (2.7517) loss 4.4675 (4.1021) grad_norm 1.2187 (1.1664) [2022-01-18 12:46:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][60/1251] eta 0:52:06 lr 0.000972 time 1.6135 (2.6247) loss 4.9610 (4.1570) grad_norm 1.1226 (1.1717) [2022-01-18 12:47:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][70/1251] eta 0:50:46 lr 0.000972 time 1.9639 (2.5797) loss 3.9556 (4.1105) grad_norm 1.1256 (1.1777) [2022-01-18 12:47:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][80/1251] eta 0:49:00 lr 0.000972 time 1.6206 (2.5113) loss 4.6327 (4.1184) grad_norm 1.1900 (1.1742) [2022-01-18 12:47:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][90/1251] eta 0:47:49 lr 0.000972 time 2.2842 (2.4719) loss 3.9885 (4.0982) grad_norm 1.0524 (1.1720) [2022-01-18 12:48:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][100/1251] eta 0:46:52 lr 0.000972 time 1.7295 (2.4435) loss 3.7553 (4.1011) grad_norm 1.4610 (1.1729) [2022-01-18 12:48:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][110/1251] eta 0:46:14 lr 0.000972 time 2.5577 (2.4314) loss 4.1119 (4.1192) grad_norm 1.1416 (1.1662) [2022-01-18 12:48:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][120/1251] eta 0:45:10 lr 0.000972 time 1.6413 (2.3965) loss 3.1244 (4.1256) grad_norm 1.4825 (1.1632) [2022-01-18 12:49:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][130/1251] eta 0:44:29 lr 0.000972 time 2.0084 (2.3813) loss 3.1610 (4.1202) grad_norm 1.5634 (1.1623) [2022-01-18 12:49:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][140/1251] eta 0:43:45 lr 0.000972 time 2.0210 (2.3633) loss 4.3503 (4.1244) grad_norm 1.3728 (1.1609) [2022-01-18 12:49:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][150/1251] eta 0:43:04 lr 0.000972 time 1.7722 (2.3474) loss 4.6159 (4.1246) grad_norm 0.9693 (1.1614) [2022-01-18 12:50:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][160/1251] eta 0:42:31 lr 0.000972 time 2.5299 (2.3386) loss 3.7161 (4.1084) grad_norm 0.9533 (1.1626) [2022-01-18 12:50:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][170/1251] eta 0:41:50 lr 0.000972 time 1.9356 (2.3227) loss 4.8420 (4.1170) grad_norm 1.0748 (1.1654) [2022-01-18 12:51:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][180/1251] eta 0:41:21 lr 0.000972 time 2.8072 (2.3166) loss 4.8878 (4.1100) grad_norm 1.3136 (1.1602) [2022-01-18 12:51:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][190/1251] eta 0:40:46 lr 0.000972 time 1.4536 (2.3054) loss 4.6445 (4.1151) grad_norm 1.1788 (1.1564) [2022-01-18 12:51:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][200/1251] eta 0:40:18 lr 0.000972 time 2.3300 (2.3011) loss 4.0031 (4.1238) grad_norm 0.9941 (1.1564) [2022-01-18 12:52:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][210/1251] eta 0:39:41 lr 0.000972 time 2.1495 (2.2877) loss 3.7303 (4.1076) grad_norm 1.1128 (1.1543) [2022-01-18 12:52:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][220/1251] eta 0:39:22 lr 0.000972 time 2.2051 (2.2919) loss 4.0873 (4.1056) grad_norm 1.0730 (1.1546) [2022-01-18 12:52:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][230/1251] eta 0:38:59 lr 0.000972 time 1.5416 (2.2911) loss 5.0264 (4.1164) grad_norm 1.3788 (1.1527) [2022-01-18 12:53:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][240/1251] eta 0:38:42 lr 0.000972 time 1.6466 (2.2970) loss 4.3915 (4.1266) grad_norm 1.0686 (1.1538) [2022-01-18 12:53:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][250/1251] eta 0:38:14 lr 0.000972 time 1.9355 (2.2924) loss 4.2109 (4.1355) grad_norm 1.3246 (1.1546) [2022-01-18 12:54:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][260/1251] eta 0:37:51 lr 0.000972 time 1.8863 (2.2917) loss 4.1077 (4.1513) grad_norm 1.1028 (1.1561) [2022-01-18 12:54:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][270/1251] eta 0:37:20 lr 0.000972 time 1.5996 (2.2837) loss 4.5555 (4.1578) grad_norm 1.1976 (1.1556) [2022-01-18 12:54:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][280/1251] eta 0:36:49 lr 0.000972 time 2.0385 (2.2755) loss 4.2326 (4.1623) grad_norm 1.1114 (1.1564) [2022-01-18 12:55:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][290/1251] eta 0:36:23 lr 0.000972 time 1.8798 (2.2726) loss 4.2359 (4.1567) grad_norm 1.2964 (1.1588) [2022-01-18 12:55:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][300/1251] eta 0:36:02 lr 0.000972 time 2.1850 (2.2734) loss 4.4526 (4.1548) grad_norm 1.3012 (1.1599) [2022-01-18 12:55:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][310/1251] eta 0:35:31 lr 0.000972 time 1.6551 (2.2646) loss 4.3889 (4.1533) grad_norm 1.3761 (1.1603) [2022-01-18 12:56:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][320/1251] eta 0:35:10 lr 0.000972 time 1.9244 (2.2672) loss 4.2597 (4.1457) grad_norm 1.1093 (1.1577) [2022-01-18 12:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][330/1251] eta 0:34:43 lr 0.000972 time 1.6061 (2.2625) loss 4.7276 (4.1409) grad_norm 1.1957 (1.1576) [2022-01-18 12:56:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][340/1251] eta 0:34:16 lr 0.000972 time 1.6379 (2.2573) loss 4.9647 (4.1449) grad_norm 0.9943 (1.1555) [2022-01-18 12:57:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][350/1251] eta 0:33:47 lr 0.000972 time 1.6775 (2.2508) loss 4.9394 (4.1480) grad_norm 0.9882 (1.1536) [2022-01-18 12:57:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][360/1251] eta 0:33:22 lr 0.000972 time 2.0192 (2.2473) loss 3.7980 (4.1475) grad_norm 0.9928 (1.1540) [2022-01-18 12:57:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][370/1251] eta 0:32:57 lr 0.000972 time 1.6343 (2.2443) loss 3.2639 (4.1443) grad_norm 1.3697 (1.1545) [2022-01-18 12:58:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][380/1251] eta 0:32:31 lr 0.000972 time 1.8803 (2.2411) loss 3.9771 (4.1435) grad_norm 1.5497 (1.1566) [2022-01-18 12:58:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][390/1251] eta 0:32:08 lr 0.000972 time 2.1058 (2.2403) loss 4.1921 (4.1447) grad_norm 1.1718 (1.1591) [2022-01-18 12:59:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][400/1251] eta 0:31:48 lr 0.000972 time 2.7539 (2.2425) loss 3.3803 (4.1421) grad_norm 1.1313 (1.1574) [2022-01-18 12:59:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][410/1251] eta 0:31:23 lr 0.000972 time 1.9161 (2.2392) loss 4.0699 (4.1407) grad_norm 1.3132 (1.1571) [2022-01-18 12:59:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][420/1251] eta 0:30:58 lr 0.000972 time 1.8208 (2.2359) loss 4.4110 (4.1485) grad_norm 1.1489 (1.1579) [2022-01-18 13:00:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][430/1251] eta 0:30:38 lr 0.000972 time 3.5049 (2.2396) loss 4.7060 (4.1533) grad_norm 1.0092 (1.1563) [2022-01-18 13:00:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][440/1251] eta 0:30:15 lr 0.000972 time 1.9773 (2.2385) loss 4.9571 (4.1580) grad_norm 1.1445 (1.1553) [2022-01-18 13:00:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][450/1251] eta 0:29:53 lr 0.000972 time 1.9578 (2.2392) loss 5.0006 (4.1632) grad_norm 1.0303 (1.1541) [2022-01-18 13:01:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][460/1251] eta 0:29:30 lr 0.000972 time 1.7684 (2.2379) loss 4.7895 (4.1619) grad_norm 1.2667 (1.1538) [2022-01-18 13:01:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][470/1251] eta 0:29:08 lr 0.000972 time 3.1414 (2.2390) loss 4.1161 (4.1640) grad_norm 1.2645 (1.1536) [2022-01-18 13:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][480/1251] eta 0:28:44 lr 0.000972 time 2.5108 (2.2363) loss 4.5680 (4.1634) grad_norm 1.4713 (1.1548) [2022-01-18 13:02:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][490/1251] eta 0:28:18 lr 0.000972 time 2.2254 (2.2314) loss 4.5919 (4.1689) grad_norm 1.2699 (1.1537) [2022-01-18 13:02:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][500/1251] eta 0:27:53 lr 0.000972 time 1.8040 (2.2280) loss 3.9249 (4.1671) grad_norm 1.3794 (1.1568) [2022-01-18 13:03:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][510/1251] eta 0:27:33 lr 0.000972 time 3.8387 (2.2308) loss 3.7010 (4.1600) grad_norm 1.0892 (1.1561) [2022-01-18 13:03:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][520/1251] eta 0:27:12 lr 0.000972 time 2.7593 (2.2331) loss 4.7481 (4.1582) grad_norm 1.2953 (1.1549) [2022-01-18 13:03:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][530/1251] eta 0:26:51 lr 0.000972 time 2.3153 (2.2350) loss 5.0444 (4.1539) grad_norm 1.2555 (1.1533) [2022-01-18 13:04:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][540/1251] eta 0:26:26 lr 0.000972 time 1.7393 (2.2314) loss 4.1459 (4.1578) grad_norm 1.1249 (1.1550) [2022-01-18 13:04:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][550/1251] eta 0:26:01 lr 0.000972 time 2.7114 (2.2281) loss 3.2198 (4.1570) grad_norm 1.0218 (1.1550) [2022-01-18 13:04:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][560/1251] eta 0:25:37 lr 0.000972 time 2.5991 (2.2252) loss 4.9157 (4.1567) grad_norm 1.1475 (1.1544) [2022-01-18 13:05:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][570/1251] eta 0:25:15 lr 0.000972 time 2.2220 (2.2247) loss 4.0716 (4.1562) grad_norm 1.0757 (1.1538) [2022-01-18 13:05:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][580/1251] eta 0:24:51 lr 0.000972 time 2.5244 (2.2225) loss 2.7852 (4.1521) grad_norm 0.9909 (1.1535) [2022-01-18 13:05:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][590/1251] eta 0:24:28 lr 0.000972 time 2.9291 (2.2220) loss 3.3468 (4.1547) grad_norm 1.0122 (1.1541) [2022-01-18 13:06:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][600/1251] eta 0:24:07 lr 0.000972 time 2.2675 (2.2233) loss 3.1815 (4.1485) grad_norm 1.2000 (1.1558) [2022-01-18 13:06:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][610/1251] eta 0:23:46 lr 0.000972 time 1.4858 (2.2254) loss 4.8696 (4.1520) grad_norm 1.1070 (1.1568) [2022-01-18 13:07:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][620/1251] eta 0:23:24 lr 0.000972 time 1.9353 (2.2255) loss 4.2151 (4.1504) grad_norm 1.0502 (1.1560) [2022-01-18 13:07:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][630/1251] eta 0:23:00 lr 0.000972 time 1.6737 (2.2234) loss 4.8799 (4.1498) grad_norm 1.3917 (1.1564) [2022-01-18 13:07:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][640/1251] eta 0:22:38 lr 0.000972 time 2.8692 (2.2233) loss 4.8722 (4.1529) grad_norm 1.0393 (1.1557) [2022-01-18 13:08:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][650/1251] eta 0:22:15 lr 0.000972 time 1.5495 (2.2225) loss 3.3132 (4.1494) grad_norm 0.9899 (1.1542) [2022-01-18 13:08:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][660/1251] eta 0:21:53 lr 0.000972 time 1.5641 (2.2220) loss 3.8943 (4.1450) grad_norm 1.1650 (1.1537) [2022-01-18 13:08:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][670/1251] eta 0:21:31 lr 0.000972 time 2.1270 (2.2229) loss 4.6965 (4.1439) grad_norm 1.3281 (1.1529) [2022-01-18 13:09:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][680/1251] eta 0:21:09 lr 0.000972 time 3.0885 (2.2241) loss 4.1417 (4.1420) grad_norm 1.1259 (1.1539) [2022-01-18 13:09:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][690/1251] eta 0:20:46 lr 0.000972 time 1.9181 (2.2216) loss 4.6470 (4.1456) grad_norm 0.9893 (1.1524) [2022-01-18 13:09:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][700/1251] eta 0:20:22 lr 0.000972 time 2.5643 (2.2194) loss 3.9715 (4.1453) grad_norm 1.2100 (1.1521) [2022-01-18 13:10:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][710/1251] eta 0:20:00 lr 0.000971 time 1.8709 (2.2188) loss 3.5836 (4.1446) grad_norm 1.4584 (1.1533) [2022-01-18 13:10:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][720/1251] eta 0:19:39 lr 0.000971 time 3.1722 (2.2219) loss 3.8485 (4.1443) grad_norm 1.1513 (1.1527) [2022-01-18 13:11:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][730/1251] eta 0:19:17 lr 0.000971 time 2.2167 (2.2220) loss 3.9306 (4.1425) grad_norm 1.6917 (1.1548) [2022-01-18 13:11:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][740/1251] eta 0:18:55 lr 0.000971 time 1.9447 (2.2230) loss 4.0761 (4.1403) grad_norm 1.3498 (1.1570) [2022-01-18 13:11:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][750/1251] eta 0:18:32 lr 0.000971 time 1.7017 (2.2214) loss 4.6073 (4.1402) grad_norm 1.2169 (1.1576) [2022-01-18 13:12:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][760/1251] eta 0:18:09 lr 0.000971 time 2.0975 (2.2187) loss 3.7906 (4.1380) grad_norm 1.1623 (1.1584) [2022-01-18 13:12:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][770/1251] eta 0:17:45 lr 0.000971 time 1.9048 (2.2161) loss 4.0411 (4.1362) grad_norm 1.3308 (1.1600) [2022-01-18 13:12:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][780/1251] eta 0:17:22 lr 0.000971 time 2.1576 (2.2131) loss 4.3563 (4.1368) grad_norm 0.9595 (1.1596) [2022-01-18 13:13:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][790/1251] eta 0:16:59 lr 0.000971 time 1.9279 (2.2109) loss 3.7548 (4.1365) grad_norm 1.1126 (1.1591) [2022-01-18 13:13:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][800/1251] eta 0:16:36 lr 0.000971 time 2.1721 (2.2106) loss 3.9514 (4.1330) grad_norm 0.9956 (1.1588) [2022-01-18 13:13:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][810/1251] eta 0:16:15 lr 0.000971 time 2.4812 (2.2118) loss 3.7814 (4.1303) grad_norm 1.2197 (1.1588) [2022-01-18 13:14:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][820/1251] eta 0:15:52 lr 0.000971 time 2.4848 (2.2111) loss 4.3385 (4.1286) grad_norm 1.2286 (1.1587) [2022-01-18 13:14:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][830/1251] eta 0:15:31 lr 0.000971 time 2.2917 (2.2116) loss 4.6254 (4.1322) grad_norm 1.0240 (1.1590) [2022-01-18 13:15:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][840/1251] eta 0:15:08 lr 0.000971 time 1.8860 (2.2109) loss 3.4788 (4.1336) grad_norm 1.0259 (1.1595) [2022-01-18 13:15:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][850/1251] eta 0:14:47 lr 0.000971 time 2.7048 (2.2126) loss 4.4891 (4.1335) grad_norm 1.3962 (1.1598) [2022-01-18 13:15:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][860/1251] eta 0:14:25 lr 0.000971 time 2.4986 (2.2128) loss 4.0345 (4.1309) grad_norm 1.1620 (1.1595) [2022-01-18 13:16:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][870/1251] eta 0:14:03 lr 0.000971 time 1.6783 (2.2139) loss 4.7981 (4.1342) grad_norm 1.2355 (1.1599) [2022-01-18 13:16:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][880/1251] eta 0:13:40 lr 0.000971 time 1.9883 (2.2127) loss 4.7193 (4.1318) grad_norm 1.2231 (1.1624) [2022-01-18 13:16:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][890/1251] eta 0:13:18 lr 0.000971 time 1.8317 (2.2107) loss 4.0314 (4.1327) grad_norm 1.1974 (1.1623) [2022-01-18 13:17:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][900/1251] eta 0:12:55 lr 0.000971 time 2.4822 (2.2095) loss 4.5033 (4.1334) grad_norm 1.0363 (1.1626) [2022-01-18 13:17:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][910/1251] eta 0:12:33 lr 0.000971 time 1.8275 (2.2101) loss 3.4804 (4.1305) grad_norm 1.3183 (1.1629) [2022-01-18 13:17:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][920/1251] eta 0:12:11 lr 0.000971 time 1.9140 (2.2093) loss 4.2352 (4.1297) grad_norm 1.2390 (1.1632) [2022-01-18 13:18:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][930/1251] eta 0:11:49 lr 0.000971 time 2.1216 (2.2092) loss 4.6929 (4.1291) grad_norm 1.3338 (1.1638) [2022-01-18 13:18:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][940/1251] eta 0:11:27 lr 0.000971 time 2.1569 (2.2100) loss 4.6989 (4.1294) grad_norm 1.1665 (1.1647) [2022-01-18 13:19:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][950/1251] eta 0:11:05 lr 0.000971 time 2.4552 (2.2122) loss 4.8455 (4.1299) grad_norm 1.1259 (1.1656) [2022-01-18 13:19:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][960/1251] eta 0:10:43 lr 0.000971 time 2.2049 (2.2117) loss 4.7278 (4.1300) grad_norm 1.0819 (1.1645) [2022-01-18 13:19:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][970/1251] eta 0:10:21 lr 0.000971 time 2.0147 (2.2112) loss 4.3549 (4.1296) grad_norm 1.2945 (1.1647) [2022-01-18 13:20:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][980/1251] eta 0:09:58 lr 0.000971 time 1.7805 (2.2090) loss 4.3255 (4.1300) grad_norm 1.0692 (1.1643) [2022-01-18 13:20:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][990/1251] eta 0:09:36 lr 0.000971 time 1.8572 (2.2077) loss 4.2611 (4.1304) grad_norm 1.1775 (1.1646) [2022-01-18 13:20:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1000/1251] eta 0:09:13 lr 0.000971 time 1.9620 (2.2062) loss 4.2525 (4.1327) grad_norm 1.0241 (1.1645) [2022-01-18 13:21:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1010/1251] eta 0:08:51 lr 0.000971 time 2.1523 (2.2065) loss 4.4115 (4.1322) grad_norm 1.1047 (1.1645) [2022-01-18 13:21:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1020/1251] eta 0:08:29 lr 0.000971 time 2.0198 (2.2063) loss 3.8318 (4.1334) grad_norm 1.1576 (1.1651) [2022-01-18 13:21:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1030/1251] eta 0:08:07 lr 0.000971 time 2.3301 (2.2055) loss 4.0072 (4.1331) grad_norm 1.0366 (1.1641) [2022-01-18 13:22:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1040/1251] eta 0:07:45 lr 0.000971 time 1.9607 (2.2048) loss 4.0674 (4.1345) grad_norm 1.4962 (1.1634) [2022-01-18 13:22:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1050/1251] eta 0:07:23 lr 0.000971 time 2.4148 (2.2043) loss 3.3478 (4.1354) grad_norm 1.4461 (1.1633) [2022-01-18 13:23:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1060/1251] eta 0:07:01 lr 0.000971 time 1.7992 (2.2048) loss 4.4500 (4.1369) grad_norm 1.2548 (1.1634) [2022-01-18 13:23:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1070/1251] eta 0:06:39 lr 0.000971 time 1.8768 (2.2047) loss 3.5773 (4.1384) grad_norm 1.0753 (1.1636) [2022-01-18 13:23:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1080/1251] eta 0:06:17 lr 0.000971 time 2.2619 (2.2057) loss 4.4517 (4.1397) grad_norm 1.2110 (1.1636) [2022-01-18 13:24:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1090/1251] eta 0:05:55 lr 0.000971 time 3.0966 (2.2065) loss 3.2429 (4.1415) grad_norm 1.2057 (1.1631) [2022-01-18 13:24:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1100/1251] eta 0:05:33 lr 0.000971 time 3.2869 (2.2075) loss 4.3896 (4.1434) grad_norm 1.1582 (1.1624) [2022-01-18 13:24:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1110/1251] eta 0:05:11 lr 0.000971 time 1.9753 (2.2071) loss 3.2952 (4.1466) grad_norm 1.2118 (1.1619) [2022-01-18 13:25:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1120/1251] eta 0:04:48 lr 0.000971 time 1.6058 (2.2057) loss 4.4021 (4.1482) grad_norm 1.4022 (1.1621) [2022-01-18 13:25:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1130/1251] eta 0:04:26 lr 0.000971 time 2.3580 (2.2047) loss 4.5202 (4.1482) grad_norm 1.1972 (1.1619) [2022-01-18 13:25:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1140/1251] eta 0:04:04 lr 0.000971 time 2.1497 (2.2039) loss 3.1266 (4.1478) grad_norm 1.2835 (1.1620) [2022-01-18 13:26:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1150/1251] eta 0:03:42 lr 0.000971 time 1.8367 (2.2038) loss 3.9491 (4.1474) grad_norm 1.0178 (1.1610) [2022-01-18 13:26:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1160/1251] eta 0:03:20 lr 0.000971 time 2.4861 (2.2052) loss 4.4783 (4.1474) grad_norm 0.9711 (1.1606) [2022-01-18 13:27:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1170/1251] eta 0:02:58 lr 0.000971 time 2.4966 (2.2066) loss 4.2156 (4.1473) grad_norm 1.1064 (1.1607) [2022-01-18 13:27:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1180/1251] eta 0:02:36 lr 0.000971 time 2.4130 (2.2063) loss 3.6044 (4.1478) grad_norm 1.0297 (1.1603) [2022-01-18 13:27:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1190/1251] eta 0:02:14 lr 0.000971 time 1.8300 (2.2049) loss 4.1193 (4.1474) grad_norm 1.1176 (1.1603) [2022-01-18 13:28:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1200/1251] eta 0:01:52 lr 0.000971 time 1.9548 (2.2036) loss 4.8475 (4.1455) grad_norm 0.8977 (1.1602) [2022-01-18 13:28:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1210/1251] eta 0:01:30 lr 0.000971 time 2.1594 (2.2034) loss 4.2870 (4.1426) grad_norm 0.9516 (1.1600) [2022-01-18 13:28:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1220/1251] eta 0:01:08 lr 0.000971 time 1.8884 (2.2028) loss 4.3906 (4.1426) grad_norm 1.3184 (1.1600) [2022-01-18 13:29:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1230/1251] eta 0:00:46 lr 0.000971 time 2.2323 (2.2025) loss 4.3719 (4.1419) grad_norm 1.1184 (1.1595) [2022-01-18 13:29:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1240/1251] eta 0:00:24 lr 0.000971 time 1.2978 (2.2011) loss 4.0606 (4.1425) grad_norm 1.1366 (1.1592) [2022-01-18 13:29:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [32/300][1250/1251] eta 0:00:02 lr 0.000971 time 1.1676 (2.1957) loss 4.6542 (4.1441) grad_norm 1.0625 (1.1584) [2022-01-18 13:29:51 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 32 training takes 0:45:47 [2022-01-18 13:30:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.902 (18.902) Loss 1.4771 (1.4771) Acc@1 66.602 (66.602) Acc@5 88.184 (88.184) [2022-01-18 13:30:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.306 (3.369) Loss 1.4585 (1.4957) Acc@1 66.504 (66.460) Acc@5 88.965 (87.695) [2022-01-18 13:30:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.279 (2.573) Loss 1.4792 (1.4966) Acc@1 68.359 (66.592) Acc@5 86.816 (87.649) [2022-01-18 13:31:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.881 (2.300) Loss 1.5213 (1.4861) Acc@1 66.699 (66.885) Acc@5 87.402 (87.828) [2022-01-18 13:31:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.088 (2.148) Loss 1.5194 (1.4788) Acc@1 65.430 (66.904) Acc@5 87.695 (88.048) [2022-01-18 13:31:27 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 66.884 Acc@5 88.148 [2022-01-18 13:31:27 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 66.9% [2022-01-18 13:31:27 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 66.88% [2022-01-18 13:31:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][0/1251] eta 7:32:22 lr 0.000971 time 21.6969 (21.6969) loss 4.5727 (4.5727) grad_norm 0.9827 (0.9827) [2022-01-18 13:32:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][10/1251] eta 1:23:15 lr 0.000971 time 1.7952 (4.0252) loss 3.8949 (4.4103) grad_norm 0.8347 (1.0204) [2022-01-18 13:32:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][20/1251] eta 1:03:50 lr 0.000971 time 1.8638 (3.1118) loss 4.6029 (4.3118) grad_norm 1.1368 (1.0692) [2022-01-18 13:32:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][30/1251] eta 0:57:55 lr 0.000971 time 1.8957 (2.8465) loss 4.7611 (4.2233) grad_norm 1.1219 (1.0933) [2022-01-18 13:33:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][40/1251] eta 0:55:35 lr 0.000971 time 3.7235 (2.7544) loss 4.3249 (4.1670) grad_norm 1.7582 (1.1457) [2022-01-18 13:33:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][50/1251] eta 0:53:09 lr 0.000971 time 2.4667 (2.6559) loss 3.2188 (4.1324) grad_norm 1.2123 (1.1481) [2022-01-18 13:34:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][60/1251] eta 0:51:12 lr 0.000971 time 1.8658 (2.5802) loss 4.6325 (4.0848) grad_norm 1.1577 (1.1531) [2022-01-18 13:34:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][70/1251] eta 0:49:46 lr 0.000971 time 1.8436 (2.5284) loss 4.3714 (4.0887) grad_norm 1.1229 (1.1634) [2022-01-18 13:34:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][80/1251] eta 0:48:54 lr 0.000971 time 3.6534 (2.5056) loss 4.3455 (4.1084) grad_norm 1.3272 (1.1765) [2022-01-18 13:35:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][90/1251] eta 0:47:39 lr 0.000971 time 1.8353 (2.4631) loss 4.4249 (4.1192) grad_norm 1.0847 (1.1769) [2022-01-18 13:35:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][100/1251] eta 0:46:23 lr 0.000971 time 1.7979 (2.4179) loss 4.3121 (4.1249) grad_norm 1.1847 (1.1885) [2022-01-18 13:35:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][110/1251] eta 0:45:40 lr 0.000971 time 1.7304 (2.4017) loss 4.3934 (4.1285) grad_norm 1.1484 (1.1840) [2022-01-18 13:36:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][120/1251] eta 0:45:11 lr 0.000971 time 3.4511 (2.3976) loss 4.6340 (4.1302) grad_norm 1.2628 (1.1810) [2022-01-18 13:36:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][130/1251] eta 0:44:27 lr 0.000971 time 2.2257 (2.3794) loss 4.6907 (4.1256) grad_norm 1.2973 (1.1777) [2022-01-18 13:36:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][140/1251] eta 0:43:34 lr 0.000971 time 1.8372 (2.3532) loss 3.7658 (4.1050) grad_norm 1.1938 (1.1698) [2022-01-18 13:37:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][150/1251] eta 0:42:50 lr 0.000971 time 1.9223 (2.3349) loss 3.8280 (4.1073) grad_norm 1.2993 (1.1733) [2022-01-18 13:37:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][160/1251] eta 0:42:20 lr 0.000971 time 2.7502 (2.3285) loss 4.4621 (4.1098) grad_norm 1.1125 (1.1723) [2022-01-18 13:38:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][170/1251] eta 0:41:50 lr 0.000970 time 2.2517 (2.3226) loss 4.5271 (4.1124) grad_norm 1.0197 (1.1683) [2022-01-18 13:38:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][180/1251] eta 0:41:16 lr 0.000970 time 1.6759 (2.3127) loss 4.4537 (4.1029) grad_norm 1.4221 (1.1667) [2022-01-18 13:38:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][190/1251] eta 0:40:36 lr 0.000970 time 1.8613 (2.2965) loss 4.1691 (4.0971) grad_norm 1.1944 (1.1625) [2022-01-18 13:39:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][200/1251] eta 0:40:15 lr 0.000970 time 3.0357 (2.2983) loss 3.2804 (4.0945) grad_norm 1.0872 (1.1591) [2022-01-18 13:39:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][210/1251] eta 0:39:46 lr 0.000970 time 2.1153 (2.2923) loss 4.5674 (4.0954) grad_norm 0.9602 (1.1584) [2022-01-18 13:39:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][220/1251] eta 0:39:13 lr 0.000970 time 2.0092 (2.2826) loss 4.3137 (4.0821) grad_norm 1.2530 (1.1620) [2022-01-18 13:40:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][230/1251] eta 0:38:42 lr 0.000970 time 1.5550 (2.2752) loss 4.9311 (4.0875) grad_norm 1.2002 (1.1614) [2022-01-18 13:40:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][240/1251] eta 0:38:26 lr 0.000970 time 2.8123 (2.2818) loss 3.6086 (4.0886) grad_norm 1.0631 (1.1599) [2022-01-18 13:40:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][250/1251] eta 0:38:00 lr 0.000970 time 2.4478 (2.2780) loss 4.2889 (4.0849) grad_norm 0.9904 (1.1605) [2022-01-18 13:41:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][260/1251] eta 0:37:33 lr 0.000970 time 3.0946 (2.2737) loss 3.1760 (4.0797) grad_norm 1.0986 (1.1595) [2022-01-18 13:41:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][270/1251] eta 0:37:05 lr 0.000970 time 1.8892 (2.2689) loss 4.1212 (4.0808) grad_norm 1.0888 (1.1570) [2022-01-18 13:42:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][280/1251] eta 0:36:44 lr 0.000970 time 2.6462 (2.2701) loss 4.4690 (4.0901) grad_norm 1.2416 (1.1561) [2022-01-18 13:42:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][290/1251] eta 0:36:20 lr 0.000970 time 1.9316 (2.2690) loss 3.7856 (4.0962) grad_norm 1.1462 (1.1537) [2022-01-18 13:42:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][300/1251] eta 0:35:59 lr 0.000970 time 2.4329 (2.2709) loss 4.4059 (4.1030) grad_norm 1.0284 (1.1550) [2022-01-18 13:43:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][310/1251] eta 0:35:29 lr 0.000970 time 1.8750 (2.2634) loss 4.3814 (4.1029) grad_norm 1.2108 (1.1537) [2022-01-18 13:43:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][320/1251] eta 0:35:02 lr 0.000970 time 2.2288 (2.2579) loss 3.1605 (4.1020) grad_norm 1.1074 (1.1543) [2022-01-18 13:43:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][330/1251] eta 0:34:32 lr 0.000970 time 1.5477 (2.2497) loss 3.0555 (4.0967) grad_norm 1.1271 (1.1585) [2022-01-18 13:44:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][340/1251] eta 0:34:06 lr 0.000970 time 2.6301 (2.2468) loss 4.4189 (4.1000) grad_norm 1.1108 (1.1576) [2022-01-18 13:44:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][350/1251] eta 0:33:41 lr 0.000970 time 2.3102 (2.2437) loss 3.2989 (4.0984) grad_norm 1.2919 (1.1590) [2022-01-18 13:44:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][360/1251] eta 0:33:18 lr 0.000970 time 2.2751 (2.2432) loss 3.5127 (4.0916) grad_norm 1.1242 (1.1575) [2022-01-18 13:45:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][370/1251] eta 0:32:57 lr 0.000970 time 2.1098 (2.2445) loss 3.3939 (4.0907) grad_norm 1.1998 (1.1571) [2022-01-18 13:45:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][380/1251] eta 0:32:40 lr 0.000970 time 2.7193 (2.2510) loss 4.7007 (4.0928) grad_norm 1.3384 (1.1585) [2022-01-18 13:46:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][390/1251] eta 0:32:17 lr 0.000970 time 1.8940 (2.2508) loss 4.5990 (4.0930) grad_norm 1.2609 (1.1602) [2022-01-18 13:46:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][400/1251] eta 0:32:00 lr 0.000970 time 2.5682 (2.2564) loss 3.3483 (4.0870) grad_norm 1.4366 (1.1619) [2022-01-18 13:46:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][410/1251] eta 0:31:34 lr 0.000970 time 2.2427 (2.2523) loss 3.8763 (4.0893) grad_norm 1.3909 (1.1598) [2022-01-18 13:47:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][420/1251] eta 0:31:07 lr 0.000970 time 1.6726 (2.2474) loss 3.8275 (4.0850) grad_norm 0.9178 (1.1583) [2022-01-18 13:47:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][430/1251] eta 0:30:40 lr 0.000970 time 1.9459 (2.2417) loss 3.3159 (4.0858) grad_norm 1.0387 (1.1599) [2022-01-18 13:47:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][440/1251] eta 0:30:14 lr 0.000970 time 1.9225 (2.2380) loss 3.4814 (4.0799) grad_norm 1.2860 (1.1604) [2022-01-18 13:48:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][450/1251] eta 0:29:50 lr 0.000970 time 1.6051 (2.2353) loss 4.5767 (4.0815) grad_norm 1.2276 (1.1642) [2022-01-18 13:48:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][460/1251] eta 0:29:28 lr 0.000970 time 1.9062 (2.2360) loss 4.0359 (4.0809) grad_norm 1.1834 (1.1647) [2022-01-18 13:49:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][470/1251] eta 0:29:05 lr 0.000970 time 1.7880 (2.2353) loss 3.2873 (4.0767) grad_norm 0.9829 (1.1645) [2022-01-18 13:49:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][480/1251] eta 0:28:44 lr 0.000970 time 2.5710 (2.2368) loss 3.6051 (4.0811) grad_norm 1.1785 (1.1626) [2022-01-18 13:49:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][490/1251] eta 0:28:21 lr 0.000970 time 2.0539 (2.2354) loss 3.1633 (4.0755) grad_norm 1.0987 (1.1614) [2022-01-18 13:50:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][500/1251] eta 0:27:59 lr 0.000970 time 2.1681 (2.2361) loss 4.3056 (4.0787) grad_norm 1.0508 (1.1607) [2022-01-18 13:50:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][510/1251] eta 0:27:37 lr 0.000970 time 2.0098 (2.2369) loss 4.3954 (4.0816) grad_norm 1.0853 (1.1600) [2022-01-18 13:50:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][520/1251] eta 0:27:15 lr 0.000970 time 2.3263 (2.2369) loss 3.2841 (4.0814) grad_norm 1.0969 (1.1589) [2022-01-18 13:51:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][530/1251] eta 0:26:53 lr 0.000970 time 2.6324 (2.2372) loss 3.4031 (4.0827) grad_norm 0.9662 (1.1592) [2022-01-18 13:51:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][540/1251] eta 0:26:27 lr 0.000970 time 2.2189 (2.2334) loss 4.3479 (4.0797) grad_norm 1.0386 (1.1591) [2022-01-18 13:51:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][550/1251] eta 0:26:02 lr 0.000970 time 1.7450 (2.2285) loss 4.8644 (4.0781) grad_norm 1.0067 (1.1587) [2022-01-18 13:52:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][560/1251] eta 0:25:38 lr 0.000970 time 1.8565 (2.2263) loss 4.3371 (4.0728) grad_norm 1.2208 (1.1603) [2022-01-18 13:52:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][570/1251] eta 0:25:14 lr 0.000970 time 2.5265 (2.2242) loss 3.9498 (4.0741) grad_norm 1.3432 (1.1605) [2022-01-18 13:52:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][580/1251] eta 0:24:51 lr 0.000970 time 2.1701 (2.2228) loss 3.0819 (4.0731) grad_norm 1.1805 (1.1615) [2022-01-18 13:53:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][590/1251] eta 0:24:30 lr 0.000970 time 2.9594 (2.2248) loss 3.9302 (4.0752) grad_norm 1.4267 (1.1618) [2022-01-18 13:53:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][600/1251] eta 0:24:07 lr 0.000970 time 1.9422 (2.2231) loss 3.5404 (4.0765) grad_norm 1.1131 (1.1622) [2022-01-18 13:54:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][610/1251] eta 0:23:44 lr 0.000970 time 1.9168 (2.2228) loss 4.3154 (4.0776) grad_norm 1.0139 (1.1617) [2022-01-18 13:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][620/1251] eta 0:23:24 lr 0.000970 time 2.8452 (2.2258) loss 4.4913 (4.0802) grad_norm 1.1391 (1.1617) [2022-01-18 13:54:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][630/1251] eta 0:23:02 lr 0.000970 time 3.6218 (2.2267) loss 3.3628 (4.0871) grad_norm 1.4060 (1.1626) [2022-01-18 13:55:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][640/1251] eta 0:22:40 lr 0.000970 time 2.3277 (2.2272) loss 4.0860 (4.0821) grad_norm 1.2047 (1.1631) [2022-01-18 13:55:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][650/1251] eta 0:22:16 lr 0.000970 time 1.9489 (2.2241) loss 4.0886 (4.0802) grad_norm 1.0915 (1.1630) [2022-01-18 13:55:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][660/1251] eta 0:21:52 lr 0.000970 time 1.5025 (2.2216) loss 3.6932 (4.0819) grad_norm 0.8815 (1.1642) [2022-01-18 13:56:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][670/1251] eta 0:21:31 lr 0.000970 time 2.8902 (2.2223) loss 4.8238 (4.0873) grad_norm 1.4051 (1.1647) [2022-01-18 13:56:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][680/1251] eta 0:21:08 lr 0.000970 time 1.8972 (2.2213) loss 3.5383 (4.0906) grad_norm 1.2522 (1.1647) [2022-01-18 13:57:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][690/1251] eta 0:20:46 lr 0.000970 time 2.4979 (2.2226) loss 4.2976 (4.0891) grad_norm 1.1364 (1.1633) [2022-01-18 13:57:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][700/1251] eta 0:20:23 lr 0.000970 time 1.6301 (2.2207) loss 4.7513 (4.0924) grad_norm 1.4391 (1.1627) [2022-01-18 13:57:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][710/1251] eta 0:20:02 lr 0.000970 time 3.1799 (2.2222) loss 3.5800 (4.0939) grad_norm 1.1902 (1.1618) [2022-01-18 13:58:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][720/1251] eta 0:19:39 lr 0.000970 time 1.7981 (2.2217) loss 4.2547 (4.0950) grad_norm 1.0091 (1.1625) [2022-01-18 13:58:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][730/1251] eta 0:19:17 lr 0.000970 time 2.5847 (2.2226) loss 4.4509 (4.0949) grad_norm 1.3620 (1.1626) [2022-01-18 13:58:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][740/1251] eta 0:18:54 lr 0.000970 time 1.6804 (2.2203) loss 3.2620 (4.0927) grad_norm 1.0326 (1.1611) [2022-01-18 13:59:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][750/1251] eta 0:18:31 lr 0.000970 time 2.8679 (2.2179) loss 4.6534 (4.0941) grad_norm 1.1087 (1.1605) [2022-01-18 13:59:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][760/1251] eta 0:18:08 lr 0.000970 time 1.9227 (2.2159) loss 4.3914 (4.0971) grad_norm 1.2474 (1.1625) [2022-01-18 13:59:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][770/1251] eta 0:17:45 lr 0.000970 time 2.6852 (2.2151) loss 4.7164 (4.0987) grad_norm 1.0193 (1.1612) [2022-01-18 14:00:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][780/1251] eta 0:17:22 lr 0.000970 time 1.8514 (2.2142) loss 3.4912 (4.0987) grad_norm 1.0866 (1.1605) [2022-01-18 14:00:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][790/1251] eta 0:17:01 lr 0.000970 time 3.0733 (2.2157) loss 3.2864 (4.1008) grad_norm 1.0619 (1.1605) [2022-01-18 14:01:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][800/1251] eta 0:16:40 lr 0.000970 time 1.9323 (2.2174) loss 3.2144 (4.0982) grad_norm 0.9280 (1.1604) [2022-01-18 14:01:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][810/1251] eta 0:16:17 lr 0.000970 time 2.2414 (2.2170) loss 4.9665 (4.0968) grad_norm 1.1388 (1.1615) [2022-01-18 14:01:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][820/1251] eta 0:15:55 lr 0.000970 time 1.5793 (2.2163) loss 4.4410 (4.0974) grad_norm 1.0337 (1.1615) [2022-01-18 14:02:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][830/1251] eta 0:15:32 lr 0.000970 time 2.2740 (2.2152) loss 3.8140 (4.0965) grad_norm 1.3575 (1.1613) [2022-01-18 14:02:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][840/1251] eta 0:15:10 lr 0.000970 time 1.7390 (2.2165) loss 3.9848 (4.0994) grad_norm 1.0361 (1.1601) [2022-01-18 14:02:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][850/1251] eta 0:14:48 lr 0.000970 time 1.8265 (2.2157) loss 4.0380 (4.0988) grad_norm 1.0741 (1.1594) [2022-01-18 14:03:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][860/1251] eta 0:14:26 lr 0.000970 time 1.9748 (2.2157) loss 4.5871 (4.1006) grad_norm 1.3238 (1.1595) [2022-01-18 14:03:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][870/1251] eta 0:14:03 lr 0.000970 time 1.6050 (2.2137) loss 3.7943 (4.0980) grad_norm 1.4368 (1.1590) [2022-01-18 14:03:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][880/1251] eta 0:13:42 lr 0.000969 time 1.8864 (2.2161) loss 3.7587 (4.0969) grad_norm 1.2949 (1.1601) [2022-01-18 14:04:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][890/1251] eta 0:13:20 lr 0.000969 time 2.4217 (2.2162) loss 4.7363 (4.0987) grad_norm 1.2452 (1.1611) [2022-01-18 14:04:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][900/1251] eta 0:12:58 lr 0.000969 time 1.9099 (2.2166) loss 3.3638 (4.1026) grad_norm 0.9355 (1.1620) [2022-01-18 14:05:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][910/1251] eta 0:12:35 lr 0.000969 time 1.9476 (2.2161) loss 4.5823 (4.1042) grad_norm 0.9284 (1.1614) [2022-01-18 14:05:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][920/1251] eta 0:12:13 lr 0.000969 time 1.5926 (2.2155) loss 4.0003 (4.1062) grad_norm 1.2941 (1.1607) [2022-01-18 14:05:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][930/1251] eta 0:11:50 lr 0.000969 time 1.8660 (2.2133) loss 4.5756 (4.1055) grad_norm 1.2381 (1.1605) [2022-01-18 14:06:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][940/1251] eta 0:11:28 lr 0.000969 time 2.4222 (2.2124) loss 4.4024 (4.1057) grad_norm 1.0286 (1.1602) [2022-01-18 14:06:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][950/1251] eta 0:11:05 lr 0.000969 time 2.1900 (2.2121) loss 4.8895 (4.1065) grad_norm 0.9733 (1.1599) [2022-01-18 14:06:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][960/1251] eta 0:10:43 lr 0.000969 time 1.7009 (2.2114) loss 3.4928 (4.1054) grad_norm 0.9265 (1.1590) [2022-01-18 14:07:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][970/1251] eta 0:10:21 lr 0.000969 time 1.8729 (2.2111) loss 4.7499 (4.1069) grad_norm 1.0222 (1.1590) [2022-01-18 14:07:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][980/1251] eta 0:09:59 lr 0.000969 time 2.2406 (2.2109) loss 3.3798 (4.1083) grad_norm 0.9584 (1.1572) [2022-01-18 14:07:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][990/1251] eta 0:09:36 lr 0.000969 time 2.2428 (2.2106) loss 4.0324 (4.1049) grad_norm 1.2809 (1.1560) [2022-01-18 14:08:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1000/1251] eta 0:09:15 lr 0.000969 time 2.5376 (2.2122) loss 4.2953 (4.1058) grad_norm 1.1634 (1.1559) [2022-01-18 14:08:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1010/1251] eta 0:08:53 lr 0.000969 time 2.5142 (2.2132) loss 3.4361 (4.1030) grad_norm 0.9559 (1.1566) [2022-01-18 14:09:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1020/1251] eta 0:08:30 lr 0.000969 time 1.7111 (2.2116) loss 3.7812 (4.1050) grad_norm 1.1330 (1.1571) [2022-01-18 14:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1030/1251] eta 0:08:08 lr 0.000969 time 2.4875 (2.2110) loss 4.0483 (4.1062) grad_norm 1.1285 (1.1569) [2022-01-18 14:09:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1040/1251] eta 0:07:46 lr 0.000969 time 1.5986 (2.2101) loss 4.4711 (4.1083) grad_norm 1.6228 (1.1572) [2022-01-18 14:10:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1050/1251] eta 0:07:24 lr 0.000969 time 2.2131 (2.2101) loss 4.3498 (4.1102) grad_norm 1.0878 (1.1572) [2022-01-18 14:10:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1060/1251] eta 0:07:01 lr 0.000969 time 2.2102 (2.2092) loss 4.2809 (4.1115) grad_norm 1.1050 (1.1576) [2022-01-18 14:10:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1070/1251] eta 0:06:39 lr 0.000969 time 2.5061 (2.2095) loss 3.6791 (4.1090) grad_norm 1.0265 (1.1574) [2022-01-18 14:11:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1080/1251] eta 0:06:17 lr 0.000969 time 1.7009 (2.2078) loss 4.0615 (4.1091) grad_norm 1.0496 (1.1577) [2022-01-18 14:11:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1090/1251] eta 0:05:55 lr 0.000969 time 1.9621 (2.2083) loss 3.8030 (4.1087) grad_norm 0.9470 (1.1571) [2022-01-18 14:12:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1100/1251] eta 0:05:33 lr 0.000969 time 2.0661 (2.2100) loss 4.0583 (4.1066) grad_norm 0.9978 (1.1566) [2022-01-18 14:12:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1110/1251] eta 0:05:11 lr 0.000969 time 2.1014 (2.2100) loss 4.2636 (4.1073) grad_norm 0.9090 (1.1556) [2022-01-18 14:12:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1120/1251] eta 0:04:49 lr 0.000969 time 1.7677 (2.2089) loss 4.8289 (4.1081) grad_norm 0.9811 (1.1545) [2022-01-18 14:13:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1130/1251] eta 0:04:27 lr 0.000969 time 2.5323 (2.2087) loss 2.9514 (4.1086) grad_norm 0.9944 (1.1543) [2022-01-18 14:13:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1140/1251] eta 0:04:05 lr 0.000969 time 2.2655 (2.2094) loss 3.8009 (4.1073) grad_norm 0.8813 (1.1534) [2022-01-18 14:13:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1150/1251] eta 0:03:43 lr 0.000969 time 1.9258 (2.2091) loss 2.9052 (4.1081) grad_norm 1.0580 (1.1536) [2022-01-18 14:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1160/1251] eta 0:03:20 lr 0.000969 time 2.0508 (2.2081) loss 4.4285 (4.1077) grad_norm 1.1769 (1.1534) [2022-01-18 14:14:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1170/1251] eta 0:02:58 lr 0.000969 time 2.4190 (2.2080) loss 3.6812 (4.1082) grad_norm 1.0747 (1.1529) [2022-01-18 14:14:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1180/1251] eta 0:02:36 lr 0.000969 time 2.4292 (2.2079) loss 4.1742 (4.1087) grad_norm 1.2176 (1.1526) [2022-01-18 14:15:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1190/1251] eta 0:02:14 lr 0.000969 time 1.5578 (2.2073) loss 4.8978 (4.1088) grad_norm 1.0381 (1.1524) [2022-01-18 14:15:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1200/1251] eta 0:01:52 lr 0.000969 time 2.0104 (2.2065) loss 4.6711 (4.1094) grad_norm 1.0204 (1.1526) [2022-01-18 14:15:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1210/1251] eta 0:01:30 lr 0.000969 time 2.1710 (2.2051) loss 3.2414 (4.1082) grad_norm 1.1962 (1.1523) [2022-01-18 14:16:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1220/1251] eta 0:01:08 lr 0.000969 time 1.5436 (2.2038) loss 5.0274 (4.1121) grad_norm 1.1956 (1.1516) [2022-01-18 14:16:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1230/1251] eta 0:00:46 lr 0.000969 time 2.3315 (2.2034) loss 3.6423 (4.1124) grad_norm 0.9143 (1.1506) [2022-01-18 14:17:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1240/1251] eta 0:00:24 lr 0.000969 time 2.2928 (2.2025) loss 3.6472 (4.1128) grad_norm 0.9138 (1.1499) [2022-01-18 14:17:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [33/300][1250/1251] eta 0:00:02 lr 0.000969 time 1.1667 (2.1973) loss 4.1765 (4.1119) grad_norm 1.0041 (1.1497) [2022-01-18 14:17:16 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 33 training takes 0:45:49 [2022-01-18 14:17:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 20.018 (20.018) Loss 1.4483 (1.4483) Acc@1 66.504 (66.504) Acc@5 87.402 (87.402) [2022-01-18 14:17:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.314 (3.353) Loss 1.3978 (1.4313) Acc@1 69.629 (67.525) Acc@5 89.258 (88.299) [2022-01-18 14:18:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.335 (2.666) Loss 1.4040 (1.4347) Acc@1 66.992 (67.192) Acc@5 88.574 (88.188) [2022-01-18 14:18:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.279 (2.291) Loss 1.4250 (1.4318) Acc@1 66.211 (67.080) Acc@5 88.867 (88.259) [2022-01-18 14:18:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.418 (2.209) Loss 1.4182 (1.4330) Acc@1 67.188 (67.023) Acc@5 88.477 (88.269) [2022-01-18 14:18:54 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 67.000 Acc@5 88.266 [2022-01-18 14:18:54 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 67.0% [2022-01-18 14:18:54 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 67.00% [2022-01-18 14:19:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][0/1251] eta 7:29:11 lr 0.000969 time 21.5442 (21.5442) loss 4.4437 (4.4437) grad_norm 1.1077 (1.1077) [2022-01-18 14:19:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][10/1251] eta 1:25:10 lr 0.000969 time 1.9356 (4.1182) loss 4.5749 (4.1090) grad_norm 1.1293 (1.1381) [2022-01-18 14:20:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][20/1251] eta 1:05:55 lr 0.000969 time 2.1413 (3.2133) loss 4.3376 (4.0567) grad_norm 1.0253 (1.1281) [2022-01-18 14:20:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][30/1251] eta 0:57:37 lr 0.000969 time 1.4439 (2.8319) loss 4.1563 (3.9053) grad_norm 1.2623 (1.1712) [2022-01-18 14:20:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][40/1251] eta 0:54:44 lr 0.000969 time 4.0611 (2.7126) loss 4.3681 (4.0036) grad_norm 1.0477 (1.1723) [2022-01-18 14:21:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][50/1251] eta 0:53:32 lr 0.000969 time 3.4268 (2.6752) loss 4.3959 (4.0212) grad_norm 1.0847 (1.1590) [2022-01-18 14:21:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][60/1251] eta 0:51:35 lr 0.000969 time 1.6941 (2.5993) loss 4.7158 (4.0356) grad_norm 0.9910 (1.1670) [2022-01-18 14:21:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][70/1251] eta 0:49:41 lr 0.000969 time 1.9471 (2.5245) loss 3.4961 (3.9943) grad_norm 0.9992 (1.1709) [2022-01-18 14:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][80/1251] eta 0:48:47 lr 0.000969 time 3.3426 (2.5002) loss 3.9488 (3.9841) grad_norm 1.0910 (1.1831) [2022-01-18 14:22:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][90/1251] eta 0:47:54 lr 0.000969 time 1.5594 (2.4763) loss 3.1670 (3.9878) grad_norm 1.6252 (1.1756) [2022-01-18 14:22:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][100/1251] eta 0:46:36 lr 0.000969 time 1.8175 (2.4300) loss 4.4482 (4.0181) grad_norm 1.4283 (1.1722) [2022-01-18 14:23:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][110/1251] eta 0:45:43 lr 0.000969 time 2.0548 (2.4045) loss 4.7434 (4.0394) grad_norm 0.9534 (1.1783) [2022-01-18 14:23:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][120/1251] eta 0:45:14 lr 0.000969 time 3.1906 (2.3998) loss 2.8424 (4.0509) grad_norm 1.3040 (1.1751) [2022-01-18 14:24:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][130/1251] eta 0:44:39 lr 0.000969 time 1.8965 (2.3907) loss 2.9006 (4.0389) grad_norm 1.1054 (1.1691) [2022-01-18 14:24:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][140/1251] eta 0:43:50 lr 0.000969 time 2.0499 (2.3677) loss 4.5838 (4.0406) grad_norm 1.3099 (1.1665) [2022-01-18 14:24:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][150/1251] eta 0:43:02 lr 0.000969 time 1.9692 (2.3460) loss 4.7617 (4.0351) grad_norm 1.0166 (1.1646) [2022-01-18 14:25:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][160/1251] eta 0:42:21 lr 0.000969 time 2.3581 (2.3298) loss 2.8184 (4.0182) grad_norm 1.1384 (1.1627) [2022-01-18 14:25:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][170/1251] eta 0:41:47 lr 0.000969 time 2.4168 (2.3194) loss 3.7054 (4.0232) grad_norm 1.1949 (1.1599) [2022-01-18 14:25:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][180/1251] eta 0:41:14 lr 0.000969 time 2.1589 (2.3100) loss 4.8543 (4.0299) grad_norm 1.1450 (1.1601) [2022-01-18 14:26:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][190/1251] eta 0:40:44 lr 0.000969 time 2.1769 (2.3036) loss 3.9575 (4.0243) grad_norm 0.9899 (1.1561) [2022-01-18 14:26:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][200/1251] eta 0:40:13 lr 0.000969 time 1.9368 (2.2967) loss 4.4732 (4.0459) grad_norm 1.1604 (1.1530) [2022-01-18 14:26:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][210/1251] eta 0:39:51 lr 0.000969 time 1.9006 (2.2973) loss 4.7443 (4.0579) grad_norm 1.1751 (1.1506) [2022-01-18 14:27:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][220/1251] eta 0:39:27 lr 0.000969 time 1.6397 (2.2961) loss 2.7103 (4.0520) grad_norm 1.1290 (1.1494) [2022-01-18 14:27:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][230/1251] eta 0:38:56 lr 0.000969 time 2.0442 (2.2882) loss 4.4008 (4.0552) grad_norm 1.3001 (1.1551) [2022-01-18 14:28:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][240/1251] eta 0:38:32 lr 0.000969 time 1.6148 (2.2869) loss 4.3033 (4.0546) grad_norm 0.8886 (1.1545) [2022-01-18 14:28:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][250/1251] eta 0:38:03 lr 0.000969 time 1.8834 (2.2815) loss 4.6199 (4.0443) grad_norm 1.3534 (1.1555) [2022-01-18 14:28:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][260/1251] eta 0:37:32 lr 0.000969 time 1.6107 (2.2733) loss 4.0451 (4.0484) grad_norm 1.2382 (1.1580) [2022-01-18 14:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][270/1251] eta 0:37:02 lr 0.000969 time 1.8651 (2.2651) loss 4.4684 (4.0590) grad_norm 1.0129 (1.1589) [2022-01-18 14:29:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][280/1251] eta 0:36:36 lr 0.000969 time 1.9038 (2.2622) loss 3.3006 (4.0548) grad_norm 1.4391 (1.1594) [2022-01-18 14:29:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][290/1251] eta 0:36:20 lr 0.000969 time 2.0595 (2.2686) loss 4.3184 (4.0642) grad_norm 1.2540 (1.1588) [2022-01-18 14:30:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][300/1251] eta 0:35:54 lr 0.000969 time 2.4423 (2.2658) loss 4.1584 (4.0606) grad_norm 1.2517 (1.1577) [2022-01-18 14:30:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][310/1251] eta 0:35:30 lr 0.000969 time 2.2765 (2.2641) loss 4.3172 (4.0624) grad_norm 0.8933 (1.1550) [2022-01-18 14:30:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][320/1251] eta 0:35:04 lr 0.000968 time 2.2139 (2.2606) loss 4.6616 (4.0619) grad_norm 1.2637 (1.1551) [2022-01-18 14:31:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][330/1251] eta 0:34:39 lr 0.000968 time 1.5945 (2.2580) loss 3.8271 (4.0693) grad_norm 0.9611 (1.1585) [2022-01-18 14:31:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][340/1251] eta 0:34:10 lr 0.000968 time 1.8112 (2.2505) loss 4.7886 (4.0746) grad_norm 1.2635 (1.1571) [2022-01-18 14:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][350/1251] eta 0:33:45 lr 0.000968 time 2.5519 (2.2482) loss 4.3516 (4.0731) grad_norm 1.0096 (1.1590) [2022-01-18 14:32:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][360/1251] eta 0:33:21 lr 0.000968 time 1.8268 (2.2460) loss 3.1544 (4.0647) grad_norm 1.1363 (1.1592) [2022-01-18 14:32:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][370/1251] eta 0:33:00 lr 0.000968 time 1.9357 (2.2484) loss 5.1382 (4.0709) grad_norm 1.1290 (1.1581) [2022-01-18 14:33:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][380/1251] eta 0:32:37 lr 0.000968 time 2.2713 (2.2473) loss 4.8226 (4.0751) grad_norm 0.9985 (1.1560) [2022-01-18 14:33:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][390/1251] eta 0:32:12 lr 0.000968 time 2.5484 (2.2444) loss 4.9485 (4.0692) grad_norm 1.1135 (1.1608) [2022-01-18 14:33:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][400/1251] eta 0:31:47 lr 0.000968 time 2.1672 (2.2419) loss 3.4273 (4.0656) grad_norm 1.0210 (1.1613) [2022-01-18 14:34:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][410/1251] eta 0:31:27 lr 0.000968 time 2.4699 (2.2444) loss 3.1738 (4.0590) grad_norm 1.2323 (1.1616) [2022-01-18 14:34:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][420/1251] eta 0:31:00 lr 0.000968 time 1.9229 (2.2394) loss 4.3837 (4.0593) grad_norm 0.9252 (1.1606) [2022-01-18 14:34:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][430/1251] eta 0:30:35 lr 0.000968 time 1.9186 (2.2351) loss 3.8884 (4.0565) grad_norm 0.9767 (1.1590) [2022-01-18 14:35:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][440/1251] eta 0:30:08 lr 0.000968 time 2.1087 (2.2304) loss 2.7442 (4.0577) grad_norm 1.3938 (1.1576) [2022-01-18 14:35:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][450/1251] eta 0:29:45 lr 0.000968 time 2.4114 (2.2290) loss 3.6458 (4.0605) grad_norm 1.1021 (1.1579) [2022-01-18 14:36:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][460/1251] eta 0:29:22 lr 0.000968 time 2.1203 (2.2285) loss 2.9224 (4.0542) grad_norm 1.1211 (1.1587) [2022-01-18 14:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][470/1251] eta 0:28:59 lr 0.000968 time 1.9183 (2.2268) loss 3.6056 (4.0588) grad_norm 1.1031 (1.1582) [2022-01-18 14:36:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][480/1251] eta 0:28:36 lr 0.000968 time 1.8587 (2.2258) loss 4.2389 (4.0612) grad_norm 1.0530 (1.1582) [2022-01-18 14:37:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][490/1251] eta 0:28:13 lr 0.000968 time 2.5731 (2.2248) loss 4.4845 (4.0587) grad_norm 1.1399 (1.1572) [2022-01-18 14:37:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][500/1251] eta 0:27:48 lr 0.000968 time 2.2594 (2.2220) loss 4.3884 (4.0640) grad_norm 1.0300 (1.1555) [2022-01-18 14:37:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][510/1251] eta 0:27:24 lr 0.000968 time 1.9224 (2.2191) loss 4.4524 (4.0659) grad_norm 1.3363 (1.1554) [2022-01-18 14:38:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][520/1251] eta 0:27:00 lr 0.000968 time 1.6325 (2.2163) loss 4.9637 (4.0661) grad_norm 1.1862 (1.1536) [2022-01-18 14:38:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][530/1251] eta 0:26:40 lr 0.000968 time 3.4910 (2.2200) loss 4.6855 (4.0690) grad_norm 1.0390 (1.1533) [2022-01-18 14:38:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][540/1251] eta 0:26:21 lr 0.000968 time 2.5276 (2.2236) loss 3.4419 (4.0701) grad_norm 1.1024 (1.1542) [2022-01-18 14:39:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][550/1251] eta 0:26:00 lr 0.000968 time 2.7119 (2.2255) loss 4.9899 (4.0770) grad_norm 1.1407 (1.1543) [2022-01-18 14:39:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][560/1251] eta 0:25:37 lr 0.000968 time 1.5950 (2.2251) loss 3.9006 (4.0809) grad_norm 0.9670 (1.1545) [2022-01-18 14:40:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][570/1251] eta 0:25:12 lr 0.000968 time 1.8854 (2.2206) loss 4.5114 (4.0812) grad_norm 1.0494 (1.1537) [2022-01-18 14:40:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][580/1251] eta 0:24:47 lr 0.000968 time 2.0699 (2.2168) loss 4.1378 (4.0838) grad_norm 1.3970 (1.1517) [2022-01-18 14:40:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][590/1251] eta 0:24:24 lr 0.000968 time 1.7024 (2.2160) loss 3.0635 (4.0841) grad_norm 1.2862 (1.1514) [2022-01-18 14:41:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][600/1251] eta 0:24:02 lr 0.000968 time 2.2945 (2.2151) loss 3.7276 (4.0777) grad_norm 1.0267 (1.1513) [2022-01-18 14:41:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][610/1251] eta 0:23:39 lr 0.000968 time 1.8425 (2.2145) loss 4.1989 (4.0754) grad_norm 1.0640 (1.1516) [2022-01-18 14:41:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][620/1251] eta 0:23:17 lr 0.000968 time 2.0348 (2.2154) loss 4.6902 (4.0739) grad_norm 1.0766 (1.1526) [2022-01-18 14:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][630/1251] eta 0:22:57 lr 0.000968 time 1.9013 (2.2175) loss 2.9035 (4.0730) grad_norm 1.2393 (1.1522) [2022-01-18 14:42:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][640/1251] eta 0:22:36 lr 0.000968 time 2.1757 (2.2200) loss 4.2185 (4.0762) grad_norm 0.9970 (1.1513) [2022-01-18 14:43:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][650/1251] eta 0:22:15 lr 0.000968 time 2.0881 (2.2221) loss 4.3385 (4.0813) grad_norm 0.9573 (1.1499) [2022-01-18 14:43:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][660/1251] eta 0:21:51 lr 0.000968 time 1.6365 (2.2185) loss 4.0673 (4.0795) grad_norm 1.0273 (1.1494) [2022-01-18 14:43:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][670/1251] eta 0:21:27 lr 0.000968 time 1.9590 (2.2157) loss 3.8514 (4.0811) grad_norm 1.1501 (1.1475) [2022-01-18 14:44:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][680/1251] eta 0:21:04 lr 0.000968 time 1.9381 (2.2144) loss 3.8198 (4.0846) grad_norm 1.3625 (1.1469) [2022-01-18 14:44:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][690/1251] eta 0:20:42 lr 0.000968 time 2.7618 (2.2156) loss 4.5189 (4.0854) grad_norm 1.1353 (1.1473) [2022-01-18 14:44:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][700/1251] eta 0:20:22 lr 0.000968 time 1.7003 (2.2178) loss 4.4748 (4.0880) grad_norm 1.1873 (1.1470) [2022-01-18 14:45:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][710/1251] eta 0:19:58 lr 0.000968 time 2.2435 (2.2162) loss 4.9108 (4.0923) grad_norm 0.9534 (1.1475) [2022-01-18 14:45:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][720/1251] eta 0:19:35 lr 0.000968 time 1.7269 (2.2134) loss 4.3292 (4.0959) grad_norm 0.9641 (1.1463) [2022-01-18 14:45:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][730/1251] eta 0:19:11 lr 0.000968 time 1.4887 (2.2104) loss 3.6471 (4.0956) grad_norm 1.3665 (1.1466) [2022-01-18 14:46:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][740/1251] eta 0:18:48 lr 0.000968 time 1.8830 (2.2083) loss 4.4310 (4.0980) grad_norm 1.2487 (1.1468) [2022-01-18 14:46:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][750/1251] eta 0:18:26 lr 0.000968 time 2.2026 (2.2088) loss 3.9405 (4.1004) grad_norm 1.2649 (1.1466) [2022-01-18 14:46:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][760/1251] eta 0:18:04 lr 0.000968 time 1.8478 (2.2096) loss 2.9472 (4.1031) grad_norm 1.1556 (1.1455) [2022-01-18 14:47:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][770/1251] eta 0:17:42 lr 0.000968 time 2.1163 (2.2097) loss 3.4323 (4.0976) grad_norm 1.1918 (1.1455) [2022-01-18 14:47:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][780/1251] eta 0:17:21 lr 0.000968 time 2.2941 (2.2105) loss 4.3683 (4.1005) grad_norm 0.9128 (1.1444) [2022-01-18 14:48:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][790/1251] eta 0:16:58 lr 0.000968 time 1.8847 (2.2098) loss 4.4222 (4.0992) grad_norm 1.1098 (1.1442) [2022-01-18 14:48:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][800/1251] eta 0:16:36 lr 0.000968 time 2.3694 (2.2087) loss 4.0404 (4.1003) grad_norm 0.9144 (1.1450) [2022-01-18 14:48:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][810/1251] eta 0:16:13 lr 0.000968 time 1.9263 (2.2076) loss 4.8337 (4.1026) grad_norm 1.3547 (1.1451) [2022-01-18 14:49:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][820/1251] eta 0:15:51 lr 0.000968 time 2.7189 (2.2088) loss 3.2240 (4.1012) grad_norm 1.1837 (1.1454) [2022-01-18 14:49:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][830/1251] eta 0:15:28 lr 0.000968 time 1.8720 (2.2065) loss 4.8201 (4.1002) grad_norm 1.0236 (1.1435) [2022-01-18 14:49:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][840/1251] eta 0:15:06 lr 0.000968 time 2.6189 (2.2049) loss 3.5286 (4.1022) grad_norm 0.9679 (1.1424) [2022-01-18 14:50:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][850/1251] eta 0:14:43 lr 0.000968 time 1.8781 (2.2037) loss 5.0520 (4.1046) grad_norm 1.2624 (1.1420) [2022-01-18 14:50:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][860/1251] eta 0:14:21 lr 0.000968 time 3.0227 (2.2037) loss 4.1260 (4.1040) grad_norm 1.0294 (1.1413) [2022-01-18 14:50:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][870/1251] eta 0:13:59 lr 0.000968 time 2.2276 (2.2032) loss 4.4691 (4.1057) grad_norm 0.9981 (1.1411) [2022-01-18 14:51:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][880/1251] eta 0:13:37 lr 0.000968 time 2.4164 (2.2035) loss 4.2943 (4.1061) grad_norm 1.1536 (1.1423) [2022-01-18 14:51:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][890/1251] eta 0:13:15 lr 0.000968 time 1.8452 (2.2039) loss 4.9021 (4.1081) grad_norm 0.9081 (1.1426) [2022-01-18 14:52:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][900/1251] eta 0:12:53 lr 0.000968 time 3.4449 (2.2045) loss 4.2754 (4.1094) grad_norm 1.2020 (1.1440) [2022-01-18 14:52:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][910/1251] eta 0:12:31 lr 0.000968 time 1.5747 (2.2035) loss 4.0420 (4.1113) grad_norm 1.0633 (1.1446) [2022-01-18 14:52:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][920/1251] eta 0:12:09 lr 0.000968 time 2.3348 (2.2039) loss 3.5213 (4.1127) grad_norm 0.9275 (1.1440) [2022-01-18 14:53:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][930/1251] eta 0:11:46 lr 0.000968 time 1.9587 (2.2021) loss 3.0242 (4.1126) grad_norm 1.2654 (1.1434) [2022-01-18 14:53:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][940/1251] eta 0:11:24 lr 0.000968 time 2.7956 (2.2022) loss 4.9151 (4.1114) grad_norm 1.2615 (1.1443) [2022-01-18 14:53:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][950/1251] eta 0:11:02 lr 0.000968 time 1.8237 (2.2009) loss 4.2304 (4.1121) grad_norm 1.2008 (1.1449) [2022-01-18 14:54:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][960/1251] eta 0:10:40 lr 0.000968 time 2.7793 (2.2011) loss 3.2440 (4.1083) grad_norm 1.0273 (1.1440) [2022-01-18 14:54:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][970/1251] eta 0:10:18 lr 0.000968 time 1.8971 (2.2004) loss 3.4794 (4.1079) grad_norm 1.1603 (1.1437) [2022-01-18 14:54:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][980/1251] eta 0:09:56 lr 0.000968 time 3.1933 (2.2005) loss 3.4750 (4.1087) grad_norm 0.9646 (1.1430) [2022-01-18 14:55:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][990/1251] eta 0:09:34 lr 0.000968 time 2.6564 (2.2011) loss 4.5625 (4.1093) grad_norm 1.2816 (1.1424) [2022-01-18 14:55:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1000/1251] eta 0:09:12 lr 0.000967 time 2.1824 (2.2007) loss 4.4134 (4.1067) grad_norm 1.1742 (1.1430) [2022-01-18 14:55:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1010/1251] eta 0:08:50 lr 0.000967 time 2.1238 (2.2006) loss 4.1426 (4.1069) grad_norm 0.9987 (1.1429) [2022-01-18 14:56:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1020/1251] eta 0:08:28 lr 0.000967 time 3.9936 (2.2019) loss 4.2788 (4.1085) grad_norm 0.9401 (1.1420) [2022-01-18 14:56:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1030/1251] eta 0:08:06 lr 0.000967 time 1.8705 (2.2014) loss 4.3441 (4.1075) grad_norm 1.8210 (1.1425) [2022-01-18 14:57:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1040/1251] eta 0:07:44 lr 0.000967 time 1.9314 (2.2011) loss 3.3892 (4.1074) grad_norm 1.8632 (1.1433) [2022-01-18 14:57:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1050/1251] eta 0:07:22 lr 0.000967 time 1.8933 (2.2008) loss 3.1783 (4.1054) grad_norm 1.1687 (1.1429) [2022-01-18 14:57:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1060/1251] eta 0:07:00 lr 0.000967 time 2.6334 (2.1999) loss 4.8973 (4.1062) grad_norm 1.0926 (1.1425) [2022-01-18 14:58:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1070/1251] eta 0:06:37 lr 0.000967 time 2.1828 (2.1981) loss 3.3673 (4.1087) grad_norm 1.1987 (1.1422) [2022-01-18 14:58:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1080/1251] eta 0:06:15 lr 0.000967 time 1.9049 (2.1971) loss 3.7000 (4.1078) grad_norm 1.3752 (1.1421) [2022-01-18 14:58:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1090/1251] eta 0:05:53 lr 0.000967 time 2.0279 (2.1962) loss 3.5391 (4.1052) grad_norm 1.1832 (1.1418) [2022-01-18 14:59:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1100/1251] eta 0:05:32 lr 0.000967 time 4.2534 (2.2000) loss 3.7334 (4.1067) grad_norm 1.1159 (1.1414) [2022-01-18 14:59:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1110/1251] eta 0:05:10 lr 0.000967 time 2.3108 (2.2006) loss 4.3517 (4.1053) grad_norm 1.0261 (1.1409) [2022-01-18 15:00:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1120/1251] eta 0:04:48 lr 0.000967 time 2.4877 (2.1999) loss 5.0666 (4.1071) grad_norm 1.1192 (1.1402) [2022-01-18 15:00:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1130/1251] eta 0:04:26 lr 0.000967 time 2.2705 (2.1994) loss 3.8788 (4.1074) grad_norm 1.0936 (1.1400) [2022-01-18 15:00:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1140/1251] eta 0:04:04 lr 0.000967 time 3.2230 (2.1989) loss 3.6315 (4.1079) grad_norm 1.3475 (1.1405) [2022-01-18 15:01:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1150/1251] eta 0:03:41 lr 0.000967 time 1.8671 (2.1978) loss 4.3595 (4.1096) grad_norm 1.3613 (1.1404) [2022-01-18 15:01:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1160/1251] eta 0:03:19 lr 0.000967 time 2.0693 (2.1966) loss 4.1061 (4.1110) grad_norm 1.2956 (1.1405) [2022-01-18 15:01:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1170/1251] eta 0:02:57 lr 0.000967 time 1.9334 (2.1968) loss 3.4038 (4.1092) grad_norm 1.1911 (1.1405) [2022-01-18 15:02:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1180/1251] eta 0:02:36 lr 0.000967 time 3.3812 (2.1973) loss 4.6191 (4.1100) grad_norm 1.3053 (1.1410) [2022-01-18 15:02:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1190/1251] eta 0:02:13 lr 0.000967 time 1.7102 (2.1964) loss 2.9634 (4.1111) grad_norm 1.0924 (1.1402) [2022-01-18 15:02:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1200/1251] eta 0:01:52 lr 0.000967 time 2.3524 (2.1967) loss 4.7926 (4.1078) grad_norm 1.0112 (1.1407) [2022-01-18 15:03:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1210/1251] eta 0:01:30 lr 0.000967 time 1.7511 (2.1970) loss 3.6982 (4.1101) grad_norm 1.1649 (1.1407) [2022-01-18 15:03:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1220/1251] eta 0:01:08 lr 0.000967 time 3.4057 (2.1976) loss 3.8588 (4.1107) grad_norm 1.0784 (1.1401) [2022-01-18 15:03:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1230/1251] eta 0:00:46 lr 0.000967 time 2.1890 (2.1965) loss 4.2984 (4.1098) grad_norm 1.0634 (1.1399) [2022-01-18 15:04:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1240/1251] eta 0:00:24 lr 0.000967 time 1.2258 (2.1949) loss 3.3808 (4.1069) grad_norm 1.1026 (1.1404) [2022-01-18 15:04:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [34/300][1250/1251] eta 0:00:02 lr 0.000967 time 1.1720 (2.1895) loss 4.3056 (4.1073) grad_norm 1.3234 (1.1405) [2022-01-18 15:04:33 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 34 training takes 0:45:39 [2022-01-18 15:04:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 20.712 (20.712) Loss 1.4278 (1.4278) Acc@1 66.309 (66.309) Acc@5 88.379 (88.379) [2022-01-18 15:05:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.240 (3.345) Loss 1.4910 (1.4172) Acc@1 66.211 (66.886) Acc@5 87.305 (88.015) [2022-01-18 15:05:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.919 (2.480) Loss 1.3642 (1.4152) Acc@1 69.141 (67.025) Acc@5 89.258 (88.188) [2022-01-18 15:05:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.202 (2.279) Loss 1.2798 (1.4250) Acc@1 69.043 (66.954) Acc@5 91.504 (88.158) [2022-01-18 15:06:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.331 (2.141) Loss 1.3480 (1.4242) Acc@1 68.652 (67.056) Acc@5 89.648 (88.088) [2022-01-18 15:06:08 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 67.076 Acc@5 88.094 [2022-01-18 15:06:08 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 67.1% [2022-01-18 15:06:08 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 67.08% [2022-01-18 15:06:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][0/1251] eta 7:20:14 lr 0.000967 time 21.1150 (21.1150) loss 4.3892 (4.3892) grad_norm 1.1641 (1.1641) [2022-01-18 15:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][10/1251] eta 1:21:54 lr 0.000967 time 1.3101 (3.9599) loss 3.5356 (4.0279) grad_norm 1.0431 (1.1050) [2022-01-18 15:07:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][20/1251] eta 1:01:35 lr 0.000967 time 1.5089 (3.0023) loss 4.3870 (3.8962) grad_norm 0.9263 (1.0726) [2022-01-18 15:07:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][30/1251] eta 0:55:51 lr 0.000967 time 1.8393 (2.7450) loss 3.8670 (3.9441) grad_norm 0.9591 (1.0790) [2022-01-18 15:07:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][40/1251] eta 0:53:36 lr 0.000967 time 6.8634 (2.6559) loss 4.2217 (4.0295) grad_norm 0.9718 (1.1049) [2022-01-18 15:08:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][50/1251] eta 0:51:44 lr 0.000967 time 1.5126 (2.5851) loss 3.1882 (4.0174) grad_norm 1.3960 (1.1200) [2022-01-18 15:08:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][60/1251] eta 0:49:54 lr 0.000967 time 1.8612 (2.5141) loss 4.5507 (4.0430) grad_norm 0.9353 (1.1151) [2022-01-18 15:09:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][70/1251] eta 0:48:26 lr 0.000967 time 2.1720 (2.4612) loss 4.3594 (4.0648) grad_norm 1.1343 (1.1200) [2022-01-18 15:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][80/1251] eta 0:47:47 lr 0.000967 time 4.2877 (2.4485) loss 4.1532 (4.0489) grad_norm 1.0027 (1.1315) [2022-01-18 15:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][90/1251] eta 0:47:01 lr 0.000967 time 2.4036 (2.4299) loss 3.7355 (4.0374) grad_norm 1.4468 (1.1400) [2022-01-18 15:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][100/1251] eta 0:46:09 lr 0.000967 time 1.8435 (2.4061) loss 3.0104 (4.0414) grad_norm 1.1518 (1.1393) [2022-01-18 15:10:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][110/1251] eta 0:45:06 lr 0.000967 time 1.9113 (2.3720) loss 3.6987 (4.0195) grad_norm 1.1082 (1.1333) [2022-01-18 15:10:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][120/1251] eta 0:44:33 lr 0.000967 time 4.2158 (2.3634) loss 4.6762 (4.0398) grad_norm 1.6000 (1.1423) [2022-01-18 15:11:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][130/1251] eta 0:43:54 lr 0.000967 time 2.4688 (2.3499) loss 3.8977 (4.0440) grad_norm 1.0432 (1.1366) [2022-01-18 15:11:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][140/1251] eta 0:43:05 lr 0.000967 time 1.7229 (2.3273) loss 3.0906 (4.0552) grad_norm 1.1373 (1.1353) [2022-01-18 15:11:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][150/1251] eta 0:42:29 lr 0.000967 time 1.8251 (2.3154) loss 4.3244 (4.0719) grad_norm 1.2650 (1.1461) [2022-01-18 15:12:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][160/1251] eta 0:42:15 lr 0.000967 time 3.5404 (2.3244) loss 3.5158 (4.0624) grad_norm 0.9222 (1.1463) [2022-01-18 15:12:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][170/1251] eta 0:41:36 lr 0.000967 time 1.7831 (2.3096) loss 4.4607 (4.0654) grad_norm 1.2701 (1.1469) [2022-01-18 15:13:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][180/1251] eta 0:41:11 lr 0.000967 time 2.0034 (2.3076) loss 4.3373 (4.0633) grad_norm 1.0881 (1.1485) [2022-01-18 15:13:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][190/1251] eta 0:40:32 lr 0.000967 time 2.1881 (2.2925) loss 4.4519 (4.0699) grad_norm 1.0918 (1.1469) [2022-01-18 15:13:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][200/1251] eta 0:40:02 lr 0.000967 time 3.0096 (2.2859) loss 3.0967 (4.0672) grad_norm 0.9812 (1.1459) [2022-01-18 15:14:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][210/1251] eta 0:39:27 lr 0.000967 time 2.4859 (2.2745) loss 4.0983 (4.0758) grad_norm 0.9968 (1.1473) [2022-01-18 15:14:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][220/1251] eta 0:38:59 lr 0.000967 time 1.9112 (2.2694) loss 4.2279 (4.0835) grad_norm 1.0072 (1.1461) [2022-01-18 15:14:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][230/1251] eta 0:38:35 lr 0.000967 time 3.0547 (2.2682) loss 4.0865 (4.0741) grad_norm 0.9882 (1.1401) [2022-01-18 15:15:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][240/1251] eta 0:38:08 lr 0.000967 time 1.8768 (2.2635) loss 4.6357 (4.0706) grad_norm 1.4260 (1.1406) [2022-01-18 15:15:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][250/1251] eta 0:37:44 lr 0.000967 time 2.7861 (2.2618) loss 4.0244 (4.0806) grad_norm 1.0525 (1.1399) [2022-01-18 15:15:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][260/1251] eta 0:37:14 lr 0.000967 time 2.1836 (2.2548) loss 2.9930 (4.0808) grad_norm 1.0237 (1.1368) [2022-01-18 15:16:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][270/1251] eta 0:36:50 lr 0.000967 time 2.7510 (2.2532) loss 4.1578 (4.0920) grad_norm 1.0910 (1.1388) [2022-01-18 15:16:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][280/1251] eta 0:36:31 lr 0.000967 time 2.0148 (2.2565) loss 4.5647 (4.0858) grad_norm 0.9748 (1.1379) [2022-01-18 15:17:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][290/1251] eta 0:36:05 lr 0.000967 time 1.5673 (2.2531) loss 4.4592 (4.0897) grad_norm 1.2392 (1.1382) [2022-01-18 15:17:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][300/1251] eta 0:35:37 lr 0.000967 time 1.7903 (2.2477) loss 4.7112 (4.0899) grad_norm 1.2411 (1.1405) [2022-01-18 15:17:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][310/1251] eta 0:35:12 lr 0.000967 time 2.8764 (2.2451) loss 4.1750 (4.0845) grad_norm 1.1880 (1.1417) [2022-01-18 15:18:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][320/1251] eta 0:34:46 lr 0.000967 time 2.0569 (2.2406) loss 3.2766 (4.0854) grad_norm 1.0282 (1.1390) [2022-01-18 15:18:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][330/1251] eta 0:34:21 lr 0.000967 time 2.0935 (2.2383) loss 3.4067 (4.0988) grad_norm 0.9810 (1.1377) [2022-01-18 15:18:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][340/1251] eta 0:33:55 lr 0.000967 time 1.9677 (2.2339) loss 4.4201 (4.1073) grad_norm 1.0202 (1.1386) [2022-01-18 15:19:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][350/1251] eta 0:33:31 lr 0.000967 time 2.6088 (2.2323) loss 3.3323 (4.1043) grad_norm 0.9675 (1.1425) [2022-01-18 15:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][360/1251] eta 0:33:06 lr 0.000967 time 1.8873 (2.2300) loss 4.4272 (4.1037) grad_norm 0.9417 (1.1392) [2022-01-18 15:19:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][370/1251] eta 0:32:44 lr 0.000967 time 2.1495 (2.2302) loss 4.6664 (4.1076) grad_norm 0.9851 (1.1388) [2022-01-18 15:20:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][380/1251] eta 0:32:20 lr 0.000967 time 1.8900 (2.2275) loss 4.6992 (4.1113) grad_norm 1.2523 (1.1407) [2022-01-18 15:20:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][390/1251] eta 0:32:00 lr 0.000967 time 2.0786 (2.2300) loss 4.4990 (4.1132) grad_norm 1.3493 (1.1406) [2022-01-18 15:21:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][400/1251] eta 0:31:35 lr 0.000967 time 1.8914 (2.2272) loss 4.6950 (4.1154) grad_norm 1.1277 (1.1416) [2022-01-18 15:21:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][410/1251] eta 0:31:13 lr 0.000967 time 2.1552 (2.2283) loss 4.9256 (4.1215) grad_norm 0.9835 (1.1420) [2022-01-18 15:21:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][420/1251] eta 0:30:50 lr 0.000966 time 2.3677 (2.2268) loss 3.9094 (4.1242) grad_norm 1.0046 (1.1386) [2022-01-18 15:22:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][430/1251] eta 0:30:26 lr 0.000966 time 1.5908 (2.2249) loss 4.1200 (4.1203) grad_norm 1.0020 (1.1353) [2022-01-18 15:22:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][440/1251] eta 0:30:03 lr 0.000966 time 2.2979 (2.2244) loss 4.6629 (4.1214) grad_norm 1.3384 (1.1354) [2022-01-18 15:22:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][450/1251] eta 0:29:40 lr 0.000966 time 2.1059 (2.2229) loss 3.8863 (4.1208) grad_norm 0.9542 (1.1326) [2022-01-18 15:23:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][460/1251] eta 0:29:18 lr 0.000966 time 2.1130 (2.2234) loss 4.7117 (4.1219) grad_norm 1.1356 (1.1324) [2022-01-18 15:23:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][470/1251] eta 0:28:55 lr 0.000966 time 2.5666 (2.2216) loss 4.3175 (4.1253) grad_norm 0.9521 (1.1335) [2022-01-18 15:23:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][480/1251] eta 0:28:32 lr 0.000966 time 2.3680 (2.2206) loss 4.7970 (4.1232) grad_norm 1.0928 (1.1326) [2022-01-18 15:24:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][490/1251] eta 0:28:07 lr 0.000966 time 1.8487 (2.2176) loss 3.8365 (4.1251) grad_norm 1.0224 (1.1334) [2022-01-18 15:24:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][500/1251] eta 0:27:45 lr 0.000966 time 2.2613 (2.2182) loss 4.1183 (4.1201) grad_norm 1.1138 (1.1339) [2022-01-18 15:25:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][510/1251] eta 0:27:22 lr 0.000966 time 2.5605 (2.2168) loss 4.4304 (4.1200) grad_norm 1.3192 (1.1348) [2022-01-18 15:25:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][520/1251] eta 0:26:59 lr 0.000966 time 2.8173 (2.2155) loss 4.4724 (4.1216) grad_norm 1.1729 (1.1363) [2022-01-18 15:25:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][530/1251] eta 0:26:36 lr 0.000966 time 1.8622 (2.2140) loss 3.8627 (4.1215) grad_norm 1.0569 (1.1345) [2022-01-18 15:26:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][540/1251] eta 0:26:14 lr 0.000966 time 1.8599 (2.2145) loss 4.9327 (4.1198) grad_norm 0.9115 (1.1338) [2022-01-18 15:26:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][550/1251] eta 0:25:52 lr 0.000966 time 2.6729 (2.2149) loss 3.4605 (4.1211) grad_norm 0.9588 (1.1318) [2022-01-18 15:26:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][560/1251] eta 0:25:28 lr 0.000966 time 2.2386 (2.2118) loss 4.0772 (4.1216) grad_norm 1.0929 (1.1315) [2022-01-18 15:27:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][570/1251] eta 0:25:05 lr 0.000966 time 1.5657 (2.2112) loss 3.3109 (4.1256) grad_norm 1.2423 (1.1313) [2022-01-18 15:27:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][580/1251] eta 0:24:43 lr 0.000966 time 1.9177 (2.2103) loss 3.6416 (4.1285) grad_norm 0.9855 (1.1314) [2022-01-18 15:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][590/1251] eta 0:24:21 lr 0.000966 time 3.1876 (2.2104) loss 4.7027 (4.1278) grad_norm 1.2485 (1.1325) [2022-01-18 15:28:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][600/1251] eta 0:23:57 lr 0.000966 time 1.8548 (2.2082) loss 3.8637 (4.1289) grad_norm 1.0935 (1.1332) [2022-01-18 15:28:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][610/1251] eta 0:23:33 lr 0.000966 time 1.5795 (2.2055) loss 4.2395 (4.1297) grad_norm 0.9945 (1.1332) [2022-01-18 15:28:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][620/1251] eta 0:23:12 lr 0.000966 time 2.5729 (2.2074) loss 3.5649 (4.1283) grad_norm 1.4261 (1.1333) [2022-01-18 15:29:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][630/1251] eta 0:22:51 lr 0.000966 time 2.7684 (2.2090) loss 3.0960 (4.1217) grad_norm 1.2737 (1.1331) [2022-01-18 15:29:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][640/1251] eta 0:22:28 lr 0.000966 time 1.8915 (2.2067) loss 3.6376 (4.1191) grad_norm 0.9933 (1.1321) [2022-01-18 15:30:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][650/1251] eta 0:22:06 lr 0.000966 time 1.8061 (2.2068) loss 4.4178 (4.1202) grad_norm 0.9516 (1.1307) [2022-01-18 15:30:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][660/1251] eta 0:21:45 lr 0.000966 time 3.1135 (2.2091) loss 4.1577 (4.1181) grad_norm 0.9528 (1.1303) [2022-01-18 15:30:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][670/1251] eta 0:21:24 lr 0.000966 time 2.2052 (2.2115) loss 4.1866 (4.1148) grad_norm 1.0728 (1.1298) [2022-01-18 15:31:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][680/1251] eta 0:21:01 lr 0.000966 time 1.7268 (2.2096) loss 4.2992 (4.1162) grad_norm 1.1006 (1.1296) [2022-01-18 15:31:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][690/1251] eta 0:20:38 lr 0.000966 time 1.6965 (2.2075) loss 4.3161 (4.1131) grad_norm 1.2606 (1.1289) [2022-01-18 15:31:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][700/1251] eta 0:20:15 lr 0.000966 time 3.0182 (2.2065) loss 4.1986 (4.1109) grad_norm 1.1326 (1.1304) [2022-01-18 15:32:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][710/1251] eta 0:19:53 lr 0.000966 time 2.5400 (2.2054) loss 4.1629 (4.1128) grad_norm 1.2096 (1.1312) [2022-01-18 15:32:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][720/1251] eta 0:19:31 lr 0.000966 time 2.3627 (2.2063) loss 4.4619 (4.1116) grad_norm 1.2223 (1.1311) [2022-01-18 15:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][730/1251] eta 0:19:09 lr 0.000966 time 1.9053 (2.2068) loss 4.9450 (4.1104) grad_norm 1.0330 (1.1305) [2022-01-18 15:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][740/1251] eta 0:18:48 lr 0.000966 time 2.8620 (2.2079) loss 4.2314 (4.1058) grad_norm 1.1442 (1.1321) [2022-01-18 15:33:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][750/1251] eta 0:18:25 lr 0.000966 time 1.8684 (2.2073) loss 3.4130 (4.1015) grad_norm 1.0408 (1.1328) [2022-01-18 15:34:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][760/1251] eta 0:18:03 lr 0.000966 time 1.8923 (2.2065) loss 4.4317 (4.1067) grad_norm 1.0552 (1.1343) [2022-01-18 15:34:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][770/1251] eta 0:17:39 lr 0.000966 time 1.7500 (2.2020) loss 3.0683 (4.1036) grad_norm 1.1675 (1.1349) [2022-01-18 15:34:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][780/1251] eta 0:17:16 lr 0.000966 time 1.8824 (2.2008) loss 4.7361 (4.1040) grad_norm 1.1129 (1.1366) [2022-01-18 15:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][790/1251] eta 0:16:53 lr 0.000966 time 2.1149 (2.1989) loss 3.1863 (4.1023) grad_norm 0.8906 (1.1368) [2022-01-18 15:35:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][800/1251] eta 0:16:32 lr 0.000966 time 2.8865 (2.2006) loss 4.4958 (4.1043) grad_norm 1.0646 (1.1361) [2022-01-18 15:35:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][810/1251] eta 0:16:10 lr 0.000966 time 2.2423 (2.2016) loss 4.1341 (4.1055) grad_norm 1.0055 (1.1369) [2022-01-18 15:36:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][820/1251] eta 0:15:49 lr 0.000966 time 2.3554 (2.2033) loss 4.4566 (4.1066) grad_norm 1.2100 (1.1385) [2022-01-18 15:36:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][830/1251] eta 0:15:27 lr 0.000966 time 1.9487 (2.2036) loss 3.4031 (4.1068) grad_norm 0.9072 (1.1373) [2022-01-18 15:37:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][840/1251] eta 0:15:05 lr 0.000966 time 3.1502 (2.2041) loss 4.6651 (4.1075) grad_norm 1.2063 (1.1370) [2022-01-18 15:37:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][850/1251] eta 0:14:43 lr 0.000966 time 1.5595 (2.2020) loss 4.1995 (4.1067) grad_norm 1.0863 (1.1372) [2022-01-18 15:37:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][860/1251] eta 0:14:21 lr 0.000966 time 2.5765 (2.2021) loss 4.7284 (4.1088) grad_norm 1.0593 (1.1374) [2022-01-18 15:38:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][870/1251] eta 0:13:58 lr 0.000966 time 1.8820 (2.2006) loss 4.6547 (4.1100) grad_norm 1.1037 (1.1369) [2022-01-18 15:38:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][880/1251] eta 0:13:36 lr 0.000966 time 2.3916 (2.2003) loss 4.9890 (4.1097) grad_norm 0.9923 (1.1380) [2022-01-18 15:38:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][890/1251] eta 0:13:13 lr 0.000966 time 1.9407 (2.1993) loss 4.0299 (4.1075) grad_norm 0.8748 (1.1380) [2022-01-18 15:39:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][900/1251] eta 0:12:52 lr 0.000966 time 3.2326 (2.2009) loss 3.3666 (4.1062) grad_norm 1.0522 (1.1375) [2022-01-18 15:39:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][910/1251] eta 0:12:30 lr 0.000966 time 1.8611 (2.1996) loss 3.8108 (4.1053) grad_norm 0.8443 (1.1369) [2022-01-18 15:39:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][920/1251] eta 0:12:08 lr 0.000966 time 2.5849 (2.2009) loss 4.6088 (4.1050) grad_norm 1.0482 (1.1361) [2022-01-18 15:40:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][930/1251] eta 0:11:46 lr 0.000966 time 2.3281 (2.2021) loss 4.4185 (4.1066) grad_norm 1.4379 (1.1368) [2022-01-18 15:40:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][940/1251] eta 0:11:24 lr 0.000966 time 3.1128 (2.2025) loss 3.9392 (4.1091) grad_norm 1.0028 (1.1366) [2022-01-18 15:41:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][950/1251] eta 0:11:02 lr 0.000966 time 1.9804 (2.2005) loss 4.4603 (4.1112) grad_norm 1.1758 (1.1375) [2022-01-18 15:41:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][960/1251] eta 0:10:39 lr 0.000966 time 1.7664 (2.1986) loss 4.2620 (4.1105) grad_norm 0.8925 (1.1374) [2022-01-18 15:41:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][970/1251] eta 0:10:17 lr 0.000966 time 1.8365 (2.1979) loss 4.2885 (4.1101) grad_norm 1.3288 (1.1372) [2022-01-18 15:42:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][980/1251] eta 0:09:55 lr 0.000966 time 3.1327 (2.1968) loss 3.9089 (4.1091) grad_norm 0.9960 (1.1375) [2022-01-18 15:42:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][990/1251] eta 0:09:33 lr 0.000966 time 2.3556 (2.1967) loss 4.8262 (4.1097) grad_norm 1.0123 (1.1376) [2022-01-18 15:42:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1000/1251] eta 0:09:11 lr 0.000966 time 1.7347 (2.1957) loss 3.9697 (4.1097) grad_norm 1.0087 (1.1372) [2022-01-18 15:43:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1010/1251] eta 0:08:49 lr 0.000966 time 2.0631 (2.1965) loss 3.9586 (4.1099) grad_norm 0.9964 (1.1375) [2022-01-18 15:43:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1020/1251] eta 0:08:27 lr 0.000966 time 2.5580 (2.1965) loss 4.3894 (4.1126) grad_norm 1.1595 (1.1378) [2022-01-18 15:43:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1030/1251] eta 0:08:05 lr 0.000966 time 2.1266 (2.1957) loss 3.7629 (4.1099) grad_norm 1.0595 (1.1375) [2022-01-18 15:44:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1040/1251] eta 0:07:43 lr 0.000966 time 1.8069 (2.1949) loss 3.7124 (4.1091) grad_norm 0.9923 (1.1371) [2022-01-18 15:44:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1050/1251] eta 0:07:21 lr 0.000966 time 1.7482 (2.1947) loss 4.0980 (4.1109) grad_norm 1.0975 (1.1368) [2022-01-18 15:44:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1060/1251] eta 0:06:59 lr 0.000966 time 2.3087 (2.1954) loss 4.0414 (4.1082) grad_norm 1.3374 (1.1372) [2022-01-18 15:45:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1070/1251] eta 0:06:37 lr 0.000966 time 1.5420 (2.1948) loss 3.8332 (4.1074) grad_norm 1.2570 (1.1366) [2022-01-18 15:45:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1080/1251] eta 0:06:15 lr 0.000965 time 2.1615 (2.1945) loss 3.2196 (4.1047) grad_norm 1.1364 (1.1354) [2022-01-18 15:46:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1090/1251] eta 0:05:53 lr 0.000965 time 1.5479 (2.1964) loss 3.9068 (4.1032) grad_norm 1.1628 (1.1353) [2022-01-18 15:46:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1100/1251] eta 0:05:31 lr 0.000965 time 2.1514 (2.1976) loss 4.7375 (4.1042) grad_norm 1.0969 (1.1356) [2022-01-18 15:46:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1110/1251] eta 0:05:10 lr 0.000965 time 2.7570 (2.1988) loss 4.4972 (4.1055) grad_norm 0.9398 (1.1361) [2022-01-18 15:47:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1120/1251] eta 0:04:47 lr 0.000965 time 1.6381 (2.1974) loss 3.2344 (4.1037) grad_norm 1.2703 (1.1369) [2022-01-18 15:47:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1130/1251] eta 0:04:25 lr 0.000965 time 1.5956 (2.1945) loss 3.7103 (4.1039) grad_norm 1.3465 (1.1368) [2022-01-18 15:47:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1140/1251] eta 0:04:03 lr 0.000965 time 1.9496 (2.1937) loss 3.0554 (4.1027) grad_norm 1.1442 (1.1366) [2022-01-18 15:48:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1150/1251] eta 0:03:41 lr 0.000965 time 2.1132 (2.1935) loss 4.4105 (4.1026) grad_norm 0.9409 (1.1361) [2022-01-18 15:48:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1160/1251] eta 0:03:19 lr 0.000965 time 2.8342 (2.1926) loss 4.5408 (4.1034) grad_norm 1.1322 (1.1362) [2022-01-18 15:48:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1170/1251] eta 0:02:57 lr 0.000965 time 1.4932 (2.1915) loss 5.0177 (4.1053) grad_norm 1.2684 (1.1370) [2022-01-18 15:49:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1180/1251] eta 0:02:35 lr 0.000965 time 3.0254 (2.1944) loss 3.3748 (4.1047) grad_norm 0.9921 (1.1370) [2022-01-18 15:49:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1190/1251] eta 0:02:13 lr 0.000965 time 3.0621 (2.1964) loss 3.9663 (4.1059) grad_norm 1.2940 (1.1372) [2022-01-18 15:50:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1200/1251] eta 0:01:52 lr 0.000965 time 2.2738 (2.1972) loss 3.0481 (4.1038) grad_norm 1.0825 (1.1368) [2022-01-18 15:50:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1210/1251] eta 0:01:30 lr 0.000965 time 1.6749 (2.1962) loss 4.1596 (4.1039) grad_norm 1.0617 (1.1369) [2022-01-18 15:50:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1220/1251] eta 0:01:08 lr 0.000965 time 1.8222 (2.1948) loss 3.8814 (4.1057) grad_norm 0.9966 (1.1367) [2022-01-18 15:51:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1230/1251] eta 0:00:46 lr 0.000965 time 1.6555 (2.1931) loss 3.0691 (4.1060) grad_norm 1.0804 (1.1364) [2022-01-18 15:51:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1240/1251] eta 0:00:24 lr 0.000965 time 2.1309 (2.1923) loss 3.5335 (4.1049) grad_norm 1.3182 (1.1364) [2022-01-18 15:51:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [35/300][1250/1251] eta 0:00:02 lr 0.000965 time 1.1586 (2.1868) loss 3.9041 (4.1057) grad_norm 1.2291 (1.1365) [2022-01-18 15:51:44 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 35 training takes 0:45:36 [2022-01-18 15:52:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.578 (18.578) Loss 1.3750 (1.3750) Acc@1 68.848 (68.848) Acc@5 88.379 (88.379) [2022-01-18 15:52:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.311 (3.397) Loss 1.3830 (1.4063) Acc@1 67.969 (67.685) Acc@5 89.258 (88.521) [2022-01-18 15:52:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.225 (2.708) Loss 1.3843 (1.4066) Acc@1 68.164 (67.680) Acc@5 88.477 (88.472) [2022-01-18 15:52:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.622 (2.315) Loss 1.4857 (1.4108) Acc@1 66.016 (67.597) Acc@5 87.402 (88.423) [2022-01-18 15:53:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.756 (2.216) Loss 1.4371 (1.4111) Acc@1 66.211 (67.492) Acc@5 87.598 (88.415) [2022-01-18 15:53:22 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 67.500 Acc@5 88.364 [2022-01-18 15:53:22 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 67.5% [2022-01-18 15:53:22 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 67.50% [2022-01-18 15:53:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][0/1251] eta 7:26:07 lr 0.000965 time 21.3969 (21.3969) loss 4.4021 (4.4021) grad_norm 1.4995 (1.4995) [2022-01-18 15:54:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][10/1251] eta 1:20:59 lr 0.000965 time 1.3741 (3.9155) loss 3.9507 (4.0690) grad_norm 1.2544 (1.2662) [2022-01-18 15:54:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][20/1251] eta 1:04:09 lr 0.000965 time 1.9923 (3.1272) loss 4.5592 (4.0632) grad_norm 1.1105 (1.2375) [2022-01-18 15:54:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][30/1251] eta 0:56:31 lr 0.000965 time 1.5730 (2.7780) loss 3.9032 (4.0787) grad_norm 1.1690 (1.2233) [2022-01-18 15:55:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][40/1251] eta 0:54:09 lr 0.000965 time 4.1850 (2.6833) loss 3.5035 (4.1033) grad_norm 1.1114 (1.2197) [2022-01-18 15:55:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][50/1251] eta 0:52:29 lr 0.000965 time 1.7967 (2.6225) loss 4.5936 (4.1938) grad_norm 1.3549 (1.1944) [2022-01-18 15:55:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][60/1251] eta 0:51:06 lr 0.000965 time 2.4702 (2.5746) loss 4.1728 (4.1905) grad_norm 1.2683 (1.1827) [2022-01-18 15:56:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][70/1251] eta 0:49:39 lr 0.000965 time 1.7287 (2.5232) loss 4.7628 (4.1804) grad_norm 1.0882 (1.1846) [2022-01-18 15:56:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][80/1251] eta 0:48:39 lr 0.000965 time 2.8547 (2.4930) loss 3.2420 (4.1595) grad_norm 1.0398 (1.1761) [2022-01-18 15:57:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][90/1251] eta 0:47:19 lr 0.000965 time 1.6710 (2.4455) loss 4.2406 (4.1479) grad_norm 1.1289 (1.1786) [2022-01-18 15:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][100/1251] eta 0:46:13 lr 0.000965 time 1.6651 (2.4092) loss 4.3403 (4.1512) grad_norm 1.1577 (1.1699) [2022-01-18 15:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][110/1251] eta 0:45:14 lr 0.000965 time 1.9026 (2.3787) loss 4.1728 (4.1395) grad_norm 1.0440 (1.1623) [2022-01-18 15:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][120/1251] eta 0:44:27 lr 0.000965 time 2.5538 (2.3586) loss 3.2161 (4.1268) grad_norm 1.3115 (1.1713) [2022-01-18 15:58:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][130/1251] eta 0:43:46 lr 0.000965 time 2.1453 (2.3427) loss 4.4211 (4.1214) grad_norm 1.4009 (1.1747) [2022-01-18 15:58:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][140/1251] eta 0:43:15 lr 0.000965 time 2.2115 (2.3366) loss 4.2015 (4.1182) grad_norm 1.0594 (1.1684) [2022-01-18 15:59:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][150/1251] eta 0:42:41 lr 0.000965 time 1.5606 (2.3264) loss 3.7498 (4.1288) grad_norm 1.1909 (1.1653) [2022-01-18 15:59:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][160/1251] eta 0:42:11 lr 0.000965 time 2.7963 (2.3203) loss 3.4729 (4.1083) grad_norm 1.2256 (1.1688) [2022-01-18 15:59:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][170/1251] eta 0:41:42 lr 0.000965 time 1.8332 (2.3154) loss 4.4586 (4.1101) grad_norm 1.2659 (1.1705) [2022-01-18 16:00:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][180/1251] eta 0:41:20 lr 0.000965 time 2.6856 (2.3160) loss 3.8657 (4.1070) grad_norm 1.0404 (1.1624) [2022-01-18 16:00:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][190/1251] eta 0:40:51 lr 0.000965 time 1.8298 (2.3108) loss 4.7790 (4.1073) grad_norm 1.1097 (1.1638) [2022-01-18 16:01:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][200/1251] eta 0:40:20 lr 0.000965 time 2.2425 (2.3033) loss 5.0115 (4.1119) grad_norm 1.1851 (1.1676) [2022-01-18 16:01:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][210/1251] eta 0:39:45 lr 0.000965 time 1.9031 (2.2914) loss 4.0612 (4.1043) grad_norm 1.1710 (1.1697) [2022-01-18 16:01:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][220/1251] eta 0:39:21 lr 0.000965 time 2.3760 (2.2904) loss 3.9402 (4.1004) grad_norm 1.0642 (1.1649) [2022-01-18 16:02:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][230/1251] eta 0:38:44 lr 0.000965 time 1.9159 (2.2771) loss 4.2560 (4.1037) grad_norm 1.2427 (1.1626) [2022-01-18 16:02:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][240/1251] eta 0:38:09 lr 0.000965 time 2.0816 (2.2649) loss 3.7199 (4.1005) grad_norm 0.9702 (1.1613) [2022-01-18 16:02:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][250/1251] eta 0:37:38 lr 0.000965 time 2.3590 (2.2562) loss 4.0984 (4.1064) grad_norm 1.1103 (1.1592) [2022-01-18 16:03:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][260/1251] eta 0:37:13 lr 0.000965 time 1.8847 (2.2542) loss 4.6426 (4.1107) grad_norm 1.0832 (1.1604) [2022-01-18 16:03:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][270/1251] eta 0:36:59 lr 0.000965 time 2.8305 (2.2621) loss 3.8935 (4.1228) grad_norm 1.0548 (1.1584) [2022-01-18 16:03:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][280/1251] eta 0:36:40 lr 0.000965 time 2.1290 (2.2658) loss 4.7928 (4.1281) grad_norm 1.2093 (1.1576) [2022-01-18 16:04:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][290/1251] eta 0:36:23 lr 0.000965 time 2.1191 (2.2722) loss 4.3542 (4.1203) grad_norm 1.2242 (1.1562) [2022-01-18 16:04:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][300/1251] eta 0:35:59 lr 0.000965 time 1.9888 (2.2711) loss 4.4865 (4.1117) grad_norm 1.3028 (1.1550) [2022-01-18 16:05:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][310/1251] eta 0:35:33 lr 0.000965 time 2.2836 (2.2672) loss 3.5216 (4.1133) grad_norm 1.1891 (1.1554) [2022-01-18 16:05:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][320/1251] eta 0:34:58 lr 0.000965 time 1.9702 (2.2539) loss 3.4126 (4.0952) grad_norm 1.1028 (1.1542) [2022-01-18 16:05:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][330/1251] eta 0:34:27 lr 0.000965 time 2.0066 (2.2451) loss 3.7936 (4.0956) grad_norm 0.9981 (1.1507) [2022-01-18 16:06:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][340/1251] eta 0:34:02 lr 0.000965 time 3.1897 (2.2417) loss 2.9508 (4.0899) grad_norm 1.0330 (1.1497) [2022-01-18 16:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][350/1251] eta 0:33:35 lr 0.000965 time 1.8506 (2.2371) loss 4.1344 (4.0886) grad_norm 1.0213 (1.1503) [2022-01-18 16:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][360/1251] eta 0:33:14 lr 0.000965 time 2.2613 (2.2381) loss 3.7648 (4.0936) grad_norm 1.0267 (1.1517) [2022-01-18 16:07:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][370/1251] eta 0:32:52 lr 0.000965 time 1.5038 (2.2384) loss 4.4075 (4.0946) grad_norm 1.2860 (1.1544) [2022-01-18 16:07:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][380/1251] eta 0:32:28 lr 0.000965 time 2.1772 (2.2366) loss 4.4668 (4.0945) grad_norm 1.1597 (1.1545) [2022-01-18 16:07:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][390/1251] eta 0:32:05 lr 0.000965 time 2.1633 (2.2365) loss 4.7036 (4.0975) grad_norm 1.1486 (1.1546) [2022-01-18 16:08:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][400/1251] eta 0:31:43 lr 0.000965 time 2.4412 (2.2363) loss 4.6992 (4.0971) grad_norm 1.1216 (1.1558) [2022-01-18 16:08:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][410/1251] eta 0:31:23 lr 0.000965 time 1.8453 (2.2392) loss 4.0871 (4.0946) grad_norm 1.1673 (1.1540) [2022-01-18 16:09:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][420/1251] eta 0:30:59 lr 0.000965 time 2.5228 (2.2383) loss 4.2881 (4.0924) grad_norm 1.0636 (1.1527) [2022-01-18 16:09:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][430/1251] eta 0:30:38 lr 0.000965 time 1.9906 (2.2389) loss 4.9625 (4.0920) grad_norm 1.1768 (1.1521) [2022-01-18 16:09:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][440/1251] eta 0:30:15 lr 0.000965 time 3.1036 (2.2391) loss 4.3446 (4.0893) grad_norm 1.2147 (1.1517) [2022-01-18 16:10:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][450/1251] eta 0:29:57 lr 0.000965 time 1.8436 (2.2439) loss 4.8522 (4.0975) grad_norm 0.9835 (1.1497) [2022-01-18 16:10:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][460/1251] eta 0:29:32 lr 0.000965 time 2.0068 (2.2413) loss 4.7224 (4.0926) grad_norm 1.2518 (1.1485) [2022-01-18 16:10:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][470/1251] eta 0:29:08 lr 0.000965 time 1.9605 (2.2390) loss 4.4997 (4.0942) grad_norm 0.9725 (1.1474) [2022-01-18 16:11:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][480/1251] eta 0:28:45 lr 0.000965 time 3.3869 (2.2380) loss 4.5163 (4.0874) grad_norm 0.9572 (1.1469) [2022-01-18 16:11:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][490/1251] eta 0:28:20 lr 0.000964 time 1.9140 (2.2349) loss 4.1514 (4.0852) grad_norm 1.3126 (1.1496) [2022-01-18 16:12:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][500/1251] eta 0:27:57 lr 0.000964 time 1.9032 (2.2332) loss 3.7343 (4.0747) grad_norm 0.9315 (1.1498) [2022-01-18 16:12:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][510/1251] eta 0:27:32 lr 0.000964 time 1.8237 (2.2296) loss 4.3974 (4.0775) grad_norm 1.0779 (1.1483) [2022-01-18 16:12:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][520/1251] eta 0:27:06 lr 0.000964 time 2.2946 (2.2251) loss 4.3860 (4.0787) grad_norm 1.4031 (1.1481) [2022-01-18 16:13:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][530/1251] eta 0:26:42 lr 0.000964 time 2.1999 (2.2228) loss 4.4382 (4.0777) grad_norm 1.0828 (1.1462) [2022-01-18 16:13:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][540/1251] eta 0:26:20 lr 0.000964 time 2.9410 (2.2227) loss 4.5629 (4.0746) grad_norm 0.9769 (1.1456) [2022-01-18 16:13:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][550/1251] eta 0:25:58 lr 0.000964 time 2.1381 (2.2233) loss 4.7140 (4.0743) grad_norm 0.9397 (1.1458) [2022-01-18 16:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][560/1251] eta 0:25:36 lr 0.000964 time 2.5292 (2.2238) loss 3.9368 (4.0766) grad_norm 1.2172 (1.1447) [2022-01-18 16:14:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][570/1251] eta 0:25:14 lr 0.000964 time 1.9386 (2.2234) loss 3.7440 (4.0763) grad_norm 1.2585 (1.1440) [2022-01-18 16:14:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][580/1251] eta 0:24:52 lr 0.000964 time 2.2489 (2.2236) loss 4.1211 (4.0759) grad_norm 1.0596 (1.1444) [2022-01-18 16:15:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][590/1251] eta 0:24:29 lr 0.000964 time 1.7827 (2.2228) loss 4.4393 (4.0737) grad_norm 1.0296 (1.1443) [2022-01-18 16:15:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][600/1251] eta 0:24:07 lr 0.000964 time 2.1025 (2.2236) loss 4.4812 (4.0726) grad_norm 0.8997 (1.1440) [2022-01-18 16:16:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][610/1251] eta 0:23:45 lr 0.000964 time 2.1710 (2.2235) loss 3.9554 (4.0673) grad_norm 1.2203 (1.1458) [2022-01-18 16:16:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][620/1251] eta 0:23:22 lr 0.000964 time 1.5649 (2.2222) loss 4.9953 (4.0672) grad_norm 1.5841 (1.1459) [2022-01-18 16:16:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][630/1251] eta 0:22:59 lr 0.000964 time 2.2962 (2.2211) loss 4.6556 (4.0689) grad_norm 1.7529 (1.1482) [2022-01-18 16:17:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][640/1251] eta 0:22:35 lr 0.000964 time 1.9330 (2.2193) loss 4.3287 (4.0742) grad_norm 1.1726 (1.1483) [2022-01-18 16:17:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][650/1251] eta 0:22:13 lr 0.000964 time 2.2034 (2.2195) loss 4.1212 (4.0762) grad_norm 1.0195 (1.1472) [2022-01-18 16:17:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][660/1251] eta 0:21:51 lr 0.000964 time 1.9578 (2.2185) loss 4.5726 (4.0775) grad_norm 0.9881 (1.1467) [2022-01-18 16:18:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][670/1251] eta 0:21:29 lr 0.000964 time 2.0889 (2.2188) loss 3.0835 (4.0765) grad_norm 1.0293 (1.1459) [2022-01-18 16:18:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][680/1251] eta 0:21:06 lr 0.000964 time 1.6419 (2.2184) loss 4.4322 (4.0749) grad_norm 1.0977 (1.1452) [2022-01-18 16:18:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][690/1251] eta 0:20:45 lr 0.000964 time 2.8832 (2.2194) loss 4.4824 (4.0790) grad_norm 0.9437 (1.1432) [2022-01-18 16:19:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][700/1251] eta 0:20:24 lr 0.000964 time 2.2214 (2.2217) loss 4.4038 (4.0761) grad_norm 1.2439 (1.1426) [2022-01-18 16:19:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][710/1251] eta 0:20:00 lr 0.000964 time 1.9051 (2.2199) loss 3.8859 (4.0761) grad_norm 1.2022 (1.1437) [2022-01-18 16:20:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][720/1251] eta 0:19:37 lr 0.000964 time 1.8172 (2.2171) loss 3.5073 (4.0723) grad_norm 1.0204 (1.1438) [2022-01-18 16:20:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][730/1251] eta 0:19:13 lr 0.000964 time 2.0880 (2.2132) loss 2.8762 (4.0733) grad_norm 0.9904 (1.1425) [2022-01-18 16:20:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][740/1251] eta 0:18:50 lr 0.000964 time 2.2241 (2.2117) loss 4.4048 (4.0765) grad_norm 1.2362 (1.1416) [2022-01-18 16:21:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][750/1251] eta 0:18:28 lr 0.000964 time 2.8631 (2.2126) loss 4.2110 (4.0781) grad_norm 1.0582 (1.1414) [2022-01-18 16:21:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][760/1251] eta 0:18:07 lr 0.000964 time 1.8555 (2.2153) loss 4.3736 (4.0762) grad_norm 1.0710 (1.1409) [2022-01-18 16:21:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][770/1251] eta 0:17:45 lr 0.000964 time 2.4949 (2.2161) loss 3.9749 (4.0709) grad_norm 1.2791 (1.1427) [2022-01-18 16:22:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][780/1251] eta 0:17:23 lr 0.000964 time 2.4535 (2.2162) loss 4.2280 (4.0704) grad_norm 1.1395 (1.1423) [2022-01-18 16:22:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][790/1251] eta 0:17:00 lr 0.000964 time 1.9150 (2.2138) loss 4.4896 (4.0730) grad_norm 1.0777 (1.1419) [2022-01-18 16:22:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][800/1251] eta 0:16:37 lr 0.000964 time 2.1124 (2.2126) loss 3.6238 (4.0733) grad_norm 1.2348 (1.1421) [2022-01-18 16:23:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][810/1251] eta 0:16:15 lr 0.000964 time 2.0178 (2.2123) loss 4.1687 (4.0711) grad_norm 1.0851 (1.1416) [2022-01-18 16:23:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][820/1251] eta 0:15:53 lr 0.000964 time 2.6726 (2.2118) loss 2.9999 (4.0682) grad_norm 1.0906 (1.1411) [2022-01-18 16:24:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][830/1251] eta 0:15:31 lr 0.000964 time 2.7092 (2.2136) loss 3.0458 (4.0647) grad_norm 0.9480 (1.1413) [2022-01-18 16:24:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][840/1251] eta 0:15:09 lr 0.000964 time 1.9274 (2.2129) loss 4.3069 (4.0639) grad_norm 1.1421 (1.1423) [2022-01-18 16:24:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][850/1251] eta 0:14:46 lr 0.000964 time 1.6081 (2.2103) loss 4.4534 (4.0634) grad_norm 1.0834 (1.1420) [2022-01-18 16:25:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][860/1251] eta 0:14:23 lr 0.000964 time 1.9322 (2.2097) loss 4.8689 (4.0639) grad_norm 1.0715 (1.1425) [2022-01-18 16:25:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][870/1251] eta 0:14:01 lr 0.000964 time 2.2157 (2.2086) loss 3.7546 (4.0627) grad_norm 1.2301 (1.1419) [2022-01-18 16:25:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][880/1251] eta 0:13:39 lr 0.000964 time 3.2544 (2.2101) loss 4.6054 (4.0633) grad_norm 1.2944 (1.1424) [2022-01-18 16:26:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][890/1251] eta 0:13:17 lr 0.000964 time 1.8925 (2.2087) loss 4.3798 (4.0638) grad_norm 1.1014 (1.1431) [2022-01-18 16:26:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][900/1251] eta 0:12:55 lr 0.000964 time 2.4272 (2.2095) loss 3.5824 (4.0641) grad_norm 0.9689 (1.1421) [2022-01-18 16:26:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][910/1251] eta 0:12:33 lr 0.000964 time 1.6616 (2.2097) loss 4.1744 (4.0696) grad_norm 0.9001 (1.1419) [2022-01-18 16:27:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][920/1251] eta 0:12:11 lr 0.000964 time 3.0552 (2.2112) loss 4.3724 (4.0702) grad_norm 0.9044 (1.1414) [2022-01-18 16:27:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][930/1251] eta 0:11:49 lr 0.000964 time 1.8499 (2.2102) loss 4.8872 (4.0706) grad_norm 1.0494 (1.1410) [2022-01-18 16:28:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][940/1251] eta 0:11:27 lr 0.000964 time 1.9320 (2.2090) loss 3.5448 (4.0714) grad_norm 1.1142 (1.1414) [2022-01-18 16:28:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][950/1251] eta 0:11:04 lr 0.000964 time 1.9420 (2.2067) loss 4.2253 (4.0706) grad_norm 0.9445 (1.1406) [2022-01-18 16:28:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][960/1251] eta 0:10:42 lr 0.000964 time 3.2413 (2.2072) loss 4.1635 (4.0710) grad_norm 0.9540 (1.1397) [2022-01-18 16:29:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][970/1251] eta 0:10:20 lr 0.000964 time 1.7549 (2.2070) loss 4.2485 (4.0729) grad_norm 1.2004 (1.1402) [2022-01-18 16:29:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][980/1251] eta 0:09:58 lr 0.000964 time 1.9042 (2.2078) loss 4.0808 (4.0716) grad_norm 0.9021 (1.1405) [2022-01-18 16:29:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][990/1251] eta 0:09:35 lr 0.000964 time 1.9692 (2.2060) loss 4.0973 (4.0726) grad_norm 1.4860 (1.1412) [2022-01-18 16:30:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1000/1251] eta 0:09:13 lr 0.000964 time 2.8247 (2.2060) loss 3.1707 (4.0729) grad_norm 1.1343 (1.1414) [2022-01-18 16:30:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1010/1251] eta 0:08:51 lr 0.000964 time 2.1397 (2.2048) loss 4.1957 (4.0721) grad_norm 0.9986 (1.1414) [2022-01-18 16:30:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1020/1251] eta 0:08:29 lr 0.000964 time 2.8982 (2.2044) loss 3.0143 (4.0721) grad_norm 1.1455 (1.1414) [2022-01-18 16:31:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1030/1251] eta 0:08:07 lr 0.000964 time 2.2305 (2.2044) loss 4.9030 (4.0737) grad_norm 1.2795 (1.1404) [2022-01-18 16:31:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1040/1251] eta 0:07:44 lr 0.000964 time 2.9745 (2.2033) loss 3.9884 (4.0716) grad_norm 1.2915 (1.1392) [2022-01-18 16:31:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1050/1251] eta 0:07:22 lr 0.000964 time 2.0717 (2.2031) loss 3.6550 (4.0724) grad_norm 1.1355 (1.1389) [2022-01-18 16:32:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1060/1251] eta 0:07:00 lr 0.000964 time 2.8207 (2.2026) loss 3.8079 (4.0733) grad_norm 1.3303 (1.1398) [2022-01-18 16:32:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1070/1251] eta 0:06:38 lr 0.000964 time 2.1209 (2.2015) loss 3.9510 (4.0729) grad_norm 1.0427 (1.1392) [2022-01-18 16:33:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1080/1251] eta 0:06:16 lr 0.000964 time 2.9247 (2.2019) loss 4.9308 (4.0745) grad_norm 1.0819 (1.1394) [2022-01-18 16:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1090/1251] eta 0:05:54 lr 0.000964 time 1.6902 (2.2013) loss 4.3633 (4.0751) grad_norm 1.1760 (1.1394) [2022-01-18 16:33:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1100/1251] eta 0:05:32 lr 0.000964 time 2.4371 (2.2005) loss 3.9857 (4.0743) grad_norm 0.9883 (1.1392) [2022-01-18 16:34:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1110/1251] eta 0:05:10 lr 0.000964 time 1.9535 (2.2007) loss 4.8011 (4.0761) grad_norm 0.8948 (1.1396) [2022-01-18 16:34:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1120/1251] eta 0:04:48 lr 0.000964 time 3.1771 (2.2027) loss 3.5086 (4.0759) grad_norm 1.2570 (1.1402) [2022-01-18 16:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1130/1251] eta 0:04:26 lr 0.000963 time 2.2052 (2.2033) loss 2.9076 (4.0757) grad_norm 0.9977 (1.1398) [2022-01-18 16:35:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1140/1251] eta 0:04:04 lr 0.000963 time 3.3447 (2.2054) loss 3.9916 (4.0784) grad_norm 1.0237 (1.1390) [2022-01-18 16:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1150/1251] eta 0:03:42 lr 0.000963 time 2.0091 (2.2047) loss 4.7073 (4.0783) grad_norm 1.0748 (1.1397) [2022-01-18 16:35:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1160/1251] eta 0:03:20 lr 0.000963 time 1.8623 (2.2022) loss 4.7075 (4.0814) grad_norm 1.0142 (1.1395) [2022-01-18 16:36:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1170/1251] eta 0:02:58 lr 0.000963 time 2.2701 (2.2013) loss 4.7424 (4.0804) grad_norm 1.0471 (1.1394) [2022-01-18 16:36:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1180/1251] eta 0:02:36 lr 0.000963 time 2.4874 (2.2017) loss 3.6642 (4.0807) grad_norm 1.0706 (1.1390) [2022-01-18 16:37:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1190/1251] eta 0:02:14 lr 0.000963 time 1.8999 (2.2013) loss 4.3279 (4.0809) grad_norm 1.1691 (1.1388) [2022-01-18 16:37:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1200/1251] eta 0:01:52 lr 0.000963 time 2.2690 (2.2012) loss 4.1879 (4.0806) grad_norm 1.2599 (1.1385) [2022-01-18 16:37:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1210/1251] eta 0:01:30 lr 0.000963 time 1.9142 (2.2014) loss 3.9403 (4.0791) grad_norm 1.1499 (1.1385) [2022-01-18 16:38:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1220/1251] eta 0:01:08 lr 0.000963 time 1.9275 (2.2017) loss 4.6655 (4.0797) grad_norm 0.9038 (1.1379) [2022-01-18 16:38:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1230/1251] eta 0:00:46 lr 0.000963 time 1.5760 (2.2008) loss 2.9148 (4.0799) grad_norm 1.0586 (1.1377) [2022-01-18 16:38:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1240/1251] eta 0:00:24 lr 0.000963 time 1.4651 (2.1996) loss 4.7877 (4.0795) grad_norm 1.2558 (1.1382) [2022-01-18 16:39:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [36/300][1250/1251] eta 0:00:02 lr 0.000963 time 1.1854 (2.1945) loss 4.1804 (4.0811) grad_norm 1.1952 (1.1385) [2022-01-18 16:39:08 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 36 training takes 0:45:45 [2022-01-18 16:39:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.334 (18.334) Loss 1.4202 (1.4202) Acc@1 69.824 (69.824) Acc@5 88.477 (88.477) [2022-01-18 16:39:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.731 (3.547) Loss 1.4392 (1.4475) Acc@1 67.969 (67.773) Acc@5 88.184 (87.962) [2022-01-18 16:40:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.306 (2.578) Loss 1.3583 (1.4190) Acc@1 68.848 (67.983) Acc@5 89.062 (88.458) [2022-01-18 16:40:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.309 (2.331) Loss 1.4253 (1.4228) Acc@1 66.504 (67.799) Acc@5 88.770 (88.357) [2022-01-18 16:40:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.556 (2.194) Loss 1.4606 (1.4186) Acc@1 67.773 (67.833) Acc@5 88.184 (88.500) [2022-01-18 16:40:45 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 67.700 Acc@5 88.470 [2022-01-18 16:40:45 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 67.7% [2022-01-18 16:40:45 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 67.70% [2022-01-18 16:41:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][0/1251] eta 7:21:52 lr 0.000963 time 21.1929 (21.1929) loss 4.2367 (4.2367) grad_norm 1.0246 (1.0246) [2022-01-18 16:41:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][10/1251] eta 1:21:25 lr 0.000963 time 1.7749 (3.9369) loss 4.8905 (4.3225) grad_norm 1.0096 (1.1483) [2022-01-18 16:41:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][20/1251] eta 1:04:06 lr 0.000963 time 1.9088 (3.1243) loss 3.7545 (4.0294) grad_norm 1.1079 (1.1430) [2022-01-18 16:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][30/1251] eta 0:57:19 lr 0.000963 time 1.5324 (2.8169) loss 4.5786 (4.1578) grad_norm 1.1956 (1.1568) [2022-01-18 16:42:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][40/1251] eta 0:54:24 lr 0.000963 time 3.5200 (2.6957) loss 3.3387 (4.1151) grad_norm 1.0167 (1.1574) [2022-01-18 16:43:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][50/1251] eta 0:53:29 lr 0.000963 time 2.7889 (2.6722) loss 4.2135 (4.1425) grad_norm 1.0517 (1.1365) [2022-01-18 16:43:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][60/1251] eta 0:51:11 lr 0.000963 time 1.4930 (2.5792) loss 3.9225 (4.1246) grad_norm 1.0518 (1.1280) [2022-01-18 16:43:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][70/1251] eta 0:49:06 lr 0.000963 time 1.7582 (2.4947) loss 2.8350 (4.0874) grad_norm 1.1207 (1.1451) [2022-01-18 16:44:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][80/1251] eta 0:47:42 lr 0.000963 time 2.6159 (2.4443) loss 4.5251 (4.1287) grad_norm 1.0756 (1.1348) [2022-01-18 16:44:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][90/1251] eta 0:46:42 lr 0.000963 time 2.1166 (2.4138) loss 4.2621 (4.1197) grad_norm 1.0371 (1.1411) [2022-01-18 16:44:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][100/1251] eta 0:45:46 lr 0.000963 time 2.6283 (2.3862) loss 4.7783 (4.1136) grad_norm 1.2543 (1.1496) [2022-01-18 16:45:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][110/1251] eta 0:45:05 lr 0.000963 time 2.4819 (2.3709) loss 4.4479 (4.1140) grad_norm 1.4709 (1.1466) [2022-01-18 16:45:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][120/1251] eta 0:44:25 lr 0.000963 time 2.4480 (2.3570) loss 4.4940 (4.1224) grad_norm 1.1573 (1.1528) [2022-01-18 16:45:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][130/1251] eta 0:44:19 lr 0.000963 time 2.4948 (2.3725) loss 3.2913 (4.1217) grad_norm 1.1935 (1.1522) [2022-01-18 16:46:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][140/1251] eta 0:43:49 lr 0.000963 time 2.9748 (2.3671) loss 4.2332 (4.1182) grad_norm 1.2081 (1.1543) [2022-01-18 16:46:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][150/1251] eta 0:43:11 lr 0.000963 time 1.9000 (2.3538) loss 4.1674 (4.0980) grad_norm 1.1820 (1.1612) [2022-01-18 16:47:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][160/1251] eta 0:42:33 lr 0.000963 time 2.2357 (2.3404) loss 4.4992 (4.0898) grad_norm 1.5190 (1.1644) [2022-01-18 16:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][170/1251] eta 0:41:57 lr 0.000963 time 2.1691 (2.3284) loss 4.7292 (4.1011) grad_norm 1.1226 (1.1614) [2022-01-18 16:47:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][180/1251] eta 0:41:12 lr 0.000963 time 1.9211 (2.3086) loss 3.7784 (4.0962) grad_norm 1.1490 (1.1570) [2022-01-18 16:48:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][190/1251] eta 0:40:35 lr 0.000963 time 2.1524 (2.2956) loss 3.4129 (4.0860) grad_norm 1.0100 (1.1576) [2022-01-18 16:48:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][200/1251] eta 0:39:59 lr 0.000963 time 2.1124 (2.2829) loss 4.7329 (4.0825) grad_norm 1.1537 (1.1600) [2022-01-18 16:48:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][210/1251] eta 0:39:30 lr 0.000963 time 2.2203 (2.2773) loss 4.4328 (4.0773) grad_norm 1.1611 (1.1615) [2022-01-18 16:49:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][220/1251] eta 0:39:07 lr 0.000963 time 2.1446 (2.2768) loss 4.5553 (4.0683) grad_norm 1.1156 (1.1673) [2022-01-18 16:49:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][230/1251] eta 0:38:41 lr 0.000963 time 2.2815 (2.2735) loss 3.2175 (4.0620) grad_norm 1.0940 (1.1676) [2022-01-18 16:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][240/1251] eta 0:38:19 lr 0.000963 time 2.8251 (2.2748) loss 4.5308 (4.0602) grad_norm 1.4058 (1.1718) [2022-01-18 16:50:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][250/1251] eta 0:37:56 lr 0.000963 time 1.8706 (2.2747) loss 2.9643 (4.0578) grad_norm 1.2552 (1.1694) [2022-01-18 16:50:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][260/1251] eta 0:37:36 lr 0.000963 time 1.8019 (2.2766) loss 4.5807 (4.0584) grad_norm 1.0115 (1.1697) [2022-01-18 16:51:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][270/1251] eta 0:37:07 lr 0.000963 time 2.3183 (2.2704) loss 2.7476 (4.0556) grad_norm 0.8271 (1.1695) [2022-01-18 16:51:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][280/1251] eta 0:36:36 lr 0.000963 time 1.9039 (2.2621) loss 4.9824 (4.0630) grad_norm 1.0876 (1.1690) [2022-01-18 16:51:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][290/1251] eta 0:36:08 lr 0.000963 time 2.7223 (2.2562) loss 4.4037 (4.0655) grad_norm 1.1513 (1.1663) [2022-01-18 16:52:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][300/1251] eta 0:35:39 lr 0.000963 time 1.9019 (2.2498) loss 3.2871 (4.0615) grad_norm 1.1824 (1.1629) [2022-01-18 16:52:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][310/1251] eta 0:35:10 lr 0.000963 time 1.8576 (2.2431) loss 4.4005 (4.0643) grad_norm 1.1647 (1.1633) [2022-01-18 16:52:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][320/1251] eta 0:34:46 lr 0.000963 time 2.5387 (2.2414) loss 4.4135 (4.0653) grad_norm 1.0402 (1.1620) [2022-01-18 16:53:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][330/1251] eta 0:34:25 lr 0.000963 time 2.8897 (2.2428) loss 3.2078 (4.0594) grad_norm 1.1252 (1.1605) [2022-01-18 16:53:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][340/1251] eta 0:34:05 lr 0.000963 time 2.4710 (2.2457) loss 4.3250 (4.0562) grad_norm 1.0913 (1.1564) [2022-01-18 16:53:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][350/1251] eta 0:33:51 lr 0.000963 time 3.5666 (2.2548) loss 4.9064 (4.0639) grad_norm 1.0441 (1.1563) [2022-01-18 16:54:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][360/1251] eta 0:33:33 lr 0.000963 time 2.4809 (2.2594) loss 4.1659 (4.0696) grad_norm 0.9335 (1.1548) [2022-01-18 16:54:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][370/1251] eta 0:33:09 lr 0.000963 time 2.1557 (2.2581) loss 4.6680 (4.0685) grad_norm 1.5270 (1.1539) [2022-01-18 16:55:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][380/1251] eta 0:32:42 lr 0.000963 time 1.9473 (2.2529) loss 4.2967 (4.0666) grad_norm 1.2014 (1.1545) [2022-01-18 16:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][390/1251] eta 0:32:12 lr 0.000963 time 1.9819 (2.2446) loss 4.1953 (4.0706) grad_norm 1.2649 (1.1540) [2022-01-18 16:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][400/1251] eta 0:31:43 lr 0.000963 time 2.2637 (2.2372) loss 4.1004 (4.0643) grad_norm 1.1553 (1.1532) [2022-01-18 16:56:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][410/1251] eta 0:31:15 lr 0.000963 time 1.9093 (2.2306) loss 3.9733 (4.0663) grad_norm 0.9893 (1.1512) [2022-01-18 16:56:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][420/1251] eta 0:30:52 lr 0.000963 time 2.4375 (2.2297) loss 3.1982 (4.0591) grad_norm 0.9839 (1.1501) [2022-01-18 16:56:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][430/1251] eta 0:30:31 lr 0.000963 time 2.7815 (2.2309) loss 4.3288 (4.0611) grad_norm 0.9834 (1.1481) [2022-01-18 16:57:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][440/1251] eta 0:30:05 lr 0.000963 time 1.5674 (2.2265) loss 3.3728 (4.0629) grad_norm 1.2093 (1.1470) [2022-01-18 16:57:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][450/1251] eta 0:29:42 lr 0.000963 time 2.0978 (2.2248) loss 3.8916 (4.0588) grad_norm 1.1738 (1.1460) [2022-01-18 16:57:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][460/1251] eta 0:29:22 lr 0.000963 time 3.2064 (2.2279) loss 3.7326 (4.0502) grad_norm 1.3398 (1.1471) [2022-01-18 16:58:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][470/1251] eta 0:29:04 lr 0.000963 time 2.8235 (2.2332) loss 2.7854 (4.0477) grad_norm 1.1667 (1.1464) [2022-01-18 16:58:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][480/1251] eta 0:28:42 lr 0.000963 time 1.8241 (2.2341) loss 4.3632 (4.0540) grad_norm 1.0516 (1.1430) [2022-01-18 16:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][490/1251] eta 0:28:18 lr 0.000963 time 1.9646 (2.2317) loss 3.7200 (4.0583) grad_norm 1.2576 (1.1431) [2022-01-18 16:59:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][500/1251] eta 0:27:54 lr 0.000963 time 2.7918 (2.2294) loss 3.3870 (4.0633) grad_norm 0.9246 (1.1438) [2022-01-18 16:59:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][510/1251] eta 0:27:30 lr 0.000963 time 2.8115 (2.2269) loss 2.9266 (4.0696) grad_norm 1.2949 (1.1446) [2022-01-18 17:00:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][520/1251] eta 0:27:06 lr 0.000962 time 1.5234 (2.2256) loss 4.5091 (4.0675) grad_norm 0.9502 (1.1442) [2022-01-18 17:00:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][530/1251] eta 0:26:43 lr 0.000962 time 1.9293 (2.2235) loss 4.4276 (4.0686) grad_norm 0.9808 (1.1441) [2022-01-18 17:00:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][540/1251] eta 0:26:21 lr 0.000962 time 2.7865 (2.2241) loss 3.5508 (4.0722) grad_norm 0.9681 (1.1444) [2022-01-18 17:01:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][550/1251] eta 0:25:59 lr 0.000962 time 2.2148 (2.2242) loss 4.2648 (4.0741) grad_norm 1.3158 (1.1451) [2022-01-18 17:01:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][560/1251] eta 0:25:35 lr 0.000962 time 1.8970 (2.2223) loss 4.3643 (4.0761) grad_norm 0.9682 (1.1432) [2022-01-18 17:01:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][570/1251] eta 0:25:12 lr 0.000962 time 1.8643 (2.2211) loss 4.5131 (4.0760) grad_norm 1.0900 (1.1464) [2022-01-18 17:02:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][580/1251] eta 0:24:49 lr 0.000962 time 2.5457 (2.2204) loss 4.3790 (4.0748) grad_norm 1.1945 (1.1477) [2022-01-18 17:02:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][590/1251] eta 0:24:26 lr 0.000962 time 1.8611 (2.2191) loss 4.0723 (4.0720) grad_norm 1.0168 (1.1472) [2022-01-18 17:03:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][600/1251] eta 0:24:05 lr 0.000962 time 2.5850 (2.2201) loss 4.7707 (4.0761) grad_norm 1.0676 (1.1450) [2022-01-18 17:03:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][610/1251] eta 0:23:41 lr 0.000962 time 1.9710 (2.2174) loss 3.3476 (4.0738) grad_norm 1.2684 (1.1455) [2022-01-18 17:03:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][620/1251] eta 0:23:19 lr 0.000962 time 2.7175 (2.2183) loss 3.5502 (4.0715) grad_norm 1.0234 (1.1463) [2022-01-18 17:04:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][630/1251] eta 0:22:58 lr 0.000962 time 1.5309 (2.2194) loss 4.5406 (4.0668) grad_norm 1.4931 (1.1470) [2022-01-18 17:04:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][640/1251] eta 0:22:34 lr 0.000962 time 2.4461 (2.2176) loss 4.5662 (4.0697) grad_norm 0.9427 (1.1464) [2022-01-18 17:04:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][650/1251] eta 0:22:11 lr 0.000962 time 1.8995 (2.2149) loss 4.3655 (4.0724) grad_norm 0.9243 (1.1454) [2022-01-18 17:05:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][660/1251] eta 0:21:48 lr 0.000962 time 2.4690 (2.2132) loss 3.6922 (4.0768) grad_norm 0.9874 (1.1442) [2022-01-18 17:05:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][670/1251] eta 0:21:25 lr 0.000962 time 1.6165 (2.2118) loss 4.6768 (4.0789) grad_norm 1.1922 (1.1440) [2022-01-18 17:05:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][680/1251] eta 0:21:01 lr 0.000962 time 2.0998 (2.2100) loss 4.3865 (4.0823) grad_norm 1.1635 (1.1448) [2022-01-18 17:06:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][690/1251] eta 0:20:40 lr 0.000962 time 2.2331 (2.2113) loss 4.7614 (4.0797) grad_norm 1.0806 (1.1447) [2022-01-18 17:06:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][700/1251] eta 0:20:18 lr 0.000962 time 2.2580 (2.2112) loss 3.1353 (4.0765) grad_norm 1.3968 (1.1450) [2022-01-18 17:06:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][710/1251] eta 0:19:56 lr 0.000962 time 1.9959 (2.2125) loss 4.9554 (4.0771) grad_norm 1.0585 (1.1453) [2022-01-18 17:07:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][720/1251] eta 0:19:34 lr 0.000962 time 2.5306 (2.2113) loss 4.8942 (4.0823) grad_norm 1.0357 (1.1446) [2022-01-18 17:07:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][730/1251] eta 0:19:11 lr 0.000962 time 2.2732 (2.2097) loss 4.3742 (4.0828) grad_norm 1.0102 (1.1445) [2022-01-18 17:08:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][740/1251] eta 0:18:48 lr 0.000962 time 1.6769 (2.2079) loss 4.6312 (4.0870) grad_norm 1.0149 (1.1443) [2022-01-18 17:08:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][750/1251] eta 0:18:25 lr 0.000962 time 2.6265 (2.2072) loss 3.7299 (4.0880) grad_norm 1.2686 (1.1446) [2022-01-18 17:08:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][760/1251] eta 0:18:02 lr 0.000962 time 1.9080 (2.2053) loss 4.5104 (4.0882) grad_norm 1.2014 (1.1452) [2022-01-18 17:09:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][770/1251] eta 0:17:41 lr 0.000962 time 3.1587 (2.2070) loss 4.3513 (4.0866) grad_norm 1.1318 (1.1449) [2022-01-18 17:09:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][780/1251] eta 0:17:20 lr 0.000962 time 2.7614 (2.2091) loss 3.9446 (4.0830) grad_norm 1.1492 (1.1443) [2022-01-18 17:09:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][790/1251] eta 0:16:59 lr 0.000962 time 3.7671 (2.2108) loss 4.3708 (4.0832) grad_norm 1.0395 (1.1441) [2022-01-18 17:10:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][800/1251] eta 0:16:37 lr 0.000962 time 2.2092 (2.2113) loss 3.8749 (4.0853) grad_norm 0.8827 (1.1437) [2022-01-18 17:10:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][810/1251] eta 0:16:15 lr 0.000962 time 2.7492 (2.2115) loss 4.5057 (4.0832) grad_norm 1.1163 (1.1436) [2022-01-18 17:11:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][820/1251] eta 0:15:52 lr 0.000962 time 1.8414 (2.2104) loss 4.7374 (4.0843) grad_norm 1.0326 (1.1434) [2022-01-18 17:11:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][830/1251] eta 0:15:30 lr 0.000962 time 2.8290 (2.2104) loss 2.9533 (4.0842) grad_norm 1.1102 (1.1437) [2022-01-18 17:11:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][840/1251] eta 0:15:07 lr 0.000962 time 1.9415 (2.2071) loss 4.3217 (4.0785) grad_norm 0.9704 (1.1428) [2022-01-18 17:12:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][850/1251] eta 0:14:45 lr 0.000962 time 2.8501 (2.2071) loss 4.1940 (4.0802) grad_norm 1.1387 (1.1421) [2022-01-18 17:12:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][860/1251] eta 0:14:22 lr 0.000962 time 1.5138 (2.2068) loss 3.8345 (4.0798) grad_norm 1.1373 (1.1421) [2022-01-18 17:12:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][870/1251] eta 0:14:00 lr 0.000962 time 2.8421 (2.2073) loss 3.1791 (4.0798) grad_norm 1.0950 (1.1416) [2022-01-18 17:13:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][880/1251] eta 0:13:39 lr 0.000962 time 2.1285 (2.2079) loss 3.9246 (4.0810) grad_norm 1.5489 (1.1430) [2022-01-18 17:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][890/1251] eta 0:13:16 lr 0.000962 time 2.4671 (2.2074) loss 3.6067 (4.0809) grad_norm 1.0656 (1.1426) [2022-01-18 17:13:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][900/1251] eta 0:12:54 lr 0.000962 time 1.8966 (2.2054) loss 4.1698 (4.0827) grad_norm 1.1025 (1.1433) [2022-01-18 17:14:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][910/1251] eta 0:12:31 lr 0.000962 time 2.1843 (2.2036) loss 4.3761 (4.0781) grad_norm 1.1013 (1.1426) [2022-01-18 17:14:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][920/1251] eta 0:12:09 lr 0.000962 time 2.1825 (2.2041) loss 3.3508 (4.0760) grad_norm 1.0668 (1.1414) [2022-01-18 17:14:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][930/1251] eta 0:11:47 lr 0.000962 time 2.2065 (2.2037) loss 4.1765 (4.0772) grad_norm 0.9563 (1.1405) [2022-01-18 17:15:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][940/1251] eta 0:11:24 lr 0.000962 time 1.9533 (2.2020) loss 4.7895 (4.0801) grad_norm 1.1188 (1.1400) [2022-01-18 17:15:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][950/1251] eta 0:11:02 lr 0.000962 time 2.7843 (2.2023) loss 4.2830 (4.0794) grad_norm 1.0308 (1.1397) [2022-01-18 17:16:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][960/1251] eta 0:10:41 lr 0.000962 time 2.1746 (2.2038) loss 4.0240 (4.0778) grad_norm 0.9573 (1.1402) [2022-01-18 17:16:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][970/1251] eta 0:10:19 lr 0.000962 time 2.1837 (2.2053) loss 5.1337 (4.0794) grad_norm 1.1530 (1.1402) [2022-01-18 17:16:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][980/1251] eta 0:09:57 lr 0.000962 time 2.1461 (2.2058) loss 4.2775 (4.0802) grad_norm 1.0557 (1.1406) [2022-01-18 17:17:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][990/1251] eta 0:09:35 lr 0.000962 time 1.5586 (2.2041) loss 3.6870 (4.0824) grad_norm 1.5165 (1.1423) [2022-01-18 17:17:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1000/1251] eta 0:09:12 lr 0.000962 time 1.9524 (2.2015) loss 4.2974 (4.0831) grad_norm 1.1358 (1.1416) [2022-01-18 17:17:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1010/1251] eta 0:08:50 lr 0.000962 time 2.1322 (2.1992) loss 3.5334 (4.0806) grad_norm 1.0208 (1.1414) [2022-01-18 17:18:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1020/1251] eta 0:08:27 lr 0.000962 time 2.9215 (2.1989) loss 4.6529 (4.0821) grad_norm 1.0645 (1.1411) [2022-01-18 17:18:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1030/1251] eta 0:08:06 lr 0.000962 time 3.0061 (2.2000) loss 3.6337 (4.0812) grad_norm 1.1255 (1.1410) [2022-01-18 17:18:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1040/1251] eta 0:07:44 lr 0.000962 time 1.5369 (2.2003) loss 4.4744 (4.0816) grad_norm 1.5923 (1.1409) [2022-01-18 17:19:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1050/1251] eta 0:07:22 lr 0.000962 time 1.9556 (2.2005) loss 4.6876 (4.0807) grad_norm 1.3454 (1.1411) [2022-01-18 17:19:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1060/1251] eta 0:07:00 lr 0.000962 time 2.7114 (2.2027) loss 3.0575 (4.0800) grad_norm 0.9782 (1.1412) [2022-01-18 17:20:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1070/1251] eta 0:06:38 lr 0.000962 time 2.1969 (2.2040) loss 3.7405 (4.0831) grad_norm 1.1495 (1.1407) [2022-01-18 17:20:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1080/1251] eta 0:06:17 lr 0.000962 time 3.2631 (2.2057) loss 5.0429 (4.0845) grad_norm 1.0231 (1.1412) [2022-01-18 17:20:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1090/1251] eta 0:05:54 lr 0.000962 time 1.8918 (2.2044) loss 3.8548 (4.0828) grad_norm 1.3721 (1.1413) [2022-01-18 17:21:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1100/1251] eta 0:05:32 lr 0.000962 time 1.9438 (2.2033) loss 4.6970 (4.0832) grad_norm 1.0292 (1.1414) [2022-01-18 17:21:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1110/1251] eta 0:05:10 lr 0.000962 time 1.9494 (2.2028) loss 4.6255 (4.0836) grad_norm 1.2017 (1.1410) [2022-01-18 17:21:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1120/1251] eta 0:04:48 lr 0.000962 time 3.2788 (2.2047) loss 3.8853 (4.0825) grad_norm 1.1783 (1.1410) [2022-01-18 17:22:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1130/1251] eta 0:04:26 lr 0.000962 time 1.9387 (2.2041) loss 4.3000 (4.0847) grad_norm 1.2405 (1.1410) [2022-01-18 17:22:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1140/1251] eta 0:04:04 lr 0.000962 time 1.7730 (2.2032) loss 3.5173 (4.0850) grad_norm 1.3060 (1.1418) [2022-01-18 17:23:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1150/1251] eta 0:03:42 lr 0.000961 time 1.7354 (2.2024) loss 4.8519 (4.0872) grad_norm 1.2273 (1.1418) [2022-01-18 17:23:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1160/1251] eta 0:03:20 lr 0.000961 time 3.6077 (2.2047) loss 4.8512 (4.0869) grad_norm 1.2340 (1.1416) [2022-01-18 17:23:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1170/1251] eta 0:02:58 lr 0.000961 time 1.5829 (2.2031) loss 4.1839 (4.0848) grad_norm 1.1250 (1.1412) [2022-01-18 17:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1180/1251] eta 0:02:36 lr 0.000961 time 2.2168 (2.2022) loss 4.9338 (4.0869) grad_norm 1.1698 (1.1404) [2022-01-18 17:24:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1190/1251] eta 0:02:14 lr 0.000961 time 2.2275 (2.2012) loss 3.9699 (4.0893) grad_norm 0.9767 (1.1409) [2022-01-18 17:24:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1200/1251] eta 0:01:52 lr 0.000961 time 2.4730 (2.2013) loss 4.8041 (4.0897) grad_norm 1.2460 (1.1419) [2022-01-18 17:25:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1210/1251] eta 0:01:30 lr 0.000961 time 1.9245 (2.2016) loss 3.1533 (4.0876) grad_norm 1.0677 (1.1412) [2022-01-18 17:25:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1220/1251] eta 0:01:08 lr 0.000961 time 2.5625 (2.2031) loss 3.9905 (4.0868) grad_norm 1.0295 (1.1407) [2022-01-18 17:25:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1230/1251] eta 0:00:46 lr 0.000961 time 1.8966 (2.2035) loss 4.6076 (4.0849) grad_norm 0.9940 (1.1403) [2022-01-18 17:26:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1240/1251] eta 0:00:24 lr 0.000961 time 2.3233 (2.2031) loss 3.8441 (4.0860) grad_norm 0.8642 (1.1399) [2022-01-18 17:26:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [37/300][1250/1251] eta 0:00:02 lr 0.000961 time 1.1722 (2.1970) loss 4.3390 (4.0866) grad_norm 1.1599 (1.1395) [2022-01-18 17:26:34 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 37 training takes 0:45:48 [2022-01-18 17:26:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.793 (17.793) Loss 1.4490 (1.4490) Acc@1 66.504 (66.504) Acc@5 87.402 (87.402) [2022-01-18 17:27:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.579 (3.389) Loss 1.3957 (1.4034) Acc@1 65.918 (67.649) Acc@5 89.160 (88.725) [2022-01-18 17:27:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.960 (2.461) Loss 1.4577 (1.4094) Acc@1 66.895 (67.653) Acc@5 87.012 (88.509) [2022-01-18 17:27:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.605 (2.228) Loss 1.4764 (1.4155) Acc@1 66.992 (67.566) Acc@5 87.988 (88.486) [2022-01-18 17:28:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.338 (2.135) Loss 1.4593 (1.4195) Acc@1 67.578 (67.488) Acc@5 88.867 (88.393) [2022-01-18 17:28:09 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 67.544 Acc@5 88.420 [2022-01-18 17:28:09 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 67.5% [2022-01-18 17:28:09 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 67.70% [2022-01-18 17:28:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][0/1251] eta 7:41:59 lr 0.000961 time 22.1575 (22.1575) loss 4.2806 (4.2806) grad_norm 1.5902 (1.5902) [2022-01-18 17:28:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][10/1251] eta 1:21:34 lr 0.000961 time 2.1934 (3.9437) loss 2.7417 (3.9995) grad_norm 1.0200 (1.1806) [2022-01-18 17:29:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][20/1251] eta 1:04:04 lr 0.000961 time 1.9296 (3.1228) loss 3.8300 (4.1087) grad_norm 1.3256 (1.1789) [2022-01-18 17:29:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][30/1251] eta 0:56:39 lr 0.000961 time 1.2010 (2.7843) loss 3.9972 (4.0643) grad_norm 1.5999 (1.2204) [2022-01-18 17:30:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][40/1251] eta 0:54:42 lr 0.000961 time 5.9068 (2.7104) loss 4.2067 (4.0368) grad_norm 1.1690 (1.2052) [2022-01-18 17:30:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][50/1251] eta 0:52:09 lr 0.000961 time 2.8459 (2.6058) loss 4.3119 (4.0275) grad_norm 1.1282 (1.1856) [2022-01-18 17:30:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][60/1251] eta 0:50:32 lr 0.000961 time 1.8575 (2.5460) loss 4.6789 (4.0550) grad_norm 1.0563 (1.1663) [2022-01-18 17:31:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][70/1251] eta 0:48:52 lr 0.000961 time 1.9195 (2.4834) loss 4.7629 (4.0809) grad_norm 1.1498 (1.1626) [2022-01-18 17:31:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][80/1251] eta 0:48:08 lr 0.000961 time 4.0167 (2.4669) loss 4.6042 (4.0566) grad_norm 1.2358 (1.1541) [2022-01-18 17:31:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][90/1251] eta 0:47:22 lr 0.000961 time 3.5950 (2.4483) loss 3.1198 (4.0373) grad_norm 1.0434 (1.1473) [2022-01-18 17:32:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][100/1251] eta 0:46:13 lr 0.000961 time 2.2176 (2.4092) loss 3.1019 (4.0465) grad_norm 1.1106 (1.1422) [2022-01-18 17:32:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][110/1251] eta 0:45:14 lr 0.000961 time 1.6055 (2.3791) loss 4.6003 (4.0638) grad_norm 1.0893 (1.1442) [2022-01-18 17:32:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][120/1251] eta 0:44:27 lr 0.000961 time 3.0241 (2.3583) loss 4.2331 (4.0623) grad_norm 1.4899 (1.1482) [2022-01-18 17:33:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][130/1251] eta 0:43:53 lr 0.000961 time 3.4779 (2.3489) loss 3.8444 (4.0675) grad_norm 1.2167 (1.1468) [2022-01-18 17:33:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][140/1251] eta 0:43:23 lr 0.000961 time 2.5015 (2.3433) loss 4.2363 (4.0641) grad_norm 1.3116 (1.1378) [2022-01-18 17:34:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][150/1251] eta 0:42:52 lr 0.000961 time 2.1599 (2.3368) loss 4.2505 (4.0673) grad_norm 1.0568 (1.1319) [2022-01-18 17:34:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][160/1251] eta 0:42:16 lr 0.000961 time 2.6635 (2.3254) loss 3.7511 (4.0623) grad_norm 0.9643 (1.1275) [2022-01-18 17:34:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][170/1251] eta 0:41:31 lr 0.000961 time 1.8825 (2.3050) loss 4.9262 (4.0701) grad_norm 1.2693 (1.1289) [2022-01-18 17:35:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][180/1251] eta 0:40:56 lr 0.000961 time 1.8657 (2.2933) loss 4.5874 (4.0811) grad_norm 1.2989 (1.1270) [2022-01-18 17:35:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][190/1251] eta 0:40:17 lr 0.000961 time 1.8572 (2.2782) loss 3.7289 (4.0773) grad_norm 1.3458 (1.1271) [2022-01-18 17:35:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][200/1251] eta 0:39:48 lr 0.000961 time 1.8419 (2.2723) loss 4.3688 (4.0759) grad_norm 1.1951 (1.1277) [2022-01-18 17:36:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][210/1251] eta 0:39:28 lr 0.000961 time 2.0983 (2.2751) loss 4.1454 (4.0812) grad_norm 0.9219 (1.1243) [2022-01-18 17:36:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][220/1251] eta 0:39:06 lr 0.000961 time 2.5061 (2.2762) loss 4.5936 (4.0908) grad_norm 0.9677 (1.1207) [2022-01-18 17:36:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][230/1251] eta 0:38:43 lr 0.000961 time 2.2338 (2.2753) loss 4.4991 (4.0889) grad_norm 1.3237 (1.1263) [2022-01-18 17:37:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][240/1251] eta 0:38:14 lr 0.000961 time 1.6406 (2.2699) loss 3.9930 (4.0962) grad_norm 1.0491 (1.1286) [2022-01-18 17:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][250/1251] eta 0:37:42 lr 0.000961 time 1.8476 (2.2598) loss 4.7767 (4.1023) grad_norm 0.9194 (1.1281) [2022-01-18 17:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][260/1251] eta 0:37:16 lr 0.000961 time 2.8683 (2.2569) loss 4.2957 (4.1103) grad_norm 1.0478 (1.1275) [2022-01-18 17:38:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][270/1251] eta 0:36:49 lr 0.000961 time 2.0038 (2.2524) loss 3.9939 (4.1014) grad_norm 0.9081 (1.1253) [2022-01-18 17:38:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][280/1251] eta 0:36:26 lr 0.000961 time 1.9865 (2.2516) loss 4.4530 (4.0910) grad_norm 1.1297 (1.1276) [2022-01-18 17:39:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][290/1251] eta 0:36:03 lr 0.000961 time 2.2024 (2.2512) loss 3.9464 (4.0931) grad_norm 1.0808 (1.1307) [2022-01-18 17:39:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][300/1251] eta 0:35:46 lr 0.000961 time 2.6227 (2.2569) loss 3.8933 (4.0878) grad_norm 0.9830 (1.1277) [2022-01-18 17:39:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][310/1251] eta 0:35:27 lr 0.000961 time 2.4414 (2.2605) loss 4.8632 (4.0868) grad_norm 1.0769 (1.1281) [2022-01-18 17:40:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][320/1251] eta 0:35:05 lr 0.000961 time 3.0782 (2.2616) loss 3.0116 (4.0817) grad_norm 1.0386 (1.1278) [2022-01-18 17:40:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][330/1251] eta 0:34:37 lr 0.000961 time 1.9148 (2.2560) loss 4.7738 (4.0903) grad_norm 1.1244 (1.1261) [2022-01-18 17:40:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][340/1251] eta 0:34:07 lr 0.000961 time 2.6635 (2.2471) loss 4.8250 (4.0899) grad_norm 0.9752 (1.1244) [2022-01-18 17:41:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][350/1251] eta 0:33:37 lr 0.000961 time 2.1788 (2.2394) loss 3.2096 (4.0872) grad_norm 1.7472 (1.1257) [2022-01-18 17:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][360/1251] eta 0:33:15 lr 0.000961 time 2.8191 (2.2396) loss 4.2277 (4.0896) grad_norm 1.0835 (1.1272) [2022-01-18 17:41:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][370/1251] eta 0:32:51 lr 0.000961 time 2.2199 (2.2374) loss 3.3180 (4.0921) grad_norm 1.4738 (1.1286) [2022-01-18 17:42:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][380/1251] eta 0:32:29 lr 0.000961 time 2.2266 (2.2379) loss 4.1426 (4.0925) grad_norm 1.1746 (1.1308) [2022-01-18 17:42:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][390/1251] eta 0:32:06 lr 0.000961 time 2.4987 (2.2371) loss 4.8120 (4.0883) grad_norm 1.1624 (1.1319) [2022-01-18 17:43:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][400/1251] eta 0:31:43 lr 0.000961 time 1.8668 (2.2365) loss 4.5177 (4.0899) grad_norm 1.1566 (1.1318) [2022-01-18 17:43:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][410/1251] eta 0:31:20 lr 0.000961 time 2.1104 (2.2363) loss 4.2469 (4.0872) grad_norm 1.2722 (1.1323) [2022-01-18 17:43:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][420/1251] eta 0:30:55 lr 0.000961 time 1.5289 (2.2325) loss 4.5856 (4.0856) grad_norm 1.3093 (1.1354) [2022-01-18 17:44:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][430/1251] eta 0:30:33 lr 0.000961 time 2.1670 (2.2333) loss 3.3319 (4.0808) grad_norm 1.1651 (1.1351) [2022-01-18 17:44:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][440/1251] eta 0:30:13 lr 0.000961 time 2.8313 (2.2361) loss 4.1565 (4.0747) grad_norm 1.2567 (1.1373) [2022-01-18 17:44:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][450/1251] eta 0:29:52 lr 0.000961 time 2.4187 (2.2383) loss 4.4631 (4.0789) grad_norm 1.1730 (1.1370) [2022-01-18 17:45:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][460/1251] eta 0:29:27 lr 0.000961 time 2.1983 (2.2351) loss 3.0743 (4.0747) grad_norm 1.1676 (1.1366) [2022-01-18 17:45:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][470/1251] eta 0:29:00 lr 0.000961 time 1.9436 (2.2285) loss 3.4458 (4.0779) grad_norm 1.0672 (1.1364) [2022-01-18 17:45:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][480/1251] eta 0:28:35 lr 0.000961 time 2.2144 (2.2254) loss 4.2972 (4.0852) grad_norm 1.0628 (1.1361) [2022-01-18 17:46:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][490/1251] eta 0:28:11 lr 0.000961 time 1.5213 (2.2229) loss 4.3271 (4.0836) grad_norm 1.0808 (1.1367) [2022-01-18 17:46:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][500/1251] eta 0:27:47 lr 0.000961 time 1.8991 (2.2200) loss 4.2382 (4.0878) grad_norm 1.0650 (1.1373) [2022-01-18 17:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][510/1251] eta 0:27:25 lr 0.000960 time 1.7764 (2.2203) loss 4.3347 (4.0886) grad_norm 1.1507 (1.1375) [2022-01-18 17:47:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][520/1251] eta 0:27:04 lr 0.000960 time 2.6835 (2.2218) loss 4.9231 (4.0883) grad_norm 1.1262 (1.1379) [2022-01-18 17:47:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][530/1251] eta 0:26:42 lr 0.000960 time 1.6019 (2.2221) loss 4.8553 (4.0887) grad_norm 1.3569 (1.1381) [2022-01-18 17:48:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][540/1251] eta 0:26:20 lr 0.000960 time 1.7852 (2.2233) loss 3.1717 (4.0856) grad_norm 0.8713 (1.1392) [2022-01-18 17:48:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][550/1251] eta 0:25:59 lr 0.000960 time 1.8387 (2.2245) loss 3.5404 (4.0843) grad_norm 1.1727 (1.1398) [2022-01-18 17:48:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][560/1251] eta 0:25:37 lr 0.000960 time 2.5689 (2.2244) loss 3.0058 (4.0756) grad_norm 1.4336 (1.1411) [2022-01-18 17:49:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][570/1251] eta 0:25:12 lr 0.000960 time 1.5879 (2.2205) loss 4.5228 (4.0771) grad_norm 1.1025 (1.1410) [2022-01-18 17:49:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][580/1251] eta 0:24:50 lr 0.000960 time 2.3242 (2.2206) loss 4.7452 (4.0743) grad_norm 1.3591 (1.1412) [2022-01-18 17:50:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][590/1251] eta 0:24:26 lr 0.000960 time 1.8629 (2.2181) loss 4.1531 (4.0785) grad_norm 1.0769 (1.1399) [2022-01-18 17:50:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][600/1251] eta 0:24:03 lr 0.000960 time 1.9315 (2.2171) loss 4.7539 (4.0763) grad_norm 1.2354 (1.1410) [2022-01-18 17:50:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][610/1251] eta 0:23:39 lr 0.000960 time 2.4311 (2.2148) loss 4.1523 (4.0741) grad_norm 1.0828 (1.1405) [2022-01-18 17:51:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][620/1251] eta 0:23:15 lr 0.000960 time 1.8089 (2.2109) loss 5.0095 (4.0782) grad_norm 1.1626 (1.1407) [2022-01-18 17:51:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][630/1251] eta 0:22:51 lr 0.000960 time 2.2152 (2.2088) loss 4.1710 (4.0788) grad_norm 1.4069 (1.1416) [2022-01-18 17:51:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][640/1251] eta 0:22:29 lr 0.000960 time 2.7892 (2.2082) loss 3.3922 (4.0748) grad_norm 1.2415 (1.1413) [2022-01-18 17:52:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][650/1251] eta 0:22:08 lr 0.000960 time 5.1872 (2.2111) loss 4.7880 (4.0782) grad_norm 1.1033 (1.1410) [2022-01-18 17:52:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][660/1251] eta 0:21:46 lr 0.000960 time 2.1844 (2.2115) loss 4.0551 (4.0731) grad_norm 1.1823 (1.1428) [2022-01-18 17:52:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][670/1251] eta 0:21:25 lr 0.000960 time 2.3437 (2.2132) loss 3.9493 (4.0741) grad_norm 0.8892 (1.1418) [2022-01-18 17:53:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][680/1251] eta 0:21:04 lr 0.000960 time 2.2628 (2.2145) loss 3.5633 (4.0776) grad_norm 1.2461 (1.1416) [2022-01-18 17:53:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][690/1251] eta 0:20:42 lr 0.000960 time 2.9706 (2.2155) loss 3.6941 (4.0789) grad_norm 1.3066 (1.1415) [2022-01-18 17:54:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][700/1251] eta 0:20:19 lr 0.000960 time 2.2336 (2.2126) loss 4.9297 (4.0835) grad_norm 1.3096 (1.1416) [2022-01-18 17:54:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][710/1251] eta 0:19:55 lr 0.000960 time 1.7302 (2.2097) loss 4.3185 (4.0861) grad_norm 1.1373 (1.1402) [2022-01-18 17:54:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][720/1251] eta 0:19:32 lr 0.000960 time 1.9359 (2.2074) loss 4.2763 (4.0842) grad_norm 1.1387 (1.1401) [2022-01-18 17:55:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][730/1251] eta 0:19:10 lr 0.000960 time 2.1912 (2.2074) loss 4.9540 (4.0803) grad_norm 1.0364 (1.1405) [2022-01-18 17:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][740/1251] eta 0:18:46 lr 0.000960 time 2.1962 (2.2051) loss 3.5246 (4.0805) grad_norm 1.1005 (1.1412) [2022-01-18 17:55:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][750/1251] eta 0:18:23 lr 0.000960 time 1.8586 (2.2033) loss 4.7629 (4.0862) grad_norm 1.2828 (1.1407) [2022-01-18 17:56:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][760/1251] eta 0:18:01 lr 0.000960 time 1.9094 (2.2022) loss 3.4045 (4.0836) grad_norm 1.2783 (1.1415) [2022-01-18 17:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][770/1251] eta 0:17:39 lr 0.000960 time 1.9178 (2.2021) loss 4.3013 (4.0804) grad_norm 0.9648 (1.1405) [2022-01-18 17:56:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][780/1251] eta 0:17:17 lr 0.000960 time 2.5616 (2.2021) loss 4.1034 (4.0773) grad_norm 1.1080 (1.1405) [2022-01-18 17:57:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][790/1251] eta 0:16:55 lr 0.000960 time 2.1242 (2.2030) loss 3.2339 (4.0761) grad_norm 0.9929 (1.1402) [2022-01-18 17:57:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][800/1251] eta 0:16:34 lr 0.000960 time 2.8473 (2.2044) loss 4.0079 (4.0797) grad_norm 1.0906 (1.1408) [2022-01-18 17:57:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][810/1251] eta 0:16:12 lr 0.000960 time 2.4519 (2.2063) loss 3.0361 (4.0785) grad_norm 1.1898 (1.1429) [2022-01-18 17:58:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][820/1251] eta 0:15:50 lr 0.000960 time 1.9488 (2.2058) loss 3.9721 (4.0776) grad_norm 1.1191 (1.1443) [2022-01-18 17:58:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][830/1251] eta 0:15:28 lr 0.000960 time 2.2489 (2.2056) loss 4.5573 (4.0795) grad_norm 1.2102 (1.1448) [2022-01-18 17:59:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][840/1251] eta 0:15:06 lr 0.000960 time 2.2269 (2.2052) loss 3.3723 (4.0810) grad_norm 1.2490 (1.1448) [2022-01-18 17:59:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][850/1251] eta 0:14:44 lr 0.000960 time 2.7491 (2.2058) loss 4.2912 (4.0809) grad_norm 1.2232 (1.1444) [2022-01-18 17:59:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][860/1251] eta 0:14:21 lr 0.000960 time 1.9461 (2.2044) loss 4.3405 (4.0842) grad_norm 1.1857 (1.1447) [2022-01-18 18:00:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][870/1251] eta 0:13:58 lr 0.000960 time 1.9581 (2.2013) loss 2.8597 (4.0838) grad_norm 0.8910 (1.1439) [2022-01-18 18:00:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][880/1251] eta 0:13:35 lr 0.000960 time 2.0190 (2.1994) loss 4.8646 (4.0832) grad_norm 0.9413 (1.1436) [2022-01-18 18:00:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][890/1251] eta 0:13:14 lr 0.000960 time 2.2971 (2.2002) loss 4.2518 (4.0809) grad_norm 1.5510 (1.1445) [2022-01-18 18:01:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][900/1251] eta 0:12:51 lr 0.000960 time 1.8838 (2.1988) loss 3.9783 (4.0866) grad_norm 1.0219 (1.1455) [2022-01-18 18:01:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][910/1251] eta 0:12:29 lr 0.000960 time 2.4444 (2.1978) loss 4.2365 (4.0893) grad_norm 1.0629 (1.1452) [2022-01-18 18:01:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][920/1251] eta 0:12:07 lr 0.000960 time 2.1239 (2.1983) loss 2.8163 (4.0882) grad_norm 0.9106 (1.1448) [2022-01-18 18:02:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][930/1251] eta 0:11:46 lr 0.000960 time 2.6837 (2.2001) loss 4.3115 (4.0878) grad_norm 0.9464 (1.1436) [2022-01-18 18:02:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][940/1251] eta 0:11:24 lr 0.000960 time 1.8106 (2.2020) loss 4.1651 (4.0868) grad_norm 1.0573 (1.1440) [2022-01-18 18:03:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][950/1251] eta 0:11:02 lr 0.000960 time 2.4651 (2.2026) loss 4.8513 (4.0875) grad_norm 0.9727 (1.1441) [2022-01-18 18:03:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][960/1251] eta 0:10:40 lr 0.000960 time 1.6137 (2.2007) loss 3.6942 (4.0860) grad_norm 0.9702 (1.1438) [2022-01-18 18:03:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][970/1251] eta 0:10:17 lr 0.000960 time 2.3359 (2.1982) loss 3.0498 (4.0876) grad_norm 0.9436 (1.1442) [2022-01-18 18:04:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][980/1251] eta 0:09:55 lr 0.000960 time 1.9238 (2.1973) loss 3.2497 (4.0872) grad_norm 0.9781 (1.1432) [2022-01-18 18:04:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][990/1251] eta 0:09:33 lr 0.000960 time 2.8237 (2.1985) loss 3.2404 (4.0854) grad_norm 1.5387 (1.1435) [2022-01-18 18:04:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1000/1251] eta 0:09:12 lr 0.000960 time 1.5118 (2.1999) loss 4.7784 (4.0852) grad_norm 1.1906 (1.1432) [2022-01-18 18:05:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1010/1251] eta 0:08:51 lr 0.000960 time 3.9849 (2.2034) loss 3.2201 (4.0809) grad_norm 1.0723 (1.1435) [2022-01-18 18:05:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1020/1251] eta 0:08:28 lr 0.000960 time 1.9051 (2.2034) loss 4.5333 (4.0829) grad_norm 1.0654 (1.1433) [2022-01-18 18:05:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1030/1251] eta 0:08:06 lr 0.000960 time 1.9494 (2.2016) loss 3.4294 (4.0821) grad_norm 1.1843 (1.1433) [2022-01-18 18:06:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1040/1251] eta 0:07:43 lr 0.000960 time 1.6209 (2.1983) loss 4.0042 (4.0815) grad_norm 1.1700 (1.1437) [2022-01-18 18:06:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1050/1251] eta 0:07:21 lr 0.000960 time 2.1661 (2.1968) loss 4.6151 (4.0827) grad_norm 1.1531 (1.1431) [2022-01-18 18:06:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1060/1251] eta 0:06:59 lr 0.000960 time 1.8282 (2.1951) loss 3.5548 (4.0825) grad_norm 1.1684 (1.1438) [2022-01-18 18:07:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1070/1251] eta 0:06:37 lr 0.000960 time 2.2088 (2.1941) loss 3.6549 (4.0822) grad_norm 1.1767 (1.1431) [2022-01-18 18:07:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1080/1251] eta 0:06:15 lr 0.000960 time 2.3659 (2.1945) loss 3.8005 (4.0821) grad_norm 1.0509 (1.1428) [2022-01-18 18:08:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1090/1251] eta 0:05:53 lr 0.000960 time 2.2453 (2.1953) loss 4.7621 (4.0834) grad_norm 1.2368 (1.1424) [2022-01-18 18:08:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1100/1251] eta 0:05:31 lr 0.000960 time 2.1209 (2.1947) loss 2.8044 (4.0809) grad_norm 1.1129 (1.1419) [2022-01-18 18:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1110/1251] eta 0:05:09 lr 0.000960 time 1.5349 (2.1942) loss 4.0125 (4.0825) grad_norm 1.1483 (1.1414) [2022-01-18 18:09:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1120/1251] eta 0:04:47 lr 0.000960 time 2.1496 (2.1950) loss 4.0718 (4.0802) grad_norm 1.4415 (1.1414) [2022-01-18 18:09:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1130/1251] eta 0:04:25 lr 0.000959 time 2.8513 (2.1967) loss 4.2353 (4.0781) grad_norm 1.0400 (1.1409) [2022-01-18 18:09:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1140/1251] eta 0:04:04 lr 0.000959 time 2.7492 (2.1988) loss 4.2904 (4.0783) grad_norm 1.2523 (1.1407) [2022-01-18 18:10:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1150/1251] eta 0:03:41 lr 0.000959 time 1.6617 (2.1979) loss 4.8394 (4.0803) grad_norm 1.0393 (1.1409) [2022-01-18 18:10:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1160/1251] eta 0:03:20 lr 0.000959 time 1.6329 (2.1981) loss 3.4448 (4.0779) grad_norm 1.0473 (1.1405) [2022-01-18 18:11:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1170/1251] eta 0:02:58 lr 0.000959 time 2.1253 (2.1986) loss 4.4677 (4.0772) grad_norm 1.0077 (1.1399) [2022-01-18 18:11:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1180/1251] eta 0:02:36 lr 0.000959 time 3.2578 (2.1999) loss 4.3198 (4.0742) grad_norm 0.9817 (1.1397) [2022-01-18 18:11:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1190/1251] eta 0:02:14 lr 0.000959 time 1.7779 (2.1991) loss 4.3783 (4.0743) grad_norm 1.1448 (1.1398) [2022-01-18 18:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1200/1251] eta 0:01:52 lr 0.000959 time 2.0384 (2.1973) loss 3.6128 (4.0736) grad_norm 1.3932 (1.1396) [2022-01-18 18:12:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1210/1251] eta 0:01:30 lr 0.000959 time 1.9507 (2.1951) loss 4.3861 (4.0737) grad_norm 1.0869 (1.1396) [2022-01-18 18:12:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1220/1251] eta 0:01:08 lr 0.000959 time 2.2145 (2.1939) loss 3.2165 (4.0748) grad_norm 0.9288 (1.1387) [2022-01-18 18:13:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1230/1251] eta 0:00:46 lr 0.000959 time 3.5830 (2.1951) loss 3.9938 (4.0760) grad_norm 1.0808 (1.1383) [2022-01-18 18:13:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1240/1251] eta 0:00:24 lr 0.000959 time 2.0107 (2.1949) loss 3.9547 (4.0760) grad_norm 1.0462 (1.1382) [2022-01-18 18:13:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [38/300][1250/1251] eta 0:00:02 lr 0.000959 time 1.1693 (2.1891) loss 4.8486 (4.0745) grad_norm 1.2070 (1.1380) [2022-01-18 18:13:48 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 38 training takes 0:45:39 [2022-01-18 18:14:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.957 (18.957) Loss 1.3691 (1.3691) Acc@1 68.555 (68.555) Acc@5 89.258 (89.258) [2022-01-18 18:14:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.320 (3.204) Loss 1.3896 (1.3800) Acc@1 67.090 (68.040) Acc@5 88.281 (88.627) [2022-01-18 18:14:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.903 (2.543) Loss 1.2441 (1.3525) Acc@1 69.922 (68.345) Acc@5 90.527 (89.160) [2022-01-18 18:14:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.964 (2.252) Loss 1.3803 (1.3506) Acc@1 66.406 (68.422) Acc@5 88.965 (89.132) [2022-01-18 18:15:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.549 (2.187) Loss 1.3653 (1.3591) Acc@1 67.773 (68.236) Acc@5 88.770 (88.982) [2022-01-18 18:15:25 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 68.270 Acc@5 88.998 [2022-01-18 18:15:25 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 68.3% [2022-01-18 18:15:25 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 68.27% [2022-01-18 18:15:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][0/1251] eta 7:37:03 lr 0.000959 time 21.9212 (21.9212) loss 3.8893 (3.8893) grad_norm 1.0841 (1.0841) [2022-01-18 18:16:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][10/1251] eta 1:23:09 lr 0.000959 time 2.2200 (4.0203) loss 4.7712 (4.1353) grad_norm 1.1400 (1.0739) [2022-01-18 18:16:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][20/1251] eta 1:03:00 lr 0.000959 time 2.1653 (3.0714) loss 4.3982 (4.1674) grad_norm 1.0655 (1.1074) [2022-01-18 18:16:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][30/1251] eta 0:55:43 lr 0.000959 time 1.4050 (2.7387) loss 4.0332 (4.1514) grad_norm 1.3284 (1.1147) [2022-01-18 18:17:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][40/1251] eta 0:53:53 lr 0.000959 time 3.8621 (2.6699) loss 3.4564 (4.0763) grad_norm 1.1787 (1.1622) [2022-01-18 18:17:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][50/1251] eta 0:51:35 lr 0.000959 time 2.2512 (2.5771) loss 3.9982 (4.0580) grad_norm 1.3717 (1.1637) [2022-01-18 18:18:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][60/1251] eta 0:50:32 lr 0.000959 time 2.4515 (2.5462) loss 4.4186 (4.0458) grad_norm 1.0119 (1.1552) [2022-01-18 18:18:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][70/1251] eta 0:48:58 lr 0.000959 time 2.8049 (2.4883) loss 4.1374 (4.0243) grad_norm 1.0635 (1.1650) [2022-01-18 18:18:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][80/1251] eta 0:48:18 lr 0.000959 time 3.1320 (2.4752) loss 3.4766 (4.0056) grad_norm 1.1057 (1.1520) [2022-01-18 18:19:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][90/1251] eta 0:47:20 lr 0.000959 time 2.5607 (2.4466) loss 4.2812 (4.0239) grad_norm 1.1456 (1.1524) [2022-01-18 18:19:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][100/1251] eta 0:46:18 lr 0.000959 time 2.1818 (2.4142) loss 3.5649 (4.0505) grad_norm 1.2057 (1.1607) [2022-01-18 18:19:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][110/1251] eta 0:45:36 lr 0.000959 time 2.9725 (2.3986) loss 4.0898 (4.0700) grad_norm 0.9117 (1.1593) [2022-01-18 18:20:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][120/1251] eta 0:44:55 lr 0.000959 time 2.9642 (2.3834) loss 4.4745 (4.0722) grad_norm 1.3223 (1.1602) [2022-01-18 18:20:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][130/1251] eta 0:44:06 lr 0.000959 time 2.3143 (2.3609) loss 4.2329 (4.1019) grad_norm 1.1783 (1.1628) [2022-01-18 18:20:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][140/1251] eta 0:43:27 lr 0.000959 time 2.2373 (2.3474) loss 3.9421 (4.0951) grad_norm 1.2603 (1.1568) [2022-01-18 18:21:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][150/1251] eta 0:42:50 lr 0.000959 time 2.3367 (2.3347) loss 4.3533 (4.0953) grad_norm 0.9203 (1.1569) [2022-01-18 18:21:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][160/1251] eta 0:42:16 lr 0.000959 time 2.6101 (2.3245) loss 4.7298 (4.0931) grad_norm 0.9587 (1.1549) [2022-01-18 18:22:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][170/1251] eta 0:41:41 lr 0.000959 time 2.9188 (2.3143) loss 4.5078 (4.0848) grad_norm 1.0276 (1.1490) [2022-01-18 18:22:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][180/1251] eta 0:41:03 lr 0.000959 time 1.6552 (2.3000) loss 4.1402 (4.0854) grad_norm 1.3934 (1.1539) [2022-01-18 18:22:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][190/1251] eta 0:40:24 lr 0.000959 time 1.9237 (2.2851) loss 4.4510 (4.0965) grad_norm 1.0107 (1.1595) [2022-01-18 18:23:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][200/1251] eta 0:39:50 lr 0.000959 time 1.8229 (2.2740) loss 3.2060 (4.0959) grad_norm 0.9821 (1.1539) [2022-01-18 18:23:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][210/1251] eta 0:39:31 lr 0.000959 time 3.0976 (2.2780) loss 3.6648 (4.0870) grad_norm 1.0749 (1.1520) [2022-01-18 18:23:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][220/1251] eta 0:39:04 lr 0.000959 time 2.2227 (2.2741) loss 4.1961 (4.0951) grad_norm 1.0029 (1.1485) [2022-01-18 18:24:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][230/1251] eta 0:38:37 lr 0.000959 time 1.8343 (2.2700) loss 4.5053 (4.0936) grad_norm 1.4684 (1.1507) [2022-01-18 18:24:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][240/1251] eta 0:38:12 lr 0.000959 time 2.1644 (2.2679) loss 4.0214 (4.0958) grad_norm 1.1056 (1.1513) [2022-01-18 18:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][250/1251] eta 0:37:46 lr 0.000959 time 2.3749 (2.2638) loss 3.5241 (4.0814) grad_norm 1.0860 (1.1535) [2022-01-18 18:25:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][260/1251] eta 0:37:20 lr 0.000959 time 2.4302 (2.2610) loss 4.6459 (4.0862) grad_norm 1.0786 (1.1478) [2022-01-18 18:25:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][270/1251] eta 0:36:51 lr 0.000959 time 1.7215 (2.2539) loss 4.2915 (4.0875) grad_norm 1.0455 (1.1481) [2022-01-18 18:25:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][280/1251] eta 0:36:25 lr 0.000959 time 1.8755 (2.2506) loss 3.5316 (4.0834) grad_norm 0.9474 (1.1456) [2022-01-18 18:26:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][290/1251] eta 0:36:01 lr 0.000959 time 2.4813 (2.2490) loss 3.9581 (4.0732) grad_norm 0.9611 (1.1429) [2022-01-18 18:26:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][300/1251] eta 0:35:42 lr 0.000959 time 2.3620 (2.2528) loss 4.8316 (4.0732) grad_norm 1.0390 (1.1426) [2022-01-18 18:27:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][310/1251] eta 0:35:22 lr 0.000959 time 1.9583 (2.2557) loss 4.8591 (4.0739) grad_norm 1.0456 (1.1388) [2022-01-18 18:27:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][320/1251] eta 0:35:01 lr 0.000959 time 1.8983 (2.2569) loss 4.5801 (4.0787) grad_norm 0.9225 (1.1372) [2022-01-18 18:27:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][330/1251] eta 0:34:33 lr 0.000959 time 1.9211 (2.2518) loss 2.8390 (4.0758) grad_norm 1.1359 (1.1377) [2022-01-18 18:28:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][340/1251] eta 0:34:05 lr 0.000959 time 1.9126 (2.2448) loss 4.4613 (4.0725) grad_norm 1.0770 (1.1393) [2022-01-18 18:28:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][350/1251] eta 0:33:36 lr 0.000959 time 1.8983 (2.2381) loss 4.3013 (4.0740) grad_norm 1.0503 (1.1412) [2022-01-18 18:28:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][360/1251] eta 0:33:10 lr 0.000959 time 2.4640 (2.2337) loss 3.4803 (4.0652) grad_norm 1.4147 (1.1420) [2022-01-18 18:29:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][370/1251] eta 0:32:45 lr 0.000959 time 2.1547 (2.2310) loss 3.0975 (4.0678) grad_norm 1.2656 (1.1433) [2022-01-18 18:29:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][380/1251] eta 0:32:19 lr 0.000959 time 1.6546 (2.2269) loss 3.7900 (4.0716) grad_norm 1.1779 (1.1471) [2022-01-18 18:29:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][390/1251] eta 0:31:57 lr 0.000959 time 1.9826 (2.2269) loss 3.0132 (4.0705) grad_norm 1.3449 (1.1470) [2022-01-18 18:30:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][400/1251] eta 0:31:34 lr 0.000959 time 2.6369 (2.2259) loss 4.4540 (4.0750) grad_norm 1.0112 (1.1463) [2022-01-18 18:30:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][410/1251] eta 0:31:16 lr 0.000959 time 2.1480 (2.2310) loss 3.6751 (4.0760) grad_norm 1.1367 (1.1452) [2022-01-18 18:31:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][420/1251] eta 0:30:52 lr 0.000959 time 2.4432 (2.2294) loss 3.3715 (4.0778) grad_norm 1.1653 (1.1479) [2022-01-18 18:31:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][430/1251] eta 0:30:29 lr 0.000959 time 1.6057 (2.2284) loss 4.3734 (4.0826) grad_norm 1.1358 (1.1478) [2022-01-18 18:31:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][440/1251] eta 0:30:05 lr 0.000959 time 2.0038 (2.2266) loss 3.2692 (4.0834) grad_norm 1.5829 (1.1496) [2022-01-18 18:32:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][450/1251] eta 0:29:44 lr 0.000959 time 2.5307 (2.2280) loss 3.8490 (4.0775) grad_norm 0.9140 (1.1494) [2022-01-18 18:32:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][460/1251] eta 0:29:20 lr 0.000959 time 1.6852 (2.2258) loss 2.8408 (4.0686) grad_norm 1.6563 (1.1512) [2022-01-18 18:32:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][470/1251] eta 0:28:59 lr 0.000959 time 1.5747 (2.2272) loss 4.0097 (4.0717) grad_norm 1.0135 (1.1506) [2022-01-18 18:33:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][480/1251] eta 0:28:35 lr 0.000958 time 2.2478 (2.2257) loss 3.9598 (4.0753) grad_norm 0.9696 (1.1510) [2022-01-18 18:33:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][490/1251] eta 0:28:12 lr 0.000958 time 2.1299 (2.2236) loss 4.3136 (4.0784) grad_norm 1.0881 (1.1534) [2022-01-18 18:33:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][500/1251] eta 0:27:46 lr 0.000958 time 1.8801 (2.2195) loss 4.3422 (4.0718) grad_norm 0.8909 (1.1558) [2022-01-18 18:34:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][510/1251] eta 0:27:23 lr 0.000958 time 2.3315 (2.2180) loss 3.7610 (4.0730) grad_norm 0.9455 (1.1547) [2022-01-18 18:34:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][520/1251] eta 0:26:59 lr 0.000958 time 1.6233 (2.2151) loss 4.8884 (4.0788) grad_norm 1.1144 (1.1540) [2022-01-18 18:35:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][530/1251] eta 0:26:38 lr 0.000958 time 2.3063 (2.2171) loss 3.6721 (4.0749) grad_norm 0.9570 (1.1533) [2022-01-18 18:35:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][540/1251] eta 0:26:16 lr 0.000958 time 2.1102 (2.2170) loss 4.3199 (4.0730) grad_norm 1.0997 (1.1526) [2022-01-18 18:35:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][550/1251] eta 0:25:56 lr 0.000958 time 2.2752 (2.2200) loss 3.9576 (4.0739) grad_norm 0.9100 (1.1507) [2022-01-18 18:36:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][560/1251] eta 0:25:36 lr 0.000958 time 2.3871 (2.2234) loss 2.7375 (4.0745) grad_norm 1.0801 (1.1504) [2022-01-18 18:36:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][570/1251] eta 0:25:13 lr 0.000958 time 1.9394 (2.2221) loss 4.3871 (4.0782) grad_norm 1.4162 (1.1504) [2022-01-18 18:36:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][580/1251] eta 0:24:48 lr 0.000958 time 2.0602 (2.2187) loss 4.1040 (4.0805) grad_norm 1.1593 (1.1498) [2022-01-18 18:37:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][590/1251] eta 0:24:24 lr 0.000958 time 2.5830 (2.2152) loss 4.5630 (4.0824) grad_norm 1.1198 (1.1493) [2022-01-18 18:37:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][600/1251] eta 0:24:01 lr 0.000958 time 2.6831 (2.2142) loss 4.6484 (4.0863) grad_norm 1.0349 (1.1504) [2022-01-18 18:37:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][610/1251] eta 0:23:38 lr 0.000958 time 1.9901 (2.2127) loss 4.5786 (4.0928) grad_norm 1.3965 (1.1508) [2022-01-18 18:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][620/1251] eta 0:23:15 lr 0.000958 time 2.1589 (2.2118) loss 4.1688 (4.0905) grad_norm 1.4859 (1.1508) [2022-01-18 18:38:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][630/1251] eta 0:22:52 lr 0.000958 time 2.1347 (2.2106) loss 4.6968 (4.0907) grad_norm 1.4435 (1.1525) [2022-01-18 18:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][640/1251] eta 0:22:30 lr 0.000958 time 2.6365 (2.2103) loss 3.6228 (4.0869) grad_norm 1.0427 (1.1506) [2022-01-18 18:39:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][650/1251] eta 0:22:09 lr 0.000958 time 2.5271 (2.2120) loss 4.1562 (4.0863) grad_norm 1.1377 (1.1498) [2022-01-18 18:39:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][660/1251] eta 0:21:49 lr 0.000958 time 2.9938 (2.2152) loss 3.5212 (4.0848) grad_norm 0.9307 (1.1478) [2022-01-18 18:40:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][670/1251] eta 0:21:28 lr 0.000958 time 2.8971 (2.2175) loss 4.0427 (4.0895) grad_norm 1.2021 (1.1462) [2022-01-18 18:40:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][680/1251] eta 0:21:06 lr 0.000958 time 2.3056 (2.2183) loss 2.6675 (4.0864) grad_norm 1.0092 (1.1474) [2022-01-18 18:40:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][690/1251] eta 0:20:43 lr 0.000958 time 1.7455 (2.2171) loss 2.9885 (4.0834) grad_norm 0.9088 (1.1467) [2022-01-18 18:41:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][700/1251] eta 0:20:20 lr 0.000958 time 2.1721 (2.2147) loss 3.6435 (4.0810) grad_norm 1.5199 (1.1474) [2022-01-18 18:41:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][710/1251] eta 0:19:55 lr 0.000958 time 2.1789 (2.2102) loss 4.7146 (4.0822) grad_norm 1.3464 (1.1478) [2022-01-18 18:41:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][720/1251] eta 0:19:31 lr 0.000958 time 1.9640 (2.2064) loss 4.1234 (4.0815) grad_norm 1.0580 (1.1485) [2022-01-18 18:42:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][730/1251] eta 0:19:08 lr 0.000958 time 1.8639 (2.2039) loss 3.4999 (4.0833) grad_norm 1.0371 (1.1475) [2022-01-18 18:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][740/1251] eta 0:18:46 lr 0.000958 time 2.1101 (2.2039) loss 3.8887 (4.0796) grad_norm 0.9620 (1.1486) [2022-01-18 18:43:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][750/1251] eta 0:18:24 lr 0.000958 time 2.5295 (2.2056) loss 4.2909 (4.0791) grad_norm 1.1689 (1.1485) [2022-01-18 18:43:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][760/1251] eta 0:18:04 lr 0.000958 time 2.4596 (2.2079) loss 3.0443 (4.0781) grad_norm 1.2847 (1.1491) [2022-01-18 18:43:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][770/1251] eta 0:17:42 lr 0.000958 time 1.8247 (2.2085) loss 3.4654 (4.0774) grad_norm 1.1153 (1.1487) [2022-01-18 18:44:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][780/1251] eta 0:17:20 lr 0.000958 time 2.4949 (2.2100) loss 4.7231 (4.0760) grad_norm 1.1020 (1.1495) [2022-01-18 18:44:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][790/1251] eta 0:16:59 lr 0.000958 time 2.4983 (2.2117) loss 3.4433 (4.0736) grad_norm 1.0227 (1.1484) [2022-01-18 18:44:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][800/1251] eta 0:16:37 lr 0.000958 time 2.9709 (2.2127) loss 4.5960 (4.0757) grad_norm 0.8966 (1.1471) [2022-01-18 18:45:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][810/1251] eta 0:16:14 lr 0.000958 time 1.8275 (2.2105) loss 3.7162 (4.0749) grad_norm 0.9304 (1.1470) [2022-01-18 18:45:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][820/1251] eta 0:15:51 lr 0.000958 time 2.2926 (2.2075) loss 4.5216 (4.0752) grad_norm 1.2613 (1.1490) [2022-01-18 18:45:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][830/1251] eta 0:15:28 lr 0.000958 time 1.8660 (2.2055) loss 3.4442 (4.0730) grad_norm 1.0397 (1.1486) [2022-01-18 18:46:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][840/1251] eta 0:15:06 lr 0.000958 time 2.2392 (2.2044) loss 3.0904 (4.0727) grad_norm 1.1730 (1.1494) [2022-01-18 18:46:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][850/1251] eta 0:14:44 lr 0.000958 time 2.6752 (2.2051) loss 2.8051 (4.0704) grad_norm 1.1140 (1.1491) [2022-01-18 18:47:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][860/1251] eta 0:14:22 lr 0.000958 time 2.5688 (2.2057) loss 4.6915 (4.0739) grad_norm 0.9574 (1.1499) [2022-01-18 18:47:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][870/1251] eta 0:14:00 lr 0.000958 time 2.1746 (2.2060) loss 4.3201 (4.0746) grad_norm 1.1512 (1.1503) [2022-01-18 18:47:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][880/1251] eta 0:13:39 lr 0.000958 time 2.2056 (2.2085) loss 4.5355 (4.0726) grad_norm 1.1351 (1.1502) [2022-01-18 18:48:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][890/1251] eta 0:13:17 lr 0.000958 time 1.9515 (2.2089) loss 3.7819 (4.0725) grad_norm 1.2068 (1.1503) [2022-01-18 18:48:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][900/1251] eta 0:12:55 lr 0.000958 time 2.9994 (2.2088) loss 3.9427 (4.0752) grad_norm 0.9886 (1.1503) [2022-01-18 18:48:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][910/1251] eta 0:12:32 lr 0.000958 time 1.9418 (2.2057) loss 4.1269 (4.0746) grad_norm 0.9187 (1.1493) [2022-01-18 18:49:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][920/1251] eta 0:12:09 lr 0.000958 time 2.2983 (2.2046) loss 3.4806 (4.0702) grad_norm 1.0048 (1.1487) [2022-01-18 18:49:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][930/1251] eta 0:11:46 lr 0.000958 time 1.9176 (2.2018) loss 4.3262 (4.0687) grad_norm 1.6248 (1.1489) [2022-01-18 18:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][940/1251] eta 0:11:24 lr 0.000958 time 2.1897 (2.2005) loss 4.3638 (4.0683) grad_norm 1.4259 (1.1502) [2022-01-18 18:50:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][950/1251] eta 0:11:02 lr 0.000958 time 2.9007 (2.2011) loss 4.1049 (4.0705) grad_norm 1.1557 (1.1513) [2022-01-18 18:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][960/1251] eta 0:10:40 lr 0.000958 time 1.6165 (2.2007) loss 3.7642 (4.0693) grad_norm 0.8665 (1.1507) [2022-01-18 18:51:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][970/1251] eta 0:10:18 lr 0.000958 time 1.9182 (2.2011) loss 4.9558 (4.0683) grad_norm 1.1867 (1.1507) [2022-01-18 18:51:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][980/1251] eta 0:09:57 lr 0.000958 time 2.4402 (2.2031) loss 3.2495 (4.0693) grad_norm 1.1240 (1.1496) [2022-01-18 18:51:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][990/1251] eta 0:09:35 lr 0.000958 time 2.3207 (2.2032) loss 4.7823 (4.0684) grad_norm 1.1643 (1.1489) [2022-01-18 18:52:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1000/1251] eta 0:09:12 lr 0.000958 time 1.8731 (2.2029) loss 3.8458 (4.0682) grad_norm 1.2532 (1.1485) [2022-01-18 18:52:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1010/1251] eta 0:08:51 lr 0.000958 time 2.1226 (2.2041) loss 3.3757 (4.0663) grad_norm 1.1045 (1.1491) [2022-01-18 18:52:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1020/1251] eta 0:08:28 lr 0.000958 time 2.3253 (2.2027) loss 3.4771 (4.0665) grad_norm 1.4757 (1.1491) [2022-01-18 18:53:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1030/1251] eta 0:08:06 lr 0.000958 time 2.2680 (2.2022) loss 4.0585 (4.0689) grad_norm 1.0028 (1.1491) [2022-01-18 18:53:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1040/1251] eta 0:07:44 lr 0.000958 time 1.7436 (2.2015) loss 3.7905 (4.0690) grad_norm 0.9792 (1.1485) [2022-01-18 18:53:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1050/1251] eta 0:07:22 lr 0.000958 time 2.3309 (2.2006) loss 3.4942 (4.0685) grad_norm 1.1417 (1.1481) [2022-01-18 18:54:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1060/1251] eta 0:07:00 lr 0.000958 time 2.2562 (2.2001) loss 3.0320 (4.0679) grad_norm 0.9925 (1.1481) [2022-01-18 18:54:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1070/1251] eta 0:06:37 lr 0.000958 time 1.8363 (2.1985) loss 4.2658 (4.0653) grad_norm 0.9736 (1.1476) [2022-01-18 18:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1080/1251] eta 0:06:15 lr 0.000957 time 1.5845 (2.1973) loss 4.7271 (4.0674) grad_norm 1.0785 (1.1475) [2022-01-18 18:55:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1090/1251] eta 0:05:53 lr 0.000957 time 2.8846 (2.1978) loss 3.9986 (4.0686) grad_norm 1.1860 (1.1472) [2022-01-18 18:55:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1100/1251] eta 0:05:31 lr 0.000957 time 2.8024 (2.1983) loss 4.0356 (4.0694) grad_norm 1.0567 (1.1462) [2022-01-18 18:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1110/1251] eta 0:05:09 lr 0.000957 time 1.5304 (2.1983) loss 4.3610 (4.0715) grad_norm 1.2161 (1.1463) [2022-01-18 18:56:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1120/1251] eta 0:04:48 lr 0.000957 time 1.8976 (2.1988) loss 4.1175 (4.0727) grad_norm 1.0808 (1.1459) [2022-01-18 18:56:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1130/1251] eta 0:04:26 lr 0.000957 time 3.6404 (2.1996) loss 3.0747 (4.0728) grad_norm 1.1263 (1.1466) [2022-01-18 18:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1140/1251] eta 0:04:04 lr 0.000957 time 2.8976 (2.2000) loss 4.7843 (4.0735) grad_norm 1.0248 (1.1458) [2022-01-18 18:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1150/1251] eta 0:03:42 lr 0.000957 time 1.5036 (2.1990) loss 4.3774 (4.0761) grad_norm 1.0606 (1.1452) [2022-01-18 18:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1160/1251] eta 0:03:20 lr 0.000957 time 1.6238 (2.1983) loss 3.9646 (4.0781) grad_norm 1.0668 (1.1445) [2022-01-18 18:58:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1170/1251] eta 0:02:58 lr 0.000957 time 2.2908 (2.1979) loss 3.9728 (4.0760) grad_norm 1.1524 (1.1444) [2022-01-18 18:58:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1180/1251] eta 0:02:36 lr 0.000957 time 3.5977 (2.1992) loss 3.0475 (4.0744) grad_norm 1.1613 (1.1449) [2022-01-18 18:59:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1190/1251] eta 0:02:14 lr 0.000957 time 1.9562 (2.1984) loss 4.3455 (4.0769) grad_norm 1.2690 (1.1447) [2022-01-18 18:59:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1200/1251] eta 0:01:52 lr 0.000957 time 2.0704 (2.1970) loss 4.2696 (4.0780) grad_norm 0.9993 (1.1449) [2022-01-18 18:59:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1210/1251] eta 0:01:30 lr 0.000957 time 2.2415 (2.1970) loss 4.0963 (4.0756) grad_norm 1.3387 (1.1447) [2022-01-18 19:00:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1220/1251] eta 0:01:08 lr 0.000957 time 4.1003 (2.1991) loss 4.0147 (4.0734) grad_norm 0.9045 (1.1443) [2022-01-18 19:00:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1230/1251] eta 0:00:46 lr 0.000957 time 1.5420 (2.1996) loss 4.1268 (4.0726) grad_norm 1.1326 (1.1450) [2022-01-18 19:00:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1240/1251] eta 0:00:24 lr 0.000957 time 1.3593 (2.1983) loss 3.4425 (4.0723) grad_norm 1.4103 (1.1456) [2022-01-18 19:01:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [39/300][1250/1251] eta 0:00:02 lr 0.000957 time 1.1751 (2.1929) loss 3.0795 (4.0689) grad_norm 1.0714 (1.1452) [2022-01-18 19:01:08 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 39 training takes 0:45:43 [2022-01-18 19:01:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.610 (18.610) Loss 1.3957 (1.3957) Acc@1 67.871 (67.871) Acc@5 88.281 (88.281) [2022-01-18 19:01:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.653 (3.495) Loss 1.3314 (1.3678) Acc@1 69.531 (68.430) Acc@5 89.355 (89.311) [2022-01-18 19:02:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 2.035 (2.669) Loss 1.4798 (1.3671) Acc@1 66.602 (68.694) Acc@5 87.891 (89.235) [2022-01-18 19:02:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.585 (2.282) Loss 1.2882 (1.3697) Acc@1 70.801 (68.564) Acc@5 89.453 (89.157) [2022-01-18 19:02:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.243 (2.214) Loss 1.4574 (1.3776) Acc@1 65.234 (68.467) Acc@5 88.379 (88.998) [2022-01-18 19:02:46 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 68.434 Acc@5 89.086 [2022-01-18 19:02:46 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 68.4% [2022-01-18 19:02:46 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 68.43% [2022-01-18 19:03:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][0/1251] eta 7:13:58 lr 0.000957 time 20.8138 (20.8138) loss 2.9756 (2.9756) grad_norm 1.2812 (1.2812) [2022-01-18 19:03:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][10/1251] eta 1:24:34 lr 0.000957 time 2.1778 (4.0890) loss 4.0547 (3.9154) grad_norm 0.9999 (1.1261) [2022-01-18 19:03:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][20/1251] eta 1:03:46 lr 0.000957 time 2.0111 (3.1084) loss 4.0878 (4.0191) grad_norm 1.0377 (1.1275) [2022-01-18 19:04:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][30/1251] eta 0:56:52 lr 0.000957 time 1.6620 (2.7944) loss 3.6880 (4.1120) grad_norm 1.3633 (1.1456) [2022-01-18 19:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][40/1251] eta 0:54:17 lr 0.000957 time 3.7353 (2.6897) loss 4.8543 (4.0952) grad_norm 1.0872 (1.1328) [2022-01-18 19:05:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][50/1251] eta 0:52:36 lr 0.000957 time 2.1395 (2.6281) loss 3.3893 (4.0649) grad_norm 1.1522 (1.1184) [2022-01-18 19:05:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][60/1251] eta 0:50:36 lr 0.000957 time 2.3417 (2.5498) loss 2.7827 (4.0465) grad_norm 1.0075 (1.1036) [2022-01-18 19:05:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][70/1251] eta 0:48:57 lr 0.000957 time 1.5377 (2.4873) loss 4.0968 (3.9826) grad_norm 0.9660 (1.0924) [2022-01-18 19:06:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][80/1251] eta 0:48:03 lr 0.000957 time 3.7900 (2.4623) loss 4.2443 (4.0110) grad_norm 1.0835 (1.0898) [2022-01-18 19:06:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][90/1251] eta 0:47:35 lr 0.000957 time 2.3932 (2.4594) loss 4.2998 (3.9643) grad_norm 1.0144 (1.0887) [2022-01-18 19:06:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][100/1251] eta 0:46:38 lr 0.000957 time 1.9247 (2.4314) loss 4.6283 (3.9710) grad_norm 1.1115 (1.0924) [2022-01-18 19:07:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][110/1251] eta 0:45:33 lr 0.000957 time 1.7264 (2.3956) loss 2.9613 (3.9626) grad_norm 1.1302 (1.0978) [2022-01-18 19:07:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][120/1251] eta 0:44:53 lr 0.000957 time 3.4836 (2.3818) loss 5.0576 (3.9797) grad_norm 1.5222 (1.1078) [2022-01-18 19:07:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][130/1251] eta 0:44:12 lr 0.000957 time 2.7208 (2.3662) loss 3.6867 (3.9755) grad_norm 1.2269 (1.1108) [2022-01-18 19:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][140/1251] eta 0:43:26 lr 0.000957 time 2.0212 (2.3457) loss 4.1822 (3.9711) grad_norm 1.3263 (1.1094) [2022-01-18 19:08:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][150/1251] eta 0:42:45 lr 0.000957 time 1.9216 (2.3298) loss 4.3493 (3.9886) grad_norm 1.2195 (1.1129) [2022-01-18 19:09:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][160/1251] eta 0:42:13 lr 0.000957 time 2.9568 (2.3224) loss 4.2855 (3.9833) grad_norm 1.1077 (1.1152) [2022-01-18 19:09:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][170/1251] eta 0:41:44 lr 0.000957 time 2.6468 (2.3165) loss 4.3797 (3.9909) grad_norm 1.3584 (1.1194) [2022-01-18 19:09:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][180/1251] eta 0:41:08 lr 0.000957 time 1.7820 (2.3051) loss 3.8403 (3.9893) grad_norm 1.0918 (1.1193) [2022-01-18 19:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][190/1251] eta 0:40:45 lr 0.000957 time 2.2151 (2.3053) loss 3.9192 (3.9924) grad_norm 1.1022 (1.1195) [2022-01-18 19:10:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][200/1251] eta 0:40:14 lr 0.000957 time 2.1406 (2.2978) loss 4.0534 (3.9911) grad_norm 1.0933 (1.1198) [2022-01-18 19:10:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][210/1251] eta 0:39:48 lr 0.000957 time 2.5349 (2.2942) loss 3.3651 (3.9969) grad_norm 1.1236 (1.1174) [2022-01-18 19:11:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][220/1251] eta 0:39:19 lr 0.000957 time 2.4611 (2.2888) loss 4.3835 (3.9962) grad_norm 1.1557 (1.1187) [2022-01-18 19:11:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][230/1251] eta 0:38:49 lr 0.000957 time 2.0678 (2.2820) loss 4.3370 (4.0012) grad_norm 1.1302 (1.1217) [2022-01-18 19:11:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][240/1251] eta 0:38:22 lr 0.000957 time 1.6872 (2.2777) loss 4.7539 (4.0080) grad_norm 1.0046 (1.1242) [2022-01-18 19:12:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][250/1251] eta 0:37:48 lr 0.000957 time 2.2525 (2.2662) loss 4.2879 (4.0160) grad_norm 0.9831 (1.1220) [2022-01-18 19:12:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][260/1251] eta 0:37:14 lr 0.000957 time 1.8734 (2.2552) loss 3.9777 (4.0129) grad_norm 1.1315 (1.1246) [2022-01-18 19:12:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][270/1251] eta 0:36:52 lr 0.000957 time 1.8417 (2.2550) loss 3.5807 (4.0066) grad_norm 1.0391 (1.1240) [2022-01-18 19:13:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][280/1251] eta 0:36:30 lr 0.000957 time 1.9028 (2.2562) loss 3.5729 (4.0010) grad_norm 1.0121 (1.1247) [2022-01-18 19:13:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][290/1251] eta 0:36:07 lr 0.000957 time 2.1193 (2.2554) loss 4.7233 (4.0077) grad_norm 1.1595 (1.1254) [2022-01-18 19:14:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][300/1251] eta 0:35:42 lr 0.000957 time 1.5870 (2.2524) loss 4.5707 (4.0115) grad_norm 1.1750 (1.1260) [2022-01-18 19:14:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][310/1251] eta 0:35:17 lr 0.000957 time 2.0422 (2.2499) loss 4.9125 (4.0196) grad_norm 1.0278 (1.1248) [2022-01-18 19:14:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][320/1251] eta 0:34:50 lr 0.000957 time 2.5684 (2.2458) loss 4.2417 (4.0052) grad_norm 1.2266 (1.1247) [2022-01-18 19:15:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][330/1251] eta 0:34:24 lr 0.000957 time 1.7199 (2.2416) loss 4.4236 (4.0048) grad_norm 1.0385 (1.1255) [2022-01-18 19:15:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][340/1251] eta 0:33:59 lr 0.000957 time 2.0311 (2.2387) loss 4.5278 (4.0161) grad_norm 1.0189 (1.1251) [2022-01-18 19:15:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][350/1251] eta 0:33:37 lr 0.000957 time 2.6085 (2.2388) loss 3.5665 (4.0134) grad_norm 1.0015 (1.1257) [2022-01-18 19:16:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][360/1251] eta 0:33:15 lr 0.000957 time 2.8309 (2.2398) loss 4.8602 (4.0270) grad_norm 0.9966 (1.1275) [2022-01-18 19:16:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][370/1251] eta 0:32:50 lr 0.000957 time 1.5762 (2.2370) loss 3.8793 (4.0343) grad_norm 1.4003 (1.1306) [2022-01-18 19:16:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][380/1251] eta 0:32:29 lr 0.000957 time 2.0874 (2.2387) loss 4.4737 (4.0403) grad_norm 1.0662 (1.1300) [2022-01-18 19:17:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][390/1251] eta 0:32:08 lr 0.000957 time 2.4611 (2.2399) loss 4.4105 (4.0476) grad_norm 1.2509 (1.1303) [2022-01-18 19:17:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][400/1251] eta 0:31:44 lr 0.000957 time 2.0219 (2.2381) loss 4.9109 (4.0550) grad_norm 0.9060 (1.1312) [2022-01-18 19:18:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][410/1251] eta 0:31:19 lr 0.000957 time 1.6435 (2.2353) loss 4.6350 (4.0551) grad_norm 1.4133 (1.1314) [2022-01-18 19:18:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][420/1251] eta 0:30:53 lr 0.000956 time 1.6150 (2.2302) loss 4.2911 (4.0597) grad_norm 1.0213 (1.1292) [2022-01-18 19:18:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][430/1251] eta 0:30:26 lr 0.000956 time 2.3447 (2.2253) loss 4.8279 (4.0647) grad_norm 1.1765 (1.1275) [2022-01-18 19:19:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][440/1251] eta 0:30:03 lr 0.000956 time 2.0494 (2.2243) loss 2.6688 (4.0671) grad_norm 1.4210 (1.1301) [2022-01-18 19:19:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][450/1251] eta 0:29:39 lr 0.000956 time 1.9502 (2.2222) loss 3.3255 (4.0573) grad_norm 1.1260 (1.1303) [2022-01-18 19:19:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][460/1251] eta 0:29:15 lr 0.000956 time 1.8661 (2.2189) loss 3.6870 (4.0590) grad_norm 1.2813 (1.1323) [2022-01-18 19:20:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][470/1251] eta 0:28:51 lr 0.000956 time 2.0867 (2.2165) loss 3.1070 (4.0563) grad_norm 1.3678 (1.1350) [2022-01-18 19:20:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][480/1251] eta 0:28:28 lr 0.000956 time 2.1762 (2.2160) loss 3.7809 (4.0587) grad_norm 1.3826 (1.1348) [2022-01-18 19:20:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][490/1251] eta 0:28:06 lr 0.000956 time 2.1317 (2.2158) loss 4.1601 (4.0568) grad_norm 1.0111 (1.1340) [2022-01-18 19:21:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][500/1251] eta 0:27:42 lr 0.000956 time 2.2932 (2.2134) loss 4.7024 (4.0616) grad_norm 1.2905 (1.1339) [2022-01-18 19:21:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][510/1251] eta 0:27:21 lr 0.000956 time 1.7789 (2.2156) loss 4.3027 (4.0631) grad_norm 1.1330 (1.1348) [2022-01-18 19:22:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][520/1251] eta 0:27:00 lr 0.000956 time 1.6165 (2.2173) loss 4.5948 (4.0644) grad_norm 1.1100 (1.1334) [2022-01-18 19:22:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][530/1251] eta 0:26:38 lr 0.000956 time 2.1452 (2.2176) loss 4.3481 (4.0683) grad_norm 1.0794 (1.1332) [2022-01-18 19:22:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][540/1251] eta 0:26:16 lr 0.000956 time 2.2222 (2.2171) loss 2.9920 (4.0681) grad_norm 0.9479 (1.1334) [2022-01-18 19:23:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][550/1251] eta 0:25:53 lr 0.000956 time 1.7213 (2.2157) loss 4.3668 (4.0693) grad_norm 1.0504 (1.1326) [2022-01-18 19:23:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][560/1251] eta 0:25:29 lr 0.000956 time 1.5352 (2.2136) loss 4.6700 (4.0644) grad_norm 1.3929 (1.1345) [2022-01-18 19:23:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][570/1251] eta 0:25:07 lr 0.000956 time 2.6189 (2.2139) loss 4.5532 (4.0636) grad_norm 1.0332 (1.1341) [2022-01-18 19:24:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][580/1251] eta 0:24:44 lr 0.000956 time 2.1534 (2.2128) loss 4.1547 (4.0576) grad_norm 1.0243 (1.1337) [2022-01-18 19:24:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][590/1251] eta 0:24:21 lr 0.000956 time 1.9872 (2.2118) loss 3.6465 (4.0507) grad_norm 1.5966 (1.1357) [2022-01-18 19:24:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][600/1251] eta 0:23:59 lr 0.000956 time 1.7179 (2.2105) loss 4.1957 (4.0531) grad_norm 1.4858 (1.1386) [2022-01-18 19:25:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][610/1251] eta 0:23:36 lr 0.000956 time 2.2374 (2.2093) loss 3.8942 (4.0514) grad_norm 1.2084 (1.1378) [2022-01-18 19:25:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][620/1251] eta 0:23:14 lr 0.000956 time 2.5322 (2.2099) loss 4.4113 (4.0493) grad_norm 1.1399 (1.1383) [2022-01-18 19:26:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][630/1251] eta 0:22:53 lr 0.000956 time 2.4714 (2.2117) loss 4.9008 (4.0535) grad_norm 1.3899 (1.1392) [2022-01-18 19:26:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][640/1251] eta 0:22:30 lr 0.000956 time 1.5133 (2.2111) loss 4.5572 (4.0521) grad_norm 1.1324 (1.1387) [2022-01-18 19:26:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][650/1251] eta 0:22:08 lr 0.000956 time 1.8333 (2.2098) loss 4.3280 (4.0569) grad_norm 0.9782 (1.1373) [2022-01-18 19:27:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][660/1251] eta 0:21:44 lr 0.000956 time 2.1140 (2.2080) loss 4.0100 (4.0586) grad_norm 1.0057 (1.1366) [2022-01-18 19:27:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][670/1251] eta 0:21:22 lr 0.000956 time 1.5461 (2.2073) loss 4.7501 (4.0580) grad_norm 1.1833 (1.1360) [2022-01-18 19:27:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][680/1251] eta 0:21:00 lr 0.000956 time 1.7123 (2.2070) loss 3.3813 (4.0609) grad_norm 1.2140 (1.1361) [2022-01-18 19:28:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][690/1251] eta 0:20:38 lr 0.000956 time 2.0028 (2.2072) loss 4.2789 (4.0639) grad_norm 0.9698 (1.1360) [2022-01-18 19:28:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][700/1251] eta 0:20:16 lr 0.000956 time 2.2012 (2.2083) loss 4.1439 (4.0663) grad_norm 1.6992 (1.1354) [2022-01-18 19:28:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][710/1251] eta 0:19:54 lr 0.000956 time 2.1420 (2.2075) loss 3.5000 (4.0618) grad_norm 1.0523 (1.1361) [2022-01-18 19:29:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][720/1251] eta 0:19:30 lr 0.000956 time 2.1080 (2.2053) loss 4.4287 (4.0600) grad_norm 1.1995 (1.1347) [2022-01-18 19:29:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][730/1251] eta 0:19:08 lr 0.000956 time 2.1843 (2.2050) loss 3.8549 (4.0595) grad_norm 1.2904 (1.1339) [2022-01-18 19:30:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][740/1251] eta 0:18:46 lr 0.000956 time 1.9028 (2.2051) loss 4.4805 (4.0545) grad_norm 0.8786 (1.1352) [2022-01-18 19:30:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][750/1251] eta 0:18:24 lr 0.000956 time 2.2263 (2.2041) loss 4.3169 (4.0517) grad_norm 1.0965 (1.1354) [2022-01-18 19:30:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][760/1251] eta 0:18:02 lr 0.000956 time 2.1552 (2.2044) loss 2.7166 (4.0489) grad_norm 1.0493 (1.1368) [2022-01-18 19:31:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][770/1251] eta 0:17:40 lr 0.000956 time 1.9274 (2.2048) loss 4.1063 (4.0505) grad_norm 1.0814 (1.1374) [2022-01-18 19:31:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][780/1251] eta 0:17:18 lr 0.000956 time 1.9054 (2.2042) loss 4.6829 (4.0481) grad_norm 1.0705 (1.1392) [2022-01-18 19:31:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][790/1251] eta 0:16:55 lr 0.000956 time 2.2458 (2.2030) loss 4.1720 (4.0504) grad_norm 0.9927 (1.1385) [2022-01-18 19:32:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][800/1251] eta 0:16:32 lr 0.000956 time 1.7976 (2.2015) loss 3.4491 (4.0508) grad_norm 1.0588 (1.1380) [2022-01-18 19:32:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][810/1251] eta 0:16:11 lr 0.000956 time 2.2119 (2.2019) loss 4.6593 (4.0530) grad_norm 1.0651 (1.1384) [2022-01-18 19:32:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][820/1251] eta 0:15:49 lr 0.000956 time 2.2100 (2.2039) loss 4.5045 (4.0544) grad_norm 1.0921 (1.1383) [2022-01-18 19:33:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][830/1251] eta 0:15:28 lr 0.000956 time 2.1723 (2.2043) loss 4.4266 (4.0574) grad_norm 1.2014 (1.1376) [2022-01-18 19:33:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][840/1251] eta 0:15:05 lr 0.000956 time 1.9666 (2.2031) loss 4.5524 (4.0603) grad_norm 1.4089 (1.1374) [2022-01-18 19:34:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][850/1251] eta 0:14:42 lr 0.000956 time 1.8775 (2.2017) loss 2.9771 (4.0597) grad_norm 1.3931 (1.1382) [2022-01-18 19:34:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][860/1251] eta 0:14:20 lr 0.000956 time 2.5425 (2.2012) loss 3.4471 (4.0629) grad_norm 1.1915 (1.1380) [2022-01-18 19:34:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][870/1251] eta 0:13:57 lr 0.000956 time 1.9036 (2.1992) loss 5.0066 (4.0632) grad_norm 1.1167 (1.1382) [2022-01-18 19:35:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][880/1251] eta 0:13:35 lr 0.000956 time 2.1963 (2.1969) loss 4.2155 (4.0661) grad_norm 1.0103 (1.1399) [2022-01-18 19:35:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][890/1251] eta 0:13:12 lr 0.000956 time 1.7518 (2.1961) loss 3.7176 (4.0649) grad_norm 1.1793 (1.1394) [2022-01-18 19:35:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][900/1251] eta 0:12:51 lr 0.000956 time 2.4188 (2.1967) loss 4.2238 (4.0645) grad_norm 1.0185 (1.1392) [2022-01-18 19:36:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][910/1251] eta 0:12:29 lr 0.000956 time 1.8729 (2.1970) loss 3.6309 (4.0653) grad_norm 1.1574 (1.1389) [2022-01-18 19:36:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][920/1251] eta 0:12:07 lr 0.000956 time 1.8512 (2.1983) loss 3.9614 (4.0655) grad_norm 1.1148 (1.1379) [2022-01-18 19:36:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][930/1251] eta 0:11:46 lr 0.000956 time 1.5135 (2.2007) loss 4.8723 (4.0653) grad_norm 1.2291 (1.1373) [2022-01-18 19:37:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][940/1251] eta 0:11:24 lr 0.000956 time 2.1176 (2.2012) loss 4.2013 (4.0653) grad_norm 0.9810 (1.1372) [2022-01-18 19:37:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][950/1251] eta 0:11:02 lr 0.000956 time 2.8185 (2.2018) loss 4.3602 (4.0609) grad_norm 1.0255 (1.1370) [2022-01-18 19:38:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][960/1251] eta 0:10:40 lr 0.000956 time 2.0704 (2.2004) loss 4.4512 (4.0588) grad_norm 1.0084 (1.1364) [2022-01-18 19:38:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][970/1251] eta 0:10:18 lr 0.000956 time 1.7806 (2.2000) loss 4.1858 (4.0579) grad_norm 1.0716 (1.1364) [2022-01-18 19:38:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][980/1251] eta 0:09:56 lr 0.000956 time 2.4442 (2.1996) loss 3.4618 (4.0585) grad_norm 1.0543 (1.1359) [2022-01-18 19:39:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][990/1251] eta 0:09:33 lr 0.000956 time 2.2022 (2.1977) loss 3.9595 (4.0578) grad_norm 1.1921 (1.1357) [2022-01-18 19:39:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1000/1251] eta 0:09:11 lr 0.000956 time 2.3219 (2.1971) loss 4.1639 (4.0607) grad_norm 1.0842 (1.1352) [2022-01-18 19:39:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1010/1251] eta 0:08:49 lr 0.000955 time 2.2022 (2.1964) loss 3.0805 (4.0599) grad_norm 1.0689 (1.1355) [2022-01-18 19:40:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1020/1251] eta 0:08:27 lr 0.000955 time 2.2119 (2.1965) loss 3.9080 (4.0610) grad_norm 1.0504 (1.1350) [2022-01-18 19:40:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1030/1251] eta 0:08:05 lr 0.000955 time 2.2190 (2.1966) loss 4.7036 (4.0589) grad_norm 1.0695 (1.1355) [2022-01-18 19:40:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1040/1251] eta 0:07:43 lr 0.000955 time 1.4732 (2.1974) loss 4.5768 (4.0589) grad_norm 1.6299 (1.1355) [2022-01-18 19:41:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1050/1251] eta 0:07:21 lr 0.000955 time 3.3850 (2.1984) loss 4.3291 (4.0600) grad_norm 1.3828 (1.1357) [2022-01-18 19:41:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1060/1251] eta 0:06:59 lr 0.000955 time 2.0560 (2.1975) loss 4.2447 (4.0602) grad_norm 1.3513 (1.1355) [2022-01-18 19:41:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1070/1251] eta 0:06:37 lr 0.000955 time 2.2145 (2.1958) loss 4.8853 (4.0600) grad_norm 1.3964 (1.1354) [2022-01-18 19:42:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1080/1251] eta 0:06:15 lr 0.000955 time 2.2741 (2.1945) loss 4.4903 (4.0606) grad_norm 1.4412 (1.1358) [2022-01-18 19:42:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1090/1251] eta 0:05:53 lr 0.000955 time 2.7418 (2.1939) loss 2.9222 (4.0560) grad_norm 1.1770 (1.1356) [2022-01-18 19:43:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1100/1251] eta 0:05:31 lr 0.000955 time 1.9090 (2.1931) loss 4.1412 (4.0537) grad_norm 1.1172 (1.1344) [2022-01-18 19:43:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1110/1251] eta 0:05:09 lr 0.000955 time 2.4711 (2.1940) loss 3.7796 (4.0528) grad_norm 1.2792 (1.1337) [2022-01-18 19:43:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1120/1251] eta 0:04:47 lr 0.000955 time 2.5366 (2.1968) loss 4.4294 (4.0534) grad_norm 1.1493 (1.1347) [2022-01-18 19:44:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1130/1251] eta 0:04:25 lr 0.000955 time 2.5300 (2.1977) loss 4.4201 (4.0552) grad_norm 0.9382 (1.1348) [2022-01-18 19:44:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1140/1251] eta 0:04:03 lr 0.000955 time 1.8737 (2.1965) loss 4.1331 (4.0531) grad_norm 0.9530 (1.1356) [2022-01-18 19:44:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1150/1251] eta 0:03:41 lr 0.000955 time 2.1192 (2.1941) loss 3.9897 (4.0538) grad_norm 0.9969 (1.1350) [2022-01-18 19:45:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1160/1251] eta 0:03:19 lr 0.000955 time 2.3065 (2.1932) loss 4.4229 (4.0539) grad_norm 1.2584 (1.1348) [2022-01-18 19:45:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1170/1251] eta 0:02:57 lr 0.000955 time 3.5287 (2.1941) loss 3.0547 (4.0511) grad_norm 1.0309 (1.1346) [2022-01-18 19:45:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1180/1251] eta 0:02:35 lr 0.000955 time 2.2698 (2.1951) loss 2.7718 (4.0483) grad_norm 1.2008 (1.1343) [2022-01-18 19:46:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1190/1251] eta 0:02:13 lr 0.000955 time 1.6107 (2.1942) loss 3.2876 (4.0452) grad_norm 1.0300 (1.1348) [2022-01-18 19:46:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1200/1251] eta 0:01:51 lr 0.000955 time 2.2269 (2.1944) loss 4.7762 (4.0437) grad_norm 0.9584 (1.1344) [2022-01-18 19:47:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1210/1251] eta 0:01:29 lr 0.000955 time 3.0726 (2.1946) loss 4.5655 (4.0401) grad_norm 1.0527 (1.1344) [2022-01-18 19:47:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1220/1251] eta 0:01:08 lr 0.000955 time 2.2883 (2.1946) loss 4.1034 (4.0401) grad_norm 1.0838 (1.1337) [2022-01-18 19:47:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1230/1251] eta 0:00:46 lr 0.000955 time 1.5581 (2.1942) loss 2.9391 (4.0400) grad_norm 0.9306 (1.1330) [2022-01-18 19:48:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1240/1251] eta 0:00:24 lr 0.000955 time 1.7376 (2.1931) loss 4.3911 (4.0403) grad_norm 1.1419 (1.1326) [2022-01-18 19:48:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [40/300][1250/1251] eta 0:00:02 lr 0.000955 time 1.3489 (2.1877) loss 4.0228 (4.0405) grad_norm 1.1404 (1.1327) [2022-01-18 19:48:24 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 40 training takes 0:45:37 [2022-01-18 19:48:24 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_40 saving...... [2022-01-18 19:48:35 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_40 saved !!! [2022-01-18 19:48:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 11.586 (11.586) Loss 1.4743 (1.4743) Acc@1 66.211 (66.211) Acc@5 86.816 (86.816) [2022-01-18 19:49:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.837 (2.527) Loss 1.4014 (1.3427) Acc@1 66.895 (68.608) Acc@5 88.477 (89.551) [2022-01-18 19:49:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.578 (2.093) Loss 1.3565 (1.3531) Acc@1 68.457 (68.694) Acc@5 88.574 (89.160) [2022-01-18 19:49:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.013 (2.018) Loss 1.3100 (1.3547) Acc@1 68.164 (68.621) Acc@5 89.746 (89.176) [2022-01-18 19:49:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.386 (1.936) Loss 1.3980 (1.3474) Acc@1 67.773 (68.683) Acc@5 88.672 (89.194) [2022-01-18 19:50:01 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 68.608 Acc@5 89.218 [2022-01-18 19:50:01 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 68.6% [2022-01-18 19:50:01 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 68.61% [2022-01-18 19:50:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][0/1251] eta 7:24:13 lr 0.000955 time 21.3059 (21.3059) loss 4.4778 (4.4778) grad_norm 0.9830 (0.9830) [2022-01-18 19:50:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][10/1251] eta 1:29:10 lr 0.000955 time 2.2488 (4.3118) loss 2.7976 (4.1066) grad_norm 1.0932 (1.1470) [2022-01-18 19:51:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][20/1251] eta 1:05:06 lr 0.000955 time 1.3200 (3.1732) loss 4.2062 (4.0310) grad_norm 1.1458 (1.1445) [2022-01-18 19:51:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][30/1251] eta 0:58:22 lr 0.000955 time 1.8174 (2.8683) loss 4.4972 (3.8570) grad_norm 1.0995 (1.1982) [2022-01-18 19:51:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][40/1251] eta 0:55:15 lr 0.000955 time 3.7129 (2.7381) loss 4.5911 (3.9259) grad_norm 1.0147 (1.2291) [2022-01-18 19:52:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][50/1251] eta 0:52:51 lr 0.000955 time 2.8407 (2.6403) loss 4.5473 (3.9824) grad_norm 1.4211 (1.2158) [2022-01-18 19:52:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][60/1251] eta 0:50:47 lr 0.000955 time 1.3243 (2.5592) loss 3.5145 (3.9973) grad_norm 1.2383 (1.1930) [2022-01-18 19:52:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][70/1251] eta 0:48:59 lr 0.000955 time 2.3167 (2.4894) loss 3.9032 (4.0406) grad_norm 1.1830 (1.1862) [2022-01-18 19:53:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][80/1251] eta 0:47:56 lr 0.000955 time 2.9817 (2.4566) loss 4.1988 (4.0341) grad_norm 1.2621 (1.1795) [2022-01-18 19:53:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][90/1251] eta 0:47:06 lr 0.000955 time 3.4145 (2.4343) loss 4.7869 (4.0100) grad_norm 1.2756 (1.1824) [2022-01-18 19:54:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][100/1251] eta 0:46:18 lr 0.000955 time 2.2291 (2.4144) loss 4.0315 (4.0230) grad_norm 1.1773 (1.1925) [2022-01-18 19:54:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][110/1251] eta 0:45:41 lr 0.000955 time 2.7570 (2.4024) loss 4.2012 (4.0210) grad_norm 1.0755 (1.1806) [2022-01-18 19:54:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][120/1251] eta 0:45:02 lr 0.000955 time 2.5252 (2.3898) loss 4.4276 (4.0223) grad_norm 1.1749 (1.1686) [2022-01-18 19:55:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][130/1251] eta 0:44:15 lr 0.000955 time 2.4334 (2.3685) loss 4.4255 (4.0372) grad_norm 0.9487 (1.1678) [2022-01-18 19:55:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][140/1251] eta 0:43:26 lr 0.000955 time 2.1844 (2.3464) loss 3.4549 (4.0446) grad_norm 1.3565 (1.1650) [2022-01-18 19:55:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][150/1251] eta 0:42:37 lr 0.000955 time 1.9400 (2.3227) loss 4.0114 (4.0511) grad_norm 1.1215 (1.1677) [2022-01-18 19:56:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][160/1251] eta 0:42:00 lr 0.000955 time 1.7897 (2.3101) loss 3.2947 (4.0534) grad_norm 1.0925 (1.1630) [2022-01-18 19:56:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][170/1251] eta 0:41:33 lr 0.000955 time 2.4231 (2.3063) loss 4.1184 (4.0483) grad_norm 1.3006 (1.1616) [2022-01-18 19:56:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][180/1251] eta 0:41:09 lr 0.000955 time 2.8071 (2.3054) loss 3.4801 (4.0463) grad_norm 1.3662 (1.1556) [2022-01-18 19:57:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][190/1251] eta 0:40:41 lr 0.000955 time 1.5757 (2.3012) loss 3.2393 (4.0262) grad_norm 1.0638 (1.1586) [2022-01-18 19:57:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][200/1251] eta 0:40:20 lr 0.000955 time 2.0996 (2.3029) loss 4.3743 (4.0216) grad_norm 1.0660 (1.1598) [2022-01-18 19:58:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][210/1251] eta 0:39:51 lr 0.000955 time 1.8466 (2.2973) loss 3.9069 (4.0303) grad_norm 1.0765 (1.1586) [2022-01-18 19:58:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][220/1251] eta 0:39:31 lr 0.000955 time 2.9388 (2.2998) loss 4.7106 (4.0350) grad_norm 1.4221 (1.1602) [2022-01-18 19:58:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][230/1251] eta 0:38:51 lr 0.000955 time 1.9469 (2.2832) loss 4.4265 (4.0366) grad_norm 1.1154 (1.1572) [2022-01-18 19:59:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][240/1251] eta 0:38:12 lr 0.000955 time 1.7441 (2.2678) loss 3.5482 (4.0470) grad_norm 1.1389 (1.1563) [2022-01-18 19:59:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][250/1251] eta 0:37:41 lr 0.000955 time 2.0929 (2.2591) loss 4.9391 (4.0584) grad_norm 1.1456 (1.1563) [2022-01-18 19:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][260/1251] eta 0:37:16 lr 0.000955 time 1.8709 (2.2564) loss 4.7392 (4.0475) grad_norm 1.2969 (1.1629) [2022-01-18 20:00:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][270/1251] eta 0:36:49 lr 0.000955 time 1.7411 (2.2522) loss 4.0386 (4.0524) grad_norm 0.9710 (1.1589) [2022-01-18 20:00:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][280/1251] eta 0:36:29 lr 0.000955 time 2.1828 (2.2550) loss 3.9716 (4.0377) grad_norm 1.2231 (1.1566) [2022-01-18 20:00:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][290/1251] eta 0:36:11 lr 0.000955 time 2.6206 (2.2597) loss 4.4051 (4.0405) grad_norm 1.5344 (1.1567) [2022-01-18 20:01:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][300/1251] eta 0:35:54 lr 0.000955 time 2.9187 (2.2651) loss 4.7723 (4.0414) grad_norm 1.2019 (1.1551) [2022-01-18 20:01:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][310/1251] eta 0:35:31 lr 0.000955 time 1.5531 (2.2654) loss 3.0203 (4.0410) grad_norm 0.9784 (1.1524) [2022-01-18 20:02:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][320/1251] eta 0:35:05 lr 0.000955 time 1.7493 (2.2619) loss 4.4574 (4.0505) grad_norm 1.0727 (1.1502) [2022-01-18 20:02:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][330/1251] eta 0:34:38 lr 0.000955 time 1.9015 (2.2566) loss 3.0467 (4.0511) grad_norm 1.1503 (1.1483) [2022-01-18 20:02:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][340/1251] eta 0:34:08 lr 0.000954 time 1.6153 (2.2490) loss 3.9890 (4.0462) grad_norm 0.9614 (1.1490) [2022-01-18 20:03:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][350/1251] eta 0:33:40 lr 0.000954 time 2.2077 (2.2421) loss 4.7211 (4.0582) grad_norm 1.0540 (1.1458) [2022-01-18 20:03:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][360/1251] eta 0:33:19 lr 0.000954 time 5.0560 (2.2438) loss 4.0229 (4.0528) grad_norm 0.9141 (1.1441) [2022-01-18 20:03:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][370/1251] eta 0:33:02 lr 0.000954 time 2.8374 (2.2499) loss 4.3010 (4.0475) grad_norm 1.2733 (1.1437) [2022-01-18 20:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][380/1251] eta 0:32:35 lr 0.000954 time 1.9445 (2.2449) loss 4.2437 (4.0518) grad_norm 1.0343 (1.1421) [2022-01-18 20:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][390/1251] eta 0:32:08 lr 0.000954 time 1.9145 (2.2396) loss 4.9319 (4.0553) grad_norm 1.3394 (1.1446) [2022-01-18 20:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][400/1251] eta 0:31:44 lr 0.000954 time 2.6474 (2.2375) loss 4.7487 (4.0541) grad_norm 1.0309 (1.1456) [2022-01-18 20:05:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][410/1251] eta 0:31:18 lr 0.000954 time 1.9013 (2.2340) loss 4.6238 (4.0477) grad_norm 1.4185 (1.1473) [2022-01-18 20:05:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][420/1251] eta 0:30:56 lr 0.000954 time 1.9982 (2.2339) loss 3.1831 (4.0460) grad_norm 1.0989 (1.1483) [2022-01-18 20:06:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][430/1251] eta 0:30:33 lr 0.000954 time 2.1420 (2.2337) loss 3.1233 (4.0413) grad_norm 1.0869 (1.1463) [2022-01-18 20:06:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][440/1251] eta 0:30:14 lr 0.000954 time 3.7285 (2.2368) loss 4.1846 (4.0375) grad_norm 1.7414 (1.1477) [2022-01-18 20:06:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][450/1251] eta 0:29:49 lr 0.000954 time 2.0036 (2.2338) loss 3.7957 (4.0402) grad_norm 1.1035 (1.1485) [2022-01-18 20:07:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][460/1251] eta 0:29:25 lr 0.000954 time 2.1908 (2.2325) loss 4.3777 (4.0437) grad_norm 1.3130 (1.1471) [2022-01-18 20:07:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][470/1251] eta 0:29:00 lr 0.000954 time 1.8401 (2.2288) loss 2.6308 (4.0377) grad_norm 1.1768 (1.1473) [2022-01-18 20:07:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][480/1251] eta 0:28:36 lr 0.000954 time 2.5192 (2.2264) loss 4.3030 (4.0313) grad_norm 1.0860 (1.1457) [2022-01-18 20:08:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][490/1251] eta 0:28:12 lr 0.000954 time 1.8919 (2.2236) loss 4.1461 (4.0311) grad_norm 1.1331 (1.1452) [2022-01-18 20:08:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][500/1251] eta 0:27:50 lr 0.000954 time 2.0989 (2.2249) loss 2.8934 (4.0308) grad_norm 0.9434 (1.1455) [2022-01-18 20:09:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][510/1251] eta 0:27:30 lr 0.000954 time 2.4408 (2.2273) loss 4.2058 (4.0297) grad_norm 1.1421 (1.1451) [2022-01-18 20:09:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][520/1251] eta 0:27:11 lr 0.000954 time 3.2511 (2.2312) loss 3.9993 (4.0315) grad_norm 1.1203 (1.1459) [2022-01-18 20:09:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][530/1251] eta 0:26:48 lr 0.000954 time 2.2250 (2.2314) loss 4.1682 (4.0283) grad_norm 0.9553 (1.1449) [2022-01-18 20:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][540/1251] eta 0:26:24 lr 0.000954 time 1.9266 (2.2285) loss 3.2412 (4.0278) grad_norm 1.0689 (1.1450) [2022-01-18 20:10:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][550/1251] eta 0:26:00 lr 0.000954 time 1.9406 (2.2258) loss 3.2885 (4.0273) grad_norm 1.1172 (1.1434) [2022-01-18 20:10:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][560/1251] eta 0:25:36 lr 0.000954 time 2.3130 (2.2229) loss 4.6188 (4.0275) grad_norm 1.1194 (1.1432) [2022-01-18 20:11:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][570/1251] eta 0:25:10 lr 0.000954 time 1.9623 (2.2181) loss 4.1022 (4.0273) grad_norm 1.0873 (1.1428) [2022-01-18 20:11:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][580/1251] eta 0:24:47 lr 0.000954 time 1.9358 (2.2169) loss 3.5370 (4.0254) grad_norm 1.0214 (1.1431) [2022-01-18 20:11:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][590/1251] eta 0:24:24 lr 0.000954 time 1.9565 (2.2163) loss 3.6050 (4.0233) grad_norm 1.2001 (1.1438) [2022-01-18 20:12:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][600/1251] eta 0:24:03 lr 0.000954 time 2.5471 (2.2172) loss 4.5503 (4.0279) grad_norm 0.9903 (1.1445) [2022-01-18 20:12:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][610/1251] eta 0:23:41 lr 0.000954 time 2.3930 (2.2171) loss 3.0754 (4.0294) grad_norm 1.1902 (1.1436) [2022-01-18 20:12:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][620/1251] eta 0:23:18 lr 0.000954 time 2.2185 (2.2171) loss 4.2573 (4.0313) grad_norm 0.9360 (1.1430) [2022-01-18 20:13:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][630/1251] eta 0:22:56 lr 0.000954 time 1.8889 (2.2168) loss 4.6234 (4.0339) grad_norm 1.5197 (1.1420) [2022-01-18 20:13:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][640/1251] eta 0:22:35 lr 0.000954 time 3.1029 (2.2177) loss 3.5547 (4.0362) grad_norm 1.2311 (1.1412) [2022-01-18 20:14:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][650/1251] eta 0:22:11 lr 0.000954 time 1.9099 (2.2151) loss 4.1003 (4.0381) grad_norm 1.0210 (1.1407) [2022-01-18 20:14:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][660/1251] eta 0:21:48 lr 0.000954 time 2.1485 (2.2143) loss 4.8488 (4.0365) grad_norm 0.9907 (1.1397) [2022-01-18 20:14:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][670/1251] eta 0:21:27 lr 0.000954 time 2.7840 (2.2151) loss 4.3288 (4.0375) grad_norm 1.3101 (1.1392) [2022-01-18 20:15:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][680/1251] eta 0:21:04 lr 0.000954 time 2.5541 (2.2152) loss 4.4727 (4.0375) grad_norm 1.1553 (1.1399) [2022-01-18 20:15:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][690/1251] eta 0:20:41 lr 0.000954 time 1.9534 (2.2135) loss 4.3314 (4.0378) grad_norm 1.1107 (1.1398) [2022-01-18 20:15:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][700/1251] eta 0:20:19 lr 0.000954 time 2.1821 (2.2139) loss 3.9619 (4.0349) grad_norm 1.3243 (1.1389) [2022-01-18 20:16:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][710/1251] eta 0:19:57 lr 0.000954 time 2.1925 (2.2128) loss 3.5215 (4.0352) grad_norm 1.0379 (1.1389) [2022-01-18 20:16:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][720/1251] eta 0:19:33 lr 0.000954 time 1.8345 (2.2105) loss 4.8151 (4.0380) grad_norm 1.1860 (1.1392) [2022-01-18 20:16:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][730/1251] eta 0:19:11 lr 0.000954 time 2.3601 (2.2106) loss 3.6182 (4.0369) grad_norm 1.4389 (1.1400) [2022-01-18 20:17:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][740/1251] eta 0:18:49 lr 0.000954 time 2.0928 (2.2101) loss 2.9127 (4.0358) grad_norm 1.1616 (1.1400) [2022-01-18 20:17:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][750/1251] eta 0:18:26 lr 0.000954 time 2.2913 (2.2093) loss 4.6244 (4.0389) grad_norm 1.3053 (1.1406) [2022-01-18 20:18:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][760/1251] eta 0:18:04 lr 0.000954 time 2.0544 (2.2080) loss 4.3959 (4.0351) grad_norm 1.1155 (1.1415) [2022-01-18 20:18:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][770/1251] eta 0:17:42 lr 0.000954 time 3.4476 (2.2090) loss 4.3083 (4.0352) grad_norm 1.1133 (1.1419) [2022-01-18 20:18:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][780/1251] eta 0:17:21 lr 0.000954 time 2.8299 (2.2103) loss 4.3500 (4.0337) grad_norm 1.0633 (1.1431) [2022-01-18 20:19:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][790/1251] eta 0:16:57 lr 0.000954 time 1.7053 (2.2081) loss 3.4038 (4.0362) grad_norm 0.9756 (1.1429) [2022-01-18 20:19:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][800/1251] eta 0:16:34 lr 0.000954 time 1.6792 (2.2059) loss 3.3016 (4.0341) grad_norm 0.9879 (1.1425) [2022-01-18 20:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][810/1251] eta 0:16:12 lr 0.000954 time 2.2317 (2.2050) loss 4.0857 (4.0352) grad_norm 1.2047 (1.1423) [2022-01-18 20:20:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][820/1251] eta 0:15:50 lr 0.000954 time 2.8494 (2.2048) loss 4.0002 (4.0357) grad_norm 1.1798 (1.1419) [2022-01-18 20:20:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][830/1251] eta 0:15:27 lr 0.000954 time 2.1744 (2.2037) loss 4.2220 (4.0370) grad_norm 1.0793 (1.1408) [2022-01-18 20:20:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][840/1251] eta 0:15:05 lr 0.000954 time 1.7738 (2.2033) loss 3.8901 (4.0411) grad_norm 1.0660 (1.1406) [2022-01-18 20:21:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][850/1251] eta 0:14:43 lr 0.000954 time 2.7660 (2.2042) loss 4.0758 (4.0428) grad_norm 1.2396 (1.1412) [2022-01-18 20:21:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][860/1251] eta 0:14:21 lr 0.000954 time 2.1832 (2.2045) loss 4.1017 (4.0436) grad_norm 1.2105 (1.1419) [2022-01-18 20:22:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][870/1251] eta 0:13:59 lr 0.000954 time 2.1818 (2.2025) loss 4.3687 (4.0458) grad_norm 1.0601 (1.1399) [2022-01-18 20:22:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][880/1251] eta 0:13:36 lr 0.000954 time 1.9099 (2.2012) loss 4.3803 (4.0457) grad_norm 1.1568 (1.1389) [2022-01-18 20:22:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][890/1251] eta 0:13:15 lr 0.000954 time 1.9727 (2.2025) loss 4.2563 (4.0460) grad_norm 1.2426 (1.1392) [2022-01-18 20:23:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][900/1251] eta 0:12:53 lr 0.000954 time 1.8866 (2.2034) loss 3.8111 (4.0446) grad_norm 1.1124 (1.1394) [2022-01-18 20:23:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][910/1251] eta 0:12:31 lr 0.000953 time 1.6684 (2.2024) loss 3.7494 (4.0437) grad_norm 1.2355 (1.1392) [2022-01-18 20:23:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][920/1251] eta 0:12:08 lr 0.000953 time 1.9196 (2.2014) loss 4.3054 (4.0419) grad_norm 1.2812 (1.1384) [2022-01-18 20:24:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][930/1251] eta 0:11:46 lr 0.000953 time 2.1199 (2.2018) loss 4.4085 (4.0432) grad_norm 1.2208 (1.1394) [2022-01-18 20:24:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][940/1251] eta 0:11:25 lr 0.000953 time 2.1970 (2.2033) loss 4.7167 (4.0465) grad_norm 1.4574 (1.1403) [2022-01-18 20:24:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][950/1251] eta 0:11:02 lr 0.000953 time 1.9815 (2.2026) loss 4.3685 (4.0456) grad_norm 0.9653 (1.1400) [2022-01-18 20:25:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][960/1251] eta 0:10:40 lr 0.000953 time 2.1764 (2.2016) loss 4.2201 (4.0489) grad_norm 0.9266 (1.1389) [2022-01-18 20:25:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][970/1251] eta 0:10:18 lr 0.000953 time 1.8823 (2.2001) loss 4.1081 (4.0475) grad_norm 0.9111 (1.1382) [2022-01-18 20:25:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][980/1251] eta 0:09:55 lr 0.000953 time 2.2517 (2.1986) loss 2.8663 (4.0461) grad_norm 1.3413 (1.1379) [2022-01-18 20:26:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][990/1251] eta 0:09:33 lr 0.000953 time 2.2286 (2.1966) loss 3.6288 (4.0445) grad_norm 1.3726 (1.1382) [2022-01-18 20:26:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1000/1251] eta 0:09:11 lr 0.000953 time 2.7212 (2.1968) loss 3.5919 (4.0433) grad_norm 1.0312 (1.1384) [2022-01-18 20:27:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1010/1251] eta 0:08:49 lr 0.000953 time 1.8153 (2.1965) loss 4.3528 (4.0435) grad_norm 1.0370 (1.1378) [2022-01-18 20:27:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1020/1251] eta 0:08:27 lr 0.000953 time 1.9403 (2.1966) loss 4.6489 (4.0461) grad_norm 1.2347 (1.1385) [2022-01-18 20:27:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1030/1251] eta 0:08:05 lr 0.000953 time 2.3406 (2.1975) loss 4.3450 (4.0461) grad_norm 1.2402 (1.1385) [2022-01-18 20:28:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1040/1251] eta 0:07:43 lr 0.000953 time 3.0710 (2.1984) loss 4.5995 (4.0468) grad_norm 1.1934 (1.1384) [2022-01-18 20:28:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1050/1251] eta 0:07:21 lr 0.000953 time 2.1369 (2.1989) loss 2.8100 (4.0446) grad_norm 1.1906 (1.1381) [2022-01-18 20:28:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1060/1251] eta 0:07:00 lr 0.000953 time 1.7019 (2.1995) loss 3.3531 (4.0414) grad_norm 1.1727 (1.1377) [2022-01-18 20:29:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1070/1251] eta 0:06:38 lr 0.000953 time 3.2125 (2.2000) loss 4.1597 (4.0411) grad_norm 1.2142 (1.1371) [2022-01-18 20:29:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1080/1251] eta 0:06:16 lr 0.000953 time 2.8682 (2.2003) loss 4.3421 (4.0432) grad_norm 1.0233 (1.1364) [2022-01-18 20:30:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1090/1251] eta 0:05:54 lr 0.000953 time 1.6008 (2.2000) loss 4.0228 (4.0436) grad_norm 1.3955 (1.1367) [2022-01-18 20:30:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1100/1251] eta 0:05:32 lr 0.000953 time 1.8751 (2.1997) loss 3.4590 (4.0454) grad_norm 0.9902 (1.1360) [2022-01-18 20:30:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1110/1251] eta 0:05:10 lr 0.000953 time 3.5950 (2.2004) loss 3.2038 (4.0444) grad_norm 1.1934 (1.1353) [2022-01-18 20:31:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1120/1251] eta 0:04:48 lr 0.000953 time 2.3005 (2.1989) loss 3.6389 (4.0467) grad_norm 1.1697 (1.1352) [2022-01-18 20:31:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1130/1251] eta 0:04:25 lr 0.000953 time 1.6581 (2.1977) loss 4.2279 (4.0478) grad_norm 1.4070 (1.1352) [2022-01-18 20:31:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1140/1251] eta 0:04:03 lr 0.000953 time 2.4121 (2.1972) loss 3.9206 (4.0466) grad_norm 1.0448 (1.1345) [2022-01-18 20:32:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1150/1251] eta 0:03:41 lr 0.000953 time 2.1834 (2.1963) loss 4.2257 (4.0480) grad_norm 0.9333 (1.1339) [2022-01-18 20:32:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1160/1251] eta 0:03:19 lr 0.000953 time 2.2821 (2.1948) loss 4.5088 (4.0478) grad_norm 1.2763 (1.1340) [2022-01-18 20:32:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1170/1251] eta 0:02:57 lr 0.000953 time 2.1164 (2.1942) loss 4.5202 (4.0487) grad_norm 1.0297 (1.1337) [2022-01-18 20:33:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1180/1251] eta 0:02:35 lr 0.000953 time 2.5125 (2.1949) loss 4.5043 (4.0506) grad_norm 1.4158 (1.1337) [2022-01-18 20:33:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1190/1251] eta 0:02:13 lr 0.000953 time 2.4045 (2.1966) loss 4.3276 (4.0490) grad_norm 1.0941 (1.1330) [2022-01-18 20:34:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1200/1251] eta 0:01:52 lr 0.000953 time 2.2341 (2.1966) loss 3.2251 (4.0468) grad_norm 1.0578 (1.1330) [2022-01-18 20:34:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1210/1251] eta 0:01:30 lr 0.000953 time 1.9554 (2.1962) loss 2.9413 (4.0455) grad_norm 1.1541 (1.1330) [2022-01-18 20:34:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1220/1251] eta 0:01:08 lr 0.000953 time 2.7546 (2.1964) loss 4.8030 (4.0452) grad_norm 1.1359 (1.1325) [2022-01-18 20:35:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1230/1251] eta 0:00:46 lr 0.000953 time 2.7651 (2.1965) loss 3.9898 (4.0454) grad_norm 1.1377 (1.1330) [2022-01-18 20:35:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1240/1251] eta 0:00:24 lr 0.000953 time 1.6804 (2.1952) loss 3.4989 (4.0464) grad_norm 1.2497 (1.1329) [2022-01-18 20:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [41/300][1250/1251] eta 0:00:02 lr 0.000953 time 1.1560 (2.1893) loss 4.2292 (4.0470) grad_norm 1.1844 (1.1326) [2022-01-18 20:35:41 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 41 training takes 0:45:39 [2022-01-18 20:35:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.155 (18.155) Loss 1.3978 (1.3978) Acc@1 68.457 (68.457) Acc@5 88.574 (88.574) [2022-01-18 20:36:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.205 (3.313) Loss 1.3412 (1.3635) Acc@1 67.969 (68.297) Acc@5 90.723 (89.178) [2022-01-18 20:36:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.587 (2.586) Loss 1.3467 (1.3675) Acc@1 70.508 (68.606) Acc@5 88.770 (89.100) [2022-01-18 20:36:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.644 (2.350) Loss 1.2839 (1.3579) Acc@1 70.312 (68.668) Acc@5 90.723 (89.327) [2022-01-18 20:37:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.885 (2.187) Loss 1.3678 (1.3555) Acc@1 67.871 (68.667) Acc@5 89.062 (89.332) [2022-01-18 20:37:18 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 68.746 Acc@5 89.286 [2022-01-18 20:37:18 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 68.7% [2022-01-18 20:37:18 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 68.75% [2022-01-18 20:37:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][0/1251] eta 7:18:05 lr 0.000953 time 21.0112 (21.0112) loss 3.9848 (3.9848) grad_norm 1.1907 (1.1907) [2022-01-18 20:38:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][10/1251] eta 1:25:11 lr 0.000953 time 3.1036 (4.1188) loss 4.4352 (4.2233) grad_norm 1.1753 (1.0659) [2022-01-18 20:38:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][20/1251] eta 1:05:49 lr 0.000953 time 2.0597 (3.2087) loss 4.3842 (4.1843) grad_norm 1.2625 (1.1705) [2022-01-18 20:38:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][30/1251] eta 0:58:19 lr 0.000953 time 1.5076 (2.8658) loss 4.7462 (4.1665) grad_norm 1.1216 (1.1582) [2022-01-18 20:39:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][40/1251] eta 0:55:42 lr 0.000953 time 4.0170 (2.7600) loss 4.1882 (4.1917) grad_norm 1.1263 (1.1659) [2022-01-18 20:39:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][50/1251] eta 0:53:34 lr 0.000953 time 2.4285 (2.6767) loss 3.4097 (4.1715) grad_norm 1.1829 (1.1598) [2022-01-18 20:39:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][60/1251] eta 0:51:28 lr 0.000953 time 1.9527 (2.5933) loss 4.2013 (4.1220) grad_norm 1.0353 (1.1407) [2022-01-18 20:40:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][70/1251] eta 0:49:54 lr 0.000953 time 1.5612 (2.5353) loss 4.3546 (4.1271) grad_norm 1.0948 (1.1607) [2022-01-18 20:40:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][80/1251] eta 0:48:46 lr 0.000953 time 2.8785 (2.4990) loss 3.3890 (4.1036) grad_norm 1.1499 (1.1536) [2022-01-18 20:41:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][90/1251] eta 0:47:51 lr 0.000953 time 2.1898 (2.4737) loss 3.7543 (4.0651) grad_norm 1.1634 (1.1499) [2022-01-18 20:41:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][100/1251] eta 0:46:56 lr 0.000953 time 2.6165 (2.4468) loss 4.7661 (4.0829) grad_norm 0.9908 (1.1488) [2022-01-18 20:41:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][110/1251] eta 0:45:47 lr 0.000953 time 1.7244 (2.4080) loss 4.4471 (4.0560) grad_norm 1.0515 (1.1398) [2022-01-18 20:42:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][120/1251] eta 0:44:57 lr 0.000953 time 2.3029 (2.3846) loss 2.9922 (4.0150) grad_norm 1.2997 (1.1446) [2022-01-18 20:42:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][130/1251] eta 0:44:10 lr 0.000953 time 1.5978 (2.3640) loss 2.8773 (4.0017) grad_norm 1.4634 (1.1514) [2022-01-18 20:42:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][140/1251] eta 0:43:34 lr 0.000953 time 3.3547 (2.3532) loss 2.9521 (3.9992) grad_norm 1.0274 (1.1454) [2022-01-18 20:43:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][150/1251] eta 0:43:00 lr 0.000953 time 1.9457 (2.3438) loss 4.1046 (4.0004) grad_norm 1.0078 (1.1438) [2022-01-18 20:43:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][160/1251] eta 0:42:31 lr 0.000953 time 2.5830 (2.3390) loss 4.5017 (4.0100) grad_norm 0.9724 (1.1438) [2022-01-18 20:43:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][170/1251] eta 0:41:56 lr 0.000953 time 1.6661 (2.3277) loss 3.6826 (4.0037) grad_norm 1.3237 (1.1465) [2022-01-18 20:44:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][180/1251] eta 0:41:24 lr 0.000953 time 2.6205 (2.3200) loss 4.1759 (4.0198) grad_norm 0.9809 (1.1454) [2022-01-18 20:44:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][190/1251] eta 0:40:48 lr 0.000953 time 1.9296 (2.3077) loss 4.3729 (4.0168) grad_norm 1.0662 (1.1426) [2022-01-18 20:45:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][200/1251] eta 0:40:16 lr 0.000953 time 2.3604 (2.2988) loss 3.3216 (4.0075) grad_norm 0.9520 (1.1438) [2022-01-18 20:45:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][210/1251] eta 0:39:47 lr 0.000953 time 2.3068 (2.2934) loss 3.8711 (4.0147) grad_norm 1.0578 (1.1396) [2022-01-18 20:45:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][220/1251] eta 0:39:23 lr 0.000953 time 2.1735 (2.2920) loss 3.3638 (4.0247) grad_norm 1.2097 (1.1437) [2022-01-18 20:46:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][230/1251] eta 0:38:52 lr 0.000952 time 1.9391 (2.2847) loss 3.5966 (4.0151) grad_norm 1.3224 (1.1446) [2022-01-18 20:46:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][240/1251] eta 0:38:30 lr 0.000952 time 2.9531 (2.2852) loss 3.7552 (4.0014) grad_norm 1.1342 (1.1464) [2022-01-18 20:46:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][250/1251] eta 0:38:02 lr 0.000952 time 2.2192 (2.2802) loss 3.8652 (3.9992) grad_norm 1.1378 (1.1492) [2022-01-18 20:47:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][260/1251] eta 0:37:33 lr 0.000952 time 2.1656 (2.2738) loss 3.2712 (3.9952) grad_norm 0.9260 (1.1470) [2022-01-18 20:47:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][270/1251] eta 0:37:02 lr 0.000952 time 2.0385 (2.2651) loss 4.1571 (4.0021) grad_norm 0.9952 (1.1485) [2022-01-18 20:47:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][280/1251] eta 0:36:36 lr 0.000952 time 2.8902 (2.2624) loss 4.2509 (3.9961) grad_norm 1.0138 (1.1499) [2022-01-18 20:48:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][290/1251] eta 0:36:07 lr 0.000952 time 2.1595 (2.2551) loss 4.5467 (3.9981) grad_norm 1.1801 (1.1472) [2022-01-18 20:48:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][300/1251] eta 0:35:44 lr 0.000952 time 2.4018 (2.2553) loss 4.2237 (3.9902) grad_norm 0.9827 (1.1471) [2022-01-18 20:48:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][310/1251] eta 0:35:18 lr 0.000952 time 2.0088 (2.2515) loss 5.0068 (3.9906) grad_norm 1.1963 (1.1464) [2022-01-18 20:49:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][320/1251] eta 0:34:55 lr 0.000952 time 3.1781 (2.2512) loss 4.3060 (3.9896) grad_norm 1.1421 (1.1440) [2022-01-18 20:49:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][330/1251] eta 0:34:32 lr 0.000952 time 2.4523 (2.2508) loss 3.4251 (3.9903) grad_norm 0.9167 (1.1422) [2022-01-18 20:50:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][340/1251] eta 0:34:09 lr 0.000952 time 1.8704 (2.2492) loss 4.2439 (3.9945) grad_norm 1.0974 (1.1408) [2022-01-18 20:50:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][350/1251] eta 0:33:42 lr 0.000952 time 1.5944 (2.2450) loss 4.0471 (3.9957) grad_norm 1.0989 (1.1398) [2022-01-18 20:50:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][360/1251] eta 0:33:23 lr 0.000952 time 3.5943 (2.2488) loss 4.1755 (3.9879) grad_norm 0.8976 (1.1401) [2022-01-18 20:51:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][370/1251] eta 0:33:03 lr 0.000952 time 3.0382 (2.2520) loss 4.2798 (3.9797) grad_norm 1.2000 (1.1431) [2022-01-18 20:51:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][380/1251] eta 0:32:40 lr 0.000952 time 1.5648 (2.2506) loss 4.1682 (3.9834) grad_norm 0.9947 (1.1442) [2022-01-18 20:51:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][390/1251] eta 0:32:13 lr 0.000952 time 1.7732 (2.2451) loss 3.7686 (3.9843) grad_norm 1.1258 (1.1417) [2022-01-18 20:52:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][400/1251] eta 0:31:48 lr 0.000952 time 3.1265 (2.2426) loss 3.7361 (3.9881) grad_norm 1.2640 (1.1400) [2022-01-18 20:52:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][410/1251] eta 0:31:21 lr 0.000952 time 2.3124 (2.2369) loss 4.4593 (3.9868) grad_norm 1.3128 (1.1413) [2022-01-18 20:52:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][420/1251] eta 0:30:52 lr 0.000952 time 1.9302 (2.2298) loss 2.6994 (3.9826) grad_norm 0.8919 (1.1414) [2022-01-18 20:53:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][430/1251] eta 0:30:27 lr 0.000952 time 1.8974 (2.2258) loss 4.4679 (3.9807) grad_norm 1.0016 (1.1418) [2022-01-18 20:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][440/1251] eta 0:30:03 lr 0.000952 time 2.2545 (2.2233) loss 4.1713 (3.9871) grad_norm 1.1103 (1.1413) [2022-01-18 20:53:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][450/1251] eta 0:29:38 lr 0.000952 time 2.0841 (2.2209) loss 4.4778 (3.9878) grad_norm 1.0583 (1.1409) [2022-01-18 20:54:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][460/1251] eta 0:29:16 lr 0.000952 time 2.0061 (2.2200) loss 4.6970 (3.9901) grad_norm 1.0781 (1.1408) [2022-01-18 20:54:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][470/1251] eta 0:28:51 lr 0.000952 time 2.1833 (2.2175) loss 3.6586 (3.9896) grad_norm 1.0044 (1.1394) [2022-01-18 20:55:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][480/1251] eta 0:28:28 lr 0.000952 time 1.8961 (2.2159) loss 4.1345 (3.9912) grad_norm 1.0447 (1.1366) [2022-01-18 20:55:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][490/1251] eta 0:28:10 lr 0.000952 time 2.0111 (2.2215) loss 4.5498 (3.9948) grad_norm 1.2798 (1.1378) [2022-01-18 20:55:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][500/1251] eta 0:27:51 lr 0.000952 time 2.7580 (2.2258) loss 3.8017 (3.9957) grad_norm 1.1629 (1.1391) [2022-01-18 20:56:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][510/1251] eta 0:27:32 lr 0.000952 time 2.3109 (2.2302) loss 2.9702 (3.9929) grad_norm 1.0847 (1.1383) [2022-01-18 20:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][520/1251] eta 0:27:10 lr 0.000952 time 2.0977 (2.2309) loss 4.3911 (3.9984) grad_norm 1.2045 (1.1386) [2022-01-18 20:57:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][530/1251] eta 0:26:48 lr 0.000952 time 1.8592 (2.2310) loss 4.6270 (3.9941) grad_norm 1.1797 (1.1382) [2022-01-18 20:57:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][540/1251] eta 0:26:24 lr 0.000952 time 1.6039 (2.2280) loss 3.7258 (3.9942) grad_norm 1.0065 (1.1409) [2022-01-18 20:57:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][550/1251] eta 0:25:58 lr 0.000952 time 1.6740 (2.2237) loss 3.0815 (3.9902) grad_norm 1.2162 (1.1419) [2022-01-18 20:58:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][560/1251] eta 0:25:33 lr 0.000952 time 1.9552 (2.2187) loss 4.4033 (3.9934) grad_norm 1.0425 (1.1415) [2022-01-18 20:58:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][570/1251] eta 0:25:07 lr 0.000952 time 2.1649 (2.2138) loss 4.7299 (3.9933) grad_norm 1.0648 (1.1418) [2022-01-18 20:58:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][580/1251] eta 0:24:43 lr 0.000952 time 2.1654 (2.2106) loss 4.5217 (3.9995) grad_norm 0.9928 (1.1418) [2022-01-18 20:59:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][590/1251] eta 0:24:21 lr 0.000952 time 2.4415 (2.2115) loss 4.1270 (3.9974) grad_norm 1.1576 (1.1415) [2022-01-18 20:59:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][600/1251] eta 0:24:01 lr 0.000952 time 2.0283 (2.2141) loss 3.5588 (3.9964) grad_norm 1.3389 (1.1415) [2022-01-18 20:59:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][610/1251] eta 0:23:40 lr 0.000952 time 2.4433 (2.2156) loss 4.4342 (3.9987) grad_norm 1.0496 (1.1423) [2022-01-18 21:00:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][620/1251] eta 0:23:19 lr 0.000952 time 3.0907 (2.2182) loss 4.6162 (4.0041) grad_norm 1.2193 (1.1423) [2022-01-18 21:00:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][630/1251] eta 0:22:58 lr 0.000952 time 1.5022 (2.2194) loss 4.4552 (3.9983) grad_norm 1.2925 (1.1423) [2022-01-18 21:01:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][640/1251] eta 0:22:36 lr 0.000952 time 2.3936 (2.2205) loss 4.2166 (3.9960) grad_norm 1.1307 (1.1420) [2022-01-18 21:01:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][650/1251] eta 0:22:13 lr 0.000952 time 1.9262 (2.2181) loss 4.5781 (3.9981) grad_norm 0.9545 (1.1419) [2022-01-18 21:01:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][660/1251] eta 0:21:50 lr 0.000952 time 2.4636 (2.2176) loss 4.4039 (3.9941) grad_norm 1.1023 (1.1422) [2022-01-18 21:02:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][670/1251] eta 0:21:27 lr 0.000952 time 1.8423 (2.2160) loss 2.9505 (3.9908) grad_norm 1.2154 (1.1425) [2022-01-18 21:02:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][680/1251] eta 0:21:04 lr 0.000952 time 1.6952 (2.2144) loss 4.5077 (3.9920) grad_norm 1.4640 (1.1418) [2022-01-18 21:02:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][690/1251] eta 0:20:42 lr 0.000952 time 2.2782 (2.2140) loss 4.2869 (3.9918) grad_norm 1.1228 (1.1412) [2022-01-18 21:03:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][700/1251] eta 0:20:19 lr 0.000952 time 2.3769 (2.2128) loss 3.5250 (3.9904) grad_norm 0.9864 (1.1405) [2022-01-18 21:03:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][710/1251] eta 0:19:56 lr 0.000952 time 2.4632 (2.2118) loss 3.1237 (3.9895) grad_norm 1.1493 (1.1392) [2022-01-18 21:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][720/1251] eta 0:19:35 lr 0.000952 time 2.0409 (2.2133) loss 3.1678 (3.9879) grad_norm 1.3014 (1.1399) [2022-01-18 21:04:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][730/1251] eta 0:19:13 lr 0.000952 time 2.3441 (2.2133) loss 3.6243 (3.9871) grad_norm 1.0046 (1.1397) [2022-01-18 21:04:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][740/1251] eta 0:18:50 lr 0.000952 time 2.5561 (2.2116) loss 3.3357 (3.9911) grad_norm 0.9950 (1.1397) [2022-01-18 21:04:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][750/1251] eta 0:18:26 lr 0.000952 time 1.9172 (2.2089) loss 3.3604 (3.9896) grad_norm 1.3873 (1.1397) [2022-01-18 21:05:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][760/1251] eta 0:18:03 lr 0.000952 time 1.8427 (2.2073) loss 3.9671 (3.9904) grad_norm 1.1178 (1.1408) [2022-01-18 21:05:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][770/1251] eta 0:17:41 lr 0.000952 time 2.1205 (2.2063) loss 4.4333 (3.9909) grad_norm 1.0105 (1.1402) [2022-01-18 21:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][780/1251] eta 0:17:19 lr 0.000952 time 3.0878 (2.2071) loss 3.4081 (3.9871) grad_norm 0.9549 (1.1408) [2022-01-18 21:06:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][790/1251] eta 0:16:57 lr 0.000951 time 1.6755 (2.2071) loss 3.1366 (3.9855) grad_norm 0.9448 (1.1404) [2022-01-18 21:06:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][800/1251] eta 0:16:34 lr 0.000951 time 1.5277 (2.2060) loss 3.3462 (3.9862) grad_norm 1.0461 (1.1397) [2022-01-18 21:07:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][810/1251] eta 0:16:13 lr 0.000951 time 2.8974 (2.2066) loss 4.5068 (3.9866) grad_norm 1.1948 (1.1387) [2022-01-18 21:07:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][820/1251] eta 0:15:50 lr 0.000951 time 2.3319 (2.2061) loss 4.4258 (3.9887) grad_norm 1.4000 (1.1397) [2022-01-18 21:07:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][830/1251] eta 0:15:28 lr 0.000951 time 1.8637 (2.2056) loss 4.2380 (3.9891) grad_norm 1.1430 (1.1393) [2022-01-18 21:08:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][840/1251] eta 0:15:06 lr 0.000951 time 1.5492 (2.2051) loss 3.0085 (3.9853) grad_norm 1.0749 (1.1396) [2022-01-18 21:08:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][850/1251] eta 0:14:44 lr 0.000951 time 2.9835 (2.2060) loss 3.9059 (3.9839) grad_norm 0.9944 (1.1395) [2022-01-18 21:08:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][860/1251] eta 0:14:22 lr 0.000951 time 2.9900 (2.2064) loss 3.4657 (3.9842) grad_norm 1.1207 (1.1397) [2022-01-18 21:09:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][870/1251] eta 0:14:00 lr 0.000951 time 1.8365 (2.2061) loss 4.2477 (3.9856) grad_norm 0.9645 (1.1392) [2022-01-18 21:09:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][880/1251] eta 0:13:38 lr 0.000951 time 1.8076 (2.2059) loss 3.8287 (3.9858) grad_norm 1.4547 (1.1402) [2022-01-18 21:10:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][890/1251] eta 0:13:16 lr 0.000951 time 2.9594 (2.2055) loss 3.7314 (3.9849) grad_norm 1.3012 (1.1410) [2022-01-18 21:10:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][900/1251] eta 0:12:54 lr 0.000951 time 2.8200 (2.2058) loss 4.7315 (3.9845) grad_norm 1.0627 (1.1419) [2022-01-18 21:10:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][910/1251] eta 0:12:32 lr 0.000951 time 2.1803 (2.2058) loss 4.1178 (3.9852) grad_norm 1.6361 (1.1429) [2022-01-18 21:11:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][920/1251] eta 0:12:10 lr 0.000951 time 1.5349 (2.2063) loss 4.7483 (3.9821) grad_norm 1.3120 (1.1443) [2022-01-18 21:11:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][930/1251] eta 0:11:47 lr 0.000951 time 1.9073 (2.2055) loss 4.0271 (3.9835) grad_norm 1.2849 (1.1438) [2022-01-18 21:11:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][940/1251] eta 0:11:26 lr 0.000951 time 3.4977 (2.2059) loss 4.0523 (3.9814) grad_norm 1.3824 (1.1440) [2022-01-18 21:12:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][950/1251] eta 0:11:03 lr 0.000951 time 1.9247 (2.2048) loss 4.8932 (3.9850) grad_norm 1.1029 (1.1450) [2022-01-18 21:12:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][960/1251] eta 0:10:41 lr 0.000951 time 1.8966 (2.2054) loss 4.5304 (3.9866) grad_norm 1.0580 (1.1448) [2022-01-18 21:12:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][970/1251] eta 0:10:19 lr 0.000951 time 1.6164 (2.2058) loss 4.2111 (3.9867) grad_norm 1.0973 (1.1441) [2022-01-18 21:13:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][980/1251] eta 0:09:58 lr 0.000951 time 3.1927 (2.2075) loss 4.4518 (3.9890) grad_norm 1.2036 (1.1437) [2022-01-18 21:13:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][990/1251] eta 0:09:35 lr 0.000951 time 1.6897 (2.2059) loss 4.9745 (3.9922) grad_norm 1.1653 (1.1434) [2022-01-18 21:14:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1000/1251] eta 0:09:12 lr 0.000951 time 1.5559 (2.2031) loss 3.3057 (3.9914) grad_norm 0.9719 (1.1432) [2022-01-18 21:14:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1010/1251] eta 0:08:50 lr 0.000951 time 1.9271 (2.2007) loss 4.6971 (3.9934) grad_norm 1.2020 (1.1425) [2022-01-18 21:14:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1020/1251] eta 0:08:28 lr 0.000951 time 2.3382 (2.1994) loss 4.2904 (3.9922) grad_norm 1.0203 (1.1435) [2022-01-18 21:15:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1030/1251] eta 0:08:06 lr 0.000951 time 2.3078 (2.1998) loss 3.5583 (3.9930) grad_norm 1.2590 (1.1433) [2022-01-18 21:15:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1040/1251] eta 0:07:44 lr 0.000951 time 2.5541 (2.1999) loss 3.2627 (3.9931) grad_norm 1.3988 (1.1434) [2022-01-18 21:15:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1050/1251] eta 0:07:22 lr 0.000951 time 1.9526 (2.2006) loss 4.4710 (3.9960) grad_norm 1.1536 (1.1428) [2022-01-18 21:16:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1060/1251] eta 0:07:00 lr 0.000951 time 3.0254 (2.2019) loss 4.8872 (3.9975) grad_norm 1.1514 (1.1419) [2022-01-18 21:16:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1070/1251] eta 0:06:38 lr 0.000951 time 1.9881 (2.2028) loss 3.8984 (3.9975) grad_norm 0.9712 (1.1422) [2022-01-18 21:17:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1080/1251] eta 0:06:16 lr 0.000951 time 1.8538 (2.2035) loss 4.2087 (3.9958) grad_norm 1.3094 (1.1421) [2022-01-18 21:17:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1090/1251] eta 0:05:54 lr 0.000951 time 2.1059 (2.2044) loss 4.0983 (3.9952) grad_norm 1.2129 (1.1420) [2022-01-18 21:17:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1100/1251] eta 0:05:32 lr 0.000951 time 1.6025 (2.2036) loss 4.0304 (3.9923) grad_norm 1.0689 (1.1425) [2022-01-18 21:18:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1110/1251] eta 0:05:10 lr 0.000951 time 1.6764 (2.2009) loss 4.0905 (3.9934) grad_norm 1.0610 (1.1426) [2022-01-18 21:18:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1120/1251] eta 0:04:48 lr 0.000951 time 1.8510 (2.2005) loss 3.2383 (3.9904) grad_norm 1.1580 (1.1424) [2022-01-18 21:18:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1130/1251] eta 0:04:26 lr 0.000951 time 2.2003 (2.2001) loss 3.5756 (3.9888) grad_norm 1.0938 (1.1416) [2022-01-18 21:19:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1140/1251] eta 0:04:04 lr 0.000951 time 2.2144 (2.2023) loss 4.1241 (3.9891) grad_norm 0.9714 (1.1412) [2022-01-18 21:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1150/1251] eta 0:03:42 lr 0.000951 time 2.1265 (2.2024) loss 4.2532 (3.9908) grad_norm 1.1008 (1.1406) [2022-01-18 21:19:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1160/1251] eta 0:03:20 lr 0.000951 time 2.2235 (2.2015) loss 4.7039 (3.9920) grad_norm 1.0533 (1.1407) [2022-01-18 21:20:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1170/1251] eta 0:02:58 lr 0.000951 time 2.1919 (2.2008) loss 3.2061 (3.9920) grad_norm 1.0476 (1.1410) [2022-01-18 21:20:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1180/1251] eta 0:02:36 lr 0.000951 time 2.0979 (2.2016) loss 3.8345 (3.9921) grad_norm 1.1081 (1.1414) [2022-01-18 21:21:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1190/1251] eta 0:02:14 lr 0.000951 time 2.1931 (2.2020) loss 4.2418 (3.9926) grad_norm 1.0850 (1.1411) [2022-01-18 21:21:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1200/1251] eta 0:01:52 lr 0.000951 time 2.0681 (2.2016) loss 4.4923 (3.9929) grad_norm 1.2767 (1.1410) [2022-01-18 21:21:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1210/1251] eta 0:01:30 lr 0.000951 time 1.6540 (2.2010) loss 3.7638 (3.9925) grad_norm 1.0841 (1.1403) [2022-01-18 21:22:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1220/1251] eta 0:01:08 lr 0.000951 time 1.8641 (2.2001) loss 4.5253 (3.9931) grad_norm 1.0610 (1.1400) [2022-01-18 21:22:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1230/1251] eta 0:00:46 lr 0.000951 time 1.9759 (2.1985) loss 3.7971 (3.9931) grad_norm 1.1965 (1.1401) [2022-01-18 21:22:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1240/1251] eta 0:00:24 lr 0.000951 time 1.8868 (2.1983) loss 2.9228 (3.9929) grad_norm 1.1006 (1.1399) [2022-01-18 21:23:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [42/300][1250/1251] eta 0:00:02 lr 0.000951 time 1.1454 (2.1931) loss 2.9952 (3.9914) grad_norm 1.1605 (1.1394) [2022-01-18 21:23:02 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 42 training takes 0:45:43 [2022-01-18 21:23:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.790 (17.790) Loss 1.3903 (1.3903) Acc@1 69.434 (69.434) Acc@5 88.477 (88.477) [2022-01-18 21:23:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.990 (3.382) Loss 1.2411 (1.3468) Acc@1 73.340 (69.309) Acc@5 90.430 (89.293) [2022-01-18 21:23:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.281 (2.597) Loss 1.4850 (1.3587) Acc@1 66.406 (68.936) Acc@5 86.719 (89.128) [2022-01-18 21:24:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.636 (2.253) Loss 1.4400 (1.3634) Acc@1 66.406 (68.870) Acc@5 88.965 (89.097) [2022-01-18 21:24:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.547 (2.155) Loss 1.3164 (1.3601) Acc@1 71.094 (69.112) Acc@5 89.551 (89.191) [2022-01-18 21:24:37 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 69.014 Acc@5 89.202 [2022-01-18 21:24:37 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 69.0% [2022-01-18 21:24:37 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 69.01% [2022-01-18 21:24:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][0/1251] eta 7:32:04 lr 0.000951 time 21.6821 (21.6821) loss 4.2512 (4.2512) grad_norm 1.1508 (1.1508) [2022-01-18 21:25:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][10/1251] eta 1:22:25 lr 0.000951 time 1.6661 (3.9850) loss 4.5987 (4.0199) grad_norm 0.9506 (1.1872) [2022-01-18 21:25:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][20/1251] eta 1:06:05 lr 0.000951 time 1.9381 (3.2210) loss 3.3275 (4.0076) grad_norm 0.9893 (1.1751) [2022-01-18 21:26:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][30/1251] eta 0:58:04 lr 0.000951 time 1.4975 (2.8542) loss 4.4458 (4.1382) grad_norm 1.1440 (1.1563) [2022-01-18 21:26:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][40/1251] eta 0:55:58 lr 0.000951 time 3.9621 (2.7737) loss 4.2028 (4.1099) grad_norm 1.1777 (1.1500) [2022-01-18 21:26:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][50/1251] eta 0:53:51 lr 0.000951 time 2.3107 (2.6909) loss 3.5246 (4.1095) grad_norm 1.2804 (1.1352) [2022-01-18 21:27:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][60/1251] eta 0:51:36 lr 0.000951 time 1.5597 (2.6001) loss 3.4545 (4.0839) grad_norm 0.9896 (1.1322) [2022-01-18 21:27:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][70/1251] eta 0:49:38 lr 0.000951 time 1.6891 (2.5222) loss 3.8173 (4.0495) grad_norm 1.3959 (1.1465) [2022-01-18 21:28:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][80/1251] eta 0:48:42 lr 0.000951 time 4.7821 (2.4958) loss 3.4987 (4.0490) grad_norm 1.1853 (1.1422) [2022-01-18 21:28:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][90/1251] eta 0:47:05 lr 0.000950 time 2.1221 (2.4338) loss 4.3513 (4.0718) grad_norm 0.8887 (1.1265) [2022-01-18 21:28:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][100/1251] eta 0:45:57 lr 0.000950 time 1.5219 (2.3956) loss 4.1221 (4.0667) grad_norm 1.1920 (1.1232) [2022-01-18 21:29:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][110/1251] eta 0:45:18 lr 0.000950 time 2.3173 (2.3824) loss 4.1131 (4.0700) grad_norm 1.1496 (1.1270) [2022-01-18 21:29:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][120/1251] eta 0:45:10 lr 0.000950 time 3.6389 (2.3965) loss 4.4575 (4.0574) grad_norm 1.1152 (1.1239) [2022-01-18 21:29:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][130/1251] eta 0:44:30 lr 0.000950 time 1.9501 (2.3822) loss 4.4553 (4.0351) grad_norm 1.3800 (1.1252) [2022-01-18 21:30:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][140/1251] eta 0:43:57 lr 0.000950 time 1.8451 (2.3740) loss 4.2583 (4.0294) grad_norm 1.1907 (1.1223) [2022-01-18 21:30:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][150/1251] eta 0:43:16 lr 0.000950 time 1.7998 (2.3582) loss 4.0949 (4.0452) grad_norm 1.1657 (1.1190) [2022-01-18 21:30:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][160/1251] eta 0:42:44 lr 0.000950 time 3.3845 (2.3507) loss 4.4149 (4.0460) grad_norm 1.3003 (1.1207) [2022-01-18 21:31:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][170/1251] eta 0:42:03 lr 0.000950 time 2.0689 (2.3348) loss 3.7855 (4.0376) grad_norm 1.4061 (1.1183) [2022-01-18 21:31:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][180/1251] eta 0:41:35 lr 0.000950 time 1.8474 (2.3297) loss 4.2238 (4.0211) grad_norm 1.3231 (1.1180) [2022-01-18 21:32:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][190/1251] eta 0:41:06 lr 0.000950 time 2.2563 (2.3244) loss 4.4535 (4.0107) grad_norm 1.3126 (1.1179) [2022-01-18 21:32:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][200/1251] eta 0:40:32 lr 0.000950 time 2.5780 (2.3146) loss 4.4398 (4.0149) grad_norm 1.2201 (1.1258) [2022-01-18 21:32:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][210/1251] eta 0:39:54 lr 0.000950 time 1.6409 (2.3001) loss 4.5637 (4.0079) grad_norm 1.1425 (1.1269) [2022-01-18 21:33:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][220/1251] eta 0:39:22 lr 0.000950 time 1.8754 (2.2919) loss 4.4431 (4.0114) grad_norm 1.0276 (1.1240) [2022-01-18 21:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][230/1251] eta 0:38:47 lr 0.000950 time 2.0899 (2.2797) loss 4.4096 (4.0080) grad_norm 1.1189 (1.1225) [2022-01-18 21:33:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][240/1251] eta 0:38:19 lr 0.000950 time 1.8927 (2.2745) loss 3.6614 (4.0221) grad_norm 0.9297 (1.1197) [2022-01-18 21:34:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][250/1251] eta 0:37:52 lr 0.000950 time 2.4782 (2.2705) loss 3.2580 (4.0168) grad_norm 1.2115 (1.1198) [2022-01-18 21:34:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][260/1251] eta 0:37:30 lr 0.000950 time 2.8150 (2.2711) loss 3.2351 (4.0167) grad_norm 1.0964 (1.1194) [2022-01-18 21:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][270/1251] eta 0:37:11 lr 0.000950 time 1.8551 (2.2745) loss 3.3174 (4.0219) grad_norm 0.8577 (1.1158) [2022-01-18 21:35:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][280/1251] eta 0:36:44 lr 0.000950 time 1.9529 (2.2704) loss 4.5447 (4.0203) grad_norm 1.1554 (1.1136) [2022-01-18 21:35:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][290/1251] eta 0:36:16 lr 0.000950 time 1.9306 (2.2651) loss 3.7518 (4.0245) grad_norm 0.9261 (1.1111) [2022-01-18 21:35:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][300/1251] eta 0:35:51 lr 0.000950 time 2.2605 (2.2625) loss 4.0792 (4.0349) grad_norm 0.9558 (1.1087) [2022-01-18 21:36:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][310/1251] eta 0:35:22 lr 0.000950 time 1.9792 (2.2560) loss 3.7794 (4.0200) grad_norm 1.1270 (1.1090) [2022-01-18 21:36:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][320/1251] eta 0:34:57 lr 0.000950 time 2.0944 (2.2531) loss 3.6594 (4.0189) grad_norm 1.1807 (1.1065) [2022-01-18 21:37:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][330/1251] eta 0:34:32 lr 0.000950 time 1.9214 (2.2507) loss 4.2750 (4.0073) grad_norm 1.1620 (1.1051) [2022-01-18 21:37:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][340/1251] eta 0:34:12 lr 0.000950 time 2.3977 (2.2534) loss 4.5826 (4.0045) grad_norm 1.1881 (1.1054) [2022-01-18 21:37:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][350/1251] eta 0:33:46 lr 0.000950 time 1.7645 (2.2495) loss 3.6291 (4.0073) grad_norm 1.0299 (1.1051) [2022-01-18 21:38:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][360/1251] eta 0:33:21 lr 0.000950 time 2.5443 (2.2469) loss 3.3204 (4.0053) grad_norm 0.8749 (1.1060) [2022-01-18 21:38:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][370/1251] eta 0:32:59 lr 0.000950 time 1.5954 (2.2464) loss 4.0847 (4.0007) grad_norm 1.0886 (1.1085) [2022-01-18 21:38:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][380/1251] eta 0:32:36 lr 0.000950 time 1.7945 (2.2466) loss 4.0166 (4.0003) grad_norm 1.1323 (1.1106) [2022-01-18 21:39:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][390/1251] eta 0:32:13 lr 0.000950 time 1.9334 (2.2461) loss 4.3358 (4.0012) grad_norm 1.2656 (1.1116) [2022-01-18 21:39:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][400/1251] eta 0:31:49 lr 0.000950 time 2.1687 (2.2442) loss 3.0634 (4.0060) grad_norm 1.0529 (1.1112) [2022-01-18 21:39:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][410/1251] eta 0:31:22 lr 0.000950 time 1.8723 (2.2385) loss 3.7715 (4.0118) grad_norm 1.1427 (1.1103) [2022-01-18 21:40:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][420/1251] eta 0:30:56 lr 0.000950 time 2.2223 (2.2346) loss 4.2094 (4.0082) grad_norm 0.9231 (1.1104) [2022-01-18 21:40:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][430/1251] eta 0:30:35 lr 0.000950 time 1.9509 (2.2355) loss 4.0717 (4.0094) grad_norm 1.0269 (1.1116) [2022-01-18 21:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][440/1251] eta 0:30:13 lr 0.000950 time 1.7630 (2.2365) loss 4.3617 (4.0110) grad_norm 1.3286 (1.1115) [2022-01-18 21:41:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][450/1251] eta 0:29:49 lr 0.000950 time 1.7531 (2.2345) loss 4.6912 (4.0147) grad_norm 1.2222 (1.1140) [2022-01-18 21:41:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][460/1251] eta 0:29:26 lr 0.000950 time 1.5783 (2.2336) loss 3.8062 (4.0131) grad_norm 1.4778 (1.1166) [2022-01-18 21:42:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][470/1251] eta 0:29:03 lr 0.000950 time 1.5556 (2.2324) loss 4.1298 (4.0184) grad_norm 1.0953 (1.1170) [2022-01-18 21:42:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][480/1251] eta 0:28:40 lr 0.000950 time 2.1531 (2.2321) loss 4.8052 (4.0197) grad_norm 1.2563 (1.1165) [2022-01-18 21:42:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][490/1251] eta 0:28:16 lr 0.000950 time 1.8498 (2.2290) loss 4.9035 (4.0225) grad_norm 1.0552 (1.1192) [2022-01-18 21:43:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][500/1251] eta 0:27:52 lr 0.000950 time 2.1897 (2.2273) loss 3.7947 (4.0186) grad_norm 1.0419 (1.1214) [2022-01-18 21:43:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][510/1251] eta 0:27:28 lr 0.000950 time 1.7659 (2.2241) loss 4.5966 (4.0192) grad_norm 1.0388 (1.1197) [2022-01-18 21:43:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][520/1251] eta 0:27:04 lr 0.000950 time 2.4671 (2.2224) loss 3.8057 (4.0214) grad_norm 1.2071 (1.1199) [2022-01-18 21:44:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][530/1251] eta 0:26:41 lr 0.000950 time 2.1882 (2.2214) loss 4.1727 (4.0211) grad_norm 1.2208 (1.1194) [2022-01-18 21:44:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][540/1251] eta 0:26:19 lr 0.000950 time 2.1102 (2.2212) loss 3.6810 (4.0202) grad_norm 0.9848 (1.1206) [2022-01-18 21:45:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][550/1251] eta 0:25:55 lr 0.000950 time 2.2055 (2.2189) loss 4.3235 (4.0191) grad_norm 1.1309 (1.1207) [2022-01-18 21:45:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][560/1251] eta 0:25:34 lr 0.000950 time 2.5480 (2.2200) loss 3.8906 (4.0102) grad_norm 0.9571 (1.1204) [2022-01-18 21:45:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][570/1251] eta 0:25:14 lr 0.000950 time 3.0431 (2.2242) loss 4.2288 (4.0104) grad_norm 1.2931 (1.1221) [2022-01-18 21:46:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][580/1251] eta 0:24:51 lr 0.000950 time 2.2024 (2.2233) loss 4.8949 (4.0044) grad_norm 1.1397 (1.1212) [2022-01-18 21:46:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][590/1251] eta 0:24:26 lr 0.000950 time 1.6223 (2.2186) loss 4.4898 (4.0085) grad_norm 1.0127 (1.1204) [2022-01-18 21:46:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][600/1251] eta 0:24:02 lr 0.000950 time 2.4807 (2.2163) loss 3.8209 (4.0117) grad_norm 1.0430 (1.1210) [2022-01-18 21:47:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][610/1251] eta 0:23:38 lr 0.000950 time 2.1437 (2.2131) loss 4.3558 (4.0057) grad_norm 1.1383 (1.1220) [2022-01-18 21:47:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][620/1251] eta 0:23:16 lr 0.000950 time 2.7700 (2.2128) loss 4.7692 (4.0008) grad_norm 1.3364 (1.1229) [2022-01-18 21:47:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][630/1251] eta 0:22:53 lr 0.000950 time 2.2179 (2.2112) loss 4.1431 (4.0008) grad_norm 1.1286 (1.1227) [2022-01-18 21:48:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][640/1251] eta 0:22:31 lr 0.000949 time 2.4379 (2.2118) loss 4.6749 (4.0026) grad_norm 1.1574 (1.1220) [2022-01-18 21:48:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][650/1251] eta 0:22:09 lr 0.000949 time 2.3083 (2.2116) loss 4.0844 (4.0015) grad_norm 1.0117 (1.1224) [2022-01-18 21:49:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][660/1251] eta 0:21:49 lr 0.000949 time 2.7898 (2.2161) loss 4.2505 (3.9973) grad_norm 0.9360 (1.1215) [2022-01-18 21:49:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][670/1251] eta 0:21:27 lr 0.000949 time 2.5500 (2.2163) loss 2.9278 (3.9980) grad_norm 1.2896 (1.1224) [2022-01-18 21:49:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][680/1251] eta 0:21:04 lr 0.000949 time 2.1549 (2.2153) loss 4.4218 (4.0010) grad_norm 1.0092 (1.1220) [2022-01-18 21:50:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][690/1251] eta 0:20:40 lr 0.000949 time 1.8988 (2.2120) loss 2.6660 (4.0000) grad_norm 1.1855 (1.1217) [2022-01-18 21:50:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][700/1251] eta 0:20:19 lr 0.000949 time 2.2174 (2.2132) loss 4.7709 (3.9986) grad_norm 1.0640 (1.1217) [2022-01-18 21:50:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][710/1251] eta 0:19:56 lr 0.000949 time 1.9643 (2.2116) loss 4.2807 (4.0013) grad_norm 1.0415 (1.1225) [2022-01-18 21:51:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][720/1251] eta 0:19:33 lr 0.000949 time 2.5262 (2.2106) loss 4.0411 (3.9990) grad_norm 1.1280 (1.1223) [2022-01-18 21:51:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][730/1251] eta 0:19:11 lr 0.000949 time 2.2315 (2.2107) loss 3.7231 (3.9986) grad_norm 1.0769 (1.1222) [2022-01-18 21:51:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][740/1251] eta 0:18:50 lr 0.000949 time 2.0493 (2.2116) loss 3.3311 (3.9965) grad_norm 1.1657 (1.1221) [2022-01-18 21:52:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][750/1251] eta 0:18:27 lr 0.000949 time 2.2218 (2.2110) loss 4.3958 (3.9961) grad_norm 1.1301 (1.1226) [2022-01-18 21:52:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][760/1251] eta 0:18:05 lr 0.000949 time 2.5379 (2.2116) loss 4.5393 (3.9967) grad_norm 1.0549 (1.1232) [2022-01-18 21:53:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][770/1251] eta 0:17:43 lr 0.000949 time 1.5887 (2.2103) loss 4.3794 (3.9992) grad_norm 1.2441 (1.1237) [2022-01-18 21:53:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][780/1251] eta 0:17:21 lr 0.000949 time 1.9299 (2.2112) loss 4.7182 (4.0030) grad_norm 1.3672 (1.1247) [2022-01-18 21:53:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][790/1251] eta 0:16:58 lr 0.000949 time 1.8627 (2.2083) loss 4.7388 (4.0052) grad_norm 0.9583 (1.1252) [2022-01-18 21:54:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][800/1251] eta 0:16:35 lr 0.000949 time 2.0844 (2.2080) loss 4.8815 (4.0074) grad_norm 1.2529 (1.1262) [2022-01-18 21:54:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][810/1251] eta 0:16:13 lr 0.000949 time 2.2370 (2.2072) loss 3.5832 (4.0090) grad_norm 1.0227 (1.1256) [2022-01-18 21:54:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][820/1251] eta 0:15:51 lr 0.000949 time 1.9313 (2.2066) loss 3.5818 (4.0056) grad_norm 1.1200 (1.1254) [2022-01-18 21:55:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][830/1251] eta 0:15:28 lr 0.000949 time 2.1482 (2.2060) loss 3.9779 (4.0062) grad_norm 0.9362 (1.1243) [2022-01-18 21:55:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][840/1251] eta 0:15:06 lr 0.000949 time 1.8852 (2.2053) loss 4.4788 (4.0080) grad_norm 0.9787 (1.1235) [2022-01-18 21:55:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][850/1251] eta 0:14:44 lr 0.000949 time 2.0074 (2.2055) loss 4.2011 (4.0059) grad_norm 1.4146 (1.1234) [2022-01-18 21:56:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][860/1251] eta 0:14:22 lr 0.000949 time 2.2604 (2.2053) loss 4.2334 (4.0045) grad_norm 0.9611 (1.1249) [2022-01-18 21:56:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][870/1251] eta 0:13:59 lr 0.000949 time 1.9339 (2.2043) loss 4.1765 (4.0041) grad_norm 1.0510 (1.1243) [2022-01-18 21:56:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][880/1251] eta 0:13:37 lr 0.000949 time 1.8548 (2.2041) loss 5.0104 (4.0039) grad_norm 1.0274 (1.1246) [2022-01-18 21:57:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][890/1251] eta 0:13:16 lr 0.000949 time 2.2190 (2.2051) loss 4.4080 (4.0049) grad_norm 1.0704 (1.1256) [2022-01-18 21:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][900/1251] eta 0:12:54 lr 0.000949 time 2.8051 (2.2069) loss 4.1908 (4.0042) grad_norm 0.9098 (1.1261) [2022-01-18 21:58:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][910/1251] eta 0:12:31 lr 0.000949 time 1.7340 (2.2048) loss 3.3520 (4.0003) grad_norm 1.1055 (1.1262) [2022-01-18 21:58:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][920/1251] eta 0:12:08 lr 0.000949 time 2.0444 (2.2018) loss 4.6500 (3.9982) grad_norm 0.9966 (1.1249) [2022-01-18 21:58:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][930/1251] eta 0:11:45 lr 0.000949 time 2.3735 (2.1993) loss 3.7687 (3.9982) grad_norm 1.0186 (1.1239) [2022-01-18 21:59:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][940/1251] eta 0:11:23 lr 0.000949 time 2.4265 (2.1988) loss 3.5010 (3.9946) grad_norm 1.0738 (1.1238) [2022-01-18 21:59:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][950/1251] eta 0:11:02 lr 0.000949 time 3.2599 (2.2002) loss 4.3049 (3.9924) grad_norm 0.9877 (1.1237) [2022-01-18 21:59:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][960/1251] eta 0:10:40 lr 0.000949 time 1.6581 (2.2008) loss 4.6019 (3.9919) grad_norm 1.1199 (1.1236) [2022-01-18 22:00:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][970/1251] eta 0:10:19 lr 0.000949 time 2.6675 (2.2040) loss 4.0481 (3.9901) grad_norm 1.2026 (1.1234) [2022-01-18 22:00:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][980/1251] eta 0:09:57 lr 0.000949 time 2.2189 (2.2053) loss 4.4926 (3.9912) grad_norm 0.9437 (1.1231) [2022-01-18 22:01:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][990/1251] eta 0:09:35 lr 0.000949 time 3.0836 (2.2059) loss 4.7935 (3.9917) grad_norm 1.2930 (1.1231) [2022-01-18 22:01:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1000/1251] eta 0:09:13 lr 0.000949 time 1.7178 (2.2042) loss 3.2822 (3.9909) grad_norm 1.0306 (1.1227) [2022-01-18 22:01:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1010/1251] eta 0:08:50 lr 0.000949 time 2.0728 (2.2017) loss 4.2032 (3.9911) grad_norm 0.9641 (1.1233) [2022-01-18 22:02:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1020/1251] eta 0:08:28 lr 0.000949 time 1.9814 (2.1993) loss 4.4116 (3.9910) grad_norm 0.8825 (1.1225) [2022-01-18 22:02:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1030/1251] eta 0:08:06 lr 0.000949 time 2.6862 (2.1999) loss 3.5473 (3.9900) grad_norm 0.9181 (1.1212) [2022-01-18 22:02:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1040/1251] eta 0:07:44 lr 0.000949 time 1.8065 (2.2005) loss 4.4775 (3.9917) grad_norm 1.1202 (1.1210) [2022-01-18 22:03:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1050/1251] eta 0:07:22 lr 0.000949 time 2.4725 (2.2020) loss 4.1397 (3.9894) grad_norm 1.2897 (1.1207) [2022-01-18 22:03:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1060/1251] eta 0:07:00 lr 0.000949 time 1.7227 (2.2029) loss 4.2049 (3.9897) grad_norm 1.0468 (1.1209) [2022-01-18 22:03:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1070/1251] eta 0:06:38 lr 0.000949 time 3.2520 (2.2041) loss 3.4821 (3.9901) grad_norm 0.9584 (1.1202) [2022-01-18 22:04:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1080/1251] eta 0:06:16 lr 0.000949 time 1.8632 (2.2031) loss 4.0358 (3.9923) grad_norm 1.3090 (1.1199) [2022-01-18 22:04:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1090/1251] eta 0:05:54 lr 0.000949 time 1.8753 (2.2026) loss 4.0175 (3.9943) grad_norm 1.1661 (1.1196) [2022-01-18 22:05:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1100/1251] eta 0:05:32 lr 0.000949 time 1.9272 (2.2017) loss 4.4744 (3.9949) grad_norm 1.1372 (1.1193) [2022-01-18 22:05:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1110/1251] eta 0:05:10 lr 0.000949 time 5.0127 (2.2041) loss 4.0842 (3.9978) grad_norm 1.3386 (1.1199) [2022-01-18 22:05:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1120/1251] eta 0:04:48 lr 0.000949 time 2.4933 (2.2050) loss 4.1299 (3.9981) grad_norm 1.2724 (1.1206) [2022-01-18 22:06:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1130/1251] eta 0:04:26 lr 0.000949 time 1.9213 (2.2035) loss 3.6781 (3.9971) grad_norm 1.0234 (1.1203) [2022-01-18 22:06:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1140/1251] eta 0:04:04 lr 0.000949 time 1.7177 (2.2026) loss 4.0431 (3.9968) grad_norm 1.1136 (1.1192) [2022-01-18 22:06:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1150/1251] eta 0:03:42 lr 0.000949 time 3.6751 (2.2026) loss 4.6109 (3.9980) grad_norm 1.0543 (1.1194) [2022-01-18 22:07:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1160/1251] eta 0:03:20 lr 0.000949 time 2.0894 (2.2028) loss 4.1935 (3.9956) grad_norm 1.3678 (1.1195) [2022-01-18 22:07:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1170/1251] eta 0:02:58 lr 0.000949 time 1.7029 (2.2032) loss 4.1385 (3.9952) grad_norm 1.1596 (1.1190) [2022-01-18 22:07:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1180/1251] eta 0:02:36 lr 0.000949 time 1.9037 (2.2020) loss 3.0104 (3.9926) grad_norm 1.3661 (1.1199) [2022-01-18 22:08:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1190/1251] eta 0:02:14 lr 0.000948 time 3.7035 (2.2023) loss 4.1154 (3.9938) grad_norm 1.0254 (1.1202) [2022-01-18 22:08:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1200/1251] eta 0:01:52 lr 0.000948 time 2.1487 (2.2018) loss 3.0212 (3.9915) grad_norm 1.1145 (1.1196) [2022-01-18 22:09:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1210/1251] eta 0:01:30 lr 0.000948 time 2.1463 (2.2025) loss 4.4490 (3.9918) grad_norm 1.0369 (1.1197) [2022-01-18 22:09:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1220/1251] eta 0:01:08 lr 0.000948 time 1.8618 (2.2012) loss 3.2231 (3.9903) grad_norm 0.9572 (1.1193) [2022-01-18 22:09:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1230/1251] eta 0:00:46 lr 0.000948 time 3.5976 (2.2031) loss 2.9599 (3.9919) grad_norm 0.9949 (1.1188) [2022-01-18 22:10:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1240/1251] eta 0:00:24 lr 0.000948 time 1.6486 (2.2013) loss 4.0315 (3.9933) grad_norm 0.9477 (1.1186) [2022-01-18 22:10:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [43/300][1250/1251] eta 0:00:02 lr 0.000948 time 1.1957 (2.1951) loss 4.1922 (3.9914) grad_norm 1.2955 (1.1187) [2022-01-18 22:10:24 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 43 training takes 0:45:46 [2022-01-18 22:10:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.325 (18.325) Loss 1.5358 (1.5358) Acc@1 65.332 (65.332) Acc@5 87.500 (87.500) [2022-01-18 22:11:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.620 (3.436) Loss 1.2704 (1.3753) Acc@1 71.973 (69.416) Acc@5 91.016 (89.560) [2022-01-18 22:11:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.602 (2.640) Loss 1.4000 (1.3920) Acc@1 68.164 (69.220) Acc@5 89.355 (89.365) [2022-01-18 22:11:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.579 (2.267) Loss 1.4032 (1.3965) Acc@1 69.336 (69.024) Acc@5 88.867 (89.321) [2022-01-18 22:11:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.300 (2.202) Loss 1.4003 (1.3938) Acc@1 68.848 (69.100) Acc@5 88.965 (89.332) [2022-01-18 22:12:02 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 69.124 Acc@5 89.418 [2022-01-18 22:12:02 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 69.1% [2022-01-18 22:12:02 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 69.12% [2022-01-18 22:12:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][0/1251] eta 7:32:34 lr 0.000948 time 21.7059 (21.7059) loss 3.6570 (3.6570) grad_norm 1.1890 (1.1890) [2022-01-18 22:12:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][10/1251] eta 1:24:13 lr 0.000948 time 2.2550 (4.0724) loss 3.9789 (3.9210) grad_norm 0.9953 (1.1331) [2022-01-18 22:13:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][20/1251] eta 1:04:30 lr 0.000948 time 1.6593 (3.1444) loss 4.1110 (3.9824) grad_norm 1.0939 (1.1073) [2022-01-18 22:13:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][30/1251] eta 0:56:55 lr 0.000948 time 1.8132 (2.7975) loss 4.1240 (4.0871) grad_norm 1.1696 (1.1053) [2022-01-18 22:13:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][40/1251] eta 0:53:51 lr 0.000948 time 3.9648 (2.6684) loss 4.1337 (4.0134) grad_norm 1.1708 (1.1260) [2022-01-18 22:14:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][50/1251] eta 0:52:00 lr 0.000948 time 2.0886 (2.5987) loss 3.0501 (4.0314) grad_norm 1.5813 (1.1205) [2022-01-18 22:14:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][60/1251] eta 0:50:29 lr 0.000948 time 1.4311 (2.5439) loss 4.5432 (4.0321) grad_norm 1.2011 (1.1176) [2022-01-18 22:14:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][70/1251] eta 0:48:48 lr 0.000948 time 1.7099 (2.4796) loss 4.8395 (4.0271) grad_norm 0.9990 (1.1299) [2022-01-18 22:15:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][80/1251] eta 0:48:21 lr 0.000948 time 3.7526 (2.4780) loss 4.5964 (4.0395) grad_norm 1.2363 (1.1178) [2022-01-18 22:15:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][90/1251] eta 0:47:40 lr 0.000948 time 2.6556 (2.4635) loss 2.9310 (4.0032) grad_norm 1.1532 (1.1057) [2022-01-18 22:16:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][100/1251] eta 0:47:17 lr 0.000948 time 2.5370 (2.4657) loss 4.1640 (4.0309) grad_norm 1.1456 (1.1027) [2022-01-18 22:16:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][110/1251] eta 0:46:19 lr 0.000948 time 1.6641 (2.4358) loss 4.2187 (4.0066) grad_norm 1.0367 (1.0993) [2022-01-18 22:16:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][120/1251] eta 0:45:14 lr 0.000948 time 2.5233 (2.4002) loss 3.7143 (4.0068) grad_norm 1.1348 (1.1008) [2022-01-18 22:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][130/1251] eta 0:44:08 lr 0.000948 time 1.9286 (2.3623) loss 4.7521 (4.0104) grad_norm 1.0819 (1.0994) [2022-01-18 22:17:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][140/1251] eta 0:43:17 lr 0.000948 time 2.5422 (2.3384) loss 3.1221 (3.9950) grad_norm 1.2765 (1.1095) [2022-01-18 22:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][150/1251] eta 0:42:43 lr 0.000948 time 2.6589 (2.3286) loss 4.5092 (4.0122) grad_norm 1.0317 (1.1124) [2022-01-18 22:18:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][160/1251] eta 0:42:07 lr 0.000948 time 2.2192 (2.3165) loss 4.5666 (4.0198) grad_norm 0.9838 (1.1126) [2022-01-18 22:18:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][170/1251] eta 0:41:37 lr 0.000948 time 2.5640 (2.3103) loss 4.4151 (4.0106) grad_norm 1.2993 (1.1109) [2022-01-18 22:19:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][180/1251] eta 0:41:09 lr 0.000948 time 1.8567 (2.3058) loss 4.1658 (4.0065) grad_norm 1.3059 (1.1113) [2022-01-18 22:19:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][190/1251] eta 0:40:42 lr 0.000948 time 2.3236 (2.3016) loss 4.1722 (4.0075) grad_norm 0.9545 (1.1068) [2022-01-18 22:19:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][200/1251] eta 0:40:22 lr 0.000948 time 3.1815 (2.3046) loss 3.0562 (3.9921) grad_norm 1.1165 (1.1054) [2022-01-18 22:20:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][210/1251] eta 0:39:57 lr 0.000948 time 2.5433 (2.3030) loss 3.6635 (3.9997) grad_norm 1.0760 (1.1009) [2022-01-18 22:20:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][220/1251] eta 0:39:24 lr 0.000948 time 2.3081 (2.2939) loss 3.2432 (3.9847) grad_norm 1.1197 (1.1011) [2022-01-18 22:20:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][230/1251] eta 0:38:55 lr 0.000948 time 2.4435 (2.2877) loss 4.6236 (3.9834) grad_norm 1.2593 (1.1005) [2022-01-18 22:21:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][240/1251] eta 0:38:36 lr 0.000948 time 3.0627 (2.2913) loss 4.4979 (3.9835) grad_norm 0.9294 (1.1012) [2022-01-18 22:21:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][250/1251] eta 0:38:06 lr 0.000948 time 1.9071 (2.2838) loss 3.3660 (3.9771) grad_norm 1.1328 (1.1034) [2022-01-18 22:21:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][260/1251] eta 0:37:36 lr 0.000948 time 1.6425 (2.2768) loss 4.2356 (3.9789) grad_norm 1.0496 (1.1035) [2022-01-18 22:22:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][270/1251] eta 0:37:09 lr 0.000948 time 2.1245 (2.2725) loss 3.9269 (3.9787) grad_norm 1.1479 (1.1042) [2022-01-18 22:22:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][280/1251] eta 0:36:42 lr 0.000948 time 3.0817 (2.2681) loss 4.0701 (3.9758) grad_norm 1.1171 (1.1021) [2022-01-18 22:23:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][290/1251] eta 0:36:18 lr 0.000948 time 2.2301 (2.2670) loss 4.1996 (3.9822) grad_norm 1.1421 (1.1019) [2022-01-18 22:23:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][300/1251] eta 0:35:51 lr 0.000948 time 2.5108 (2.2623) loss 4.0912 (3.9837) grad_norm 0.9133 (1.1026) [2022-01-18 22:23:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][310/1251] eta 0:35:26 lr 0.000948 time 2.2246 (2.2598) loss 4.1188 (3.9841) grad_norm 1.2913 (1.1044) [2022-01-18 22:24:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][320/1251] eta 0:35:00 lr 0.000948 time 2.2520 (2.2566) loss 4.1279 (3.9875) grad_norm 0.9746 (1.1048) [2022-01-18 22:24:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][330/1251] eta 0:34:36 lr 0.000948 time 2.0993 (2.2547) loss 4.8794 (3.9977) grad_norm 1.0742 (1.1044) [2022-01-18 22:24:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][340/1251] eta 0:34:10 lr 0.000948 time 2.2137 (2.2504) loss 3.7982 (3.9936) grad_norm 1.2601 (1.1037) [2022-01-18 22:25:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][350/1251] eta 0:33:44 lr 0.000948 time 2.1219 (2.2471) loss 3.6339 (3.9924) grad_norm 1.2184 (1.1027) [2022-01-18 22:25:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][360/1251] eta 0:33:20 lr 0.000948 time 2.5789 (2.2454) loss 3.9712 (3.9848) grad_norm 0.9886 (1.1027) [2022-01-18 22:25:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][370/1251] eta 0:32:56 lr 0.000948 time 1.9804 (2.2438) loss 4.9133 (3.9896) grad_norm 1.4341 (1.1034) [2022-01-18 22:26:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][380/1251] eta 0:32:33 lr 0.000948 time 2.5303 (2.2423) loss 4.0618 (3.9908) grad_norm 1.2237 (1.1049) [2022-01-18 22:26:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][390/1251] eta 0:32:12 lr 0.000948 time 2.1862 (2.2440) loss 4.4403 (3.9908) grad_norm 1.1838 (1.1071) [2022-01-18 22:27:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][400/1251] eta 0:31:48 lr 0.000948 time 3.4968 (2.2425) loss 3.9595 (3.9910) grad_norm 0.9351 (1.1090) [2022-01-18 22:27:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][410/1251] eta 0:31:26 lr 0.000948 time 2.1889 (2.2432) loss 4.5072 (3.9910) grad_norm 1.1115 (1.1081) [2022-01-18 22:27:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][420/1251] eta 0:31:02 lr 0.000948 time 2.1857 (2.2407) loss 4.0915 (3.9893) grad_norm 1.2085 (1.1101) [2022-01-18 22:28:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][430/1251] eta 0:30:37 lr 0.000948 time 1.7657 (2.2386) loss 4.5189 (3.9898) grad_norm 1.1959 (1.1099) [2022-01-18 22:28:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][440/1251] eta 0:30:13 lr 0.000948 time 2.0532 (2.2358) loss 4.5967 (3.9975) grad_norm 1.1094 (1.1102) [2022-01-18 22:28:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][450/1251] eta 0:29:47 lr 0.000948 time 1.6202 (2.2318) loss 2.9741 (3.9992) grad_norm 0.9179 (1.1101) [2022-01-18 22:29:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][460/1251] eta 0:29:22 lr 0.000948 time 1.6714 (2.2287) loss 4.3082 (3.9904) grad_norm 1.0177 (1.1101) [2022-01-18 22:29:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][470/1251] eta 0:29:02 lr 0.000948 time 2.2231 (2.2316) loss 4.2503 (3.9949) grad_norm 1.0734 (1.1130) [2022-01-18 22:29:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][480/1251] eta 0:28:39 lr 0.000947 time 2.4752 (2.2299) loss 3.7988 (3.9972) grad_norm 1.1053 (1.1137) [2022-01-18 22:30:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][490/1251] eta 0:28:14 lr 0.000947 time 1.5936 (2.2267) loss 3.5481 (3.9981) grad_norm 1.0534 (1.1123) [2022-01-18 22:30:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][500/1251] eta 0:27:52 lr 0.000947 time 2.6085 (2.2266) loss 2.9932 (3.9942) grad_norm 0.9857 (1.1127) [2022-01-18 22:31:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][510/1251] eta 0:27:29 lr 0.000947 time 1.8797 (2.2263) loss 3.5173 (3.9936) grad_norm 1.0960 (1.1112) [2022-01-18 22:31:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][520/1251] eta 0:27:06 lr 0.000947 time 1.9793 (2.2244) loss 4.3064 (3.9996) grad_norm 1.1862 (1.1111) [2022-01-18 22:31:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][530/1251] eta 0:26:41 lr 0.000947 time 2.3185 (2.2217) loss 3.5816 (3.9953) grad_norm 1.1318 (1.1113) [2022-01-18 22:32:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][540/1251] eta 0:26:20 lr 0.000947 time 2.6179 (2.2224) loss 4.4560 (3.9972) grad_norm 0.9965 (1.1118) [2022-01-18 22:32:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][550/1251] eta 0:25:56 lr 0.000947 time 2.1296 (2.2201) loss 3.6357 (3.9993) grad_norm 0.9809 (1.1105) [2022-01-18 22:32:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][560/1251] eta 0:25:33 lr 0.000947 time 2.3213 (2.2187) loss 4.1928 (4.0029) grad_norm 1.0291 (1.1107) [2022-01-18 22:33:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][570/1251] eta 0:25:11 lr 0.000947 time 2.2715 (2.2202) loss 4.2048 (4.0035) grad_norm 1.0220 (1.1095) [2022-01-18 22:33:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][580/1251] eta 0:24:52 lr 0.000947 time 3.7509 (2.2244) loss 4.5929 (4.0044) grad_norm 0.9729 (1.1083) [2022-01-18 22:33:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][590/1251] eta 0:24:31 lr 0.000947 time 2.7700 (2.2265) loss 4.7746 (4.0063) grad_norm 1.3355 (1.1081) [2022-01-18 22:34:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][600/1251] eta 0:24:07 lr 0.000947 time 2.2298 (2.2239) loss 3.8802 (4.0034) grad_norm 0.9727 (1.1100) [2022-01-18 22:34:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][610/1251] eta 0:23:41 lr 0.000947 time 1.6074 (2.2179) loss 2.6022 (3.9982) grad_norm 1.0779 (1.1119) [2022-01-18 22:34:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][620/1251] eta 0:23:16 lr 0.000947 time 1.9321 (2.2138) loss 3.7219 (3.9967) grad_norm 1.0640 (1.1127) [2022-01-18 22:35:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][630/1251] eta 0:22:52 lr 0.000947 time 2.0540 (2.2104) loss 4.2882 (3.9940) grad_norm 1.0864 (1.1133) [2022-01-18 22:35:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][640/1251] eta 0:22:28 lr 0.000947 time 2.1985 (2.2077) loss 3.0874 (3.9932) grad_norm 1.1560 (1.1128) [2022-01-18 22:35:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][650/1251] eta 0:22:06 lr 0.000947 time 2.1094 (2.2064) loss 4.6291 (3.9953) grad_norm 1.0190 (1.1131) [2022-01-18 22:36:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][660/1251] eta 0:21:43 lr 0.000947 time 2.5708 (2.2053) loss 3.7980 (3.9983) grad_norm 1.0024 (1.1126) [2022-01-18 22:36:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][670/1251] eta 0:21:21 lr 0.000947 time 2.1460 (2.2050) loss 3.1467 (3.9931) grad_norm 0.9477 (1.1117) [2022-01-18 22:37:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][680/1251] eta 0:21:06 lr 0.000947 time 7.3580 (2.2173) loss 4.2322 (3.9910) grad_norm 1.0375 (1.1111) [2022-01-18 22:37:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][690/1251] eta 0:20:43 lr 0.000947 time 2.1906 (2.2169) loss 3.5145 (3.9888) grad_norm 1.2602 (1.1100) [2022-01-18 22:37:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][700/1251] eta 0:20:21 lr 0.000947 time 2.1207 (2.2173) loss 2.7628 (3.9883) grad_norm 1.3538 (1.1104) [2022-01-18 22:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][710/1251] eta 0:19:59 lr 0.000947 time 1.9255 (2.2167) loss 4.2846 (3.9928) grad_norm 1.0993 (1.1095) [2022-01-18 22:38:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][720/1251] eta 0:19:37 lr 0.000947 time 3.4028 (2.2168) loss 3.7014 (3.9893) grad_norm 0.9654 (1.1091) [2022-01-18 22:39:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][730/1251] eta 0:19:12 lr 0.000947 time 1.6145 (2.2130) loss 4.1587 (3.9874) grad_norm 0.9822 (1.1090) [2022-01-18 22:39:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][740/1251] eta 0:18:49 lr 0.000947 time 1.6259 (2.2099) loss 4.4812 (3.9870) grad_norm 1.2360 (1.1094) [2022-01-18 22:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][750/1251] eta 0:18:26 lr 0.000947 time 1.8145 (2.2090) loss 2.9996 (3.9877) grad_norm 1.3273 (1.1085) [2022-01-18 22:40:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][760/1251] eta 0:18:04 lr 0.000947 time 2.5024 (2.2079) loss 4.3028 (3.9858) grad_norm 1.0126 (1.1088) [2022-01-18 22:40:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][770/1251] eta 0:17:41 lr 0.000947 time 2.0486 (2.2077) loss 3.2734 (3.9858) grad_norm 1.2569 (1.1092) [2022-01-18 22:40:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][780/1251] eta 0:17:20 lr 0.000947 time 1.9071 (2.2081) loss 2.8424 (3.9829) grad_norm 1.1031 (1.1091) [2022-01-18 22:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][790/1251] eta 0:16:58 lr 0.000947 time 1.8874 (2.2086) loss 4.0875 (3.9819) grad_norm 1.0224 (1.1098) [2022-01-18 22:41:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][800/1251] eta 0:16:36 lr 0.000947 time 2.5104 (2.2087) loss 3.6789 (3.9821) grad_norm 1.1566 (1.1103) [2022-01-18 22:41:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][810/1251] eta 0:16:13 lr 0.000947 time 1.9729 (2.2076) loss 4.2894 (3.9790) grad_norm 1.3587 (1.1110) [2022-01-18 22:42:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][820/1251] eta 0:15:51 lr 0.000947 time 1.8098 (2.2066) loss 4.6214 (3.9844) grad_norm 1.2242 (1.1127) [2022-01-18 22:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][830/1251] eta 0:15:28 lr 0.000947 time 2.2640 (2.2059) loss 4.7444 (3.9858) grad_norm 0.9222 (1.1127) [2022-01-18 22:42:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][840/1251] eta 0:15:06 lr 0.000947 time 2.5591 (2.2051) loss 2.7959 (3.9842) grad_norm 1.1108 (1.1132) [2022-01-18 22:43:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][850/1251] eta 0:14:43 lr 0.000947 time 1.6071 (2.2040) loss 4.0550 (3.9843) grad_norm 1.3039 (1.1147) [2022-01-18 22:43:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][860/1251] eta 0:14:21 lr 0.000947 time 2.0991 (2.2044) loss 3.3732 (3.9805) grad_norm 1.0980 (1.1150) [2022-01-18 22:44:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][870/1251] eta 0:13:59 lr 0.000947 time 1.9707 (2.2040) loss 4.2337 (3.9826) grad_norm 1.1088 (1.1147) [2022-01-18 22:44:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][880/1251] eta 0:13:37 lr 0.000947 time 2.1855 (2.2033) loss 4.0862 (3.9829) grad_norm 0.9909 (1.1149) [2022-01-18 22:44:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][890/1251] eta 0:13:15 lr 0.000947 time 2.6409 (2.2038) loss 4.2430 (3.9871) grad_norm 1.2704 (1.1149) [2022-01-18 22:45:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][900/1251] eta 0:12:53 lr 0.000947 time 1.9484 (2.2031) loss 3.5468 (3.9845) grad_norm 0.9796 (1.1155) [2022-01-18 22:45:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][910/1251] eta 0:12:30 lr 0.000947 time 2.2408 (2.2023) loss 3.7964 (3.9871) grad_norm 0.9898 (1.1152) [2022-01-18 22:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][920/1251] eta 0:12:09 lr 0.000947 time 2.8587 (2.2031) loss 4.0004 (3.9883) grad_norm 0.9577 (1.1155) [2022-01-18 22:46:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][930/1251] eta 0:11:47 lr 0.000947 time 2.5825 (2.2045) loss 4.8514 (3.9892) grad_norm 1.1664 (1.1153) [2022-01-18 22:46:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][940/1251] eta 0:11:25 lr 0.000947 time 2.8220 (2.2041) loss 4.2954 (3.9898) grad_norm 1.3000 (1.1162) [2022-01-18 22:46:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][950/1251] eta 0:11:02 lr 0.000947 time 2.0994 (2.2021) loss 4.5364 (3.9916) grad_norm 1.0106 (1.1164) [2022-01-18 22:47:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][960/1251] eta 0:10:40 lr 0.000947 time 1.8969 (2.2001) loss 4.0074 (3.9903) grad_norm 1.0658 (1.1158) [2022-01-18 22:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][970/1251] eta 0:10:18 lr 0.000947 time 2.4127 (2.2001) loss 4.6946 (3.9917) grad_norm 1.2399 (1.1148) [2022-01-18 22:48:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][980/1251] eta 0:09:56 lr 0.000947 time 2.8950 (2.2011) loss 4.4619 (3.9916) grad_norm 1.2040 (1.1147) [2022-01-18 22:48:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][990/1251] eta 0:09:34 lr 0.000947 time 2.7325 (2.2012) loss 4.3655 (3.9910) grad_norm 1.3445 (1.1143) [2022-01-18 22:48:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1000/1251] eta 0:09:12 lr 0.000947 time 1.8751 (2.2012) loss 4.0296 (3.9904) grad_norm 1.2993 (1.1151) [2022-01-18 22:49:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1010/1251] eta 0:08:50 lr 0.000947 time 2.6260 (2.2024) loss 4.1774 (3.9906) grad_norm 1.0711 (1.1146) [2022-01-18 22:49:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1020/1251] eta 0:08:29 lr 0.000946 time 3.3163 (2.2049) loss 4.1523 (3.9927) grad_norm 1.1102 (1.1142) [2022-01-18 22:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1030/1251] eta 0:08:07 lr 0.000946 time 1.7421 (2.2045) loss 3.1866 (3.9913) grad_norm 1.2028 (1.1141) [2022-01-18 22:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1040/1251] eta 0:07:44 lr 0.000946 time 1.8858 (2.2021) loss 4.8169 (3.9900) grad_norm 1.2102 (1.1140) [2022-01-18 22:50:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1050/1251] eta 0:07:22 lr 0.000946 time 2.4620 (2.1996) loss 3.9112 (3.9896) grad_norm 1.0113 (1.1135) [2022-01-18 22:50:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1060/1251] eta 0:06:59 lr 0.000946 time 2.7889 (2.1989) loss 4.5493 (3.9906) grad_norm 1.0821 (1.1129) [2022-01-18 22:51:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1070/1251] eta 0:06:38 lr 0.000946 time 2.1804 (2.1993) loss 4.3018 (3.9908) grad_norm 1.0758 (1.1122) [2022-01-18 22:51:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1080/1251] eta 0:06:16 lr 0.000946 time 2.0811 (2.2001) loss 4.4926 (3.9875) grad_norm 1.0731 (1.1121) [2022-01-18 22:52:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1090/1251] eta 0:05:54 lr 0.000946 time 2.1951 (2.2008) loss 4.4803 (3.9862) grad_norm 1.2510 (1.1126) [2022-01-18 22:52:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1100/1251] eta 0:05:32 lr 0.000946 time 2.4904 (2.2016) loss 4.4301 (3.9866) grad_norm 1.1922 (1.1125) [2022-01-18 22:52:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1110/1251] eta 0:05:10 lr 0.000946 time 1.8852 (2.2010) loss 3.5814 (3.9844) grad_norm 0.8959 (1.1126) [2022-01-18 22:53:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1120/1251] eta 0:04:48 lr 0.000946 time 2.1401 (2.1994) loss 4.0834 (3.9830) grad_norm 1.1437 (1.1139) [2022-01-18 22:53:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1130/1251] eta 0:04:26 lr 0.000946 time 1.8459 (2.1992) loss 3.8210 (3.9834) grad_norm 1.3703 (1.1144) [2022-01-18 22:53:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1140/1251] eta 0:04:04 lr 0.000946 time 2.5112 (2.1991) loss 3.2795 (3.9826) grad_norm 1.0765 (1.1142) [2022-01-18 22:54:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1150/1251] eta 0:03:42 lr 0.000946 time 2.2760 (2.1992) loss 3.9762 (3.9845) grad_norm 1.0449 (1.1147) [2022-01-18 22:54:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1160/1251] eta 0:03:20 lr 0.000946 time 1.8294 (2.1998) loss 3.4885 (3.9826) grad_norm 1.1830 (1.1151) [2022-01-18 22:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1170/1251] eta 0:02:58 lr 0.000946 time 1.8977 (2.2009) loss 4.5988 (3.9839) grad_norm 0.9694 (1.1151) [2022-01-18 22:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1180/1251] eta 0:02:36 lr 0.000946 time 2.4246 (2.2017) loss 3.9477 (3.9835) grad_norm 1.2749 (1.1156) [2022-01-18 22:55:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1190/1251] eta 0:02:14 lr 0.000946 time 2.1445 (2.2014) loss 4.0931 (3.9829) grad_norm 0.9366 (1.1152) [2022-01-18 22:56:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1200/1251] eta 0:01:52 lr 0.000946 time 1.9300 (2.2010) loss 3.8068 (3.9851) grad_norm 1.0223 (1.1151) [2022-01-18 22:56:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1210/1251] eta 0:01:30 lr 0.000946 time 1.6647 (2.1983) loss 4.2319 (3.9845) grad_norm 0.9593 (1.1148) [2022-01-18 22:56:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1220/1251] eta 0:01:08 lr 0.000946 time 1.9002 (2.1968) loss 4.6376 (3.9836) grad_norm 1.2740 (1.1143) [2022-01-18 22:57:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1230/1251] eta 0:00:46 lr 0.000946 time 2.4959 (2.1968) loss 3.9199 (3.9859) grad_norm 1.0832 (1.1144) [2022-01-18 22:57:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1240/1251] eta 0:00:24 lr 0.000946 time 1.9086 (2.1970) loss 3.2539 (3.9858) grad_norm 1.1777 (1.1144) [2022-01-18 22:57:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [44/300][1250/1251] eta 0:00:02 lr 0.000946 time 1.1728 (2.1919) loss 3.4838 (3.9845) grad_norm 1.5235 (1.1154) [2022-01-18 22:57:45 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 44 training takes 0:45:42 [2022-01-18 22:58:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.574 (18.574) Loss 1.3134 (1.3134) Acc@1 68.262 (68.262) Acc@5 89.648 (89.648) [2022-01-18 22:58:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.597 (3.481) Loss 1.2822 (1.2918) Acc@1 71.680 (69.798) Acc@5 88.379 (89.648) [2022-01-18 22:58:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.295 (2.687) Loss 1.2120 (1.2965) Acc@1 72.559 (69.843) Acc@5 90.430 (89.565) [2022-01-18 22:58:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.301 (2.309) Loss 1.2925 (1.2955) Acc@1 68.555 (69.629) Acc@5 89.551 (89.696) [2022-01-18 22:59:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.465 (2.177) Loss 1.2596 (1.2908) Acc@1 70.215 (69.708) Acc@5 90.137 (89.796) [2022-01-18 22:59:21 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 69.654 Acc@5 89.778 [2022-01-18 22:59:21 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 69.7% [2022-01-18 22:59:21 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 69.65% [2022-01-18 22:59:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][0/1251] eta 7:31:25 lr 0.000946 time 21.6514 (21.6514) loss 4.5712 (4.5712) grad_norm 1.0859 (1.0859) [2022-01-18 23:00:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][10/1251] eta 1:22:36 lr 0.000946 time 1.4906 (3.9943) loss 4.4490 (4.0733) grad_norm 1.0185 (1.0292) [2022-01-18 23:00:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][20/1251] eta 1:05:36 lr 0.000946 time 1.5411 (3.1976) loss 4.6759 (4.1934) grad_norm 1.0194 (1.0444) [2022-01-18 23:00:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][30/1251] eta 0:57:10 lr 0.000946 time 1.5850 (2.8095) loss 4.5644 (4.2240) grad_norm 1.1500 (1.0600) [2022-01-18 23:01:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][40/1251] eta 0:54:13 lr 0.000946 time 3.8342 (2.6867) loss 4.1579 (4.1829) grad_norm 0.9720 (1.0696) [2022-01-18 23:01:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][50/1251] eta 0:53:16 lr 0.000946 time 2.8137 (2.6619) loss 3.8250 (4.0851) grad_norm 1.1225 (1.0756) [2022-01-18 23:02:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][60/1251] eta 0:51:38 lr 0.000946 time 2.1330 (2.6018) loss 4.6382 (4.0851) grad_norm 1.3832 (1.0899) [2022-01-18 23:02:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][70/1251] eta 0:49:44 lr 0.000946 time 2.2620 (2.5269) loss 3.4435 (4.0803) grad_norm 1.0990 (1.1196) [2022-01-18 23:02:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][80/1251] eta 0:48:36 lr 0.000946 time 3.2334 (2.4902) loss 4.8840 (4.0805) grad_norm 1.1086 (1.1395) [2022-01-18 23:03:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][90/1251] eta 0:47:32 lr 0.000946 time 2.2441 (2.4566) loss 4.2020 (4.1162) grad_norm 1.0603 (1.1408) [2022-01-18 23:03:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][100/1251] eta 0:46:33 lr 0.000946 time 2.4008 (2.4268) loss 4.0926 (4.0916) grad_norm 1.0040 (1.1303) [2022-01-18 23:03:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][110/1251] eta 0:45:33 lr 0.000946 time 2.4584 (2.3959) loss 4.5064 (4.1017) grad_norm 1.2703 (1.1335) [2022-01-18 23:04:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][120/1251] eta 0:44:54 lr 0.000946 time 2.2616 (2.3824) loss 4.5408 (4.0788) grad_norm 1.2087 (1.1341) [2022-01-18 23:04:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][130/1251] eta 0:44:10 lr 0.000946 time 2.2371 (2.3648) loss 4.1146 (4.0898) grad_norm 1.1956 (1.1311) [2022-01-18 23:04:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][140/1251] eta 0:43:23 lr 0.000946 time 1.7733 (2.3434) loss 3.5474 (4.0610) grad_norm 1.2221 (1.1333) [2022-01-18 23:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][150/1251] eta 0:42:58 lr 0.000946 time 2.7046 (2.3422) loss 3.9731 (4.0588) grad_norm 0.9699 (1.1394) [2022-01-18 23:05:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][160/1251] eta 0:42:26 lr 0.000946 time 1.9778 (2.3345) loss 3.3593 (4.0471) grad_norm 1.0931 (1.1359) [2022-01-18 23:06:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][170/1251] eta 0:42:00 lr 0.000946 time 2.8479 (2.3315) loss 4.2818 (4.0355) grad_norm 1.0494 (1.1354) [2022-01-18 23:06:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][180/1251] eta 0:41:25 lr 0.000946 time 1.8210 (2.3212) loss 4.0854 (4.0260) grad_norm 1.0271 (1.1339) [2022-01-18 23:06:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][190/1251] eta 0:40:54 lr 0.000946 time 3.1372 (2.3131) loss 4.0354 (4.0364) grad_norm 1.2162 (1.1375) [2022-01-18 23:07:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][200/1251] eta 0:40:24 lr 0.000946 time 2.2701 (2.3071) loss 4.5417 (4.0261) grad_norm 1.0792 (1.1397) [2022-01-18 23:07:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][210/1251] eta 0:39:57 lr 0.000946 time 2.4851 (2.3034) loss 3.9303 (4.0359) grad_norm 1.0570 (1.1401) [2022-01-18 23:07:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][220/1251] eta 0:39:23 lr 0.000946 time 1.8916 (2.2928) loss 4.7339 (4.0406) grad_norm 1.2013 (1.1390) [2022-01-18 23:08:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][230/1251] eta 0:38:48 lr 0.000946 time 2.0539 (2.2807) loss 4.7717 (4.0491) grad_norm 0.8982 (1.1380) [2022-01-18 23:08:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][240/1251] eta 0:38:20 lr 0.000946 time 2.3550 (2.2758) loss 4.4848 (4.0346) grad_norm 1.1403 (1.1385) [2022-01-18 23:08:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][250/1251] eta 0:37:53 lr 0.000946 time 2.2390 (2.2715) loss 4.6824 (4.0331) grad_norm 1.1825 (1.1375) [2022-01-18 23:09:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][260/1251] eta 0:37:30 lr 0.000946 time 1.8889 (2.2706) loss 3.7258 (4.0326) grad_norm 1.0927 (1.1398) [2022-01-18 23:09:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][270/1251] eta 0:37:04 lr 0.000946 time 1.9564 (2.2681) loss 3.3101 (4.0269) grad_norm 0.9667 (1.1394) [2022-01-18 23:09:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][280/1251] eta 0:36:33 lr 0.000946 time 2.2733 (2.2593) loss 4.1949 (4.0181) grad_norm 1.0810 (1.1370) [2022-01-18 23:10:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][290/1251] eta 0:36:03 lr 0.000946 time 1.9408 (2.2512) loss 3.5962 (4.0157) grad_norm 1.0966 (1.1353) [2022-01-18 23:10:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][300/1251] eta 0:35:37 lr 0.000945 time 1.9027 (2.2481) loss 3.5849 (4.0176) grad_norm 0.9760 (1.1332) [2022-01-18 23:11:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][310/1251] eta 0:35:13 lr 0.000945 time 1.8591 (2.2464) loss 4.7277 (4.0184) grad_norm 1.1527 (1.1319) [2022-01-18 23:11:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][320/1251] eta 0:34:51 lr 0.000945 time 1.8153 (2.2463) loss 4.5235 (4.0226) grad_norm 1.0369 (1.1299) [2022-01-18 23:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][330/1251] eta 0:34:26 lr 0.000945 time 1.8332 (2.2436) loss 3.6238 (4.0187) grad_norm 1.1745 (1.1311) [2022-01-18 23:12:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][340/1251] eta 0:34:02 lr 0.000945 time 1.9728 (2.2416) loss 4.1819 (4.0158) grad_norm 0.9724 (1.1300) [2022-01-18 23:12:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][350/1251] eta 0:33:37 lr 0.000945 time 2.7881 (2.2397) loss 3.2810 (4.0147) grad_norm 1.3773 (1.1315) [2022-01-18 23:12:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][360/1251] eta 0:33:18 lr 0.000945 time 2.4714 (2.2425) loss 3.4539 (4.0171) grad_norm 1.1421 (1.1334) [2022-01-18 23:13:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][370/1251] eta 0:32:55 lr 0.000945 time 2.2161 (2.2424) loss 3.8978 (4.0161) grad_norm 0.9260 (1.1338) [2022-01-18 23:13:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][380/1251] eta 0:32:31 lr 0.000945 time 1.6572 (2.2408) loss 3.7383 (4.0172) grad_norm 1.0475 (1.1351) [2022-01-18 23:13:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][390/1251] eta 0:32:05 lr 0.000945 time 2.5426 (2.2365) loss 4.3840 (4.0186) grad_norm 1.5712 (1.1377) [2022-01-18 23:14:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][400/1251] eta 0:31:41 lr 0.000945 time 1.8817 (2.2346) loss 4.2028 (4.0168) grad_norm 1.1921 (1.1369) [2022-01-18 23:14:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][410/1251] eta 0:31:15 lr 0.000945 time 1.6092 (2.2296) loss 4.6143 (4.0154) grad_norm 1.2065 (1.1368) [2022-01-18 23:15:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][420/1251] eta 0:30:54 lr 0.000945 time 2.8672 (2.2314) loss 4.5422 (4.0136) grad_norm 1.3462 (1.1360) [2022-01-18 23:15:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][430/1251] eta 0:30:34 lr 0.000945 time 3.1224 (2.2340) loss 4.3408 (4.0153) grad_norm 0.9711 (1.1346) [2022-01-18 23:15:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][440/1251] eta 0:30:11 lr 0.000945 time 2.5114 (2.2338) loss 3.2848 (4.0106) grad_norm 1.0808 (1.1352) [2022-01-18 23:16:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][450/1251] eta 0:29:45 lr 0.000945 time 1.7007 (2.2296) loss 4.0205 (4.0084) grad_norm 1.0699 (1.1349) [2022-01-18 23:16:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][460/1251] eta 0:29:23 lr 0.000945 time 2.6163 (2.2292) loss 3.0880 (4.0020) grad_norm 1.1797 (1.1354) [2022-01-18 23:16:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][470/1251] eta 0:29:01 lr 0.000945 time 3.6485 (2.2293) loss 3.6518 (4.0024) grad_norm 1.1761 (1.1362) [2022-01-18 23:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][480/1251] eta 0:28:36 lr 0.000945 time 1.5493 (2.2268) loss 4.8148 (4.0046) grad_norm 1.0409 (1.1352) [2022-01-18 23:17:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][490/1251] eta 0:28:12 lr 0.000945 time 1.9045 (2.2247) loss 3.2362 (4.0051) grad_norm 1.0710 (1.1355) [2022-01-18 23:17:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][500/1251] eta 0:27:51 lr 0.000945 time 2.2295 (2.2261) loss 2.7447 (4.0059) grad_norm 1.0945 (1.1360) [2022-01-18 23:18:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][510/1251] eta 0:27:32 lr 0.000945 time 3.6533 (2.2301) loss 4.2579 (4.0055) grad_norm 1.5341 (1.1372) [2022-01-18 23:18:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][520/1251] eta 0:27:09 lr 0.000945 time 1.8220 (2.2291) loss 4.5638 (4.0038) grad_norm 1.0878 (1.1372) [2022-01-18 23:19:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][530/1251] eta 0:26:45 lr 0.000945 time 1.6741 (2.2264) loss 4.1834 (4.0068) grad_norm 1.0301 (1.1373) [2022-01-18 23:19:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][540/1251] eta 0:26:20 lr 0.000945 time 1.9412 (2.2235) loss 4.6658 (4.0080) grad_norm 1.2747 (1.1371) [2022-01-18 23:19:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][550/1251] eta 0:25:56 lr 0.000945 time 2.8900 (2.2209) loss 4.2656 (4.0126) grad_norm 1.1747 (1.1358) [2022-01-18 23:20:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][560/1251] eta 0:25:33 lr 0.000945 time 1.8627 (2.2187) loss 3.7000 (4.0123) grad_norm 1.4626 (1.1363) [2022-01-18 23:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][570/1251] eta 0:25:09 lr 0.000945 time 1.5321 (2.2169) loss 3.3732 (4.0113) grad_norm 1.5295 (1.1388) [2022-01-18 23:20:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][580/1251] eta 0:24:48 lr 0.000945 time 2.7239 (2.2183) loss 4.1325 (4.0087) grad_norm 1.1737 (1.1389) [2022-01-18 23:21:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][590/1251] eta 0:24:29 lr 0.000945 time 3.3126 (2.2225) loss 3.3608 (4.0105) grad_norm 1.7741 (1.1411) [2022-01-18 23:21:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][600/1251] eta 0:24:07 lr 0.000945 time 1.7422 (2.2230) loss 4.1958 (4.0059) grad_norm 1.1601 (1.1431) [2022-01-18 23:21:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][610/1251] eta 0:23:44 lr 0.000945 time 2.3515 (2.2225) loss 2.8486 (4.0087) grad_norm 1.1331 (1.1447) [2022-01-18 23:22:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][620/1251] eta 0:23:19 lr 0.000945 time 1.9461 (2.2185) loss 3.6093 (4.0100) grad_norm 0.9863 (1.1429) [2022-01-18 23:22:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][630/1251] eta 0:22:54 lr 0.000945 time 1.8721 (2.2136) loss 3.1156 (4.0095) grad_norm 1.2922 (1.1424) [2022-01-18 23:22:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][640/1251] eta 0:22:30 lr 0.000945 time 1.8907 (2.2110) loss 4.6016 (4.0066) grad_norm 0.9353 (1.1403) [2022-01-18 23:23:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][650/1251] eta 0:22:08 lr 0.000945 time 2.1735 (2.2104) loss 4.3981 (4.0062) grad_norm 1.0142 (1.1393) [2022-01-18 23:23:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][660/1251] eta 0:21:46 lr 0.000945 time 2.2267 (2.2102) loss 4.3200 (4.0037) grad_norm 1.0436 (1.1385) [2022-01-18 23:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][670/1251] eta 0:21:25 lr 0.000945 time 2.3094 (2.2122) loss 4.3183 (4.0040) grad_norm 1.2233 (1.1384) [2022-01-18 23:24:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][680/1251] eta 0:21:03 lr 0.000945 time 1.8024 (2.2121) loss 4.0528 (4.0046) grad_norm 1.1267 (1.1377) [2022-01-18 23:24:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][690/1251] eta 0:20:41 lr 0.000945 time 2.5166 (2.2137) loss 4.3128 (4.0067) grad_norm 1.0741 (1.1389) [2022-01-18 23:25:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][700/1251] eta 0:20:19 lr 0.000945 time 2.3384 (2.2133) loss 4.2557 (4.0051) grad_norm 1.1217 (1.1392) [2022-01-18 23:25:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][710/1251] eta 0:19:56 lr 0.000945 time 1.8968 (2.2125) loss 4.1588 (4.0010) grad_norm 1.0950 (1.1401) [2022-01-18 23:25:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][720/1251] eta 0:19:34 lr 0.000945 time 1.9755 (2.2110) loss 2.9311 (4.0010) grad_norm 0.9762 (1.1410) [2022-01-18 23:26:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][730/1251] eta 0:19:11 lr 0.000945 time 1.8709 (2.2107) loss 4.2838 (4.0002) grad_norm 1.1728 (1.1407) [2022-01-18 23:26:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][740/1251] eta 0:18:49 lr 0.000945 time 1.8292 (2.2109) loss 3.4991 (3.9982) grad_norm 0.9861 (1.1393) [2022-01-18 23:27:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][750/1251] eta 0:18:27 lr 0.000945 time 2.5642 (2.2114) loss 4.5645 (3.9980) grad_norm 1.2196 (1.1395) [2022-01-18 23:27:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][760/1251] eta 0:18:06 lr 0.000945 time 2.0038 (2.2125) loss 3.3234 (3.9998) grad_norm 1.0570 (1.1401) [2022-01-18 23:27:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][770/1251] eta 0:17:44 lr 0.000945 time 2.3335 (2.2128) loss 3.9955 (4.0033) grad_norm 1.0194 (1.1393) [2022-01-18 23:28:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][780/1251] eta 0:17:20 lr 0.000945 time 1.6859 (2.2098) loss 4.1262 (3.9995) grad_norm 1.4907 (1.1398) [2022-01-18 23:28:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][790/1251] eta 0:16:57 lr 0.000945 time 1.8269 (2.2080) loss 3.9343 (3.9976) grad_norm 1.0625 (1.1398) [2022-01-18 23:28:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][800/1251] eta 0:16:35 lr 0.000945 time 2.5544 (2.2083) loss 4.4477 (3.9986) grad_norm 1.0003 (1.1383) [2022-01-18 23:29:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][810/1251] eta 0:16:12 lr 0.000945 time 2.0630 (2.2055) loss 3.3473 (3.9997) grad_norm 1.0067 (1.1384) [2022-01-18 23:29:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][820/1251] eta 0:15:50 lr 0.000944 time 1.8648 (2.2050) loss 3.4282 (3.9987) grad_norm 1.1475 (1.1385) [2022-01-18 23:29:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][830/1251] eta 0:15:28 lr 0.000944 time 1.7535 (2.2044) loss 3.3807 (4.0006) grad_norm 1.0915 (1.1374) [2022-01-18 23:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][840/1251] eta 0:15:07 lr 0.000944 time 2.6235 (2.2076) loss 2.9609 (3.9969) grad_norm 1.0710 (1.1373) [2022-01-18 23:30:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][850/1251] eta 0:14:45 lr 0.000944 time 2.2229 (2.2087) loss 4.2072 (3.9962) grad_norm 1.4291 (1.1369) [2022-01-18 23:31:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][860/1251] eta 0:14:22 lr 0.000944 time 1.9127 (2.2064) loss 4.4088 (3.9964) grad_norm 0.9080 (1.1386) [2022-01-18 23:31:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][870/1251] eta 0:14:00 lr 0.000944 time 2.3702 (2.2062) loss 4.0644 (3.9959) grad_norm 1.2232 (1.1382) [2022-01-18 23:31:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][880/1251] eta 0:13:38 lr 0.000944 time 1.5792 (2.2050) loss 2.8060 (3.9950) grad_norm 1.1374 (1.1386) [2022-01-18 23:32:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][890/1251] eta 0:13:15 lr 0.000944 time 2.4733 (2.2046) loss 3.1657 (3.9907) grad_norm 1.1657 (1.1383) [2022-01-18 23:32:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][900/1251] eta 0:12:53 lr 0.000944 time 2.2113 (2.2050) loss 3.5491 (3.9897) grad_norm 0.8860 (1.1385) [2022-01-18 23:32:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][910/1251] eta 0:12:31 lr 0.000944 time 2.8063 (2.2038) loss 3.6293 (3.9920) grad_norm 0.9995 (1.1385) [2022-01-18 23:33:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][920/1251] eta 0:12:09 lr 0.000944 time 1.8019 (2.2028) loss 3.4886 (3.9889) grad_norm 0.9642 (1.1381) [2022-01-18 23:33:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][930/1251] eta 0:11:47 lr 0.000944 time 2.8698 (2.2041) loss 3.0376 (3.9874) grad_norm 1.2863 (1.1373) [2022-01-18 23:33:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][940/1251] eta 0:11:25 lr 0.000944 time 1.5731 (2.2042) loss 3.3279 (3.9858) grad_norm 1.1499 (1.1391) [2022-01-18 23:34:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][950/1251] eta 0:11:04 lr 0.000944 time 3.1121 (2.2070) loss 4.4979 (3.9833) grad_norm 0.9415 (1.1396) [2022-01-18 23:34:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][960/1251] eta 0:10:42 lr 0.000944 time 1.5238 (2.2065) loss 4.5245 (3.9822) grad_norm 1.0255 (1.1393) [2022-01-18 23:35:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][970/1251] eta 0:10:19 lr 0.000944 time 2.0448 (2.2061) loss 3.5922 (3.9837) grad_norm 1.2243 (1.1396) [2022-01-18 23:35:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][980/1251] eta 0:09:57 lr 0.000944 time 2.1937 (2.2048) loss 3.7051 (3.9833) grad_norm 1.0429 (1.1392) [2022-01-18 23:35:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][990/1251] eta 0:09:34 lr 0.000944 time 1.6934 (2.2028) loss 3.8363 (3.9833) grad_norm 0.9364 (1.1384) [2022-01-18 23:36:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1000/1251] eta 0:09:12 lr 0.000944 time 1.9043 (2.2014) loss 4.4766 (3.9821) grad_norm 1.0402 (1.1383) [2022-01-18 23:36:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1010/1251] eta 0:08:50 lr 0.000944 time 2.2363 (2.2006) loss 3.9118 (3.9833) grad_norm 1.1937 (1.1395) [2022-01-18 23:36:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1020/1251] eta 0:08:28 lr 0.000944 time 2.5491 (2.2011) loss 4.3498 (3.9866) grad_norm 1.0532 (1.1396) [2022-01-18 23:37:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1030/1251] eta 0:08:06 lr 0.000944 time 2.8243 (2.2010) loss 4.4740 (3.9888) grad_norm 1.0768 (1.1385) [2022-01-18 23:37:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1040/1251] eta 0:07:44 lr 0.000944 time 2.1912 (2.2011) loss 3.9044 (3.9871) grad_norm 1.2387 (1.1387) [2022-01-18 23:37:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1050/1251] eta 0:07:22 lr 0.000944 time 1.9804 (2.2015) loss 3.9248 (3.9889) grad_norm 0.9082 (1.1382) [2022-01-18 23:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1060/1251] eta 0:07:00 lr 0.000944 time 2.7862 (2.2024) loss 4.1600 (3.9880) grad_norm 1.2201 (1.1375) [2022-01-18 23:38:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1070/1251] eta 0:06:38 lr 0.000944 time 2.1335 (2.2017) loss 3.5238 (3.9875) grad_norm 1.1856 (1.1369) [2022-01-18 23:39:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1080/1251] eta 0:06:16 lr 0.000944 time 2.4162 (2.2025) loss 4.4557 (3.9868) grad_norm 1.0450 (1.1367) [2022-01-18 23:39:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1090/1251] eta 0:05:54 lr 0.000944 time 2.7095 (2.2033) loss 4.2285 (3.9879) grad_norm 1.5442 (1.1370) [2022-01-18 23:39:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1100/1251] eta 0:05:32 lr 0.000944 time 2.0570 (2.2027) loss 3.0699 (3.9859) grad_norm 1.0929 (1.1374) [2022-01-18 23:40:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1110/1251] eta 0:05:10 lr 0.000944 time 1.9435 (2.2008) loss 3.8723 (3.9846) grad_norm 1.0394 (1.1371) [2022-01-18 23:40:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1120/1251] eta 0:04:48 lr 0.000944 time 2.4681 (2.2000) loss 4.2870 (3.9878) grad_norm 1.2367 (1.1365) [2022-01-18 23:40:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1130/1251] eta 0:04:25 lr 0.000944 time 2.1614 (2.1981) loss 2.9498 (3.9858) grad_norm 1.4687 (1.1374) [2022-01-18 23:41:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1140/1251] eta 0:04:03 lr 0.000944 time 2.1674 (2.1964) loss 3.8691 (3.9859) grad_norm 1.0073 (1.1377) [2022-01-18 23:41:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1150/1251] eta 0:03:41 lr 0.000944 time 1.9966 (2.1954) loss 4.0144 (3.9855) grad_norm 1.1413 (1.1378) [2022-01-18 23:41:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1160/1251] eta 0:03:19 lr 0.000944 time 2.4713 (2.1943) loss 2.9609 (3.9857) grad_norm 1.0593 (1.1386) [2022-01-18 23:42:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1170/1251] eta 0:02:57 lr 0.000944 time 1.8422 (2.1943) loss 2.8216 (3.9825) grad_norm 1.0252 (1.1386) [2022-01-18 23:42:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1180/1251] eta 0:02:35 lr 0.000944 time 1.9287 (2.1949) loss 4.2627 (3.9812) grad_norm 1.1673 (1.1383) [2022-01-18 23:42:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1190/1251] eta 0:02:13 lr 0.000944 time 2.5426 (2.1954) loss 4.3934 (3.9808) grad_norm 1.2459 (1.1390) [2022-01-18 23:43:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1200/1251] eta 0:01:51 lr 0.000944 time 2.6562 (2.1960) loss 4.6602 (3.9805) grad_norm 0.9621 (1.1393) [2022-01-18 23:43:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1210/1251] eta 0:01:30 lr 0.000944 time 2.5503 (2.1973) loss 3.9119 (3.9800) grad_norm 1.0197 (1.1392) [2022-01-18 23:44:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1220/1251] eta 0:01:08 lr 0.000944 time 2.3263 (2.1999) loss 3.8171 (3.9805) grad_norm 1.1905 (1.1389) [2022-01-18 23:44:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1230/1251] eta 0:00:46 lr 0.000944 time 1.8425 (2.2003) loss 3.0688 (3.9791) grad_norm 1.0489 (1.1387) [2022-01-18 23:44:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1240/1251] eta 0:00:24 lr 0.000944 time 1.8230 (2.1980) loss 4.0322 (3.9808) grad_norm 1.2152 (1.1383) [2022-01-18 23:45:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [45/300][1250/1251] eta 0:00:02 lr 0.000944 time 1.2122 (2.1913) loss 4.2799 (3.9803) grad_norm 1.1087 (1.1384) [2022-01-18 23:45:03 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 45 training takes 0:45:41 [2022-01-18 23:45:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.856 (18.856) Loss 1.2752 (1.2752) Acc@1 71.680 (71.680) Acc@5 90.234 (90.234) [2022-01-18 23:45:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 2.308 (3.502) Loss 1.2499 (1.3324) Acc@1 72.949 (70.020) Acc@5 90.820 (89.870) [2022-01-18 23:46:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.945 (2.736) Loss 1.3490 (1.3340) Acc@1 70.703 (69.950) Acc@5 89.258 (89.946) [2022-01-18 23:46:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.586 (2.341) Loss 1.3890 (1.3403) Acc@1 70.508 (69.752) Acc@5 87.988 (89.885) [2022-01-18 23:46:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.180 (2.238) Loss 1.3740 (1.3472) Acc@1 68.750 (69.507) Acc@5 89.453 (89.753) [2022-01-18 23:46:42 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 69.424 Acc@5 89.672 [2022-01-18 23:46:42 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 69.4% [2022-01-18 23:46:42 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 69.65% [2022-01-18 23:47:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][0/1251] eta 7:31:55 lr 0.000944 time 21.6748 (21.6748) loss 2.8996 (2.8996) grad_norm 0.9639 (0.9639) [2022-01-18 23:47:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][10/1251] eta 1:27:54 lr 0.000944 time 2.2686 (4.2502) loss 4.0115 (3.7016) grad_norm 1.2400 (1.0408) [2022-01-18 23:47:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][20/1251] eta 1:06:09 lr 0.000944 time 1.2403 (3.2245) loss 4.0374 (3.7584) grad_norm 1.1013 (1.0560) [2022-01-18 23:48:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][30/1251] eta 1:00:22 lr 0.000944 time 1.5114 (2.9670) loss 3.6137 (3.8627) grad_norm 1.0405 (1.0372) [2022-01-18 23:48:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][40/1251] eta 0:57:10 lr 0.000944 time 3.6366 (2.8331) loss 4.2199 (3.9414) grad_norm 0.9725 (1.0449) [2022-01-18 23:49:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][50/1251] eta 0:54:27 lr 0.000944 time 3.0718 (2.7207) loss 3.6250 (3.9633) grad_norm 1.1274 (1.0497) [2022-01-18 23:49:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][60/1251] eta 0:51:17 lr 0.000944 time 1.8964 (2.5843) loss 4.8151 (4.0272) grad_norm 1.1865 (1.0493) [2022-01-18 23:49:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][70/1251] eta 0:49:24 lr 0.000944 time 2.1780 (2.5106) loss 4.2247 (4.0522) grad_norm 1.0219 (1.0659) [2022-01-18 23:50:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][80/1251] eta 0:48:18 lr 0.000944 time 2.1538 (2.4751) loss 3.1820 (4.0473) grad_norm 1.0598 (1.0619) [2022-01-18 23:50:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][90/1251] eta 0:47:28 lr 0.000943 time 2.8175 (2.4534) loss 4.0888 (4.0737) grad_norm 1.0778 (1.0716) [2022-01-18 23:50:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][100/1251] eta 0:46:35 lr 0.000943 time 2.5380 (2.4285) loss 4.3074 (4.0683) grad_norm 1.0367 (1.0774) [2022-01-18 23:51:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][110/1251] eta 0:45:30 lr 0.000943 time 1.7514 (2.3930) loss 4.2479 (4.0347) grad_norm 1.1093 (1.0827) [2022-01-18 23:51:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][120/1251] eta 0:44:41 lr 0.000943 time 2.1933 (2.3711) loss 4.0303 (4.0274) grad_norm 1.0270 (1.0817) [2022-01-18 23:51:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][130/1251] eta 0:44:01 lr 0.000943 time 2.5052 (2.3567) loss 3.7666 (4.0251) grad_norm 1.2352 (1.0820) [2022-01-18 23:52:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][140/1251] eta 0:43:16 lr 0.000943 time 2.2113 (2.3374) loss 4.6142 (4.0350) grad_norm 1.6106 (1.0916) [2022-01-18 23:52:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][150/1251] eta 0:42:33 lr 0.000943 time 2.1505 (2.3193) loss 4.1169 (4.0386) grad_norm 1.0708 (1.0923) [2022-01-18 23:52:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][160/1251] eta 0:42:02 lr 0.000943 time 1.8811 (2.3119) loss 4.0385 (4.0301) grad_norm 1.3748 (1.0925) [2022-01-18 23:53:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][170/1251] eta 0:41:43 lr 0.000943 time 3.0981 (2.3156) loss 4.3639 (4.0211) grad_norm 1.2814 (1.0945) [2022-01-18 23:53:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][180/1251] eta 0:41:11 lr 0.000943 time 2.7011 (2.3080) loss 4.3634 (4.0110) grad_norm 1.1683 (1.0939) [2022-01-18 23:54:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][190/1251] eta 0:40:42 lr 0.000943 time 1.7630 (2.3023) loss 2.9199 (3.9976) grad_norm 1.2025 (1.1004) [2022-01-18 23:54:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][200/1251] eta 0:40:11 lr 0.000943 time 1.8832 (2.2943) loss 4.3641 (4.0047) grad_norm 1.1980 (1.1055) [2022-01-18 23:54:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][210/1251] eta 0:39:44 lr 0.000943 time 2.4885 (2.2904) loss 3.3582 (3.9977) grad_norm 1.0928 (1.1059) [2022-01-18 23:55:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][220/1251] eta 0:39:19 lr 0.000943 time 2.8239 (2.2886) loss 4.0800 (4.0029) grad_norm 1.2730 (1.1078) [2022-01-18 23:55:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][230/1251] eta 0:38:46 lr 0.000943 time 2.2360 (2.2789) loss 4.0289 (4.0037) grad_norm 1.1025 (1.1089) [2022-01-18 23:55:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][240/1251] eta 0:38:20 lr 0.000943 time 1.5406 (2.2753) loss 4.3654 (4.0054) grad_norm 1.0047 (1.1094) [2022-01-18 23:56:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][250/1251] eta 0:37:59 lr 0.000943 time 2.4726 (2.2774) loss 3.5326 (4.0071) grad_norm 1.1356 (1.1092) [2022-01-18 23:56:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][260/1251] eta 0:37:37 lr 0.000943 time 1.8995 (2.2781) loss 3.1207 (4.0091) grad_norm 1.1647 (1.1117) [2022-01-18 23:56:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][270/1251] eta 0:37:07 lr 0.000943 time 1.7911 (2.2707) loss 3.3331 (3.9963) grad_norm 0.8673 (1.1120) [2022-01-18 23:57:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][280/1251] eta 0:36:40 lr 0.000943 time 1.8663 (2.2658) loss 3.1657 (3.9906) grad_norm 1.4930 (1.1178) [2022-01-18 23:57:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][290/1251] eta 0:36:15 lr 0.000943 time 2.3112 (2.2640) loss 4.4735 (3.9932) grad_norm 1.3726 (1.1206) [2022-01-18 23:58:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][300/1251] eta 0:35:49 lr 0.000943 time 2.5525 (2.2601) loss 4.2228 (3.9917) grad_norm 1.0143 (1.1191) [2022-01-18 23:58:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][310/1251] eta 0:35:23 lr 0.000943 time 1.5304 (2.2567) loss 3.0127 (3.9831) grad_norm 1.0927 (1.1172) [2022-01-18 23:58:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][320/1251] eta 0:34:59 lr 0.000943 time 2.2074 (2.2552) loss 3.5717 (3.9772) grad_norm 1.1356 (1.1169) [2022-01-18 23:59:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][330/1251] eta 0:34:37 lr 0.000943 time 2.6331 (2.2554) loss 4.2676 (3.9644) grad_norm 1.0519 (1.1178) [2022-01-18 23:59:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][340/1251] eta 0:34:08 lr 0.000943 time 1.6020 (2.2482) loss 4.4838 (3.9644) grad_norm 1.1361 (1.1189) [2022-01-18 23:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][350/1251] eta 0:33:42 lr 0.000943 time 1.7326 (2.2448) loss 3.8025 (3.9653) grad_norm 0.9672 (1.1185) [2022-01-19 00:00:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][360/1251] eta 0:33:14 lr 0.000943 time 1.8993 (2.2389) loss 3.9308 (3.9651) grad_norm 1.0365 (1.1167) [2022-01-19 00:00:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][370/1251] eta 0:32:53 lr 0.000943 time 1.6410 (2.2401) loss 3.3147 (3.9614) grad_norm 1.2946 (1.1169) [2022-01-19 00:00:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][380/1251] eta 0:32:31 lr 0.000943 time 1.7493 (2.2406) loss 4.1970 (3.9606) grad_norm 1.0816 (1.1199) [2022-01-19 00:01:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][390/1251] eta 0:32:10 lr 0.000943 time 1.5880 (2.2419) loss 4.2142 (3.9576) grad_norm 1.0551 (1.1206) [2022-01-19 00:01:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][400/1251] eta 0:31:47 lr 0.000943 time 1.9140 (2.2409) loss 4.4155 (3.9653) grad_norm 1.2031 (1.1200) [2022-01-19 00:02:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][410/1251] eta 0:31:23 lr 0.000943 time 1.8511 (2.2397) loss 3.9790 (3.9651) grad_norm 1.1553 (1.1204) [2022-01-19 00:02:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][420/1251] eta 0:30:55 lr 0.000943 time 1.9031 (2.2334) loss 4.7168 (3.9643) grad_norm 1.0668 (1.1201) [2022-01-19 00:02:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][430/1251] eta 0:30:32 lr 0.000943 time 1.8931 (2.2325) loss 4.9844 (3.9653) grad_norm 1.0742 (1.1216) [2022-01-19 00:03:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][440/1251] eta 0:30:09 lr 0.000943 time 2.4776 (2.2318) loss 4.0605 (3.9642) grad_norm 1.6094 (1.1256) [2022-01-19 00:03:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][450/1251] eta 0:29:48 lr 0.000943 time 2.1672 (2.2326) loss 4.2286 (3.9612) grad_norm 1.1716 (1.1261) [2022-01-19 00:03:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][460/1251] eta 0:29:24 lr 0.000943 time 2.0502 (2.2310) loss 3.8652 (3.9597) grad_norm 1.1790 (1.1265) [2022-01-19 00:04:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][470/1251] eta 0:29:00 lr 0.000943 time 1.7498 (2.2285) loss 4.6421 (3.9623) grad_norm 1.3766 (1.1274) [2022-01-19 00:04:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][480/1251] eta 0:28:36 lr 0.000943 time 2.2979 (2.2267) loss 3.9282 (3.9591) grad_norm 0.9339 (1.1265) [2022-01-19 00:04:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][490/1251] eta 0:28:13 lr 0.000943 time 2.4722 (2.2248) loss 3.6280 (3.9611) grad_norm 1.1814 (1.1261) [2022-01-19 00:05:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][500/1251] eta 0:27:49 lr 0.000943 time 1.6634 (2.2225) loss 4.8250 (3.9701) grad_norm 0.9063 (1.1260) [2022-01-19 00:05:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][510/1251] eta 0:27:24 lr 0.000943 time 1.9295 (2.2195) loss 4.2362 (3.9740) grad_norm 1.2659 (1.1262) [2022-01-19 00:05:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][520/1251] eta 0:27:00 lr 0.000943 time 1.8089 (2.2169) loss 3.9205 (3.9801) grad_norm 1.0683 (1.1256) [2022-01-19 00:06:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][530/1251] eta 0:26:38 lr 0.000943 time 1.9271 (2.2175) loss 4.5720 (3.9840) grad_norm 1.0902 (1.1265) [2022-01-19 00:06:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][540/1251] eta 0:26:16 lr 0.000943 time 1.8618 (2.2168) loss 3.7155 (3.9789) grad_norm 1.0961 (1.1265) [2022-01-19 00:07:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][550/1251] eta 0:25:55 lr 0.000943 time 2.4847 (2.2189) loss 4.3360 (3.9794) grad_norm 1.2627 (1.1282) [2022-01-19 00:07:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][560/1251] eta 0:25:33 lr 0.000943 time 1.8155 (2.2192) loss 4.6361 (3.9838) grad_norm 1.0554 (1.1293) [2022-01-19 00:07:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][570/1251] eta 0:25:09 lr 0.000943 time 1.8025 (2.2171) loss 4.0332 (3.9847) grad_norm 1.9343 (1.1314) [2022-01-19 00:08:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][580/1251] eta 0:24:46 lr 0.000943 time 1.6959 (2.2150) loss 4.0160 (3.9831) grad_norm 1.1877 (1.1330) [2022-01-19 00:08:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][590/1251] eta 0:24:24 lr 0.000943 time 2.2178 (2.2159) loss 4.6893 (3.9829) grad_norm 1.3899 (1.1334) [2022-01-19 00:08:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][600/1251] eta 0:24:01 lr 0.000943 time 2.2269 (2.2144) loss 3.7698 (3.9801) grad_norm 1.0488 (1.1336) [2022-01-19 00:09:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][610/1251] eta 0:23:39 lr 0.000942 time 2.1167 (2.2144) loss 4.0296 (3.9751) grad_norm 0.9716 (1.1336) [2022-01-19 00:09:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][620/1251] eta 0:23:16 lr 0.000942 time 2.2095 (2.2136) loss 2.5868 (3.9736) grad_norm 1.1770 (1.1336) [2022-01-19 00:10:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][630/1251] eta 0:22:55 lr 0.000942 time 2.3018 (2.2145) loss 4.5266 (3.9698) grad_norm 1.1547 (1.1329) [2022-01-19 00:10:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][640/1251] eta 0:22:33 lr 0.000942 time 2.7961 (2.2154) loss 4.3913 (3.9738) grad_norm 1.1189 (1.1312) [2022-01-19 00:10:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][650/1251] eta 0:22:11 lr 0.000942 time 1.7190 (2.2151) loss 3.7616 (3.9709) grad_norm 1.2223 (1.1298) [2022-01-19 00:11:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][660/1251] eta 0:21:47 lr 0.000942 time 2.1866 (2.2130) loss 4.5209 (3.9750) grad_norm 1.1304 (1.1286) [2022-01-19 00:11:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][670/1251] eta 0:21:23 lr 0.000942 time 2.2067 (2.2096) loss 3.8303 (3.9735) grad_norm 0.8823 (1.1281) [2022-01-19 00:11:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][680/1251] eta 0:20:59 lr 0.000942 time 1.8738 (2.2062) loss 4.5409 (3.9750) grad_norm 1.1432 (1.1298) [2022-01-19 00:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][690/1251] eta 0:20:38 lr 0.000942 time 2.4908 (2.2070) loss 4.4788 (3.9764) grad_norm 0.9188 (1.1285) [2022-01-19 00:12:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][700/1251] eta 0:20:15 lr 0.000942 time 2.8071 (2.2053) loss 3.9496 (3.9723) grad_norm 1.3792 (1.1294) [2022-01-19 00:12:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][710/1251] eta 0:19:52 lr 0.000942 time 1.8727 (2.2040) loss 4.2593 (3.9711) grad_norm 0.9785 (1.1295) [2022-01-19 00:13:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][720/1251] eta 0:19:31 lr 0.000942 time 2.5384 (2.2061) loss 3.7000 (3.9732) grad_norm 1.0701 (1.1297) [2022-01-19 00:13:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][730/1251] eta 0:19:09 lr 0.000942 time 2.7635 (2.2073) loss 4.7985 (3.9752) grad_norm 1.1124 (1.1296) [2022-01-19 00:14:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][740/1251] eta 0:18:49 lr 0.000942 time 3.3523 (2.2111) loss 3.8362 (3.9739) grad_norm 1.0884 (1.1292) [2022-01-19 00:14:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][750/1251] eta 0:18:28 lr 0.000942 time 2.1576 (2.2123) loss 4.6560 (3.9708) grad_norm 1.2041 (1.1296) [2022-01-19 00:14:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][760/1251] eta 0:18:06 lr 0.000942 time 2.1197 (2.2136) loss 2.9544 (3.9734) grad_norm 1.4799 (1.1308) [2022-01-19 00:15:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][770/1251] eta 0:17:43 lr 0.000942 time 2.1335 (2.2119) loss 4.0623 (3.9758) grad_norm 1.0059 (1.1296) [2022-01-19 00:15:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][780/1251] eta 0:17:19 lr 0.000942 time 1.9505 (2.2073) loss 4.3815 (3.9782) grad_norm 1.0748 (1.1299) [2022-01-19 00:15:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][790/1251] eta 0:16:56 lr 0.000942 time 2.1211 (2.2049) loss 4.3575 (3.9799) grad_norm 1.0516 (1.1292) [2022-01-19 00:16:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][800/1251] eta 0:16:33 lr 0.000942 time 1.8836 (2.2030) loss 4.2439 (3.9781) grad_norm 1.1810 (1.1286) [2022-01-19 00:16:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][810/1251] eta 0:16:11 lr 0.000942 time 2.0094 (2.2025) loss 3.6148 (3.9752) grad_norm 0.8444 (1.1286) [2022-01-19 00:16:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][820/1251] eta 0:15:49 lr 0.000942 time 2.5671 (2.2027) loss 4.3300 (3.9745) grad_norm 1.4897 (1.1294) [2022-01-19 00:17:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][830/1251] eta 0:15:27 lr 0.000942 time 2.6067 (2.2032) loss 3.5666 (3.9715) grad_norm 1.0656 (1.1287) [2022-01-19 00:17:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][840/1251] eta 0:15:05 lr 0.000942 time 1.9921 (2.2039) loss 4.7251 (3.9735) grad_norm 1.0526 (1.1283) [2022-01-19 00:17:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][850/1251] eta 0:14:44 lr 0.000942 time 2.9033 (2.2048) loss 4.5927 (3.9703) grad_norm 1.1368 (1.1288) [2022-01-19 00:18:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][860/1251] eta 0:14:22 lr 0.000942 time 2.6307 (2.2060) loss 4.7130 (3.9679) grad_norm 1.2474 (1.1302) [2022-01-19 00:18:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][870/1251] eta 0:14:00 lr 0.000942 time 2.7047 (2.2072) loss 3.7670 (3.9665) grad_norm 1.0734 (1.1304) [2022-01-19 00:19:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][880/1251] eta 0:13:37 lr 0.000942 time 1.7981 (2.2045) loss 4.1396 (3.9642) grad_norm 1.0030 (1.1306) [2022-01-19 00:19:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][890/1251] eta 0:13:15 lr 0.000942 time 2.0472 (2.2032) loss 4.7053 (3.9700) grad_norm 1.0103 (1.1302) [2022-01-19 00:19:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][900/1251] eta 0:12:52 lr 0.000942 time 2.1704 (2.2022) loss 4.1915 (3.9714) grad_norm 0.9626 (1.1301) [2022-01-19 00:20:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][910/1251] eta 0:12:30 lr 0.000942 time 2.2329 (2.2019) loss 4.2886 (3.9700) grad_norm 1.0511 (1.1310) [2022-01-19 00:20:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][920/1251] eta 0:12:09 lr 0.000942 time 2.4215 (2.2037) loss 4.1702 (3.9718) grad_norm 1.0651 (1.1303) [2022-01-19 00:20:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][930/1251] eta 0:11:47 lr 0.000942 time 1.9628 (2.2047) loss 3.8689 (3.9729) grad_norm 1.0502 (1.1297) [2022-01-19 00:21:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][940/1251] eta 0:11:25 lr 0.000942 time 1.8453 (2.2043) loss 2.8892 (3.9722) grad_norm 1.0042 (1.1288) [2022-01-19 00:21:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][950/1251] eta 0:11:03 lr 0.000942 time 1.9056 (2.2040) loss 3.9266 (3.9740) grad_norm 1.0097 (1.1281) [2022-01-19 00:22:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][960/1251] eta 0:10:41 lr 0.000942 time 1.9135 (2.2049) loss 4.4388 (3.9767) grad_norm 1.1627 (1.1281) [2022-01-19 00:22:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][970/1251] eta 0:10:18 lr 0.000942 time 1.7351 (2.2027) loss 3.4892 (3.9755) grad_norm 1.6761 (1.1283) [2022-01-19 00:22:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][980/1251] eta 0:09:56 lr 0.000942 time 1.9515 (2.2025) loss 4.6373 (3.9741) grad_norm 1.0357 (1.1277) [2022-01-19 00:23:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][990/1251] eta 0:09:34 lr 0.000942 time 1.9222 (2.2026) loss 2.9427 (3.9733) grad_norm 1.2601 (1.1273) [2022-01-19 00:23:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1000/1251] eta 0:09:12 lr 0.000942 time 1.5727 (2.2031) loss 3.2000 (3.9720) grad_norm 1.3210 (1.1262) [2022-01-19 00:23:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1010/1251] eta 0:08:50 lr 0.000942 time 1.7560 (2.2021) loss 4.1887 (3.9700) grad_norm 1.1968 (1.1267) [2022-01-19 00:24:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1020/1251] eta 0:08:28 lr 0.000942 time 1.8149 (2.2010) loss 3.2798 (3.9679) grad_norm 1.1558 (1.1271) [2022-01-19 00:24:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1030/1251] eta 0:08:06 lr 0.000942 time 1.8633 (2.2010) loss 3.6465 (3.9672) grad_norm 1.1455 (1.1273) [2022-01-19 00:24:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1040/1251] eta 0:07:44 lr 0.000942 time 2.4326 (2.2037) loss 4.6129 (3.9684) grad_norm 1.6681 (1.1274) [2022-01-19 00:25:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1050/1251] eta 0:07:22 lr 0.000942 time 1.7949 (2.2029) loss 4.3842 (3.9680) grad_norm 1.6263 (1.1274) [2022-01-19 00:25:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1060/1251] eta 0:07:00 lr 0.000942 time 1.9293 (2.2015) loss 4.0972 (3.9667) grad_norm 1.0305 (1.1275) [2022-01-19 00:26:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1070/1251] eta 0:06:38 lr 0.000942 time 2.2664 (2.2018) loss 4.3598 (3.9669) grad_norm 1.0070 (1.1272) [2022-01-19 00:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1080/1251] eta 0:06:16 lr 0.000942 time 2.3506 (2.2020) loss 4.7839 (3.9646) grad_norm 1.0537 (1.1268) [2022-01-19 00:26:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1090/1251] eta 0:05:54 lr 0.000942 time 2.2416 (2.2016) loss 3.9759 (3.9640) grad_norm 1.1645 (1.1265) [2022-01-19 00:27:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1100/1251] eta 0:05:32 lr 0.000942 time 1.6711 (2.2004) loss 2.7640 (3.9624) grad_norm 1.2407 (1.1269) [2022-01-19 00:27:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1110/1251] eta 0:05:10 lr 0.000942 time 2.4682 (2.1998) loss 4.2111 (3.9631) grad_norm 1.0847 (1.1269) [2022-01-19 00:27:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1120/1251] eta 0:04:48 lr 0.000942 time 1.7553 (2.1990) loss 4.1618 (3.9629) grad_norm 1.2196 (1.1280) [2022-01-19 00:28:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1130/1251] eta 0:04:26 lr 0.000941 time 2.7869 (2.2007) loss 4.1456 (3.9660) grad_norm 1.1597 (1.1283) [2022-01-19 00:28:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1140/1251] eta 0:04:04 lr 0.000941 time 2.1322 (2.1995) loss 4.3126 (3.9662) grad_norm 0.9364 (1.1277) [2022-01-19 00:28:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1150/1251] eta 0:03:42 lr 0.000941 time 2.4956 (2.1984) loss 4.4099 (3.9672) grad_norm 1.0795 (1.1276) [2022-01-19 00:29:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1160/1251] eta 0:03:20 lr 0.000941 time 2.1180 (2.1988) loss 4.1660 (3.9683) grad_norm 1.0925 (1.1275) [2022-01-19 00:29:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1170/1251] eta 0:02:58 lr 0.000941 time 2.1558 (2.1997) loss 4.0576 (3.9686) grad_norm 1.1359 (1.1272) [2022-01-19 00:30:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1180/1251] eta 0:02:36 lr 0.000941 time 1.9133 (2.2010) loss 4.3085 (3.9674) grad_norm 1.0321 (1.1269) [2022-01-19 00:30:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1190/1251] eta 0:02:14 lr 0.000941 time 2.1821 (2.2011) loss 3.8153 (3.9665) grad_norm 1.3239 (1.1283) [2022-01-19 00:30:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1200/1251] eta 0:01:52 lr 0.000941 time 2.4277 (2.2002) loss 4.4953 (3.9663) grad_norm 1.0795 (1.1276) [2022-01-19 00:31:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1210/1251] eta 0:01:30 lr 0.000941 time 1.6956 (2.1983) loss 4.5688 (3.9679) grad_norm 1.0143 (1.1271) [2022-01-19 00:31:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1220/1251] eta 0:01:08 lr 0.000941 time 2.4004 (2.1966) loss 3.6716 (3.9680) grad_norm 1.0831 (1.1266) [2022-01-19 00:31:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1230/1251] eta 0:00:46 lr 0.000941 time 1.9316 (2.1954) loss 4.2710 (3.9689) grad_norm 0.9673 (1.1261) [2022-01-19 00:32:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1240/1251] eta 0:00:24 lr 0.000941 time 2.2877 (2.1960) loss 3.4138 (3.9697) grad_norm 0.9976 (1.1263) [2022-01-19 00:32:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [46/300][1250/1251] eta 0:00:02 lr 0.000941 time 1.1669 (2.1908) loss 4.1387 (3.9707) grad_norm 1.2776 (1.1265) [2022-01-19 00:32:24 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 46 training takes 0:45:41 [2022-01-19 00:32:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 20.062 (20.062) Loss 1.3609 (1.3609) Acc@1 69.824 (69.824) Acc@5 89.746 (89.746) [2022-01-19 00:32:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.254 (3.228) Loss 1.3023 (1.3271) Acc@1 70.996 (69.815) Acc@5 89.648 (89.782) [2022-01-19 00:33:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.514 (2.527) Loss 1.3483 (1.3252) Acc@1 68.945 (69.666) Acc@5 89.160 (89.816) [2022-01-19 00:33:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.340 (2.324) Loss 1.3552 (1.3242) Acc@1 69.824 (69.859) Acc@5 89.160 (89.774) [2022-01-19 00:33:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.777 (2.175) Loss 1.3691 (1.3283) Acc@1 68.750 (69.691) Acc@5 89.160 (89.741) [2022-01-19 00:34:00 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 69.576 Acc@5 89.708 [2022-01-19 00:34:00 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 69.6% [2022-01-19 00:34:00 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 69.65% [2022-01-19 00:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][0/1251] eta 7:32:25 lr 0.000941 time 21.6989 (21.6989) loss 4.5953 (4.5953) grad_norm 1.2008 (1.2008) [2022-01-19 00:34:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][10/1251] eta 1:24:50 lr 0.000941 time 2.1548 (4.1020) loss 4.5597 (4.0885) grad_norm 1.4734 (1.2242) [2022-01-19 00:35:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][20/1251] eta 1:06:30 lr 0.000941 time 1.7382 (3.2416) loss 3.5090 (3.8914) grad_norm 1.1750 (1.1821) [2022-01-19 00:35:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][30/1251] eta 0:58:31 lr 0.000941 time 1.6893 (2.8755) loss 4.0561 (3.9229) grad_norm 1.3077 (1.1658) [2022-01-19 00:35:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][40/1251] eta 0:55:55 lr 0.000941 time 3.8985 (2.7707) loss 3.8889 (3.8600) grad_norm 0.9075 (1.1552) [2022-01-19 00:36:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][50/1251] eta 0:53:27 lr 0.000941 time 2.5168 (2.6708) loss 3.9306 (3.8670) grad_norm 1.3220 (1.1569) [2022-01-19 00:36:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][60/1251] eta 0:51:25 lr 0.000941 time 1.8461 (2.5907) loss 4.7211 (3.8909) grad_norm 1.1597 (1.1488) [2022-01-19 00:36:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][70/1251] eta 0:49:33 lr 0.000941 time 1.5771 (2.5179) loss 4.8595 (3.9210) grad_norm 1.0647 (1.1488) [2022-01-19 00:37:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][80/1251] eta 0:49:15 lr 0.000941 time 6.5625 (2.5239) loss 4.2906 (3.9364) grad_norm 1.0664 (1.1488) [2022-01-19 00:37:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][90/1251] eta 0:47:52 lr 0.000941 time 1.8362 (2.4744) loss 4.7623 (3.9676) grad_norm 1.1290 (1.1478) [2022-01-19 00:38:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][100/1251] eta 0:46:37 lr 0.000941 time 1.8924 (2.4304) loss 3.4513 (3.9347) grad_norm 1.0758 (1.1444) [2022-01-19 00:38:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][110/1251] eta 0:45:27 lr 0.000941 time 1.8889 (2.3908) loss 4.4853 (3.9293) grad_norm 1.0093 (1.1431) [2022-01-19 00:38:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][120/1251] eta 0:44:53 lr 0.000941 time 3.3647 (2.3814) loss 4.4498 (3.9099) grad_norm 1.2001 (1.1394) [2022-01-19 00:39:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][130/1251] eta 0:44:13 lr 0.000941 time 2.2510 (2.3670) loss 4.0575 (3.9256) grad_norm 1.0541 (1.1381) [2022-01-19 00:39:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][140/1251] eta 0:43:27 lr 0.000941 time 1.8508 (2.3472) loss 3.4991 (3.9192) grad_norm 0.9909 (1.1327) [2022-01-19 00:39:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][150/1251] eta 0:42:53 lr 0.000941 time 1.9431 (2.3378) loss 3.9948 (3.9114) grad_norm 1.0579 (1.1336) [2022-01-19 00:40:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][160/1251] eta 0:42:23 lr 0.000941 time 3.8023 (2.3317) loss 4.6342 (3.9387) grad_norm 0.9929 (1.1310) [2022-01-19 00:40:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][170/1251] eta 0:41:46 lr 0.000941 time 1.8890 (2.3187) loss 4.1665 (3.9507) grad_norm 1.2616 (1.1305) [2022-01-19 00:40:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][180/1251] eta 0:41:11 lr 0.000941 time 2.0320 (2.3075) loss 4.6564 (3.9425) grad_norm 1.3052 (1.1289) [2022-01-19 00:41:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][190/1251] eta 0:40:51 lr 0.000941 time 3.1139 (2.3106) loss 4.1176 (3.9517) grad_norm 1.0984 (1.1242) [2022-01-19 00:41:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][200/1251] eta 0:40:35 lr 0.000941 time 3.1501 (2.3176) loss 4.2579 (3.9496) grad_norm 1.1444 (1.1261) [2022-01-19 00:42:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][210/1251] eta 0:40:03 lr 0.000941 time 1.8036 (2.3085) loss 3.4618 (3.9399) grad_norm 0.9473 (1.1249) [2022-01-19 00:42:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][220/1251] eta 0:39:31 lr 0.000941 time 2.2238 (2.2998) loss 3.9257 (3.9467) grad_norm 1.1292 (1.1206) [2022-01-19 00:42:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][230/1251] eta 0:38:59 lr 0.000941 time 2.9274 (2.2915) loss 3.1926 (3.9400) grad_norm 1.5533 (1.1213) [2022-01-19 00:43:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][240/1251] eta 0:38:21 lr 0.000941 time 1.6110 (2.2761) loss 4.0341 (3.9411) grad_norm 0.9721 (1.1204) [2022-01-19 00:43:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][250/1251] eta 0:37:48 lr 0.000941 time 1.9794 (2.2659) loss 3.7605 (3.9520) grad_norm 1.0491 (1.1229) [2022-01-19 00:43:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][260/1251] eta 0:37:22 lr 0.000941 time 2.2679 (2.2625) loss 4.2245 (3.9401) grad_norm 1.0399 (1.1292) [2022-01-19 00:44:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][270/1251] eta 0:36:58 lr 0.000941 time 2.5311 (2.2617) loss 4.3200 (3.9375) grad_norm 1.0119 (1.1269) [2022-01-19 00:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][280/1251] eta 0:36:39 lr 0.000941 time 2.3797 (2.2650) loss 4.5281 (3.9471) grad_norm 1.0755 (1.1255) [2022-01-19 00:44:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][290/1251] eta 0:36:16 lr 0.000941 time 2.4206 (2.2646) loss 3.4966 (3.9459) grad_norm 1.0709 (1.1234) [2022-01-19 00:45:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][300/1251] eta 0:35:52 lr 0.000941 time 2.2425 (2.2632) loss 4.2798 (3.9403) grad_norm 1.0572 (1.1217) [2022-01-19 00:45:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][310/1251] eta 0:35:23 lr 0.000941 time 2.1064 (2.2572) loss 3.5361 (3.9359) grad_norm 1.0879 (1.1227) [2022-01-19 00:46:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][320/1251] eta 0:35:00 lr 0.000941 time 1.8729 (2.2558) loss 3.4574 (3.9436) grad_norm 1.2439 (1.1254) [2022-01-19 00:46:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][330/1251] eta 0:34:37 lr 0.000941 time 1.9059 (2.2557) loss 4.4494 (3.9525) grad_norm 1.2423 (1.1259) [2022-01-19 00:46:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][340/1251] eta 0:34:15 lr 0.000941 time 2.0610 (2.2564) loss 3.5871 (3.9514) grad_norm 1.0244 (1.1243) [2022-01-19 00:47:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][350/1251] eta 0:33:47 lr 0.000941 time 2.0227 (2.2499) loss 3.2407 (3.9473) grad_norm 1.2019 (1.1246) [2022-01-19 00:47:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][360/1251] eta 0:33:24 lr 0.000941 time 2.2537 (2.2498) loss 3.8498 (3.9537) grad_norm 1.2902 (1.1268) [2022-01-19 00:47:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][370/1251] eta 0:33:00 lr 0.000941 time 1.9166 (2.2483) loss 4.0755 (3.9533) grad_norm 1.4410 (1.1256) [2022-01-19 00:48:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][380/1251] eta 0:32:36 lr 0.000940 time 1.6794 (2.2459) loss 3.2652 (3.9551) grad_norm 1.0093 (1.1271) [2022-01-19 00:48:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][390/1251] eta 0:32:12 lr 0.000940 time 2.8479 (2.2445) loss 3.6367 (3.9459) grad_norm 1.0828 (1.1276) [2022-01-19 00:48:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][400/1251] eta 0:31:47 lr 0.000940 time 1.8492 (2.2414) loss 4.3695 (3.9512) grad_norm 1.0512 (1.1268) [2022-01-19 00:49:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][410/1251] eta 0:31:25 lr 0.000940 time 2.0942 (2.2422) loss 4.4329 (3.9540) grad_norm 1.1537 (1.1272) [2022-01-19 00:49:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][420/1251] eta 0:31:02 lr 0.000940 time 1.7591 (2.2410) loss 4.4074 (3.9637) grad_norm 0.9254 (1.1291) [2022-01-19 00:50:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][430/1251] eta 0:30:41 lr 0.000940 time 3.4106 (2.2427) loss 3.9342 (3.9673) grad_norm 1.4301 (1.1287) [2022-01-19 00:50:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][440/1251] eta 0:30:18 lr 0.000940 time 2.1093 (2.2421) loss 4.3632 (3.9666) grad_norm 1.0159 (1.1273) [2022-01-19 00:50:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][450/1251] eta 0:29:56 lr 0.000940 time 2.5698 (2.2423) loss 4.1907 (3.9618) grad_norm 1.3109 (1.1278) [2022-01-19 00:51:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][460/1251] eta 0:29:31 lr 0.000940 time 1.9011 (2.2398) loss 3.8649 (3.9585) grad_norm 1.4798 (1.1277) [2022-01-19 00:51:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][470/1251] eta 0:29:07 lr 0.000940 time 2.8645 (2.2381) loss 4.0789 (3.9618) grad_norm 1.0541 (1.1296) [2022-01-19 00:51:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][480/1251] eta 0:28:42 lr 0.000940 time 2.0040 (2.2344) loss 3.1948 (3.9552) grad_norm 1.0919 (1.1289) [2022-01-19 00:52:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][490/1251] eta 0:28:19 lr 0.000940 time 2.1777 (2.2339) loss 4.3606 (3.9521) grad_norm 1.1373 (1.1285) [2022-01-19 00:52:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][500/1251] eta 0:27:55 lr 0.000940 time 2.3172 (2.2314) loss 3.6975 (3.9501) grad_norm 0.9661 (1.1272) [2022-01-19 00:52:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][510/1251] eta 0:27:30 lr 0.000940 time 1.8452 (2.2269) loss 4.2370 (3.9544) grad_norm 1.0858 (1.1284) [2022-01-19 00:53:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][520/1251] eta 0:27:06 lr 0.000940 time 2.1695 (2.2248) loss 4.7053 (3.9552) grad_norm 1.0459 (1.1275) [2022-01-19 00:53:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][530/1251] eta 0:26:44 lr 0.000940 time 2.0978 (2.2253) loss 3.1875 (3.9569) grad_norm 1.1042 (1.1270) [2022-01-19 00:54:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][540/1251] eta 0:26:21 lr 0.000940 time 1.9169 (2.2247) loss 3.2917 (3.9557) grad_norm 1.0043 (1.1278) [2022-01-19 00:54:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][550/1251] eta 0:26:00 lr 0.000940 time 2.8054 (2.2260) loss 4.4343 (3.9600) grad_norm 0.9476 (1.1289) [2022-01-19 00:54:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][560/1251] eta 0:25:39 lr 0.000940 time 2.2131 (2.2283) loss 4.3267 (3.9647) grad_norm 1.0641 (1.1302) [2022-01-19 00:55:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][570/1251] eta 0:25:18 lr 0.000940 time 1.5972 (2.2296) loss 3.6153 (3.9673) grad_norm 1.2396 (1.1324) [2022-01-19 00:55:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][580/1251] eta 0:24:54 lr 0.000940 time 1.8717 (2.2277) loss 3.5271 (3.9679) grad_norm 1.2923 (1.1318) [2022-01-19 00:55:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][590/1251] eta 0:24:28 lr 0.000940 time 1.9030 (2.2224) loss 4.3736 (3.9690) grad_norm 1.3988 (1.1319) [2022-01-19 00:56:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][600/1251] eta 0:24:04 lr 0.000940 time 2.2259 (2.2189) loss 3.8292 (3.9720) grad_norm 1.0554 (1.1323) [2022-01-19 00:56:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][610/1251] eta 0:23:40 lr 0.000940 time 2.1757 (2.2158) loss 4.3193 (3.9726) grad_norm 1.1336 (1.1315) [2022-01-19 00:56:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][620/1251] eta 0:23:17 lr 0.000940 time 1.9974 (2.2145) loss 3.3891 (3.9745) grad_norm 1.3219 (1.1319) [2022-01-19 00:57:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][630/1251] eta 0:22:55 lr 0.000940 time 2.1562 (2.2146) loss 3.4479 (3.9759) grad_norm 1.1299 (1.1316) [2022-01-19 00:57:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][640/1251] eta 0:22:34 lr 0.000940 time 2.0520 (2.2161) loss 4.4865 (3.9761) grad_norm 1.4902 (1.1319) [2022-01-19 00:58:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][650/1251] eta 0:22:11 lr 0.000940 time 2.0904 (2.2160) loss 3.4451 (3.9783) grad_norm 0.9258 (1.1311) [2022-01-19 00:58:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][660/1251] eta 0:21:50 lr 0.000940 time 1.8541 (2.2176) loss 4.0337 (3.9739) grad_norm 1.1748 (1.1320) [2022-01-19 00:58:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][670/1251] eta 0:21:28 lr 0.000940 time 1.9404 (2.2171) loss 4.2038 (3.9701) grad_norm 0.9151 (1.1321) [2022-01-19 00:59:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][680/1251] eta 0:21:05 lr 0.000940 time 1.9081 (2.2171) loss 4.1953 (3.9678) grad_norm 1.1259 (1.1313) [2022-01-19 00:59:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][690/1251] eta 0:20:43 lr 0.000940 time 2.1374 (2.2165) loss 4.1700 (3.9665) grad_norm 1.0135 (1.1300) [2022-01-19 00:59:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][700/1251] eta 0:20:19 lr 0.000940 time 1.8706 (2.2140) loss 4.2589 (3.9672) grad_norm 1.4041 (1.1310) [2022-01-19 01:00:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][710/1251] eta 0:20:01 lr 0.000940 time 6.9425 (2.2217) loss 3.6183 (3.9692) grad_norm 1.2481 (1.1329) [2022-01-19 01:00:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][720/1251] eta 0:19:39 lr 0.000940 time 1.8138 (2.2220) loss 4.4978 (3.9732) grad_norm 1.1513 (1.1336) [2022-01-19 01:01:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][730/1251] eta 0:19:17 lr 0.000940 time 1.6608 (2.2221) loss 4.4423 (3.9758) grad_norm 0.9888 (1.1350) [2022-01-19 01:01:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][740/1251] eta 0:18:54 lr 0.000940 time 1.8226 (2.2199) loss 4.1443 (3.9754) grad_norm 1.0787 (1.1356) [2022-01-19 01:01:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][750/1251] eta 0:18:31 lr 0.000940 time 3.2183 (2.2182) loss 4.2605 (3.9727) grad_norm 1.0448 (1.1380) [2022-01-19 01:02:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][760/1251] eta 0:18:08 lr 0.000940 time 2.0078 (2.2164) loss 3.8745 (3.9779) grad_norm 1.1020 (1.1391) [2022-01-19 01:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][770/1251] eta 0:17:45 lr 0.000940 time 2.9353 (2.2157) loss 3.6550 (3.9772) grad_norm 0.9174 (1.1380) [2022-01-19 01:02:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][780/1251] eta 0:17:22 lr 0.000940 time 2.2818 (2.2130) loss 4.7338 (3.9777) grad_norm 0.9291 (1.1380) [2022-01-19 01:03:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][790/1251] eta 0:17:00 lr 0.000940 time 2.2734 (2.2139) loss 3.7325 (3.9760) grad_norm 0.9557 (1.1373) [2022-01-19 01:03:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][800/1251] eta 0:16:37 lr 0.000940 time 1.5102 (2.2122) loss 4.4161 (3.9794) grad_norm 1.2052 (1.1365) [2022-01-19 01:03:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][810/1251] eta 0:16:15 lr 0.000940 time 2.2505 (2.2116) loss 4.3601 (3.9788) grad_norm 1.2394 (1.1363) [2022-01-19 01:04:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][820/1251] eta 0:15:53 lr 0.000940 time 1.8909 (2.2113) loss 3.4392 (3.9773) grad_norm 1.0949 (1.1359) [2022-01-19 01:04:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][830/1251] eta 0:15:31 lr 0.000940 time 3.0760 (2.2132) loss 4.7887 (3.9796) grad_norm 1.2964 (1.1358) [2022-01-19 01:05:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][840/1251] eta 0:15:10 lr 0.000940 time 2.0768 (2.2144) loss 4.1984 (3.9785) grad_norm 1.1522 (1.1361) [2022-01-19 01:05:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][850/1251] eta 0:14:48 lr 0.000940 time 2.5207 (2.2169) loss 3.2031 (3.9773) grad_norm 1.7542 (1.1374) [2022-01-19 01:05:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][860/1251] eta 0:14:26 lr 0.000940 time 2.5878 (2.2174) loss 3.1524 (3.9733) grad_norm 0.8765 (1.1376) [2022-01-19 01:06:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][870/1251] eta 0:14:03 lr 0.000940 time 1.7711 (2.2146) loss 3.8876 (3.9714) grad_norm 1.3534 (1.1371) [2022-01-19 01:06:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][880/1251] eta 0:13:40 lr 0.000940 time 1.9755 (2.2117) loss 4.3176 (3.9715) grad_norm 1.0903 (1.1370) [2022-01-19 01:06:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][890/1251] eta 0:13:17 lr 0.000939 time 2.2165 (2.2099) loss 3.0946 (3.9701) grad_norm 1.0818 (1.1370) [2022-01-19 01:07:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][900/1251] eta 0:12:55 lr 0.000939 time 2.1705 (2.2103) loss 2.7016 (3.9655) grad_norm 0.9447 (1.1359) [2022-01-19 01:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][910/1251] eta 0:12:33 lr 0.000939 time 1.8094 (2.2089) loss 3.6249 (3.9656) grad_norm 1.2843 (1.1364) [2022-01-19 01:07:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][920/1251] eta 0:12:11 lr 0.000939 time 1.8933 (2.2088) loss 4.3260 (3.9664) grad_norm 1.2075 (1.1361) [2022-01-19 01:08:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][930/1251] eta 0:11:49 lr 0.000939 time 3.2105 (2.2111) loss 3.6480 (3.9653) grad_norm 1.0824 (1.1360) [2022-01-19 01:08:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][940/1251] eta 0:11:28 lr 0.000939 time 1.8931 (2.2127) loss 4.0408 (3.9602) grad_norm 1.4104 (1.1364) [2022-01-19 01:09:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][950/1251] eta 0:11:06 lr 0.000939 time 2.0654 (2.2128) loss 4.3507 (3.9633) grad_norm 1.1726 (1.1363) [2022-01-19 01:09:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][960/1251] eta 0:10:43 lr 0.000939 time 1.7922 (2.2128) loss 4.6262 (3.9630) grad_norm 1.0937 (1.1360) [2022-01-19 01:09:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][970/1251] eta 0:10:21 lr 0.000939 time 3.2680 (2.2120) loss 4.1517 (3.9623) grad_norm 1.1722 (1.1359) [2022-01-19 01:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][980/1251] eta 0:09:58 lr 0.000939 time 1.8207 (2.2092) loss 3.6831 (3.9611) grad_norm 0.9973 (1.1350) [2022-01-19 01:10:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][990/1251] eta 0:09:36 lr 0.000939 time 2.2642 (2.2091) loss 2.7496 (3.9595) grad_norm 1.2479 (1.1357) [2022-01-19 01:10:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1000/1251] eta 0:09:14 lr 0.000939 time 1.5947 (2.2086) loss 4.5718 (3.9600) grad_norm 1.0900 (1.1359) [2022-01-19 01:11:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1010/1251] eta 0:08:52 lr 0.000939 time 3.5665 (2.2093) loss 4.2886 (3.9636) grad_norm 1.0602 (1.1360) [2022-01-19 01:11:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1020/1251] eta 0:08:30 lr 0.000939 time 2.5701 (2.2093) loss 4.7339 (3.9642) grad_norm 1.1263 (1.1356) [2022-01-19 01:11:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1030/1251] eta 0:08:07 lr 0.000939 time 2.2707 (2.2076) loss 4.1454 (3.9649) grad_norm 1.0249 (1.1350) [2022-01-19 01:12:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1040/1251] eta 0:07:45 lr 0.000939 time 1.8303 (2.2065) loss 3.7488 (3.9649) grad_norm 1.1949 (1.1345) [2022-01-19 01:12:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1050/1251] eta 0:07:23 lr 0.000939 time 2.8940 (2.2076) loss 4.4623 (3.9626) grad_norm 1.3957 (1.1352) [2022-01-19 01:13:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1060/1251] eta 0:07:01 lr 0.000939 time 2.5767 (2.2079) loss 4.3533 (3.9622) grad_norm 1.2548 (1.1363) [2022-01-19 01:13:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1070/1251] eta 0:06:39 lr 0.000939 time 2.1716 (2.2075) loss 3.9416 (3.9643) grad_norm 1.1220 (1.1360) [2022-01-19 01:13:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1080/1251] eta 0:06:17 lr 0.000939 time 1.8728 (2.2067) loss 4.4894 (3.9637) grad_norm 0.9277 (1.1357) [2022-01-19 01:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1090/1251] eta 0:05:55 lr 0.000939 time 2.9166 (2.2085) loss 4.4262 (3.9637) grad_norm 1.3101 (1.1356) [2022-01-19 01:14:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1100/1251] eta 0:05:33 lr 0.000939 time 1.9062 (2.2072) loss 4.4416 (3.9644) grad_norm 1.1022 (1.1349) [2022-01-19 01:14:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1110/1251] eta 0:05:11 lr 0.000939 time 1.6391 (2.2060) loss 3.1017 (3.9645) grad_norm 1.2709 (1.1354) [2022-01-19 01:15:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1120/1251] eta 0:04:48 lr 0.000939 time 1.8318 (2.2051) loss 3.1992 (3.9622) grad_norm 1.1444 (1.1350) [2022-01-19 01:15:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1130/1251] eta 0:04:27 lr 0.000939 time 3.1391 (2.2075) loss 4.2082 (3.9620) grad_norm 0.9644 (1.1344) [2022-01-19 01:15:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1140/1251] eta 0:04:05 lr 0.000939 time 1.9624 (2.2077) loss 3.5413 (3.9606) grad_norm 1.0614 (1.1344) [2022-01-19 01:16:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1150/1251] eta 0:03:42 lr 0.000939 time 1.9559 (2.2078) loss 4.1279 (3.9632) grad_norm 1.1382 (1.1340) [2022-01-19 01:16:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1160/1251] eta 0:03:20 lr 0.000939 time 1.8759 (2.2070) loss 4.4160 (3.9643) grad_norm 1.6987 (1.1340) [2022-01-19 01:17:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1170/1251] eta 0:02:58 lr 0.000939 time 5.5040 (2.2087) loss 4.5947 (3.9655) grad_norm 0.9802 (1.1334) [2022-01-19 01:17:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1180/1251] eta 0:02:36 lr 0.000939 time 1.5192 (2.2075) loss 4.4391 (3.9671) grad_norm 1.0135 (1.1329) [2022-01-19 01:17:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1190/1251] eta 0:02:14 lr 0.000939 time 2.5307 (2.2075) loss 2.6197 (3.9642) grad_norm 1.3451 (1.1328) [2022-01-19 01:18:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1200/1251] eta 0:01:52 lr 0.000939 time 1.6299 (2.2063) loss 4.1101 (3.9665) grad_norm 1.0462 (1.1331) [2022-01-19 01:18:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1210/1251] eta 0:01:30 lr 0.000939 time 3.6039 (2.2068) loss 4.0043 (3.9645) grad_norm 1.2624 (1.1330) [2022-01-19 01:18:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1220/1251] eta 0:01:08 lr 0.000939 time 1.7929 (2.2061) loss 4.4613 (3.9661) grad_norm 1.0708 (1.1337) [2022-01-19 01:19:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1230/1251] eta 0:00:46 lr 0.000939 time 2.5706 (2.2061) loss 3.8095 (3.9652) grad_norm 1.0801 (1.1341) [2022-01-19 01:19:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1240/1251] eta 0:00:24 lr 0.000939 time 1.4667 (2.2045) loss 3.5054 (3.9652) grad_norm 1.1569 (1.1340) [2022-01-19 01:19:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [47/300][1250/1251] eta 0:00:02 lr 0.000939 time 1.3019 (2.1995) loss 3.4311 (3.9670) grad_norm 1.0762 (1.1339) [2022-01-19 01:19:52 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 47 training takes 0:45:51 [2022-01-19 01:20:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.710 (18.710) Loss 1.3986 (1.3986) Acc@1 68.359 (68.359) Acc@5 86.914 (86.914) [2022-01-19 01:20:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.946 (3.260) Loss 1.3166 (1.3241) Acc@1 71.094 (70.206) Acc@5 89.941 (89.915) [2022-01-19 01:20:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.533 (2.518) Loss 1.3990 (1.3295) Acc@1 68.945 (69.908) Acc@5 88.281 (89.746) [2022-01-19 01:21:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.650 (2.221) Loss 1.2879 (1.3332) Acc@1 69.238 (69.676) Acc@5 89.746 (89.740) [2022-01-19 01:21:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.965 (2.120) Loss 1.2982 (1.3368) Acc@1 70.996 (69.603) Acc@5 89.941 (89.665) [2022-01-19 01:21:27 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 69.582 Acc@5 89.666 [2022-01-19 01:21:27 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 69.6% [2022-01-19 01:21:27 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 69.65% [2022-01-19 01:21:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][0/1251] eta 7:29:39 lr 0.000939 time 21.5664 (21.5664) loss 4.3881 (4.3881) grad_norm 1.3784 (1.3784) [2022-01-19 01:22:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][10/1251] eta 1:22:30 lr 0.000939 time 2.2317 (3.9889) loss 3.9406 (4.2316) grad_norm 1.0416 (1.0916) [2022-01-19 01:22:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][20/1251] eta 1:02:23 lr 0.000939 time 1.4875 (3.0410) loss 3.9718 (4.0346) grad_norm 1.1326 (1.1368) [2022-01-19 01:22:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][30/1251] eta 0:56:59 lr 0.000939 time 1.6631 (2.8006) loss 3.5443 (3.9830) grad_norm 1.0698 (1.1232) [2022-01-19 01:23:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][40/1251] eta 0:54:41 lr 0.000939 time 3.5830 (2.7097) loss 4.5012 (4.0048) grad_norm 1.0718 (1.1283) [2022-01-19 01:23:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][50/1251] eta 0:53:25 lr 0.000939 time 2.8801 (2.6692) loss 4.0451 (4.0023) grad_norm 1.2097 (1.1296) [2022-01-19 01:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][60/1251] eta 0:51:45 lr 0.000939 time 2.1894 (2.6076) loss 3.8961 (3.9864) grad_norm 0.9991 (1.1216) [2022-01-19 01:24:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][70/1251] eta 0:49:38 lr 0.000939 time 1.6930 (2.5224) loss 4.0126 (3.9996) grad_norm 1.2063 (1.1351) [2022-01-19 01:24:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][80/1251] eta 0:48:08 lr 0.000939 time 2.2887 (2.4670) loss 3.1577 (3.9882) grad_norm 1.5089 (1.1391) [2022-01-19 01:25:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][90/1251] eta 0:47:16 lr 0.000939 time 2.0728 (2.4434) loss 3.6529 (3.9980) grad_norm 1.4396 (1.1449) [2022-01-19 01:25:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][100/1251] eta 0:46:24 lr 0.000939 time 1.9701 (2.4189) loss 3.0428 (3.9909) grad_norm 1.0394 (1.1490) [2022-01-19 01:25:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][110/1251] eta 0:45:32 lr 0.000939 time 2.2529 (2.3945) loss 4.0826 (3.9945) grad_norm 1.0284 (1.1394) [2022-01-19 01:26:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][120/1251] eta 0:44:48 lr 0.000939 time 2.2581 (2.3773) loss 3.2086 (3.9812) grad_norm 1.3754 (1.1459) [2022-01-19 01:26:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][130/1251] eta 0:44:05 lr 0.000939 time 2.4753 (2.3599) loss 3.2545 (3.9761) grad_norm 1.3208 (1.1443) [2022-01-19 01:26:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][140/1251] eta 0:43:22 lr 0.000938 time 2.2988 (2.3428) loss 4.5093 (3.9767) grad_norm 1.0762 (1.1380) [2022-01-19 01:27:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][150/1251] eta 0:42:41 lr 0.000938 time 1.6099 (2.3267) loss 3.6074 (3.9552) grad_norm 1.0162 (1.1395) [2022-01-19 01:27:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][160/1251] eta 0:42:08 lr 0.000938 time 2.2850 (2.3175) loss 3.9698 (3.9723) grad_norm 1.0583 (1.1385) [2022-01-19 01:28:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][170/1251] eta 0:41:44 lr 0.000938 time 2.8986 (2.3172) loss 3.8374 (3.9904) grad_norm 1.0128 (1.1402) [2022-01-19 01:28:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][180/1251] eta 0:41:15 lr 0.000938 time 2.1717 (2.3113) loss 3.8748 (3.9752) grad_norm 1.1975 (1.1345) [2022-01-19 01:28:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][190/1251] eta 0:40:43 lr 0.000938 time 1.8192 (2.3026) loss 3.7503 (3.9814) grad_norm 0.9817 (1.1329) [2022-01-19 01:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][200/1251] eta 0:40:11 lr 0.000938 time 1.9300 (2.2942) loss 4.4101 (3.9797) grad_norm 0.9634 (1.1320) [2022-01-19 01:29:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][210/1251] eta 0:39:41 lr 0.000938 time 3.0641 (2.2878) loss 4.2041 (3.9754) grad_norm 1.0613 (1.1287) [2022-01-19 01:29:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][220/1251] eta 0:39:19 lr 0.000938 time 3.0564 (2.2885) loss 2.7038 (3.9689) grad_norm 1.0349 (1.1290) [2022-01-19 01:30:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][230/1251] eta 0:38:56 lr 0.000938 time 1.9688 (2.2880) loss 3.5923 (3.9651) grad_norm 0.9673 (1.1245) [2022-01-19 01:30:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][240/1251] eta 0:38:32 lr 0.000938 time 2.2842 (2.2871) loss 4.0753 (3.9573) grad_norm 1.0510 (1.1221) [2022-01-19 01:31:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][250/1251] eta 0:38:08 lr 0.000938 time 2.7028 (2.2863) loss 3.4948 (3.9642) grad_norm 1.1524 (1.1207) [2022-01-19 01:31:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][260/1251] eta 0:37:35 lr 0.000938 time 1.8162 (2.2756) loss 3.0198 (3.9561) grad_norm 1.2094 (1.1238) [2022-01-19 01:31:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][270/1251] eta 0:36:56 lr 0.000938 time 1.8202 (2.2596) loss 4.6095 (3.9516) grad_norm 0.9703 (1.1250) [2022-01-19 01:32:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][280/1251] eta 0:36:28 lr 0.000938 time 2.2422 (2.2534) loss 4.2753 (3.9515) grad_norm 1.0101 (1.1255) [2022-01-19 01:32:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][290/1251] eta 0:35:59 lr 0.000938 time 2.1635 (2.2468) loss 4.3929 (3.9453) grad_norm 1.2175 (1.1286) [2022-01-19 01:32:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][300/1251] eta 0:35:29 lr 0.000938 time 1.9098 (2.2396) loss 4.3983 (3.9523) grad_norm 1.0934 (1.1285) [2022-01-19 01:33:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][310/1251] eta 0:35:04 lr 0.000938 time 2.4706 (2.2367) loss 2.8706 (3.9501) grad_norm 1.2645 (1.1295) [2022-01-19 01:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][320/1251] eta 0:34:41 lr 0.000938 time 2.2021 (2.2359) loss 3.6931 (3.9535) grad_norm 1.0257 (1.1289) [2022-01-19 01:33:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][330/1251] eta 0:34:22 lr 0.000938 time 2.5913 (2.2400) loss 3.7785 (3.9452) grad_norm 1.1209 (1.1289) [2022-01-19 01:34:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][340/1251] eta 0:34:03 lr 0.000938 time 2.2386 (2.2430) loss 3.9375 (3.9465) grad_norm 0.9707 (1.1292) [2022-01-19 01:34:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][350/1251] eta 0:33:42 lr 0.000938 time 2.2458 (2.2450) loss 3.5905 (3.9468) grad_norm 1.0850 (1.1285) [2022-01-19 01:34:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][360/1251] eta 0:33:17 lr 0.000938 time 1.7344 (2.2413) loss 4.0802 (3.9462) grad_norm 1.0640 (1.1294) [2022-01-19 01:35:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][370/1251] eta 0:32:55 lr 0.000938 time 1.9076 (2.2420) loss 4.6246 (3.9447) grad_norm 1.5994 (1.1311) [2022-01-19 01:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][380/1251] eta 0:32:30 lr 0.000938 time 1.8012 (2.2395) loss 4.1771 (3.9393) grad_norm 1.0852 (1.1324) [2022-01-19 01:36:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][390/1251] eta 0:32:08 lr 0.000938 time 3.0814 (2.2396) loss 2.7446 (3.9478) grad_norm 1.6771 (1.1341) [2022-01-19 01:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][400/1251] eta 0:31:42 lr 0.000938 time 1.6052 (2.2360) loss 4.1564 (3.9469) grad_norm 1.1721 (1.1344) [2022-01-19 01:36:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][410/1251] eta 0:31:18 lr 0.000938 time 1.8916 (2.2339) loss 2.7681 (3.9447) grad_norm 1.3307 (1.1349) [2022-01-19 01:37:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][420/1251] eta 0:30:55 lr 0.000938 time 2.1328 (2.2326) loss 3.4338 (3.9483) grad_norm 1.0020 (1.1350) [2022-01-19 01:37:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][430/1251] eta 0:30:35 lr 0.000938 time 3.2225 (2.2356) loss 4.4661 (3.9428) grad_norm 1.0102 (1.1321) [2022-01-19 01:37:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][440/1251] eta 0:30:12 lr 0.000938 time 1.7912 (2.2345) loss 3.8865 (3.9427) grad_norm 1.5096 (1.1323) [2022-01-19 01:38:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][450/1251] eta 0:29:49 lr 0.000938 time 2.0201 (2.2343) loss 3.4878 (3.9379) grad_norm 0.9789 (1.1316) [2022-01-19 01:38:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][460/1251] eta 0:29:26 lr 0.000938 time 2.1308 (2.2335) loss 4.3224 (3.9353) grad_norm 1.2229 (1.1312) [2022-01-19 01:38:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][470/1251] eta 0:29:05 lr 0.000938 time 2.8314 (2.2348) loss 4.0193 (3.9403) grad_norm 1.0592 (1.1292) [2022-01-19 01:39:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][480/1251] eta 0:28:41 lr 0.000938 time 1.9504 (2.2325) loss 4.3343 (3.9431) grad_norm 1.3466 (1.1285) [2022-01-19 01:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][490/1251] eta 0:28:15 lr 0.000938 time 1.6504 (2.2276) loss 4.1881 (3.9484) grad_norm 1.1520 (1.1286) [2022-01-19 01:40:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][500/1251] eta 0:27:51 lr 0.000938 time 2.0587 (2.2255) loss 4.0236 (3.9559) grad_norm 1.0395 (1.1281) [2022-01-19 01:40:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][510/1251] eta 0:27:27 lr 0.000938 time 2.2988 (2.2229) loss 4.4519 (3.9579) grad_norm 1.1774 (1.1266) [2022-01-19 01:40:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][520/1251] eta 0:27:04 lr 0.000938 time 2.7667 (2.2223) loss 2.8923 (3.9521) grad_norm 1.1301 (1.1249) [2022-01-19 01:41:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][530/1251] eta 0:26:40 lr 0.000938 time 1.5435 (2.2195) loss 4.5183 (3.9511) grad_norm 1.1725 (1.1241) [2022-01-19 01:41:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][540/1251] eta 0:26:19 lr 0.000938 time 1.9656 (2.2213) loss 4.1766 (3.9571) grad_norm 1.0938 (1.1248) [2022-01-19 01:41:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][550/1251] eta 0:25:55 lr 0.000938 time 2.1938 (2.2190) loss 2.9503 (3.9584) grad_norm 0.9329 (1.1239) [2022-01-19 01:42:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][560/1251] eta 0:25:33 lr 0.000938 time 1.7947 (2.2196) loss 4.7786 (3.9584) grad_norm 1.3950 (1.1245) [2022-01-19 01:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][570/1251] eta 0:25:12 lr 0.000938 time 2.4736 (2.2206) loss 4.7355 (3.9620) grad_norm 1.3477 (1.1256) [2022-01-19 01:42:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][580/1251] eta 0:24:49 lr 0.000938 time 2.8982 (2.2194) loss 3.3889 (3.9589) grad_norm 1.1106 (1.1267) [2022-01-19 01:43:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][590/1251] eta 0:24:26 lr 0.000938 time 2.4028 (2.2191) loss 3.6171 (3.9630) grad_norm 1.3079 (1.1288) [2022-01-19 01:43:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][600/1251] eta 0:24:03 lr 0.000938 time 1.6288 (2.2180) loss 4.0672 (3.9644) grad_norm 1.0718 (1.1294) [2022-01-19 01:44:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][610/1251] eta 0:23:41 lr 0.000938 time 1.8037 (2.2183) loss 4.3635 (3.9646) grad_norm 1.0987 (1.1292) [2022-01-19 01:44:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][620/1251] eta 0:23:18 lr 0.000938 time 2.1454 (2.2169) loss 4.4270 (3.9653) grad_norm 0.9184 (1.1292) [2022-01-19 01:44:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][630/1251] eta 0:22:55 lr 0.000938 time 2.8689 (2.2153) loss 3.7741 (3.9665) grad_norm 1.0292 (1.1312) [2022-01-19 01:45:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][640/1251] eta 0:22:33 lr 0.000937 time 1.8935 (2.2154) loss 4.6655 (3.9697) grad_norm 1.1791 (1.1291) [2022-01-19 01:45:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][650/1251] eta 0:22:12 lr 0.000937 time 1.9022 (2.2167) loss 3.3533 (3.9699) grad_norm 1.0140 (1.1287) [2022-01-19 01:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][660/1251] eta 0:21:49 lr 0.000937 time 1.5357 (2.2158) loss 4.6859 (3.9674) grad_norm 1.1262 (1.1295) [2022-01-19 01:46:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][670/1251] eta 0:21:26 lr 0.000937 time 2.5288 (2.2148) loss 3.6818 (3.9672) grad_norm 1.1737 (1.1308) [2022-01-19 01:46:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][680/1251] eta 0:21:02 lr 0.000937 time 1.7932 (2.2110) loss 3.9077 (3.9658) grad_norm 0.9050 (1.1317) [2022-01-19 01:46:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][690/1251] eta 0:20:39 lr 0.000937 time 2.0919 (2.2099) loss 4.6132 (3.9704) grad_norm 1.0950 (1.1309) [2022-01-19 01:47:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][700/1251] eta 0:20:17 lr 0.000937 time 2.0254 (2.2097) loss 3.9286 (3.9733) grad_norm 1.3148 (1.1318) [2022-01-19 01:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][710/1251] eta 0:19:55 lr 0.000937 time 2.2404 (2.2107) loss 4.2180 (3.9766) grad_norm 0.9790 (1.1305) [2022-01-19 01:48:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][720/1251] eta 0:19:33 lr 0.000937 time 2.4558 (2.2109) loss 3.0101 (3.9714) grad_norm 1.0806 (1.1298) [2022-01-19 01:48:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][730/1251] eta 0:19:12 lr 0.000937 time 1.9365 (2.2113) loss 4.3800 (3.9705) grad_norm 1.4148 (1.1300) [2022-01-19 01:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][740/1251] eta 0:18:51 lr 0.000937 time 2.7823 (2.2138) loss 4.3806 (3.9707) grad_norm 1.3706 (1.1307) [2022-01-19 01:49:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][750/1251] eta 0:18:30 lr 0.000937 time 2.2944 (2.2160) loss 4.5368 (3.9717) grad_norm 1.0868 (1.1312) [2022-01-19 01:49:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][760/1251] eta 0:18:07 lr 0.000937 time 1.9007 (2.2145) loss 4.3001 (3.9725) grad_norm 1.0413 (1.1305) [2022-01-19 01:49:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][770/1251] eta 0:17:42 lr 0.000937 time 1.8614 (2.2098) loss 3.4516 (3.9717) grad_norm 1.1452 (1.1306) [2022-01-19 01:50:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][780/1251] eta 0:17:18 lr 0.000937 time 1.6855 (2.2058) loss 4.4437 (3.9737) grad_norm 1.1636 (1.1316) [2022-01-19 01:50:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][790/1251] eta 0:16:56 lr 0.000937 time 2.1713 (2.2052) loss 4.5193 (3.9774) grad_norm 1.3691 (1.1325) [2022-01-19 01:50:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][800/1251] eta 0:16:34 lr 0.000937 time 1.8145 (2.2053) loss 4.0758 (3.9796) grad_norm 1.0162 (1.1334) [2022-01-19 01:51:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][810/1251] eta 0:16:12 lr 0.000937 time 2.1107 (2.2056) loss 2.9256 (3.9755) grad_norm 1.0905 (1.1345) [2022-01-19 01:51:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][820/1251] eta 0:15:50 lr 0.000937 time 2.1588 (2.2059) loss 4.3070 (3.9742) grad_norm 1.6673 (1.1369) [2022-01-19 01:52:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][830/1251] eta 0:15:28 lr 0.000937 time 1.9666 (2.2064) loss 4.1510 (3.9747) grad_norm 0.9665 (1.1372) [2022-01-19 01:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][840/1251] eta 0:15:08 lr 0.000937 time 2.8510 (2.2092) loss 4.0113 (3.9742) grad_norm 0.9468 (1.1365) [2022-01-19 01:52:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][850/1251] eta 0:14:46 lr 0.000937 time 1.8861 (2.2107) loss 3.4577 (3.9710) grad_norm 1.1757 (1.1362) [2022-01-19 01:53:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][860/1251] eta 0:14:23 lr 0.000937 time 1.9232 (2.2092) loss 4.2744 (3.9736) grad_norm 1.1835 (1.1361) [2022-01-19 01:53:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][870/1251] eta 0:14:01 lr 0.000937 time 1.9097 (2.2077) loss 3.8059 (3.9734) grad_norm 1.0662 (1.1356) [2022-01-19 01:53:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][880/1251] eta 0:13:38 lr 0.000937 time 1.9449 (2.2067) loss 3.1781 (3.9762) grad_norm 1.0518 (1.1356) [2022-01-19 01:54:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][890/1251] eta 0:13:16 lr 0.000937 time 3.0872 (2.2060) loss 4.2747 (3.9732) grad_norm 1.2216 (1.1350) [2022-01-19 01:54:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][900/1251] eta 0:12:54 lr 0.000937 time 1.7374 (2.2054) loss 4.1917 (3.9742) grad_norm 1.0249 (1.1357) [2022-01-19 01:54:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][910/1251] eta 0:12:31 lr 0.000937 time 1.5930 (2.2040) loss 3.0647 (3.9743) grad_norm 0.9297 (1.1349) [2022-01-19 01:55:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][920/1251] eta 0:12:09 lr 0.000937 time 1.9731 (2.2037) loss 4.8603 (3.9732) grad_norm 0.9964 (1.1339) [2022-01-19 01:55:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][930/1251] eta 0:11:47 lr 0.000937 time 2.7842 (2.2042) loss 3.6260 (3.9745) grad_norm 0.9973 (1.1331) [2022-01-19 01:56:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][940/1251] eta 0:11:25 lr 0.000937 time 2.4658 (2.2044) loss 3.9622 (3.9768) grad_norm 1.1806 (1.1347) [2022-01-19 01:56:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][950/1251] eta 0:11:03 lr 0.000937 time 1.6433 (2.2040) loss 4.2556 (3.9753) grad_norm 1.0334 (1.1348) [2022-01-19 01:56:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][960/1251] eta 0:10:41 lr 0.000937 time 2.1785 (2.2032) loss 3.6843 (3.9726) grad_norm 0.9633 (1.1350) [2022-01-19 01:57:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][970/1251] eta 0:10:19 lr 0.000937 time 3.0945 (2.2061) loss 4.5446 (3.9723) grad_norm 1.0118 (1.1351) [2022-01-19 01:57:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][980/1251] eta 0:09:57 lr 0.000937 time 2.2250 (2.2053) loss 4.0821 (3.9723) grad_norm 1.0223 (1.1350) [2022-01-19 01:57:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][990/1251] eta 0:09:35 lr 0.000937 time 1.8530 (2.2051) loss 3.9706 (3.9752) grad_norm 1.2123 (1.1339) [2022-01-19 01:58:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1000/1251] eta 0:09:13 lr 0.000937 time 1.8506 (2.2034) loss 3.5859 (3.9738) grad_norm 1.2456 (1.1344) [2022-01-19 01:58:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1010/1251] eta 0:08:50 lr 0.000937 time 1.5590 (2.2013) loss 4.5054 (3.9738) grad_norm 1.0451 (1.1346) [2022-01-19 01:58:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1020/1251] eta 0:08:28 lr 0.000937 time 1.8937 (2.1997) loss 4.2472 (3.9727) grad_norm 1.0980 (1.1349) [2022-01-19 01:59:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1030/1251] eta 0:08:06 lr 0.000937 time 3.2343 (2.1999) loss 4.7933 (3.9714) grad_norm 1.3071 (1.1347) [2022-01-19 01:59:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1040/1251] eta 0:07:44 lr 0.000937 time 2.1838 (2.2003) loss 3.2583 (3.9707) grad_norm 1.4301 (1.1349) [2022-01-19 01:59:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1050/1251] eta 0:07:22 lr 0.000937 time 1.8372 (2.2000) loss 3.5671 (3.9689) grad_norm 1.4237 (1.1343) [2022-01-19 02:00:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1060/1251] eta 0:07:00 lr 0.000937 time 1.9167 (2.1998) loss 3.7823 (3.9695) grad_norm 1.1618 (1.1343) [2022-01-19 02:00:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1070/1251] eta 0:06:38 lr 0.000937 time 2.4892 (2.2004) loss 4.5032 (3.9700) grad_norm 1.0411 (1.1341) [2022-01-19 02:01:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1080/1251] eta 0:06:16 lr 0.000937 time 2.4236 (2.2025) loss 4.2694 (3.9697) grad_norm 1.0363 (1.1329) [2022-01-19 02:01:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1090/1251] eta 0:05:54 lr 0.000937 time 1.5351 (2.2023) loss 2.8571 (3.9684) grad_norm 1.2746 (1.1323) [2022-01-19 02:01:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1100/1251] eta 0:05:32 lr 0.000937 time 2.1209 (2.2022) loss 4.2249 (3.9692) grad_norm 1.0218 (1.1322) [2022-01-19 02:02:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1110/1251] eta 0:05:10 lr 0.000937 time 1.8986 (2.2024) loss 3.5220 (3.9683) grad_norm 1.1946 (1.1319) [2022-01-19 02:02:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1120/1251] eta 0:04:48 lr 0.000937 time 1.6822 (2.2029) loss 3.7890 (3.9646) grad_norm 0.9825 (1.1323) [2022-01-19 02:02:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1130/1251] eta 0:04:26 lr 0.000936 time 1.9577 (2.2015) loss 4.4033 (3.9651) grad_norm 1.0372 (1.1326) [2022-01-19 02:03:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1140/1251] eta 0:04:04 lr 0.000936 time 1.6893 (2.1998) loss 3.6156 (3.9629) grad_norm 1.2117 (1.1332) [2022-01-19 02:03:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1150/1251] eta 0:03:42 lr 0.000936 time 2.0071 (2.1993) loss 3.8976 (3.9622) grad_norm 1.1228 (1.1331) [2022-01-19 02:03:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1160/1251] eta 0:03:20 lr 0.000936 time 2.1897 (2.1987) loss 4.3136 (3.9611) grad_norm 0.8815 (1.1326) [2022-01-19 02:04:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1170/1251] eta 0:02:58 lr 0.000936 time 1.9499 (2.1989) loss 4.6467 (3.9625) grad_norm 1.0754 (1.1321) [2022-01-19 02:04:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1180/1251] eta 0:02:36 lr 0.000936 time 1.9019 (2.1988) loss 4.3332 (3.9637) grad_norm 1.0543 (1.1314) [2022-01-19 02:05:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1190/1251] eta 0:02:14 lr 0.000936 time 2.1689 (2.1988) loss 4.1698 (3.9643) grad_norm 0.9711 (1.1313) [2022-01-19 02:05:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1200/1251] eta 0:01:52 lr 0.000936 time 2.7018 (2.2001) loss 3.5674 (3.9642) grad_norm 1.1418 (1.1313) [2022-01-19 02:05:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1210/1251] eta 0:01:30 lr 0.000936 time 1.9192 (2.1993) loss 3.3566 (3.9648) grad_norm 1.2463 (1.1317) [2022-01-19 02:06:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1220/1251] eta 0:01:08 lr 0.000936 time 1.6255 (2.1971) loss 4.4008 (3.9655) grad_norm 1.0530 (1.1318) [2022-01-19 02:06:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1230/1251] eta 0:00:46 lr 0.000936 time 1.8006 (2.1966) loss 4.4096 (3.9680) grad_norm 1.0067 (1.1309) [2022-01-19 02:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1240/1251] eta 0:00:24 lr 0.000936 time 1.5470 (2.1953) loss 4.0418 (3.9647) grad_norm 1.0254 (1.1305) [2022-01-19 02:07:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [48/300][1250/1251] eta 0:00:02 lr 0.000936 time 1.3500 (2.1903) loss 3.8371 (3.9646) grad_norm 1.2306 (1.1310) [2022-01-19 02:07:07 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 48 training takes 0:45:40 [2022-01-19 02:07:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.108 (18.108) Loss 1.3008 (1.3008) Acc@1 71.289 (71.289) Acc@5 89.941 (89.941) [2022-01-19 02:07:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.597 (3.309) Loss 1.2776 (1.3033) Acc@1 69.727 (70.108) Acc@5 90.137 (89.977) [2022-01-19 02:08:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.230 (2.584) Loss 1.2395 (1.2899) Acc@1 71.777 (70.396) Acc@5 90.430 (90.109) [2022-01-19 02:08:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.528 (2.321) Loss 1.2256 (1.3009) Acc@1 71.094 (69.963) Acc@5 91.309 (89.982) [2022-01-19 02:08:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.528 (2.191) Loss 1.2487 (1.3015) Acc@1 70.410 (69.865) Acc@5 91.406 (90.006) [2022-01-19 02:08:45 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 69.950 Acc@5 90.076 [2022-01-19 02:08:45 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 70.0% [2022-01-19 02:08:45 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 69.95% [2022-01-19 02:09:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][0/1251] eta 7:28:51 lr 0.000936 time 21.5281 (21.5281) loss 4.1878 (4.1878) grad_norm 1.2212 (1.2212) [2022-01-19 02:09:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][10/1251] eta 1:21:34 lr 0.000936 time 2.4148 (3.9439) loss 4.1610 (4.0672) grad_norm 1.3482 (1.1691) [2022-01-19 02:09:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][20/1251] eta 1:04:07 lr 0.000936 time 1.7007 (3.1255) loss 2.7661 (3.9253) grad_norm 1.1340 (1.1437) [2022-01-19 02:10:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][30/1251] eta 0:57:25 lr 0.000936 time 1.9407 (2.8222) loss 4.2671 (3.9006) grad_norm 1.1873 (1.1294) [2022-01-19 02:10:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][40/1251] eta 0:54:34 lr 0.000936 time 3.3926 (2.7040) loss 4.2534 (3.9183) grad_norm 0.9813 (1.1418) [2022-01-19 02:11:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][50/1251] eta 0:53:13 lr 0.000936 time 2.5037 (2.6589) loss 3.0012 (3.9529) grad_norm 1.3201 (1.1342) [2022-01-19 02:11:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][60/1251] eta 0:51:51 lr 0.000936 time 2.1358 (2.6123) loss 4.1907 (3.8954) grad_norm 1.1853 (1.1340) [2022-01-19 02:11:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][70/1251] eta 0:50:28 lr 0.000936 time 2.0984 (2.5642) loss 4.5962 (3.9290) grad_norm 1.0417 (1.1352) [2022-01-19 02:12:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][80/1251] eta 0:49:12 lr 0.000936 time 2.8808 (2.5215) loss 3.2851 (3.9145) grad_norm 1.1374 (1.1251) [2022-01-19 02:12:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][90/1251] eta 0:47:57 lr 0.000936 time 2.2875 (2.4788) loss 4.1474 (3.9216) grad_norm 1.1931 (1.1205) [2022-01-19 02:12:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][100/1251] eta 0:46:34 lr 0.000936 time 1.8221 (2.4280) loss 3.5953 (3.8980) grad_norm 1.2664 (1.1229) [2022-01-19 02:13:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][110/1251] eta 0:45:26 lr 0.000936 time 1.7892 (2.3899) loss 4.3642 (3.9026) grad_norm 1.2548 (1.1235) [2022-01-19 02:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][120/1251] eta 0:44:44 lr 0.000936 time 2.1781 (2.3732) loss 3.7481 (3.9023) grad_norm 1.3709 (1.1351) [2022-01-19 02:13:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][130/1251] eta 0:43:54 lr 0.000936 time 2.1569 (2.3500) loss 4.6295 (3.9017) grad_norm 1.2235 (1.1317) [2022-01-19 02:14:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][140/1251] eta 0:43:15 lr 0.000936 time 1.9811 (2.3360) loss 4.4215 (3.8964) grad_norm 1.2446 (1.1312) [2022-01-19 02:14:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][150/1251] eta 0:42:38 lr 0.000936 time 2.1423 (2.3242) loss 4.2578 (3.8967) grad_norm 1.1923 (1.1296) [2022-01-19 02:14:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][160/1251] eta 0:42:07 lr 0.000936 time 2.4179 (2.3168) loss 3.9757 (3.9072) grad_norm 0.9164 (1.1298) [2022-01-19 02:15:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][170/1251] eta 0:41:39 lr 0.000936 time 1.9539 (2.3123) loss 3.9316 (3.8928) grad_norm 1.2943 (1.1317) [2022-01-19 02:15:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][180/1251] eta 0:41:24 lr 0.000936 time 2.4310 (2.3198) loss 2.7044 (3.8905) grad_norm 1.4130 (1.1346) [2022-01-19 02:16:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][190/1251] eta 0:40:56 lr 0.000936 time 1.8312 (2.3156) loss 3.8224 (3.8931) grad_norm 1.2125 (1.1339) [2022-01-19 02:16:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][200/1251] eta 0:40:31 lr 0.000936 time 1.6501 (2.3140) loss 4.0710 (3.8904) grad_norm 1.1199 (1.1356) [2022-01-19 02:16:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][210/1251] eta 0:40:03 lr 0.000936 time 2.2736 (2.3087) loss 3.0521 (3.8769) grad_norm 1.2400 (1.1365) [2022-01-19 02:17:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][220/1251] eta 0:39:40 lr 0.000936 time 3.0903 (2.3091) loss 4.1586 (3.8811) grad_norm 1.1376 (1.1351) [2022-01-19 02:17:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][230/1251] eta 0:39:06 lr 0.000936 time 1.8092 (2.2981) loss 3.8884 (3.8864) grad_norm 1.2792 (1.1384) [2022-01-19 02:17:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][240/1251] eta 0:38:33 lr 0.000936 time 1.8794 (2.2885) loss 4.2386 (3.8942) grad_norm 0.8880 (1.1393) [2022-01-19 02:18:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][250/1251] eta 0:38:12 lr 0.000936 time 2.8756 (2.2902) loss 4.1390 (3.8881) grad_norm 1.1997 (1.1408) [2022-01-19 02:18:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][260/1251] eta 0:37:49 lr 0.000936 time 2.0120 (2.2901) loss 4.0137 (3.8901) grad_norm 1.0965 (1.1402) [2022-01-19 02:19:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][270/1251] eta 0:37:18 lr 0.000936 time 1.9122 (2.2815) loss 4.2196 (3.8873) grad_norm 0.9116 (1.1391) [2022-01-19 02:19:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][280/1251] eta 0:36:50 lr 0.000936 time 2.2751 (2.2764) loss 4.2858 (3.8911) grad_norm 1.0046 (1.1394) [2022-01-19 02:19:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][290/1251] eta 0:36:21 lr 0.000936 time 2.2045 (2.2705) loss 4.2009 (3.8914) grad_norm 1.0445 (1.1366) [2022-01-19 02:20:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][300/1251] eta 0:35:54 lr 0.000936 time 2.2241 (2.2655) loss 3.8393 (3.8866) grad_norm 1.1854 (1.1342) [2022-01-19 02:20:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][310/1251] eta 0:35:31 lr 0.000936 time 2.2492 (2.2650) loss 3.7005 (3.8864) grad_norm 1.4536 (1.1341) [2022-01-19 02:20:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][320/1251] eta 0:35:03 lr 0.000936 time 2.0231 (2.2593) loss 2.8237 (3.8797) grad_norm 1.0807 (1.1332) [2022-01-19 02:21:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][330/1251] eta 0:34:34 lr 0.000936 time 2.2298 (2.2523) loss 4.2455 (3.8859) grad_norm 1.2066 (1.1335) [2022-01-19 02:21:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][340/1251] eta 0:34:11 lr 0.000936 time 2.2659 (2.2515) loss 3.3805 (3.8861) grad_norm 1.1914 (1.1317) [2022-01-19 02:21:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][350/1251] eta 0:33:45 lr 0.000936 time 1.8327 (2.2485) loss 4.1184 (3.8848) grad_norm 1.2155 (1.1294) [2022-01-19 02:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][360/1251] eta 0:33:22 lr 0.000936 time 2.4746 (2.2480) loss 4.7488 (3.8882) grad_norm 1.1096 (1.1311) [2022-01-19 02:22:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][370/1251] eta 0:32:59 lr 0.000935 time 2.5636 (2.2467) loss 3.6354 (3.8922) grad_norm 0.9041 (1.1292) [2022-01-19 02:23:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][380/1251] eta 0:32:42 lr 0.000935 time 3.2547 (2.2536) loss 4.6862 (3.8960) grad_norm 0.9734 (1.1308) [2022-01-19 02:23:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][390/1251] eta 0:32:21 lr 0.000935 time 2.7485 (2.2546) loss 3.7343 (3.8916) grad_norm 1.4429 (1.1321) [2022-01-19 02:23:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][400/1251] eta 0:31:53 lr 0.000935 time 1.8800 (2.2487) loss 3.8606 (3.9028) grad_norm 1.0422 (1.1317) [2022-01-19 02:24:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][410/1251] eta 0:31:28 lr 0.000935 time 1.7374 (2.2460) loss 4.2599 (3.9008) grad_norm 1.1093 (1.1295) [2022-01-19 02:24:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][420/1251] eta 0:31:05 lr 0.000935 time 2.5610 (2.2452) loss 4.2050 (3.9025) grad_norm 0.9450 (1.1281) [2022-01-19 02:24:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][430/1251] eta 0:30:39 lr 0.000935 time 1.8759 (2.2408) loss 4.3475 (3.9006) grad_norm 1.1417 (1.1293) [2022-01-19 02:25:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][440/1251] eta 0:30:15 lr 0.000935 time 2.4460 (2.2390) loss 4.4802 (3.9101) grad_norm 1.1952 (1.1301) [2022-01-19 02:25:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][450/1251] eta 0:29:48 lr 0.000935 time 1.8836 (2.2331) loss 4.2554 (3.9143) grad_norm 1.0657 (1.1284) [2022-01-19 02:25:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][460/1251] eta 0:29:25 lr 0.000935 time 2.1379 (2.2318) loss 3.0567 (3.9128) grad_norm 1.2175 (1.1283) [2022-01-19 02:26:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][470/1251] eta 0:29:01 lr 0.000935 time 2.1385 (2.2302) loss 3.6294 (3.9088) grad_norm 1.0271 (1.1282) [2022-01-19 02:26:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][480/1251] eta 0:28:38 lr 0.000935 time 2.5253 (2.2295) loss 4.6754 (3.9092) grad_norm 1.1993 (1.1300) [2022-01-19 02:27:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][490/1251] eta 0:28:17 lr 0.000935 time 2.2997 (2.2300) loss 4.5230 (3.9141) grad_norm 1.2576 (1.1316) [2022-01-19 02:27:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][500/1251] eta 0:27:57 lr 0.000935 time 3.6978 (2.2333) loss 3.3755 (3.9102) grad_norm 0.9884 (1.1316) [2022-01-19 02:27:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][510/1251] eta 0:27:36 lr 0.000935 time 2.8501 (2.2357) loss 2.8374 (3.9093) grad_norm 1.0537 (1.1301) [2022-01-19 02:28:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][520/1251] eta 0:27:15 lr 0.000935 time 2.1642 (2.2368) loss 4.9576 (3.9140) grad_norm 1.1424 (1.1315) [2022-01-19 02:28:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][530/1251] eta 0:26:52 lr 0.000935 time 1.5832 (2.2371) loss 2.7455 (3.9109) grad_norm 1.0065 (1.1307) [2022-01-19 02:28:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][540/1251] eta 0:26:31 lr 0.000935 time 3.4189 (2.2391) loss 3.8582 (3.9093) grad_norm 1.0397 (1.1310) [2022-01-19 02:29:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][550/1251] eta 0:26:05 lr 0.000935 time 1.9007 (2.2339) loss 4.3397 (3.9069) grad_norm 1.1939 (1.1315) [2022-01-19 02:29:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][560/1251] eta 0:25:39 lr 0.000935 time 1.8056 (2.2280) loss 4.2858 (3.9063) grad_norm 1.2151 (1.1310) [2022-01-19 02:29:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][570/1251] eta 0:25:16 lr 0.000935 time 2.7302 (2.2263) loss 4.4422 (3.9093) grad_norm 1.4134 (1.1318) [2022-01-19 02:30:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][580/1251] eta 0:24:54 lr 0.000935 time 2.4738 (2.2269) loss 3.5533 (3.9089) grad_norm 1.1201 (1.1338) [2022-01-19 02:30:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][590/1251] eta 0:24:32 lr 0.000935 time 2.3344 (2.2273) loss 4.0294 (3.9104) grad_norm 1.1601 (1.1341) [2022-01-19 02:31:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][600/1251] eta 0:24:10 lr 0.000935 time 2.4846 (2.2282) loss 4.2700 (3.9115) grad_norm 0.8963 (1.1336) [2022-01-19 02:31:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][610/1251] eta 0:23:46 lr 0.000935 time 1.9251 (2.2262) loss 4.7624 (3.9144) grad_norm 1.2034 (1.1342) [2022-01-19 02:31:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][620/1251] eta 0:23:24 lr 0.000935 time 2.9699 (2.2263) loss 4.4782 (3.9184) grad_norm 1.0237 (1.1330) [2022-01-19 02:32:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][630/1251] eta 0:23:00 lr 0.000935 time 1.9578 (2.2236) loss 4.5161 (3.9209) grad_norm 1.2102 (1.1328) [2022-01-19 02:32:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][640/1251] eta 0:22:37 lr 0.000935 time 2.5206 (2.2223) loss 3.2635 (3.9232) grad_norm 1.0892 (1.1325) [2022-01-19 02:32:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][650/1251] eta 0:22:15 lr 0.000935 time 2.2909 (2.2224) loss 4.2846 (3.9264) grad_norm 1.2009 (1.1322) [2022-01-19 02:33:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][660/1251] eta 0:21:52 lr 0.000935 time 1.8423 (2.2200) loss 3.3339 (3.9232) grad_norm 1.2011 (1.1325) [2022-01-19 02:33:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][670/1251] eta 0:21:30 lr 0.000935 time 2.3516 (2.2208) loss 4.3134 (3.9248) grad_norm 0.9821 (1.1315) [2022-01-19 02:33:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][680/1251] eta 0:21:08 lr 0.000935 time 1.9440 (2.2210) loss 4.0850 (3.9241) grad_norm 0.9623 (1.1313) [2022-01-19 02:34:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][690/1251] eta 0:20:45 lr 0.000935 time 1.9154 (2.2204) loss 4.3867 (3.9228) grad_norm 1.2233 (1.1308) [2022-01-19 02:34:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][700/1251] eta 0:20:23 lr 0.000935 time 2.1781 (2.2214) loss 4.1513 (3.9262) grad_norm 1.4705 (1.1308) [2022-01-19 02:35:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][710/1251] eta 0:20:01 lr 0.000935 time 2.4256 (2.2207) loss 3.1564 (3.9268) grad_norm 1.1077 (1.1324) [2022-01-19 02:35:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][720/1251] eta 0:19:37 lr 0.000935 time 1.9068 (2.2173) loss 3.7932 (3.9282) grad_norm 1.2117 (1.1321) [2022-01-19 02:35:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][730/1251] eta 0:19:13 lr 0.000935 time 1.8464 (2.2141) loss 4.1902 (3.9302) grad_norm 1.0045 (1.1327) [2022-01-19 02:36:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][740/1251] eta 0:18:51 lr 0.000935 time 2.1098 (2.2136) loss 3.8748 (3.9300) grad_norm 1.1283 (1.1333) [2022-01-19 02:36:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][750/1251] eta 0:18:29 lr 0.000935 time 2.2020 (2.2143) loss 4.1407 (3.9316) grad_norm 1.1672 (1.1324) [2022-01-19 02:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][760/1251] eta 0:18:07 lr 0.000935 time 2.1129 (2.2157) loss 4.1822 (3.9333) grad_norm 1.0570 (1.1322) [2022-01-19 02:37:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][770/1251] eta 0:17:45 lr 0.000935 time 1.8959 (2.2152) loss 4.0675 (3.9325) grad_norm 1.3665 (1.1326) [2022-01-19 02:37:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][780/1251] eta 0:17:22 lr 0.000935 time 2.1528 (2.2143) loss 2.9161 (3.9301) grad_norm 0.9237 (1.1327) [2022-01-19 02:37:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][790/1251] eta 0:17:00 lr 0.000935 time 1.8227 (2.2141) loss 4.0783 (3.9309) grad_norm 1.1676 (1.1332) [2022-01-19 02:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][800/1251] eta 0:16:38 lr 0.000935 time 2.4830 (2.2141) loss 4.2069 (3.9299) grad_norm 0.9423 (1.1336) [2022-01-19 02:38:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][810/1251] eta 0:16:15 lr 0.000935 time 1.8799 (2.2117) loss 4.4517 (3.9268) grad_norm 1.1523 (1.1340) [2022-01-19 02:39:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][820/1251] eta 0:15:52 lr 0.000935 time 2.1895 (2.2105) loss 4.5010 (3.9260) grad_norm 1.2301 (1.1352) [2022-01-19 02:39:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][830/1251] eta 0:15:30 lr 0.000935 time 1.9107 (2.2097) loss 4.1234 (3.9242) grad_norm 0.9730 (1.1358) [2022-01-19 02:39:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][840/1251] eta 0:15:08 lr 0.000935 time 1.5989 (2.2102) loss 4.2894 (3.9233) grad_norm 1.3717 (1.1351) [2022-01-19 02:40:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][850/1251] eta 0:14:46 lr 0.000935 time 2.1095 (2.2116) loss 4.4210 (3.9241) grad_norm 1.1763 (1.1346) [2022-01-19 02:40:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][860/1251] eta 0:14:24 lr 0.000934 time 1.5941 (2.2112) loss 4.1261 (3.9221) grad_norm 1.0189 (1.1343) [2022-01-19 02:40:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][870/1251] eta 0:14:02 lr 0.000934 time 1.8714 (2.2110) loss 4.0976 (3.9237) grad_norm 1.1177 (1.1334) [2022-01-19 02:41:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][880/1251] eta 0:13:40 lr 0.000934 time 1.6057 (2.2117) loss 4.2604 (3.9280) grad_norm 1.1037 (1.1337) [2022-01-19 02:41:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][890/1251] eta 0:13:18 lr 0.000934 time 1.5943 (2.2117) loss 3.8442 (3.9297) grad_norm 1.0278 (1.1332) [2022-01-19 02:41:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][900/1251] eta 0:12:55 lr 0.000934 time 2.2424 (2.2097) loss 2.8936 (3.9294) grad_norm 1.1635 (1.1338) [2022-01-19 02:42:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][910/1251] eta 0:12:32 lr 0.000934 time 1.8837 (2.2080) loss 4.2412 (3.9282) grad_norm 0.8922 (1.1336) [2022-01-19 02:42:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][920/1251] eta 0:12:10 lr 0.000934 time 2.1236 (2.2068) loss 4.3053 (3.9265) grad_norm 0.9963 (1.1339) [2022-01-19 02:42:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][930/1251] eta 0:11:48 lr 0.000934 time 2.5390 (2.2066) loss 4.3632 (3.9265) grad_norm 1.0545 (1.1338) [2022-01-19 02:43:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][940/1251] eta 0:11:26 lr 0.000934 time 2.6755 (2.2083) loss 2.6099 (3.9239) grad_norm 1.1143 (1.1349) [2022-01-19 02:43:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][950/1251] eta 0:11:04 lr 0.000934 time 1.8243 (2.2086) loss 3.8278 (3.9257) grad_norm 1.0477 (1.1342) [2022-01-19 02:44:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][960/1251] eta 0:10:42 lr 0.000934 time 2.0224 (2.2070) loss 4.1138 (3.9233) grad_norm 1.1047 (1.1349) [2022-01-19 02:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][970/1251] eta 0:10:20 lr 0.000934 time 2.1543 (2.2068) loss 4.4118 (3.9244) grad_norm 1.0883 (1.1353) [2022-01-19 02:44:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][980/1251] eta 0:09:57 lr 0.000934 time 2.4866 (2.2066) loss 4.6527 (3.9222) grad_norm 1.0064 (1.1347) [2022-01-19 02:45:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][990/1251] eta 0:09:36 lr 0.000934 time 2.1333 (2.2073) loss 4.7623 (3.9230) grad_norm 1.1661 (1.1340) [2022-01-19 02:45:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1000/1251] eta 0:09:14 lr 0.000934 time 1.8079 (2.2086) loss 4.7330 (3.9240) grad_norm 1.1914 (1.1340) [2022-01-19 02:45:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1010/1251] eta 0:08:52 lr 0.000934 time 1.9762 (2.2091) loss 4.2172 (3.9248) grad_norm 1.1136 (1.1341) [2022-01-19 02:46:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1020/1251] eta 0:08:30 lr 0.000934 time 2.0632 (2.2080) loss 3.6493 (3.9241) grad_norm 1.2529 (1.1340) [2022-01-19 02:46:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1030/1251] eta 0:08:07 lr 0.000934 time 2.6681 (2.2066) loss 4.6365 (3.9251) grad_norm 1.5383 (1.1347) [2022-01-19 02:47:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1040/1251] eta 0:07:45 lr 0.000934 time 1.8627 (2.2054) loss 4.3152 (3.9262) grad_norm 1.0991 (1.1341) [2022-01-19 02:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1050/1251] eta 0:07:23 lr 0.000934 time 1.8231 (2.2056) loss 4.2301 (3.9257) grad_norm 1.3427 (1.1339) [2022-01-19 02:47:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1060/1251] eta 0:07:01 lr 0.000934 time 2.1662 (2.2057) loss 4.3358 (3.9256) grad_norm 1.2422 (1.1334) [2022-01-19 02:48:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1070/1251] eta 0:06:39 lr 0.000934 time 1.9085 (2.2046) loss 4.3220 (3.9282) grad_norm 1.2014 (1.1334) [2022-01-19 02:48:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1080/1251] eta 0:06:17 lr 0.000934 time 1.7661 (2.2052) loss 3.4898 (3.9233) grad_norm 1.3229 (1.1335) [2022-01-19 02:48:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1090/1251] eta 0:05:55 lr 0.000934 time 2.4299 (2.2052) loss 3.5374 (3.9248) grad_norm 1.2142 (1.1340) [2022-01-19 02:49:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1100/1251] eta 0:05:32 lr 0.000934 time 1.8679 (2.2042) loss 4.0250 (3.9262) grad_norm 1.0839 (1.1336) [2022-01-19 02:49:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1110/1251] eta 0:05:10 lr 0.000934 time 2.1368 (2.2041) loss 4.0575 (3.9260) grad_norm 1.2224 (1.1338) [2022-01-19 02:49:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1120/1251] eta 0:04:48 lr 0.000934 time 2.4832 (2.2045) loss 4.2203 (3.9241) grad_norm 1.2717 (1.1342) [2022-01-19 02:50:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1130/1251] eta 0:04:26 lr 0.000934 time 2.0124 (2.2034) loss 3.1950 (3.9235) grad_norm 1.3172 (1.1344) [2022-01-19 02:50:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1140/1251] eta 0:04:04 lr 0.000934 time 2.2016 (2.2029) loss 4.2666 (3.9249) grad_norm 0.9123 (1.1338) [2022-01-19 02:50:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1150/1251] eta 0:03:42 lr 0.000934 time 1.8993 (2.2013) loss 4.6637 (3.9269) grad_norm 0.9537 (1.1328) [2022-01-19 02:51:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1160/1251] eta 0:03:20 lr 0.000934 time 2.0281 (2.2005) loss 2.9742 (3.9292) grad_norm 1.1816 (1.1328) [2022-01-19 02:51:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1170/1251] eta 0:02:58 lr 0.000934 time 2.3240 (2.2002) loss 4.5107 (3.9286) grad_norm 1.1923 (1.1323) [2022-01-19 02:52:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1180/1251] eta 0:02:36 lr 0.000934 time 2.3310 (2.2009) loss 3.7657 (3.9276) grad_norm 1.1796 (1.1320) [2022-01-19 02:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1190/1251] eta 0:02:14 lr 0.000934 time 2.0163 (2.2005) loss 4.0883 (3.9271) grad_norm 0.8697 (1.1313) [2022-01-19 02:52:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1200/1251] eta 0:01:52 lr 0.000934 time 2.2204 (2.2011) loss 4.6374 (3.9267) grad_norm 1.0545 (1.1307) [2022-01-19 02:53:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1210/1251] eta 0:01:30 lr 0.000934 time 2.5911 (2.2011) loss 4.2636 (3.9284) grad_norm 1.3327 (1.1303) [2022-01-19 02:53:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1220/1251] eta 0:01:08 lr 0.000934 time 2.6754 (2.2011) loss 3.6078 (3.9234) grad_norm 1.0273 (1.1298) [2022-01-19 02:53:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1230/1251] eta 0:00:46 lr 0.000934 time 2.5009 (2.2006) loss 4.0202 (3.9238) grad_norm 1.1113 (1.1296) [2022-01-19 02:54:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1240/1251] eta 0:00:24 lr 0.000934 time 1.7324 (2.1993) loss 4.3352 (3.9258) grad_norm 1.3871 (1.1298) [2022-01-19 02:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [49/300][1250/1251] eta 0:00:02 lr 0.000934 time 1.2018 (2.1934) loss 4.3091 (3.9274) grad_norm 1.1927 (1.1296) [2022-01-19 02:54:29 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 49 training takes 0:45:44 [2022-01-19 02:54:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.386 (18.386) Loss 1.3106 (1.3106) Acc@1 68.652 (68.652) Acc@5 90.039 (90.039) [2022-01-19 02:55:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.308 (3.393) Loss 1.2559 (1.2943) Acc@1 70.801 (69.256) Acc@5 89.160 (89.906) [2022-01-19 02:55:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.591 (2.542) Loss 1.2265 (1.2917) Acc@1 71.973 (69.550) Acc@5 91.016 (90.030) [2022-01-19 02:55:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.943 (2.222) Loss 1.2579 (1.2848) Acc@1 71.094 (69.796) Acc@5 90.625 (90.017) [2022-01-19 02:55:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.817 (2.159) Loss 1.3471 (1.2848) Acc@1 68.848 (69.958) Acc@5 88.379 (89.972) [2022-01-19 02:56:05 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 70.104 Acc@5 90.006 [2022-01-19 02:56:05 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 70.1% [2022-01-19 02:56:05 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 70.10% [2022-01-19 02:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][0/1251] eta 7:32:25 lr 0.000934 time 21.6989 (21.6989) loss 3.4463 (3.4463) grad_norm 1.2064 (1.2064) [2022-01-19 02:56:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][10/1251] eta 1:26:02 lr 0.000934 time 2.1697 (4.1596) loss 3.8591 (3.6924) grad_norm 1.0401 (1.1975) [2022-01-19 02:57:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][20/1251] eta 1:05:26 lr 0.000934 time 1.4007 (3.1897) loss 2.6970 (3.7622) grad_norm 1.0756 (1.1507) [2022-01-19 02:57:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][30/1251] eta 0:58:20 lr 0.000934 time 1.9129 (2.8670) loss 2.8457 (3.8272) grad_norm 1.1987 (1.1731) [2022-01-19 02:57:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][40/1251] eta 0:54:09 lr 0.000934 time 2.7774 (2.6831) loss 3.5264 (3.8071) grad_norm 0.9700 (1.1766) [2022-01-19 02:58:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][50/1251] eta 0:52:00 lr 0.000934 time 1.5756 (2.5982) loss 4.4329 (3.8097) grad_norm 1.2047 (1.1697) [2022-01-19 02:58:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][60/1251] eta 0:50:35 lr 0.000934 time 1.8073 (2.5491) loss 4.0244 (3.8728) grad_norm 1.0782 (1.1571) [2022-01-19 02:59:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][70/1251] eta 0:49:01 lr 0.000934 time 1.5099 (2.4907) loss 2.9709 (3.8604) grad_norm 1.0558 (1.1537) [2022-01-19 02:59:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][80/1251] eta 0:47:44 lr 0.000934 time 2.1462 (2.4464) loss 4.6744 (3.8535) grad_norm 1.1282 (1.1562) [2022-01-19 02:59:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][90/1251] eta 0:46:56 lr 0.000933 time 1.9299 (2.4260) loss 4.6312 (3.8811) grad_norm 1.5214 (1.1552) [2022-01-19 03:00:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][100/1251] eta 0:45:55 lr 0.000933 time 1.9168 (2.3944) loss 4.1035 (3.8546) grad_norm 1.1951 (1.1607) [2022-01-19 03:00:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][110/1251] eta 0:45:13 lr 0.000933 time 1.8157 (2.3780) loss 4.5316 (3.8735) grad_norm 1.2352 (1.1554) [2022-01-19 03:00:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][120/1251] eta 0:44:22 lr 0.000933 time 1.9354 (2.3539) loss 4.5300 (3.9092) grad_norm 1.0828 (1.1514) [2022-01-19 03:01:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][130/1251] eta 0:43:23 lr 0.000933 time 1.5475 (2.3224) loss 4.1788 (3.8967) grad_norm 1.1938 (1.1410) [2022-01-19 03:01:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][140/1251] eta 0:42:38 lr 0.000933 time 2.2546 (2.3026) loss 4.6882 (3.9192) grad_norm 1.2068 (1.1368) [2022-01-19 03:01:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][150/1251] eta 0:42:01 lr 0.000933 time 2.1814 (2.2906) loss 4.0584 (3.9282) grad_norm 0.9440 (1.1324) [2022-01-19 03:02:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][160/1251] eta 0:41:25 lr 0.000933 time 1.6323 (2.2784) loss 4.7467 (3.9305) grad_norm 1.1859 (1.1339) [2022-01-19 03:02:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][170/1251] eta 0:41:12 lr 0.000933 time 2.5619 (2.2870) loss 4.0207 (3.9386) grad_norm 1.1766 (1.1282) [2022-01-19 03:02:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][180/1251] eta 0:40:48 lr 0.000933 time 1.7294 (2.2862) loss 3.8968 (3.9321) grad_norm 1.1184 (1.1224) [2022-01-19 03:03:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][190/1251] eta 0:40:32 lr 0.000933 time 2.7984 (2.2922) loss 4.5355 (3.9428) grad_norm 1.1787 (1.1249) [2022-01-19 03:03:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][200/1251] eta 0:39:57 lr 0.000933 time 1.8176 (2.2816) loss 3.9919 (3.9252) grad_norm 1.2765 (1.1279) [2022-01-19 03:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][210/1251] eta 0:39:26 lr 0.000933 time 1.5419 (2.2737) loss 4.5789 (3.9236) grad_norm 1.3986 (1.1334) [2022-01-19 03:04:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][220/1251] eta 0:38:53 lr 0.000933 time 1.8422 (2.2637) loss 3.1067 (3.9244) grad_norm 1.0655 (1.1386) [2022-01-19 03:04:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][230/1251] eta 0:38:27 lr 0.000933 time 2.2572 (2.2599) loss 4.2774 (3.9256) grad_norm 1.8081 (1.1422) [2022-01-19 03:05:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][240/1251] eta 0:38:02 lr 0.000933 time 2.1596 (2.2575) loss 4.0284 (3.9385) grad_norm 1.2141 (1.1445) [2022-01-19 03:05:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][250/1251] eta 0:37:47 lr 0.000933 time 2.7815 (2.2655) loss 4.2817 (3.9331) grad_norm 1.0857 (1.1422) [2022-01-19 03:05:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][260/1251] eta 0:37:23 lr 0.000933 time 1.8845 (2.2643) loss 4.8764 (3.9372) grad_norm 1.2129 (1.1435) [2022-01-19 03:06:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][270/1251] eta 0:36:58 lr 0.000933 time 1.9349 (2.2611) loss 4.0875 (3.9389) grad_norm 1.2939 (1.1456) [2022-01-19 03:06:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][280/1251] eta 0:36:27 lr 0.000933 time 2.1715 (2.2529) loss 4.3496 (3.9395) grad_norm 1.0847 (1.1471) [2022-01-19 03:07:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][290/1251] eta 0:36:03 lr 0.000933 time 1.9545 (2.2509) loss 3.3035 (3.9336) grad_norm 1.1287 (1.1447) [2022-01-19 03:07:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][300/1251] eta 0:35:43 lr 0.000933 time 1.8683 (2.2542) loss 2.9822 (3.9235) grad_norm 1.0685 (1.1420) [2022-01-19 03:07:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][310/1251] eta 0:35:23 lr 0.000933 time 1.9308 (2.2564) loss 3.5646 (3.9361) grad_norm 1.0790 (1.1396) [2022-01-19 03:08:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][320/1251] eta 0:34:56 lr 0.000933 time 2.0473 (2.2519) loss 4.1634 (3.9420) grad_norm 1.0995 (1.1386) [2022-01-19 03:08:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][330/1251] eta 0:34:28 lr 0.000933 time 2.1287 (2.2455) loss 3.5681 (3.9502) grad_norm 1.0076 (1.1386) [2022-01-19 03:08:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][340/1251] eta 0:34:04 lr 0.000933 time 1.9866 (2.2446) loss 4.4338 (3.9460) grad_norm 1.1315 (1.1405) [2022-01-19 03:09:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][350/1251] eta 0:33:39 lr 0.000933 time 2.2526 (2.2417) loss 3.5812 (3.9589) grad_norm 1.0532 (1.1440) [2022-01-19 03:09:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][360/1251] eta 0:33:15 lr 0.000933 time 2.2073 (2.2398) loss 2.8673 (3.9598) grad_norm 1.0156 (1.1441) [2022-01-19 03:09:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][370/1251] eta 0:32:52 lr 0.000933 time 2.7355 (2.2390) loss 3.4167 (3.9542) grad_norm 1.3806 (1.1433) [2022-01-19 03:10:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][380/1251] eta 0:32:28 lr 0.000933 time 1.7886 (2.2373) loss 3.4925 (3.9561) grad_norm 1.0726 (1.1429) [2022-01-19 03:10:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][390/1251] eta 0:32:04 lr 0.000933 time 1.4949 (2.2355) loss 4.4787 (3.9543) grad_norm 1.2319 (1.1438) [2022-01-19 03:11:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][400/1251] eta 0:31:40 lr 0.000933 time 1.4263 (2.2334) loss 3.7728 (3.9561) grad_norm 1.3462 (1.1440) [2022-01-19 03:11:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][410/1251] eta 0:31:16 lr 0.000933 time 2.3845 (2.2313) loss 2.7913 (3.9523) grad_norm 1.5025 (1.1436) [2022-01-19 03:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][420/1251] eta 0:30:53 lr 0.000933 time 1.7068 (2.2306) loss 4.0684 (3.9552) grad_norm 0.9961 (1.1445) [2022-01-19 03:12:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][430/1251] eta 0:30:28 lr 0.000933 time 2.1661 (2.2269) loss 4.2278 (3.9631) grad_norm 1.4755 (1.1441) [2022-01-19 03:12:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][440/1251] eta 0:30:03 lr 0.000933 time 1.9387 (2.2237) loss 3.8899 (3.9591) grad_norm 1.0560 (1.1437) [2022-01-19 03:12:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][450/1251] eta 0:29:39 lr 0.000933 time 1.6414 (2.2222) loss 3.5590 (3.9573) grad_norm 1.0525 (1.1425) [2022-01-19 03:13:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][460/1251] eta 0:29:19 lr 0.000933 time 2.4302 (2.2240) loss 4.7047 (3.9583) grad_norm 1.3669 (1.1420) [2022-01-19 03:13:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][470/1251] eta 0:28:58 lr 0.000933 time 1.5411 (2.2259) loss 4.0614 (3.9573) grad_norm 1.3846 (1.1410) [2022-01-19 03:13:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][480/1251] eta 0:28:39 lr 0.000933 time 1.8652 (2.2302) loss 3.4778 (3.9546) grad_norm 1.3408 (1.1418) [2022-01-19 03:14:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][490/1251] eta 0:28:20 lr 0.000933 time 1.9011 (2.2345) loss 4.2899 (3.9512) grad_norm 1.0893 (1.1400) [2022-01-19 03:14:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][500/1251] eta 0:27:57 lr 0.000933 time 1.7594 (2.2338) loss 4.0553 (3.9522) grad_norm 1.0434 (1.1393) [2022-01-19 03:15:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][510/1251] eta 0:27:31 lr 0.000933 time 1.8015 (2.2282) loss 4.4161 (3.9525) grad_norm 0.9780 (1.1376) [2022-01-19 03:15:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][520/1251] eta 0:27:05 lr 0.000933 time 2.3247 (2.2237) loss 3.9832 (3.9487) grad_norm 1.1395 (1.1362) [2022-01-19 03:15:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][530/1251] eta 0:26:40 lr 0.000933 time 1.9542 (2.2201) loss 4.0260 (3.9518) grad_norm 1.0167 (1.1352) [2022-01-19 03:16:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][540/1251] eta 0:26:17 lr 0.000933 time 2.4343 (2.2188) loss 3.9734 (3.9545) grad_norm 1.0625 (1.1348) [2022-01-19 03:16:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][550/1251] eta 0:25:53 lr 0.000933 time 1.8822 (2.2164) loss 4.2085 (3.9599) grad_norm 1.1898 (1.1339) [2022-01-19 03:16:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][560/1251] eta 0:25:31 lr 0.000933 time 1.9471 (2.2171) loss 3.0839 (3.9586) grad_norm 1.3073 (1.1342) [2022-01-19 03:17:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][570/1251] eta 0:25:10 lr 0.000932 time 2.1892 (2.2182) loss 3.1601 (3.9607) grad_norm 1.3342 (1.1342) [2022-01-19 03:17:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][580/1251] eta 0:24:51 lr 0.000932 time 1.5361 (2.2221) loss 4.0632 (3.9611) grad_norm 1.0245 (1.1328) [2022-01-19 03:17:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][590/1251] eta 0:24:29 lr 0.000932 time 2.3227 (2.2232) loss 3.0750 (3.9578) grad_norm 1.2223 (1.1322) [2022-01-19 03:18:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][600/1251] eta 0:24:07 lr 0.000932 time 2.5624 (2.2228) loss 3.8471 (3.9550) grad_norm 1.0027 (1.1335) [2022-01-19 03:18:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][610/1251] eta 0:23:44 lr 0.000932 time 1.6777 (2.2224) loss 4.1895 (3.9507) grad_norm 1.1345 (1.1348) [2022-01-19 03:19:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][620/1251] eta 0:23:20 lr 0.000932 time 1.8967 (2.2196) loss 4.0704 (3.9485) grad_norm 1.3248 (1.1361) [2022-01-19 03:19:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][630/1251] eta 0:22:57 lr 0.000932 time 1.5341 (2.2186) loss 3.7630 (3.9480) grad_norm 1.4820 (1.1365) [2022-01-19 03:19:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][640/1251] eta 0:22:35 lr 0.000932 time 2.4468 (2.2186) loss 3.3406 (3.9482) grad_norm 1.0266 (1.1365) [2022-01-19 03:20:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][650/1251] eta 0:22:15 lr 0.000932 time 2.0015 (2.2214) loss 3.6583 (3.9470) grad_norm 1.0013 (1.1370) [2022-01-19 03:20:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][660/1251] eta 0:21:52 lr 0.000932 time 1.8330 (2.2205) loss 4.3037 (3.9490) grad_norm 1.2252 (1.1365) [2022-01-19 03:20:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][670/1251] eta 0:21:28 lr 0.000932 time 1.9217 (2.2182) loss 4.3182 (3.9477) grad_norm 1.0500 (1.1367) [2022-01-19 03:21:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][680/1251] eta 0:21:05 lr 0.000932 time 2.7851 (2.2167) loss 3.3021 (3.9479) grad_norm 1.6325 (1.1380) [2022-01-19 03:21:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][690/1251] eta 0:20:42 lr 0.000932 time 2.2105 (2.2147) loss 3.4860 (3.9494) grad_norm 1.1107 (1.1373) [2022-01-19 03:21:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][700/1251] eta 0:20:18 lr 0.000932 time 1.8490 (2.2123) loss 3.4691 (3.9480) grad_norm 1.0188 (1.1372) [2022-01-19 03:22:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][710/1251] eta 0:19:56 lr 0.000932 time 2.1510 (2.2110) loss 4.5731 (3.9499) grad_norm 1.0419 (1.1373) [2022-01-19 03:22:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][720/1251] eta 0:19:33 lr 0.000932 time 2.1672 (2.2100) loss 4.5988 (3.9530) grad_norm 1.3450 (1.1366) [2022-01-19 03:23:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][730/1251] eta 0:19:11 lr 0.000932 time 2.6483 (2.2104) loss 3.6804 (3.9503) grad_norm 1.0039 (1.1367) [2022-01-19 03:23:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][740/1251] eta 0:18:50 lr 0.000932 time 2.2198 (2.2115) loss 3.5801 (3.9486) grad_norm 0.9202 (1.1372) [2022-01-19 03:23:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][750/1251] eta 0:18:28 lr 0.000932 time 2.2709 (2.2123) loss 2.7312 (3.9460) grad_norm 1.2745 (1.1370) [2022-01-19 03:24:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][760/1251] eta 0:18:06 lr 0.000932 time 2.2287 (2.2136) loss 3.3503 (3.9500) grad_norm 0.9565 (1.1364) [2022-01-19 03:24:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][770/1251] eta 0:17:45 lr 0.000932 time 2.2564 (2.2145) loss 4.5616 (3.9502) grad_norm 1.4077 (1.1365) [2022-01-19 03:24:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][780/1251] eta 0:17:23 lr 0.000932 time 1.8741 (2.2148) loss 4.1346 (3.9521) grad_norm 0.9809 (1.1364) [2022-01-19 03:25:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][790/1251] eta 0:17:01 lr 0.000932 time 2.9504 (2.2155) loss 3.9725 (3.9566) grad_norm 0.9368 (1.1368) [2022-01-19 03:25:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][800/1251] eta 0:16:38 lr 0.000932 time 1.7219 (2.2140) loss 2.9994 (3.9564) grad_norm 0.9343 (1.1362) [2022-01-19 03:25:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][810/1251] eta 0:16:15 lr 0.000932 time 2.4332 (2.2109) loss 4.0675 (3.9547) grad_norm 1.1096 (1.1363) [2022-01-19 03:26:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][820/1251] eta 0:15:51 lr 0.000932 time 1.9355 (2.2084) loss 4.3276 (3.9579) grad_norm 1.0299 (1.1373) [2022-01-19 03:26:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][830/1251] eta 0:15:28 lr 0.000932 time 1.8931 (2.2060) loss 3.0961 (3.9578) grad_norm 1.0815 (1.1373) [2022-01-19 03:27:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][840/1251] eta 0:15:06 lr 0.000932 time 2.8086 (2.2065) loss 3.3664 (3.9576) grad_norm 0.9341 (1.1369) [2022-01-19 03:27:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][850/1251] eta 0:14:44 lr 0.000932 time 1.8769 (2.2056) loss 3.9374 (3.9581) grad_norm 1.1378 (1.1367) [2022-01-19 03:27:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][860/1251] eta 0:14:22 lr 0.000932 time 1.7917 (2.2061) loss 2.9212 (3.9562) grad_norm 1.1232 (1.1361) [2022-01-19 03:28:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][870/1251] eta 0:14:00 lr 0.000932 time 2.3065 (2.2064) loss 4.1153 (3.9560) grad_norm 1.0793 (1.1361) [2022-01-19 03:28:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][880/1251] eta 0:13:38 lr 0.000932 time 2.8213 (2.2069) loss 4.3132 (3.9584) grad_norm 0.8802 (1.1355) [2022-01-19 03:28:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][890/1251] eta 0:13:16 lr 0.000932 time 2.3931 (2.2056) loss 4.3920 (3.9587) grad_norm 1.1010 (1.1348) [2022-01-19 03:29:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][900/1251] eta 0:12:53 lr 0.000932 time 1.2183 (2.2050) loss 4.7863 (3.9597) grad_norm 0.9624 (1.1337) [2022-01-19 03:29:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][910/1251] eta 0:12:32 lr 0.000932 time 2.4032 (2.2057) loss 4.0621 (3.9582) grad_norm 1.0249 (1.1331) [2022-01-19 03:29:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][920/1251] eta 0:12:10 lr 0.000932 time 2.7844 (2.2080) loss 3.4799 (3.9565) grad_norm 1.1593 (1.1326) [2022-01-19 03:30:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][930/1251] eta 0:11:49 lr 0.000932 time 1.8532 (2.2093) loss 3.6517 (3.9564) grad_norm 1.1348 (1.1323) [2022-01-19 03:30:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][940/1251] eta 0:11:27 lr 0.000932 time 2.1592 (2.2107) loss 4.1142 (3.9555) grad_norm 1.1081 (1.1323) [2022-01-19 03:31:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][950/1251] eta 0:11:05 lr 0.000932 time 1.9047 (2.2106) loss 4.3240 (3.9556) grad_norm 1.0830 (1.1325) [2022-01-19 03:31:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][960/1251] eta 0:10:42 lr 0.000932 time 1.5823 (2.2066) loss 4.6999 (3.9575) grad_norm 0.9778 (1.1327) [2022-01-19 03:31:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][970/1251] eta 0:10:19 lr 0.000932 time 1.8698 (2.2034) loss 4.3154 (3.9550) grad_norm 1.3698 (1.1331) [2022-01-19 03:32:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][980/1251] eta 0:09:56 lr 0.000932 time 1.6226 (2.2022) loss 3.9153 (3.9549) grad_norm 1.1041 (1.1330) [2022-01-19 03:32:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][990/1251] eta 0:09:34 lr 0.000932 time 1.6330 (2.2007) loss 4.0300 (3.9559) grad_norm 1.3886 (1.1334) [2022-01-19 03:32:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1000/1251] eta 0:09:12 lr 0.000932 time 2.4745 (2.2004) loss 3.1190 (3.9540) grad_norm 1.5433 (1.1335) [2022-01-19 03:33:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1010/1251] eta 0:08:50 lr 0.000932 time 2.2162 (2.2012) loss 3.6396 (3.9541) grad_norm 1.1669 (1.1339) [2022-01-19 03:33:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1020/1251] eta 0:08:29 lr 0.000932 time 2.7065 (2.2037) loss 4.4571 (3.9551) grad_norm 1.0614 (1.1337) [2022-01-19 03:33:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1030/1251] eta 0:08:07 lr 0.000932 time 2.2767 (2.2037) loss 3.9398 (3.9537) grad_norm 1.0409 (1.1328) [2022-01-19 03:34:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1040/1251] eta 0:07:45 lr 0.000932 time 2.5348 (2.2052) loss 3.7080 (3.9530) grad_norm 1.2134 (1.1329) [2022-01-19 03:34:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1050/1251] eta 0:07:23 lr 0.000931 time 2.5099 (2.2078) loss 4.0283 (3.9507) grad_norm 0.9539 (1.1334) [2022-01-19 03:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1060/1251] eta 0:07:01 lr 0.000931 time 2.1999 (2.2074) loss 4.1358 (3.9497) grad_norm 1.0271 (1.1321) [2022-01-19 03:35:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1070/1251] eta 0:06:39 lr 0.000931 time 1.8902 (2.2067) loss 3.4451 (3.9498) grad_norm 1.1429 (1.1308) [2022-01-19 03:35:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1080/1251] eta 0:06:16 lr 0.000931 time 2.1949 (2.2044) loss 4.7966 (3.9488) grad_norm 1.2537 (1.1306) [2022-01-19 03:36:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1090/1251] eta 0:05:54 lr 0.000931 time 1.8734 (2.2021) loss 4.2201 (3.9496) grad_norm 1.2354 (1.1311) [2022-01-19 03:36:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1100/1251] eta 0:05:32 lr 0.000931 time 1.8302 (2.2003) loss 3.6024 (3.9482) grad_norm 1.0080 (1.1309) [2022-01-19 03:36:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1110/1251] eta 0:05:10 lr 0.000931 time 2.7513 (2.1996) loss 4.2419 (3.9480) grad_norm 1.3862 (1.1318) [2022-01-19 03:37:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1120/1251] eta 0:04:48 lr 0.000931 time 1.8964 (2.1993) loss 3.7611 (3.9516) grad_norm 0.9920 (1.1314) [2022-01-19 03:37:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1130/1251] eta 0:04:26 lr 0.000931 time 3.1361 (2.2010) loss 3.3953 (3.9515) grad_norm 1.1392 (1.1310) [2022-01-19 03:37:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1140/1251] eta 0:04:04 lr 0.000931 time 1.7432 (2.2018) loss 3.3638 (3.9511) grad_norm 0.9961 (1.1305) [2022-01-19 03:38:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1150/1251] eta 0:03:42 lr 0.000931 time 2.7486 (2.2039) loss 4.1675 (3.9509) grad_norm 1.1720 (1.1302) [2022-01-19 03:38:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1160/1251] eta 0:03:20 lr 0.000931 time 1.8900 (2.2028) loss 3.7132 (3.9497) grad_norm 1.0477 (1.1301) [2022-01-19 03:39:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1170/1251] eta 0:02:58 lr 0.000931 time 1.7829 (2.2036) loss 4.5630 (3.9500) grad_norm 1.0281 (1.1295) [2022-01-19 03:39:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1180/1251] eta 0:02:36 lr 0.000931 time 1.7398 (2.2050) loss 2.9543 (3.9500) grad_norm 1.3159 (1.1302) [2022-01-19 03:39:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1190/1251] eta 0:02:14 lr 0.000931 time 2.1253 (2.2064) loss 3.2482 (3.9502) grad_norm 1.3294 (1.1305) [2022-01-19 03:40:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1200/1251] eta 0:01:52 lr 0.000931 time 1.7971 (2.2063) loss 3.3037 (3.9501) grad_norm 0.9040 (1.1300) [2022-01-19 03:40:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1210/1251] eta 0:01:30 lr 0.000931 time 1.9315 (2.2055) loss 3.1600 (3.9503) grad_norm 1.1026 (1.1304) [2022-01-19 03:40:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1220/1251] eta 0:01:08 lr 0.000931 time 1.8643 (2.2036) loss 4.4099 (3.9508) grad_norm 0.9962 (1.1305) [2022-01-19 03:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1230/1251] eta 0:00:46 lr 0.000931 time 1.8884 (2.2025) loss 4.5699 (3.9501) grad_norm 1.0814 (1.1302) [2022-01-19 03:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1240/1251] eta 0:00:24 lr 0.000931 time 1.6157 (2.2013) loss 4.0416 (3.9517) grad_norm 1.2083 (1.1297) [2022-01-19 03:41:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [50/300][1250/1251] eta 0:00:02 lr 0.000931 time 1.2124 (2.1957) loss 4.3983 (3.9514) grad_norm 1.2764 (1.1293) [2022-01-19 03:41:52 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 50 training takes 0:45:47 [2022-01-19 03:41:52 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_50 saving...... [2022-01-19 03:42:03 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_50 saved !!! [2022-01-19 03:42:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 15.640 (15.640) Loss 1.2268 (1.2268) Acc@1 73.242 (73.242) Acc@5 91.016 (91.016) [2022-01-19 03:42:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.650 (2.825) Loss 1.2897 (1.2850) Acc@1 69.531 (70.366) Acc@5 88.965 (90.004) [2022-01-19 03:42:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.259 (2.403) Loss 1.2706 (1.2982) Acc@1 70.508 (70.117) Acc@5 90.820 (89.900) [2022-01-19 03:43:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.690 (2.144) Loss 1.3158 (1.3013) Acc@1 69.629 (70.001) Acc@5 90.039 (89.938) [2022-01-19 03:43:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 5.124 (2.000) Loss 1.2679 (1.2958) Acc@1 71.094 (70.181) Acc@5 89.258 (90.025) [2022-01-19 03:43:32 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 70.298 Acc@5 90.116 [2022-01-19 03:43:32 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 70.3% [2022-01-19 03:43:32 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 70.30% [2022-01-19 03:43:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][0/1251] eta 7:28:10 lr 0.000931 time 21.4954 (21.4954) loss 2.8308 (2.8308) grad_norm 1.2561 (1.2561) [2022-01-19 03:44:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][10/1251] eta 1:23:46 lr 0.000931 time 2.1425 (4.0506) loss 4.3521 (3.9533) grad_norm 1.0669 (1.0505) [2022-01-19 03:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][20/1251] eta 1:03:09 lr 0.000931 time 1.5297 (3.0787) loss 4.3390 (3.9560) grad_norm 1.3958 (1.1163) [2022-01-19 03:44:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][30/1251] eta 0:56:00 lr 0.000931 time 1.8840 (2.7519) loss 4.6354 (3.9856) grad_norm 1.2776 (1.1389) [2022-01-19 03:45:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][40/1251] eta 0:53:35 lr 0.000931 time 3.6514 (2.6549) loss 3.8377 (3.9405) grad_norm 0.8846 (1.1433) [2022-01-19 03:45:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][50/1251] eta 0:51:45 lr 0.000931 time 3.1641 (2.5857) loss 4.0360 (3.9826) grad_norm 1.1906 (1.1395) [2022-01-19 03:46:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][60/1251] eta 0:50:02 lr 0.000931 time 1.7933 (2.5210) loss 4.2507 (3.9366) grad_norm 1.1376 (1.1289) [2022-01-19 03:46:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][70/1251] eta 0:48:40 lr 0.000931 time 2.1874 (2.4727) loss 4.4099 (3.9850) grad_norm 1.2965 (1.1449) [2022-01-19 03:46:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][80/1251] eta 0:47:56 lr 0.000931 time 2.8731 (2.4561) loss 4.6149 (3.9981) grad_norm 0.9136 (1.1417) [2022-01-19 03:47:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][90/1251] eta 0:47:37 lr 0.000931 time 3.6580 (2.4617) loss 4.7129 (3.9931) grad_norm 1.0161 (1.1381) [2022-01-19 03:47:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][100/1251] eta 0:46:28 lr 0.000931 time 1.3883 (2.4226) loss 3.2624 (4.0037) grad_norm 1.1457 (1.1328) [2022-01-19 03:47:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][110/1251] eta 0:45:37 lr 0.000931 time 1.6065 (2.3990) loss 4.2684 (4.0042) grad_norm 1.0259 (1.1247) [2022-01-19 03:48:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][120/1251] eta 0:44:43 lr 0.000931 time 1.9368 (2.3723) loss 4.2546 (3.9990) grad_norm 1.5891 (1.1277) [2022-01-19 03:48:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][130/1251] eta 0:44:10 lr 0.000931 time 2.2175 (2.3646) loss 4.6297 (4.0006) grad_norm 1.0329 (1.1246) [2022-01-19 03:49:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][140/1251] eta 0:43:34 lr 0.000931 time 1.9097 (2.3536) loss 3.5359 (3.9732) grad_norm 1.1601 (1.1272) [2022-01-19 03:49:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][150/1251] eta 0:42:52 lr 0.000931 time 2.2186 (2.3367) loss 4.7429 (3.9613) grad_norm 1.1115 (1.1289) [2022-01-19 03:49:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][160/1251] eta 0:42:10 lr 0.000931 time 2.2345 (2.3191) loss 3.2240 (3.9713) grad_norm 0.9320 (1.1277) [2022-01-19 03:50:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][170/1251] eta 0:41:46 lr 0.000931 time 2.5304 (2.3183) loss 4.2996 (3.9690) grad_norm 1.2573 (1.1282) [2022-01-19 03:50:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][180/1251] eta 0:41:20 lr 0.000931 time 2.6930 (2.3163) loss 4.5215 (3.9677) grad_norm 1.1139 (1.1274) [2022-01-19 03:50:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][190/1251] eta 0:40:51 lr 0.000931 time 2.4454 (2.3109) loss 3.2784 (3.9655) grad_norm 1.1077 (1.1240) [2022-01-19 03:51:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][200/1251] eta 0:40:20 lr 0.000931 time 2.1631 (2.3033) loss 4.5365 (3.9653) grad_norm 1.2015 (1.1242) [2022-01-19 03:51:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][210/1251] eta 0:39:49 lr 0.000931 time 1.9348 (2.2955) loss 3.6929 (3.9696) grad_norm 1.3168 (1.1253) [2022-01-19 03:51:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][220/1251] eta 0:39:17 lr 0.000931 time 1.8653 (2.2863) loss 4.5422 (3.9716) grad_norm 1.0265 (1.1248) [2022-01-19 03:52:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][230/1251] eta 0:38:56 lr 0.000931 time 2.0411 (2.2885) loss 2.6745 (3.9535) grad_norm 1.2122 (1.1234) [2022-01-19 03:52:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][240/1251] eta 0:38:29 lr 0.000931 time 1.8462 (2.2845) loss 4.2681 (3.9440) grad_norm 1.2731 (1.1239) [2022-01-19 03:53:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][250/1251] eta 0:38:01 lr 0.000931 time 1.7520 (2.2796) loss 2.5234 (3.9373) grad_norm 1.1669 (1.1233) [2022-01-19 03:53:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][260/1251] eta 0:37:31 lr 0.000931 time 2.2141 (2.2723) loss 4.6309 (3.9455) grad_norm 1.1398 (1.1244) [2022-01-19 03:53:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][270/1251] eta 0:37:01 lr 0.000930 time 2.3730 (2.2645) loss 4.1354 (3.9481) grad_norm 1.0347 (1.1221) [2022-01-19 03:54:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][280/1251] eta 0:36:31 lr 0.000930 time 2.2326 (2.2570) loss 4.3274 (3.9449) grad_norm 1.0191 (1.1215) [2022-01-19 03:54:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][290/1251] eta 0:36:10 lr 0.000930 time 2.2744 (2.2586) loss 3.0902 (3.9387) grad_norm 1.1596 (1.1227) [2022-01-19 03:54:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][300/1251] eta 0:35:50 lr 0.000930 time 2.4153 (2.2614) loss 4.1066 (3.9438) grad_norm 1.0247 (1.1241) [2022-01-19 03:55:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][310/1251] eta 0:35:26 lr 0.000930 time 2.5886 (2.2603) loss 3.5944 (3.9405) grad_norm 1.1542 (1.1228) [2022-01-19 03:55:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][320/1251] eta 0:34:57 lr 0.000930 time 1.6266 (2.2527) loss 4.9450 (3.9353) grad_norm 1.1849 (1.1223) [2022-01-19 03:55:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][330/1251] eta 0:34:28 lr 0.000930 time 2.2079 (2.2455) loss 3.8362 (3.9290) grad_norm 1.1975 (1.1210) [2022-01-19 03:56:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][340/1251] eta 0:34:01 lr 0.000930 time 1.8162 (2.2412) loss 4.6918 (3.9281) grad_norm 0.9608 (1.1211) [2022-01-19 03:56:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][350/1251] eta 0:33:39 lr 0.000930 time 1.9161 (2.2410) loss 3.4255 (3.9290) grad_norm 1.0959 (1.1203) [2022-01-19 03:57:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][360/1251] eta 0:33:15 lr 0.000930 time 1.8327 (2.2395) loss 3.2173 (3.9194) grad_norm 1.1072 (1.1223) [2022-01-19 03:57:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][370/1251] eta 0:32:53 lr 0.000930 time 2.4229 (2.2401) loss 4.2456 (3.9230) grad_norm 1.3866 (1.1243) [2022-01-19 03:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][380/1251] eta 0:32:31 lr 0.000930 time 2.1421 (2.2407) loss 4.6145 (3.9316) grad_norm 0.9662 (1.1265) [2022-01-19 03:58:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][390/1251] eta 0:32:09 lr 0.000930 time 1.8613 (2.2406) loss 4.4618 (3.9244) grad_norm 1.1444 (1.1284) [2022-01-19 03:58:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][400/1251] eta 0:31:44 lr 0.000930 time 1.6912 (2.2382) loss 3.7520 (3.9200) grad_norm 1.1303 (1.1302) [2022-01-19 03:58:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][410/1251] eta 0:31:19 lr 0.000930 time 2.1383 (2.2350) loss 3.5363 (3.9177) grad_norm 1.2802 (1.1327) [2022-01-19 03:59:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][420/1251] eta 0:30:57 lr 0.000930 time 2.1710 (2.2349) loss 3.8589 (3.9200) grad_norm 1.0927 (1.1328) [2022-01-19 03:59:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][430/1251] eta 0:30:33 lr 0.000930 time 1.8228 (2.2336) loss 4.1653 (3.9202) grad_norm 1.3063 (1.1339) [2022-01-19 03:59:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][440/1251] eta 0:30:11 lr 0.000930 time 3.0159 (2.2333) loss 4.4324 (3.9188) grad_norm 1.2067 (1.1349) [2022-01-19 04:00:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][450/1251] eta 0:29:48 lr 0.000930 time 1.8616 (2.2327) loss 4.2837 (3.9217) grad_norm 1.1221 (1.1333) [2022-01-19 04:00:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][460/1251] eta 0:29:23 lr 0.000930 time 1.5636 (2.2294) loss 2.6335 (3.9170) grad_norm 1.4541 (1.1332) [2022-01-19 04:01:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][470/1251] eta 0:29:00 lr 0.000930 time 2.8874 (2.2290) loss 3.7538 (3.9155) grad_norm 1.1098 (1.1325) [2022-01-19 04:01:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][480/1251] eta 0:28:35 lr 0.000930 time 1.7911 (2.2251) loss 2.4978 (3.9151) grad_norm 1.0194 (1.1319) [2022-01-19 04:01:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][490/1251] eta 0:28:13 lr 0.000930 time 1.9711 (2.2251) loss 3.0558 (3.9153) grad_norm 1.2541 (1.1323) [2022-01-19 04:02:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][500/1251] eta 0:27:52 lr 0.000930 time 1.8237 (2.2276) loss 4.5790 (3.9194) grad_norm 0.9326 (1.1315) [2022-01-19 04:02:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][510/1251] eta 0:27:30 lr 0.000930 time 2.1468 (2.2275) loss 4.1603 (3.9209) grad_norm 1.1418 (1.1309) [2022-01-19 04:02:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][520/1251] eta 0:27:08 lr 0.000930 time 2.1927 (2.2274) loss 4.2452 (3.9191) grad_norm 1.0041 (1.1308) [2022-01-19 04:03:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][530/1251] eta 0:26:45 lr 0.000930 time 2.2222 (2.2263) loss 4.7377 (3.9233) grad_norm 1.3241 (1.1322) [2022-01-19 04:03:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][540/1251] eta 0:26:21 lr 0.000930 time 2.3181 (2.2250) loss 4.2196 (3.9258) grad_norm 0.8933 (1.1319) [2022-01-19 04:03:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][550/1251] eta 0:25:57 lr 0.000930 time 1.6771 (2.2211) loss 2.9993 (3.9270) grad_norm 1.0181 (1.1308) [2022-01-19 04:04:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][560/1251] eta 0:25:34 lr 0.000930 time 2.2651 (2.2209) loss 2.9322 (3.9261) grad_norm 1.2031 (1.1316) [2022-01-19 04:04:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][570/1251] eta 0:25:12 lr 0.000930 time 1.8449 (2.2204) loss 3.7327 (3.9251) grad_norm 1.4238 (1.1344) [2022-01-19 04:05:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][580/1251] eta 0:24:52 lr 0.000930 time 2.8415 (2.2239) loss 3.6414 (3.9257) grad_norm 1.0215 (1.1347) [2022-01-19 04:05:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][590/1251] eta 0:24:28 lr 0.000930 time 1.8267 (2.2220) loss 3.1409 (3.9266) grad_norm 1.3287 (1.1351) [2022-01-19 04:05:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][600/1251] eta 0:24:05 lr 0.000930 time 1.6613 (2.2207) loss 3.8170 (3.9248) grad_norm 0.9154 (1.1359) [2022-01-19 04:06:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][610/1251] eta 0:23:40 lr 0.000930 time 1.8794 (2.2162) loss 3.9047 (3.9280) grad_norm 1.3311 (1.1370) [2022-01-19 04:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][620/1251] eta 0:23:16 lr 0.000930 time 2.1769 (2.2133) loss 3.9994 (3.9309) grad_norm 1.2205 (1.1371) [2022-01-19 04:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][630/1251] eta 0:22:55 lr 0.000930 time 2.6663 (2.2142) loss 3.0805 (3.9255) grad_norm 1.1265 (1.1365) [2022-01-19 04:07:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][640/1251] eta 0:22:32 lr 0.000930 time 2.2023 (2.2130) loss 3.8114 (3.9248) grad_norm 1.2368 (1.1357) [2022-01-19 04:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][650/1251] eta 0:22:08 lr 0.000930 time 1.8638 (2.2110) loss 3.7148 (3.9260) grad_norm 1.0074 (1.1367) [2022-01-19 04:07:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][660/1251] eta 0:21:45 lr 0.000930 time 2.0675 (2.2094) loss 4.2133 (3.9272) grad_norm 1.2434 (1.1372) [2022-01-19 04:08:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][670/1251] eta 0:21:23 lr 0.000930 time 2.5397 (2.2093) loss 4.0044 (3.9280) grad_norm 0.9476 (1.1369) [2022-01-19 04:08:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][680/1251] eta 0:21:02 lr 0.000930 time 2.3695 (2.2110) loss 4.8582 (3.9295) grad_norm 1.2607 (1.1383) [2022-01-19 04:09:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][690/1251] eta 0:20:41 lr 0.000930 time 2.2710 (2.2125) loss 3.3596 (3.9281) grad_norm 1.0962 (1.1380) [2022-01-19 04:09:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][700/1251] eta 0:20:19 lr 0.000930 time 2.3923 (2.2139) loss 3.2306 (3.9270) grad_norm 1.1097 (1.1388) [2022-01-19 04:09:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][710/1251] eta 0:19:57 lr 0.000930 time 1.9363 (2.2135) loss 3.4118 (3.9242) grad_norm 1.1788 (1.1392) [2022-01-19 04:10:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][720/1251] eta 0:19:35 lr 0.000930 time 2.9714 (2.2133) loss 3.8980 (3.9252) grad_norm 1.0660 (1.1390) [2022-01-19 04:10:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][730/1251] eta 0:19:11 lr 0.000930 time 2.0535 (2.2106) loss 4.0817 (3.9212) grad_norm 1.0332 (1.1375) [2022-01-19 04:10:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][740/1251] eta 0:18:47 lr 0.000929 time 1.8563 (2.2069) loss 4.7474 (3.9210) grad_norm 1.1983 (1.1371) [2022-01-19 04:11:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][750/1251] eta 0:18:24 lr 0.000929 time 2.1800 (2.2055) loss 4.3072 (3.9208) grad_norm 1.1268 (1.1384) [2022-01-19 04:11:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][760/1251] eta 0:18:03 lr 0.000929 time 2.9031 (2.2069) loss 2.7976 (3.9209) grad_norm 1.2286 (1.1376) [2022-01-19 04:11:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][770/1251] eta 0:17:41 lr 0.000929 time 2.2856 (2.2074) loss 4.2771 (3.9195) grad_norm 1.0794 (1.1373) [2022-01-19 04:12:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][780/1251] eta 0:17:19 lr 0.000929 time 1.6749 (2.2080) loss 3.1680 (3.9242) grad_norm 1.0118 (1.1373) [2022-01-19 04:12:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][790/1251] eta 0:16:57 lr 0.000929 time 2.2368 (2.2068) loss 3.9577 (3.9215) grad_norm 1.0064 (1.1380) [2022-01-19 04:13:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][800/1251] eta 0:16:35 lr 0.000929 time 2.1551 (2.2064) loss 3.4778 (3.9246) grad_norm 1.0527 (1.1383) [2022-01-19 04:13:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][810/1251] eta 0:16:12 lr 0.000929 time 2.2119 (2.2051) loss 4.1061 (3.9293) grad_norm 0.9745 (1.1374) [2022-01-19 04:13:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][820/1251] eta 0:15:49 lr 0.000929 time 2.1734 (2.2041) loss 3.3080 (3.9298) grad_norm 1.1103 (1.1374) [2022-01-19 04:14:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][830/1251] eta 0:15:27 lr 0.000929 time 2.6489 (2.2027) loss 3.3773 (3.9296) grad_norm 1.0355 (1.1369) [2022-01-19 04:14:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][840/1251] eta 0:15:05 lr 0.000929 time 1.8061 (2.2021) loss 4.0350 (3.9298) grad_norm 1.2752 (1.1372) [2022-01-19 04:14:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][850/1251] eta 0:14:42 lr 0.000929 time 1.6390 (2.2002) loss 3.8657 (3.9280) grad_norm 1.3014 (1.1375) [2022-01-19 04:15:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][860/1251] eta 0:14:20 lr 0.000929 time 1.9708 (2.2010) loss 3.0376 (3.9268) grad_norm 1.1512 (1.1395) [2022-01-19 04:15:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][870/1251] eta 0:13:59 lr 0.000929 time 2.6922 (2.2029) loss 4.4451 (3.9238) grad_norm 1.1244 (1.1387) [2022-01-19 04:15:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][880/1251] eta 0:13:38 lr 0.000929 time 2.0884 (2.2052) loss 4.4117 (3.9244) grad_norm 1.1645 (1.1398) [2022-01-19 04:16:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][890/1251] eta 0:13:16 lr 0.000929 time 1.9047 (2.2057) loss 4.1359 (3.9257) grad_norm 1.0398 (1.1405) [2022-01-19 04:16:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][900/1251] eta 0:12:54 lr 0.000929 time 3.3305 (2.2078) loss 3.9372 (3.9251) grad_norm 0.9583 (1.1413) [2022-01-19 04:17:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][910/1251] eta 0:12:32 lr 0.000929 time 1.8244 (2.2079) loss 4.7227 (3.9257) grad_norm 0.9525 (1.1408) [2022-01-19 04:17:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][920/1251] eta 0:12:09 lr 0.000929 time 1.8797 (2.2053) loss 2.8242 (3.9216) grad_norm 0.9841 (1.1406) [2022-01-19 04:17:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][930/1251] eta 0:11:47 lr 0.000929 time 1.9202 (2.2040) loss 3.8310 (3.9215) grad_norm 0.9594 (1.1402) [2022-01-19 04:18:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][940/1251] eta 0:11:25 lr 0.000929 time 1.8685 (2.2032) loss 4.5578 (3.9229) grad_norm 1.0035 (1.1396) [2022-01-19 04:18:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][950/1251] eta 0:11:03 lr 0.000929 time 2.2358 (2.2034) loss 3.0077 (3.9240) grad_norm 1.0968 (1.1388) [2022-01-19 04:18:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][960/1251] eta 0:10:40 lr 0.000929 time 1.8317 (2.2022) loss 3.8151 (3.9221) grad_norm 0.9767 (1.1381) [2022-01-19 04:19:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][970/1251] eta 0:10:18 lr 0.000929 time 1.8019 (2.2009) loss 4.7824 (3.9218) grad_norm 1.0383 (1.1384) [2022-01-19 04:19:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][980/1251] eta 0:09:56 lr 0.000929 time 1.5003 (2.2008) loss 3.2480 (3.9196) grad_norm 1.3620 (1.1390) [2022-01-19 04:19:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][990/1251] eta 0:09:34 lr 0.000929 time 2.0018 (2.2006) loss 4.4208 (3.9214) grad_norm 1.3171 (1.1394) [2022-01-19 04:20:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1000/1251] eta 0:09:12 lr 0.000929 time 2.5016 (2.2011) loss 3.2986 (3.9175) grad_norm 1.3077 (1.1396) [2022-01-19 04:20:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1010/1251] eta 0:08:50 lr 0.000929 time 2.3160 (2.2009) loss 2.9040 (3.9166) grad_norm 1.3401 (1.1404) [2022-01-19 04:21:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1020/1251] eta 0:08:28 lr 0.000929 time 2.2267 (2.2020) loss 4.4253 (3.9167) grad_norm 0.9897 (1.1404) [2022-01-19 04:21:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1030/1251] eta 0:08:06 lr 0.000929 time 1.5227 (2.2007) loss 3.3626 (3.9144) grad_norm 1.1027 (1.1400) [2022-01-19 04:21:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1040/1251] eta 0:07:44 lr 0.000929 time 1.9500 (2.2002) loss 3.2173 (3.9131) grad_norm 0.9973 (1.1390) [2022-01-19 04:22:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1050/1251] eta 0:07:22 lr 0.000929 time 2.5902 (2.2005) loss 5.0145 (3.9143) grad_norm 1.4670 (1.1388) [2022-01-19 04:22:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1060/1251] eta 0:07:00 lr 0.000929 time 1.6648 (2.2009) loss 3.7900 (3.9159) grad_norm 0.9906 (1.1390) [2022-01-19 04:22:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1070/1251] eta 0:06:38 lr 0.000929 time 1.8583 (2.2003) loss 3.5605 (3.9173) grad_norm 1.0249 (1.1380) [2022-01-19 04:23:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1080/1251] eta 0:06:16 lr 0.000929 time 2.2382 (2.1990) loss 4.7444 (3.9182) grad_norm 1.0664 (1.1371) [2022-01-19 04:23:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1090/1251] eta 0:05:53 lr 0.000929 time 2.5689 (2.1979) loss 3.6182 (3.9197) grad_norm 1.3508 (1.1376) [2022-01-19 04:23:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1100/1251] eta 0:05:32 lr 0.000929 time 1.8475 (2.2003) loss 4.3688 (3.9212) grad_norm 1.1464 (1.1377) [2022-01-19 04:24:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1110/1251] eta 0:05:10 lr 0.000929 time 1.8300 (2.2015) loss 4.3912 (3.9210) grad_norm 1.2213 (1.1377) [2022-01-19 04:24:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1120/1251] eta 0:04:48 lr 0.000929 time 1.7392 (2.2024) loss 3.5360 (3.9207) grad_norm 1.1685 (1.1370) [2022-01-19 04:25:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1130/1251] eta 0:04:26 lr 0.000929 time 1.9075 (2.2013) loss 4.4209 (3.9237) grad_norm 1.3532 (1.1366) [2022-01-19 04:25:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1140/1251] eta 0:04:04 lr 0.000929 time 1.8777 (2.2019) loss 3.4991 (3.9215) grad_norm 1.2036 (1.1364) [2022-01-19 04:25:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1150/1251] eta 0:03:42 lr 0.000929 time 2.0535 (2.2011) loss 3.9862 (3.9225) grad_norm 1.1318 (1.1364) [2022-01-19 04:26:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1160/1251] eta 0:03:20 lr 0.000929 time 1.9168 (2.2000) loss 3.3055 (3.9209) grad_norm 1.2344 (1.1367) [2022-01-19 04:26:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1170/1251] eta 0:02:58 lr 0.000929 time 1.5857 (2.1984) loss 2.8221 (3.9200) grad_norm 1.0961 (1.1365) [2022-01-19 04:26:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1180/1251] eta 0:02:36 lr 0.000929 time 2.2006 (2.1980) loss 4.3238 (3.9235) grad_norm 1.0412 (1.1365) [2022-01-19 04:27:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1190/1251] eta 0:02:13 lr 0.000929 time 1.7984 (2.1965) loss 4.4187 (3.9234) grad_norm 1.7038 (1.1382) [2022-01-19 04:27:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1200/1251] eta 0:01:52 lr 0.000929 time 2.1195 (2.1977) loss 3.4211 (3.9231) grad_norm 1.0901 (1.1383) [2022-01-19 04:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1210/1251] eta 0:01:30 lr 0.000928 time 2.1568 (2.1979) loss 3.3395 (3.9224) grad_norm 1.0582 (1.1383) [2022-01-19 04:28:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1220/1251] eta 0:01:08 lr 0.000928 time 2.2906 (2.1991) loss 4.0467 (3.9230) grad_norm 0.9077 (1.1377) [2022-01-19 04:28:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1230/1251] eta 0:00:46 lr 0.000928 time 2.0626 (2.1986) loss 2.8530 (3.9233) grad_norm 1.2764 (1.1369) [2022-01-19 04:28:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1240/1251] eta 0:00:24 lr 0.000928 time 1.4395 (2.1974) loss 4.2065 (3.9217) grad_norm 0.9936 (1.1371) [2022-01-19 04:29:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [51/300][1250/1251] eta 0:00:02 lr 0.000928 time 1.3575 (2.1916) loss 4.1691 (3.9224) grad_norm 0.9184 (1.1367) [2022-01-19 04:29:14 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 51 training takes 0:45:42 [2022-01-19 04:29:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.267 (18.267) Loss 1.2371 (1.2371) Acc@1 70.996 (70.996) Acc@5 90.625 (90.625) [2022-01-19 04:29:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.595 (3.262) Loss 1.2639 (1.2594) Acc@1 69.824 (70.490) Acc@5 90.137 (90.510) [2022-01-19 04:30:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.636 (2.506) Loss 1.2844 (1.2689) Acc@1 69.141 (70.373) Acc@5 90.234 (90.369) [2022-01-19 04:30:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.586 (2.316) Loss 1.2942 (1.2806) Acc@1 70.312 (70.190) Acc@5 89.648 (90.086) [2022-01-19 04:30:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.371 (2.199) Loss 1.2704 (1.2788) Acc@1 71.191 (70.415) Acc@5 90.039 (90.130) [2022-01-19 04:30:52 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 70.382 Acc@5 90.078 [2022-01-19 04:30:52 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 70.4% [2022-01-19 04:30:52 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 70.38% [2022-01-19 04:31:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][0/1251] eta 7:26:38 lr 0.000928 time 21.4216 (21.4216) loss 3.8493 (3.8493) grad_norm 1.2493 (1.2493) [2022-01-19 04:31:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][10/1251] eta 1:24:28 lr 0.000928 time 2.9692 (4.0845) loss 4.2144 (3.9480) grad_norm 1.1202 (1.1552) [2022-01-19 04:31:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][20/1251] eta 1:05:11 lr 0.000928 time 1.8698 (3.1773) loss 4.1349 (3.9207) grad_norm 1.1450 (1.1122) [2022-01-19 04:32:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][30/1251] eta 0:57:04 lr 0.000928 time 1.6379 (2.8044) loss 4.0807 (3.8870) grad_norm 1.0059 (1.1432) [2022-01-19 04:32:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][40/1251] eta 0:53:34 lr 0.000928 time 3.2609 (2.6546) loss 4.0577 (3.9654) grad_norm 0.9708 (1.1295) [2022-01-19 04:33:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][50/1251] eta 0:52:05 lr 0.000928 time 2.5246 (2.6026) loss 4.4900 (3.9218) grad_norm 1.1609 (1.1379) [2022-01-19 04:33:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][60/1251] eta 0:50:48 lr 0.000928 time 2.7371 (2.5595) loss 4.1859 (3.9671) grad_norm 1.0545 (1.1219) [2022-01-19 04:33:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][70/1251] eta 0:49:27 lr 0.000928 time 1.8809 (2.5128) loss 3.0062 (3.9524) grad_norm 1.0676 (1.1282) [2022-01-19 04:34:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][80/1251] eta 0:48:16 lr 0.000928 time 1.9793 (2.4735) loss 3.5318 (3.9237) grad_norm 1.0925 (1.1295) [2022-01-19 04:34:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][90/1251] eta 0:47:24 lr 0.000928 time 2.3082 (2.4503) loss 3.5785 (3.9173) grad_norm 1.0396 (1.1287) [2022-01-19 04:34:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][100/1251] eta 0:46:19 lr 0.000928 time 1.8957 (2.4145) loss 4.7327 (3.9003) grad_norm 1.0209 (1.1234) [2022-01-19 04:35:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][110/1251] eta 0:45:27 lr 0.000928 time 2.1243 (2.3903) loss 4.3810 (3.9083) grad_norm 1.2051 (1.1214) [2022-01-19 04:35:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][120/1251] eta 0:44:58 lr 0.000928 time 2.1522 (2.3861) loss 4.4632 (3.9187) grad_norm 1.1938 (1.1229) [2022-01-19 04:36:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][130/1251] eta 0:44:43 lr 0.000928 time 3.3612 (2.3937) loss 4.7831 (3.9407) grad_norm 1.3963 (1.1278) [2022-01-19 04:36:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][140/1251] eta 0:43:51 lr 0.000928 time 1.9069 (2.3690) loss 4.2129 (3.9646) grad_norm 1.3129 (1.1275) [2022-01-19 04:36:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][150/1251] eta 0:43:04 lr 0.000928 time 1.5491 (2.3470) loss 4.2937 (3.9770) grad_norm 0.9453 (1.1300) [2022-01-19 04:37:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][160/1251] eta 0:42:22 lr 0.000928 time 1.8038 (2.3307) loss 4.4391 (3.9899) grad_norm 1.0502 (1.1260) [2022-01-19 04:37:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][170/1251] eta 0:42:04 lr 0.000928 time 3.6426 (2.3356) loss 4.3036 (3.9703) grad_norm 1.1934 (1.1253) [2022-01-19 04:37:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][180/1251] eta 0:41:30 lr 0.000928 time 2.1197 (2.3254) loss 4.5298 (3.9712) grad_norm 1.0577 (1.1286) [2022-01-19 04:38:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][190/1251] eta 0:40:53 lr 0.000928 time 1.8990 (2.3125) loss 4.5964 (3.9633) grad_norm 1.0353 (1.1272) [2022-01-19 04:38:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][200/1251] eta 0:40:16 lr 0.000928 time 1.7806 (2.2990) loss 3.7285 (3.9563) grad_norm 1.0830 (1.1293) [2022-01-19 04:38:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][210/1251] eta 0:39:50 lr 0.000928 time 3.5532 (2.2960) loss 3.3667 (3.9544) grad_norm 1.0142 (1.1289) [2022-01-19 04:39:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][220/1251] eta 0:39:24 lr 0.000928 time 1.9373 (2.2930) loss 3.6625 (3.9646) grad_norm 1.0922 (1.1299) [2022-01-19 04:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][230/1251] eta 0:38:57 lr 0.000928 time 2.2136 (2.2899) loss 4.7616 (3.9692) grad_norm 1.1453 (1.1269) [2022-01-19 04:40:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][240/1251] eta 0:38:29 lr 0.000928 time 1.9444 (2.2839) loss 4.3178 (3.9620) grad_norm 0.9925 (1.1269) [2022-01-19 04:40:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][250/1251] eta 0:38:03 lr 0.000928 time 2.9957 (2.2812) loss 4.1625 (3.9449) grad_norm 1.2591 (1.1284) [2022-01-19 04:40:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][260/1251] eta 0:37:34 lr 0.000928 time 1.7388 (2.2745) loss 4.1571 (3.9424) grad_norm 1.1851 (1.1279) [2022-01-19 04:41:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][270/1251] eta 0:37:10 lr 0.000928 time 1.8826 (2.2734) loss 3.0307 (3.9427) grad_norm 1.1588 (1.1291) [2022-01-19 04:41:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][280/1251] eta 0:36:44 lr 0.000928 time 2.5488 (2.2707) loss 3.9697 (3.9412) grad_norm 1.2011 (1.1354) [2022-01-19 04:41:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][290/1251] eta 0:36:26 lr 0.000928 time 3.5184 (2.2752) loss 4.4311 (3.9262) grad_norm 1.3019 (1.1384) [2022-01-19 04:42:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][300/1251] eta 0:35:55 lr 0.000928 time 1.7956 (2.2665) loss 4.1841 (3.9251) grad_norm 1.1504 (1.1382) [2022-01-19 04:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][310/1251] eta 0:35:24 lr 0.000928 time 1.9439 (2.2579) loss 4.7284 (3.9218) grad_norm 0.9876 (1.1361) [2022-01-19 04:42:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][320/1251] eta 0:34:57 lr 0.000928 time 1.5485 (2.2530) loss 4.3269 (3.9163) grad_norm 0.9191 (1.1349) [2022-01-19 04:43:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][330/1251] eta 0:34:41 lr 0.000928 time 5.1576 (2.2605) loss 3.8266 (3.9185) grad_norm 0.8941 (1.1328) [2022-01-19 04:43:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][340/1251] eta 0:34:16 lr 0.000928 time 2.5149 (2.2579) loss 4.5178 (3.9184) grad_norm 0.9780 (1.1319) [2022-01-19 04:44:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][350/1251] eta 0:33:53 lr 0.000928 time 2.2039 (2.2572) loss 4.5653 (3.9209) grad_norm 1.4347 (1.1373) [2022-01-19 04:44:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][360/1251] eta 0:33:30 lr 0.000928 time 1.8700 (2.2562) loss 3.8792 (3.9199) grad_norm 0.9516 (1.1396) [2022-01-19 04:44:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][370/1251] eta 0:33:04 lr 0.000928 time 2.8331 (2.2531) loss 3.6914 (3.9278) grad_norm 1.1846 (1.1384) [2022-01-19 04:45:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][380/1251] eta 0:32:36 lr 0.000928 time 1.8893 (2.2464) loss 3.3531 (3.9341) grad_norm 1.1717 (1.1371) [2022-01-19 04:45:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][390/1251] eta 0:32:11 lr 0.000928 time 2.1554 (2.2428) loss 3.4325 (3.9405) grad_norm 1.2691 (1.1359) [2022-01-19 04:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][400/1251] eta 0:31:46 lr 0.000928 time 2.4582 (2.2400) loss 4.3330 (3.9448) grad_norm 1.3437 (1.1372) [2022-01-19 04:46:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][410/1251] eta 0:31:22 lr 0.000928 time 2.5042 (2.2380) loss 3.9779 (3.9481) grad_norm 1.2079 (1.1360) [2022-01-19 04:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][420/1251] eta 0:30:59 lr 0.000928 time 2.1890 (2.2376) loss 2.5983 (3.9438) grad_norm 1.1968 (1.1368) [2022-01-19 04:46:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][430/1251] eta 0:30:38 lr 0.000927 time 2.7820 (2.2388) loss 4.2823 (3.9455) grad_norm 1.1259 (1.1390) [2022-01-19 04:47:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][440/1251] eta 0:30:14 lr 0.000927 time 1.5329 (2.2374) loss 4.1287 (3.9485) grad_norm 1.1723 (1.1392) [2022-01-19 04:47:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][450/1251] eta 0:29:51 lr 0.000927 time 2.5515 (2.2366) loss 4.5094 (3.9512) grad_norm 1.1685 (1.1387) [2022-01-19 04:48:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][460/1251] eta 0:29:27 lr 0.000927 time 2.4985 (2.2346) loss 2.9619 (3.9510) grad_norm 1.1055 (1.1378) [2022-01-19 04:48:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][470/1251] eta 0:29:04 lr 0.000927 time 3.2489 (2.2343) loss 4.0426 (3.9517) grad_norm 1.1866 (1.1386) [2022-01-19 04:48:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][480/1251] eta 0:28:41 lr 0.000927 time 1.9357 (2.2322) loss 4.7021 (3.9568) grad_norm 1.4398 (1.1392) [2022-01-19 04:49:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][490/1251] eta 0:28:17 lr 0.000927 time 2.1123 (2.2311) loss 3.0279 (3.9582) grad_norm 1.0785 (1.1400) [2022-01-19 04:49:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][500/1251] eta 0:27:56 lr 0.000927 time 2.4445 (2.2328) loss 4.6122 (3.9598) grad_norm 1.0911 (1.1403) [2022-01-19 04:49:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][510/1251] eta 0:27:33 lr 0.000927 time 2.1133 (2.2313) loss 3.3457 (3.9591) grad_norm 1.1748 (1.1392) [2022-01-19 04:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][520/1251] eta 0:27:10 lr 0.000927 time 1.9001 (2.2308) loss 4.4254 (3.9637) grad_norm 1.0679 (1.1391) [2022-01-19 04:50:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][530/1251] eta 0:26:48 lr 0.000927 time 2.4703 (2.2304) loss 3.8365 (3.9608) grad_norm 0.9485 (1.1384) [2022-01-19 04:50:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][540/1251] eta 0:26:24 lr 0.000927 time 2.1712 (2.2290) loss 3.2274 (3.9572) grad_norm 1.1255 (1.1389) [2022-01-19 04:51:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][550/1251] eta 0:26:00 lr 0.000927 time 2.1707 (2.2265) loss 4.0709 (3.9530) grad_norm 1.4239 (1.1391) [2022-01-19 04:51:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][560/1251] eta 0:25:36 lr 0.000927 time 2.5659 (2.2241) loss 3.6941 (3.9561) grad_norm 1.2909 (1.1413) [2022-01-19 04:52:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][570/1251] eta 0:25:14 lr 0.000927 time 1.7476 (2.2234) loss 3.5448 (3.9524) grad_norm 1.3563 (1.1416) [2022-01-19 04:52:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][580/1251] eta 0:24:52 lr 0.000927 time 1.6564 (2.2237) loss 4.1471 (3.9519) grad_norm 1.1090 (1.1405) [2022-01-19 04:52:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][590/1251] eta 0:24:29 lr 0.000927 time 2.7849 (2.2238) loss 4.3472 (3.9516) grad_norm 0.9968 (1.1391) [2022-01-19 04:53:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][600/1251] eta 0:24:06 lr 0.000927 time 2.0427 (2.2220) loss 3.6998 (3.9476) grad_norm 0.9864 (1.1403) [2022-01-19 04:53:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][610/1251] eta 0:23:42 lr 0.000927 time 1.8823 (2.2193) loss 4.3128 (3.9448) grad_norm 1.0372 (1.1402) [2022-01-19 04:53:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][620/1251] eta 0:23:19 lr 0.000927 time 1.8572 (2.2172) loss 4.2541 (3.9480) grad_norm 1.2004 (1.1396) [2022-01-19 04:54:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][630/1251] eta 0:22:55 lr 0.000927 time 2.2684 (2.2155) loss 3.2676 (3.9470) grad_norm 1.2183 (1.1394) [2022-01-19 04:54:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][640/1251] eta 0:22:32 lr 0.000927 time 2.2850 (2.2144) loss 2.9614 (3.9432) grad_norm 1.0643 (1.1397) [2022-01-19 04:54:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][650/1251] eta 0:22:09 lr 0.000927 time 2.2063 (2.2127) loss 4.3984 (3.9435) grad_norm 1.0571 (1.1383) [2022-01-19 04:55:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][660/1251] eta 0:21:48 lr 0.000927 time 1.8782 (2.2146) loss 3.8246 (3.9413) grad_norm 1.0637 (1.1368) [2022-01-19 04:55:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][670/1251] eta 0:21:25 lr 0.000927 time 1.7783 (2.2119) loss 4.4260 (3.9427) grad_norm 1.1306 (1.1366) [2022-01-19 04:55:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][680/1251] eta 0:21:03 lr 0.000927 time 2.1086 (2.2130) loss 3.3365 (3.9385) grad_norm 1.1349 (1.1379) [2022-01-19 04:56:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][690/1251] eta 0:20:42 lr 0.000927 time 1.2270 (2.2143) loss 3.9801 (3.9368) grad_norm 1.1518 (1.1374) [2022-01-19 04:56:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][700/1251] eta 0:20:21 lr 0.000927 time 2.2009 (2.2177) loss 4.0356 (3.9368) grad_norm 1.2627 (1.1367) [2022-01-19 04:57:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][710/1251] eta 0:19:59 lr 0.000927 time 2.4132 (2.2178) loss 2.8621 (3.9392) grad_norm 1.0925 (1.1360) [2022-01-19 04:57:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][720/1251] eta 0:19:36 lr 0.000927 time 2.1451 (2.2160) loss 4.3389 (3.9416) grad_norm 1.2309 (1.1365) [2022-01-19 04:57:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][730/1251] eta 0:19:13 lr 0.000927 time 2.1827 (2.2140) loss 4.1198 (3.9362) grad_norm 1.1654 (1.1376) [2022-01-19 04:58:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][740/1251] eta 0:18:51 lr 0.000927 time 2.5372 (2.2142) loss 4.1548 (3.9399) grad_norm 1.2318 (1.1397) [2022-01-19 04:58:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][750/1251] eta 0:18:28 lr 0.000927 time 1.9897 (2.2131) loss 2.8962 (3.9398) grad_norm 1.1181 (1.1394) [2022-01-19 04:58:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][760/1251] eta 0:18:06 lr 0.000927 time 1.9277 (2.2127) loss 2.7659 (3.9385) grad_norm 1.2811 (1.1410) [2022-01-19 04:59:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][770/1251] eta 0:17:43 lr 0.000927 time 2.2299 (2.2114) loss 3.5433 (3.9340) grad_norm 1.4028 (1.1427) [2022-01-19 04:59:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][780/1251] eta 0:17:21 lr 0.000927 time 2.1108 (2.2106) loss 3.4205 (3.9293) grad_norm 1.0216 (1.1428) [2022-01-19 05:00:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][790/1251] eta 0:16:58 lr 0.000927 time 1.5683 (2.2092) loss 4.8174 (3.9291) grad_norm 1.0480 (1.1433) [2022-01-19 05:00:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][800/1251] eta 0:16:36 lr 0.000927 time 2.1830 (2.2105) loss 4.3590 (3.9305) grad_norm 1.0165 (1.1427) [2022-01-19 05:00:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][810/1251] eta 0:16:13 lr 0.000927 time 1.8936 (2.2084) loss 3.2552 (3.9307) grad_norm 1.1928 (1.1425) [2022-01-19 05:01:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][820/1251] eta 0:15:51 lr 0.000927 time 2.0366 (2.2066) loss 4.0761 (3.9326) grad_norm 1.3133 (1.1423) [2022-01-19 05:01:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][830/1251] eta 0:15:28 lr 0.000927 time 1.5427 (2.2045) loss 3.9820 (3.9340) grad_norm 0.9431 (1.1412) [2022-01-19 05:01:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][840/1251] eta 0:15:05 lr 0.000927 time 1.6869 (2.2035) loss 3.2784 (3.9324) grad_norm 1.0734 (1.1419) [2022-01-19 05:02:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][850/1251] eta 0:14:43 lr 0.000927 time 2.3558 (2.2027) loss 4.4976 (3.9322) grad_norm 1.0962 (1.1426) [2022-01-19 05:02:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][860/1251] eta 0:14:20 lr 0.000927 time 1.8312 (2.2008) loss 4.6463 (3.9356) grad_norm 1.0146 (1.1420) [2022-01-19 05:02:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][870/1251] eta 0:13:58 lr 0.000927 time 1.8578 (2.2007) loss 4.3218 (3.9355) grad_norm 1.0771 (1.1414) [2022-01-19 05:03:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][880/1251] eta 0:13:36 lr 0.000927 time 1.8911 (2.2003) loss 2.9865 (3.9353) grad_norm 1.2817 (1.1419) [2022-01-19 05:03:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][890/1251] eta 0:13:15 lr 0.000926 time 2.6404 (2.2027) loss 4.6856 (3.9383) grad_norm 0.9509 (1.1423) [2022-01-19 05:03:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][900/1251] eta 0:12:53 lr 0.000926 time 2.0785 (2.2043) loss 4.2026 (3.9375) grad_norm 0.9940 (1.1412) [2022-01-19 05:04:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][910/1251] eta 0:12:32 lr 0.000926 time 2.0489 (2.2061) loss 3.5981 (3.9371) grad_norm 1.0041 (1.1407) [2022-01-19 05:04:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][920/1251] eta 0:12:11 lr 0.000926 time 1.6226 (2.2094) loss 4.4935 (3.9352) grad_norm 0.9916 (1.1412) [2022-01-19 05:05:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][930/1251] eta 0:11:48 lr 0.000926 time 1.8962 (2.2086) loss 3.5665 (3.9320) grad_norm 1.1028 (1.1411) [2022-01-19 05:05:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][940/1251] eta 0:11:26 lr 0.000926 time 1.8768 (2.2074) loss 4.2293 (3.9357) grad_norm 0.9717 (1.1415) [2022-01-19 05:05:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][950/1251] eta 0:11:03 lr 0.000926 time 1.9045 (2.2043) loss 4.1946 (3.9352) grad_norm 1.0061 (1.1416) [2022-01-19 05:06:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][960/1251] eta 0:10:41 lr 0.000926 time 1.9275 (2.2032) loss 4.1758 (3.9367) grad_norm 1.0534 (1.1423) [2022-01-19 05:06:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][970/1251] eta 0:10:18 lr 0.000926 time 1.5596 (2.2023) loss 4.1104 (3.9351) grad_norm 1.0231 (1.1421) [2022-01-19 05:06:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][980/1251] eta 0:09:57 lr 0.000926 time 3.0397 (2.2031) loss 3.0464 (3.9354) grad_norm 0.9196 (1.1415) [2022-01-19 05:07:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][990/1251] eta 0:09:35 lr 0.000926 time 1.8797 (2.2039) loss 4.6689 (3.9359) grad_norm 1.2320 (1.1415) [2022-01-19 05:07:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1000/1251] eta 0:09:13 lr 0.000926 time 1.8946 (2.2053) loss 4.1252 (3.9365) grad_norm 1.0190 (1.1407) [2022-01-19 05:08:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1010/1251] eta 0:08:51 lr 0.000926 time 1.8658 (2.2045) loss 4.4094 (3.9350) grad_norm 0.9141 (1.1406) [2022-01-19 05:08:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1020/1251] eta 0:08:29 lr 0.000926 time 2.5692 (2.2039) loss 4.0648 (3.9353) grad_norm 1.0109 (1.1409) [2022-01-19 05:08:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1030/1251] eta 0:08:06 lr 0.000926 time 1.9889 (2.2028) loss 4.7250 (3.9376) grad_norm 1.1401 (1.1404) [2022-01-19 05:09:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1040/1251] eta 0:07:45 lr 0.000926 time 2.5353 (2.2046) loss 3.6659 (3.9371) grad_norm 1.0757 (1.1403) [2022-01-19 05:09:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1050/1251] eta 0:07:22 lr 0.000926 time 1.9270 (2.2038) loss 2.7231 (3.9377) grad_norm 1.3471 (1.1404) [2022-01-19 05:09:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1060/1251] eta 0:07:00 lr 0.000926 time 2.2270 (2.2035) loss 3.8752 (3.9407) grad_norm 1.0658 (1.1398) [2022-01-19 05:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1070/1251] eta 0:06:38 lr 0.000926 time 1.9798 (2.2022) loss 3.5637 (3.9388) grad_norm 1.1711 (1.1398) [2022-01-19 05:10:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1080/1251] eta 0:06:16 lr 0.000926 time 3.3612 (2.2032) loss 3.5925 (3.9399) grad_norm 1.4055 (1.1405) [2022-01-19 05:10:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1090/1251] eta 0:05:54 lr 0.000926 time 1.4883 (2.2017) loss 3.9218 (3.9405) grad_norm 1.6308 (1.1413) [2022-01-19 05:11:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1100/1251] eta 0:05:32 lr 0.000926 time 1.6621 (2.2023) loss 3.9509 (3.9409) grad_norm 1.2612 (1.1419) [2022-01-19 05:11:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1110/1251] eta 0:05:10 lr 0.000926 time 1.7007 (2.2015) loss 3.3485 (3.9431) grad_norm 0.9934 (1.1410) [2022-01-19 05:12:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1120/1251] eta 0:04:48 lr 0.000926 time 2.8150 (2.2023) loss 3.4348 (3.9427) grad_norm 1.1030 (1.1416) [2022-01-19 05:12:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1130/1251] eta 0:04:26 lr 0.000926 time 1.7756 (2.2016) loss 3.0864 (3.9424) grad_norm 1.0384 (1.1416) [2022-01-19 05:12:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1140/1251] eta 0:04:04 lr 0.000926 time 1.7554 (2.2017) loss 3.7960 (3.9417) grad_norm 1.0486 (1.1414) [2022-01-19 05:13:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1150/1251] eta 0:03:42 lr 0.000926 time 1.8854 (2.2011) loss 4.2458 (3.9439) grad_norm 1.0493 (1.1412) [2022-01-19 05:13:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1160/1251] eta 0:03:20 lr 0.000926 time 3.3090 (2.2032) loss 3.5632 (3.9431) grad_norm 1.0420 (1.1410) [2022-01-19 05:13:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1170/1251] eta 0:02:58 lr 0.000926 time 1.6494 (2.2031) loss 3.0238 (3.9400) grad_norm 1.1049 (1.1407) [2022-01-19 05:14:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1180/1251] eta 0:02:36 lr 0.000926 time 1.6103 (2.2024) loss 4.6388 (3.9432) grad_norm 1.0492 (1.1404) [2022-01-19 05:14:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1190/1251] eta 0:02:14 lr 0.000926 time 1.8393 (2.2016) loss 4.6591 (3.9443) grad_norm 1.0793 (1.1398) [2022-01-19 05:14:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1200/1251] eta 0:01:52 lr 0.000926 time 2.1389 (2.2006) loss 3.7409 (3.9428) grad_norm 1.1575 (1.1395) [2022-01-19 05:15:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1210/1251] eta 0:01:30 lr 0.000926 time 2.0639 (2.1992) loss 4.3514 (3.9422) grad_norm 1.0077 (1.1390) [2022-01-19 05:15:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1220/1251] eta 0:01:08 lr 0.000926 time 1.9133 (2.1989) loss 3.7270 (3.9406) grad_norm 1.0688 (1.1386) [2022-01-19 05:15:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1230/1251] eta 0:00:46 lr 0.000926 time 2.0879 (2.1987) loss 3.8129 (3.9397) grad_norm 0.9563 (1.1384) [2022-01-19 05:16:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1240/1251] eta 0:00:24 lr 0.000926 time 1.8065 (2.1988) loss 3.9331 (3.9401) grad_norm 1.0016 (1.1379) [2022-01-19 05:16:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [52/300][1250/1251] eta 0:00:02 lr 0.000926 time 1.2777 (2.1939) loss 4.2379 (3.9399) grad_norm 1.0725 (1.1372) [2022-01-19 05:16:37 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 52 training takes 0:45:44 [2022-01-19 05:16:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.402 (18.402) Loss 1.2946 (1.2946) Acc@1 70.801 (70.801) Acc@5 90.039 (90.039) [2022-01-19 05:17:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.977 (3.500) Loss 1.2464 (1.2700) Acc@1 71.289 (71.245) Acc@5 90.918 (90.368) [2022-01-19 05:17:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.271 (2.489) Loss 1.3191 (1.2818) Acc@1 69.336 (70.764) Acc@5 89.844 (90.299) [2022-01-19 05:17:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.966 (2.242) Loss 1.3413 (1.2919) Acc@1 69.434 (70.347) Acc@5 89.551 (90.316) [2022-01-19 05:18:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.603 (2.151) Loss 1.2795 (1.2864) Acc@1 71.777 (70.262) Acc@5 90.527 (90.396) [2022-01-19 05:18:12 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 70.296 Acc@5 90.300 [2022-01-19 05:18:12 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 70.3% [2022-01-19 05:18:12 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 70.38% [2022-01-19 05:18:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][0/1251] eta 7:24:04 lr 0.000926 time 21.2988 (21.2988) loss 4.2881 (4.2881) grad_norm 1.0596 (1.0596) [2022-01-19 05:18:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][10/1251] eta 1:24:59 lr 0.000926 time 2.0357 (4.1088) loss 3.0355 (3.6219) grad_norm 1.1637 (1.0738) [2022-01-19 05:19:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][20/1251] eta 1:06:42 lr 0.000926 time 2.3353 (3.2518) loss 2.6548 (3.6003) grad_norm 1.1889 (1.0963) [2022-01-19 05:19:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][30/1251] eta 0:59:15 lr 0.000926 time 1.9365 (2.9117) loss 4.4287 (3.7167) grad_norm 1.0598 (1.1051) [2022-01-19 05:20:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][40/1251] eta 0:55:44 lr 0.000926 time 3.3282 (2.7617) loss 4.8129 (3.7660) grad_norm 1.0335 (1.1190) [2022-01-19 05:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][50/1251] eta 0:52:44 lr 0.000926 time 1.5434 (2.6349) loss 4.6331 (3.8176) grad_norm 1.5880 (1.1383) [2022-01-19 05:20:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][60/1251] eta 0:50:26 lr 0.000926 time 1.6701 (2.5414) loss 4.4174 (3.8772) grad_norm 1.1020 (1.1345) [2022-01-19 05:21:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][70/1251] eta 0:48:52 lr 0.000926 time 2.2469 (2.4829) loss 4.7411 (3.9321) grad_norm 1.0308 (1.1292) [2022-01-19 05:21:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][80/1251] eta 0:47:36 lr 0.000926 time 2.5002 (2.4393) loss 4.8907 (3.9060) grad_norm 1.2894 (1.1214) [2022-01-19 05:21:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][90/1251] eta 0:46:51 lr 0.000926 time 1.9362 (2.4214) loss 3.7366 (3.9077) grad_norm 1.2454 (1.1147) [2022-01-19 05:22:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][100/1251] eta 0:46:22 lr 0.000925 time 2.0105 (2.4177) loss 3.6131 (3.9253) grad_norm 1.0899 (1.1173) [2022-01-19 05:22:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][110/1251] eta 0:45:48 lr 0.000925 time 2.2005 (2.4086) loss 3.5658 (3.9221) grad_norm 1.1574 (1.1162) [2022-01-19 05:23:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][120/1251] eta 0:45:15 lr 0.000925 time 3.1092 (2.4012) loss 2.7917 (3.9063) grad_norm 1.1753 (1.1221) [2022-01-19 05:23:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][130/1251] eta 0:44:21 lr 0.000925 time 1.5965 (2.3745) loss 4.6022 (3.9271) grad_norm 0.9860 (1.1248) [2022-01-19 05:23:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][140/1251] eta 0:43:23 lr 0.000925 time 2.0140 (2.3432) loss 2.5843 (3.9159) grad_norm 1.3349 (1.1237) [2022-01-19 05:24:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][150/1251] eta 0:42:43 lr 0.000925 time 2.7989 (2.3282) loss 4.4565 (3.8889) grad_norm 1.0667 (1.1351) [2022-01-19 05:24:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][160/1251] eta 0:42:06 lr 0.000925 time 2.2665 (2.3155) loss 4.2139 (3.8994) grad_norm 0.9077 (1.1329) [2022-01-19 05:24:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][170/1251] eta 0:41:32 lr 0.000925 time 1.8826 (2.3062) loss 4.2763 (3.8898) grad_norm 1.0070 (1.1353) [2022-01-19 05:25:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][180/1251] eta 0:41:07 lr 0.000925 time 2.4770 (2.3039) loss 4.2485 (3.8886) grad_norm 1.2379 (1.1306) [2022-01-19 05:25:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][190/1251] eta 0:40:47 lr 0.000925 time 2.9912 (2.3065) loss 4.3978 (3.8876) grad_norm 1.0487 (1.1289) [2022-01-19 05:25:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][200/1251] eta 0:40:23 lr 0.000925 time 2.9890 (2.3058) loss 4.9617 (3.8886) grad_norm 1.0000 (1.1292) [2022-01-19 05:26:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][210/1251] eta 0:39:48 lr 0.000925 time 1.8938 (2.2947) loss 4.3141 (3.8884) grad_norm 1.1204 (1.1278) [2022-01-19 05:26:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][220/1251] eta 0:39:18 lr 0.000925 time 2.6508 (2.2875) loss 4.1159 (3.9059) grad_norm 1.0403 (1.1264) [2022-01-19 05:26:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][230/1251] eta 0:38:46 lr 0.000925 time 2.4294 (2.2788) loss 3.7169 (3.9012) grad_norm 1.3063 (1.1262) [2022-01-19 05:27:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][240/1251] eta 0:38:12 lr 0.000925 time 1.9154 (2.2679) loss 3.7070 (3.9051) grad_norm 1.0442 (1.1265) [2022-01-19 05:27:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][250/1251] eta 0:37:41 lr 0.000925 time 1.5503 (2.2597) loss 4.8466 (3.9011) grad_norm 1.0584 (1.1271) [2022-01-19 05:28:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][260/1251] eta 0:37:19 lr 0.000925 time 3.4561 (2.2601) loss 3.5226 (3.9059) grad_norm 1.0985 (1.1292) [2022-01-19 05:28:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][270/1251] eta 0:36:56 lr 0.000925 time 2.7106 (2.2596) loss 4.0834 (3.9044) grad_norm 1.1158 (1.1308) [2022-01-19 05:28:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][280/1251] eta 0:36:32 lr 0.000925 time 2.2149 (2.2580) loss 4.0473 (3.9086) grad_norm 1.2848 (1.1337) [2022-01-19 05:29:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][290/1251] eta 0:36:13 lr 0.000925 time 2.1352 (2.2621) loss 3.4476 (3.9055) grad_norm 1.6429 (1.1364) [2022-01-19 05:29:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][300/1251] eta 0:35:56 lr 0.000925 time 3.5484 (2.2678) loss 2.8718 (3.9036) grad_norm 1.0618 (1.1351) [2022-01-19 05:29:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][310/1251] eta 0:35:31 lr 0.000925 time 2.5042 (2.2653) loss 4.2557 (3.9023) grad_norm 1.1634 (1.1327) [2022-01-19 05:30:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][320/1251] eta 0:35:00 lr 0.000925 time 2.1959 (2.2565) loss 3.8579 (3.8872) grad_norm 1.0949 (1.1354) [2022-01-19 05:30:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][330/1251] eta 0:34:29 lr 0.000925 time 1.9443 (2.2472) loss 4.1269 (3.8945) grad_norm 0.9703 (1.1338) [2022-01-19 05:30:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][340/1251] eta 0:34:04 lr 0.000925 time 2.5099 (2.2444) loss 4.4348 (3.8886) grad_norm 0.9918 (1.1326) [2022-01-19 05:31:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][350/1251] eta 0:33:39 lr 0.000925 time 1.8844 (2.2413) loss 3.8387 (3.8929) grad_norm 1.1000 (1.1310) [2022-01-19 05:31:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][360/1251] eta 0:33:14 lr 0.000925 time 1.9702 (2.2382) loss 4.4025 (3.8965) grad_norm 1.1520 (1.1313) [2022-01-19 05:32:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][370/1251] eta 0:32:49 lr 0.000925 time 2.9467 (2.2353) loss 4.6704 (3.8985) grad_norm 1.1400 (1.1310) [2022-01-19 05:32:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][380/1251] eta 0:32:29 lr 0.000925 time 2.8900 (2.2382) loss 4.1614 (3.9003) grad_norm 1.0613 (1.1323) [2022-01-19 05:32:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][390/1251] eta 0:32:03 lr 0.000925 time 1.5773 (2.2344) loss 4.4525 (3.9012) grad_norm 1.0609 (1.1331) [2022-01-19 05:33:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][400/1251] eta 0:31:46 lr 0.000925 time 2.2707 (2.2403) loss 4.2574 (3.9063) grad_norm 1.0296 (1.1334) [2022-01-19 05:33:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][410/1251] eta 0:31:19 lr 0.000925 time 1.9497 (2.2351) loss 4.3547 (3.9086) grad_norm 1.4141 (1.1327) [2022-01-19 05:33:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][420/1251] eta 0:31:02 lr 0.000925 time 5.7344 (2.2412) loss 3.8122 (3.9140) grad_norm 0.8887 (1.1323) [2022-01-19 05:34:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][430/1251] eta 0:30:36 lr 0.000925 time 1.7195 (2.2366) loss 3.5559 (3.9135) grad_norm 1.1096 (1.1310) [2022-01-19 05:34:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][440/1251] eta 0:30:14 lr 0.000925 time 1.8577 (2.2373) loss 4.1605 (3.9140) grad_norm 1.3528 (1.1314) [2022-01-19 05:35:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][450/1251] eta 0:29:48 lr 0.000925 time 1.9260 (2.2331) loss 3.6235 (3.9164) grad_norm 1.2206 (1.1306) [2022-01-19 05:35:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][460/1251] eta 0:29:26 lr 0.000925 time 3.5720 (2.2328) loss 3.9826 (3.9171) grad_norm 1.4702 (1.1298) [2022-01-19 05:35:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][470/1251] eta 0:29:00 lr 0.000925 time 1.8263 (2.2288) loss 4.3938 (3.9210) grad_norm 1.3000 (1.1302) [2022-01-19 05:36:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][480/1251] eta 0:28:40 lr 0.000925 time 2.9660 (2.2313) loss 4.3194 (3.9172) grad_norm 1.3378 (1.1291) [2022-01-19 05:36:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][490/1251] eta 0:28:18 lr 0.000925 time 2.4875 (2.2321) loss 3.1108 (3.9182) grad_norm 1.1608 (1.1281) [2022-01-19 05:36:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][500/1251] eta 0:27:55 lr 0.000925 time 3.1479 (2.2312) loss 4.2859 (3.9164) grad_norm 1.0524 (1.1300) [2022-01-19 05:37:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][510/1251] eta 0:27:31 lr 0.000925 time 1.9111 (2.2292) loss 4.4602 (3.9213) grad_norm 1.0865 (1.1307) [2022-01-19 05:37:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][520/1251] eta 0:27:06 lr 0.000925 time 1.8766 (2.2253) loss 4.3559 (3.9238) grad_norm 1.1502 (1.1325) [2022-01-19 05:37:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][530/1251] eta 0:26:44 lr 0.000925 time 2.4826 (2.2247) loss 4.6657 (3.9284) grad_norm 1.1662 (1.1331) [2022-01-19 05:38:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][540/1251] eta 0:26:22 lr 0.000925 time 2.6654 (2.2264) loss 3.4921 (3.9296) grad_norm 1.0347 (1.1346) [2022-01-19 05:38:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][550/1251] eta 0:26:02 lr 0.000924 time 2.8791 (2.2286) loss 4.8679 (3.9379) grad_norm 1.0660 (1.1334) [2022-01-19 05:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][560/1251] eta 0:25:38 lr 0.000924 time 1.5930 (2.2260) loss 3.4617 (3.9361) grad_norm 1.0584 (1.1329) [2022-01-19 05:39:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][570/1251] eta 0:25:15 lr 0.000924 time 2.2978 (2.2252) loss 4.2759 (3.9345) grad_norm 1.0799 (1.1324) [2022-01-19 05:39:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][580/1251] eta 0:24:51 lr 0.000924 time 2.3729 (2.2225) loss 4.5175 (3.9335) grad_norm 1.0035 (1.1312) [2022-01-19 05:40:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][590/1251] eta 0:24:27 lr 0.000924 time 2.0945 (2.2198) loss 4.2521 (3.9331) grad_norm 1.0651 (1.1308) [2022-01-19 05:40:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][600/1251] eta 0:24:03 lr 0.000924 time 2.2256 (2.2179) loss 4.4930 (3.9296) grad_norm 1.2822 (1.1313) [2022-01-19 05:40:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][610/1251] eta 0:23:40 lr 0.000924 time 2.4621 (2.2167) loss 4.0247 (3.9296) grad_norm 1.1898 (1.1320) [2022-01-19 05:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][620/1251] eta 0:23:18 lr 0.000924 time 2.4931 (2.2160) loss 4.4896 (3.9313) grad_norm 1.1777 (1.1303) [2022-01-19 05:41:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][630/1251] eta 0:22:55 lr 0.000924 time 2.7090 (2.2148) loss 4.5711 (3.9321) grad_norm 1.1396 (1.1305) [2022-01-19 05:41:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][640/1251] eta 0:22:32 lr 0.000924 time 2.3485 (2.2138) loss 3.0741 (3.9309) grad_norm 1.2915 (1.1295) [2022-01-19 05:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][650/1251] eta 0:22:10 lr 0.000924 time 1.5853 (2.2132) loss 4.6929 (3.9348) grad_norm 1.0337 (1.1283) [2022-01-19 05:42:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][660/1251] eta 0:21:48 lr 0.000924 time 1.8685 (2.2141) loss 3.1170 (3.9306) grad_norm 1.1792 (1.1297) [2022-01-19 05:42:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][670/1251] eta 0:21:26 lr 0.000924 time 2.6042 (2.2150) loss 4.0697 (3.9330) grad_norm 1.0996 (1.1308) [2022-01-19 05:43:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][680/1251] eta 0:21:06 lr 0.000924 time 3.1901 (2.2177) loss 3.6801 (3.9307) grad_norm 1.2316 (1.1318) [2022-01-19 05:43:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][690/1251] eta 0:20:44 lr 0.000924 time 1.5574 (2.2190) loss 3.3162 (3.9305) grad_norm 1.2495 (1.1323) [2022-01-19 05:44:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][700/1251] eta 0:20:22 lr 0.000924 time 1.6536 (2.2187) loss 3.1202 (3.9305) grad_norm 1.0535 (1.1339) [2022-01-19 05:44:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][710/1251] eta 0:19:59 lr 0.000924 time 2.3396 (2.2179) loss 4.4486 (3.9292) grad_norm 1.0174 (1.1335) [2022-01-19 05:44:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][720/1251] eta 0:19:37 lr 0.000924 time 2.2127 (2.2166) loss 3.0521 (3.9294) grad_norm 1.1113 (1.1331) [2022-01-19 05:45:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][730/1251] eta 0:19:12 lr 0.000924 time 1.6409 (2.2125) loss 3.5588 (3.9258) grad_norm 1.1224 (1.1339) [2022-01-19 05:45:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][740/1251] eta 0:18:49 lr 0.000924 time 1.8665 (2.2095) loss 3.5552 (3.9255) grad_norm 1.0989 (1.1342) [2022-01-19 05:45:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][750/1251] eta 0:18:26 lr 0.000924 time 2.4827 (2.2095) loss 3.8258 (3.9235) grad_norm 1.0473 (1.1337) [2022-01-19 05:46:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][760/1251] eta 0:18:05 lr 0.000924 time 2.6581 (2.2098) loss 4.2407 (3.9218) grad_norm 1.1223 (1.1336) [2022-01-19 05:46:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][770/1251] eta 0:17:42 lr 0.000924 time 2.0216 (2.2094) loss 4.4353 (3.9245) grad_norm 0.9992 (1.1330) [2022-01-19 05:46:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][780/1251] eta 0:17:21 lr 0.000924 time 2.7414 (2.2106) loss 3.8241 (3.9213) grad_norm 1.1825 (1.1340) [2022-01-19 05:47:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][790/1251] eta 0:16:58 lr 0.000924 time 2.6243 (2.2093) loss 2.8143 (3.9213) grad_norm 1.1110 (1.1332) [2022-01-19 05:47:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][800/1251] eta 0:16:37 lr 0.000924 time 2.3579 (2.2113) loss 3.3655 (3.9205) grad_norm 1.4385 (1.1338) [2022-01-19 05:48:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][810/1251] eta 0:16:16 lr 0.000924 time 1.5475 (2.2138) loss 2.6879 (3.9169) grad_norm 1.1164 (1.1342) [2022-01-19 05:48:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][820/1251] eta 0:15:53 lr 0.000924 time 2.3839 (2.2134) loss 4.6792 (3.9175) grad_norm 1.2107 (1.1355) [2022-01-19 05:48:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][830/1251] eta 0:15:31 lr 0.000924 time 2.1360 (2.2118) loss 3.3879 (3.9180) grad_norm 1.1136 (1.1354) [2022-01-19 05:49:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][840/1251] eta 0:15:08 lr 0.000924 time 2.2553 (2.2103) loss 3.3006 (3.9163) grad_norm 1.0223 (1.1351) [2022-01-19 05:49:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][850/1251] eta 0:14:46 lr 0.000924 time 1.6799 (2.2096) loss 4.4079 (3.9182) grad_norm 1.0463 (1.1352) [2022-01-19 05:49:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][860/1251] eta 0:14:24 lr 0.000924 time 3.1797 (2.2104) loss 3.5848 (3.9176) grad_norm 1.0671 (1.1353) [2022-01-19 05:50:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][870/1251] eta 0:14:02 lr 0.000924 time 2.3868 (2.2109) loss 4.0749 (3.9201) grad_norm 0.9457 (1.1352) [2022-01-19 05:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][880/1251] eta 0:13:40 lr 0.000924 time 2.2642 (2.2107) loss 4.2936 (3.9228) grad_norm 1.1090 (1.1352) [2022-01-19 05:51:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][890/1251] eta 0:13:17 lr 0.000924 time 1.6685 (2.2099) loss 3.1129 (3.9225) grad_norm 1.0030 (1.1350) [2022-01-19 05:51:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][900/1251] eta 0:12:55 lr 0.000924 time 2.7315 (2.2099) loss 3.9050 (3.9201) grad_norm 1.0650 (1.1350) [2022-01-19 05:51:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][910/1251] eta 0:12:32 lr 0.000924 time 1.5917 (2.2071) loss 3.1171 (3.9194) grad_norm 1.0795 (1.1340) [2022-01-19 05:52:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][920/1251] eta 0:12:09 lr 0.000924 time 1.6247 (2.2049) loss 3.5059 (3.9208) grad_norm 1.1108 (1.1342) [2022-01-19 05:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][930/1251] eta 0:11:47 lr 0.000924 time 1.9189 (2.2052) loss 4.2502 (3.9205) grad_norm 1.0224 (1.1345) [2022-01-19 05:52:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][940/1251] eta 0:11:26 lr 0.000924 time 1.6006 (2.2061) loss 3.3458 (3.9186) grad_norm 1.1398 (1.1350) [2022-01-19 05:53:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][950/1251] eta 0:11:04 lr 0.000924 time 2.3009 (2.2065) loss 3.3104 (3.9154) grad_norm 1.0180 (1.1355) [2022-01-19 05:53:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][960/1251] eta 0:10:42 lr 0.000924 time 1.8719 (2.2068) loss 3.1838 (3.9134) grad_norm 1.0386 (1.1344) [2022-01-19 05:53:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][970/1251] eta 0:10:20 lr 0.000924 time 2.2626 (2.2075) loss 3.1400 (3.9129) grad_norm 1.0419 (1.1344) [2022-01-19 05:54:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][980/1251] eta 0:09:58 lr 0.000924 time 1.5435 (2.2067) loss 4.2153 (3.9140) grad_norm 0.9151 (1.1338) [2022-01-19 05:54:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][990/1251] eta 0:09:35 lr 0.000924 time 1.9103 (2.2063) loss 4.2909 (3.9171) grad_norm 1.0967 (1.1338) [2022-01-19 05:54:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1000/1251] eta 0:09:13 lr 0.000923 time 2.0179 (2.2048) loss 2.9395 (3.9163) grad_norm 1.1133 (1.1332) [2022-01-19 05:55:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1010/1251] eta 0:08:50 lr 0.000923 time 2.2035 (2.2024) loss 3.8890 (3.9148) grad_norm 1.0246 (1.1329) [2022-01-19 05:55:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1020/1251] eta 0:08:28 lr 0.000923 time 2.2270 (2.2016) loss 4.2495 (3.9113) grad_norm 1.0958 (1.1331) [2022-01-19 05:56:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1030/1251] eta 0:08:06 lr 0.000923 time 1.8973 (2.2036) loss 3.6781 (3.9102) grad_norm 1.1082 (1.1327) [2022-01-19 05:56:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1040/1251] eta 0:07:44 lr 0.000923 time 1.6049 (2.2031) loss 4.1600 (3.9119) grad_norm 1.5243 (1.1330) [2022-01-19 05:56:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1050/1251] eta 0:07:22 lr 0.000923 time 1.9867 (2.2034) loss 3.8068 (3.9121) grad_norm 1.0183 (1.1327) [2022-01-19 05:57:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1060/1251] eta 0:07:00 lr 0.000923 time 2.1669 (2.2033) loss 4.2260 (3.9143) grad_norm 1.3606 (1.1328) [2022-01-19 05:57:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1070/1251] eta 0:06:38 lr 0.000923 time 1.9185 (2.2035) loss 4.1872 (3.9141) grad_norm 1.2670 (1.1322) [2022-01-19 05:57:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1080/1251] eta 0:06:16 lr 0.000923 time 2.1859 (2.2030) loss 3.8884 (3.9123) grad_norm 1.1343 (1.1317) [2022-01-19 05:58:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1090/1251] eta 0:05:54 lr 0.000923 time 1.9314 (2.2034) loss 3.8880 (3.9133) grad_norm 1.4073 (1.1316) [2022-01-19 05:58:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1100/1251] eta 0:05:32 lr 0.000923 time 3.0040 (2.2047) loss 3.8832 (3.9112) grad_norm 0.9978 (1.1323) [2022-01-19 05:59:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1110/1251] eta 0:05:10 lr 0.000923 time 1.5791 (2.2044) loss 2.9996 (3.9119) grad_norm 1.1674 (1.1332) [2022-01-19 05:59:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1120/1251] eta 0:04:48 lr 0.000923 time 1.8710 (2.2031) loss 4.1343 (3.9123) grad_norm 1.2523 (1.1331) [2022-01-19 05:59:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1130/1251] eta 0:04:26 lr 0.000923 time 1.9130 (2.2012) loss 4.2589 (3.9115) grad_norm 1.0167 (1.1328) [2022-01-19 06:00:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1140/1251] eta 0:04:04 lr 0.000923 time 3.3647 (2.2010) loss 4.1460 (3.9125) grad_norm 1.0415 (1.1325) [2022-01-19 06:00:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1150/1251] eta 0:03:42 lr 0.000923 time 2.2861 (2.2010) loss 3.5060 (3.9104) grad_norm 1.0616 (1.1317) [2022-01-19 06:00:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1160/1251] eta 0:03:20 lr 0.000923 time 2.3997 (2.2016) loss 2.6712 (3.9093) grad_norm 1.0375 (1.1322) [2022-01-19 06:01:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1170/1251] eta 0:02:58 lr 0.000923 time 1.8475 (2.2024) loss 3.2077 (3.9103) grad_norm 1.0797 (1.1319) [2022-01-19 06:01:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1180/1251] eta 0:02:36 lr 0.000923 time 2.6029 (2.2025) loss 4.2657 (3.9096) grad_norm 1.1954 (1.1332) [2022-01-19 06:01:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1190/1251] eta 0:02:14 lr 0.000923 time 2.1532 (2.2015) loss 4.3803 (3.9088) grad_norm 1.4591 (1.1337) [2022-01-19 06:02:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1200/1251] eta 0:01:52 lr 0.000923 time 2.3826 (2.2007) loss 4.2678 (3.9115) grad_norm 1.1559 (1.1338) [2022-01-19 06:02:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1210/1251] eta 0:01:30 lr 0.000923 time 1.8633 (2.1995) loss 4.1973 (3.9102) grad_norm 0.9364 (1.1334) [2022-01-19 06:02:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1220/1251] eta 0:01:08 lr 0.000923 time 2.9590 (2.1997) loss 3.8822 (3.9098) grad_norm 1.2198 (1.1332) [2022-01-19 06:03:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1230/1251] eta 0:00:46 lr 0.000923 time 2.2379 (2.2003) loss 4.2828 (3.9063) grad_norm 0.9873 (1.1330) [2022-01-19 06:03:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1240/1251] eta 0:00:24 lr 0.000923 time 1.7540 (2.1996) loss 4.5811 (3.9072) grad_norm 1.1626 (1.1331) [2022-01-19 06:03:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [53/300][1250/1251] eta 0:00:02 lr 0.000923 time 1.1794 (2.1945) loss 3.6340 (3.9043) grad_norm 1.1324 (1.1336) [2022-01-19 06:03:58 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 53 training takes 0:45:45 [2022-01-19 06:04:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.088 (18.088) Loss 1.2155 (1.2155) Acc@1 71.191 (71.191) Acc@5 89.844 (89.844) [2022-01-19 06:04:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.307 (3.458) Loss 1.2447 (1.2689) Acc@1 70.898 (70.526) Acc@5 90.430 (90.199) [2022-01-19 06:04:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.904 (2.679) Loss 1.2549 (1.2661) Acc@1 70.605 (70.596) Acc@5 90.137 (90.197) [2022-01-19 06:05:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.210 (2.383) Loss 1.2845 (1.2674) Acc@1 69.531 (70.435) Acc@5 90.723 (90.234) [2022-01-19 06:05:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.876 (2.260) Loss 1.2648 (1.2615) Acc@1 68.555 (70.541) Acc@5 90.820 (90.261) [2022-01-19 06:05:38 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 70.584 Acc@5 90.224 [2022-01-19 06:05:38 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 70.6% [2022-01-19 06:05:38 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 70.58% [2022-01-19 06:06:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][0/1251] eta 7:26:24 lr 0.000923 time 21.4103 (21.4103) loss 2.7573 (2.7573) grad_norm 1.2328 (1.2328) [2022-01-19 06:06:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][10/1251] eta 1:25:23 lr 0.000923 time 2.1543 (4.1285) loss 3.3754 (3.9807) grad_norm 1.3200 (1.1591) [2022-01-19 06:06:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][20/1251] eta 1:05:49 lr 0.000923 time 1.9192 (3.2086) loss 3.9807 (4.0343) grad_norm 0.9081 (1.1032) [2022-01-19 06:07:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][30/1251] eta 0:59:03 lr 0.000923 time 1.8891 (2.9022) loss 4.8062 (4.0049) grad_norm 1.1501 (1.0905) [2022-01-19 06:07:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][40/1251] eta 0:56:14 lr 0.000923 time 3.7900 (2.7868) loss 3.5861 (3.9811) grad_norm 1.2538 (1.1108) [2022-01-19 06:07:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][50/1251] eta 0:53:44 lr 0.000923 time 1.6329 (2.6849) loss 4.7626 (3.9626) grad_norm 1.2422 (1.1261) [2022-01-19 06:08:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][60/1251] eta 0:51:23 lr 0.000923 time 1.6757 (2.5889) loss 4.3219 (3.9774) grad_norm 1.0190 (1.1201) [2022-01-19 06:08:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][70/1251] eta 0:49:35 lr 0.000923 time 1.9416 (2.5196) loss 4.1758 (3.9789) grad_norm 0.9167 (1.1089) [2022-01-19 06:09:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][80/1251] eta 0:48:48 lr 0.000923 time 3.7277 (2.5008) loss 3.8973 (3.9429) grad_norm 1.4800 (1.1180) [2022-01-19 06:09:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][90/1251] eta 0:48:02 lr 0.000923 time 2.5300 (2.4824) loss 4.1987 (3.9545) grad_norm 1.1386 (1.1216) [2022-01-19 06:09:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][100/1251] eta 0:46:52 lr 0.000923 time 1.9264 (2.4439) loss 4.0455 (3.9392) grad_norm 1.1319 (1.1193) [2022-01-19 06:10:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][110/1251] eta 0:45:46 lr 0.000923 time 1.8183 (2.4068) loss 4.1812 (3.9475) grad_norm 1.0273 (1.1171) [2022-01-19 06:10:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][120/1251] eta 0:44:52 lr 0.000923 time 2.4472 (2.3805) loss 3.9350 (3.9605) grad_norm 1.2123 (1.1136) [2022-01-19 06:10:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][130/1251] eta 0:44:03 lr 0.000923 time 1.8900 (2.3580) loss 3.4167 (3.9578) grad_norm 1.1643 (1.1174) [2022-01-19 06:11:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][140/1251] eta 0:43:35 lr 0.000923 time 2.1383 (2.3545) loss 4.8051 (3.9566) grad_norm 1.3581 (1.1196) [2022-01-19 06:11:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][150/1251] eta 0:42:55 lr 0.000923 time 1.9155 (2.3393) loss 4.4631 (3.9685) grad_norm 1.0179 (1.1201) [2022-01-19 06:11:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][160/1251] eta 0:42:21 lr 0.000923 time 2.8435 (2.3291) loss 4.4467 (3.9680) grad_norm 1.0953 (1.1229) [2022-01-19 06:12:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][170/1251] eta 0:41:35 lr 0.000923 time 1.9817 (2.3089) loss 3.1114 (3.9440) grad_norm 1.0309 (1.1236) [2022-01-19 06:12:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][180/1251] eta 0:40:58 lr 0.000923 time 2.0557 (2.2956) loss 4.6603 (3.9532) grad_norm 0.9872 (1.1230) [2022-01-19 06:12:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][190/1251] eta 0:40:25 lr 0.000923 time 2.0588 (2.2862) loss 4.7022 (3.9409) grad_norm 1.0846 (1.1238) [2022-01-19 06:13:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][200/1251] eta 0:39:56 lr 0.000922 time 2.3663 (2.2807) loss 4.0862 (3.9465) grad_norm 1.0460 (1.1245) [2022-01-19 06:13:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][210/1251] eta 0:39:21 lr 0.000922 time 2.2268 (2.2686) loss 4.2489 (3.9438) grad_norm 1.0056 (1.1235) [2022-01-19 06:14:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][220/1251] eta 0:39:00 lr 0.000922 time 3.1301 (2.2705) loss 4.5742 (3.9325) grad_norm 1.4198 (1.1234) [2022-01-19 06:14:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][230/1251] eta 0:38:39 lr 0.000922 time 2.6481 (2.2720) loss 3.2384 (3.9251) grad_norm 1.1669 (1.1309) [2022-01-19 06:14:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][240/1251] eta 0:38:14 lr 0.000922 time 2.3063 (2.2694) loss 3.1177 (3.9261) grad_norm 1.3548 (1.1316) [2022-01-19 06:15:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][250/1251] eta 0:37:40 lr 0.000922 time 1.8652 (2.2578) loss 4.4694 (3.9415) grad_norm 0.9798 (1.1314) [2022-01-19 06:15:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][260/1251] eta 0:37:11 lr 0.000922 time 2.3234 (2.2521) loss 4.6581 (3.9442) grad_norm 1.3351 (1.1317) [2022-01-19 06:15:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][270/1251] eta 0:36:53 lr 0.000922 time 3.3806 (2.2568) loss 4.1196 (3.9409) grad_norm 1.0786 (1.1328) [2022-01-19 06:16:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][280/1251] eta 0:36:31 lr 0.000922 time 2.1849 (2.2565) loss 4.6345 (3.9385) grad_norm 1.0354 (1.1337) [2022-01-19 06:16:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][290/1251] eta 0:36:07 lr 0.000922 time 1.8397 (2.2556) loss 3.3225 (3.9335) grad_norm 1.1343 (1.1358) [2022-01-19 06:16:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][300/1251] eta 0:35:47 lr 0.000922 time 3.4798 (2.2581) loss 3.4583 (3.9198) grad_norm 1.1104 (1.1350) [2022-01-19 06:17:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][310/1251] eta 0:35:20 lr 0.000922 time 2.6143 (2.2537) loss 2.8770 (3.9122) grad_norm 1.1068 (1.1323) [2022-01-19 06:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][320/1251] eta 0:34:50 lr 0.000922 time 2.2674 (2.2460) loss 3.9123 (3.9112) grad_norm 1.1013 (1.1307) [2022-01-19 06:18:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][330/1251] eta 0:34:23 lr 0.000922 time 1.5727 (2.2401) loss 3.4555 (3.9157) grad_norm 1.0154 (1.1295) [2022-01-19 06:18:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][340/1251] eta 0:33:59 lr 0.000922 time 2.4627 (2.2390) loss 2.7429 (3.9175) grad_norm 1.1236 (1.1285) [2022-01-19 06:18:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][350/1251] eta 0:33:34 lr 0.000922 time 2.5040 (2.2358) loss 3.8073 (3.9202) grad_norm 0.9853 (1.1261) [2022-01-19 06:19:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][360/1251] eta 0:33:11 lr 0.000922 time 2.2896 (2.2347) loss 3.9728 (3.9171) grad_norm 1.3012 (1.1272) [2022-01-19 06:19:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][370/1251] eta 0:32:51 lr 0.000922 time 2.4007 (2.2376) loss 4.6175 (3.9185) grad_norm 1.1603 (1.1289) [2022-01-19 06:19:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][380/1251] eta 0:32:31 lr 0.000922 time 2.8011 (2.2404) loss 4.5615 (3.9103) grad_norm 1.0769 (1.1333) [2022-01-19 06:20:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][390/1251] eta 0:32:08 lr 0.000922 time 3.0107 (2.2400) loss 3.9315 (3.9109) grad_norm 1.0706 (1.1343) [2022-01-19 06:20:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][400/1251] eta 0:31:46 lr 0.000922 time 2.6184 (2.2397) loss 4.3516 (3.9125) grad_norm 1.1107 (1.1377) [2022-01-19 06:20:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][410/1251] eta 0:31:17 lr 0.000922 time 1.9588 (2.2329) loss 3.7291 (3.9119) grad_norm 1.3434 (1.1380) [2022-01-19 06:21:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][420/1251] eta 0:30:53 lr 0.000922 time 2.8155 (2.2308) loss 2.4113 (3.9071) grad_norm 0.9512 (1.1380) [2022-01-19 06:21:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][430/1251] eta 0:30:30 lr 0.000922 time 2.4967 (2.2300) loss 4.4518 (3.9124) grad_norm 1.2042 (1.1369) [2022-01-19 06:22:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][440/1251] eta 0:30:06 lr 0.000922 time 2.3635 (2.2281) loss 3.8628 (3.9051) grad_norm 1.0114 (1.1381) [2022-01-19 06:22:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][450/1251] eta 0:29:42 lr 0.000922 time 2.1342 (2.2260) loss 4.0431 (3.9082) grad_norm 1.0910 (1.1424) [2022-01-19 06:22:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][460/1251] eta 0:29:19 lr 0.000922 time 2.8920 (2.2243) loss 4.0887 (3.9075) grad_norm 1.2608 (1.1429) [2022-01-19 06:23:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][470/1251] eta 0:28:55 lr 0.000922 time 2.2060 (2.2218) loss 4.1575 (3.9047) grad_norm 0.9018 (1.1411) [2022-01-19 06:23:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][480/1251] eta 0:28:32 lr 0.000922 time 2.5977 (2.2211) loss 4.6694 (3.9098) grad_norm 0.9641 (1.1392) [2022-01-19 06:23:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][490/1251] eta 0:28:09 lr 0.000922 time 1.9650 (2.2206) loss 3.5136 (3.9071) grad_norm 1.1965 (1.1397) [2022-01-19 06:24:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][500/1251] eta 0:27:50 lr 0.000922 time 2.4459 (2.2243) loss 2.8670 (3.9009) grad_norm 1.1914 (1.1401) [2022-01-19 06:24:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][510/1251] eta 0:27:26 lr 0.000922 time 2.1441 (2.2224) loss 4.0459 (3.9031) grad_norm 1.0540 (1.1409) [2022-01-19 06:24:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][520/1251] eta 0:27:03 lr 0.000922 time 1.8968 (2.2210) loss 3.4391 (3.9038) grad_norm 0.9975 (1.1405) [2022-01-19 06:25:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][530/1251] eta 0:26:38 lr 0.000922 time 1.9260 (2.2171) loss 4.2325 (3.9005) grad_norm 1.0187 (1.1391) [2022-01-19 06:25:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][540/1251] eta 0:26:15 lr 0.000922 time 2.4692 (2.2163) loss 3.3415 (3.9019) grad_norm 1.2321 (1.1413) [2022-01-19 06:25:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][550/1251] eta 0:25:52 lr 0.000922 time 2.2037 (2.2153) loss 4.4959 (3.9094) grad_norm 1.0994 (1.1400) [2022-01-19 06:26:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][560/1251] eta 0:25:31 lr 0.000922 time 2.2631 (2.2161) loss 4.3000 (3.9109) grad_norm 0.9919 (1.1398) [2022-01-19 06:26:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][570/1251] eta 0:25:09 lr 0.000922 time 1.5050 (2.2163) loss 3.9317 (3.9059) grad_norm 1.2515 (1.1400) [2022-01-19 06:27:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][580/1251] eta 0:24:49 lr 0.000922 time 2.4723 (2.2199) loss 3.9066 (3.9122) grad_norm 0.9812 (1.1400) [2022-01-19 06:27:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][590/1251] eta 0:24:28 lr 0.000922 time 2.1946 (2.2211) loss 4.2595 (3.9127) grad_norm 1.6982 (1.1418) [2022-01-19 06:27:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][600/1251] eta 0:24:04 lr 0.000922 time 1.9134 (2.2182) loss 3.8866 (3.9128) grad_norm 1.1040 (1.1423) [2022-01-19 06:28:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][610/1251] eta 0:23:40 lr 0.000922 time 1.6345 (2.2160) loss 4.1204 (3.9093) grad_norm 1.1321 (1.1430) [2022-01-19 06:28:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][620/1251] eta 0:23:16 lr 0.000922 time 2.3753 (2.2139) loss 4.3008 (3.9144) grad_norm 1.0726 (1.1424) [2022-01-19 06:28:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][630/1251] eta 0:22:52 lr 0.000922 time 1.8542 (2.2107) loss 4.1092 (3.9192) grad_norm 1.0301 (1.1424) [2022-01-19 06:29:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][640/1251] eta 0:22:30 lr 0.000922 time 2.0593 (2.2101) loss 4.5584 (3.9210) grad_norm 1.1784 (1.1421) [2022-01-19 06:29:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][650/1251] eta 0:22:07 lr 0.000921 time 1.9445 (2.2088) loss 3.0712 (3.9232) grad_norm 1.0125 (1.1420) [2022-01-19 06:29:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][660/1251] eta 0:21:44 lr 0.000921 time 2.4573 (2.2070) loss 2.7243 (3.9253) grad_norm 1.0772 (1.1410) [2022-01-19 06:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][670/1251] eta 0:21:21 lr 0.000921 time 1.8536 (2.2049) loss 3.4614 (3.9271) grad_norm 1.3371 (1.1416) [2022-01-19 06:30:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][680/1251] eta 0:20:59 lr 0.000921 time 2.1441 (2.2060) loss 4.1398 (3.9299) grad_norm 1.1015 (1.1422) [2022-01-19 06:31:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][690/1251] eta 0:20:37 lr 0.000921 time 2.2358 (2.2063) loss 4.3311 (3.9301) grad_norm 1.1705 (1.1420) [2022-01-19 06:31:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][700/1251] eta 0:20:17 lr 0.000921 time 2.7761 (2.2088) loss 4.0537 (3.9312) grad_norm 1.0514 (1.1415) [2022-01-19 06:31:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][710/1251] eta 0:19:56 lr 0.000921 time 1.4750 (2.2108) loss 3.3234 (3.9327) grad_norm 1.3184 (1.1423) [2022-01-19 06:32:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][720/1251] eta 0:19:33 lr 0.000921 time 1.4550 (2.2107) loss 3.9478 (3.9366) grad_norm 1.0275 (1.1410) [2022-01-19 06:32:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][730/1251] eta 0:19:12 lr 0.000921 time 3.0813 (2.2115) loss 3.5399 (3.9331) grad_norm 1.3288 (1.1406) [2022-01-19 06:32:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][740/1251] eta 0:18:49 lr 0.000921 time 2.8728 (2.2113) loss 3.2214 (3.9317) grad_norm 1.0514 (1.1412) [2022-01-19 06:33:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][750/1251] eta 0:18:26 lr 0.000921 time 1.8697 (2.2094) loss 3.1688 (3.9343) grad_norm 1.2226 (1.1423) [2022-01-19 06:33:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][760/1251] eta 0:18:03 lr 0.000921 time 1.6912 (2.2071) loss 4.3204 (3.9353) grad_norm 1.3449 (1.1429) [2022-01-19 06:33:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][770/1251] eta 0:17:41 lr 0.000921 time 2.1188 (2.2060) loss 4.2423 (3.9365) grad_norm 1.1474 (1.1436) [2022-01-19 06:34:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][780/1251] eta 0:17:18 lr 0.000921 time 2.7367 (2.2050) loss 3.2165 (3.9367) grad_norm 1.0395 (1.1438) [2022-01-19 06:34:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][790/1251] eta 0:16:56 lr 0.000921 time 1.6464 (2.2056) loss 4.2998 (3.9395) grad_norm 0.9587 (1.1434) [2022-01-19 06:35:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][800/1251] eta 0:16:34 lr 0.000921 time 2.1598 (2.2054) loss 4.1816 (3.9409) grad_norm 0.9778 (1.1433) [2022-01-19 06:35:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][810/1251] eta 0:16:11 lr 0.000921 time 2.4038 (2.2039) loss 4.3749 (3.9430) grad_norm 1.0653 (1.1429) [2022-01-19 06:35:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][820/1251] eta 0:15:49 lr 0.000921 time 2.1352 (2.2036) loss 4.2727 (3.9472) grad_norm 1.2400 (1.1449) [2022-01-19 06:36:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][830/1251] eta 0:15:27 lr 0.000921 time 2.4963 (2.2038) loss 3.8696 (3.9475) grad_norm 1.0230 (1.1453) [2022-01-19 06:36:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][840/1251] eta 0:15:06 lr 0.000921 time 1.8131 (2.2046) loss 3.8522 (3.9455) grad_norm 1.0535 (1.1445) [2022-01-19 06:36:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][850/1251] eta 0:14:44 lr 0.000921 time 2.1426 (2.2063) loss 2.9779 (3.9475) grad_norm 0.9738 (1.1444) [2022-01-19 06:37:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][860/1251] eta 0:14:22 lr 0.000921 time 1.9152 (2.2065) loss 4.6131 (3.9480) grad_norm 1.1716 (1.1445) [2022-01-19 06:37:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][870/1251] eta 0:13:59 lr 0.000921 time 1.7578 (2.2043) loss 3.9806 (3.9458) grad_norm 1.1345 (1.1450) [2022-01-19 06:37:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][880/1251] eta 0:13:37 lr 0.000921 time 2.2682 (2.2030) loss 4.5664 (3.9499) grad_norm 1.2430 (1.1454) [2022-01-19 06:38:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][890/1251] eta 0:13:15 lr 0.000921 time 2.2688 (2.2025) loss 4.3071 (3.9534) grad_norm 1.1930 (1.1456) [2022-01-19 06:38:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][900/1251] eta 0:12:52 lr 0.000921 time 1.6324 (2.2012) loss 4.3587 (3.9501) grad_norm 1.0787 (1.1449) [2022-01-19 06:39:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][910/1251] eta 0:12:30 lr 0.000921 time 2.8399 (2.2016) loss 2.5691 (3.9520) grad_norm 1.2348 (1.1447) [2022-01-19 06:39:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][920/1251] eta 0:12:09 lr 0.000921 time 2.2074 (2.2041) loss 4.1830 (3.9509) grad_norm 1.1876 (1.1442) [2022-01-19 06:39:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][930/1251] eta 0:11:47 lr 0.000921 time 2.0985 (2.2036) loss 3.3483 (3.9492) grad_norm 1.1487 (1.1442) [2022-01-19 06:40:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][940/1251] eta 0:11:25 lr 0.000921 time 2.0258 (2.2056) loss 4.2352 (3.9498) grad_norm 1.0680 (1.1439) [2022-01-19 06:40:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][950/1251] eta 0:11:04 lr 0.000921 time 2.2297 (2.2063) loss 4.1832 (3.9468) grad_norm 0.9952 (1.1438) [2022-01-19 06:40:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][960/1251] eta 0:10:41 lr 0.000921 time 2.1354 (2.2047) loss 3.1434 (3.9452) grad_norm 1.0256 (1.1439) [2022-01-19 06:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][970/1251] eta 0:10:18 lr 0.000921 time 1.8278 (2.2011) loss 4.4489 (3.9449) grad_norm 1.1998 (1.1441) [2022-01-19 06:41:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][980/1251] eta 0:09:55 lr 0.000921 time 1.9201 (2.1986) loss 4.2566 (3.9461) grad_norm 1.0636 (1.1435) [2022-01-19 06:41:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][990/1251] eta 0:09:33 lr 0.000921 time 1.8396 (2.1966) loss 4.7645 (3.9485) grad_norm 1.2143 (1.1437) [2022-01-19 06:42:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1000/1251] eta 0:09:11 lr 0.000921 time 4.7510 (2.1971) loss 4.0389 (3.9487) grad_norm 1.0878 (1.1429) [2022-01-19 06:42:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1010/1251] eta 0:08:50 lr 0.000921 time 2.8938 (2.1992) loss 4.5449 (3.9513) grad_norm 1.0561 (1.1424) [2022-01-19 06:43:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1020/1251] eta 0:08:27 lr 0.000921 time 2.8399 (2.1986) loss 3.6087 (3.9516) grad_norm 1.1091 (1.1423) [2022-01-19 06:43:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1030/1251] eta 0:08:06 lr 0.000921 time 2.2452 (2.1992) loss 3.9486 (3.9496) grad_norm 1.3737 (1.1421) [2022-01-19 06:43:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1040/1251] eta 0:07:44 lr 0.000921 time 2.5093 (2.1997) loss 3.3996 (3.9503) grad_norm 1.4017 (1.1425) [2022-01-19 06:44:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1050/1251] eta 0:07:22 lr 0.000921 time 2.7284 (2.2013) loss 4.2968 (3.9482) grad_norm 1.1663 (1.1427) [2022-01-19 06:44:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1060/1251] eta 0:07:00 lr 0.000921 time 2.5052 (2.2010) loss 3.8589 (3.9469) grad_norm 0.9814 (1.1416) [2022-01-19 06:44:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1070/1251] eta 0:06:38 lr 0.000921 time 1.8435 (2.1998) loss 4.4372 (3.9491) grad_norm 1.1348 (1.1409) [2022-01-19 06:45:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1080/1251] eta 0:06:15 lr 0.000921 time 2.4042 (2.1982) loss 4.0668 (3.9487) grad_norm 1.1276 (1.1407) [2022-01-19 06:45:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1090/1251] eta 0:05:53 lr 0.000921 time 2.1747 (2.1977) loss 4.7124 (3.9499) grad_norm 1.1124 (1.1408) [2022-01-19 06:46:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1100/1251] eta 0:05:32 lr 0.000920 time 2.4414 (2.1993) loss 4.6987 (3.9501) grad_norm 1.0053 (1.1400) [2022-01-19 06:46:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1110/1251] eta 0:05:10 lr 0.000920 time 1.8035 (2.1994) loss 3.5808 (3.9502) grad_norm 1.1636 (1.1397) [2022-01-19 06:46:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1120/1251] eta 0:04:48 lr 0.000920 time 2.2784 (2.2008) loss 3.2746 (3.9501) grad_norm 0.9763 (1.1392) [2022-01-19 06:47:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1130/1251] eta 0:04:26 lr 0.000920 time 2.1844 (2.2011) loss 3.2898 (3.9473) grad_norm 0.9912 (1.1392) [2022-01-19 06:47:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1140/1251] eta 0:04:04 lr 0.000920 time 1.7951 (2.1991) loss 4.1811 (3.9469) grad_norm 1.2988 (1.1394) [2022-01-19 06:47:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1150/1251] eta 0:03:41 lr 0.000920 time 1.9435 (2.1976) loss 4.3342 (3.9449) grad_norm 1.3915 (1.1393) [2022-01-19 06:48:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1160/1251] eta 0:03:19 lr 0.000920 time 2.1519 (2.1964) loss 4.2053 (3.9459) grad_norm 1.0782 (1.1395) [2022-01-19 06:48:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1170/1251] eta 0:02:57 lr 0.000920 time 1.9029 (2.1957) loss 4.5327 (3.9486) grad_norm 1.1310 (1.1391) [2022-01-19 06:48:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1180/1251] eta 0:02:35 lr 0.000920 time 1.8421 (2.1951) loss 3.0314 (3.9469) grad_norm 1.0622 (1.1386) [2022-01-19 06:49:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1190/1251] eta 0:02:13 lr 0.000920 time 2.8089 (2.1953) loss 3.9032 (3.9475) grad_norm 0.9024 (1.1388) [2022-01-19 06:49:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1200/1251] eta 0:01:51 lr 0.000920 time 2.3736 (2.1956) loss 4.3650 (3.9477) grad_norm 1.0421 (1.1381) [2022-01-19 06:49:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1210/1251] eta 0:01:30 lr 0.000920 time 2.7935 (2.1963) loss 3.7454 (3.9446) grad_norm 1.0718 (1.1380) [2022-01-19 06:50:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1220/1251] eta 0:01:08 lr 0.000920 time 1.5049 (2.1960) loss 4.2147 (3.9426) grad_norm 1.0169 (1.1374) [2022-01-19 06:50:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1230/1251] eta 0:00:46 lr 0.000920 time 3.0654 (2.1976) loss 3.5492 (3.9410) grad_norm 1.2050 (1.1377) [2022-01-19 06:51:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1240/1251] eta 0:00:24 lr 0.000920 time 1.6943 (2.1969) loss 4.3122 (3.9399) grad_norm 1.0879 (1.1371) [2022-01-19 06:51:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [54/300][1250/1251] eta 0:00:02 lr 0.000920 time 1.2057 (2.1914) loss 3.5027 (3.9395) grad_norm 1.1783 (1.1371) [2022-01-19 06:51:20 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 54 training takes 0:45:41 [2022-01-19 06:51:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 16.692 (16.692) Loss 1.2295 (1.2295) Acc@1 71.777 (71.777) Acc@5 91.406 (91.406) [2022-01-19 06:51:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.925 (3.276) Loss 1.3025 (1.2977) Acc@1 70.117 (70.481) Acc@5 90.137 (89.968) [2022-01-19 06:52:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.904 (2.434) Loss 1.2462 (1.2910) Acc@1 72.363 (70.838) Acc@5 91.113 (90.132) [2022-01-19 06:52:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.934 (2.213) Loss 1.2990 (1.2861) Acc@1 70.996 (70.889) Acc@5 91.602 (90.222) [2022-01-19 06:52:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.114 (2.141) Loss 1.3235 (1.2893) Acc@1 70.410 (70.660) Acc@5 90.527 (90.277) [2022-01-19 06:52:55 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 70.790 Acc@5 90.278 [2022-01-19 06:52:55 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 70.8% [2022-01-19 06:52:55 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 70.79% [2022-01-19 06:53:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][0/1251] eta 7:28:19 lr 0.000920 time 21.5025 (21.5025) loss 3.7080 (3.7080) grad_norm 1.1702 (1.1702) [2022-01-19 06:53:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][10/1251] eta 1:24:58 lr 0.000920 time 2.5684 (4.1087) loss 4.1766 (3.9279) grad_norm 1.2354 (1.1395) [2022-01-19 06:54:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][20/1251] eta 1:05:28 lr 0.000920 time 1.8408 (3.1913) loss 4.4378 (3.9900) grad_norm 1.1710 (1.1465) [2022-01-19 06:54:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][30/1251] eta 0:59:03 lr 0.000920 time 2.4436 (2.9017) loss 4.3832 (4.0563) grad_norm 1.2684 (1.1472) [2022-01-19 06:54:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][40/1251] eta 0:56:20 lr 0.000920 time 3.3445 (2.7916) loss 4.2905 (3.9885) grad_norm 1.1094 (1.1479) [2022-01-19 06:55:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][50/1251] eta 0:54:01 lr 0.000920 time 2.4232 (2.6988) loss 4.1139 (3.9427) grad_norm 0.9901 (1.1361) [2022-01-19 06:55:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][60/1251] eta 0:51:19 lr 0.000920 time 1.6771 (2.5855) loss 3.8091 (3.8939) grad_norm 1.0021 (1.1311) [2022-01-19 06:55:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][70/1251] eta 0:49:23 lr 0.000920 time 1.8665 (2.5096) loss 4.1279 (3.8699) grad_norm 1.3930 (1.1384) [2022-01-19 06:56:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][80/1251] eta 0:48:24 lr 0.000920 time 3.1264 (2.4806) loss 4.8003 (3.8841) grad_norm 1.3235 (1.1398) [2022-01-19 06:56:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][90/1251] eta 0:47:21 lr 0.000920 time 2.8306 (2.4471) loss 4.6516 (3.8990) grad_norm 1.2359 (1.1437) [2022-01-19 06:57:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][100/1251] eta 0:46:48 lr 0.000920 time 1.9053 (2.4401) loss 4.2418 (3.8841) grad_norm 1.1419 (1.1395) [2022-01-19 06:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][110/1251] eta 0:46:08 lr 0.000920 time 2.1394 (2.4268) loss 4.6189 (3.8916) grad_norm 1.1623 (1.1384) [2022-01-19 06:57:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][120/1251] eta 0:45:35 lr 0.000920 time 3.2442 (2.4184) loss 3.3813 (3.9092) grad_norm 1.2146 (1.1463) [2022-01-19 06:58:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][130/1251] eta 0:44:39 lr 0.000920 time 1.9314 (2.3902) loss 3.8608 (3.9135) grad_norm 1.4694 (1.1435) [2022-01-19 06:58:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][140/1251] eta 0:43:45 lr 0.000920 time 1.8555 (2.3636) loss 4.3044 (3.9295) grad_norm 0.9243 (1.1351) [2022-01-19 06:58:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][150/1251] eta 0:43:06 lr 0.000920 time 2.1767 (2.3488) loss 3.9582 (3.9220) grad_norm 1.0784 (1.1351) [2022-01-19 06:59:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][160/1251] eta 0:42:12 lr 0.000920 time 1.7404 (2.3209) loss 4.0899 (3.9330) grad_norm 1.0935 (1.1310) [2022-01-19 06:59:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][170/1251] eta 0:41:32 lr 0.000920 time 1.8057 (2.3055) loss 3.8688 (3.9303) grad_norm 1.0704 (1.1297) [2022-01-19 06:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][180/1251] eta 0:40:55 lr 0.000920 time 2.4697 (2.2926) loss 3.4508 (3.9319) grad_norm 1.3668 (1.1284) [2022-01-19 07:00:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][190/1251] eta 0:40:26 lr 0.000920 time 1.6215 (2.2870) loss 3.3576 (3.9307) grad_norm 1.0975 (1.1295) [2022-01-19 07:00:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][200/1251] eta 0:40:00 lr 0.000920 time 2.5230 (2.2843) loss 4.2179 (3.9316) grad_norm 1.4703 (1.1293) [2022-01-19 07:00:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][210/1251] eta 0:39:37 lr 0.000920 time 1.7484 (2.2835) loss 3.3861 (3.9291) grad_norm 1.0955 (1.1312) [2022-01-19 07:01:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][220/1251] eta 0:39:19 lr 0.000920 time 3.3488 (2.2890) loss 3.4665 (3.9320) grad_norm 1.2417 (1.1321) [2022-01-19 07:01:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][230/1251] eta 0:38:52 lr 0.000920 time 1.8313 (2.2844) loss 3.9018 (3.9281) grad_norm 1.4700 (1.1341) [2022-01-19 07:02:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][240/1251] eta 0:38:25 lr 0.000920 time 2.8547 (2.2804) loss 4.3535 (3.9177) grad_norm 1.3464 (1.1349) [2022-01-19 07:02:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][250/1251] eta 0:37:52 lr 0.000920 time 1.9621 (2.2703) loss 4.8155 (3.9332) grad_norm 1.0924 (1.1332) [2022-01-19 07:02:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][260/1251] eta 0:37:31 lr 0.000920 time 3.5436 (2.2722) loss 3.1837 (3.9404) grad_norm 1.2911 (1.1350) [2022-01-19 07:03:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][270/1251] eta 0:37:04 lr 0.000920 time 1.5687 (2.2675) loss 3.6625 (3.9279) grad_norm 0.9362 (1.1343) [2022-01-19 07:03:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][280/1251] eta 0:36:40 lr 0.000920 time 2.8732 (2.2664) loss 4.6418 (3.9155) grad_norm 1.0525 (1.1349) [2022-01-19 07:03:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][290/1251] eta 0:36:16 lr 0.000919 time 1.8497 (2.2649) loss 3.2666 (3.9040) grad_norm 1.0014 (1.1346) [2022-01-19 07:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][300/1251] eta 0:35:53 lr 0.000919 time 2.9800 (2.2645) loss 3.5889 (3.9009) grad_norm 1.4701 (1.1373) [2022-01-19 07:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][310/1251] eta 0:35:24 lr 0.000919 time 1.9521 (2.2579) loss 3.8231 (3.9048) grad_norm 1.2206 (1.1371) [2022-01-19 07:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][320/1251] eta 0:34:59 lr 0.000919 time 2.2571 (2.2548) loss 4.4284 (3.8994) grad_norm 1.0294 (1.1344) [2022-01-19 07:05:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][330/1251] eta 0:34:31 lr 0.000919 time 2.1530 (2.2491) loss 3.7201 (3.9003) grad_norm 0.9440 (1.1329) [2022-01-19 07:05:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][340/1251] eta 0:34:01 lr 0.000919 time 2.1814 (2.2411) loss 3.6779 (3.8930) grad_norm 1.0803 (1.1314) [2022-01-19 07:06:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][350/1251] eta 0:33:34 lr 0.000919 time 1.9363 (2.2363) loss 3.7542 (3.8954) grad_norm 1.0377 (1.1295) [2022-01-19 07:06:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][360/1251] eta 0:33:09 lr 0.000919 time 2.0335 (2.2332) loss 4.1874 (3.8947) grad_norm 1.1908 (1.1320) [2022-01-19 07:06:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][370/1251] eta 0:32:49 lr 0.000919 time 3.1063 (2.2358) loss 4.1046 (3.8986) grad_norm 1.0575 (1.1327) [2022-01-19 07:07:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][380/1251] eta 0:32:29 lr 0.000919 time 2.2690 (2.2377) loss 3.0938 (3.8993) grad_norm 1.1781 (1.1343) [2022-01-19 07:07:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][390/1251] eta 0:32:08 lr 0.000919 time 1.5671 (2.2395) loss 3.0617 (3.8948) grad_norm 1.9976 (1.1402) [2022-01-19 07:07:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][400/1251] eta 0:31:50 lr 0.000919 time 2.9093 (2.2445) loss 2.7006 (3.8930) grad_norm 1.1224 (1.1418) [2022-01-19 07:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][410/1251] eta 0:31:26 lr 0.000919 time 2.1897 (2.2432) loss 4.0581 (3.8964) grad_norm 1.1231 (1.1412) [2022-01-19 07:08:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][420/1251] eta 0:30:58 lr 0.000919 time 1.8896 (2.2366) loss 4.1802 (3.8997) grad_norm 1.0024 (1.1418) [2022-01-19 07:08:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][430/1251] eta 0:30:30 lr 0.000919 time 2.2085 (2.2301) loss 3.9365 (3.8986) grad_norm 1.2962 (1.1408) [2022-01-19 07:09:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][440/1251] eta 0:30:06 lr 0.000919 time 1.9498 (2.2274) loss 4.3457 (3.8942) grad_norm 1.0575 (1.1415) [2022-01-19 07:09:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][450/1251] eta 0:29:42 lr 0.000919 time 2.5715 (2.2249) loss 4.8010 (3.8924) grad_norm 1.0828 (1.1409) [2022-01-19 07:10:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][460/1251] eta 0:29:20 lr 0.000919 time 2.8174 (2.2255) loss 3.9659 (3.8937) grad_norm 0.9317 (1.1382) [2022-01-19 07:10:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][470/1251] eta 0:28:59 lr 0.000919 time 2.8175 (2.2275) loss 3.1317 (3.8923) grad_norm 1.1515 (1.1366) [2022-01-19 07:10:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][480/1251] eta 0:28:39 lr 0.000919 time 2.5142 (2.2297) loss 4.5311 (3.8942) grad_norm 1.4064 (1.1363) [2022-01-19 07:11:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][490/1251] eta 0:28:19 lr 0.000919 time 2.9233 (2.2328) loss 2.9625 (3.8952) grad_norm 1.0259 (1.1353) [2022-01-19 07:11:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][500/1251] eta 0:27:55 lr 0.000919 time 2.8611 (2.2311) loss 2.9276 (3.8937) grad_norm 0.9385 (1.1356) [2022-01-19 07:11:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][510/1251] eta 0:27:30 lr 0.000919 time 1.6769 (2.2275) loss 4.7547 (3.8951) grad_norm 1.0895 (1.1369) [2022-01-19 07:12:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][520/1251] eta 0:27:05 lr 0.000919 time 1.9075 (2.2239) loss 4.0889 (3.8974) grad_norm 1.0895 (1.1367) [2022-01-19 07:12:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][530/1251] eta 0:26:41 lr 0.000919 time 2.2258 (2.2205) loss 4.2564 (3.8979) grad_norm 1.0215 (1.1351) [2022-01-19 07:12:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][540/1251] eta 0:26:19 lr 0.000919 time 2.5002 (2.2210) loss 3.8004 (3.8991) grad_norm 1.0873 (1.1350) [2022-01-19 07:13:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][550/1251] eta 0:25:57 lr 0.000919 time 1.9352 (2.2212) loss 4.3210 (3.9043) grad_norm 1.1605 (1.1340) [2022-01-19 07:13:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][560/1251] eta 0:25:33 lr 0.000919 time 1.9032 (2.2187) loss 3.6944 (3.9036) grad_norm 0.8897 (1.1342) [2022-01-19 07:13:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][570/1251] eta 0:25:07 lr 0.000919 time 2.3296 (2.2139) loss 4.1425 (3.9029) grad_norm 1.3353 (1.1343) [2022-01-19 07:14:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][580/1251] eta 0:24:44 lr 0.000919 time 2.0969 (2.2125) loss 4.5767 (3.9038) grad_norm 1.3718 (1.1357) [2022-01-19 07:14:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][590/1251] eta 0:24:20 lr 0.000919 time 2.0805 (2.2088) loss 3.2166 (3.9012) grad_norm 0.9545 (1.1358) [2022-01-19 07:15:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][600/1251] eta 0:23:59 lr 0.000919 time 1.8246 (2.2109) loss 3.6373 (3.8994) grad_norm 1.1516 (1.1372) [2022-01-19 07:15:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][610/1251] eta 0:23:37 lr 0.000919 time 2.2764 (2.2114) loss 4.7840 (3.8982) grad_norm 1.2052 (1.1366) [2022-01-19 07:15:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][620/1251] eta 0:23:18 lr 0.000919 time 3.1365 (2.2163) loss 4.2614 (3.9022) grad_norm 1.1885 (1.1373) [2022-01-19 07:16:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][630/1251] eta 0:22:59 lr 0.000919 time 1.8007 (2.2216) loss 4.1715 (3.9064) grad_norm 1.1439 (1.1382) [2022-01-19 07:16:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][640/1251] eta 0:22:37 lr 0.000919 time 1.6289 (2.2221) loss 4.1620 (3.9076) grad_norm 1.0444 (1.1377) [2022-01-19 07:17:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][650/1251] eta 0:22:14 lr 0.000919 time 2.1671 (2.2203) loss 3.9291 (3.9080) grad_norm 1.1520 (1.1362) [2022-01-19 07:17:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][660/1251] eta 0:21:48 lr 0.000919 time 1.8647 (2.2147) loss 4.1439 (3.9085) grad_norm 1.2867 (1.1376) [2022-01-19 07:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][670/1251] eta 0:21:24 lr 0.000919 time 1.5800 (2.2106) loss 2.9917 (3.9031) grad_norm 1.1365 (1.1380) [2022-01-19 07:18:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][680/1251] eta 0:21:01 lr 0.000919 time 2.1595 (2.2096) loss 4.7781 (3.9061) grad_norm 1.0298 (1.1378) [2022-01-19 07:18:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][690/1251] eta 0:20:39 lr 0.000919 time 2.1398 (2.2097) loss 4.2343 (3.9037) grad_norm 0.9558 (1.1382) [2022-01-19 07:18:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][700/1251] eta 0:20:16 lr 0.000919 time 2.0972 (2.2080) loss 4.9024 (3.9020) grad_norm 1.5257 (1.1383) [2022-01-19 07:19:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][710/1251] eta 0:19:55 lr 0.000919 time 2.9420 (2.2105) loss 4.0861 (3.9013) grad_norm 1.2980 (1.1387) [2022-01-19 07:19:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][720/1251] eta 0:19:35 lr 0.000919 time 2.5967 (2.2145) loss 2.9989 (3.9000) grad_norm 1.1277 (1.1394) [2022-01-19 07:19:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][730/1251] eta 0:19:14 lr 0.000918 time 1.8532 (2.2158) loss 4.1565 (3.9021) grad_norm 1.2678 (1.1410) [2022-01-19 07:20:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][740/1251] eta 0:18:52 lr 0.000918 time 2.5169 (2.2172) loss 4.3159 (3.9035) grad_norm 1.0381 (1.1410) [2022-01-19 07:20:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][750/1251] eta 0:18:31 lr 0.000918 time 2.6008 (2.2189) loss 3.1602 (3.9034) grad_norm 1.0993 (1.1403) [2022-01-19 07:21:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][760/1251] eta 0:18:08 lr 0.000918 time 1.8673 (2.2162) loss 4.4040 (3.9023) grad_norm 1.1879 (1.1408) [2022-01-19 07:21:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][770/1251] eta 0:17:43 lr 0.000918 time 1.9392 (2.2117) loss 3.7174 (3.8975) grad_norm 1.2756 (1.1407) [2022-01-19 07:21:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][780/1251] eta 0:17:21 lr 0.000918 time 2.9231 (2.2118) loss 2.9237 (3.8952) grad_norm 1.1440 (1.1414) [2022-01-19 07:22:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][790/1251] eta 0:16:58 lr 0.000918 time 1.9168 (2.2101) loss 3.0289 (3.8948) grad_norm 1.4002 (1.1413) [2022-01-19 07:22:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][800/1251] eta 0:16:38 lr 0.000918 time 2.2295 (2.2132) loss 3.1788 (3.8939) grad_norm 0.9287 (1.1404) [2022-01-19 07:22:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][810/1251] eta 0:16:17 lr 0.000918 time 2.2328 (2.2160) loss 3.8251 (3.8955) grad_norm 1.1212 (1.1399) [2022-01-19 07:23:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][820/1251] eta 0:15:55 lr 0.000918 time 1.9565 (2.2165) loss 4.6275 (3.8941) grad_norm 1.1997 (1.1398) [2022-01-19 07:23:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][830/1251] eta 0:15:32 lr 0.000918 time 1.9395 (2.2142) loss 4.1048 (3.8934) grad_norm 1.0686 (1.1389) [2022-01-19 07:23:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][840/1251] eta 0:15:09 lr 0.000918 time 2.1408 (2.2128) loss 3.8307 (3.8935) grad_norm 1.1624 (1.1387) [2022-01-19 07:24:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][850/1251] eta 0:14:46 lr 0.000918 time 1.8689 (2.2117) loss 2.7258 (3.8916) grad_norm 1.2753 (1.1389) [2022-01-19 07:24:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][860/1251] eta 0:14:25 lr 0.000918 time 2.5945 (2.2123) loss 3.7062 (3.8913) grad_norm 1.2371 (1.1403) [2022-01-19 07:25:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][870/1251] eta 0:14:02 lr 0.000918 time 2.6485 (2.2111) loss 3.2331 (3.8902) grad_norm 0.9098 (1.1391) [2022-01-19 07:25:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][880/1251] eta 0:13:39 lr 0.000918 time 1.8824 (2.2102) loss 3.9799 (3.8933) grad_norm 1.0157 (1.1387) [2022-01-19 07:25:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][890/1251] eta 0:13:17 lr 0.000918 time 2.6430 (2.2091) loss 3.5232 (3.8934) grad_norm 1.1879 (1.1382) [2022-01-19 07:26:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][900/1251] eta 0:12:55 lr 0.000918 time 1.8112 (2.2098) loss 4.0731 (3.8939) grad_norm 1.0581 (1.1373) [2022-01-19 07:26:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][910/1251] eta 0:12:33 lr 0.000918 time 2.1878 (2.2087) loss 3.4768 (3.8954) grad_norm 1.1538 (1.1375) [2022-01-19 07:26:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][920/1251] eta 0:12:10 lr 0.000918 time 2.1362 (2.2083) loss 3.6354 (3.8932) grad_norm 1.1259 (1.1376) [2022-01-19 07:27:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][930/1251] eta 0:11:49 lr 0.000918 time 2.4923 (2.2101) loss 4.4422 (3.8960) grad_norm 1.4242 (1.1386) [2022-01-19 07:27:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][940/1251] eta 0:11:27 lr 0.000918 time 2.1969 (2.2104) loss 4.1902 (3.8990) grad_norm 0.9833 (1.1389) [2022-01-19 07:27:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][950/1251] eta 0:11:05 lr 0.000918 time 2.3535 (2.2102) loss 4.4487 (3.9008) grad_norm 0.9900 (1.1380) [2022-01-19 07:28:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][960/1251] eta 0:10:42 lr 0.000918 time 1.6329 (2.2089) loss 4.6045 (3.9043) grad_norm 0.9344 (1.1370) [2022-01-19 07:28:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][970/1251] eta 0:10:20 lr 0.000918 time 2.0925 (2.2090) loss 3.1234 (3.9031) grad_norm 1.0770 (1.1363) [2022-01-19 07:29:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][980/1251] eta 0:09:58 lr 0.000918 time 1.5947 (2.2074) loss 2.8967 (3.9008) grad_norm 0.9549 (1.1361) [2022-01-19 07:29:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][990/1251] eta 0:09:35 lr 0.000918 time 1.8275 (2.2065) loss 4.1705 (3.9015) grad_norm 1.0365 (1.1355) [2022-01-19 07:29:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1000/1251] eta 0:09:13 lr 0.000918 time 1.8323 (2.2061) loss 4.2197 (3.9024) grad_norm 1.1956 (1.1350) [2022-01-19 07:30:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1010/1251] eta 0:08:51 lr 0.000918 time 2.0273 (2.2053) loss 2.9247 (3.8990) grad_norm 1.2548 (1.1351) [2022-01-19 07:30:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1020/1251] eta 0:08:29 lr 0.000918 time 2.1587 (2.2057) loss 3.1553 (3.9003) grad_norm 1.0454 (1.1351) [2022-01-19 07:30:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1030/1251] eta 0:08:07 lr 0.000918 time 2.3114 (2.2067) loss 4.0355 (3.9007) grad_norm 1.4046 (1.1350) [2022-01-19 07:31:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1040/1251] eta 0:07:45 lr 0.000918 time 2.3265 (2.2068) loss 4.1478 (3.9015) grad_norm 1.2051 (1.1363) [2022-01-19 07:31:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1050/1251] eta 0:07:23 lr 0.000918 time 1.7516 (2.2065) loss 3.1111 (3.9005) grad_norm 1.2630 (1.1364) [2022-01-19 07:31:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1060/1251] eta 0:07:01 lr 0.000918 time 1.7893 (2.2064) loss 3.7832 (3.9008) grad_norm 0.9735 (1.1357) [2022-01-19 07:32:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1070/1251] eta 0:06:39 lr 0.000918 time 1.8353 (2.2062) loss 2.9722 (3.8963) grad_norm 1.1385 (1.1361) [2022-01-19 07:32:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1080/1251] eta 0:06:17 lr 0.000918 time 2.1545 (2.2051) loss 2.7462 (3.8954) grad_norm 1.2989 (1.1359) [2022-01-19 07:33:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1090/1251] eta 0:05:55 lr 0.000918 time 2.8547 (2.2056) loss 4.5461 (3.8961) grad_norm 1.1761 (1.1357) [2022-01-19 07:33:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1100/1251] eta 0:05:32 lr 0.000918 time 1.9184 (2.2048) loss 4.0624 (3.8964) grad_norm 1.1603 (1.1354) [2022-01-19 07:33:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1110/1251] eta 0:05:11 lr 0.000918 time 1.8894 (2.2061) loss 4.4365 (3.8982) grad_norm 0.9873 (1.1352) [2022-01-19 07:34:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1120/1251] eta 0:04:49 lr 0.000918 time 2.8735 (2.2065) loss 4.2492 (3.8979) grad_norm 1.0185 (1.1352) [2022-01-19 07:34:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1130/1251] eta 0:04:26 lr 0.000918 time 2.1655 (2.2058) loss 4.1985 (3.9000) grad_norm 1.2142 (1.1354) [2022-01-19 07:34:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1140/1251] eta 0:04:04 lr 0.000918 time 1.8226 (2.2032) loss 4.2782 (3.9009) grad_norm 1.0755 (1.1351) [2022-01-19 07:35:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1150/1251] eta 0:03:42 lr 0.000918 time 2.3691 (2.2027) loss 4.6649 (3.9010) grad_norm 1.2905 (1.1353) [2022-01-19 07:35:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1160/1251] eta 0:03:20 lr 0.000918 time 2.0979 (2.2011) loss 2.8720 (3.8997) grad_norm 0.9983 (1.1343) [2022-01-19 07:35:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1170/1251] eta 0:02:58 lr 0.000917 time 1.9165 (2.2001) loss 4.3936 (3.8996) grad_norm 1.2133 (1.1336) [2022-01-19 07:36:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1180/1251] eta 0:02:36 lr 0.000917 time 2.6895 (2.1996) loss 2.7360 (3.8980) grad_norm 1.0391 (1.1328) [2022-01-19 07:36:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1190/1251] eta 0:02:14 lr 0.000917 time 2.0345 (2.1986) loss 3.2875 (3.8972) grad_norm 1.2150 (1.1326) [2022-01-19 07:36:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1200/1251] eta 0:01:52 lr 0.000917 time 1.8316 (2.1986) loss 2.7249 (3.8940) grad_norm 1.0538 (1.1327) [2022-01-19 07:37:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1210/1251] eta 0:01:30 lr 0.000917 time 2.1979 (2.1991) loss 4.1109 (3.8955) grad_norm 1.0587 (1.1322) [2022-01-19 07:37:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1220/1251] eta 0:01:08 lr 0.000917 time 2.2148 (2.1996) loss 4.3747 (3.8933) grad_norm 1.0391 (1.1320) [2022-01-19 07:38:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1230/1251] eta 0:00:46 lr 0.000917 time 2.2122 (2.2008) loss 4.3545 (3.8923) grad_norm 1.2998 (1.1317) [2022-01-19 07:38:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1240/1251] eta 0:00:24 lr 0.000917 time 1.9400 (2.2018) loss 4.2464 (3.8916) grad_norm 1.2522 (1.1323) [2022-01-19 07:38:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [55/300][1250/1251] eta 0:00:02 lr 0.000917 time 1.0832 (2.1968) loss 4.3203 (3.8926) grad_norm 1.2927 (1.1333) [2022-01-19 07:38:44 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 55 training takes 0:45:48 [2022-01-19 07:39:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.002 (18.002) Loss 1.2802 (1.2802) Acc@1 68.945 (68.945) Acc@5 89.746 (89.746) [2022-01-19 07:39:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.261 (3.318) Loss 1.1826 (1.2699) Acc@1 72.461 (70.082) Acc@5 92.090 (90.572) [2022-01-19 07:39:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.621 (2.479) Loss 1.2348 (1.2758) Acc@1 71.973 (70.140) Acc@5 90.332 (90.397) [2022-01-19 07:39:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.288 (2.228) Loss 1.2301 (1.2701) Acc@1 71.582 (70.322) Acc@5 92.285 (90.609) [2022-01-19 07:40:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.057 (2.162) Loss 1.2529 (1.2702) Acc@1 70.801 (70.363) Acc@5 91.016 (90.580) [2022-01-19 07:40:20 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 70.426 Acc@5 90.504 [2022-01-19 07:40:20 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 70.4% [2022-01-19 07:40:20 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 70.79% [2022-01-19 07:40:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][0/1251] eta 7:31:30 lr 0.000917 time 21.6550 (21.6550) loss 3.5418 (3.5418) grad_norm 1.1203 (1.1203) [2022-01-19 07:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][10/1251] eta 1:23:57 lr 0.000917 time 2.2375 (4.0591) loss 2.9551 (3.6693) grad_norm 1.3213 (1.2343) [2022-01-19 07:41:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][20/1251] eta 1:05:11 lr 0.000917 time 1.5574 (3.1775) loss 4.5218 (3.9305) grad_norm 1.1409 (1.1933) [2022-01-19 07:41:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][30/1251] eta 0:57:45 lr 0.000917 time 1.6755 (2.8382) loss 3.9828 (3.9326) grad_norm 1.2707 (1.1796) [2022-01-19 07:42:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][40/1251] eta 0:55:00 lr 0.000917 time 3.6684 (2.7251) loss 3.8969 (3.8805) grad_norm 1.1379 (1.1693) [2022-01-19 07:42:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][50/1251] eta 0:53:45 lr 0.000917 time 3.3095 (2.6861) loss 3.3430 (3.8384) grad_norm 1.1399 (1.1608) [2022-01-19 07:42:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][60/1251] eta 0:51:42 lr 0.000917 time 1.7780 (2.6047) loss 3.8706 (3.8616) grad_norm 1.3152 (1.1594) [2022-01-19 07:43:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][70/1251] eta 0:50:00 lr 0.000917 time 1.6876 (2.5409) loss 4.4755 (3.8695) grad_norm 1.1570 (1.1691) [2022-01-19 07:43:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][80/1251] eta 0:49:06 lr 0.000917 time 3.5459 (2.5160) loss 4.5156 (3.8500) grad_norm 1.0780 (1.1627) [2022-01-19 07:44:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][90/1251] eta 0:48:07 lr 0.000917 time 2.2769 (2.4873) loss 4.7846 (3.8510) grad_norm 1.5422 (1.1692) [2022-01-19 07:44:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][100/1251] eta 0:46:43 lr 0.000917 time 1.6858 (2.4359) loss 3.3101 (3.8782) grad_norm 1.0303 (1.1615) [2022-01-19 07:44:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][110/1251] eta 0:45:27 lr 0.000917 time 1.8469 (2.3909) loss 3.9647 (3.8916) grad_norm 1.0301 (1.1537) [2022-01-19 07:45:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][120/1251] eta 0:44:42 lr 0.000917 time 2.5162 (2.3721) loss 2.7386 (3.8617) grad_norm 1.1181 (1.1526) [2022-01-19 07:45:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][130/1251] eta 0:44:04 lr 0.000917 time 1.8919 (2.3594) loss 3.2909 (3.8704) grad_norm 1.2142 (1.1556) [2022-01-19 07:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][140/1251] eta 0:43:27 lr 0.000917 time 2.0635 (2.3473) loss 4.3127 (3.8852) grad_norm 1.3110 (1.1533) [2022-01-19 07:46:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][150/1251] eta 0:42:55 lr 0.000917 time 2.1121 (2.3394) loss 3.5852 (3.9053) grad_norm 0.9360 (1.1527) [2022-01-19 07:46:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][160/1251] eta 0:42:28 lr 0.000917 time 3.2770 (2.3355) loss 3.8153 (3.8866) grad_norm 1.0589 (1.1509) [2022-01-19 07:46:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][170/1251] eta 0:41:53 lr 0.000917 time 2.2443 (2.3256) loss 4.0183 (3.8726) grad_norm 1.1889 (1.1479) [2022-01-19 07:47:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][180/1251] eta 0:41:21 lr 0.000917 time 2.2042 (2.3166) loss 3.1915 (3.8728) grad_norm 1.2665 (1.1455) [2022-01-19 07:47:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][190/1251] eta 0:40:58 lr 0.000917 time 2.1782 (2.3171) loss 4.0563 (3.8791) grad_norm 1.1276 (1.1450) [2022-01-19 07:48:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][200/1251] eta 0:40:30 lr 0.000917 time 2.8827 (2.3130) loss 3.3445 (3.8659) grad_norm 1.1325 (1.1435) [2022-01-19 07:48:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][210/1251] eta 0:39:49 lr 0.000917 time 1.5309 (2.2953) loss 3.2735 (3.8744) grad_norm 1.1950 (1.1409) [2022-01-19 07:48:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][220/1251] eta 0:39:10 lr 0.000917 time 1.8723 (2.2803) loss 4.9925 (3.8802) grad_norm 1.1822 (1.1384) [2022-01-19 07:49:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][230/1251] eta 0:38:37 lr 0.000917 time 1.8833 (2.2700) loss 4.2037 (3.8761) grad_norm 1.2185 (1.1380) [2022-01-19 07:49:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][240/1251] eta 0:38:08 lr 0.000917 time 1.9973 (2.2633) loss 3.7226 (3.8838) grad_norm 1.1387 (1.1411) [2022-01-19 07:49:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][250/1251] eta 0:37:39 lr 0.000917 time 2.1443 (2.2577) loss 3.5898 (3.8815) grad_norm 1.1378 (1.1394) [2022-01-19 07:50:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][260/1251] eta 0:37:17 lr 0.000917 time 2.2028 (2.2574) loss 4.2768 (3.8768) grad_norm 1.4877 (1.1429) [2022-01-19 07:50:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][270/1251] eta 0:36:54 lr 0.000917 time 2.0904 (2.2570) loss 3.3533 (3.8746) grad_norm 1.1332 (1.1431) [2022-01-19 07:50:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][280/1251] eta 0:36:31 lr 0.000917 time 2.4782 (2.2572) loss 4.2029 (3.8810) grad_norm 1.0845 (1.1405) [2022-01-19 07:51:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][290/1251] eta 0:36:07 lr 0.000917 time 2.0199 (2.2553) loss 4.2193 (3.8891) grad_norm 1.2908 (1.1397) [2022-01-19 07:51:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][300/1251] eta 0:35:43 lr 0.000917 time 2.5812 (2.2541) loss 4.6754 (3.8886) grad_norm 1.0357 (1.1368) [2022-01-19 07:52:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][310/1251] eta 0:35:24 lr 0.000917 time 3.6624 (2.2579) loss 3.0609 (3.8845) grad_norm 1.3983 (1.1372) [2022-01-19 07:52:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][320/1251] eta 0:34:55 lr 0.000917 time 1.7324 (2.2513) loss 3.8188 (3.8884) grad_norm 1.2364 (1.1365) [2022-01-19 07:52:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][330/1251] eta 0:34:32 lr 0.000917 time 1.6815 (2.2498) loss 4.6254 (3.8963) grad_norm 0.8836 (1.1367) [2022-01-19 07:53:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][340/1251] eta 0:34:05 lr 0.000917 time 2.6229 (2.2451) loss 4.0378 (3.8981) grad_norm 1.0434 (1.1355) [2022-01-19 07:53:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][350/1251] eta 0:33:42 lr 0.000916 time 3.1893 (2.2453) loss 3.7876 (3.8996) grad_norm 1.0764 (1.1373) [2022-01-19 07:53:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][360/1251] eta 0:33:17 lr 0.000916 time 2.5752 (2.2415) loss 2.9051 (3.9001) grad_norm 1.4485 (1.1394) [2022-01-19 07:54:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][370/1251] eta 0:32:50 lr 0.000916 time 2.1565 (2.2367) loss 4.1350 (3.8994) grad_norm 1.1751 (1.1413) [2022-01-19 07:54:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][380/1251] eta 0:32:26 lr 0.000916 time 2.2626 (2.2351) loss 4.2918 (3.9008) grad_norm 1.0274 (1.1403) [2022-01-19 07:54:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][390/1251] eta 0:32:03 lr 0.000916 time 3.3038 (2.2339) loss 3.9260 (3.9027) grad_norm 1.4265 (1.1416) [2022-01-19 07:55:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][400/1251] eta 0:31:39 lr 0.000916 time 2.1321 (2.2326) loss 4.4050 (3.9083) grad_norm 1.1366 (1.1411) [2022-01-19 07:55:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][410/1251] eta 0:31:16 lr 0.000916 time 1.9506 (2.2308) loss 4.7820 (3.9071) grad_norm 1.1517 (1.1405) [2022-01-19 07:55:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][420/1251] eta 0:30:53 lr 0.000916 time 2.4936 (2.2307) loss 4.5536 (3.9043) grad_norm 0.9404 (1.1414) [2022-01-19 07:56:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][430/1251] eta 0:30:35 lr 0.000916 time 4.0026 (2.2354) loss 3.9481 (3.9035) grad_norm 0.9950 (1.1410) [2022-01-19 07:56:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][440/1251] eta 0:30:14 lr 0.000916 time 2.4555 (2.2369) loss 4.3579 (3.8990) grad_norm 1.2081 (1.1412) [2022-01-19 07:57:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][450/1251] eta 0:29:47 lr 0.000916 time 2.2900 (2.2320) loss 3.8136 (3.9046) grad_norm 1.0122 (1.1417) [2022-01-19 07:57:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][460/1251] eta 0:29:22 lr 0.000916 time 2.1968 (2.2280) loss 4.5545 (3.9056) grad_norm 1.2420 (1.1435) [2022-01-19 07:57:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][470/1251] eta 0:29:04 lr 0.000916 time 3.6324 (2.2332) loss 4.1055 (3.9048) grad_norm 1.2150 (1.1432) [2022-01-19 07:58:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][480/1251] eta 0:28:44 lr 0.000916 time 2.1432 (2.2364) loss 3.8398 (3.9105) grad_norm 1.0930 (1.1427) [2022-01-19 07:58:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][490/1251] eta 0:28:21 lr 0.000916 time 1.8560 (2.2354) loss 4.3253 (3.9075) grad_norm 1.2011 (1.1431) [2022-01-19 07:58:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][500/1251] eta 0:27:57 lr 0.000916 time 2.8444 (2.2334) loss 4.3035 (3.9067) grad_norm 1.1444 (1.1427) [2022-01-19 07:59:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][510/1251] eta 0:27:31 lr 0.000916 time 1.5693 (2.2285) loss 4.1210 (3.9045) grad_norm 1.0514 (1.1418) [2022-01-19 07:59:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][520/1251] eta 0:27:06 lr 0.000916 time 2.2543 (2.2252) loss 3.3698 (3.9052) grad_norm 1.0527 (1.1422) [2022-01-19 08:00:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][530/1251] eta 0:26:42 lr 0.000916 time 2.1654 (2.2225) loss 3.3470 (3.9026) grad_norm 1.2497 (1.1430) [2022-01-19 08:00:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][540/1251] eta 0:26:17 lr 0.000916 time 1.9491 (2.2186) loss 4.2366 (3.9068) grad_norm 1.0022 (1.1449) [2022-01-19 08:00:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][550/1251] eta 0:25:54 lr 0.000916 time 1.6056 (2.2178) loss 3.4028 (3.9114) grad_norm 1.1005 (1.1446) [2022-01-19 08:01:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][560/1251] eta 0:25:32 lr 0.000916 time 3.1088 (2.2178) loss 2.9825 (3.9115) grad_norm 0.9624 (1.1442) [2022-01-19 08:01:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][570/1251] eta 0:25:10 lr 0.000916 time 2.2281 (2.2180) loss 3.6523 (3.9111) grad_norm 1.1515 (1.1452) [2022-01-19 08:01:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][580/1251] eta 0:24:48 lr 0.000916 time 1.9329 (2.2186) loss 4.5646 (3.9143) grad_norm 0.9961 (1.1437) [2022-01-19 08:02:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][590/1251] eta 0:24:27 lr 0.000916 time 2.1327 (2.2205) loss 4.1316 (3.9120) grad_norm 1.3222 (1.1438) [2022-01-19 08:02:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][600/1251] eta 0:24:05 lr 0.000916 time 2.1375 (2.2205) loss 3.9727 (3.9128) grad_norm 1.0295 (1.1448) [2022-01-19 08:02:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][610/1251] eta 0:23:43 lr 0.000916 time 2.5301 (2.2207) loss 3.4117 (3.9169) grad_norm 1.0686 (1.1443) [2022-01-19 08:03:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][620/1251] eta 0:23:19 lr 0.000916 time 1.8422 (2.2181) loss 3.9278 (3.9174) grad_norm 1.0205 (1.1433) [2022-01-19 08:03:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][630/1251] eta 0:22:56 lr 0.000916 time 2.1254 (2.2159) loss 4.2794 (3.9182) grad_norm 1.3240 (1.1437) [2022-01-19 08:04:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][640/1251] eta 0:22:33 lr 0.000916 time 2.2429 (2.2156) loss 4.3179 (3.9172) grad_norm 1.2018 (1.1439) [2022-01-19 08:04:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][650/1251] eta 0:22:11 lr 0.000916 time 2.6697 (2.2154) loss 4.0706 (3.9203) grad_norm 1.1535 (1.1430) [2022-01-19 08:04:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][660/1251] eta 0:21:49 lr 0.000916 time 1.5805 (2.2150) loss 4.0902 (3.9202) grad_norm 1.0010 (1.1431) [2022-01-19 08:05:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][670/1251] eta 0:21:25 lr 0.000916 time 1.6199 (2.2131) loss 4.3727 (3.9219) grad_norm 1.2682 (1.1436) [2022-01-19 08:05:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][680/1251] eta 0:21:03 lr 0.000916 time 2.1341 (2.2131) loss 3.7748 (3.9246) grad_norm 1.1113 (1.1435) [2022-01-19 08:05:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][690/1251] eta 0:20:41 lr 0.000916 time 2.2674 (2.2131) loss 3.9926 (3.9229) grad_norm 1.0756 (1.1441) [2022-01-19 08:06:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][700/1251] eta 0:20:19 lr 0.000916 time 2.1622 (2.2127) loss 3.1460 (3.9182) grad_norm 1.0531 (1.1434) [2022-01-19 08:06:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][710/1251] eta 0:19:57 lr 0.000916 time 1.8609 (2.2140) loss 3.8241 (3.9182) grad_norm 1.0593 (1.1426) [2022-01-19 08:06:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][720/1251] eta 0:19:35 lr 0.000916 time 2.1090 (2.2128) loss 4.5150 (3.9156) grad_norm 1.0893 (1.1425) [2022-01-19 08:07:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][730/1251] eta 0:19:12 lr 0.000916 time 2.4858 (2.2122) loss 2.8707 (3.9159) grad_norm 1.0815 (1.1424) [2022-01-19 08:07:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][740/1251] eta 0:18:49 lr 0.000916 time 1.6818 (2.2097) loss 4.1593 (3.9102) grad_norm 0.9279 (1.1423) [2022-01-19 08:08:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][750/1251] eta 0:18:27 lr 0.000916 time 2.3714 (2.2110) loss 4.6671 (3.9100) grad_norm 1.0070 (1.1423) [2022-01-19 08:08:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][760/1251] eta 0:18:05 lr 0.000916 time 1.5462 (2.2101) loss 4.1696 (3.9088) grad_norm 1.1033 (1.1427) [2022-01-19 08:08:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][770/1251] eta 0:17:43 lr 0.000916 time 2.5965 (2.2102) loss 4.7381 (3.9105) grad_norm 1.5198 (1.1437) [2022-01-19 08:09:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][780/1251] eta 0:17:20 lr 0.000915 time 1.8336 (2.2083) loss 4.2040 (3.9102) grad_norm 1.0790 (1.1442) [2022-01-19 08:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][790/1251] eta 0:16:57 lr 0.000915 time 1.7699 (2.2073) loss 3.8190 (3.9080) grad_norm 1.0964 (1.1444) [2022-01-19 08:09:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][800/1251] eta 0:16:34 lr 0.000915 time 1.8946 (2.2050) loss 4.0064 (3.9099) grad_norm 0.9029 (1.1454) [2022-01-19 08:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][810/1251] eta 0:16:11 lr 0.000915 time 2.4641 (2.2037) loss 3.7567 (3.9123) grad_norm 1.1893 (1.1462) [2022-01-19 08:10:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][820/1251] eta 0:15:50 lr 0.000915 time 2.1883 (2.2048) loss 3.8454 (3.9102) grad_norm 1.2280 (1.1474) [2022-01-19 08:10:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][830/1251] eta 0:15:29 lr 0.000915 time 2.1890 (2.2082) loss 4.2466 (3.9083) grad_norm 1.0589 (1.1465) [2022-01-19 08:11:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][840/1251] eta 0:15:07 lr 0.000915 time 2.2377 (2.2078) loss 4.4748 (3.9085) grad_norm 1.0112 (1.1466) [2022-01-19 08:11:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][850/1251] eta 0:14:44 lr 0.000915 time 1.8616 (2.2067) loss 4.5095 (3.9083) grad_norm 1.1743 (1.1466) [2022-01-19 08:11:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][860/1251] eta 0:14:22 lr 0.000915 time 2.4193 (2.2052) loss 3.6605 (3.9129) grad_norm 0.9437 (1.1460) [2022-01-19 08:12:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][870/1251] eta 0:14:00 lr 0.000915 time 1.7726 (2.2055) loss 4.2685 (3.9122) grad_norm 1.0155 (1.1454) [2022-01-19 08:12:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][880/1251] eta 0:13:38 lr 0.000915 time 2.0041 (2.2056) loss 3.7525 (3.9091) grad_norm 1.0460 (1.1453) [2022-01-19 08:13:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][890/1251] eta 0:13:16 lr 0.000915 time 2.4926 (2.2059) loss 3.4125 (3.9115) grad_norm 1.5430 (1.1451) [2022-01-19 08:13:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][900/1251] eta 0:12:53 lr 0.000915 time 1.8584 (2.2042) loss 3.9889 (3.9147) grad_norm 1.2049 (1.1448) [2022-01-19 08:13:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][910/1251] eta 0:12:31 lr 0.000915 time 1.6538 (2.2032) loss 4.5249 (3.9168) grad_norm 0.9836 (1.1447) [2022-01-19 08:14:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][920/1251] eta 0:12:09 lr 0.000915 time 2.0287 (2.2029) loss 3.6655 (3.9165) grad_norm 1.2049 (1.1450) [2022-01-19 08:14:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][930/1251] eta 0:11:48 lr 0.000915 time 1.7635 (2.2057) loss 4.1697 (3.9165) grad_norm 1.0865 (1.1448) [2022-01-19 08:14:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][940/1251] eta 0:11:25 lr 0.000915 time 1.8852 (2.2037) loss 3.5598 (3.9147) grad_norm 1.1763 (1.1450) [2022-01-19 08:15:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][950/1251] eta 0:11:03 lr 0.000915 time 1.9232 (2.2034) loss 4.3307 (3.9138) grad_norm 0.9380 (1.1440) [2022-01-19 08:15:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][960/1251] eta 0:10:40 lr 0.000915 time 1.9234 (2.2021) loss 4.4820 (3.9151) grad_norm 1.0583 (1.1438) [2022-01-19 08:15:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][970/1251] eta 0:10:18 lr 0.000915 time 2.1235 (2.2021) loss 3.6071 (3.9155) grad_norm 1.0444 (1.1437) [2022-01-19 08:16:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][980/1251] eta 0:09:56 lr 0.000915 time 1.9913 (2.2023) loss 3.4004 (3.9175) grad_norm 0.9658 (1.1427) [2022-01-19 08:16:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][990/1251] eta 0:09:34 lr 0.000915 time 1.8307 (2.2019) loss 4.4401 (3.9207) grad_norm 1.6090 (1.1427) [2022-01-19 08:17:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1000/1251] eta 0:09:12 lr 0.000915 time 1.9353 (2.2027) loss 4.3286 (3.9223) grad_norm 1.1879 (1.1425) [2022-01-19 08:17:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1010/1251] eta 0:08:51 lr 0.000915 time 2.8590 (2.2046) loss 3.6336 (3.9189) grad_norm 1.0320 (1.1420) [2022-01-19 08:17:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1020/1251] eta 0:08:28 lr 0.000915 time 2.0242 (2.2033) loss 3.8823 (3.9166) grad_norm 1.1300 (1.1415) [2022-01-19 08:18:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1030/1251] eta 0:08:06 lr 0.000915 time 1.9059 (2.2021) loss 4.8598 (3.9162) grad_norm 1.1142 (1.1413) [2022-01-19 08:18:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1040/1251] eta 0:07:44 lr 0.000915 time 2.4461 (2.2020) loss 3.9586 (3.9155) grad_norm 1.2432 (1.1407) [2022-01-19 08:18:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1050/1251] eta 0:07:22 lr 0.000915 time 2.3907 (2.2010) loss 3.5226 (3.9129) grad_norm 1.1583 (1.1402) [2022-01-19 08:19:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1060/1251] eta 0:07:00 lr 0.000915 time 1.5978 (2.2008) loss 4.2022 (3.9127) grad_norm 1.0484 (1.1399) [2022-01-19 08:19:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1070/1251] eta 0:06:38 lr 0.000915 time 1.8285 (2.2009) loss 2.7346 (3.9137) grad_norm 1.1338 (1.1396) [2022-01-19 08:19:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1080/1251] eta 0:06:16 lr 0.000915 time 2.3311 (2.1996) loss 4.3782 (3.9143) grad_norm 1.5516 (1.1396) [2022-01-19 08:20:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1090/1251] eta 0:05:54 lr 0.000915 time 2.0947 (2.1992) loss 3.4698 (3.9125) grad_norm 1.4433 (1.1400) [2022-01-19 08:20:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1100/1251] eta 0:05:32 lr 0.000915 time 1.6103 (2.1987) loss 3.1033 (3.9130) grad_norm 1.1644 (1.1398) [2022-01-19 08:21:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1110/1251] eta 0:05:10 lr 0.000915 time 2.2486 (2.1991) loss 3.5547 (3.9127) grad_norm 1.2883 (1.1406) [2022-01-19 08:21:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1120/1251] eta 0:04:47 lr 0.000915 time 2.4865 (2.1984) loss 3.5625 (3.9126) grad_norm 1.3013 (1.1407) [2022-01-19 08:21:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1130/1251] eta 0:04:25 lr 0.000915 time 1.8028 (2.1981) loss 4.2609 (3.9143) grad_norm 1.0233 (1.1402) [2022-01-19 08:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1140/1251] eta 0:04:04 lr 0.000915 time 2.1203 (2.1992) loss 2.8761 (3.9111) grad_norm 0.9460 (1.1395) [2022-01-19 08:22:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1150/1251] eta 0:03:42 lr 0.000915 time 1.9270 (2.1989) loss 4.2103 (3.9102) grad_norm 1.1848 (1.1392) [2022-01-19 08:22:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1160/1251] eta 0:03:20 lr 0.000915 time 2.1491 (2.1988) loss 4.1017 (3.9096) grad_norm 1.1373 (1.1390) [2022-01-19 08:23:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1170/1251] eta 0:02:58 lr 0.000915 time 1.5604 (2.1981) loss 4.0037 (3.9090) grad_norm 1.0027 (1.1388) [2022-01-19 08:23:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1180/1251] eta 0:02:36 lr 0.000915 time 1.9486 (2.1985) loss 4.0300 (3.9099) grad_norm 1.1126 (1.1388) [2022-01-19 08:23:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1190/1251] eta 0:02:14 lr 0.000915 time 3.1062 (2.1986) loss 4.1227 (3.9136) grad_norm 1.0350 (1.1387) [2022-01-19 08:24:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1200/1251] eta 0:01:52 lr 0.000915 time 1.7992 (2.1973) loss 3.2813 (3.9128) grad_norm 1.0824 (1.1383) [2022-01-19 08:24:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1210/1251] eta 0:01:30 lr 0.000915 time 2.1401 (2.1960) loss 3.9901 (3.9125) grad_norm 1.3246 (1.1390) [2022-01-19 08:25:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1220/1251] eta 0:01:08 lr 0.000914 time 2.2922 (2.1963) loss 4.4970 (3.9148) grad_norm 1.1012 (1.1388) [2022-01-19 08:25:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1230/1251] eta 0:00:46 lr 0.000914 time 3.5677 (2.1972) loss 3.9428 (3.9128) grad_norm 1.0210 (1.1396) [2022-01-19 08:25:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1240/1251] eta 0:00:24 lr 0.000914 time 1.1834 (2.1964) loss 3.8611 (3.9121) grad_norm 1.1627 (1.1399) [2022-01-19 08:26:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [56/300][1250/1251] eta 0:00:02 lr 0.000914 time 1.1823 (2.1911) loss 4.2013 (3.9120) grad_norm 1.1328 (1.1401) [2022-01-19 08:26:01 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 56 training takes 0:45:41 [2022-01-19 08:26:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.136 (18.136) Loss 1.2248 (1.2248) Acc@1 71.582 (71.582) Acc@5 89.941 (89.941) [2022-01-19 08:26:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.332 (3.203) Loss 1.2421 (1.2355) Acc@1 70.410 (70.845) Acc@5 90.527 (90.732) [2022-01-19 08:26:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.949 (2.513) Loss 1.2729 (1.2469) Acc@1 69.141 (70.736) Acc@5 90.137 (90.402) [2022-01-19 08:27:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.933 (2.265) Loss 1.2634 (1.2450) Acc@1 70.117 (70.820) Acc@5 89.941 (90.427) [2022-01-19 08:27:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.844 (2.158) Loss 1.3025 (1.2426) Acc@1 70.410 (70.965) Acc@5 89.062 (90.494) [2022-01-19 08:27:37 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 70.934 Acc@5 90.542 [2022-01-19 08:27:37 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 70.9% [2022-01-19 08:27:37 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 70.93% [2022-01-19 08:27:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][0/1251] eta 7:25:55 lr 0.000914 time 21.3872 (21.3872) loss 3.0124 (3.0124) grad_norm 1.1637 (1.1637) [2022-01-19 08:28:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][10/1251] eta 1:22:10 lr 0.000914 time 2.2160 (3.9734) loss 4.1881 (4.1118) grad_norm 1.2275 (1.0946) [2022-01-19 08:28:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][20/1251] eta 1:02:50 lr 0.000914 time 1.3912 (3.0630) loss 3.2231 (3.7451) grad_norm 1.0245 (1.1253) [2022-01-19 08:29:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][30/1251] eta 0:56:10 lr 0.000914 time 1.5572 (2.7608) loss 2.8776 (3.7234) grad_norm 1.1887 (1.1425) [2022-01-19 08:29:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][40/1251] eta 0:53:37 lr 0.000914 time 3.1523 (2.6566) loss 4.2343 (3.7562) grad_norm 0.9523 (1.1432) [2022-01-19 08:29:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][50/1251] eta 0:52:22 lr 0.000914 time 2.6917 (2.6162) loss 4.0097 (3.7322) grad_norm 1.0564 (1.1327) [2022-01-19 08:30:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][60/1251] eta 0:50:32 lr 0.000914 time 2.1981 (2.5462) loss 3.1455 (3.7456) grad_norm 1.0979 (1.1273) [2022-01-19 08:30:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][70/1251] eta 0:49:13 lr 0.000914 time 1.6661 (2.5011) loss 4.1976 (3.7539) grad_norm 1.1150 (1.1361) [2022-01-19 08:30:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][80/1251] eta 0:47:57 lr 0.000914 time 2.2082 (2.4572) loss 4.4042 (3.7839) grad_norm 0.9971 (1.1346) [2022-01-19 08:31:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][90/1251] eta 0:47:05 lr 0.000914 time 2.3188 (2.4338) loss 4.0657 (3.8106) grad_norm 1.3390 (1.1372) [2022-01-19 08:31:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][100/1251] eta 0:46:29 lr 0.000914 time 2.4135 (2.4239) loss 4.2011 (3.8204) grad_norm 1.3085 (1.1397) [2022-01-19 08:32:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][110/1251] eta 0:45:52 lr 0.000914 time 1.9144 (2.4121) loss 2.6304 (3.8333) grad_norm 1.0869 (1.1400) [2022-01-19 08:32:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][120/1251] eta 0:45:02 lr 0.000914 time 2.4529 (2.3896) loss 4.3859 (3.8147) grad_norm 1.2076 (1.1426) [2022-01-19 08:32:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][130/1251] eta 0:44:20 lr 0.000914 time 1.9714 (2.3734) loss 4.0377 (3.8287) grad_norm 1.4457 (1.1449) [2022-01-19 08:33:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][140/1251] eta 0:43:34 lr 0.000914 time 2.6837 (2.3532) loss 4.0803 (3.8597) grad_norm 1.1796 (1.1432) [2022-01-19 08:33:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][150/1251] eta 0:42:53 lr 0.000914 time 2.3246 (2.3377) loss 3.8829 (3.8668) grad_norm 1.0258 (1.1414) [2022-01-19 08:33:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][160/1251] eta 0:42:08 lr 0.000914 time 1.6929 (2.3176) loss 4.3107 (3.8700) grad_norm 1.0304 (1.1389) [2022-01-19 08:34:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][170/1251] eta 0:41:37 lr 0.000914 time 1.3753 (2.3102) loss 4.5521 (3.8789) grad_norm 1.6153 (1.1390) [2022-01-19 08:34:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][180/1251] eta 0:41:02 lr 0.000914 time 2.4364 (2.2989) loss 3.2546 (3.8842) grad_norm 1.2313 (1.1361) [2022-01-19 08:34:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][190/1251] eta 0:40:33 lr 0.000914 time 2.0698 (2.2939) loss 3.2702 (3.8856) grad_norm 1.0702 (1.1360) [2022-01-19 08:35:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][200/1251] eta 0:40:01 lr 0.000914 time 1.8914 (2.2847) loss 3.2521 (3.8786) grad_norm 1.0646 (1.1336) [2022-01-19 08:35:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][210/1251] eta 0:39:40 lr 0.000914 time 1.9045 (2.2866) loss 4.5255 (3.8706) grad_norm 1.0164 (1.1325) [2022-01-19 08:36:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][220/1251] eta 0:39:09 lr 0.000914 time 1.8907 (2.2790) loss 3.6887 (3.8643) grad_norm 1.1394 (1.1317) [2022-01-19 08:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][230/1251] eta 0:38:44 lr 0.000914 time 2.5405 (2.2768) loss 4.1935 (3.8649) grad_norm 1.5451 (1.1321) [2022-01-19 08:36:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][240/1251] eta 0:38:18 lr 0.000914 time 2.2339 (2.2735) loss 4.5411 (3.8610) grad_norm 1.1806 (1.1333) [2022-01-19 08:37:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][250/1251] eta 0:38:00 lr 0.000914 time 2.0852 (2.2785) loss 4.7103 (3.8581) grad_norm 1.1772 (1.1338) [2022-01-19 08:37:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][260/1251] eta 0:37:35 lr 0.000914 time 2.0601 (2.2757) loss 4.2240 (3.8597) grad_norm 1.1384 (1.1344) [2022-01-19 08:37:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][270/1251] eta 0:37:06 lr 0.000914 time 2.2681 (2.2699) loss 3.5825 (3.8560) grad_norm 1.0211 (1.1366) [2022-01-19 08:38:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][280/1251] eta 0:36:39 lr 0.000914 time 1.6299 (2.2647) loss 4.4303 (3.8558) grad_norm 0.9439 (1.1356) [2022-01-19 08:38:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][290/1251] eta 0:36:19 lr 0.000914 time 2.1460 (2.2675) loss 3.0305 (3.8547) grad_norm 1.1345 (1.1331) [2022-01-19 08:38:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][300/1251] eta 0:35:54 lr 0.000914 time 1.8963 (2.2654) loss 3.3584 (3.8566) grad_norm 1.0271 (1.1335) [2022-01-19 08:39:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][310/1251] eta 0:35:28 lr 0.000914 time 2.2008 (2.2617) loss 4.6728 (3.8632) grad_norm 1.1432 (1.1315) [2022-01-19 08:39:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][320/1251] eta 0:34:58 lr 0.000914 time 1.9140 (2.2543) loss 4.6344 (3.8720) grad_norm 1.3944 (1.1318) [2022-01-19 08:40:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][330/1251] eta 0:34:36 lr 0.000914 time 2.0991 (2.2544) loss 4.2283 (3.8742) grad_norm 1.4080 (1.1337) [2022-01-19 08:40:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][340/1251] eta 0:34:11 lr 0.000914 time 1.8687 (2.2519) loss 3.2625 (3.8652) grad_norm 1.2868 (1.1332) [2022-01-19 08:40:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][350/1251] eta 0:33:47 lr 0.000914 time 1.9014 (2.2500) loss 4.7330 (3.8675) grad_norm 1.1166 (1.1322) [2022-01-19 08:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][360/1251] eta 0:33:24 lr 0.000914 time 2.5402 (2.2493) loss 4.2001 (3.8697) grad_norm 1.3043 (1.1347) [2022-01-19 08:41:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][370/1251] eta 0:33:02 lr 0.000914 time 1.9958 (2.2504) loss 2.7323 (3.8715) grad_norm 1.4544 (1.1361) [2022-01-19 08:41:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][380/1251] eta 0:32:35 lr 0.000914 time 1.9141 (2.2446) loss 4.4077 (3.8757) grad_norm 1.0084 (1.1369) [2022-01-19 08:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][390/1251] eta 0:32:10 lr 0.000913 time 1.9735 (2.2417) loss 4.5002 (3.8821) grad_norm 1.2546 (1.1383) [2022-01-19 08:42:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][400/1251] eta 0:31:45 lr 0.000913 time 2.2964 (2.2387) loss 4.1179 (3.8895) grad_norm 1.1884 (1.1401) [2022-01-19 08:42:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][410/1251] eta 0:31:24 lr 0.000913 time 2.1976 (2.2409) loss 4.2966 (3.8870) grad_norm 0.9608 (1.1403) [2022-01-19 08:43:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][420/1251] eta 0:30:59 lr 0.000913 time 1.8343 (2.2373) loss 4.4676 (3.8874) grad_norm 0.9837 (1.1407) [2022-01-19 08:43:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][430/1251] eta 0:30:31 lr 0.000913 time 1.5904 (2.2309) loss 2.8860 (3.8859) grad_norm 1.4324 (1.1450) [2022-01-19 08:43:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][440/1251] eta 0:30:05 lr 0.000913 time 1.8658 (2.2266) loss 4.3595 (3.8881) grad_norm 1.1513 (1.1464) [2022-01-19 08:44:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][450/1251] eta 0:29:43 lr 0.000913 time 2.4507 (2.2270) loss 4.0025 (3.8966) grad_norm 1.1210 (1.1466) [2022-01-19 08:44:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][460/1251] eta 0:29:23 lr 0.000913 time 2.0981 (2.2295) loss 4.4819 (3.8961) grad_norm 0.9954 (1.1462) [2022-01-19 08:45:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][470/1251] eta 0:29:01 lr 0.000913 time 2.1014 (2.2293) loss 4.2617 (3.8990) grad_norm 1.1011 (1.1444) [2022-01-19 08:45:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][480/1251] eta 0:28:41 lr 0.000913 time 1.9319 (2.2331) loss 3.6485 (3.8912) grad_norm 1.0755 (1.1446) [2022-01-19 08:45:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][490/1251] eta 0:28:18 lr 0.000913 time 2.5079 (2.2317) loss 4.1202 (3.8873) grad_norm 1.0977 (1.1430) [2022-01-19 08:46:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][500/1251] eta 0:27:51 lr 0.000913 time 1.9485 (2.2261) loss 4.2504 (3.8851) grad_norm 1.0563 (1.1431) [2022-01-19 08:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][510/1251] eta 0:27:29 lr 0.000913 time 1.7080 (2.2261) loss 2.7610 (3.8848) grad_norm 1.1438 (1.1442) [2022-01-19 08:46:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][520/1251] eta 0:27:07 lr 0.000913 time 1.8777 (2.2258) loss 4.0572 (3.8865) grad_norm 1.1609 (1.1432) [2022-01-19 08:47:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][530/1251] eta 0:26:45 lr 0.000913 time 2.2899 (2.2268) loss 3.9798 (3.8863) grad_norm 1.2154 (1.1424) [2022-01-19 08:47:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][540/1251] eta 0:26:21 lr 0.000913 time 1.9378 (2.2247) loss 4.4443 (3.8845) grad_norm 1.0274 (1.1435) [2022-01-19 08:48:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][550/1251] eta 0:25:58 lr 0.000913 time 2.1814 (2.2226) loss 4.1958 (3.8845) grad_norm 1.0739 (1.1455) [2022-01-19 08:48:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][560/1251] eta 0:25:32 lr 0.000913 time 2.1409 (2.2183) loss 2.7995 (3.8852) grad_norm 1.3447 (1.1482) [2022-01-19 08:48:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][570/1251] eta 0:25:09 lr 0.000913 time 2.2037 (2.2172) loss 4.7685 (3.8856) grad_norm 1.0635 (1.1491) [2022-01-19 08:49:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][580/1251] eta 0:24:47 lr 0.000913 time 2.1902 (2.2175) loss 3.7954 (3.8784) grad_norm 1.0015 (1.1489) [2022-01-19 08:49:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][590/1251] eta 0:24:26 lr 0.000913 time 2.1235 (2.2188) loss 4.1938 (3.8789) grad_norm 1.0925 (1.1476) [2022-01-19 08:49:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][600/1251] eta 0:24:03 lr 0.000913 time 1.8480 (2.2180) loss 3.0316 (3.8762) grad_norm 1.1660 (1.1489) [2022-01-19 08:50:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][610/1251] eta 0:23:42 lr 0.000913 time 2.1479 (2.2185) loss 3.3061 (3.8768) grad_norm 1.1316 (1.1488) [2022-01-19 08:50:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][620/1251] eta 0:23:19 lr 0.000913 time 1.6928 (2.2173) loss 3.1850 (3.8711) grad_norm 1.2087 (1.1480) [2022-01-19 08:50:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][630/1251] eta 0:22:57 lr 0.000913 time 2.8605 (2.2174) loss 3.1390 (3.8734) grad_norm 1.1397 (1.1482) [2022-01-19 08:51:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][640/1251] eta 0:22:34 lr 0.000913 time 2.5222 (2.2165) loss 4.1088 (3.8723) grad_norm 0.9992 (1.1480) [2022-01-19 08:51:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][650/1251] eta 0:22:11 lr 0.000913 time 1.8994 (2.2154) loss 2.9176 (3.8741) grad_norm 1.2063 (1.1480) [2022-01-19 08:52:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][660/1251] eta 0:21:49 lr 0.000913 time 1.8768 (2.2156) loss 4.7746 (3.8759) grad_norm 0.9412 (1.1470) [2022-01-19 08:52:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][670/1251] eta 0:21:28 lr 0.000913 time 2.5387 (2.2171) loss 4.2858 (3.8778) grad_norm 1.0547 (1.1473) [2022-01-19 08:52:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][680/1251] eta 0:21:05 lr 0.000913 time 1.8255 (2.2164) loss 3.4570 (3.8797) grad_norm 1.3647 (1.1484) [2022-01-19 08:53:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][690/1251] eta 0:20:41 lr 0.000913 time 1.8759 (2.2136) loss 4.0765 (3.8796) grad_norm 1.0590 (1.1485) [2022-01-19 08:53:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][700/1251] eta 0:20:18 lr 0.000913 time 1.5488 (2.2109) loss 4.8053 (3.8822) grad_norm 1.0658 (1.1486) [2022-01-19 08:53:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][710/1251] eta 0:19:55 lr 0.000913 time 1.9598 (2.2103) loss 3.0889 (3.8853) grad_norm 1.4611 (1.1502) [2022-01-19 08:54:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][720/1251] eta 0:19:34 lr 0.000913 time 2.2359 (2.2119) loss 3.6710 (3.8852) grad_norm 1.0208 (1.1507) [2022-01-19 08:54:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][730/1251] eta 0:19:13 lr 0.000913 time 2.6911 (2.2131) loss 3.3090 (3.8857) grad_norm 1.0398 (1.1502) [2022-01-19 08:54:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][740/1251] eta 0:18:50 lr 0.000913 time 1.8880 (2.2124) loss 3.4286 (3.8846) grad_norm 1.2192 (1.1506) [2022-01-19 08:55:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][750/1251] eta 0:18:27 lr 0.000913 time 1.9154 (2.2108) loss 3.1388 (3.8814) grad_norm 0.9550 (1.1505) [2022-01-19 08:55:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][760/1251] eta 0:18:04 lr 0.000913 time 2.0385 (2.2080) loss 3.3999 (3.8808) grad_norm 1.6831 (1.1513) [2022-01-19 08:55:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][770/1251] eta 0:17:40 lr 0.000913 time 1.9718 (2.2057) loss 4.9127 (3.8835) grad_norm 1.3901 (1.1519) [2022-01-19 08:56:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][780/1251] eta 0:17:18 lr 0.000913 time 2.0890 (2.2050) loss 3.7907 (3.8861) grad_norm 1.2777 (1.1532) [2022-01-19 08:56:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][790/1251] eta 0:16:57 lr 0.000913 time 2.8399 (2.2065) loss 4.7957 (3.8878) grad_norm 0.9538 (1.1521) [2022-01-19 08:57:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][800/1251] eta 0:16:35 lr 0.000913 time 2.2254 (2.2075) loss 3.5210 (3.8874) grad_norm 1.1266 (1.1513) [2022-01-19 08:57:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][810/1251] eta 0:16:14 lr 0.000913 time 1.9679 (2.2086) loss 3.9313 (3.8921) grad_norm 1.1186 (1.1523) [2022-01-19 08:57:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][820/1251] eta 0:15:51 lr 0.000912 time 1.9925 (2.2082) loss 3.7523 (3.8923) grad_norm 1.1836 (1.1528) [2022-01-19 08:58:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][830/1251] eta 0:15:29 lr 0.000912 time 2.4323 (2.2073) loss 3.6992 (3.8891) grad_norm 1.2450 (1.1528) [2022-01-19 08:58:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][840/1251] eta 0:15:06 lr 0.000912 time 2.0377 (2.2060) loss 4.5824 (3.8882) grad_norm 1.0922 (1.1514) [2022-01-19 08:58:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][850/1251] eta 0:14:43 lr 0.000912 time 2.4303 (2.2041) loss 3.4625 (3.8856) grad_norm 1.0497 (1.1509) [2022-01-19 08:59:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][860/1251] eta 0:14:21 lr 0.000912 time 1.9411 (2.2031) loss 4.1093 (3.8854) grad_norm 0.9992 (1.1505) [2022-01-19 08:59:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][870/1251] eta 0:13:59 lr 0.000912 time 2.4907 (2.2037) loss 2.7692 (3.8832) grad_norm 0.9962 (1.1495) [2022-01-19 08:59:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][880/1251] eta 0:13:37 lr 0.000912 time 1.5431 (2.2026) loss 3.1092 (3.8817) grad_norm 1.2397 (1.1497) [2022-01-19 09:00:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][890/1251] eta 0:13:15 lr 0.000912 time 2.1791 (2.2028) loss 3.0091 (3.8788) grad_norm 1.0473 (1.1488) [2022-01-19 09:00:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][900/1251] eta 0:12:53 lr 0.000912 time 1.9967 (2.2027) loss 4.1271 (3.8798) grad_norm 1.1129 (1.1481) [2022-01-19 09:01:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][910/1251] eta 0:12:31 lr 0.000912 time 2.3859 (2.2031) loss 4.0984 (3.8830) grad_norm 1.0589 (1.1476) [2022-01-19 09:01:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][920/1251] eta 0:12:09 lr 0.000912 time 2.0613 (2.2035) loss 3.3988 (3.8833) grad_norm 0.9988 (1.1475) [2022-01-19 09:01:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][930/1251] eta 0:11:47 lr 0.000912 time 1.5777 (2.2042) loss 3.3424 (3.8851) grad_norm 1.2130 (1.1469) [2022-01-19 09:02:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][940/1251] eta 0:11:24 lr 0.000912 time 2.0433 (2.2017) loss 4.4717 (3.8858) grad_norm 1.2428 (1.1469) [2022-01-19 09:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][950/1251] eta 0:11:02 lr 0.000912 time 2.2198 (2.1999) loss 4.6941 (3.8861) grad_norm 1.1921 (1.1467) [2022-01-19 09:02:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][960/1251] eta 0:10:39 lr 0.000912 time 1.7514 (2.1978) loss 2.6753 (3.8864) grad_norm 1.0142 (1.1459) [2022-01-19 09:03:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][970/1251] eta 0:10:17 lr 0.000912 time 2.6717 (2.1973) loss 3.8962 (3.8886) grad_norm 1.1822 (1.1455) [2022-01-19 09:03:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][980/1251] eta 0:09:55 lr 0.000912 time 2.2445 (2.1981) loss 4.2372 (3.8915) grad_norm 1.2418 (1.1451) [2022-01-19 09:03:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][990/1251] eta 0:09:34 lr 0.000912 time 2.5713 (2.2003) loss 3.5681 (3.8914) grad_norm 1.0076 (1.1448) [2022-01-19 09:04:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1000/1251] eta 0:09:12 lr 0.000912 time 2.5183 (2.2012) loss 3.6413 (3.8918) grad_norm 1.2595 (1.1447) [2022-01-19 09:04:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1010/1251] eta 0:08:50 lr 0.000912 time 1.5866 (2.2007) loss 3.6851 (3.8926) grad_norm 1.0236 (1.1449) [2022-01-19 09:05:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1020/1251] eta 0:08:28 lr 0.000912 time 1.8833 (2.1999) loss 3.2356 (3.8910) grad_norm 1.0700 (1.1444) [2022-01-19 09:05:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1030/1251] eta 0:08:06 lr 0.000912 time 3.0782 (2.2006) loss 2.7865 (3.8906) grad_norm 1.1869 (1.1441) [2022-01-19 09:05:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1040/1251] eta 0:07:44 lr 0.000912 time 2.4144 (2.2006) loss 4.0227 (3.8910) grad_norm 1.1219 (1.1444) [2022-01-19 09:06:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1050/1251] eta 0:07:22 lr 0.000912 time 1.6812 (2.1995) loss 3.8016 (3.8885) grad_norm 1.1790 (1.1442) [2022-01-19 09:06:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1060/1251] eta 0:06:59 lr 0.000912 time 1.9668 (2.1977) loss 2.7565 (3.8860) grad_norm 1.1012 (1.1439) [2022-01-19 09:06:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1070/1251] eta 0:06:37 lr 0.000912 time 2.2027 (2.1962) loss 3.4887 (3.8863) grad_norm 1.2185 (1.1433) [2022-01-19 09:07:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1080/1251] eta 0:06:15 lr 0.000912 time 2.4995 (2.1952) loss 4.0278 (3.8881) grad_norm 1.1771 (1.1442) [2022-01-19 09:07:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1090/1251] eta 0:05:53 lr 0.000912 time 1.7034 (2.1937) loss 3.6230 (3.8884) grad_norm 1.0583 (1.1438) [2022-01-19 09:07:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1100/1251] eta 0:05:31 lr 0.000912 time 2.1732 (2.1930) loss 4.2724 (3.8881) grad_norm 0.9678 (1.1446) [2022-01-19 09:08:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1110/1251] eta 0:05:09 lr 0.000912 time 3.0363 (2.1938) loss 4.3931 (3.8888) grad_norm 0.9743 (1.1450) [2022-01-19 09:08:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1120/1251] eta 0:04:47 lr 0.000912 time 2.2151 (2.1938) loss 3.0459 (3.8862) grad_norm 1.1389 (1.1444) [2022-01-19 09:08:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1130/1251] eta 0:04:25 lr 0.000912 time 2.0268 (2.1934) loss 4.3847 (3.8851) grad_norm 1.0775 (1.1444) [2022-01-19 09:09:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1140/1251] eta 0:04:03 lr 0.000912 time 1.8640 (2.1931) loss 4.0545 (3.8864) grad_norm 0.9529 (1.1444) [2022-01-19 09:09:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1150/1251] eta 0:03:41 lr 0.000912 time 2.9262 (2.1958) loss 4.5575 (3.8880) grad_norm 1.1133 (1.1442) [2022-01-19 09:10:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1160/1251] eta 0:03:19 lr 0.000912 time 2.2842 (2.1959) loss 4.6670 (3.8883) grad_norm 1.1750 (1.1438) [2022-01-19 09:10:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1170/1251] eta 0:02:57 lr 0.000912 time 1.9381 (2.1961) loss 4.1882 (3.8877) grad_norm 1.0218 (1.1439) [2022-01-19 09:10:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1180/1251] eta 0:02:35 lr 0.000912 time 1.7671 (2.1955) loss 3.2965 (3.8857) grad_norm 1.1314 (1.1442) [2022-01-19 09:11:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1190/1251] eta 0:02:13 lr 0.000912 time 1.9131 (2.1963) loss 4.0132 (3.8851) grad_norm 1.1199 (1.1447) [2022-01-19 09:11:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1200/1251] eta 0:01:51 lr 0.000912 time 1.8715 (2.1950) loss 4.3878 (3.8851) grad_norm 1.2186 (1.1449) [2022-01-19 09:11:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1210/1251] eta 0:01:29 lr 0.000912 time 2.4967 (2.1939) loss 3.6943 (3.8819) grad_norm 1.0697 (1.1446) [2022-01-19 09:12:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1220/1251] eta 0:01:07 lr 0.000912 time 1.8952 (2.1925) loss 3.8531 (3.8799) grad_norm 0.9368 (1.1438) [2022-01-19 09:12:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1230/1251] eta 0:00:46 lr 0.000912 time 1.5835 (2.1908) loss 4.2239 (3.8790) grad_norm 1.1537 (1.1433) [2022-01-19 09:12:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1240/1251] eta 0:00:24 lr 0.000911 time 2.1770 (2.1898) loss 4.4656 (3.8797) grad_norm 1.3801 (1.1430) [2022-01-19 09:13:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [57/300][1250/1251] eta 0:00:02 lr 0.000911 time 1.1801 (2.1854) loss 3.7317 (3.8792) grad_norm 1.1252 (1.1427) [2022-01-19 09:13:11 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 57 training takes 0:45:34 [2022-01-19 09:13:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.925 (17.925) Loss 1.3115 (1.3115) Acc@1 70.996 (70.996) Acc@5 89.551 (89.551) [2022-01-19 09:13:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 2.714 (3.616) Loss 1.2814 (1.2449) Acc@1 70.996 (71.040) Acc@5 89.941 (90.749) [2022-01-19 09:14:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.303 (2.651) Loss 1.2653 (1.2399) Acc@1 70.117 (71.084) Acc@5 90.723 (90.802) [2022-01-19 09:14:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.648 (2.453) Loss 1.2061 (1.2470) Acc@1 70.703 (70.930) Acc@5 91.602 (90.723) [2022-01-19 09:14:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.935 (2.217) Loss 1.2525 (1.2440) Acc@1 71.387 (71.025) Acc@5 91.211 (90.727) [2022-01-19 09:14:49 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 70.946 Acc@5 90.730 [2022-01-19 09:14:49 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 70.9% [2022-01-19 09:14:49 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 70.95% [2022-01-19 09:15:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][0/1251] eta 7:33:44 lr 0.000911 time 21.7619 (21.7619) loss 4.1608 (4.1608) grad_norm 1.0070 (1.0070) [2022-01-19 09:15:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][10/1251] eta 1:25:40 lr 0.000911 time 2.8271 (4.1421) loss 3.8331 (3.5847) grad_norm 1.0821 (1.1501) [2022-01-19 09:15:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][20/1251] eta 1:06:52 lr 0.000911 time 1.5076 (3.2593) loss 3.2998 (3.6878) grad_norm 1.1279 (1.1162) [2022-01-19 09:16:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][30/1251] eta 0:59:10 lr 0.000911 time 1.8348 (2.9075) loss 2.5559 (3.6745) grad_norm 1.2693 (1.1133) [2022-01-19 09:16:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][40/1251] eta 0:55:26 lr 0.000911 time 3.3606 (2.7465) loss 3.0240 (3.7046) grad_norm 1.2405 (1.1188) [2022-01-19 09:17:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][50/1251] eta 0:53:05 lr 0.000911 time 2.9861 (2.6524) loss 4.0849 (3.7079) grad_norm 1.0510 (1.1100) [2022-01-19 09:17:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][60/1251] eta 0:51:26 lr 0.000911 time 1.6814 (2.5916) loss 2.8578 (3.6958) grad_norm 1.1979 (1.1087) [2022-01-19 09:17:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][70/1251] eta 0:49:50 lr 0.000911 time 1.8938 (2.5326) loss 3.9902 (3.7183) grad_norm 1.0034 (1.1184) [2022-01-19 09:18:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][80/1251] eta 0:48:26 lr 0.000911 time 2.9086 (2.4818) loss 4.4549 (3.7801) grad_norm 1.1397 (1.1162) [2022-01-19 09:18:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][90/1251] eta 0:47:14 lr 0.000911 time 2.0847 (2.4418) loss 4.7638 (3.8268) grad_norm 1.2557 (1.1266) [2022-01-19 09:18:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][100/1251] eta 0:45:55 lr 0.000911 time 2.1917 (2.3943) loss 4.5794 (3.8307) grad_norm 1.0924 (1.1250) [2022-01-19 09:19:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][110/1251] eta 0:45:19 lr 0.000911 time 2.1628 (2.3832) loss 4.1607 (3.8240) grad_norm 0.9492 (1.1195) [2022-01-19 09:19:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][120/1251] eta 0:44:41 lr 0.000911 time 2.6953 (2.3713) loss 2.9100 (3.8089) grad_norm 1.4864 (1.1205) [2022-01-19 09:19:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][130/1251] eta 0:44:08 lr 0.000911 time 2.1623 (2.3623) loss 3.4834 (3.8173) grad_norm 1.2845 (1.1244) [2022-01-19 09:20:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][140/1251] eta 0:43:24 lr 0.000911 time 1.9183 (2.3440) loss 4.5267 (3.8294) grad_norm 1.2019 (1.1288) [2022-01-19 09:20:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][150/1251] eta 0:43:04 lr 0.000911 time 1.9348 (2.3471) loss 3.8864 (3.8449) grad_norm 0.9875 (1.1301) [2022-01-19 09:21:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][160/1251] eta 0:42:38 lr 0.000911 time 2.8251 (2.3449) loss 4.6012 (3.8533) grad_norm 1.2967 (1.1338) [2022-01-19 09:21:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][170/1251] eta 0:41:59 lr 0.000911 time 1.8882 (2.3303) loss 4.2527 (3.8654) grad_norm 1.3315 (1.1351) [2022-01-19 09:21:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][180/1251] eta 0:41:11 lr 0.000911 time 1.6862 (2.3081) loss 3.3387 (3.8664) grad_norm 1.1284 (1.1362) [2022-01-19 09:22:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][190/1251] eta 0:40:39 lr 0.000911 time 2.7456 (2.2995) loss 3.3810 (3.8658) grad_norm 1.1628 (1.1379) [2022-01-19 09:22:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][200/1251] eta 0:40:10 lr 0.000911 time 2.4836 (2.2936) loss 3.5792 (3.8634) grad_norm 1.1121 (1.1379) [2022-01-19 09:22:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][210/1251] eta 0:39:34 lr 0.000911 time 1.8711 (2.2814) loss 4.3634 (3.8649) grad_norm 1.0544 (1.1364) [2022-01-19 09:23:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][220/1251] eta 0:39:04 lr 0.000911 time 1.8819 (2.2742) loss 3.6481 (3.8570) grad_norm 1.0251 (1.1390) [2022-01-19 09:23:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][230/1251] eta 0:38:42 lr 0.000911 time 3.4813 (2.2750) loss 4.6168 (3.8455) grad_norm 1.1352 (1.1402) [2022-01-19 09:23:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][240/1251] eta 0:38:14 lr 0.000911 time 2.2267 (2.2699) loss 2.9632 (3.8457) grad_norm 1.2632 (1.1430) [2022-01-19 09:24:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][250/1251] eta 0:37:49 lr 0.000911 time 1.8462 (2.2671) loss 4.5754 (3.8359) grad_norm 1.2591 (1.1456) [2022-01-19 09:24:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][260/1251] eta 0:37:28 lr 0.000911 time 1.6842 (2.2692) loss 3.9323 (3.8427) grad_norm 1.3171 (1.1467) [2022-01-19 09:25:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][270/1251] eta 0:37:08 lr 0.000911 time 2.7567 (2.2712) loss 3.5522 (3.8423) grad_norm 1.1251 (1.1461) [2022-01-19 09:25:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][280/1251] eta 0:36:41 lr 0.000911 time 2.4227 (2.2670) loss 4.5831 (3.8450) grad_norm 1.2767 (1.1486) [2022-01-19 09:25:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][290/1251] eta 0:36:16 lr 0.000911 time 1.9191 (2.2649) loss 2.7270 (3.8458) grad_norm 1.2554 (1.1511) [2022-01-19 09:26:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][300/1251] eta 0:35:49 lr 0.000911 time 1.9266 (2.2607) loss 4.3372 (3.8490) grad_norm 0.9955 (1.1498) [2022-01-19 09:26:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][310/1251] eta 0:35:22 lr 0.000911 time 2.5604 (2.2555) loss 3.5684 (3.8488) grad_norm 1.3076 (1.1491) [2022-01-19 09:26:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][320/1251] eta 0:34:58 lr 0.000911 time 1.7962 (2.2535) loss 3.7919 (3.8494) grad_norm 1.3945 (1.1512) [2022-01-19 09:27:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][330/1251] eta 0:34:34 lr 0.000911 time 1.8763 (2.2524) loss 3.9771 (3.8579) grad_norm 1.2005 (1.1509) [2022-01-19 09:27:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][340/1251] eta 0:34:11 lr 0.000911 time 2.4957 (2.2521) loss 3.4050 (3.8648) grad_norm 0.9264 (1.1506) [2022-01-19 09:27:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][350/1251] eta 0:33:46 lr 0.000911 time 2.4457 (2.2493) loss 4.3320 (3.8677) grad_norm 1.2870 (1.1485) [2022-01-19 09:28:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][360/1251] eta 0:33:20 lr 0.000911 time 1.6160 (2.2456) loss 2.9545 (3.8724) grad_norm 1.0030 (1.1507) [2022-01-19 09:28:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][370/1251] eta 0:32:56 lr 0.000911 time 1.9343 (2.2437) loss 4.2218 (3.8696) grad_norm 1.0867 (1.1506) [2022-01-19 09:29:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][380/1251] eta 0:32:37 lr 0.000911 time 3.1061 (2.2470) loss 3.7273 (3.8698) grad_norm 0.9985 (1.1498) [2022-01-19 09:29:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][390/1251] eta 0:32:14 lr 0.000911 time 2.2467 (2.2472) loss 4.3182 (3.8715) grad_norm 1.1692 (1.1499) [2022-01-19 09:29:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][400/1251] eta 0:31:47 lr 0.000911 time 1.9953 (2.2420) loss 3.7716 (3.8730) grad_norm 1.1295 (1.1501) [2022-01-19 09:30:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][410/1251] eta 0:31:22 lr 0.000910 time 1.9381 (2.2384) loss 4.4588 (3.8741) grad_norm 1.1585 (1.1493) [2022-01-19 09:30:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][420/1251] eta 0:30:56 lr 0.000910 time 2.4430 (2.2346) loss 3.0227 (3.8749) grad_norm 1.1502 (1.1506) [2022-01-19 09:30:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][430/1251] eta 0:30:33 lr 0.000910 time 1.9002 (2.2327) loss 4.5817 (3.8850) grad_norm 1.0876 (1.1500) [2022-01-19 09:31:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][440/1251] eta 0:30:09 lr 0.000910 time 1.9564 (2.2314) loss 4.0560 (3.8890) grad_norm 1.1414 (1.1494) [2022-01-19 09:31:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][450/1251] eta 0:29:47 lr 0.000910 time 2.7346 (2.2313) loss 3.3170 (3.8884) grad_norm 1.0136 (1.1471) [2022-01-19 09:31:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][460/1251] eta 0:29:25 lr 0.000910 time 2.5834 (2.2321) loss 4.7906 (3.8845) grad_norm 1.2659 (1.1458) [2022-01-19 09:32:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][470/1251] eta 0:29:04 lr 0.000910 time 2.1336 (2.2334) loss 3.7943 (3.8846) grad_norm 1.1112 (1.1439) [2022-01-19 09:32:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][480/1251] eta 0:28:42 lr 0.000910 time 2.3954 (2.2345) loss 4.6954 (3.8903) grad_norm 1.1788 (1.1439) [2022-01-19 09:33:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][490/1251] eta 0:28:21 lr 0.000910 time 2.2275 (2.2353) loss 3.7326 (3.8885) grad_norm 1.1083 (1.1441) [2022-01-19 09:33:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][500/1251] eta 0:27:56 lr 0.000910 time 2.3837 (2.2323) loss 2.8388 (3.8829) grad_norm 1.0401 (1.1452) [2022-01-19 09:33:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][510/1251] eta 0:27:29 lr 0.000910 time 1.9159 (2.2259) loss 4.6927 (3.8894) grad_norm 1.1720 (1.1444) [2022-01-19 09:34:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][520/1251] eta 0:27:03 lr 0.000910 time 1.5831 (2.2207) loss 4.5337 (3.8900) grad_norm 1.1870 (1.1443) [2022-01-19 09:34:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][530/1251] eta 0:26:39 lr 0.000910 time 2.1795 (2.2188) loss 3.0449 (3.8906) grad_norm 1.1955 (1.1441) [2022-01-19 09:34:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][540/1251] eta 0:26:18 lr 0.000910 time 2.6100 (2.2195) loss 4.6222 (3.8958) grad_norm 1.0445 (1.1430) [2022-01-19 09:35:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][550/1251] eta 0:25:54 lr 0.000910 time 1.8905 (2.2178) loss 3.8920 (3.8918) grad_norm 1.1200 (1.1427) [2022-01-19 09:35:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][560/1251] eta 0:25:33 lr 0.000910 time 2.5877 (2.2199) loss 4.4710 (3.8931) grad_norm 1.2183 (1.1430) [2022-01-19 09:35:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][570/1251] eta 0:25:13 lr 0.000910 time 2.4234 (2.2222) loss 4.3963 (3.8967) grad_norm 2.4072 (1.1465) [2022-01-19 09:36:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][580/1251] eta 0:24:52 lr 0.000910 time 2.4824 (2.2248) loss 4.2555 (3.9029) grad_norm 1.1080 (1.1497) [2022-01-19 09:36:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][590/1251] eta 0:24:30 lr 0.000910 time 1.9145 (2.2249) loss 3.4615 (3.8987) grad_norm 1.2468 (1.1501) [2022-01-19 09:37:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][600/1251] eta 0:24:07 lr 0.000910 time 2.4867 (2.2242) loss 4.4761 (3.9017) grad_norm 1.0345 (1.1517) [2022-01-19 09:37:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][610/1251] eta 0:23:44 lr 0.000910 time 2.0080 (2.2227) loss 4.4886 (3.9016) grad_norm 0.9501 (1.1516) [2022-01-19 09:37:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][620/1251] eta 0:23:21 lr 0.000910 time 2.0951 (2.2204) loss 4.0809 (3.9023) grad_norm 1.1385 (1.1506) [2022-01-19 09:38:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][630/1251] eta 0:22:57 lr 0.000910 time 1.8971 (2.2178) loss 4.5085 (3.9007) grad_norm 1.2363 (1.1498) [2022-01-19 09:38:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][640/1251] eta 0:22:34 lr 0.000910 time 2.2007 (2.2174) loss 3.8609 (3.9044) grad_norm 1.0865 (1.1477) [2022-01-19 09:38:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][650/1251] eta 0:22:13 lr 0.000910 time 2.4779 (2.2193) loss 2.7922 (3.9069) grad_norm 1.1583 (1.1472) [2022-01-19 09:39:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][660/1251] eta 0:21:50 lr 0.000910 time 1.6763 (2.2175) loss 4.3840 (3.9083) grad_norm 0.9704 (1.1467) [2022-01-19 09:39:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][670/1251] eta 0:21:28 lr 0.000910 time 1.9041 (2.2176) loss 3.9926 (3.9099) grad_norm 1.0560 (1.1471) [2022-01-19 09:39:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][680/1251] eta 0:21:05 lr 0.000910 time 1.8849 (2.2163) loss 4.4770 (3.9092) grad_norm 1.0685 (1.1467) [2022-01-19 09:40:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][690/1251] eta 0:20:44 lr 0.000910 time 2.6466 (2.2181) loss 3.2830 (3.9062) grad_norm 1.0220 (1.1465) [2022-01-19 09:40:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][700/1251] eta 0:20:21 lr 0.000910 time 2.1094 (2.2164) loss 4.0909 (3.9071) grad_norm 1.0880 (1.1453) [2022-01-19 09:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][710/1251] eta 0:19:58 lr 0.000910 time 1.8894 (2.2147) loss 4.3474 (3.9089) grad_norm 0.9173 (1.1454) [2022-01-19 09:41:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][720/1251] eta 0:19:35 lr 0.000910 time 1.5299 (2.2140) loss 2.8687 (3.9100) grad_norm 1.3181 (1.1453) [2022-01-19 09:41:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][730/1251] eta 0:19:13 lr 0.000910 time 1.7997 (2.2147) loss 4.4932 (3.9111) grad_norm 0.9229 (1.1453) [2022-01-19 09:42:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][740/1251] eta 0:18:51 lr 0.000910 time 1.5935 (2.2137) loss 3.3485 (3.9099) grad_norm 0.9435 (1.1438) [2022-01-19 09:42:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][750/1251] eta 0:18:29 lr 0.000910 time 2.0501 (2.2136) loss 3.8123 (3.9111) grad_norm 1.0869 (1.1437) [2022-01-19 09:42:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][760/1251] eta 0:18:06 lr 0.000910 time 1.8525 (2.2124) loss 4.3182 (3.9090) grad_norm 1.3743 (1.1439) [2022-01-19 09:43:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][770/1251] eta 0:17:43 lr 0.000910 time 1.8661 (2.2117) loss 4.0142 (3.9083) grad_norm 1.4116 (1.1440) [2022-01-19 09:43:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][780/1251] eta 0:17:22 lr 0.000910 time 1.8704 (2.2127) loss 4.3221 (3.9067) grad_norm 1.0643 (1.1454) [2022-01-19 09:44:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][790/1251] eta 0:17:00 lr 0.000910 time 2.5166 (2.2136) loss 4.3413 (3.9075) grad_norm 1.0085 (1.1452) [2022-01-19 09:44:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][800/1251] eta 0:16:37 lr 0.000910 time 1.5719 (2.2128) loss 2.7926 (3.9083) grad_norm 0.9481 (1.1451) [2022-01-19 09:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][810/1251] eta 0:16:15 lr 0.000910 time 1.8462 (2.2123) loss 2.9986 (3.9079) grad_norm 1.1911 (1.1454) [2022-01-19 09:45:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][820/1251] eta 0:15:53 lr 0.000910 time 2.2400 (2.2121) loss 3.8776 (3.9088) grad_norm 1.0796 (1.1454) [2022-01-19 09:45:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][830/1251] eta 0:15:31 lr 0.000909 time 1.5819 (2.2121) loss 4.4066 (3.9093) grad_norm 1.1002 (1.1458) [2022-01-19 09:45:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][840/1251] eta 0:15:08 lr 0.000909 time 2.1735 (2.2099) loss 3.7435 (3.9108) grad_norm 1.1456 (1.1451) [2022-01-19 09:46:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][850/1251] eta 0:14:45 lr 0.000909 time 1.9053 (2.2087) loss 4.4838 (3.9081) grad_norm 1.2635 (1.1445) [2022-01-19 09:46:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][860/1251] eta 0:14:23 lr 0.000909 time 2.1016 (2.2080) loss 3.8647 (3.9086) grad_norm 1.0706 (1.1445) [2022-01-19 09:46:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][870/1251] eta 0:14:00 lr 0.000909 time 2.0520 (2.2073) loss 4.0757 (3.9072) grad_norm 1.1621 (1.1447) [2022-01-19 09:47:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][880/1251] eta 0:13:38 lr 0.000909 time 2.1278 (2.2058) loss 4.0877 (3.9082) grad_norm 0.9845 (1.1442) [2022-01-19 09:47:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][890/1251] eta 0:13:17 lr 0.000909 time 2.4199 (2.2080) loss 4.5580 (3.9067) grad_norm 0.9741 (1.1431) [2022-01-19 09:48:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][900/1251] eta 0:12:55 lr 0.000909 time 1.7763 (2.2100) loss 3.8314 (3.9061) grad_norm 0.9545 (1.1431) [2022-01-19 09:48:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][910/1251] eta 0:12:33 lr 0.000909 time 1.9165 (2.2104) loss 3.5847 (3.9034) grad_norm 1.0175 (1.1432) [2022-01-19 09:48:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][920/1251] eta 0:12:10 lr 0.000909 time 2.4200 (2.2082) loss 4.0645 (3.9053) grad_norm 1.1408 (1.1433) [2022-01-19 09:49:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][930/1251] eta 0:11:48 lr 0.000909 time 2.0302 (2.2085) loss 4.1129 (3.9070) grad_norm 1.1615 (1.1440) [2022-01-19 09:49:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][940/1251] eta 0:11:27 lr 0.000909 time 1.7881 (2.2092) loss 4.2123 (3.9079) grad_norm 1.1946 (1.1456) [2022-01-19 09:49:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][950/1251] eta 0:11:04 lr 0.000909 time 2.2170 (2.2077) loss 4.0400 (3.9060) grad_norm 1.1610 (1.1461) [2022-01-19 09:50:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][960/1251] eta 0:10:42 lr 0.000909 time 1.7398 (2.2071) loss 4.4687 (3.9047) grad_norm 1.1794 (1.1462) [2022-01-19 09:50:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][970/1251] eta 0:10:20 lr 0.000909 time 1.7710 (2.2065) loss 3.6866 (3.9055) grad_norm 1.2485 (1.1465) [2022-01-19 09:50:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][980/1251] eta 0:09:58 lr 0.000909 time 1.8252 (2.2082) loss 4.3485 (3.9058) grad_norm 1.1384 (1.1456) [2022-01-19 09:51:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][990/1251] eta 0:09:36 lr 0.000909 time 1.8343 (2.2078) loss 4.0869 (3.9046) grad_norm 1.1304 (1.1456) [2022-01-19 09:51:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1000/1251] eta 0:09:13 lr 0.000909 time 1.9435 (2.2063) loss 3.7142 (3.9024) grad_norm 1.0321 (1.1468) [2022-01-19 09:52:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1010/1251] eta 0:08:51 lr 0.000909 time 2.7469 (2.2063) loss 4.6196 (3.9028) grad_norm 1.6214 (1.1478) [2022-01-19 09:52:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1020/1251] eta 0:08:29 lr 0.000909 time 2.1428 (2.2051) loss 2.7517 (3.9035) grad_norm 1.2062 (1.1479) [2022-01-19 09:52:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1030/1251] eta 0:08:07 lr 0.000909 time 2.1058 (2.2048) loss 4.3532 (3.9026) grad_norm 1.2992 (1.1473) [2022-01-19 09:53:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1040/1251] eta 0:07:45 lr 0.000909 time 2.0682 (2.2043) loss 2.8954 (3.9010) grad_norm 1.2656 (1.1467) [2022-01-19 09:53:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1050/1251] eta 0:07:23 lr 0.000909 time 2.2040 (2.2045) loss 4.6533 (3.9033) grad_norm 1.1564 (1.1465) [2022-01-19 09:53:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1060/1251] eta 0:07:00 lr 0.000909 time 2.2133 (2.2039) loss 4.1713 (3.9026) grad_norm 1.2898 (1.1460) [2022-01-19 09:54:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1070/1251] eta 0:06:39 lr 0.000909 time 2.2771 (2.2091) loss 3.7812 (3.9037) grad_norm 1.0842 (1.1452) [2022-01-19 09:54:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1080/1251] eta 0:06:17 lr 0.000909 time 2.5511 (2.2101) loss 3.9327 (3.9045) grad_norm 1.0401 (1.1444) [2022-01-19 09:54:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1090/1251] eta 0:05:55 lr 0.000909 time 1.8941 (2.2093) loss 3.3340 (3.9030) grad_norm 1.0596 (1.1446) [2022-01-19 09:55:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1100/1251] eta 0:05:33 lr 0.000909 time 1.5354 (2.2078) loss 3.9748 (3.9015) grad_norm 1.0045 (1.1439) [2022-01-19 09:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1110/1251] eta 0:05:11 lr 0.000909 time 1.6462 (2.2080) loss 4.1328 (3.9010) grad_norm 1.0132 (1.1432) [2022-01-19 09:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1120/1251] eta 0:04:49 lr 0.000909 time 1.8535 (2.2067) loss 4.6515 (3.9029) grad_norm 1.1814 (1.1431) [2022-01-19 09:56:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1130/1251] eta 0:04:26 lr 0.000909 time 2.3273 (2.2059) loss 4.1817 (3.9014) grad_norm 1.1794 (1.1431) [2022-01-19 09:56:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1140/1251] eta 0:04:04 lr 0.000909 time 1.6804 (2.2065) loss 4.4012 (3.8996) grad_norm 1.0852 (1.1429) [2022-01-19 09:57:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1150/1251] eta 0:03:42 lr 0.000909 time 1.8636 (2.2076) loss 4.1391 (3.8976) grad_norm 1.0853 (1.1430) [2022-01-19 09:57:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1160/1251] eta 0:03:20 lr 0.000909 time 1.7808 (2.2076) loss 4.5226 (3.8953) grad_norm 1.1448 (1.1429) [2022-01-19 09:57:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1170/1251] eta 0:02:58 lr 0.000909 time 1.8018 (2.2067) loss 4.2666 (3.8954) grad_norm 1.3067 (1.1430) [2022-01-19 09:58:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1180/1251] eta 0:02:36 lr 0.000909 time 1.8977 (2.2058) loss 4.4978 (3.8976) grad_norm 1.1634 (1.1429) [2022-01-19 09:58:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1190/1251] eta 0:02:14 lr 0.000909 time 1.8416 (2.2049) loss 3.9766 (3.8974) grad_norm 1.1904 (1.1432) [2022-01-19 09:58:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1200/1251] eta 0:01:52 lr 0.000909 time 1.9258 (2.2041) loss 3.9927 (3.8962) grad_norm 1.0334 (1.1426) [2022-01-19 09:59:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1210/1251] eta 0:01:30 lr 0.000909 time 1.8295 (2.2038) loss 2.8226 (3.8953) grad_norm 1.0751 (1.1429) [2022-01-19 09:59:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1220/1251] eta 0:01:08 lr 0.000909 time 2.5091 (2.2046) loss 3.1691 (3.8947) grad_norm 1.1238 (1.1422) [2022-01-19 10:00:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1230/1251] eta 0:00:46 lr 0.000909 time 1.9251 (2.2058) loss 4.0502 (3.8944) grad_norm 1.1548 (1.1423) [2022-01-19 10:00:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1240/1251] eta 0:00:24 lr 0.000909 time 1.3948 (2.2051) loss 4.1855 (3.8957) grad_norm 1.2328 (1.1432) [2022-01-19 10:00:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [58/300][1250/1251] eta 0:00:02 lr 0.000908 time 1.1875 (2.2002) loss 2.7813 (3.8949) grad_norm 1.0685 (1.1432) [2022-01-19 10:00:42 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 58 training takes 0:45:52 [2022-01-19 10:01:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.693 (18.693) Loss 1.1635 (1.1635) Acc@1 74.316 (74.316) Acc@5 91.699 (91.699) [2022-01-19 10:01:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.637 (3.524) Loss 1.1708 (1.2402) Acc@1 73.145 (71.467) Acc@5 91.309 (90.439) [2022-01-19 10:01:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.627 (2.661) Loss 1.1963 (1.2358) Acc@1 72.949 (71.475) Acc@5 90.527 (90.630) [2022-01-19 10:01:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.631 (2.281) Loss 1.2064 (1.2494) Acc@1 71.680 (71.160) Acc@5 91.406 (90.420) [2022-01-19 10:02:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.149 (2.229) Loss 1.2420 (1.2445) Acc@1 71.582 (71.313) Acc@5 90.625 (90.573) [2022-01-19 10:02:20 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 71.278 Acc@5 90.550 [2022-01-19 10:02:20 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 71.3% [2022-01-19 10:02:20 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 71.28% [2022-01-19 10:02:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][0/1251] eta 7:28:19 lr 0.000908 time 21.5021 (21.5021) loss 4.3581 (4.3581) grad_norm 1.0111 (1.0111) [2022-01-19 10:03:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][10/1251] eta 1:22:01 lr 0.000908 time 1.8745 (3.9658) loss 3.9637 (3.9840) grad_norm 1.2828 (1.1227) [2022-01-19 10:03:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][20/1251] eta 1:05:01 lr 0.000908 time 2.1711 (3.1694) loss 4.1705 (3.8294) grad_norm 1.0677 (1.1175) [2022-01-19 10:03:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][30/1251] eta 0:58:15 lr 0.000908 time 1.8724 (2.8626) loss 4.5448 (3.8917) grad_norm 1.1575 (1.1119) [2022-01-19 10:04:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][40/1251] eta 0:54:48 lr 0.000908 time 3.8798 (2.7156) loss 4.0070 (3.9105) grad_norm 0.8956 (1.1054) [2022-01-19 10:04:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][50/1251] eta 0:52:39 lr 0.000908 time 2.2625 (2.6310) loss 3.9609 (3.8804) grad_norm 1.0194 (1.1064) [2022-01-19 10:04:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][60/1251] eta 0:50:42 lr 0.000908 time 1.7197 (2.5548) loss 2.6673 (3.8801) grad_norm 1.0401 (1.1107) [2022-01-19 10:05:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][70/1251] eta 0:49:37 lr 0.000908 time 2.0510 (2.5209) loss 4.2199 (3.8491) grad_norm 1.0751 (1.1120) [2022-01-19 10:05:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][80/1251] eta 0:48:26 lr 0.000908 time 3.3393 (2.4824) loss 4.8561 (3.8275) grad_norm 1.0906 (1.1185) [2022-01-19 10:06:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][90/1251] eta 0:47:39 lr 0.000908 time 2.1125 (2.4630) loss 3.4299 (3.8235) grad_norm 1.0406 (1.1173) [2022-01-19 10:06:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][100/1251] eta 0:46:28 lr 0.000908 time 1.8915 (2.4228) loss 3.3973 (3.8367) grad_norm 0.9469 (1.1162) [2022-01-19 10:06:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][110/1251] eta 0:45:23 lr 0.000908 time 1.7084 (2.3866) loss 4.3779 (3.8688) grad_norm 0.9534 (1.1143) [2022-01-19 10:07:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][120/1251] eta 0:44:39 lr 0.000908 time 2.2128 (2.3688) loss 4.5863 (3.8737) grad_norm 1.1677 (1.1089) [2022-01-19 10:07:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][130/1251] eta 0:44:09 lr 0.000908 time 2.1041 (2.3634) loss 4.0917 (3.8415) grad_norm 1.2964 (1.1088) [2022-01-19 10:07:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][140/1251] eta 0:43:28 lr 0.000908 time 2.2349 (2.3481) loss 2.6270 (3.8295) grad_norm 1.3741 (1.1126) [2022-01-19 10:08:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][150/1251] eta 0:42:48 lr 0.000908 time 1.9724 (2.3326) loss 4.2674 (3.8198) grad_norm 1.1835 (1.1218) [2022-01-19 10:08:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][160/1251] eta 0:42:21 lr 0.000908 time 2.6207 (2.3297) loss 3.9763 (3.8372) grad_norm 1.0802 (1.1247) [2022-01-19 10:08:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][170/1251] eta 0:41:50 lr 0.000908 time 1.9925 (2.3223) loss 4.3007 (3.8418) grad_norm 1.2680 (1.1278) [2022-01-19 10:09:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][180/1251] eta 0:41:21 lr 0.000908 time 3.0512 (2.3167) loss 3.8684 (3.8449) grad_norm 1.4570 (1.1311) [2022-01-19 10:09:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][190/1251] eta 0:40:55 lr 0.000908 time 1.8582 (2.3143) loss 4.2048 (3.8519) grad_norm 1.0360 (1.1299) [2022-01-19 10:10:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][200/1251] eta 0:40:29 lr 0.000908 time 2.6637 (2.3114) loss 4.3265 (3.8512) grad_norm 1.1371 (1.1312) [2022-01-19 10:10:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][210/1251] eta 0:39:58 lr 0.000908 time 1.8653 (2.3038) loss 3.9449 (3.8554) grad_norm 0.9825 (1.1319) [2022-01-19 10:10:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][220/1251] eta 0:39:18 lr 0.000908 time 1.8809 (2.2873) loss 3.8494 (3.8653) grad_norm 1.2639 (1.1336) [2022-01-19 10:11:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][230/1251] eta 0:38:42 lr 0.000908 time 1.6635 (2.2744) loss 3.2472 (3.8619) grad_norm 1.6423 (1.1359) [2022-01-19 10:11:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][240/1251] eta 0:38:06 lr 0.000908 time 2.2020 (2.2620) loss 3.7510 (3.8643) grad_norm 1.0650 (1.1374) [2022-01-19 10:11:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][250/1251] eta 0:37:37 lr 0.000908 time 2.5326 (2.2548) loss 4.4885 (3.8611) grad_norm 1.2210 (1.1382) [2022-01-19 10:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][260/1251] eta 0:37:10 lr 0.000908 time 1.6538 (2.2507) loss 3.6523 (3.8569) grad_norm 1.0944 (1.1395) [2022-01-19 10:12:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][270/1251] eta 0:36:54 lr 0.000908 time 2.1785 (2.2577) loss 4.1330 (3.8632) grad_norm 1.1805 (1.1405) [2022-01-19 10:12:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][280/1251] eta 0:36:42 lr 0.000908 time 2.5136 (2.2686) loss 3.0457 (3.8551) grad_norm 1.1523 (1.1410) [2022-01-19 10:13:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][290/1251] eta 0:36:20 lr 0.000908 time 1.7813 (2.2692) loss 3.0244 (3.8532) grad_norm 1.2175 (1.1419) [2022-01-19 10:13:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][300/1251] eta 0:35:56 lr 0.000908 time 2.1358 (2.2674) loss 4.2737 (3.8599) grad_norm 1.1871 (1.1426) [2022-01-19 10:14:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][310/1251] eta 0:35:36 lr 0.000908 time 2.5296 (2.2705) loss 4.2334 (3.8732) grad_norm 1.0131 (1.1407) [2022-01-19 10:14:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][320/1251] eta 0:35:08 lr 0.000908 time 1.5763 (2.2649) loss 4.1191 (3.8787) grad_norm 1.1227 (1.1389) [2022-01-19 10:14:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][330/1251] eta 0:34:36 lr 0.000908 time 1.5603 (2.2542) loss 3.9430 (3.8739) grad_norm 0.9699 (1.1393) [2022-01-19 10:15:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][340/1251] eta 0:34:09 lr 0.000908 time 2.2218 (2.2500) loss 2.9606 (3.8705) grad_norm 1.2415 (1.1392) [2022-01-19 10:15:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][350/1251] eta 0:33:39 lr 0.000908 time 1.5741 (2.2413) loss 3.4927 (3.8733) grad_norm 1.2938 (1.1394) [2022-01-19 10:15:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][360/1251] eta 0:33:16 lr 0.000908 time 2.8558 (2.2412) loss 3.9506 (3.8757) grad_norm 1.3929 (1.1417) [2022-01-19 10:16:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][370/1251] eta 0:32:55 lr 0.000908 time 2.2342 (2.2421) loss 3.6995 (3.8681) grad_norm 1.1643 (1.1448) [2022-01-19 10:16:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][380/1251] eta 0:32:35 lr 0.000908 time 2.6063 (2.2455) loss 4.2376 (3.8671) grad_norm 1.2829 (1.1451) [2022-01-19 10:17:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][390/1251] eta 0:32:18 lr 0.000908 time 2.0694 (2.2515) loss 3.6674 (3.8708) grad_norm 1.2989 (1.1440) [2022-01-19 10:17:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][400/1251] eta 0:31:59 lr 0.000908 time 2.4116 (2.2561) loss 3.2461 (3.8610) grad_norm 1.1247 (1.1459) [2022-01-19 10:17:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][410/1251] eta 0:31:36 lr 0.000908 time 2.1478 (2.2555) loss 4.6246 (3.8551) grad_norm 1.2531 (1.1445) [2022-01-19 10:18:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][420/1251] eta 0:31:10 lr 0.000907 time 1.8568 (2.2515) loss 3.0462 (3.8559) grad_norm 0.9521 (1.1448) [2022-01-19 10:18:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][430/1251] eta 0:30:43 lr 0.000907 time 1.8082 (2.2456) loss 3.1684 (3.8526) grad_norm 1.1331 (1.1462) [2022-01-19 10:18:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][440/1251] eta 0:30:14 lr 0.000907 time 1.9165 (2.2370) loss 3.9464 (3.8569) grad_norm 1.3421 (1.1474) [2022-01-19 10:19:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][450/1251] eta 0:29:47 lr 0.000907 time 1.9061 (2.2312) loss 3.9160 (3.8512) grad_norm 1.1413 (1.1474) [2022-01-19 10:19:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][460/1251] eta 0:29:22 lr 0.000907 time 2.1699 (2.2277) loss 4.1150 (3.8579) grad_norm 1.1066 (1.1470) [2022-01-19 10:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][470/1251] eta 0:29:00 lr 0.000907 time 2.2097 (2.2290) loss 3.9016 (3.8594) grad_norm 1.2756 (1.1470) [2022-01-19 10:20:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][480/1251] eta 0:28:40 lr 0.000907 time 2.7325 (2.2313) loss 4.2880 (3.8587) grad_norm 1.2592 (1.1458) [2022-01-19 10:20:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][490/1251] eta 0:28:19 lr 0.000907 time 2.5740 (2.2335) loss 4.0485 (3.8610) grad_norm 1.1941 (1.1449) [2022-01-19 10:20:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][500/1251] eta 0:27:56 lr 0.000907 time 2.5329 (2.2326) loss 3.4417 (3.8618) grad_norm 0.9611 (1.1448) [2022-01-19 10:21:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][510/1251] eta 0:27:37 lr 0.000907 time 2.1750 (2.2368) loss 4.5951 (3.8629) grad_norm 1.1485 (1.1448) [2022-01-19 10:21:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][520/1251] eta 0:27:16 lr 0.000907 time 2.9079 (2.2384) loss 4.1252 (3.8677) grad_norm 1.1700 (1.1432) [2022-01-19 10:22:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][530/1251] eta 0:26:51 lr 0.000907 time 1.9446 (2.2353) loss 4.1076 (3.8705) grad_norm 0.9997 (1.1418) [2022-01-19 10:22:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][540/1251] eta 0:26:26 lr 0.000907 time 2.4850 (2.2313) loss 4.0737 (3.8705) grad_norm 1.2846 (1.1450) [2022-01-19 10:22:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][550/1251] eta 0:26:00 lr 0.000907 time 2.0858 (2.2259) loss 4.2838 (3.8777) grad_norm 1.0535 (1.1444) [2022-01-19 10:23:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][560/1251] eta 0:25:37 lr 0.000907 time 2.1546 (2.2251) loss 3.8474 (3.8781) grad_norm 1.0872 (1.1432) [2022-01-19 10:23:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][570/1251] eta 0:25:16 lr 0.000907 time 2.6053 (2.2263) loss 4.7178 (3.8770) grad_norm 1.1575 (1.1437) [2022-01-19 10:23:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][580/1251] eta 0:24:57 lr 0.000907 time 2.4258 (2.2318) loss 3.6201 (3.8786) grad_norm 1.0994 (1.1431) [2022-01-19 10:24:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][590/1251] eta 0:24:37 lr 0.000907 time 2.1389 (2.2356) loss 3.3073 (3.8817) grad_norm 1.1231 (1.1432) [2022-01-19 10:24:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][600/1251] eta 0:24:14 lr 0.000907 time 2.1712 (2.2347) loss 4.1964 (3.8795) grad_norm 1.0281 (1.1439) [2022-01-19 10:25:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][610/1251] eta 0:23:49 lr 0.000907 time 1.5835 (2.2308) loss 4.1614 (3.8771) grad_norm 1.1134 (1.1444) [2022-01-19 10:25:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][620/1251] eta 0:23:25 lr 0.000907 time 1.8699 (2.2271) loss 3.2683 (3.8773) grad_norm 1.1114 (1.1430) [2022-01-19 10:25:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][630/1251] eta 0:23:02 lr 0.000907 time 1.8473 (2.2263) loss 3.8686 (3.8754) grad_norm 1.0880 (1.1426) [2022-01-19 10:26:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][640/1251] eta 0:22:39 lr 0.000907 time 1.5971 (2.2249) loss 2.9641 (3.8724) grad_norm 0.9157 (1.1417) [2022-01-19 10:26:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][650/1251] eta 0:22:17 lr 0.000907 time 2.1487 (2.2252) loss 3.2716 (3.8723) grad_norm 1.0854 (1.1406) [2022-01-19 10:26:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][660/1251] eta 0:21:55 lr 0.000907 time 2.0906 (2.2264) loss 4.6076 (3.8779) grad_norm 1.1635 (1.1415) [2022-01-19 10:27:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][670/1251] eta 0:21:35 lr 0.000907 time 2.8392 (2.2299) loss 3.6626 (3.8785) grad_norm 0.9168 (1.1414) [2022-01-19 10:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][680/1251] eta 0:21:14 lr 0.000907 time 2.8336 (2.2325) loss 4.2930 (3.8775) grad_norm 1.2770 (1.1416) [2022-01-19 10:28:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][690/1251] eta 0:20:51 lr 0.000907 time 2.2722 (2.2315) loss 3.2373 (3.8780) grad_norm 0.9728 (1.1405) [2022-01-19 10:28:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][700/1251] eta 0:20:26 lr 0.000907 time 1.8591 (2.2266) loss 4.3265 (3.8788) grad_norm 1.1739 (1.1404) [2022-01-19 10:28:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][710/1251] eta 0:20:02 lr 0.000907 time 1.9973 (2.2236) loss 4.3630 (3.8789) grad_norm 1.2096 (1.1408) [2022-01-19 10:29:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][720/1251] eta 0:19:41 lr 0.000907 time 2.2450 (2.2241) loss 3.4887 (3.8760) grad_norm 1.3075 (1.1404) [2022-01-19 10:29:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][730/1251] eta 0:19:19 lr 0.000907 time 2.8973 (2.2250) loss 2.3685 (3.8772) grad_norm 1.2026 (1.1409) [2022-01-19 10:29:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][740/1251] eta 0:18:57 lr 0.000907 time 2.5568 (2.2262) loss 4.0961 (3.8717) grad_norm 1.0527 (1.1421) [2022-01-19 10:30:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][750/1251] eta 0:18:35 lr 0.000907 time 2.8189 (2.2270) loss 4.1397 (3.8740) grad_norm 1.1478 (1.1431) [2022-01-19 10:30:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][760/1251] eta 0:18:12 lr 0.000907 time 1.8698 (2.2243) loss 4.7924 (3.8778) grad_norm 1.1975 (1.1455) [2022-01-19 10:30:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][770/1251] eta 0:17:49 lr 0.000907 time 3.0466 (2.2241) loss 4.0887 (3.8781) grad_norm 1.0532 (1.1451) [2022-01-19 10:31:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][780/1251] eta 0:17:27 lr 0.000907 time 2.1188 (2.2231) loss 4.3954 (3.8772) grad_norm 1.0518 (1.1447) [2022-01-19 10:31:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][790/1251] eta 0:17:04 lr 0.000907 time 2.3873 (2.2231) loss 4.4837 (3.8801) grad_norm 0.9708 (1.1436) [2022-01-19 10:32:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][800/1251] eta 0:16:42 lr 0.000907 time 2.1414 (2.2226) loss 3.0435 (3.8832) grad_norm 0.9733 (1.1431) [2022-01-19 10:32:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][810/1251] eta 0:16:20 lr 0.000907 time 3.3540 (2.2230) loss 3.3171 (3.8801) grad_norm 1.1137 (1.1424) [2022-01-19 10:32:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][820/1251] eta 0:15:57 lr 0.000907 time 1.5510 (2.2222) loss 3.8062 (3.8782) grad_norm 1.2258 (1.1420) [2022-01-19 10:33:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][830/1251] eta 0:15:35 lr 0.000906 time 2.2110 (2.2209) loss 4.1732 (3.8835) grad_norm 0.9531 (1.1414) [2022-01-19 10:33:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][840/1251] eta 0:15:12 lr 0.000906 time 1.6826 (2.2196) loss 4.3142 (3.8847) grad_norm 1.2740 (1.1425) [2022-01-19 10:33:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][850/1251] eta 0:14:49 lr 0.000906 time 2.7814 (2.2192) loss 4.0477 (3.8865) grad_norm 1.1530 (1.1427) [2022-01-19 10:34:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][860/1251] eta 0:14:27 lr 0.000906 time 1.7491 (2.2178) loss 4.0637 (3.8862) grad_norm 1.2910 (1.1438) [2022-01-19 10:34:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][870/1251] eta 0:14:04 lr 0.000906 time 2.1870 (2.2164) loss 3.1540 (3.8870) grad_norm 0.9065 (1.1436) [2022-01-19 10:34:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][880/1251] eta 0:13:41 lr 0.000906 time 1.6121 (2.2154) loss 4.1945 (3.8902) grad_norm 1.2680 (1.1434) [2022-01-19 10:35:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][890/1251] eta 0:13:19 lr 0.000906 time 2.5309 (2.2147) loss 3.7227 (3.8907) grad_norm 1.1626 (1.1435) [2022-01-19 10:35:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][900/1251] eta 0:12:57 lr 0.000906 time 2.8437 (2.2162) loss 3.6794 (3.8893) grad_norm 1.1821 (1.1439) [2022-01-19 10:36:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][910/1251] eta 0:12:36 lr 0.000906 time 2.1698 (2.2173) loss 3.9985 (3.8910) grad_norm 1.1022 (1.1438) [2022-01-19 10:36:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][920/1251] eta 0:12:14 lr 0.000906 time 1.9791 (2.2190) loss 4.2092 (3.8936) grad_norm 1.1259 (1.1434) [2022-01-19 10:36:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][930/1251] eta 0:11:52 lr 0.000906 time 2.1827 (2.2187) loss 4.3773 (3.8976) grad_norm 1.1601 (1.1429) [2022-01-19 10:37:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][940/1251] eta 0:11:29 lr 0.000906 time 2.5409 (2.2174) loss 3.8664 (3.8983) grad_norm 1.3926 (1.1436) [2022-01-19 10:37:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][950/1251] eta 0:11:06 lr 0.000906 time 2.0412 (2.2146) loss 3.8084 (3.8974) grad_norm 1.0682 (1.1439) [2022-01-19 10:37:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][960/1251] eta 0:10:43 lr 0.000906 time 1.8749 (2.2117) loss 4.1551 (3.8957) grad_norm 1.1614 (1.1441) [2022-01-19 10:38:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][970/1251] eta 0:10:21 lr 0.000906 time 2.5533 (2.2105) loss 3.6993 (3.8956) grad_norm 1.1764 (1.1436) [2022-01-19 10:38:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][980/1251] eta 0:09:58 lr 0.000906 time 2.4365 (2.2102) loss 3.9206 (3.8967) grad_norm 1.0695 (1.1437) [2022-01-19 10:38:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][990/1251] eta 0:09:36 lr 0.000906 time 2.3837 (2.2104) loss 4.3532 (3.8989) grad_norm 1.3142 (1.1441) [2022-01-19 10:39:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1000/1251] eta 0:09:15 lr 0.000906 time 2.1723 (2.2116) loss 3.3254 (3.8999) grad_norm 1.0514 (1.1434) [2022-01-19 10:39:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1010/1251] eta 0:08:53 lr 0.000906 time 2.5486 (2.2123) loss 2.9551 (3.8997) grad_norm 1.0240 (1.1430) [2022-01-19 10:39:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1020/1251] eta 0:08:31 lr 0.000906 time 2.1386 (2.2127) loss 4.0573 (3.8989) grad_norm 1.1299 (1.1428) [2022-01-19 10:40:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1030/1251] eta 0:08:09 lr 0.000906 time 1.9063 (2.2138) loss 3.0367 (3.8978) grad_norm 1.0300 (1.1418) [2022-01-19 10:40:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1040/1251] eta 0:07:47 lr 0.000906 time 1.8691 (2.2145) loss 4.2108 (3.8983) grad_norm 1.7059 (1.1421) [2022-01-19 10:41:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1050/1251] eta 0:07:24 lr 0.000906 time 2.4043 (2.2134) loss 3.5693 (3.8999) grad_norm 1.1910 (1.1421) [2022-01-19 10:41:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1060/1251] eta 0:07:02 lr 0.000906 time 1.9630 (2.2116) loss 4.0069 (3.9004) grad_norm 1.1574 (1.1416) [2022-01-19 10:41:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1070/1251] eta 0:06:40 lr 0.000906 time 2.0616 (2.2112) loss 4.0398 (3.8987) grad_norm 1.0656 (1.1412) [2022-01-19 10:42:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1080/1251] eta 0:06:18 lr 0.000906 time 1.9421 (2.2125) loss 4.0078 (3.8996) grad_norm 1.1259 (1.1410) [2022-01-19 10:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1090/1251] eta 0:05:56 lr 0.000906 time 2.8173 (2.2137) loss 3.6631 (3.8993) grad_norm 1.1193 (1.1407) [2022-01-19 10:42:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1100/1251] eta 0:05:34 lr 0.000906 time 2.1254 (2.2130) loss 3.8437 (3.9017) grad_norm 1.5541 (1.1414) [2022-01-19 10:43:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1110/1251] eta 0:05:11 lr 0.000906 time 1.9179 (2.2115) loss 3.7173 (3.9003) grad_norm 1.3110 (1.1413) [2022-01-19 10:43:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1120/1251] eta 0:04:49 lr 0.000906 time 1.8535 (2.2116) loss 3.0005 (3.9013) grad_norm 1.2180 (1.1409) [2022-01-19 10:44:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1130/1251] eta 0:04:27 lr 0.000906 time 2.6279 (2.2110) loss 4.6765 (3.9017) grad_norm 1.0259 (1.1405) [2022-01-19 10:44:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1140/1251] eta 0:04:05 lr 0.000906 time 2.1238 (2.2107) loss 4.0252 (3.9003) grad_norm 1.1267 (1.1399) [2022-01-19 10:44:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1150/1251] eta 0:03:43 lr 0.000906 time 1.5743 (2.2112) loss 3.9499 (3.9011) grad_norm 1.1815 (1.1398) [2022-01-19 10:45:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1160/1251] eta 0:03:21 lr 0.000906 time 1.9608 (2.2107) loss 3.1589 (3.8993) grad_norm 1.1302 (1.1396) [2022-01-19 10:45:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1170/1251] eta 0:02:58 lr 0.000906 time 1.6198 (2.2091) loss 4.1934 (3.8993) grad_norm 1.0121 (1.1391) [2022-01-19 10:45:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1180/1251] eta 0:02:36 lr 0.000906 time 2.2749 (2.2080) loss 4.6283 (3.8988) grad_norm 1.2994 (1.1389) [2022-01-19 10:46:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1190/1251] eta 0:02:14 lr 0.000906 time 2.0701 (2.2084) loss 3.6397 (3.8966) grad_norm 1.3599 (1.1391) [2022-01-19 10:46:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1200/1251] eta 0:01:52 lr 0.000906 time 4.3798 (2.2087) loss 4.1546 (3.8944) grad_norm 1.0143 (1.1380) [2022-01-19 10:46:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1210/1251] eta 0:01:30 lr 0.000906 time 1.9403 (2.2084) loss 3.9986 (3.8937) grad_norm 0.9812 (1.1375) [2022-01-19 10:47:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1220/1251] eta 0:01:08 lr 0.000906 time 2.4136 (2.2079) loss 4.4680 (3.8940) grad_norm 1.0766 (1.1368) [2022-01-19 10:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1230/1251] eta 0:00:46 lr 0.000906 time 2.2353 (2.2085) loss 3.2055 (3.8925) grad_norm 1.1882 (1.1367) [2022-01-19 10:48:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1240/1251] eta 0:00:24 lr 0.000905 time 2.4204 (2.2081) loss 4.2167 (3.8937) grad_norm 1.0308 (1.1371) [2022-01-19 10:48:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [59/300][1250/1251] eta 0:00:02 lr 0.000905 time 1.1492 (2.2022) loss 4.6230 (3.8939) grad_norm 1.2233 (1.1379) [2022-01-19 10:48:16 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 59 training takes 0:45:55 [2022-01-19 10:48:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.633 (18.633) Loss 1.2471 (1.2471) Acc@1 72.559 (72.559) Acc@5 89.746 (89.746) [2022-01-19 10:48:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.518 (3.399) Loss 1.2219 (1.2568) Acc@1 72.363 (71.218) Acc@5 91.504 (90.154) [2022-01-19 10:49:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.938 (2.672) Loss 1.2188 (1.2421) Acc@1 71.191 (71.447) Acc@5 90.527 (90.453) [2022-01-19 10:49:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.645 (2.279) Loss 1.2549 (1.2492) Acc@1 71.387 (71.176) Acc@5 90.723 (90.543) [2022-01-19 10:49:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 5.397 (2.181) Loss 1.2652 (1.2449) Acc@1 70.996 (71.225) Acc@5 89.062 (90.644) [2022-01-19 10:49:52 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 71.080 Acc@5 90.598 [2022-01-19 10:49:52 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 71.1% [2022-01-19 10:49:52 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 71.28% [2022-01-19 10:50:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][0/1251] eta 8:09:04 lr 0.000905 time 23.4566 (23.4566) loss 3.7538 (3.7538) grad_norm 1.3162 (1.3162) [2022-01-19 10:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][10/1251] eta 1:29:30 lr 0.000905 time 1.8495 (4.3276) loss 4.0114 (3.8008) grad_norm 1.1703 (1.2156) [2022-01-19 10:51:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][20/1251] eta 1:07:18 lr 0.000905 time 1.9029 (3.2810) loss 3.3841 (3.7295) grad_norm 1.3234 (1.1924) [2022-01-19 10:51:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][30/1251] eta 0:59:22 lr 0.000905 time 1.9416 (2.9173) loss 4.3795 (3.8754) grad_norm 1.3181 (1.1790) [2022-01-19 10:51:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][40/1251] eta 0:55:41 lr 0.000905 time 3.1309 (2.7596) loss 3.4401 (3.8419) grad_norm 1.0630 (1.1625) [2022-01-19 10:52:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][50/1251] eta 0:53:19 lr 0.000905 time 3.1358 (2.6637) loss 4.0362 (3.9005) grad_norm 1.1627 (1.1689) [2022-01-19 10:52:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][60/1251] eta 0:51:00 lr 0.000905 time 1.5808 (2.5694) loss 4.2375 (3.9052) grad_norm 0.9811 (1.1653) [2022-01-19 10:52:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][70/1251] eta 0:49:31 lr 0.000905 time 2.0734 (2.5159) loss 4.7408 (3.9144) grad_norm 1.1154 (1.1659) [2022-01-19 10:53:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][80/1251] eta 0:48:11 lr 0.000905 time 2.2723 (2.4696) loss 4.0932 (3.9118) grad_norm 1.2790 (1.1710) [2022-01-19 10:53:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][90/1251] eta 0:47:09 lr 0.000905 time 1.8956 (2.4368) loss 4.5472 (3.8927) grad_norm 1.2828 (1.1688) [2022-01-19 10:53:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][100/1251] eta 0:46:19 lr 0.000905 time 2.6391 (2.4150) loss 4.7946 (3.8731) grad_norm 1.0248 (1.1618) [2022-01-19 10:54:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][110/1251] eta 0:45:37 lr 0.000905 time 2.1713 (2.3989) loss 4.2991 (3.8809) grad_norm 1.0347 (1.1577) [2022-01-19 10:54:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][120/1251] eta 0:44:49 lr 0.000905 time 2.3113 (2.3778) loss 3.9724 (3.8790) grad_norm 1.4185 (1.1605) [2022-01-19 10:55:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][130/1251] eta 0:44:09 lr 0.000905 time 1.9837 (2.3634) loss 3.9951 (3.8783) grad_norm 1.3738 (1.1571) [2022-01-19 10:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][140/1251] eta 0:43:25 lr 0.000905 time 2.5901 (2.3455) loss 4.5821 (3.8862) grad_norm 1.9438 (1.1633) [2022-01-19 10:55:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][150/1251] eta 0:42:41 lr 0.000905 time 2.5915 (2.3269) loss 3.7014 (3.8878) grad_norm 1.3747 (1.1659) [2022-01-19 10:56:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][160/1251] eta 0:42:03 lr 0.000905 time 2.3587 (2.3126) loss 3.4964 (3.8891) grad_norm 1.3918 (1.1665) [2022-01-19 10:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][170/1251] eta 0:41:39 lr 0.000905 time 2.0713 (2.3118) loss 3.9815 (3.9094) grad_norm 1.0379 (1.1632) [2022-01-19 10:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][180/1251] eta 0:41:11 lr 0.000905 time 3.0509 (2.3077) loss 3.1339 (3.8907) grad_norm 1.3353 (1.1575) [2022-01-19 10:57:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][190/1251] eta 0:40:46 lr 0.000905 time 1.7587 (2.3055) loss 4.0336 (3.8965) grad_norm 1.2512 (1.1561) [2022-01-19 10:57:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][200/1251] eta 0:40:17 lr 0.000905 time 2.2264 (2.3006) loss 4.1008 (3.8937) grad_norm 1.1561 (1.1583) [2022-01-19 10:57:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][210/1251] eta 0:39:48 lr 0.000905 time 1.8234 (2.2941) loss 4.3903 (3.8869) grad_norm 1.2211 (1.1568) [2022-01-19 10:58:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][220/1251] eta 0:39:21 lr 0.000905 time 3.1149 (2.2903) loss 3.7556 (3.8892) grad_norm 0.9201 (1.1545) [2022-01-19 10:58:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][230/1251] eta 0:38:57 lr 0.000905 time 2.9358 (2.2896) loss 3.2139 (3.8874) grad_norm 1.0456 (1.1544) [2022-01-19 10:59:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][240/1251] eta 0:38:27 lr 0.000905 time 2.3011 (2.2820) loss 4.8553 (3.8912) grad_norm 1.2486 (1.1549) [2022-01-19 10:59:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][250/1251] eta 0:37:58 lr 0.000905 time 2.4283 (2.2765) loss 4.0887 (3.8955) grad_norm 1.1476 (1.1576) [2022-01-19 10:59:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][260/1251] eta 0:37:30 lr 0.000905 time 1.9927 (2.2709) loss 3.6563 (3.8941) grad_norm 1.1872 (1.1570) [2022-01-19 11:00:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][270/1251] eta 0:37:02 lr 0.000905 time 2.5684 (2.2660) loss 4.6081 (3.9006) grad_norm 0.9865 (1.1546) [2022-01-19 11:00:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][280/1251] eta 0:36:40 lr 0.000905 time 2.1811 (2.2658) loss 3.1600 (3.9034) grad_norm 1.1849 (1.1547) [2022-01-19 11:00:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][290/1251] eta 0:36:13 lr 0.000905 time 2.1687 (2.2618) loss 3.9641 (3.8991) grad_norm 1.0430 (1.1583) [2022-01-19 11:01:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][300/1251] eta 0:35:51 lr 0.000905 time 3.1155 (2.2618) loss 4.4597 (3.8913) grad_norm 1.1342 (1.1565) [2022-01-19 11:01:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][310/1251] eta 0:35:28 lr 0.000905 time 2.6449 (2.2620) loss 4.1344 (3.8899) grad_norm 0.9717 (1.1531) [2022-01-19 11:01:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][320/1251] eta 0:34:59 lr 0.000905 time 2.0183 (2.2555) loss 3.1976 (3.8880) grad_norm 1.2287 (1.1534) [2022-01-19 11:02:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][330/1251] eta 0:34:35 lr 0.000905 time 2.4920 (2.2539) loss 3.7817 (3.8787) grad_norm 1.2113 (1.1557) [2022-01-19 11:02:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][340/1251] eta 0:34:08 lr 0.000905 time 1.9534 (2.2483) loss 3.8431 (3.8832) grad_norm 1.2090 (1.1565) [2022-01-19 11:03:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][350/1251] eta 0:33:43 lr 0.000905 time 3.0484 (2.2460) loss 3.1696 (3.8884) grad_norm 1.3163 (1.1565) [2022-01-19 11:03:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][360/1251] eta 0:33:21 lr 0.000905 time 2.5882 (2.2459) loss 4.3811 (3.8923) grad_norm 0.9763 (1.1555) [2022-01-19 11:03:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][370/1251] eta 0:32:54 lr 0.000905 time 1.6185 (2.2409) loss 3.8617 (3.8940) grad_norm 1.0199 (1.1577) [2022-01-19 11:04:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][380/1251] eta 0:32:31 lr 0.000905 time 2.4859 (2.2401) loss 2.9272 (3.8994) grad_norm 1.1574 (1.1578) [2022-01-19 11:04:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][390/1251] eta 0:32:07 lr 0.000905 time 2.7790 (2.2392) loss 4.2320 (3.8981) grad_norm 1.2051 (1.1577) [2022-01-19 11:04:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][400/1251] eta 0:31:45 lr 0.000904 time 2.4438 (2.2392) loss 4.6219 (3.9003) grad_norm 0.9899 (1.1572) [2022-01-19 11:05:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][410/1251] eta 0:31:21 lr 0.000904 time 2.7797 (2.2377) loss 3.3902 (3.9017) grad_norm 1.3852 (1.1567) [2022-01-19 11:05:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][420/1251] eta 0:30:55 lr 0.000904 time 1.9950 (2.2332) loss 4.4915 (3.8981) grad_norm 1.4688 (1.1570) [2022-01-19 11:05:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][430/1251] eta 0:30:34 lr 0.000904 time 2.8226 (2.2341) loss 4.5213 (3.8992) grad_norm 1.2436 (1.1576) [2022-01-19 11:06:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][440/1251] eta 0:30:14 lr 0.000904 time 3.4608 (2.2369) loss 4.1171 (3.8981) grad_norm 1.2301 (1.1587) [2022-01-19 11:06:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][450/1251] eta 0:29:47 lr 0.000904 time 1.8141 (2.2311) loss 4.5119 (3.9005) grad_norm 1.0662 (1.1561) [2022-01-19 11:07:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][460/1251] eta 0:29:23 lr 0.000904 time 1.9620 (2.2291) loss 4.3333 (3.9043) grad_norm 1.2808 (1.1560) [2022-01-19 11:07:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][470/1251] eta 0:28:58 lr 0.000904 time 2.6609 (2.2266) loss 3.0134 (3.9032) grad_norm 1.1517 (1.1541) [2022-01-19 11:07:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][480/1251] eta 0:28:38 lr 0.000904 time 3.4308 (2.2285) loss 4.2000 (3.9052) grad_norm 1.2389 (1.1549) [2022-01-19 11:08:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][490/1251] eta 0:28:17 lr 0.000904 time 2.8183 (2.2301) loss 3.5747 (3.9016) grad_norm 1.3611 (1.1556) [2022-01-19 11:08:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][500/1251] eta 0:27:55 lr 0.000904 time 1.5338 (2.2313) loss 4.0010 (3.9001) grad_norm 1.0848 (1.1552) [2022-01-19 11:08:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][510/1251] eta 0:27:34 lr 0.000904 time 3.1670 (2.2324) loss 3.4119 (3.9020) grad_norm 1.3514 (1.1559) [2022-01-19 11:09:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][520/1251] eta 0:27:11 lr 0.000904 time 2.5715 (2.2315) loss 4.5434 (3.9036) grad_norm 1.1969 (1.1557) [2022-01-19 11:09:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][530/1251] eta 0:26:47 lr 0.000904 time 2.1202 (2.2294) loss 3.6895 (3.9084) grad_norm 1.2273 (1.1569) [2022-01-19 11:09:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][540/1251] eta 0:26:22 lr 0.000904 time 1.9555 (2.2262) loss 3.8938 (3.9078) grad_norm 0.9857 (1.1568) [2022-01-19 11:10:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][550/1251] eta 0:26:00 lr 0.000904 time 3.4272 (2.2266) loss 4.2998 (3.9108) grad_norm 1.0759 (1.1556) [2022-01-19 11:10:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][560/1251] eta 0:25:36 lr 0.000904 time 1.9628 (2.2232) loss 4.3510 (3.9108) grad_norm 1.1368 (1.1549) [2022-01-19 11:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][570/1251] eta 0:25:13 lr 0.000904 time 1.5110 (2.2227) loss 4.2862 (3.9107) grad_norm 1.2312 (1.1550) [2022-01-19 11:11:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][580/1251] eta 0:24:49 lr 0.000904 time 1.7478 (2.2196) loss 3.3793 (3.9126) grad_norm 1.0657 (1.1536) [2022-01-19 11:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][590/1251] eta 0:24:27 lr 0.000904 time 2.8108 (2.2195) loss 3.6937 (3.9138) grad_norm 1.1503 (1.1522) [2022-01-19 11:12:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][600/1251] eta 0:24:06 lr 0.000904 time 3.1228 (2.2215) loss 3.6940 (3.9137) grad_norm 1.0943 (1.1530) [2022-01-19 11:12:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][610/1251] eta 0:23:42 lr 0.000904 time 1.9835 (2.2187) loss 4.5215 (3.9154) grad_norm 1.2121 (1.1534) [2022-01-19 11:12:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][620/1251] eta 0:23:20 lr 0.000904 time 2.2271 (2.2188) loss 2.7978 (3.9172) grad_norm 1.2681 (1.1540) [2022-01-19 11:13:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][630/1251] eta 0:22:58 lr 0.000904 time 2.3667 (2.2195) loss 3.1715 (3.9168) grad_norm 1.6116 (1.1550) [2022-01-19 11:13:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][640/1251] eta 0:22:37 lr 0.000904 time 3.1448 (2.2211) loss 4.6200 (3.9193) grad_norm 1.0120 (1.1540) [2022-01-19 11:13:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][650/1251] eta 0:22:14 lr 0.000904 time 1.8972 (2.2209) loss 3.7410 (3.9190) grad_norm 0.9115 (1.1529) [2022-01-19 11:14:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][660/1251] eta 0:21:51 lr 0.000904 time 1.7728 (2.2200) loss 4.4157 (3.9192) grad_norm 1.0052 (1.1529) [2022-01-19 11:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][670/1251] eta 0:21:28 lr 0.000904 time 1.7753 (2.2176) loss 4.2579 (3.9185) grad_norm 1.0684 (1.1547) [2022-01-19 11:15:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][680/1251] eta 0:21:04 lr 0.000904 time 2.4250 (2.2152) loss 4.3176 (3.9219) grad_norm 1.4002 (1.1551) [2022-01-19 11:15:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][690/1251] eta 0:20:41 lr 0.000904 time 1.9431 (2.2124) loss 3.5871 (3.9187) grad_norm 1.1593 (1.1548) [2022-01-19 11:15:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][700/1251] eta 0:20:18 lr 0.000904 time 2.2630 (2.2116) loss 3.9109 (3.9199) grad_norm 1.0889 (1.1536) [2022-01-19 11:16:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][710/1251] eta 0:19:57 lr 0.000904 time 2.5379 (2.2140) loss 4.0368 (3.9214) grad_norm 1.0797 (1.1526) [2022-01-19 11:16:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][720/1251] eta 0:19:35 lr 0.000904 time 2.5254 (2.2138) loss 4.2502 (3.9212) grad_norm 1.0916 (1.1514) [2022-01-19 11:16:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][730/1251] eta 0:19:14 lr 0.000904 time 2.8625 (2.2152) loss 2.9823 (3.9195) grad_norm 1.2440 (1.1507) [2022-01-19 11:17:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][740/1251] eta 0:18:52 lr 0.000904 time 2.5146 (2.2153) loss 3.8222 (3.9217) grad_norm 1.1500 (1.1508) [2022-01-19 11:17:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][750/1251] eta 0:18:30 lr 0.000904 time 2.5198 (2.2163) loss 4.5936 (3.9180) grad_norm 1.0803 (1.1510) [2022-01-19 11:17:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][760/1251] eta 0:18:07 lr 0.000904 time 1.6152 (2.2139) loss 4.1127 (3.9136) grad_norm 1.1648 (1.1503) [2022-01-19 11:18:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][770/1251] eta 0:17:43 lr 0.000904 time 1.8558 (2.2119) loss 4.0464 (3.9144) grad_norm 1.1618 (1.1506) [2022-01-19 11:18:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][780/1251] eta 0:17:20 lr 0.000904 time 2.4680 (2.2096) loss 3.8844 (3.9154) grad_norm 1.1810 (1.1513) [2022-01-19 11:19:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][790/1251] eta 0:16:58 lr 0.000904 time 2.5194 (2.2094) loss 3.9709 (3.9160) grad_norm 1.1575 (1.1519) [2022-01-19 11:19:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][800/1251] eta 0:16:36 lr 0.000904 time 2.4876 (2.2100) loss 2.9782 (3.9146) grad_norm 0.9358 (1.1524) [2022-01-19 11:19:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][810/1251] eta 0:16:15 lr 0.000903 time 2.8704 (2.2110) loss 3.6606 (3.9130) grad_norm 1.2340 (1.1535) [2022-01-19 11:20:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][820/1251] eta 0:15:52 lr 0.000903 time 1.8667 (2.2106) loss 3.7528 (3.9151) grad_norm 1.2964 (1.1541) [2022-01-19 11:20:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][830/1251] eta 0:15:30 lr 0.000903 time 2.0012 (2.2107) loss 3.7683 (3.9122) grad_norm 0.9316 (1.1530) [2022-01-19 11:20:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][840/1251] eta 0:15:07 lr 0.000903 time 1.9363 (2.2084) loss 4.0518 (3.9112) grad_norm 1.0210 (1.1521) [2022-01-19 11:21:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][850/1251] eta 0:14:45 lr 0.000903 time 1.9577 (2.2082) loss 2.7385 (3.9100) grad_norm 1.7675 (1.1524) [2022-01-19 11:21:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][860/1251] eta 0:14:23 lr 0.000903 time 2.1438 (2.2089) loss 4.4000 (3.9113) grad_norm 1.0470 (1.1521) [2022-01-19 11:21:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][870/1251] eta 0:14:01 lr 0.000903 time 2.2072 (2.2090) loss 4.2596 (3.9079) grad_norm 1.1233 (1.1517) [2022-01-19 11:22:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][880/1251] eta 0:13:39 lr 0.000903 time 2.1025 (2.2088) loss 2.6184 (3.9078) grad_norm 1.1580 (1.1522) [2022-01-19 11:22:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][890/1251] eta 0:13:17 lr 0.000903 time 2.2511 (2.2095) loss 3.3320 (3.9087) grad_norm 1.2379 (1.1519) [2022-01-19 11:23:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][900/1251] eta 0:12:54 lr 0.000903 time 2.3455 (2.2075) loss 2.6039 (3.9060) grad_norm 1.1436 (1.1519) [2022-01-19 11:23:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][910/1251] eta 0:12:31 lr 0.000903 time 1.8861 (2.2046) loss 3.9367 (3.9058) grad_norm 1.2892 (1.1521) [2022-01-19 11:23:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][920/1251] eta 0:12:09 lr 0.000903 time 2.0965 (2.2039) loss 4.3917 (3.9035) grad_norm 1.1190 (1.1520) [2022-01-19 11:24:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][930/1251] eta 0:11:46 lr 0.000903 time 2.1974 (2.2020) loss 4.0446 (3.9023) grad_norm 1.1867 (1.1516) [2022-01-19 11:24:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][940/1251] eta 0:11:24 lr 0.000903 time 2.3318 (2.2013) loss 3.8799 (3.9017) grad_norm 1.2056 (1.1513) [2022-01-19 11:24:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][950/1251] eta 0:11:02 lr 0.000903 time 1.9903 (2.2007) loss 3.8642 (3.8994) grad_norm 1.2341 (1.1525) [2022-01-19 11:25:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][960/1251] eta 0:10:40 lr 0.000903 time 2.0624 (2.1999) loss 4.1668 (3.8976) grad_norm 1.6219 (1.1536) [2022-01-19 11:25:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][970/1251] eta 0:10:18 lr 0.000903 time 2.4083 (2.1996) loss 4.1595 (3.8995) grad_norm 1.2911 (1.1544) [2022-01-19 11:25:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][980/1251] eta 0:09:56 lr 0.000903 time 2.9942 (2.2021) loss 3.3827 (3.8982) grad_norm 1.0086 (1.1537) [2022-01-19 11:26:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][990/1251] eta 0:09:35 lr 0.000903 time 1.8308 (2.2036) loss 3.7589 (3.8981) grad_norm 1.0709 (1.1532) [2022-01-19 11:26:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1000/1251] eta 0:09:13 lr 0.000903 time 3.0928 (2.2052) loss 3.1357 (3.8966) grad_norm 0.9877 (1.1532) [2022-01-19 11:27:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1010/1251] eta 0:08:51 lr 0.000903 time 2.2150 (2.2055) loss 4.2090 (3.8968) grad_norm 1.0279 (1.1527) [2022-01-19 11:27:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1020/1251] eta 0:08:29 lr 0.000903 time 1.8330 (2.2047) loss 2.8780 (3.8970) grad_norm 1.0734 (1.1517) [2022-01-19 11:27:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1030/1251] eta 0:08:07 lr 0.000903 time 2.3236 (2.2040) loss 4.2481 (3.8993) grad_norm 0.9927 (1.1515) [2022-01-19 11:28:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1040/1251] eta 0:07:44 lr 0.000903 time 1.8340 (2.2027) loss 4.0839 (3.8988) grad_norm 1.0742 (1.1507) [2022-01-19 11:28:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1050/1251] eta 0:07:22 lr 0.000903 time 2.2112 (2.2011) loss 4.7658 (3.9003) grad_norm 1.0647 (1.1501) [2022-01-19 11:28:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1060/1251] eta 0:07:00 lr 0.000903 time 2.1730 (2.1995) loss 4.6623 (3.9031) grad_norm 1.0901 (1.1497) [2022-01-19 11:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1070/1251] eta 0:06:38 lr 0.000903 time 2.3588 (2.1997) loss 3.7783 (3.9042) grad_norm 0.9809 (1.1491) [2022-01-19 11:29:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1080/1251] eta 0:06:16 lr 0.000903 time 2.0749 (2.2000) loss 4.8942 (3.9063) grad_norm 1.2283 (1.1495) [2022-01-19 11:29:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1090/1251] eta 0:05:54 lr 0.000903 time 2.2527 (2.1998) loss 4.0425 (3.9057) grad_norm 1.0491 (1.1491) [2022-01-19 11:30:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1100/1251] eta 0:05:32 lr 0.000903 time 2.1661 (2.1997) loss 2.6105 (3.9024) grad_norm 1.1400 (1.1487) [2022-01-19 11:30:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1110/1251] eta 0:05:10 lr 0.000903 time 1.9473 (2.2003) loss 4.3744 (3.9036) grad_norm 1.1570 (1.1480) [2022-01-19 11:30:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1120/1251] eta 0:04:48 lr 0.000903 time 1.6843 (2.2004) loss 3.3524 (3.9026) grad_norm 1.5838 (1.1485) [2022-01-19 11:31:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1130/1251] eta 0:04:26 lr 0.000903 time 3.0174 (2.2011) loss 2.7252 (3.9022) grad_norm 1.3295 (1.1492) [2022-01-19 11:31:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1140/1251] eta 0:04:04 lr 0.000903 time 2.3700 (2.2012) loss 4.1284 (3.9030) grad_norm 1.4161 (1.1494) [2022-01-19 11:32:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1150/1251] eta 0:03:42 lr 0.000903 time 2.1914 (2.2014) loss 4.5631 (3.8993) grad_norm 1.1556 (1.1500) [2022-01-19 11:32:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1160/1251] eta 0:03:20 lr 0.000903 time 1.9036 (2.2001) loss 4.3837 (3.9013) grad_norm 1.0682 (1.1492) [2022-01-19 11:32:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1170/1251] eta 0:02:58 lr 0.000903 time 2.7869 (2.2009) loss 3.5221 (3.9021) grad_norm 1.0602 (1.1489) [2022-01-19 11:33:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1180/1251] eta 0:02:36 lr 0.000903 time 1.8839 (2.2001) loss 3.8955 (3.9024) grad_norm 1.1333 (1.1485) [2022-01-19 11:33:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1190/1251] eta 0:02:14 lr 0.000903 time 2.6046 (2.1996) loss 4.9202 (3.9025) grad_norm 1.0495 (1.1490) [2022-01-19 11:33:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1200/1251] eta 0:01:52 lr 0.000903 time 1.9383 (2.1989) loss 4.5670 (3.9030) grad_norm 1.1430 (1.1489) [2022-01-19 11:34:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1210/1251] eta 0:01:30 lr 0.000902 time 2.5057 (2.1992) loss 4.2665 (3.9026) grad_norm 1.2423 (1.1491) [2022-01-19 11:34:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1220/1251] eta 0:01:08 lr 0.000902 time 2.4235 (2.1998) loss 4.3657 (3.9038) grad_norm 1.0475 (1.1490) [2022-01-19 11:34:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1230/1251] eta 0:00:46 lr 0.000902 time 2.0504 (2.1994) loss 4.4958 (3.9028) grad_norm 1.1519 (1.1488) [2022-01-19 11:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1240/1251] eta 0:00:24 lr 0.000902 time 1.4479 (2.1982) loss 3.0590 (3.9007) grad_norm 1.0040 (1.1489) [2022-01-19 11:35:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [60/300][1250/1251] eta 0:00:02 lr 0.000902 time 1.1677 (2.1930) loss 4.1357 (3.9006) grad_norm 1.0011 (1.1486) [2022-01-19 11:35:36 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 60 training takes 0:45:43 [2022-01-19 11:35:36 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_60 saving...... [2022-01-19 11:35:47 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_60 saved !!! [2022-01-19 11:36:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 13.509 (13.509) Loss 1.2100 (1.2100) Acc@1 71.875 (71.875) Acc@5 91.406 (91.406) [2022-01-19 11:36:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.086 (2.506) Loss 1.2556 (1.2318) Acc@1 69.727 (71.342) Acc@5 90.430 (90.811) [2022-01-19 11:36:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.430 (2.133) Loss 1.1997 (1.2134) Acc@1 71.094 (71.610) Acc@5 90.723 (90.992) [2022-01-19 11:36:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.980 (2.059) Loss 1.2328 (1.2233) Acc@1 72.363 (71.327) Acc@5 91.113 (90.978) [2022-01-19 11:37:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.711 (1.916) Loss 1.1780 (1.2263) Acc@1 73.145 (71.394) Acc@5 91.016 (90.923) [2022-01-19 11:37:14 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 71.448 Acc@5 90.886 [2022-01-19 11:37:14 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 71.4% [2022-01-19 11:37:14 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 71.45% [2022-01-19 11:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][0/1251] eta 7:28:12 lr 0.000902 time 21.4969 (21.4969) loss 4.3168 (4.3168) grad_norm 1.0581 (1.0581) [2022-01-19 11:37:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][10/1251] eta 1:24:22 lr 0.000902 time 1.4643 (4.0797) loss 3.9479 (3.7114) grad_norm 1.1270 (1.1475) [2022-01-19 11:38:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][20/1251] eta 1:07:37 lr 0.000902 time 2.1634 (3.2962) loss 3.2931 (3.7876) grad_norm 1.0350 (1.1801) [2022-01-19 11:38:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][30/1251] eta 0:59:42 lr 0.000902 time 1.5833 (2.9342) loss 4.0747 (3.7888) grad_norm 1.1567 (1.1748) [2022-01-19 11:39:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][40/1251] eta 0:56:33 lr 0.000902 time 3.8068 (2.8022) loss 3.7527 (3.8191) grad_norm 1.1422 (1.1689) [2022-01-19 11:39:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][50/1251] eta 0:53:30 lr 0.000902 time 1.5305 (2.6729) loss 3.3355 (3.8161) grad_norm 1.1169 (1.1547) [2022-01-19 11:39:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][60/1251] eta 0:51:38 lr 0.000902 time 3.0045 (2.6018) loss 3.6860 (3.7907) grad_norm 1.1166 (1.1558) [2022-01-19 11:40:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][70/1251] eta 0:49:58 lr 0.000902 time 1.8815 (2.5392) loss 3.3721 (3.7881) grad_norm 1.0310 (1.1710) [2022-01-19 11:40:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][80/1251] eta 0:48:51 lr 0.000902 time 2.8116 (2.5037) loss 4.0916 (3.7685) grad_norm 1.0350 (1.1584) [2022-01-19 11:40:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][90/1251] eta 0:47:50 lr 0.000902 time 2.2147 (2.4721) loss 3.8483 (3.8139) grad_norm 1.0184 (1.1584) [2022-01-19 11:41:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][100/1251] eta 0:46:43 lr 0.000902 time 2.6454 (2.4361) loss 4.4087 (3.8407) grad_norm 1.2621 (1.1539) [2022-01-19 11:41:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][110/1251] eta 0:45:44 lr 0.000902 time 1.8305 (2.4054) loss 3.3077 (3.8164) grad_norm 1.1914 (1.1575) [2022-01-19 11:42:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][120/1251] eta 0:44:56 lr 0.000902 time 2.8527 (2.3846) loss 4.5684 (3.8411) grad_norm 1.1946 (1.1586) [2022-01-19 11:42:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][130/1251] eta 0:44:26 lr 0.000902 time 2.4847 (2.3789) loss 4.6280 (3.8508) grad_norm 1.3582 (1.1586) [2022-01-19 11:42:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][140/1251] eta 0:43:43 lr 0.000902 time 2.2098 (2.3614) loss 4.1378 (3.8593) grad_norm 1.1774 (1.1549) [2022-01-19 11:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][150/1251] eta 0:43:01 lr 0.000902 time 1.7894 (2.3446) loss 4.8196 (3.8544) grad_norm 1.1874 (1.1601) [2022-01-19 11:43:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][160/1251] eta 0:42:24 lr 0.000902 time 2.2181 (2.3327) loss 4.3971 (3.8361) grad_norm 1.1099 (1.1578) [2022-01-19 11:43:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][170/1251] eta 0:42:00 lr 0.000902 time 2.1238 (2.3315) loss 4.6087 (3.8404) grad_norm 1.4816 (1.1587) [2022-01-19 11:44:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][180/1251] eta 0:41:23 lr 0.000902 time 2.1286 (2.3193) loss 4.3086 (3.8418) grad_norm 1.0021 (1.1542) [2022-01-19 11:44:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][190/1251] eta 0:40:56 lr 0.000902 time 2.1877 (2.3153) loss 3.1431 (3.8415) grad_norm 1.0564 (1.1540) [2022-01-19 11:44:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][200/1251] eta 0:40:16 lr 0.000902 time 1.6130 (2.2996) loss 3.6926 (3.8469) grad_norm 1.0861 (1.1535) [2022-01-19 11:45:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][210/1251] eta 0:39:53 lr 0.000902 time 2.3899 (2.2991) loss 4.3525 (3.8521) grad_norm 1.4738 (1.1572) [2022-01-19 11:45:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][220/1251] eta 0:39:17 lr 0.000902 time 1.4904 (2.2869) loss 4.3910 (3.8421) grad_norm 1.1723 (1.1580) [2022-01-19 11:46:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][230/1251] eta 0:38:50 lr 0.000902 time 2.1847 (2.2822) loss 3.9984 (3.8624) grad_norm 1.2938 (1.1598) [2022-01-19 11:46:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][240/1251] eta 0:38:20 lr 0.000902 time 2.2020 (2.2758) loss 4.2060 (3.8638) grad_norm 1.1092 (1.1594) [2022-01-19 11:46:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][250/1251] eta 0:37:56 lr 0.000902 time 1.5273 (2.2738) loss 3.6726 (3.8556) grad_norm 1.1476 (1.1599) [2022-01-19 11:47:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][260/1251] eta 0:37:30 lr 0.000902 time 2.6786 (2.2708) loss 3.5105 (3.8472) grad_norm 1.0733 (1.1573) [2022-01-19 11:47:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][270/1251] eta 0:37:10 lr 0.000902 time 2.0619 (2.2733) loss 3.8449 (3.8515) grad_norm 1.0043 (1.1576) [2022-01-19 11:47:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][280/1251] eta 0:36:48 lr 0.000902 time 2.8714 (2.2742) loss 3.5143 (3.8579) grad_norm 1.0123 (1.1572) [2022-01-19 11:48:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][290/1251] eta 0:36:25 lr 0.000902 time 1.8758 (2.2741) loss 4.1802 (3.8592) grad_norm 1.7383 (1.1608) [2022-01-19 11:48:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][300/1251] eta 0:35:56 lr 0.000902 time 2.0899 (2.2676) loss 4.3936 (3.8560) grad_norm 1.0144 (1.1616) [2022-01-19 11:48:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][310/1251] eta 0:35:27 lr 0.000902 time 1.5919 (2.2604) loss 2.7052 (3.8527) grad_norm 1.0005 (1.1569) [2022-01-19 11:49:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][320/1251] eta 0:34:55 lr 0.000902 time 1.7424 (2.2508) loss 4.3471 (3.8607) grad_norm 1.0129 (1.1540) [2022-01-19 11:49:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][330/1251] eta 0:34:32 lr 0.000902 time 2.8312 (2.2503) loss 3.2400 (3.8568) grad_norm 1.1861 (1.1542) [2022-01-19 11:50:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][340/1251] eta 0:34:08 lr 0.000902 time 2.5267 (2.2483) loss 4.7209 (3.8515) grad_norm 1.0281 (1.1547) [2022-01-19 11:50:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][350/1251] eta 0:33:41 lr 0.000902 time 1.9692 (2.2432) loss 3.8596 (3.8500) grad_norm 1.2049 (1.1532) [2022-01-19 11:50:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][360/1251] eta 0:33:17 lr 0.000902 time 2.2375 (2.2413) loss 3.5115 (3.8483) grad_norm 1.1323 (1.1530) [2022-01-19 11:51:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][370/1251] eta 0:32:53 lr 0.000901 time 2.8927 (2.2406) loss 4.2566 (3.8543) grad_norm 1.2138 (1.1534) [2022-01-19 11:51:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][380/1251] eta 0:32:30 lr 0.000901 time 2.4242 (2.2391) loss 3.4189 (3.8580) grad_norm 1.1120 (1.1558) [2022-01-19 11:51:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][390/1251] eta 0:32:05 lr 0.000901 time 2.1155 (2.2364) loss 3.2082 (3.8578) grad_norm 1.1639 (1.1572) [2022-01-19 11:52:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][400/1251] eta 0:31:46 lr 0.000901 time 3.1519 (2.2406) loss 4.2267 (3.8565) grad_norm 1.1200 (1.1574) [2022-01-19 11:52:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][410/1251] eta 0:31:24 lr 0.000901 time 2.3304 (2.2413) loss 3.3803 (3.8548) grad_norm 1.1149 (1.1564) [2022-01-19 11:52:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][420/1251] eta 0:31:00 lr 0.000901 time 2.1504 (2.2392) loss 3.0066 (3.8559) grad_norm 1.1546 (1.1560) [2022-01-19 11:53:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][430/1251] eta 0:30:36 lr 0.000901 time 2.2189 (2.2363) loss 3.6949 (3.8538) grad_norm 0.9655 (1.1540) [2022-01-19 11:53:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][440/1251] eta 0:30:11 lr 0.000901 time 2.5939 (2.2331) loss 3.3896 (3.8520) grad_norm 1.2931 (1.1540) [2022-01-19 11:54:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][450/1251] eta 0:29:46 lr 0.000901 time 2.1407 (2.2298) loss 4.1094 (3.8509) grad_norm 0.9886 (1.1528) [2022-01-19 11:54:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][460/1251] eta 0:29:21 lr 0.000901 time 2.4750 (2.2275) loss 4.1245 (3.8513) grad_norm 1.3216 (1.1533) [2022-01-19 11:54:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][470/1251] eta 0:29:00 lr 0.000901 time 1.9211 (2.2282) loss 4.4651 (3.8543) grad_norm 1.4047 (1.1525) [2022-01-19 11:55:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][480/1251] eta 0:28:38 lr 0.000901 time 2.6997 (2.2287) loss 2.8982 (3.8508) grad_norm 1.2180 (1.1521) [2022-01-19 11:55:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][490/1251] eta 0:28:14 lr 0.000901 time 1.7918 (2.2268) loss 3.1665 (3.8542) grad_norm 1.0057 (1.1510) [2022-01-19 11:55:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][500/1251] eta 0:27:52 lr 0.000901 time 1.8703 (2.2274) loss 2.5196 (3.8557) grad_norm 1.0999 (1.1513) [2022-01-19 11:56:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][510/1251] eta 0:27:30 lr 0.000901 time 2.1535 (2.2275) loss 4.6815 (3.8587) grad_norm 1.0693 (1.1505) [2022-01-19 11:56:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][520/1251] eta 0:27:08 lr 0.000901 time 2.1524 (2.2279) loss 4.6109 (3.8637) grad_norm 1.2394 (1.1501) [2022-01-19 11:56:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][530/1251] eta 0:26:45 lr 0.000901 time 2.2188 (2.2265) loss 4.3279 (3.8647) grad_norm 1.2241 (1.1514) [2022-01-19 11:57:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][540/1251] eta 0:26:20 lr 0.000901 time 1.8981 (2.2229) loss 3.6277 (3.8629) grad_norm 1.2134 (1.1539) [2022-01-19 11:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][550/1251] eta 0:25:55 lr 0.000901 time 1.8382 (2.2184) loss 4.5403 (3.8665) grad_norm 1.2789 (1.1541) [2022-01-19 11:57:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][560/1251] eta 0:25:31 lr 0.000901 time 2.4624 (2.2165) loss 4.6231 (3.8689) grad_norm 1.3088 (1.1549) [2022-01-19 11:58:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][570/1251] eta 0:25:09 lr 0.000901 time 2.4075 (2.2167) loss 3.6307 (3.8706) grad_norm 1.4447 (1.1550) [2022-01-19 11:58:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][580/1251] eta 0:24:47 lr 0.000901 time 2.3151 (2.2165) loss 3.8871 (3.8732) grad_norm 1.1297 (1.1545) [2022-01-19 11:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][590/1251] eta 0:24:25 lr 0.000901 time 2.1651 (2.2172) loss 4.2146 (3.8738) grad_norm 1.0907 (1.1539) [2022-01-19 11:59:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][600/1251] eta 0:24:04 lr 0.000901 time 2.7077 (2.2185) loss 3.4698 (3.8728) grad_norm 0.8652 (1.1534) [2022-01-19 11:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][610/1251] eta 0:23:42 lr 0.000901 time 2.0792 (2.2185) loss 4.4968 (3.8727) grad_norm 1.3478 (1.1544) [2022-01-19 12:00:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][620/1251] eta 0:23:19 lr 0.000901 time 2.1310 (2.2178) loss 3.7754 (3.8712) grad_norm 1.3096 (1.1548) [2022-01-19 12:00:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][630/1251] eta 0:22:56 lr 0.000901 time 2.6303 (2.2168) loss 4.6114 (3.8739) grad_norm 1.1465 (1.1559) [2022-01-19 12:00:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][640/1251] eta 0:22:34 lr 0.000901 time 2.6685 (2.2160) loss 3.3874 (3.8715) grad_norm 1.0976 (1.1555) [2022-01-19 12:01:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][650/1251] eta 0:22:11 lr 0.000901 time 2.2631 (2.2148) loss 3.1313 (3.8705) grad_norm 1.0696 (1.1555) [2022-01-19 12:01:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][660/1251] eta 0:21:48 lr 0.000901 time 2.2771 (2.2143) loss 2.9833 (3.8681) grad_norm 1.0743 (1.1547) [2022-01-19 12:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][670/1251] eta 0:21:25 lr 0.000901 time 1.8954 (2.2127) loss 4.8433 (3.8651) grad_norm 1.2175 (1.1548) [2022-01-19 12:02:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][680/1251] eta 0:21:04 lr 0.000901 time 3.0213 (2.2144) loss 4.1619 (3.8656) grad_norm 1.0484 (1.1548) [2022-01-19 12:02:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][690/1251] eta 0:20:42 lr 0.000901 time 2.6727 (2.2153) loss 4.3543 (3.8644) grad_norm 1.0090 (1.1536) [2022-01-19 12:03:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][700/1251] eta 0:20:20 lr 0.000901 time 2.9862 (2.2154) loss 3.3213 (3.8647) grad_norm 1.0859 (1.1525) [2022-01-19 12:03:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][710/1251] eta 0:19:57 lr 0.000901 time 2.2372 (2.2141) loss 3.8420 (3.8642) grad_norm 1.0072 (1.1522) [2022-01-19 12:03:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][720/1251] eta 0:19:35 lr 0.000901 time 1.9871 (2.2129) loss 3.6444 (3.8642) grad_norm 1.2692 (1.1525) [2022-01-19 12:04:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][730/1251] eta 0:19:11 lr 0.000901 time 2.0208 (2.2110) loss 4.2435 (3.8647) grad_norm 1.3119 (1.1528) [2022-01-19 12:04:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][740/1251] eta 0:18:48 lr 0.000901 time 1.6053 (2.2090) loss 4.5193 (3.8662) grad_norm 1.1780 (1.1530) [2022-01-19 12:04:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][750/1251] eta 0:18:26 lr 0.000901 time 1.9046 (2.2084) loss 4.3918 (3.8646) grad_norm 1.1165 (1.1529) [2022-01-19 12:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][760/1251] eta 0:18:04 lr 0.000901 time 2.1272 (2.2095) loss 4.2112 (3.8684) grad_norm 1.2058 (1.1527) [2022-01-19 12:05:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][770/1251] eta 0:17:42 lr 0.000900 time 2.2362 (2.2081) loss 4.3234 (3.8708) grad_norm 0.9701 (1.1520) [2022-01-19 12:05:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][780/1251] eta 0:17:18 lr 0.000900 time 2.1995 (2.2059) loss 3.6094 (3.8703) grad_norm 1.2049 (1.1522) [2022-01-19 12:06:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][790/1251] eta 0:16:56 lr 0.000900 time 1.4793 (2.2050) loss 3.7723 (3.8723) grad_norm 1.1671 (1.1532) [2022-01-19 12:06:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][800/1251] eta 0:16:35 lr 0.000900 time 1.8543 (2.2073) loss 4.1145 (3.8725) grad_norm 1.0426 (1.1529) [2022-01-19 12:07:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][810/1251] eta 0:16:13 lr 0.000900 time 2.7801 (2.2082) loss 4.1003 (3.8735) grad_norm 0.8765 (1.1522) [2022-01-19 12:07:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][820/1251] eta 0:15:51 lr 0.000900 time 1.8269 (2.2078) loss 3.4004 (3.8700) grad_norm 1.6078 (1.1523) [2022-01-19 12:07:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][830/1251] eta 0:15:29 lr 0.000900 time 1.8808 (2.2067) loss 4.2470 (3.8714) grad_norm 1.0250 (1.1531) [2022-01-19 12:08:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][840/1251] eta 0:15:06 lr 0.000900 time 1.8930 (2.2058) loss 3.7911 (3.8714) grad_norm 1.1116 (1.1536) [2022-01-19 12:08:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][850/1251] eta 0:14:44 lr 0.000900 time 2.7639 (2.2052) loss 2.8022 (3.8720) grad_norm 1.1588 (1.1537) [2022-01-19 12:08:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][860/1251] eta 0:14:22 lr 0.000900 time 2.2660 (2.2066) loss 3.9798 (3.8764) grad_norm 1.4247 (1.1550) [2022-01-19 12:09:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][870/1251] eta 0:14:00 lr 0.000900 time 1.9468 (2.2050) loss 3.5420 (3.8781) grad_norm 1.0109 (1.1545) [2022-01-19 12:09:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][880/1251] eta 0:13:38 lr 0.000900 time 2.1569 (2.2054) loss 4.5738 (3.8787) grad_norm 1.1338 (1.1553) [2022-01-19 12:10:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][890/1251] eta 0:13:16 lr 0.000900 time 2.6255 (2.2061) loss 4.3348 (3.8801) grad_norm 1.0931 (1.1558) [2022-01-19 12:10:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][900/1251] eta 0:12:54 lr 0.000900 time 2.1027 (2.2066) loss 4.3210 (3.8816) grad_norm 1.0232 (1.1551) [2022-01-19 12:10:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][910/1251] eta 0:12:31 lr 0.000900 time 1.6391 (2.2041) loss 2.7728 (3.8795) grad_norm 1.3606 (1.1549) [2022-01-19 12:11:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][920/1251] eta 0:12:09 lr 0.000900 time 1.8329 (2.2030) loss 4.8401 (3.8834) grad_norm 1.1130 (1.1540) [2022-01-19 12:11:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][930/1251] eta 0:11:46 lr 0.000900 time 1.5508 (2.2018) loss 3.2139 (3.8828) grad_norm 1.0747 (1.1536) [2022-01-19 12:11:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][940/1251] eta 0:11:24 lr 0.000900 time 2.4301 (2.2017) loss 2.8270 (3.8837) grad_norm 1.1814 (1.1544) [2022-01-19 12:12:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][950/1251] eta 0:11:02 lr 0.000900 time 1.8969 (2.2011) loss 3.2745 (3.8832) grad_norm 1.3303 (1.1543) [2022-01-19 12:12:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][960/1251] eta 0:10:40 lr 0.000900 time 2.2214 (2.2008) loss 3.2717 (3.8847) grad_norm 0.9756 (1.1540) [2022-01-19 12:12:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][970/1251] eta 0:10:18 lr 0.000900 time 1.8588 (2.2009) loss 3.6979 (3.8848) grad_norm 1.3730 (1.1539) [2022-01-19 12:13:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][980/1251] eta 0:09:56 lr 0.000900 time 1.5476 (2.2002) loss 4.3267 (3.8838) grad_norm 1.0793 (1.1535) [2022-01-19 12:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][990/1251] eta 0:09:33 lr 0.000900 time 1.9360 (2.1981) loss 4.2302 (3.8843) grad_norm 1.3090 (1.1534) [2022-01-19 12:13:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1000/1251] eta 0:09:11 lr 0.000900 time 1.8573 (2.1977) loss 4.1656 (3.8828) grad_norm 1.1357 (1.1531) [2022-01-19 12:14:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1010/1251] eta 0:08:49 lr 0.000900 time 1.8667 (2.1979) loss 3.8352 (3.8835) grad_norm 1.1505 (1.1528) [2022-01-19 12:14:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1020/1251] eta 0:08:27 lr 0.000900 time 1.9135 (2.1978) loss 3.7376 (3.8849) grad_norm 1.2283 (1.1528) [2022-01-19 12:15:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1030/1251] eta 0:08:06 lr 0.000900 time 2.1938 (2.1995) loss 4.2872 (3.8851) grad_norm 1.2436 (1.1532) [2022-01-19 12:15:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1040/1251] eta 0:07:44 lr 0.000900 time 2.1768 (2.1993) loss 4.6057 (3.8871) grad_norm 1.2078 (1.1536) [2022-01-19 12:15:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1050/1251] eta 0:07:22 lr 0.000900 time 1.5376 (2.1998) loss 2.9335 (3.8878) grad_norm 1.1042 (1.1527) [2022-01-19 12:16:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1060/1251] eta 0:07:00 lr 0.000900 time 2.4166 (2.2010) loss 3.1967 (3.8865) grad_norm 1.1648 (1.1526) [2022-01-19 12:16:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1070/1251] eta 0:06:38 lr 0.000900 time 2.2000 (2.2012) loss 4.5560 (3.8866) grad_norm 1.0905 (1.1522) [2022-01-19 12:16:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1080/1251] eta 0:06:16 lr 0.000900 time 1.9517 (2.2006) loss 3.9163 (3.8879) grad_norm 1.2676 (1.1520) [2022-01-19 12:17:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1090/1251] eta 0:05:54 lr 0.000900 time 1.8695 (2.1991) loss 4.1180 (3.8891) grad_norm 1.2169 (1.1514) [2022-01-19 12:17:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1100/1251] eta 0:05:31 lr 0.000900 time 1.9334 (2.1974) loss 3.4717 (3.8902) grad_norm 1.1943 (1.1511) [2022-01-19 12:17:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1110/1251] eta 0:05:09 lr 0.000900 time 2.1713 (2.1956) loss 4.5161 (3.8917) grad_norm 1.1079 (1.1507) [2022-01-19 12:18:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1120/1251] eta 0:04:47 lr 0.000900 time 1.9781 (2.1947) loss 3.7307 (3.8912) grad_norm 1.2105 (1.1505) [2022-01-19 12:18:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1130/1251] eta 0:04:25 lr 0.000900 time 2.3007 (2.1955) loss 3.7517 (3.8871) grad_norm 1.0006 (1.1500) [2022-01-19 12:19:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1140/1251] eta 0:04:03 lr 0.000900 time 2.1418 (2.1963) loss 4.3049 (3.8883) grad_norm 1.0768 (1.1493) [2022-01-19 12:19:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1150/1251] eta 0:03:41 lr 0.000900 time 2.3381 (2.1968) loss 3.8811 (3.8879) grad_norm 1.2857 (1.1487) [2022-01-19 12:19:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1160/1251] eta 0:03:20 lr 0.000900 time 3.1583 (2.1981) loss 4.0931 (3.8895) grad_norm 1.1910 (1.1487) [2022-01-19 12:20:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1170/1251] eta 0:02:58 lr 0.000899 time 2.1606 (2.1991) loss 2.2840 (3.8895) grad_norm 1.0022 (1.1484) [2022-01-19 12:20:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1180/1251] eta 0:02:36 lr 0.000899 time 2.3756 (2.2002) loss 4.2731 (3.8893) grad_norm 1.0146 (1.1484) [2022-01-19 12:20:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1190/1251] eta 0:02:14 lr 0.000899 time 2.3165 (2.2002) loss 2.7472 (3.8887) grad_norm 0.9484 (1.1486) [2022-01-19 12:21:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1200/1251] eta 0:01:52 lr 0.000899 time 2.3145 (2.1992) loss 2.6592 (3.8881) grad_norm 1.3177 (1.1482) [2022-01-19 12:21:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1210/1251] eta 0:01:30 lr 0.000899 time 3.1010 (2.1990) loss 3.1522 (3.8871) grad_norm 1.0486 (1.1478) [2022-01-19 12:21:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1220/1251] eta 0:01:08 lr 0.000899 time 2.1909 (2.1978) loss 3.1170 (3.8859) grad_norm 1.0822 (1.1477) [2022-01-19 12:22:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1230/1251] eta 0:00:46 lr 0.000899 time 2.4189 (2.1988) loss 3.9217 (3.8848) grad_norm 1.1880 (1.1477) [2022-01-19 12:22:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1240/1251] eta 0:00:24 lr 0.000899 time 1.5734 (2.1972) loss 4.0803 (3.8845) grad_norm 1.0685 (1.1477) [2022-01-19 12:22:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [61/300][1250/1251] eta 0:00:02 lr 0.000899 time 1.3234 (2.1922) loss 3.6387 (3.8830) grad_norm 1.2125 (1.1476) [2022-01-19 12:22:57 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 61 training takes 0:45:42 [2022-01-19 12:23:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.902 (18.902) Loss 1.2485 (1.2485) Acc@1 71.289 (71.289) Acc@5 90.625 (90.625) [2022-01-19 12:23:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.528 (3.343) Loss 1.2065 (1.1980) Acc@1 72.070 (71.795) Acc@5 91.406 (91.335) [2022-01-19 12:23:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.622 (2.598) Loss 1.1573 (1.2140) Acc@1 72.266 (71.666) Acc@5 91.504 (91.095) [2022-01-19 12:24:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.963 (2.306) Loss 1.2067 (1.2239) Acc@1 72.852 (71.506) Acc@5 90.918 (90.984) [2022-01-19 12:24:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.772 (2.216) Loss 1.1629 (1.2304) Acc@1 73.242 (71.470) Acc@5 92.090 (90.873) [2022-01-19 12:24:35 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 71.460 Acc@5 90.812 [2022-01-19 12:24:35 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 71.5% [2022-01-19 12:24:35 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 71.46% [2022-01-19 12:24:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][0/1251] eta 7:23:59 lr 0.000899 time 21.2946 (21.2946) loss 3.9585 (3.9585) grad_norm 1.1042 (1.1042) [2022-01-19 12:25:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][10/1251] eta 1:22:30 lr 0.000899 time 2.1014 (3.9892) loss 2.5480 (3.8261) grad_norm 1.3124 (1.1711) [2022-01-19 12:25:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][20/1251] eta 1:04:39 lr 0.000899 time 2.6030 (3.1519) loss 2.7906 (3.7657) grad_norm 1.2702 (1.1839) [2022-01-19 12:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][30/1251] eta 0:57:27 lr 0.000899 time 1.8638 (2.8232) loss 2.5354 (3.8479) grad_norm 1.0976 (1.1653) [2022-01-19 12:26:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][40/1251] eta 0:54:31 lr 0.000899 time 3.5922 (2.7016) loss 4.2444 (3.8190) grad_norm 0.9427 (1.1490) [2022-01-19 12:26:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][50/1251] eta 0:52:13 lr 0.000899 time 2.0960 (2.6089) loss 3.4325 (3.8502) grad_norm 1.0509 (1.1430) [2022-01-19 12:27:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][60/1251] eta 0:50:58 lr 0.000899 time 2.4356 (2.5684) loss 4.2084 (3.8617) grad_norm 1.0091 (1.1363) [2022-01-19 12:27:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][70/1251] eta 0:49:28 lr 0.000899 time 1.5019 (2.5140) loss 4.0489 (3.8614) grad_norm 1.2562 (1.1345) [2022-01-19 12:27:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][80/1251] eta 0:48:31 lr 0.000899 time 3.1225 (2.4862) loss 2.8999 (3.8257) grad_norm 1.3222 (1.1311) [2022-01-19 12:28:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][90/1251] eta 0:47:47 lr 0.000899 time 2.5262 (2.4700) loss 3.9580 (3.8127) grad_norm 1.2461 (1.1380) [2022-01-19 12:28:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][100/1251] eta 0:46:32 lr 0.000899 time 2.2427 (2.4261) loss 4.3347 (3.7984) grad_norm 1.0563 (1.1358) [2022-01-19 12:29:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][110/1251] eta 0:45:20 lr 0.000899 time 1.9284 (2.3841) loss 4.0549 (3.8012) grad_norm 1.1008 (1.1333) [2022-01-19 12:29:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][120/1251] eta 0:44:27 lr 0.000899 time 2.6818 (2.3590) loss 3.6318 (3.8062) grad_norm 1.5244 (1.1315) [2022-01-19 12:29:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][130/1251] eta 0:43:40 lr 0.000899 time 1.9238 (2.3376) loss 3.5929 (3.7856) grad_norm 1.2473 (1.1327) [2022-01-19 12:30:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][140/1251] eta 0:43:08 lr 0.000899 time 1.6664 (2.3297) loss 4.3693 (3.7822) grad_norm 1.1188 (1.1366) [2022-01-19 12:30:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][150/1251] eta 0:42:48 lr 0.000899 time 1.8839 (2.3325) loss 4.4429 (3.7799) grad_norm 1.0993 (1.1372) [2022-01-19 12:30:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][160/1251] eta 0:42:28 lr 0.000899 time 2.9830 (2.3356) loss 4.7655 (3.7755) grad_norm 1.2317 (1.1414) [2022-01-19 12:31:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][170/1251] eta 0:42:05 lr 0.000899 time 2.3959 (2.3366) loss 2.9240 (3.7616) grad_norm 1.0696 (1.1390) [2022-01-19 12:31:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][180/1251] eta 0:41:37 lr 0.000899 time 1.8439 (2.3324) loss 4.5535 (3.7754) grad_norm 1.1184 (1.1395) [2022-01-19 12:31:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][190/1251] eta 0:41:00 lr 0.000899 time 1.9306 (2.3193) loss 4.1461 (3.7670) grad_norm 1.0403 (1.1439) [2022-01-19 12:32:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][200/1251] eta 0:40:18 lr 0.000899 time 2.2476 (2.3009) loss 4.1741 (3.7783) grad_norm 1.0427 (1.1439) [2022-01-19 12:32:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][210/1251] eta 0:39:37 lr 0.000899 time 2.2641 (2.2839) loss 4.3164 (3.7968) grad_norm 1.0451 (1.1445) [2022-01-19 12:32:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][220/1251] eta 0:39:05 lr 0.000899 time 2.1574 (2.2745) loss 3.2424 (3.7937) grad_norm 1.1269 (1.1410) [2022-01-19 12:33:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][230/1251] eta 0:38:35 lr 0.000899 time 2.1752 (2.2677) loss 4.7997 (3.7996) grad_norm 1.1914 (1.1403) [2022-01-19 12:33:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][240/1251] eta 0:38:15 lr 0.000899 time 2.5214 (2.2700) loss 3.4888 (3.8059) grad_norm 1.1803 (1.1427) [2022-01-19 12:34:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][250/1251] eta 0:37:54 lr 0.000899 time 2.4353 (2.2718) loss 3.7629 (3.8015) grad_norm 0.9776 (1.1406) [2022-01-19 12:34:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][260/1251] eta 0:37:30 lr 0.000899 time 2.2779 (2.2712) loss 4.0838 (3.8010) grad_norm 1.1573 (1.1383) [2022-01-19 12:34:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][270/1251] eta 0:37:09 lr 0.000899 time 1.6495 (2.2722) loss 2.9850 (3.7915) grad_norm 1.0245 (1.1385) [2022-01-19 12:35:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][280/1251] eta 0:36:44 lr 0.000899 time 1.8319 (2.2701) loss 3.6092 (3.7947) grad_norm 1.1473 (1.1357) [2022-01-19 12:35:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][290/1251] eta 0:36:19 lr 0.000899 time 1.9014 (2.2675) loss 4.4413 (3.8006) grad_norm 1.1227 (1.1333) [2022-01-19 12:35:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][300/1251] eta 0:35:54 lr 0.000899 time 2.2764 (2.2659) loss 4.0957 (3.8051) grad_norm 1.1434 (1.1324) [2022-01-19 12:36:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][310/1251] eta 0:35:31 lr 0.000899 time 2.5694 (2.2655) loss 2.7337 (3.8088) grad_norm 1.1346 (1.1322) [2022-01-19 12:36:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][320/1251] eta 0:35:03 lr 0.000898 time 1.6096 (2.2598) loss 3.3326 (3.8038) grad_norm 1.1574 (1.1330) [2022-01-19 12:37:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][330/1251] eta 0:34:40 lr 0.000898 time 1.6552 (2.2585) loss 3.0819 (3.8046) grad_norm 1.2084 (1.1339) [2022-01-19 12:37:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][340/1251] eta 0:34:10 lr 0.000898 time 1.6066 (2.2508) loss 3.8385 (3.8037) grad_norm 1.0073 (1.1338) [2022-01-19 12:37:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][350/1251] eta 0:33:46 lr 0.000898 time 1.8539 (2.2490) loss 4.7512 (3.8045) grad_norm 1.4304 (1.1382) [2022-01-19 12:38:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][360/1251] eta 0:33:21 lr 0.000898 time 2.2146 (2.2469) loss 4.2486 (3.8108) grad_norm 1.3054 (1.1389) [2022-01-19 12:38:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][370/1251] eta 0:32:58 lr 0.000898 time 2.2081 (2.2456) loss 3.9637 (3.8183) grad_norm 1.3640 (1.1388) [2022-01-19 12:38:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][380/1251] eta 0:32:32 lr 0.000898 time 1.8567 (2.2415) loss 3.9871 (3.8222) grad_norm 1.1990 (1.1413) [2022-01-19 12:39:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][390/1251] eta 0:32:12 lr 0.000898 time 2.2248 (2.2441) loss 3.5625 (3.8220) grad_norm 1.0114 (1.1428) [2022-01-19 12:39:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][400/1251] eta 0:31:49 lr 0.000898 time 1.5453 (2.2435) loss 4.2872 (3.8236) grad_norm 1.0488 (1.1417) [2022-01-19 12:39:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][410/1251] eta 0:31:25 lr 0.000898 time 1.8954 (2.2414) loss 4.3324 (3.8242) grad_norm 1.4664 (1.1431) [2022-01-19 12:40:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][420/1251] eta 0:30:59 lr 0.000898 time 1.9830 (2.2380) loss 4.6277 (3.8276) grad_norm 1.0229 (1.1443) [2022-01-19 12:40:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][430/1251] eta 0:30:37 lr 0.000898 time 2.0450 (2.2386) loss 4.1789 (3.8252) grad_norm 1.1751 (1.1424) [2022-01-19 12:41:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][440/1251] eta 0:30:15 lr 0.000898 time 2.9664 (2.2384) loss 3.0174 (3.8202) grad_norm 1.1634 (1.1416) [2022-01-19 12:41:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][450/1251] eta 0:29:49 lr 0.000898 time 2.0271 (2.2335) loss 2.9081 (3.8176) grad_norm 1.0511 (1.1424) [2022-01-19 12:41:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][460/1251] eta 0:29:23 lr 0.000898 time 2.1561 (2.2296) loss 3.5768 (3.8178) grad_norm 1.1157 (1.1424) [2022-01-19 12:42:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][470/1251] eta 0:29:03 lr 0.000898 time 2.7830 (2.2328) loss 4.4912 (3.8212) grad_norm 0.9695 (1.1422) [2022-01-19 12:42:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][480/1251] eta 0:28:45 lr 0.000898 time 2.8263 (2.2377) loss 3.8965 (3.8224) grad_norm 1.1440 (1.1437) [2022-01-19 12:42:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][490/1251] eta 0:28:21 lr 0.000898 time 1.8224 (2.2355) loss 4.2901 (3.8247) grad_norm 1.2358 (1.1429) [2022-01-19 12:43:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][500/1251] eta 0:27:58 lr 0.000898 time 1.7175 (2.2356) loss 3.8495 (3.8293) grad_norm 1.1304 (1.1425) [2022-01-19 12:43:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][510/1251] eta 0:27:35 lr 0.000898 time 2.2577 (2.2336) loss 4.3242 (3.8281) grad_norm 0.9567 (1.1427) [2022-01-19 12:43:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][520/1251] eta 0:27:08 lr 0.000898 time 1.9385 (2.2284) loss 4.3168 (3.8282) grad_norm 1.0935 (1.1438) [2022-01-19 12:44:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][530/1251] eta 0:26:43 lr 0.000898 time 1.5377 (2.2235) loss 4.1940 (3.8290) grad_norm 1.1847 (1.1451) [2022-01-19 12:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][540/1251] eta 0:26:19 lr 0.000898 time 1.7460 (2.2212) loss 3.5675 (3.8333) grad_norm 1.1301 (1.1467) [2022-01-19 12:44:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][550/1251] eta 0:25:54 lr 0.000898 time 1.4309 (2.2171) loss 4.2359 (3.8338) grad_norm 1.1007 (1.1459) [2022-01-19 12:45:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][560/1251] eta 0:25:33 lr 0.000898 time 2.7068 (2.2199) loss 4.3401 (3.8342) grad_norm 1.1469 (1.1460) [2022-01-19 12:45:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][570/1251] eta 0:25:12 lr 0.000898 time 2.5711 (2.2207) loss 4.3958 (3.8365) grad_norm 1.4250 (1.1469) [2022-01-19 12:46:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][580/1251] eta 0:24:50 lr 0.000898 time 1.8973 (2.2215) loss 3.2817 (3.8382) grad_norm 1.0992 (1.1460) [2022-01-19 12:46:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][590/1251] eta 0:24:28 lr 0.000898 time 1.4544 (2.2214) loss 4.2775 (3.8390) grad_norm 1.5843 (1.1465) [2022-01-19 12:46:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][600/1251] eta 0:24:06 lr 0.000898 time 2.2728 (2.2219) loss 3.5942 (3.8417) grad_norm 1.0003 (1.1472) [2022-01-19 12:47:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][610/1251] eta 0:23:43 lr 0.000898 time 2.7591 (2.2213) loss 4.2771 (3.8467) grad_norm 1.1156 (1.1487) [2022-01-19 12:47:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][620/1251] eta 0:23:19 lr 0.000898 time 1.5538 (2.2183) loss 2.5622 (3.8479) grad_norm 1.2330 (1.1488) [2022-01-19 12:47:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][630/1251] eta 0:22:57 lr 0.000898 time 2.2261 (2.2180) loss 4.0993 (3.8518) grad_norm 1.2458 (1.1487) [2022-01-19 12:48:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][640/1251] eta 0:22:35 lr 0.000898 time 2.4803 (2.2177) loss 3.2702 (3.8470) grad_norm 1.1590 (1.1481) [2022-01-19 12:48:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][650/1251] eta 0:22:13 lr 0.000898 time 2.3563 (2.2187) loss 3.9938 (3.8463) grad_norm 1.0167 (1.1474) [2022-01-19 12:49:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][660/1251] eta 0:21:50 lr 0.000898 time 1.9921 (2.2182) loss 3.8643 (3.8506) grad_norm 1.0067 (1.1480) [2022-01-19 12:49:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][670/1251] eta 0:21:28 lr 0.000898 time 1.6892 (2.2177) loss 2.9792 (3.8506) grad_norm 1.2261 (1.1485) [2022-01-19 12:49:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][680/1251] eta 0:21:05 lr 0.000898 time 1.9403 (2.2169) loss 4.7380 (3.8533) grad_norm 1.5500 (1.1496) [2022-01-19 12:50:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][690/1251] eta 0:20:43 lr 0.000898 time 2.2429 (2.2159) loss 4.3644 (3.8483) grad_norm 1.2219 (1.1500) [2022-01-19 12:50:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][700/1251] eta 0:20:19 lr 0.000898 time 1.9373 (2.2130) loss 4.0651 (3.8471) grad_norm 1.2402 (1.1498) [2022-01-19 12:50:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][710/1251] eta 0:19:56 lr 0.000897 time 2.1906 (2.2120) loss 3.4589 (3.8455) grad_norm 1.2121 (1.1503) [2022-01-19 12:51:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][720/1251] eta 0:19:34 lr 0.000897 time 2.1435 (2.2127) loss 4.2507 (3.8451) grad_norm 1.3445 (1.1517) [2022-01-19 12:51:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][730/1251] eta 0:19:12 lr 0.000897 time 2.2545 (2.2124) loss 3.7973 (3.8441) grad_norm 0.9898 (1.1520) [2022-01-19 12:51:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][740/1251] eta 0:18:50 lr 0.000897 time 2.0905 (2.2130) loss 3.6789 (3.8483) grad_norm 1.1192 (1.1522) [2022-01-19 12:52:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][750/1251] eta 0:18:28 lr 0.000897 time 1.9117 (2.2132) loss 4.6891 (3.8493) grad_norm 1.2506 (1.1535) [2022-01-19 12:52:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][760/1251] eta 0:18:06 lr 0.000897 time 2.2807 (2.2124) loss 4.3386 (3.8518) grad_norm 1.0524 (1.1537) [2022-01-19 12:53:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][770/1251] eta 0:17:43 lr 0.000897 time 1.8570 (2.2104) loss 4.2246 (3.8527) grad_norm 1.1694 (1.1532) [2022-01-19 12:53:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][780/1251] eta 0:17:20 lr 0.000897 time 2.2258 (2.2095) loss 4.1045 (3.8500) grad_norm 1.0536 (1.1527) [2022-01-19 12:53:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][790/1251] eta 0:17:00 lr 0.000897 time 2.5429 (2.2126) loss 3.1117 (3.8485) grad_norm 1.0604 (1.1527) [2022-01-19 12:54:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][800/1251] eta 0:16:38 lr 0.000897 time 3.4353 (2.2146) loss 2.7340 (3.8490) grad_norm 1.0632 (1.1520) [2022-01-19 12:54:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][810/1251] eta 0:16:16 lr 0.000897 time 1.9151 (2.2133) loss 4.3140 (3.8522) grad_norm 1.0306 (1.1513) [2022-01-19 12:54:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][820/1251] eta 0:15:53 lr 0.000897 time 1.8531 (2.2116) loss 4.0629 (3.8539) grad_norm 1.1483 (1.1521) [2022-01-19 12:55:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][830/1251] eta 0:15:30 lr 0.000897 time 2.1231 (2.2101) loss 3.1712 (3.8511) grad_norm 1.1071 (1.1535) [2022-01-19 12:55:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][840/1251] eta 0:15:08 lr 0.000897 time 3.3980 (2.2106) loss 3.4290 (3.8493) grad_norm 1.1758 (1.1532) [2022-01-19 12:55:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][850/1251] eta 0:14:46 lr 0.000897 time 2.1719 (2.2097) loss 3.9913 (3.8511) grad_norm 0.9752 (1.1529) [2022-01-19 12:56:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][860/1251] eta 0:14:23 lr 0.000897 time 1.9139 (2.2085) loss 4.0979 (3.8503) grad_norm 1.1435 (1.1536) [2022-01-19 12:56:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][870/1251] eta 0:14:02 lr 0.000897 time 2.2691 (2.2110) loss 3.2229 (3.8433) grad_norm 1.4547 (1.1536) [2022-01-19 12:57:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][880/1251] eta 0:13:40 lr 0.000897 time 3.7336 (2.2110) loss 4.1448 (3.8439) grad_norm 1.2772 (1.1541) [2022-01-19 12:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][890/1251] eta 0:13:18 lr 0.000897 time 1.7338 (2.2105) loss 4.6572 (3.8455) grad_norm 1.1494 (1.1538) [2022-01-19 12:57:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][900/1251] eta 0:12:55 lr 0.000897 time 2.2392 (2.2108) loss 3.3117 (3.8443) grad_norm 1.0026 (1.1535) [2022-01-19 12:58:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][910/1251] eta 0:12:33 lr 0.000897 time 2.1553 (2.2098) loss 3.8490 (3.8446) grad_norm 1.0794 (1.1531) [2022-01-19 12:58:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][920/1251] eta 0:12:11 lr 0.000897 time 2.8204 (2.2102) loss 4.3945 (3.8450) grad_norm 1.1683 (1.1531) [2022-01-19 12:58:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][930/1251] eta 0:11:48 lr 0.000897 time 1.8801 (2.2081) loss 3.6610 (3.8438) grad_norm 1.0326 (1.1533) [2022-01-19 12:59:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][940/1251] eta 0:11:27 lr 0.000897 time 2.8812 (2.2094) loss 3.8233 (3.8431) grad_norm 1.2549 (1.1536) [2022-01-19 12:59:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][950/1251] eta 0:11:04 lr 0.000897 time 1.5858 (2.2084) loss 4.0701 (3.8442) grad_norm 0.9268 (1.1531) [2022-01-19 12:59:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][960/1251] eta 0:10:42 lr 0.000897 time 2.6126 (2.2086) loss 3.3745 (3.8424) grad_norm 1.0229 (1.1525) [2022-01-19 13:00:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][970/1251] eta 0:10:20 lr 0.000897 time 1.8750 (2.2092) loss 4.0903 (3.8451) grad_norm 1.2498 (1.1528) [2022-01-19 13:00:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][980/1251] eta 0:09:58 lr 0.000897 time 2.2547 (2.2100) loss 3.8962 (3.8449) grad_norm 1.3066 (1.1532) [2022-01-19 13:01:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][990/1251] eta 0:09:36 lr 0.000897 time 1.8726 (2.2092) loss 4.1094 (3.8463) grad_norm 1.3664 (1.1537) [2022-01-19 13:01:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1000/1251] eta 0:09:14 lr 0.000897 time 2.2298 (2.2081) loss 3.7618 (3.8451) grad_norm 1.1403 (1.1538) [2022-01-19 13:01:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1010/1251] eta 0:08:51 lr 0.000897 time 2.0257 (2.2061) loss 4.3675 (3.8458) grad_norm 1.0150 (1.1533) [2022-01-19 13:02:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1020/1251] eta 0:08:29 lr 0.000897 time 2.1541 (2.2065) loss 4.2019 (3.8492) grad_norm 1.1016 (1.1532) [2022-01-19 13:02:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1030/1251] eta 0:08:07 lr 0.000897 time 1.5270 (2.2059) loss 4.6778 (3.8521) grad_norm 1.3454 (1.1526) [2022-01-19 13:02:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1040/1251] eta 0:07:45 lr 0.000897 time 1.9580 (2.2069) loss 4.7252 (3.8515) grad_norm 1.4981 (1.1525) [2022-01-19 13:03:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1050/1251] eta 0:07:23 lr 0.000897 time 2.0292 (2.2061) loss 4.3434 (3.8500) grad_norm 1.0866 (1.1531) [2022-01-19 13:03:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1060/1251] eta 0:07:01 lr 0.000897 time 2.2907 (2.2048) loss 2.5585 (3.8499) grad_norm 1.2245 (1.1527) [2022-01-19 13:03:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1070/1251] eta 0:06:38 lr 0.000897 time 1.5944 (2.2041) loss 4.2109 (3.8487) grad_norm 1.2160 (1.1526) [2022-01-19 13:04:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1080/1251] eta 0:06:17 lr 0.000897 time 1.8555 (2.2048) loss 3.7674 (3.8492) grad_norm 1.4323 (1.1529) [2022-01-19 13:04:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1090/1251] eta 0:05:54 lr 0.000897 time 2.5045 (2.2043) loss 4.2983 (3.8504) grad_norm 1.1403 (1.1524) [2022-01-19 13:05:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1100/1251] eta 0:05:32 lr 0.000897 time 2.4312 (2.2043) loss 3.9946 (3.8517) grad_norm 1.2098 (1.1521) [2022-01-19 13:05:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1110/1251] eta 0:05:10 lr 0.000896 time 2.5110 (2.2048) loss 3.3578 (3.8525) grad_norm 1.1838 (1.1518) [2022-01-19 13:05:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1120/1251] eta 0:04:48 lr 0.000896 time 2.1665 (2.2053) loss 3.8533 (3.8527) grad_norm 1.2665 (1.1522) [2022-01-19 13:06:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1130/1251] eta 0:04:26 lr 0.000896 time 1.9414 (2.2041) loss 4.2620 (3.8539) grad_norm 0.9945 (1.1527) [2022-01-19 13:06:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1140/1251] eta 0:04:04 lr 0.000896 time 1.7508 (2.2024) loss 4.2114 (3.8549) grad_norm 1.0130 (1.1521) [2022-01-19 13:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1150/1251] eta 0:03:42 lr 0.000896 time 1.8549 (2.2025) loss 4.4813 (3.8543) grad_norm 1.0324 (1.1522) [2022-01-19 13:07:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1160/1251] eta 0:03:20 lr 0.000896 time 2.3739 (2.2034) loss 2.7856 (3.8539) grad_norm 1.2321 (1.1527) [2022-01-19 13:07:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1170/1251] eta 0:02:58 lr 0.000896 time 1.6241 (2.2030) loss 4.0293 (3.8553) grad_norm 1.1661 (1.1520) [2022-01-19 13:07:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1180/1251] eta 0:02:36 lr 0.000896 time 1.9050 (2.2029) loss 2.5250 (3.8553) grad_norm 1.0679 (1.1522) [2022-01-19 13:08:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1190/1251] eta 0:02:14 lr 0.000896 time 2.3683 (2.2028) loss 3.3517 (3.8575) grad_norm 1.0832 (1.1520) [2022-01-19 13:08:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1200/1251] eta 0:01:52 lr 0.000896 time 2.1141 (2.2023) loss 4.3855 (3.8573) grad_norm 1.1860 (1.1516) [2022-01-19 13:09:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1210/1251] eta 0:01:30 lr 0.000896 time 2.1026 (2.2003) loss 4.3420 (3.8577) grad_norm 0.9656 (1.1516) [2022-01-19 13:09:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1220/1251] eta 0:01:08 lr 0.000896 time 1.9844 (2.1993) loss 4.1561 (3.8566) grad_norm 1.0296 (1.1511) [2022-01-19 13:09:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1230/1251] eta 0:00:46 lr 0.000896 time 2.2972 (2.1988) loss 3.4477 (3.8543) grad_norm 0.9741 (1.1506) [2022-01-19 13:10:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1240/1251] eta 0:00:24 lr 0.000896 time 2.1072 (2.1986) loss 4.7680 (3.8523) grad_norm 1.0372 (1.1501) [2022-01-19 13:10:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [62/300][1250/1251] eta 0:00:02 lr 0.000896 time 1.1710 (2.1935) loss 4.1358 (3.8538) grad_norm 1.0942 (1.1497) [2022-01-19 13:10:20 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 62 training takes 0:45:44 [2022-01-19 13:10:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.356 (18.356) Loss 1.2660 (1.2660) Acc@1 70.020 (70.020) Acc@5 89.941 (89.941) [2022-01-19 13:10:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.593 (3.557) Loss 1.1631 (1.2006) Acc@1 72.754 (72.106) Acc@5 92.090 (90.661) [2022-01-19 13:11:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.961 (2.759) Loss 1.2729 (1.2270) Acc@1 70.312 (71.647) Acc@5 89.453 (90.402) [2022-01-19 13:11:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.013 (2.442) Loss 1.1114 (1.2162) Acc@1 73.926 (71.755) Acc@5 91.016 (90.612) [2022-01-19 13:11:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.549 (2.209) Loss 1.2330 (1.2164) Acc@1 70.605 (71.668) Acc@5 90.234 (90.663) [2022-01-19 13:11:57 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 71.532 Acc@5 90.762 [2022-01-19 13:11:57 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 71.5% [2022-01-19 13:11:57 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 71.53% [2022-01-19 13:12:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][0/1251] eta 7:23:01 lr 0.000896 time 21.2479 (21.2479) loss 4.0373 (4.0373) grad_norm 1.1460 (1.1460) [2022-01-19 13:12:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][10/1251] eta 1:24:31 lr 0.000896 time 1.8577 (4.0864) loss 2.6073 (3.6368) grad_norm 1.0581 (1.1278) [2022-01-19 13:13:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][20/1251] eta 1:05:21 lr 0.000896 time 2.1888 (3.1860) loss 3.1629 (3.7300) grad_norm 0.9793 (1.1316) [2022-01-19 13:13:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][30/1251] eta 0:56:39 lr 0.000896 time 1.5866 (2.7839) loss 3.1166 (3.7923) grad_norm 1.2634 (1.1433) [2022-01-19 13:13:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][40/1251] eta 0:54:22 lr 0.000896 time 3.8180 (2.6937) loss 3.2533 (3.8078) grad_norm 1.0559 (1.1456) [2022-01-19 13:14:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][50/1251] eta 0:51:45 lr 0.000896 time 2.0535 (2.5856) loss 3.6410 (3.8179) grad_norm 1.1287 (1.1562) [2022-01-19 13:14:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][60/1251] eta 0:49:52 lr 0.000896 time 2.1995 (2.5128) loss 3.0894 (3.8340) grad_norm 0.9979 (1.1499) [2022-01-19 13:14:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][70/1251] eta 0:48:39 lr 0.000896 time 1.6849 (2.4720) loss 3.6498 (3.8181) grad_norm 1.3498 (1.1694) [2022-01-19 13:15:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][80/1251] eta 0:48:06 lr 0.000896 time 4.1427 (2.4646) loss 2.9454 (3.7792) grad_norm 1.2091 (1.1739) [2022-01-19 13:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][90/1251] eta 0:47:32 lr 0.000896 time 2.0727 (2.4568) loss 3.9386 (3.8068) grad_norm 0.9980 (1.1693) [2022-01-19 13:16:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][100/1251] eta 0:46:32 lr 0.000896 time 2.1142 (2.4266) loss 4.2357 (3.8234) grad_norm 1.1889 (1.1689) [2022-01-19 13:16:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][110/1251] eta 0:45:42 lr 0.000896 time 1.6774 (2.4033) loss 4.6225 (3.8423) grad_norm 1.0798 (1.1693) [2022-01-19 13:16:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][120/1251] eta 0:45:06 lr 0.000896 time 3.0602 (2.3930) loss 2.7813 (3.8286) grad_norm 1.4109 (1.1828) [2022-01-19 13:17:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][130/1251] eta 0:44:24 lr 0.000896 time 1.5949 (2.3770) loss 3.0842 (3.8213) grad_norm 1.2436 (1.1788) [2022-01-19 13:17:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][140/1251] eta 0:43:43 lr 0.000896 time 1.7826 (2.3618) loss 4.3524 (3.8441) grad_norm 1.5943 (1.1819) [2022-01-19 13:17:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][150/1251] eta 0:43:01 lr 0.000896 time 1.8802 (2.3451) loss 3.8058 (3.8315) grad_norm 1.1906 (1.1826) [2022-01-19 13:18:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][160/1251] eta 0:42:13 lr 0.000896 time 1.9154 (2.3223) loss 4.2841 (3.8122) grad_norm 1.2459 (1.1825) [2022-01-19 13:18:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][170/1251] eta 0:41:31 lr 0.000896 time 1.6971 (2.3051) loss 4.6130 (3.8136) grad_norm 1.2085 (1.1867) [2022-01-19 13:18:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][180/1251] eta 0:41:00 lr 0.000896 time 2.4993 (2.2972) loss 3.8554 (3.8170) grad_norm 1.2182 (1.1845) [2022-01-19 13:19:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][190/1251] eta 0:40:19 lr 0.000896 time 1.5653 (2.2806) loss 4.0411 (3.8136) grad_norm 1.2594 (1.1834) [2022-01-19 13:19:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][200/1251] eta 0:39:50 lr 0.000896 time 2.1912 (2.2747) loss 3.8278 (3.8129) grad_norm 1.1369 (1.1856) [2022-01-19 13:19:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][210/1251] eta 0:39:30 lr 0.000896 time 1.8700 (2.2768) loss 2.7744 (3.8154) grad_norm 1.0896 (1.1853) [2022-01-19 13:20:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][220/1251] eta 0:39:14 lr 0.000896 time 2.6278 (2.2839) loss 4.6453 (3.8170) grad_norm 1.0784 (1.1844) [2022-01-19 13:20:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][230/1251] eta 0:38:55 lr 0.000896 time 2.8629 (2.2877) loss 3.8918 (3.8284) grad_norm 1.1310 (1.1850) [2022-01-19 13:21:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][240/1251] eta 0:38:35 lr 0.000896 time 1.8674 (2.2900) loss 4.3946 (3.8258) grad_norm 1.1455 (1.1820) [2022-01-19 13:21:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][250/1251] eta 0:38:17 lr 0.000895 time 2.3296 (2.2949) loss 2.7090 (3.8282) grad_norm 1.0811 (1.1804) [2022-01-19 13:21:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][260/1251] eta 0:37:53 lr 0.000895 time 1.9622 (2.2946) loss 3.2316 (3.8301) grad_norm 1.1069 (1.1804) [2022-01-19 13:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][270/1251] eta 0:37:20 lr 0.000895 time 1.8573 (2.2835) loss 2.8168 (3.8241) grad_norm 0.9486 (1.1800) [2022-01-19 13:22:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][280/1251] eta 0:36:44 lr 0.000895 time 1.9498 (2.2699) loss 3.9369 (3.8169) grad_norm 0.9714 (1.1785) [2022-01-19 13:22:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][290/1251] eta 0:36:15 lr 0.000895 time 2.2162 (2.2637) loss 3.8653 (3.8073) grad_norm 1.0251 (1.1762) [2022-01-19 13:23:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][300/1251] eta 0:35:49 lr 0.000895 time 1.5499 (2.2604) loss 3.2028 (3.8100) grad_norm 1.0688 (1.1740) [2022-01-19 13:23:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][310/1251] eta 0:35:27 lr 0.000895 time 1.8805 (2.2608) loss 4.2104 (3.8115) grad_norm 1.2069 (1.1745) [2022-01-19 13:24:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][320/1251] eta 0:35:05 lr 0.000895 time 2.2739 (2.2615) loss 4.6510 (3.8149) grad_norm 1.5433 (1.1758) [2022-01-19 13:24:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][330/1251] eta 0:34:42 lr 0.000895 time 1.8868 (2.2614) loss 3.6117 (3.8138) grad_norm 1.0531 (1.1772) [2022-01-19 13:24:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][340/1251] eta 0:34:18 lr 0.000895 time 2.8728 (2.2601) loss 3.5590 (3.8187) grad_norm 1.0164 (1.1760) [2022-01-19 13:25:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][350/1251] eta 0:33:53 lr 0.000895 time 1.4883 (2.2565) loss 4.4526 (3.8226) grad_norm 1.1965 (1.1741) [2022-01-19 13:25:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][360/1251] eta 0:33:23 lr 0.000895 time 1.9516 (2.2490) loss 3.9381 (3.8272) grad_norm 1.0273 (1.1718) [2022-01-19 13:25:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][370/1251] eta 0:32:59 lr 0.000895 time 2.6488 (2.2473) loss 3.8606 (3.8256) grad_norm 1.1932 (1.1721) [2022-01-19 13:26:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][380/1251] eta 0:32:38 lr 0.000895 time 2.8162 (2.2484) loss 4.0507 (3.8290) grad_norm 1.1881 (1.1721) [2022-01-19 13:26:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][390/1251] eta 0:32:15 lr 0.000895 time 1.9022 (2.2483) loss 4.3686 (3.8330) grad_norm 1.0779 (1.1728) [2022-01-19 13:26:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][400/1251] eta 0:31:50 lr 0.000895 time 2.3806 (2.2449) loss 3.6705 (3.8354) grad_norm 1.0017 (1.1711) [2022-01-19 13:27:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][410/1251] eta 0:31:22 lr 0.000895 time 1.7080 (2.2389) loss 3.9254 (3.8425) grad_norm 1.2245 (1.1702) [2022-01-19 13:27:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][420/1251] eta 0:31:01 lr 0.000895 time 2.9373 (2.2397) loss 3.4477 (3.8500) grad_norm 1.2482 (1.1719) [2022-01-19 13:28:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][430/1251] eta 0:30:37 lr 0.000895 time 1.9385 (2.2382) loss 4.7521 (3.8518) grad_norm 1.3888 (1.1728) [2022-01-19 13:28:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][440/1251] eta 0:30:15 lr 0.000895 time 2.4748 (2.2380) loss 4.1943 (3.8624) grad_norm 1.9141 (1.1746) [2022-01-19 13:28:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][450/1251] eta 0:29:53 lr 0.000895 time 2.1878 (2.2388) loss 4.0451 (3.8622) grad_norm 1.1653 (1.1761) [2022-01-19 13:29:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][460/1251] eta 0:29:31 lr 0.000895 time 2.1485 (2.2391) loss 2.6206 (3.8593) grad_norm 1.2375 (1.1771) [2022-01-19 13:29:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][470/1251] eta 0:29:07 lr 0.000895 time 1.6374 (2.2381) loss 3.4250 (3.8584) grad_norm 1.0942 (1.1746) [2022-01-19 13:29:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][480/1251] eta 0:28:42 lr 0.000895 time 1.6099 (2.2338) loss 2.8228 (3.8565) grad_norm 1.3968 (1.1734) [2022-01-19 13:30:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][490/1251] eta 0:28:15 lr 0.000895 time 2.1846 (2.2274) loss 4.3270 (3.8579) grad_norm 0.9386 (1.1739) [2022-01-19 13:30:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][500/1251] eta 0:27:49 lr 0.000895 time 1.8705 (2.2235) loss 3.8445 (3.8632) grad_norm 0.9589 (1.1736) [2022-01-19 13:30:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][510/1251] eta 0:27:26 lr 0.000895 time 2.2978 (2.2225) loss 4.3008 (3.8622) grad_norm 1.1010 (1.1728) [2022-01-19 13:31:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][520/1251] eta 0:27:02 lr 0.000895 time 1.8882 (2.2195) loss 4.7467 (3.8616) grad_norm 0.9880 (1.1722) [2022-01-19 13:31:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][530/1251] eta 0:26:41 lr 0.000895 time 2.5674 (2.2210) loss 4.2296 (3.8644) grad_norm 0.9966 (1.1700) [2022-01-19 13:31:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][540/1251] eta 0:26:19 lr 0.000895 time 2.5147 (2.2210) loss 4.3729 (3.8666) grad_norm 1.1811 (1.1693) [2022-01-19 13:32:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][550/1251] eta 0:25:56 lr 0.000895 time 3.2852 (2.2208) loss 4.6184 (3.8724) grad_norm 1.3092 (1.1696) [2022-01-19 13:32:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][560/1251] eta 0:25:33 lr 0.000895 time 2.2315 (2.2195) loss 3.9239 (3.8747) grad_norm 1.1287 (1.1708) [2022-01-19 13:33:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][570/1251] eta 0:25:13 lr 0.000895 time 2.7599 (2.2221) loss 4.1551 (3.8702) grad_norm 0.9953 (1.1704) [2022-01-19 13:33:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][580/1251] eta 0:24:52 lr 0.000895 time 2.4472 (2.2237) loss 3.0757 (3.8704) grad_norm 1.1729 (1.1703) [2022-01-19 13:33:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][590/1251] eta 0:24:33 lr 0.000895 time 3.6939 (2.2287) loss 3.0818 (3.8667) grad_norm 0.9822 (1.1683) [2022-01-19 13:34:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][600/1251] eta 0:24:10 lr 0.000895 time 1.8594 (2.2278) loss 3.1592 (3.8634) grad_norm 1.1733 (1.1674) [2022-01-19 13:34:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][610/1251] eta 0:23:47 lr 0.000895 time 1.7090 (2.2265) loss 3.8768 (3.8662) grad_norm 1.2853 (1.1662) [2022-01-19 13:34:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][620/1251] eta 0:23:23 lr 0.000895 time 1.8465 (2.2240) loss 4.2350 (3.8728) grad_norm 1.1586 (1.1648) [2022-01-19 13:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][630/1251] eta 0:23:00 lr 0.000895 time 2.8889 (2.2232) loss 3.6562 (3.8726) grad_norm 1.3323 (1.1644) [2022-01-19 13:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][640/1251] eta 0:22:36 lr 0.000894 time 1.9292 (2.2196) loss 3.9962 (3.8762) grad_norm 1.3657 (1.1643) [2022-01-19 13:36:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][650/1251] eta 0:22:12 lr 0.000894 time 1.5724 (2.2174) loss 4.3468 (3.8779) grad_norm 1.2428 (1.1648) [2022-01-19 13:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][660/1251] eta 0:21:50 lr 0.000894 time 2.1768 (2.2175) loss 3.3484 (3.8743) grad_norm 1.0202 (1.1639) [2022-01-19 13:36:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][670/1251] eta 0:21:29 lr 0.000894 time 3.6604 (2.2199) loss 3.5578 (3.8766) grad_norm 1.1390 (1.1648) [2022-01-19 13:37:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][680/1251] eta 0:21:09 lr 0.000894 time 2.5159 (2.2229) loss 4.2554 (3.8774) grad_norm 1.4356 (1.1656) [2022-01-19 13:37:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][690/1251] eta 0:20:46 lr 0.000894 time 1.9002 (2.2217) loss 3.9464 (3.8737) grad_norm 1.1503 (1.1658) [2022-01-19 13:37:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][700/1251] eta 0:20:22 lr 0.000894 time 1.6085 (2.2184) loss 4.2080 (3.8781) grad_norm 1.1845 (1.1658) [2022-01-19 13:38:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][710/1251] eta 0:19:58 lr 0.000894 time 2.0382 (2.2158) loss 3.6959 (3.8772) grad_norm 1.0564 (1.1654) [2022-01-19 13:38:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][720/1251] eta 0:19:36 lr 0.000894 time 2.1844 (2.2155) loss 4.6219 (3.8748) grad_norm 1.1160 (1.1645) [2022-01-19 13:38:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][730/1251] eta 0:19:15 lr 0.000894 time 2.0858 (2.2174) loss 3.8718 (3.8707) grad_norm 1.0187 (1.1637) [2022-01-19 13:39:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][740/1251] eta 0:18:53 lr 0.000894 time 2.2360 (2.2186) loss 3.0819 (3.8673) grad_norm 1.1413 (1.1647) [2022-01-19 13:39:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][750/1251] eta 0:18:31 lr 0.000894 time 1.8989 (2.2187) loss 3.8296 (3.8691) grad_norm 1.0179 (1.1643) [2022-01-19 13:40:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][760/1251] eta 0:18:08 lr 0.000894 time 1.9513 (2.2173) loss 3.7045 (3.8702) grad_norm 1.1221 (1.1631) [2022-01-19 13:40:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][770/1251] eta 0:17:46 lr 0.000894 time 2.7245 (2.2173) loss 4.3442 (3.8702) grad_norm 1.2030 (1.1628) [2022-01-19 13:40:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][780/1251] eta 0:17:24 lr 0.000894 time 2.2596 (2.2167) loss 3.7771 (3.8667) grad_norm 1.0602 (1.1637) [2022-01-19 13:41:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][790/1251] eta 0:17:02 lr 0.000894 time 2.1207 (2.2171) loss 4.0276 (3.8676) grad_norm 1.1532 (1.1629) [2022-01-19 13:41:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][800/1251] eta 0:16:39 lr 0.000894 time 1.6933 (2.2158) loss 4.3720 (3.8639) grad_norm 1.2638 (1.1633) [2022-01-19 13:41:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][810/1251] eta 0:16:16 lr 0.000894 time 2.6465 (2.2135) loss 4.3674 (3.8661) grad_norm 1.0028 (1.1627) [2022-01-19 13:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][820/1251] eta 0:15:52 lr 0.000894 time 1.8855 (2.2109) loss 4.4242 (3.8658) grad_norm 1.1551 (1.1636) [2022-01-19 13:42:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][830/1251] eta 0:15:29 lr 0.000894 time 2.4899 (2.2090) loss 4.0676 (3.8634) grad_norm 0.9778 (1.1642) [2022-01-19 13:42:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][840/1251] eta 0:15:07 lr 0.000894 time 1.6600 (2.2081) loss 4.5815 (3.8644) grad_norm 1.0439 (1.1637) [2022-01-19 13:43:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][850/1251] eta 0:14:46 lr 0.000894 time 2.1705 (2.2095) loss 3.5916 (3.8614) grad_norm 1.1446 (1.1644) [2022-01-19 13:43:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][860/1251] eta 0:14:24 lr 0.000894 time 2.2697 (2.2102) loss 3.7136 (3.8597) grad_norm 1.0085 (1.1648) [2022-01-19 13:44:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][870/1251] eta 0:14:02 lr 0.000894 time 1.8305 (2.2105) loss 4.4195 (3.8593) grad_norm 1.1077 (1.1647) [2022-01-19 13:44:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][880/1251] eta 0:13:40 lr 0.000894 time 1.9914 (2.2117) loss 3.9137 (3.8607) grad_norm 1.2074 (1.1658) [2022-01-19 13:44:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][890/1251] eta 0:13:18 lr 0.000894 time 2.2695 (2.2123) loss 4.1432 (3.8574) grad_norm 1.1619 (1.1654) [2022-01-19 13:45:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][900/1251] eta 0:12:56 lr 0.000894 time 2.2481 (2.2121) loss 3.6086 (3.8582) grad_norm 1.0937 (1.1651) [2022-01-19 13:45:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][910/1251] eta 0:12:33 lr 0.000894 time 1.6172 (2.2088) loss 4.6727 (3.8577) grad_norm 1.3149 (1.1655) [2022-01-19 13:45:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][920/1251] eta 0:12:10 lr 0.000894 time 2.2623 (2.2074) loss 4.2964 (3.8557) grad_norm 1.2683 (1.1654) [2022-01-19 13:46:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][930/1251] eta 0:11:48 lr 0.000894 time 1.8249 (2.2063) loss 4.6046 (3.8584) grad_norm 1.1256 (1.1647) [2022-01-19 13:46:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][940/1251] eta 0:11:26 lr 0.000894 time 2.5128 (2.2081) loss 4.0097 (3.8578) grad_norm 1.2547 (1.1653) [2022-01-19 13:46:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][950/1251] eta 0:11:04 lr 0.000894 time 1.5645 (2.2081) loss 3.1232 (3.8565) grad_norm 0.9628 (1.1648) [2022-01-19 13:47:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][960/1251] eta 0:10:42 lr 0.000894 time 1.9469 (2.2083) loss 3.4351 (3.8529) grad_norm 1.0144 (1.1645) [2022-01-19 13:47:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][970/1251] eta 0:10:20 lr 0.000894 time 2.3678 (2.2088) loss 4.2864 (3.8535) grad_norm 1.0622 (1.1635) [2022-01-19 13:48:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][980/1251] eta 0:09:58 lr 0.000894 time 1.8611 (2.2079) loss 4.4128 (3.8519) grad_norm 1.0783 (1.1634) [2022-01-19 13:48:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][990/1251] eta 0:09:36 lr 0.000894 time 1.9166 (2.2078) loss 4.7107 (3.8528) grad_norm 1.2063 (1.1632) [2022-01-19 13:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1000/1251] eta 0:09:14 lr 0.000894 time 2.4531 (2.2076) loss 4.3427 (3.8582) grad_norm 1.0948 (1.1625) [2022-01-19 13:49:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1010/1251] eta 0:08:52 lr 0.000894 time 2.4250 (2.2093) loss 3.8458 (3.8565) grad_norm 0.9950 (1.1626) [2022-01-19 13:49:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1020/1251] eta 0:08:30 lr 0.000894 time 2.1235 (2.2087) loss 3.5154 (3.8565) grad_norm 1.0336 (1.1624) [2022-01-19 13:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1030/1251] eta 0:08:08 lr 0.000893 time 1.9206 (2.2084) loss 3.5916 (3.8550) grad_norm 1.0992 (1.1620) [2022-01-19 13:50:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1040/1251] eta 0:07:45 lr 0.000893 time 1.6318 (2.2081) loss 3.8055 (3.8564) grad_norm 1.4282 (1.1615) [2022-01-19 13:50:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1050/1251] eta 0:07:23 lr 0.000893 time 1.7941 (2.2058) loss 3.3213 (3.8549) grad_norm 1.1188 (1.1617) [2022-01-19 13:50:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1060/1251] eta 0:07:00 lr 0.000893 time 2.1603 (2.2035) loss 3.9861 (3.8551) grad_norm 0.9982 (1.1611) [2022-01-19 13:51:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1070/1251] eta 0:06:38 lr 0.000893 time 2.2439 (2.2026) loss 2.9231 (3.8540) grad_norm 1.1642 (1.1610) [2022-01-19 13:51:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1080/1251] eta 0:06:16 lr 0.000893 time 2.2205 (2.2018) loss 4.1763 (3.8545) grad_norm 1.2160 (1.1607) [2022-01-19 13:52:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1090/1251] eta 0:05:54 lr 0.000893 time 2.2547 (2.2043) loss 4.5367 (3.8560) grad_norm 1.0950 (1.1610) [2022-01-19 13:52:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1100/1251] eta 0:05:32 lr 0.000893 time 2.1375 (2.2041) loss 3.5140 (3.8541) grad_norm 1.2949 (1.1617) [2022-01-19 13:52:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1110/1251] eta 0:05:11 lr 0.000893 time 3.0657 (2.2065) loss 4.6148 (3.8551) grad_norm 1.1242 (1.1609) [2022-01-19 13:53:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1120/1251] eta 0:04:48 lr 0.000893 time 2.4136 (2.2056) loss 4.1694 (3.8551) grad_norm 1.1744 (1.1607) [2022-01-19 13:53:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1130/1251] eta 0:04:26 lr 0.000893 time 1.9337 (2.2047) loss 3.8306 (3.8552) grad_norm 1.1973 (1.1610) [2022-01-19 13:53:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1140/1251] eta 0:04:04 lr 0.000893 time 3.0969 (2.2029) loss 4.1012 (3.8573) grad_norm 1.2836 (1.1607) [2022-01-19 13:54:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1150/1251] eta 0:03:42 lr 0.000893 time 2.2286 (2.2017) loss 3.9803 (3.8565) grad_norm 1.0508 (1.1600) [2022-01-19 13:54:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1160/1251] eta 0:03:20 lr 0.000893 time 2.1636 (2.2009) loss 4.0867 (3.8589) grad_norm 1.1020 (1.1598) [2022-01-19 13:54:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1170/1251] eta 0:02:58 lr 0.000893 time 1.8489 (2.2015) loss 4.2556 (3.8603) grad_norm 1.1308 (1.1600) [2022-01-19 13:55:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1180/1251] eta 0:02:36 lr 0.000893 time 3.3946 (2.2029) loss 4.0255 (3.8600) grad_norm 1.0977 (1.1599) [2022-01-19 13:55:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1190/1251] eta 0:02:14 lr 0.000893 time 2.2738 (2.2046) loss 4.5809 (3.8605) grad_norm 1.2932 (1.1605) [2022-01-19 13:56:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1200/1251] eta 0:01:52 lr 0.000893 time 2.2326 (2.2056) loss 3.1050 (3.8589) grad_norm 1.1123 (1.1605) [2022-01-19 13:56:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1210/1251] eta 0:01:30 lr 0.000893 time 2.2161 (2.2055) loss 4.2392 (3.8606) grad_norm 1.0975 (1.1604) [2022-01-19 13:56:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1220/1251] eta 0:01:08 lr 0.000893 time 2.2100 (2.2033) loss 2.8094 (3.8571) grad_norm 1.0316 (1.1597) [2022-01-19 13:57:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1230/1251] eta 0:00:46 lr 0.000893 time 1.9196 (2.2013) loss 4.0049 (3.8573) grad_norm 1.1257 (1.1592) [2022-01-19 13:57:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1240/1251] eta 0:00:24 lr 0.000893 time 1.3599 (2.1995) loss 2.6728 (3.8567) grad_norm 1.2884 (1.1592) [2022-01-19 13:57:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [63/300][1250/1251] eta 0:00:02 lr 0.000893 time 1.2224 (2.1939) loss 3.0112 (3.8544) grad_norm 1.3154 (1.1590) [2022-01-19 13:57:42 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 63 training takes 0:45:45 [2022-01-19 13:58:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.680 (18.680) Loss 1.2637 (1.2637) Acc@1 70.508 (70.508) Acc@5 90.723 (90.723) [2022-01-19 13:58:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.310 (3.460) Loss 1.2286 (1.2706) Acc@1 71.094 (70.446) Acc@5 92.383 (90.696) [2022-01-19 13:58:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.610 (2.648) Loss 1.2689 (1.2576) Acc@1 70.703 (70.861) Acc@5 89.941 (90.695) [2022-01-19 13:58:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.610 (2.303) Loss 1.1931 (1.2543) Acc@1 73.242 (71.087) Acc@5 91.797 (90.653) [2022-01-19 13:59:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.576 (2.182) Loss 1.2684 (1.2512) Acc@1 70.020 (71.160) Acc@5 91.016 (90.711) [2022-01-19 13:59:19 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 71.220 Acc@5 90.756 [2022-01-19 13:59:19 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 71.2% [2022-01-19 13:59:19 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 71.53% [2022-01-19 13:59:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][0/1251] eta 7:30:59 lr 0.000893 time 21.6307 (21.6307) loss 3.9226 (3.9226) grad_norm 1.1066 (1.1066) [2022-01-19 14:00:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][10/1251] eta 1:26:43 lr 0.000893 time 2.5044 (4.1927) loss 4.5285 (4.1094) grad_norm 1.3617 (1.1158) [2022-01-19 14:00:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][20/1251] eta 1:08:33 lr 0.000893 time 2.4841 (3.3416) loss 3.9883 (3.9101) grad_norm 0.9633 (1.1689) [2022-01-19 14:00:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][30/1251] eta 1:00:35 lr 0.000893 time 2.1878 (2.9771) loss 2.6538 (3.8751) grad_norm 1.2599 (1.1854) [2022-01-19 14:01:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][40/1251] eta 0:56:32 lr 0.000893 time 2.2571 (2.8013) loss 3.4140 (3.7828) grad_norm 1.0980 (1.1943) [2022-01-19 14:01:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][50/1251] eta 0:54:12 lr 0.000893 time 1.5543 (2.7083) loss 3.5940 (3.8440) grad_norm 1.0667 (1.1848) [2022-01-19 14:01:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][60/1251] eta 0:51:38 lr 0.000893 time 1.8988 (2.6020) loss 4.3503 (3.8755) grad_norm 1.1110 (1.1712) [2022-01-19 14:02:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][70/1251] eta 0:49:32 lr 0.000893 time 2.5007 (2.5173) loss 3.9180 (3.8826) grad_norm 1.1365 (1.1924) [2022-01-19 14:02:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][80/1251] eta 0:48:02 lr 0.000893 time 1.9579 (2.4615) loss 4.2710 (3.8607) grad_norm 1.1810 (1.1855) [2022-01-19 14:03:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][90/1251] eta 0:47:01 lr 0.000893 time 1.8827 (2.4299) loss 3.9934 (3.8407) grad_norm 1.0865 (1.1819) [2022-01-19 14:03:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][100/1251] eta 0:46:09 lr 0.000893 time 1.4857 (2.4058) loss 4.0736 (3.8492) grad_norm 1.1649 (1.1759) [2022-01-19 14:03:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][110/1251] eta 0:45:10 lr 0.000893 time 2.3723 (2.3760) loss 2.7064 (3.8101) grad_norm 1.2185 (1.1675) [2022-01-19 14:04:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][120/1251] eta 0:44:25 lr 0.000893 time 2.1807 (2.3567) loss 4.0791 (3.7946) grad_norm 1.4311 (1.1713) [2022-01-19 14:04:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][130/1251] eta 0:43:55 lr 0.000893 time 1.9503 (2.3513) loss 3.4790 (3.7876) grad_norm 1.2022 (1.1689) [2022-01-19 14:04:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][140/1251] eta 0:43:29 lr 0.000893 time 2.1808 (2.3490) loss 3.9866 (3.7974) grad_norm 1.5437 (1.1736) [2022-01-19 14:05:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][150/1251] eta 0:43:06 lr 0.000893 time 2.7789 (2.3492) loss 4.2558 (3.8086) grad_norm 1.2266 (1.1752) [2022-01-19 14:05:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][160/1251] eta 0:42:35 lr 0.000893 time 2.1231 (2.3424) loss 4.2009 (3.8106) grad_norm 1.0485 (1.1738) [2022-01-19 14:05:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][170/1251] eta 0:42:01 lr 0.000892 time 2.0707 (2.3324) loss 4.3422 (3.8232) grad_norm 1.2113 (1.1685) [2022-01-19 14:06:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][180/1251] eta 0:41:17 lr 0.000892 time 1.8976 (2.3129) loss 4.1245 (3.8248) grad_norm 1.0702 (1.1624) [2022-01-19 14:06:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][190/1251] eta 0:40:41 lr 0.000892 time 2.1547 (2.3012) loss 4.2152 (3.8338) grad_norm 1.1483 (1.1605) [2022-01-19 14:07:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][200/1251] eta 0:40:12 lr 0.000892 time 2.8233 (2.2953) loss 3.5340 (3.8427) grad_norm 1.1438 (1.1596) [2022-01-19 14:07:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][210/1251] eta 0:39:42 lr 0.000892 time 1.9088 (2.2885) loss 4.5755 (3.8411) grad_norm 1.1935 (1.1577) [2022-01-19 14:07:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][220/1251] eta 0:39:12 lr 0.000892 time 2.3014 (2.2817) loss 4.5649 (3.8412) grad_norm 1.1038 (1.1601) [2022-01-19 14:08:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][230/1251] eta 0:38:34 lr 0.000892 time 1.9652 (2.2673) loss 4.2886 (3.8378) grad_norm 1.4854 (1.1594) [2022-01-19 14:08:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][240/1251] eta 0:38:06 lr 0.000892 time 1.9591 (2.2620) loss 4.0870 (3.8508) grad_norm 1.0064 (1.1604) [2022-01-19 14:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][250/1251] eta 0:37:43 lr 0.000892 time 2.0803 (2.2614) loss 3.3943 (3.8437) grad_norm 1.1150 (1.1594) [2022-01-19 14:09:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][260/1251] eta 0:37:16 lr 0.000892 time 1.8987 (2.2567) loss 4.5472 (3.8491) grad_norm 1.1309 (1.1569) [2022-01-19 14:09:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][270/1251] eta 0:36:52 lr 0.000892 time 2.8223 (2.2556) loss 3.9024 (3.8628) grad_norm 1.1238 (1.1567) [2022-01-19 14:09:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][280/1251] eta 0:36:25 lr 0.000892 time 1.7171 (2.2508) loss 4.4351 (3.8621) grad_norm 1.1882 (1.1542) [2022-01-19 14:10:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][290/1251] eta 0:36:06 lr 0.000892 time 2.4255 (2.2543) loss 3.8345 (3.8555) grad_norm 1.4660 (1.1559) [2022-01-19 14:10:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][300/1251] eta 0:35:41 lr 0.000892 time 1.9252 (2.2520) loss 3.6568 (3.8480) grad_norm 1.1010 (1.1569) [2022-01-19 14:10:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][310/1251] eta 0:35:18 lr 0.000892 time 2.9042 (2.2509) loss 4.2090 (3.8435) grad_norm 1.1215 (1.1546) [2022-01-19 14:11:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][320/1251] eta 0:34:52 lr 0.000892 time 1.5269 (2.2480) loss 3.1466 (3.8426) grad_norm 1.1989 (1.1532) [2022-01-19 14:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][330/1251] eta 0:34:32 lr 0.000892 time 2.4137 (2.2507) loss 4.2133 (3.8520) grad_norm 1.1450 (1.1524) [2022-01-19 14:12:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][340/1251] eta 0:34:05 lr 0.000892 time 1.9835 (2.2455) loss 4.0393 (3.8488) grad_norm 1.2100 (1.1534) [2022-01-19 14:12:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][350/1251] eta 0:33:43 lr 0.000892 time 2.5133 (2.2456) loss 2.9359 (3.8529) grad_norm 1.1241 (1.1519) [2022-01-19 14:12:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][360/1251] eta 0:33:15 lr 0.000892 time 1.9328 (2.2402) loss 3.6916 (3.8510) grad_norm 1.2871 (1.1535) [2022-01-19 14:13:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][370/1251] eta 0:32:57 lr 0.000892 time 3.2214 (2.2441) loss 3.5891 (3.8502) grad_norm 1.4230 (1.1567) [2022-01-19 14:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][380/1251] eta 0:32:30 lr 0.000892 time 1.8486 (2.2393) loss 3.9669 (3.8477) grad_norm 1.0654 (1.1583) [2022-01-19 14:13:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][390/1251] eta 0:32:07 lr 0.000892 time 2.5622 (2.2391) loss 4.1211 (3.8459) grad_norm 1.6915 (1.1614) [2022-01-19 14:14:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][400/1251] eta 0:31:42 lr 0.000892 time 1.8363 (2.2358) loss 3.7258 (3.8449) grad_norm 1.3263 (1.1636) [2022-01-19 14:14:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][410/1251] eta 0:31:22 lr 0.000892 time 2.9346 (2.2385) loss 3.6782 (3.8453) grad_norm 1.2394 (1.1644) [2022-01-19 14:15:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][420/1251] eta 0:30:57 lr 0.000892 time 1.8114 (2.2347) loss 3.2261 (3.8395) grad_norm 1.1427 (1.1649) [2022-01-19 14:15:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][430/1251] eta 0:30:30 lr 0.000892 time 1.9441 (2.2293) loss 2.7178 (3.8394) grad_norm 1.0724 (1.1650) [2022-01-19 14:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][440/1251] eta 0:30:05 lr 0.000892 time 2.2994 (2.2266) loss 3.9588 (3.8376) grad_norm 1.2929 (1.1665) [2022-01-19 14:16:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][450/1251] eta 0:29:45 lr 0.000892 time 2.8423 (2.2287) loss 3.9345 (3.8433) grad_norm 1.2302 (1.1667) [2022-01-19 14:16:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][460/1251] eta 0:29:23 lr 0.000892 time 1.7724 (2.2294) loss 4.2207 (3.8462) grad_norm 1.0489 (1.1654) [2022-01-19 14:16:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][470/1251] eta 0:29:00 lr 0.000892 time 2.1678 (2.2280) loss 3.9740 (3.8464) grad_norm 1.0392 (1.1652) [2022-01-19 14:17:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][480/1251] eta 0:28:37 lr 0.000892 time 2.6156 (2.2278) loss 3.6578 (3.8430) grad_norm 1.2149 (1.1662) [2022-01-19 14:17:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][490/1251] eta 0:28:15 lr 0.000892 time 2.7596 (2.2283) loss 4.0411 (3.8400) grad_norm 1.0141 (1.1666) [2022-01-19 14:17:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][500/1251] eta 0:27:52 lr 0.000892 time 2.2581 (2.2272) loss 3.6250 (3.8416) grad_norm 1.0152 (1.1667) [2022-01-19 14:18:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][510/1251] eta 0:27:27 lr 0.000892 time 1.9645 (2.2237) loss 3.2969 (3.8442) grad_norm 1.0800 (1.1646) [2022-01-19 14:18:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][520/1251] eta 0:27:03 lr 0.000892 time 1.9399 (2.2209) loss 4.4098 (3.8474) grad_norm 0.9533 (1.1637) [2022-01-19 14:18:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][530/1251] eta 0:26:37 lr 0.000892 time 1.8343 (2.2160) loss 4.1014 (3.8478) grad_norm 1.1125 (1.1652) [2022-01-19 14:19:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][540/1251] eta 0:26:15 lr 0.000892 time 2.5412 (2.2161) loss 2.7285 (3.8497) grad_norm 1.1882 (1.1653) [2022-01-19 14:19:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][550/1251] eta 0:25:52 lr 0.000892 time 1.8140 (2.2153) loss 4.6848 (3.8529) grad_norm 1.0682 (1.1648) [2022-01-19 14:20:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][560/1251] eta 0:25:31 lr 0.000891 time 2.4036 (2.2157) loss 4.4711 (3.8522) grad_norm 1.6149 (1.1671) [2022-01-19 14:20:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][570/1251] eta 0:25:10 lr 0.000891 time 2.2574 (2.2181) loss 3.7608 (3.8492) grad_norm 1.4850 (1.1676) [2022-01-19 14:20:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][580/1251] eta 0:24:47 lr 0.000891 time 3.0026 (2.2173) loss 4.5616 (3.8510) grad_norm 1.1459 (1.1676) [2022-01-19 14:21:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][590/1251] eta 0:24:23 lr 0.000891 time 1.5570 (2.2144) loss 4.4639 (3.8545) grad_norm 1.1226 (1.1694) [2022-01-19 14:21:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][600/1251] eta 0:24:00 lr 0.000891 time 2.1252 (2.2127) loss 4.6875 (3.8541) grad_norm 1.0528 (1.1689) [2022-01-19 14:21:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][610/1251] eta 0:23:38 lr 0.000891 time 1.8243 (2.2130) loss 4.0332 (3.8534) grad_norm 1.0808 (1.1684) [2022-01-19 14:22:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][620/1251] eta 0:23:17 lr 0.000891 time 2.6353 (2.2152) loss 3.8292 (3.8544) grad_norm 1.2018 (1.1681) [2022-01-19 14:22:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][630/1251] eta 0:22:57 lr 0.000891 time 2.8817 (2.2179) loss 4.5428 (3.8550) grad_norm 1.2674 (1.1675) [2022-01-19 14:23:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][640/1251] eta 0:22:34 lr 0.000891 time 2.0385 (2.2171) loss 3.1247 (3.8518) grad_norm 1.0162 (1.1659) [2022-01-19 14:23:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][650/1251] eta 0:22:10 lr 0.000891 time 2.1025 (2.2143) loss 2.7995 (3.8470) grad_norm 1.0899 (1.1652) [2022-01-19 14:23:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][660/1251] eta 0:21:46 lr 0.000891 time 2.0726 (2.2106) loss 4.1511 (3.8464) grad_norm 1.0540 (1.1636) [2022-01-19 14:24:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][670/1251] eta 0:21:23 lr 0.000891 time 2.2216 (2.2095) loss 3.8999 (3.8477) grad_norm 1.0918 (1.1634) [2022-01-19 14:24:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][680/1251] eta 0:21:01 lr 0.000891 time 1.9227 (2.2094) loss 3.8702 (3.8482) grad_norm 1.5571 (1.1651) [2022-01-19 14:24:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][690/1251] eta 0:20:39 lr 0.000891 time 2.1754 (2.2098) loss 4.3015 (3.8494) grad_norm 1.1017 (1.1655) [2022-01-19 14:25:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][700/1251] eta 0:20:18 lr 0.000891 time 2.5326 (2.2115) loss 3.5577 (3.8452) grad_norm 1.1438 (1.1660) [2022-01-19 14:25:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][710/1251] eta 0:19:57 lr 0.000891 time 1.8458 (2.2135) loss 2.6176 (3.8427) grad_norm 1.0017 (1.1656) [2022-01-19 14:25:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][720/1251] eta 0:19:35 lr 0.000891 time 2.9250 (2.2144) loss 4.2321 (3.8434) grad_norm 1.0874 (1.1655) [2022-01-19 14:26:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][730/1251] eta 0:19:12 lr 0.000891 time 1.6166 (2.2113) loss 4.2766 (3.8418) grad_norm 1.1773 (1.1643) [2022-01-19 14:26:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][740/1251] eta 0:18:49 lr 0.000891 time 2.2307 (2.2107) loss 3.3638 (3.8426) grad_norm 1.1684 (1.1649) [2022-01-19 14:26:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][750/1251] eta 0:18:26 lr 0.000891 time 2.1813 (2.2090) loss 4.3290 (3.8447) grad_norm 1.0701 (1.1650) [2022-01-19 14:27:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][760/1251] eta 0:18:03 lr 0.000891 time 2.0911 (2.2075) loss 4.7720 (3.8455) grad_norm 1.2935 (1.1650) [2022-01-19 14:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][770/1251] eta 0:17:41 lr 0.000891 time 2.0215 (2.2072) loss 3.6185 (3.8467) grad_norm 1.0382 (1.1649) [2022-01-19 14:28:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][780/1251] eta 0:17:19 lr 0.000891 time 2.4988 (2.2067) loss 4.3646 (3.8484) grad_norm 1.0748 (1.1644) [2022-01-19 14:28:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][790/1251] eta 0:16:56 lr 0.000891 time 1.8746 (2.2056) loss 4.1228 (3.8500) grad_norm 1.1462 (1.1633) [2022-01-19 14:28:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][800/1251] eta 0:16:34 lr 0.000891 time 2.3251 (2.2045) loss 4.7070 (3.8514) grad_norm 1.2862 (1.1629) [2022-01-19 14:29:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][810/1251] eta 0:16:12 lr 0.000891 time 2.4866 (2.2043) loss 3.8937 (3.8494) grad_norm 1.0193 (1.1621) [2022-01-19 14:29:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][820/1251] eta 0:15:51 lr 0.000891 time 2.5278 (2.2073) loss 3.5563 (3.8501) grad_norm 1.2750 (1.1620) [2022-01-19 14:29:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][830/1251] eta 0:15:30 lr 0.000891 time 3.2244 (2.2105) loss 4.5064 (3.8525) grad_norm 1.0012 (1.1621) [2022-01-19 14:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][840/1251] eta 0:15:08 lr 0.000891 time 2.4563 (2.2107) loss 4.2993 (3.8535) grad_norm 1.0041 (1.1617) [2022-01-19 14:30:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][850/1251] eta 0:14:45 lr 0.000891 time 1.9271 (2.2076) loss 4.3227 (3.8536) grad_norm 1.2197 (1.1619) [2022-01-19 14:30:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][860/1251] eta 0:14:21 lr 0.000891 time 1.8073 (2.2045) loss 3.3372 (3.8550) grad_norm 1.0934 (1.1614) [2022-01-19 14:31:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][870/1251] eta 0:13:59 lr 0.000891 time 1.8611 (2.2039) loss 3.9115 (3.8576) grad_norm 1.1961 (1.1608) [2022-01-19 14:31:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][880/1251] eta 0:13:37 lr 0.000891 time 2.4677 (2.2041) loss 4.7520 (3.8573) grad_norm 1.0139 (1.1611) [2022-01-19 14:32:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][890/1251] eta 0:13:15 lr 0.000891 time 1.6687 (2.2028) loss 3.6494 (3.8607) grad_norm 0.9216 (1.1604) [2022-01-19 14:32:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][900/1251] eta 0:12:53 lr 0.000891 time 1.8654 (2.2029) loss 4.4186 (3.8617) grad_norm 1.2065 (1.1599) [2022-01-19 14:32:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][910/1251] eta 0:12:31 lr 0.000891 time 2.7327 (2.2036) loss 3.7277 (3.8640) grad_norm 1.0801 (1.1595) [2022-01-19 14:33:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][920/1251] eta 0:12:09 lr 0.000891 time 2.6419 (2.2046) loss 3.4678 (3.8649) grad_norm 1.3293 (1.1593) [2022-01-19 14:33:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][930/1251] eta 0:11:47 lr 0.000891 time 1.5188 (2.2034) loss 4.1242 (3.8670) grad_norm 1.0271 (1.1589) [2022-01-19 14:33:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][940/1251] eta 0:11:25 lr 0.000890 time 2.0830 (2.2044) loss 4.6502 (3.8691) grad_norm 1.2735 (1.1599) [2022-01-19 14:34:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][950/1251] eta 0:11:03 lr 0.000890 time 2.0949 (2.2055) loss 3.6263 (3.8696) grad_norm 1.0138 (1.1598) [2022-01-19 14:34:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][960/1251] eta 0:10:41 lr 0.000890 time 3.0487 (2.2054) loss 3.6238 (3.8670) grad_norm 0.9710 (1.1589) [2022-01-19 14:34:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][970/1251] eta 0:10:19 lr 0.000890 time 1.6033 (2.2038) loss 2.9956 (3.8653) grad_norm 1.2921 (1.1583) [2022-01-19 14:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][980/1251] eta 0:09:56 lr 0.000890 time 1.7929 (2.2023) loss 4.7679 (3.8672) grad_norm 1.1241 (1.1580) [2022-01-19 14:35:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][990/1251] eta 0:09:34 lr 0.000890 time 2.0619 (2.2023) loss 2.5942 (3.8651) grad_norm 1.1382 (1.1580) [2022-01-19 14:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1000/1251] eta 0:09:12 lr 0.000890 time 2.6151 (2.2026) loss 3.8299 (3.8641) grad_norm 1.0428 (1.1579) [2022-01-19 14:36:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1010/1251] eta 0:08:50 lr 0.000890 time 2.1076 (2.2030) loss 4.6668 (3.8662) grad_norm 1.2647 (1.1578) [2022-01-19 14:36:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1020/1251] eta 0:08:28 lr 0.000890 time 1.5691 (2.2030) loss 4.1858 (3.8650) grad_norm 1.0264 (1.1570) [2022-01-19 14:37:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1030/1251] eta 0:08:06 lr 0.000890 time 1.8942 (2.2036) loss 4.1367 (3.8650) grad_norm 1.0032 (1.1562) [2022-01-19 14:37:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1040/1251] eta 0:07:45 lr 0.000890 time 3.3084 (2.2052) loss 4.1923 (3.8643) grad_norm 1.2390 (1.1562) [2022-01-19 14:37:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1050/1251] eta 0:07:23 lr 0.000890 time 1.5864 (2.2041) loss 2.6875 (3.8616) grad_norm 1.0198 (1.1554) [2022-01-19 14:38:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1060/1251] eta 0:07:00 lr 0.000890 time 1.6400 (2.2024) loss 4.2161 (3.8624) grad_norm 1.0061 (1.1547) [2022-01-19 14:38:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1070/1251] eta 0:06:38 lr 0.000890 time 1.7712 (2.2013) loss 4.3373 (3.8616) grad_norm 1.0998 (1.1536) [2022-01-19 14:38:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1080/1251] eta 0:06:16 lr 0.000890 time 2.8519 (2.2015) loss 2.6358 (3.8614) grad_norm 1.1595 (1.1532) [2022-01-19 14:39:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1090/1251] eta 0:05:54 lr 0.000890 time 2.2565 (2.2027) loss 4.3375 (3.8646) grad_norm 1.4812 (1.1531) [2022-01-19 14:39:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1100/1251] eta 0:05:32 lr 0.000890 time 2.3478 (2.2025) loss 3.6313 (3.8628) grad_norm 1.2236 (1.1526) [2022-01-19 14:40:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1110/1251] eta 0:05:10 lr 0.000890 time 1.9619 (2.2017) loss 4.5588 (3.8639) grad_norm 1.1107 (1.1528) [2022-01-19 14:40:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1120/1251] eta 0:04:48 lr 0.000890 time 1.9755 (2.1995) loss 4.2425 (3.8648) grad_norm 1.3687 (1.1533) [2022-01-19 14:40:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1130/1251] eta 0:04:26 lr 0.000890 time 2.0298 (2.1989) loss 3.5651 (3.8636) grad_norm 1.1852 (1.1533) [2022-01-19 14:41:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1140/1251] eta 0:04:03 lr 0.000890 time 1.7439 (2.1981) loss 4.1970 (3.8616) grad_norm 1.3666 (1.1531) [2022-01-19 14:41:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1150/1251] eta 0:03:42 lr 0.000890 time 3.3573 (2.1987) loss 3.6134 (3.8619) grad_norm 1.2407 (1.1529) [2022-01-19 14:41:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1160/1251] eta 0:03:20 lr 0.000890 time 2.4302 (2.1993) loss 3.7384 (3.8582) grad_norm 1.2418 (1.1532) [2022-01-19 14:42:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1170/1251] eta 0:02:58 lr 0.000890 time 1.8447 (2.2009) loss 4.6553 (3.8597) grad_norm 1.1560 (1.1537) [2022-01-19 14:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1180/1251] eta 0:02:36 lr 0.000890 time 1.5659 (2.2009) loss 2.8872 (3.8599) grad_norm 1.0419 (1.1534) [2022-01-19 14:43:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1190/1251] eta 0:02:14 lr 0.000890 time 3.4180 (2.2015) loss 4.1164 (3.8586) grad_norm 1.3461 (1.1537) [2022-01-19 14:43:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1200/1251] eta 0:01:52 lr 0.000890 time 2.7912 (2.2012) loss 4.0404 (3.8601) grad_norm 0.9883 (1.1531) [2022-01-19 14:43:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1210/1251] eta 0:01:30 lr 0.000890 time 2.0640 (2.1994) loss 2.8858 (3.8572) grad_norm 0.9689 (1.1527) [2022-01-19 14:44:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1220/1251] eta 0:01:08 lr 0.000890 time 1.8750 (2.1984) loss 3.3255 (3.8568) grad_norm 1.2248 (1.1526) [2022-01-19 14:44:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1230/1251] eta 0:00:46 lr 0.000890 time 3.3057 (2.1981) loss 4.1316 (3.8582) grad_norm 1.2515 (1.1528) [2022-01-19 14:44:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1240/1251] eta 0:00:24 lr 0.000890 time 1.5535 (2.1972) loss 3.1788 (3.8567) grad_norm 1.1269 (1.1537) [2022-01-19 14:45:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [64/300][1250/1251] eta 0:00:02 lr 0.000890 time 1.1521 (2.1920) loss 3.6783 (3.8564) grad_norm 1.1576 (1.1539) [2022-01-19 14:45:02 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 64 training takes 0:45:42 [2022-01-19 14:45:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.760 (18.760) Loss 1.2430 (1.2430) Acc@1 72.949 (72.949) Acc@5 90.332 (90.332) [2022-01-19 14:45:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.863 (3.581) Loss 1.2525 (1.2332) Acc@1 70.703 (71.520) Acc@5 90.234 (90.598) [2022-01-19 14:45:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.275 (2.747) Loss 1.3071 (1.2236) Acc@1 68.457 (71.661) Acc@5 89.941 (90.834) [2022-01-19 14:46:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.591 (2.503) Loss 1.2639 (1.2259) Acc@1 70.312 (71.434) Acc@5 91.113 (90.959) [2022-01-19 14:46:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.910 (2.236) Loss 1.2971 (1.2254) Acc@1 71.289 (71.406) Acc@5 89.160 (90.939) [2022-01-19 14:46:41 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 71.442 Acc@5 90.898 [2022-01-19 14:46:41 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 71.4% [2022-01-19 14:46:41 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 71.53% [2022-01-19 14:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][0/1251] eta 7:42:09 lr 0.000890 time 22.1659 (22.1659) loss 3.4222 (3.4222) grad_norm 1.1777 (1.1777) [2022-01-19 14:47:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][10/1251] eta 1:22:53 lr 0.000890 time 2.1911 (4.0077) loss 4.2362 (3.7988) grad_norm 1.0047 (1.0857) [2022-01-19 14:47:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][20/1251] eta 1:06:13 lr 0.000890 time 2.1473 (3.2276) loss 3.8032 (3.7738) grad_norm 1.2047 (1.1342) [2022-01-19 14:48:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][30/1251] eta 0:58:46 lr 0.000890 time 1.9324 (2.8885) loss 3.8392 (3.8410) grad_norm 1.1955 (1.1475) [2022-01-19 14:48:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][40/1251] eta 0:55:23 lr 0.000890 time 3.6469 (2.7442) loss 3.8605 (3.8042) grad_norm 1.1050 (1.1505) [2022-01-19 14:48:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][50/1251] eta 0:52:55 lr 0.000890 time 2.9809 (2.6439) loss 3.2297 (3.8204) grad_norm 1.0880 (1.1385) [2022-01-19 14:49:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][60/1251] eta 0:50:43 lr 0.000890 time 1.5561 (2.5557) loss 4.4757 (3.8178) grad_norm 1.2220 (1.1350) [2022-01-19 14:49:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][70/1251] eta 0:48:57 lr 0.000890 time 1.6565 (2.4872) loss 4.1852 (3.8275) grad_norm 1.2657 (1.1428) [2022-01-19 14:49:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][80/1251] eta 0:47:34 lr 0.000889 time 2.3516 (2.4380) loss 3.0933 (3.7976) grad_norm 1.1797 (1.1484) [2022-01-19 14:50:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][90/1251] eta 0:46:51 lr 0.000889 time 2.2893 (2.4216) loss 2.7646 (3.7746) grad_norm 1.3622 (1.1586) [2022-01-19 14:50:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][100/1251] eta 0:45:46 lr 0.000889 time 1.9233 (2.3865) loss 4.4948 (3.7752) grad_norm 1.1285 (1.1583) [2022-01-19 14:51:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][110/1251] eta 0:45:03 lr 0.000889 time 1.9484 (2.3692) loss 3.9269 (3.7842) grad_norm 1.0596 (1.1613) [2022-01-19 14:51:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][120/1251] eta 0:44:21 lr 0.000889 time 2.1335 (2.3530) loss 3.5148 (3.7742) grad_norm 1.3934 (1.1583) [2022-01-19 14:51:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][130/1251] eta 0:43:40 lr 0.000889 time 2.2098 (2.3377) loss 3.8975 (3.7788) grad_norm 1.2100 (1.1523) [2022-01-19 14:52:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][140/1251] eta 0:43:11 lr 0.000889 time 1.7992 (2.3329) loss 2.6282 (3.7971) grad_norm 1.0592 (1.1500) [2022-01-19 14:52:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][150/1251] eta 0:42:36 lr 0.000889 time 1.9086 (2.3223) loss 4.2231 (3.8126) grad_norm 1.2654 (1.1542) [2022-01-19 14:52:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][160/1251] eta 0:41:52 lr 0.000889 time 1.7190 (2.3030) loss 4.1026 (3.8046) grad_norm 1.2702 (1.1590) [2022-01-19 14:53:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][170/1251] eta 0:41:21 lr 0.000889 time 2.1879 (2.2956) loss 4.2191 (3.8105) grad_norm 1.0625 (1.1599) [2022-01-19 14:53:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][180/1251] eta 0:40:54 lr 0.000889 time 2.5253 (2.2922) loss 4.2440 (3.7855) grad_norm 1.4059 (1.1598) [2022-01-19 14:53:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][190/1251] eta 0:40:24 lr 0.000889 time 2.1656 (2.2847) loss 4.1419 (3.7704) grad_norm 1.0010 (1.1615) [2022-01-19 14:54:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][200/1251] eta 0:39:57 lr 0.000889 time 2.2454 (2.2815) loss 4.6277 (3.7834) grad_norm 1.1336 (1.1614) [2022-01-19 14:54:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][210/1251] eta 0:39:32 lr 0.000889 time 1.8833 (2.2794) loss 4.8784 (3.7913) grad_norm 1.0499 (1.1597) [2022-01-19 14:55:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][220/1251] eta 0:39:04 lr 0.000889 time 1.8898 (2.2737) loss 3.5325 (3.7893) grad_norm 1.1186 (1.1638) [2022-01-19 14:55:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][230/1251] eta 0:38:46 lr 0.000889 time 1.9020 (2.2784) loss 4.1909 (3.7817) grad_norm 1.3270 (1.1651) [2022-01-19 14:55:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][240/1251] eta 0:38:25 lr 0.000889 time 3.1133 (2.2803) loss 3.8818 (3.7836) grad_norm 0.9570 (1.1617) [2022-01-19 14:56:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][250/1251] eta 0:37:56 lr 0.000889 time 1.6062 (2.2743) loss 4.7891 (3.7822) grad_norm 1.1448 (1.1606) [2022-01-19 14:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][260/1251] eta 0:37:24 lr 0.000889 time 1.9360 (2.2649) loss 2.9791 (3.7842) grad_norm 0.9400 (1.1639) [2022-01-19 14:56:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][270/1251] eta 0:36:54 lr 0.000889 time 2.4634 (2.2571) loss 3.9778 (3.7858) grad_norm 1.0027 (1.1647) [2022-01-19 14:57:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][280/1251] eta 0:36:22 lr 0.000889 time 1.7226 (2.2472) loss 4.5663 (3.7891) grad_norm 1.0972 (1.1645) [2022-01-19 14:57:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][290/1251] eta 0:36:00 lr 0.000889 time 1.8380 (2.2482) loss 4.0494 (3.7886) grad_norm 1.1505 (1.1615) [2022-01-19 14:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][300/1251] eta 0:35:35 lr 0.000889 time 2.4277 (2.2456) loss 3.7195 (3.7853) grad_norm 1.1318 (1.1619) [2022-01-19 14:58:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][310/1251] eta 0:35:15 lr 0.000889 time 3.7397 (2.2485) loss 4.6380 (3.7916) grad_norm 1.2682 (1.1619) [2022-01-19 14:58:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][320/1251] eta 0:34:52 lr 0.000889 time 1.9547 (2.2480) loss 4.0453 (3.8000) grad_norm 1.0088 (1.1632) [2022-01-19 14:59:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][330/1251] eta 0:34:31 lr 0.000889 time 1.7522 (2.2496) loss 3.1486 (3.7958) grad_norm 1.0974 (1.1631) [2022-01-19 14:59:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][340/1251] eta 0:34:10 lr 0.000889 time 1.8632 (2.2510) loss 4.2455 (3.7954) grad_norm 1.1393 (1.1628) [2022-01-19 14:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][350/1251] eta 0:33:46 lr 0.000889 time 2.3871 (2.2488) loss 3.5596 (3.8059) grad_norm 1.1377 (1.1634) [2022-01-19 15:00:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][360/1251] eta 0:33:19 lr 0.000889 time 2.0757 (2.2440) loss 2.7076 (3.8090) grad_norm 1.0768 (1.1623) [2022-01-19 15:00:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][370/1251] eta 0:32:50 lr 0.000889 time 1.8976 (2.2368) loss 3.2124 (3.8083) grad_norm 1.3040 (1.1619) [2022-01-19 15:00:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][380/1251] eta 0:32:24 lr 0.000889 time 1.9213 (2.2324) loss 4.4510 (3.8116) grad_norm 1.0588 (1.1604) [2022-01-19 15:01:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][390/1251] eta 0:32:01 lr 0.000889 time 2.0005 (2.2314) loss 3.6295 (3.8104) grad_norm 1.2529 (1.1595) [2022-01-19 15:01:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][400/1251] eta 0:31:35 lr 0.000889 time 2.1976 (2.2275) loss 2.9499 (3.8078) grad_norm 1.2019 (1.1574) [2022-01-19 15:01:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][410/1251] eta 0:31:15 lr 0.000889 time 2.2163 (2.2298) loss 3.9423 (3.8076) grad_norm 1.4290 (1.1560) [2022-01-19 15:02:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][420/1251] eta 0:30:51 lr 0.000889 time 1.8800 (2.2281) loss 3.7579 (3.8120) grad_norm 1.0455 (1.1553) [2022-01-19 15:02:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][430/1251] eta 0:30:27 lr 0.000889 time 2.2064 (2.2263) loss 4.5640 (3.8117) grad_norm 1.1185 (1.1564) [2022-01-19 15:03:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][440/1251] eta 0:30:06 lr 0.000889 time 2.3588 (2.2269) loss 3.2282 (3.8070) grad_norm 1.3181 (1.1568) [2022-01-19 15:03:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][450/1251] eta 0:29:44 lr 0.000889 time 2.2928 (2.2277) loss 4.4617 (3.8079) grad_norm 1.0360 (1.1569) [2022-01-19 15:03:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][460/1251] eta 0:29:23 lr 0.000888 time 2.5498 (2.2292) loss 3.2162 (3.8077) grad_norm 1.2535 (1.1563) [2022-01-19 15:04:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][470/1251] eta 0:29:00 lr 0.000888 time 1.5477 (2.2288) loss 4.1212 (3.8099) grad_norm 0.9774 (1.1556) [2022-01-19 15:04:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][480/1251] eta 0:28:37 lr 0.000888 time 1.8986 (2.2281) loss 4.3304 (3.8112) grad_norm 1.4326 (1.1569) [2022-01-19 15:04:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][490/1251] eta 0:28:11 lr 0.000888 time 1.7003 (2.2227) loss 3.9742 (3.8141) grad_norm 1.2622 (1.1573) [2022-01-19 15:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][500/1251] eta 0:27:49 lr 0.000888 time 2.8788 (2.2234) loss 2.8508 (3.8156) grad_norm 1.1208 (1.1570) [2022-01-19 15:05:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][510/1251] eta 0:27:26 lr 0.000888 time 1.9053 (2.2215) loss 4.1886 (3.8177) grad_norm 1.0121 (1.1565) [2022-01-19 15:05:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][520/1251] eta 0:27:02 lr 0.000888 time 1.9200 (2.2202) loss 3.1765 (3.8156) grad_norm 1.0668 (1.1567) [2022-01-19 15:06:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][530/1251] eta 0:26:39 lr 0.000888 time 1.7928 (2.2183) loss 4.1740 (3.8211) grad_norm 1.2862 (1.1577) [2022-01-19 15:06:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][540/1251] eta 0:26:15 lr 0.000888 time 2.6012 (2.2162) loss 3.4595 (3.8222) grad_norm 1.0966 (1.1590) [2022-01-19 15:07:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][550/1251] eta 0:25:53 lr 0.000888 time 2.5834 (2.2154) loss 3.9991 (3.8259) grad_norm 1.1325 (1.1573) [2022-01-19 15:07:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][560/1251] eta 0:25:29 lr 0.000888 time 2.1915 (2.2133) loss 3.4738 (3.8285) grad_norm 1.3735 (1.1558) [2022-01-19 15:07:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][570/1251] eta 0:25:08 lr 0.000888 time 1.9734 (2.2144) loss 3.1867 (3.8258) grad_norm 1.0892 (1.1556) [2022-01-19 15:08:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][580/1251] eta 0:24:45 lr 0.000888 time 2.4356 (2.2134) loss 4.4710 (3.8279) grad_norm 1.1081 (1.1549) [2022-01-19 15:08:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][590/1251] eta 0:24:23 lr 0.000888 time 2.5730 (2.2144) loss 3.2288 (3.8329) grad_norm 1.3297 (1.1561) [2022-01-19 15:08:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][600/1251] eta 0:24:01 lr 0.000888 time 2.0569 (2.2144) loss 4.1938 (3.8339) grad_norm 1.0031 (1.1559) [2022-01-19 15:09:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][610/1251] eta 0:23:39 lr 0.000888 time 2.1045 (2.2146) loss 3.1548 (3.8307) grad_norm 1.2284 (1.1558) [2022-01-19 15:09:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][620/1251] eta 0:23:15 lr 0.000888 time 1.6868 (2.2123) loss 4.0781 (3.8373) grad_norm 1.0780 (1.1570) [2022-01-19 15:09:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][630/1251] eta 0:22:55 lr 0.000888 time 3.2187 (2.2151) loss 3.2334 (3.8347) grad_norm 1.0818 (1.1570) [2022-01-19 15:10:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][640/1251] eta 0:22:33 lr 0.000888 time 2.1948 (2.2144) loss 4.1762 (3.8382) grad_norm 1.0375 (1.1557) [2022-01-19 15:10:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][650/1251] eta 0:22:11 lr 0.000888 time 2.2328 (2.2151) loss 4.3821 (3.8391) grad_norm 1.0677 (1.1549) [2022-01-19 15:11:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][660/1251] eta 0:21:48 lr 0.000888 time 1.7169 (2.2146) loss 4.7873 (3.8423) grad_norm 1.1797 (1.1555) [2022-01-19 15:11:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][670/1251] eta 0:21:26 lr 0.000888 time 2.9187 (2.2139) loss 4.4663 (3.8451) grad_norm 1.0387 (1.1545) [2022-01-19 15:11:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][680/1251] eta 0:21:02 lr 0.000888 time 1.6917 (2.2107) loss 3.9200 (3.8488) grad_norm 1.0030 (1.1533) [2022-01-19 15:12:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][690/1251] eta 0:20:39 lr 0.000888 time 2.2945 (2.2092) loss 4.3524 (3.8499) grad_norm 1.0576 (1.1522) [2022-01-19 15:12:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][700/1251] eta 0:20:17 lr 0.000888 time 1.9188 (2.2090) loss 4.2768 (3.8498) grad_norm 1.2053 (1.1513) [2022-01-19 15:12:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][710/1251] eta 0:19:55 lr 0.000888 time 2.5965 (2.2094) loss 3.1273 (3.8535) grad_norm 1.0162 (1.1505) [2022-01-19 15:13:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][720/1251] eta 0:19:33 lr 0.000888 time 2.5126 (2.2096) loss 3.8465 (3.8542) grad_norm 1.0555 (1.1504) [2022-01-19 15:13:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][730/1251] eta 0:19:10 lr 0.000888 time 2.0736 (2.2089) loss 4.1026 (3.8571) grad_norm 1.2220 (1.1511) [2022-01-19 15:13:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][740/1251] eta 0:18:48 lr 0.000888 time 2.3069 (2.2078) loss 4.0249 (3.8614) grad_norm 1.2924 (1.1519) [2022-01-19 15:14:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][750/1251] eta 0:18:26 lr 0.000888 time 2.7856 (2.2082) loss 3.1084 (3.8639) grad_norm 1.1489 (1.1518) [2022-01-19 15:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][760/1251] eta 0:18:03 lr 0.000888 time 1.7561 (2.2068) loss 3.8182 (3.8666) grad_norm 1.2951 (1.1517) [2022-01-19 15:15:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][770/1251] eta 0:17:40 lr 0.000888 time 1.9350 (2.2051) loss 4.2744 (3.8710) grad_norm 1.2209 (1.1518) [2022-01-19 15:15:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][780/1251] eta 0:17:17 lr 0.000888 time 2.2293 (2.2032) loss 4.3723 (3.8703) grad_norm 1.0933 (1.1529) [2022-01-19 15:15:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][790/1251] eta 0:16:55 lr 0.000888 time 2.1849 (2.2018) loss 4.1427 (3.8717) grad_norm 1.2101 (1.1555) [2022-01-19 15:16:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][800/1251] eta 0:16:34 lr 0.000888 time 2.1418 (2.2043) loss 3.4180 (3.8693) grad_norm 0.9893 (1.1549) [2022-01-19 15:16:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][810/1251] eta 0:16:12 lr 0.000888 time 2.5757 (2.2061) loss 4.1454 (3.8726) grad_norm 1.4000 (1.1552) [2022-01-19 15:16:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][820/1251] eta 0:15:50 lr 0.000888 time 1.7359 (2.2049) loss 3.1901 (3.8694) grad_norm 1.1231 (1.1553) [2022-01-19 15:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][830/1251] eta 0:15:27 lr 0.000888 time 1.6404 (2.2040) loss 4.2019 (3.8716) grad_norm 1.1866 (1.1545) [2022-01-19 15:17:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][840/1251] eta 0:15:06 lr 0.000887 time 2.5029 (2.2054) loss 2.9814 (3.8684) grad_norm 1.1021 (1.1539) [2022-01-19 15:17:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][850/1251] eta 0:14:44 lr 0.000887 time 2.1393 (2.2047) loss 3.7364 (3.8670) grad_norm 1.4155 (1.1536) [2022-01-19 15:18:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][860/1251] eta 0:14:21 lr 0.000887 time 1.9644 (2.2023) loss 3.4636 (3.8686) grad_norm 1.2512 (1.1531) [2022-01-19 15:18:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][870/1251] eta 0:13:58 lr 0.000887 time 1.9200 (2.2017) loss 3.9023 (3.8652) grad_norm 1.2507 (1.1538) [2022-01-19 15:19:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][880/1251] eta 0:13:36 lr 0.000887 time 2.1522 (2.2008) loss 4.7755 (3.8660) grad_norm 0.9492 (1.1540) [2022-01-19 15:19:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][890/1251] eta 0:13:13 lr 0.000887 time 2.4267 (2.1982) loss 3.7118 (3.8678) grad_norm 0.9726 (1.1528) [2022-01-19 15:19:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][900/1251] eta 0:12:51 lr 0.000887 time 2.2989 (2.1973) loss 3.3842 (3.8658) grad_norm 1.0742 (1.1526) [2022-01-19 15:20:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][910/1251] eta 0:12:29 lr 0.000887 time 2.4906 (2.1969) loss 4.7407 (3.8658) grad_norm 0.9999 (1.1517) [2022-01-19 15:20:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][920/1251] eta 0:12:07 lr 0.000887 time 2.1100 (2.1976) loss 3.1316 (3.8669) grad_norm 1.2407 (1.1512) [2022-01-19 15:20:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][930/1251] eta 0:11:45 lr 0.000887 time 1.5586 (2.1970) loss 3.7428 (3.8656) grad_norm 1.1650 (1.1517) [2022-01-19 15:21:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][940/1251] eta 0:11:23 lr 0.000887 time 1.6696 (2.1970) loss 3.8311 (3.8645) grad_norm 1.3948 (1.1526) [2022-01-19 15:21:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][950/1251] eta 0:11:02 lr 0.000887 time 2.7795 (2.1998) loss 4.1103 (3.8653) grad_norm 1.0862 (1.1542) [2022-01-19 15:21:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][960/1251] eta 0:10:40 lr 0.000887 time 1.4916 (2.1996) loss 2.7119 (3.8637) grad_norm 1.1871 (1.1549) [2022-01-19 15:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][970/1251] eta 0:10:17 lr 0.000887 time 1.9037 (2.1986) loss 3.0696 (3.8630) grad_norm 1.2531 (1.1553) [2022-01-19 15:22:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][980/1251] eta 0:09:55 lr 0.000887 time 1.5823 (2.1966) loss 4.3135 (3.8651) grad_norm 1.1727 (1.1557) [2022-01-19 15:22:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][990/1251] eta 0:09:33 lr 0.000887 time 2.4847 (2.1974) loss 3.3902 (3.8647) grad_norm 1.1955 (1.1555) [2022-01-19 15:23:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1000/1251] eta 0:09:12 lr 0.000887 time 2.1243 (2.1992) loss 2.6857 (3.8653) grad_norm 1.0450 (1.1549) [2022-01-19 15:23:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1010/1251] eta 0:08:50 lr 0.000887 time 1.8039 (2.2003) loss 2.9134 (3.8633) grad_norm 0.9921 (1.1543) [2022-01-19 15:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1020/1251] eta 0:08:27 lr 0.000887 time 1.8109 (2.1991) loss 4.0077 (3.8630) grad_norm 1.2087 (1.1541) [2022-01-19 15:24:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1030/1251] eta 0:08:05 lr 0.000887 time 2.0247 (2.1970) loss 4.5016 (3.8631) grad_norm 1.4035 (1.1546) [2022-01-19 15:24:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1040/1251] eta 0:07:43 lr 0.000887 time 1.9660 (2.1954) loss 4.5720 (3.8631) grad_norm 1.2678 (1.1539) [2022-01-19 15:25:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1050/1251] eta 0:07:21 lr 0.000887 time 2.5427 (2.1953) loss 4.1472 (3.8630) grad_norm 0.8966 (1.1537) [2022-01-19 15:25:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1060/1251] eta 0:06:59 lr 0.000887 time 2.5887 (2.1951) loss 3.4779 (3.8632) grad_norm 1.1835 (1.1528) [2022-01-19 15:25:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1070/1251] eta 0:06:37 lr 0.000887 time 2.1924 (2.1945) loss 4.1960 (3.8631) grad_norm 1.3226 (1.1524) [2022-01-19 15:26:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1080/1251] eta 0:06:15 lr 0.000887 time 2.4946 (2.1957) loss 2.4644 (3.8594) grad_norm 1.3590 (1.1528) [2022-01-19 15:26:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1090/1251] eta 0:05:53 lr 0.000887 time 2.1425 (2.1962) loss 3.7450 (3.8586) grad_norm 1.0965 (1.1531) [2022-01-19 15:26:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1100/1251] eta 0:05:31 lr 0.000887 time 1.9483 (2.1958) loss 3.6830 (3.8617) grad_norm 1.1250 (1.1529) [2022-01-19 15:27:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1110/1251] eta 0:05:09 lr 0.000887 time 1.5879 (2.1961) loss 3.0258 (3.8621) grad_norm 1.0082 (1.1531) [2022-01-19 15:27:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1120/1251] eta 0:04:47 lr 0.000887 time 2.7835 (2.1964) loss 3.9980 (3.8653) grad_norm 1.1375 (1.1529) [2022-01-19 15:28:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1130/1251] eta 0:04:25 lr 0.000887 time 2.1850 (2.1955) loss 4.5677 (3.8647) grad_norm 1.1376 (1.1532) [2022-01-19 15:28:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1140/1251] eta 0:04:03 lr 0.000887 time 1.9708 (2.1941) loss 3.4694 (3.8653) grad_norm 0.9760 (1.1524) [2022-01-19 15:28:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1150/1251] eta 0:03:41 lr 0.000887 time 1.8018 (2.1935) loss 3.1573 (3.8642) grad_norm 1.0323 (1.1523) [2022-01-19 15:29:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1160/1251] eta 0:03:19 lr 0.000887 time 2.2450 (2.1923) loss 4.3193 (3.8654) grad_norm 1.1707 (1.1524) [2022-01-19 15:29:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1170/1251] eta 0:02:57 lr 0.000887 time 2.4779 (2.1918) loss 3.6060 (3.8634) grad_norm 0.9955 (1.1521) [2022-01-19 15:29:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1180/1251] eta 0:02:35 lr 0.000887 time 1.8823 (2.1914) loss 3.7025 (3.8660) grad_norm 1.2611 (1.1522) [2022-01-19 15:30:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1190/1251] eta 0:02:13 lr 0.000887 time 1.8335 (2.1923) loss 4.4629 (3.8675) grad_norm 1.1465 (1.1517) [2022-01-19 15:30:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1200/1251] eta 0:01:51 lr 0.000887 time 2.3294 (2.1926) loss 4.3303 (3.8688) grad_norm 1.1168 (1.1519) [2022-01-19 15:30:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1210/1251] eta 0:01:29 lr 0.000887 time 2.8626 (2.1932) loss 4.6280 (3.8666) grad_norm 1.1027 (1.1513) [2022-01-19 15:31:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1220/1251] eta 0:01:07 lr 0.000886 time 1.6081 (2.1920) loss 4.2538 (3.8678) grad_norm 1.1798 (1.1514) [2022-01-19 15:31:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1230/1251] eta 0:00:46 lr 0.000886 time 1.9319 (2.1920) loss 4.2288 (3.8693) grad_norm 0.9551 (1.1516) [2022-01-19 15:32:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1240/1251] eta 0:00:24 lr 0.000886 time 2.6004 (2.1925) loss 3.6123 (3.8702) grad_norm 1.0906 (1.1516) [2022-01-19 15:32:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [65/300][1250/1251] eta 0:00:02 lr 0.000886 time 1.1689 (2.1870) loss 4.0631 (3.8698) grad_norm 0.9762 (1.1517) [2022-01-19 15:32:17 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 65 training takes 0:45:36 [2022-01-19 15:32:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.498 (18.498) Loss 1.2459 (1.2459) Acc@1 72.070 (72.070) Acc@5 91.016 (91.016) [2022-01-19 15:32:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.565 (3.228) Loss 1.1969 (1.2451) Acc@1 74.121 (72.088) Acc@5 92.188 (91.282) [2022-01-19 15:33:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.895 (2.505) Loss 1.2774 (1.2526) Acc@1 69.238 (71.731) Acc@5 91.406 (91.062) [2022-01-19 15:33:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.301 (2.270) Loss 1.2282 (1.2608) Acc@1 72.461 (71.440) Acc@5 90.820 (90.918) [2022-01-19 15:33:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.073 (2.150) Loss 1.2407 (1.2567) Acc@1 72.363 (71.499) Acc@5 91.309 (90.932) [2022-01-19 15:33:53 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 71.698 Acc@5 91.018 [2022-01-19 15:33:53 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 71.7% [2022-01-19 15:33:53 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 71.70% [2022-01-19 15:34:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][0/1251] eta 7:29:59 lr 0.000886 time 21.5823 (21.5823) loss 3.1934 (3.1934) grad_norm 1.0109 (1.0109) [2022-01-19 15:34:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][10/1251] eta 1:21:36 lr 0.000886 time 1.6363 (3.9460) loss 3.4308 (3.9958) grad_norm 1.1064 (1.1690) [2022-01-19 15:34:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][20/1251] eta 1:04:12 lr 0.000886 time 1.2326 (3.1296) loss 3.1981 (3.9072) grad_norm 1.1172 (1.1596) [2022-01-19 15:35:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][30/1251] eta 0:57:37 lr 0.000886 time 1.5608 (2.8319) loss 4.2226 (3.8925) grad_norm 1.1286 (1.1628) [2022-01-19 15:35:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][40/1251] eta 0:54:22 lr 0.000886 time 3.9947 (2.6941) loss 4.4350 (3.9047) grad_norm 1.1514 (1.1672) [2022-01-19 15:36:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][50/1251] eta 0:52:02 lr 0.000886 time 1.7919 (2.6001) loss 4.3386 (3.9592) grad_norm 1.5237 (1.1725) [2022-01-19 15:36:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][60/1251] eta 0:50:08 lr 0.000886 time 1.7874 (2.5260) loss 3.7696 (3.9238) grad_norm 1.1111 (1.1740) [2022-01-19 15:36:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][70/1251] eta 0:49:01 lr 0.000886 time 2.3319 (2.4904) loss 4.0230 (3.9007) grad_norm 1.0612 (1.1878) [2022-01-19 15:37:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][80/1251] eta 0:48:24 lr 0.000886 time 3.4300 (2.4808) loss 4.5897 (3.8937) grad_norm 1.2601 (1.1840) [2022-01-19 15:37:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][90/1251] eta 0:47:43 lr 0.000886 time 1.6212 (2.4665) loss 3.1582 (3.8931) grad_norm 1.1110 (1.1810) [2022-01-19 15:37:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][100/1251] eta 0:46:48 lr 0.000886 time 2.1987 (2.4400) loss 3.6597 (3.8632) grad_norm 1.1553 (1.1750) [2022-01-19 15:38:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][110/1251] eta 0:45:42 lr 0.000886 time 1.8720 (2.4033) loss 4.1394 (3.8645) grad_norm 1.1270 (1.1720) [2022-01-19 15:38:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][120/1251] eta 0:44:53 lr 0.000886 time 3.1425 (2.3814) loss 2.8924 (3.8627) grad_norm 1.3860 (1.1797) [2022-01-19 15:39:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][130/1251] eta 0:44:12 lr 0.000886 time 1.8761 (2.3663) loss 4.6973 (3.8874) grad_norm 1.1874 (1.1816) [2022-01-19 15:39:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][140/1251] eta 0:43:41 lr 0.000886 time 2.2591 (2.3596) loss 4.2027 (3.8907) grad_norm 1.2899 (1.1772) [2022-01-19 15:39:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][150/1251] eta 0:43:06 lr 0.000886 time 2.2690 (2.3491) loss 4.3410 (3.8946) grad_norm 1.1185 (1.1763) [2022-01-19 15:40:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][160/1251] eta 0:42:29 lr 0.000886 time 2.2105 (2.3369) loss 2.9193 (3.8872) grad_norm 1.2261 (1.1834) [2022-01-19 15:40:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][170/1251] eta 0:41:54 lr 0.000886 time 3.1834 (2.3263) loss 2.7139 (3.8722) grad_norm 1.2182 (1.1781) [2022-01-19 15:40:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][180/1251] eta 0:41:24 lr 0.000886 time 1.9479 (2.3196) loss 2.5185 (3.8543) grad_norm 1.1577 (1.1763) [2022-01-19 15:41:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][190/1251] eta 0:40:47 lr 0.000886 time 2.0399 (2.3069) loss 3.4373 (3.8653) grad_norm 1.0521 (1.1744) [2022-01-19 15:41:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][200/1251] eta 0:40:13 lr 0.000886 time 2.5870 (2.2960) loss 2.9519 (3.8729) grad_norm 1.1222 (1.1733) [2022-01-19 15:41:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][210/1251] eta 0:39:39 lr 0.000886 time 2.2399 (2.2861) loss 4.3578 (3.8756) grad_norm 1.2802 (1.1770) [2022-01-19 15:42:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][220/1251] eta 0:39:10 lr 0.000886 time 2.6030 (2.2801) loss 2.7052 (3.8702) grad_norm 1.1694 (1.1771) [2022-01-19 15:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][230/1251] eta 0:38:41 lr 0.000886 time 2.2053 (2.2741) loss 2.7583 (3.8677) grad_norm 1.2324 (1.1740) [2022-01-19 15:43:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][240/1251] eta 0:38:15 lr 0.000886 time 2.4632 (2.2709) loss 4.1233 (3.8668) grad_norm 1.0653 (1.1734) [2022-01-19 15:43:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][250/1251] eta 0:37:51 lr 0.000886 time 1.8600 (2.2696) loss 3.2749 (3.8495) grad_norm 1.2451 (1.1732) [2022-01-19 15:43:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][260/1251] eta 0:37:32 lr 0.000886 time 2.4639 (2.2730) loss 3.9827 (3.8528) grad_norm 1.0750 (1.1721) [2022-01-19 15:44:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][270/1251] eta 0:37:06 lr 0.000886 time 1.8844 (2.2699) loss 3.2477 (3.8567) grad_norm 0.9914 (1.1729) [2022-01-19 15:44:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][280/1251] eta 0:36:39 lr 0.000886 time 2.2607 (2.2650) loss 4.2310 (3.8595) grad_norm 1.0911 (1.1713) [2022-01-19 15:44:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][290/1251] eta 0:36:09 lr 0.000886 time 1.9905 (2.2573) loss 3.7419 (3.8668) grad_norm 1.2159 (1.1714) [2022-01-19 15:45:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][300/1251] eta 0:35:40 lr 0.000886 time 1.9094 (2.2511) loss 3.9051 (3.8646) grad_norm 1.1034 (1.1693) [2022-01-19 15:45:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][310/1251] eta 0:35:15 lr 0.000886 time 2.2169 (2.2477) loss 4.1636 (3.8630) grad_norm 1.0266 (1.1703) [2022-01-19 15:45:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][320/1251] eta 0:34:51 lr 0.000886 time 1.7662 (2.2461) loss 3.5733 (3.8600) grad_norm 1.1933 (1.1684) [2022-01-19 15:46:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][330/1251] eta 0:34:29 lr 0.000886 time 2.8120 (2.2466) loss 4.5618 (3.8640) grad_norm 1.3347 (1.1687) [2022-01-19 15:46:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][340/1251] eta 0:34:01 lr 0.000886 time 1.6735 (2.2413) loss 4.6120 (3.8656) grad_norm 1.1633 (1.1676) [2022-01-19 15:47:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][350/1251] eta 0:33:40 lr 0.000885 time 2.3616 (2.2428) loss 3.0928 (3.8583) grad_norm 1.2004 (1.1662) [2022-01-19 15:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][360/1251] eta 0:33:20 lr 0.000885 time 2.1424 (2.2449) loss 4.6421 (3.8654) grad_norm 1.1915 (1.1658) [2022-01-19 15:47:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][370/1251] eta 0:32:58 lr 0.000885 time 2.0893 (2.2461) loss 4.1422 (3.8708) grad_norm 1.1623 (1.1672) [2022-01-19 15:48:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][380/1251] eta 0:32:32 lr 0.000885 time 2.0481 (2.2420) loss 4.3626 (3.8717) grad_norm 1.0808 (1.1673) [2022-01-19 15:48:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][390/1251] eta 0:32:05 lr 0.000885 time 1.9762 (2.2359) loss 4.5557 (3.8697) grad_norm 1.1251 (1.1694) [2022-01-19 15:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][400/1251] eta 0:31:38 lr 0.000885 time 2.2137 (2.2306) loss 4.3337 (3.8731) grad_norm 1.0241 (1.1685) [2022-01-19 15:49:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][410/1251] eta 0:31:13 lr 0.000885 time 1.8784 (2.2282) loss 3.6205 (3.8661) grad_norm 1.1644 (1.1694) [2022-01-19 15:49:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][420/1251] eta 0:30:53 lr 0.000885 time 2.5334 (2.2301) loss 3.4101 (3.8641) grad_norm 1.0795 (1.1700) [2022-01-19 15:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][430/1251] eta 0:30:29 lr 0.000885 time 2.9815 (2.2288) loss 3.5937 (3.8623) grad_norm 1.1635 (1.1710) [2022-01-19 15:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][440/1251] eta 0:30:06 lr 0.000885 time 2.1502 (2.2279) loss 2.8225 (3.8631) grad_norm 1.0078 (1.1686) [2022-01-19 15:50:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][450/1251] eta 0:29:45 lr 0.000885 time 1.8313 (2.2287) loss 3.2700 (3.8602) grad_norm 1.1982 (1.1684) [2022-01-19 15:51:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][460/1251] eta 0:29:23 lr 0.000885 time 3.1082 (2.2289) loss 4.4415 (3.8588) grad_norm 1.2049 (1.1693) [2022-01-19 15:51:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][470/1251] eta 0:29:00 lr 0.000885 time 2.7703 (2.2284) loss 2.9594 (3.8562) grad_norm 0.9874 (1.1695) [2022-01-19 15:51:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][480/1251] eta 0:28:36 lr 0.000885 time 1.8051 (2.2259) loss 3.8805 (3.8564) grad_norm 1.3662 (1.1684) [2022-01-19 15:52:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][490/1251] eta 0:28:11 lr 0.000885 time 1.9273 (2.2227) loss 3.8401 (3.8516) grad_norm 1.2254 (1.1687) [2022-01-19 15:52:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][500/1251] eta 0:27:49 lr 0.000885 time 3.4457 (2.2234) loss 3.9305 (3.8507) grad_norm 1.0239 (1.1693) [2022-01-19 15:52:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][510/1251] eta 0:27:25 lr 0.000885 time 2.5762 (2.2212) loss 3.8874 (3.8540) grad_norm 1.0929 (1.1676) [2022-01-19 15:53:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][520/1251] eta 0:27:01 lr 0.000885 time 1.8448 (2.2189) loss 3.5441 (3.8562) grad_norm 1.3203 (1.1674) [2022-01-19 15:53:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][530/1251] eta 0:26:38 lr 0.000885 time 1.7793 (2.2173) loss 3.6522 (3.8581) grad_norm 1.2157 (1.1674) [2022-01-19 15:53:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][540/1251] eta 0:26:15 lr 0.000885 time 2.7954 (2.2165) loss 2.6198 (3.8529) grad_norm 1.1234 (1.1675) [2022-01-19 15:54:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][550/1251] eta 0:25:53 lr 0.000885 time 2.4760 (2.2156) loss 4.2040 (3.8564) grad_norm 1.0059 (1.1660) [2022-01-19 15:54:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][560/1251] eta 0:25:30 lr 0.000885 time 1.8671 (2.2151) loss 4.0789 (3.8555) grad_norm 1.4043 (1.1659) [2022-01-19 15:54:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][570/1251] eta 0:25:10 lr 0.000885 time 2.2188 (2.2178) loss 3.1529 (3.8586) grad_norm 1.7887 (1.1688) [2022-01-19 15:55:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][580/1251] eta 0:24:47 lr 0.000885 time 2.2600 (2.2170) loss 3.2909 (3.8583) grad_norm 1.0155 (1.1691) [2022-01-19 15:55:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][590/1251] eta 0:24:24 lr 0.000885 time 2.1786 (2.2161) loss 2.8965 (3.8583) grad_norm 1.3956 (1.1685) [2022-01-19 15:56:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][600/1251] eta 0:24:01 lr 0.000885 time 1.8694 (2.2148) loss 3.7938 (3.8602) grad_norm 1.1275 (1.1705) [2022-01-19 15:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][610/1251] eta 0:23:40 lr 0.000885 time 2.4666 (2.2159) loss 2.8416 (3.8596) grad_norm 1.0850 (1.1706) [2022-01-19 15:56:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][620/1251] eta 0:23:17 lr 0.000885 time 2.2737 (2.2144) loss 3.8619 (3.8573) grad_norm 0.9456 (1.1696) [2022-01-19 15:57:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][630/1251] eta 0:22:53 lr 0.000885 time 2.1728 (2.2117) loss 2.8024 (3.8579) grad_norm 1.1784 (1.1694) [2022-01-19 15:57:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][640/1251] eta 0:22:30 lr 0.000885 time 2.3552 (2.2111) loss 4.1880 (3.8606) grad_norm 1.2475 (1.1697) [2022-01-19 15:57:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][650/1251] eta 0:22:09 lr 0.000885 time 2.4919 (2.2123) loss 4.3073 (3.8614) grad_norm 1.3444 (1.1708) [2022-01-19 15:58:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][660/1251] eta 0:21:49 lr 0.000885 time 3.1818 (2.2158) loss 3.5239 (3.8635) grad_norm 1.1549 (1.1718) [2022-01-19 15:58:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][670/1251] eta 0:21:27 lr 0.000885 time 2.3567 (2.2153) loss 2.7831 (3.8628) grad_norm 1.0994 (1.1713) [2022-01-19 15:59:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][680/1251] eta 0:21:03 lr 0.000885 time 1.8551 (2.2134) loss 3.3847 (3.8601) grad_norm 1.4622 (1.1717) [2022-01-19 15:59:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][690/1251] eta 0:20:40 lr 0.000885 time 1.9825 (2.2107) loss 4.3959 (3.8587) grad_norm 1.1296 (1.1717) [2022-01-19 15:59:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][700/1251] eta 0:20:15 lr 0.000885 time 1.9139 (2.2064) loss 3.1662 (3.8625) grad_norm 1.7654 (1.1725) [2022-01-19 16:00:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][710/1251] eta 0:19:52 lr 0.000885 time 2.1139 (2.2045) loss 3.7636 (3.8630) grad_norm 1.0704 (1.1734) [2022-01-19 16:00:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][720/1251] eta 0:19:30 lr 0.000884 time 1.9398 (2.2042) loss 4.2984 (3.8644) grad_norm 1.2036 (1.1732) [2022-01-19 16:00:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][730/1251] eta 0:19:09 lr 0.000884 time 2.7645 (2.2067) loss 3.8195 (3.8619) grad_norm 0.9923 (1.1722) [2022-01-19 16:01:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][740/1251] eta 0:18:47 lr 0.000884 time 2.0808 (2.2072) loss 4.1542 (3.8607) grad_norm 1.1884 (1.1724) [2022-01-19 16:01:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][750/1251] eta 0:18:26 lr 0.000884 time 2.2687 (2.2078) loss 3.5717 (3.8628) grad_norm 1.0527 (1.1725) [2022-01-19 16:01:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][760/1251] eta 0:18:04 lr 0.000884 time 1.9196 (2.2089) loss 2.8571 (3.8615) grad_norm 1.0362 (1.1711) [2022-01-19 16:02:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][770/1251] eta 0:17:42 lr 0.000884 time 2.0983 (2.2087) loss 4.4271 (3.8602) grad_norm 1.1613 (1.1702) [2022-01-19 16:02:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][780/1251] eta 0:17:19 lr 0.000884 time 1.8869 (2.2069) loss 3.9182 (3.8606) grad_norm 1.2617 (1.1707) [2022-01-19 16:02:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][790/1251] eta 0:16:56 lr 0.000884 time 2.0015 (2.2056) loss 3.6359 (3.8607) grad_norm 1.1967 (1.1697) [2022-01-19 16:03:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][800/1251] eta 0:16:33 lr 0.000884 time 1.8885 (2.2034) loss 3.4269 (3.8583) grad_norm 1.0420 (1.1687) [2022-01-19 16:03:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][810/1251] eta 0:16:11 lr 0.000884 time 2.2823 (2.2032) loss 4.0779 (3.8572) grad_norm 1.1040 (1.1680) [2022-01-19 16:04:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][820/1251] eta 0:15:50 lr 0.000884 time 2.5150 (2.2042) loss 4.0807 (3.8555) grad_norm 1.0596 (1.1694) [2022-01-19 16:04:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][830/1251] eta 0:15:27 lr 0.000884 time 1.7338 (2.2036) loss 3.6460 (3.8557) grad_norm 1.0687 (1.1692) [2022-01-19 16:04:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][840/1251] eta 0:15:05 lr 0.000884 time 1.9004 (2.2032) loss 4.1051 (3.8547) grad_norm 0.9822 (1.1686) [2022-01-19 16:05:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][850/1251] eta 0:14:43 lr 0.000884 time 2.0093 (2.2024) loss 4.1044 (3.8557) grad_norm 1.2709 (1.1684) [2022-01-19 16:05:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][860/1251] eta 0:14:21 lr 0.000884 time 3.0782 (2.2041) loss 2.6910 (3.8550) grad_norm 1.0747 (1.1676) [2022-01-19 16:05:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][870/1251] eta 0:13:59 lr 0.000884 time 1.4075 (2.2043) loss 3.1904 (3.8516) grad_norm 1.0768 (1.1666) [2022-01-19 16:06:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][880/1251] eta 0:13:38 lr 0.000884 time 1.9369 (2.2072) loss 4.4477 (3.8505) grad_norm 1.0317 (1.1662) [2022-01-19 16:06:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][890/1251] eta 0:13:17 lr 0.000884 time 2.6029 (2.2083) loss 4.7768 (3.8538) grad_norm 1.1679 (1.1666) [2022-01-19 16:07:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][900/1251] eta 0:12:54 lr 0.000884 time 2.1709 (2.2067) loss 3.0328 (3.8564) grad_norm 1.0306 (1.1667) [2022-01-19 16:07:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][910/1251] eta 0:12:31 lr 0.000884 time 1.6925 (2.2032) loss 4.2919 (3.8581) grad_norm 1.1476 (1.1667) [2022-01-19 16:07:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][920/1251] eta 0:12:08 lr 0.000884 time 1.7762 (2.1999) loss 4.1095 (3.8599) grad_norm 0.9849 (1.1669) [2022-01-19 16:07:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][930/1251] eta 0:11:45 lr 0.000884 time 1.6623 (2.1981) loss 3.9349 (3.8584) grad_norm 1.1266 (1.1654) [2022-01-19 16:08:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][940/1251] eta 0:11:24 lr 0.000884 time 2.2451 (2.2011) loss 3.2564 (3.8595) grad_norm 1.1748 (1.1666) [2022-01-19 16:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][950/1251] eta 0:11:02 lr 0.000884 time 2.4704 (2.2014) loss 3.1715 (3.8570) grad_norm 1.2578 (1.1667) [2022-01-19 16:09:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][960/1251] eta 0:10:41 lr 0.000884 time 2.8778 (2.2029) loss 3.5912 (3.8554) grad_norm 1.0685 (1.1662) [2022-01-19 16:09:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][970/1251] eta 0:10:19 lr 0.000884 time 1.7332 (2.2044) loss 4.1835 (3.8563) grad_norm 1.2273 (1.1661) [2022-01-19 16:09:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][980/1251] eta 0:09:57 lr 0.000884 time 2.1754 (2.2051) loss 4.1458 (3.8553) grad_norm 0.9711 (1.1661) [2022-01-19 16:10:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][990/1251] eta 0:09:35 lr 0.000884 time 1.9280 (2.2045) loss 2.7978 (3.8537) grad_norm 0.9981 (1.1656) [2022-01-19 16:10:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1000/1251] eta 0:09:12 lr 0.000884 time 1.9068 (2.2023) loss 2.9073 (3.8564) grad_norm 1.1798 (1.1656) [2022-01-19 16:10:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1010/1251] eta 0:08:50 lr 0.000884 time 1.8756 (2.2011) loss 4.4812 (3.8575) grad_norm 1.2791 (1.1660) [2022-01-19 16:11:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1020/1251] eta 0:08:28 lr 0.000884 time 2.1705 (2.2001) loss 3.5441 (3.8571) grad_norm 1.2314 (1.1667) [2022-01-19 16:11:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1030/1251] eta 0:08:06 lr 0.000884 time 2.3811 (2.1998) loss 4.2870 (3.8598) grad_norm 1.0713 (1.1660) [2022-01-19 16:12:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1040/1251] eta 0:07:44 lr 0.000884 time 3.0754 (2.2004) loss 4.6060 (3.8599) grad_norm 1.5542 (1.1666) [2022-01-19 16:12:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1050/1251] eta 0:07:22 lr 0.000884 time 2.3363 (2.2011) loss 3.7373 (3.8604) grad_norm 1.1368 (1.1660) [2022-01-19 16:12:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1060/1251] eta 0:07:00 lr 0.000884 time 2.5412 (2.2014) loss 3.8289 (3.8628) grad_norm 1.2932 (1.1661) [2022-01-19 16:13:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1070/1251] eta 0:06:38 lr 0.000884 time 1.9008 (2.2014) loss 3.7550 (3.8615) grad_norm 1.2111 (1.1654) [2022-01-19 16:13:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1080/1251] eta 0:06:16 lr 0.000884 time 1.8173 (2.2021) loss 3.7829 (3.8603) grad_norm 1.1210 (1.1652) [2022-01-19 16:13:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1090/1251] eta 0:05:54 lr 0.000884 time 3.0848 (2.2027) loss 4.1599 (3.8598) grad_norm 1.1772 (1.1646) [2022-01-19 16:14:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1100/1251] eta 0:05:32 lr 0.000883 time 2.3141 (2.2017) loss 4.1318 (3.8618) grad_norm 0.9927 (1.1642) [2022-01-19 16:14:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1110/1251] eta 0:05:10 lr 0.000883 time 2.0806 (2.1996) loss 4.3142 (3.8642) grad_norm 1.0582 (1.1638) [2022-01-19 16:14:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1120/1251] eta 0:04:47 lr 0.000883 time 1.8707 (2.1983) loss 3.9036 (3.8647) grad_norm 1.1509 (1.1638) [2022-01-19 16:15:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1130/1251] eta 0:04:26 lr 0.000883 time 3.0125 (2.1991) loss 4.2947 (3.8653) grad_norm 1.1137 (1.1637) [2022-01-19 16:15:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1140/1251] eta 0:04:04 lr 0.000883 time 2.1079 (2.1993) loss 2.6393 (3.8637) grad_norm 1.3155 (1.1642) [2022-01-19 16:16:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1150/1251] eta 0:03:42 lr 0.000883 time 2.8646 (2.2003) loss 4.1468 (3.8620) grad_norm 1.1434 (1.1642) [2022-01-19 16:16:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1160/1251] eta 0:03:20 lr 0.000883 time 2.3216 (2.2003) loss 3.1678 (3.8603) grad_norm 0.9540 (1.1638) [2022-01-19 16:16:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1170/1251] eta 0:02:58 lr 0.000883 time 1.8786 (2.2004) loss 3.7507 (3.8593) grad_norm 1.0693 (1.1630) [2022-01-19 16:17:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1180/1251] eta 0:02:36 lr 0.000883 time 1.9423 (2.1988) loss 3.7583 (3.8604) grad_norm 1.0866 (1.1629) [2022-01-19 16:17:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1190/1251] eta 0:02:14 lr 0.000883 time 1.8514 (2.1980) loss 4.6458 (3.8615) grad_norm 1.0907 (1.1635) [2022-01-19 16:17:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1200/1251] eta 0:01:52 lr 0.000883 time 2.4728 (2.1977) loss 3.9810 (3.8610) grad_norm 1.0773 (1.1635) [2022-01-19 16:18:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1210/1251] eta 0:01:30 lr 0.000883 time 2.2274 (2.1983) loss 3.7557 (3.8615) grad_norm 1.2673 (1.1635) [2022-01-19 16:18:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1220/1251] eta 0:01:08 lr 0.000883 time 2.0691 (2.1980) loss 4.3001 (3.8618) grad_norm 1.2265 (1.1635) [2022-01-19 16:18:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1230/1251] eta 0:00:46 lr 0.000883 time 2.1316 (2.1984) loss 4.1334 (3.8612) grad_norm 1.1072 (1.1642) [2022-01-19 16:19:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1240/1251] eta 0:00:24 lr 0.000883 time 2.1022 (2.1975) loss 2.9529 (3.8621) grad_norm 1.1941 (1.1644) [2022-01-19 16:19:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [66/300][1250/1251] eta 0:00:02 lr 0.000883 time 1.1780 (2.1927) loss 4.0723 (3.8640) grad_norm 1.0268 (1.1634) [2022-01-19 16:19:36 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 66 training takes 0:45:43 [2022-01-19 16:19:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.783 (18.783) Loss 1.2677 (1.2677) Acc@1 71.875 (71.875) Acc@5 90.820 (90.820) [2022-01-19 16:20:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.933 (3.434) Loss 1.2233 (1.2413) Acc@1 72.949 (71.600) Acc@5 91.309 (90.723) [2022-01-19 16:20:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.250 (2.578) Loss 1.1833 (1.2331) Acc@1 74.121 (71.735) Acc@5 91.992 (90.946) [2022-01-19 16:20:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.856 (2.362) Loss 1.2428 (1.2350) Acc@1 72.266 (71.554) Acc@5 90.918 (91.044) [2022-01-19 16:21:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.299 (2.189) Loss 1.2897 (1.2338) Acc@1 70.410 (71.601) Acc@5 91.211 (91.063) [2022-01-19 16:21:13 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 71.552 Acc@5 90.992 [2022-01-19 16:21:13 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 71.6% [2022-01-19 16:21:13 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 71.70% [2022-01-19 16:21:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][0/1251] eta 7:21:17 lr 0.000883 time 21.1652 (21.1652) loss 3.7135 (3.7135) grad_norm 1.1821 (1.1821) [2022-01-19 16:21:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][10/1251] eta 1:22:52 lr 0.000883 time 3.0057 (4.0066) loss 3.4832 (3.8675) grad_norm 1.3010 (1.1895) [2022-01-19 16:22:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][20/1251] eta 1:05:56 lr 0.000883 time 1.5393 (3.2139) loss 4.1145 (3.7493) grad_norm 1.3016 (1.1983) [2022-01-19 16:22:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][30/1251] eta 0:57:35 lr 0.000883 time 1.5180 (2.8298) loss 4.1800 (3.8311) grad_norm 1.3615 (1.2182) [2022-01-19 16:23:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][40/1251] eta 0:56:33 lr 0.000883 time 8.2855 (2.8023) loss 3.7601 (3.7854) grad_norm 1.1083 (1.2084) [2022-01-19 16:23:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][50/1251] eta 0:54:28 lr 0.000883 time 2.2790 (2.7213) loss 4.4393 (3.7785) grad_norm 1.0496 (1.2009) [2022-01-19 16:23:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][60/1251] eta 0:52:00 lr 0.000883 time 1.9892 (2.6202) loss 4.8535 (3.8301) grad_norm 1.0447 (1.1885) [2022-01-19 16:24:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][70/1251] eta 0:49:59 lr 0.000883 time 1.5949 (2.5395) loss 4.1173 (3.8418) grad_norm 1.1831 (1.1852) [2022-01-19 16:24:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][80/1251] eta 0:48:34 lr 0.000883 time 3.6211 (2.4888) loss 3.5231 (3.8356) grad_norm 1.2481 (1.1740) [2022-01-19 16:24:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][90/1251] eta 0:47:28 lr 0.000883 time 2.2288 (2.4535) loss 4.0404 (3.8132) grad_norm 1.1127 (1.1713) [2022-01-19 16:25:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][100/1251] eta 0:46:46 lr 0.000883 time 2.4994 (2.4379) loss 3.0710 (3.7792) grad_norm 1.4053 (1.1660) [2022-01-19 16:25:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][110/1251] eta 0:45:42 lr 0.000883 time 1.6457 (2.4032) loss 4.1153 (3.7905) grad_norm 1.0686 (1.1625) [2022-01-19 16:26:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][120/1251] eta 0:45:15 lr 0.000883 time 3.8674 (2.4008) loss 4.4222 (3.8088) grad_norm 1.2137 (1.1734) [2022-01-19 16:26:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][130/1251] eta 0:44:38 lr 0.000883 time 2.2775 (2.3890) loss 2.8112 (3.7938) grad_norm 1.2270 (1.1755) [2022-01-19 16:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][140/1251] eta 0:43:49 lr 0.000883 time 2.5775 (2.3669) loss 3.8180 (3.7793) grad_norm 1.0156 (1.1709) [2022-01-19 16:27:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][150/1251] eta 0:43:09 lr 0.000883 time 1.9578 (2.3521) loss 3.0932 (3.7824) grad_norm 1.0761 (1.1664) [2022-01-19 16:27:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][160/1251] eta 0:42:32 lr 0.000883 time 2.2109 (2.3394) loss 3.8401 (3.7788) grad_norm 1.1334 (1.1644) [2022-01-19 16:27:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][170/1251] eta 0:42:17 lr 0.000883 time 1.9688 (2.3471) loss 4.1864 (3.7734) grad_norm 1.5997 (1.1621) [2022-01-19 16:28:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][180/1251] eta 0:41:41 lr 0.000883 time 2.0341 (2.3352) loss 3.8549 (3.7903) grad_norm 1.0256 (1.1579) [2022-01-19 16:28:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][190/1251] eta 0:41:03 lr 0.000883 time 1.6255 (2.3215) loss 4.1941 (3.7874) grad_norm 1.0620 (1.1555) [2022-01-19 16:28:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][200/1251] eta 0:40:31 lr 0.000883 time 1.9432 (2.3135) loss 4.2612 (3.8069) grad_norm 1.0807 (1.1527) [2022-01-19 16:29:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][210/1251] eta 0:40:04 lr 0.000883 time 2.2333 (2.3101) loss 2.7703 (3.7988) grad_norm 1.0917 (1.1538) [2022-01-19 16:29:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][220/1251] eta 0:39:30 lr 0.000882 time 1.8699 (2.2990) loss 3.9497 (3.8020) grad_norm 1.1366 (1.1533) [2022-01-19 16:30:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][230/1251] eta 0:38:58 lr 0.000882 time 2.0229 (2.2906) loss 3.5135 (3.8101) grad_norm 1.1188 (1.1526) [2022-01-19 16:30:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][240/1251] eta 0:38:28 lr 0.000882 time 2.2552 (2.2837) loss 3.8691 (3.8093) grad_norm 1.1148 (1.1543) [2022-01-19 16:30:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][250/1251] eta 0:38:05 lr 0.000882 time 2.2162 (2.2836) loss 4.1421 (3.8006) grad_norm 1.2373 (1.1526) [2022-01-19 16:31:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][260/1251] eta 0:37:38 lr 0.000882 time 2.4994 (2.2793) loss 3.5261 (3.7989) grad_norm 1.1084 (1.1527) [2022-01-19 16:31:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][270/1251] eta 0:37:09 lr 0.000882 time 1.8591 (2.2722) loss 4.3860 (3.8077) grad_norm 1.1826 (1.1552) [2022-01-19 16:31:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][280/1251] eta 0:36:49 lr 0.000882 time 2.5343 (2.2753) loss 4.1421 (3.8220) grad_norm 1.4207 (1.1568) [2022-01-19 16:32:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][290/1251] eta 0:36:23 lr 0.000882 time 1.9609 (2.2722) loss 3.7107 (3.8208) grad_norm 0.9676 (1.1556) [2022-01-19 16:32:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][300/1251] eta 0:35:53 lr 0.000882 time 1.8658 (2.2640) loss 4.2279 (3.8130) grad_norm 0.9652 (1.1536) [2022-01-19 16:32:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][310/1251] eta 0:35:23 lr 0.000882 time 2.5280 (2.2563) loss 3.8002 (3.8127) grad_norm 1.3138 (1.1524) [2022-01-19 16:33:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][320/1251] eta 0:34:52 lr 0.000882 time 1.9105 (2.2475) loss 4.1509 (3.8135) grad_norm 1.1192 (1.1500) [2022-01-19 16:33:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][330/1251] eta 0:34:22 lr 0.000882 time 2.0411 (2.2397) loss 3.7996 (3.8179) grad_norm 1.2186 (1.1493) [2022-01-19 16:33:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][340/1251] eta 0:33:57 lr 0.000882 time 2.1991 (2.2369) loss 4.5904 (3.8221) grad_norm 1.0336 (1.1506) [2022-01-19 16:34:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][350/1251] eta 0:33:36 lr 0.000882 time 2.5601 (2.2380) loss 4.3872 (3.8273) grad_norm 1.2207 (1.1510) [2022-01-19 16:34:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][360/1251] eta 0:33:12 lr 0.000882 time 1.8836 (2.2359) loss 3.9536 (3.8311) grad_norm 1.3555 (1.1522) [2022-01-19 16:35:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][370/1251] eta 0:32:51 lr 0.000882 time 1.8917 (2.2383) loss 3.2129 (3.8302) grad_norm 1.3021 (1.1522) [2022-01-19 16:35:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][380/1251] eta 0:32:27 lr 0.000882 time 2.3416 (2.2364) loss 3.6572 (3.8284) grad_norm 1.1293 (1.1521) [2022-01-19 16:35:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][390/1251] eta 0:32:04 lr 0.000882 time 2.3603 (2.2357) loss 4.3642 (3.8331) grad_norm 1.1800 (1.1513) [2022-01-19 16:36:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][400/1251] eta 0:31:40 lr 0.000882 time 2.0800 (2.2329) loss 3.4221 (3.8293) grad_norm 1.1716 (1.1508) [2022-01-19 16:36:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][410/1251] eta 0:31:15 lr 0.000882 time 1.9291 (2.2298) loss 4.1298 (3.8336) grad_norm 1.1752 (1.1496) [2022-01-19 16:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][420/1251] eta 0:30:51 lr 0.000882 time 1.9558 (2.2278) loss 4.5193 (3.8379) grad_norm 0.9727 (1.1499) [2022-01-19 16:37:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][430/1251] eta 0:30:31 lr 0.000882 time 3.1228 (2.2304) loss 2.7893 (3.8335) grad_norm 1.1749 (1.1505) [2022-01-19 16:37:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][440/1251] eta 0:30:09 lr 0.000882 time 2.5034 (2.2306) loss 4.2056 (3.8331) grad_norm 1.4460 (1.1525) [2022-01-19 16:38:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][450/1251] eta 0:29:50 lr 0.000882 time 1.6784 (2.2356) loss 4.2833 (3.8337) grad_norm 1.1324 (1.1533) [2022-01-19 16:38:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][460/1251] eta 0:29:28 lr 0.000882 time 2.1724 (2.2355) loss 3.7569 (3.8328) grad_norm 1.2330 (1.1533) [2022-01-19 16:38:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][470/1251] eta 0:29:05 lr 0.000882 time 2.9278 (2.2345) loss 3.3248 (3.8353) grad_norm 1.0644 (1.1532) [2022-01-19 16:39:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][480/1251] eta 0:28:41 lr 0.000882 time 1.8348 (2.2324) loss 4.0465 (3.8336) grad_norm 1.1130 (1.1563) [2022-01-19 16:39:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][490/1251] eta 0:28:16 lr 0.000882 time 1.9006 (2.2297) loss 3.6540 (3.8367) grad_norm 1.1780 (1.1563) [2022-01-19 16:39:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][500/1251] eta 0:27:51 lr 0.000882 time 1.8750 (2.2260) loss 3.8654 (3.8377) grad_norm 1.0432 (1.1569) [2022-01-19 16:40:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][510/1251] eta 0:27:28 lr 0.000882 time 2.8423 (2.2253) loss 4.2750 (3.8422) grad_norm 1.0449 (1.1584) [2022-01-19 16:40:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][520/1251] eta 0:27:05 lr 0.000882 time 2.2161 (2.2238) loss 3.9885 (3.8460) grad_norm 1.0587 (1.1590) [2022-01-19 16:40:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][530/1251] eta 0:26:44 lr 0.000882 time 3.0805 (2.2252) loss 4.2621 (3.8449) grad_norm 1.2255 (1.1611) [2022-01-19 16:41:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][540/1251] eta 0:26:23 lr 0.000882 time 2.7197 (2.2268) loss 3.6834 (3.8450) grad_norm 1.2847 (1.1625) [2022-01-19 16:41:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][550/1251] eta 0:26:01 lr 0.000882 time 2.6779 (2.2270) loss 3.9762 (3.8490) grad_norm 1.0575 (1.1627) [2022-01-19 16:42:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][560/1251] eta 0:25:37 lr 0.000882 time 2.1592 (2.2247) loss 2.6899 (3.8438) grad_norm 1.1700 (1.1615) [2022-01-19 16:42:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][570/1251] eta 0:25:14 lr 0.000882 time 3.1200 (2.2233) loss 4.1880 (3.8455) grad_norm 1.1810 (1.1614) [2022-01-19 16:42:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][580/1251] eta 0:24:49 lr 0.000882 time 2.0990 (2.2203) loss 3.9582 (3.8446) grad_norm 1.1711 (1.1609) [2022-01-19 16:43:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][590/1251] eta 0:24:27 lr 0.000881 time 2.2811 (2.2204) loss 4.3125 (3.8436) grad_norm 1.4003 (1.1618) [2022-01-19 16:43:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][600/1251] eta 0:24:05 lr 0.000881 time 2.2612 (2.2204) loss 4.3183 (3.8473) grad_norm 1.1424 (1.1631) [2022-01-19 16:43:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][610/1251] eta 0:23:45 lr 0.000881 time 3.8148 (2.2240) loss 3.5767 (3.8458) grad_norm 1.1864 (1.1643) [2022-01-19 16:44:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][620/1251] eta 0:23:21 lr 0.000881 time 1.8816 (2.2218) loss 3.8582 (3.8414) grad_norm 1.4644 (1.1652) [2022-01-19 16:44:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][630/1251] eta 0:22:58 lr 0.000881 time 1.9128 (2.2198) loss 4.2513 (3.8401) grad_norm 1.0574 (1.1642) [2022-01-19 16:44:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][640/1251] eta 0:22:35 lr 0.000881 time 1.8060 (2.2189) loss 3.3234 (3.8406) grad_norm 1.0718 (1.1630) [2022-01-19 16:45:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][650/1251] eta 0:22:14 lr 0.000881 time 3.3206 (2.2199) loss 2.6438 (3.8426) grad_norm 1.0705 (1.1617) [2022-01-19 16:45:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][660/1251] eta 0:21:51 lr 0.000881 time 1.9618 (2.2185) loss 4.1220 (3.8456) grad_norm 1.1455 (1.1612) [2022-01-19 16:46:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][670/1251] eta 0:21:27 lr 0.000881 time 1.9578 (2.2153) loss 3.7269 (3.8442) grad_norm 1.2443 (1.1618) [2022-01-19 16:46:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][680/1251] eta 0:21:03 lr 0.000881 time 2.4053 (2.2128) loss 4.7349 (3.8489) grad_norm 0.8970 (1.1610) [2022-01-19 16:46:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][690/1251] eta 0:20:41 lr 0.000881 time 1.9322 (2.2126) loss 4.0489 (3.8512) grad_norm 1.0486 (1.1599) [2022-01-19 16:47:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][700/1251] eta 0:20:21 lr 0.000881 time 2.7502 (2.2167) loss 3.7348 (3.8460) grad_norm 1.4074 (1.1600) [2022-01-19 16:47:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][710/1251] eta 0:19:58 lr 0.000881 time 1.5082 (2.2151) loss 4.4541 (3.8482) grad_norm 1.1414 (1.1602) [2022-01-19 16:47:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][720/1251] eta 0:19:35 lr 0.000881 time 2.1125 (2.2143) loss 3.9718 (3.8458) grad_norm 1.2242 (1.1613) [2022-01-19 16:48:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][730/1251] eta 0:19:13 lr 0.000881 time 2.0994 (2.2148) loss 4.1893 (3.8452) grad_norm 1.4501 (1.1614) [2022-01-19 16:48:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][740/1251] eta 0:18:52 lr 0.000881 time 2.2116 (2.2156) loss 3.9574 (3.8442) grad_norm 0.9915 (1.1611) [2022-01-19 16:48:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][750/1251] eta 0:18:29 lr 0.000881 time 1.5501 (2.2142) loss 4.3475 (3.8430) grad_norm 1.2074 (1.1616) [2022-01-19 16:49:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][760/1251] eta 0:18:06 lr 0.000881 time 2.0724 (2.2130) loss 4.3524 (3.8439) grad_norm 1.1521 (1.1612) [2022-01-19 16:49:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][770/1251] eta 0:17:44 lr 0.000881 time 1.8788 (2.2121) loss 4.0285 (3.8462) grad_norm 1.0996 (1.1611) [2022-01-19 16:50:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][780/1251] eta 0:17:21 lr 0.000881 time 2.2901 (2.2122) loss 4.5686 (3.8460) grad_norm 0.9900 (1.1620) [2022-01-19 16:50:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][790/1251] eta 0:16:58 lr 0.000881 time 2.4530 (2.2103) loss 3.2946 (3.8472) grad_norm 1.0785 (1.1621) [2022-01-19 16:50:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][800/1251] eta 0:16:36 lr 0.000881 time 2.3504 (2.2094) loss 2.7544 (3.8445) grad_norm 0.9944 (1.1614) [2022-01-19 16:51:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][810/1251] eta 0:16:14 lr 0.000881 time 2.1180 (2.2104) loss 4.5653 (3.8449) grad_norm 1.1214 (1.1630) [2022-01-19 16:51:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][820/1251] eta 0:15:53 lr 0.000881 time 2.4152 (2.2132) loss 4.1817 (3.8485) grad_norm 1.1046 (1.1645) [2022-01-19 16:51:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][830/1251] eta 0:15:31 lr 0.000881 time 1.6613 (2.2129) loss 2.7260 (3.8491) grad_norm 1.1887 (1.1644) [2022-01-19 16:52:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][840/1251] eta 0:15:08 lr 0.000881 time 1.5798 (2.2107) loss 4.3408 (3.8500) grad_norm 1.0020 (1.1637) [2022-01-19 16:52:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][850/1251] eta 0:14:45 lr 0.000881 time 1.6449 (2.2090) loss 3.1118 (3.8506) grad_norm 1.0886 (1.1636) [2022-01-19 16:52:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][860/1251] eta 0:14:23 lr 0.000881 time 2.0601 (2.2075) loss 4.5133 (3.8546) grad_norm 1.0659 (1.1632) [2022-01-19 16:53:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][870/1251] eta 0:14:00 lr 0.000881 time 2.2970 (2.2060) loss 3.2849 (3.8577) grad_norm 1.0102 (1.1622) [2022-01-19 16:53:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][880/1251] eta 0:13:38 lr 0.000881 time 2.3848 (2.2066) loss 3.8752 (3.8577) grad_norm 1.1342 (1.1616) [2022-01-19 16:53:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][890/1251] eta 0:13:16 lr 0.000881 time 1.9029 (2.2064) loss 4.4367 (3.8577) grad_norm 1.0440 (1.1604) [2022-01-19 16:54:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][900/1251] eta 0:12:54 lr 0.000881 time 1.9835 (2.2078) loss 2.9478 (3.8591) grad_norm 1.0661 (1.1597) [2022-01-19 16:54:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][910/1251] eta 0:12:33 lr 0.000881 time 1.5593 (2.2087) loss 3.1025 (3.8595) grad_norm 1.0747 (1.1590) [2022-01-19 16:55:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][920/1251] eta 0:12:12 lr 0.000881 time 3.2061 (2.2120) loss 4.0147 (3.8574) grad_norm 1.1748 (1.1589) [2022-01-19 16:55:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][930/1251] eta 0:11:49 lr 0.000881 time 1.8001 (2.2101) loss 3.2101 (3.8577) grad_norm 1.0445 (1.1583) [2022-01-19 16:55:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][940/1251] eta 0:11:26 lr 0.000881 time 1.9567 (2.2074) loss 4.4824 (3.8604) grad_norm 1.5995 (1.1600) [2022-01-19 16:56:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][950/1251] eta 0:11:03 lr 0.000881 time 2.0127 (2.2048) loss 3.5805 (3.8563) grad_norm 0.9945 (1.1603) [2022-01-19 16:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][960/1251] eta 0:10:41 lr 0.000880 time 3.3384 (2.2049) loss 3.1719 (3.8541) grad_norm 0.9886 (1.1606) [2022-01-19 16:56:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][970/1251] eta 0:10:19 lr 0.000880 time 1.9014 (2.2058) loss 3.4022 (3.8525) grad_norm 1.1308 (1.1606) [2022-01-19 16:57:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][980/1251] eta 0:09:57 lr 0.000880 time 2.2337 (2.2059) loss 4.1778 (3.8523) grad_norm 1.0694 (1.1601) [2022-01-19 16:57:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][990/1251] eta 0:09:35 lr 0.000880 time 2.1485 (2.2055) loss 4.2285 (3.8560) grad_norm 1.3881 (1.1607) [2022-01-19 16:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1000/1251] eta 0:09:13 lr 0.000880 time 2.5234 (2.2034) loss 3.5741 (3.8567) grad_norm 1.3961 (1.1612) [2022-01-19 16:58:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1010/1251] eta 0:08:51 lr 0.000880 time 2.5945 (2.2034) loss 4.3182 (3.8565) grad_norm 1.0891 (1.1610) [2022-01-19 16:58:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1020/1251] eta 0:08:28 lr 0.000880 time 1.9067 (2.2022) loss 3.7476 (3.8565) grad_norm 1.1177 (1.1601) [2022-01-19 16:59:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1030/1251] eta 0:08:07 lr 0.000880 time 2.4638 (2.2039) loss 4.3348 (3.8601) grad_norm 1.0652 (1.1607) [2022-01-19 16:59:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1040/1251] eta 0:07:45 lr 0.000880 time 2.4771 (2.2047) loss 4.0942 (3.8598) grad_norm 1.2247 (1.1606) [2022-01-19 16:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1050/1251] eta 0:07:23 lr 0.000880 time 2.4864 (2.2046) loss 3.1278 (3.8589) grad_norm 1.3233 (1.1601) [2022-01-19 17:00:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1060/1251] eta 0:07:01 lr 0.000880 time 1.5586 (2.2042) loss 3.2670 (3.8593) grad_norm 1.4643 (1.1608) [2022-01-19 17:00:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1070/1251] eta 0:06:38 lr 0.000880 time 1.9477 (2.2032) loss 3.1635 (3.8580) grad_norm 0.9861 (1.1610) [2022-01-19 17:00:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1080/1251] eta 0:06:16 lr 0.000880 time 2.1960 (2.2018) loss 4.0734 (3.8590) grad_norm 1.3920 (1.1606) [2022-01-19 17:01:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1090/1251] eta 0:05:54 lr 0.000880 time 1.7685 (2.2012) loss 4.2948 (3.8583) grad_norm 1.5268 (1.1605) [2022-01-19 17:01:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1100/1251] eta 0:05:32 lr 0.000880 time 2.1961 (2.2019) loss 4.0032 (3.8615) grad_norm 1.3037 (1.1606) [2022-01-19 17:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1110/1251] eta 0:05:10 lr 0.000880 time 1.7170 (2.2013) loss 4.0504 (3.8628) grad_norm 1.0220 (1.1603) [2022-01-19 17:02:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1120/1251] eta 0:04:48 lr 0.000880 time 2.5078 (2.2034) loss 4.3103 (3.8627) grad_norm 1.3024 (1.1598) [2022-01-19 17:02:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1130/1251] eta 0:04:26 lr 0.000880 time 2.4327 (2.2041) loss 3.6169 (3.8643) grad_norm 1.0250 (1.1594) [2022-01-19 17:03:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1140/1251] eta 0:04:04 lr 0.000880 time 1.7373 (2.2035) loss 3.4438 (3.8649) grad_norm 1.1189 (1.1585) [2022-01-19 17:03:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1150/1251] eta 0:03:42 lr 0.000880 time 1.6631 (2.2017) loss 4.1286 (3.8640) grad_norm 1.1818 (1.1586) [2022-01-19 17:03:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1160/1251] eta 0:03:20 lr 0.000880 time 2.0027 (2.2018) loss 4.5790 (3.8650) grad_norm 1.0805 (1.1580) [2022-01-19 17:04:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1170/1251] eta 0:02:58 lr 0.000880 time 2.0349 (2.2015) loss 3.5887 (3.8621) grad_norm 1.0059 (1.1574) [2022-01-19 17:04:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1180/1251] eta 0:02:36 lr 0.000880 time 2.0159 (2.2009) loss 3.9234 (3.8634) grad_norm 1.1808 (1.1580) [2022-01-19 17:04:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1190/1251] eta 0:02:14 lr 0.000880 time 1.8239 (2.2004) loss 4.0269 (3.8644) grad_norm 0.9845 (1.1577) [2022-01-19 17:05:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1200/1251] eta 0:01:52 lr 0.000880 time 2.2631 (2.2008) loss 4.4381 (3.8612) grad_norm 1.0939 (1.1577) [2022-01-19 17:05:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1210/1251] eta 0:01:30 lr 0.000880 time 1.9037 (2.1997) loss 2.8778 (3.8589) grad_norm 1.0498 (1.1578) [2022-01-19 17:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1220/1251] eta 0:01:08 lr 0.000880 time 2.4445 (2.2016) loss 3.9650 (3.8605) grad_norm 1.1254 (1.1584) [2022-01-19 17:06:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1230/1251] eta 0:00:46 lr 0.000880 time 1.9015 (2.2009) loss 3.5544 (3.8605) grad_norm 0.8904 (1.1585) [2022-01-19 17:06:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1240/1251] eta 0:00:24 lr 0.000880 time 1.6748 (2.1991) loss 3.2950 (3.8585) grad_norm 1.2541 (1.1587) [2022-01-19 17:06:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [67/300][1250/1251] eta 0:00:02 lr 0.000880 time 1.3495 (2.1938) loss 4.2983 (3.8570) grad_norm 1.4103 (1.1587) [2022-01-19 17:06:58 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 67 training takes 0:45:44 [2022-01-19 17:07:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.642 (18.642) Loss 1.2078 (1.2078) Acc@1 70.215 (70.215) Acc@5 90.820 (90.820) [2022-01-19 17:07:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.931 (3.300) Loss 1.1690 (1.2004) Acc@1 71.973 (71.973) Acc@5 91.016 (90.989) [2022-01-19 17:07:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.693 (2.482) Loss 1.1746 (1.1862) Acc@1 72.754 (72.214) Acc@5 91.699 (91.127) [2022-01-19 17:08:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.964 (2.240) Loss 1.2327 (1.1934) Acc@1 70.215 (71.979) Acc@5 91.602 (91.079) [2022-01-19 17:08:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.096 (2.182) Loss 1.1154 (1.1969) Acc@1 73.438 (71.901) Acc@5 92.969 (91.078) [2022-01-19 17:08:35 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 71.960 Acc@5 91.110 [2022-01-19 17:08:35 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 72.0% [2022-01-19 17:08:35 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 71.96% [2022-01-19 17:08:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][0/1251] eta 7:32:40 lr 0.000880 time 21.7107 (21.7107) loss 4.8565 (4.8565) grad_norm 1.2441 (1.2441) [2022-01-19 17:09:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][10/1251] eta 1:25:47 lr 0.000880 time 3.0141 (4.1479) loss 4.2140 (3.9006) grad_norm 1.0854 (1.1172) [2022-01-19 17:09:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][20/1251] eta 1:06:55 lr 0.000880 time 2.1266 (3.2620) loss 3.6546 (3.7223) grad_norm 1.1122 (1.1282) [2022-01-19 17:10:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][30/1251] eta 0:59:10 lr 0.000880 time 1.8851 (2.9077) loss 4.4660 (3.7978) grad_norm 1.0964 (1.1241) [2022-01-19 17:10:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][40/1251] eta 0:55:15 lr 0.000880 time 3.5506 (2.7381) loss 3.9523 (3.8654) grad_norm 1.1928 (1.1132) [2022-01-19 17:10:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][50/1251] eta 0:53:06 lr 0.000880 time 2.6911 (2.6532) loss 4.1277 (3.8315) grad_norm 1.3906 (1.1255) [2022-01-19 17:11:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][60/1251] eta 0:50:50 lr 0.000880 time 1.6329 (2.5611) loss 4.4231 (3.8106) grad_norm 1.1010 (1.1316) [2022-01-19 17:11:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][70/1251] eta 0:49:22 lr 0.000880 time 2.5189 (2.5088) loss 3.3679 (3.8319) grad_norm 1.0365 (1.1327) [2022-01-19 17:11:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][80/1251] eta 0:48:40 lr 0.000879 time 3.5904 (2.4943) loss 4.3900 (3.8590) grad_norm 1.1867 (1.1396) [2022-01-19 17:12:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][90/1251] eta 0:47:43 lr 0.000879 time 2.7851 (2.4665) loss 4.2397 (3.8630) grad_norm 1.3279 (1.1387) [2022-01-19 17:12:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][100/1251] eta 0:46:43 lr 0.000879 time 1.9215 (2.4361) loss 2.6335 (3.8638) grad_norm 1.1126 (1.1452) [2022-01-19 17:13:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][110/1251] eta 0:45:31 lr 0.000879 time 1.7215 (2.3941) loss 3.6553 (3.8802) grad_norm 1.4027 (1.1442) [2022-01-19 17:13:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][120/1251] eta 0:44:54 lr 0.000879 time 3.2389 (2.3825) loss 3.7653 (3.8765) grad_norm 1.2182 (1.1472) [2022-01-19 17:13:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][130/1251] eta 0:44:08 lr 0.000879 time 1.9169 (2.3628) loss 4.6670 (3.8560) grad_norm 1.7069 (1.1556) [2022-01-19 17:14:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][140/1251] eta 0:43:19 lr 0.000879 time 2.0089 (2.3402) loss 4.4973 (3.8446) grad_norm 1.2721 (1.1612) [2022-01-19 17:14:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][150/1251] eta 0:42:36 lr 0.000879 time 1.6635 (2.3219) loss 3.3369 (3.8473) grad_norm 1.3080 (1.1595) [2022-01-19 17:14:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][160/1251] eta 0:42:03 lr 0.000879 time 2.4337 (2.3130) loss 3.0244 (3.8549) grad_norm 1.0620 (1.1619) [2022-01-19 17:15:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][170/1251] eta 0:41:31 lr 0.000879 time 1.8771 (2.3053) loss 4.6604 (3.8609) grad_norm 1.2253 (1.1677) [2022-01-19 17:15:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][180/1251] eta 0:41:06 lr 0.000879 time 2.0562 (2.3029) loss 3.4757 (3.8532) grad_norm 1.1539 (1.1695) [2022-01-19 17:15:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][190/1251] eta 0:40:30 lr 0.000879 time 1.9439 (2.2911) loss 3.9179 (3.8514) grad_norm 1.3176 (1.1710) [2022-01-19 17:16:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][200/1251] eta 0:39:57 lr 0.000879 time 2.0636 (2.2807) loss 3.1076 (3.8381) grad_norm 1.0883 (1.1703) [2022-01-19 17:16:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][210/1251] eta 0:39:35 lr 0.000879 time 1.8585 (2.2820) loss 3.1247 (3.8367) grad_norm 0.9791 (1.1713) [2022-01-19 17:16:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][220/1251] eta 0:39:12 lr 0.000879 time 1.8530 (2.2817) loss 4.3787 (3.8281) grad_norm 1.3507 (1.1692) [2022-01-19 17:17:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][230/1251] eta 0:38:41 lr 0.000879 time 1.8433 (2.2740) loss 3.3651 (3.8257) grad_norm 1.2261 (1.1711) [2022-01-19 17:17:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][240/1251] eta 0:38:15 lr 0.000879 time 2.4816 (2.2707) loss 3.2406 (3.8130) grad_norm 1.2240 (1.1714) [2022-01-19 17:18:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][250/1251] eta 0:38:01 lr 0.000879 time 3.6550 (2.2794) loss 3.7329 (3.8231) grad_norm 1.2629 (1.1687) [2022-01-19 17:18:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][260/1251] eta 0:37:32 lr 0.000879 time 1.9263 (2.2726) loss 3.6605 (3.8242) grad_norm 1.2319 (1.1685) [2022-01-19 17:18:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][270/1251] eta 0:37:01 lr 0.000879 time 1.8000 (2.2648) loss 4.5319 (3.8282) grad_norm 1.4080 (1.1712) [2022-01-19 17:19:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][280/1251] eta 0:36:26 lr 0.000879 time 1.6684 (2.2521) loss 4.1553 (3.8259) grad_norm 0.9565 (1.1699) [2022-01-19 17:19:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][290/1251] eta 0:36:02 lr 0.000879 time 3.3973 (2.2504) loss 3.8151 (3.8328) grad_norm 1.0630 (1.1676) [2022-01-19 17:19:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][300/1251] eta 0:35:37 lr 0.000879 time 1.8825 (2.2479) loss 4.7145 (3.8369) grad_norm 1.0265 (1.1651) [2022-01-19 17:20:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][310/1251] eta 0:35:15 lr 0.000879 time 2.0345 (2.2486) loss 4.5292 (3.8275) grad_norm 1.1467 (1.1633) [2022-01-19 17:20:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][320/1251] eta 0:34:57 lr 0.000879 time 2.4013 (2.2533) loss 4.6557 (3.8290) grad_norm 1.0276 (1.1613) [2022-01-19 17:21:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][330/1251] eta 0:34:37 lr 0.000879 time 3.0489 (2.2556) loss 4.7238 (3.8279) grad_norm 1.0817 (1.1615) [2022-01-19 17:21:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][340/1251] eta 0:34:12 lr 0.000879 time 1.9809 (2.2535) loss 3.8246 (3.8317) grad_norm 1.1992 (1.1627) [2022-01-19 17:21:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][350/1251] eta 0:33:48 lr 0.000879 time 2.0742 (2.2509) loss 4.2898 (3.8240) grad_norm 1.0818 (1.1613) [2022-01-19 17:22:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][360/1251] eta 0:33:21 lr 0.000879 time 2.4381 (2.2466) loss 2.7502 (3.8222) grad_norm 1.2649 (1.1640) [2022-01-19 17:22:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][370/1251] eta 0:32:58 lr 0.000879 time 3.1905 (2.2462) loss 4.4525 (3.8354) grad_norm 1.2106 (1.1648) [2022-01-19 17:22:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][380/1251] eta 0:32:33 lr 0.000879 time 2.2256 (2.2429) loss 3.8563 (3.8361) grad_norm 1.0259 (1.1645) [2022-01-19 17:23:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][390/1251] eta 0:32:08 lr 0.000879 time 1.8528 (2.2394) loss 3.8820 (3.8365) grad_norm 1.2166 (1.1667) [2022-01-19 17:23:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][400/1251] eta 0:31:44 lr 0.000879 time 2.3578 (2.2374) loss 4.2511 (3.8386) grad_norm 1.2454 (1.1690) [2022-01-19 17:23:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][410/1251] eta 0:31:20 lr 0.000879 time 2.2990 (2.2357) loss 4.0645 (3.8411) grad_norm 1.2763 (1.1683) [2022-01-19 17:24:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][420/1251] eta 0:30:55 lr 0.000879 time 2.2181 (2.2334) loss 4.4300 (3.8415) grad_norm 1.2902 (1.1687) [2022-01-19 17:24:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][430/1251] eta 0:30:33 lr 0.000879 time 2.8987 (2.2332) loss 3.8669 (3.8402) grad_norm 1.3322 (1.1685) [2022-01-19 17:24:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][440/1251] eta 0:30:08 lr 0.000879 time 1.9079 (2.2296) loss 4.3768 (3.8353) grad_norm 1.1095 (1.1706) [2022-01-19 17:25:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][450/1251] eta 0:29:44 lr 0.000878 time 2.2249 (2.2280) loss 4.3856 (3.8328) grad_norm 1.0562 (1.1710) [2022-01-19 17:25:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][460/1251] eta 0:29:20 lr 0.000878 time 2.2830 (2.2252) loss 4.2983 (3.8357) grad_norm 1.2819 (1.1709) [2022-01-19 17:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][470/1251] eta 0:28:58 lr 0.000878 time 2.0573 (2.2255) loss 3.8711 (3.8339) grad_norm 1.0780 (1.1701) [2022-01-19 17:26:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][480/1251] eta 0:28:34 lr 0.000878 time 2.1237 (2.2241) loss 3.9893 (3.8364) grad_norm 1.2777 (1.1692) [2022-01-19 17:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][490/1251] eta 0:28:12 lr 0.000878 time 2.0729 (2.2244) loss 4.5683 (3.8405) grad_norm 1.3345 (1.1697) [2022-01-19 17:27:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][500/1251] eta 0:27:51 lr 0.000878 time 2.3946 (2.2263) loss 3.9711 (3.8462) grad_norm 1.0001 (1.1694) [2022-01-19 17:27:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][510/1251] eta 0:27:30 lr 0.000878 time 2.1932 (2.2278) loss 4.4951 (3.8471) grad_norm 1.0980 (1.1690) [2022-01-19 17:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][520/1251] eta 0:27:06 lr 0.000878 time 1.8276 (2.2254) loss 3.8621 (3.8496) grad_norm 1.1121 (1.1697) [2022-01-19 17:28:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][530/1251] eta 0:26:42 lr 0.000878 time 2.2030 (2.2222) loss 3.6699 (3.8525) grad_norm 1.2349 (1.1714) [2022-01-19 17:28:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][540/1251] eta 0:26:18 lr 0.000878 time 2.4029 (2.2203) loss 3.6408 (3.8530) grad_norm 0.9985 (1.1737) [2022-01-19 17:28:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][550/1251] eta 0:25:54 lr 0.000878 time 2.2446 (2.2179) loss 2.9221 (3.8507) grad_norm 1.0334 (1.1721) [2022-01-19 17:29:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][560/1251] eta 0:25:33 lr 0.000878 time 2.1959 (2.2195) loss 4.0827 (3.8526) grad_norm 1.2733 (1.1722) [2022-01-19 17:29:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][570/1251] eta 0:25:10 lr 0.000878 time 1.8476 (2.2186) loss 3.5831 (3.8517) grad_norm 1.1565 (1.1726) [2022-01-19 17:30:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][580/1251] eta 0:24:47 lr 0.000878 time 1.5086 (2.2162) loss 3.8197 (3.8521) grad_norm 1.0206 (1.1713) [2022-01-19 17:30:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][590/1251] eta 0:24:24 lr 0.000878 time 2.2264 (2.2162) loss 3.2490 (3.8547) grad_norm 1.2654 (1.1713) [2022-01-19 17:30:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][600/1251] eta 0:24:03 lr 0.000878 time 2.8997 (2.2180) loss 3.3715 (3.8510) grad_norm 1.2823 (1.1737) [2022-01-19 17:31:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][610/1251] eta 0:23:40 lr 0.000878 time 1.8170 (2.2163) loss 4.0451 (3.8490) grad_norm 1.6433 (1.1746) [2022-01-19 17:31:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][620/1251] eta 0:23:17 lr 0.000878 time 2.0214 (2.2152) loss 3.2607 (3.8490) grad_norm 1.1931 (1.1754) [2022-01-19 17:31:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][630/1251] eta 0:22:54 lr 0.000878 time 2.2044 (2.2133) loss 3.8798 (3.8501) grad_norm 1.2688 (1.1746) [2022-01-19 17:32:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][640/1251] eta 0:22:31 lr 0.000878 time 2.1678 (2.2118) loss 3.2289 (3.8489) grad_norm 1.2212 (1.1752) [2022-01-19 17:32:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][650/1251] eta 0:22:09 lr 0.000878 time 1.6609 (2.2121) loss 4.3518 (3.8517) grad_norm 1.0427 (1.1746) [2022-01-19 17:32:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][660/1251] eta 0:21:47 lr 0.000878 time 2.5585 (2.2123) loss 4.2482 (3.8520) grad_norm 1.1391 (1.1747) [2022-01-19 17:33:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][670/1251] eta 0:21:24 lr 0.000878 time 1.7202 (2.2105) loss 3.7757 (3.8494) grad_norm 1.3110 (1.1745) [2022-01-19 17:33:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][680/1251] eta 0:21:01 lr 0.000878 time 2.1337 (2.2093) loss 3.9909 (3.8503) grad_norm 1.0557 (1.1743) [2022-01-19 17:34:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][690/1251] eta 0:20:38 lr 0.000878 time 1.8249 (2.2076) loss 3.8966 (3.8534) grad_norm 0.9819 (1.1731) [2022-01-19 17:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][700/1251] eta 0:20:16 lr 0.000878 time 2.7642 (2.2074) loss 4.3630 (3.8539) grad_norm 1.2933 (1.1723) [2022-01-19 17:34:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][710/1251] eta 0:19:54 lr 0.000878 time 1.9032 (2.2079) loss 4.2408 (3.8548) grad_norm 1.1003 (1.1729) [2022-01-19 17:35:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][720/1251] eta 0:19:33 lr 0.000878 time 2.4770 (2.2091) loss 4.5679 (3.8552) grad_norm 1.2395 (1.1739) [2022-01-19 17:35:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][730/1251] eta 0:19:11 lr 0.000878 time 1.8570 (2.2111) loss 2.9366 (3.8530) grad_norm 1.1716 (1.1739) [2022-01-19 17:35:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][740/1251] eta 0:18:51 lr 0.000878 time 2.9539 (2.2143) loss 4.2627 (3.8493) grad_norm 1.0547 (1.1732) [2022-01-19 17:36:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][750/1251] eta 0:18:29 lr 0.000878 time 1.8610 (2.2139) loss 3.9164 (3.8514) grad_norm 1.2674 (1.1734) [2022-01-19 17:36:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][760/1251] eta 0:18:06 lr 0.000878 time 2.3226 (2.2130) loss 4.1210 (3.8511) grad_norm 1.0959 (1.1719) [2022-01-19 17:36:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][770/1251] eta 0:17:42 lr 0.000878 time 1.8829 (2.2090) loss 4.1235 (3.8512) grad_norm 1.0812 (1.1716) [2022-01-19 17:37:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][780/1251] eta 0:17:18 lr 0.000878 time 1.9872 (2.2055) loss 4.8605 (3.8548) grad_norm 1.1293 (1.1720) [2022-01-19 17:37:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][790/1251] eta 0:16:56 lr 0.000878 time 2.1596 (2.2046) loss 4.4732 (3.8591) grad_norm 1.0986 (1.1719) [2022-01-19 17:38:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][800/1251] eta 0:16:34 lr 0.000878 time 2.1678 (2.2053) loss 2.9418 (3.8583) grad_norm 1.1346 (1.1707) [2022-01-19 17:38:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][810/1251] eta 0:16:12 lr 0.000878 time 3.0574 (2.2052) loss 4.0530 (3.8584) grad_norm 1.3597 (1.1704) [2022-01-19 17:38:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][820/1251] eta 0:15:51 lr 0.000877 time 3.5712 (2.2075) loss 4.0001 (3.8557) grad_norm 1.0691 (1.1702) [2022-01-19 17:39:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][830/1251] eta 0:15:29 lr 0.000877 time 1.8628 (2.2072) loss 4.4533 (3.8538) grad_norm 0.9870 (1.1692) [2022-01-19 17:39:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][840/1251] eta 0:15:07 lr 0.000877 time 2.9863 (2.2080) loss 3.3630 (3.8536) grad_norm 1.0210 (1.1687) [2022-01-19 17:39:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][850/1251] eta 0:14:44 lr 0.000877 time 1.8307 (2.2069) loss 4.2628 (3.8524) grad_norm 1.1860 (1.1697) [2022-01-19 17:40:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][860/1251] eta 0:14:23 lr 0.000877 time 3.1641 (2.2095) loss 2.6927 (3.8507) grad_norm 0.9288 (1.1687) [2022-01-19 17:40:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][870/1251] eta 0:14:02 lr 0.000877 time 2.4926 (2.2108) loss 3.9004 (3.8482) grad_norm 1.1543 (1.1683) [2022-01-19 17:41:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][880/1251] eta 0:13:40 lr 0.000877 time 2.5817 (2.2104) loss 3.5072 (3.8497) grad_norm 1.0186 (1.1674) [2022-01-19 17:41:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][890/1251] eta 0:13:17 lr 0.000877 time 1.8552 (2.2084) loss 4.8614 (3.8500) grad_norm 1.1071 (1.1665) [2022-01-19 17:41:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][900/1251] eta 0:12:54 lr 0.000877 time 3.1161 (2.2065) loss 3.7286 (3.8487) grad_norm 1.1756 (1.1662) [2022-01-19 17:42:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][910/1251] eta 0:12:31 lr 0.000877 time 1.6359 (2.2040) loss 4.1465 (3.8499) grad_norm 1.1037 (1.1658) [2022-01-19 17:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][920/1251] eta 0:12:09 lr 0.000877 time 2.2841 (2.2032) loss 4.1160 (3.8504) grad_norm 0.9944 (1.1652) [2022-01-19 17:42:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][930/1251] eta 0:11:46 lr 0.000877 time 1.9534 (2.2020) loss 3.9614 (3.8514) grad_norm 1.0372 (1.1651) [2022-01-19 17:43:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][940/1251] eta 0:11:25 lr 0.000877 time 3.6821 (2.2026) loss 4.0550 (3.8527) grad_norm 1.1619 (1.1656) [2022-01-19 17:43:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][950/1251] eta 0:11:03 lr 0.000877 time 2.8031 (2.2033) loss 3.9875 (3.8531) grad_norm 0.9846 (1.1647) [2022-01-19 17:43:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][960/1251] eta 0:10:41 lr 0.000877 time 2.4865 (2.2039) loss 3.7873 (3.8519) grad_norm 1.1885 (1.1647) [2022-01-19 17:44:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][970/1251] eta 0:10:19 lr 0.000877 time 2.1656 (2.2034) loss 4.3588 (3.8510) grad_norm 1.1828 (1.1646) [2022-01-19 17:44:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][980/1251] eta 0:09:57 lr 0.000877 time 2.5413 (2.2034) loss 3.9251 (3.8459) grad_norm 1.1187 (1.1645) [2022-01-19 17:45:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][990/1251] eta 0:09:35 lr 0.000877 time 3.0806 (2.2046) loss 3.0044 (3.8465) grad_norm 1.1886 (1.1643) [2022-01-19 17:45:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1000/1251] eta 0:09:13 lr 0.000877 time 2.1883 (2.2038) loss 4.7245 (3.8438) grad_norm 1.1657 (1.1637) [2022-01-19 17:45:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1010/1251] eta 0:08:50 lr 0.000877 time 2.2140 (2.2022) loss 3.6258 (3.8431) grad_norm 0.9997 (1.1637) [2022-01-19 17:46:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1020/1251] eta 0:08:28 lr 0.000877 time 2.1843 (2.2032) loss 3.4231 (3.8430) grad_norm 0.9683 (1.1633) [2022-01-19 17:46:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1030/1251] eta 0:08:07 lr 0.000877 time 2.7833 (2.2052) loss 3.6886 (3.8441) grad_norm 1.4712 (1.1637) [2022-01-19 17:46:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1040/1251] eta 0:07:45 lr 0.000877 time 1.7904 (2.2044) loss 3.1318 (3.8424) grad_norm 1.2465 (1.1638) [2022-01-19 17:47:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1050/1251] eta 0:07:23 lr 0.000877 time 2.9213 (2.2044) loss 4.7104 (3.8430) grad_norm 1.1542 (1.1633) [2022-01-19 17:47:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1060/1251] eta 0:07:00 lr 0.000877 time 1.7674 (2.2024) loss 4.5630 (3.8462) grad_norm 1.1884 (1.1638) [2022-01-19 17:47:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1070/1251] eta 0:06:38 lr 0.000877 time 2.8633 (2.2021) loss 4.0829 (3.8430) grad_norm 1.0116 (1.1637) [2022-01-19 17:48:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1080/1251] eta 0:06:16 lr 0.000877 time 1.5606 (2.1997) loss 4.0205 (3.8413) grad_norm 1.4358 (1.1637) [2022-01-19 17:48:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1090/1251] eta 0:05:54 lr 0.000877 time 1.9087 (2.1996) loss 4.5728 (3.8425) grad_norm 1.1111 (1.1636) [2022-01-19 17:48:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1100/1251] eta 0:05:32 lr 0.000877 time 2.1768 (2.1991) loss 2.8696 (3.8397) grad_norm 1.2165 (1.1643) [2022-01-19 17:49:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1110/1251] eta 0:05:10 lr 0.000877 time 2.2309 (2.1998) loss 4.3180 (3.8408) grad_norm 1.0393 (1.1649) [2022-01-19 17:49:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1120/1251] eta 0:04:48 lr 0.000877 time 1.8345 (2.2004) loss 4.1265 (3.8408) grad_norm 0.9595 (1.1640) [2022-01-19 17:50:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1130/1251] eta 0:04:26 lr 0.000877 time 2.0542 (2.2016) loss 3.6405 (3.8406) grad_norm 1.2564 (1.1646) [2022-01-19 17:50:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1140/1251] eta 0:04:04 lr 0.000877 time 2.5165 (2.2019) loss 4.3482 (3.8421) grad_norm 1.1069 (1.1642) [2022-01-19 17:50:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1150/1251] eta 0:03:42 lr 0.000877 time 3.1162 (2.2018) loss 4.2227 (3.8413) grad_norm 1.4262 (1.1644) [2022-01-19 17:51:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1160/1251] eta 0:03:20 lr 0.000877 time 1.8534 (2.2009) loss 4.0507 (3.8385) grad_norm 1.1142 (1.1641) [2022-01-19 17:51:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1170/1251] eta 0:02:58 lr 0.000877 time 1.6465 (2.1997) loss 3.8264 (3.8367) grad_norm 1.0936 (1.1638) [2022-01-19 17:51:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1180/1251] eta 0:02:36 lr 0.000876 time 1.8281 (2.1985) loss 3.6919 (3.8374) grad_norm 1.2311 (1.1637) [2022-01-19 17:52:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1190/1251] eta 0:02:14 lr 0.000876 time 2.6817 (2.1995) loss 4.2288 (3.8382) grad_norm 1.0236 (1.1637) [2022-01-19 17:52:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1200/1251] eta 0:01:52 lr 0.000876 time 2.3116 (2.1994) loss 2.7750 (3.8369) grad_norm 1.2761 (1.1634) [2022-01-19 17:52:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1210/1251] eta 0:01:30 lr 0.000876 time 2.5336 (2.2001) loss 3.9231 (3.8346) grad_norm 1.0789 (1.1632) [2022-01-19 17:53:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1220/1251] eta 0:01:08 lr 0.000876 time 1.8551 (2.1997) loss 3.0416 (3.8328) grad_norm 1.1320 (1.1628) [2022-01-19 17:53:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1230/1251] eta 0:00:46 lr 0.000876 time 3.0902 (2.2006) loss 4.2913 (3.8353) grad_norm 1.1509 (1.1627) [2022-01-19 17:54:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1240/1251] eta 0:00:24 lr 0.000876 time 1.2855 (2.1980) loss 4.0149 (3.8357) grad_norm 1.1977 (1.1631) [2022-01-19 17:54:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [68/300][1250/1251] eta 0:00:02 lr 0.000876 time 1.1860 (2.1928) loss 2.9706 (3.8362) grad_norm 1.2788 (1.1632) [2022-01-19 17:54:19 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 68 training takes 0:45:43 [2022-01-19 17:54:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.943 (17.943) Loss 1.2786 (1.2786) Acc@1 71.875 (71.875) Acc@5 89.844 (89.844) [2022-01-19 17:54:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.077 (3.193) Loss 1.2314 (1.2346) Acc@1 70.898 (71.573) Acc@5 90.625 (90.714) [2022-01-19 17:55:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.248 (2.588) Loss 1.1694 (1.2290) Acc@1 73.438 (71.731) Acc@5 92.285 (90.839) [2022-01-19 17:55:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.960 (2.345) Loss 1.1903 (1.2313) Acc@1 72.852 (71.673) Acc@5 91.699 (90.839) [2022-01-19 17:55:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.849 (2.152) Loss 1.2500 (1.2274) Acc@1 70.801 (71.684) Acc@5 90.527 (90.899) [2022-01-19 17:55:54 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 71.644 Acc@5 90.944 [2022-01-19 17:55:54 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 71.6% [2022-01-19 17:55:54 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 71.96% [2022-01-19 17:56:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][0/1251] eta 7:20:57 lr 0.000876 time 21.1488 (21.1488) loss 3.3905 (3.3905) grad_norm 1.0954 (1.0954) [2022-01-19 17:56:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][10/1251] eta 1:23:59 lr 0.000876 time 2.8857 (4.0609) loss 3.9018 (3.9317) grad_norm 1.1315 (1.1472) [2022-01-19 17:56:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][20/1251] eta 1:03:19 lr 0.000876 time 1.3635 (3.0866) loss 4.6414 (3.9631) grad_norm 1.1052 (1.1267) [2022-01-19 17:57:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][30/1251] eta 0:56:26 lr 0.000876 time 1.5954 (2.7738) loss 3.7203 (3.9002) grad_norm 1.3022 (1.1342) [2022-01-19 17:57:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][40/1251] eta 0:53:41 lr 0.000876 time 3.1695 (2.6603) loss 3.5665 (3.8415) grad_norm 1.0331 (1.1549) [2022-01-19 17:58:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][50/1251] eta 0:52:18 lr 0.000876 time 2.6107 (2.6129) loss 4.2086 (3.8282) grad_norm 1.4668 (1.1623) [2022-01-19 17:58:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][60/1251] eta 0:51:09 lr 0.000876 time 1.7809 (2.5776) loss 3.9385 (3.8438) grad_norm 1.0753 (1.1687) [2022-01-19 17:58:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][70/1251] eta 0:49:58 lr 0.000876 time 2.5470 (2.5393) loss 3.9221 (3.8334) grad_norm 1.2410 (1.1792) [2022-01-19 17:59:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][80/1251] eta 0:48:21 lr 0.000876 time 1.9295 (2.4780) loss 4.5589 (3.8219) grad_norm 1.1532 (1.1809) [2022-01-19 17:59:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][90/1251] eta 0:47:08 lr 0.000876 time 2.2866 (2.4367) loss 4.0810 (3.8152) grad_norm 1.3015 (1.1825) [2022-01-19 17:59:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][100/1251] eta 0:45:47 lr 0.000876 time 1.9379 (2.3872) loss 4.0591 (3.8331) grad_norm 1.0719 (1.1791) [2022-01-19 18:00:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][110/1251] eta 0:44:36 lr 0.000876 time 1.7002 (2.3460) loss 4.3956 (3.8380) grad_norm 1.1521 (1.1767) [2022-01-19 18:00:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][120/1251] eta 0:43:46 lr 0.000876 time 2.0756 (2.3221) loss 4.1003 (3.8401) grad_norm 1.5913 (1.1774) [2022-01-19 18:00:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][130/1251] eta 0:43:11 lr 0.000876 time 2.0100 (2.3115) loss 4.5051 (3.8658) grad_norm 1.2564 (1.1724) [2022-01-19 18:01:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][140/1251] eta 0:42:51 lr 0.000876 time 1.8930 (2.3142) loss 4.2943 (3.8508) grad_norm 1.0562 (1.1684) [2022-01-19 18:01:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][150/1251] eta 0:42:32 lr 0.000876 time 2.5136 (2.3185) loss 3.4489 (3.8503) grad_norm 1.1990 (1.1667) [2022-01-19 18:02:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][160/1251] eta 0:42:04 lr 0.000876 time 1.8142 (2.3142) loss 3.7184 (3.8460) grad_norm 1.1417 (1.1623) [2022-01-19 18:02:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][170/1251] eta 0:41:43 lr 0.000876 time 2.1041 (2.3155) loss 2.7625 (3.8340) grad_norm 1.1271 (1.1598) [2022-01-19 18:02:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][180/1251] eta 0:41:20 lr 0.000876 time 2.2377 (2.3161) loss 2.5989 (3.8085) grad_norm 1.0445 (1.1604) [2022-01-19 18:03:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][190/1251] eta 0:40:42 lr 0.000876 time 1.8918 (2.3019) loss 4.1925 (3.8057) grad_norm 1.0431 (1.1564) [2022-01-19 18:03:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][200/1251] eta 0:40:10 lr 0.000876 time 2.1044 (2.2934) loss 4.3830 (3.8116) grad_norm 1.1336 (1.1573) [2022-01-19 18:03:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][210/1251] eta 0:39:35 lr 0.000876 time 1.5889 (2.2823) loss 2.7426 (3.8143) grad_norm 0.9974 (1.1586) [2022-01-19 18:04:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][220/1251] eta 0:39:11 lr 0.000876 time 2.3113 (2.2809) loss 4.4875 (3.8115) grad_norm 1.2205 (1.1623) [2022-01-19 18:04:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][230/1251] eta 0:38:47 lr 0.000876 time 1.9221 (2.2794) loss 2.5863 (3.8144) grad_norm 1.1794 (1.1660) [2022-01-19 18:05:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][240/1251] eta 0:38:27 lr 0.000876 time 1.8852 (2.2819) loss 3.9825 (3.8069) grad_norm 1.1772 (1.1678) [2022-01-19 18:05:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][250/1251] eta 0:38:01 lr 0.000876 time 2.1670 (2.2792) loss 3.7050 (3.8007) grad_norm 0.9623 (1.1667) [2022-01-19 18:05:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][260/1251] eta 0:37:30 lr 0.000876 time 1.9993 (2.2708) loss 3.4383 (3.8065) grad_norm 0.9447 (1.1665) [2022-01-19 18:06:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][270/1251] eta 0:36:58 lr 0.000876 time 1.8589 (2.2618) loss 4.1619 (3.8041) grad_norm 1.1212 (1.1668) [2022-01-19 18:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][280/1251] eta 0:36:26 lr 0.000876 time 1.5817 (2.2517) loss 4.1901 (3.8032) grad_norm 1.1189 (1.1665) [2022-01-19 18:06:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][290/1251] eta 0:35:58 lr 0.000876 time 1.9401 (2.2457) loss 3.8014 (3.8039) grad_norm 1.3155 (1.1661) [2022-01-19 18:07:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][300/1251] eta 0:35:31 lr 0.000875 time 2.5341 (2.2412) loss 4.1922 (3.8048) grad_norm 1.1316 (1.1633) [2022-01-19 18:07:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][310/1251] eta 0:35:04 lr 0.000875 time 2.1915 (2.2364) loss 3.2388 (3.8027) grad_norm 1.1109 (1.1634) [2022-01-19 18:07:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][320/1251] eta 0:34:43 lr 0.000875 time 2.5575 (2.2382) loss 4.2093 (3.8042) grad_norm 1.0894 (1.1655) [2022-01-19 18:08:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][330/1251] eta 0:34:16 lr 0.000875 time 1.3972 (2.2326) loss 3.1082 (3.7987) grad_norm 1.0865 (1.1667) [2022-01-19 18:08:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][340/1251] eta 0:33:54 lr 0.000875 time 1.8846 (2.2331) loss 4.7775 (3.8016) grad_norm 1.4566 (1.1660) [2022-01-19 18:08:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][350/1251] eta 0:33:33 lr 0.000875 time 2.4924 (2.2343) loss 3.9690 (3.8012) grad_norm 1.3718 (1.1660) [2022-01-19 18:09:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][360/1251] eta 0:33:10 lr 0.000875 time 2.7428 (2.2344) loss 4.1655 (3.7996) grad_norm 1.2768 (1.1676) [2022-01-19 18:09:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][370/1251] eta 0:32:53 lr 0.000875 time 1.5811 (2.2400) loss 3.3756 (3.7972) grad_norm 1.1605 (1.1689) [2022-01-19 18:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][380/1251] eta 0:32:37 lr 0.000875 time 2.1933 (2.2471) loss 3.7168 (3.7892) grad_norm 0.9730 (1.1700) [2022-01-19 18:10:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][390/1251] eta 0:32:16 lr 0.000875 time 2.4060 (2.2493) loss 4.0932 (3.7949) grad_norm 1.4111 (1.1723) [2022-01-19 18:10:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][400/1251] eta 0:31:48 lr 0.000875 time 1.8508 (2.2432) loss 3.8775 (3.7977) grad_norm 1.1299 (1.1726) [2022-01-19 18:11:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][410/1251] eta 0:31:18 lr 0.000875 time 2.1007 (2.2342) loss 3.6387 (3.7964) grad_norm 1.2998 (1.1719) [2022-01-19 18:11:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][420/1251] eta 0:30:53 lr 0.000875 time 1.7060 (2.2308) loss 3.8934 (3.7912) grad_norm 1.1329 (1.1722) [2022-01-19 18:11:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][430/1251] eta 0:30:28 lr 0.000875 time 2.2162 (2.2272) loss 2.9038 (3.7932) grad_norm 1.0656 (1.1708) [2022-01-19 18:12:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][440/1251] eta 0:30:03 lr 0.000875 time 1.8227 (2.2233) loss 4.2701 (3.7917) grad_norm 1.2448 (1.1722) [2022-01-19 18:12:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][450/1251] eta 0:29:39 lr 0.000875 time 2.4567 (2.2220) loss 3.0316 (3.7877) grad_norm 1.2919 (1.1762) [2022-01-19 18:12:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][460/1251] eta 0:29:18 lr 0.000875 time 2.8169 (2.2233) loss 3.8093 (3.7821) grad_norm 1.0938 (1.1760) [2022-01-19 18:13:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][470/1251] eta 0:28:55 lr 0.000875 time 2.0544 (2.2227) loss 3.8061 (3.7897) grad_norm 1.2366 (1.1748) [2022-01-19 18:13:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][480/1251] eta 0:28:34 lr 0.000875 time 2.2864 (2.2232) loss 3.5226 (3.7866) grad_norm 1.1886 (1.1737) [2022-01-19 18:14:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][490/1251] eta 0:28:14 lr 0.000875 time 2.8106 (2.2266) loss 4.1229 (3.7853) grad_norm 1.1237 (1.1715) [2022-01-19 18:14:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][500/1251] eta 0:27:55 lr 0.000875 time 1.9560 (2.2317) loss 3.8931 (3.7868) grad_norm 1.1762 (1.1713) [2022-01-19 18:14:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][510/1251] eta 0:27:34 lr 0.000875 time 2.2291 (2.2328) loss 4.5468 (3.7941) grad_norm 1.0853 (1.1710) [2022-01-19 18:15:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][520/1251] eta 0:27:12 lr 0.000875 time 2.0981 (2.2331) loss 4.4310 (3.7963) grad_norm 1.0752 (1.1705) [2022-01-19 18:15:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][530/1251] eta 0:26:49 lr 0.000875 time 2.8388 (2.2330) loss 4.2032 (3.7969) grad_norm 1.0565 (1.1692) [2022-01-19 18:16:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][540/1251] eta 0:26:26 lr 0.000875 time 2.1935 (2.2314) loss 3.9803 (3.7932) grad_norm 1.2177 (1.1696) [2022-01-19 18:16:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][550/1251] eta 0:26:01 lr 0.000875 time 1.8770 (2.2276) loss 4.3369 (3.7932) grad_norm 1.1208 (1.1693) [2022-01-19 18:16:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][560/1251] eta 0:25:36 lr 0.000875 time 2.1877 (2.2242) loss 3.3847 (3.7919) grad_norm 1.0787 (1.1695) [2022-01-19 18:17:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][570/1251] eta 0:25:13 lr 0.000875 time 2.3866 (2.2224) loss 4.3317 (3.7965) grad_norm 1.1072 (1.1687) [2022-01-19 18:17:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][580/1251] eta 0:24:49 lr 0.000875 time 1.5745 (2.2204) loss 3.7144 (3.7963) grad_norm 1.2074 (1.1679) [2022-01-19 18:17:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][590/1251] eta 0:24:28 lr 0.000875 time 2.6435 (2.2212) loss 4.5191 (3.8041) grad_norm 1.5100 (1.1687) [2022-01-19 18:18:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][600/1251] eta 0:24:05 lr 0.000875 time 2.1420 (2.2201) loss 4.1086 (3.8046) grad_norm 1.2013 (1.1689) [2022-01-19 18:18:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][610/1251] eta 0:23:42 lr 0.000875 time 2.6335 (2.2187) loss 4.6164 (3.8083) grad_norm 1.0657 (1.1692) [2022-01-19 18:18:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][620/1251] eta 0:23:19 lr 0.000875 time 1.8830 (2.2183) loss 4.3853 (3.8129) grad_norm 1.0940 (1.1682) [2022-01-19 18:19:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][630/1251] eta 0:22:57 lr 0.000875 time 2.8017 (2.2187) loss 3.0437 (3.8106) grad_norm 1.7869 (1.1680) [2022-01-19 18:19:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][640/1251] eta 0:22:37 lr 0.000875 time 2.1746 (2.2212) loss 3.7356 (3.8081) grad_norm 1.0125 (1.1676) [2022-01-19 18:20:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][650/1251] eta 0:22:14 lr 0.000875 time 2.7131 (2.2205) loss 4.1948 (3.8067) grad_norm 1.1273 (1.1678) [2022-01-19 18:20:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][660/1251] eta 0:21:53 lr 0.000874 time 2.6041 (2.2221) loss 3.6089 (3.8122) grad_norm 1.0186 (1.1680) [2022-01-19 18:20:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][670/1251] eta 0:21:30 lr 0.000874 time 2.7546 (2.2217) loss 4.0563 (3.8159) grad_norm 1.3212 (1.1688) [2022-01-19 18:21:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][680/1251] eta 0:21:08 lr 0.000874 time 1.8486 (2.2210) loss 4.2184 (3.8163) grad_norm 0.9999 (1.1684) [2022-01-19 18:21:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][690/1251] eta 0:20:44 lr 0.000874 time 2.1790 (2.2185) loss 3.6245 (3.8116) grad_norm 1.1818 (1.1686) [2022-01-19 18:21:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][700/1251] eta 0:20:20 lr 0.000874 time 2.2041 (2.2159) loss 3.8321 (3.8046) grad_norm 1.2976 (1.1678) [2022-01-19 18:22:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][710/1251] eta 0:19:57 lr 0.000874 time 1.9444 (2.2134) loss 4.6048 (3.8076) grad_norm 1.0966 (1.1676) [2022-01-19 18:22:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][720/1251] eta 0:19:35 lr 0.000874 time 2.7090 (2.2145) loss 3.6092 (3.8092) grad_norm 1.0863 (1.1683) [2022-01-19 18:22:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][730/1251] eta 0:19:13 lr 0.000874 time 1.8283 (2.2150) loss 3.1148 (3.8105) grad_norm 1.2225 (1.1690) [2022-01-19 18:23:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][740/1251] eta 0:18:52 lr 0.000874 time 2.2550 (2.2159) loss 3.8923 (3.8079) grad_norm 1.2964 (1.1698) [2022-01-19 18:23:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][750/1251] eta 0:18:30 lr 0.000874 time 1.9052 (2.2158) loss 3.7027 (3.8096) grad_norm 1.3530 (1.1704) [2022-01-19 18:24:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][760/1251] eta 0:18:08 lr 0.000874 time 2.7251 (2.2173) loss 4.3839 (3.8096) grad_norm 1.1401 (1.1698) [2022-01-19 18:24:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][770/1251] eta 0:17:46 lr 0.000874 time 1.4814 (2.2164) loss 3.8053 (3.8075) grad_norm 1.5977 (1.1693) [2022-01-19 18:24:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][780/1251] eta 0:17:23 lr 0.000874 time 1.7149 (2.2152) loss 3.9761 (3.8075) grad_norm 1.0968 (1.1706) [2022-01-19 18:25:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][790/1251] eta 0:17:00 lr 0.000874 time 1.5616 (2.2140) loss 2.6863 (3.8052) grad_norm 1.1449 (1.1704) [2022-01-19 18:25:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][800/1251] eta 0:16:38 lr 0.000874 time 1.9248 (2.2138) loss 4.4177 (3.8111) grad_norm 0.9553 (1.1697) [2022-01-19 18:25:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][810/1251] eta 0:16:15 lr 0.000874 time 2.1929 (2.2124) loss 2.9677 (3.8106) grad_norm 0.9963 (1.1691) [2022-01-19 18:26:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][820/1251] eta 0:15:54 lr 0.000874 time 1.5602 (2.2135) loss 3.7591 (3.8078) grad_norm 1.0974 (1.1696) [2022-01-19 18:26:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][830/1251] eta 0:15:31 lr 0.000874 time 1.5775 (2.2137) loss 4.2971 (3.8072) grad_norm 1.0799 (1.1692) [2022-01-19 18:26:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][840/1251] eta 0:15:10 lr 0.000874 time 1.9476 (2.2152) loss 4.1794 (3.8072) grad_norm 0.9962 (1.1689) [2022-01-19 18:27:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][850/1251] eta 0:14:47 lr 0.000874 time 2.0682 (2.2137) loss 3.1123 (3.8049) grad_norm 1.1766 (1.1684) [2022-01-19 18:27:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][860/1251] eta 0:14:24 lr 0.000874 time 1.9208 (2.2105) loss 3.5395 (3.8074) grad_norm 1.3840 (1.1688) [2022-01-19 18:27:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][870/1251] eta 0:14:01 lr 0.000874 time 1.6133 (2.2092) loss 4.0579 (3.8111) grad_norm 1.2906 (1.1683) [2022-01-19 18:28:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][880/1251] eta 0:13:40 lr 0.000874 time 2.3517 (2.2114) loss 3.4351 (3.8125) grad_norm 1.0549 (1.1696) [2022-01-19 18:28:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][890/1251] eta 0:13:18 lr 0.000874 time 2.2279 (2.2110) loss 3.2793 (3.8103) grad_norm 1.1821 (1.1691) [2022-01-19 18:29:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][900/1251] eta 0:12:55 lr 0.000874 time 2.2613 (2.2106) loss 3.4264 (3.8105) grad_norm 0.9920 (1.1696) [2022-01-19 18:29:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][910/1251] eta 0:12:33 lr 0.000874 time 1.9292 (2.2105) loss 3.5345 (3.8114) grad_norm 1.2426 (1.1691) [2022-01-19 18:29:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][920/1251] eta 0:12:11 lr 0.000874 time 2.1937 (2.2101) loss 4.1572 (3.8140) grad_norm 1.1032 (1.1686) [2022-01-19 18:30:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][930/1251] eta 0:11:49 lr 0.000874 time 1.7921 (2.2090) loss 4.1691 (3.8154) grad_norm 1.1471 (1.1673) [2022-01-19 18:30:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][940/1251] eta 0:11:26 lr 0.000874 time 2.0041 (2.2078) loss 4.5324 (3.8179) grad_norm 1.3213 (1.1677) [2022-01-19 18:30:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][950/1251] eta 0:11:03 lr 0.000874 time 1.5774 (2.2053) loss 3.2881 (3.8199) grad_norm 1.2142 (1.1684) [2022-01-19 18:31:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][960/1251] eta 0:10:42 lr 0.000874 time 1.9090 (2.2063) loss 3.0089 (3.8178) grad_norm 1.3870 (1.1679) [2022-01-19 18:31:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][970/1251] eta 0:10:20 lr 0.000874 time 1.7916 (2.2069) loss 3.9629 (3.8199) grad_norm 1.4815 (1.1678) [2022-01-19 18:32:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][980/1251] eta 0:09:58 lr 0.000874 time 2.1683 (2.2084) loss 3.6482 (3.8186) grad_norm 1.1163 (1.1673) [2022-01-19 18:32:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][990/1251] eta 0:09:36 lr 0.000874 time 2.1873 (2.2083) loss 4.3216 (3.8211) grad_norm 1.1602 (1.1672) [2022-01-19 18:32:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1000/1251] eta 0:09:14 lr 0.000874 time 2.4753 (2.2084) loss 3.9328 (3.8219) grad_norm 1.0721 (1.1667) [2022-01-19 18:33:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1010/1251] eta 0:08:51 lr 0.000874 time 1.7870 (2.2057) loss 4.2751 (3.8232) grad_norm 1.1758 (1.1664) [2022-01-19 18:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1020/1251] eta 0:08:29 lr 0.000873 time 2.0014 (2.2036) loss 3.7608 (3.8240) grad_norm 1.2292 (1.1665) [2022-01-19 18:33:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1030/1251] eta 0:08:06 lr 0.000873 time 2.2669 (2.2033) loss 4.3884 (3.8226) grad_norm 1.1345 (1.1658) [2022-01-19 18:34:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1040/1251] eta 0:07:44 lr 0.000873 time 1.8417 (2.2024) loss 3.3321 (3.8238) grad_norm 1.7459 (1.1657) [2022-01-19 18:34:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1050/1251] eta 0:07:22 lr 0.000873 time 2.3453 (2.2025) loss 4.7116 (3.8277) grad_norm 1.5090 (1.1662) [2022-01-19 18:34:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1060/1251] eta 0:07:00 lr 0.000873 time 2.8147 (2.2034) loss 2.7928 (3.8265) grad_norm 1.1658 (1.1677) [2022-01-19 18:35:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1070/1251] eta 0:06:38 lr 0.000873 time 2.0915 (2.2034) loss 3.6695 (3.8263) grad_norm 1.2374 (1.1679) [2022-01-19 18:35:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1080/1251] eta 0:06:16 lr 0.000873 time 2.1712 (2.2032) loss 3.0499 (3.8261) grad_norm 1.1935 (1.1681) [2022-01-19 18:35:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1090/1251] eta 0:05:54 lr 0.000873 time 2.6353 (2.2033) loss 3.9667 (3.8257) grad_norm 1.3531 (1.1674) [2022-01-19 18:36:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1100/1251] eta 0:05:32 lr 0.000873 time 2.8370 (2.2037) loss 3.5246 (3.8262) grad_norm 0.9117 (1.1677) [2022-01-19 18:36:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1110/1251] eta 0:05:10 lr 0.000873 time 1.9746 (2.2030) loss 3.6647 (3.8280) grad_norm 1.1973 (1.1678) [2022-01-19 18:37:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1120/1251] eta 0:04:48 lr 0.000873 time 1.9448 (2.2014) loss 2.8923 (3.8251) grad_norm 1.3391 (1.1681) [2022-01-19 18:37:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1130/1251] eta 0:04:26 lr 0.000873 time 2.1061 (2.2011) loss 4.6405 (3.8277) grad_norm 1.0680 (1.1682) [2022-01-19 18:37:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1140/1251] eta 0:04:04 lr 0.000873 time 2.1587 (2.2006) loss 4.0545 (3.8279) grad_norm 1.1883 (1.1681) [2022-01-19 18:38:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1150/1251] eta 0:03:42 lr 0.000873 time 2.4715 (2.2012) loss 3.1111 (3.8273) grad_norm 1.1254 (1.1676) [2022-01-19 18:38:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1160/1251] eta 0:03:20 lr 0.000873 time 2.8726 (2.2023) loss 3.5804 (3.8255) grad_norm 1.3434 (1.1679) [2022-01-19 18:38:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1170/1251] eta 0:02:58 lr 0.000873 time 1.5216 (2.2030) loss 4.0106 (3.8262) grad_norm 1.0588 (1.1677) [2022-01-19 18:39:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1180/1251] eta 0:02:36 lr 0.000873 time 2.7551 (2.2040) loss 4.2173 (3.8280) grad_norm 1.1389 (1.1673) [2022-01-19 18:39:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1190/1251] eta 0:02:14 lr 0.000873 time 1.8744 (2.2037) loss 3.9668 (3.8245) grad_norm 1.0758 (1.1678) [2022-01-19 18:39:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1200/1251] eta 0:01:52 lr 0.000873 time 1.8547 (2.2021) loss 3.0357 (3.8248) grad_norm 1.0252 (1.1678) [2022-01-19 18:40:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1210/1251] eta 0:01:30 lr 0.000873 time 1.9901 (2.2006) loss 4.7462 (3.8268) grad_norm 1.1311 (1.1674) [2022-01-19 18:40:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1220/1251] eta 0:01:08 lr 0.000873 time 2.1738 (2.2002) loss 2.8959 (3.8266) grad_norm 1.0312 (1.1671) [2022-01-19 18:41:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1230/1251] eta 0:00:46 lr 0.000873 time 2.1595 (2.2002) loss 3.9961 (3.8270) grad_norm 1.0021 (1.1667) [2022-01-19 18:41:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1240/1251] eta 0:00:24 lr 0.000873 time 1.5434 (2.1999) loss 3.3976 (3.8276) grad_norm 1.0083 (1.1663) [2022-01-19 18:41:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [69/300][1250/1251] eta 0:00:02 lr 0.000873 time 1.3537 (2.1953) loss 4.3256 (3.8287) grad_norm 1.2738 (1.1661) [2022-01-19 18:41:41 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 69 training takes 0:45:46 [2022-01-19 18:41:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.055 (18.055) Loss 1.2438 (1.2438) Acc@1 72.266 (72.266) Acc@5 90.137 (90.137) [2022-01-19 18:42:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.328 (3.218) Loss 1.2650 (1.2286) Acc@1 72.168 (72.132) Acc@5 90.039 (90.732) [2022-01-19 18:42:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.233 (2.596) Loss 1.2310 (1.2149) Acc@1 72.070 (72.145) Acc@5 90.137 (90.974) [2022-01-19 18:42:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.213 (2.313) Loss 1.1975 (1.2147) Acc@1 71.289 (72.143) Acc@5 91.992 (91.041) [2022-01-19 18:43:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.964 (2.163) Loss 1.2014 (1.2161) Acc@1 71.387 (72.025) Acc@5 91.211 (90.978) [2022-01-19 18:43:17 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 71.996 Acc@5 91.026 [2022-01-19 18:43:17 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 72.0% [2022-01-19 18:43:17 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 72.00% [2022-01-19 18:43:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][0/1251] eta 7:32:37 lr 0.000873 time 21.7090 (21.7090) loss 4.0663 (4.0663) grad_norm 1.1383 (1.1383) [2022-01-19 18:44:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][10/1251] eta 1:24:00 lr 0.000873 time 2.2286 (4.0614) loss 3.2290 (4.0304) grad_norm 1.3083 (1.0809) [2022-01-19 18:44:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][20/1251] eta 1:01:59 lr 0.000873 time 1.5688 (3.0219) loss 3.9898 (3.8889) grad_norm 1.1715 (1.0721) [2022-01-19 18:44:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][30/1251] eta 0:57:46 lr 0.000873 time 1.5673 (2.8390) loss 3.6633 (3.8640) grad_norm 1.2204 (1.1013) [2022-01-19 18:45:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][40/1251] eta 0:54:27 lr 0.000873 time 3.3223 (2.6983) loss 4.2639 (3.9094) grad_norm 1.1286 (1.1214) [2022-01-19 18:45:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][50/1251] eta 0:52:53 lr 0.000873 time 2.8073 (2.6426) loss 3.3341 (3.8235) grad_norm 1.0099 (1.1216) [2022-01-19 18:45:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][60/1251] eta 0:51:21 lr 0.000873 time 2.1739 (2.5875) loss 4.2885 (3.8740) grad_norm 1.0251 (1.1218) [2022-01-19 18:46:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][70/1251] eta 0:49:41 lr 0.000873 time 1.9596 (2.5246) loss 2.5479 (3.8386) grad_norm 1.1684 (1.1254) [2022-01-19 18:46:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][80/1251] eta 0:48:06 lr 0.000873 time 1.8385 (2.4650) loss 2.7839 (3.8325) grad_norm 1.2902 (1.1323) [2022-01-19 18:46:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][90/1251] eta 0:47:00 lr 0.000873 time 1.7762 (2.4291) loss 4.3650 (3.8257) grad_norm 1.2490 (1.1426) [2022-01-19 18:47:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][100/1251] eta 0:45:51 lr 0.000873 time 1.9061 (2.3905) loss 3.2873 (3.8171) grad_norm 1.0887 (1.1425) [2022-01-19 18:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][110/1251] eta 0:44:47 lr 0.000873 time 1.9398 (2.3556) loss 4.1109 (3.8182) grad_norm 1.2078 (1.1471) [2022-01-19 18:48:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][120/1251] eta 0:44:07 lr 0.000873 time 1.6902 (2.3407) loss 3.3957 (3.7965) grad_norm 1.1856 (1.1467) [2022-01-19 18:48:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][130/1251] eta 0:43:39 lr 0.000872 time 2.6401 (2.3365) loss 4.5332 (3.8132) grad_norm 1.2316 (1.1530) [2022-01-19 18:48:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][140/1251] eta 0:43:06 lr 0.000872 time 2.4873 (2.3284) loss 4.1350 (3.7981) grad_norm 1.2574 (1.1511) [2022-01-19 18:49:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][150/1251] eta 0:42:35 lr 0.000872 time 2.5905 (2.3213) loss 3.9922 (3.8076) grad_norm 1.2004 (1.1582) [2022-01-19 18:49:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][160/1251] eta 0:42:04 lr 0.000872 time 1.6786 (2.3137) loss 4.1465 (3.8051) grad_norm 1.0215 (1.1574) [2022-01-19 18:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][170/1251] eta 0:41:43 lr 0.000872 time 2.5690 (2.3158) loss 3.3058 (3.7974) grad_norm 1.2229 (1.1566) [2022-01-19 18:50:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][180/1251] eta 0:41:13 lr 0.000872 time 2.4159 (2.3094) loss 4.1007 (3.8082) grad_norm 1.1357 (1.1567) [2022-01-19 18:50:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][190/1251] eta 0:40:49 lr 0.000872 time 2.2142 (2.3087) loss 4.2170 (3.8113) grad_norm 1.1966 (1.1570) [2022-01-19 18:51:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][200/1251] eta 0:40:20 lr 0.000872 time 1.5632 (2.3027) loss 3.3380 (3.8205) grad_norm 1.4520 (1.1594) [2022-01-19 18:51:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][210/1251] eta 0:39:45 lr 0.000872 time 1.6991 (2.2920) loss 3.9940 (3.8159) grad_norm 1.1229 (1.1556) [2022-01-19 18:51:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][220/1251] eta 0:39:10 lr 0.000872 time 1.8915 (2.2799) loss 4.2436 (3.8170) grad_norm 1.1631 (1.1530) [2022-01-19 18:52:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][230/1251] eta 0:38:44 lr 0.000872 time 2.7459 (2.2768) loss 3.9805 (3.8173) grad_norm 1.1553 (1.1544) [2022-01-19 18:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][240/1251] eta 0:38:14 lr 0.000872 time 1.8978 (2.2698) loss 4.5273 (3.8207) grad_norm 1.0686 (1.1553) [2022-01-19 18:52:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][250/1251] eta 0:37:44 lr 0.000872 time 1.9286 (2.2624) loss 3.6799 (3.8143) grad_norm 0.9932 (1.1547) [2022-01-19 18:53:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][260/1251] eta 0:37:23 lr 0.000872 time 2.5149 (2.2638) loss 4.0245 (3.8150) grad_norm 1.2836 (1.1566) [2022-01-19 18:53:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][270/1251] eta 0:37:01 lr 0.000872 time 3.1406 (2.2642) loss 2.4008 (3.8119) grad_norm 1.1424 (1.1573) [2022-01-19 18:53:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][280/1251] eta 0:36:38 lr 0.000872 time 1.5634 (2.2644) loss 4.0691 (3.8123) grad_norm 1.1161 (1.1571) [2022-01-19 18:54:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][290/1251] eta 0:36:19 lr 0.000872 time 2.4537 (2.2675) loss 3.8929 (3.8066) grad_norm 1.3240 (1.1617) [2022-01-19 18:54:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][300/1251] eta 0:35:51 lr 0.000872 time 2.0801 (2.2626) loss 4.3878 (3.8099) grad_norm 1.2057 (1.1637) [2022-01-19 18:54:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][310/1251] eta 0:35:21 lr 0.000872 time 2.2031 (2.2549) loss 4.0407 (3.8191) grad_norm 1.1777 (1.1646) [2022-01-19 18:55:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][320/1251] eta 0:34:50 lr 0.000872 time 1.7560 (2.2452) loss 3.7951 (3.8181) grad_norm 1.2016 (1.1649) [2022-01-19 18:55:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][330/1251] eta 0:34:23 lr 0.000872 time 2.0031 (2.2406) loss 4.5286 (3.8151) grad_norm 1.0858 (1.1643) [2022-01-19 18:55:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][340/1251] eta 0:33:55 lr 0.000872 time 1.7322 (2.2345) loss 3.9335 (3.8116) grad_norm 1.1315 (1.1636) [2022-01-19 18:56:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][350/1251] eta 0:33:27 lr 0.000872 time 1.8840 (2.2285) loss 3.9251 (3.8075) grad_norm 1.1230 (1.1621) [2022-01-19 18:56:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][360/1251] eta 0:33:04 lr 0.000872 time 2.1558 (2.2270) loss 3.5244 (3.8112) grad_norm 1.0877 (1.1607) [2022-01-19 18:57:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][370/1251] eta 0:32:43 lr 0.000872 time 2.8616 (2.2282) loss 3.2117 (3.8090) grad_norm 1.1789 (1.1627) [2022-01-19 18:57:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][380/1251] eta 0:32:22 lr 0.000872 time 2.0941 (2.2302) loss 3.8789 (3.8091) grad_norm 1.1107 (1.1639) [2022-01-19 18:57:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][390/1251] eta 0:32:03 lr 0.000872 time 2.5044 (2.2344) loss 4.6116 (3.8142) grad_norm 1.0615 (1.1650) [2022-01-19 18:58:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][400/1251] eta 0:31:43 lr 0.000872 time 2.6294 (2.2368) loss 4.1415 (3.8058) grad_norm 1.1316 (1.1645) [2022-01-19 18:58:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][410/1251] eta 0:31:23 lr 0.000872 time 2.8688 (2.2392) loss 4.1585 (3.8147) grad_norm 1.3277 (1.1656) [2022-01-19 18:58:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][420/1251] eta 0:30:57 lr 0.000872 time 1.5407 (2.2353) loss 4.5881 (3.8198) grad_norm 1.0537 (1.1678) [2022-01-19 18:59:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][430/1251] eta 0:30:32 lr 0.000872 time 1.9966 (2.2317) loss 3.9269 (3.8175) grad_norm 1.2140 (1.1687) [2022-01-19 18:59:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][440/1251] eta 0:30:09 lr 0.000872 time 2.4638 (2.2313) loss 3.9321 (3.8132) grad_norm 1.1881 (1.1680) [2022-01-19 19:00:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][450/1251] eta 0:29:47 lr 0.000872 time 2.4594 (2.2316) loss 2.7947 (3.8075) grad_norm 1.0297 (1.1667) [2022-01-19 19:00:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][460/1251] eta 0:29:22 lr 0.000872 time 1.7088 (2.2288) loss 3.8170 (3.8067) grad_norm 1.3790 (1.1688) [2022-01-19 19:00:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][470/1251] eta 0:28:59 lr 0.000872 time 2.4563 (2.2270) loss 4.7189 (3.8100) grad_norm 1.3470 (1.1704) [2022-01-19 19:01:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][480/1251] eta 0:28:36 lr 0.000872 time 2.8409 (2.2266) loss 4.3637 (3.8090) grad_norm 1.3535 (1.1712) [2022-01-19 19:01:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][490/1251] eta 0:28:15 lr 0.000871 time 2.2062 (2.2281) loss 3.5136 (3.8044) grad_norm 1.1303 (1.1699) [2022-01-19 19:01:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][500/1251] eta 0:27:52 lr 0.000871 time 1.8974 (2.2275) loss 3.6582 (3.8080) grad_norm 0.9736 (1.1679) [2022-01-19 19:02:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][510/1251] eta 0:27:29 lr 0.000871 time 2.3152 (2.2259) loss 3.3145 (3.8039) grad_norm 1.0802 (1.1668) [2022-01-19 19:02:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][520/1251] eta 0:27:04 lr 0.000871 time 1.9064 (2.2225) loss 4.1473 (3.8033) grad_norm 1.3362 (1.1669) [2022-01-19 19:02:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][530/1251] eta 0:26:41 lr 0.000871 time 2.3143 (2.2207) loss 3.0382 (3.7997) grad_norm 1.1637 (1.1657) [2022-01-19 19:03:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][540/1251] eta 0:26:18 lr 0.000871 time 1.8029 (2.2203) loss 4.0831 (3.8026) grad_norm 1.0953 (1.1676) [2022-01-19 19:03:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][550/1251] eta 0:25:55 lr 0.000871 time 1.8210 (2.2191) loss 3.3108 (3.8006) grad_norm 1.1292 (1.1666) [2022-01-19 19:04:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][560/1251] eta 0:25:32 lr 0.000871 time 2.4161 (2.2177) loss 4.3616 (3.7987) grad_norm 1.0474 (1.1666) [2022-01-19 19:04:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][570/1251] eta 0:25:10 lr 0.000871 time 2.8533 (2.2187) loss 3.8471 (3.7940) grad_norm 1.3063 (1.1671) [2022-01-19 19:04:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][580/1251] eta 0:24:47 lr 0.000871 time 1.8065 (2.2175) loss 4.5714 (3.7967) grad_norm 1.1155 (1.1667) [2022-01-19 19:05:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][590/1251] eta 0:24:23 lr 0.000871 time 1.9597 (2.2142) loss 4.1805 (3.7978) grad_norm 1.0972 (1.1680) [2022-01-19 19:05:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][600/1251] eta 0:23:59 lr 0.000871 time 2.0581 (2.2106) loss 4.2003 (3.7994) grad_norm 1.0449 (1.1682) [2022-01-19 19:05:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][610/1251] eta 0:23:35 lr 0.000871 time 2.2559 (2.2087) loss 4.3427 (3.8003) grad_norm 1.2657 (1.1690) [2022-01-19 19:06:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][620/1251] eta 0:23:12 lr 0.000871 time 1.8293 (2.2069) loss 3.1053 (3.8034) grad_norm 1.1684 (1.1686) [2022-01-19 19:06:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][630/1251] eta 0:22:50 lr 0.000871 time 1.9732 (2.2074) loss 4.0280 (3.8058) grad_norm 1.1885 (1.1680) [2022-01-19 19:06:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][640/1251] eta 0:22:28 lr 0.000871 time 2.2468 (2.2072) loss 3.1868 (3.8028) grad_norm 1.1100 (1.1683) [2022-01-19 19:07:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][650/1251] eta 0:22:06 lr 0.000871 time 1.8259 (2.2068) loss 4.3876 (3.8050) grad_norm 1.0736 (1.1679) [2022-01-19 19:07:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][660/1251] eta 0:21:44 lr 0.000871 time 2.1318 (2.2080) loss 3.9569 (3.7975) grad_norm 1.2851 (1.1672) [2022-01-19 19:08:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][670/1251] eta 0:21:23 lr 0.000871 time 2.2442 (2.2089) loss 4.1989 (3.7987) grad_norm 1.1903 (1.1672) [2022-01-19 19:08:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][680/1251] eta 0:21:02 lr 0.000871 time 1.8861 (2.2109) loss 4.4778 (3.7983) grad_norm 1.1210 (1.1670) [2022-01-19 19:08:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][690/1251] eta 0:20:40 lr 0.000871 time 2.4729 (2.2111) loss 3.9306 (3.7972) grad_norm 1.2410 (1.1665) [2022-01-19 19:09:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][700/1251] eta 0:20:19 lr 0.000871 time 1.8588 (2.2134) loss 3.8688 (3.8024) grad_norm 1.2208 (1.1662) [2022-01-19 19:09:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][710/1251] eta 0:19:58 lr 0.000871 time 3.0569 (2.2158) loss 4.0788 (3.8019) grad_norm 1.2196 (1.1657) [2022-01-19 19:09:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][720/1251] eta 0:19:35 lr 0.000871 time 1.5207 (2.2146) loss 3.4471 (3.8028) grad_norm 1.2694 (1.1653) [2022-01-19 19:10:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][730/1251] eta 0:19:11 lr 0.000871 time 1.5465 (2.2108) loss 2.8545 (3.7993) grad_norm 1.2003 (1.1657) [2022-01-19 19:10:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][740/1251] eta 0:18:48 lr 0.000871 time 1.8688 (2.2080) loss 3.2512 (3.8000) grad_norm 1.1181 (1.1662) [2022-01-19 19:10:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][750/1251] eta 0:18:25 lr 0.000871 time 2.8507 (2.2071) loss 4.6196 (3.8026) grad_norm 1.1273 (1.1674) [2022-01-19 19:11:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][760/1251] eta 0:18:02 lr 0.000871 time 1.8724 (2.2054) loss 2.9507 (3.8012) grad_norm 1.1597 (1.1681) [2022-01-19 19:11:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][770/1251] eta 0:17:41 lr 0.000871 time 2.4463 (2.2068) loss 4.1601 (3.7992) grad_norm 1.2490 (1.1677) [2022-01-19 19:12:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][780/1251] eta 0:17:19 lr 0.000871 time 2.3345 (2.2079) loss 3.2975 (3.7996) grad_norm 1.1434 (1.1682) [2022-01-19 19:12:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][790/1251] eta 0:16:58 lr 0.000871 time 3.0588 (2.2094) loss 4.4658 (3.8035) grad_norm 1.0016 (1.1685) [2022-01-19 19:12:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][800/1251] eta 0:16:37 lr 0.000871 time 1.6054 (2.2119) loss 4.2332 (3.8049) grad_norm 0.9432 (1.1677) [2022-01-19 19:13:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][810/1251] eta 0:16:16 lr 0.000871 time 2.0856 (2.2135) loss 4.4615 (3.8076) grad_norm 1.1639 (1.1676) [2022-01-19 19:13:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][820/1251] eta 0:15:55 lr 0.000871 time 2.8161 (2.2160) loss 4.3316 (3.8108) grad_norm 1.1764 (1.1673) [2022-01-19 19:13:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][830/1251] eta 0:15:32 lr 0.000871 time 2.5334 (2.2152) loss 4.5302 (3.8121) grad_norm 1.1166 (1.1669) [2022-01-19 19:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][840/1251] eta 0:15:09 lr 0.000871 time 1.8022 (2.2122) loss 4.1526 (3.8124) grad_norm 1.2726 (1.1667) [2022-01-19 19:14:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][850/1251] eta 0:14:45 lr 0.000870 time 1.9394 (2.2086) loss 2.7084 (3.8115) grad_norm 1.8276 (1.1670) [2022-01-19 19:14:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][860/1251] eta 0:14:22 lr 0.000870 time 1.5621 (2.2062) loss 4.1235 (3.8148) grad_norm 1.0028 (1.1676) [2022-01-19 19:15:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][870/1251] eta 0:13:59 lr 0.000870 time 2.0458 (2.2039) loss 2.9019 (3.8151) grad_norm 1.2209 (1.1669) [2022-01-19 19:15:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][880/1251] eta 0:13:37 lr 0.000870 time 1.7664 (2.2032) loss 4.3306 (3.8177) grad_norm 1.2915 (1.1674) [2022-01-19 19:16:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][890/1251] eta 0:13:15 lr 0.000870 time 2.9500 (2.2035) loss 3.9997 (3.8185) grad_norm 1.5759 (1.1682) [2022-01-19 19:16:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][900/1251] eta 0:12:53 lr 0.000870 time 1.6454 (2.2023) loss 4.3547 (3.8194) grad_norm 1.1218 (1.1687) [2022-01-19 19:16:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][910/1251] eta 0:12:31 lr 0.000870 time 2.0369 (2.2035) loss 3.5662 (3.8219) grad_norm 0.9695 (1.1677) [2022-01-19 19:17:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][920/1251] eta 0:12:10 lr 0.000870 time 3.0791 (2.2067) loss 4.3108 (3.8222) grad_norm 1.1221 (1.1673) [2022-01-19 19:17:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][930/1251] eta 0:11:50 lr 0.000870 time 2.2134 (2.2131) loss 4.2469 (3.8229) grad_norm 1.1494 (1.1668) [2022-01-19 19:18:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][940/1251] eta 0:11:29 lr 0.000870 time 2.5121 (2.2156) loss 2.8108 (3.8239) grad_norm 1.2896 (1.1667) [2022-01-19 19:18:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][950/1251] eta 0:11:07 lr 0.000870 time 1.7639 (2.2161) loss 4.3706 (3.8232) grad_norm 1.0556 (1.1669) [2022-01-19 19:18:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][960/1251] eta 0:10:44 lr 0.000870 time 1.9733 (2.2145) loss 3.5086 (3.8235) grad_norm 0.9856 (1.1660) [2022-01-19 19:19:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][970/1251] eta 0:10:21 lr 0.000870 time 1.8876 (2.2108) loss 4.1652 (3.8246) grad_norm 1.2360 (1.1658) [2022-01-19 19:19:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][980/1251] eta 0:09:58 lr 0.000870 time 2.2235 (2.2088) loss 4.5553 (3.8235) grad_norm 1.0664 (1.1653) [2022-01-19 19:19:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][990/1251] eta 0:09:36 lr 0.000870 time 2.7981 (2.2096) loss 4.0540 (3.8244) grad_norm 1.0787 (1.1654) [2022-01-19 19:20:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1000/1251] eta 0:09:14 lr 0.000870 time 2.2591 (2.2093) loss 4.4040 (3.8238) grad_norm 1.1260 (1.1654) [2022-01-19 19:20:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1010/1251] eta 0:08:52 lr 0.000870 time 2.5623 (2.2100) loss 3.2832 (3.8226) grad_norm 1.2045 (1.1653) [2022-01-19 19:20:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1020/1251] eta 0:08:30 lr 0.000870 time 2.5349 (2.2103) loss 4.6729 (3.8265) grad_norm 1.1318 (1.1656) [2022-01-19 19:21:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1030/1251] eta 0:08:08 lr 0.000870 time 1.9396 (2.2101) loss 4.1193 (3.8290) grad_norm 1.1448 (1.1656) [2022-01-19 19:21:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1040/1251] eta 0:07:46 lr 0.000870 time 1.8776 (2.2104) loss 3.7326 (3.8276) grad_norm 1.2009 (1.1661) [2022-01-19 19:22:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1050/1251] eta 0:07:24 lr 0.000870 time 3.1575 (2.2115) loss 3.7984 (3.8303) grad_norm 1.1482 (1.1659) [2022-01-19 19:22:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1060/1251] eta 0:07:02 lr 0.000870 time 1.9603 (2.2110) loss 4.1133 (3.8326) grad_norm 1.0913 (1.1663) [2022-01-19 19:22:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1070/1251] eta 0:06:39 lr 0.000870 time 2.2630 (2.2099) loss 3.0849 (3.8321) grad_norm 0.9529 (1.1656) [2022-01-19 19:23:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1080/1251] eta 0:06:17 lr 0.000870 time 1.5582 (2.2097) loss 4.8674 (3.8342) grad_norm 1.3434 (1.1660) [2022-01-19 19:23:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1090/1251] eta 0:05:55 lr 0.000870 time 2.4737 (2.2096) loss 3.2146 (3.8339) grad_norm 1.2427 (1.1664) [2022-01-19 19:23:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1100/1251] eta 0:05:33 lr 0.000870 time 1.8891 (2.2087) loss 4.5865 (3.8348) grad_norm 1.1037 (1.1660) [2022-01-19 19:24:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1110/1251] eta 0:05:11 lr 0.000870 time 2.9661 (2.2085) loss 3.4667 (3.8347) grad_norm 1.4781 (1.1663) [2022-01-19 19:24:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1120/1251] eta 0:04:48 lr 0.000870 time 1.6738 (2.2060) loss 4.3372 (3.8351) grad_norm 1.2041 (1.1665) [2022-01-19 19:24:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1130/1251] eta 0:04:26 lr 0.000870 time 2.4963 (2.2052) loss 4.4843 (3.8366) grad_norm 1.0108 (1.1660) [2022-01-19 19:25:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1140/1251] eta 0:04:04 lr 0.000870 time 1.8758 (2.2049) loss 3.7005 (3.8358) grad_norm 1.1347 (1.1661) [2022-01-19 19:25:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1150/1251] eta 0:03:42 lr 0.000870 time 2.4203 (2.2055) loss 3.8334 (3.8361) grad_norm 0.9971 (1.1656) [2022-01-19 19:25:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1160/1251] eta 0:03:20 lr 0.000870 time 1.9387 (2.2056) loss 3.0383 (3.8362) grad_norm 1.0196 (1.1655) [2022-01-19 19:26:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1170/1251] eta 0:02:58 lr 0.000870 time 2.8056 (2.2056) loss 3.4445 (3.8333) grad_norm 0.9987 (1.1664) [2022-01-19 19:26:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1180/1251] eta 0:02:36 lr 0.000870 time 1.9082 (2.2052) loss 3.3690 (3.8329) grad_norm 1.1124 (1.1669) [2022-01-19 19:27:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1190/1251] eta 0:02:14 lr 0.000870 time 2.4689 (2.2045) loss 3.2743 (3.8325) grad_norm 1.3044 (1.1668) [2022-01-19 19:27:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1200/1251] eta 0:01:52 lr 0.000870 time 1.8218 (2.2041) loss 3.8034 (3.8319) grad_norm 1.1495 (1.1672) [2022-01-19 19:27:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1210/1251] eta 0:01:30 lr 0.000869 time 1.7417 (2.2037) loss 4.3265 (3.8301) grad_norm 1.1603 (1.1672) [2022-01-19 19:28:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1220/1251] eta 0:01:08 lr 0.000869 time 1.5365 (2.2034) loss 4.3048 (3.8291) grad_norm 1.1694 (1.1671) [2022-01-19 19:28:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1230/1251] eta 0:00:46 lr 0.000869 time 2.9132 (2.2037) loss 4.1900 (3.8297) grad_norm 1.1508 (1.1668) [2022-01-19 19:28:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1240/1251] eta 0:00:24 lr 0.000869 time 1.4347 (2.2023) loss 4.0592 (3.8280) grad_norm 1.2212 (1.1672) [2022-01-19 19:29:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [70/300][1250/1251] eta 0:00:02 lr 0.000869 time 1.1240 (2.1973) loss 4.0034 (3.8268) grad_norm 1.2232 (1.1671) [2022-01-19 19:29:07 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 70 training takes 0:45:49 [2022-01-19 19:29:07 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_70 saving...... [2022-01-19 19:29:18 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_70 saved !!! [2022-01-19 19:29:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 16.900 (16.900) Loss 1.1781 (1.1781) Acc@1 70.996 (70.996) Acc@5 90.527 (90.527) [2022-01-19 19:29:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.660 (2.814) Loss 1.1416 (1.2048) Acc@1 73.145 (71.813) Acc@5 91.992 (91.202) [2022-01-19 19:30:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.590 (2.388) Loss 1.1710 (1.2042) Acc@1 73.438 (71.917) Acc@5 91.699 (91.090) [2022-01-19 19:30:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.264 (2.082) Loss 1.2411 (1.2128) Acc@1 71.484 (71.888) Acc@5 90.332 (90.921) [2022-01-19 19:30:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.097 (2.045) Loss 1.1573 (1.2098) Acc@1 72.070 (71.858) Acc@5 91.895 (90.994) [2022-01-19 19:30:50 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 71.784 Acc@5 91.102 [2022-01-19 19:30:50 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 71.8% [2022-01-19 19:30:50 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 72.00% [2022-01-19 19:31:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][0/1251] eta 7:35:36 lr 0.000869 time 21.8515 (21.8515) loss 3.2346 (3.2346) grad_norm 1.2926 (1.2926) [2022-01-19 19:31:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][10/1251] eta 1:23:12 lr 0.000869 time 2.6388 (4.0228) loss 4.4890 (3.9898) grad_norm 1.0160 (1.1168) [2022-01-19 19:31:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][20/1251] eta 1:05:04 lr 0.000869 time 1.9521 (3.1715) loss 3.8652 (3.9698) grad_norm 1.1180 (1.1170) [2022-01-19 19:32:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][30/1251] eta 1:00:22 lr 0.000869 time 1.4664 (2.9665) loss 4.5333 (3.8351) grad_norm 1.3712 (1.1502) [2022-01-19 19:32:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][40/1251] eta 0:57:38 lr 0.000869 time 2.6944 (2.8557) loss 3.5197 (3.8453) grad_norm 1.0976 (1.1600) [2022-01-19 19:33:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][50/1251] eta 0:55:19 lr 0.000869 time 3.4305 (2.7636) loss 3.6979 (3.8539) grad_norm 1.0455 (1.1432) [2022-01-19 19:33:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][60/1251] eta 0:52:49 lr 0.000869 time 1.8778 (2.6610) loss 3.8442 (3.8706) grad_norm 1.0464 (1.1360) [2022-01-19 19:33:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][70/1251] eta 0:50:55 lr 0.000869 time 2.0525 (2.5873) loss 3.7958 (3.8752) grad_norm 1.1404 (1.1430) [2022-01-19 19:34:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][80/1251] eta 0:49:01 lr 0.000869 time 2.0163 (2.5118) loss 3.9058 (3.8388) grad_norm 1.3244 (1.1514) [2022-01-19 19:34:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][90/1251] eta 0:47:37 lr 0.000869 time 2.5759 (2.4615) loss 4.0603 (3.8379) grad_norm 1.1234 (1.1545) [2022-01-19 19:34:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][100/1251] eta 0:46:34 lr 0.000869 time 2.3263 (2.4276) loss 4.1649 (3.8338) grad_norm 1.0814 (1.1594) [2022-01-19 19:35:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][110/1251] eta 0:45:44 lr 0.000869 time 2.2871 (2.4056) loss 3.8489 (3.8388) grad_norm 1.1690 (1.1646) [2022-01-19 19:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][120/1251] eta 0:45:12 lr 0.000869 time 2.0878 (2.3982) loss 4.1862 (3.8354) grad_norm 1.2542 (1.1661) [2022-01-19 19:36:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][130/1251] eta 0:44:37 lr 0.000869 time 2.1046 (2.3888) loss 3.2743 (3.8073) grad_norm 1.1748 (1.1630) [2022-01-19 19:36:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][140/1251] eta 0:44:04 lr 0.000869 time 2.3081 (2.3804) loss 4.5660 (3.8186) grad_norm 1.2746 (1.1591) [2022-01-19 19:36:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][150/1251] eta 0:43:18 lr 0.000869 time 1.8927 (2.3602) loss 4.5595 (3.8144) grad_norm 1.1895 (1.1599) [2022-01-19 19:37:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][160/1251] eta 0:42:36 lr 0.000869 time 1.6023 (2.3430) loss 3.2176 (3.8085) grad_norm 0.9320 (1.1597) [2022-01-19 19:37:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][170/1251] eta 0:41:51 lr 0.000869 time 1.7761 (2.3233) loss 3.2463 (3.8120) grad_norm 1.7075 (1.1675) [2022-01-19 19:37:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][180/1251] eta 0:41:26 lr 0.000869 time 2.3080 (2.3215) loss 3.8870 (3.8271) grad_norm 1.1737 (1.1715) [2022-01-19 19:38:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][190/1251] eta 0:41:10 lr 0.000869 time 2.1736 (2.3281) loss 3.5050 (3.8178) grad_norm 1.3416 (1.1784) [2022-01-19 19:38:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][200/1251] eta 0:40:48 lr 0.000869 time 2.1846 (2.3301) loss 3.8396 (3.7979) grad_norm 1.0885 (1.1798) [2022-01-19 19:39:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][210/1251] eta 0:40:19 lr 0.000869 time 1.9024 (2.3238) loss 4.2950 (3.8039) grad_norm 1.0353 (1.1776) [2022-01-19 19:39:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][220/1251] eta 0:39:43 lr 0.000869 time 1.9406 (2.3115) loss 2.7470 (3.8000) grad_norm 1.1945 (1.1802) [2022-01-19 19:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][230/1251] eta 0:39:07 lr 0.000869 time 1.8472 (2.2996) loss 3.9792 (3.7953) grad_norm 1.5844 (1.1799) [2022-01-19 19:40:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][240/1251] eta 0:38:34 lr 0.000869 time 1.9281 (2.2891) loss 3.9854 (3.8049) grad_norm 1.0022 (1.1801) [2022-01-19 19:40:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][250/1251] eta 0:38:07 lr 0.000869 time 2.1720 (2.2851) loss 3.4298 (3.7859) grad_norm 1.4263 (1.1789) [2022-01-19 19:40:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][260/1251] eta 0:37:43 lr 0.000869 time 2.5586 (2.2840) loss 3.7160 (3.7892) grad_norm 1.1288 (1.1784) [2022-01-19 19:41:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][270/1251] eta 0:37:14 lr 0.000869 time 1.9565 (2.2778) loss 4.7694 (3.7862) grad_norm 1.0858 (1.1783) [2022-01-19 19:41:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][280/1251] eta 0:36:48 lr 0.000869 time 1.6688 (2.2740) loss 3.5018 (3.7732) grad_norm 0.9776 (1.1769) [2022-01-19 19:41:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][290/1251] eta 0:36:25 lr 0.000869 time 2.5682 (2.2743) loss 3.6301 (3.7773) grad_norm 1.3155 (1.1796) [2022-01-19 19:42:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][300/1251] eta 0:36:06 lr 0.000869 time 1.7777 (2.2778) loss 4.3330 (3.7733) grad_norm 1.2379 (1.1803) [2022-01-19 19:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][310/1251] eta 0:35:44 lr 0.000868 time 1.9109 (2.2787) loss 3.0865 (3.7756) grad_norm 1.0941 (1.1797) [2022-01-19 19:43:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][320/1251] eta 0:35:19 lr 0.000868 time 1.8600 (2.2764) loss 4.2967 (3.7817) grad_norm 1.0674 (1.1786) [2022-01-19 19:43:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][330/1251] eta 0:34:47 lr 0.000868 time 1.7029 (2.2669) loss 4.3509 (3.7810) grad_norm 1.1205 (1.1777) [2022-01-19 19:43:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][340/1251] eta 0:34:18 lr 0.000868 time 1.7603 (2.2592) loss 3.1596 (3.7768) grad_norm 1.1985 (1.1770) [2022-01-19 19:44:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][350/1251] eta 0:33:53 lr 0.000868 time 1.9077 (2.2574) loss 3.8543 (3.7668) grad_norm 1.5024 (1.1815) [2022-01-19 19:44:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][360/1251] eta 0:33:33 lr 0.000868 time 1.8610 (2.2596) loss 2.5549 (3.7655) grad_norm 1.4179 (1.1841) [2022-01-19 19:44:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][370/1251] eta 0:33:10 lr 0.000868 time 2.8407 (2.2591) loss 4.7445 (3.7732) grad_norm 1.2719 (1.1829) [2022-01-19 19:45:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][380/1251] eta 0:32:48 lr 0.000868 time 2.2477 (2.2596) loss 3.7124 (3.7721) grad_norm 1.1124 (1.1811) [2022-01-19 19:45:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][390/1251] eta 0:32:24 lr 0.000868 time 2.1773 (2.2587) loss 4.4787 (3.7755) grad_norm 1.0056 (1.1801) [2022-01-19 19:45:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][400/1251] eta 0:32:01 lr 0.000868 time 2.2918 (2.2574) loss 3.2279 (3.7697) grad_norm 1.1748 (1.1798) [2022-01-19 19:46:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][410/1251] eta 0:31:36 lr 0.000868 time 2.5944 (2.2548) loss 3.5123 (3.7682) grad_norm 1.1092 (1.1799) [2022-01-19 19:46:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][420/1251] eta 0:31:10 lr 0.000868 time 2.1705 (2.2506) loss 2.7351 (3.7671) grad_norm 1.0174 (1.1787) [2022-01-19 19:47:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][430/1251] eta 0:30:49 lr 0.000868 time 2.2055 (2.2528) loss 4.2278 (3.7639) grad_norm 1.0406 (1.1774) [2022-01-19 19:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][440/1251] eta 0:30:27 lr 0.000868 time 2.5581 (2.2534) loss 3.2962 (3.7646) grad_norm 1.2054 (1.1773) [2022-01-19 19:47:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][450/1251] eta 0:30:02 lr 0.000868 time 1.9902 (2.2500) loss 3.1918 (3.7659) grad_norm 1.1878 (1.1776) [2022-01-19 19:48:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][460/1251] eta 0:29:35 lr 0.000868 time 2.0210 (2.2450) loss 4.4845 (3.7725) grad_norm 1.2425 (1.1776) [2022-01-19 19:48:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][470/1251] eta 0:29:11 lr 0.000868 time 1.8534 (2.2429) loss 4.2378 (3.7760) grad_norm 1.2848 (1.1775) [2022-01-19 19:48:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][480/1251] eta 0:28:48 lr 0.000868 time 2.4928 (2.2413) loss 2.7598 (3.7764) grad_norm 1.3669 (1.1790) [2022-01-19 19:49:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][490/1251] eta 0:28:24 lr 0.000868 time 1.6597 (2.2397) loss 3.9954 (3.7797) grad_norm 1.0464 (1.1805) [2022-01-19 19:49:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][500/1251] eta 0:28:01 lr 0.000868 time 1.9012 (2.2386) loss 3.5979 (3.7770) grad_norm 0.9362 (1.1805) [2022-01-19 19:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][510/1251] eta 0:27:40 lr 0.000868 time 2.6686 (2.2416) loss 3.3322 (3.7774) grad_norm 0.9668 (1.1798) [2022-01-19 19:50:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][520/1251] eta 0:27:20 lr 0.000868 time 3.6605 (2.2438) loss 4.7012 (3.7827) grad_norm 1.1538 (1.1796) [2022-01-19 19:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][530/1251] eta 0:26:56 lr 0.000868 time 2.3141 (2.2419) loss 3.7901 (3.7780) grad_norm 1.2135 (1.1790) [2022-01-19 19:51:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][540/1251] eta 0:26:31 lr 0.000868 time 1.6575 (2.2388) loss 4.4076 (3.7786) grad_norm 1.1024 (1.1789) [2022-01-19 19:51:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][550/1251] eta 0:26:06 lr 0.000868 time 1.8133 (2.2346) loss 3.4172 (3.7770) grad_norm 1.1076 (1.1786) [2022-01-19 19:51:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][560/1251] eta 0:25:40 lr 0.000868 time 1.8789 (2.2296) loss 3.3564 (3.7757) grad_norm 1.3760 (1.1794) [2022-01-19 19:52:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][570/1251] eta 0:25:16 lr 0.000868 time 2.1348 (2.2265) loss 3.3520 (3.7791) grad_norm 1.2398 (1.1790) [2022-01-19 19:52:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][580/1251] eta 0:24:51 lr 0.000868 time 1.6921 (2.2231) loss 3.7896 (3.7735) grad_norm 1.0157 (1.1793) [2022-01-19 19:52:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][590/1251] eta 0:24:29 lr 0.000868 time 1.8668 (2.2225) loss 4.0599 (3.7778) grad_norm 1.0599 (1.1784) [2022-01-19 19:53:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][600/1251] eta 0:24:05 lr 0.000868 time 1.6613 (2.2211) loss 3.0542 (3.7780) grad_norm 1.1322 (1.1788) [2022-01-19 19:53:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][610/1251] eta 0:23:43 lr 0.000868 time 2.4220 (2.2203) loss 3.3036 (3.7845) grad_norm 1.2819 (1.1783) [2022-01-19 19:53:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][620/1251] eta 0:23:20 lr 0.000868 time 2.8951 (2.2197) loss 4.3276 (3.7859) grad_norm 1.2522 (1.1785) [2022-01-19 19:54:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][630/1251] eta 0:22:58 lr 0.000868 time 1.6756 (2.2191) loss 3.6783 (3.7886) grad_norm 1.1436 (1.1786) [2022-01-19 19:54:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][640/1251] eta 0:22:36 lr 0.000868 time 1.8678 (2.2200) loss 3.5123 (3.7871) grad_norm 0.9413 (1.1775) [2022-01-19 19:54:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][650/1251] eta 0:22:15 lr 0.000868 time 1.5085 (2.2228) loss 3.6600 (3.7887) grad_norm 0.9765 (1.1769) [2022-01-19 19:55:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][660/1251] eta 0:21:55 lr 0.000868 time 3.1080 (2.2258) loss 4.4061 (3.7903) grad_norm 1.2388 (1.1780) [2022-01-19 19:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][670/1251] eta 0:21:32 lr 0.000867 time 1.9541 (2.2245) loss 4.3063 (3.7934) grad_norm 1.1226 (1.1776) [2022-01-19 19:56:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][680/1251] eta 0:21:09 lr 0.000867 time 1.6650 (2.2235) loss 4.3887 (3.7943) grad_norm 1.0664 (1.1770) [2022-01-19 19:56:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][690/1251] eta 0:20:45 lr 0.000867 time 2.2567 (2.2208) loss 3.7620 (3.7969) grad_norm 1.2916 (1.1761) [2022-01-19 19:56:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][700/1251] eta 0:20:23 lr 0.000867 time 1.6874 (2.2196) loss 4.0986 (3.7958) grad_norm 1.1793 (1.1762) [2022-01-19 19:57:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][710/1251] eta 0:20:00 lr 0.000867 time 1.9134 (2.2188) loss 4.1320 (3.7934) grad_norm 1.1116 (1.1756) [2022-01-19 19:57:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][720/1251] eta 0:19:37 lr 0.000867 time 1.5615 (2.2182) loss 3.9458 (3.7923) grad_norm 1.2437 (1.1747) [2022-01-19 19:57:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][730/1251] eta 0:19:14 lr 0.000867 time 1.7064 (2.2160) loss 4.1225 (3.7927) grad_norm 1.0751 (1.1752) [2022-01-19 19:58:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][740/1251] eta 0:18:53 lr 0.000867 time 2.2651 (2.2173) loss 3.5036 (3.7953) grad_norm 1.1871 (1.1760) [2022-01-19 19:58:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][750/1251] eta 0:18:32 lr 0.000867 time 3.1590 (2.2204) loss 4.4783 (3.7948) grad_norm 1.4758 (1.1766) [2022-01-19 19:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][760/1251] eta 0:18:11 lr 0.000867 time 1.9828 (2.2223) loss 4.0752 (3.7951) grad_norm 1.1398 (1.1770) [2022-01-19 19:59:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][770/1251] eta 0:17:48 lr 0.000867 time 1.5508 (2.2213) loss 3.9787 (3.7990) grad_norm 1.1746 (1.1764) [2022-01-19 19:59:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][780/1251] eta 0:17:24 lr 0.000867 time 1.5580 (2.2186) loss 4.0199 (3.8020) grad_norm 1.0374 (1.1755) [2022-01-19 20:00:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][790/1251] eta 0:17:02 lr 0.000867 time 2.2159 (2.2169) loss 4.0203 (3.8040) grad_norm 1.0999 (1.1749) [2022-01-19 20:00:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][800/1251] eta 0:16:41 lr 0.000867 time 2.7076 (2.2216) loss 4.4054 (3.8096) grad_norm 1.2135 (1.1752) [2022-01-19 20:00:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][810/1251] eta 0:16:18 lr 0.000867 time 1.9074 (2.2193) loss 3.5409 (3.8062) grad_norm 1.1608 (1.1763) [2022-01-19 20:01:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][820/1251] eta 0:15:56 lr 0.000867 time 2.1793 (2.2184) loss 3.0196 (3.8061) grad_norm 0.9833 (1.1764) [2022-01-19 20:01:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][830/1251] eta 0:15:33 lr 0.000867 time 1.9504 (2.2182) loss 4.4660 (3.8106) grad_norm 1.2209 (1.1759) [2022-01-19 20:01:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][840/1251] eta 0:15:12 lr 0.000867 time 2.1323 (2.2193) loss 4.1236 (3.8103) grad_norm 1.0856 (1.1761) [2022-01-19 20:02:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][850/1251] eta 0:14:49 lr 0.000867 time 1.9493 (2.2187) loss 3.7882 (3.8101) grad_norm 0.9892 (1.1755) [2022-01-19 20:02:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][860/1251] eta 0:14:26 lr 0.000867 time 1.8499 (2.2160) loss 3.3605 (3.8096) grad_norm 1.0696 (1.1750) [2022-01-19 20:02:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][870/1251] eta 0:14:03 lr 0.000867 time 1.5855 (2.2151) loss 2.8317 (3.8079) grad_norm 1.2214 (1.1745) [2022-01-19 20:03:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][880/1251] eta 0:13:41 lr 0.000867 time 1.7068 (2.2136) loss 3.9700 (3.8086) grad_norm 1.2376 (1.1753) [2022-01-19 20:03:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][890/1251] eta 0:13:19 lr 0.000867 time 2.1621 (2.2140) loss 3.6627 (3.8069) grad_norm 1.2377 (1.1753) [2022-01-19 20:04:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][900/1251] eta 0:12:57 lr 0.000867 time 2.8254 (2.2157) loss 2.8104 (3.8073) grad_norm 1.1363 (1.1756) [2022-01-19 20:04:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][910/1251] eta 0:12:35 lr 0.000867 time 2.8414 (2.2165) loss 3.7138 (3.8065) grad_norm 1.0030 (1.1746) [2022-01-19 20:04:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][920/1251] eta 0:12:13 lr 0.000867 time 2.7612 (2.2163) loss 2.9303 (3.8051) grad_norm 1.0862 (1.1742) [2022-01-19 20:05:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][930/1251] eta 0:11:51 lr 0.000867 time 1.5896 (2.2156) loss 4.1356 (3.8062) grad_norm 1.1506 (1.1739) [2022-01-19 20:05:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][940/1251] eta 0:11:29 lr 0.000867 time 1.8312 (2.2158) loss 4.0850 (3.8059) grad_norm 1.1970 (1.1738) [2022-01-19 20:05:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][950/1251] eta 0:11:06 lr 0.000867 time 1.8286 (2.2139) loss 2.7811 (3.8085) grad_norm 1.1353 (1.1742) [2022-01-19 20:06:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][960/1251] eta 0:10:44 lr 0.000867 time 3.6203 (2.2147) loss 3.7338 (3.8096) grad_norm 1.2739 (1.1745) [2022-01-19 20:06:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][970/1251] eta 0:10:21 lr 0.000867 time 2.0075 (2.2128) loss 4.2322 (3.8085) grad_norm 1.1601 (1.1749) [2022-01-19 20:06:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][980/1251] eta 0:09:59 lr 0.000867 time 2.1185 (2.2106) loss 3.7692 (3.8105) grad_norm 1.0765 (1.1744) [2022-01-19 20:07:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][990/1251] eta 0:09:36 lr 0.000867 time 2.4671 (2.2101) loss 3.9322 (3.8121) grad_norm 1.1290 (1.1735) [2022-01-19 20:07:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1000/1251] eta 0:09:14 lr 0.000867 time 2.3897 (2.2097) loss 3.8770 (3.8124) grad_norm 1.0787 (1.1735) [2022-01-19 20:08:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1010/1251] eta 0:08:52 lr 0.000867 time 1.8830 (2.2096) loss 3.9509 (3.8133) grad_norm 1.0374 (1.1729) [2022-01-19 20:08:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1020/1251] eta 0:08:30 lr 0.000866 time 2.3067 (2.2093) loss 3.1622 (3.8129) grad_norm 1.1295 (1.1732) [2022-01-19 20:08:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1030/1251] eta 0:08:08 lr 0.000866 time 3.0308 (2.2109) loss 3.6015 (3.8109) grad_norm 1.1290 (1.1736) [2022-01-19 20:09:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1040/1251] eta 0:07:46 lr 0.000866 time 2.4496 (2.2115) loss 3.0113 (3.8105) grad_norm 1.1771 (1.1734) [2022-01-19 20:09:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1050/1251] eta 0:07:24 lr 0.000866 time 2.2474 (2.2116) loss 4.2337 (3.8090) grad_norm 1.1017 (1.1724) [2022-01-19 20:09:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1060/1251] eta 0:07:02 lr 0.000866 time 1.9240 (2.2118) loss 3.4260 (3.8106) grad_norm 1.3179 (1.1727) [2022-01-19 20:10:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1070/1251] eta 0:06:40 lr 0.000866 time 1.9671 (2.2117) loss 4.2205 (3.8101) grad_norm 1.1437 (1.1725) [2022-01-19 20:10:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1080/1251] eta 0:06:18 lr 0.000866 time 1.8152 (2.2109) loss 3.1473 (3.8108) grad_norm 1.3166 (1.1727) [2022-01-19 20:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1090/1251] eta 0:05:55 lr 0.000866 time 1.8809 (2.2099) loss 4.0565 (3.8141) grad_norm 1.2411 (1.1726) [2022-01-19 20:11:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1100/1251] eta 0:05:33 lr 0.000866 time 1.6303 (2.2092) loss 3.9939 (3.8133) grad_norm 1.1339 (1.1730) [2022-01-19 20:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1110/1251] eta 0:05:11 lr 0.000866 time 1.8308 (2.2096) loss 3.8222 (3.8097) grad_norm 1.1548 (1.1728) [2022-01-19 20:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1120/1251] eta 0:04:49 lr 0.000866 time 2.3925 (2.2105) loss 4.4577 (3.8104) grad_norm 1.4854 (1.1736) [2022-01-19 20:12:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1130/1251] eta 0:04:27 lr 0.000866 time 1.7650 (2.2103) loss 4.7231 (3.8123) grad_norm 1.1000 (1.1735) [2022-01-19 20:12:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1140/1251] eta 0:04:05 lr 0.000866 time 1.5322 (2.2095) loss 3.3234 (3.8120) grad_norm 1.2637 (1.1736) [2022-01-19 20:13:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1150/1251] eta 0:03:43 lr 0.000866 time 1.8247 (2.2082) loss 4.2170 (3.8137) grad_norm 1.0966 (1.1734) [2022-01-19 20:13:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1160/1251] eta 0:03:20 lr 0.000866 time 2.4104 (2.2083) loss 3.0596 (3.8136) grad_norm 0.9838 (1.1738) [2022-01-19 20:13:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1170/1251] eta 0:02:58 lr 0.000866 time 1.8204 (2.2082) loss 3.6817 (3.8122) grad_norm 1.2538 (1.1743) [2022-01-19 20:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1180/1251] eta 0:02:36 lr 0.000866 time 1.8362 (2.2085) loss 3.7267 (3.8107) grad_norm 1.1768 (1.1748) [2022-01-19 20:14:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1190/1251] eta 0:02:14 lr 0.000866 time 2.2855 (2.2080) loss 3.0385 (3.8079) grad_norm 1.1421 (1.1746) [2022-01-19 20:15:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1200/1251] eta 0:01:52 lr 0.000866 time 3.0515 (2.2068) loss 4.1066 (3.8086) grad_norm 1.2082 (1.1753) [2022-01-19 20:15:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1210/1251] eta 0:01:30 lr 0.000866 time 1.6069 (2.2051) loss 4.1130 (3.8079) grad_norm 1.1006 (1.1751) [2022-01-19 20:15:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1220/1251] eta 0:01:08 lr 0.000866 time 2.7740 (2.2053) loss 4.0406 (3.8067) grad_norm 1.3114 (1.1750) [2022-01-19 20:16:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1230/1251] eta 0:00:46 lr 0.000866 time 2.2493 (2.2048) loss 4.3000 (3.8063) grad_norm 1.2062 (1.1753) [2022-01-19 20:16:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1240/1251] eta 0:00:24 lr 0.000866 time 2.1610 (2.2040) loss 3.9307 (3.8057) grad_norm 1.1547 (1.1753) [2022-01-19 20:16:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [71/300][1250/1251] eta 0:00:02 lr 0.000866 time 1.1682 (2.1983) loss 2.4055 (3.8055) grad_norm 1.0829 (1.1751) [2022-01-19 20:16:40 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 71 training takes 0:45:50 [2022-01-19 20:16:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.043 (19.043) Loss 1.1913 (1.1913) Acc@1 72.363 (72.363) Acc@5 91.504 (91.504) [2022-01-19 20:17:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.647 (3.299) Loss 1.2302 (1.1990) Acc@1 72.168 (72.257) Acc@5 92.188 (91.486) [2022-01-19 20:17:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.614 (2.552) Loss 1.1183 (1.1874) Acc@1 74.023 (72.307) Acc@5 91.211 (91.443) [2022-01-19 20:17:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.569 (2.256) Loss 1.2852 (1.1954) Acc@1 69.629 (72.118) Acc@5 89.746 (91.287) [2022-01-19 20:18:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.083 (2.145) Loss 1.1356 (1.1993) Acc@1 72.656 (71.913) Acc@5 91.602 (91.278) [2022-01-19 20:18:16 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 72.034 Acc@5 91.282 [2022-01-19 20:18:16 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 72.0% [2022-01-19 20:18:16 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 72.03% [2022-01-19 20:18:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][0/1251] eta 7:26:38 lr 0.000866 time 21.4214 (21.4214) loss 4.2115 (4.2115) grad_norm 1.0996 (1.0996) [2022-01-19 20:19:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][10/1251] eta 1:25:00 lr 0.000866 time 2.5483 (4.1103) loss 4.0010 (3.9890) grad_norm 0.9866 (1.1075) [2022-01-19 20:19:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][20/1251] eta 1:04:42 lr 0.000866 time 1.5253 (3.1543) loss 4.5295 (3.8989) grad_norm 1.0876 (1.1390) [2022-01-19 20:19:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][30/1251] eta 0:58:17 lr 0.000866 time 1.5628 (2.8645) loss 4.0125 (3.8639) grad_norm 1.2708 (1.1399) [2022-01-19 20:20:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][40/1251] eta 0:54:33 lr 0.000866 time 4.6078 (2.7032) loss 3.8409 (3.9207) grad_norm 1.1280 (1.1585) [2022-01-19 20:20:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][50/1251] eta 0:52:46 lr 0.000866 time 2.8290 (2.6367) loss 3.1700 (3.9173) grad_norm 1.2767 (1.1686) [2022-01-19 20:20:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][60/1251] eta 0:51:16 lr 0.000866 time 2.1864 (2.5829) loss 4.6406 (3.9240) grad_norm 1.2779 (1.1931) [2022-01-19 20:21:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][70/1251] eta 0:49:43 lr 0.000866 time 1.8715 (2.5260) loss 3.9485 (3.9121) grad_norm 1.1656 (1.2148) [2022-01-19 20:21:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][80/1251] eta 0:48:36 lr 0.000866 time 3.5816 (2.4905) loss 3.9799 (3.8630) grad_norm 1.1510 (1.2028) [2022-01-19 20:21:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][90/1251] eta 0:47:27 lr 0.000866 time 2.6300 (2.4528) loss 3.3785 (3.8087) grad_norm 1.1029 (1.2022) [2022-01-19 20:22:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][100/1251] eta 0:46:11 lr 0.000866 time 2.0355 (2.4076) loss 2.8198 (3.8079) grad_norm 1.5126 (1.2059) [2022-01-19 20:22:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][110/1251] eta 0:45:04 lr 0.000866 time 1.8844 (2.3700) loss 4.0579 (3.8340) grad_norm 1.2101 (1.2068) [2022-01-19 20:23:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][120/1251] eta 0:44:18 lr 0.000865 time 2.4976 (2.3507) loss 4.2026 (3.8375) grad_norm 1.5918 (1.2083) [2022-01-19 20:23:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][130/1251] eta 0:43:47 lr 0.000865 time 3.1185 (2.3437) loss 4.2479 (3.8263) grad_norm 1.0542 (1.2025) [2022-01-19 20:23:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][140/1251] eta 0:43:11 lr 0.000865 time 1.8574 (2.3323) loss 3.6408 (3.8251) grad_norm 1.0367 (1.1958) [2022-01-19 20:24:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][150/1251] eta 0:42:47 lr 0.000865 time 2.3984 (2.3323) loss 3.2623 (3.8048) grad_norm 1.0406 (1.1920) [2022-01-19 20:24:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][160/1251] eta 0:42:22 lr 0.000865 time 2.2985 (2.3301) loss 4.2439 (3.8096) grad_norm 0.9680 (1.1850) [2022-01-19 20:24:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][170/1251] eta 0:42:05 lr 0.000865 time 3.7654 (2.3365) loss 3.8557 (3.8026) grad_norm 1.2768 (1.1796) [2022-01-19 20:25:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][180/1251] eta 0:41:30 lr 0.000865 time 1.9461 (2.3256) loss 3.6561 (3.8199) grad_norm 1.1930 (1.1762) [2022-01-19 20:25:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][190/1251] eta 0:40:49 lr 0.000865 time 1.7563 (2.3089) loss 4.1938 (3.8254) grad_norm 1.2035 (1.1765) [2022-01-19 20:25:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][200/1251] eta 0:40:11 lr 0.000865 time 1.9595 (2.2947) loss 4.7561 (3.8310) grad_norm 1.1025 (1.1735) [2022-01-19 20:26:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][210/1251] eta 0:39:42 lr 0.000865 time 2.3466 (2.2886) loss 3.8897 (3.8337) grad_norm 1.1564 (1.1710) [2022-01-19 20:26:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][220/1251] eta 0:39:19 lr 0.000865 time 1.7024 (2.2887) loss 3.0628 (3.8259) grad_norm 1.3100 (1.1726) [2022-01-19 20:27:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][230/1251] eta 0:38:48 lr 0.000865 time 1.5312 (2.2806) loss 3.8500 (3.8165) grad_norm 1.3940 (1.1734) [2022-01-19 20:27:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][240/1251] eta 0:38:18 lr 0.000865 time 1.8720 (2.2734) loss 2.9674 (3.8092) grad_norm 1.2004 (1.1800) [2022-01-19 20:27:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][250/1251] eta 0:37:54 lr 0.000865 time 1.8969 (2.2718) loss 4.4582 (3.8079) grad_norm 1.1520 (1.1792) [2022-01-19 20:28:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][260/1251] eta 0:37:31 lr 0.000865 time 2.0166 (2.2721) loss 3.8736 (3.8071) grad_norm 1.1069 (1.1807) [2022-01-19 20:28:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][270/1251] eta 0:37:06 lr 0.000865 time 2.7394 (2.2696) loss 4.2270 (3.7974) grad_norm 1.0374 (1.1802) [2022-01-19 20:28:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][280/1251] eta 0:36:38 lr 0.000865 time 1.8128 (2.2640) loss 3.3634 (3.8033) grad_norm 1.1178 (1.1788) [2022-01-19 20:29:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][290/1251] eta 0:36:11 lr 0.000865 time 2.1881 (2.2600) loss 3.7746 (3.7956) grad_norm 1.0135 (1.1826) [2022-01-19 20:29:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][300/1251] eta 0:35:46 lr 0.000865 time 1.9443 (2.2576) loss 4.3446 (3.8044) grad_norm 1.2142 (1.1826) [2022-01-19 20:29:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][310/1251] eta 0:35:26 lr 0.000865 time 3.1003 (2.2601) loss 3.5826 (3.8030) grad_norm 1.2870 (1.1809) [2022-01-19 20:30:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][320/1251] eta 0:34:59 lr 0.000865 time 1.7838 (2.2548) loss 3.9440 (3.8018) grad_norm 1.0808 (1.1792) [2022-01-19 20:30:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][330/1251] eta 0:34:36 lr 0.000865 time 1.6598 (2.2541) loss 4.1230 (3.8009) grad_norm 1.1633 (1.1761) [2022-01-19 20:31:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][340/1251] eta 0:34:11 lr 0.000865 time 1.8491 (2.2520) loss 4.1224 (3.8116) grad_norm 1.0244 (1.1747) [2022-01-19 20:31:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][350/1251] eta 0:33:47 lr 0.000865 time 2.5892 (2.2508) loss 4.1088 (3.8141) grad_norm 1.5631 (1.1744) [2022-01-19 20:31:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][360/1251] eta 0:33:23 lr 0.000865 time 1.8112 (2.2483) loss 4.4083 (3.8165) grad_norm 1.3554 (1.1734) [2022-01-19 20:32:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][370/1251] eta 0:33:01 lr 0.000865 time 1.8800 (2.2493) loss 4.3158 (3.8215) grad_norm 1.1270 (1.1743) [2022-01-19 20:32:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][380/1251] eta 0:32:36 lr 0.000865 time 2.0458 (2.2468) loss 3.5265 (3.8242) grad_norm 0.9651 (1.1727) [2022-01-19 20:32:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][390/1251] eta 0:32:13 lr 0.000865 time 3.0736 (2.2459) loss 4.2082 (3.8145) grad_norm 1.2729 (1.1750) [2022-01-19 20:33:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][400/1251] eta 0:31:50 lr 0.000865 time 2.4574 (2.2447) loss 4.5193 (3.8166) grad_norm 0.9634 (1.1749) [2022-01-19 20:33:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][410/1251] eta 0:31:25 lr 0.000865 time 2.4661 (2.2425) loss 2.7052 (3.8200) grad_norm 1.1766 (1.1741) [2022-01-19 20:33:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][420/1251] eta 0:31:00 lr 0.000865 time 1.9104 (2.2392) loss 3.9412 (3.8220) grad_norm 1.0317 (1.1729) [2022-01-19 20:34:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][430/1251] eta 0:30:38 lr 0.000865 time 3.4971 (2.2396) loss 4.0682 (3.8226) grad_norm 1.0469 (1.1720) [2022-01-19 20:34:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][440/1251] eta 0:30:14 lr 0.000865 time 1.9193 (2.2374) loss 4.2219 (3.8216) grad_norm 1.4724 (1.1739) [2022-01-19 20:35:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][450/1251] eta 0:29:51 lr 0.000865 time 1.8906 (2.2366) loss 2.7544 (3.8153) grad_norm 1.2908 (1.1738) [2022-01-19 20:35:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][460/1251] eta 0:29:24 lr 0.000865 time 1.8960 (2.2312) loss 3.7074 (3.8235) grad_norm 1.1609 (1.1739) [2022-01-19 20:35:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][470/1251] eta 0:29:05 lr 0.000865 time 2.7358 (2.2343) loss 4.4297 (3.8255) grad_norm 1.1269 (1.1740) [2022-01-19 20:36:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][480/1251] eta 0:28:39 lr 0.000864 time 1.5643 (2.2299) loss 3.9772 (3.8297) grad_norm 1.1790 (1.1724) [2022-01-19 20:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][490/1251] eta 0:28:14 lr 0.000864 time 1.9585 (2.2272) loss 2.6817 (3.8285) grad_norm 1.1399 (1.1730) [2022-01-19 20:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][500/1251] eta 0:27:52 lr 0.000864 time 2.2119 (2.2265) loss 3.1991 (3.8237) grad_norm 1.2036 (1.1753) [2022-01-19 20:37:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][510/1251] eta 0:27:31 lr 0.000864 time 3.3778 (2.2292) loss 4.1783 (3.8198) grad_norm 1.0950 (1.1761) [2022-01-19 20:37:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][520/1251] eta 0:27:08 lr 0.000864 time 2.0415 (2.2282) loss 3.9201 (3.8185) grad_norm 1.2916 (1.1767) [2022-01-19 20:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][530/1251] eta 0:26:45 lr 0.000864 time 1.6767 (2.2273) loss 4.0781 (3.8186) grad_norm 1.2338 (1.1784) [2022-01-19 20:38:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][540/1251] eta 0:26:22 lr 0.000864 time 1.9287 (2.2262) loss 4.0940 (3.8175) grad_norm 1.2151 (1.1811) [2022-01-19 20:38:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][550/1251] eta 0:25:58 lr 0.000864 time 2.6662 (2.2233) loss 3.0159 (3.8139) grad_norm 0.9479 (1.1808) [2022-01-19 20:39:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][560/1251] eta 0:25:35 lr 0.000864 time 2.2979 (2.2217) loss 4.0635 (3.8167) grad_norm 1.0696 (1.1808) [2022-01-19 20:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][570/1251] eta 0:25:12 lr 0.000864 time 1.8616 (2.2209) loss 3.3534 (3.8112) grad_norm 1.2940 (1.1825) [2022-01-19 20:39:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][580/1251] eta 0:24:50 lr 0.000864 time 2.1475 (2.2215) loss 4.0303 (3.8115) grad_norm 1.1411 (1.1816) [2022-01-19 20:40:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][590/1251] eta 0:24:30 lr 0.000864 time 3.6276 (2.2250) loss 3.0772 (3.8082) grad_norm 1.1155 (1.1824) [2022-01-19 20:40:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][600/1251] eta 0:24:06 lr 0.000864 time 2.0430 (2.2219) loss 3.9290 (3.8009) grad_norm 1.0209 (1.1807) [2022-01-19 20:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][610/1251] eta 0:23:40 lr 0.000864 time 2.2716 (2.2162) loss 4.1283 (3.8029) grad_norm 1.2013 (1.1813) [2022-01-19 20:41:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][620/1251] eta 0:23:17 lr 0.000864 time 2.0319 (2.2152) loss 4.0492 (3.8029) grad_norm 1.1135 (1.1813) [2022-01-19 20:41:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][630/1251] eta 0:22:54 lr 0.000864 time 2.1805 (2.2130) loss 2.6918 (3.8066) grad_norm 1.3555 (1.1800) [2022-01-19 20:41:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][640/1251] eta 0:22:33 lr 0.000864 time 2.5520 (2.2147) loss 4.4291 (3.8108) grad_norm 1.0172 (1.1786) [2022-01-19 20:42:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][650/1251] eta 0:22:10 lr 0.000864 time 1.8147 (2.2141) loss 4.2942 (3.8142) grad_norm 0.9754 (1.1778) [2022-01-19 20:42:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][660/1251] eta 0:21:48 lr 0.000864 time 2.8686 (2.2146) loss 2.9140 (3.8125) grad_norm 1.1801 (1.1771) [2022-01-19 20:43:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][670/1251] eta 0:21:25 lr 0.000864 time 2.0229 (2.2133) loss 4.5910 (3.8108) grad_norm 1.0932 (1.1765) [2022-01-19 20:43:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][680/1251] eta 0:21:04 lr 0.000864 time 2.4620 (2.2144) loss 3.9094 (3.8085) grad_norm 1.2570 (1.1762) [2022-01-19 20:43:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][690/1251] eta 0:20:42 lr 0.000864 time 2.1407 (2.2145) loss 4.6035 (3.8094) grad_norm 1.4487 (1.1770) [2022-01-19 20:44:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][700/1251] eta 0:20:20 lr 0.000864 time 1.6621 (2.2146) loss 4.3162 (3.8082) grad_norm 1.2772 (1.1768) [2022-01-19 20:44:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][710/1251] eta 0:19:57 lr 0.000864 time 1.7882 (2.2129) loss 4.0810 (3.8106) grad_norm 1.2466 (1.1786) [2022-01-19 20:44:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][720/1251] eta 0:19:34 lr 0.000864 time 2.6085 (2.2116) loss 3.9220 (3.8143) grad_norm 0.9681 (1.1786) [2022-01-19 20:45:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][730/1251] eta 0:19:13 lr 0.000864 time 3.2421 (2.2139) loss 4.3262 (3.8116) grad_norm 1.3488 (1.1784) [2022-01-19 20:45:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][740/1251] eta 0:18:51 lr 0.000864 time 2.1836 (2.2141) loss 2.8005 (3.8113) grad_norm 1.0137 (1.1782) [2022-01-19 20:45:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][750/1251] eta 0:18:27 lr 0.000864 time 2.1705 (2.2115) loss 4.1629 (3.8111) grad_norm 1.0308 (1.1776) [2022-01-19 20:46:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][760/1251] eta 0:18:04 lr 0.000864 time 1.7702 (2.2083) loss 4.3881 (3.8129) grad_norm 1.2546 (1.1790) [2022-01-19 20:46:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][770/1251] eta 0:17:42 lr 0.000864 time 1.9447 (2.2080) loss 3.1903 (3.8120) grad_norm 1.0723 (1.1787) [2022-01-19 20:47:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][780/1251] eta 0:17:20 lr 0.000864 time 1.8897 (2.2084) loss 2.9748 (3.8130) grad_norm 1.1528 (1.1788) [2022-01-19 20:47:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][790/1251] eta 0:16:58 lr 0.000864 time 2.7830 (2.2102) loss 4.0284 (3.8165) grad_norm 1.0760 (1.1786) [2022-01-19 20:47:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][800/1251] eta 0:16:36 lr 0.000864 time 1.8250 (2.2101) loss 4.1873 (3.8182) grad_norm 1.1922 (1.1786) [2022-01-19 20:48:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][810/1251] eta 0:16:13 lr 0.000864 time 1.5896 (2.2082) loss 4.1010 (3.8211) grad_norm 1.2065 (1.1805) [2022-01-19 20:48:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][820/1251] eta 0:15:51 lr 0.000864 time 2.1819 (2.2069) loss 2.8510 (3.8221) grad_norm 1.1673 (1.1807) [2022-01-19 20:48:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][830/1251] eta 0:15:28 lr 0.000863 time 1.9123 (2.2053) loss 4.1925 (3.8221) grad_norm 1.0679 (1.1796) [2022-01-19 20:49:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][840/1251] eta 0:15:05 lr 0.000863 time 2.4084 (2.2041) loss 4.1794 (3.8186) grad_norm 1.0513 (1.1799) [2022-01-19 20:49:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][850/1251] eta 0:14:43 lr 0.000863 time 1.9293 (2.2026) loss 2.9513 (3.8199) grad_norm 1.1930 (1.1802) [2022-01-19 20:49:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][860/1251] eta 0:14:20 lr 0.000863 time 1.7149 (2.2011) loss 4.6131 (3.8220) grad_norm 1.1462 (1.1801) [2022-01-19 20:50:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][870/1251] eta 0:13:59 lr 0.000863 time 2.1068 (2.2022) loss 4.1658 (3.8212) grad_norm 1.2272 (1.1795) [2022-01-19 20:50:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][880/1251] eta 0:13:37 lr 0.000863 time 1.9318 (2.2033) loss 3.9982 (3.8210) grad_norm 1.0811 (1.1799) [2022-01-19 20:51:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][890/1251] eta 0:13:15 lr 0.000863 time 2.1226 (2.2046) loss 3.3112 (3.8216) grad_norm 1.1882 (1.1797) [2022-01-19 20:51:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][900/1251] eta 0:12:54 lr 0.000863 time 2.2140 (2.2057) loss 4.0755 (3.8210) grad_norm 1.3135 (1.1795) [2022-01-19 20:51:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][910/1251] eta 0:12:32 lr 0.000863 time 2.1963 (2.2058) loss 4.6214 (3.8198) grad_norm 1.1399 (1.1787) [2022-01-19 20:52:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][920/1251] eta 0:12:09 lr 0.000863 time 1.5736 (2.2051) loss 4.1897 (3.8202) grad_norm 1.4682 (1.1786) [2022-01-19 20:52:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][930/1251] eta 0:11:47 lr 0.000863 time 2.1988 (2.2044) loss 3.0718 (3.8182) grad_norm 1.2588 (1.1782) [2022-01-19 20:52:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][940/1251] eta 0:11:25 lr 0.000863 time 2.0069 (2.2032) loss 3.3767 (3.8168) grad_norm 1.3102 (1.1778) [2022-01-19 20:53:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][950/1251] eta 0:11:03 lr 0.000863 time 2.2489 (2.2034) loss 3.1317 (3.8130) grad_norm 1.1018 (1.1781) [2022-01-19 20:53:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][960/1251] eta 0:10:40 lr 0.000863 time 1.6780 (2.2026) loss 3.1073 (3.8132) grad_norm 1.2774 (1.1785) [2022-01-19 20:53:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][970/1251] eta 0:10:18 lr 0.000863 time 1.8061 (2.2027) loss 4.2987 (3.8129) grad_norm 1.6981 (1.1784) [2022-01-19 20:54:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][980/1251] eta 0:09:56 lr 0.000863 time 1.8833 (2.2023) loss 4.3039 (3.8150) grad_norm 1.4816 (1.1801) [2022-01-19 20:54:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][990/1251] eta 0:09:34 lr 0.000863 time 2.1879 (2.2029) loss 4.5892 (3.8151) grad_norm 1.5731 (1.1812) [2022-01-19 20:54:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1000/1251] eta 0:09:12 lr 0.000863 time 2.1192 (2.2015) loss 3.7425 (3.8160) grad_norm 1.2741 (1.1816) [2022-01-19 20:55:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1010/1251] eta 0:08:50 lr 0.000863 time 2.1986 (2.2015) loss 4.0479 (3.8165) grad_norm 1.1486 (1.1808) [2022-01-19 20:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1020/1251] eta 0:08:28 lr 0.000863 time 1.5630 (2.1997) loss 4.1960 (3.8185) grad_norm 1.5052 (1.1809) [2022-01-19 20:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1030/1251] eta 0:08:05 lr 0.000863 time 1.8191 (2.1987) loss 3.1500 (3.8150) grad_norm 1.2139 (1.1813) [2022-01-19 20:56:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1040/1251] eta 0:07:44 lr 0.000863 time 2.9452 (2.1997) loss 4.7619 (3.8174) grad_norm 1.4947 (1.1812) [2022-01-19 20:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1050/1251] eta 0:07:22 lr 0.000863 time 2.2425 (2.2024) loss 2.8678 (3.8151) grad_norm 1.1568 (1.1801) [2022-01-19 20:57:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1060/1251] eta 0:07:00 lr 0.000863 time 2.0607 (2.2017) loss 4.2688 (3.8166) grad_norm 1.3042 (1.1796) [2022-01-19 20:57:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1070/1251] eta 0:06:38 lr 0.000863 time 1.6048 (2.2002) loss 4.3262 (3.8194) grad_norm 1.2995 (1.1795) [2022-01-19 20:57:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1080/1251] eta 0:06:16 lr 0.000863 time 2.6693 (2.2004) loss 3.9831 (3.8215) grad_norm 1.1794 (1.1792) [2022-01-19 20:58:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1090/1251] eta 0:05:54 lr 0.000863 time 2.6791 (2.1997) loss 3.9104 (3.8219) grad_norm 1.3411 (1.1793) [2022-01-19 20:58:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1100/1251] eta 0:05:32 lr 0.000863 time 1.8760 (2.1991) loss 4.5968 (3.8236) grad_norm 1.3367 (1.1795) [2022-01-19 20:58:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1110/1251] eta 0:05:10 lr 0.000863 time 2.0105 (2.1993) loss 3.1272 (3.8233) grad_norm 1.1556 (1.1791) [2022-01-19 20:59:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1120/1251] eta 0:04:48 lr 0.000863 time 2.1878 (2.2012) loss 4.2451 (3.8196) grad_norm 1.1082 (1.1787) [2022-01-19 20:59:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1130/1251] eta 0:04:26 lr 0.000863 time 2.6271 (2.2040) loss 4.3141 (3.8222) grad_norm 1.0417 (1.1783) [2022-01-19 21:00:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1140/1251] eta 0:04:04 lr 0.000863 time 2.0452 (2.2038) loss 2.6907 (3.8207) grad_norm 1.1676 (1.1777) [2022-01-19 21:00:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1150/1251] eta 0:03:42 lr 0.000863 time 2.0351 (2.2029) loss 2.7521 (3.8174) grad_norm 1.2344 (1.1774) [2022-01-19 21:00:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1160/1251] eta 0:03:20 lr 0.000863 time 1.6273 (2.2006) loss 3.7390 (3.8171) grad_norm 1.1222 (1.1772) [2022-01-19 21:01:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1170/1251] eta 0:02:58 lr 0.000863 time 1.6079 (2.2000) loss 3.8205 (3.8162) grad_norm 1.2175 (1.1772) [2022-01-19 21:01:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1180/1251] eta 0:02:36 lr 0.000862 time 2.4024 (2.1995) loss 4.0487 (3.8175) grad_norm 1.4536 (1.1777) [2022-01-19 21:01:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1190/1251] eta 0:02:14 lr 0.000862 time 2.1303 (2.1991) loss 4.2759 (3.8189) grad_norm 1.0884 (1.1782) [2022-01-19 21:02:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1200/1251] eta 0:01:52 lr 0.000862 time 1.8431 (2.1988) loss 3.0404 (3.8197) grad_norm 1.0746 (1.1779) [2022-01-19 21:02:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1210/1251] eta 0:01:30 lr 0.000862 time 1.6268 (2.2000) loss 4.2045 (3.8186) grad_norm 0.9482 (1.1769) [2022-01-19 21:03:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1220/1251] eta 0:01:08 lr 0.000862 time 2.5507 (2.2003) loss 3.8887 (3.8186) grad_norm 1.3093 (1.1764) [2022-01-19 21:03:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1230/1251] eta 0:00:46 lr 0.000862 time 1.9656 (2.1995) loss 4.0202 (3.8205) grad_norm 1.3022 (1.1764) [2022-01-19 21:03:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1240/1251] eta 0:00:24 lr 0.000862 time 2.2331 (2.1988) loss 4.1524 (3.8210) grad_norm 1.3132 (1.1769) [2022-01-19 21:04:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [72/300][1250/1251] eta 0:00:02 lr 0.000862 time 1.1842 (2.1934) loss 4.0246 (3.8212) grad_norm 1.0050 (1.1767) [2022-01-19 21:04:00 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 72 training takes 0:45:44 [2022-01-19 21:04:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.322 (18.322) Loss 1.2328 (1.2328) Acc@1 71.680 (71.680) Acc@5 90.527 (90.527) [2022-01-19 21:04:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 3.014 (3.533) Loss 1.1766 (1.2047) Acc@1 73.047 (72.559) Acc@5 92.188 (91.362) [2022-01-19 21:04:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.955 (2.740) Loss 1.2063 (1.2033) Acc@1 72.266 (72.684) Acc@5 91.504 (91.406) [2022-01-19 21:05:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.571 (2.356) Loss 1.2149 (1.2079) Acc@1 71.973 (72.483) Acc@5 90.723 (91.350) [2022-01-19 21:05:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.673 (2.169) Loss 1.2475 (1.2087) Acc@1 71.191 (72.401) Acc@5 90.234 (91.318) [2022-01-19 21:05:37 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 72.296 Acc@5 91.338 [2022-01-19 21:05:37 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 72.3% [2022-01-19 21:05:37 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 72.30% [2022-01-19 21:05:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][0/1251] eta 7:26:49 lr 0.000862 time 21.4308 (21.4308) loss 4.2262 (4.2262) grad_norm 1.2283 (1.2283) [2022-01-19 21:06:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][10/1251] eta 1:25:23 lr 0.000862 time 2.6542 (4.1288) loss 2.6190 (3.5913) grad_norm 1.0482 (1.0861) [2022-01-19 21:06:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][20/1251] eta 1:04:56 lr 0.000862 time 1.7586 (3.1656) loss 3.7886 (3.6702) grad_norm 1.0094 (1.1086) [2022-01-19 21:07:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][30/1251] eta 0:58:17 lr 0.000862 time 1.9513 (2.8642) loss 2.9879 (3.5905) grad_norm 1.0419 (1.1374) [2022-01-19 21:07:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][40/1251] eta 0:55:30 lr 0.000862 time 3.8722 (2.7503) loss 3.3564 (3.5884) grad_norm 1.0614 (1.1379) [2022-01-19 21:07:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][50/1251] eta 0:53:37 lr 0.000862 time 3.4014 (2.6789) loss 3.6452 (3.6275) grad_norm 1.1109 (1.1497) [2022-01-19 21:08:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][60/1251] eta 0:51:33 lr 0.000862 time 2.2784 (2.5973) loss 3.9403 (3.6485) grad_norm 1.2132 (1.1504) [2022-01-19 21:08:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][70/1251] eta 0:49:48 lr 0.000862 time 1.8545 (2.5301) loss 4.2337 (3.6631) grad_norm 1.2528 (1.1694) [2022-01-19 21:08:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][80/1251] eta 0:48:22 lr 0.000862 time 2.9269 (2.4787) loss 3.1874 (3.6791) grad_norm 1.1101 (1.1744) [2022-01-19 21:09:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][90/1251] eta 0:47:00 lr 0.000862 time 1.8913 (2.4296) loss 3.8373 (3.6819) grad_norm 1.0923 (1.1797) [2022-01-19 21:09:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][100/1251] eta 0:46:00 lr 0.000862 time 1.8317 (2.3984) loss 3.4168 (3.6735) grad_norm 1.1592 (1.1737) [2022-01-19 21:10:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][110/1251] eta 0:45:33 lr 0.000862 time 1.8470 (2.3956) loss 4.1931 (3.6841) grad_norm 1.1159 (1.1668) [2022-01-19 21:10:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][120/1251] eta 0:45:06 lr 0.000862 time 2.2020 (2.3929) loss 4.3541 (3.6964) grad_norm 1.1667 (1.1624) [2022-01-19 21:10:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][130/1251] eta 0:44:20 lr 0.000862 time 2.2779 (2.3734) loss 4.5468 (3.7246) grad_norm 1.2798 (1.1604) [2022-01-19 21:11:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][140/1251] eta 0:43:30 lr 0.000862 time 2.1802 (2.3499) loss 3.2379 (3.7166) grad_norm 1.2460 (1.1625) [2022-01-19 21:11:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][150/1251] eta 0:42:49 lr 0.000862 time 1.8499 (2.3342) loss 3.9729 (3.7307) grad_norm 1.0818 (1.1617) [2022-01-19 21:11:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][160/1251] eta 0:42:23 lr 0.000862 time 2.1773 (2.3315) loss 2.6983 (3.7375) grad_norm 1.0893 (1.1621) [2022-01-19 21:12:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][170/1251] eta 0:41:47 lr 0.000862 time 2.1126 (2.3201) loss 4.0589 (3.7329) grad_norm 1.3012 (1.1672) [2022-01-19 21:12:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][180/1251] eta 0:41:08 lr 0.000862 time 1.9281 (2.3051) loss 4.2773 (3.7279) grad_norm 1.5150 (1.1697) [2022-01-19 21:12:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][190/1251] eta 0:40:31 lr 0.000862 time 1.5964 (2.2917) loss 4.4285 (3.7186) grad_norm 1.0022 (1.1731) [2022-01-19 21:13:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][200/1251] eta 0:40:07 lr 0.000862 time 1.7493 (2.2902) loss 3.3181 (3.7311) grad_norm 1.1267 (1.1734) [2022-01-19 21:13:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][210/1251] eta 0:39:32 lr 0.000862 time 1.7280 (2.2795) loss 4.5018 (3.7406) grad_norm 1.1844 (1.1769) [2022-01-19 21:14:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][220/1251] eta 0:39:06 lr 0.000862 time 1.7613 (2.2760) loss 3.8738 (3.7540) grad_norm 1.1919 (1.1737) [2022-01-19 21:14:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][230/1251] eta 0:38:42 lr 0.000862 time 1.4810 (2.2745) loss 3.2923 (3.7500) grad_norm 1.0459 (1.1707) [2022-01-19 21:14:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][240/1251] eta 0:38:21 lr 0.000862 time 1.8835 (2.2760) loss 2.9583 (3.7493) grad_norm 1.0526 (1.1725) [2022-01-19 21:15:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][250/1251] eta 0:37:58 lr 0.000862 time 1.5204 (2.2759) loss 4.2046 (3.7640) grad_norm 0.9955 (1.1672) [2022-01-19 21:15:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][260/1251] eta 0:37:32 lr 0.000862 time 2.3027 (2.2731) loss 4.1949 (3.7602) grad_norm 1.1012 (1.1674) [2022-01-19 21:15:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][270/1251] eta 0:37:06 lr 0.000861 time 1.8605 (2.2698) loss 4.1517 (3.7633) grad_norm 1.0488 (1.1724) [2022-01-19 21:16:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][280/1251] eta 0:36:32 lr 0.000861 time 1.8823 (2.2584) loss 3.9542 (3.7659) grad_norm 1.2404 (1.1724) [2022-01-19 21:16:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][290/1251] eta 0:36:04 lr 0.000861 time 1.8296 (2.2524) loss 3.5092 (3.7603) grad_norm 1.3299 (1.1726) [2022-01-19 21:16:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][300/1251] eta 0:35:39 lr 0.000861 time 1.9042 (2.2493) loss 3.6583 (3.7592) grad_norm 1.1262 (1.1737) [2022-01-19 21:17:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][310/1251] eta 0:35:15 lr 0.000861 time 2.3903 (2.2484) loss 3.7855 (3.7659) grad_norm 1.0092 (1.1719) [2022-01-19 21:17:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][320/1251] eta 0:34:50 lr 0.000861 time 2.2635 (2.2456) loss 4.2058 (3.7555) grad_norm 1.1161 (1.1700) [2022-01-19 21:18:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][330/1251] eta 0:34:30 lr 0.000861 time 2.5016 (2.2485) loss 4.7563 (3.7650) grad_norm 1.1085 (1.1692) [2022-01-19 21:18:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][340/1251] eta 0:34:08 lr 0.000861 time 1.6394 (2.2489) loss 4.1052 (3.7689) grad_norm 1.1619 (1.1689) [2022-01-19 21:18:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][350/1251] eta 0:33:46 lr 0.000861 time 2.5346 (2.2494) loss 3.9641 (3.7620) grad_norm 1.1235 (1.1677) [2022-01-19 21:19:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][360/1251] eta 0:33:19 lr 0.000861 time 2.4420 (2.2436) loss 4.2472 (3.7629) grad_norm 1.2613 (1.1696) [2022-01-19 21:19:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][370/1251] eta 0:32:50 lr 0.000861 time 1.8514 (2.2369) loss 4.0055 (3.7642) grad_norm 1.1590 (1.1698) [2022-01-19 21:19:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][380/1251] eta 0:32:25 lr 0.000861 time 2.1355 (2.2333) loss 3.2543 (3.7649) grad_norm 1.1306 (1.1703) [2022-01-19 21:20:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][390/1251] eta 0:31:59 lr 0.000861 time 2.2511 (2.2297) loss 3.9697 (3.7670) grad_norm 1.1770 (1.1722) [2022-01-19 21:20:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][400/1251] eta 0:31:35 lr 0.000861 time 2.5664 (2.2276) loss 3.4182 (3.7657) grad_norm 1.0335 (1.1711) [2022-01-19 21:20:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][410/1251] eta 0:31:15 lr 0.000861 time 2.5943 (2.2295) loss 3.4868 (3.7650) grad_norm 1.4859 (1.1754) [2022-01-19 21:21:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][420/1251] eta 0:30:53 lr 0.000861 time 1.5751 (2.2309) loss 2.6537 (3.7564) grad_norm 1.1324 (1.1754) [2022-01-19 21:21:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][430/1251] eta 0:30:34 lr 0.000861 time 2.9131 (2.2339) loss 2.8803 (3.7601) grad_norm 1.0693 (1.1762) [2022-01-19 21:22:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][440/1251] eta 0:30:12 lr 0.000861 time 2.9577 (2.2345) loss 3.3774 (3.7645) grad_norm 1.4235 (1.1785) [2022-01-19 21:22:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][450/1251] eta 0:29:46 lr 0.000861 time 1.9787 (2.2305) loss 3.7380 (3.7562) grad_norm 1.1622 (1.1799) [2022-01-19 21:22:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][460/1251] eta 0:29:20 lr 0.000861 time 1.6644 (2.2251) loss 4.5766 (3.7594) grad_norm 1.5197 (1.1830) [2022-01-19 21:23:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][470/1251] eta 0:28:56 lr 0.000861 time 2.1746 (2.2239) loss 2.8326 (3.7594) grad_norm 1.0778 (1.1817) [2022-01-19 21:23:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][480/1251] eta 0:28:35 lr 0.000861 time 3.0663 (2.2252) loss 2.8214 (3.7542) grad_norm 1.1635 (1.1801) [2022-01-19 21:23:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][490/1251] eta 0:28:12 lr 0.000861 time 1.8801 (2.2239) loss 3.8903 (3.7571) grad_norm 1.0536 (1.1797) [2022-01-19 21:24:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][500/1251] eta 0:27:47 lr 0.000861 time 1.6076 (2.2202) loss 4.2933 (3.7513) grad_norm 1.0998 (1.1795) [2022-01-19 21:24:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][510/1251] eta 0:27:24 lr 0.000861 time 2.5055 (2.2187) loss 4.4241 (3.7546) grad_norm 1.2534 (1.1791) [2022-01-19 21:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][520/1251] eta 0:27:02 lr 0.000861 time 2.9426 (2.2191) loss 4.1423 (3.7526) grad_norm 1.1343 (1.1799) [2022-01-19 21:25:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][530/1251] eta 0:26:41 lr 0.000861 time 2.3007 (2.2213) loss 2.7374 (3.7494) grad_norm 1.1387 (1.1798) [2022-01-19 21:25:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][540/1251] eta 0:26:20 lr 0.000861 time 1.6662 (2.2222) loss 4.3882 (3.7461) grad_norm 1.1442 (1.1805) [2022-01-19 21:26:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][550/1251] eta 0:25:56 lr 0.000861 time 1.5713 (2.2200) loss 2.9952 (3.7468) grad_norm 1.1723 (1.1802) [2022-01-19 21:26:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][560/1251] eta 0:25:32 lr 0.000861 time 2.4977 (2.2181) loss 4.1027 (3.7431) grad_norm 1.1754 (1.1800) [2022-01-19 21:26:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][570/1251] eta 0:25:09 lr 0.000861 time 1.8883 (2.2160) loss 4.6589 (3.7467) grad_norm 1.1437 (1.1791) [2022-01-19 21:27:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][580/1251] eta 0:24:45 lr 0.000861 time 2.0931 (2.2145) loss 4.7049 (3.7520) grad_norm 1.0975 (1.1783) [2022-01-19 21:27:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][590/1251] eta 0:24:22 lr 0.000861 time 1.8358 (2.2123) loss 3.2846 (3.7500) grad_norm 1.3703 (1.1786) [2022-01-19 21:27:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][600/1251] eta 0:23:58 lr 0.000861 time 1.9652 (2.2091) loss 3.4647 (3.7502) grad_norm 1.0437 (1.1792) [2022-01-19 21:28:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][610/1251] eta 0:23:35 lr 0.000861 time 1.7983 (2.2085) loss 4.2116 (3.7471) grad_norm 1.2848 (1.1799) [2022-01-19 21:28:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][620/1251] eta 0:23:13 lr 0.000860 time 2.5034 (2.2091) loss 2.9594 (3.7493) grad_norm 1.4073 (1.1797) [2022-01-19 21:28:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][630/1251] eta 0:22:51 lr 0.000860 time 1.4647 (2.2080) loss 4.5180 (3.7492) grad_norm 1.3957 (1.1810) [2022-01-19 21:29:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][640/1251] eta 0:22:28 lr 0.000860 time 2.5580 (2.2073) loss 4.4629 (3.7488) grad_norm 1.2325 (1.1820) [2022-01-19 21:29:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][650/1251] eta 0:22:05 lr 0.000860 time 1.7804 (2.2050) loss 4.2769 (3.7517) grad_norm 1.0385 (1.1819) [2022-01-19 21:29:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][660/1251] eta 0:21:43 lr 0.000860 time 2.0630 (2.2048) loss 4.1485 (3.7547) grad_norm 1.1558 (1.1810) [2022-01-19 21:30:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][670/1251] eta 0:21:21 lr 0.000860 time 1.6718 (2.2057) loss 4.0250 (3.7547) grad_norm 1.0916 (1.1811) [2022-01-19 21:30:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][680/1251] eta 0:21:02 lr 0.000860 time 3.3692 (2.2115) loss 2.8945 (3.7562) grad_norm 1.4025 (1.1816) [2022-01-19 21:31:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][690/1251] eta 0:20:43 lr 0.000860 time 2.0074 (2.2167) loss 4.1802 (3.7588) grad_norm 1.1167 (1.1803) [2022-01-19 21:31:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][700/1251] eta 0:20:21 lr 0.000860 time 1.8625 (2.2163) loss 4.3421 (3.7569) grad_norm 1.2707 (1.1798) [2022-01-19 21:31:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][710/1251] eta 0:19:58 lr 0.000860 time 2.1194 (2.2149) loss 3.7251 (3.7533) grad_norm 1.0402 (1.1795) [2022-01-19 21:32:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][720/1251] eta 0:19:34 lr 0.000860 time 2.3288 (2.2115) loss 3.7904 (3.7533) grad_norm 1.1444 (1.1793) [2022-01-19 21:32:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][730/1251] eta 0:19:10 lr 0.000860 time 1.5966 (2.2076) loss 3.8150 (3.7515) grad_norm 1.0758 (1.1781) [2022-01-19 21:32:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][740/1251] eta 0:18:47 lr 0.000860 time 2.1540 (2.2065) loss 4.1748 (3.7532) grad_norm 0.9323 (1.1777) [2022-01-19 21:33:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][750/1251] eta 0:18:25 lr 0.000860 time 1.6166 (2.2069) loss 3.0281 (3.7563) grad_norm 1.2124 (1.1787) [2022-01-19 21:33:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][760/1251] eta 0:18:03 lr 0.000860 time 2.1934 (2.2076) loss 4.2893 (3.7586) grad_norm 1.0773 (1.1794) [2022-01-19 21:33:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][770/1251] eta 0:17:42 lr 0.000860 time 1.8790 (2.2080) loss 4.0126 (3.7562) grad_norm 1.3256 (1.1795) [2022-01-19 21:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][780/1251] eta 0:17:20 lr 0.000860 time 2.3371 (2.2086) loss 4.0637 (3.7559) grad_norm 1.3213 (1.1795) [2022-01-19 21:34:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][790/1251] eta 0:16:57 lr 0.000860 time 1.5707 (2.2069) loss 3.7043 (3.7537) grad_norm 1.1359 (1.1804) [2022-01-19 21:35:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][800/1251] eta 0:16:35 lr 0.000860 time 2.5094 (2.2075) loss 4.1871 (3.7563) grad_norm 1.1912 (1.1799) [2022-01-19 21:35:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][810/1251] eta 0:16:13 lr 0.000860 time 2.1771 (2.2077) loss 4.2728 (3.7555) grad_norm 1.2836 (1.1799) [2022-01-19 21:35:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][820/1251] eta 0:15:51 lr 0.000860 time 2.5273 (2.2081) loss 3.9403 (3.7543) grad_norm 1.1272 (1.1807) [2022-01-19 21:36:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][830/1251] eta 0:15:29 lr 0.000860 time 2.0256 (2.2071) loss 3.8344 (3.7567) grad_norm 0.9404 (1.1797) [2022-01-19 21:36:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][840/1251] eta 0:15:06 lr 0.000860 time 2.4580 (2.2064) loss 3.7924 (3.7587) grad_norm 1.1792 (1.1793) [2022-01-19 21:36:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][850/1251] eta 0:14:44 lr 0.000860 time 2.0046 (2.2046) loss 4.0298 (3.7622) grad_norm 1.1480 (1.1788) [2022-01-19 21:37:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][860/1251] eta 0:14:21 lr 0.000860 time 2.2230 (2.2044) loss 3.3609 (3.7621) grad_norm 1.0655 (1.1792) [2022-01-19 21:37:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][870/1251] eta 0:13:59 lr 0.000860 time 1.9466 (2.2045) loss 4.6502 (3.7613) grad_norm 1.1136 (1.1789) [2022-01-19 21:38:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][880/1251] eta 0:13:38 lr 0.000860 time 2.3505 (2.2063) loss 3.6317 (3.7633) grad_norm 1.0885 (1.1792) [2022-01-19 21:38:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][890/1251] eta 0:13:16 lr 0.000860 time 1.7206 (2.2059) loss 4.1404 (3.7642) grad_norm 1.0976 (1.1786) [2022-01-19 21:38:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][900/1251] eta 0:12:54 lr 0.000860 time 2.4476 (2.2068) loss 4.4644 (3.7675) grad_norm 1.3496 (1.1794) [2022-01-19 21:39:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][910/1251] eta 0:12:31 lr 0.000860 time 1.6492 (2.2046) loss 4.0348 (3.7661) grad_norm 1.0616 (1.1784) [2022-01-19 21:39:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][920/1251] eta 0:12:09 lr 0.000860 time 2.1859 (2.2031) loss 3.8451 (3.7667) grad_norm 1.0259 (1.1777) [2022-01-19 21:39:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][930/1251] eta 0:11:46 lr 0.000860 time 1.6016 (2.2007) loss 4.3490 (3.7687) grad_norm 1.0766 (1.1774) [2022-01-19 21:40:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][940/1251] eta 0:11:24 lr 0.000860 time 2.2015 (2.1999) loss 4.2385 (3.7685) grad_norm 1.1828 (1.1772) [2022-01-19 21:40:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][950/1251] eta 0:11:02 lr 0.000860 time 2.2156 (2.2016) loss 3.4262 (3.7710) grad_norm 1.1051 (1.1764) [2022-01-19 21:40:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][960/1251] eta 0:10:41 lr 0.000860 time 3.4988 (2.2028) loss 2.6737 (3.7694) grad_norm 1.1922 (1.1754) [2022-01-19 21:41:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][970/1251] eta 0:10:19 lr 0.000859 time 1.8462 (2.2044) loss 3.2308 (3.7663) grad_norm 0.9938 (1.1750) [2022-01-19 21:41:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][980/1251] eta 0:09:57 lr 0.000859 time 2.2493 (2.2042) loss 2.9152 (3.7655) grad_norm 1.2456 (1.1758) [2022-01-19 21:42:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][990/1251] eta 0:09:35 lr 0.000859 time 2.1681 (2.2051) loss 3.4106 (3.7661) grad_norm 1.3223 (1.1762) [2022-01-19 21:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1000/1251] eta 0:09:13 lr 0.000859 time 3.1748 (2.2051) loss 4.1466 (3.7687) grad_norm 1.1576 (1.1763) [2022-01-19 21:42:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1010/1251] eta 0:08:50 lr 0.000859 time 1.8255 (2.2027) loss 3.8217 (3.7674) grad_norm 1.0865 (1.1765) [2022-01-19 21:43:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1020/1251] eta 0:08:28 lr 0.000859 time 1.9959 (2.1996) loss 3.5712 (3.7681) grad_norm 1.1713 (1.1763) [2022-01-19 21:43:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1030/1251] eta 0:08:05 lr 0.000859 time 1.8975 (2.1985) loss 4.6921 (3.7691) grad_norm 1.4269 (1.1764) [2022-01-19 21:43:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1040/1251] eta 0:07:43 lr 0.000859 time 1.9826 (2.1977) loss 3.0840 (3.7692) grad_norm 1.2810 (1.1762) [2022-01-19 21:44:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1050/1251] eta 0:07:22 lr 0.000859 time 2.5650 (2.1991) loss 4.5597 (3.7682) grad_norm 1.3166 (1.1755) [2022-01-19 21:44:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1060/1251] eta 0:07:00 lr 0.000859 time 2.4911 (2.1992) loss 3.9890 (3.7689) grad_norm 1.2522 (1.1757) [2022-01-19 21:44:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1070/1251] eta 0:06:38 lr 0.000859 time 2.5231 (2.1989) loss 3.3258 (3.7694) grad_norm 1.1265 (1.1753) [2022-01-19 21:45:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1080/1251] eta 0:06:16 lr 0.000859 time 2.1414 (2.1996) loss 3.3908 (3.7694) grad_norm 1.0801 (1.1749) [2022-01-19 21:45:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1090/1251] eta 0:05:54 lr 0.000859 time 2.2628 (2.2000) loss 4.1259 (3.7706) grad_norm 1.2352 (1.1745) [2022-01-19 21:45:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1100/1251] eta 0:05:32 lr 0.000859 time 2.4303 (2.2004) loss 4.1944 (3.7664) grad_norm 1.2153 (1.1752) [2022-01-19 21:46:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1110/1251] eta 0:05:10 lr 0.000859 time 2.5915 (2.2013) loss 4.1218 (3.7658) grad_norm 1.1707 (1.1765) [2022-01-19 21:46:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1120/1251] eta 0:04:48 lr 0.000859 time 2.7413 (2.2029) loss 4.3222 (3.7672) grad_norm 1.1949 (1.1768) [2022-01-19 21:47:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1130/1251] eta 0:04:26 lr 0.000859 time 1.7353 (2.2028) loss 4.3198 (3.7677) grad_norm 1.0908 (1.1763) [2022-01-19 21:47:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1140/1251] eta 0:04:04 lr 0.000859 time 1.6752 (2.2003) loss 2.8600 (3.7695) grad_norm 1.2985 (1.1759) [2022-01-19 21:47:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1150/1251] eta 0:03:42 lr 0.000859 time 2.0284 (2.1993) loss 3.4681 (3.7697) grad_norm 1.2727 (1.1754) [2022-01-19 21:48:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1160/1251] eta 0:03:20 lr 0.000859 time 1.9546 (2.1991) loss 2.6493 (3.7680) grad_norm 1.1720 (1.1754) [2022-01-19 21:48:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1170/1251] eta 0:02:58 lr 0.000859 time 1.9754 (2.2004) loss 3.2712 (3.7679) grad_norm 1.0316 (1.1754) [2022-01-19 21:48:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1180/1251] eta 0:02:36 lr 0.000859 time 2.2010 (2.1986) loss 3.4793 (3.7683) grad_norm 1.3843 (1.1754) [2022-01-19 21:49:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1190/1251] eta 0:02:14 lr 0.000859 time 1.8274 (2.1970) loss 2.9352 (3.7682) grad_norm 1.1437 (1.1751) [2022-01-19 21:49:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1200/1251] eta 0:01:52 lr 0.000859 time 2.2476 (2.1968) loss 2.9251 (3.7675) grad_norm 1.0500 (1.1752) [2022-01-19 21:49:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1210/1251] eta 0:01:30 lr 0.000859 time 1.5700 (2.1971) loss 3.1020 (3.7679) grad_norm 1.1254 (1.1747) [2022-01-19 21:50:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1220/1251] eta 0:01:08 lr 0.000859 time 1.8632 (2.1960) loss 4.8009 (3.7673) grad_norm 1.0844 (1.1743) [2022-01-19 21:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1230/1251] eta 0:00:46 lr 0.000859 time 2.1730 (2.1965) loss 3.6277 (3.7676) grad_norm 1.2440 (1.1746) [2022-01-19 21:51:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1240/1251] eta 0:00:24 lr 0.000859 time 1.9051 (2.1970) loss 3.6801 (3.7684) grad_norm 1.2617 (1.1747) [2022-01-19 21:51:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [73/300][1250/1251] eta 0:00:02 lr 0.000859 time 1.1408 (2.1920) loss 3.7120 (3.7681) grad_norm 1.1161 (1.1740) [2022-01-19 21:51:19 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 73 training takes 0:45:42 [2022-01-19 21:51:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.442 (18.442) Loss 1.2187 (1.2187) Acc@1 70.605 (70.605) Acc@5 91.113 (91.113) [2022-01-19 21:51:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 2.259 (3.532) Loss 1.2217 (1.1968) Acc@1 72.070 (72.337) Acc@5 91.895 (91.735) [2022-01-19 21:52:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.915 (2.489) Loss 1.2880 (1.2162) Acc@1 71.289 (71.931) Acc@5 90.039 (91.527) [2022-01-19 21:52:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.016 (2.222) Loss 1.2821 (1.2128) Acc@1 69.727 (71.966) Acc@5 90.918 (91.479) [2022-01-19 21:52:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.147 (2.135) Loss 1.2889 (1.2174) Acc@1 71.387 (71.982) Acc@5 89.746 (91.328) [2022-01-19 21:52:54 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 72.022 Acc@5 91.300 [2022-01-19 21:52:54 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 72.0% [2022-01-19 21:52:54 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 72.30% [2022-01-19 21:53:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][0/1251] eta 7:18:10 lr 0.000859 time 21.0156 (21.0156) loss 3.6904 (3.6904) grad_norm 1.5208 (1.5208) [2022-01-19 21:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][10/1251] eta 1:23:53 lr 0.000859 time 2.5749 (4.0558) loss 3.9285 (3.7807) grad_norm 1.0184 (1.1312) [2022-01-19 21:54:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][20/1251] eta 1:04:20 lr 0.000859 time 1.4198 (3.1361) loss 3.2566 (3.5968) grad_norm 1.1442 (1.1535) [2022-01-19 21:54:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][30/1251] eta 0:57:12 lr 0.000859 time 1.6016 (2.8116) loss 4.3932 (3.5627) grad_norm 1.5111 (1.1830) [2022-01-19 21:54:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][40/1251] eta 0:55:19 lr 0.000859 time 3.5058 (2.7410) loss 4.4347 (3.6201) grad_norm 1.0162 (1.1857) [2022-01-19 21:55:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][50/1251] eta 0:53:19 lr 0.000859 time 2.7481 (2.6642) loss 3.0339 (3.6524) grad_norm 1.2267 (1.1750) [2022-01-19 21:55:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][60/1251] eta 0:51:25 lr 0.000858 time 1.6017 (2.5910) loss 3.6214 (3.6881) grad_norm 1.1574 (1.1739) [2022-01-19 21:55:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][70/1251] eta 0:49:52 lr 0.000858 time 1.8494 (2.5343) loss 3.7972 (3.7189) grad_norm 1.0360 (1.1805) [2022-01-19 21:56:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][80/1251] eta 0:48:21 lr 0.000858 time 1.9650 (2.4778) loss 4.1865 (3.7211) grad_norm 1.2243 (1.1861) [2022-01-19 21:56:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][90/1251] eta 0:46:57 lr 0.000858 time 2.3470 (2.4269) loss 4.3681 (3.7288) grad_norm 1.5669 (1.1914) [2022-01-19 21:56:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][100/1251] eta 0:45:43 lr 0.000858 time 1.8734 (2.3837) loss 3.8263 (3.7609) grad_norm 1.2708 (1.1927) [2022-01-19 21:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][110/1251] eta 0:44:49 lr 0.000858 time 2.0147 (2.3569) loss 2.6949 (3.7329) grad_norm 1.3292 (1.1912) [2022-01-19 21:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][120/1251] eta 0:44:00 lr 0.000858 time 2.0330 (2.3347) loss 3.4483 (3.7472) grad_norm 1.3710 (1.1914) [2022-01-19 21:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][130/1251] eta 0:43:34 lr 0.000858 time 3.2745 (2.3327) loss 3.9933 (3.7607) grad_norm 1.1215 (1.1903) [2022-01-19 21:58:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][140/1251] eta 0:43:02 lr 0.000858 time 2.2589 (2.3244) loss 3.9555 (3.7695) grad_norm 1.2814 (1.1856) [2022-01-19 21:58:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][150/1251] eta 0:42:37 lr 0.000858 time 2.2301 (2.3231) loss 4.3152 (3.7796) grad_norm 0.9490 (1.1808) [2022-01-19 21:59:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][160/1251] eta 0:41:59 lr 0.000858 time 1.9495 (2.3093) loss 4.1951 (3.7844) grad_norm 1.0859 (1.1788) [2022-01-19 21:59:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][170/1251] eta 0:41:34 lr 0.000858 time 2.6775 (2.3080) loss 4.0470 (3.7835) grad_norm 1.2933 (1.1814) [2022-01-19 21:59:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][180/1251] eta 0:41:16 lr 0.000858 time 2.8351 (2.3119) loss 4.4990 (3.7864) grad_norm 1.6279 (1.1839) [2022-01-19 22:00:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][190/1251] eta 0:40:56 lr 0.000858 time 2.4724 (2.3155) loss 4.0550 (3.7934) grad_norm 0.9799 (1.1819) [2022-01-19 22:00:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][200/1251] eta 0:40:25 lr 0.000858 time 1.8090 (2.3079) loss 2.6865 (3.7963) grad_norm 1.1154 (1.1817) [2022-01-19 22:00:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][210/1251] eta 0:39:52 lr 0.000858 time 2.4502 (2.2987) loss 2.4074 (3.7878) grad_norm 1.0470 (1.1784) [2022-01-19 22:01:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][220/1251] eta 0:39:12 lr 0.000858 time 1.9296 (2.2820) loss 2.5843 (3.7792) grad_norm 1.7085 (1.1827) [2022-01-19 22:01:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][230/1251] eta 0:38:43 lr 0.000858 time 2.2565 (2.2753) loss 3.8447 (3.7823) grad_norm 1.1641 (1.1836) [2022-01-19 22:02:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][240/1251] eta 0:38:18 lr 0.000858 time 2.2449 (2.2735) loss 3.4336 (3.7766) grad_norm 1.2193 (1.1834) [2022-01-19 22:02:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][250/1251] eta 0:37:55 lr 0.000858 time 2.5165 (2.2735) loss 2.7783 (3.7673) grad_norm 1.1381 (1.1822) [2022-01-19 22:02:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][260/1251] eta 0:37:22 lr 0.000858 time 1.9249 (2.2625) loss 4.0112 (3.7753) grad_norm 1.0685 (1.1815) [2022-01-19 22:03:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][270/1251] eta 0:36:55 lr 0.000858 time 2.2751 (2.2583) loss 4.3314 (3.7757) grad_norm 1.0307 (1.1807) [2022-01-19 22:03:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][280/1251] eta 0:36:35 lr 0.000858 time 2.4377 (2.2614) loss 4.0293 (3.7722) grad_norm 0.9960 (1.1806) [2022-01-19 22:03:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][290/1251] eta 0:36:12 lr 0.000858 time 1.9440 (2.2604) loss 4.0064 (3.7710) grad_norm 1.1243 (1.1800) [2022-01-19 22:04:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][300/1251] eta 0:35:43 lr 0.000858 time 1.8962 (2.2539) loss 4.4152 (3.7644) grad_norm 1.1234 (1.1778) [2022-01-19 22:04:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][310/1251] eta 0:35:15 lr 0.000858 time 2.2147 (2.2482) loss 4.3496 (3.7642) grad_norm 1.1323 (1.1762) [2022-01-19 22:04:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][320/1251] eta 0:34:49 lr 0.000858 time 1.8326 (2.2439) loss 4.1505 (3.7667) grad_norm 1.6884 (1.1767) [2022-01-19 22:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][330/1251] eta 0:34:23 lr 0.000858 time 1.9848 (2.2402) loss 3.9385 (3.7677) grad_norm 1.1570 (1.1762) [2022-01-19 22:05:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][340/1251] eta 0:33:58 lr 0.000858 time 2.5242 (2.2374) loss 4.0675 (3.7665) grad_norm 1.3782 (1.1782) [2022-01-19 22:05:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][350/1251] eta 0:33:33 lr 0.000858 time 1.9519 (2.2352) loss 3.3561 (3.7734) grad_norm 1.0682 (1.1770) [2022-01-19 22:06:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][360/1251] eta 0:33:15 lr 0.000858 time 2.9176 (2.2394) loss 4.4699 (3.7721) grad_norm 1.0308 (1.1756) [2022-01-19 22:06:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][370/1251] eta 0:32:51 lr 0.000858 time 1.4725 (2.2384) loss 4.3968 (3.7833) grad_norm 1.0338 (1.1757) [2022-01-19 22:07:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][380/1251] eta 0:32:33 lr 0.000858 time 2.5270 (2.2424) loss 4.4606 (3.7909) grad_norm 1.1973 (1.1761) [2022-01-19 22:07:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][390/1251] eta 0:32:11 lr 0.000858 time 1.7740 (2.2431) loss 4.3980 (3.8018) grad_norm 1.3090 (1.1762) [2022-01-19 22:07:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][400/1251] eta 0:31:46 lr 0.000858 time 2.2428 (2.2405) loss 3.7768 (3.7991) grad_norm 1.1203 (1.1769) [2022-01-19 22:08:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][410/1251] eta 0:31:19 lr 0.000857 time 1.7162 (2.2343) loss 3.5781 (3.8042) grad_norm 1.3950 (1.1757) [2022-01-19 22:08:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][420/1251] eta 0:30:51 lr 0.000857 time 2.0591 (2.2281) loss 4.2500 (3.8015) grad_norm 1.0577 (1.1757) [2022-01-19 22:08:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][430/1251] eta 0:30:24 lr 0.000857 time 1.5460 (2.2228) loss 3.2075 (3.8013) grad_norm 1.2031 (1.1751) [2022-01-19 22:09:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][440/1251] eta 0:30:03 lr 0.000857 time 2.6401 (2.2243) loss 4.1833 (3.8042) grad_norm 1.0600 (1.1738) [2022-01-19 22:09:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][450/1251] eta 0:29:41 lr 0.000857 time 2.0958 (2.2236) loss 4.0880 (3.8140) grad_norm 1.1566 (1.1723) [2022-01-19 22:09:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][460/1251] eta 0:29:19 lr 0.000857 time 2.2294 (2.2239) loss 3.9065 (3.8109) grad_norm 1.3402 (1.1741) [2022-01-19 22:10:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][470/1251] eta 0:28:56 lr 0.000857 time 2.2034 (2.2237) loss 3.9453 (3.8084) grad_norm 1.3269 (1.1742) [2022-01-19 22:10:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][480/1251] eta 0:28:34 lr 0.000857 time 1.7557 (2.2235) loss 3.7318 (3.8100) grad_norm 1.0580 (1.1734) [2022-01-19 22:11:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][490/1251] eta 0:28:12 lr 0.000857 time 2.2440 (2.2242) loss 4.4266 (3.8055) grad_norm 1.0276 (1.1728) [2022-01-19 22:11:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][500/1251] eta 0:27:51 lr 0.000857 time 2.1677 (2.2256) loss 3.3593 (3.7999) grad_norm 1.1024 (1.1744) [2022-01-19 22:11:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][510/1251] eta 0:27:29 lr 0.000857 time 2.7345 (2.2257) loss 3.5963 (3.7990) grad_norm 1.0724 (1.1750) [2022-01-19 22:12:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][520/1251] eta 0:27:07 lr 0.000857 time 1.6722 (2.2258) loss 4.0354 (3.8017) grad_norm 1.3147 (1.1747) [2022-01-19 22:12:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][530/1251] eta 0:26:43 lr 0.000857 time 1.9798 (2.2244) loss 4.1560 (3.8051) grad_norm 1.1660 (1.1738) [2022-01-19 22:12:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][540/1251] eta 0:26:18 lr 0.000857 time 1.9324 (2.2198) loss 4.0563 (3.8104) grad_norm 1.2126 (1.1758) [2022-01-19 22:13:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][550/1251] eta 0:25:52 lr 0.000857 time 2.0995 (2.2151) loss 4.2428 (3.8133) grad_norm 1.1876 (1.1762) [2022-01-19 22:13:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][560/1251] eta 0:25:27 lr 0.000857 time 1.9277 (2.2108) loss 4.2738 (3.8155) grad_norm 1.1692 (1.1761) [2022-01-19 22:13:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][570/1251] eta 0:25:04 lr 0.000857 time 2.4818 (2.2088) loss 3.6401 (3.8178) grad_norm 1.1074 (1.1763) [2022-01-19 22:14:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][580/1251] eta 0:24:40 lr 0.000857 time 1.7597 (2.2060) loss 4.3194 (3.8186) grad_norm 1.0752 (1.1758) [2022-01-19 22:14:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][590/1251] eta 0:24:17 lr 0.000857 time 1.9425 (2.2052) loss 4.2910 (3.8150) grad_norm 0.9953 (1.1748) [2022-01-19 22:15:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][600/1251] eta 0:23:57 lr 0.000857 time 2.8492 (2.2074) loss 3.0581 (3.8114) grad_norm 1.1786 (1.1764) [2022-01-19 22:15:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][610/1251] eta 0:23:34 lr 0.000857 time 1.9203 (2.2072) loss 4.0613 (3.8132) grad_norm 1.2189 (1.1769) [2022-01-19 22:15:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][620/1251] eta 0:23:12 lr 0.000857 time 2.6243 (2.2072) loss 3.2142 (3.8098) grad_norm 1.3120 (1.1774) [2022-01-19 22:16:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][630/1251] eta 0:22:50 lr 0.000857 time 2.3752 (2.2074) loss 4.2241 (3.8078) grad_norm 1.0595 (1.1776) [2022-01-19 22:16:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][640/1251] eta 0:22:29 lr 0.000857 time 3.1357 (2.2093) loss 4.2332 (3.8064) grad_norm 0.9476 (1.1772) [2022-01-19 22:16:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][650/1251] eta 0:22:09 lr 0.000857 time 2.7369 (2.2122) loss 4.3360 (3.8049) grad_norm 1.0375 (1.1757) [2022-01-19 22:17:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][660/1251] eta 0:21:48 lr 0.000857 time 2.5378 (2.2149) loss 4.3791 (3.8058) grad_norm 1.3579 (1.1761) [2022-01-19 22:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][670/1251] eta 0:21:25 lr 0.000857 time 2.4533 (2.2129) loss 3.2841 (3.8020) grad_norm 1.1481 (1.1778) [2022-01-19 22:17:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][680/1251] eta 0:21:02 lr 0.000857 time 1.9519 (2.2103) loss 3.9587 (3.8002) grad_norm 1.1467 (1.1791) [2022-01-19 22:18:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][690/1251] eta 0:20:39 lr 0.000857 time 2.5203 (2.2097) loss 4.0850 (3.8038) grad_norm 1.2484 (1.1782) [2022-01-19 22:18:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][700/1251] eta 0:20:17 lr 0.000857 time 2.2554 (2.2105) loss 4.1869 (3.8032) grad_norm 1.2175 (1.1778) [2022-01-19 22:19:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][710/1251] eta 0:19:56 lr 0.000857 time 2.2061 (2.2108) loss 4.3109 (3.8049) grad_norm 1.1406 (1.1777) [2022-01-19 22:19:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][720/1251] eta 0:19:33 lr 0.000857 time 1.9270 (2.2104) loss 3.0921 (3.8032) grad_norm 1.2035 (1.1784) [2022-01-19 22:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][730/1251] eta 0:19:11 lr 0.000857 time 2.4389 (2.2102) loss 4.4144 (3.8023) grad_norm 1.2791 (1.1791) [2022-01-19 22:20:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][740/1251] eta 0:18:48 lr 0.000857 time 1.8795 (2.2074) loss 3.7480 (3.7994) grad_norm 1.0810 (1.1795) [2022-01-19 22:20:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][750/1251] eta 0:18:25 lr 0.000856 time 2.5888 (2.2064) loss 3.9402 (3.7994) grad_norm 1.0699 (1.1782) [2022-01-19 22:20:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][760/1251] eta 0:18:04 lr 0.000856 time 1.7616 (2.2079) loss 3.1710 (3.8004) grad_norm 1.6240 (1.1783) [2022-01-19 22:21:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][770/1251] eta 0:17:41 lr 0.000856 time 2.5044 (2.2077) loss 4.1180 (3.8032) grad_norm 1.0152 (1.1777) [2022-01-19 22:21:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][780/1251] eta 0:17:19 lr 0.000856 time 1.5800 (2.2067) loss 3.3955 (3.8049) grad_norm 1.1462 (1.1773) [2022-01-19 22:21:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][790/1251] eta 0:16:57 lr 0.000856 time 2.2911 (2.2062) loss 3.5812 (3.8030) grad_norm 1.1079 (1.1774) [2022-01-19 22:22:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][800/1251] eta 0:16:34 lr 0.000856 time 1.9097 (2.2058) loss 4.4475 (3.8025) grad_norm 1.0845 (1.1774) [2022-01-19 22:22:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][810/1251] eta 0:16:12 lr 0.000856 time 2.0739 (2.2043) loss 4.0135 (3.8018) grad_norm 1.1688 (1.1786) [2022-01-19 22:23:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][820/1251] eta 0:15:49 lr 0.000856 time 2.2473 (2.2033) loss 2.7794 (3.7993) grad_norm 1.2244 (1.1792) [2022-01-19 22:23:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][830/1251] eta 0:15:27 lr 0.000856 time 2.1547 (2.2020) loss 3.4899 (3.7985) grad_norm 1.2359 (1.1801) [2022-01-19 22:23:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][840/1251] eta 0:15:04 lr 0.000856 time 2.0684 (2.2017) loss 3.6044 (3.7983) grad_norm 1.0772 (1.1804) [2022-01-19 22:24:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][850/1251] eta 0:14:42 lr 0.000856 time 2.6502 (2.2014) loss 3.4136 (3.7976) grad_norm 1.1087 (1.1801) [2022-01-19 22:24:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][860/1251] eta 0:14:20 lr 0.000856 time 1.9045 (2.2010) loss 4.1518 (3.7987) grad_norm 1.1120 (1.1800) [2022-01-19 22:24:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][870/1251] eta 0:13:58 lr 0.000856 time 1.7973 (2.2014) loss 3.6885 (3.7996) grad_norm 0.9855 (1.1790) [2022-01-19 22:25:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][880/1251] eta 0:13:36 lr 0.000856 time 2.0015 (2.2007) loss 3.8289 (3.8032) grad_norm 1.1683 (1.1793) [2022-01-19 22:25:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][890/1251] eta 0:13:15 lr 0.000856 time 2.8533 (2.2032) loss 4.4090 (3.8047) grad_norm 1.0175 (1.1784) [2022-01-19 22:25:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][900/1251] eta 0:12:52 lr 0.000856 time 1.6108 (2.2014) loss 4.2590 (3.8063) grad_norm 1.0738 (1.1779) [2022-01-19 22:26:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][910/1251] eta 0:12:30 lr 0.000856 time 1.9035 (2.2008) loss 4.0819 (3.8084) grad_norm 1.1917 (1.1779) [2022-01-19 22:26:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][920/1251] eta 0:12:07 lr 0.000856 time 1.7662 (2.1983) loss 4.1007 (3.8096) grad_norm 1.2660 (1.1777) [2022-01-19 22:27:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][930/1251] eta 0:11:45 lr 0.000856 time 2.5204 (2.1979) loss 4.0680 (3.8082) grad_norm 1.1974 (1.1777) [2022-01-19 22:27:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][940/1251] eta 0:11:23 lr 0.000856 time 1.9289 (2.1978) loss 4.1500 (3.8072) grad_norm 1.3531 (1.1781) [2022-01-19 22:27:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][950/1251] eta 0:11:01 lr 0.000856 time 1.6567 (2.1991) loss 3.5112 (3.8076) grad_norm 1.1646 (1.1783) [2022-01-19 22:28:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][960/1251] eta 0:10:40 lr 0.000856 time 2.5362 (2.2013) loss 3.7959 (3.8087) grad_norm 1.0790 (1.1784) [2022-01-19 22:28:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][970/1251] eta 0:10:18 lr 0.000856 time 2.7535 (2.2023) loss 3.2791 (3.8066) grad_norm 1.1689 (1.1785) [2022-01-19 22:28:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][980/1251] eta 0:09:56 lr 0.000856 time 1.9456 (2.2021) loss 3.6256 (3.8055) grad_norm 1.2451 (1.1794) [2022-01-19 22:29:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][990/1251] eta 0:09:34 lr 0.000856 time 1.7168 (2.2005) loss 3.2844 (3.8049) grad_norm 1.2503 (1.1794) [2022-01-19 22:29:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1000/1251] eta 0:09:11 lr 0.000856 time 1.8287 (2.1980) loss 4.1677 (3.8061) grad_norm 1.0335 (1.1796) [2022-01-19 22:29:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1010/1251] eta 0:08:49 lr 0.000856 time 1.9178 (2.1976) loss 3.5367 (3.8029) grad_norm 0.9437 (1.1795) [2022-01-19 22:30:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1020/1251] eta 0:08:27 lr 0.000856 time 1.8602 (2.1970) loss 3.9680 (3.8052) grad_norm 1.1989 (1.1791) [2022-01-19 22:30:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1030/1251] eta 0:08:05 lr 0.000856 time 1.9846 (2.1966) loss 4.7130 (3.8026) grad_norm 1.1166 (1.1789) [2022-01-19 22:31:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1040/1251] eta 0:07:43 lr 0.000856 time 1.8917 (2.1975) loss 4.2888 (3.8009) grad_norm 1.1365 (1.1793) [2022-01-19 22:31:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1050/1251] eta 0:07:21 lr 0.000856 time 2.1904 (2.1985) loss 3.4344 (3.8016) grad_norm 1.1435 (1.1791) [2022-01-19 22:31:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1060/1251] eta 0:07:00 lr 0.000856 time 1.9386 (2.2000) loss 3.2959 (3.8040) grad_norm 1.1940 (1.1792) [2022-01-19 22:32:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1070/1251] eta 0:06:38 lr 0.000856 time 2.1224 (2.2003) loss 4.1435 (3.8050) grad_norm 0.9938 (1.1789) [2022-01-19 22:32:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1080/1251] eta 0:06:16 lr 0.000856 time 2.1810 (2.1994) loss 3.3166 (3.8065) grad_norm 1.4256 (1.1788) [2022-01-19 22:32:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1090/1251] eta 0:05:53 lr 0.000855 time 1.7752 (2.1975) loss 4.0983 (3.8068) grad_norm 1.0909 (1.1799) [2022-01-19 22:33:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1100/1251] eta 0:05:31 lr 0.000855 time 2.4905 (2.1975) loss 3.9764 (3.8089) grad_norm 1.1208 (1.1799) [2022-01-19 22:33:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1110/1251] eta 0:05:09 lr 0.000855 time 2.0945 (2.1962) loss 4.4687 (3.8098) grad_norm 1.4427 (1.1801) [2022-01-19 22:33:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1120/1251] eta 0:04:47 lr 0.000855 time 2.4789 (2.1971) loss 4.5154 (3.8080) grad_norm 1.2595 (1.1805) [2022-01-19 22:34:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1130/1251] eta 0:04:25 lr 0.000855 time 2.1873 (2.1966) loss 4.5053 (3.8091) grad_norm 1.2345 (1.1809) [2022-01-19 22:34:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1140/1251] eta 0:04:04 lr 0.000855 time 2.1452 (2.1989) loss 3.8320 (3.8106) grad_norm 1.0753 (1.1808) [2022-01-19 22:35:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1150/1251] eta 0:03:42 lr 0.000855 time 1.9270 (2.1987) loss 4.1944 (3.8100) grad_norm 1.0541 (1.1798) [2022-01-19 22:35:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1160/1251] eta 0:03:20 lr 0.000855 time 2.5019 (2.1980) loss 3.5236 (3.8093) grad_norm 1.0814 (1.1805) [2022-01-19 22:35:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1170/1251] eta 0:02:57 lr 0.000855 time 1.8849 (2.1964) loss 3.9352 (3.8103) grad_norm 1.2693 (1.1806) [2022-01-19 22:36:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1180/1251] eta 0:02:35 lr 0.000855 time 1.8796 (2.1970) loss 2.7637 (3.8090) grad_norm 1.0534 (1.1808) [2022-01-19 22:36:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1190/1251] eta 0:02:14 lr 0.000855 time 2.3958 (2.1971) loss 3.4973 (3.8072) grad_norm 1.0893 (1.1810) [2022-01-19 22:36:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1200/1251] eta 0:01:52 lr 0.000855 time 2.7830 (2.1976) loss 2.9949 (3.8052) grad_norm 1.0123 (1.1809) [2022-01-19 22:37:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1210/1251] eta 0:01:30 lr 0.000855 time 2.8378 (2.1976) loss 4.4391 (3.8067) grad_norm 1.0842 (1.1805) [2022-01-19 22:37:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1220/1251] eta 0:01:08 lr 0.000855 time 1.9267 (2.1972) loss 3.6556 (3.8090) grad_norm 1.1208 (1.1804) [2022-01-19 22:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1230/1251] eta 0:00:46 lr 0.000855 time 2.2019 (2.1966) loss 4.3486 (3.8097) grad_norm 1.1504 (1.1803) [2022-01-19 22:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1240/1251] eta 0:00:24 lr 0.000855 time 2.3278 (2.1951) loss 4.1018 (3.8105) grad_norm 1.4006 (1.1802) [2022-01-19 22:38:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [74/300][1250/1251] eta 0:00:02 lr 0.000855 time 1.2041 (2.1889) loss 3.8498 (3.8106) grad_norm 1.1551 (1.1802) [2022-01-19 22:38:33 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 74 training takes 0:45:38 [2022-01-19 22:38:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.651 (18.651) Loss 1.1708 (1.1708) Acc@1 72.266 (72.266) Acc@5 91.016 (91.016) [2022-01-19 22:39:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.613 (3.255) Loss 1.2261 (1.1780) Acc@1 72.168 (72.967) Acc@5 91.602 (91.460) [2022-01-19 22:39:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.496 (2.474) Loss 1.1570 (1.1970) Acc@1 73.438 (72.252) Acc@5 91.113 (91.178) [2022-01-19 22:39:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.605 (2.204) Loss 1.2281 (1.2029) Acc@1 72.363 (72.092) Acc@5 90.430 (91.145) [2022-01-19 22:40:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.494 (2.126) Loss 1.2079 (1.1990) Acc@1 69.727 (72.263) Acc@5 91.211 (91.182) [2022-01-19 22:40:08 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 72.336 Acc@5 91.230 [2022-01-19 22:40:08 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 72.3% [2022-01-19 22:40:08 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 72.34% [2022-01-19 22:40:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][0/1251] eta 7:33:42 lr 0.000855 time 21.7609 (21.7609) loss 4.0784 (4.0784) grad_norm 1.1691 (1.1691) [2022-01-19 22:40:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][10/1251] eta 1:23:16 lr 0.000855 time 1.5510 (4.0265) loss 4.5222 (3.8766) grad_norm 1.0108 (1.1515) [2022-01-19 22:41:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][20/1251] eta 1:06:17 lr 0.000855 time 1.3545 (3.2312) loss 3.7081 (3.7452) grad_norm 1.1236 (1.1654) [2022-01-19 22:41:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][30/1251] eta 0:58:19 lr 0.000855 time 1.5688 (2.8663) loss 3.5987 (3.7724) grad_norm 1.0567 (1.1582) [2022-01-19 22:41:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][40/1251] eta 0:53:54 lr 0.000855 time 3.0236 (2.6706) loss 4.1543 (3.8308) grad_norm 1.0966 (1.1639) [2022-01-19 22:42:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][50/1251] eta 0:52:21 lr 0.000855 time 2.4262 (2.6157) loss 4.1020 (3.8370) grad_norm 1.2668 (1.1703) [2022-01-19 22:42:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][60/1251] eta 0:50:23 lr 0.000855 time 1.8548 (2.5387) loss 3.9145 (3.8501) grad_norm 1.1667 (1.1695) [2022-01-19 22:43:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][70/1251] eta 0:49:06 lr 0.000855 time 1.6855 (2.4952) loss 3.6486 (3.7894) grad_norm 1.0766 (1.1733) [2022-01-19 22:43:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][80/1251] eta 0:48:27 lr 0.000855 time 3.7924 (2.4827) loss 4.0836 (3.7616) grad_norm 1.5038 (1.1748) [2022-01-19 22:43:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][90/1251] eta 0:47:29 lr 0.000855 time 1.6753 (2.4546) loss 4.1913 (3.7762) grad_norm 1.0482 (1.1803) [2022-01-19 22:44:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][100/1251] eta 0:46:25 lr 0.000855 time 1.5993 (2.4202) loss 3.8879 (3.7849) grad_norm 1.0313 (1.1782) [2022-01-19 22:44:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][110/1251] eta 0:45:21 lr 0.000855 time 1.6579 (2.3852) loss 4.0242 (3.7864) grad_norm 1.0822 (1.1770) [2022-01-19 22:44:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][120/1251] eta 0:44:50 lr 0.000855 time 3.3582 (2.3787) loss 4.3938 (3.7987) grad_norm 1.3133 (1.1772) [2022-01-19 22:45:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][130/1251] eta 0:44:08 lr 0.000855 time 1.5992 (2.3623) loss 3.1328 (3.7908) grad_norm 1.1618 (1.1844) [2022-01-19 22:45:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][140/1251] eta 0:43:29 lr 0.000855 time 2.1576 (2.3484) loss 4.3017 (3.8113) grad_norm 1.3920 (1.1859) [2022-01-19 22:46:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][150/1251] eta 0:42:54 lr 0.000855 time 1.8601 (2.3380) loss 3.3327 (3.8160) grad_norm 1.1340 (1.1837) [2022-01-19 22:46:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][160/1251] eta 0:42:19 lr 0.000855 time 2.9880 (2.3279) loss 4.1140 (3.8200) grad_norm 1.2502 (1.1887) [2022-01-19 22:46:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][170/1251] eta 0:41:39 lr 0.000855 time 1.5056 (2.3124) loss 4.2126 (3.8281) grad_norm 1.2530 (1.1878) [2022-01-19 22:47:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][180/1251] eta 0:41:10 lr 0.000854 time 1.9222 (2.3066) loss 3.0555 (3.8174) grad_norm 1.1361 (1.1835) [2022-01-19 22:47:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][190/1251] eta 0:40:39 lr 0.000854 time 1.8597 (2.2996) loss 4.5734 (3.8182) grad_norm 1.0113 (1.1814) [2022-01-19 22:47:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][200/1251] eta 0:40:16 lr 0.000854 time 2.4456 (2.2996) loss 4.2666 (3.8238) grad_norm 1.3268 (1.1820) [2022-01-19 22:48:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][210/1251] eta 0:39:46 lr 0.000854 time 1.9644 (2.2923) loss 3.0749 (3.8210) grad_norm 1.1409 (1.1851) [2022-01-19 22:48:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][220/1251] eta 0:39:20 lr 0.000854 time 1.9352 (2.2898) loss 2.7260 (3.8217) grad_norm 1.4058 (1.1874) [2022-01-19 22:48:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][230/1251] eta 0:38:58 lr 0.000854 time 1.9000 (2.2902) loss 2.6211 (3.8157) grad_norm 1.0756 (1.1856) [2022-01-19 22:49:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][240/1251] eta 0:38:37 lr 0.000854 time 1.6123 (2.2924) loss 4.4064 (3.8162) grad_norm 1.0520 (1.1823) [2022-01-19 22:49:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][250/1251] eta 0:38:01 lr 0.000854 time 1.8528 (2.2794) loss 3.9913 (3.8161) grad_norm 1.0361 (1.1802) [2022-01-19 22:49:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][260/1251] eta 0:37:27 lr 0.000854 time 1.8667 (2.2675) loss 3.0848 (3.8118) grad_norm 1.2588 (1.1785) [2022-01-19 22:50:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][270/1251] eta 0:36:56 lr 0.000854 time 1.9590 (2.2592) loss 3.2320 (3.8048) grad_norm 1.1062 (1.1815) [2022-01-19 22:50:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][280/1251] eta 0:36:31 lr 0.000854 time 2.3169 (2.2565) loss 4.2810 (3.8134) grad_norm 1.2128 (1.1835) [2022-01-19 22:51:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][290/1251] eta 0:36:07 lr 0.000854 time 1.5573 (2.2555) loss 4.5946 (3.8276) grad_norm 1.0787 (1.1843) [2022-01-19 22:51:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][300/1251] eta 0:35:45 lr 0.000854 time 2.5123 (2.2562) loss 3.9634 (3.8276) grad_norm 1.1772 (1.1850) [2022-01-19 22:51:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][310/1251] eta 0:35:17 lr 0.000854 time 2.1490 (2.2508) loss 4.3302 (3.8186) grad_norm 1.1086 (1.1836) [2022-01-19 22:52:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][320/1251] eta 0:34:54 lr 0.000854 time 2.1679 (2.2496) loss 4.7184 (3.8248) grad_norm 1.4975 (1.1835) [2022-01-19 22:52:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][330/1251] eta 0:34:30 lr 0.000854 time 1.6045 (2.2481) loss 4.3880 (3.8195) grad_norm 1.2159 (1.1833) [2022-01-19 22:52:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][340/1251] eta 0:34:06 lr 0.000854 time 2.4140 (2.2469) loss 4.2369 (3.8247) grad_norm 0.9900 (1.1807) [2022-01-19 22:53:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][350/1251] eta 0:33:41 lr 0.000854 time 1.8618 (2.2442) loss 4.0042 (3.8278) grad_norm 1.0682 (1.1783) [2022-01-19 22:53:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][360/1251] eta 0:33:15 lr 0.000854 time 2.0549 (2.2400) loss 3.3844 (3.8316) grad_norm 1.2156 (1.1796) [2022-01-19 22:53:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][370/1251] eta 0:32:51 lr 0.000854 time 1.8527 (2.2377) loss 3.9500 (3.8286) grad_norm 1.0013 (1.1789) [2022-01-19 22:54:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][380/1251] eta 0:32:26 lr 0.000854 time 2.5003 (2.2353) loss 4.5174 (3.8238) grad_norm 1.0627 (1.1803) [2022-01-19 22:54:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][390/1251] eta 0:32:02 lr 0.000854 time 1.6253 (2.2329) loss 4.1699 (3.8261) grad_norm 1.1814 (1.1800) [2022-01-19 22:55:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][400/1251] eta 0:31:42 lr 0.000854 time 2.3025 (2.2360) loss 4.3196 (3.8285) grad_norm 0.9983 (1.1790) [2022-01-19 22:55:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][410/1251] eta 0:31:21 lr 0.000854 time 2.7535 (2.2367) loss 4.2693 (3.8308) grad_norm 1.2743 (1.1804) [2022-01-19 22:55:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][420/1251] eta 0:30:59 lr 0.000854 time 2.8070 (2.2381) loss 4.1040 (3.8227) grad_norm 1.0761 (1.1798) [2022-01-19 22:56:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][430/1251] eta 0:30:33 lr 0.000854 time 1.8249 (2.2331) loss 3.9437 (3.8276) grad_norm 1.1790 (1.1800) [2022-01-19 22:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][440/1251] eta 0:30:09 lr 0.000854 time 1.9279 (2.2315) loss 3.9515 (3.8259) grad_norm 1.1384 (1.1803) [2022-01-19 22:56:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][450/1251] eta 0:29:46 lr 0.000854 time 2.5763 (2.2300) loss 3.0193 (3.8244) grad_norm 1.0779 (1.1794) [2022-01-19 22:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][460/1251] eta 0:29:21 lr 0.000854 time 2.1734 (2.2275) loss 2.6568 (3.8256) grad_norm 1.1983 (1.1798) [2022-01-19 22:57:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][470/1251] eta 0:28:57 lr 0.000854 time 2.0054 (2.2245) loss 4.3373 (3.8271) grad_norm 1.1091 (1.1806) [2022-01-19 22:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][480/1251] eta 0:28:33 lr 0.000854 time 1.9099 (2.2224) loss 3.1134 (3.8267) grad_norm 1.1319 (1.1807) [2022-01-19 22:58:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][490/1251] eta 0:28:13 lr 0.000854 time 2.2937 (2.2255) loss 3.8731 (3.8246) grad_norm 1.2267 (1.1811) [2022-01-19 22:58:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][500/1251] eta 0:27:49 lr 0.000854 time 1.6028 (2.2235) loss 3.4096 (3.8260) grad_norm 1.1062 (1.1799) [2022-01-19 22:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][510/1251] eta 0:27:27 lr 0.000854 time 2.1641 (2.2236) loss 4.0926 (3.8228) grad_norm 1.1197 (1.1795) [2022-01-19 22:59:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][520/1251] eta 0:27:08 lr 0.000853 time 3.6985 (2.2278) loss 3.4323 (3.8236) grad_norm 1.1649 (1.1800) [2022-01-19 22:59:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][530/1251] eta 0:26:44 lr 0.000853 time 1.6192 (2.2249) loss 4.6959 (3.8283) grad_norm 1.2736 (1.1798) [2022-01-19 23:00:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][540/1251] eta 0:26:17 lr 0.000853 time 1.8588 (2.2184) loss 3.4549 (3.8312) grad_norm 1.2353 (1.1795) [2022-01-19 23:00:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][550/1251] eta 0:25:53 lr 0.000853 time 1.8614 (2.2157) loss 3.2045 (3.8286) grad_norm 0.9624 (1.1795) [2022-01-19 23:00:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][560/1251] eta 0:25:31 lr 0.000853 time 3.1290 (2.2170) loss 4.0095 (3.8258) grad_norm 1.2160 (1.1808) [2022-01-19 23:01:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][570/1251] eta 0:25:09 lr 0.000853 time 1.7598 (2.2163) loss 4.3136 (3.8292) grad_norm 1.0363 (1.1812) [2022-01-19 23:01:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][580/1251] eta 0:24:47 lr 0.000853 time 2.2598 (2.2172) loss 3.9163 (3.8313) grad_norm 1.1410 (1.1803) [2022-01-19 23:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][590/1251] eta 0:24:27 lr 0.000853 time 2.2377 (2.2194) loss 4.3587 (3.8348) grad_norm 1.4047 (1.1809) [2022-01-19 23:02:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][600/1251] eta 0:24:05 lr 0.000853 time 2.2049 (2.2203) loss 3.6411 (3.8338) grad_norm 1.2608 (1.1822) [2022-01-19 23:02:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][610/1251] eta 0:23:43 lr 0.000853 time 1.7911 (2.2214) loss 3.9666 (3.8374) grad_norm 1.1166 (1.1823) [2022-01-19 23:03:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][620/1251] eta 0:23:18 lr 0.000853 time 1.5882 (2.2164) loss 3.8391 (3.8368) grad_norm 0.9900 (1.1818) [2022-01-19 23:03:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][630/1251] eta 0:22:54 lr 0.000853 time 1.7276 (2.2128) loss 3.8227 (3.8380) grad_norm 1.1226 (1.1814) [2022-01-19 23:03:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][640/1251] eta 0:22:30 lr 0.000853 time 1.8950 (2.2104) loss 4.0151 (3.8372) grad_norm 1.1576 (1.1806) [2022-01-19 23:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][650/1251] eta 0:22:06 lr 0.000853 time 1.5919 (2.2079) loss 3.9068 (3.8391) grad_norm 1.2180 (1.1795) [2022-01-19 23:04:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][660/1251] eta 0:21:44 lr 0.000853 time 2.3490 (2.2067) loss 4.1347 (3.8415) grad_norm 1.1179 (1.1784) [2022-01-19 23:04:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][670/1251] eta 0:21:21 lr 0.000853 time 1.9973 (2.2065) loss 4.1509 (3.8423) grad_norm 1.1806 (1.1779) [2022-01-19 23:05:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][680/1251] eta 0:21:00 lr 0.000853 time 2.4907 (2.2077) loss 3.5299 (3.8421) grad_norm 1.0054 (1.1771) [2022-01-19 23:05:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][690/1251] eta 0:20:38 lr 0.000853 time 2.1732 (2.2075) loss 4.2479 (3.8425) grad_norm 1.2426 (1.1780) [2022-01-19 23:05:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][700/1251] eta 0:20:17 lr 0.000853 time 2.5286 (2.2102) loss 3.5691 (3.8417) grad_norm 1.3616 (1.1787) [2022-01-19 23:06:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][710/1251] eta 0:19:55 lr 0.000853 time 1.8972 (2.2104) loss 2.8604 (3.8387) grad_norm 1.0135 (1.1786) [2022-01-19 23:06:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][720/1251] eta 0:19:34 lr 0.000853 time 2.5870 (2.2114) loss 3.8345 (3.8388) grad_norm 1.4564 (1.1783) [2022-01-19 23:07:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][730/1251] eta 0:19:11 lr 0.000853 time 1.4974 (2.2110) loss 4.1539 (3.8386) grad_norm 1.3451 (1.1785) [2022-01-19 23:07:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][740/1251] eta 0:18:50 lr 0.000853 time 2.7927 (2.2114) loss 2.9282 (3.8380) grad_norm 1.2917 (1.1791) [2022-01-19 23:07:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][750/1251] eta 0:18:26 lr 0.000853 time 1.9531 (2.2084) loss 3.8929 (3.8336) grad_norm 1.1635 (1.1788) [2022-01-19 23:08:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][760/1251] eta 0:18:03 lr 0.000853 time 1.8872 (2.2063) loss 4.1454 (3.8318) grad_norm 1.3799 (1.1806) [2022-01-19 23:08:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][770/1251] eta 0:17:40 lr 0.000853 time 2.1669 (2.2051) loss 2.6377 (3.8340) grad_norm 1.5515 (1.1807) [2022-01-19 23:08:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][780/1251] eta 0:17:18 lr 0.000853 time 2.6225 (2.2039) loss 3.8872 (3.8374) grad_norm 1.1581 (1.1806) [2022-01-19 23:09:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][790/1251] eta 0:16:55 lr 0.000853 time 1.8724 (2.2022) loss 4.0915 (3.8404) grad_norm 1.1759 (1.1815) [2022-01-19 23:09:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][800/1251] eta 0:16:33 lr 0.000853 time 2.1119 (2.2019) loss 3.0481 (3.8403) grad_norm 0.9075 (1.1827) [2022-01-19 23:09:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][810/1251] eta 0:16:10 lr 0.000853 time 2.1915 (2.2014) loss 3.7467 (3.8395) grad_norm 1.0854 (1.1834) [2022-01-19 23:10:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][820/1251] eta 0:15:49 lr 0.000853 time 2.2993 (2.2022) loss 3.7239 (3.8379) grad_norm 1.2230 (1.1830) [2022-01-19 23:10:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][830/1251] eta 0:15:27 lr 0.000853 time 2.4169 (2.2023) loss 3.7394 (3.8390) grad_norm 1.2751 (1.1827) [2022-01-19 23:11:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][840/1251] eta 0:15:05 lr 0.000853 time 2.4332 (2.2024) loss 3.5886 (3.8389) grad_norm 1.6524 (1.1865) [2022-01-19 23:11:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][850/1251] eta 0:14:43 lr 0.000853 time 2.2303 (2.2021) loss 3.9949 (3.8377) grad_norm 1.3072 (1.1868) [2022-01-19 23:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][860/1251] eta 0:14:20 lr 0.000852 time 2.2462 (2.2020) loss 3.8260 (3.8358) grad_norm 0.9857 (1.1870) [2022-01-19 23:12:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][870/1251] eta 0:13:58 lr 0.000852 time 1.5619 (2.1998) loss 2.7132 (3.8359) grad_norm 1.1563 (1.1873) [2022-01-19 23:12:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][880/1251] eta 0:13:36 lr 0.000852 time 2.0676 (2.1996) loss 3.9880 (3.8383) grad_norm 1.0579 (1.1863) [2022-01-19 23:12:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][890/1251] eta 0:13:14 lr 0.000852 time 2.2170 (2.2001) loss 3.4786 (3.8413) grad_norm 1.2735 (1.1858) [2022-01-19 23:13:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][900/1251] eta 0:12:52 lr 0.000852 time 2.2141 (2.2001) loss 4.1665 (3.8380) grad_norm 1.0224 (1.1865) [2022-01-19 23:13:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][910/1251] eta 0:12:29 lr 0.000852 time 1.8816 (2.1985) loss 3.4869 (3.8342) grad_norm 1.0075 (1.1855) [2022-01-19 23:13:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][920/1251] eta 0:12:07 lr 0.000852 time 2.5651 (2.1980) loss 4.4623 (3.8336) grad_norm 1.1889 (1.1853) [2022-01-19 23:14:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][930/1251] eta 0:11:45 lr 0.000852 time 1.9513 (2.1973) loss 3.8923 (3.8338) grad_norm 1.2429 (1.1854) [2022-01-19 23:14:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][940/1251] eta 0:11:23 lr 0.000852 time 2.5185 (2.1979) loss 4.3691 (3.8319) grad_norm 1.3408 (1.1858) [2022-01-19 23:14:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][950/1251] eta 0:11:02 lr 0.000852 time 2.1160 (2.1994) loss 4.5147 (3.8363) grad_norm 1.0468 (1.1861) [2022-01-19 23:15:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][960/1251] eta 0:10:40 lr 0.000852 time 2.0932 (2.2001) loss 4.8001 (3.8410) grad_norm 0.9649 (1.1862) [2022-01-19 23:15:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][970/1251] eta 0:10:18 lr 0.000852 time 2.0028 (2.1998) loss 3.8353 (3.8417) grad_norm 1.1030 (1.1856) [2022-01-19 23:16:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][980/1251] eta 0:09:55 lr 0.000852 time 1.8051 (2.1989) loss 3.6391 (3.8399) grad_norm 1.0083 (1.1846) [2022-01-19 23:16:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][990/1251] eta 0:09:33 lr 0.000852 time 1.8790 (2.1979) loss 2.8713 (3.8389) grad_norm 1.5388 (1.1848) [2022-01-19 23:16:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1000/1251] eta 0:09:11 lr 0.000852 time 2.5606 (2.1980) loss 4.3571 (3.8409) grad_norm 1.2777 (1.1844) [2022-01-19 23:17:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1010/1251] eta 0:08:50 lr 0.000852 time 1.8208 (2.1992) loss 4.1238 (3.8403) grad_norm 1.2609 (1.1847) [2022-01-19 23:17:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1020/1251] eta 0:08:28 lr 0.000852 time 1.8903 (2.1998) loss 3.7178 (3.8424) grad_norm 1.2197 (1.1854) [2022-01-19 23:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1030/1251] eta 0:08:05 lr 0.000852 time 1.6078 (2.1986) loss 3.9563 (3.8437) grad_norm 1.1207 (1.1848) [2022-01-19 23:18:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1040/1251] eta 0:07:43 lr 0.000852 time 2.0863 (2.1978) loss 3.2571 (3.8424) grad_norm 1.1340 (1.1841) [2022-01-19 23:18:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1050/1251] eta 0:07:21 lr 0.000852 time 1.6094 (2.1959) loss 4.8504 (3.8455) grad_norm 1.3016 (1.1839) [2022-01-19 23:18:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1060/1251] eta 0:06:59 lr 0.000852 time 2.2844 (2.1956) loss 4.2482 (3.8463) grad_norm 1.0468 (1.1837) [2022-01-19 23:19:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1070/1251] eta 0:06:37 lr 0.000852 time 2.2770 (2.1951) loss 4.3860 (3.8472) grad_norm 1.0940 (1.1833) [2022-01-19 23:19:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1080/1251] eta 0:06:15 lr 0.000852 time 1.7926 (2.1944) loss 3.8314 (3.8447) grad_norm 1.2488 (1.1828) [2022-01-19 23:20:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1090/1251] eta 0:05:53 lr 0.000852 time 1.9796 (2.1946) loss 3.4543 (3.8416) grad_norm 1.0211 (1.1824) [2022-01-19 23:20:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1100/1251] eta 0:05:31 lr 0.000852 time 1.9099 (2.1947) loss 3.6668 (3.8416) grad_norm 1.3204 (1.1820) [2022-01-19 23:20:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1110/1251] eta 0:05:09 lr 0.000852 time 3.0577 (2.1964) loss 4.0162 (3.8415) grad_norm 1.1652 (1.1816) [2022-01-19 23:21:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1120/1251] eta 0:04:47 lr 0.000852 time 1.6787 (2.1956) loss 3.6759 (3.8390) grad_norm 1.0261 (1.1817) [2022-01-19 23:21:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1130/1251] eta 0:04:25 lr 0.000852 time 1.7979 (2.1947) loss 3.9679 (3.8413) grad_norm 1.2313 (1.1816) [2022-01-19 23:21:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1140/1251] eta 0:04:03 lr 0.000852 time 2.1367 (2.1941) loss 3.1290 (3.8401) grad_norm 1.0140 (1.1814) [2022-01-19 23:22:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1150/1251] eta 0:03:41 lr 0.000852 time 2.6777 (2.1931) loss 3.7191 (3.8389) grad_norm 1.3508 (1.1812) [2022-01-19 23:22:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1160/1251] eta 0:03:19 lr 0.000852 time 1.8505 (2.1928) loss 3.8602 (3.8392) grad_norm 1.1319 (1.1810) [2022-01-19 23:22:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1170/1251] eta 0:02:57 lr 0.000852 time 1.8168 (2.1934) loss 4.0595 (3.8377) grad_norm 1.0759 (1.1809) [2022-01-19 23:23:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1180/1251] eta 0:02:35 lr 0.000852 time 1.9184 (2.1932) loss 3.5340 (3.8389) grad_norm 1.2082 (1.1810) [2022-01-19 23:23:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1190/1251] eta 0:02:13 lr 0.000852 time 2.4641 (2.1945) loss 3.8040 (3.8389) grad_norm 1.0525 (1.1811) [2022-01-19 23:24:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1200/1251] eta 0:01:51 lr 0.000851 time 2.7333 (2.1948) loss 4.4882 (3.8377) grad_norm 1.2365 (1.1809) [2022-01-19 23:24:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1210/1251] eta 0:01:29 lr 0.000851 time 1.6577 (2.1948) loss 4.5986 (3.8357) grad_norm 1.0379 (1.1805) [2022-01-19 23:24:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1220/1251] eta 0:01:08 lr 0.000851 time 2.6225 (2.1945) loss 4.4968 (3.8350) grad_norm 1.1063 (1.1806) [2022-01-19 23:25:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1230/1251] eta 0:00:46 lr 0.000851 time 1.6846 (2.1938) loss 3.5968 (3.8358) grad_norm 1.1246 (1.1804) [2022-01-19 23:25:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1240/1251] eta 0:00:24 lr 0.000851 time 1.2969 (2.1919) loss 4.2453 (3.8358) grad_norm 1.0963 (1.1798) [2022-01-19 23:25:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [75/300][1250/1251] eta 0:00:02 lr 0.000851 time 1.1694 (2.1868) loss 4.7262 (3.8342) grad_norm 1.2045 (1.1796) [2022-01-19 23:25:44 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 75 training takes 0:45:36 [2022-01-19 23:26:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.427 (18.427) Loss 1.1381 (1.1381) Acc@1 73.340 (73.340) Acc@5 92.188 (92.188) [2022-01-19 23:26:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.007 (3.251) Loss 1.2389 (1.2147) Acc@1 70.703 (71.973) Acc@5 90.039 (90.900) [2022-01-19 23:26:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.653 (2.551) Loss 1.1764 (1.1956) Acc@1 72.949 (72.424) Acc@5 91.504 (91.309) [2022-01-19 23:26:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.281 (2.243) Loss 1.2312 (1.1970) Acc@1 71.680 (72.262) Acc@5 91.211 (91.211) [2022-01-19 23:27:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.942 (2.168) Loss 1.1682 (1.1955) Acc@1 73.633 (72.192) Acc@5 91.699 (91.273) [2022-01-19 23:27:20 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 72.180 Acc@5 91.272 [2022-01-19 23:27:20 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 72.2% [2022-01-19 23:27:20 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 72.34% [2022-01-19 23:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][0/1251] eta 7:30:04 lr 0.000851 time 21.5863 (21.5863) loss 4.0257 (4.0257) grad_norm 1.1909 (1.1909) [2022-01-19 23:28:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][10/1251] eta 1:25:05 lr 0.000851 time 2.4395 (4.1141) loss 3.5432 (3.8318) grad_norm 1.0300 (1.1595) [2022-01-19 23:28:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][20/1251] eta 1:07:16 lr 0.000851 time 3.1252 (3.2789) loss 3.2922 (3.7819) grad_norm 1.3653 (1.1601) [2022-01-19 23:28:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][30/1251] eta 0:59:32 lr 0.000851 time 1.4893 (2.9263) loss 3.8295 (3.8658) grad_norm 1.3228 (1.1722) [2022-01-19 23:29:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][40/1251] eta 0:57:07 lr 0.000851 time 3.9116 (2.8307) loss 3.7473 (3.8800) grad_norm 1.2558 (1.1846) [2022-01-19 23:29:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][50/1251] eta 0:54:26 lr 0.000851 time 2.1502 (2.7199) loss 3.3465 (3.8694) grad_norm 1.0602 (1.1761) [2022-01-19 23:30:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][60/1251] eta 0:52:23 lr 0.000851 time 2.6279 (2.6394) loss 4.5791 (3.8716) grad_norm 1.0166 (1.1628) [2022-01-19 23:30:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][70/1251] eta 0:50:20 lr 0.000851 time 1.8041 (2.5576) loss 4.3738 (3.8556) grad_norm 1.3291 (1.1711) [2022-01-19 23:30:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][80/1251] eta 0:48:49 lr 0.000851 time 2.7977 (2.5021) loss 2.6390 (3.8250) grad_norm 1.2718 (1.1677) [2022-01-19 23:31:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][90/1251] eta 0:47:17 lr 0.000851 time 1.7743 (2.4440) loss 4.0594 (3.8596) grad_norm 1.0648 (1.1687) [2022-01-19 23:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][100/1251] eta 0:46:06 lr 0.000851 time 1.5972 (2.4035) loss 3.7782 (3.8577) grad_norm 1.0126 (1.1617) [2022-01-19 23:31:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][110/1251] eta 0:45:20 lr 0.000851 time 2.0356 (2.3842) loss 3.2202 (3.8711) grad_norm 1.2217 (1.1592) [2022-01-19 23:32:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][120/1251] eta 0:44:38 lr 0.000851 time 2.5395 (2.3679) loss 3.6253 (3.8734) grad_norm 1.3380 (1.1549) [2022-01-19 23:32:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][130/1251] eta 0:43:57 lr 0.000851 time 2.2094 (2.3529) loss 4.3512 (3.8857) grad_norm 1.2107 (1.1507) [2022-01-19 23:32:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][140/1251] eta 0:43:25 lr 0.000851 time 2.4660 (2.3449) loss 3.5945 (3.8804) grad_norm 1.0390 (1.1496) [2022-01-19 23:33:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][150/1251] eta 0:43:04 lr 0.000851 time 2.2682 (2.3473) loss 3.6786 (3.8624) grad_norm 1.1017 (1.1501) [2022-01-19 23:33:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][160/1251] eta 0:42:27 lr 0.000851 time 2.0365 (2.3350) loss 4.3873 (3.8633) grad_norm 1.2381 (1.1539) [2022-01-19 23:33:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][170/1251] eta 0:41:54 lr 0.000851 time 1.9369 (2.3260) loss 3.7030 (3.8518) grad_norm 1.2588 (1.1583) [2022-01-19 23:34:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][180/1251] eta 0:41:08 lr 0.000851 time 1.7461 (2.3049) loss 3.7575 (3.8259) grad_norm 1.1433 (1.1597) [2022-01-19 23:34:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][190/1251] eta 0:40:34 lr 0.000851 time 2.0291 (2.2942) loss 4.6260 (3.8375) grad_norm 1.0169 (1.1637) [2022-01-19 23:34:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][200/1251] eta 0:39:56 lr 0.000851 time 1.5744 (2.2805) loss 2.8881 (3.8373) grad_norm 1.1113 (1.1646) [2022-01-19 23:35:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][210/1251] eta 0:39:36 lr 0.000851 time 2.2269 (2.2826) loss 3.3727 (3.8307) grad_norm 0.9847 (1.1634) [2022-01-19 23:35:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][220/1251] eta 0:39:03 lr 0.000851 time 1.7881 (2.2734) loss 4.3832 (3.8273) grad_norm 1.0924 (1.1613) [2022-01-19 23:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][230/1251] eta 0:38:38 lr 0.000851 time 2.2256 (2.2707) loss 4.4088 (3.8240) grad_norm 1.3621 (1.1605) [2022-01-19 23:36:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][240/1251] eta 0:38:14 lr 0.000851 time 1.6301 (2.2698) loss 4.0576 (3.8341) grad_norm 1.1490 (1.1621) [2022-01-19 23:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][250/1251] eta 0:37:57 lr 0.000851 time 1.8650 (2.2752) loss 2.8388 (3.8094) grad_norm 1.0771 (1.1641) [2022-01-19 23:37:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][260/1251] eta 0:37:29 lr 0.000851 time 2.1675 (2.2696) loss 4.2395 (3.8053) grad_norm 1.0609 (1.1688) [2022-01-19 23:37:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][270/1251] eta 0:37:01 lr 0.000851 time 3.0806 (2.2641) loss 3.8214 (3.8089) grad_norm 1.1515 (1.1675) [2022-01-19 23:37:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][280/1251] eta 0:36:34 lr 0.000851 time 2.3184 (2.2599) loss 3.9241 (3.8068) grad_norm 1.1484 (1.1676) [2022-01-19 23:38:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][290/1251] eta 0:36:08 lr 0.000850 time 2.3593 (2.2562) loss 2.5386 (3.8042) grad_norm 1.2360 (1.1673) [2022-01-19 23:38:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][300/1251] eta 0:35:43 lr 0.000850 time 2.3408 (2.2542) loss 3.3486 (3.8006) grad_norm 1.2334 (1.1668) [2022-01-19 23:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][310/1251] eta 0:35:21 lr 0.000850 time 2.4764 (2.2548) loss 2.7181 (3.7966) grad_norm 1.1175 (1.1657) [2022-01-19 23:39:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][320/1251] eta 0:34:57 lr 0.000850 time 2.4242 (2.2530) loss 3.1632 (3.8089) grad_norm 0.9035 (1.1647) [2022-01-19 23:39:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][330/1251] eta 0:34:31 lr 0.000850 time 1.8616 (2.2495) loss 3.2767 (3.8060) grad_norm 1.2818 (1.1663) [2022-01-19 23:40:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][340/1251] eta 0:34:07 lr 0.000850 time 1.9239 (2.2472) loss 4.0649 (3.8044) grad_norm 1.1276 (1.1681) [2022-01-19 23:40:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][350/1251] eta 0:33:41 lr 0.000850 time 1.8822 (2.2433) loss 4.1178 (3.8073) grad_norm 1.0399 (1.1666) [2022-01-19 23:40:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][360/1251] eta 0:33:17 lr 0.000850 time 3.1132 (2.2420) loss 4.0453 (3.8094) grad_norm 1.3801 (1.1683) [2022-01-19 23:41:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][370/1251] eta 0:32:51 lr 0.000850 time 2.1931 (2.2381) loss 2.4264 (3.7991) grad_norm 1.2575 (1.1696) [2022-01-19 23:41:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][380/1251] eta 0:32:24 lr 0.000850 time 1.8491 (2.2327) loss 3.7952 (3.7968) grad_norm 1.1413 (1.1685) [2022-01-19 23:41:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][390/1251] eta 0:32:02 lr 0.000850 time 2.2984 (2.2331) loss 2.7439 (3.7864) grad_norm 1.4965 (1.1693) [2022-01-19 23:42:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][400/1251] eta 0:31:41 lr 0.000850 time 2.8064 (2.2340) loss 3.2450 (3.7895) grad_norm 1.2647 (1.1739) [2022-01-19 23:42:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][410/1251] eta 0:31:17 lr 0.000850 time 2.4848 (2.2330) loss 4.1363 (3.7930) grad_norm 1.2092 (1.1754) [2022-01-19 23:43:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][420/1251] eta 0:30:57 lr 0.000850 time 1.8638 (2.2349) loss 3.7550 (3.7945) grad_norm 1.0697 (1.1762) [2022-01-19 23:43:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][430/1251] eta 0:30:32 lr 0.000850 time 1.7642 (2.2325) loss 4.1690 (3.7998) grad_norm 1.2169 (1.1766) [2022-01-19 23:43:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][440/1251] eta 0:30:08 lr 0.000850 time 2.8124 (2.2297) loss 4.0282 (3.8021) grad_norm 1.2061 (1.1765) [2022-01-19 23:44:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][450/1251] eta 0:29:40 lr 0.000850 time 1.8943 (2.2231) loss 3.5553 (3.7975) grad_norm 1.0040 (1.1754) [2022-01-19 23:44:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][460/1251] eta 0:29:15 lr 0.000850 time 2.0526 (2.2196) loss 3.4988 (3.7973) grad_norm 1.2090 (1.1757) [2022-01-19 23:44:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][470/1251] eta 0:28:50 lr 0.000850 time 2.0291 (2.2162) loss 2.6841 (3.7969) grad_norm 1.0203 (1.1740) [2022-01-19 23:45:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][480/1251] eta 0:28:34 lr 0.000850 time 4.4497 (2.2234) loss 3.1204 (3.7947) grad_norm 1.3031 (1.1749) [2022-01-19 23:45:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][490/1251] eta 0:28:11 lr 0.000850 time 1.7335 (2.2233) loss 4.0749 (3.7960) grad_norm 1.2334 (1.1759) [2022-01-19 23:45:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][500/1251] eta 0:27:50 lr 0.000850 time 1.7273 (2.2238) loss 2.9125 (3.7955) grad_norm 1.0807 (1.1761) [2022-01-19 23:46:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][510/1251] eta 0:27:30 lr 0.000850 time 2.2967 (2.2275) loss 4.2811 (3.7946) grad_norm 1.0583 (1.1771) [2022-01-19 23:46:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][520/1251] eta 0:27:09 lr 0.000850 time 3.2243 (2.2291) loss 3.9627 (3.7911) grad_norm 1.4505 (1.1789) [2022-01-19 23:47:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][530/1251] eta 0:26:45 lr 0.000850 time 1.8844 (2.2267) loss 3.6829 (3.7921) grad_norm 1.0459 (1.1782) [2022-01-19 23:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][540/1251] eta 0:26:21 lr 0.000850 time 2.2607 (2.2240) loss 3.6403 (3.7956) grad_norm 1.2235 (1.1782) [2022-01-19 23:47:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][550/1251] eta 0:25:55 lr 0.000850 time 1.5608 (2.2193) loss 4.3481 (3.8017) grad_norm 1.3299 (1.1781) [2022-01-19 23:48:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][560/1251] eta 0:25:32 lr 0.000850 time 2.5029 (2.2179) loss 3.4083 (3.8013) grad_norm 1.3325 (1.1780) [2022-01-19 23:48:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][570/1251] eta 0:25:11 lr 0.000850 time 1.9756 (2.2188) loss 3.3174 (3.7966) grad_norm 1.2623 (1.1791) [2022-01-19 23:48:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][580/1251] eta 0:24:49 lr 0.000850 time 1.9277 (2.2203) loss 3.9587 (3.7969) grad_norm 1.1618 (1.1787) [2022-01-19 23:49:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][590/1251] eta 0:24:27 lr 0.000850 time 2.1898 (2.2208) loss 3.7378 (3.7977) grad_norm 1.2958 (1.1804) [2022-01-19 23:49:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][600/1251] eta 0:24:05 lr 0.000850 time 2.2226 (2.2204) loss 4.0490 (3.7973) grad_norm 0.9903 (1.1801) [2022-01-19 23:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][610/1251] eta 0:23:42 lr 0.000850 time 2.5905 (2.2188) loss 4.4933 (3.7954) grad_norm 1.2259 (1.1804) [2022-01-19 23:50:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][620/1251] eta 0:23:18 lr 0.000849 time 2.3655 (2.2157) loss 4.3610 (3.7950) grad_norm 1.1420 (1.1810) [2022-01-19 23:50:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][630/1251] eta 0:22:54 lr 0.000849 time 2.5954 (2.2140) loss 3.9931 (3.8019) grad_norm 1.0883 (1.1812) [2022-01-19 23:50:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][640/1251] eta 0:22:32 lr 0.000849 time 2.1301 (2.2128) loss 3.2847 (3.7981) grad_norm 1.2124 (1.1806) [2022-01-19 23:51:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][650/1251] eta 0:22:09 lr 0.000849 time 1.8149 (2.2113) loss 4.3266 (3.8027) grad_norm 1.3014 (1.1801) [2022-01-19 23:51:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][660/1251] eta 0:21:47 lr 0.000849 time 2.7604 (2.2117) loss 4.5570 (3.8067) grad_norm 1.0818 (1.1811) [2022-01-19 23:52:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][670/1251] eta 0:21:23 lr 0.000849 time 2.3036 (2.2096) loss 3.2982 (3.8059) grad_norm 1.0855 (1.1815) [2022-01-19 23:52:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][680/1251] eta 0:21:01 lr 0.000849 time 2.3076 (2.2089) loss 4.5522 (3.8121) grad_norm 1.1359 (1.1806) [2022-01-19 23:52:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][690/1251] eta 0:20:38 lr 0.000849 time 1.5955 (2.2077) loss 3.9922 (3.8075) grad_norm 1.1351 (1.1795) [2022-01-19 23:53:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][700/1251] eta 0:20:16 lr 0.000849 time 1.8562 (2.2086) loss 4.6159 (3.8094) grad_norm 1.1635 (1.1783) [2022-01-19 23:53:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][710/1251] eta 0:19:56 lr 0.000849 time 2.5259 (2.2117) loss 4.3225 (3.8115) grad_norm 1.2542 (1.1782) [2022-01-19 23:53:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][720/1251] eta 0:19:35 lr 0.000849 time 2.7299 (2.2129) loss 2.9454 (3.8122) grad_norm 1.4767 (1.1786) [2022-01-19 23:54:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][730/1251] eta 0:19:11 lr 0.000849 time 1.7234 (2.2099) loss 4.4259 (3.8139) grad_norm 1.1145 (1.1787) [2022-01-19 23:54:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][740/1251] eta 0:18:49 lr 0.000849 time 2.1802 (2.2096) loss 4.1833 (3.8147) grad_norm 1.1000 (1.1784) [2022-01-19 23:54:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][750/1251] eta 0:18:26 lr 0.000849 time 2.2000 (2.2087) loss 4.2334 (3.8155) grad_norm 1.1001 (1.1799) [2022-01-19 23:55:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][760/1251] eta 0:18:04 lr 0.000849 time 3.2074 (2.2094) loss 3.6655 (3.8177) grad_norm 1.1346 (1.1807) [2022-01-19 23:55:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][770/1251] eta 0:17:43 lr 0.000849 time 1.9644 (2.2103) loss 3.4010 (3.8185) grad_norm 1.1241 (1.1811) [2022-01-19 23:56:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][780/1251] eta 0:17:21 lr 0.000849 time 2.4471 (2.2106) loss 3.3535 (3.8199) grad_norm 1.4973 (1.1828) [2022-01-19 23:56:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][790/1251] eta 0:16:57 lr 0.000849 time 1.8854 (2.2077) loss 3.4203 (3.8168) grad_norm 1.0898 (1.1834) [2022-01-19 23:56:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][800/1251] eta 0:16:34 lr 0.000849 time 2.2209 (2.2060) loss 4.1081 (3.8196) grad_norm 0.9348 (1.1821) [2022-01-19 23:57:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][810/1251] eta 0:16:13 lr 0.000849 time 2.3925 (2.2065) loss 4.3835 (3.8223) grad_norm 1.1943 (1.1818) [2022-01-19 23:57:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][820/1251] eta 0:15:51 lr 0.000849 time 2.7634 (2.2085) loss 3.2507 (3.8196) grad_norm 1.2324 (1.1817) [2022-01-19 23:57:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][830/1251] eta 0:15:29 lr 0.000849 time 2.4229 (2.2077) loss 3.9671 (3.8218) grad_norm 1.2649 (1.1810) [2022-01-19 23:58:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][840/1251] eta 0:15:07 lr 0.000849 time 1.8805 (2.2074) loss 4.7289 (3.8215) grad_norm 1.2369 (1.1806) [2022-01-19 23:58:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][850/1251] eta 0:14:45 lr 0.000849 time 2.8001 (2.2075) loss 3.8878 (3.8213) grad_norm 1.0964 (1.1809) [2022-01-19 23:58:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][860/1251] eta 0:14:22 lr 0.000849 time 1.9642 (2.2064) loss 3.0294 (3.8180) grad_norm 1.3595 (1.1815) [2022-01-19 23:59:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][870/1251] eta 0:14:00 lr 0.000849 time 2.5562 (2.2065) loss 4.0808 (3.8176) grad_norm 1.2841 (1.1817) [2022-01-19 23:59:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][880/1251] eta 0:13:37 lr 0.000849 time 1.5016 (2.2045) loss 4.3034 (3.8156) grad_norm 1.3651 (1.1820) [2022-01-20 00:00:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][890/1251] eta 0:13:15 lr 0.000849 time 2.4742 (2.2048) loss 3.4001 (3.8163) grad_norm 1.1267 (1.1813) [2022-01-20 00:00:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][900/1251] eta 0:12:53 lr 0.000849 time 1.9054 (2.2037) loss 4.0900 (3.8182) grad_norm 1.0903 (1.1817) [2022-01-20 00:00:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][910/1251] eta 0:12:32 lr 0.000849 time 3.5679 (2.2067) loss 3.1610 (3.8134) grad_norm 1.1500 (1.1806) [2022-01-20 00:01:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][920/1251] eta 0:12:10 lr 0.000849 time 1.9922 (2.2070) loss 4.4126 (3.8103) grad_norm 1.3194 (1.1812) [2022-01-20 00:01:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][930/1251] eta 0:11:47 lr 0.000849 time 2.1521 (2.2054) loss 4.0893 (3.8093) grad_norm 1.3803 (1.1817) [2022-01-20 00:01:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][940/1251] eta 0:11:25 lr 0.000849 time 1.8485 (2.2026) loss 4.7314 (3.8069) grad_norm 1.3046 (1.1825) [2022-01-20 00:02:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][950/1251] eta 0:11:02 lr 0.000849 time 2.0891 (2.2005) loss 3.8706 (3.8061) grad_norm 1.1774 (1.1820) [2022-01-20 00:02:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][960/1251] eta 0:10:40 lr 0.000848 time 1.9477 (2.1995) loss 4.4756 (3.8069) grad_norm 1.1202 (1.1819) [2022-01-20 00:02:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][970/1251] eta 0:10:17 lr 0.000848 time 2.2101 (2.1982) loss 3.9835 (3.8073) grad_norm 1.3390 (1.1822) [2022-01-20 00:03:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][980/1251] eta 0:09:55 lr 0.000848 time 2.0969 (2.1978) loss 3.7114 (3.8087) grad_norm 1.1343 (1.1821) [2022-01-20 00:03:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][990/1251] eta 0:09:34 lr 0.000848 time 2.5112 (2.1995) loss 4.0329 (3.8083) grad_norm 1.3251 (1.1819) [2022-01-20 00:04:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1000/1251] eta 0:09:12 lr 0.000848 time 2.1432 (2.2009) loss 4.3796 (3.8083) grad_norm 1.1110 (1.1815) [2022-01-20 00:04:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1010/1251] eta 0:08:50 lr 0.000848 time 2.0263 (2.2005) loss 3.5017 (3.8070) grad_norm 1.0503 (1.1818) [2022-01-20 00:04:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1020/1251] eta 0:08:28 lr 0.000848 time 1.5698 (2.2014) loss 3.2606 (3.8059) grad_norm 1.1133 (1.1823) [2022-01-20 00:05:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1030/1251] eta 0:08:06 lr 0.000848 time 3.0979 (2.2021) loss 4.1187 (3.8042) grad_norm 1.2947 (1.1824) [2022-01-20 00:05:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1040/1251] eta 0:07:44 lr 0.000848 time 2.4993 (2.2007) loss 3.8080 (3.8060) grad_norm 1.3801 (1.1822) [2022-01-20 00:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1050/1251] eta 0:07:22 lr 0.000848 time 1.9100 (2.1992) loss 4.1869 (3.8088) grad_norm 0.9867 (1.1815) [2022-01-20 00:06:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1060/1251] eta 0:06:59 lr 0.000848 time 2.2222 (2.1982) loss 4.1778 (3.8085) grad_norm 1.1665 (1.1815) [2022-01-20 00:06:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1070/1251] eta 0:06:38 lr 0.000848 time 2.7991 (2.1998) loss 3.6574 (3.8082) grad_norm 1.1279 (1.1812) [2022-01-20 00:06:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1080/1251] eta 0:06:16 lr 0.000848 time 3.4102 (2.2014) loss 3.7788 (3.8098) grad_norm 1.0596 (1.1816) [2022-01-20 00:07:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1090/1251] eta 0:05:54 lr 0.000848 time 1.6958 (2.2007) loss 4.2314 (3.8102) grad_norm 1.2335 (1.1817) [2022-01-20 00:07:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1100/1251] eta 0:05:32 lr 0.000848 time 2.3487 (2.2001) loss 3.6742 (3.8086) grad_norm 1.0778 (1.1817) [2022-01-20 00:08:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1110/1251] eta 0:05:09 lr 0.000848 time 2.1972 (2.1982) loss 4.1910 (3.8086) grad_norm 1.0798 (1.1816) [2022-01-20 00:08:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1120/1251] eta 0:04:47 lr 0.000848 time 2.1725 (2.1962) loss 2.9213 (3.8087) grad_norm 1.2572 (1.1818) [2022-01-20 00:08:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1130/1251] eta 0:04:25 lr 0.000848 time 1.8831 (2.1951) loss 3.0242 (3.8053) grad_norm 1.5287 (1.1827) [2022-01-20 00:09:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1140/1251] eta 0:04:03 lr 0.000848 time 1.5132 (2.1946) loss 3.2592 (3.8041) grad_norm 1.2279 (1.1827) [2022-01-20 00:09:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1150/1251] eta 0:03:41 lr 0.000848 time 2.4920 (2.1939) loss 2.6294 (3.8020) grad_norm 1.0741 (1.1828) [2022-01-20 00:09:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1160/1251] eta 0:03:19 lr 0.000848 time 2.7065 (2.1944) loss 3.4086 (3.8033) grad_norm 1.1258 (1.1827) [2022-01-20 00:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1170/1251] eta 0:02:57 lr 0.000848 time 2.8744 (2.1960) loss 3.4756 (3.8026) grad_norm 1.2136 (1.1827) [2022-01-20 00:10:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1180/1251] eta 0:02:36 lr 0.000848 time 1.8103 (2.1974) loss 3.1298 (3.8017) grad_norm 1.0713 (1.1823) [2022-01-20 00:10:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1190/1251] eta 0:02:14 lr 0.000848 time 1.9333 (2.1989) loss 4.3038 (3.8016) grad_norm 1.1280 (1.1821) [2022-01-20 00:11:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1200/1251] eta 0:01:52 lr 0.000848 time 2.7817 (2.1983) loss 4.5773 (3.8021) grad_norm 1.0003 (1.1820) [2022-01-20 00:11:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1210/1251] eta 0:01:30 lr 0.000848 time 1.6120 (2.1959) loss 3.8812 (3.8023) grad_norm 1.2755 (1.1818) [2022-01-20 00:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1220/1251] eta 0:01:08 lr 0.000848 time 1.8544 (2.1947) loss 3.7018 (3.7998) grad_norm 1.0468 (1.1815) [2022-01-20 00:12:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1230/1251] eta 0:00:46 lr 0.000848 time 2.5234 (2.1935) loss 4.4838 (3.8015) grad_norm 1.2634 (1.1815) [2022-01-20 00:12:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1240/1251] eta 0:00:24 lr 0.000848 time 1.4660 (2.1924) loss 3.1460 (3.7996) grad_norm 1.2902 (1.1817) [2022-01-20 00:12:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [76/300][1250/1251] eta 0:00:02 lr 0.000848 time 1.1601 (2.1874) loss 3.3590 (3.7976) grad_norm 1.3247 (1.1815) [2022-01-20 00:12:57 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 76 training takes 0:45:36 [2022-01-20 00:13:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.529 (18.529) Loss 1.2341 (1.2341) Acc@1 71.191 (71.191) Acc@5 91.016 (91.016) [2022-01-20 00:13:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.961 (3.571) Loss 1.0828 (1.1781) Acc@1 73.633 (72.461) Acc@5 93.457 (91.602) [2022-01-20 00:13:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.356 (2.611) Loss 1.2203 (1.1838) Acc@1 72.168 (72.480) Acc@5 90.430 (91.430) [2022-01-20 00:14:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.646 (2.330) Loss 1.1381 (1.1824) Acc@1 72.656 (72.521) Acc@5 92.285 (91.447) [2022-01-20 00:14:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.161 (2.220) Loss 1.1560 (1.1775) Acc@1 72.266 (72.604) Acc@5 92.383 (91.552) [2022-01-20 00:14:35 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 72.532 Acc@5 91.490 [2022-01-20 00:14:35 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 72.5% [2022-01-20 00:14:35 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 72.53% [2022-01-20 00:14:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][0/1251] eta 7:28:34 lr 0.000848 time 21.5143 (21.5143) loss 2.8776 (2.8776) grad_norm 1.2023 (1.2023) [2022-01-20 00:15:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][10/1251] eta 1:22:27 lr 0.000848 time 2.4730 (3.9863) loss 3.6330 (3.8203) grad_norm 1.1059 (1.1689) [2022-01-20 00:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][20/1251] eta 1:04:40 lr 0.000848 time 2.2427 (3.1526) loss 2.6077 (3.7081) grad_norm 1.0554 (1.1285) [2022-01-20 00:16:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][30/1251] eta 0:57:42 lr 0.000848 time 1.7721 (2.8355) loss 4.5171 (3.6524) grad_norm 1.5484 (1.1796) [2022-01-20 00:16:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][40/1251] eta 0:54:55 lr 0.000847 time 3.5544 (2.7212) loss 4.2504 (3.7361) grad_norm 1.2696 (1.1940) [2022-01-20 00:16:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][50/1251] eta 0:53:28 lr 0.000847 time 3.5009 (2.6715) loss 3.2440 (3.7330) grad_norm 1.1824 (1.1859) [2022-01-20 00:17:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][60/1251] eta 0:51:38 lr 0.000847 time 2.2743 (2.6017) loss 2.7740 (3.7390) grad_norm 1.1359 (1.1853) [2022-01-20 00:17:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][70/1251] eta 0:49:40 lr 0.000847 time 1.9496 (2.5235) loss 3.8292 (3.7918) grad_norm 1.0817 (1.2142) [2022-01-20 00:17:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][80/1251] eta 0:48:25 lr 0.000847 time 3.2105 (2.4813) loss 3.3892 (3.8087) grad_norm 1.1534 (1.2123) [2022-01-20 00:18:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][90/1251] eta 0:47:19 lr 0.000847 time 2.3141 (2.4455) loss 4.3034 (3.8441) grad_norm 1.0543 (1.2003) [2022-01-20 00:18:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][100/1251] eta 0:46:23 lr 0.000847 time 1.8133 (2.4186) loss 3.5801 (3.8620) grad_norm 1.1013 (1.1912) [2022-01-20 00:19:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][110/1251] eta 0:45:38 lr 0.000847 time 1.8331 (2.4001) loss 3.8429 (3.8546) grad_norm 1.3070 (1.1947) [2022-01-20 00:19:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][120/1251] eta 0:44:58 lr 0.000847 time 2.8031 (2.3862) loss 3.9056 (3.8487) grad_norm 1.4554 (1.1941) [2022-01-20 00:19:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][130/1251] eta 0:44:24 lr 0.000847 time 2.9248 (2.3767) loss 4.5565 (3.8521) grad_norm 1.3652 (1.1988) [2022-01-20 00:20:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][140/1251] eta 0:43:41 lr 0.000847 time 1.5577 (2.3593) loss 4.6548 (3.8631) grad_norm 1.1994 (1.2027) [2022-01-20 00:20:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][150/1251] eta 0:42:59 lr 0.000847 time 1.7689 (2.3433) loss 3.2665 (3.8572) grad_norm 1.4335 (1.2067) [2022-01-20 00:20:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][160/1251] eta 0:42:24 lr 0.000847 time 2.4474 (2.3321) loss 4.7646 (3.8564) grad_norm 1.2091 (1.2083) [2022-01-20 00:21:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][170/1251] eta 0:41:43 lr 0.000847 time 1.6239 (2.3158) loss 4.4517 (3.8629) grad_norm 1.1071 (1.2059) [2022-01-20 00:21:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][180/1251] eta 0:41:15 lr 0.000847 time 2.1292 (2.3110) loss 2.7364 (3.8604) grad_norm 1.1557 (1.2016) [2022-01-20 00:21:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][190/1251] eta 0:40:42 lr 0.000847 time 2.0615 (2.3022) loss 4.5040 (3.8653) grad_norm 1.0860 (1.1960) [2022-01-20 00:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][200/1251] eta 0:40:13 lr 0.000847 time 2.2054 (2.2964) loss 2.8589 (3.8562) grad_norm 1.2605 (1.1961) [2022-01-20 00:22:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][210/1251] eta 0:39:41 lr 0.000847 time 2.1272 (2.2874) loss 4.0677 (3.8628) grad_norm 1.1996 (1.1982) [2022-01-20 00:22:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][220/1251] eta 0:39:12 lr 0.000847 time 2.1928 (2.2821) loss 3.7396 (3.8545) grad_norm 1.0572 (1.1964) [2022-01-20 00:23:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][230/1251] eta 0:38:47 lr 0.000847 time 2.1491 (2.2800) loss 4.2623 (3.8550) grad_norm 1.3814 (1.1954) [2022-01-20 00:23:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][240/1251] eta 0:38:18 lr 0.000847 time 2.1499 (2.2739) loss 4.0970 (3.8605) grad_norm 1.3683 (1.1954) [2022-01-20 00:24:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][250/1251] eta 0:37:53 lr 0.000847 time 1.5880 (2.2710) loss 4.0368 (3.8553) grad_norm 1.1826 (1.1930) [2022-01-20 00:24:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][260/1251] eta 0:37:26 lr 0.000847 time 2.2253 (2.2673) loss 4.1274 (3.8595) grad_norm 1.1862 (1.1910) [2022-01-20 00:24:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][270/1251] eta 0:37:02 lr 0.000847 time 2.1485 (2.2660) loss 4.0940 (3.8601) grad_norm 1.1042 (1.1909) [2022-01-20 00:25:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][280/1251] eta 0:36:32 lr 0.000847 time 1.8602 (2.2580) loss 3.8461 (3.8663) grad_norm 1.2798 (1.1907) [2022-01-20 00:25:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][290/1251] eta 0:36:01 lr 0.000847 time 2.4836 (2.2497) loss 3.7801 (3.8565) grad_norm 1.1304 (1.1905) [2022-01-20 00:25:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][300/1251] eta 0:35:33 lr 0.000847 time 1.4675 (2.2433) loss 3.9029 (3.8597) grad_norm 1.0343 (1.1922) [2022-01-20 00:26:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][310/1251] eta 0:35:10 lr 0.000847 time 2.2563 (2.2423) loss 3.9901 (3.8595) grad_norm 1.1748 (1.1916) [2022-01-20 00:26:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][320/1251] eta 0:34:54 lr 0.000847 time 1.8556 (2.2495) loss 3.2384 (3.8580) grad_norm 1.1003 (1.1900) [2022-01-20 00:26:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][330/1251] eta 0:34:31 lr 0.000847 time 2.1897 (2.2488) loss 3.8776 (3.8506) grad_norm 1.0905 (1.1921) [2022-01-20 00:27:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][340/1251] eta 0:34:07 lr 0.000847 time 1.9452 (2.2474) loss 4.5049 (3.8614) grad_norm 1.0455 (1.1927) [2022-01-20 00:27:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][350/1251] eta 0:33:45 lr 0.000847 time 1.9125 (2.2484) loss 4.0813 (3.8556) grad_norm 1.1352 (1.1927) [2022-01-20 00:28:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][360/1251] eta 0:33:19 lr 0.000847 time 2.1978 (2.2439) loss 3.5534 (3.8563) grad_norm 1.2083 (1.1925) [2022-01-20 00:28:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][370/1251] eta 0:32:53 lr 0.000847 time 1.5792 (2.2401) loss 3.6441 (3.8544) grad_norm 1.1583 (1.1916) [2022-01-20 00:28:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][380/1251] eta 0:32:28 lr 0.000846 time 1.9479 (2.2368) loss 4.0543 (3.8602) grad_norm 1.1106 (1.1904) [2022-01-20 00:29:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][390/1251] eta 0:32:07 lr 0.000846 time 1.9363 (2.2390) loss 4.0431 (3.8557) grad_norm 1.4262 (1.1905) [2022-01-20 00:29:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][400/1251] eta 0:31:44 lr 0.000846 time 3.2797 (2.2378) loss 3.7091 (3.8515) grad_norm 1.3944 (1.1925) [2022-01-20 00:29:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][410/1251] eta 0:31:19 lr 0.000846 time 1.9142 (2.2344) loss 4.3796 (3.8547) grad_norm 1.1583 (1.1918) [2022-01-20 00:30:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][420/1251] eta 0:30:55 lr 0.000846 time 1.9990 (2.2329) loss 3.9200 (3.8538) grad_norm 1.3389 (1.1941) [2022-01-20 00:30:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][430/1251] eta 0:30:35 lr 0.000846 time 2.2539 (2.2362) loss 3.6709 (3.8584) grad_norm 1.1763 (1.1951) [2022-01-20 00:31:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][440/1251] eta 0:30:13 lr 0.000846 time 2.9551 (2.2361) loss 3.8319 (3.8569) grad_norm 1.1497 (1.1948) [2022-01-20 00:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][450/1251] eta 0:29:48 lr 0.000846 time 1.9286 (2.2330) loss 4.3040 (3.8552) grad_norm 1.1594 (1.1949) [2022-01-20 00:31:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][460/1251] eta 0:29:23 lr 0.000846 time 1.8598 (2.2294) loss 4.2094 (3.8572) grad_norm 1.1851 (1.1942) [2022-01-20 00:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][470/1251] eta 0:28:58 lr 0.000846 time 2.3152 (2.2264) loss 4.0526 (3.8553) grad_norm 1.1662 (1.1922) [2022-01-20 00:32:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][480/1251] eta 0:28:37 lr 0.000846 time 2.7347 (2.2271) loss 3.4988 (3.8500) grad_norm 1.3359 (1.1912) [2022-01-20 00:32:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][490/1251] eta 0:28:14 lr 0.000846 time 1.6131 (2.2270) loss 3.0226 (3.8504) grad_norm 1.1971 (1.1921) [2022-01-20 00:33:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][500/1251] eta 0:27:50 lr 0.000846 time 2.1104 (2.2244) loss 2.6547 (3.8505) grad_norm 1.2219 (1.1921) [2022-01-20 00:33:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][510/1251] eta 0:27:25 lr 0.000846 time 1.6171 (2.2211) loss 3.8818 (3.8516) grad_norm 1.2703 (1.1917) [2022-01-20 00:33:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][520/1251] eta 0:27:03 lr 0.000846 time 2.3981 (2.2208) loss 3.0342 (3.8452) grad_norm 1.1148 (1.1911) [2022-01-20 00:34:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][530/1251] eta 0:26:42 lr 0.000846 time 2.7188 (2.2228) loss 4.2901 (3.8407) grad_norm 1.2031 (1.1905) [2022-01-20 00:34:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][540/1251] eta 0:26:21 lr 0.000846 time 2.4825 (2.2244) loss 4.6474 (3.8381) grad_norm 1.1677 (1.1901) [2022-01-20 00:35:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][550/1251] eta 0:25:58 lr 0.000846 time 2.0076 (2.2231) loss 4.4980 (3.8408) grad_norm 1.2701 (1.1899) [2022-01-20 00:35:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][560/1251] eta 0:25:34 lr 0.000846 time 2.1955 (2.2208) loss 4.0618 (3.8417) grad_norm 1.4852 (1.1901) [2022-01-20 00:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][570/1251] eta 0:25:09 lr 0.000846 time 1.9792 (2.2160) loss 3.6222 (3.8375) grad_norm 1.6316 (1.1904) [2022-01-20 00:36:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][580/1251] eta 0:24:44 lr 0.000846 time 2.4766 (2.2120) loss 4.1331 (3.8366) grad_norm 1.0595 (1.1904) [2022-01-20 00:36:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][590/1251] eta 0:24:21 lr 0.000846 time 2.1541 (2.2108) loss 3.9042 (3.8345) grad_norm 1.0313 (1.1896) [2022-01-20 00:36:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][600/1251] eta 0:24:00 lr 0.000846 time 2.6265 (2.2124) loss 3.9145 (3.8380) grad_norm 1.2692 (1.1898) [2022-01-20 00:37:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][610/1251] eta 0:23:40 lr 0.000846 time 2.8794 (2.2161) loss 4.3355 (3.8372) grad_norm 1.1220 (1.1900) [2022-01-20 00:37:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][620/1251] eta 0:23:17 lr 0.000846 time 1.9321 (2.2154) loss 3.8019 (3.8355) grad_norm 1.1040 (1.1902) [2022-01-20 00:37:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][630/1251] eta 0:22:56 lr 0.000846 time 2.4415 (2.2170) loss 2.7754 (3.8359) grad_norm 1.0308 (1.1892) [2022-01-20 00:38:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][640/1251] eta 0:22:34 lr 0.000846 time 2.2429 (2.2172) loss 3.4265 (3.8325) grad_norm 1.1168 (1.1882) [2022-01-20 00:38:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][650/1251] eta 0:22:11 lr 0.000846 time 2.0535 (2.2151) loss 3.5898 (3.8295) grad_norm 1.1706 (1.1872) [2022-01-20 00:38:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][660/1251] eta 0:21:47 lr 0.000846 time 1.6150 (2.2122) loss 3.9311 (3.8279) grad_norm 1.0911 (1.1857) [2022-01-20 00:39:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][670/1251] eta 0:21:26 lr 0.000846 time 2.5120 (2.2137) loss 2.5699 (3.8264) grad_norm 1.1475 (1.1855) [2022-01-20 00:39:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][680/1251] eta 0:21:03 lr 0.000846 time 2.0665 (2.2129) loss 4.3150 (3.8275) grad_norm 1.2929 (1.1856) [2022-01-20 00:40:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][690/1251] eta 0:20:41 lr 0.000846 time 1.8060 (2.2123) loss 4.2809 (3.8292) grad_norm 1.1978 (1.1851) [2022-01-20 00:40:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][700/1251] eta 0:20:18 lr 0.000846 time 2.1143 (2.2112) loss 4.4257 (3.8310) grad_norm 1.0681 (1.1846) [2022-01-20 00:40:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][710/1251] eta 0:19:56 lr 0.000845 time 1.9253 (2.2111) loss 3.9373 (3.8328) grad_norm 1.1627 (1.1853) [2022-01-20 00:41:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][720/1251] eta 0:19:33 lr 0.000845 time 2.6674 (2.2093) loss 4.0754 (3.8303) grad_norm 1.0052 (1.1853) [2022-01-20 00:41:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][730/1251] eta 0:19:11 lr 0.000845 time 2.2423 (2.2096) loss 3.6140 (3.8292) grad_norm 1.2299 (1.1844) [2022-01-20 00:41:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][740/1251] eta 0:18:48 lr 0.000845 time 1.8958 (2.2085) loss 4.2955 (3.8322) grad_norm 1.0757 (1.1833) [2022-01-20 00:42:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][750/1251] eta 0:18:25 lr 0.000845 time 1.7703 (2.2072) loss 4.3428 (3.8316) grad_norm 1.0945 (1.1837) [2022-01-20 00:42:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][760/1251] eta 0:18:03 lr 0.000845 time 2.0749 (2.2066) loss 3.8181 (3.8300) grad_norm 1.1585 (1.1847) [2022-01-20 00:42:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][770/1251] eta 0:17:41 lr 0.000845 time 2.2113 (2.2059) loss 4.2060 (3.8325) grad_norm 1.2492 (1.1853) [2022-01-20 00:43:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][780/1251] eta 0:17:18 lr 0.000845 time 2.9179 (2.2058) loss 4.3471 (3.8335) grad_norm 1.1683 (1.1861) [2022-01-20 00:43:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][790/1251] eta 0:16:56 lr 0.000845 time 1.8530 (2.2054) loss 3.7972 (3.8349) grad_norm 1.2051 (1.1861) [2022-01-20 00:44:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][800/1251] eta 0:16:35 lr 0.000845 time 2.2140 (2.2065) loss 2.8974 (3.8325) grad_norm 1.0138 (1.1862) [2022-01-20 00:44:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][810/1251] eta 0:16:13 lr 0.000845 time 2.5136 (2.2080) loss 3.7119 (3.8332) grad_norm 1.1922 (1.1872) [2022-01-20 00:44:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][820/1251] eta 0:15:51 lr 0.000845 time 2.2362 (2.2087) loss 3.9968 (3.8337) grad_norm 1.1284 (1.1881) [2022-01-20 00:45:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][830/1251] eta 0:15:29 lr 0.000845 time 1.9071 (2.2090) loss 3.0968 (3.8328) grad_norm 1.1541 (1.1879) [2022-01-20 00:45:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][840/1251] eta 0:15:07 lr 0.000845 time 1.8448 (2.2072) loss 4.3207 (3.8336) grad_norm 1.1954 (1.1888) [2022-01-20 00:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][850/1251] eta 0:14:43 lr 0.000845 time 1.5780 (2.2044) loss 4.0261 (3.8325) grad_norm 1.1274 (1.1881) [2022-01-20 00:46:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][860/1251] eta 0:14:21 lr 0.000845 time 2.3682 (2.2040) loss 3.0440 (3.8327) grad_norm 1.1530 (1.1882) [2022-01-20 00:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][870/1251] eta 0:13:59 lr 0.000845 time 1.8422 (2.2040) loss 3.9953 (3.8306) grad_norm 1.2116 (1.1879) [2022-01-20 00:46:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][880/1251] eta 0:13:37 lr 0.000845 time 2.8604 (2.2041) loss 3.8996 (3.8284) grad_norm 1.1426 (1.1886) [2022-01-20 00:47:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][890/1251] eta 0:13:16 lr 0.000845 time 2.0524 (2.2057) loss 3.6555 (3.8287) grad_norm 1.1855 (1.1885) [2022-01-20 00:47:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][900/1251] eta 0:12:54 lr 0.000845 time 2.2245 (2.2068) loss 2.9072 (3.8285) grad_norm 1.1874 (1.1886) [2022-01-20 00:48:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][910/1251] eta 0:12:32 lr 0.000845 time 1.8158 (2.2079) loss 3.0492 (3.8297) grad_norm 1.1982 (1.1888) [2022-01-20 00:48:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][920/1251] eta 0:12:11 lr 0.000845 time 3.1431 (2.2087) loss 2.9451 (3.8286) grad_norm 1.1041 (1.1890) [2022-01-20 00:48:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][930/1251] eta 0:11:48 lr 0.000845 time 2.2000 (2.2066) loss 4.1699 (3.8285) grad_norm 1.2761 (1.1891) [2022-01-20 00:49:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][940/1251] eta 0:11:25 lr 0.000845 time 2.1651 (2.2049) loss 4.1436 (3.8268) grad_norm 1.3126 (1.1899) [2022-01-20 00:49:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][950/1251] eta 0:11:03 lr 0.000845 time 1.8884 (2.2028) loss 3.8673 (3.8284) grad_norm 1.1648 (1.1907) [2022-01-20 00:49:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][960/1251] eta 0:10:40 lr 0.000845 time 1.5414 (2.2005) loss 3.8201 (3.8277) grad_norm 1.1860 (1.1904) [2022-01-20 00:50:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][970/1251] eta 0:10:18 lr 0.000845 time 2.0421 (2.2009) loss 4.4147 (3.8300) grad_norm 1.2761 (1.1913) [2022-01-20 00:50:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][980/1251] eta 0:09:56 lr 0.000845 time 2.5291 (2.2009) loss 4.0607 (3.8309) grad_norm 1.0866 (1.1923) [2022-01-20 00:50:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][990/1251] eta 0:09:34 lr 0.000845 time 2.2088 (2.2022) loss 2.9253 (3.8300) grad_norm 1.2481 (1.1926) [2022-01-20 00:51:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1000/1251] eta 0:09:12 lr 0.000845 time 1.5466 (2.2029) loss 3.0560 (3.8280) grad_norm 1.1723 (1.1922) [2022-01-20 00:51:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1010/1251] eta 0:08:51 lr 0.000845 time 2.1294 (2.2036) loss 3.2386 (3.8292) grad_norm 1.0996 (1.1915) [2022-01-20 00:52:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1020/1251] eta 0:08:29 lr 0.000845 time 2.8777 (2.2047) loss 4.0354 (3.8304) grad_norm 0.9818 (1.1912) [2022-01-20 00:52:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1030/1251] eta 0:08:07 lr 0.000845 time 1.8296 (2.2052) loss 4.0708 (3.8339) grad_norm 1.5666 (1.1915) [2022-01-20 00:52:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1040/1251] eta 0:07:45 lr 0.000844 time 1.9645 (2.2060) loss 2.9672 (3.8322) grad_norm 1.6733 (1.1911) [2022-01-20 00:53:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1050/1251] eta 0:07:22 lr 0.000844 time 1.6576 (2.2038) loss 4.4383 (3.8318) grad_norm 1.2226 (1.1904) [2022-01-20 00:53:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1060/1251] eta 0:07:00 lr 0.000844 time 1.9525 (2.2005) loss 4.1227 (3.8317) grad_norm 1.3855 (1.1906) [2022-01-20 00:53:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1070/1251] eta 0:06:37 lr 0.000844 time 1.9118 (2.1981) loss 3.2483 (3.8345) grad_norm 1.1812 (1.1908) [2022-01-20 00:54:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1080/1251] eta 0:06:15 lr 0.000844 time 1.8227 (2.1970) loss 2.7513 (3.8321) grad_norm 1.3814 (1.1912) [2022-01-20 00:54:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1090/1251] eta 0:05:53 lr 0.000844 time 2.5558 (2.1973) loss 4.2347 (3.8309) grad_norm 1.3829 (1.1917) [2022-01-20 00:54:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1100/1251] eta 0:05:31 lr 0.000844 time 2.1313 (2.1977) loss 4.4910 (3.8289) grad_norm 1.1535 (1.1911) [2022-01-20 00:55:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1110/1251] eta 0:05:09 lr 0.000844 time 2.9060 (2.1979) loss 4.0709 (3.8294) grad_norm 1.1994 (1.1906) [2022-01-20 00:55:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1120/1251] eta 0:04:47 lr 0.000844 time 1.9129 (2.1976) loss 4.2723 (3.8286) grad_norm 1.1293 (1.1906) [2022-01-20 00:56:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1130/1251] eta 0:04:25 lr 0.000844 time 1.8657 (2.1978) loss 3.2826 (3.8277) grad_norm 1.1274 (1.1902) [2022-01-20 00:56:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1140/1251] eta 0:04:04 lr 0.000844 time 2.4043 (2.1988) loss 3.8675 (3.8245) grad_norm 1.1458 (1.1899) [2022-01-20 00:56:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1150/1251] eta 0:03:42 lr 0.000844 time 2.7845 (2.1997) loss 3.6614 (3.8240) grad_norm 1.1802 (1.1901) [2022-01-20 00:57:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1160/1251] eta 0:03:20 lr 0.000844 time 2.0710 (2.1994) loss 4.0100 (3.8218) grad_norm 1.1193 (1.1894) [2022-01-20 00:57:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1170/1251] eta 0:02:58 lr 0.000844 time 1.5872 (2.1990) loss 3.5483 (3.8180) grad_norm 1.0759 (1.1892) [2022-01-20 00:57:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1180/1251] eta 0:02:36 lr 0.000844 time 1.9033 (2.1995) loss 2.9599 (3.8177) grad_norm 1.1320 (1.1886) [2022-01-20 00:58:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1190/1251] eta 0:02:14 lr 0.000844 time 2.1859 (2.1995) loss 3.6650 (3.8166) grad_norm 1.1127 (1.1886) [2022-01-20 00:58:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1200/1251] eta 0:01:52 lr 0.000844 time 2.4105 (2.1992) loss 3.9540 (3.8149) grad_norm 1.0415 (1.1893) [2022-01-20 00:58:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1210/1251] eta 0:01:30 lr 0.000844 time 2.1172 (2.1987) loss 4.0322 (3.8156) grad_norm 0.9747 (1.1894) [2022-01-20 00:59:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1220/1251] eta 0:01:08 lr 0.000844 time 1.5667 (2.1984) loss 3.9860 (3.8151) grad_norm 1.0323 (1.1889) [2022-01-20 00:59:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1230/1251] eta 0:00:46 lr 0.000844 time 2.0276 (2.1977) loss 4.3011 (3.8172) grad_norm 1.1726 (1.1891) [2022-01-20 01:00:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1240/1251] eta 0:00:24 lr 0.000844 time 1.9812 (2.1967) loss 4.3226 (3.8165) grad_norm 1.3037 (1.1888) [2022-01-20 01:00:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [77/300][1250/1251] eta 0:00:02 lr 0.000844 time 1.3109 (2.1910) loss 3.7844 (3.8179) grad_norm 1.0955 (1.1884) [2022-01-20 01:00:16 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 77 training takes 0:45:41 [2022-01-20 01:00:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.864 (17.864) Loss 1.1873 (1.1873) Acc@1 71.094 (71.094) Acc@5 90.918 (90.918) [2022-01-20 01:00:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.989 (3.356) Loss 1.2174 (1.1788) Acc@1 71.777 (72.656) Acc@5 90.820 (91.575) [2022-01-20 01:01:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.626 (2.551) Loss 1.2052 (1.1976) Acc@1 72.070 (72.015) Acc@5 91.016 (91.434) [2022-01-20 01:01:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.323 (2.216) Loss 1.1758 (1.1868) Acc@1 72.461 (72.212) Acc@5 91.602 (91.561) [2022-01-20 01:01:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.074 (2.122) Loss 1.2236 (1.1889) Acc@1 70.410 (72.099) Acc@5 89.551 (91.502) [2022-01-20 01:01:51 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 72.188 Acc@5 91.452 [2022-01-20 01:01:51 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 72.2% [2022-01-20 01:01:51 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 72.53% [2022-01-20 01:02:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][0/1251] eta 7:41:30 lr 0.000844 time 22.1345 (22.1345) loss 4.1198 (4.1198) grad_norm 1.1999 (1.1999) [2022-01-20 01:02:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][10/1251] eta 1:22:58 lr 0.000844 time 1.8711 (4.0115) loss 4.5209 (4.1783) grad_norm 1.0196 (1.2087) [2022-01-20 01:02:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][20/1251] eta 1:04:50 lr 0.000844 time 1.5415 (3.1607) loss 3.6322 (4.1156) grad_norm 1.1740 (1.1719) [2022-01-20 01:03:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][30/1251] eta 0:56:45 lr 0.000844 time 1.8836 (2.7894) loss 4.2494 (3.9927) grad_norm 1.4268 (1.1908) [2022-01-20 01:03:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][40/1251] eta 0:55:00 lr 0.000844 time 5.5366 (2.7256) loss 3.9989 (3.9809) grad_norm 1.0531 (1.1854) [2022-01-20 01:04:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][50/1251] eta 0:53:01 lr 0.000844 time 2.6073 (2.6492) loss 2.7857 (3.8815) grad_norm 1.4197 (1.1899) [2022-01-20 01:04:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][60/1251] eta 0:50:58 lr 0.000844 time 1.4113 (2.5681) loss 3.3283 (3.7995) grad_norm 1.1567 (1.2049) [2022-01-20 01:04:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][70/1251] eta 0:49:19 lr 0.000844 time 1.7357 (2.5055) loss 3.9093 (3.7637) grad_norm 1.1380 (1.2131) [2022-01-20 01:05:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][80/1251] eta 0:48:04 lr 0.000844 time 2.8764 (2.4633) loss 4.1376 (3.7748) grad_norm 1.2682 (1.2064) [2022-01-20 01:05:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][90/1251] eta 0:47:07 lr 0.000844 time 1.8985 (2.4356) loss 2.7564 (3.7861) grad_norm 1.0858 (1.2016) [2022-01-20 01:05:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][100/1251] eta 0:46:03 lr 0.000844 time 2.1668 (2.4007) loss 3.9941 (3.8129) grad_norm 1.1136 (1.1937) [2022-01-20 01:06:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][110/1251] eta 0:45:12 lr 0.000844 time 1.8800 (2.3773) loss 4.2487 (3.8010) grad_norm 1.0487 (1.1856) [2022-01-20 01:06:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][120/1251] eta 0:44:44 lr 0.000843 time 2.4816 (2.3732) loss 2.3367 (3.7989) grad_norm 1.2206 (1.1839) [2022-01-20 01:07:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][130/1251] eta 0:44:14 lr 0.000843 time 1.7725 (2.3684) loss 4.0289 (3.8042) grad_norm 1.0845 (1.1830) [2022-01-20 01:07:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][140/1251] eta 0:43:37 lr 0.000843 time 2.9950 (2.3560) loss 3.2240 (3.7971) grad_norm 1.0512 (1.1776) [2022-01-20 01:07:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][150/1251] eta 0:42:55 lr 0.000843 time 1.8032 (2.3392) loss 3.1056 (3.7978) grad_norm 1.0088 (1.1753) [2022-01-20 01:08:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][160/1251] eta 0:42:17 lr 0.000843 time 1.9017 (2.3259) loss 4.1329 (3.8197) grad_norm 1.1728 (1.1800) [2022-01-20 01:08:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][170/1251] eta 0:41:45 lr 0.000843 time 1.7689 (2.3181) loss 3.4734 (3.8005) grad_norm 1.3034 (1.1805) [2022-01-20 01:08:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][180/1251] eta 0:41:08 lr 0.000843 time 2.7280 (2.3047) loss 4.0033 (3.7941) grad_norm 1.3606 (1.1828) [2022-01-20 01:09:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][190/1251] eta 0:40:34 lr 0.000843 time 2.3954 (2.2945) loss 2.8377 (3.7587) grad_norm 1.2078 (1.1907) [2022-01-20 01:09:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][200/1251] eta 0:40:05 lr 0.000843 time 2.0895 (2.2890) loss 4.3066 (3.7640) grad_norm 1.1956 (1.1901) [2022-01-20 01:09:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][210/1251] eta 0:39:42 lr 0.000843 time 2.0762 (2.2882) loss 3.7295 (3.7623) grad_norm 1.0648 (1.1908) [2022-01-20 01:10:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][220/1251] eta 0:39:21 lr 0.000843 time 2.1615 (2.2908) loss 2.8934 (3.7656) grad_norm 1.2660 (1.1911) [2022-01-20 01:10:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][230/1251] eta 0:38:56 lr 0.000843 time 2.8556 (2.2882) loss 3.5845 (3.7681) grad_norm 1.1175 (1.1913) [2022-01-20 01:10:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][240/1251] eta 0:38:20 lr 0.000843 time 2.5022 (2.2752) loss 4.1079 (3.7658) grad_norm 1.3485 (1.1959) [2022-01-20 01:11:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][250/1251] eta 0:37:47 lr 0.000843 time 1.8408 (2.2655) loss 3.9497 (3.7595) grad_norm 1.3074 (1.1982) [2022-01-20 01:11:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][260/1251] eta 0:37:17 lr 0.000843 time 2.3632 (2.2582) loss 3.9301 (3.7463) grad_norm 1.0628 (1.1985) [2022-01-20 01:12:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][270/1251] eta 0:36:52 lr 0.000843 time 2.0269 (2.2553) loss 4.1466 (3.7568) grad_norm 0.9875 (1.1987) [2022-01-20 01:12:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][280/1251] eta 0:36:32 lr 0.000843 time 2.9134 (2.2578) loss 4.2661 (3.7620) grad_norm 1.0839 (1.1981) [2022-01-20 01:12:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][290/1251] eta 0:36:10 lr 0.000843 time 2.0761 (2.2581) loss 3.8815 (3.7610) grad_norm 1.3043 (1.2009) [2022-01-20 01:13:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][300/1251] eta 0:35:51 lr 0.000843 time 2.5812 (2.2626) loss 2.7453 (3.7612) grad_norm 1.3028 (1.2021) [2022-01-20 01:13:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][310/1251] eta 0:35:24 lr 0.000843 time 1.8361 (2.2577) loss 3.4921 (3.7595) grad_norm 1.0677 (1.1998) [2022-01-20 01:13:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][320/1251] eta 0:34:57 lr 0.000843 time 1.8382 (2.2532) loss 4.8151 (3.7571) grad_norm 1.1831 (1.1974) [2022-01-20 01:14:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][330/1251] eta 0:34:28 lr 0.000843 time 1.9414 (2.2455) loss 3.6590 (3.7477) grad_norm 1.1703 (1.1959) [2022-01-20 01:14:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][340/1251] eta 0:34:00 lr 0.000843 time 2.2260 (2.2403) loss 4.1223 (3.7449) grad_norm 1.0987 (1.1943) [2022-01-20 01:14:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][350/1251] eta 0:33:35 lr 0.000843 time 2.1811 (2.2366) loss 2.9512 (3.7498) grad_norm 1.2226 (1.1941) [2022-01-20 01:15:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][360/1251] eta 0:33:17 lr 0.000843 time 2.3313 (2.2421) loss 3.2198 (3.7523) grad_norm 1.1397 (1.1936) [2022-01-20 01:15:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][370/1251] eta 0:32:54 lr 0.000843 time 2.0880 (2.2408) loss 4.0516 (3.7614) grad_norm 1.1000 (1.1942) [2022-01-20 01:16:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][380/1251] eta 0:32:35 lr 0.000843 time 2.2380 (2.2447) loss 3.7326 (3.7599) grad_norm 1.1792 (1.1931) [2022-01-20 01:16:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][390/1251] eta 0:32:11 lr 0.000843 time 1.9673 (2.2433) loss 3.2865 (3.7589) grad_norm 1.0575 (1.1941) [2022-01-20 01:16:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][400/1251] eta 0:31:43 lr 0.000843 time 2.5109 (2.2369) loss 4.5316 (3.7591) grad_norm 1.1661 (1.1955) [2022-01-20 01:17:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][410/1251] eta 0:31:18 lr 0.000843 time 2.2565 (2.2331) loss 4.2883 (3.7625) grad_norm 1.0422 (1.1938) [2022-01-20 01:17:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][420/1251] eta 0:30:53 lr 0.000843 time 2.1548 (2.2310) loss 3.8829 (3.7682) grad_norm 1.0754 (1.1918) [2022-01-20 01:17:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][430/1251] eta 0:30:27 lr 0.000843 time 2.2636 (2.2263) loss 4.2747 (3.7653) grad_norm 1.0984 (1.1900) [2022-01-20 01:18:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][440/1251] eta 0:30:03 lr 0.000843 time 1.9383 (2.2236) loss 3.9781 (3.7656) grad_norm 1.1951 (1.1931) [2022-01-20 01:18:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][450/1251] eta 0:29:41 lr 0.000842 time 2.1825 (2.2238) loss 4.3642 (3.7630) grad_norm 1.1260 (1.1914) [2022-01-20 01:18:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][460/1251] eta 0:29:16 lr 0.000842 time 1.7122 (2.2200) loss 3.3808 (3.7634) grad_norm 1.0345 (1.1912) [2022-01-20 01:19:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][470/1251] eta 0:28:52 lr 0.000842 time 2.2378 (2.2179) loss 3.5231 (3.7605) grad_norm 1.3639 (1.1907) [2022-01-20 01:19:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][480/1251] eta 0:28:31 lr 0.000842 time 2.3162 (2.2203) loss 4.4142 (3.7611) grad_norm 1.4485 (1.1906) [2022-01-20 01:20:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][490/1251] eta 0:28:10 lr 0.000842 time 1.9047 (2.2211) loss 2.6816 (3.7638) grad_norm 1.1466 (1.1911) [2022-01-20 01:20:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][500/1251] eta 0:27:48 lr 0.000842 time 2.1819 (2.2220) loss 4.4605 (3.7690) grad_norm 1.2001 (1.1927) [2022-01-20 01:20:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][510/1251] eta 0:27:27 lr 0.000842 time 2.4575 (2.2236) loss 3.2719 (3.7706) grad_norm 1.0808 (1.1907) [2022-01-20 01:21:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][520/1251] eta 0:27:04 lr 0.000842 time 2.1547 (2.2228) loss 4.7502 (3.7703) grad_norm 1.0955 (1.1906) [2022-01-20 01:21:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][530/1251] eta 0:26:40 lr 0.000842 time 2.2372 (2.2198) loss 3.9869 (3.7675) grad_norm 1.1781 (1.1898) [2022-01-20 01:21:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][540/1251] eta 0:26:15 lr 0.000842 time 1.5658 (2.2159) loss 4.2451 (3.7739) grad_norm 1.3625 (1.1920) [2022-01-20 01:22:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][550/1251] eta 0:25:52 lr 0.000842 time 2.2919 (2.2149) loss 3.5658 (3.7658) grad_norm 1.1923 (1.1920) [2022-01-20 01:22:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][560/1251] eta 0:25:30 lr 0.000842 time 2.7955 (2.2146) loss 3.0235 (3.7640) grad_norm 1.2225 (1.1919) [2022-01-20 01:22:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][570/1251] eta 0:25:10 lr 0.000842 time 2.7803 (2.2188) loss 4.2787 (3.7639) grad_norm 1.3281 (1.1921) [2022-01-20 01:23:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][580/1251] eta 0:24:48 lr 0.000842 time 1.8866 (2.2185) loss 3.0180 (3.7663) grad_norm 1.0790 (1.1909) [2022-01-20 01:23:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][590/1251] eta 0:24:25 lr 0.000842 time 1.8676 (2.2168) loss 3.2870 (3.7673) grad_norm 1.0863 (1.1910) [2022-01-20 01:24:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][600/1251] eta 0:24:00 lr 0.000842 time 2.0553 (2.2124) loss 2.8387 (3.7604) grad_norm 1.3037 (1.1924) [2022-01-20 01:24:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][610/1251] eta 0:23:37 lr 0.000842 time 1.8545 (2.2118) loss 3.2242 (3.7570) grad_norm 1.2495 (1.1914) [2022-01-20 01:24:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][620/1251] eta 0:23:16 lr 0.000842 time 1.8393 (2.2132) loss 4.5988 (3.7590) grad_norm 1.2947 (1.1907) [2022-01-20 01:25:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][630/1251] eta 0:22:54 lr 0.000842 time 1.9506 (2.2128) loss 4.2635 (3.7558) grad_norm 1.3834 (1.1907) [2022-01-20 01:25:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][640/1251] eta 0:22:32 lr 0.000842 time 2.1845 (2.2141) loss 3.3757 (3.7531) grad_norm 1.0469 (1.1906) [2022-01-20 01:25:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][650/1251] eta 0:22:10 lr 0.000842 time 1.8870 (2.2136) loss 4.0556 (3.7571) grad_norm 1.2442 (1.1905) [2022-01-20 01:26:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][660/1251] eta 0:21:47 lr 0.000842 time 2.6696 (2.2120) loss 2.6162 (3.7542) grad_norm 1.2445 (1.1903) [2022-01-20 01:26:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][670/1251] eta 0:21:22 lr 0.000842 time 1.8544 (2.2079) loss 4.6710 (3.7584) grad_norm 1.1280 (1.1902) [2022-01-20 01:26:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][680/1251] eta 0:21:00 lr 0.000842 time 2.6715 (2.2076) loss 3.6698 (3.7623) grad_norm 1.0861 (1.1902) [2022-01-20 01:27:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][690/1251] eta 0:20:38 lr 0.000842 time 1.5763 (2.2075) loss 4.6454 (3.7627) grad_norm 1.2277 (1.1896) [2022-01-20 01:27:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][700/1251] eta 0:20:17 lr 0.000842 time 2.4656 (2.2102) loss 4.1124 (3.7635) grad_norm 1.2454 (1.1901) [2022-01-20 01:28:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][710/1251] eta 0:19:54 lr 0.000842 time 2.0726 (2.2085) loss 4.3941 (3.7666) grad_norm 1.2698 (1.1910) [2022-01-20 01:28:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][720/1251] eta 0:19:33 lr 0.000842 time 2.9190 (2.2094) loss 3.5466 (3.7660) grad_norm 1.1623 (1.1908) [2022-01-20 01:28:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][730/1251] eta 0:19:10 lr 0.000842 time 1.9276 (2.2086) loss 3.7221 (3.7671) grad_norm 1.2779 (1.1901) [2022-01-20 01:29:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][740/1251] eta 0:18:48 lr 0.000842 time 2.0393 (2.2085) loss 4.4537 (3.7655) grad_norm 1.2482 (1.1902) [2022-01-20 01:29:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][750/1251] eta 0:18:26 lr 0.000842 time 2.0610 (2.2080) loss 3.8665 (3.7688) grad_norm 1.4019 (1.1910) [2022-01-20 01:29:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][760/1251] eta 0:18:04 lr 0.000842 time 3.1155 (2.2090) loss 3.4682 (3.7690) grad_norm 1.2178 (1.1914) [2022-01-20 01:30:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][770/1251] eta 0:17:41 lr 0.000842 time 1.8083 (2.2074) loss 4.1876 (3.7698) grad_norm 1.2720 (1.1909) [2022-01-20 01:30:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][780/1251] eta 0:17:18 lr 0.000841 time 2.5414 (2.2054) loss 4.2538 (3.7720) grad_norm 1.1861 (1.1916) [2022-01-20 01:30:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][790/1251] eta 0:16:56 lr 0.000841 time 1.9840 (2.2046) loss 4.1378 (3.7736) grad_norm 1.2326 (1.1911) [2022-01-20 01:31:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][800/1251] eta 0:16:34 lr 0.000841 time 2.7793 (2.2047) loss 2.7729 (3.7763) grad_norm 1.0250 (1.1904) [2022-01-20 01:31:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][810/1251] eta 0:16:11 lr 0.000841 time 1.6883 (2.2024) loss 2.4494 (3.7755) grad_norm 1.2068 (1.1904) [2022-01-20 01:31:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][820/1251] eta 0:15:49 lr 0.000841 time 2.1817 (2.2025) loss 3.0162 (3.7767) grad_norm 1.2918 (1.1925) [2022-01-20 01:32:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][830/1251] eta 0:15:26 lr 0.000841 time 2.1623 (2.2017) loss 4.5769 (3.7782) grad_norm 1.1599 (1.1924) [2022-01-20 01:32:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][840/1251] eta 0:15:04 lr 0.000841 time 2.2992 (2.2017) loss 3.1609 (3.7738) grad_norm 1.1581 (1.1919) [2022-01-20 01:33:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][850/1251] eta 0:14:42 lr 0.000841 time 1.7569 (2.2019) loss 3.6894 (3.7736) grad_norm 1.1325 (1.1929) [2022-01-20 01:33:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][860/1251] eta 0:14:21 lr 0.000841 time 2.4272 (2.2035) loss 4.0507 (3.7796) grad_norm 1.1361 (1.1928) [2022-01-20 01:33:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][870/1251] eta 0:13:58 lr 0.000841 time 1.8928 (2.2020) loss 4.7087 (3.7806) grad_norm 1.0731 (1.1925) [2022-01-20 01:34:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][880/1251] eta 0:13:36 lr 0.000841 time 1.7177 (2.2011) loss 3.4316 (3.7812) grad_norm 1.2806 (1.1926) [2022-01-20 01:34:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][890/1251] eta 0:13:15 lr 0.000841 time 1.9820 (2.2035) loss 3.8307 (3.7841) grad_norm 1.1494 (1.1926) [2022-01-20 01:34:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][900/1251] eta 0:12:53 lr 0.000841 time 1.9589 (2.2037) loss 3.6676 (3.7832) grad_norm 1.0341 (1.1926) [2022-01-20 01:35:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][910/1251] eta 0:12:30 lr 0.000841 time 1.5122 (2.2008) loss 4.2365 (3.7851) grad_norm 1.3970 (1.1922) [2022-01-20 01:35:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][920/1251] eta 0:12:08 lr 0.000841 time 1.5026 (2.1994) loss 3.9819 (3.7854) grad_norm 1.1603 (1.1913) [2022-01-20 01:35:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][930/1251] eta 0:11:46 lr 0.000841 time 2.0195 (2.1995) loss 3.7355 (3.7868) grad_norm 1.3355 (1.1920) [2022-01-20 01:36:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][940/1251] eta 0:11:24 lr 0.000841 time 2.5820 (2.1999) loss 4.3248 (3.7883) grad_norm 1.1236 (1.1916) [2022-01-20 01:36:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][950/1251] eta 0:11:02 lr 0.000841 time 1.9335 (2.2002) loss 3.9299 (3.7877) grad_norm 1.1403 (1.1914) [2022-01-20 01:37:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][960/1251] eta 0:10:40 lr 0.000841 time 1.8731 (2.2006) loss 3.3467 (3.7880) grad_norm 0.9498 (1.1912) [2022-01-20 01:37:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][970/1251] eta 0:10:18 lr 0.000841 time 2.1688 (2.2001) loss 4.0596 (3.7908) grad_norm 1.1089 (1.1908) [2022-01-20 01:37:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][980/1251] eta 0:09:55 lr 0.000841 time 1.8778 (2.1986) loss 3.2526 (3.7926) grad_norm 1.0391 (1.1909) [2022-01-20 01:38:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][990/1251] eta 0:09:33 lr 0.000841 time 1.9371 (2.1975) loss 4.0809 (3.7958) grad_norm 1.2259 (1.1905) [2022-01-20 01:38:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1000/1251] eta 0:09:11 lr 0.000841 time 1.5369 (2.1963) loss 2.4741 (3.7923) grad_norm 1.0596 (1.1900) [2022-01-20 01:38:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1010/1251] eta 0:08:49 lr 0.000841 time 2.3799 (2.1963) loss 4.4440 (3.7912) grad_norm 1.1391 (1.1898) [2022-01-20 01:39:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1020/1251] eta 0:08:27 lr 0.000841 time 1.9226 (2.1963) loss 2.6841 (3.7907) grad_norm 1.1233 (1.1894) [2022-01-20 01:39:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1030/1251] eta 0:08:05 lr 0.000841 time 2.5919 (2.1964) loss 3.8341 (3.7926) grad_norm 1.3069 (1.1888) [2022-01-20 01:39:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1040/1251] eta 0:07:43 lr 0.000841 time 2.0490 (2.1969) loss 3.5138 (3.7936) grad_norm 1.1159 (1.1883) [2022-01-20 01:40:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1050/1251] eta 0:07:21 lr 0.000841 time 2.1538 (2.1969) loss 3.8820 (3.7962) grad_norm 1.2657 (1.1879) [2022-01-20 01:40:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1060/1251] eta 0:06:59 lr 0.000841 time 1.7031 (2.1964) loss 3.5610 (3.7983) grad_norm 1.3346 (1.1883) [2022-01-20 01:41:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1070/1251] eta 0:06:37 lr 0.000841 time 3.1880 (2.1987) loss 3.6571 (3.7986) grad_norm 1.1850 (1.1882) [2022-01-20 01:41:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1080/1251] eta 0:06:15 lr 0.000841 time 2.5721 (2.1988) loss 3.7230 (3.7971) grad_norm 1.1459 (1.1883) [2022-01-20 01:41:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1090/1251] eta 0:05:53 lr 0.000841 time 2.0088 (2.1975) loss 3.4876 (3.7963) grad_norm 1.3341 (1.1885) [2022-01-20 01:42:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1100/1251] eta 0:05:31 lr 0.000841 time 1.6021 (2.1962) loss 2.8759 (3.7963) grad_norm 1.1749 (1.1888) [2022-01-20 01:42:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1110/1251] eta 0:05:09 lr 0.000840 time 1.8257 (2.1947) loss 4.3231 (3.7994) grad_norm 1.1141 (1.1880) [2022-01-20 01:42:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1120/1251] eta 0:04:47 lr 0.000840 time 3.0379 (2.1953) loss 4.4606 (3.8004) grad_norm 1.1606 (1.1875) [2022-01-20 01:43:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1130/1251] eta 0:04:25 lr 0.000840 time 1.8640 (2.1944) loss 2.6157 (3.7996) grad_norm 1.3217 (1.1877) [2022-01-20 01:43:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1140/1251] eta 0:04:03 lr 0.000840 time 1.5317 (2.1945) loss 3.9500 (3.7991) grad_norm 1.1318 (1.1874) [2022-01-20 01:43:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1150/1251] eta 0:03:41 lr 0.000840 time 1.8965 (2.1948) loss 3.9012 (3.7995) grad_norm 1.0698 (1.1870) [2022-01-20 01:44:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1160/1251] eta 0:03:19 lr 0.000840 time 2.5385 (2.1963) loss 4.1163 (3.7984) grad_norm 1.1175 (1.1865) [2022-01-20 01:44:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1170/1251] eta 0:02:57 lr 0.000840 time 1.8865 (2.1961) loss 4.1587 (3.7997) grad_norm 0.9611 (1.1857) [2022-01-20 01:45:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1180/1251] eta 0:02:35 lr 0.000840 time 2.6154 (2.1955) loss 3.1868 (3.8009) grad_norm 1.1252 (1.1859) [2022-01-20 01:45:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1190/1251] eta 0:02:13 lr 0.000840 time 1.8514 (2.1957) loss 3.5193 (3.8008) grad_norm 0.9563 (1.1860) [2022-01-20 01:45:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1200/1251] eta 0:01:51 lr 0.000840 time 2.5900 (2.1958) loss 4.9208 (3.8017) grad_norm 1.1498 (1.1859) [2022-01-20 01:46:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1210/1251] eta 0:01:30 lr 0.000840 time 1.9877 (2.1953) loss 3.2226 (3.8013) grad_norm 1.1183 (1.1859) [2022-01-20 01:46:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1220/1251] eta 0:01:08 lr 0.000840 time 1.8735 (2.1946) loss 4.1185 (3.8009) grad_norm 1.1025 (1.1857) [2022-01-20 01:46:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1230/1251] eta 0:00:46 lr 0.000840 time 2.1038 (2.1945) loss 3.5060 (3.8016) grad_norm 1.1524 (1.1855) [2022-01-20 01:47:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1240/1251] eta 0:00:24 lr 0.000840 time 1.8994 (2.1933) loss 4.0791 (3.8006) grad_norm 1.0477 (1.1858) [2022-01-20 01:47:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [78/300][1250/1251] eta 0:00:02 lr 0.000840 time 1.2082 (2.1876) loss 4.1262 (3.7998) grad_norm 1.4235 (1.1856) [2022-01-20 01:47:28 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 78 training takes 0:45:37 [2022-01-20 01:47:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.261 (18.261) Loss 1.0615 (1.0615) Acc@1 73.535 (73.535) Acc@5 92.773 (92.773) [2022-01-20 01:48:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.572 (3.529) Loss 1.1308 (1.1488) Acc@1 73.535 (72.514) Acc@5 92.871 (91.744) [2022-01-20 01:48:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.622 (2.581) Loss 1.1160 (1.1607) Acc@1 73.633 (72.638) Acc@5 92.090 (91.336) [2022-01-20 01:48:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.921 (2.318) Loss 1.0405 (1.1607) Acc@1 75.879 (72.713) Acc@5 92.480 (91.343) [2022-01-20 01:48:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.946 (2.159) Loss 1.1459 (1.1638) Acc@1 72.363 (72.559) Acc@5 92.578 (91.316) [2022-01-20 01:49:03 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 72.590 Acc@5 91.364 [2022-01-20 01:49:03 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 72.6% [2022-01-20 01:49:03 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 72.59% [2022-01-20 01:49:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][0/1251] eta 7:45:07 lr 0.000840 time 22.3079 (22.3079) loss 3.0877 (3.0877) grad_norm 1.4405 (1.4405) [2022-01-20 01:49:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][10/1251] eta 1:25:05 lr 0.000840 time 2.1677 (4.1136) loss 2.7692 (3.8081) grad_norm 1.4923 (1.2000) [2022-01-20 01:50:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][20/1251] eta 1:04:56 lr 0.000840 time 1.3089 (3.1653) loss 3.8385 (3.7640) grad_norm 1.1980 (1.1856) [2022-01-20 01:50:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][30/1251] eta 0:57:54 lr 0.000840 time 1.9815 (2.8458) loss 4.3920 (3.8690) grad_norm 1.0568 (1.1867) [2022-01-20 01:50:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][40/1251] eta 0:55:10 lr 0.000840 time 3.7795 (2.7339) loss 4.2418 (3.8185) grad_norm 1.1812 (1.1871) [2022-01-20 01:51:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][50/1251] eta 0:53:54 lr 0.000840 time 2.6780 (2.6932) loss 2.5503 (3.7576) grad_norm 1.2689 (1.1797) [2022-01-20 01:51:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][60/1251] eta 0:51:44 lr 0.000840 time 1.3084 (2.6068) loss 3.9367 (3.7459) grad_norm 1.2994 (1.1895) [2022-01-20 01:52:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][70/1251] eta 0:49:59 lr 0.000840 time 1.7975 (2.5401) loss 4.0973 (3.7201) grad_norm 1.1156 (1.1906) [2022-01-20 01:52:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][80/1251] eta 0:48:59 lr 0.000840 time 3.4131 (2.5104) loss 3.8512 (3.7199) grad_norm 1.2215 (1.1827) [2022-01-20 01:52:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][90/1251] eta 0:47:44 lr 0.000840 time 1.5250 (2.4676) loss 3.9044 (3.7117) grad_norm 1.1744 (1.2016) [2022-01-20 01:53:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][100/1251] eta 0:46:40 lr 0.000840 time 1.8058 (2.4333) loss 3.5443 (3.7109) grad_norm 1.1129 (1.2073) [2022-01-20 01:53:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][110/1251] eta 0:45:54 lr 0.000840 time 2.4339 (2.4139) loss 4.0991 (3.7169) grad_norm 1.0828 (1.2031) [2022-01-20 01:53:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][120/1251] eta 0:45:09 lr 0.000840 time 2.2078 (2.3959) loss 3.7290 (3.7207) grad_norm 1.2154 (1.1991) [2022-01-20 01:54:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][130/1251] eta 0:44:19 lr 0.000840 time 1.6217 (2.3723) loss 4.5033 (3.7426) grad_norm 1.1626 (1.1934) [2022-01-20 01:54:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][140/1251] eta 0:43:30 lr 0.000840 time 1.9070 (2.3493) loss 2.7730 (3.7308) grad_norm 1.5296 (1.1949) [2022-01-20 01:54:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][150/1251] eta 0:43:00 lr 0.000840 time 2.2514 (2.3436) loss 3.9705 (3.7381) grad_norm 1.2603 (1.2017) [2022-01-20 01:55:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][160/1251] eta 0:42:21 lr 0.000840 time 1.9586 (2.3293) loss 3.7644 (3.7523) grad_norm 1.0665 (1.1983) [2022-01-20 01:55:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][170/1251] eta 0:41:46 lr 0.000840 time 2.2600 (2.3182) loss 3.2224 (3.7519) grad_norm 1.3999 (1.2013) [2022-01-20 01:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][180/1251] eta 0:41:21 lr 0.000840 time 3.0670 (2.3166) loss 4.1632 (3.7626) grad_norm 1.2216 (1.1963) [2022-01-20 01:56:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][190/1251] eta 0:40:52 lr 0.000839 time 2.5454 (2.3115) loss 4.5072 (3.7621) grad_norm 0.9972 (1.1953) [2022-01-20 01:56:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][200/1251] eta 0:40:19 lr 0.000839 time 2.1776 (2.3023) loss 3.7638 (3.7678) grad_norm 1.0436 (1.1929) [2022-01-20 01:57:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][210/1251] eta 0:39:48 lr 0.000839 time 2.1904 (2.2940) loss 4.5183 (3.7701) grad_norm 1.1287 (1.1928) [2022-01-20 01:57:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][220/1251] eta 0:39:16 lr 0.000839 time 2.3255 (2.2859) loss 2.7672 (3.7724) grad_norm 1.3772 (1.1940) [2022-01-20 01:57:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][230/1251] eta 0:38:55 lr 0.000839 time 2.6834 (2.2874) loss 2.9449 (3.7674) grad_norm 1.4721 (1.1953) [2022-01-20 01:58:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][240/1251] eta 0:38:21 lr 0.000839 time 2.5433 (2.2768) loss 3.9806 (3.7660) grad_norm 1.1278 (1.1947) [2022-01-20 01:58:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][250/1251] eta 0:37:53 lr 0.000839 time 2.7439 (2.2710) loss 2.5765 (3.7591) grad_norm 1.3502 (1.1955) [2022-01-20 01:58:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][260/1251] eta 0:37:23 lr 0.000839 time 1.9768 (2.2638) loss 4.2053 (3.7550) grad_norm 1.1053 (1.1946) [2022-01-20 01:59:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][270/1251] eta 0:37:01 lr 0.000839 time 2.1279 (2.2643) loss 3.7272 (3.7654) grad_norm 1.2500 (1.1924) [2022-01-20 01:59:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][280/1251] eta 0:36:44 lr 0.000839 time 3.3445 (2.2698) loss 3.5694 (3.7619) grad_norm 1.0519 (1.1931) [2022-01-20 02:00:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][290/1251] eta 0:36:23 lr 0.000839 time 3.0596 (2.2716) loss 4.6475 (3.7636) grad_norm 1.5271 (1.1953) [2022-01-20 02:00:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][300/1251] eta 0:35:52 lr 0.000839 time 1.8812 (2.2637) loss 3.7128 (3.7681) grad_norm 1.0809 (1.1954) [2022-01-20 02:00:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][310/1251] eta 0:35:21 lr 0.000839 time 1.7425 (2.2544) loss 2.8182 (3.7734) grad_norm 1.3082 (1.1940) [2022-01-20 02:01:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][320/1251] eta 0:34:53 lr 0.000839 time 2.1343 (2.2483) loss 4.3412 (3.7665) grad_norm 1.1028 (1.1915) [2022-01-20 02:01:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][330/1251] eta 0:34:30 lr 0.000839 time 3.0904 (2.2483) loss 4.3565 (3.7714) grad_norm 1.2217 (1.1898) [2022-01-20 02:01:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][340/1251] eta 0:34:10 lr 0.000839 time 2.2978 (2.2505) loss 3.3087 (3.7674) grad_norm 1.0460 (1.1885) [2022-01-20 02:02:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][350/1251] eta 0:33:45 lr 0.000839 time 2.1776 (2.2483) loss 4.1686 (3.7668) grad_norm 1.1572 (1.1885) [2022-01-20 02:02:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][360/1251] eta 0:33:22 lr 0.000839 time 2.0875 (2.2475) loss 4.6705 (3.7732) grad_norm 1.0229 (1.1881) [2022-01-20 02:02:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][370/1251] eta 0:33:00 lr 0.000839 time 3.2828 (2.2477) loss 4.1006 (3.7739) grad_norm 1.2002 (1.1893) [2022-01-20 02:03:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][380/1251] eta 0:32:34 lr 0.000839 time 1.6173 (2.2438) loss 4.1773 (3.7710) grad_norm 1.0775 (1.1879) [2022-01-20 02:03:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][390/1251] eta 0:32:09 lr 0.000839 time 1.9048 (2.2409) loss 3.3880 (3.7748) grad_norm 1.2266 (1.1863) [2022-01-20 02:04:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][400/1251] eta 0:31:45 lr 0.000839 time 1.8217 (2.2396) loss 3.0952 (3.7662) grad_norm 1.2497 (1.1868) [2022-01-20 02:04:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][410/1251] eta 0:31:20 lr 0.000839 time 2.6413 (2.2359) loss 4.1331 (3.7751) grad_norm 1.1022 (1.1872) [2022-01-20 02:04:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][420/1251] eta 0:30:55 lr 0.000839 time 1.8969 (2.2327) loss 4.5107 (3.7825) grad_norm 1.1361 (1.1875) [2022-01-20 02:05:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][430/1251] eta 0:30:29 lr 0.000839 time 1.6474 (2.2280) loss 4.2198 (3.7888) grad_norm 1.1975 (1.1866) [2022-01-20 02:05:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][440/1251] eta 0:30:06 lr 0.000839 time 2.1742 (2.2279) loss 3.4301 (3.7916) grad_norm 1.1509 (1.1871) [2022-01-20 02:05:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][450/1251] eta 0:29:44 lr 0.000839 time 2.8984 (2.2283) loss 4.6047 (3.7906) grad_norm 1.2352 (1.1874) [2022-01-20 02:06:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][460/1251] eta 0:29:20 lr 0.000839 time 1.7790 (2.2263) loss 3.7778 (3.7894) grad_norm 1.4264 (1.1872) [2022-01-20 02:06:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][470/1251] eta 0:28:57 lr 0.000839 time 2.0452 (2.2243) loss 2.9312 (3.7879) grad_norm 1.2861 (1.1867) [2022-01-20 02:06:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][480/1251] eta 0:28:34 lr 0.000839 time 1.5859 (2.2239) loss 2.5869 (3.7889) grad_norm 1.2404 (1.1861) [2022-01-20 02:07:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][490/1251] eta 0:28:13 lr 0.000839 time 3.0087 (2.2259) loss 4.1607 (3.7910) grad_norm 1.0715 (1.1859) [2022-01-20 02:07:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][500/1251] eta 0:27:52 lr 0.000839 time 1.5170 (2.2272) loss 3.3586 (3.7879) grad_norm 1.0166 (1.1868) [2022-01-20 02:08:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][510/1251] eta 0:27:30 lr 0.000838 time 1.7662 (2.2277) loss 3.3875 (3.7854) grad_norm 1.0126 (1.1859) [2022-01-20 02:08:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][520/1251] eta 0:27:07 lr 0.000838 time 1.7993 (2.2267) loss 3.8944 (3.7860) grad_norm 1.0969 (1.1863) [2022-01-20 02:08:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][530/1251] eta 0:26:43 lr 0.000838 time 1.9102 (2.2236) loss 3.1742 (3.7877) grad_norm 1.3167 (1.1861) [2022-01-20 02:09:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][540/1251] eta 0:26:19 lr 0.000838 time 1.8302 (2.2217) loss 4.5434 (3.7910) grad_norm 1.0639 (1.1857) [2022-01-20 02:09:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][550/1251] eta 0:25:56 lr 0.000838 time 1.9982 (2.2206) loss 4.4956 (3.7970) grad_norm 1.1210 (1.1838) [2022-01-20 02:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][560/1251] eta 0:25:33 lr 0.000838 time 2.1081 (2.2195) loss 3.9227 (3.7996) grad_norm 1.3920 (1.1848) [2022-01-20 02:10:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][570/1251] eta 0:25:10 lr 0.000838 time 1.9469 (2.2174) loss 4.0509 (3.7975) grad_norm 1.4093 (1.1868) [2022-01-20 02:10:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][580/1251] eta 0:24:48 lr 0.000838 time 1.9435 (2.2177) loss 4.0055 (3.7998) grad_norm 1.0714 (1.1871) [2022-01-20 02:10:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][590/1251] eta 0:24:26 lr 0.000838 time 1.9253 (2.2191) loss 4.5198 (3.8029) grad_norm 1.2529 (1.1878) [2022-01-20 02:11:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][600/1251] eta 0:24:04 lr 0.000838 time 2.1708 (2.2187) loss 3.3667 (3.8043) grad_norm 1.1836 (1.1879) [2022-01-20 02:11:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][610/1251] eta 0:23:42 lr 0.000838 time 1.8372 (2.2186) loss 4.8213 (3.8066) grad_norm 1.0995 (1.1884) [2022-01-20 02:12:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][620/1251] eta 0:23:18 lr 0.000838 time 2.0559 (2.2164) loss 4.4692 (3.8088) grad_norm 1.3905 (1.1892) [2022-01-20 02:12:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][630/1251] eta 0:22:54 lr 0.000838 time 2.1651 (2.2135) loss 4.4836 (3.8144) grad_norm 1.0053 (1.1878) [2022-01-20 02:12:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][640/1251] eta 0:22:29 lr 0.000838 time 2.2094 (2.2094) loss 3.6532 (3.8141) grad_norm 1.1389 (1.1872) [2022-01-20 02:12:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][650/1251] eta 0:22:05 lr 0.000838 time 1.8622 (2.2053) loss 3.5168 (3.8095) grad_norm 1.0469 (1.1853) [2022-01-20 02:13:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][660/1251] eta 0:21:41 lr 0.000838 time 2.2297 (2.2027) loss 3.2385 (3.8080) grad_norm 0.9954 (1.1849) [2022-01-20 02:13:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][670/1251] eta 0:21:18 lr 0.000838 time 2.1546 (2.2004) loss 4.2736 (3.8106) grad_norm 1.1548 (1.1846) [2022-01-20 02:14:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][680/1251] eta 0:20:59 lr 0.000838 time 3.8589 (2.2059) loss 3.9471 (3.8125) grad_norm 1.0136 (1.1851) [2022-01-20 02:14:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][690/1251] eta 0:20:38 lr 0.000838 time 2.3770 (2.2074) loss 4.1593 (3.8156) grad_norm 1.3388 (1.1859) [2022-01-20 02:14:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][700/1251] eta 0:20:16 lr 0.000838 time 2.6881 (2.2084) loss 4.3064 (3.8120) grad_norm 1.3139 (1.1863) [2022-01-20 02:15:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][710/1251] eta 0:19:54 lr 0.000838 time 1.9610 (2.2074) loss 3.2915 (3.8103) grad_norm 1.1297 (1.1863) [2022-01-20 02:15:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][720/1251] eta 0:19:32 lr 0.000838 time 2.0780 (2.2084) loss 2.8797 (3.8112) grad_norm 1.2547 (1.1868) [2022-01-20 02:15:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][730/1251] eta 0:19:10 lr 0.000838 time 1.6119 (2.2074) loss 3.8102 (3.8092) grad_norm 1.4971 (1.1874) [2022-01-20 02:16:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][740/1251] eta 0:18:48 lr 0.000838 time 2.8214 (2.2081) loss 4.2712 (3.8105) grad_norm 1.2731 (1.1867) [2022-01-20 02:16:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][750/1251] eta 0:18:25 lr 0.000838 time 1.9131 (2.2071) loss 3.4109 (3.8084) grad_norm 1.2085 (1.1866) [2022-01-20 02:17:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][760/1251] eta 0:18:03 lr 0.000838 time 2.1605 (2.2075) loss 3.8799 (3.8109) grad_norm 1.1847 (1.1867) [2022-01-20 02:17:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][770/1251] eta 0:17:42 lr 0.000838 time 2.7421 (2.2097) loss 4.2729 (3.8064) grad_norm 1.0997 (1.1863) [2022-01-20 02:17:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][780/1251] eta 0:17:20 lr 0.000838 time 1.6535 (2.2097) loss 3.9536 (3.8041) grad_norm 1.2428 (1.1874) [2022-01-20 02:18:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][790/1251] eta 0:16:57 lr 0.000838 time 1.6576 (2.2063) loss 4.7239 (3.8053) grad_norm 1.0831 (1.1872) [2022-01-20 02:18:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][800/1251] eta 0:16:33 lr 0.000838 time 1.9478 (2.2032) loss 3.8779 (3.8021) grad_norm 1.2141 (1.1867) [2022-01-20 02:18:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][810/1251] eta 0:16:11 lr 0.000838 time 1.9115 (2.2026) loss 3.9705 (3.8022) grad_norm 0.9539 (1.1859) [2022-01-20 02:19:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][820/1251] eta 0:15:48 lr 0.000838 time 1.7157 (2.2016) loss 4.0207 (3.8041) grad_norm 1.1725 (1.1860) [2022-01-20 02:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][830/1251] eta 0:15:26 lr 0.000838 time 1.8912 (2.2013) loss 4.2292 (3.8035) grad_norm 1.2133 (1.1852) [2022-01-20 02:19:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][840/1251] eta 0:15:05 lr 0.000837 time 2.7045 (2.2024) loss 4.0344 (3.8050) grad_norm 0.9656 (1.1846) [2022-01-20 02:20:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][850/1251] eta 0:14:43 lr 0.000837 time 2.5100 (2.2035) loss 3.8182 (3.8036) grad_norm 1.1028 (1.1845) [2022-01-20 02:20:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][860/1251] eta 0:14:21 lr 0.000837 time 1.8608 (2.2040) loss 4.5025 (3.8040) grad_norm 1.0817 (1.1843) [2022-01-20 02:21:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][870/1251] eta 0:13:59 lr 0.000837 time 1.8961 (2.2033) loss 4.0629 (3.8027) grad_norm 1.0168 (1.1837) [2022-01-20 02:21:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][880/1251] eta 0:13:37 lr 0.000837 time 1.8417 (2.2031) loss 3.4951 (3.8002) grad_norm 1.1767 (1.1841) [2022-01-20 02:21:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][890/1251] eta 0:13:15 lr 0.000837 time 1.8667 (2.2033) loss 3.8244 (3.7987) grad_norm 0.9542 (1.1832) [2022-01-20 02:22:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][900/1251] eta 0:12:53 lr 0.000837 time 2.2149 (2.2030) loss 3.4587 (3.7979) grad_norm 1.0013 (1.1835) [2022-01-20 02:22:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][910/1251] eta 0:12:30 lr 0.000837 time 1.9037 (2.2015) loss 3.3750 (3.7986) grad_norm 1.2696 (1.1853) [2022-01-20 02:22:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][920/1251] eta 0:12:08 lr 0.000837 time 1.6513 (2.2005) loss 3.5189 (3.7984) grad_norm 1.2433 (1.1860) [2022-01-20 02:23:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][930/1251] eta 0:11:46 lr 0.000837 time 2.6559 (2.2003) loss 2.8804 (3.7965) grad_norm 1.0681 (1.1862) [2022-01-20 02:23:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][940/1251] eta 0:11:24 lr 0.000837 time 1.7541 (2.1996) loss 2.7598 (3.7939) grad_norm 1.3546 (1.1865) [2022-01-20 02:23:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][950/1251] eta 0:11:01 lr 0.000837 time 2.2681 (2.1990) loss 4.0613 (3.7941) grad_norm 1.2629 (1.1875) [2022-01-20 02:24:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][960/1251] eta 0:10:39 lr 0.000837 time 1.9032 (2.1993) loss 3.2532 (3.7940) grad_norm 1.1837 (1.1877) [2022-01-20 02:24:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][970/1251] eta 0:10:18 lr 0.000837 time 2.4498 (2.2014) loss 4.0893 (3.7938) grad_norm 1.0775 (1.1880) [2022-01-20 02:25:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][980/1251] eta 0:09:56 lr 0.000837 time 2.4945 (2.2028) loss 3.7932 (3.7945) grad_norm 1.0585 (1.1870) [2022-01-20 02:25:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][990/1251] eta 0:09:34 lr 0.000837 time 2.2038 (2.2024) loss 4.4327 (3.7962) grad_norm 1.1597 (1.1862) [2022-01-20 02:25:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1000/1251] eta 0:09:12 lr 0.000837 time 1.8252 (2.2022) loss 3.3780 (3.7966) grad_norm 1.1121 (1.1854) [2022-01-20 02:26:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1010/1251] eta 0:08:50 lr 0.000837 time 1.9489 (2.2014) loss 3.8362 (3.7991) grad_norm 1.1519 (1.1852) [2022-01-20 02:26:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1020/1251] eta 0:08:28 lr 0.000837 time 2.4501 (2.2004) loss 3.8548 (3.7964) grad_norm 1.1215 (1.1851) [2022-01-20 02:26:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1030/1251] eta 0:08:05 lr 0.000837 time 2.2629 (2.1983) loss 3.8374 (3.7970) grad_norm 1.1685 (1.1849) [2022-01-20 02:27:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1040/1251] eta 0:07:43 lr 0.000837 time 2.3833 (2.1976) loss 2.9461 (3.7960) grad_norm 1.1283 (1.1841) [2022-01-20 02:27:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1050/1251] eta 0:07:21 lr 0.000837 time 2.2343 (2.1969) loss 4.3934 (3.7982) grad_norm 1.2084 (1.1838) [2022-01-20 02:27:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1060/1251] eta 0:06:59 lr 0.000837 time 2.2733 (2.1984) loss 4.2086 (3.8009) grad_norm 1.1634 (1.1834) [2022-01-20 02:28:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1070/1251] eta 0:06:38 lr 0.000837 time 2.1325 (2.2000) loss 3.4928 (3.7994) grad_norm 0.9957 (1.1827) [2022-01-20 02:28:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1080/1251] eta 0:06:16 lr 0.000837 time 1.9546 (2.1998) loss 4.2584 (3.8012) grad_norm 1.2855 (1.1836) [2022-01-20 02:29:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1090/1251] eta 0:05:54 lr 0.000837 time 1.8886 (2.1990) loss 4.2024 (3.8007) grad_norm 1.2191 (1.1833) [2022-01-20 02:29:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1100/1251] eta 0:05:32 lr 0.000837 time 1.9165 (2.1988) loss 4.2763 (3.7987) grad_norm 1.1080 (1.1836) [2022-01-20 02:29:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1110/1251] eta 0:05:10 lr 0.000837 time 1.8233 (2.1991) loss 4.3475 (3.7967) grad_norm 1.1135 (1.1833) [2022-01-20 02:30:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1120/1251] eta 0:04:48 lr 0.000837 time 2.2086 (2.1993) loss 4.4815 (3.7971) grad_norm 1.0375 (1.1827) [2022-01-20 02:30:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1130/1251] eta 0:04:26 lr 0.000837 time 2.1904 (2.1991) loss 4.2399 (3.7989) grad_norm 1.0785 (1.1821) [2022-01-20 02:30:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1140/1251] eta 0:04:04 lr 0.000837 time 2.1674 (2.1993) loss 4.1640 (3.7991) grad_norm 1.2173 (1.1823) [2022-01-20 02:31:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1150/1251] eta 0:03:42 lr 0.000837 time 2.1091 (2.1990) loss 4.1891 (3.8010) grad_norm 1.3018 (1.1821) [2022-01-20 02:31:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1160/1251] eta 0:03:20 lr 0.000836 time 2.0215 (2.1984) loss 4.1085 (3.8014) grad_norm 1.1449 (1.1826) [2022-01-20 02:31:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1170/1251] eta 0:02:57 lr 0.000836 time 1.9391 (2.1974) loss 3.0511 (3.8024) grad_norm 0.9502 (1.1824) [2022-01-20 02:32:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1180/1251] eta 0:02:35 lr 0.000836 time 2.6213 (2.1969) loss 2.8115 (3.8012) grad_norm 1.0439 (1.1822) [2022-01-20 02:32:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1190/1251] eta 0:02:14 lr 0.000836 time 2.2392 (2.1973) loss 3.7261 (3.8007) grad_norm 1.1040 (1.1819) [2022-01-20 02:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1200/1251] eta 0:01:51 lr 0.000836 time 1.8914 (2.1960) loss 3.3188 (3.7994) grad_norm 1.0716 (1.1821) [2022-01-20 02:33:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1210/1251] eta 0:01:30 lr 0.000836 time 1.5786 (2.1960) loss 2.7025 (3.7971) grad_norm 1.1028 (1.1819) [2022-01-20 02:33:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1220/1251] eta 0:01:08 lr 0.000836 time 2.2841 (2.1949) loss 4.4770 (3.7970) grad_norm 1.1803 (1.1819) [2022-01-20 02:34:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1230/1251] eta 0:00:46 lr 0.000836 time 1.5954 (2.1937) loss 3.3050 (3.7949) grad_norm 1.2185 (1.1814) [2022-01-20 02:34:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1240/1251] eta 0:00:24 lr 0.000836 time 1.4030 (2.1934) loss 4.3660 (3.7944) grad_norm 1.0133 (1.1811) [2022-01-20 02:34:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [79/300][1250/1251] eta 0:00:02 lr 0.000836 time 1.1738 (2.1882) loss 4.5753 (3.7942) grad_norm 1.1140 (1.1805) [2022-01-20 02:34:41 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 79 training takes 0:45:37 [2022-01-20 02:35:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.287 (18.287) Loss 1.1729 (1.1729) Acc@1 73.730 (73.730) Acc@5 92.090 (92.090) [2022-01-20 02:35:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 2.382 (3.605) Loss 1.2164 (1.2235) Acc@1 71.582 (72.479) Acc@5 90.820 (91.211) [2022-01-20 02:35:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.244 (2.666) Loss 1.2223 (1.2084) Acc@1 72.168 (72.656) Acc@5 90.625 (91.504) [2022-01-20 02:35:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.890 (2.355) Loss 1.2437 (1.2116) Acc@1 72.852 (72.571) Acc@5 90.723 (91.469) [2022-01-20 02:36:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.955 (2.185) Loss 1.1774 (1.2167) Acc@1 73.438 (72.432) Acc@5 93.164 (91.380) [2022-01-20 02:36:18 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 72.516 Acc@5 91.458 [2022-01-20 02:36:18 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 72.5% [2022-01-20 02:36:18 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 72.59% [2022-01-20 02:36:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][0/1251] eta 7:33:07 lr 0.000836 time 21.7328 (21.7328) loss 3.1691 (3.1691) grad_norm 1.0197 (1.0197) [2022-01-20 02:37:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][10/1251] eta 1:22:37 lr 0.000836 time 2.9624 (3.9949) loss 4.4038 (3.6233) grad_norm 1.0780 (1.1164) [2022-01-20 02:37:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][20/1251] eta 1:02:59 lr 0.000836 time 1.7717 (3.0700) loss 3.3501 (3.5944) grad_norm 1.1690 (1.1275) [2022-01-20 02:37:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][30/1251] eta 0:55:41 lr 0.000836 time 1.5899 (2.7364) loss 3.2478 (3.6337) grad_norm 1.1533 (1.1482) [2022-01-20 02:38:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][40/1251] eta 0:53:52 lr 0.000836 time 3.8612 (2.6697) loss 4.2400 (3.6776) grad_norm 1.2462 (1.1979) [2022-01-20 02:38:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][50/1251] eta 0:52:29 lr 0.000836 time 3.9372 (2.6220) loss 3.9439 (3.6392) grad_norm 1.1822 (1.1856) [2022-01-20 02:38:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][60/1251] eta 0:51:06 lr 0.000836 time 2.6468 (2.5749) loss 2.6333 (3.6684) grad_norm 0.9607 (1.1896) [2022-01-20 02:39:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][70/1251] eta 0:49:34 lr 0.000836 time 1.6654 (2.5189) loss 4.0152 (3.6446) grad_norm 1.2474 (1.2027) [2022-01-20 02:39:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][80/1251] eta 0:48:40 lr 0.000836 time 3.7228 (2.4942) loss 3.9486 (3.6810) grad_norm 1.2547 (1.1989) [2022-01-20 02:40:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][90/1251] eta 0:48:07 lr 0.000836 time 3.2511 (2.4875) loss 4.5280 (3.7360) grad_norm 1.1238 (1.1969) [2022-01-20 02:40:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][100/1251] eta 0:47:05 lr 0.000836 time 1.9119 (2.4553) loss 4.0365 (3.7581) grad_norm 1.2637 (1.1997) [2022-01-20 02:40:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][110/1251] eta 0:45:54 lr 0.000836 time 1.6658 (2.4141) loss 4.0357 (3.7433) grad_norm 1.0999 (1.2086) [2022-01-20 02:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][120/1251] eta 0:45:11 lr 0.000836 time 3.1633 (2.3973) loss 3.8851 (3.7616) grad_norm 1.3167 (1.2117) [2022-01-20 02:41:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][130/1251] eta 0:44:33 lr 0.000836 time 3.1358 (2.3848) loss 4.1936 (3.7729) grad_norm 0.9859 (1.2061) [2022-01-20 02:41:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][140/1251] eta 0:43:54 lr 0.000836 time 2.2943 (2.3709) loss 4.0988 (3.7705) grad_norm 1.2939 (1.2023) [2022-01-20 02:42:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][150/1251] eta 0:43:18 lr 0.000836 time 1.9124 (2.3603) loss 3.7761 (3.7655) grad_norm 1.1505 (1.1994) [2022-01-20 02:42:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][160/1251] eta 0:42:48 lr 0.000836 time 3.2643 (2.3541) loss 3.6873 (3.7672) grad_norm 1.2901 (1.1977) [2022-01-20 02:42:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][170/1251] eta 0:42:14 lr 0.000836 time 1.8545 (2.3445) loss 4.3426 (3.7910) grad_norm 1.2132 (1.1990) [2022-01-20 02:43:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][180/1251] eta 0:41:33 lr 0.000836 time 1.9798 (2.3281) loss 4.4223 (3.7899) grad_norm 1.1482 (1.1954) [2022-01-20 02:43:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][190/1251] eta 0:40:54 lr 0.000836 time 2.0574 (2.3130) loss 2.8512 (3.7805) grad_norm 1.4607 (1.1947) [2022-01-20 02:44:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][200/1251] eta 0:40:21 lr 0.000836 time 1.9094 (2.3043) loss 4.2985 (3.7962) grad_norm 1.1337 (1.1931) [2022-01-20 02:44:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][210/1251] eta 0:40:01 lr 0.000836 time 2.1829 (2.3066) loss 3.8268 (3.7964) grad_norm 1.3937 (1.1951) [2022-01-20 02:44:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][220/1251] eta 0:39:31 lr 0.000836 time 1.7959 (2.2998) loss 3.2466 (3.7814) grad_norm 1.2189 (1.1968) [2022-01-20 02:45:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][230/1251] eta 0:38:59 lr 0.000836 time 2.0763 (2.2910) loss 3.9802 (3.7864) grad_norm 1.1423 (1.1979) [2022-01-20 02:45:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][240/1251] eta 0:38:31 lr 0.000835 time 2.1983 (2.2867) loss 4.1092 (3.8079) grad_norm 1.1358 (1.1989) [2022-01-20 02:45:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][250/1251] eta 0:38:13 lr 0.000835 time 3.0127 (2.2911) loss 3.8687 (3.8083) grad_norm 1.2312 (1.1978) [2022-01-20 02:46:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][260/1251] eta 0:37:48 lr 0.000835 time 1.8028 (2.2887) loss 2.9784 (3.8010) grad_norm 1.2783 (1.1991) [2022-01-20 02:46:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][270/1251] eta 0:37:23 lr 0.000835 time 1.8808 (2.2872) loss 4.0651 (3.8022) grad_norm 1.1321 (1.1971) [2022-01-20 02:47:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][280/1251] eta 0:36:56 lr 0.000835 time 1.9282 (2.2829) loss 4.2696 (3.8096) grad_norm 1.0657 (1.1982) [2022-01-20 02:47:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][290/1251] eta 0:36:27 lr 0.000835 time 2.8170 (2.2767) loss 3.4969 (3.8089) grad_norm 1.4270 (1.1996) [2022-01-20 02:47:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][300/1251] eta 0:35:58 lr 0.000835 time 1.9478 (2.2692) loss 3.3428 (3.8105) grad_norm 1.1632 (1.1989) [2022-01-20 02:48:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][310/1251] eta 0:35:32 lr 0.000835 time 2.2897 (2.2659) loss 4.3167 (3.8033) grad_norm 1.1318 (1.1969) [2022-01-20 02:48:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][320/1251] eta 0:35:05 lr 0.000835 time 2.0067 (2.2616) loss 4.4833 (3.8016) grad_norm 1.2609 (1.2003) [2022-01-20 02:48:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][330/1251] eta 0:34:40 lr 0.000835 time 2.8975 (2.2589) loss 4.4043 (3.8092) grad_norm 1.2247 (1.2011) [2022-01-20 02:49:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][340/1251] eta 0:34:11 lr 0.000835 time 1.9928 (2.2515) loss 3.8303 (3.8022) grad_norm 1.0690 (1.1998) [2022-01-20 02:49:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][350/1251] eta 0:33:45 lr 0.000835 time 1.9209 (2.2479) loss 4.0365 (3.7971) grad_norm 1.0992 (1.1979) [2022-01-20 02:49:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][360/1251] eta 0:33:20 lr 0.000835 time 2.2721 (2.2458) loss 4.4689 (3.8011) grad_norm 1.2585 (1.1972) [2022-01-20 02:50:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][370/1251] eta 0:32:59 lr 0.000835 time 3.0070 (2.2469) loss 4.6645 (3.7927) grad_norm 1.0437 (1.1988) [2022-01-20 02:50:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][380/1251] eta 0:32:40 lr 0.000835 time 1.5065 (2.2506) loss 3.8620 (3.7930) grad_norm 1.1646 (1.1980) [2022-01-20 02:51:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][390/1251] eta 0:32:21 lr 0.000835 time 2.7345 (2.2546) loss 4.2113 (3.7880) grad_norm 1.4231 (1.1988) [2022-01-20 02:51:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][400/1251] eta 0:31:59 lr 0.000835 time 2.4021 (2.2554) loss 3.8255 (3.7867) grad_norm 1.0655 (1.1980) [2022-01-20 02:51:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][410/1251] eta 0:31:37 lr 0.000835 time 3.4097 (2.2559) loss 4.0581 (3.7834) grad_norm 1.4502 (1.1982) [2022-01-20 02:52:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][420/1251] eta 0:31:11 lr 0.000835 time 1.7140 (2.2516) loss 4.6191 (3.7836) grad_norm 1.3391 (1.1991) [2022-01-20 02:52:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][430/1251] eta 0:30:45 lr 0.000835 time 1.9400 (2.2480) loss 3.3513 (3.7822) grad_norm 1.0805 (1.1983) [2022-01-20 02:52:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][440/1251] eta 0:30:20 lr 0.000835 time 1.8234 (2.2446) loss 3.9310 (3.7842) grad_norm 1.1915 (1.1985) [2022-01-20 02:53:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][450/1251] eta 0:30:01 lr 0.000835 time 3.5641 (2.2493) loss 3.6591 (3.7863) grad_norm 1.0929 (1.1981) [2022-01-20 02:53:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][460/1251] eta 0:29:36 lr 0.000835 time 1.9228 (2.2456) loss 3.5020 (3.7852) grad_norm 1.1715 (1.1964) [2022-01-20 02:53:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][470/1251] eta 0:29:11 lr 0.000835 time 1.9475 (2.2421) loss 4.2405 (3.7794) grad_norm 1.3088 (1.1966) [2022-01-20 02:54:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][480/1251] eta 0:28:50 lr 0.000835 time 2.2877 (2.2439) loss 4.4481 (3.7797) grad_norm 1.2464 (1.1981) [2022-01-20 02:54:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][490/1251] eta 0:28:27 lr 0.000835 time 3.4065 (2.2444) loss 4.3523 (3.7757) grad_norm 1.3130 (1.1983) [2022-01-20 02:55:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][500/1251] eta 0:28:04 lr 0.000835 time 1.8639 (2.2424) loss 4.1310 (3.7796) grad_norm 1.2101 (1.1991) [2022-01-20 02:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][510/1251] eta 0:27:39 lr 0.000835 time 1.6771 (2.2395) loss 3.6260 (3.7803) grad_norm 1.1611 (1.1984) [2022-01-20 02:55:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][520/1251] eta 0:27:19 lr 0.000835 time 2.2879 (2.2430) loss 2.8901 (3.7780) grad_norm 1.0727 (1.1981) [2022-01-20 02:56:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][530/1251] eta 0:26:56 lr 0.000835 time 3.0454 (2.2415) loss 4.0210 (3.7795) grad_norm 1.4168 (1.1986) [2022-01-20 02:56:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][540/1251] eta 0:26:30 lr 0.000835 time 1.8993 (2.2375) loss 4.2445 (3.7742) grad_norm 1.1364 (1.2010) [2022-01-20 02:56:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][550/1251] eta 0:26:05 lr 0.000835 time 1.9447 (2.2331) loss 3.9431 (3.7745) grad_norm 1.3002 (1.2017) [2022-01-20 02:57:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][560/1251] eta 0:25:40 lr 0.000834 time 2.0394 (2.2297) loss 4.1208 (3.7755) grad_norm 1.4467 (1.2029) [2022-01-20 02:57:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][570/1251] eta 0:25:17 lr 0.000834 time 2.1744 (2.2284) loss 3.7959 (3.7810) grad_norm 1.4400 (1.2039) [2022-01-20 02:57:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][580/1251] eta 0:24:55 lr 0.000834 time 2.6775 (2.2288) loss 2.7711 (3.7761) grad_norm 1.2833 (1.2049) [2022-01-20 02:58:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][590/1251] eta 0:24:32 lr 0.000834 time 1.4617 (2.2273) loss 4.0106 (3.7811) grad_norm 1.4563 (1.2052) [2022-01-20 02:58:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][600/1251] eta 0:24:10 lr 0.000834 time 2.2154 (2.2282) loss 3.1188 (3.7743) grad_norm 1.0890 (1.2054) [2022-01-20 02:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][610/1251] eta 0:23:49 lr 0.000834 time 2.7302 (2.2294) loss 3.8755 (3.7742) grad_norm 1.2622 (1.2048) [2022-01-20 02:59:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][620/1251] eta 0:23:28 lr 0.000834 time 2.1310 (2.2325) loss 3.2813 (3.7732) grad_norm 1.2937 (1.2041) [2022-01-20 02:59:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][630/1251] eta 0:23:07 lr 0.000834 time 1.9735 (2.2344) loss 3.0454 (3.7717) grad_norm 1.4322 (1.2042) [2022-01-20 03:00:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][640/1251] eta 0:22:44 lr 0.000834 time 1.9944 (2.2329) loss 2.8886 (3.7729) grad_norm 1.2993 (1.2031) [2022-01-20 03:00:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][650/1251] eta 0:22:20 lr 0.000834 time 2.2408 (2.2299) loss 3.4895 (3.7761) grad_norm 1.1865 (1.2026) [2022-01-20 03:00:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][660/1251] eta 0:21:56 lr 0.000834 time 1.9475 (2.2274) loss 3.7542 (3.7777) grad_norm 1.0363 (1.2017) [2022-01-20 03:01:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][670/1251] eta 0:21:33 lr 0.000834 time 2.2919 (2.2262) loss 3.9323 (3.7768) grad_norm 1.1369 (1.2015) [2022-01-20 03:01:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][680/1251] eta 0:21:11 lr 0.000834 time 2.8035 (2.2268) loss 3.9481 (3.7753) grad_norm 1.0612 (1.2009) [2022-01-20 03:02:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][690/1251] eta 0:20:51 lr 0.000834 time 2.9219 (2.2313) loss 4.2433 (3.7753) grad_norm 1.1378 (1.1992) [2022-01-20 03:02:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][700/1251] eta 0:20:30 lr 0.000834 time 2.6410 (2.2331) loss 3.2866 (3.7765) grad_norm 1.0623 (1.1992) [2022-01-20 03:02:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][710/1251] eta 0:20:06 lr 0.000834 time 1.9196 (2.2301) loss 4.4205 (3.7804) grad_norm 1.1887 (1.1982) [2022-01-20 03:03:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][720/1251] eta 0:19:42 lr 0.000834 time 2.3291 (2.2267) loss 2.6344 (3.7766) grad_norm 1.1992 (1.1980) [2022-01-20 03:03:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][730/1251] eta 0:19:19 lr 0.000834 time 2.2540 (2.2259) loss 4.7078 (3.7777) grad_norm 1.2340 (1.1976) [2022-01-20 03:03:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][740/1251] eta 0:18:56 lr 0.000834 time 1.9966 (2.2236) loss 4.4234 (3.7782) grad_norm 1.0914 (1.1983) [2022-01-20 03:04:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][750/1251] eta 0:18:32 lr 0.000834 time 1.8145 (2.2215) loss 3.7265 (3.7784) grad_norm 1.2670 (1.2000) [2022-01-20 03:04:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][760/1251] eta 0:18:11 lr 0.000834 time 3.1675 (2.2231) loss 4.4324 (3.7793) grad_norm 1.2318 (1.2006) [2022-01-20 03:04:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][770/1251] eta 0:17:50 lr 0.000834 time 1.7822 (2.2250) loss 4.3298 (3.7803) grad_norm 1.2800 (1.2009) [2022-01-20 03:05:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][780/1251] eta 0:17:28 lr 0.000834 time 1.9051 (2.2254) loss 4.0543 (3.7824) grad_norm 1.3963 (1.2018) [2022-01-20 03:05:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][790/1251] eta 0:17:06 lr 0.000834 time 2.2261 (2.2256) loss 3.6474 (3.7807) grad_norm 1.3769 (1.2032) [2022-01-20 03:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][800/1251] eta 0:16:43 lr 0.000834 time 3.2170 (2.2251) loss 4.0600 (3.7814) grad_norm 1.1706 (1.2031) [2022-01-20 03:06:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][810/1251] eta 0:16:19 lr 0.000834 time 1.8324 (2.2218) loss 3.5552 (3.7796) grad_norm 1.2874 (1.2037) [2022-01-20 03:06:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][820/1251] eta 0:15:56 lr 0.000834 time 1.9711 (2.2194) loss 3.1752 (3.7778) grad_norm 1.0961 (1.2028) [2022-01-20 03:07:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][830/1251] eta 0:15:34 lr 0.000834 time 2.2807 (2.2185) loss 2.7506 (3.7757) grad_norm 1.0777 (1.2020) [2022-01-20 03:07:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][840/1251] eta 0:15:11 lr 0.000834 time 1.5045 (2.2167) loss 2.6685 (3.7745) grad_norm 0.9553 (1.2007) [2022-01-20 03:07:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][850/1251] eta 0:14:47 lr 0.000834 time 1.5517 (2.2140) loss 4.2991 (3.7747) grad_norm 1.2184 (1.2005) [2022-01-20 03:08:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][860/1251] eta 0:14:26 lr 0.000834 time 2.7980 (2.2150) loss 3.8952 (3.7717) grad_norm 1.0472 (1.1995) [2022-01-20 03:08:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][870/1251] eta 0:14:03 lr 0.000834 time 2.1649 (2.2138) loss 4.1746 (3.7694) grad_norm 1.1547 (1.1986) [2022-01-20 03:08:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][880/1251] eta 0:13:41 lr 0.000834 time 1.5910 (2.2134) loss 3.1266 (3.7690) grad_norm 1.1671 (1.1985) [2022-01-20 03:09:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][890/1251] eta 0:13:19 lr 0.000833 time 1.6028 (2.2136) loss 4.2916 (3.7715) grad_norm 1.0693 (1.1973) [2022-01-20 03:09:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][900/1251] eta 0:12:57 lr 0.000833 time 2.9165 (2.2149) loss 3.9719 (3.7711) grad_norm 1.0871 (1.1970) [2022-01-20 03:09:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][910/1251] eta 0:12:34 lr 0.000833 time 1.8535 (2.2135) loss 4.3187 (3.7717) grad_norm 1.1397 (1.1964) [2022-01-20 03:10:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][920/1251] eta 0:12:12 lr 0.000833 time 2.2580 (2.2121) loss 3.3295 (3.7719) grad_norm 1.2809 (1.1959) [2022-01-20 03:10:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][930/1251] eta 0:11:49 lr 0.000833 time 1.8106 (2.2108) loss 4.3099 (3.7740) grad_norm 1.2371 (1.1960) [2022-01-20 03:11:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][940/1251] eta 0:11:28 lr 0.000833 time 5.0835 (2.2148) loss 3.5956 (3.7751) grad_norm 1.3977 (1.1962) [2022-01-20 03:11:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][950/1251] eta 0:11:06 lr 0.000833 time 1.8559 (2.2151) loss 4.0709 (3.7730) grad_norm 0.9286 (1.1958) [2022-01-20 03:11:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][960/1251] eta 0:10:45 lr 0.000833 time 3.0188 (2.2172) loss 4.4040 (3.7732) grad_norm 1.2110 (1.1951) [2022-01-20 03:12:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][970/1251] eta 0:10:22 lr 0.000833 time 2.0706 (2.2163) loss 4.0439 (3.7721) grad_norm 1.2168 (1.1946) [2022-01-20 03:12:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][980/1251] eta 0:10:00 lr 0.000833 time 3.2301 (2.2157) loss 3.9732 (3.7744) grad_norm 1.0675 (1.1943) [2022-01-20 03:12:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][990/1251] eta 0:09:37 lr 0.000833 time 1.9316 (2.2139) loss 4.2117 (3.7769) grad_norm 1.0593 (1.1938) [2022-01-20 03:13:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1000/1251] eta 0:09:15 lr 0.000833 time 1.7039 (2.2130) loss 2.9715 (3.7743) grad_norm 1.0358 (1.1933) [2022-01-20 03:13:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1010/1251] eta 0:08:53 lr 0.000833 time 2.3661 (2.2124) loss 3.7351 (3.7724) grad_norm 1.0479 (1.1932) [2022-01-20 03:13:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1020/1251] eta 0:08:30 lr 0.000833 time 2.1028 (2.2111) loss 3.6266 (3.7748) grad_norm 1.1428 (1.1932) [2022-01-20 03:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1030/1251] eta 0:08:08 lr 0.000833 time 1.9252 (2.2110) loss 4.0947 (3.7752) grad_norm 1.0736 (1.1928) [2022-01-20 03:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1040/1251] eta 0:07:46 lr 0.000833 time 2.5144 (2.2110) loss 2.9032 (3.7754) grad_norm 2.2433 (1.1937) [2022-01-20 03:15:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1050/1251] eta 0:07:24 lr 0.000833 time 2.4592 (2.2122) loss 4.0344 (3.7720) grad_norm 1.2025 (1.1940) [2022-01-20 03:15:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1060/1251] eta 0:07:02 lr 0.000833 time 2.1612 (2.2122) loss 4.0908 (3.7730) grad_norm 1.2999 (1.1937) [2022-01-20 03:15:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1070/1251] eta 0:06:40 lr 0.000833 time 1.6672 (2.2123) loss 4.5186 (3.7739) grad_norm 1.2100 (1.1939) [2022-01-20 03:16:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1080/1251] eta 0:06:18 lr 0.000833 time 1.6357 (2.2110) loss 3.9447 (3.7761) grad_norm 1.1135 (1.1934) [2022-01-20 03:16:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1090/1251] eta 0:05:55 lr 0.000833 time 2.5394 (2.2110) loss 3.3638 (3.7778) grad_norm 1.1380 (1.1936) [2022-01-20 03:16:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1100/1251] eta 0:05:33 lr 0.000833 time 2.0339 (2.2103) loss 2.8430 (3.7775) grad_norm 1.2213 (1.1941) [2022-01-20 03:17:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1110/1251] eta 0:05:11 lr 0.000833 time 1.9131 (2.2099) loss 2.9711 (3.7736) grad_norm 1.1975 (1.1957) [2022-01-20 03:17:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1120/1251] eta 0:04:49 lr 0.000833 time 1.9320 (2.2084) loss 3.7399 (3.7717) grad_norm 1.1001 (1.1953) [2022-01-20 03:17:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1130/1251] eta 0:04:27 lr 0.000833 time 2.5672 (2.2085) loss 4.0772 (3.7714) grad_norm 1.0581 (1.1945) [2022-01-20 03:18:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1140/1251] eta 0:04:05 lr 0.000833 time 2.4078 (2.2095) loss 4.1401 (3.7724) grad_norm 1.3148 (1.1942) [2022-01-20 03:18:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1150/1251] eta 0:03:43 lr 0.000833 time 2.7436 (2.2117) loss 2.8035 (3.7703) grad_norm 1.0887 (1.1937) [2022-01-20 03:19:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1160/1251] eta 0:03:21 lr 0.000833 time 1.8899 (2.2121) loss 3.7934 (3.7705) grad_norm 1.5729 (1.1940) [2022-01-20 03:19:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1170/1251] eta 0:02:59 lr 0.000833 time 2.2482 (2.2106) loss 4.5089 (3.7731) grad_norm 1.4775 (1.1944) [2022-01-20 03:19:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1180/1251] eta 0:02:36 lr 0.000833 time 2.2558 (2.2086) loss 4.3395 (3.7746) grad_norm 1.5341 (1.1947) [2022-01-20 03:20:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1190/1251] eta 0:02:14 lr 0.000833 time 1.9405 (2.2069) loss 4.7273 (3.7757) grad_norm 1.0458 (1.1950) [2022-01-20 03:20:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1200/1251] eta 0:01:52 lr 0.000833 time 1.7895 (2.2047) loss 4.4180 (3.7766) grad_norm 1.1012 (1.1945) [2022-01-20 03:20:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1210/1251] eta 0:01:30 lr 0.000832 time 1.8844 (2.2036) loss 4.3101 (3.7774) grad_norm 1.1209 (1.1940) [2022-01-20 03:21:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1220/1251] eta 0:01:08 lr 0.000832 time 2.5161 (2.2027) loss 4.5123 (3.7797) grad_norm 1.1182 (1.1939) [2022-01-20 03:21:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1230/1251] eta 0:00:46 lr 0.000832 time 1.9163 (2.2018) loss 4.2242 (3.7814) grad_norm 1.3379 (1.1943) [2022-01-20 03:21:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1240/1251] eta 0:00:24 lr 0.000832 time 1.4101 (2.2029) loss 4.2441 (3.7834) grad_norm 1.2157 (1.1944) [2022-01-20 03:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [80/300][1250/1251] eta 0:00:02 lr 0.000832 time 1.1519 (2.1983) loss 3.6630 (3.7853) grad_norm 1.3822 (1.1939) [2022-01-20 03:22:09 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 80 training takes 0:45:50 [2022-01-20 03:22:09 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_80 saving...... [2022-01-20 03:22:20 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_80 saved !!! [2022-01-20 03:22:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 16.845 (16.845) Loss 1.2917 (1.2917) Acc@1 69.727 (69.727) Acc@5 89.648 (89.648) [2022-01-20 03:22:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.916 (2.879) Loss 1.1384 (1.2377) Acc@1 74.121 (71.493) Acc@5 91.602 (90.669) [2022-01-20 03:23:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.343 (2.149) Loss 1.1860 (1.2112) Acc@1 71.875 (72.028) Acc@5 92.285 (91.090) [2022-01-20 03:23:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.613 (2.086) Loss 1.1967 (1.2014) Acc@1 72.266 (72.051) Acc@5 90.820 (91.249) [2022-01-20 03:23:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.497 (2.024) Loss 1.1917 (1.1991) Acc@1 71.875 (72.180) Acc@5 91.406 (91.332) [2022-01-20 03:23:51 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 72.358 Acc@5 91.332 [2022-01-20 03:23:51 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 72.4% [2022-01-20 03:23:51 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 72.59% [2022-01-20 03:24:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][0/1251] eta 7:26:25 lr 0.000832 time 21.4112 (21.4112) loss 3.1441 (3.1441) grad_norm 1.1664 (1.1664) [2022-01-20 03:24:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][10/1251] eta 1:24:39 lr 0.000832 time 2.4587 (4.0932) loss 4.1283 (3.5694) grad_norm 1.2875 (1.1487) [2022-01-20 03:24:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][20/1251] eta 1:03:32 lr 0.000832 time 1.3887 (3.0969) loss 3.9824 (3.6982) grad_norm 1.1201 (1.1447) [2022-01-20 03:25:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][30/1251] eta 0:57:52 lr 0.000832 time 1.9466 (2.8438) loss 4.2747 (3.6749) grad_norm 1.1500 (1.1479) [2022-01-20 03:25:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][40/1251] eta 0:54:50 lr 0.000832 time 4.6944 (2.7171) loss 3.6913 (3.6524) grad_norm 0.9924 (1.1413) [2022-01-20 03:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][50/1251] eta 0:52:06 lr 0.000832 time 2.0890 (2.6032) loss 4.5905 (3.7009) grad_norm 1.2556 (1.1428) [2022-01-20 03:26:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][60/1251] eta 0:50:36 lr 0.000832 time 2.5247 (2.5494) loss 3.0875 (3.7068) grad_norm 1.1088 (1.1452) [2022-01-20 03:26:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][70/1251] eta 0:49:34 lr 0.000832 time 1.5883 (2.5182) loss 3.7688 (3.7304) grad_norm 1.1882 (1.1592) [2022-01-20 03:27:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][80/1251] eta 0:48:28 lr 0.000832 time 3.8151 (2.4834) loss 3.8910 (3.7608) grad_norm 1.5387 (1.1649) [2022-01-20 03:27:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][90/1251] eta 0:47:43 lr 0.000832 time 2.3576 (2.4662) loss 4.3040 (3.7712) grad_norm 1.2291 (1.1758) [2022-01-20 03:27:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][100/1251] eta 0:46:58 lr 0.000832 time 1.5918 (2.4485) loss 3.9095 (3.7712) grad_norm 1.1580 (1.1742) [2022-01-20 03:28:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][110/1251] eta 0:46:00 lr 0.000832 time 1.7235 (2.4191) loss 2.6875 (3.7272) grad_norm 1.0075 (1.1704) [2022-01-20 03:28:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][120/1251] eta 0:44:55 lr 0.000832 time 2.0607 (2.3830) loss 3.1314 (3.7331) grad_norm 1.9582 (1.1781) [2022-01-20 03:28:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][130/1251] eta 0:43:53 lr 0.000832 time 1.8710 (2.3490) loss 4.2023 (3.7439) grad_norm 1.1862 (1.1780) [2022-01-20 03:29:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][140/1251] eta 0:43:11 lr 0.000832 time 2.3194 (2.3324) loss 4.9260 (3.7376) grad_norm 1.5179 (1.1830) [2022-01-20 03:29:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][150/1251] eta 0:42:34 lr 0.000832 time 2.1182 (2.3205) loss 3.7941 (3.7561) grad_norm 1.3130 (1.1850) [2022-01-20 03:30:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][160/1251] eta 0:42:00 lr 0.000832 time 1.7075 (2.3105) loss 4.1866 (3.7635) grad_norm 1.3105 (1.1905) [2022-01-20 03:30:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][170/1251] eta 0:41:41 lr 0.000832 time 2.3537 (2.3140) loss 4.5866 (3.7828) grad_norm 1.1194 (1.1936) [2022-01-20 03:30:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][180/1251] eta 0:41:17 lr 0.000832 time 2.8266 (2.3128) loss 4.1429 (3.7800) grad_norm 1.1388 (1.1884) [2022-01-20 03:31:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][190/1251] eta 0:40:51 lr 0.000832 time 1.6742 (2.3105) loss 4.2392 (3.7773) grad_norm 1.0907 (1.1847) [2022-01-20 03:31:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][200/1251] eta 0:40:26 lr 0.000832 time 1.6781 (2.3087) loss 3.4052 (3.7744) grad_norm 1.1914 (1.1810) [2022-01-20 03:31:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][210/1251] eta 0:40:02 lr 0.000832 time 1.7076 (2.3079) loss 3.9268 (3.7781) grad_norm 1.4162 (1.1805) [2022-01-20 03:32:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][220/1251] eta 0:39:35 lr 0.000832 time 2.7043 (2.3038) loss 4.4964 (3.7784) grad_norm 1.2475 (1.1820) [2022-01-20 03:32:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][230/1251] eta 0:39:00 lr 0.000832 time 1.6180 (2.2928) loss 3.8583 (3.7684) grad_norm 1.1501 (1.1811) [2022-01-20 03:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][240/1251] eta 0:38:29 lr 0.000832 time 2.0205 (2.2845) loss 3.3452 (3.7758) grad_norm 1.1954 (1.1811) [2022-01-20 03:33:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][250/1251] eta 0:38:01 lr 0.000832 time 1.8234 (2.2797) loss 3.6703 (3.7697) grad_norm 1.0296 (1.1777) [2022-01-20 03:33:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][260/1251] eta 0:37:35 lr 0.000832 time 2.1701 (2.2763) loss 3.6977 (3.7685) grad_norm 1.0758 (1.1778) [2022-01-20 03:34:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][270/1251] eta 0:37:07 lr 0.000832 time 1.9265 (2.2706) loss 3.6999 (3.7722) grad_norm 1.1697 (1.1778) [2022-01-20 03:34:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][280/1251] eta 0:36:42 lr 0.000831 time 2.2171 (2.2686) loss 4.5451 (3.7689) grad_norm 1.1206 (1.1774) [2022-01-20 03:34:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][290/1251] eta 0:36:16 lr 0.000831 time 1.8540 (2.2645) loss 3.8298 (3.7681) grad_norm 1.2092 (1.1782) [2022-01-20 03:35:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][300/1251] eta 0:35:50 lr 0.000831 time 1.8176 (2.2615) loss 4.0249 (3.7691) grad_norm 1.0967 (1.1769) [2022-01-20 03:35:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][310/1251] eta 0:35:27 lr 0.000831 time 2.2532 (2.2607) loss 3.6864 (3.7703) grad_norm 1.1712 (1.1748) [2022-01-20 03:35:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][320/1251] eta 0:35:08 lr 0.000831 time 3.2969 (2.2646) loss 4.3124 (3.7678) grad_norm 1.1583 (1.1757) [2022-01-20 03:36:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][330/1251] eta 0:34:48 lr 0.000831 time 1.8936 (2.2673) loss 3.6703 (3.7652) grad_norm 1.0408 (1.1746) [2022-01-20 03:36:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][340/1251] eta 0:34:20 lr 0.000831 time 1.5508 (2.2622) loss 3.4369 (3.7579) grad_norm 1.3587 (1.1755) [2022-01-20 03:37:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][350/1251] eta 0:33:51 lr 0.000831 time 2.2328 (2.2552) loss 3.3597 (3.7605) grad_norm 1.0527 (1.1770) [2022-01-20 03:37:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][360/1251] eta 0:33:24 lr 0.000831 time 1.9241 (2.2498) loss 4.1711 (3.7673) grad_norm 1.1040 (1.1774) [2022-01-20 03:37:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][370/1251] eta 0:33:02 lr 0.000831 time 1.9401 (2.2504) loss 4.1600 (3.7691) grad_norm 1.4146 (1.1777) [2022-01-20 03:38:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][380/1251] eta 0:32:41 lr 0.000831 time 2.2034 (2.2518) loss 2.4818 (3.7638) grad_norm 0.9781 (1.1784) [2022-01-20 03:38:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][390/1251] eta 0:32:15 lr 0.000831 time 2.8336 (2.2481) loss 3.0034 (3.7670) grad_norm 1.2602 (1.1781) [2022-01-20 03:38:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][400/1251] eta 0:31:48 lr 0.000831 time 1.9811 (2.2426) loss 4.5107 (3.7783) grad_norm 1.3958 (1.1789) [2022-01-20 03:39:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][410/1251] eta 0:31:22 lr 0.000831 time 1.7225 (2.2390) loss 4.4941 (3.7752) grad_norm 1.2622 (1.1789) [2022-01-20 03:39:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][420/1251] eta 0:30:57 lr 0.000831 time 1.9007 (2.2351) loss 4.5155 (3.7835) grad_norm 1.1319 (1.1794) [2022-01-20 03:39:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][430/1251] eta 0:30:34 lr 0.000831 time 2.4974 (2.2345) loss 4.0138 (3.7824) grad_norm 1.1860 (1.1805) [2022-01-20 03:40:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][440/1251] eta 0:30:12 lr 0.000831 time 1.8202 (2.2344) loss 3.8977 (3.7827) grad_norm 1.3318 (1.1810) [2022-01-20 03:40:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][450/1251] eta 0:29:51 lr 0.000831 time 2.6117 (2.2360) loss 3.4098 (3.7837) grad_norm 1.1539 (1.1817) [2022-01-20 03:41:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][460/1251] eta 0:29:27 lr 0.000831 time 1.8152 (2.2342) loss 4.0825 (3.7879) grad_norm 1.2591 (1.1848) [2022-01-20 03:41:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][470/1251] eta 0:29:05 lr 0.000831 time 3.3430 (2.2356) loss 3.4049 (3.7895) grad_norm 1.0949 (1.1864) [2022-01-20 03:41:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][480/1251] eta 0:28:44 lr 0.000831 time 3.0595 (2.2368) loss 4.2262 (3.7860) grad_norm 1.2417 (1.1864) [2022-01-20 03:42:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][490/1251] eta 0:28:20 lr 0.000831 time 2.2800 (2.2352) loss 4.1474 (3.7875) grad_norm 1.3718 (1.1866) [2022-01-20 03:42:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][500/1251] eta 0:27:55 lr 0.000831 time 1.7103 (2.2313) loss 4.0561 (3.7928) grad_norm 0.9165 (1.1866) [2022-01-20 03:42:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][510/1251] eta 0:27:30 lr 0.000831 time 1.8108 (2.2269) loss 4.1349 (3.7875) grad_norm 1.3926 (1.1873) [2022-01-20 03:43:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][520/1251] eta 0:27:06 lr 0.000831 time 2.7622 (2.2256) loss 3.1155 (3.7842) grad_norm 1.3433 (1.1882) [2022-01-20 03:43:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][530/1251] eta 0:26:43 lr 0.000831 time 3.1069 (2.2242) loss 4.3230 (3.7830) grad_norm 1.2518 (1.1881) [2022-01-20 03:43:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][540/1251] eta 0:26:21 lr 0.000831 time 2.1551 (2.2240) loss 4.6661 (3.7838) grad_norm 1.0470 (1.1883) [2022-01-20 03:44:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][550/1251] eta 0:25:59 lr 0.000831 time 2.5082 (2.2247) loss 4.2273 (3.7862) grad_norm 1.2512 (1.1885) [2022-01-20 03:44:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][560/1251] eta 0:25:37 lr 0.000831 time 2.1290 (2.2245) loss 3.1611 (3.7862) grad_norm 1.1697 (1.1885) [2022-01-20 03:45:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][570/1251] eta 0:25:16 lr 0.000831 time 2.9630 (2.2262) loss 3.9105 (3.7888) grad_norm 1.5176 (1.1885) [2022-01-20 03:45:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][580/1251] eta 0:24:52 lr 0.000831 time 1.9455 (2.2250) loss 3.8761 (3.7923) grad_norm 1.0859 (1.1885) [2022-01-20 03:45:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][590/1251] eta 0:24:31 lr 0.000831 time 2.8813 (2.2257) loss 4.2340 (3.7900) grad_norm 1.1858 (1.1884) [2022-01-20 03:46:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][600/1251] eta 0:24:08 lr 0.000830 time 1.9007 (2.2249) loss 4.0713 (3.7923) grad_norm 1.0234 (1.1891) [2022-01-20 03:46:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][610/1251] eta 0:23:45 lr 0.000830 time 2.7874 (2.2242) loss 3.3162 (3.7935) grad_norm 1.4137 (1.1897) [2022-01-20 03:46:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][620/1251] eta 0:23:21 lr 0.000830 time 1.9019 (2.2207) loss 2.3834 (3.7929) grad_norm 1.1350 (1.1894) [2022-01-20 03:47:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][630/1251] eta 0:22:56 lr 0.000830 time 2.1135 (2.2164) loss 3.9853 (3.7955) grad_norm 1.1066 (1.1907) [2022-01-20 03:47:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][640/1251] eta 0:22:33 lr 0.000830 time 2.0222 (2.2148) loss 3.9090 (3.7990) grad_norm 1.2132 (1.1920) [2022-01-20 03:47:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][650/1251] eta 0:22:11 lr 0.000830 time 3.0131 (2.2152) loss 3.7119 (3.7993) grad_norm 1.0690 (1.1914) [2022-01-20 03:48:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][660/1251] eta 0:21:50 lr 0.000830 time 2.0946 (2.2174) loss 4.3497 (3.8015) grad_norm 1.2603 (1.1926) [2022-01-20 03:48:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][670/1251] eta 0:21:29 lr 0.000830 time 2.6772 (2.2198) loss 4.1386 (3.8002) grad_norm 1.0738 (1.1918) [2022-01-20 03:49:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][680/1251] eta 0:21:07 lr 0.000830 time 1.8376 (2.2198) loss 3.8642 (3.8037) grad_norm 1.0837 (1.1911) [2022-01-20 03:49:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][690/1251] eta 0:20:45 lr 0.000830 time 2.5229 (2.2209) loss 4.1922 (3.8000) grad_norm 1.1015 (1.1912) [2022-01-20 03:49:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][700/1251] eta 0:20:24 lr 0.000830 time 1.9145 (2.2222) loss 4.1306 (3.7987) grad_norm 1.1711 (1.1919) [2022-01-20 03:50:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][710/1251] eta 0:20:00 lr 0.000830 time 2.1670 (2.2197) loss 4.2427 (3.7968) grad_norm 1.1651 (1.1931) [2022-01-20 03:50:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][720/1251] eta 0:19:37 lr 0.000830 time 1.9116 (2.2167) loss 3.8111 (3.7959) grad_norm 1.3140 (1.1933) [2022-01-20 03:50:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][730/1251] eta 0:19:13 lr 0.000830 time 2.2819 (2.2149) loss 4.1352 (3.7976) grad_norm 1.1091 (1.1929) [2022-01-20 03:51:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][740/1251] eta 0:18:50 lr 0.000830 time 1.5886 (2.2131) loss 3.8117 (3.7958) grad_norm 1.3153 (1.1929) [2022-01-20 03:51:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][750/1251] eta 0:18:28 lr 0.000830 time 3.0373 (2.2130) loss 3.5108 (3.7951) grad_norm 1.1204 (1.1931) [2022-01-20 03:51:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][760/1251] eta 0:18:06 lr 0.000830 time 1.6635 (2.2128) loss 4.3479 (3.7969) grad_norm 1.3421 (1.1945) [2022-01-20 03:52:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][770/1251] eta 0:17:44 lr 0.000830 time 1.9225 (2.2128) loss 4.2876 (3.7939) grad_norm 1.1261 (1.1952) [2022-01-20 03:52:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][780/1251] eta 0:17:22 lr 0.000830 time 2.3688 (2.2137) loss 3.3569 (3.7887) grad_norm 1.4093 (1.1963) [2022-01-20 03:53:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][790/1251] eta 0:17:00 lr 0.000830 time 3.0204 (2.2146) loss 3.6999 (3.7895) grad_norm 1.2856 (1.1957) [2022-01-20 03:53:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][800/1251] eta 0:16:39 lr 0.000830 time 2.2304 (2.2152) loss 4.5191 (3.7908) grad_norm 1.0047 (1.1956) [2022-01-20 03:53:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][810/1251] eta 0:16:16 lr 0.000830 time 2.2632 (2.2150) loss 4.3193 (3.7951) grad_norm 1.2097 (1.1963) [2022-01-20 03:54:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][820/1251] eta 0:15:54 lr 0.000830 time 2.1281 (2.2135) loss 3.7135 (3.7983) grad_norm 1.0263 (1.1964) [2022-01-20 03:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][830/1251] eta 0:15:31 lr 0.000830 time 1.8817 (2.2121) loss 4.0096 (3.7975) grad_norm 1.0948 (1.1960) [2022-01-20 03:54:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][840/1251] eta 0:15:08 lr 0.000830 time 2.2284 (2.2108) loss 4.0992 (3.7989) grad_norm 1.2129 (1.1960) [2022-01-20 03:55:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][850/1251] eta 0:14:46 lr 0.000830 time 2.5714 (2.2110) loss 3.9353 (3.8006) grad_norm 1.1820 (1.1957) [2022-01-20 03:55:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][860/1251] eta 0:14:24 lr 0.000830 time 2.5962 (2.2107) loss 4.5820 (3.7991) grad_norm 1.2105 (1.1958) [2022-01-20 03:55:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][870/1251] eta 0:14:01 lr 0.000830 time 1.4880 (2.2097) loss 3.0497 (3.7987) grad_norm 1.3112 (1.1959) [2022-01-20 03:56:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][880/1251] eta 0:13:39 lr 0.000830 time 1.8447 (2.2089) loss 4.0374 (3.7957) grad_norm 1.1000 (1.1964) [2022-01-20 03:56:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][890/1251] eta 0:13:16 lr 0.000830 time 1.8035 (2.2076) loss 3.7569 (3.7961) grad_norm 1.3892 (1.1968) [2022-01-20 03:57:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][900/1251] eta 0:12:55 lr 0.000830 time 3.2865 (2.2086) loss 4.2629 (3.7972) grad_norm 0.9887 (1.1981) [2022-01-20 03:57:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][910/1251] eta 0:12:33 lr 0.000830 time 1.7254 (2.2102) loss 3.7903 (3.7946) grad_norm 1.2405 (1.1975) [2022-01-20 03:57:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][920/1251] eta 0:12:12 lr 0.000829 time 2.4935 (2.2141) loss 3.3164 (3.7925) grad_norm 1.3455 (1.1991) [2022-01-20 03:58:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][930/1251] eta 0:11:50 lr 0.000829 time 2.2004 (2.2140) loss 3.6858 (3.7940) grad_norm 1.2797 (1.2002) [2022-01-20 03:58:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][940/1251] eta 0:11:28 lr 0.000829 time 2.3946 (2.2137) loss 3.9847 (3.7932) grad_norm 1.3551 (1.2004) [2022-01-20 03:58:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][950/1251] eta 0:11:05 lr 0.000829 time 1.8310 (2.2120) loss 4.1036 (3.7937) grad_norm 1.0311 (1.2006) [2022-01-20 03:59:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][960/1251] eta 0:10:43 lr 0.000829 time 1.7218 (2.2099) loss 3.8264 (3.7926) grad_norm 1.1313 (1.2003) [2022-01-20 03:59:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][970/1251] eta 0:10:20 lr 0.000829 time 1.9686 (2.2080) loss 2.7902 (3.7941) grad_norm 1.6787 (1.2002) [2022-01-20 03:59:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][980/1251] eta 0:09:58 lr 0.000829 time 2.0692 (2.2069) loss 3.5485 (3.7934) grad_norm 1.0038 (1.1994) [2022-01-20 04:00:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][990/1251] eta 0:09:35 lr 0.000829 time 2.5464 (2.2067) loss 4.0497 (3.7937) grad_norm 1.4506 (1.1990) [2022-01-20 04:00:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1000/1251] eta 0:09:14 lr 0.000829 time 1.9619 (2.2095) loss 3.9697 (3.7932) grad_norm 1.1997 (1.1995) [2022-01-20 04:01:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1010/1251] eta 0:08:53 lr 0.000829 time 2.4940 (2.2133) loss 3.4390 (3.7941) grad_norm 1.2016 (1.1994) [2022-01-20 04:01:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1020/1251] eta 0:08:31 lr 0.000829 time 2.4496 (2.2155) loss 2.7873 (3.7942) grad_norm 1.4547 (1.2005) [2022-01-20 04:01:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1030/1251] eta 0:08:09 lr 0.000829 time 2.2226 (2.2156) loss 4.1374 (3.7944) grad_norm 1.4247 (1.2003) [2022-01-20 04:02:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1040/1251] eta 0:07:47 lr 0.000829 time 1.9199 (2.2154) loss 2.9107 (3.7929) grad_norm 1.2024 (1.2006) [2022-01-20 04:02:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1050/1251] eta 0:07:24 lr 0.000829 time 1.9949 (2.2127) loss 4.2959 (3.7959) grad_norm 1.3688 (1.2000) [2022-01-20 04:02:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1060/1251] eta 0:07:02 lr 0.000829 time 2.2195 (2.2106) loss 3.9472 (3.7945) grad_norm 1.1801 (1.1994) [2022-01-20 04:03:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1070/1251] eta 0:06:39 lr 0.000829 time 1.8973 (2.2080) loss 2.7539 (3.7893) grad_norm 1.1138 (1.1983) [2022-01-20 04:03:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1080/1251] eta 0:06:17 lr 0.000829 time 2.5704 (2.2086) loss 3.9522 (3.7888) grad_norm 1.3277 (1.1981) [2022-01-20 04:03:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1090/1251] eta 0:05:55 lr 0.000829 time 1.8505 (2.2075) loss 4.4610 (3.7885) grad_norm 1.3859 (1.1982) [2022-01-20 04:04:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1100/1251] eta 0:05:33 lr 0.000829 time 1.8038 (2.2066) loss 3.2526 (3.7874) grad_norm 1.1314 (1.1983) [2022-01-20 04:04:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1110/1251] eta 0:05:11 lr 0.000829 time 2.5379 (2.2078) loss 4.2762 (3.7857) grad_norm 1.0675 (1.1976) [2022-01-20 04:05:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1120/1251] eta 0:04:49 lr 0.000829 time 2.0930 (2.2092) loss 3.0423 (3.7841) grad_norm 1.1897 (1.1969) [2022-01-20 04:05:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1130/1251] eta 0:04:27 lr 0.000829 time 1.8993 (2.2105) loss 3.6438 (3.7840) grad_norm 1.1782 (1.1970) [2022-01-20 04:05:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1140/1251] eta 0:04:05 lr 0.000829 time 2.1860 (2.2111) loss 4.0722 (3.7825) grad_norm 1.1347 (1.1969) [2022-01-20 04:06:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1150/1251] eta 0:03:43 lr 0.000829 time 2.1256 (2.2112) loss 4.3448 (3.7829) grad_norm 1.1470 (1.1970) [2022-01-20 04:06:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1160/1251] eta 0:03:21 lr 0.000829 time 1.6663 (2.2109) loss 3.0924 (3.7830) grad_norm 1.0126 (1.1962) [2022-01-20 04:06:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1170/1251] eta 0:02:59 lr 0.000829 time 1.9339 (2.2101) loss 4.2219 (3.7832) grad_norm 1.0696 (1.1956) [2022-01-20 04:07:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1180/1251] eta 0:02:36 lr 0.000829 time 2.2614 (2.2099) loss 3.8028 (3.7843) grad_norm 1.2734 (1.1958) [2022-01-20 04:07:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1190/1251] eta 0:02:14 lr 0.000829 time 2.2813 (2.2092) loss 4.2572 (3.7843) grad_norm 1.1274 (1.1963) [2022-01-20 04:08:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1200/1251] eta 0:01:52 lr 0.000829 time 1.6722 (2.2089) loss 3.3758 (3.7848) grad_norm 1.1816 (1.1959) [2022-01-20 04:08:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1210/1251] eta 0:01:30 lr 0.000829 time 1.8946 (2.2083) loss 3.1100 (3.7836) grad_norm 1.1013 (1.1960) [2022-01-20 04:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1220/1251] eta 0:01:08 lr 0.000829 time 1.9458 (2.2080) loss 3.5500 (3.7842) grad_norm 1.0648 (1.1956) [2022-01-20 04:09:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1230/1251] eta 0:00:46 lr 0.000829 time 2.4446 (2.2098) loss 4.3935 (3.7852) grad_norm 1.1747 (1.1957) [2022-01-20 04:09:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1240/1251] eta 0:00:24 lr 0.000828 time 1.1855 (2.2094) loss 3.4348 (3.7846) grad_norm 1.2608 (1.1958) [2022-01-20 04:09:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [81/300][1250/1251] eta 0:00:02 lr 0.000828 time 1.1709 (2.2034) loss 3.9642 (3.7875) grad_norm 1.2563 (1.1965) [2022-01-20 04:09:47 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 81 training takes 0:45:56 [2022-01-20 04:10:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.455 (18.455) Loss 1.2080 (1.2080) Acc@1 71.289 (71.289) Acc@5 91.016 (91.016) [2022-01-20 04:10:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.988 (3.385) Loss 1.1678 (1.1768) Acc@1 71.973 (71.964) Acc@5 91.602 (91.468) [2022-01-20 04:10:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.303 (2.483) Loss 1.1568 (1.1770) Acc@1 72.754 (72.270) Acc@5 91.602 (91.448) [2022-01-20 04:10:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.958 (2.301) Loss 1.2065 (1.1708) Acc@1 71.582 (72.499) Acc@5 91.113 (91.554) [2022-01-20 04:11:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.856 (2.145) Loss 1.1031 (1.1652) Acc@1 73.340 (72.649) Acc@5 93.262 (91.592) [2022-01-20 04:11:24 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 72.686 Acc@5 91.536 [2022-01-20 04:11:24 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 72.7% [2022-01-20 04:11:24 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 72.69% [2022-01-20 04:11:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][0/1251] eta 7:26:03 lr 0.000828 time 21.3936 (21.3936) loss 2.8072 (2.8072) grad_norm 1.3173 (1.3173) [2022-01-20 04:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][10/1251] eta 1:24:02 lr 0.000828 time 2.5875 (4.0636) loss 3.9719 (3.8596) grad_norm 1.1013 (1.1682) [2022-01-20 04:12:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][20/1251] eta 1:05:14 lr 0.000828 time 1.8873 (3.1796) loss 3.9473 (3.7703) grad_norm 1.0068 (1.1778) [2022-01-20 04:12:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][30/1251] eta 0:58:09 lr 0.000828 time 1.8525 (2.8580) loss 3.8479 (3.7242) grad_norm 1.1911 (1.1598) [2022-01-20 04:13:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][40/1251] eta 0:55:23 lr 0.000828 time 4.7228 (2.7446) loss 4.0142 (3.8068) grad_norm 1.1378 (1.1796) [2022-01-20 04:13:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][50/1251] eta 0:52:35 lr 0.000828 time 2.1723 (2.6278) loss 4.3508 (3.8156) grad_norm 1.3873 (1.1964) [2022-01-20 04:13:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][60/1251] eta 0:50:12 lr 0.000828 time 1.2227 (2.5295) loss 4.2342 (3.7944) grad_norm 1.1027 (1.2018) [2022-01-20 04:14:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][70/1251] eta 0:48:33 lr 0.000828 time 1.8692 (2.4667) loss 3.7488 (3.7913) grad_norm 1.2437 (1.2095) [2022-01-20 04:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][80/1251] eta 0:47:25 lr 0.000828 time 3.0506 (2.4298) loss 2.7745 (3.7483) grad_norm 1.1999 (1.2049) [2022-01-20 04:15:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][90/1251] eta 0:46:40 lr 0.000828 time 2.9836 (2.4123) loss 4.8239 (3.7726) grad_norm 1.2289 (1.2046) [2022-01-20 04:15:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][100/1251] eta 0:45:54 lr 0.000828 time 2.1830 (2.3933) loss 4.1280 (3.7773) grad_norm 1.2294 (1.2009) [2022-01-20 04:15:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][110/1251] eta 0:45:10 lr 0.000828 time 1.5496 (2.3757) loss 3.6813 (3.7848) grad_norm 1.1217 (1.2017) [2022-01-20 04:16:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][120/1251] eta 0:45:00 lr 0.000828 time 3.7814 (2.3877) loss 4.2191 (3.7839) grad_norm 1.4869 (1.2046) [2022-01-20 04:16:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][130/1251] eta 0:44:27 lr 0.000828 time 2.5062 (2.3798) loss 4.1826 (3.7770) grad_norm 1.3004 (1.2042) [2022-01-20 04:16:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][140/1251] eta 0:43:56 lr 0.000828 time 2.6292 (2.3729) loss 3.5072 (3.7825) grad_norm 1.2839 (1.2019) [2022-01-20 04:17:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][150/1251] eta 0:43:10 lr 0.000828 time 1.6073 (2.3528) loss 3.6590 (3.7929) grad_norm 1.1136 (1.2017) [2022-01-20 04:17:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][160/1251] eta 0:42:27 lr 0.000828 time 2.9157 (2.3350) loss 4.1081 (3.8012) grad_norm 1.0939 (1.1989) [2022-01-20 04:18:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][170/1251] eta 0:41:48 lr 0.000828 time 2.8655 (2.3209) loss 4.2359 (3.7865) grad_norm 1.2327 (1.2005) [2022-01-20 04:18:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][180/1251] eta 0:41:16 lr 0.000828 time 2.2889 (2.3125) loss 3.4900 (3.7889) grad_norm 1.2339 (1.1997) [2022-01-20 04:18:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][190/1251] eta 0:40:42 lr 0.000828 time 1.8572 (2.3021) loss 3.6395 (3.8000) grad_norm 1.0230 (1.2002) [2022-01-20 04:19:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][200/1251] eta 0:40:19 lr 0.000828 time 2.5873 (2.3025) loss 2.9618 (3.7974) grad_norm 1.2196 (1.2023) [2022-01-20 04:19:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][210/1251] eta 0:39:47 lr 0.000828 time 1.5814 (2.2931) loss 3.4630 (3.7969) grad_norm 1.2967 (1.2033) [2022-01-20 04:19:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][220/1251] eta 0:39:10 lr 0.000828 time 1.6703 (2.2798) loss 3.2510 (3.7876) grad_norm 1.6780 (1.2063) [2022-01-20 04:20:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][230/1251] eta 0:38:45 lr 0.000828 time 2.1004 (2.2776) loss 4.4306 (3.7892) grad_norm 1.3565 (1.2072) [2022-01-20 04:20:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][240/1251] eta 0:38:13 lr 0.000828 time 1.8597 (2.2689) loss 3.0749 (3.7953) grad_norm 1.1701 (1.2057) [2022-01-20 04:20:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][250/1251] eta 0:37:51 lr 0.000828 time 1.5674 (2.2691) loss 3.3381 (3.7917) grad_norm 1.1220 (1.2026) [2022-01-20 04:21:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][260/1251] eta 0:37:22 lr 0.000828 time 1.5583 (2.2629) loss 4.2443 (3.7986) grad_norm 1.2205 (1.2033) [2022-01-20 04:21:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][270/1251] eta 0:37:01 lr 0.000828 time 2.5868 (2.2642) loss 4.1352 (3.8021) grad_norm 1.1845 (1.2009) [2022-01-20 04:21:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][280/1251] eta 0:36:35 lr 0.000828 time 1.7283 (2.2615) loss 3.8533 (3.8013) grad_norm 1.1249 (1.2021) [2022-01-20 04:22:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][290/1251] eta 0:36:16 lr 0.000828 time 2.1494 (2.2645) loss 4.3634 (3.8098) grad_norm 1.0579 (1.2009) [2022-01-20 04:22:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][300/1251] eta 0:35:51 lr 0.000828 time 1.8950 (2.2620) loss 4.2652 (3.8023) grad_norm 1.1119 (1.2006) [2022-01-20 04:23:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][310/1251] eta 0:35:28 lr 0.000827 time 2.6113 (2.2617) loss 2.8506 (3.7982) grad_norm 1.1128 (1.1990) [2022-01-20 04:23:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][320/1251] eta 0:34:59 lr 0.000827 time 1.8284 (2.2555) loss 3.0350 (3.7919) grad_norm 1.3515 (1.1985) [2022-01-20 04:23:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][330/1251] eta 0:34:35 lr 0.000827 time 2.5437 (2.2538) loss 2.8175 (3.7950) grad_norm 1.0610 (1.1965) [2022-01-20 04:24:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][340/1251] eta 0:34:09 lr 0.000827 time 2.2444 (2.2498) loss 4.3789 (3.7966) grad_norm 1.1544 (1.1947) [2022-01-20 04:24:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][350/1251] eta 0:33:40 lr 0.000827 time 1.6601 (2.2428) loss 4.3198 (3.7963) grad_norm 1.2402 (1.1935) [2022-01-20 04:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][360/1251] eta 0:33:18 lr 0.000827 time 2.0841 (2.2425) loss 3.5665 (3.8005) grad_norm 1.1116 (1.1938) [2022-01-20 04:25:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][370/1251] eta 0:33:00 lr 0.000827 time 2.6271 (2.2477) loss 3.5930 (3.7969) grad_norm 1.1158 (1.1919) [2022-01-20 04:25:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][380/1251] eta 0:32:37 lr 0.000827 time 2.5101 (2.2470) loss 4.3133 (3.7945) grad_norm 1.2138 (1.1917) [2022-01-20 04:26:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][390/1251] eta 0:32:11 lr 0.000827 time 1.7331 (2.2431) loss 4.5556 (3.7954) grad_norm 1.3160 (1.1944) [2022-01-20 04:26:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][400/1251] eta 0:31:46 lr 0.000827 time 2.5015 (2.2400) loss 3.0416 (3.7969) grad_norm 1.2032 (1.1957) [2022-01-20 04:26:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][410/1251] eta 0:31:22 lr 0.000827 time 2.2112 (2.2385) loss 3.9589 (3.8004) grad_norm 1.3180 (1.1955) [2022-01-20 04:27:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][420/1251] eta 0:30:58 lr 0.000827 time 2.3375 (2.2366) loss 4.3146 (3.7996) grad_norm 1.0830 (1.1953) [2022-01-20 04:27:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][430/1251] eta 0:30:36 lr 0.000827 time 2.0424 (2.2365) loss 2.6297 (3.7919) grad_norm 1.3131 (1.1939) [2022-01-20 04:27:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][440/1251] eta 0:30:15 lr 0.000827 time 2.8863 (2.2386) loss 4.2090 (3.7904) grad_norm 1.2508 (1.1950) [2022-01-20 04:28:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][450/1251] eta 0:29:51 lr 0.000827 time 2.1477 (2.2362) loss 4.1235 (3.7906) grad_norm 1.2003 (1.1946) [2022-01-20 04:28:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][460/1251] eta 0:29:27 lr 0.000827 time 1.8595 (2.2341) loss 3.9740 (3.7905) grad_norm 1.2542 (1.1937) [2022-01-20 04:28:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][470/1251] eta 0:29:02 lr 0.000827 time 2.2032 (2.2309) loss 3.5293 (3.7867) grad_norm 1.3651 (1.1928) [2022-01-20 04:29:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][480/1251] eta 0:28:37 lr 0.000827 time 1.6812 (2.2274) loss 3.8742 (3.7841) grad_norm 1.1348 (1.1926) [2022-01-20 04:29:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][490/1251] eta 0:28:15 lr 0.000827 time 2.3026 (2.2282) loss 3.3358 (3.7878) grad_norm 1.0831 (1.1912) [2022-01-20 04:29:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][500/1251] eta 0:27:51 lr 0.000827 time 1.9434 (2.2259) loss 3.8053 (3.7841) grad_norm 1.1386 (1.1912) [2022-01-20 04:30:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][510/1251] eta 0:27:28 lr 0.000827 time 1.7142 (2.2245) loss 4.3739 (3.7822) grad_norm 1.0458 (1.1916) [2022-01-20 04:30:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][520/1251] eta 0:27:04 lr 0.000827 time 1.6627 (2.2222) loss 4.4490 (3.7798) grad_norm 1.2139 (1.1911) [2022-01-20 04:31:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][530/1251] eta 0:26:41 lr 0.000827 time 1.7353 (2.2214) loss 4.2079 (3.7818) grad_norm 1.1593 (1.1906) [2022-01-20 04:31:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][540/1251] eta 0:26:21 lr 0.000827 time 2.7244 (2.2239) loss 4.2163 (3.7815) grad_norm 1.0230 (1.1909) [2022-01-20 04:31:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][550/1251] eta 0:26:01 lr 0.000827 time 2.9234 (2.2274) loss 3.4336 (3.7837) grad_norm 1.1489 (1.1913) [2022-01-20 04:32:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][560/1251] eta 0:25:40 lr 0.000827 time 1.5684 (2.2293) loss 3.6055 (3.7815) grad_norm 1.0823 (1.1938) [2022-01-20 04:32:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][570/1251] eta 0:25:17 lr 0.000827 time 1.6387 (2.2287) loss 3.9540 (3.7795) grad_norm 1.3863 (1.1960) [2022-01-20 04:32:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][580/1251] eta 0:24:53 lr 0.000827 time 2.1273 (2.2265) loss 4.1069 (3.7828) grad_norm 0.9431 (1.1954) [2022-01-20 04:33:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][590/1251] eta 0:24:27 lr 0.000827 time 1.8502 (2.2204) loss 3.0804 (3.7794) grad_norm 1.2201 (1.1952) [2022-01-20 04:33:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][600/1251] eta 0:24:03 lr 0.000827 time 2.1002 (2.2175) loss 4.4855 (3.7768) grad_norm 1.0428 (1.1941) [2022-01-20 04:33:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][610/1251] eta 0:23:40 lr 0.000827 time 2.1774 (2.2157) loss 4.0547 (3.7807) grad_norm 1.1104 (1.1954) [2022-01-20 04:34:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][620/1251] eta 0:23:17 lr 0.000826 time 1.9586 (2.2154) loss 3.2422 (3.7811) grad_norm 1.3425 (1.1953) [2022-01-20 04:34:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][630/1251] eta 0:22:54 lr 0.000826 time 2.1823 (2.2138) loss 4.2059 (3.7838) grad_norm 1.6272 (1.1962) [2022-01-20 04:35:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][640/1251] eta 0:22:32 lr 0.000826 time 2.5012 (2.2143) loss 3.8152 (3.7834) grad_norm 1.0865 (1.1961) [2022-01-20 04:35:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][650/1251] eta 0:22:11 lr 0.000826 time 2.5238 (2.2148) loss 4.7039 (3.7865) grad_norm 1.0495 (1.1953) [2022-01-20 04:35:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][660/1251] eta 0:21:48 lr 0.000826 time 2.3517 (2.2133) loss 3.1879 (3.7885) grad_norm 1.2036 (1.1945) [2022-01-20 04:36:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][670/1251] eta 0:21:25 lr 0.000826 time 2.0736 (2.2124) loss 2.5435 (3.7833) grad_norm 1.0343 (1.1932) [2022-01-20 04:36:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][680/1251] eta 0:21:03 lr 0.000826 time 3.1442 (2.2122) loss 3.0487 (3.7829) grad_norm 1.0066 (1.1922) [2022-01-20 04:36:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][690/1251] eta 0:20:40 lr 0.000826 time 1.9531 (2.2113) loss 4.5261 (3.7865) grad_norm 1.2040 (1.1923) [2022-01-20 04:37:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][700/1251] eta 0:20:18 lr 0.000826 time 2.2406 (2.2120) loss 4.3631 (3.7899) grad_norm 1.2480 (1.1922) [2022-01-20 04:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][710/1251] eta 0:19:56 lr 0.000826 time 1.9032 (2.2122) loss 2.8994 (3.7897) grad_norm 1.3404 (1.1945) [2022-01-20 04:38:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][720/1251] eta 0:19:36 lr 0.000826 time 3.1101 (2.2155) loss 4.0653 (3.7893) grad_norm 1.2606 (1.1947) [2022-01-20 04:38:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][730/1251] eta 0:19:15 lr 0.000826 time 2.2947 (2.2187) loss 4.2773 (3.7880) grad_norm 1.1349 (1.1954) [2022-01-20 04:38:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][740/1251] eta 0:18:53 lr 0.000826 time 1.6265 (2.2178) loss 4.3761 (3.7871) grad_norm 1.0801 (1.1941) [2022-01-20 04:39:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][750/1251] eta 0:18:29 lr 0.000826 time 1.9226 (2.2144) loss 4.1160 (3.7881) grad_norm 1.1939 (1.1940) [2022-01-20 04:39:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][760/1251] eta 0:18:06 lr 0.000826 time 2.8208 (2.2124) loss 3.8418 (3.7879) grad_norm 1.3802 (1.1937) [2022-01-20 04:39:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][770/1251] eta 0:17:43 lr 0.000826 time 2.2362 (2.2113) loss 2.6961 (3.7846) grad_norm 1.1769 (1.1945) [2022-01-20 04:40:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][780/1251] eta 0:17:21 lr 0.000826 time 1.9384 (2.2108) loss 3.8043 (3.7824) grad_norm 1.4218 (1.1945) [2022-01-20 04:40:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][790/1251] eta 0:16:59 lr 0.000826 time 2.3904 (2.2120) loss 4.3959 (3.7872) grad_norm 1.1542 (1.1949) [2022-01-20 04:40:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][800/1251] eta 0:16:38 lr 0.000826 time 3.3457 (2.2143) loss 3.0474 (3.7879) grad_norm 0.9612 (1.1948) [2022-01-20 04:41:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][810/1251] eta 0:16:16 lr 0.000826 time 2.3122 (2.2148) loss 3.9787 (3.7894) grad_norm 1.1592 (1.1951) [2022-01-20 04:41:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][820/1251] eta 0:15:54 lr 0.000826 time 1.9513 (2.2150) loss 4.2950 (3.7909) grad_norm 1.5794 (1.1969) [2022-01-20 04:42:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][830/1251] eta 0:15:32 lr 0.000826 time 1.5301 (2.2139) loss 3.3530 (3.7917) grad_norm 1.1516 (1.1967) [2022-01-20 04:42:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][840/1251] eta 0:15:09 lr 0.000826 time 3.0059 (2.2134) loss 3.4548 (3.7946) grad_norm 1.1538 (1.1967) [2022-01-20 04:42:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][850/1251] eta 0:14:46 lr 0.000826 time 2.1093 (2.2119) loss 4.6550 (3.7977) grad_norm 1.2239 (1.1963) [2022-01-20 04:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][860/1251] eta 0:14:24 lr 0.000826 time 2.2745 (2.2114) loss 4.7191 (3.7945) grad_norm 1.1761 (1.1972) [2022-01-20 04:43:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][870/1251] eta 0:14:02 lr 0.000826 time 1.8190 (2.2101) loss 3.4003 (3.7897) grad_norm 1.1975 (1.1972) [2022-01-20 04:43:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][880/1251] eta 0:13:39 lr 0.000826 time 2.4005 (2.2093) loss 3.8669 (3.7906) grad_norm 1.1285 (1.1975) [2022-01-20 04:44:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][890/1251] eta 0:13:16 lr 0.000826 time 2.3978 (2.2076) loss 4.2210 (3.7915) grad_norm 1.1616 (1.1973) [2022-01-20 04:44:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][900/1251] eta 0:12:54 lr 0.000826 time 2.1661 (2.2072) loss 3.9387 (3.7918) grad_norm 1.1664 (1.1978) [2022-01-20 04:44:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][910/1251] eta 0:12:32 lr 0.000826 time 2.0011 (2.2075) loss 4.1540 (3.7907) grad_norm 1.3508 (1.1982) [2022-01-20 04:45:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][920/1251] eta 0:12:10 lr 0.000826 time 2.4429 (2.2081) loss 3.9968 (3.7914) grad_norm 1.1209 (1.1987) [2022-01-20 04:45:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][930/1251] eta 0:11:49 lr 0.000826 time 2.5060 (2.2091) loss 4.2459 (3.7920) grad_norm 1.0536 (1.1989) [2022-01-20 04:46:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][940/1251] eta 0:11:26 lr 0.000825 time 1.9334 (2.2090) loss 3.5975 (3.7919) grad_norm 1.6617 (1.2014) [2022-01-20 04:46:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][950/1251] eta 0:11:04 lr 0.000825 time 2.2553 (2.2075) loss 3.9366 (3.7915) grad_norm 1.1100 (1.2016) [2022-01-20 04:46:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][960/1251] eta 0:10:41 lr 0.000825 time 1.7486 (2.2051) loss 3.5853 (3.7938) grad_norm 1.2311 (1.2017) [2022-01-20 04:47:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][970/1251] eta 0:10:19 lr 0.000825 time 2.2280 (2.2062) loss 4.0160 (3.7894) grad_norm 1.3454 (1.2012) [2022-01-20 04:47:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][980/1251] eta 0:09:58 lr 0.000825 time 2.3112 (2.2079) loss 3.3623 (3.7911) grad_norm 1.1534 (1.2014) [2022-01-20 04:47:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][990/1251] eta 0:09:36 lr 0.000825 time 2.2498 (2.2073) loss 4.1103 (3.7930) grad_norm 1.3759 (1.2021) [2022-01-20 04:48:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1000/1251] eta 0:09:14 lr 0.000825 time 1.7803 (2.2072) loss 3.1212 (3.7916) grad_norm 1.1226 (1.2016) [2022-01-20 04:48:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1010/1251] eta 0:08:51 lr 0.000825 time 2.3131 (2.2073) loss 2.4630 (3.7908) grad_norm 1.0909 (1.2007) [2022-01-20 04:48:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1020/1251] eta 0:08:29 lr 0.000825 time 1.8016 (2.2063) loss 3.9070 (3.7898) grad_norm 1.1462 (1.2003) [2022-01-20 04:49:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1030/1251] eta 0:08:07 lr 0.000825 time 2.1499 (2.2045) loss 3.6512 (3.7876) grad_norm 1.2477 (1.1998) [2022-01-20 04:49:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1040/1251] eta 0:07:44 lr 0.000825 time 1.5980 (2.2035) loss 2.8595 (3.7863) grad_norm 1.2552 (1.2000) [2022-01-20 04:49:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1050/1251] eta 0:07:22 lr 0.000825 time 2.5688 (2.2032) loss 4.5099 (3.7852) grad_norm 1.1854 (1.2000) [2022-01-20 04:50:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1060/1251] eta 0:07:00 lr 0.000825 time 1.8756 (2.2029) loss 3.6426 (3.7862) grad_norm 1.1696 (1.2006) [2022-01-20 04:50:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1070/1251] eta 0:06:38 lr 0.000825 time 2.2978 (2.2034) loss 3.7177 (3.7865) grad_norm 1.1170 (1.2001) [2022-01-20 04:51:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1080/1251] eta 0:06:16 lr 0.000825 time 2.2112 (2.2019) loss 4.0333 (3.7862) grad_norm 1.1911 (1.1999) [2022-01-20 04:51:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1090/1251] eta 0:05:54 lr 0.000825 time 2.4174 (2.2016) loss 4.4880 (3.7874) grad_norm 1.4929 (1.1998) [2022-01-20 04:51:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1100/1251] eta 0:05:32 lr 0.000825 time 2.4613 (2.2026) loss 4.0466 (3.7882) grad_norm 1.3475 (1.1993) [2022-01-20 04:52:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1110/1251] eta 0:05:10 lr 0.000825 time 3.0594 (2.2040) loss 4.0409 (3.7883) grad_norm 1.0675 (1.1988) [2022-01-20 04:52:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1120/1251] eta 0:04:48 lr 0.000825 time 2.2396 (2.2040) loss 4.1318 (3.7877) grad_norm 1.0905 (1.1980) [2022-01-20 04:52:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1130/1251] eta 0:04:26 lr 0.000825 time 2.2899 (2.2040) loss 4.0127 (3.7876) grad_norm 1.1899 (1.1979) [2022-01-20 04:53:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1140/1251] eta 0:04:04 lr 0.000825 time 2.7377 (2.2042) loss 3.2005 (3.7859) grad_norm 1.0788 (1.1978) [2022-01-20 04:53:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1150/1251] eta 0:03:42 lr 0.000825 time 1.6398 (2.2025) loss 3.0094 (3.7855) grad_norm 1.1257 (1.1972) [2022-01-20 04:53:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1160/1251] eta 0:03:20 lr 0.000825 time 1.5300 (2.2008) loss 3.8056 (3.7837) grad_norm 1.0724 (1.1969) [2022-01-20 04:54:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1170/1251] eta 0:02:58 lr 0.000825 time 1.8984 (2.1991) loss 2.9375 (3.7824) grad_norm 1.2177 (1.1967) [2022-01-20 04:54:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1180/1251] eta 0:02:36 lr 0.000825 time 1.8919 (2.1977) loss 2.8940 (3.7826) grad_norm 1.2271 (1.1975) [2022-01-20 04:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1190/1251] eta 0:02:14 lr 0.000825 time 2.1090 (2.1971) loss 3.9085 (3.7800) grad_norm 1.2776 (1.1979) [2022-01-20 04:55:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1200/1251] eta 0:01:52 lr 0.000825 time 2.1819 (2.1984) loss 4.4651 (3.7806) grad_norm 1.0517 (1.1973) [2022-01-20 04:55:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1210/1251] eta 0:01:30 lr 0.000825 time 2.5047 (2.2006) loss 4.0352 (3.7811) grad_norm 1.1042 (1.1967) [2022-01-20 04:56:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1220/1251] eta 0:01:08 lr 0.000825 time 3.4767 (2.2024) loss 3.4997 (3.7799) grad_norm 1.2226 (1.1965) [2022-01-20 04:56:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1230/1251] eta 0:00:46 lr 0.000825 time 2.2065 (2.2017) loss 4.0661 (3.7804) grad_norm 1.1683 (1.1966) [2022-01-20 04:56:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1240/1251] eta 0:00:24 lr 0.000825 time 1.8003 (2.1997) loss 2.7383 (3.7801) grad_norm 1.0461 (1.1962) [2022-01-20 04:57:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [82/300][1250/1251] eta 0:00:02 lr 0.000825 time 1.2125 (2.1937) loss 3.9581 (3.7795) grad_norm 0.9867 (1.1955) [2022-01-20 04:57:08 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 82 training takes 0:45:44 [2022-01-20 04:57:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.447 (18.447) Loss 1.2359 (1.2359) Acc@1 71.582 (71.582) Acc@5 91.309 (91.309) [2022-01-20 04:57:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.659 (3.444) Loss 1.2331 (1.1964) Acc@1 71.582 (72.363) Acc@5 90.430 (91.744) [2022-01-20 04:58:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.285 (2.479) Loss 1.0665 (1.1812) Acc@1 75.977 (72.917) Acc@5 92.969 (91.713) [2022-01-20 04:58:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.634 (2.243) Loss 1.2169 (1.1872) Acc@1 73.145 (72.763) Acc@5 91.504 (91.696) [2022-01-20 04:58:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.061 (2.145) Loss 1.1712 (1.1890) Acc@1 72.070 (72.694) Acc@5 92.480 (91.647) [2022-01-20 04:58:44 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 72.656 Acc@5 91.582 [2022-01-20 04:58:44 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 72.7% [2022-01-20 04:58:44 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 72.69% [2022-01-20 04:59:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][0/1251] eta 7:21:35 lr 0.000825 time 21.1794 (21.1794) loss 3.4715 (3.4715) grad_norm 1.2809 (1.2809) [2022-01-20 04:59:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][10/1251] eta 1:24:37 lr 0.000824 time 1.8704 (4.0913) loss 3.6579 (3.7615) grad_norm 1.0710 (1.1443) [2022-01-20 04:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][20/1251] eta 1:04:52 lr 0.000824 time 2.3047 (3.1619) loss 4.6743 (3.7436) grad_norm 1.1104 (1.1594) [2022-01-20 05:00:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][30/1251] eta 0:57:11 lr 0.000824 time 1.8967 (2.8103) loss 3.7312 (3.7341) grad_norm 1.2691 (1.1890) [2022-01-20 05:00:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][40/1251] eta 0:54:30 lr 0.000824 time 3.9039 (2.7007) loss 3.2993 (3.7348) grad_norm 1.1052 (1.1965) [2022-01-20 05:00:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][50/1251] eta 0:52:52 lr 0.000824 time 1.8586 (2.6412) loss 3.9060 (3.7760) grad_norm 1.0027 (1.1934) [2022-01-20 05:01:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][60/1251] eta 0:50:45 lr 0.000824 time 1.4619 (2.5573) loss 4.1505 (3.8153) grad_norm 1.0681 (1.1864) [2022-01-20 05:01:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][70/1251] eta 0:49:39 lr 0.000824 time 1.7875 (2.5232) loss 4.1542 (3.7958) grad_norm 1.2716 (1.1893) [2022-01-20 05:02:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][80/1251] eta 0:49:08 lr 0.000824 time 3.8411 (2.5181) loss 4.5413 (3.8156) grad_norm 1.5300 (1.1911) [2022-01-20 05:02:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][90/1251] eta 0:48:05 lr 0.000824 time 1.8665 (2.4851) loss 3.5554 (3.8246) grad_norm 1.4369 (1.2118) [2022-01-20 05:02:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][100/1251] eta 0:46:48 lr 0.000824 time 1.8854 (2.4401) loss 3.4656 (3.8445) grad_norm 1.4028 (1.2136) [2022-01-20 05:03:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][110/1251] eta 0:45:35 lr 0.000824 time 1.8695 (2.3973) loss 3.3871 (3.8288) grad_norm 1.1523 (1.2069) [2022-01-20 05:03:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][120/1251] eta 0:45:05 lr 0.000824 time 4.4323 (2.3921) loss 4.2274 (3.8523) grad_norm 1.3775 (1.2113) [2022-01-20 05:03:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][130/1251] eta 0:44:28 lr 0.000824 time 1.7694 (2.3806) loss 2.7494 (3.8210) grad_norm 1.3457 (1.2127) [2022-01-20 05:04:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][140/1251] eta 0:44:02 lr 0.000824 time 2.1706 (2.3781) loss 4.2252 (3.8343) grad_norm 1.1704 (1.2074) [2022-01-20 05:04:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][150/1251] eta 0:43:23 lr 0.000824 time 2.1479 (2.3645) loss 3.7576 (3.8244) grad_norm 1.0582 (1.2069) [2022-01-20 05:05:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][160/1251] eta 0:42:31 lr 0.000824 time 2.1290 (2.3389) loss 4.1730 (3.8254) grad_norm 1.1513 (1.2018) [2022-01-20 05:05:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][170/1251] eta 0:41:52 lr 0.000824 time 2.4168 (2.3245) loss 4.1203 (3.8196) grad_norm 1.0151 (1.2000) [2022-01-20 05:05:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][180/1251] eta 0:41:24 lr 0.000824 time 2.0023 (2.3198) loss 4.1693 (3.8141) grad_norm 1.2980 (1.2001) [2022-01-20 05:06:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][190/1251] eta 0:40:56 lr 0.000824 time 2.2768 (2.3155) loss 3.8869 (3.8095) grad_norm 1.1102 (1.1975) [2022-01-20 05:06:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][200/1251] eta 0:40:26 lr 0.000824 time 1.9697 (2.3089) loss 3.3496 (3.8094) grad_norm 1.2239 (1.1942) [2022-01-20 05:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][210/1251] eta 0:39:58 lr 0.000824 time 2.2531 (2.3042) loss 4.4077 (3.8048) grad_norm 1.2613 (1.1950) [2022-01-20 05:07:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][220/1251] eta 0:39:34 lr 0.000824 time 3.6350 (2.3026) loss 4.3456 (3.8164) grad_norm 1.1121 (1.1933) [2022-01-20 05:07:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][230/1251] eta 0:39:01 lr 0.000824 time 2.1565 (2.2937) loss 4.1145 (3.8166) grad_norm 1.2916 (1.1955) [2022-01-20 05:07:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][240/1251] eta 0:38:28 lr 0.000824 time 1.6203 (2.2834) loss 3.1180 (3.8187) grad_norm 1.1548 (1.1993) [2022-01-20 05:08:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][250/1251] eta 0:38:00 lr 0.000824 time 2.3281 (2.2781) loss 3.3480 (3.8123) grad_norm 1.1644 (1.1980) [2022-01-20 05:08:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][260/1251] eta 0:37:39 lr 0.000824 time 4.0139 (2.2799) loss 2.5810 (3.8051) grad_norm 1.1728 (1.2002) [2022-01-20 05:09:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][270/1251] eta 0:37:12 lr 0.000824 time 1.6711 (2.2761) loss 2.7046 (3.7972) grad_norm 1.1391 (1.1990) [2022-01-20 05:09:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][280/1251] eta 0:36:46 lr 0.000824 time 2.1334 (2.2728) loss 2.7733 (3.7965) grad_norm 1.1503 (1.1992) [2022-01-20 05:09:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][290/1251] eta 0:36:21 lr 0.000824 time 1.8637 (2.2698) loss 4.4280 (3.7911) grad_norm 1.0975 (1.1996) [2022-01-20 05:10:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][300/1251] eta 0:35:50 lr 0.000824 time 1.8684 (2.2615) loss 4.1137 (3.7934) grad_norm 1.0975 (1.1988) [2022-01-20 05:10:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][310/1251] eta 0:35:19 lr 0.000824 time 2.1550 (2.2525) loss 2.7721 (3.7767) grad_norm 1.3704 (1.1959) [2022-01-20 05:10:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][320/1251] eta 0:34:56 lr 0.000823 time 2.9808 (2.2516) loss 3.3508 (3.7745) grad_norm 0.9896 (1.1940) [2022-01-20 05:11:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][330/1251] eta 0:34:34 lr 0.000823 time 2.0881 (2.2525) loss 4.4965 (3.7721) grad_norm 1.1327 (1.1976) [2022-01-20 05:11:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][340/1251] eta 0:34:14 lr 0.000823 time 2.0053 (2.2549) loss 3.9549 (3.7749) grad_norm 1.2243 (1.1998) [2022-01-20 05:11:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][350/1251] eta 0:33:53 lr 0.000823 time 1.9537 (2.2566) loss 3.9270 (3.7779) grad_norm 1.2382 (1.2009) [2022-01-20 05:12:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][360/1251] eta 0:33:28 lr 0.000823 time 2.4734 (2.2542) loss 4.0024 (3.7794) grad_norm 1.1626 (1.2012) [2022-01-20 05:12:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][370/1251] eta 0:33:03 lr 0.000823 time 2.2221 (2.2518) loss 3.2587 (3.7786) grad_norm 1.1863 (1.2009) [2022-01-20 05:13:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][380/1251] eta 0:32:39 lr 0.000823 time 1.6521 (2.2500) loss 4.3630 (3.7774) grad_norm 1.1781 (1.2012) [2022-01-20 05:13:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][390/1251] eta 0:32:14 lr 0.000823 time 1.9923 (2.2469) loss 4.3185 (3.7801) grad_norm 1.0847 (1.2010) [2022-01-20 05:13:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][400/1251] eta 0:31:50 lr 0.000823 time 2.1175 (2.2449) loss 3.5955 (3.7796) grad_norm 1.0727 (1.1998) [2022-01-20 05:14:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][410/1251] eta 0:31:31 lr 0.000823 time 1.7757 (2.2491) loss 4.0101 (3.7879) grad_norm 1.4426 (1.1988) [2022-01-20 05:14:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][420/1251] eta 0:31:08 lr 0.000823 time 1.6937 (2.2485) loss 3.4472 (3.7833) grad_norm 1.1890 (1.1977) [2022-01-20 05:14:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][430/1251] eta 0:30:42 lr 0.000823 time 1.6959 (2.2443) loss 4.2982 (3.7814) grad_norm 1.1883 (1.1968) [2022-01-20 05:15:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][440/1251] eta 0:30:15 lr 0.000823 time 1.5355 (2.2389) loss 3.6130 (3.7811) grad_norm 1.1843 (1.1983) [2022-01-20 05:15:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][450/1251] eta 0:29:52 lr 0.000823 time 2.1133 (2.2374) loss 4.8122 (3.7747) grad_norm 1.1504 (1.1971) [2022-01-20 05:15:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][460/1251] eta 0:29:28 lr 0.000823 time 2.1560 (2.2363) loss 3.5365 (3.7675) grad_norm 1.5347 (1.1983) [2022-01-20 05:16:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][470/1251] eta 0:29:03 lr 0.000823 time 1.9415 (2.2327) loss 3.9301 (3.7679) grad_norm 1.2678 (1.1980) [2022-01-20 05:16:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][480/1251] eta 0:28:42 lr 0.000823 time 2.7298 (2.2336) loss 3.5060 (3.7705) grad_norm 1.0824 (1.1970) [2022-01-20 05:16:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][490/1251] eta 0:28:18 lr 0.000823 time 1.8459 (2.2315) loss 3.5799 (3.7717) grad_norm 1.1373 (1.1959) [2022-01-20 05:17:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][500/1251] eta 0:27:56 lr 0.000823 time 2.0159 (2.2324) loss 3.2049 (3.7680) grad_norm 1.0843 (1.1955) [2022-01-20 05:17:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][510/1251] eta 0:27:31 lr 0.000823 time 2.2189 (2.2294) loss 3.9258 (3.7702) grad_norm 1.3517 (1.1961) [2022-01-20 05:18:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][520/1251] eta 0:27:08 lr 0.000823 time 3.1686 (2.2279) loss 4.7669 (3.7710) grad_norm 1.4863 (1.1971) [2022-01-20 05:18:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][530/1251] eta 0:26:45 lr 0.000823 time 2.0232 (2.2267) loss 3.3671 (3.7642) grad_norm 1.1230 (1.1983) [2022-01-20 05:18:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][540/1251] eta 0:26:22 lr 0.000823 time 2.3328 (2.2259) loss 4.3507 (3.7674) grad_norm 1.2678 (1.1995) [2022-01-20 05:19:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][550/1251] eta 0:25:59 lr 0.000823 time 1.7392 (2.2253) loss 3.8980 (3.7693) grad_norm 1.0464 (1.1989) [2022-01-20 05:19:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][560/1251] eta 0:25:36 lr 0.000823 time 2.8465 (2.2239) loss 3.8889 (3.7681) grad_norm 1.4244 (1.1991) [2022-01-20 05:19:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][570/1251] eta 0:25:13 lr 0.000823 time 2.5215 (2.2222) loss 3.2530 (3.7730) grad_norm 1.2144 (1.2001) [2022-01-20 05:20:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][580/1251] eta 0:24:49 lr 0.000823 time 1.9239 (2.2199) loss 2.5102 (3.7705) grad_norm 1.0975 (1.1992) [2022-01-20 05:20:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][590/1251] eta 0:24:26 lr 0.000823 time 2.5921 (2.2185) loss 4.0522 (3.7669) grad_norm 1.1863 (1.1985) [2022-01-20 05:20:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][600/1251] eta 0:24:04 lr 0.000823 time 2.8177 (2.2182) loss 2.8153 (3.7644) grad_norm 1.1734 (1.1986) [2022-01-20 05:21:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][610/1251] eta 0:23:41 lr 0.000823 time 1.8896 (2.2181) loss 4.0491 (3.7691) grad_norm 1.3682 (1.1987) [2022-01-20 05:21:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][620/1251] eta 0:23:20 lr 0.000823 time 2.0684 (2.2191) loss 4.3043 (3.7728) grad_norm 1.0481 (1.1976) [2022-01-20 05:22:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][630/1251] eta 0:22:59 lr 0.000823 time 2.7891 (2.2216) loss 3.7386 (3.7706) grad_norm 1.2609 (1.1967) [2022-01-20 05:22:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][640/1251] eta 0:22:38 lr 0.000822 time 2.5017 (2.2226) loss 4.0396 (3.7728) grad_norm 1.0312 (1.1960) [2022-01-20 05:22:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][650/1251] eta 0:22:14 lr 0.000822 time 1.9299 (2.2197) loss 4.4978 (3.7751) grad_norm 1.2090 (1.1946) [2022-01-20 05:23:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][660/1251] eta 0:21:49 lr 0.000822 time 1.9359 (2.2155) loss 4.4545 (3.7764) grad_norm 1.0621 (1.1934) [2022-01-20 05:23:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][670/1251] eta 0:21:26 lr 0.000822 time 2.8626 (2.2142) loss 2.9605 (3.7755) grad_norm 1.0602 (1.1923) [2022-01-20 05:23:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][680/1251] eta 0:21:02 lr 0.000822 time 2.0294 (2.2118) loss 2.6656 (3.7723) grad_norm 1.4483 (1.1926) [2022-01-20 05:24:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][690/1251] eta 0:20:39 lr 0.000822 time 1.5701 (2.2100) loss 4.4312 (3.7708) grad_norm 1.3579 (1.1929) [2022-01-20 05:24:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][700/1251] eta 0:20:17 lr 0.000822 time 2.1784 (2.2102) loss 2.9232 (3.7689) grad_norm 1.3622 (1.1936) [2022-01-20 05:24:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][710/1251] eta 0:19:56 lr 0.000822 time 3.0447 (2.2125) loss 4.0474 (3.7711) grad_norm 1.1566 (1.1940) [2022-01-20 05:25:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][720/1251] eta 0:19:34 lr 0.000822 time 2.1796 (2.2119) loss 4.0308 (3.7712) grad_norm 1.1585 (1.1927) [2022-01-20 05:25:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][730/1251] eta 0:19:11 lr 0.000822 time 1.6741 (2.2111) loss 3.9896 (3.7692) grad_norm 1.1887 (1.1935) [2022-01-20 05:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][740/1251] eta 0:18:50 lr 0.000822 time 2.1878 (2.2126) loss 3.5093 (3.7680) grad_norm 1.2429 (1.1940) [2022-01-20 05:26:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][750/1251] eta 0:18:30 lr 0.000822 time 3.8036 (2.2163) loss 4.0609 (3.7701) grad_norm 1.2462 (1.1945) [2022-01-20 05:26:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][760/1251] eta 0:18:09 lr 0.000822 time 3.1082 (2.2185) loss 4.4923 (3.7671) grad_norm 1.3238 (1.1948) [2022-01-20 05:27:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][770/1251] eta 0:17:46 lr 0.000822 time 1.9397 (2.2181) loss 3.1123 (3.7640) grad_norm 1.4062 (1.1949) [2022-01-20 05:27:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][780/1251] eta 0:17:23 lr 0.000822 time 1.8237 (2.2161) loss 4.4245 (3.7664) grad_norm 1.1107 (1.1958) [2022-01-20 05:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][790/1251] eta 0:17:00 lr 0.000822 time 1.9559 (2.2134) loss 4.4320 (3.7661) grad_norm 1.2579 (1.1964) [2022-01-20 05:28:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][800/1251] eta 0:16:37 lr 0.000822 time 3.5899 (2.2128) loss 3.3534 (3.7640) grad_norm 1.1495 (1.1957) [2022-01-20 05:28:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][810/1251] eta 0:16:14 lr 0.000822 time 2.1599 (2.2107) loss 3.3997 (3.7597) grad_norm 1.0661 (1.1950) [2022-01-20 05:28:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][820/1251] eta 0:15:52 lr 0.000822 time 2.2080 (2.2107) loss 3.5170 (3.7593) grad_norm 1.2949 (1.1959) [2022-01-20 05:29:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][830/1251] eta 0:15:31 lr 0.000822 time 2.1495 (2.2114) loss 3.9470 (3.7581) grad_norm 1.1235 (1.1960) [2022-01-20 05:29:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][840/1251] eta 0:15:09 lr 0.000822 time 3.2648 (2.2123) loss 3.8003 (3.7579) grad_norm 1.2567 (1.1966) [2022-01-20 05:30:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][850/1251] eta 0:14:46 lr 0.000822 time 2.3034 (2.2113) loss 3.6990 (3.7581) grad_norm 1.2250 (1.1957) [2022-01-20 05:30:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][860/1251] eta 0:14:24 lr 0.000822 time 1.7980 (2.2113) loss 4.5304 (3.7581) grad_norm 1.1246 (1.1953) [2022-01-20 05:30:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][870/1251] eta 0:14:02 lr 0.000822 time 1.9101 (2.2117) loss 4.1047 (3.7603) grad_norm 1.3574 (1.1949) [2022-01-20 05:31:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][880/1251] eta 0:13:40 lr 0.000822 time 2.8495 (2.2117) loss 3.3641 (3.7627) grad_norm 1.2342 (1.1950) [2022-01-20 05:31:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][890/1251] eta 0:13:17 lr 0.000822 time 2.5660 (2.2102) loss 3.2614 (3.7648) grad_norm 1.2206 (1.1954) [2022-01-20 05:31:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][900/1251] eta 0:12:55 lr 0.000822 time 2.5677 (2.2087) loss 3.4530 (3.7646) grad_norm 1.1579 (1.1954) [2022-01-20 05:32:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][910/1251] eta 0:12:32 lr 0.000822 time 1.5635 (2.2071) loss 3.5422 (3.7658) grad_norm 1.1251 (1.1958) [2022-01-20 05:32:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][920/1251] eta 0:12:10 lr 0.000822 time 1.5534 (2.2059) loss 4.5899 (3.7641) grad_norm 1.2994 (1.1970) [2022-01-20 05:32:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][930/1251] eta 0:11:48 lr 0.000822 time 2.5729 (2.2070) loss 4.2881 (3.7630) grad_norm 1.0743 (1.1970) [2022-01-20 05:33:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][940/1251] eta 0:11:26 lr 0.000822 time 2.3625 (2.2078) loss 4.1075 (3.7639) grad_norm 1.0088 (1.1972) [2022-01-20 05:33:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][950/1251] eta 0:11:04 lr 0.000821 time 1.8337 (2.2080) loss 4.1785 (3.7650) grad_norm 1.2087 (1.1975) [2022-01-20 05:34:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][960/1251] eta 0:10:42 lr 0.000821 time 1.6333 (2.2086) loss 4.3246 (3.7658) grad_norm 1.1646 (1.1972) [2022-01-20 05:34:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][970/1251] eta 0:10:20 lr 0.000821 time 2.7504 (2.2082) loss 2.6032 (3.7671) grad_norm 1.2049 (1.1975) [2022-01-20 05:34:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][980/1251] eta 0:09:57 lr 0.000821 time 1.8252 (2.2050) loss 3.8669 (3.7669) grad_norm 1.1331 (1.1981) [2022-01-20 05:35:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][990/1251] eta 0:09:35 lr 0.000821 time 1.9511 (2.2047) loss 3.0363 (3.7669) grad_norm 1.3477 (1.1976) [2022-01-20 05:35:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1000/1251] eta 0:09:13 lr 0.000821 time 1.8878 (2.2046) loss 4.4407 (3.7657) grad_norm 1.2628 (1.1971) [2022-01-20 05:35:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1010/1251] eta 0:08:51 lr 0.000821 time 2.5648 (2.2037) loss 4.2796 (3.7637) grad_norm 0.9940 (1.1967) [2022-01-20 05:36:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1020/1251] eta 0:08:29 lr 0.000821 time 2.0055 (2.2038) loss 4.1910 (3.7648) grad_norm 1.1196 (1.1977) [2022-01-20 05:36:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1030/1251] eta 0:08:07 lr 0.000821 time 2.1213 (2.2053) loss 3.7948 (3.7608) grad_norm 1.1718 (1.1975) [2022-01-20 05:36:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1040/1251] eta 0:07:45 lr 0.000821 time 1.6638 (2.2047) loss 3.4645 (3.7589) grad_norm 1.4343 (1.1976) [2022-01-20 05:37:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1050/1251] eta 0:07:22 lr 0.000821 time 1.7762 (2.2040) loss 4.1834 (3.7584) grad_norm 1.3132 (1.1978) [2022-01-20 05:37:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1060/1251] eta 0:07:00 lr 0.000821 time 2.6585 (2.2035) loss 3.5239 (3.7585) grad_norm 1.2709 (1.1977) [2022-01-20 05:38:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1070/1251] eta 0:06:38 lr 0.000821 time 2.7354 (2.2026) loss 3.9546 (3.7598) grad_norm 1.0422 (1.1971) [2022-01-20 05:38:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1080/1251] eta 0:06:16 lr 0.000821 time 1.4659 (2.2025) loss 4.2054 (3.7590) grad_norm 1.1041 (1.1962) [2022-01-20 05:38:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1090/1251] eta 0:05:54 lr 0.000821 time 1.5193 (2.2030) loss 3.2572 (3.7582) grad_norm 1.4565 (1.1955) [2022-01-20 05:39:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1100/1251] eta 0:05:32 lr 0.000821 time 2.1754 (2.2039) loss 3.4359 (3.7598) grad_norm 1.3963 (1.1962) [2022-01-20 05:39:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1110/1251] eta 0:05:11 lr 0.000821 time 4.0265 (2.2061) loss 3.6018 (3.7598) grad_norm 1.0048 (1.1954) [2022-01-20 05:39:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1120/1251] eta 0:04:48 lr 0.000821 time 1.8359 (2.2055) loss 4.3692 (3.7595) grad_norm 1.1056 (1.1953) [2022-01-20 05:40:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1130/1251] eta 0:04:26 lr 0.000821 time 1.7951 (2.2034) loss 4.1374 (3.7605) grad_norm 1.4629 (1.1960) [2022-01-20 05:40:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1140/1251] eta 0:04:04 lr 0.000821 time 2.4740 (2.2025) loss 4.4263 (3.7600) grad_norm 1.0864 (1.1954) [2022-01-20 05:40:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1150/1251] eta 0:03:42 lr 0.000821 time 2.5334 (2.2018) loss 3.0355 (3.7619) grad_norm 1.1461 (1.1950) [2022-01-20 05:41:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1160/1251] eta 0:03:20 lr 0.000821 time 2.1685 (2.2011) loss 4.5824 (3.7614) grad_norm 1.0722 (1.1942) [2022-01-20 05:41:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1170/1251] eta 0:02:58 lr 0.000821 time 1.9250 (2.2011) loss 4.1884 (3.7642) grad_norm 1.1813 (1.1942) [2022-01-20 05:42:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1180/1251] eta 0:02:36 lr 0.000821 time 2.4066 (2.2008) loss 3.0242 (3.7634) grad_norm 1.0667 (1.1942) [2022-01-20 05:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1190/1251] eta 0:02:14 lr 0.000821 time 2.2009 (2.2000) loss 3.6917 (3.7642) grad_norm 1.0727 (1.1946) [2022-01-20 05:42:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1200/1251] eta 0:01:52 lr 0.000821 time 2.6267 (2.1999) loss 3.4373 (3.7637) grad_norm 1.1577 (1.1942) [2022-01-20 05:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1210/1251] eta 0:01:30 lr 0.000821 time 2.5043 (2.2000) loss 4.0742 (3.7632) grad_norm 1.0408 (1.1941) [2022-01-20 05:43:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1220/1251] eta 0:01:08 lr 0.000821 time 2.5381 (2.2010) loss 4.2537 (3.7630) grad_norm 1.0989 (1.1938) [2022-01-20 05:43:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1230/1251] eta 0:00:46 lr 0.000821 time 2.5969 (2.2014) loss 3.4100 (3.7627) grad_norm 1.2578 (1.1936) [2022-01-20 05:44:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1240/1251] eta 0:00:24 lr 0.000821 time 1.2034 (2.1996) loss 3.7387 (3.7631) grad_norm 1.1323 (1.1939) [2022-01-20 05:44:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [83/300][1250/1251] eta 0:00:02 lr 0.000821 time 1.1961 (2.1943) loss 4.0677 (3.7616) grad_norm 1.5061 (1.1942) [2022-01-20 05:44:29 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 83 training takes 0:45:45 [2022-01-20 05:44:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.093 (18.093) Loss 1.1823 (1.1823) Acc@1 72.949 (72.949) Acc@5 92.285 (92.285) [2022-01-20 05:45:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 2.306 (3.420) Loss 1.1401 (1.1744) Acc@1 74.219 (73.136) Acc@5 91.113 (91.744) [2022-01-20 05:45:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.605 (2.575) Loss 1.1563 (1.1740) Acc@1 72.852 (73.042) Acc@5 91.797 (91.825) [2022-01-20 05:45:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.006 (2.373) Loss 1.1388 (1.1755) Acc@1 72.266 (72.880) Acc@5 92.480 (91.794) [2022-01-20 05:45:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.405 (2.175) Loss 1.2063 (1.1780) Acc@1 73.730 (72.842) Acc@5 90.918 (91.740) [2022-01-20 05:46:05 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 72.936 Acc@5 91.666 [2022-01-20 05:46:05 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 72.9% [2022-01-20 05:46:05 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 72.94% [2022-01-20 05:46:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][0/1251] eta 7:20:11 lr 0.000821 time 21.1119 (21.1119) loss 3.5240 (3.5240) grad_norm 1.4185 (1.4185) [2022-01-20 05:46:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][10/1251] eta 1:21:20 lr 0.000820 time 1.8499 (3.9324) loss 2.9182 (3.6047) grad_norm 1.1942 (1.2492) [2022-01-20 05:47:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][20/1251] eta 1:02:07 lr 0.000820 time 1.4995 (3.0278) loss 3.9732 (3.7512) grad_norm 1.1561 (1.2486) [2022-01-20 05:47:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][30/1251] eta 0:56:07 lr 0.000820 time 1.4918 (2.7576) loss 4.1474 (3.7385) grad_norm 0.9991 (1.2189) [2022-01-20 05:47:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][40/1251] eta 0:52:55 lr 0.000820 time 3.3344 (2.6225) loss 4.2389 (3.7942) grad_norm 1.1367 (1.1985) [2022-01-20 05:48:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][50/1251] eta 0:51:42 lr 0.000820 time 2.2056 (2.5837) loss 3.3841 (3.7796) grad_norm 1.1620 (1.1971) [2022-01-20 05:48:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][60/1251] eta 0:50:43 lr 0.000820 time 1.8481 (2.5555) loss 3.9851 (3.8128) grad_norm 1.0847 (1.1869) [2022-01-20 05:49:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][70/1251] eta 0:49:00 lr 0.000820 time 1.6709 (2.4898) loss 3.7934 (3.7939) grad_norm 1.2024 (1.1874) [2022-01-20 05:49:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][80/1251] eta 0:48:01 lr 0.000820 time 3.4475 (2.4609) loss 2.5024 (3.7844) grad_norm 1.3765 (1.1870) [2022-01-20 05:49:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][90/1251] eta 0:47:10 lr 0.000820 time 2.1389 (2.4383) loss 4.4472 (3.7902) grad_norm 1.1445 (1.2004) [2022-01-20 05:50:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][100/1251] eta 0:46:18 lr 0.000820 time 1.5128 (2.4143) loss 2.8406 (3.7713) grad_norm 1.0601 (1.2042) [2022-01-20 05:50:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][110/1251] eta 0:45:32 lr 0.000820 time 1.7647 (2.3946) loss 3.4205 (3.7564) grad_norm 1.0019 (1.1983) [2022-01-20 05:50:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][120/1251] eta 0:44:39 lr 0.000820 time 1.8841 (2.3694) loss 4.2893 (3.7485) grad_norm 1.9653 (1.2043) [2022-01-20 05:51:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][130/1251] eta 0:44:06 lr 0.000820 time 1.6668 (2.3608) loss 4.3351 (3.7726) grad_norm 1.5065 (1.2093) [2022-01-20 05:51:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][140/1251] eta 0:43:19 lr 0.000820 time 1.6854 (2.3400) loss 3.8062 (3.7702) grad_norm 1.1138 (1.2071) [2022-01-20 05:51:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][150/1251] eta 0:42:56 lr 0.000820 time 2.1226 (2.3400) loss 3.1039 (3.7769) grad_norm 1.1622 (1.2080) [2022-01-20 05:52:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][160/1251] eta 0:42:18 lr 0.000820 time 1.8839 (2.3271) loss 4.0960 (3.7755) grad_norm 1.0050 (1.2057) [2022-01-20 05:52:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][170/1251] eta 0:41:51 lr 0.000820 time 1.7197 (2.3237) loss 3.6966 (3.7616) grad_norm 1.2363 (1.2069) [2022-01-20 05:53:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][180/1251] eta 0:41:21 lr 0.000820 time 2.1847 (2.3166) loss 3.9644 (3.7624) grad_norm 1.2237 (1.2038) [2022-01-20 05:53:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][190/1251] eta 0:40:43 lr 0.000820 time 1.7555 (2.3027) loss 4.0761 (3.7598) grad_norm 1.1208 (1.1986) [2022-01-20 05:53:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][200/1251] eta 0:40:08 lr 0.000820 time 1.8804 (2.2915) loss 4.2139 (3.7601) grad_norm 1.0764 (1.1958) [2022-01-20 05:54:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][210/1251] eta 0:39:43 lr 0.000820 time 1.9053 (2.2897) loss 4.4672 (3.7453) grad_norm 1.1538 (1.1970) [2022-01-20 05:54:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][220/1251] eta 0:39:22 lr 0.000820 time 2.1302 (2.2919) loss 4.1002 (3.7457) grad_norm 1.0963 (1.1992) [2022-01-20 05:54:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][230/1251] eta 0:38:52 lr 0.000820 time 1.8967 (2.2842) loss 2.8088 (3.7312) grad_norm 1.2479 (1.1988) [2022-01-20 05:55:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][240/1251] eta 0:38:28 lr 0.000820 time 1.6773 (2.2830) loss 3.7681 (3.7430) grad_norm 1.2408 (1.1978) [2022-01-20 05:55:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][250/1251] eta 0:38:02 lr 0.000820 time 2.1725 (2.2804) loss 3.0543 (3.7342) grad_norm 1.0858 (1.1977) [2022-01-20 05:55:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][260/1251] eta 0:37:26 lr 0.000820 time 1.9769 (2.2671) loss 4.3000 (3.7424) grad_norm 1.1946 (1.1980) [2022-01-20 05:56:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][270/1251] eta 0:36:57 lr 0.000820 time 1.7865 (2.2602) loss 3.9027 (3.7459) grad_norm 1.2399 (1.2022) [2022-01-20 05:56:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][280/1251] eta 0:36:34 lr 0.000820 time 2.2376 (2.2602) loss 2.5374 (3.7338) grad_norm 1.1854 (1.2031) [2022-01-20 05:57:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][290/1251] eta 0:36:08 lr 0.000820 time 1.8956 (2.2565) loss 4.0874 (3.7370) grad_norm 1.1568 (1.2032) [2022-01-20 05:57:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][300/1251] eta 0:35:42 lr 0.000820 time 2.1111 (2.2532) loss 2.7718 (3.7365) grad_norm 1.0613 (1.2034) [2022-01-20 05:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][310/1251] eta 0:35:20 lr 0.000820 time 1.8569 (2.2536) loss 4.5347 (3.7459) grad_norm 1.1967 (1.2031) [2022-01-20 05:58:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][320/1251] eta 0:35:01 lr 0.000820 time 2.7433 (2.2570) loss 2.8394 (3.7471) grad_norm 1.1204 (1.2009) [2022-01-20 05:58:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][330/1251] eta 0:34:39 lr 0.000819 time 2.1375 (2.2577) loss 4.3153 (3.7540) grad_norm 1.1300 (1.1992) [2022-01-20 05:58:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][340/1251] eta 0:34:12 lr 0.000819 time 1.8859 (2.2532) loss 3.7123 (3.7491) grad_norm 1.1727 (1.1971) [2022-01-20 05:59:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][350/1251] eta 0:33:44 lr 0.000819 time 1.8600 (2.2475) loss 2.7811 (3.7495) grad_norm 1.3840 (1.1971) [2022-01-20 05:59:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][360/1251] eta 0:33:14 lr 0.000819 time 1.8528 (2.2383) loss 4.6085 (3.7561) grad_norm 1.3203 (1.1983) [2022-01-20 05:59:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][370/1251] eta 0:32:48 lr 0.000819 time 1.9770 (2.2344) loss 4.0253 (3.7572) grad_norm 1.1916 (1.1992) [2022-01-20 06:00:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][380/1251] eta 0:32:23 lr 0.000819 time 2.2204 (2.2314) loss 4.2438 (3.7612) grad_norm 1.3504 (1.1998) [2022-01-20 06:00:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][390/1251] eta 0:32:03 lr 0.000819 time 2.4017 (2.2345) loss 3.2248 (3.7608) grad_norm 1.1235 (1.2016) [2022-01-20 06:01:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][400/1251] eta 0:31:44 lr 0.000819 time 1.9300 (2.2374) loss 2.6045 (3.7619) grad_norm 1.2810 (1.2022) [2022-01-20 06:01:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][410/1251] eta 0:31:21 lr 0.000819 time 2.8350 (2.2368) loss 2.6842 (3.7557) grad_norm 1.0777 (1.2017) [2022-01-20 06:01:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][420/1251] eta 0:30:59 lr 0.000819 time 2.1812 (2.2372) loss 3.7794 (3.7590) grad_norm 1.1283 (1.2016) [2022-01-20 06:02:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][430/1251] eta 0:30:39 lr 0.000819 time 2.1662 (2.2405) loss 4.4691 (3.7616) grad_norm 1.2131 (1.2031) [2022-01-20 06:02:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][440/1251] eta 0:30:14 lr 0.000819 time 2.2750 (2.2377) loss 4.3187 (3.7639) grad_norm 1.0916 (1.2027) [2022-01-20 06:02:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][450/1251] eta 0:29:48 lr 0.000819 time 2.3653 (2.2331) loss 3.9312 (3.7669) grad_norm 0.9486 (1.2020) [2022-01-20 06:03:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][460/1251] eta 0:29:22 lr 0.000819 time 1.9009 (2.2288) loss 4.1470 (3.7684) grad_norm 1.1615 (1.2005) [2022-01-20 06:03:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][470/1251] eta 0:29:01 lr 0.000819 time 1.8832 (2.2304) loss 3.4683 (3.7769) grad_norm 1.2414 (1.2006) [2022-01-20 06:03:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][480/1251] eta 0:28:39 lr 0.000819 time 1.9250 (2.2307) loss 4.1141 (3.7811) grad_norm 1.0153 (1.1986) [2022-01-20 06:04:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][490/1251] eta 0:28:15 lr 0.000819 time 2.3624 (2.2279) loss 4.1108 (3.7770) grad_norm 1.0892 (1.1979) [2022-01-20 06:04:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][500/1251] eta 0:27:52 lr 0.000819 time 1.7897 (2.2265) loss 3.8101 (3.7760) grad_norm 1.0410 (1.1979) [2022-01-20 06:05:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][510/1251] eta 0:27:32 lr 0.000819 time 2.6385 (2.2296) loss 3.2469 (3.7739) grad_norm 1.0677 (1.1973) [2022-01-20 06:05:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][520/1251] eta 0:27:09 lr 0.000819 time 1.7258 (2.2286) loss 3.6051 (3.7745) grad_norm 1.2575 (1.1971) [2022-01-20 06:05:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][530/1251] eta 0:26:45 lr 0.000819 time 2.1650 (2.2271) loss 4.1597 (3.7747) grad_norm 1.1196 (1.1971) [2022-01-20 06:06:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][540/1251] eta 0:26:20 lr 0.000819 time 1.8507 (2.2224) loss 2.6429 (3.7747) grad_norm 1.0450 (1.1981) [2022-01-20 06:06:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][550/1251] eta 0:25:56 lr 0.000819 time 2.6315 (2.2201) loss 3.5938 (3.7736) grad_norm 1.3675 (1.1991) [2022-01-20 06:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][560/1251] eta 0:25:33 lr 0.000819 time 2.4024 (2.2197) loss 4.1758 (3.7719) grad_norm 1.2575 (1.1994) [2022-01-20 06:07:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][570/1251] eta 0:25:11 lr 0.000819 time 2.2462 (2.2192) loss 3.7118 (3.7657) grad_norm 1.5300 (1.1995) [2022-01-20 06:07:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][580/1251] eta 0:24:48 lr 0.000819 time 1.3340 (2.2189) loss 2.9892 (3.7680) grad_norm 1.1826 (1.1983) [2022-01-20 06:07:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][590/1251] eta 0:24:25 lr 0.000819 time 1.9721 (2.2166) loss 3.4148 (3.7657) grad_norm 1.2800 (1.1991) [2022-01-20 06:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][600/1251] eta 0:24:02 lr 0.000819 time 1.8885 (2.2158) loss 3.8931 (3.7659) grad_norm 1.1341 (1.1995) [2022-01-20 06:08:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][610/1251] eta 0:23:40 lr 0.000819 time 2.7238 (2.2165) loss 3.4498 (3.7652) grad_norm 1.1246 (1.1989) [2022-01-20 06:09:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][620/1251] eta 0:23:18 lr 0.000819 time 1.8909 (2.2170) loss 4.3009 (3.7660) grad_norm 1.1285 (1.1983) [2022-01-20 06:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][630/1251] eta 0:22:57 lr 0.000819 time 1.5818 (2.2187) loss 2.8417 (3.7662) grad_norm 1.2794 (1.1983) [2022-01-20 06:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][640/1251] eta 0:22:36 lr 0.000818 time 2.2537 (2.2209) loss 3.4674 (3.7687) grad_norm 1.0172 (1.1967) [2022-01-20 06:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][650/1251] eta 0:22:14 lr 0.000818 time 2.8172 (2.2211) loss 4.0994 (3.7649) grad_norm 1.1986 (1.1960) [2022-01-20 06:10:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][660/1251] eta 0:21:50 lr 0.000818 time 1.7046 (2.2176) loss 4.0893 (3.7669) grad_norm 1.0893 (1.1971) [2022-01-20 06:10:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][670/1251] eta 0:21:26 lr 0.000818 time 1.7372 (2.2134) loss 4.0380 (3.7719) grad_norm 1.1344 (1.1961) [2022-01-20 06:11:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][680/1251] eta 0:21:02 lr 0.000818 time 2.1832 (2.2106) loss 4.5396 (3.7726) grad_norm 1.1965 (1.1960) [2022-01-20 06:11:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][690/1251] eta 0:20:38 lr 0.000818 time 1.9505 (2.2078) loss 3.6155 (3.7737) grad_norm 0.9718 (1.1956) [2022-01-20 06:11:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][700/1251] eta 0:20:15 lr 0.000818 time 1.9899 (2.2063) loss 3.0196 (3.7752) grad_norm 1.0488 (1.1948) [2022-01-20 06:12:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][710/1251] eta 0:19:52 lr 0.000818 time 1.7164 (2.2043) loss 4.1145 (3.7753) grad_norm 1.0221 (1.1939) [2022-01-20 06:12:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][720/1251] eta 0:19:30 lr 0.000818 time 2.2195 (2.2041) loss 3.9219 (3.7727) grad_norm 1.1556 (1.1940) [2022-01-20 06:12:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][730/1251] eta 0:19:08 lr 0.000818 time 2.1607 (2.2052) loss 4.6056 (3.7770) grad_norm 1.1278 (1.1936) [2022-01-20 06:13:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][740/1251] eta 0:18:47 lr 0.000818 time 1.6069 (2.2063) loss 4.1243 (3.7783) grad_norm 1.0168 (1.1941) [2022-01-20 06:13:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][750/1251] eta 0:18:26 lr 0.000818 time 1.9562 (2.2082) loss 4.1276 (3.7790) grad_norm 1.0751 (1.1956) [2022-01-20 06:14:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][760/1251] eta 0:18:05 lr 0.000818 time 2.8761 (2.2112) loss 3.4282 (3.7789) grad_norm 1.3394 (1.1973) [2022-01-20 06:14:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][770/1251] eta 0:17:43 lr 0.000818 time 1.9719 (2.2115) loss 4.7281 (3.7748) grad_norm 1.3265 (1.1982) [2022-01-20 06:14:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][780/1251] eta 0:17:20 lr 0.000818 time 2.2692 (2.2100) loss 2.7977 (3.7752) grad_norm 1.2834 (1.2002) [2022-01-20 06:15:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][790/1251] eta 0:16:57 lr 0.000818 time 1.8341 (2.2078) loss 4.4177 (3.7768) grad_norm 1.0750 (1.2006) [2022-01-20 06:15:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][800/1251] eta 0:16:34 lr 0.000818 time 1.9806 (2.2053) loss 4.0609 (3.7779) grad_norm 1.0229 (1.2007) [2022-01-20 06:15:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][810/1251] eta 0:16:12 lr 0.000818 time 2.1976 (2.2063) loss 3.3900 (3.7789) grad_norm 1.1795 (1.2006) [2022-01-20 06:16:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][820/1251] eta 0:15:50 lr 0.000818 time 2.2413 (2.2063) loss 4.3975 (3.7818) grad_norm 1.3733 (1.2012) [2022-01-20 06:16:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][830/1251] eta 0:15:28 lr 0.000818 time 1.8699 (2.2066) loss 3.7508 (3.7825) grad_norm 1.0961 (1.2011) [2022-01-20 06:17:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][840/1251] eta 0:15:07 lr 0.000818 time 2.1915 (2.2070) loss 3.9951 (3.7815) grad_norm 1.1332 (1.2012) [2022-01-20 06:17:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][850/1251] eta 0:14:45 lr 0.000818 time 1.9041 (2.2081) loss 3.8802 (3.7821) grad_norm 1.2503 (1.2017) [2022-01-20 06:17:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][860/1251] eta 0:14:22 lr 0.000818 time 1.6828 (2.2058) loss 4.2544 (3.7829) grad_norm 1.2480 (1.2022) [2022-01-20 06:18:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][870/1251] eta 0:13:59 lr 0.000818 time 1.7517 (2.2036) loss 2.9260 (3.7825) grad_norm 1.1943 (1.2017) [2022-01-20 06:18:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][880/1251] eta 0:13:37 lr 0.000818 time 2.1641 (2.2031) loss 3.7622 (3.7828) grad_norm 1.0708 (1.2012) [2022-01-20 06:18:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][890/1251] eta 0:13:16 lr 0.000818 time 2.4519 (2.2054) loss 4.1561 (3.7829) grad_norm 1.0908 (1.2010) [2022-01-20 06:19:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][900/1251] eta 0:12:54 lr 0.000818 time 2.2598 (2.2052) loss 4.3213 (3.7839) grad_norm 1.2266 (1.2004) [2022-01-20 06:19:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][910/1251] eta 0:12:31 lr 0.000818 time 1.9185 (2.2052) loss 3.7710 (3.7837) grad_norm 1.2647 (1.2002) [2022-01-20 06:19:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][920/1251] eta 0:12:10 lr 0.000818 time 2.5572 (2.2058) loss 3.9025 (3.7855) grad_norm 1.2809 (1.2005) [2022-01-20 06:20:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][930/1251] eta 0:11:47 lr 0.000818 time 1.5620 (2.2041) loss 4.3899 (3.7858) grad_norm 1.3476 (1.2011) [2022-01-20 06:20:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][940/1251] eta 0:11:24 lr 0.000818 time 1.6279 (2.2019) loss 3.6811 (3.7863) grad_norm 1.3560 (1.2013) [2022-01-20 06:20:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][950/1251] eta 0:11:02 lr 0.000817 time 2.6122 (2.2018) loss 3.8215 (3.7841) grad_norm 1.1888 (1.2018) [2022-01-20 06:21:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][960/1251] eta 0:10:40 lr 0.000817 time 1.8545 (2.2027) loss 4.6120 (3.7854) grad_norm 1.0766 (1.2015) [2022-01-20 06:21:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][970/1251] eta 0:10:19 lr 0.000817 time 2.3262 (2.2037) loss 4.5938 (3.7838) grad_norm 1.2086 (1.2027) [2022-01-20 06:22:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][980/1251] eta 0:09:57 lr 0.000817 time 2.1329 (2.2038) loss 4.2243 (3.7848) grad_norm 1.0955 (1.2021) [2022-01-20 06:22:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][990/1251] eta 0:09:34 lr 0.000817 time 1.8664 (2.2027) loss 4.0653 (3.7853) grad_norm 1.0909 (1.2023) [2022-01-20 06:22:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1000/1251] eta 0:09:12 lr 0.000817 time 2.1078 (2.2012) loss 2.8561 (3.7851) grad_norm 1.0314 (1.2024) [2022-01-20 06:23:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1010/1251] eta 0:08:50 lr 0.000817 time 2.8533 (2.2020) loss 3.9716 (3.7829) grad_norm 1.0920 (1.2029) [2022-01-20 06:23:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1020/1251] eta 0:08:28 lr 0.000817 time 2.6838 (2.2017) loss 4.3356 (3.7862) grad_norm 1.2783 (1.2033) [2022-01-20 06:23:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1030/1251] eta 0:08:06 lr 0.000817 time 2.7847 (2.2025) loss 3.3062 (3.7870) grad_norm 1.2517 (1.2036) [2022-01-20 06:24:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1040/1251] eta 0:07:44 lr 0.000817 time 1.8703 (2.2037) loss 4.5538 (3.7884) grad_norm 1.1028 (1.2032) [2022-01-20 06:24:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1050/1251] eta 0:07:23 lr 0.000817 time 2.1021 (2.2040) loss 4.1940 (3.7861) grad_norm 1.2473 (1.2031) [2022-01-20 06:25:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1060/1251] eta 0:07:00 lr 0.000817 time 1.8826 (2.2021) loss 3.7896 (3.7851) grad_norm 1.2491 (1.2031) [2022-01-20 06:25:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1070/1251] eta 0:06:38 lr 0.000817 time 2.6366 (2.2000) loss 3.8031 (3.7852) grad_norm 1.0423 (1.2022) [2022-01-20 06:25:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1080/1251] eta 0:06:16 lr 0.000817 time 2.1829 (2.2002) loss 2.9785 (3.7841) grad_norm 1.4547 (1.2022) [2022-01-20 06:26:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1090/1251] eta 0:05:54 lr 0.000817 time 1.9265 (2.2007) loss 2.8058 (3.7832) grad_norm 1.0910 (1.2021) [2022-01-20 06:26:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1100/1251] eta 0:05:32 lr 0.000817 time 2.8698 (2.2021) loss 2.8517 (3.7842) grad_norm 1.1540 (1.2022) [2022-01-20 06:26:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1110/1251] eta 0:05:10 lr 0.000817 time 3.1057 (2.2030) loss 3.9621 (3.7836) grad_norm 1.2629 (1.2022) [2022-01-20 06:27:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1120/1251] eta 0:04:48 lr 0.000817 time 2.1083 (2.2028) loss 2.5438 (3.7828) grad_norm 1.3904 (1.2022) [2022-01-20 06:27:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1130/1251] eta 0:04:26 lr 0.000817 time 2.0664 (2.2001) loss 4.1012 (3.7824) grad_norm 1.3589 (1.2022) [2022-01-20 06:27:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1140/1251] eta 0:04:04 lr 0.000817 time 2.5118 (2.1991) loss 4.1292 (3.7827) grad_norm 1.0919 (1.2020) [2022-01-20 06:28:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1150/1251] eta 0:03:42 lr 0.000817 time 2.2977 (2.1986) loss 3.8221 (3.7819) grad_norm 1.1116 (1.2016) [2022-01-20 06:28:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1160/1251] eta 0:03:20 lr 0.000817 time 2.1276 (2.1989) loss 3.9721 (3.7831) grad_norm 1.1626 (1.2012) [2022-01-20 06:28:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1170/1251] eta 0:02:58 lr 0.000817 time 1.8721 (2.1979) loss 3.9571 (3.7859) grad_norm 1.3325 (1.2017) [2022-01-20 06:29:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1180/1251] eta 0:02:36 lr 0.000817 time 2.4339 (2.1983) loss 4.0012 (3.7863) grad_norm 1.2520 (1.2026) [2022-01-20 06:29:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1190/1251] eta 0:02:14 lr 0.000817 time 2.5645 (2.1989) loss 4.0787 (3.7839) grad_norm 1.0527 (1.2022) [2022-01-20 06:30:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1200/1251] eta 0:01:52 lr 0.000817 time 1.6828 (2.1989) loss 4.4710 (3.7821) grad_norm 1.2493 (1.2024) [2022-01-20 06:30:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1210/1251] eta 0:01:30 lr 0.000817 time 2.1617 (2.1985) loss 4.2149 (3.7816) grad_norm 1.3878 (1.2029) [2022-01-20 06:30:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1220/1251] eta 0:01:08 lr 0.000817 time 3.1495 (2.2005) loss 4.0143 (3.7809) grad_norm 1.0664 (1.2028) [2022-01-20 06:31:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1230/1251] eta 0:00:46 lr 0.000817 time 1.8772 (2.2008) loss 3.4074 (3.7805) grad_norm 1.2498 (1.2026) [2022-01-20 06:31:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1240/1251] eta 0:00:24 lr 0.000817 time 1.4454 (2.1986) loss 3.9241 (3.7816) grad_norm 1.3292 (1.2024) [2022-01-20 06:31:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [84/300][1250/1251] eta 0:00:02 lr 0.000817 time 1.2227 (2.1925) loss 4.0268 (3.7810) grad_norm 1.1127 (1.2020) [2022-01-20 06:31:49 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 84 training takes 0:45:43 [2022-01-20 06:32:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.777 (18.777) Loss 1.1787 (1.1787) Acc@1 73.535 (73.535) Acc@5 90.723 (90.723) [2022-01-20 06:32:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.501 (3.462) Loss 1.2107 (1.2016) Acc@1 71.680 (72.541) Acc@5 91.406 (91.388) [2022-01-20 06:32:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.317 (2.547) Loss 1.2408 (1.2037) Acc@1 71.875 (72.698) Acc@5 90.527 (91.420) [2022-01-20 06:32:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.179 (2.242) Loss 1.1933 (1.2004) Acc@1 73.828 (72.883) Acc@5 91.992 (91.457) [2022-01-20 06:33:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.249 (2.205) Loss 1.1138 (1.1971) Acc@1 76.074 (72.942) Acc@5 91.895 (91.556) [2022-01-20 06:33:26 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 72.882 Acc@5 91.498 [2022-01-20 06:33:26 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 72.9% [2022-01-20 06:33:26 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 72.94% [2022-01-20 06:33:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][0/1251] eta 7:33:59 lr 0.000817 time 21.7742 (21.7742) loss 2.7172 (2.7172) grad_norm 1.4122 (1.4122) [2022-01-20 06:34:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][10/1251] eta 1:23:38 lr 0.000816 time 2.8919 (4.0439) loss 3.8288 (3.5676) grad_norm 0.9456 (1.1699) [2022-01-20 06:34:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][20/1251] eta 1:04:44 lr 0.000816 time 1.5437 (3.1558) loss 3.4302 (3.7384) grad_norm 1.0252 (1.1601) [2022-01-20 06:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][30/1251] eta 0:57:42 lr 0.000816 time 1.8997 (2.8357) loss 3.1745 (3.7713) grad_norm 1.4341 (1.1582) [2022-01-20 06:35:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][40/1251] eta 0:54:12 lr 0.000816 time 3.9991 (2.6861) loss 4.1410 (3.7624) grad_norm 1.1671 (1.1582) [2022-01-20 06:35:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][50/1251] eta 0:51:53 lr 0.000816 time 2.0868 (2.5921) loss 4.5318 (3.7857) grad_norm 1.0941 (1.1538) [2022-01-20 06:35:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][60/1251] eta 0:49:32 lr 0.000816 time 1.2867 (2.4959) loss 4.6365 (3.7891) grad_norm 1.3535 (1.1685) [2022-01-20 06:36:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][70/1251] eta 0:48:11 lr 0.000816 time 1.5076 (2.4480) loss 4.2864 (3.7789) grad_norm 1.2522 (1.1823) [2022-01-20 06:36:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][80/1251] eta 0:47:06 lr 0.000816 time 2.7969 (2.4140) loss 4.1803 (3.7789) grad_norm 1.3452 (1.1926) [2022-01-20 06:37:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][90/1251] eta 0:46:19 lr 0.000816 time 2.2813 (2.3937) loss 3.4674 (3.7665) grad_norm 1.2440 (1.2071) [2022-01-20 06:37:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][100/1251] eta 0:45:35 lr 0.000816 time 1.4552 (2.3766) loss 2.6842 (3.7600) grad_norm 1.0681 (1.2099) [2022-01-20 06:37:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][110/1251] eta 0:44:58 lr 0.000816 time 1.8754 (2.3652) loss 3.9579 (3.7747) grad_norm 1.3132 (1.2108) [2022-01-20 06:38:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][120/1251] eta 0:44:29 lr 0.000816 time 2.6038 (2.3600) loss 4.1326 (3.7730) grad_norm 1.2319 (1.2118) [2022-01-20 06:38:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][130/1251] eta 0:44:15 lr 0.000816 time 2.9983 (2.3686) loss 3.4976 (3.7775) grad_norm 1.2746 (1.2084) [2022-01-20 06:38:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][140/1251] eta 0:43:38 lr 0.000816 time 1.8005 (2.3573) loss 2.3187 (3.7436) grad_norm 1.4855 (1.2054) [2022-01-20 06:39:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][150/1251] eta 0:42:57 lr 0.000816 time 2.1895 (2.3414) loss 3.7950 (3.7479) grad_norm 1.0960 (1.2025) [2022-01-20 06:39:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][160/1251] eta 0:42:05 lr 0.000816 time 2.0354 (2.3145) loss 4.3039 (3.7454) grad_norm 1.1757 (1.2037) [2022-01-20 06:39:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][170/1251] eta 0:41:20 lr 0.000816 time 1.8166 (2.2949) loss 4.1107 (3.7559) grad_norm 1.4185 (1.2035) [2022-01-20 06:40:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][180/1251] eta 0:40:45 lr 0.000816 time 1.8792 (2.2835) loss 4.1101 (3.7596) grad_norm 1.1562 (1.2015) [2022-01-20 06:40:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][190/1251] eta 0:40:08 lr 0.000816 time 2.0426 (2.2704) loss 4.0263 (3.7525) grad_norm 1.1703 (1.2012) [2022-01-20 06:41:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][200/1251] eta 0:39:39 lr 0.000816 time 1.9515 (2.2642) loss 4.0094 (3.7604) grad_norm 1.2875 (1.2020) [2022-01-20 06:41:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][210/1251] eta 0:39:14 lr 0.000816 time 1.9760 (2.2615) loss 3.7797 (3.7495) grad_norm 1.1282 (1.1999) [2022-01-20 06:41:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][220/1251] eta 0:38:55 lr 0.000816 time 2.4673 (2.2649) loss 2.7184 (3.7501) grad_norm 1.2758 (1.2014) [2022-01-20 06:42:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][230/1251] eta 0:38:44 lr 0.000816 time 2.9338 (2.2769) loss 3.3722 (3.7457) grad_norm 1.8667 (1.2079) [2022-01-20 06:42:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][240/1251] eta 0:38:27 lr 0.000816 time 2.1263 (2.2823) loss 4.0992 (3.7464) grad_norm 1.1405 (1.2083) [2022-01-20 06:43:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][250/1251] eta 0:38:12 lr 0.000816 time 2.0436 (2.2898) loss 4.0099 (3.7435) grad_norm 1.2265 (1.2094) [2022-01-20 06:43:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][260/1251] eta 0:37:47 lr 0.000816 time 1.9068 (2.2877) loss 3.5576 (3.7448) grad_norm 1.2535 (1.2112) [2022-01-20 06:43:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][270/1251] eta 0:37:09 lr 0.000816 time 1.9854 (2.2727) loss 3.0918 (3.7460) grad_norm 1.1055 (1.2119) [2022-01-20 06:44:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][280/1251] eta 0:36:33 lr 0.000816 time 1.7701 (2.2594) loss 3.5145 (3.7508) grad_norm 1.1467 (1.2096) [2022-01-20 06:44:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][290/1251] eta 0:36:03 lr 0.000816 time 1.7899 (2.2516) loss 3.5694 (3.7530) grad_norm 1.2591 (1.2106) [2022-01-20 06:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][300/1251] eta 0:35:38 lr 0.000816 time 2.6090 (2.2482) loss 4.6159 (3.7620) grad_norm 1.0973 (1.2120) [2022-01-20 06:45:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][310/1251] eta 0:35:14 lr 0.000816 time 2.2271 (2.2468) loss 4.1874 (3.7561) grad_norm 1.2294 (1.2096) [2022-01-20 06:45:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][320/1251] eta 0:34:49 lr 0.000815 time 1.8429 (2.2447) loss 4.0904 (3.7598) grad_norm 0.9952 (1.2060) [2022-01-20 06:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][330/1251] eta 0:34:32 lr 0.000815 time 1.4699 (2.2498) loss 4.7602 (3.7664) grad_norm 1.0539 (1.2036) [2022-01-20 06:46:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][340/1251] eta 0:34:19 lr 0.000815 time 3.0807 (2.2610) loss 3.2855 (3.7680) grad_norm 1.2611 (1.2034) [2022-01-20 06:46:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][350/1251] eta 0:33:56 lr 0.000815 time 1.6765 (2.2603) loss 4.5020 (3.7658) grad_norm 1.1924 (1.2015) [2022-01-20 06:47:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][360/1251] eta 0:33:33 lr 0.000815 time 1.9881 (2.2595) loss 2.8461 (3.7634) grad_norm 1.0277 (1.2009) [2022-01-20 06:47:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][370/1251] eta 0:33:03 lr 0.000815 time 2.0795 (2.2518) loss 4.7590 (3.7644) grad_norm 1.1686 (1.2028) [2022-01-20 06:47:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][380/1251] eta 0:32:35 lr 0.000815 time 2.2559 (2.2453) loss 4.2737 (3.7676) grad_norm 1.0873 (1.2041) [2022-01-20 06:48:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][390/1251] eta 0:32:10 lr 0.000815 time 2.3829 (2.2427) loss 4.1918 (3.7688) grad_norm 1.2706 (1.2056) [2022-01-20 06:48:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][400/1251] eta 0:31:48 lr 0.000815 time 1.8360 (2.2426) loss 4.0498 (3.7688) grad_norm 1.2284 (1.2081) [2022-01-20 06:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][410/1251] eta 0:31:23 lr 0.000815 time 1.8754 (2.2401) loss 3.6575 (3.7695) grad_norm 1.3488 (1.2091) [2022-01-20 06:49:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][420/1251] eta 0:31:02 lr 0.000815 time 2.3067 (2.2407) loss 3.0247 (3.7713) grad_norm 1.1472 (1.2109) [2022-01-20 06:49:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][430/1251] eta 0:30:39 lr 0.000815 time 2.0644 (2.2401) loss 3.5581 (3.7675) grad_norm 1.1642 (1.2118) [2022-01-20 06:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][440/1251] eta 0:30:18 lr 0.000815 time 2.5802 (2.2424) loss 3.7901 (3.7637) grad_norm 1.4566 (1.2110) [2022-01-20 06:50:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][450/1251] eta 0:29:57 lr 0.000815 time 1.9504 (2.2437) loss 4.0597 (3.7654) grad_norm 1.2718 (1.2113) [2022-01-20 06:50:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][460/1251] eta 0:29:33 lr 0.000815 time 1.8721 (2.2417) loss 2.9334 (3.7610) grad_norm 1.3386 (1.2116) [2022-01-20 06:51:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][470/1251] eta 0:29:07 lr 0.000815 time 1.7645 (2.2382) loss 3.0220 (3.7599) grad_norm 1.3702 (1.2125) [2022-01-20 06:51:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][480/1251] eta 0:28:43 lr 0.000815 time 2.8030 (2.2349) loss 3.6103 (3.7549) grad_norm 1.2897 (1.2131) [2022-01-20 06:51:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][490/1251] eta 0:28:17 lr 0.000815 time 1.9708 (2.2301) loss 4.0208 (3.7555) grad_norm 1.1080 (1.2123) [2022-01-20 06:52:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][500/1251] eta 0:27:52 lr 0.000815 time 1.8077 (2.2268) loss 3.3648 (3.7567) grad_norm 1.1355 (1.2113) [2022-01-20 06:52:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][510/1251] eta 0:27:27 lr 0.000815 time 1.9280 (2.2236) loss 4.0690 (3.7566) grad_norm 1.1989 (1.2112) [2022-01-20 06:52:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][520/1251] eta 0:27:06 lr 0.000815 time 2.1609 (2.2255) loss 4.1450 (3.7588) grad_norm 1.3154 (1.2121) [2022-01-20 06:53:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][530/1251] eta 0:26:45 lr 0.000815 time 1.5889 (2.2262) loss 2.7617 (3.7569) grad_norm 1.0560 (1.2121) [2022-01-20 06:53:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][540/1251] eta 0:26:22 lr 0.000815 time 2.2054 (2.2251) loss 2.7710 (3.7540) grad_norm 1.1430 (1.2159) [2022-01-20 06:53:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][550/1251] eta 0:25:59 lr 0.000815 time 1.7418 (2.2248) loss 3.7428 (3.7536) grad_norm 1.1259 (1.2157) [2022-01-20 06:54:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][560/1251] eta 0:25:37 lr 0.000815 time 2.8465 (2.2253) loss 3.9108 (3.7539) grad_norm 1.0526 (1.2162) [2022-01-20 06:54:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][570/1251] eta 0:25:15 lr 0.000815 time 1.8814 (2.2253) loss 4.2231 (3.7571) grad_norm 1.3162 (1.2152) [2022-01-20 06:54:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][580/1251] eta 0:24:53 lr 0.000815 time 2.2494 (2.2263) loss 4.0844 (3.7626) grad_norm 1.1952 (1.2148) [2022-01-20 06:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][590/1251] eta 0:24:33 lr 0.000815 time 1.5553 (2.2288) loss 4.3130 (3.7643) grad_norm 1.1613 (1.2144) [2022-01-20 06:55:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][600/1251] eta 0:24:10 lr 0.000815 time 1.5852 (2.2282) loss 3.2313 (3.7604) grad_norm 1.0033 (1.2151) [2022-01-20 06:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][610/1251] eta 0:23:47 lr 0.000815 time 1.9376 (2.2271) loss 3.0472 (3.7595) grad_norm 1.1163 (1.2166) [2022-01-20 06:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][620/1251] eta 0:23:23 lr 0.000815 time 1.9126 (2.2242) loss 4.3996 (3.7634) grad_norm 1.0058 (1.2167) [2022-01-20 06:56:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][630/1251] eta 0:23:00 lr 0.000814 time 1.6927 (2.2224) loss 3.6034 (3.7627) grad_norm 1.1521 (1.2157) [2022-01-20 06:57:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][640/1251] eta 0:22:36 lr 0.000814 time 1.5194 (2.2198) loss 3.6867 (3.7669) grad_norm 1.1122 (1.2145) [2022-01-20 06:57:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][650/1251] eta 0:22:14 lr 0.000814 time 1.9284 (2.2198) loss 3.6124 (3.7647) grad_norm 1.2230 (1.2146) [2022-01-20 06:57:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][660/1251] eta 0:21:52 lr 0.000814 time 1.7754 (2.2207) loss 4.1360 (3.7630) grad_norm 1.2410 (1.2153) [2022-01-20 06:58:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][670/1251] eta 0:21:30 lr 0.000814 time 2.9103 (2.2217) loss 3.2138 (3.7609) grad_norm 1.1735 (1.2151) [2022-01-20 06:58:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][680/1251] eta 0:21:06 lr 0.000814 time 1.9113 (2.2189) loss 3.9016 (3.7599) grad_norm 1.3116 (1.2155) [2022-01-20 06:58:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][690/1251] eta 0:20:43 lr 0.000814 time 1.8659 (2.2168) loss 3.8411 (3.7641) grad_norm 1.1759 (1.2160) [2022-01-20 06:59:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][700/1251] eta 0:20:21 lr 0.000814 time 1.9541 (2.2166) loss 4.5498 (3.7616) grad_norm 1.0694 (1.2152) [2022-01-20 06:59:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][710/1251] eta 0:19:59 lr 0.000814 time 2.7646 (2.2180) loss 3.7435 (3.7617) grad_norm 1.1294 (1.2145) [2022-01-20 07:00:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][720/1251] eta 0:19:37 lr 0.000814 time 1.9260 (2.2171) loss 4.1660 (3.7618) grad_norm 1.0953 (1.2139) [2022-01-20 07:00:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][730/1251] eta 0:19:15 lr 0.000814 time 2.2068 (2.2172) loss 3.5705 (3.7601) grad_norm 1.0535 (1.2143) [2022-01-20 07:00:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][740/1251] eta 0:18:52 lr 0.000814 time 1.8436 (2.2164) loss 3.9918 (3.7597) grad_norm 1.2120 (1.2136) [2022-01-20 07:01:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][750/1251] eta 0:18:30 lr 0.000814 time 2.3940 (2.2169) loss 4.2560 (3.7619) grad_norm 1.0830 (1.2139) [2022-01-20 07:01:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][760/1251] eta 0:18:07 lr 0.000814 time 2.0112 (2.2146) loss 4.2442 (3.7649) grad_norm 1.1133 (1.2141) [2022-01-20 07:01:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][770/1251] eta 0:17:44 lr 0.000814 time 2.2920 (2.2125) loss 2.6936 (3.7617) grad_norm 1.2561 (1.2141) [2022-01-20 07:02:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][780/1251] eta 0:17:21 lr 0.000814 time 1.6812 (2.2107) loss 2.9015 (3.7613) grad_norm 1.2630 (1.2131) [2022-01-20 07:02:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][790/1251] eta 0:16:59 lr 0.000814 time 2.1959 (2.2112) loss 4.4596 (3.7599) grad_norm 1.1653 (1.2133) [2022-01-20 07:02:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][800/1251] eta 0:16:37 lr 0.000814 time 1.9373 (2.2124) loss 3.7753 (3.7588) grad_norm 1.0690 (1.2128) [2022-01-20 07:03:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][810/1251] eta 0:16:16 lr 0.000814 time 2.2604 (2.2143) loss 4.1311 (3.7569) grad_norm 1.1136 (1.2129) [2022-01-20 07:03:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][820/1251] eta 0:15:53 lr 0.000814 time 1.7186 (2.2133) loss 3.2905 (3.7576) grad_norm 1.1796 (1.2127) [2022-01-20 07:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][830/1251] eta 0:15:31 lr 0.000814 time 2.6058 (2.2136) loss 3.9529 (3.7600) grad_norm 1.2711 (1.2124) [2022-01-20 07:04:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][840/1251] eta 0:15:09 lr 0.000814 time 1.7634 (2.2122) loss 3.3881 (3.7594) grad_norm 1.0939 (1.2125) [2022-01-20 07:04:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][850/1251] eta 0:14:47 lr 0.000814 time 1.7444 (2.2121) loss 4.1021 (3.7560) grad_norm 1.0720 (1.2114) [2022-01-20 07:05:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][860/1251] eta 0:14:24 lr 0.000814 time 2.1187 (2.2115) loss 4.2464 (3.7543) grad_norm 1.1723 (1.2108) [2022-01-20 07:05:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][870/1251] eta 0:14:02 lr 0.000814 time 2.5442 (2.2116) loss 3.2939 (3.7522) grad_norm 1.1093 (1.2096) [2022-01-20 07:05:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][880/1251] eta 0:13:40 lr 0.000814 time 1.9341 (2.2116) loss 3.3135 (3.7519) grad_norm 1.2007 (1.2107) [2022-01-20 07:06:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][890/1251] eta 0:13:17 lr 0.000814 time 2.0672 (2.2102) loss 4.1133 (3.7524) grad_norm 1.2531 (1.2106) [2022-01-20 07:06:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][900/1251] eta 0:12:55 lr 0.000814 time 1.8949 (2.2089) loss 4.2306 (3.7549) grad_norm 1.1842 (1.2107) [2022-01-20 07:06:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][910/1251] eta 0:12:32 lr 0.000814 time 2.3340 (2.2078) loss 3.5054 (3.7558) grad_norm 1.2651 (1.2101) [2022-01-20 07:07:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][920/1251] eta 0:12:10 lr 0.000814 time 1.5589 (2.2055) loss 4.3156 (3.7588) grad_norm 1.1543 (1.2098) [2022-01-20 07:07:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][930/1251] eta 0:11:47 lr 0.000814 time 1.8162 (2.2052) loss 2.5968 (3.7584) grad_norm 1.4359 (1.2092) [2022-01-20 07:08:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][940/1251] eta 0:11:25 lr 0.000813 time 2.5056 (2.2048) loss 3.7091 (3.7581) grad_norm 1.2194 (1.2092) [2022-01-20 07:08:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][950/1251] eta 0:11:03 lr 0.000813 time 2.0332 (2.2051) loss 4.1647 (3.7580) grad_norm 1.3763 (1.2096) [2022-01-20 07:08:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][960/1251] eta 0:10:41 lr 0.000813 time 1.9230 (2.2053) loss 2.6123 (3.7554) grad_norm 1.2657 (1.2104) [2022-01-20 07:09:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][970/1251] eta 0:10:20 lr 0.000813 time 1.8910 (2.2080) loss 4.3728 (3.7571) grad_norm 1.2901 (1.2107) [2022-01-20 07:09:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][980/1251] eta 0:09:58 lr 0.000813 time 2.2261 (2.2100) loss 2.4842 (3.7590) grad_norm 1.5063 (1.2109) [2022-01-20 07:09:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][990/1251] eta 0:09:36 lr 0.000813 time 2.3989 (2.2099) loss 4.1136 (3.7609) grad_norm 1.3812 (1.2117) [2022-01-20 07:10:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1000/1251] eta 0:09:13 lr 0.000813 time 1.8857 (2.2071) loss 3.1166 (3.7568) grad_norm 1.1585 (1.2120) [2022-01-20 07:10:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1010/1251] eta 0:08:51 lr 0.000813 time 1.8631 (2.2049) loss 3.7920 (3.7562) grad_norm 1.3826 (1.2118) [2022-01-20 07:10:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1020/1251] eta 0:08:29 lr 0.000813 time 1.9877 (2.2037) loss 3.7299 (3.7568) grad_norm 1.4965 (1.2115) [2022-01-20 07:11:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1030/1251] eta 0:08:06 lr 0.000813 time 2.0905 (2.2032) loss 4.1760 (3.7567) grad_norm 1.5764 (1.2116) [2022-01-20 07:11:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1040/1251] eta 0:07:45 lr 0.000813 time 2.2567 (2.2039) loss 4.0540 (3.7569) grad_norm 1.4371 (1.2115) [2022-01-20 07:12:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1050/1251] eta 0:07:22 lr 0.000813 time 2.4817 (2.2038) loss 4.7197 (3.7580) grad_norm 1.0661 (1.2107) [2022-01-20 07:12:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1060/1251] eta 0:07:00 lr 0.000813 time 1.7757 (2.2036) loss 3.4293 (3.7612) grad_norm 1.0556 (1.2101) [2022-01-20 07:12:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1070/1251] eta 0:06:39 lr 0.000813 time 2.0177 (2.2061) loss 2.6786 (3.7598) grad_norm 1.2457 (1.2093) [2022-01-20 07:13:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1080/1251] eta 0:06:17 lr 0.000813 time 2.7978 (2.2073) loss 4.0542 (3.7602) grad_norm 1.1476 (1.2094) [2022-01-20 07:13:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1090/1251] eta 0:05:55 lr 0.000813 time 1.5202 (2.2065) loss 4.5729 (3.7594) grad_norm 1.2397 (1.2096) [2022-01-20 07:13:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1100/1251] eta 0:05:32 lr 0.000813 time 1.7825 (2.2051) loss 4.3373 (3.7594) grad_norm 1.5072 (1.2102) [2022-01-20 07:14:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1110/1251] eta 0:05:10 lr 0.000813 time 2.0706 (2.2034) loss 4.3029 (3.7602) grad_norm 1.1448 (1.2105) [2022-01-20 07:14:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1120/1251] eta 0:04:48 lr 0.000813 time 1.9144 (2.2026) loss 4.3480 (3.7630) grad_norm 1.3449 (1.2108) [2022-01-20 07:14:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1130/1251] eta 0:04:26 lr 0.000813 time 2.4932 (2.2029) loss 4.0059 (3.7647) grad_norm 1.3364 (1.2101) [2022-01-20 07:15:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1140/1251] eta 0:04:04 lr 0.000813 time 2.5449 (2.2027) loss 3.4663 (3.7645) grad_norm 1.2719 (1.2099) [2022-01-20 07:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1150/1251] eta 0:03:42 lr 0.000813 time 2.4540 (2.2027) loss 4.4414 (3.7667) grad_norm 1.2660 (1.2102) [2022-01-20 07:16:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1160/1251] eta 0:03:20 lr 0.000813 time 2.0792 (2.2028) loss 4.0723 (3.7634) grad_norm 1.2637 (1.2094) [2022-01-20 07:16:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1170/1251] eta 0:02:58 lr 0.000813 time 2.2130 (2.2026) loss 3.4169 (3.7629) grad_norm 1.1340 (1.2091) [2022-01-20 07:16:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1180/1251] eta 0:02:36 lr 0.000813 time 2.1534 (2.2032) loss 2.7919 (3.7625) grad_norm 1.1047 (1.2088) [2022-01-20 07:17:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1190/1251] eta 0:02:14 lr 0.000813 time 2.4874 (2.2037) loss 4.0823 (3.7635) grad_norm 1.1186 (1.2092) [2022-01-20 07:17:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1200/1251] eta 0:01:52 lr 0.000813 time 1.6696 (2.2028) loss 4.1038 (3.7626) grad_norm 1.1317 (1.2092) [2022-01-20 07:17:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1210/1251] eta 0:01:30 lr 0.000813 time 1.9986 (2.2006) loss 2.3981 (3.7617) grad_norm 1.1231 (1.2101) [2022-01-20 07:18:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1220/1251] eta 0:01:08 lr 0.000813 time 2.5696 (2.2013) loss 4.1549 (3.7642) grad_norm 1.0968 (1.2094) [2022-01-20 07:18:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1230/1251] eta 0:00:46 lr 0.000813 time 1.6913 (2.2011) loss 4.2625 (3.7620) grad_norm 1.2329 (1.2092) [2022-01-20 07:18:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1240/1251] eta 0:00:24 lr 0.000813 time 1.5427 (2.1998) loss 4.0355 (3.7636) grad_norm 1.2720 (1.2091) [2022-01-20 07:19:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [85/300][1250/1251] eta 0:00:02 lr 0.000812 time 1.2175 (2.1941) loss 4.0842 (3.7639) grad_norm 0.9484 (1.2082) [2022-01-20 07:19:11 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 85 training takes 0:45:45 [2022-01-20 07:19:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.494 (18.494) Loss 1.1592 (1.1592) Acc@1 71.191 (71.191) Acc@5 92.188 (92.188) [2022-01-20 07:19:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.556 (3.365) Loss 1.1456 (1.1621) Acc@1 74.219 (72.940) Acc@5 91.895 (91.939) [2022-01-20 07:20:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.925 (2.574) Loss 1.1347 (1.1661) Acc@1 73.535 (72.959) Acc@5 92.969 (91.783) [2022-01-20 07:20:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.486 (2.219) Loss 1.1491 (1.1735) Acc@1 73.926 (72.855) Acc@5 91.602 (91.728) [2022-01-20 07:20:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.576 (2.146) Loss 1.2041 (1.1704) Acc@1 71.387 (73.023) Acc@5 92.090 (91.730) [2022-01-20 07:20:48 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 73.018 Acc@5 91.758 [2022-01-20 07:20:48 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 73.0% [2022-01-20 07:20:48 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.02% [2022-01-20 07:21:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][0/1251] eta 7:22:13 lr 0.000812 time 21.2100 (21.2100) loss 4.1619 (4.1619) grad_norm 1.2232 (1.2232) [2022-01-20 07:21:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][10/1251] eta 1:26:13 lr 0.000812 time 2.1759 (4.1687) loss 3.5965 (3.8532) grad_norm 1.1028 (1.2434) [2022-01-20 07:21:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][20/1251] eta 1:05:56 lr 0.000812 time 1.3382 (3.2144) loss 3.2341 (3.8314) grad_norm 1.0934 (1.2006) [2022-01-20 07:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][30/1251] eta 0:57:56 lr 0.000812 time 1.4100 (2.8472) loss 4.7365 (3.9318) grad_norm 1.1537 (1.1961) [2022-01-20 07:22:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][40/1251] eta 0:56:07 lr 0.000812 time 3.9355 (2.7810) loss 4.3859 (3.8482) grad_norm 1.4060 (1.1966) [2022-01-20 07:23:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][50/1251] eta 0:53:48 lr 0.000812 time 2.7391 (2.6882) loss 3.0840 (3.8463) grad_norm 1.2198 (1.1896) [2022-01-20 07:23:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][60/1251] eta 0:51:26 lr 0.000812 time 1.6191 (2.5913) loss 3.6288 (3.7977) grad_norm 1.1461 (1.1931) [2022-01-20 07:23:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][70/1251] eta 0:49:39 lr 0.000812 time 1.9994 (2.5224) loss 4.3015 (3.8153) grad_norm 1.1174 (1.1984) [2022-01-20 07:24:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][80/1251] eta 0:48:20 lr 0.000812 time 3.0703 (2.4765) loss 4.0670 (3.8090) grad_norm 1.4026 (1.1980) [2022-01-20 07:24:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][90/1251] eta 0:47:17 lr 0.000812 time 2.5393 (2.4438) loss 4.2878 (3.8325) grad_norm 1.2195 (1.2099) [2022-01-20 07:24:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][100/1251] eta 0:46:07 lr 0.000812 time 1.5889 (2.4048) loss 4.2112 (3.8370) grad_norm 1.2641 (1.2143) [2022-01-20 07:25:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][110/1251] eta 0:45:07 lr 0.000812 time 2.2095 (2.3727) loss 3.8588 (3.8370) grad_norm 1.1693 (1.2193) [2022-01-20 07:25:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][120/1251] eta 0:44:24 lr 0.000812 time 2.8192 (2.3561) loss 3.8403 (3.8276) grad_norm 1.2959 (1.2179) [2022-01-20 07:25:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][130/1251] eta 0:43:39 lr 0.000812 time 1.9272 (2.3364) loss 3.9706 (3.8261) grad_norm 1.1308 (1.2155) [2022-01-20 07:26:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][140/1251] eta 0:43:06 lr 0.000812 time 2.1858 (2.3279) loss 3.7438 (3.8308) grad_norm 1.1198 (1.2116) [2022-01-20 07:26:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][150/1251] eta 0:42:34 lr 0.000812 time 2.5158 (2.3205) loss 3.0680 (3.8080) grad_norm 1.1202 (1.2092) [2022-01-20 07:27:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][160/1251] eta 0:42:15 lr 0.000812 time 2.5889 (2.3240) loss 3.8812 (3.8001) grad_norm 1.1439 (1.2072) [2022-01-20 07:27:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][170/1251] eta 0:42:01 lr 0.000812 time 2.7490 (2.3327) loss 4.4839 (3.7940) grad_norm 1.3192 (1.2090) [2022-01-20 07:27:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][180/1251] eta 0:41:29 lr 0.000812 time 1.7151 (2.3242) loss 3.7962 (3.8115) grad_norm 1.4922 (1.2126) [2022-01-20 07:28:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][190/1251] eta 0:40:58 lr 0.000812 time 2.5352 (2.3167) loss 3.6352 (3.8054) grad_norm 1.0927 (1.2070) [2022-01-20 07:28:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][200/1251] eta 0:40:11 lr 0.000812 time 1.7379 (2.2940) loss 4.3481 (3.8047) grad_norm 1.2742 (1.2070) [2022-01-20 07:28:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][210/1251] eta 0:39:32 lr 0.000812 time 1.8909 (2.2794) loss 3.5840 (3.8114) grad_norm 1.2263 (1.2058) [2022-01-20 07:29:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][220/1251] eta 0:39:01 lr 0.000812 time 2.4510 (2.2715) loss 3.3520 (3.8175) grad_norm 1.3280 (1.2081) [2022-01-20 07:29:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][230/1251] eta 0:38:31 lr 0.000812 time 2.4388 (2.2640) loss 4.1462 (3.8221) grad_norm 1.1256 (1.2084) [2022-01-20 07:29:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][240/1251] eta 0:38:01 lr 0.000812 time 2.0575 (2.2563) loss 4.3686 (3.8263) grad_norm 1.2383 (1.2072) [2022-01-20 07:30:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][250/1251] eta 0:37:37 lr 0.000812 time 1.8423 (2.2548) loss 3.8934 (3.8288) grad_norm 1.1769 (1.2062) [2022-01-20 07:30:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][260/1251] eta 0:37:11 lr 0.000812 time 2.5734 (2.2516) loss 4.3194 (3.8373) grad_norm 1.4050 (1.2071) [2022-01-20 07:30:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][270/1251] eta 0:36:50 lr 0.000812 time 2.3477 (2.2533) loss 4.0756 (3.8387) grad_norm 1.2266 (1.2076) [2022-01-20 07:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][280/1251] eta 0:36:29 lr 0.000812 time 1.9549 (2.2544) loss 3.4053 (3.8448) grad_norm 1.0162 (1.2051) [2022-01-20 07:31:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][290/1251] eta 0:36:09 lr 0.000812 time 3.0190 (2.2571) loss 3.9891 (3.8462) grad_norm 1.0906 (1.2041) [2022-01-20 07:32:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][300/1251] eta 0:35:44 lr 0.000811 time 2.2733 (2.2552) loss 4.1554 (3.8460) grad_norm 1.1092 (1.2017) [2022-01-20 07:32:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][310/1251] eta 0:35:19 lr 0.000811 time 1.9240 (2.2521) loss 3.3388 (3.8420) grad_norm 1.0648 (1.2012) [2022-01-20 07:32:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][320/1251] eta 0:34:55 lr 0.000811 time 2.4979 (2.2508) loss 3.2451 (3.8367) grad_norm 1.0849 (1.1998) [2022-01-20 07:33:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][330/1251] eta 0:34:30 lr 0.000811 time 1.5831 (2.2485) loss 3.1635 (3.8303) grad_norm 1.0908 (1.1994) [2022-01-20 07:33:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][340/1251] eta 0:34:17 lr 0.000811 time 4.0505 (2.2581) loss 4.0715 (3.8324) grad_norm 1.1945 (1.1996) [2022-01-20 07:34:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][350/1251] eta 0:33:56 lr 0.000811 time 1.7276 (2.2601) loss 3.6955 (3.8315) grad_norm 1.1205 (1.1982) [2022-01-20 07:34:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][360/1251] eta 0:33:32 lr 0.000811 time 1.6947 (2.2590) loss 3.7826 (3.8212) grad_norm 1.2792 (1.1999) [2022-01-20 07:34:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][370/1251] eta 0:33:05 lr 0.000811 time 2.2913 (2.2540) loss 3.0960 (3.8239) grad_norm 1.2707 (1.2026) [2022-01-20 07:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][380/1251] eta 0:32:43 lr 0.000811 time 3.7563 (2.2541) loss 3.6511 (3.8208) grad_norm 1.1002 (1.2016) [2022-01-20 07:35:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][390/1251] eta 0:32:12 lr 0.000811 time 1.6822 (2.2448) loss 2.9450 (3.8214) grad_norm 1.3626 (1.2046) [2022-01-20 07:35:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][400/1251] eta 0:31:47 lr 0.000811 time 2.4963 (2.2417) loss 4.0583 (3.8192) grad_norm 1.2472 (1.2061) [2022-01-20 07:36:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][410/1251] eta 0:31:20 lr 0.000811 time 1.8206 (2.2365) loss 3.3695 (3.8213) grad_norm 1.1068 (1.2065) [2022-01-20 07:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][420/1251] eta 0:30:57 lr 0.000811 time 2.4896 (2.2350) loss 2.9610 (3.8146) grad_norm 1.0790 (1.2061) [2022-01-20 07:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][430/1251] eta 0:30:34 lr 0.000811 time 1.9445 (2.2347) loss 3.2052 (3.8123) grad_norm 1.3669 (1.2067) [2022-01-20 07:37:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][440/1251] eta 0:30:12 lr 0.000811 time 2.4304 (2.2345) loss 4.1867 (3.8066) grad_norm 1.3822 (1.2091) [2022-01-20 07:37:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][450/1251] eta 0:29:47 lr 0.000811 time 1.6652 (2.2319) loss 4.2258 (3.8075) grad_norm 1.3479 (1.2089) [2022-01-20 07:37:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][460/1251] eta 0:29:25 lr 0.000811 time 2.2080 (2.2325) loss 3.5462 (3.8013) grad_norm 1.1691 (1.2109) [2022-01-20 07:38:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][470/1251] eta 0:29:02 lr 0.000811 time 2.4520 (2.2316) loss 4.2931 (3.7987) grad_norm 1.0827 (1.2097) [2022-01-20 07:38:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][480/1251] eta 0:28:39 lr 0.000811 time 2.0976 (2.2308) loss 3.7225 (3.7999) grad_norm 1.1947 (1.2089) [2022-01-20 07:39:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][490/1251] eta 0:28:17 lr 0.000811 time 2.2450 (2.2310) loss 3.9237 (3.7964) grad_norm 1.2030 (1.2102) [2022-01-20 07:39:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][500/1251] eta 0:27:55 lr 0.000811 time 1.7508 (2.2304) loss 4.0987 (3.7962) grad_norm 1.0358 (1.2091) [2022-01-20 07:39:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][510/1251] eta 0:27:32 lr 0.000811 time 2.0914 (2.2298) loss 4.0968 (3.7991) grad_norm 1.0724 (1.2095) [2022-01-20 07:40:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][520/1251] eta 0:27:09 lr 0.000811 time 1.8467 (2.2298) loss 3.1050 (3.7998) grad_norm 1.1047 (1.2096) [2022-01-20 07:40:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][530/1251] eta 0:26:46 lr 0.000811 time 1.8875 (2.2280) loss 4.2554 (3.8011) grad_norm 1.1781 (1.2094) [2022-01-20 07:40:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][540/1251] eta 0:26:23 lr 0.000811 time 1.6187 (2.2269) loss 4.1243 (3.7989) grad_norm 1.0671 (1.2104) [2022-01-20 07:41:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][550/1251] eta 0:26:00 lr 0.000811 time 1.8261 (2.2260) loss 2.6981 (3.7964) grad_norm 1.0930 (1.2090) [2022-01-20 07:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][560/1251] eta 0:25:37 lr 0.000811 time 2.2004 (2.2257) loss 3.7505 (3.7946) grad_norm 1.2502 (1.2095) [2022-01-20 07:41:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][570/1251] eta 0:25:13 lr 0.000811 time 2.0397 (2.2227) loss 3.5355 (3.7973) grad_norm 1.2896 (1.2104) [2022-01-20 07:42:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][580/1251] eta 0:24:49 lr 0.000811 time 1.9472 (2.2200) loss 3.8362 (3.7951) grad_norm 1.1003 (1.2093) [2022-01-20 07:42:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][590/1251] eta 0:24:26 lr 0.000811 time 1.8834 (2.2185) loss 4.6536 (3.7917) grad_norm 1.0873 (1.2102) [2022-01-20 07:43:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][600/1251] eta 0:24:03 lr 0.000811 time 2.2809 (2.2172) loss 4.3519 (3.7922) grad_norm 1.1320 (1.2096) [2022-01-20 07:43:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][610/1251] eta 0:23:41 lr 0.000810 time 1.8921 (2.2174) loss 4.6586 (3.7964) grad_norm 1.0152 (1.2097) [2022-01-20 07:43:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][620/1251] eta 0:23:18 lr 0.000810 time 2.1747 (2.2170) loss 4.2321 (3.7977) grad_norm 1.2815 (1.2096) [2022-01-20 07:44:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][630/1251] eta 0:22:57 lr 0.000810 time 2.1698 (2.2177) loss 3.7902 (3.7970) grad_norm 1.2465 (1.2097) [2022-01-20 07:44:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][640/1251] eta 0:22:35 lr 0.000810 time 2.1588 (2.2189) loss 3.5063 (3.8010) grad_norm 1.2225 (1.2094) [2022-01-20 07:44:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][650/1251] eta 0:22:13 lr 0.000810 time 1.9558 (2.2188) loss 4.6028 (3.8028) grad_norm 1.1186 (1.2095) [2022-01-20 07:45:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][660/1251] eta 0:21:50 lr 0.000810 time 2.3675 (2.2182) loss 4.1771 (3.7988) grad_norm 1.1378 (1.2081) [2022-01-20 07:45:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][670/1251] eta 0:21:27 lr 0.000810 time 1.9404 (2.2168) loss 3.6886 (3.7990) grad_norm 1.1593 (1.2076) [2022-01-20 07:45:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][680/1251] eta 0:21:04 lr 0.000810 time 1.5835 (2.2152) loss 3.3678 (3.7999) grad_norm 1.1023 (1.2077) [2022-01-20 07:46:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][690/1251] eta 0:20:42 lr 0.000810 time 2.5690 (2.2156) loss 2.4154 (3.7990) grad_norm 1.1990 (1.2072) [2022-01-20 07:46:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][700/1251] eta 0:20:19 lr 0.000810 time 1.7799 (2.2125) loss 3.0361 (3.7975) grad_norm 1.2037 (1.2062) [2022-01-20 07:47:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][710/1251] eta 0:19:56 lr 0.000810 time 2.2255 (2.2112) loss 3.7236 (3.7939) grad_norm 1.1185 (1.2054) [2022-01-20 07:47:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][720/1251] eta 0:19:35 lr 0.000810 time 2.4235 (2.2133) loss 4.3720 (3.7972) grad_norm 1.2150 (1.2052) [2022-01-20 07:47:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][730/1251] eta 0:19:14 lr 0.000810 time 2.1783 (2.2156) loss 2.7179 (3.7970) grad_norm 1.1876 (1.2049) [2022-01-20 07:48:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][740/1251] eta 0:18:50 lr 0.000810 time 1.8378 (2.2133) loss 4.1938 (3.7963) grad_norm 1.0942 (1.2045) [2022-01-20 07:48:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][750/1251] eta 0:18:27 lr 0.000810 time 1.9004 (2.2106) loss 2.4113 (3.7940) grad_norm 1.1985 (1.2040) [2022-01-20 07:48:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][760/1251] eta 0:18:04 lr 0.000810 time 2.3588 (2.2085) loss 3.8552 (3.7912) grad_norm 1.5226 (1.2045) [2022-01-20 07:49:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][770/1251] eta 0:17:42 lr 0.000810 time 1.8919 (2.2092) loss 4.2509 (3.7923) grad_norm 1.2227 (1.2042) [2022-01-20 07:49:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][780/1251] eta 0:17:20 lr 0.000810 time 1.8842 (2.2100) loss 2.8864 (3.7886) grad_norm 0.9460 (1.2042) [2022-01-20 07:49:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][790/1251] eta 0:16:58 lr 0.000810 time 2.1856 (2.2102) loss 4.1736 (3.7876) grad_norm 1.2212 (1.2039) [2022-01-20 07:50:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][800/1251] eta 0:16:36 lr 0.000810 time 2.1768 (2.2099) loss 4.1432 (3.7893) grad_norm 1.1271 (1.2036) [2022-01-20 07:50:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][810/1251] eta 0:16:15 lr 0.000810 time 2.8374 (2.2123) loss 3.0989 (3.7909) grad_norm 1.1941 (1.2040) [2022-01-20 07:51:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][820/1251] eta 0:15:53 lr 0.000810 time 1.5271 (2.2115) loss 4.5114 (3.7940) grad_norm 1.2457 (1.2040) [2022-01-20 07:51:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][830/1251] eta 0:15:30 lr 0.000810 time 1.7284 (2.2097) loss 3.8142 (3.7935) grad_norm 1.3744 (1.2041) [2022-01-20 07:51:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][840/1251] eta 0:15:06 lr 0.000810 time 1.5907 (2.2067) loss 4.2787 (3.7913) grad_norm 1.0714 (1.2041) [2022-01-20 07:52:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][850/1251] eta 0:14:44 lr 0.000810 time 1.9221 (2.2061) loss 4.0008 (3.7916) grad_norm 1.2668 (1.2041) [2022-01-20 07:52:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][860/1251] eta 0:14:22 lr 0.000810 time 1.8311 (2.2061) loss 3.0494 (3.7932) grad_norm 1.0786 (1.2056) [2022-01-20 07:52:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][870/1251] eta 0:14:00 lr 0.000810 time 2.4793 (2.2054) loss 3.0369 (3.7921) grad_norm 1.3752 (1.2053) [2022-01-20 07:53:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][880/1251] eta 0:13:39 lr 0.000810 time 3.0413 (2.2086) loss 3.3579 (3.7933) grad_norm 1.1972 (1.2049) [2022-01-20 07:53:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][890/1251] eta 0:13:17 lr 0.000810 time 2.5059 (2.2102) loss 4.6325 (3.7903) grad_norm 1.3053 (1.2044) [2022-01-20 07:53:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][900/1251] eta 0:12:55 lr 0.000810 time 2.1711 (2.2098) loss 4.5231 (3.7879) grad_norm 1.2999 (1.2046) [2022-01-20 07:54:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][910/1251] eta 0:12:33 lr 0.000810 time 2.5428 (2.2088) loss 4.3942 (3.7891) grad_norm 1.1152 (1.2038) [2022-01-20 07:54:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][920/1251] eta 0:12:10 lr 0.000809 time 1.9613 (2.2078) loss 4.1580 (3.7889) grad_norm 1.0476 (1.2031) [2022-01-20 07:55:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][930/1251] eta 0:11:48 lr 0.000809 time 1.5815 (2.2083) loss 3.5151 (3.7900) grad_norm 1.1918 (1.2032) [2022-01-20 07:55:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][940/1251] eta 0:11:26 lr 0.000809 time 1.8647 (2.2088) loss 3.2633 (3.7891) grad_norm 1.0877 (1.2034) [2022-01-20 07:55:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][950/1251] eta 0:11:04 lr 0.000809 time 2.4232 (2.2082) loss 3.8261 (3.7876) grad_norm 1.1670 (1.2035) [2022-01-20 07:56:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][960/1251] eta 0:10:42 lr 0.000809 time 2.1285 (2.2084) loss 3.9672 (3.7875) grad_norm 1.2190 (1.2037) [2022-01-20 07:56:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][970/1251] eta 0:10:20 lr 0.000809 time 1.9354 (2.2086) loss 3.3151 (3.7876) grad_norm 1.2187 (1.2038) [2022-01-20 07:56:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][980/1251] eta 0:09:58 lr 0.000809 time 2.2390 (2.2093) loss 3.8336 (3.7861) grad_norm 1.3264 (1.2037) [2022-01-20 07:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][990/1251] eta 0:09:35 lr 0.000809 time 2.1191 (2.2068) loss 3.1858 (3.7823) grad_norm 1.2162 (1.2035) [2022-01-20 07:57:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1000/1251] eta 0:09:13 lr 0.000809 time 2.0779 (2.2035) loss 3.8960 (3.7783) grad_norm 1.1855 (1.2035) [2022-01-20 07:57:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1010/1251] eta 0:08:50 lr 0.000809 time 1.7073 (2.2015) loss 3.7299 (3.7772) grad_norm 1.0166 (1.2023) [2022-01-20 07:58:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1020/1251] eta 0:08:28 lr 0.000809 time 1.9078 (2.2009) loss 2.9436 (3.7760) grad_norm 0.9866 (1.2020) [2022-01-20 07:58:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1030/1251] eta 0:08:06 lr 0.000809 time 2.3249 (2.2012) loss 4.0581 (3.7755) grad_norm 1.3044 (1.2023) [2022-01-20 07:58:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1040/1251] eta 0:07:44 lr 0.000809 time 2.4159 (2.2010) loss 3.6170 (3.7752) grad_norm 1.6729 (1.2036) [2022-01-20 07:59:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1050/1251] eta 0:07:22 lr 0.000809 time 2.1517 (2.2016) loss 3.1010 (3.7746) grad_norm 1.3465 (1.2039) [2022-01-20 07:59:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1060/1251] eta 0:07:00 lr 0.000809 time 2.2415 (2.2034) loss 4.5291 (3.7756) grad_norm 1.1707 (1.2036) [2022-01-20 08:00:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1070/1251] eta 0:06:38 lr 0.000809 time 2.2140 (2.2026) loss 3.6890 (3.7749) grad_norm 1.1651 (1.2030) [2022-01-20 08:00:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1080/1251] eta 0:06:16 lr 0.000809 time 1.5848 (2.2029) loss 4.4109 (3.7741) grad_norm 1.3292 (1.2036) [2022-01-20 08:00:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1090/1251] eta 0:05:54 lr 0.000809 time 3.1825 (2.2047) loss 3.4497 (3.7719) grad_norm 1.2091 (1.2037) [2022-01-20 08:01:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1100/1251] eta 0:05:32 lr 0.000809 time 1.8305 (2.2041) loss 3.8586 (3.7746) grad_norm 1.1141 (1.2035) [2022-01-20 08:01:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1110/1251] eta 0:05:10 lr 0.000809 time 2.1628 (2.2036) loss 4.0127 (3.7739) grad_norm 1.1544 (1.2035) [2022-01-20 08:01:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1120/1251] eta 0:04:48 lr 0.000809 time 1.8712 (2.2026) loss 3.9933 (3.7761) grad_norm 1.1712 (1.2034) [2022-01-20 08:02:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1130/1251] eta 0:04:26 lr 0.000809 time 1.7948 (2.2012) loss 4.3668 (3.7757) grad_norm 1.1218 (1.2036) [2022-01-20 08:02:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1140/1251] eta 0:04:04 lr 0.000809 time 2.4684 (2.2009) loss 3.1729 (3.7768) grad_norm 1.0622 (1.2032) [2022-01-20 08:03:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1150/1251] eta 0:03:42 lr 0.000809 time 2.1211 (2.2016) loss 4.1808 (3.7791) grad_norm 1.2041 (1.2031) [2022-01-20 08:03:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1160/1251] eta 0:03:20 lr 0.000809 time 2.0557 (2.2024) loss 4.0803 (3.7779) grad_norm 1.2219 (1.2029) [2022-01-20 08:03:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1170/1251] eta 0:02:58 lr 0.000809 time 2.4626 (2.2015) loss 3.8743 (3.7756) grad_norm 1.0813 (1.2027) [2022-01-20 08:04:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1180/1251] eta 0:02:36 lr 0.000809 time 1.5642 (2.2005) loss 3.8390 (3.7718) grad_norm 1.2505 (1.2029) [2022-01-20 08:04:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1190/1251] eta 0:02:14 lr 0.000809 time 1.5573 (2.1997) loss 4.2378 (3.7727) grad_norm 1.1711 (1.2027) [2022-01-20 08:04:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1200/1251] eta 0:01:52 lr 0.000809 time 2.8794 (2.1992) loss 4.2025 (3.7719) grad_norm 1.3248 (1.2028) [2022-01-20 08:05:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1210/1251] eta 0:01:30 lr 0.000809 time 2.1768 (2.1982) loss 4.1554 (3.7703) grad_norm 1.3612 (1.2030) [2022-01-20 08:05:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1220/1251] eta 0:01:08 lr 0.000808 time 2.2510 (2.1989) loss 3.4629 (3.7716) grad_norm 1.0641 (1.2026) [2022-01-20 08:05:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1230/1251] eta 0:00:46 lr 0.000808 time 1.7449 (2.1988) loss 4.0398 (3.7725) grad_norm 1.0278 (1.2022) [2022-01-20 08:06:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1240/1251] eta 0:00:24 lr 0.000808 time 2.0028 (2.1981) loss 4.0086 (3.7736) grad_norm 1.0614 (1.2025) [2022-01-20 08:06:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [86/300][1250/1251] eta 0:00:02 lr 0.000808 time 1.1231 (2.1929) loss 3.8983 (3.7722) grad_norm 1.4291 (1.2025) [2022-01-20 08:06:32 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 86 training takes 0:45:43 [2022-01-20 08:06:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.708 (18.708) Loss 1.0918 (1.0918) Acc@1 73.730 (73.730) Acc@5 93.066 (93.066) [2022-01-20 08:07:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 2.378 (3.476) Loss 1.1886 (1.1558) Acc@1 72.266 (72.949) Acc@5 91.113 (91.850) [2022-01-20 08:07:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.628 (2.715) Loss 1.2145 (1.1652) Acc@1 71.191 (72.670) Acc@5 90.625 (91.648) [2022-01-20 08:07:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.286 (2.314) Loss 1.1942 (1.1644) Acc@1 70.996 (72.650) Acc@5 91.895 (91.690) [2022-01-20 08:08:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.990 (2.228) Loss 1.2018 (1.1654) Acc@1 71.289 (72.725) Acc@5 91.699 (91.661) [2022-01-20 08:08:10 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 72.786 Acc@5 91.664 [2022-01-20 08:08:10 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 72.8% [2022-01-20 08:08:10 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.02% [2022-01-20 08:08:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][0/1251] eta 7:23:16 lr 0.000808 time 21.2605 (21.2605) loss 3.9448 (3.9448) grad_norm 1.1242 (1.1242) [2022-01-20 08:08:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][10/1251] eta 1:27:15 lr 0.000808 time 2.7722 (4.2188) loss 2.7397 (3.7382) grad_norm 1.2473 (1.2075) [2022-01-20 08:09:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][20/1251] eta 1:06:42 lr 0.000808 time 2.4812 (3.2515) loss 3.5718 (3.7246) grad_norm 1.1399 (1.2139) [2022-01-20 08:09:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][30/1251] eta 0:58:34 lr 0.000808 time 1.3714 (2.8783) loss 3.2014 (3.7266) grad_norm 1.1878 (1.2159) [2022-01-20 08:10:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][40/1251] eta 0:56:06 lr 0.000808 time 3.3724 (2.7796) loss 3.9617 (3.7058) grad_norm 1.0636 (1.2178) [2022-01-20 08:10:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][50/1251] eta 0:53:53 lr 0.000808 time 2.3837 (2.6926) loss 3.7507 (3.7235) grad_norm 1.3823 (1.2010) [2022-01-20 08:10:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][60/1251] eta 0:51:23 lr 0.000808 time 1.8160 (2.5889) loss 4.5452 (3.7364) grad_norm 1.0333 (1.1991) [2022-01-20 08:11:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][70/1251] eta 0:49:28 lr 0.000808 time 1.6199 (2.5137) loss 4.2141 (3.7770) grad_norm 1.0880 (1.1949) [2022-01-20 08:11:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][80/1251] eta 0:48:15 lr 0.000808 time 3.3311 (2.4728) loss 4.0356 (3.7797) grad_norm 1.4428 (1.1994) [2022-01-20 08:11:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][90/1251] eta 0:47:27 lr 0.000808 time 2.1516 (2.4525) loss 2.3706 (3.7483) grad_norm 1.3107 (1.2178) [2022-01-20 08:12:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][100/1251] eta 0:46:32 lr 0.000808 time 1.9447 (2.4261) loss 4.0890 (3.7740) grad_norm 1.1165 (1.2248) [2022-01-20 08:12:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][110/1251] eta 0:45:38 lr 0.000808 time 1.6108 (2.4003) loss 3.2789 (3.7606) grad_norm 1.0421 (1.2186) [2022-01-20 08:13:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][120/1251] eta 0:45:07 lr 0.000808 time 3.6101 (2.3941) loss 3.1444 (3.7247) grad_norm 1.4252 (1.2173) [2022-01-20 08:13:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][130/1251] eta 0:44:34 lr 0.000808 time 1.8698 (2.3854) loss 2.7835 (3.7095) grad_norm 1.2651 (1.2107) [2022-01-20 08:13:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][140/1251] eta 0:43:36 lr 0.000808 time 1.6429 (2.3549) loss 3.7025 (3.7055) grad_norm 1.1326 (1.2062) [2022-01-20 08:14:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][150/1251] eta 0:42:43 lr 0.000808 time 1.6930 (2.3285) loss 4.2383 (3.6952) grad_norm 1.1536 (1.2085) [2022-01-20 08:14:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][160/1251] eta 0:42:08 lr 0.000808 time 3.7495 (2.3176) loss 3.5897 (3.7085) grad_norm 1.1215 (1.2101) [2022-01-20 08:14:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][170/1251] eta 0:41:48 lr 0.000808 time 2.2226 (2.3209) loss 2.8014 (3.7064) grad_norm 1.2280 (1.2041) [2022-01-20 08:15:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][180/1251] eta 0:41:16 lr 0.000808 time 2.0960 (2.3124) loss 3.2152 (3.6940) grad_norm 1.1231 (1.1993) [2022-01-20 08:15:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][190/1251] eta 0:40:44 lr 0.000808 time 1.8422 (2.3037) loss 3.1461 (3.6882) grad_norm 1.2934 (1.1974) [2022-01-20 08:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][200/1251] eta 0:40:21 lr 0.000808 time 3.4316 (2.3044) loss 3.4253 (3.6782) grad_norm 1.1344 (1.1966) [2022-01-20 08:16:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][210/1251] eta 0:39:54 lr 0.000808 time 2.2393 (2.2998) loss 3.2295 (3.6778) grad_norm 1.2323 (1.2011) [2022-01-20 08:16:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][220/1251] eta 0:39:26 lr 0.000808 time 2.4366 (2.2958) loss 4.5148 (3.6866) grad_norm 1.3452 (1.2033) [2022-01-20 08:16:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][230/1251] eta 0:38:54 lr 0.000808 time 2.3105 (2.2868) loss 3.0604 (3.6938) grad_norm 1.2302 (1.2035) [2022-01-20 08:17:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][240/1251] eta 0:38:23 lr 0.000808 time 2.7361 (2.2784) loss 4.2554 (3.6985) grad_norm 1.1850 (1.2034) [2022-01-20 08:17:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][250/1251] eta 0:37:51 lr 0.000808 time 1.9467 (2.2688) loss 3.6895 (3.6982) grad_norm 1.1146 (1.2037) [2022-01-20 08:18:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][260/1251] eta 0:37:25 lr 0.000808 time 2.0436 (2.2664) loss 3.8576 (3.6868) grad_norm 1.2417 (1.2046) [2022-01-20 08:18:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][270/1251] eta 0:36:59 lr 0.000808 time 2.2045 (2.2624) loss 4.4473 (3.6879) grad_norm 1.2226 (1.2083) [2022-01-20 08:18:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][280/1251] eta 0:36:33 lr 0.000807 time 2.7335 (2.2594) loss 4.3867 (3.6952) grad_norm 1.2940 (1.2111) [2022-01-20 08:19:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][290/1251] eta 0:36:09 lr 0.000807 time 1.8801 (2.2572) loss 4.5162 (3.7044) grad_norm 1.3686 (1.2127) [2022-01-20 08:19:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][300/1251] eta 0:35:43 lr 0.000807 time 1.5729 (2.2540) loss 3.4024 (3.7016) grad_norm 1.1235 (1.2126) [2022-01-20 08:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][310/1251] eta 0:35:18 lr 0.000807 time 2.4920 (2.2511) loss 2.8055 (3.6962) grad_norm 1.0105 (1.2108) [2022-01-20 08:20:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][320/1251] eta 0:34:55 lr 0.000807 time 2.7495 (2.2504) loss 4.2063 (3.6980) grad_norm 1.5331 (1.2102) [2022-01-20 08:20:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][330/1251] eta 0:34:32 lr 0.000807 time 2.7920 (2.2499) loss 4.3213 (3.6967) grad_norm 1.3353 (1.2126) [2022-01-20 08:20:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][340/1251] eta 0:34:07 lr 0.000807 time 2.5243 (2.2471) loss 4.0927 (3.6956) grad_norm 1.2038 (1.2120) [2022-01-20 08:21:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][350/1251] eta 0:33:44 lr 0.000807 time 2.0903 (2.2472) loss 3.0344 (3.6928) grad_norm 1.3029 (1.2128) [2022-01-20 08:21:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][360/1251] eta 0:33:20 lr 0.000807 time 1.9807 (2.2456) loss 3.7836 (3.6991) grad_norm 1.1337 (1.2113) [2022-01-20 08:22:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][370/1251] eta 0:32:55 lr 0.000807 time 2.2474 (2.2426) loss 4.0580 (3.7035) grad_norm 1.1581 (1.2112) [2022-01-20 08:22:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][380/1251] eta 0:32:31 lr 0.000807 time 2.3974 (2.2407) loss 4.4360 (3.7106) grad_norm 1.1359 (1.2115) [2022-01-20 08:22:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][390/1251] eta 0:32:07 lr 0.000807 time 1.6953 (2.2384) loss 3.8932 (3.7169) grad_norm 1.1852 (1.2122) [2022-01-20 08:23:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][400/1251] eta 0:31:41 lr 0.000807 time 2.0354 (2.2341) loss 2.6976 (3.7074) grad_norm 1.3327 (1.2121) [2022-01-20 08:23:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][410/1251] eta 0:31:22 lr 0.000807 time 3.3955 (2.2386) loss 4.2850 (3.7036) grad_norm 1.2556 (1.2116) [2022-01-20 08:23:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][420/1251] eta 0:30:59 lr 0.000807 time 1.8786 (2.2378) loss 3.6283 (3.7041) grad_norm 1.2081 (1.2128) [2022-01-20 08:24:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][430/1251] eta 0:30:36 lr 0.000807 time 1.6527 (2.2374) loss 3.4352 (3.7029) grad_norm 1.2839 (1.2134) [2022-01-20 08:24:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][440/1251] eta 0:30:12 lr 0.000807 time 2.1984 (2.2343) loss 3.9272 (3.6987) grad_norm 1.2273 (1.2121) [2022-01-20 08:24:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][450/1251] eta 0:29:43 lr 0.000807 time 2.0349 (2.2271) loss 3.8803 (3.7033) grad_norm 1.1487 (1.2118) [2022-01-20 08:25:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][460/1251] eta 0:29:20 lr 0.000807 time 1.8693 (2.2258) loss 3.8408 (3.6996) grad_norm 1.3241 (1.2137) [2022-01-20 08:25:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][470/1251] eta 0:28:56 lr 0.000807 time 1.8135 (2.2239) loss 4.0153 (3.7040) grad_norm 1.1643 (1.2141) [2022-01-20 08:26:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][480/1251] eta 0:28:34 lr 0.000807 time 1.8831 (2.2235) loss 3.9322 (3.7118) grad_norm 1.3278 (1.2133) [2022-01-20 08:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][490/1251] eta 0:28:12 lr 0.000807 time 2.0064 (2.2244) loss 3.3945 (3.7129) grad_norm 1.1812 (1.2121) [2022-01-20 08:26:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][500/1251] eta 0:27:52 lr 0.000807 time 2.1642 (2.2276) loss 3.9562 (3.7161) grad_norm 1.1062 (1.2111) [2022-01-20 08:27:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][510/1251] eta 0:27:30 lr 0.000807 time 1.9411 (2.2278) loss 4.0835 (3.7199) grad_norm 1.1060 (1.2113) [2022-01-20 08:27:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][520/1251] eta 0:27:08 lr 0.000807 time 1.4892 (2.2273) loss 4.6098 (3.7195) grad_norm 1.2180 (1.2119) [2022-01-20 08:27:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][530/1251] eta 0:26:43 lr 0.000807 time 2.2177 (2.2240) loss 4.1611 (3.7212) grad_norm 1.3131 (1.2117) [2022-01-20 08:28:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][540/1251] eta 0:26:18 lr 0.000807 time 1.7973 (2.2195) loss 4.0839 (3.7230) grad_norm 1.0732 (1.2115) [2022-01-20 08:28:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][550/1251] eta 0:25:53 lr 0.000807 time 1.8615 (2.2154) loss 3.5613 (3.7211) grad_norm 1.3916 (1.2125) [2022-01-20 08:28:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][560/1251] eta 0:25:30 lr 0.000807 time 2.1662 (2.2154) loss 3.7458 (3.7233) grad_norm 1.1467 (1.2115) [2022-01-20 08:29:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][570/1251] eta 0:25:06 lr 0.000807 time 2.2603 (2.2127) loss 4.2149 (3.7254) grad_norm 1.1931 (1.2113) [2022-01-20 08:29:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][580/1251] eta 0:24:43 lr 0.000806 time 1.6908 (2.2112) loss 3.6139 (3.7288) grad_norm 1.0530 (1.2133) [2022-01-20 08:29:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][590/1251] eta 0:24:23 lr 0.000806 time 2.8612 (2.2140) loss 3.9398 (3.7309) grad_norm 1.3924 (1.2133) [2022-01-20 08:30:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][600/1251] eta 0:24:01 lr 0.000806 time 1.6061 (2.2139) loss 3.4723 (3.7310) grad_norm 1.2051 (1.2133) [2022-01-20 08:30:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][610/1251] eta 0:23:40 lr 0.000806 time 2.8156 (2.2155) loss 3.2817 (3.7307) grad_norm 1.2755 (1.2126) [2022-01-20 08:31:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][620/1251] eta 0:23:19 lr 0.000806 time 2.2362 (2.2176) loss 2.9418 (3.7320) grad_norm 1.1203 (1.2130) [2022-01-20 08:31:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][630/1251] eta 0:22:57 lr 0.000806 time 2.1461 (2.2175) loss 2.9607 (3.7321) grad_norm 1.2616 (1.2151) [2022-01-20 08:31:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][640/1251] eta 0:22:32 lr 0.000806 time 1.9114 (2.2141) loss 4.1001 (3.7338) grad_norm 1.0528 (1.2139) [2022-01-20 08:32:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][650/1251] eta 0:22:09 lr 0.000806 time 2.3099 (2.2119) loss 4.0595 (3.7385) grad_norm 1.1071 (1.2132) [2022-01-20 08:32:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][660/1251] eta 0:21:45 lr 0.000806 time 1.9633 (2.2096) loss 2.7372 (3.7359) grad_norm 1.2133 (1.2133) [2022-01-20 08:32:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][670/1251] eta 0:21:23 lr 0.000806 time 2.4647 (2.2083) loss 3.8827 (3.7354) grad_norm 1.0058 (1.2144) [2022-01-20 08:33:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][680/1251] eta 0:21:01 lr 0.000806 time 2.1960 (2.2089) loss 4.5589 (3.7354) grad_norm 1.1395 (1.2145) [2022-01-20 08:33:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][690/1251] eta 0:20:38 lr 0.000806 time 1.8646 (2.2085) loss 2.5982 (3.7332) grad_norm 1.3845 (1.2152) [2022-01-20 08:33:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][700/1251] eta 0:20:16 lr 0.000806 time 2.2138 (2.2081) loss 3.1138 (3.7350) grad_norm 1.2409 (1.2146) [2022-01-20 08:34:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][710/1251] eta 0:19:55 lr 0.000806 time 2.1258 (2.2095) loss 3.8975 (3.7369) grad_norm 1.2271 (1.2137) [2022-01-20 08:34:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][720/1251] eta 0:19:34 lr 0.000806 time 1.5725 (2.2115) loss 3.7108 (3.7380) grad_norm 1.1368 (1.2136) [2022-01-20 08:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][730/1251] eta 0:19:12 lr 0.000806 time 1.8924 (2.2113) loss 4.0285 (3.7389) grad_norm 1.1388 (1.2127) [2022-01-20 08:35:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][740/1251] eta 0:18:49 lr 0.000806 time 1.7165 (2.2105) loss 2.9409 (3.7366) grad_norm 1.1216 (1.2131) [2022-01-20 08:35:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][750/1251] eta 0:18:26 lr 0.000806 time 2.3179 (2.2091) loss 3.2236 (3.7360) grad_norm 1.2281 (1.2133) [2022-01-20 08:36:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][760/1251] eta 0:18:04 lr 0.000806 time 1.8746 (2.2086) loss 3.9574 (3.7389) grad_norm 1.3497 (1.2144) [2022-01-20 08:36:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][770/1251] eta 0:17:42 lr 0.000806 time 2.1473 (2.2091) loss 3.9352 (3.7383) grad_norm 1.1299 (1.2149) [2022-01-20 08:36:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][780/1251] eta 0:17:21 lr 0.000806 time 2.1490 (2.2106) loss 3.2822 (3.7401) grad_norm 1.4929 (1.2179) [2022-01-20 08:37:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][790/1251] eta 0:16:59 lr 0.000806 time 2.2059 (2.2115) loss 2.7005 (3.7353) grad_norm 1.1250 (1.2175) [2022-01-20 08:37:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][800/1251] eta 0:16:37 lr 0.000806 time 2.0958 (2.2108) loss 3.5439 (3.7375) grad_norm 1.1618 (1.2170) [2022-01-20 08:38:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][810/1251] eta 0:16:13 lr 0.000806 time 1.6007 (2.2065) loss 3.6545 (3.7338) grad_norm 1.1215 (1.2162) [2022-01-20 08:38:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][820/1251] eta 0:15:49 lr 0.000806 time 2.0474 (2.2040) loss 2.6926 (3.7314) grad_norm 1.1240 (1.2165) [2022-01-20 08:38:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][830/1251] eta 0:15:27 lr 0.000806 time 2.5087 (2.2034) loss 3.4012 (3.7297) grad_norm 1.0386 (1.2155) [2022-01-20 08:39:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][840/1251] eta 0:15:05 lr 0.000806 time 2.2612 (2.2033) loss 4.2953 (3.7301) grad_norm 1.0679 (1.2150) [2022-01-20 08:39:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][850/1251] eta 0:14:43 lr 0.000806 time 1.9078 (2.2026) loss 2.8901 (3.7290) grad_norm 1.2080 (1.2148) [2022-01-20 08:39:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][860/1251] eta 0:14:20 lr 0.000806 time 2.3163 (2.2017) loss 4.1003 (3.7281) grad_norm 0.9766 (1.2138) [2022-01-20 08:40:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][870/1251] eta 0:13:58 lr 0.000806 time 2.1730 (2.2019) loss 4.0681 (3.7307) grad_norm 1.0922 (1.2133) [2022-01-20 08:40:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][880/1251] eta 0:13:37 lr 0.000805 time 2.5710 (2.2044) loss 2.7172 (3.7277) grad_norm 1.2214 (1.2126) [2022-01-20 08:40:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][890/1251] eta 0:13:16 lr 0.000805 time 2.9666 (2.2052) loss 4.5446 (3.7269) grad_norm 1.3080 (1.2129) [2022-01-20 08:41:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][900/1251] eta 0:12:53 lr 0.000805 time 1.5395 (2.2045) loss 3.7992 (3.7289) grad_norm 1.2898 (1.2125) [2022-01-20 08:41:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][910/1251] eta 0:12:32 lr 0.000805 time 2.1870 (2.2062) loss 3.7714 (3.7305) grad_norm 1.1999 (1.2123) [2022-01-20 08:42:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][920/1251] eta 0:12:09 lr 0.000805 time 2.1822 (2.2050) loss 2.8960 (3.7278) grad_norm 1.5274 (1.2131) [2022-01-20 08:42:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][930/1251] eta 0:11:47 lr 0.000805 time 3.4915 (2.2047) loss 4.1754 (3.7268) grad_norm 1.1612 (1.2130) [2022-01-20 08:42:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][940/1251] eta 0:11:25 lr 0.000805 time 1.9806 (2.2032) loss 4.2707 (3.7295) grad_norm 1.2311 (1.2123) [2022-01-20 08:43:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][950/1251] eta 0:11:02 lr 0.000805 time 1.5907 (2.2017) loss 3.4795 (3.7307) grad_norm 1.2316 (1.2127) [2022-01-20 08:43:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][960/1251] eta 0:10:40 lr 0.000805 time 2.1408 (2.1997) loss 2.7031 (3.7315) grad_norm 1.1116 (1.2127) [2022-01-20 08:43:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][970/1251] eta 0:10:18 lr 0.000805 time 2.9287 (2.2002) loss 2.8673 (3.7286) grad_norm 1.4049 (1.2123) [2022-01-20 08:44:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][980/1251] eta 0:09:56 lr 0.000805 time 1.9396 (2.1999) loss 3.5441 (3.7309) grad_norm 1.2361 (1.2126) [2022-01-20 08:44:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][990/1251] eta 0:09:35 lr 0.000805 time 3.4395 (2.2032) loss 3.5344 (3.7305) grad_norm 1.2064 (1.2125) [2022-01-20 08:44:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1000/1251] eta 0:09:13 lr 0.000805 time 3.2840 (2.2050) loss 2.6006 (3.7302) grad_norm 1.1504 (1.2115) [2022-01-20 08:45:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1010/1251] eta 0:08:51 lr 0.000805 time 3.1623 (2.2069) loss 4.0351 (3.7317) grad_norm 1.3110 (1.2125) [2022-01-20 08:45:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1020/1251] eta 0:08:29 lr 0.000805 time 1.8653 (2.2061) loss 3.1898 (3.7337) grad_norm 1.0386 (1.2129) [2022-01-20 08:46:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1030/1251] eta 0:08:07 lr 0.000805 time 2.1367 (2.2039) loss 3.5267 (3.7333) grad_norm 1.2668 (1.2127) [2022-01-20 08:46:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1040/1251] eta 0:07:44 lr 0.000805 time 2.9133 (2.2018) loss 4.5812 (3.7352) grad_norm 1.3103 (1.2126) [2022-01-20 08:46:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1050/1251] eta 0:07:22 lr 0.000805 time 1.9106 (2.2006) loss 3.6835 (3.7357) grad_norm 1.0860 (1.2121) [2022-01-20 08:47:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1060/1251] eta 0:07:00 lr 0.000805 time 1.8561 (2.1998) loss 4.3323 (3.7361) grad_norm 1.3406 (1.2121) [2022-01-20 08:47:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1070/1251] eta 0:06:38 lr 0.000805 time 2.4998 (2.1998) loss 3.6573 (3.7341) grad_norm 1.1404 (1.2123) [2022-01-20 08:47:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1080/1251] eta 0:06:16 lr 0.000805 time 2.7564 (2.2022) loss 3.4172 (3.7340) grad_norm 0.9457 (1.2121) [2022-01-20 08:48:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1090/1251] eta 0:05:54 lr 0.000805 time 2.3025 (2.2038) loss 3.8256 (3.7345) grad_norm 1.2626 (1.2124) [2022-01-20 08:48:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1100/1251] eta 0:05:33 lr 0.000805 time 2.8089 (2.2053) loss 4.3722 (3.7377) grad_norm 1.1861 (1.2123) [2022-01-20 08:48:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1110/1251] eta 0:05:10 lr 0.000805 time 2.0393 (2.2041) loss 4.3125 (3.7379) grad_norm 1.0597 (1.2119) [2022-01-20 08:49:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1120/1251] eta 0:04:48 lr 0.000805 time 1.6402 (2.2015) loss 4.2713 (3.7395) grad_norm 1.1419 (1.2111) [2022-01-20 08:49:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1130/1251] eta 0:04:26 lr 0.000805 time 1.9197 (2.1993) loss 3.4391 (3.7408) grad_norm 1.1488 (1.2103) [2022-01-20 08:49:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1140/1251] eta 0:04:04 lr 0.000805 time 2.4809 (2.1985) loss 3.9421 (3.7404) grad_norm 1.0530 (1.2095) [2022-01-20 08:50:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1150/1251] eta 0:03:42 lr 0.000805 time 3.2785 (2.1999) loss 3.8689 (3.7407) grad_norm 1.3205 (1.2098) [2022-01-20 08:50:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1160/1251] eta 0:03:20 lr 0.000805 time 2.8228 (2.2003) loss 4.4053 (3.7404) grad_norm 1.0949 (1.2096) [2022-01-20 08:51:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1170/1251] eta 0:02:58 lr 0.000805 time 1.8699 (2.2006) loss 3.4790 (3.7431) grad_norm 1.2080 (1.2100) [2022-01-20 08:51:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1180/1251] eta 0:02:36 lr 0.000805 time 2.3001 (2.2005) loss 3.9799 (3.7415) grad_norm 1.1029 (1.2100) [2022-01-20 08:51:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1190/1251] eta 0:02:14 lr 0.000804 time 2.8067 (2.2008) loss 4.1035 (3.7410) grad_norm 1.2644 (1.2097) [2022-01-20 08:52:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1200/1251] eta 0:01:52 lr 0.000804 time 2.6924 (2.2015) loss 3.8136 (3.7420) grad_norm 1.1129 (1.2092) [2022-01-20 08:52:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1210/1251] eta 0:01:30 lr 0.000804 time 1.9027 (2.2021) loss 3.7281 (3.7410) grad_norm 1.2979 (1.2092) [2022-01-20 08:52:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1220/1251] eta 0:01:08 lr 0.000804 time 2.5817 (2.2018) loss 3.5341 (3.7429) grad_norm 1.1101 (1.2090) [2022-01-20 08:53:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1230/1251] eta 0:00:46 lr 0.000804 time 2.7517 (2.2018) loss 3.3679 (3.7422) grad_norm 1.0730 (1.2084) [2022-01-20 08:53:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1240/1251] eta 0:00:24 lr 0.000804 time 1.3003 (2.1997) loss 2.8882 (3.7416) grad_norm 1.2732 (1.2084) [2022-01-20 08:53:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [87/300][1250/1251] eta 0:00:02 lr 0.000804 time 1.1942 (2.1943) loss 2.9273 (3.7414) grad_norm 1.1600 (1.2080) [2022-01-20 08:53:56 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 87 training takes 0:45:45 [2022-01-20 08:54:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.340 (18.340) Loss 1.1011 (1.1011) Acc@1 72.070 (72.070) Acc@5 92.969 (92.969) [2022-01-20 08:54:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.777 (3.396) Loss 1.1769 (1.1447) Acc@1 71.973 (72.763) Acc@5 91.797 (92.045) [2022-01-20 08:54:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.564 (2.564) Loss 1.1579 (1.1453) Acc@1 72.559 (73.191) Acc@5 91.309 (91.978) [2022-01-20 08:55:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.562 (2.275) Loss 1.1808 (1.1480) Acc@1 72.070 (73.129) Acc@5 91.406 (91.819) [2022-01-20 08:55:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.254 (2.176) Loss 1.1270 (1.1462) Acc@1 72.559 (73.168) Acc@5 92.383 (91.883) [2022-01-20 08:55:31 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 73.192 Acc@5 91.858 [2022-01-20 08:55:31 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 73.2% [2022-01-20 08:55:31 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.19% [2022-01-20 08:55:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][0/1251] eta 7:19:44 lr 0.000804 time 21.0903 (21.0903) loss 4.4419 (4.4419) grad_norm 1.8322 (1.8322) [2022-01-20 08:56:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][10/1251] eta 1:27:33 lr 0.000804 time 2.8756 (4.2336) loss 3.9983 (3.4016) grad_norm 1.1597 (1.2889) [2022-01-20 08:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][20/1251] eta 1:06:36 lr 0.000804 time 1.4810 (3.2466) loss 4.2242 (3.5443) grad_norm 1.1868 (1.2549) [2022-01-20 08:57:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][30/1251] eta 0:59:06 lr 0.000804 time 1.2975 (2.9043) loss 3.9846 (3.5372) grad_norm 1.1713 (1.2516) [2022-01-20 08:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][40/1251] eta 0:56:06 lr 0.000804 time 3.7391 (2.7802) loss 3.9586 (3.6714) grad_norm 1.0224 (1.2193) [2022-01-20 08:57:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][50/1251] eta 0:54:36 lr 0.000804 time 2.5541 (2.7282) loss 2.8312 (3.7281) grad_norm 1.0025 (1.2076) [2022-01-20 08:58:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][60/1251] eta 0:52:24 lr 0.000804 time 1.5791 (2.6401) loss 4.4186 (3.7344) grad_norm 1.1128 (1.2160) [2022-01-20 08:58:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][70/1251] eta 0:50:21 lr 0.000804 time 1.9175 (2.5586) loss 2.8439 (3.7007) grad_norm 1.0957 (1.2116) [2022-01-20 08:58:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][80/1251] eta 0:48:45 lr 0.000804 time 2.3213 (2.4982) loss 4.3693 (3.6723) grad_norm 1.1428 (1.2095) [2022-01-20 08:59:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][90/1251] eta 0:47:41 lr 0.000804 time 2.0517 (2.4647) loss 4.4497 (3.7089) grad_norm 1.1585 (1.2097) [2022-01-20 08:59:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][100/1251] eta 0:46:45 lr 0.000804 time 2.7298 (2.4377) loss 4.2289 (3.7072) grad_norm 1.4272 (1.2137) [2022-01-20 08:59:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][110/1251] eta 0:45:49 lr 0.000804 time 1.5018 (2.4097) loss 3.9432 (3.7150) grad_norm 1.1322 (1.2073) [2022-01-20 09:00:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][120/1251] eta 0:44:55 lr 0.000804 time 2.0521 (2.3834) loss 3.1243 (3.7146) grad_norm 1.1719 (1.2060) [2022-01-20 09:00:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][130/1251] eta 0:44:17 lr 0.000804 time 1.9030 (2.3704) loss 4.5990 (3.7008) grad_norm 1.4124 (1.2024) [2022-01-20 09:01:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][140/1251] eta 0:43:43 lr 0.000804 time 2.2731 (2.3614) loss 3.6866 (3.7058) grad_norm 1.2537 (1.2017) [2022-01-20 09:01:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][150/1251] eta 0:43:04 lr 0.000804 time 2.1145 (2.3471) loss 4.3715 (3.7169) grad_norm 1.3788 (1.2045) [2022-01-20 09:01:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][160/1251] eta 0:42:28 lr 0.000804 time 1.7641 (2.3359) loss 3.4592 (3.7155) grad_norm 1.2216 (1.2044) [2022-01-20 09:02:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][170/1251] eta 0:41:56 lr 0.000804 time 1.9842 (2.3280) loss 3.0219 (3.6960) grad_norm 1.5368 (1.2039) [2022-01-20 09:02:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][180/1251] eta 0:41:29 lr 0.000804 time 2.9905 (2.3246) loss 4.1220 (3.7018) grad_norm 1.5112 (1.2060) [2022-01-20 09:02:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][190/1251] eta 0:40:51 lr 0.000804 time 1.8829 (2.3109) loss 4.2522 (3.7094) grad_norm 1.3292 (1.2091) [2022-01-20 09:03:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][200/1251] eta 0:40:18 lr 0.000804 time 1.9428 (2.3014) loss 4.5954 (3.7178) grad_norm 1.1469 (1.2085) [2022-01-20 09:03:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][210/1251] eta 0:39:56 lr 0.000804 time 2.3753 (2.3020) loss 4.0106 (3.7200) grad_norm 1.1812 (1.2076) [2022-01-20 09:04:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][220/1251] eta 0:39:31 lr 0.000804 time 2.8291 (2.3004) loss 3.6273 (3.7206) grad_norm 1.3230 (1.2080) [2022-01-20 09:04:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][230/1251] eta 0:39:05 lr 0.000804 time 1.9181 (2.2975) loss 2.6328 (3.7193) grad_norm 1.0579 (1.2045) [2022-01-20 09:04:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][240/1251] eta 0:38:29 lr 0.000803 time 1.5904 (2.2846) loss 2.6671 (3.7096) grad_norm 1.0802 (1.2104) [2022-01-20 09:05:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][250/1251] eta 0:37:58 lr 0.000803 time 2.5628 (2.2763) loss 4.1674 (3.7134) grad_norm 1.2219 (1.2129) [2022-01-20 09:05:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][260/1251] eta 0:37:33 lr 0.000803 time 2.7704 (2.2744) loss 3.0969 (3.7160) grad_norm 1.1961 (1.2126) [2022-01-20 09:05:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][270/1251] eta 0:37:05 lr 0.000803 time 1.9973 (2.2690) loss 4.1979 (3.7165) grad_norm 1.3060 (1.2116) [2022-01-20 09:06:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][280/1251] eta 0:36:39 lr 0.000803 time 2.2027 (2.2657) loss 3.1590 (3.7227) grad_norm 1.3012 (1.2167) [2022-01-20 09:06:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][290/1251] eta 0:36:08 lr 0.000803 time 1.8952 (2.2567) loss 4.3897 (3.7182) grad_norm 1.5595 (1.2201) [2022-01-20 09:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][300/1251] eta 0:35:45 lr 0.000803 time 2.1629 (2.2560) loss 4.0738 (3.7130) grad_norm 1.2705 (1.2199) [2022-01-20 09:07:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][310/1251] eta 0:35:23 lr 0.000803 time 2.5227 (2.2562) loss 4.2450 (3.7094) grad_norm 1.1075 (1.2195) [2022-01-20 09:07:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][320/1251] eta 0:34:58 lr 0.000803 time 2.1983 (2.2541) loss 4.0645 (3.6976) grad_norm 1.2227 (1.2202) [2022-01-20 09:07:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][330/1251] eta 0:34:32 lr 0.000803 time 2.0598 (2.2504) loss 4.6443 (3.7021) grad_norm 1.4024 (1.2232) [2022-01-20 09:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][340/1251] eta 0:34:06 lr 0.000803 time 2.4578 (2.2462) loss 3.8343 (3.7018) grad_norm 0.9735 (1.2223) [2022-01-20 09:08:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][350/1251] eta 0:33:42 lr 0.000803 time 1.4737 (2.2445) loss 2.5685 (3.6996) grad_norm 1.2676 (1.2212) [2022-01-20 09:09:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][360/1251] eta 0:33:20 lr 0.000803 time 2.5713 (2.2455) loss 4.6899 (3.7034) grad_norm 1.2126 (1.2208) [2022-01-20 09:09:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][370/1251] eta 0:32:57 lr 0.000803 time 2.0927 (2.2450) loss 3.7569 (3.7122) grad_norm 1.4159 (1.2221) [2022-01-20 09:09:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][380/1251] eta 0:32:32 lr 0.000803 time 2.5098 (2.2413) loss 4.1468 (3.7192) grad_norm 1.3812 (1.2240) [2022-01-20 09:10:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][390/1251] eta 0:32:05 lr 0.000803 time 2.2493 (2.2360) loss 4.1259 (3.7142) grad_norm 1.2035 (1.2241) [2022-01-20 09:10:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][400/1251] eta 0:31:40 lr 0.000803 time 2.6370 (2.2336) loss 3.5153 (3.7083) grad_norm 1.3764 (1.2259) [2022-01-20 09:10:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][410/1251] eta 0:31:17 lr 0.000803 time 2.6066 (2.2322) loss 4.2989 (3.7117) grad_norm 1.3707 (1.2269) [2022-01-20 09:11:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][420/1251] eta 0:30:53 lr 0.000803 time 2.7987 (2.2306) loss 4.0860 (3.7159) grad_norm 1.0401 (1.2264) [2022-01-20 09:11:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][430/1251] eta 0:30:30 lr 0.000803 time 1.7597 (2.2291) loss 3.8058 (3.7176) grad_norm 1.3682 (1.2263) [2022-01-20 09:11:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][440/1251] eta 0:30:09 lr 0.000803 time 2.9630 (2.2311) loss 4.1209 (3.7204) grad_norm 1.4248 (1.2280) [2022-01-20 09:12:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][450/1251] eta 0:29:49 lr 0.000803 time 3.0587 (2.2345) loss 4.3185 (3.7279) grad_norm 1.1122 (1.2275) [2022-01-20 09:12:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][460/1251] eta 0:29:31 lr 0.000803 time 2.3690 (2.2392) loss 3.3949 (3.7236) grad_norm 1.2621 (1.2274) [2022-01-20 09:13:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][470/1251] eta 0:29:09 lr 0.000803 time 2.1311 (2.2405) loss 3.4025 (3.7259) grad_norm 1.1269 (1.2270) [2022-01-20 09:13:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][480/1251] eta 0:28:47 lr 0.000803 time 3.0931 (2.2406) loss 3.7914 (3.7265) grad_norm 1.1103 (1.2262) [2022-01-20 09:13:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][490/1251] eta 0:28:21 lr 0.000803 time 1.9069 (2.2364) loss 3.9714 (3.7288) grad_norm 1.3649 (1.2267) [2022-01-20 09:14:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][500/1251] eta 0:27:54 lr 0.000803 time 1.9704 (2.2296) loss 2.9337 (3.7256) grad_norm 1.1406 (1.2278) [2022-01-20 09:14:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][510/1251] eta 0:27:31 lr 0.000803 time 2.0327 (2.2292) loss 3.7695 (3.7229) grad_norm 1.3274 (1.2279) [2022-01-20 09:14:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][520/1251] eta 0:27:07 lr 0.000803 time 2.8722 (2.2267) loss 3.5795 (3.7225) grad_norm 1.1502 (1.2299) [2022-01-20 09:15:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][530/1251] eta 0:26:46 lr 0.000803 time 1.7476 (2.2275) loss 3.6219 (3.7240) grad_norm 1.0912 (1.2291) [2022-01-20 09:15:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][540/1251] eta 0:26:25 lr 0.000802 time 2.6989 (2.2299) loss 3.7537 (3.7236) grad_norm 1.1456 (1.2287) [2022-01-20 09:15:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][550/1251] eta 0:26:00 lr 0.000802 time 1.5546 (2.2266) loss 3.2464 (3.7243) grad_norm 1.4478 (1.2307) [2022-01-20 09:16:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][560/1251] eta 0:25:37 lr 0.000802 time 2.4666 (2.2249) loss 3.7903 (3.7253) grad_norm 1.2756 (1.2308) [2022-01-20 09:16:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][570/1251] eta 0:25:13 lr 0.000802 time 1.5752 (2.2229) loss 2.5295 (3.7229) grad_norm 1.3617 (1.2308) [2022-01-20 09:17:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][580/1251] eta 0:24:50 lr 0.000802 time 1.6497 (2.2216) loss 3.6103 (3.7265) grad_norm 1.0531 (1.2297) [2022-01-20 09:17:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][590/1251] eta 0:24:28 lr 0.000802 time 2.4902 (2.2224) loss 3.2915 (3.7303) grad_norm 1.6850 (1.2308) [2022-01-20 09:17:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][600/1251] eta 0:24:08 lr 0.000802 time 2.4859 (2.2256) loss 4.1373 (3.7307) grad_norm 1.0284 (1.2294) [2022-01-20 09:18:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][610/1251] eta 0:23:46 lr 0.000802 time 2.1422 (2.2250) loss 3.3068 (3.7295) grad_norm 1.1809 (1.2280) [2022-01-20 09:18:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][620/1251] eta 0:23:24 lr 0.000802 time 1.9864 (2.2258) loss 3.7345 (3.7266) grad_norm 1.3454 (1.2275) [2022-01-20 09:18:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][630/1251] eta 0:23:01 lr 0.000802 time 2.6694 (2.2252) loss 3.3712 (3.7257) grad_norm 1.6407 (1.2283) [2022-01-20 09:19:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][640/1251] eta 0:22:37 lr 0.000802 time 1.8631 (2.2226) loss 3.7266 (3.7258) grad_norm 1.1741 (1.2267) [2022-01-20 09:19:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][650/1251] eta 0:22:15 lr 0.000802 time 1.5603 (2.2214) loss 4.2565 (3.7254) grad_norm 1.0229 (1.2258) [2022-01-20 09:19:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][660/1251] eta 0:21:51 lr 0.000802 time 1.6458 (2.2193) loss 3.7028 (3.7195) grad_norm 1.0590 (1.2244) [2022-01-20 09:20:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][670/1251] eta 0:21:27 lr 0.000802 time 1.6017 (2.2165) loss 3.4448 (3.7213) grad_norm 1.2794 (1.2256) [2022-01-20 09:20:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][680/1251] eta 0:21:05 lr 0.000802 time 2.0806 (2.2168) loss 3.9519 (3.7236) grad_norm 1.0818 (1.2262) [2022-01-20 09:21:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][690/1251] eta 0:20:44 lr 0.000802 time 1.8442 (2.2181) loss 2.9054 (3.7246) grad_norm 1.6063 (1.2266) [2022-01-20 09:21:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][700/1251] eta 0:20:21 lr 0.000802 time 1.8922 (2.2174) loss 4.4277 (3.7285) grad_norm 1.1119 (1.2257) [2022-01-20 09:21:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][710/1251] eta 0:19:58 lr 0.000802 time 1.9939 (2.2157) loss 3.3019 (3.7284) grad_norm 1.2113 (1.2251) [2022-01-20 09:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][720/1251] eta 0:19:36 lr 0.000802 time 2.1407 (2.2154) loss 3.5167 (3.7260) grad_norm 1.1731 (1.2254) [2022-01-20 09:22:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][730/1251] eta 0:19:14 lr 0.000802 time 1.9395 (2.2153) loss 4.0788 (3.7273) grad_norm 1.0997 (1.2263) [2022-01-20 09:22:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][740/1251] eta 0:18:51 lr 0.000802 time 1.8053 (2.2143) loss 3.2158 (3.7220) grad_norm 1.0840 (1.2255) [2022-01-20 09:23:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][750/1251] eta 0:18:29 lr 0.000802 time 1.6697 (2.2146) loss 2.8654 (3.7207) grad_norm 1.1344 (1.2248) [2022-01-20 09:23:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][760/1251] eta 0:18:07 lr 0.000802 time 2.4487 (2.2142) loss 2.7382 (3.7229) grad_norm 1.2471 (1.2269) [2022-01-20 09:23:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][770/1251] eta 0:17:44 lr 0.000802 time 2.4924 (2.2127) loss 4.2464 (3.7239) grad_norm 1.1391 (1.2265) [2022-01-20 09:24:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][780/1251] eta 0:17:21 lr 0.000802 time 1.8899 (2.2108) loss 3.6326 (3.7292) grad_norm 1.1389 (1.2264) [2022-01-20 09:24:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][790/1251] eta 0:16:59 lr 0.000802 time 1.9308 (2.2108) loss 4.3643 (3.7293) grad_norm 1.4252 (1.2267) [2022-01-20 09:25:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][800/1251] eta 0:16:37 lr 0.000802 time 1.5469 (2.2110) loss 2.8200 (3.7286) grad_norm 1.2046 (1.2271) [2022-01-20 09:25:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][810/1251] eta 0:16:14 lr 0.000802 time 2.5312 (2.2109) loss 3.8312 (3.7311) grad_norm 1.1916 (1.2272) [2022-01-20 09:25:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][820/1251] eta 0:15:52 lr 0.000802 time 1.4833 (2.2095) loss 4.0743 (3.7338) grad_norm 1.1358 (1.2276) [2022-01-20 09:26:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][830/1251] eta 0:15:30 lr 0.000802 time 2.2029 (2.2113) loss 4.0053 (3.7342) grad_norm 0.9648 (1.2270) [2022-01-20 09:26:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][840/1251] eta 0:15:09 lr 0.000801 time 1.7896 (2.2119) loss 3.0054 (3.7340) grad_norm 1.0820 (1.2262) [2022-01-20 09:26:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][850/1251] eta 0:14:46 lr 0.000801 time 2.1869 (2.2114) loss 3.8190 (3.7340) grad_norm 1.2108 (1.2265) [2022-01-20 09:27:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][860/1251] eta 0:14:23 lr 0.000801 time 1.9605 (2.2091) loss 4.5359 (3.7353) grad_norm 1.2174 (1.2267) [2022-01-20 09:27:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][870/1251] eta 0:14:01 lr 0.000801 time 1.9480 (2.2080) loss 2.6635 (3.7355) grad_norm 1.2449 (1.2260) [2022-01-20 09:27:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][880/1251] eta 0:13:38 lr 0.000801 time 1.7771 (2.2070) loss 2.6828 (3.7374) grad_norm 1.0362 (1.2262) [2022-01-20 09:28:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][890/1251] eta 0:13:16 lr 0.000801 time 1.9172 (2.2062) loss 3.9566 (3.7375) grad_norm 1.2281 (1.2258) [2022-01-20 09:28:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][900/1251] eta 0:12:54 lr 0.000801 time 1.5363 (2.2064) loss 3.0788 (3.7383) grad_norm 1.2568 (1.2254) [2022-01-20 09:29:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][910/1251] eta 0:12:32 lr 0.000801 time 2.2914 (2.2075) loss 4.1393 (3.7420) grad_norm 1.2602 (1.2260) [2022-01-20 09:29:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][920/1251] eta 0:12:11 lr 0.000801 time 2.8449 (2.2104) loss 4.2041 (3.7421) grad_norm 1.0767 (1.2250) [2022-01-20 09:29:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][930/1251] eta 0:11:48 lr 0.000801 time 1.9140 (2.2085) loss 4.4169 (3.7434) grad_norm 1.1242 (1.2246) [2022-01-20 09:30:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][940/1251] eta 0:11:26 lr 0.000801 time 1.6518 (2.2075) loss 4.2402 (3.7422) grad_norm 1.1167 (1.2247) [2022-01-20 09:30:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][950/1251] eta 0:11:03 lr 0.000801 time 2.2354 (2.2047) loss 3.7305 (3.7428) grad_norm 1.3769 (1.2253) [2022-01-20 09:30:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][960/1251] eta 0:10:41 lr 0.000801 time 2.2298 (2.2037) loss 4.1139 (3.7432) grad_norm 1.1616 (1.2244) [2022-01-20 09:31:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][970/1251] eta 0:10:19 lr 0.000801 time 2.1873 (2.2031) loss 3.4545 (3.7445) grad_norm 0.9988 (1.2233) [2022-01-20 09:31:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][980/1251] eta 0:09:57 lr 0.000801 time 2.1657 (2.2048) loss 4.2373 (3.7468) grad_norm 1.1568 (1.2229) [2022-01-20 09:31:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][990/1251] eta 0:09:35 lr 0.000801 time 2.7083 (2.2066) loss 4.0660 (3.7490) grad_norm 1.1229 (1.2223) [2022-01-20 09:32:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1000/1251] eta 0:09:13 lr 0.000801 time 2.2406 (2.2067) loss 4.1262 (3.7495) grad_norm 1.3446 (1.2225) [2022-01-20 09:32:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1010/1251] eta 0:08:51 lr 0.000801 time 2.0405 (2.2066) loss 4.5084 (3.7506) grad_norm 1.2049 (1.2234) [2022-01-20 09:33:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1020/1251] eta 0:08:29 lr 0.000801 time 1.8519 (2.2077) loss 2.8823 (3.7511) grad_norm 1.1562 (1.2229) [2022-01-20 09:33:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1030/1251] eta 0:08:07 lr 0.000801 time 2.3627 (2.2076) loss 4.4949 (3.7528) grad_norm 1.2233 (1.2230) [2022-01-20 09:33:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1040/1251] eta 0:07:45 lr 0.000801 time 2.5460 (2.2078) loss 3.7325 (3.7519) grad_norm 1.2570 (1.2229) [2022-01-20 09:34:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1050/1251] eta 0:07:23 lr 0.000801 time 1.6334 (2.2050) loss 3.5235 (3.7517) grad_norm 1.4031 (1.2223) [2022-01-20 09:34:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1060/1251] eta 0:07:00 lr 0.000801 time 1.9060 (2.2035) loss 3.9688 (3.7542) grad_norm 1.2670 (1.2223) [2022-01-20 09:34:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1070/1251] eta 0:06:38 lr 0.000801 time 1.9326 (2.2019) loss 3.4715 (3.7531) grad_norm 1.1189 (1.2217) [2022-01-20 09:35:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1080/1251] eta 0:06:16 lr 0.000801 time 2.2102 (2.2015) loss 3.7041 (3.7545) grad_norm 1.4054 (1.2220) [2022-01-20 09:35:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1090/1251] eta 0:05:54 lr 0.000801 time 1.9413 (2.2000) loss 4.4821 (3.7565) grad_norm 1.2236 (1.2220) [2022-01-20 09:35:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1100/1251] eta 0:05:32 lr 0.000801 time 1.8226 (2.1990) loss 3.2145 (3.7552) grad_norm 1.3311 (1.2221) [2022-01-20 09:36:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1110/1251] eta 0:05:10 lr 0.000801 time 2.9943 (2.1997) loss 4.1910 (3.7545) grad_norm 1.1563 (1.2220) [2022-01-20 09:36:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1120/1251] eta 0:04:48 lr 0.000801 time 1.8721 (2.2001) loss 2.8559 (3.7520) grad_norm 1.4023 (1.2219) [2022-01-20 09:37:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1130/1251] eta 0:04:26 lr 0.000801 time 2.5160 (2.2005) loss 3.7654 (3.7524) grad_norm 1.2948 (1.2214) [2022-01-20 09:37:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1140/1251] eta 0:04:04 lr 0.000801 time 1.8156 (2.2023) loss 4.3675 (3.7529) grad_norm 1.1261 (1.2204) [2022-01-20 09:37:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1150/1251] eta 0:03:42 lr 0.000800 time 3.1712 (2.2043) loss 4.2113 (3.7551) grad_norm 1.1534 (1.2196) [2022-01-20 09:38:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1160/1251] eta 0:03:20 lr 0.000800 time 1.4431 (2.2053) loss 4.5351 (3.7559) grad_norm 1.3204 (1.2196) [2022-01-20 09:38:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1170/1251] eta 0:02:58 lr 0.000800 time 2.5338 (2.2059) loss 4.4337 (3.7563) grad_norm 1.3640 (1.2202) [2022-01-20 09:38:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1180/1251] eta 0:02:36 lr 0.000800 time 1.6866 (2.2029) loss 3.8442 (3.7563) grad_norm 1.2682 (1.2204) [2022-01-20 09:39:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1190/1251] eta 0:02:14 lr 0.000800 time 1.8696 (2.1999) loss 4.8202 (3.7563) grad_norm 1.4258 (1.2207) [2022-01-20 09:39:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1200/1251] eta 0:01:52 lr 0.000800 time 2.5767 (2.1987) loss 3.9632 (3.7554) grad_norm 1.3845 (1.2216) [2022-01-20 09:39:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1210/1251] eta 0:01:30 lr 0.000800 time 1.9329 (2.1974) loss 3.2850 (3.7540) grad_norm 1.0886 (1.2213) [2022-01-20 09:40:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1220/1251] eta 0:01:08 lr 0.000800 time 1.7905 (2.1977) loss 3.5441 (3.7550) grad_norm 1.2063 (1.2208) [2022-01-20 09:40:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1230/1251] eta 0:00:46 lr 0.000800 time 2.3917 (2.1985) loss 2.7516 (3.7540) grad_norm 1.1451 (1.2209) [2022-01-20 09:41:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1240/1251] eta 0:00:24 lr 0.000800 time 2.4582 (2.1986) loss 3.6544 (3.7536) grad_norm 1.2454 (1.2208) [2022-01-20 09:41:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [88/300][1250/1251] eta 0:00:02 lr 0.000800 time 1.1372 (2.1935) loss 4.0375 (3.7538) grad_norm 1.2909 (1.2203) [2022-01-20 09:41:16 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 88 training takes 0:45:44 [2022-01-20 09:41:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.365 (18.365) Loss 1.1451 (1.1451) Acc@1 73.633 (73.633) Acc@5 91.699 (91.699) [2022-01-20 09:41:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.672 (3.249) Loss 1.1875 (1.1349) Acc@1 72.852 (73.544) Acc@5 91.309 (92.028) [2022-01-20 09:42:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 2.234 (2.600) Loss 1.1245 (1.1590) Acc@1 74.609 (73.228) Acc@5 91.699 (91.853) [2022-01-20 09:42:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.922 (2.282) Loss 1.1594 (1.1705) Acc@1 74.023 (72.886) Acc@5 91.797 (91.718) [2022-01-20 09:42:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.022 (2.196) Loss 1.2423 (1.1703) Acc@1 71.289 (72.942) Acc@5 91.504 (91.733) [2022-01-20 09:42:53 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 72.964 Acc@5 91.712 [2022-01-20 09:42:53 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 73.0% [2022-01-20 09:42:53 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.19% [2022-01-20 09:43:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][0/1251] eta 7:27:14 lr 0.000800 time 21.4505 (21.4505) loss 3.8520 (3.8520) grad_norm 1.2463 (1.2463) [2022-01-20 09:43:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][10/1251] eta 1:24:08 lr 0.000800 time 1.6143 (4.0681) loss 4.3168 (3.4488) grad_norm 1.0514 (1.1849) [2022-01-20 09:44:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][20/1251] eta 1:05:25 lr 0.000800 time 1.3856 (3.1891) loss 3.3608 (3.5921) grad_norm 1.1063 (1.1998) [2022-01-20 09:44:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][30/1251] eta 0:58:58 lr 0.000800 time 1.8684 (2.8977) loss 3.5151 (3.6274) grad_norm 1.0919 (1.2326) [2022-01-20 09:44:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][40/1251] eta 0:55:00 lr 0.000800 time 3.4887 (2.7251) loss 2.9181 (3.6061) grad_norm 1.2245 (1.2589) [2022-01-20 09:45:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][50/1251] eta 0:52:28 lr 0.000800 time 2.2018 (2.6214) loss 4.2391 (3.6524) grad_norm 1.3444 (1.2598) [2022-01-20 09:45:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][60/1251] eta 0:50:26 lr 0.000800 time 1.2248 (2.5407) loss 4.0702 (3.6409) grad_norm 0.9867 (1.2445) [2022-01-20 09:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][70/1251] eta 0:49:14 lr 0.000800 time 1.9685 (2.5020) loss 3.7716 (3.6265) grad_norm 1.0755 (1.2272) [2022-01-20 09:46:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][80/1251] eta 0:48:28 lr 0.000800 time 2.8783 (2.4834) loss 3.1180 (3.6258) grad_norm 1.2533 (1.2202) [2022-01-20 09:46:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][90/1251] eta 0:47:45 lr 0.000800 time 1.9449 (2.4685) loss 2.7692 (3.6314) grad_norm 1.1144 (1.2250) [2022-01-20 09:46:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][100/1251] eta 0:46:43 lr 0.000800 time 1.7484 (2.4353) loss 3.5248 (3.6252) grad_norm 1.2296 (1.2249) [2022-01-20 09:47:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][110/1251] eta 0:45:30 lr 0.000800 time 1.5922 (2.3932) loss 3.3299 (3.6401) grad_norm 1.2565 (1.2231) [2022-01-20 09:47:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][120/1251] eta 0:44:41 lr 0.000800 time 2.5674 (2.3711) loss 3.2280 (3.6268) grad_norm 1.5692 (1.2264) [2022-01-20 09:48:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][130/1251] eta 0:44:18 lr 0.000800 time 2.7995 (2.3712) loss 3.1405 (3.6532) grad_norm 1.5161 (1.2310) [2022-01-20 09:48:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][140/1251] eta 0:43:41 lr 0.000800 time 2.5142 (2.3599) loss 2.9546 (3.6528) grad_norm 1.1706 (1.2245) [2022-01-20 09:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][150/1251] eta 0:42:59 lr 0.000800 time 1.8147 (2.3425) loss 3.9364 (3.6652) grad_norm 1.1629 (1.2261) [2022-01-20 09:49:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][160/1251] eta 0:42:26 lr 0.000800 time 2.7828 (2.3343) loss 3.9146 (3.6850) grad_norm 1.1557 (1.2249) [2022-01-20 09:49:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][170/1251] eta 0:41:49 lr 0.000800 time 2.2781 (2.3212) loss 3.8411 (3.6772) grad_norm 1.1732 (1.2216) [2022-01-20 09:49:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][180/1251] eta 0:41:10 lr 0.000800 time 2.1267 (2.3063) loss 3.9000 (3.6805) grad_norm 1.2674 (1.2168) [2022-01-20 09:50:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][190/1251] eta 0:40:42 lr 0.000799 time 2.9839 (2.3022) loss 3.8014 (3.6963) grad_norm 1.2916 (1.2159) [2022-01-20 09:50:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][200/1251] eta 0:40:18 lr 0.000799 time 2.7866 (2.3010) loss 4.0646 (3.6949) grad_norm 1.1727 (1.2173) [2022-01-20 09:50:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][210/1251] eta 0:39:45 lr 0.000799 time 2.2077 (2.2912) loss 4.1184 (3.7055) grad_norm 1.0079 (1.2127) [2022-01-20 09:51:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][220/1251] eta 0:39:15 lr 0.000799 time 1.8642 (2.2848) loss 3.8817 (3.7061) grad_norm 1.1633 (1.2157) [2022-01-20 09:51:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][230/1251] eta 0:38:51 lr 0.000799 time 3.2470 (2.2838) loss 4.7378 (3.7142) grad_norm 1.2140 (1.2172) [2022-01-20 09:52:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][240/1251] eta 0:38:24 lr 0.000799 time 1.9530 (2.2794) loss 4.2558 (3.7099) grad_norm 1.1032 (1.2171) [2022-01-20 09:52:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][250/1251] eta 0:37:58 lr 0.000799 time 2.0439 (2.2758) loss 3.6376 (3.7150) grad_norm 1.0993 (1.2148) [2022-01-20 09:52:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][260/1251] eta 0:37:37 lr 0.000799 time 1.9171 (2.2779) loss 3.9704 (3.7265) grad_norm 1.2103 (1.2184) [2022-01-20 09:53:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][270/1251] eta 0:37:07 lr 0.000799 time 2.5430 (2.2709) loss 2.4650 (3.7151) grad_norm 1.2353 (1.2199) [2022-01-20 09:53:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][280/1251] eta 0:36:38 lr 0.000799 time 1.9928 (2.2645) loss 4.0942 (3.7131) grad_norm 1.1698 (1.2209) [2022-01-20 09:53:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][290/1251] eta 0:36:13 lr 0.000799 time 2.1482 (2.2620) loss 3.9822 (3.7180) grad_norm 1.1211 (1.2199) [2022-01-20 09:54:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][300/1251] eta 0:35:49 lr 0.000799 time 2.1285 (2.2598) loss 4.4854 (3.7261) grad_norm 1.1970 (1.2178) [2022-01-20 09:54:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][310/1251] eta 0:35:23 lr 0.000799 time 2.5312 (2.2561) loss 3.9766 (3.7301) grad_norm 1.0041 (1.2133) [2022-01-20 09:54:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][320/1251] eta 0:34:59 lr 0.000799 time 1.7803 (2.2549) loss 3.8525 (3.7283) grad_norm 1.2379 (1.2133) [2022-01-20 09:55:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][330/1251] eta 0:34:33 lr 0.000799 time 1.8167 (2.2511) loss 4.2330 (3.7285) grad_norm 1.2565 (1.2145) [2022-01-20 09:55:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][340/1251] eta 0:34:06 lr 0.000799 time 2.0735 (2.2467) loss 3.1117 (3.7242) grad_norm 1.2234 (1.2157) [2022-01-20 09:56:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][350/1251] eta 0:33:42 lr 0.000799 time 2.0737 (2.2452) loss 3.5758 (3.7347) grad_norm 1.1588 (1.2145) [2022-01-20 09:56:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][360/1251] eta 0:33:17 lr 0.000799 time 1.8955 (2.2419) loss 4.5365 (3.7373) grad_norm 1.2384 (1.2139) [2022-01-20 09:56:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][370/1251] eta 0:32:53 lr 0.000799 time 2.2458 (2.2400) loss 3.8103 (3.7331) grad_norm 1.3113 (1.2162) [2022-01-20 09:57:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][380/1251] eta 0:32:31 lr 0.000799 time 2.8593 (2.2406) loss 3.1511 (3.7289) grad_norm 1.2486 (1.2177) [2022-01-20 09:57:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][390/1251] eta 0:32:08 lr 0.000799 time 2.2431 (2.2400) loss 3.7047 (3.7211) grad_norm 1.5939 (1.2223) [2022-01-20 09:57:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][400/1251] eta 0:31:44 lr 0.000799 time 1.9378 (2.2378) loss 4.2562 (3.7229) grad_norm 1.3310 (1.2242) [2022-01-20 09:58:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][410/1251] eta 0:31:24 lr 0.000799 time 2.5880 (2.2404) loss 4.2747 (3.7217) grad_norm 1.4650 (1.2238) [2022-01-20 09:58:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][420/1251] eta 0:31:01 lr 0.000799 time 2.4775 (2.2398) loss 4.1504 (3.7246) grad_norm 1.1152 (1.2233) [2022-01-20 09:58:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][430/1251] eta 0:30:36 lr 0.000799 time 2.0159 (2.2364) loss 4.2972 (3.7237) grad_norm 1.2667 (1.2226) [2022-01-20 09:59:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][440/1251] eta 0:30:09 lr 0.000799 time 1.7274 (2.2317) loss 3.3993 (3.7247) grad_norm 1.1241 (1.2221) [2022-01-20 09:59:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][450/1251] eta 0:29:46 lr 0.000799 time 2.4759 (2.2303) loss 4.5269 (3.7232) grad_norm 1.0847 (1.2224) [2022-01-20 10:00:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][460/1251] eta 0:29:24 lr 0.000799 time 2.2002 (2.2303) loss 4.7318 (3.7284) grad_norm 1.0824 (1.2240) [2022-01-20 10:00:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][470/1251] eta 0:28:59 lr 0.000799 time 2.7801 (2.2269) loss 3.7782 (3.7322) grad_norm 1.3827 (1.2243) [2022-01-20 10:00:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][480/1251] eta 0:28:34 lr 0.000799 time 1.7366 (2.2235) loss 3.5335 (3.7258) grad_norm 1.0357 (1.2229) [2022-01-20 10:01:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][490/1251] eta 0:28:10 lr 0.000798 time 1.9936 (2.2212) loss 2.8198 (3.7273) grad_norm 1.4781 (1.2236) [2022-01-20 10:01:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][500/1251] eta 0:27:48 lr 0.000798 time 2.2105 (2.2215) loss 3.9571 (3.7253) grad_norm 1.2316 (1.2262) [2022-01-20 10:01:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][510/1251] eta 0:27:29 lr 0.000798 time 2.5276 (2.2260) loss 3.2206 (3.7214) grad_norm 1.1502 (1.2267) [2022-01-20 10:02:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][520/1251] eta 0:27:07 lr 0.000798 time 2.0511 (2.2265) loss 2.6860 (3.7202) grad_norm 1.0595 (1.2252) [2022-01-20 10:02:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][530/1251] eta 0:26:44 lr 0.000798 time 1.9806 (2.2250) loss 3.3306 (3.7229) grad_norm 1.0519 (1.2243) [2022-01-20 10:02:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][540/1251] eta 0:26:21 lr 0.000798 time 1.8755 (2.2244) loss 3.9467 (3.7282) grad_norm 1.1469 (1.2249) [2022-01-20 10:03:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][550/1251] eta 0:25:59 lr 0.000798 time 3.6084 (2.2251) loss 3.9845 (3.7219) grad_norm 1.0629 (1.2238) [2022-01-20 10:03:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][560/1251] eta 0:25:36 lr 0.000798 time 2.0822 (2.2239) loss 4.2924 (3.7244) grad_norm 1.5006 (1.2251) [2022-01-20 10:04:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][570/1251] eta 0:25:12 lr 0.000798 time 2.0018 (2.2212) loss 3.9959 (3.7248) grad_norm 1.5055 (1.2273) [2022-01-20 10:04:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][580/1251] eta 0:24:49 lr 0.000798 time 1.7218 (2.2197) loss 3.2810 (3.7223) grad_norm 1.2066 (1.2259) [2022-01-20 10:04:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][590/1251] eta 0:24:24 lr 0.000798 time 2.1880 (2.2155) loss 3.3391 (3.7223) grad_norm 1.2323 (1.2252) [2022-01-20 10:05:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][600/1251] eta 0:24:03 lr 0.000798 time 2.2822 (2.2168) loss 4.4651 (3.7260) grad_norm 1.0751 (1.2259) [2022-01-20 10:05:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][610/1251] eta 0:23:42 lr 0.000798 time 3.1640 (2.2186) loss 4.1024 (3.7278) grad_norm 1.3098 (1.2256) [2022-01-20 10:05:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][620/1251] eta 0:23:19 lr 0.000798 time 2.0590 (2.2180) loss 3.6400 (3.7267) grad_norm 1.3005 (1.2258) [2022-01-20 10:06:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][630/1251] eta 0:22:57 lr 0.000798 time 1.5635 (2.2180) loss 3.8099 (3.7253) grad_norm 1.0623 (1.2269) [2022-01-20 10:06:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][640/1251] eta 0:22:35 lr 0.000798 time 2.1131 (2.2184) loss 3.1960 (3.7222) grad_norm 1.1284 (1.2259) [2022-01-20 10:06:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][650/1251] eta 0:22:12 lr 0.000798 time 2.1656 (2.2168) loss 3.1473 (3.7243) grad_norm 1.1851 (1.2259) [2022-01-20 10:07:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][660/1251] eta 0:21:48 lr 0.000798 time 1.6021 (2.2143) loss 4.3071 (3.7228) grad_norm 1.0220 (1.2248) [2022-01-20 10:07:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][670/1251] eta 0:21:26 lr 0.000798 time 1.6290 (2.2137) loss 3.9116 (3.7282) grad_norm 1.0464 (1.2237) [2022-01-20 10:08:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][680/1251] eta 0:21:04 lr 0.000798 time 2.2401 (2.2144) loss 3.7939 (3.7332) grad_norm 1.4441 (1.2252) [2022-01-20 10:08:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][690/1251] eta 0:20:42 lr 0.000798 time 1.9183 (2.2154) loss 2.9870 (3.7354) grad_norm 1.2321 (1.2261) [2022-01-20 10:08:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][700/1251] eta 0:20:20 lr 0.000798 time 1.7212 (2.2147) loss 4.2389 (3.7386) grad_norm 1.2989 (1.2267) [2022-01-20 10:09:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][710/1251] eta 0:19:56 lr 0.000798 time 2.0878 (2.2121) loss 3.3643 (3.7408) grad_norm 1.1860 (1.2265) [2022-01-20 10:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][720/1251] eta 0:19:33 lr 0.000798 time 2.2132 (2.2091) loss 4.5320 (3.7432) grad_norm 1.3737 (1.2267) [2022-01-20 10:09:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][730/1251] eta 0:19:11 lr 0.000798 time 2.8050 (2.2095) loss 4.3523 (3.7448) grad_norm 1.3320 (1.2267) [2022-01-20 10:10:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][740/1251] eta 0:18:48 lr 0.000798 time 1.9297 (2.2078) loss 4.2827 (3.7448) grad_norm 1.1766 (1.2275) [2022-01-20 10:10:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][750/1251] eta 0:18:25 lr 0.000798 time 2.3639 (2.2071) loss 2.9623 (3.7391) grad_norm 1.3255 (1.2282) [2022-01-20 10:10:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][760/1251] eta 0:18:03 lr 0.000798 time 2.0006 (2.2068) loss 3.9377 (3.7428) grad_norm 1.2145 (1.2275) [2022-01-20 10:11:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][770/1251] eta 0:17:41 lr 0.000798 time 2.2554 (2.2072) loss 3.8297 (3.7460) grad_norm 1.1192 (1.2270) [2022-01-20 10:11:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][780/1251] eta 0:17:18 lr 0.000798 time 2.0077 (2.2058) loss 4.1617 (3.7466) grad_norm 1.2031 (1.2274) [2022-01-20 10:11:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][790/1251] eta 0:16:56 lr 0.000797 time 2.5116 (2.2054) loss 4.7056 (3.7469) grad_norm 1.1260 (1.2272) [2022-01-20 10:12:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][800/1251] eta 0:16:35 lr 0.000797 time 2.1671 (2.2067) loss 3.8379 (3.7456) grad_norm 0.9982 (1.2260) [2022-01-20 10:12:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][810/1251] eta 0:16:13 lr 0.000797 time 2.5612 (2.2066) loss 3.6384 (3.7484) grad_norm 1.0887 (1.2256) [2022-01-20 10:13:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][820/1251] eta 0:15:50 lr 0.000797 time 2.1243 (2.2050) loss 3.9120 (3.7490) grad_norm 1.3342 (1.2257) [2022-01-20 10:13:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][830/1251] eta 0:15:28 lr 0.000797 time 3.4416 (2.2059) loss 3.9389 (3.7513) grad_norm 1.1440 (1.2254) [2022-01-20 10:13:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][840/1251] eta 0:15:07 lr 0.000797 time 1.9230 (2.2073) loss 2.5795 (3.7475) grad_norm 1.0374 (1.2241) [2022-01-20 10:14:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][850/1251] eta 0:14:45 lr 0.000797 time 2.7930 (2.2084) loss 3.7887 (3.7499) grad_norm 1.3693 (1.2239) [2022-01-20 10:14:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][860/1251] eta 0:14:22 lr 0.000797 time 2.1523 (2.2063) loss 2.7320 (3.7468) grad_norm 0.9942 (1.2238) [2022-01-20 10:14:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][870/1251] eta 0:13:59 lr 0.000797 time 2.1186 (2.2042) loss 3.9163 (3.7493) grad_norm 1.2254 (1.2229) [2022-01-20 10:15:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][880/1251] eta 0:13:38 lr 0.000797 time 2.4261 (2.2052) loss 3.3161 (3.7467) grad_norm 1.1516 (1.2222) [2022-01-20 10:15:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][890/1251] eta 0:13:16 lr 0.000797 time 2.9812 (2.2059) loss 3.9462 (3.7476) grad_norm 1.1017 (1.2212) [2022-01-20 10:16:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][900/1251] eta 0:12:53 lr 0.000797 time 1.8320 (2.2051) loss 3.6224 (3.7506) grad_norm 1.1087 (1.2216) [2022-01-20 10:16:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][910/1251] eta 0:12:31 lr 0.000797 time 1.9443 (2.2040) loss 4.2402 (3.7521) grad_norm 1.2876 (1.2211) [2022-01-20 10:16:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][920/1251] eta 0:12:09 lr 0.000797 time 1.5433 (2.2027) loss 4.5618 (3.7549) grad_norm 1.1861 (1.2212) [2022-01-20 10:17:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][930/1251] eta 0:11:46 lr 0.000797 time 2.8918 (2.2020) loss 4.1151 (3.7559) grad_norm 1.1489 (1.2208) [2022-01-20 10:17:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][940/1251] eta 0:11:24 lr 0.000797 time 1.9149 (2.2015) loss 4.1737 (3.7577) grad_norm 1.3059 (1.2214) [2022-01-20 10:17:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][950/1251] eta 0:11:02 lr 0.000797 time 2.1898 (2.2025) loss 2.4359 (3.7566) grad_norm 1.3269 (1.2209) [2022-01-20 10:18:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][960/1251] eta 0:10:40 lr 0.000797 time 2.1030 (2.2026) loss 3.8241 (3.7558) grad_norm 1.0886 (1.2203) [2022-01-20 10:18:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][970/1251] eta 0:10:18 lr 0.000797 time 1.8002 (2.2018) loss 3.0339 (3.7550) grad_norm 1.2254 (1.2200) [2022-01-20 10:18:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][980/1251] eta 0:09:56 lr 0.000797 time 1.6791 (2.2006) loss 3.5175 (3.7554) grad_norm 1.1769 (1.2194) [2022-01-20 10:19:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][990/1251] eta 0:09:33 lr 0.000797 time 2.2593 (2.1992) loss 4.2401 (3.7557) grad_norm 0.9497 (1.2190) [2022-01-20 10:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1000/1251] eta 0:09:11 lr 0.000797 time 2.0197 (2.1982) loss 3.3521 (3.7526) grad_norm 1.1043 (1.2199) [2022-01-20 10:19:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1010/1251] eta 0:08:49 lr 0.000797 time 1.7026 (2.1975) loss 2.8699 (3.7543) grad_norm 1.1762 (1.2200) [2022-01-20 10:20:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1020/1251] eta 0:08:27 lr 0.000797 time 2.1495 (2.1988) loss 4.7233 (3.7564) grad_norm 1.1290 (1.2191) [2022-01-20 10:20:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1030/1251] eta 0:08:06 lr 0.000797 time 3.0362 (2.2007) loss 4.0588 (3.7556) grad_norm 1.5323 (1.2193) [2022-01-20 10:21:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1040/1251] eta 0:07:44 lr 0.000797 time 2.0952 (2.2003) loss 3.3805 (3.7542) grad_norm 1.2817 (1.2192) [2022-01-20 10:21:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1050/1251] eta 0:07:21 lr 0.000797 time 2.0577 (2.1988) loss 3.5900 (3.7545) grad_norm 1.2453 (1.2188) [2022-01-20 10:21:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1060/1251] eta 0:06:59 lr 0.000797 time 2.1990 (2.1985) loss 3.3455 (3.7550) grad_norm 1.3585 (1.2189) [2022-01-20 10:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1070/1251] eta 0:06:38 lr 0.000797 time 3.2628 (2.1997) loss 3.8814 (3.7561) grad_norm 1.3006 (1.2199) [2022-01-20 10:22:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1080/1251] eta 0:06:15 lr 0.000797 time 1.8724 (2.1984) loss 4.2781 (3.7555) grad_norm 1.2928 (1.2203) [2022-01-20 10:22:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1090/1251] eta 0:05:53 lr 0.000796 time 1.9471 (2.1974) loss 4.0342 (3.7559) grad_norm 1.6806 (1.2210) [2022-01-20 10:23:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1100/1251] eta 0:05:31 lr 0.000796 time 2.2575 (2.1981) loss 4.1239 (3.7570) grad_norm 1.0753 (1.2220) [2022-01-20 10:23:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1110/1251] eta 0:05:09 lr 0.000796 time 2.2908 (2.1983) loss 3.5157 (3.7568) grad_norm 1.1261 (1.2213) [2022-01-20 10:23:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1120/1251] eta 0:04:47 lr 0.000796 time 2.2097 (2.1978) loss 2.9478 (3.7559) grad_norm 1.4092 (1.2211) [2022-01-20 10:24:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1130/1251] eta 0:04:25 lr 0.000796 time 1.6083 (2.1966) loss 4.4777 (3.7553) grad_norm 1.2314 (1.2208) [2022-01-20 10:24:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1140/1251] eta 0:04:03 lr 0.000796 time 2.8571 (2.1974) loss 3.1768 (3.7537) grad_norm 1.0752 (1.2202) [2022-01-20 10:25:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1150/1251] eta 0:03:41 lr 0.000796 time 1.7980 (2.1956) loss 3.5923 (3.7504) grad_norm 1.1642 (1.2201) [2022-01-20 10:25:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1160/1251] eta 0:03:19 lr 0.000796 time 1.8387 (2.1951) loss 4.3586 (3.7512) grad_norm 1.2835 (1.2203) [2022-01-20 10:25:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1170/1251] eta 0:02:57 lr 0.000796 time 1.5449 (2.1942) loss 3.6442 (3.7511) grad_norm 0.9902 (1.2199) [2022-01-20 10:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1180/1251] eta 0:02:35 lr 0.000796 time 2.4582 (2.1932) loss 3.8384 (3.7509) grad_norm 1.1357 (1.2202) [2022-01-20 10:26:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1190/1251] eta 0:02:13 lr 0.000796 time 2.2025 (2.1925) loss 3.8605 (3.7491) grad_norm 1.1938 (1.2201) [2022-01-20 10:26:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1200/1251] eta 0:01:51 lr 0.000796 time 2.0329 (2.1918) loss 4.0373 (3.7479) grad_norm 1.2376 (1.2198) [2022-01-20 10:27:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1210/1251] eta 0:01:29 lr 0.000796 time 1.5656 (2.1925) loss 3.9570 (3.7483) grad_norm 1.2187 (1.2194) [2022-01-20 10:27:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1220/1251] eta 0:01:07 lr 0.000796 time 2.4770 (2.1927) loss 3.9108 (3.7487) grad_norm 1.2410 (1.2195) [2022-01-20 10:27:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1230/1251] eta 0:00:46 lr 0.000796 time 1.6754 (2.1912) loss 4.0872 (3.7466) grad_norm 1.1626 (1.2193) [2022-01-20 10:28:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1240/1251] eta 0:00:24 lr 0.000796 time 1.7054 (2.1900) loss 3.9715 (3.7453) grad_norm 1.1581 (1.2193) [2022-01-20 10:28:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [89/300][1250/1251] eta 0:00:02 lr 0.000796 time 1.1859 (2.1844) loss 3.7164 (3.7448) grad_norm 1.1445 (1.2185) [2022-01-20 10:28:26 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 89 training takes 0:45:33 [2022-01-20 10:28:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.228 (18.228) Loss 1.1066 (1.1066) Acc@1 74.219 (74.219) Acc@5 93.164 (93.164) [2022-01-20 10:29:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.284 (3.552) Loss 1.1745 (1.1445) Acc@1 72.949 (73.331) Acc@5 91.699 (92.232) [2022-01-20 10:29:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.657 (2.635) Loss 1.1516 (1.1529) Acc@1 72.363 (73.093) Acc@5 91.797 (91.983) [2022-01-20 10:29:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.990 (2.365) Loss 1.2064 (1.1561) Acc@1 71.777 (73.157) Acc@5 90.625 (91.866) [2022-01-20 10:29:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.975 (2.173) Loss 1.1906 (1.1584) Acc@1 72.363 (73.090) Acc@5 91.504 (91.830) [2022-01-20 10:30:02 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 73.040 Acc@5 91.902 [2022-01-20 10:30:02 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 73.0% [2022-01-20 10:30:02 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.19% [2022-01-20 10:30:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][0/1251] eta 7:31:27 lr 0.000796 time 21.6525 (21.6525) loss 3.9997 (3.9997) grad_norm 1.8258 (1.8258) [2022-01-20 10:30:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][10/1251] eta 1:23:25 lr 0.000796 time 2.2030 (4.0332) loss 2.8382 (3.8304) grad_norm 1.0503 (1.3113) [2022-01-20 10:31:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][20/1251] eta 1:04:29 lr 0.000796 time 1.4152 (3.1432) loss 4.3396 (3.8790) grad_norm 1.2760 (1.3127) [2022-01-20 10:31:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][30/1251] eta 0:57:37 lr 0.000796 time 1.8566 (2.8317) loss 4.4523 (3.7895) grad_norm 1.4118 (1.2886) [2022-01-20 10:31:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][40/1251] eta 0:54:57 lr 0.000796 time 3.3689 (2.7230) loss 3.9617 (3.7696) grad_norm 1.3415 (1.2690) [2022-01-20 10:32:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][50/1251] eta 0:53:14 lr 0.000796 time 2.4621 (2.6596) loss 3.4849 (3.7567) grad_norm 1.2386 (1.2548) [2022-01-20 10:32:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][60/1251] eta 0:51:03 lr 0.000796 time 1.9749 (2.5718) loss 2.6647 (3.7423) grad_norm 1.2659 (1.2426) [2022-01-20 10:32:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][70/1251] eta 0:49:04 lr 0.000796 time 1.8949 (2.4935) loss 4.3048 (3.7733) grad_norm 1.1164 (1.2396) [2022-01-20 10:33:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][80/1251] eta 0:47:34 lr 0.000796 time 2.7564 (2.4378) loss 3.8650 (3.7620) grad_norm 1.1620 (1.2322) [2022-01-20 10:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][90/1251] eta 0:46:17 lr 0.000796 time 1.9003 (2.3922) loss 4.4884 (3.7565) grad_norm 1.1512 (1.2310) [2022-01-20 10:34:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][100/1251] eta 0:45:23 lr 0.000796 time 1.6462 (2.3662) loss 4.2583 (3.7732) grad_norm 1.2523 (1.2346) [2022-01-20 10:34:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][110/1251] eta 0:45:02 lr 0.000796 time 1.8637 (2.3687) loss 4.1829 (3.7446) grad_norm 1.1474 (1.2292) [2022-01-20 10:34:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][120/1251] eta 0:44:34 lr 0.000796 time 3.4503 (2.3645) loss 2.5656 (3.7555) grad_norm 1.3515 (1.2301) [2022-01-20 10:35:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][130/1251] eta 0:43:54 lr 0.000796 time 1.9375 (2.3504) loss 3.4928 (3.7401) grad_norm 1.3111 (1.2268) [2022-01-20 10:35:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][140/1251] eta 0:43:17 lr 0.000795 time 1.6414 (2.3379) loss 3.9981 (3.7329) grad_norm 1.1156 (1.2232) [2022-01-20 10:35:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][150/1251] eta 0:42:40 lr 0.000795 time 2.5388 (2.3258) loss 3.2290 (3.7104) grad_norm 1.3457 (1.2197) [2022-01-20 10:36:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][160/1251] eta 0:42:17 lr 0.000795 time 3.4403 (2.3256) loss 3.5643 (3.7228) grad_norm 1.0062 (1.2174) [2022-01-20 10:36:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][170/1251] eta 0:41:38 lr 0.000795 time 1.9648 (2.3115) loss 2.9417 (3.7329) grad_norm 1.1977 (1.2161) [2022-01-20 10:36:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][180/1251] eta 0:40:59 lr 0.000795 time 1.5996 (2.2963) loss 3.2242 (3.7193) grad_norm 1.1220 (1.2157) [2022-01-20 10:37:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][190/1251] eta 0:40:27 lr 0.000795 time 2.3075 (2.2879) loss 4.1177 (3.7004) grad_norm 1.2795 (1.2138) [2022-01-20 10:37:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][200/1251] eta 0:40:04 lr 0.000795 time 3.6483 (2.2874) loss 4.1071 (3.7013) grad_norm 1.0877 (1.2128) [2022-01-20 10:38:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][210/1251] eta 0:39:30 lr 0.000795 time 2.1762 (2.2768) loss 3.8951 (3.7026) grad_norm 1.2604 (1.2144) [2022-01-20 10:38:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][220/1251] eta 0:38:59 lr 0.000795 time 1.7968 (2.2687) loss 4.0911 (3.7067) grad_norm 1.0703 (1.2130) [2022-01-20 10:38:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][230/1251] eta 0:38:35 lr 0.000795 time 2.4430 (2.2676) loss 3.7881 (3.7037) grad_norm 1.2213 (1.2151) [2022-01-20 10:39:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][240/1251] eta 0:38:17 lr 0.000795 time 3.7020 (2.2724) loss 4.3292 (3.7079) grad_norm 1.4003 (1.2196) [2022-01-20 10:39:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][250/1251] eta 0:37:55 lr 0.000795 time 2.2208 (2.2729) loss 4.2224 (3.7061) grad_norm 1.4629 (1.2238) [2022-01-20 10:39:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][260/1251] eta 0:37:29 lr 0.000795 time 2.2154 (2.2704) loss 3.9194 (3.7076) grad_norm 1.1538 (1.2278) [2022-01-20 10:40:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][270/1251] eta 0:36:57 lr 0.000795 time 1.9758 (2.2606) loss 4.2431 (3.7089) grad_norm 1.1685 (1.2270) [2022-01-20 10:40:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][280/1251] eta 0:36:34 lr 0.000795 time 3.5196 (2.2598) loss 3.3426 (3.7027) grad_norm 1.1049 (1.2253) [2022-01-20 10:40:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][290/1251] eta 0:36:07 lr 0.000795 time 1.9895 (2.2560) loss 4.2306 (3.7154) grad_norm 1.1816 (1.2251) [2022-01-20 10:41:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][300/1251] eta 0:35:44 lr 0.000795 time 2.1292 (2.2547) loss 4.4054 (3.7162) grad_norm 1.2546 (1.2265) [2022-01-20 10:41:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][310/1251] eta 0:35:16 lr 0.000795 time 1.6909 (2.2494) loss 4.5146 (3.7251) grad_norm 1.2234 (1.2291) [2022-01-20 10:42:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][320/1251] eta 0:34:56 lr 0.000795 time 3.3543 (2.2520) loss 4.0665 (3.7305) grad_norm 1.1459 (1.2278) [2022-01-20 10:42:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][330/1251] eta 0:34:26 lr 0.000795 time 1.7285 (2.2442) loss 3.9054 (3.7349) grad_norm 1.1594 (1.2286) [2022-01-20 10:42:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][340/1251] eta 0:33:59 lr 0.000795 time 1.9211 (2.2389) loss 3.2717 (3.7321) grad_norm 1.1187 (1.2275) [2022-01-20 10:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][350/1251] eta 0:33:36 lr 0.000795 time 2.8605 (2.2380) loss 4.1464 (3.7267) grad_norm 1.1099 (1.2266) [2022-01-20 10:43:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][360/1251] eta 0:33:13 lr 0.000795 time 2.0928 (2.2378) loss 4.2780 (3.7267) grad_norm 1.0966 (1.2245) [2022-01-20 10:43:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][370/1251] eta 0:32:53 lr 0.000795 time 2.4789 (2.2396) loss 3.1379 (3.7240) grad_norm 1.3615 (1.2247) [2022-01-20 10:44:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][380/1251] eta 0:32:28 lr 0.000795 time 2.1952 (2.2369) loss 3.3668 (3.7298) grad_norm 1.1552 (1.2263) [2022-01-20 10:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][390/1251] eta 0:32:06 lr 0.000795 time 2.4744 (2.2375) loss 3.1688 (3.7311) grad_norm 1.1929 (1.2253) [2022-01-20 10:44:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][400/1251] eta 0:31:43 lr 0.000795 time 2.3202 (2.2363) loss 2.8736 (3.7328) grad_norm 1.3969 (1.2266) [2022-01-20 10:45:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][410/1251] eta 0:31:19 lr 0.000795 time 2.3201 (2.2351) loss 4.1137 (3.7342) grad_norm 1.2306 (1.2267) [2022-01-20 10:45:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][420/1251] eta 0:30:56 lr 0.000795 time 2.5334 (2.2343) loss 3.7806 (3.7348) grad_norm 1.2853 (1.2274) [2022-01-20 10:46:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][430/1251] eta 0:30:33 lr 0.000795 time 1.8928 (2.2330) loss 4.3116 (3.7327) grad_norm 1.1452 (1.2260) [2022-01-20 10:46:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][440/1251] eta 0:30:11 lr 0.000794 time 2.1280 (2.2340) loss 3.5829 (3.7369) grad_norm 1.3569 (1.2279) [2022-01-20 10:46:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][450/1251] eta 0:29:51 lr 0.000794 time 2.9197 (2.2360) loss 3.6654 (3.7428) grad_norm 1.1872 (1.2280) [2022-01-20 10:47:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][460/1251] eta 0:29:27 lr 0.000794 time 1.9043 (2.2339) loss 4.2931 (3.7450) grad_norm 1.2404 (1.2268) [2022-01-20 10:47:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][470/1251] eta 0:29:01 lr 0.000794 time 2.1053 (2.2302) loss 3.9434 (3.7459) grad_norm 1.2749 (1.2251) [2022-01-20 10:47:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][480/1251] eta 0:28:37 lr 0.000794 time 2.3673 (2.2277) loss 2.6437 (3.7479) grad_norm 1.3228 (1.2245) [2022-01-20 10:48:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][490/1251] eta 0:28:13 lr 0.000794 time 1.8673 (2.2259) loss 3.1624 (3.7459) grad_norm 1.0475 (1.2230) [2022-01-20 10:48:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][500/1251] eta 0:27:49 lr 0.000794 time 2.3030 (2.2235) loss 3.6895 (3.7376) grad_norm 1.0699 (1.2215) [2022-01-20 10:48:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][510/1251] eta 0:27:26 lr 0.000794 time 1.8393 (2.2224) loss 3.6248 (3.7414) grad_norm 1.3886 (1.2222) [2022-01-20 10:49:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][520/1251] eta 0:27:03 lr 0.000794 time 2.2821 (2.2204) loss 4.2466 (3.7424) grad_norm 1.0363 (1.2228) [2022-01-20 10:49:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][530/1251] eta 0:26:41 lr 0.000794 time 2.5461 (2.2209) loss 4.0420 (3.7425) grad_norm 1.0842 (1.2209) [2022-01-20 10:50:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][540/1251] eta 0:26:19 lr 0.000794 time 2.9630 (2.2208) loss 3.7916 (3.7409) grad_norm 1.0020 (1.2208) [2022-01-20 10:50:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][550/1251] eta 0:25:56 lr 0.000794 time 1.8270 (2.2207) loss 4.4869 (3.7351) grad_norm 1.1110 (1.2209) [2022-01-20 10:50:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][560/1251] eta 0:25:35 lr 0.000794 time 1.5948 (2.2225) loss 4.1402 (3.7319) grad_norm 1.3859 (1.2215) [2022-01-20 10:51:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][570/1251] eta 0:25:12 lr 0.000794 time 2.3030 (2.2212) loss 3.7855 (3.7294) grad_norm 1.5363 (1.2241) [2022-01-20 10:51:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][580/1251] eta 0:24:49 lr 0.000794 time 3.0185 (2.2201) loss 3.0568 (3.7299) grad_norm 1.3634 (1.2241) [2022-01-20 10:51:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][590/1251] eta 0:24:25 lr 0.000794 time 1.6390 (2.2167) loss 3.5443 (3.7306) grad_norm 1.3245 (1.2239) [2022-01-20 10:52:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][600/1251] eta 0:24:01 lr 0.000794 time 2.2213 (2.2150) loss 3.6684 (3.7321) grad_norm 1.0603 (1.2245) [2022-01-20 10:52:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][610/1251] eta 0:23:40 lr 0.000794 time 2.8181 (2.2157) loss 4.4762 (3.7346) grad_norm 1.0502 (1.2239) [2022-01-20 10:52:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][620/1251] eta 0:23:17 lr 0.000794 time 3.0198 (2.2154) loss 4.5312 (3.7406) grad_norm 1.3388 (1.2231) [2022-01-20 10:53:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][630/1251] eta 0:22:55 lr 0.000794 time 2.2201 (2.2153) loss 3.6207 (3.7390) grad_norm 1.2503 (1.2231) [2022-01-20 10:53:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][640/1251] eta 0:22:32 lr 0.000794 time 1.8709 (2.2141) loss 2.4133 (3.7356) grad_norm 1.1073 (1.2234) [2022-01-20 10:54:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][650/1251] eta 0:22:10 lr 0.000794 time 2.8562 (2.2130) loss 3.5639 (3.7382) grad_norm 1.2283 (1.2234) [2022-01-20 10:54:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][660/1251] eta 0:21:48 lr 0.000794 time 2.8139 (2.2140) loss 4.2186 (3.7405) grad_norm 1.2873 (1.2242) [2022-01-20 10:54:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][670/1251] eta 0:21:27 lr 0.000794 time 2.2302 (2.2156) loss 3.3723 (3.7403) grad_norm 1.1932 (1.2243) [2022-01-20 10:55:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][680/1251] eta 0:21:05 lr 0.000794 time 1.6172 (2.2156) loss 2.5278 (3.7380) grad_norm 1.1618 (1.2241) [2022-01-20 10:55:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][690/1251] eta 0:20:42 lr 0.000794 time 2.9943 (2.2142) loss 4.1058 (3.7368) grad_norm 1.3306 (1.2235) [2022-01-20 10:55:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][700/1251] eta 0:20:18 lr 0.000794 time 1.9860 (2.2121) loss 4.2574 (3.7370) grad_norm 1.2355 (1.2226) [2022-01-20 10:56:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][710/1251] eta 0:19:55 lr 0.000794 time 2.1184 (2.2101) loss 4.2920 (3.7399) grad_norm 1.2644 (1.2218) [2022-01-20 10:56:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][720/1251] eta 0:19:33 lr 0.000794 time 2.0670 (2.2092) loss 3.9602 (3.7395) grad_norm 1.2864 (1.2217) [2022-01-20 10:56:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][730/1251] eta 0:19:11 lr 0.000794 time 2.6199 (2.2093) loss 4.4236 (3.7389) grad_norm 1.2454 (1.2227) [2022-01-20 10:57:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][740/1251] eta 0:18:47 lr 0.000793 time 1.8952 (2.2067) loss 4.2432 (3.7416) grad_norm 1.0223 (1.2220) [2022-01-20 10:57:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][750/1251] eta 0:18:25 lr 0.000793 time 2.2470 (2.2063) loss 3.9869 (3.7417) grad_norm 1.2525 (1.2223) [2022-01-20 10:58:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][760/1251] eta 0:18:02 lr 0.000793 time 1.6193 (2.2047) loss 3.7974 (3.7378) grad_norm 1.4980 (1.2237) [2022-01-20 10:58:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][770/1251] eta 0:17:40 lr 0.000793 time 2.4278 (2.2045) loss 3.2877 (3.7382) grad_norm 1.4081 (1.2234) [2022-01-20 10:58:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][780/1251] eta 0:17:18 lr 0.000793 time 1.4911 (2.2039) loss 4.1835 (3.7368) grad_norm 1.1192 (1.2231) [2022-01-20 10:59:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][790/1251] eta 0:16:56 lr 0.000793 time 1.8175 (2.2054) loss 4.6390 (3.7352) grad_norm 1.1000 (1.2226) [2022-01-20 10:59:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][800/1251] eta 0:16:34 lr 0.000793 time 1.8698 (2.2054) loss 4.3885 (3.7372) grad_norm 1.1045 (1.2223) [2022-01-20 10:59:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][810/1251] eta 0:16:13 lr 0.000793 time 2.1803 (2.2079) loss 4.0322 (3.7376) grad_norm 1.3341 (1.2222) [2022-01-20 11:00:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][820/1251] eta 0:15:51 lr 0.000793 time 2.0272 (2.2084) loss 2.9374 (3.7387) grad_norm 1.4722 (1.2238) [2022-01-20 11:00:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][830/1251] eta 0:15:29 lr 0.000793 time 1.9197 (2.2085) loss 4.5296 (3.7401) grad_norm 1.2351 (1.2242) [2022-01-20 11:01:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][840/1251] eta 0:15:08 lr 0.000793 time 1.9614 (2.2094) loss 3.8764 (3.7427) grad_norm 1.2453 (1.2239) [2022-01-20 11:01:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][850/1251] eta 0:14:45 lr 0.000793 time 2.2178 (2.2083) loss 3.6654 (3.7426) grad_norm 1.4888 (1.2242) [2022-01-20 11:01:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][860/1251] eta 0:14:23 lr 0.000793 time 1.6701 (2.2080) loss 3.1316 (3.7393) grad_norm 1.0331 (1.2247) [2022-01-20 11:02:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][870/1251] eta 0:14:01 lr 0.000793 time 3.0853 (2.2076) loss 3.8363 (3.7383) grad_norm 1.0990 (1.2239) [2022-01-20 11:02:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][880/1251] eta 0:13:38 lr 0.000793 time 1.5876 (2.2071) loss 4.2484 (3.7402) grad_norm 1.4151 (1.2240) [2022-01-20 11:02:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][890/1251] eta 0:13:15 lr 0.000793 time 1.9198 (2.2049) loss 3.0938 (3.7396) grad_norm 1.1361 (1.2231) [2022-01-20 11:03:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][900/1251] eta 0:12:54 lr 0.000793 time 1.8794 (2.2056) loss 3.7055 (3.7410) grad_norm 1.1061 (1.2235) [2022-01-20 11:03:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][910/1251] eta 0:12:31 lr 0.000793 time 2.0530 (2.2040) loss 3.8093 (3.7404) grad_norm 1.3628 (1.2257) [2022-01-20 11:03:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][920/1251] eta 0:12:09 lr 0.000793 time 1.8654 (2.2029) loss 4.1779 (3.7395) grad_norm 1.2280 (1.2253) [2022-01-20 11:04:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][930/1251] eta 0:11:46 lr 0.000793 time 2.1100 (2.2018) loss 3.8666 (3.7401) grad_norm 1.1019 (1.2255) [2022-01-20 11:04:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][940/1251] eta 0:11:24 lr 0.000793 time 2.0894 (2.2015) loss 4.4199 (3.7418) grad_norm 1.2858 (1.2265) [2022-01-20 11:04:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][950/1251] eta 0:11:02 lr 0.000793 time 2.3007 (2.2008) loss 3.2521 (3.7429) grad_norm 1.1837 (1.2263) [2022-01-20 11:05:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][960/1251] eta 0:10:40 lr 0.000793 time 2.1469 (2.2013) loss 2.6347 (3.7408) grad_norm 1.1168 (1.2256) [2022-01-20 11:05:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][970/1251] eta 0:10:18 lr 0.000793 time 1.6038 (2.2012) loss 4.2959 (3.7423) grad_norm 1.3582 (1.2252) [2022-01-20 11:06:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][980/1251] eta 0:09:56 lr 0.000793 time 2.4808 (2.2028) loss 2.7518 (3.7408) grad_norm 1.3750 (1.2255) [2022-01-20 11:06:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][990/1251] eta 0:09:36 lr 0.000793 time 5.1488 (2.2075) loss 2.7612 (3.7416) grad_norm 1.3794 (1.2250) [2022-01-20 11:06:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1000/1251] eta 0:09:14 lr 0.000793 time 1.8660 (2.2082) loss 3.4824 (3.7415) grad_norm 1.2535 (1.2250) [2022-01-20 11:07:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1010/1251] eta 0:08:52 lr 0.000793 time 1.5560 (2.2078) loss 3.7590 (3.7397) grad_norm 1.2494 (1.2255) [2022-01-20 11:07:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1020/1251] eta 0:08:29 lr 0.000793 time 2.1024 (2.2064) loss 4.4531 (3.7404) grad_norm 1.1911 (1.2253) [2022-01-20 11:07:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1030/1251] eta 0:08:07 lr 0.000792 time 3.1616 (2.2057) loss 2.9070 (3.7373) grad_norm 1.0471 (1.2244) [2022-01-20 11:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1040/1251] eta 0:07:45 lr 0.000792 time 1.6185 (2.2044) loss 4.0375 (3.7387) grad_norm 1.2333 (1.2241) [2022-01-20 11:08:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1050/1251] eta 0:07:23 lr 0.000792 time 1.8660 (2.2041) loss 3.8967 (3.7386) grad_norm 1.5442 (1.2245) [2022-01-20 11:09:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1060/1251] eta 0:07:00 lr 0.000792 time 1.9011 (2.2039) loss 2.7649 (3.7389) grad_norm 1.3857 (1.2245) [2022-01-20 11:09:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1070/1251] eta 0:06:39 lr 0.000792 time 3.5137 (2.2051) loss 4.3100 (3.7401) grad_norm 1.3074 (1.2248) [2022-01-20 11:09:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1080/1251] eta 0:06:16 lr 0.000792 time 1.8785 (2.2037) loss 2.5495 (3.7397) grad_norm 1.2879 (1.2247) [2022-01-20 11:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1090/1251] eta 0:05:54 lr 0.000792 time 1.9074 (2.2039) loss 4.2791 (3.7426) grad_norm 1.3025 (1.2254) [2022-01-20 11:10:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1100/1251] eta 0:05:32 lr 0.000792 time 2.5597 (2.2039) loss 3.8426 (3.7409) grad_norm 1.2408 (1.2243) [2022-01-20 11:10:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1110/1251] eta 0:05:10 lr 0.000792 time 3.3902 (2.2046) loss 3.0932 (3.7419) grad_norm 1.1570 (1.2236) [2022-01-20 11:11:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1120/1251] eta 0:04:48 lr 0.000792 time 1.9625 (2.2037) loss 3.1896 (3.7418) grad_norm 1.2093 (1.2236) [2022-01-20 11:11:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1130/1251] eta 0:04:26 lr 0.000792 time 2.4613 (2.2024) loss 3.3450 (3.7378) grad_norm 1.2047 (1.2232) [2022-01-20 11:11:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1140/1251] eta 0:04:04 lr 0.000792 time 2.2328 (2.2010) loss 3.6665 (3.7366) grad_norm 1.0849 (1.2224) [2022-01-20 11:12:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1150/1251] eta 0:03:42 lr 0.000792 time 1.9285 (2.2006) loss 2.5906 (3.7368) grad_norm 1.1321 (1.2217) [2022-01-20 11:12:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1160/1251] eta 0:03:20 lr 0.000792 time 1.8498 (2.2004) loss 3.9796 (3.7370) grad_norm 1.3230 (1.2214) [2022-01-20 11:12:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1170/1251] eta 0:02:58 lr 0.000792 time 2.5740 (2.2008) loss 2.8482 (3.7367) grad_norm 1.1553 (1.2213) [2022-01-20 11:13:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1180/1251] eta 0:02:36 lr 0.000792 time 2.7302 (2.2031) loss 4.1923 (3.7371) grad_norm 1.4751 (1.2222) [2022-01-20 11:13:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1190/1251] eta 0:02:14 lr 0.000792 time 2.1945 (2.2044) loss 4.2990 (3.7366) grad_norm 1.1158 (1.2229) [2022-01-20 11:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1200/1251] eta 0:01:52 lr 0.000792 time 1.8585 (2.2045) loss 3.0111 (3.7356) grad_norm 1.3525 (1.2230) [2022-01-20 11:14:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1210/1251] eta 0:01:30 lr 0.000792 time 2.5038 (2.2052) loss 3.9252 (3.7374) grad_norm 1.3273 (1.2232) [2022-01-20 11:14:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1220/1251] eta 0:01:08 lr 0.000792 time 1.9450 (2.2037) loss 4.2301 (3.7360) grad_norm 1.2374 (1.2237) [2022-01-20 11:15:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1230/1251] eta 0:00:46 lr 0.000792 time 2.0071 (2.2016) loss 3.8037 (3.7348) grad_norm 1.1824 (1.2239) [2022-01-20 11:15:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1240/1251] eta 0:00:24 lr 0.000792 time 1.3128 (2.1993) loss 4.0704 (3.7357) grad_norm 1.0616 (1.2239) [2022-01-20 11:15:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [90/300][1250/1251] eta 0:00:02 lr 0.000792 time 1.1769 (2.1937) loss 3.6996 (3.7371) grad_norm 1.3790 (1.2246) [2022-01-20 11:15:47 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 90 training takes 0:45:44 [2022-01-20 11:15:47 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_90 saving...... [2022-01-20 11:15:58 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_90 saved !!! [2022-01-20 11:16:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 16.700 (16.700) Loss 1.1173 (1.1173) Acc@1 73.730 (73.730) Acc@5 92.773 (92.773) [2022-01-20 11:16:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.611 (2.800) Loss 1.1408 (1.1570) Acc@1 73.926 (73.455) Acc@5 91.992 (91.575) [2022-01-20 11:16:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.658 (2.298) Loss 1.2252 (1.1595) Acc@1 71.875 (73.293) Acc@5 90.527 (91.643) [2022-01-20 11:17:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.301 (2.153) Loss 1.1913 (1.1616) Acc@1 72.363 (73.163) Acc@5 91.992 (91.690) [2022-01-20 11:17:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.116 (2.063) Loss 1.1457 (1.1613) Acc@1 72.461 (73.087) Acc@5 92.480 (91.866) [2022-01-20 11:17:30 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 73.022 Acc@5 91.768 [2022-01-20 11:17:30 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 73.0% [2022-01-20 11:17:30 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.19% [2022-01-20 11:17:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][0/1251] eta 7:34:48 lr 0.000792 time 21.8135 (21.8135) loss 3.5715 (3.5715) grad_norm 1.2160 (1.2160) [2022-01-20 11:18:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][10/1251] eta 1:23:16 lr 0.000792 time 1.5314 (4.0261) loss 3.8375 (3.5886) grad_norm 1.2148 (1.2077) [2022-01-20 11:18:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][20/1251] eta 1:04:54 lr 0.000792 time 1.4840 (3.1635) loss 3.0808 (3.6572) grad_norm 1.1038 (1.2236) [2022-01-20 11:18:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][30/1251] eta 0:56:54 lr 0.000792 time 1.5466 (2.7964) loss 4.4278 (3.7717) grad_norm 1.7749 (1.2320) [2022-01-20 11:19:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][40/1251] eta 0:55:02 lr 0.000792 time 6.4872 (2.7273) loss 3.3516 (3.7506) grad_norm 1.0831 (1.2147) [2022-01-20 11:19:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][50/1251] eta 0:52:33 lr 0.000792 time 2.7821 (2.6254) loss 4.1858 (3.7782) grad_norm 1.1458 (1.2051) [2022-01-20 11:20:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][60/1251] eta 0:50:36 lr 0.000792 time 1.5154 (2.5499) loss 3.8394 (3.8353) grad_norm 1.2445 (1.1960) [2022-01-20 11:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][70/1251] eta 0:48:54 lr 0.000792 time 2.0288 (2.4847) loss 2.8268 (3.8183) grad_norm 1.2365 (1.2216) [2022-01-20 11:20:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][80/1251] eta 0:48:04 lr 0.000791 time 4.0983 (2.4637) loss 3.8163 (3.8297) grad_norm 1.0232 (1.2178) [2022-01-20 11:21:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][90/1251] eta 0:47:24 lr 0.000791 time 2.1751 (2.4501) loss 3.9450 (3.7875) grad_norm 1.1902 (1.2299) [2022-01-20 11:21:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][100/1251] eta 0:46:29 lr 0.000791 time 1.6059 (2.4237) loss 4.1851 (3.7890) grad_norm 1.1333 (1.2321) [2022-01-20 11:21:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][110/1251] eta 0:45:40 lr 0.000791 time 1.7999 (2.4020) loss 4.3444 (3.7903) grad_norm 1.2437 (1.2315) [2022-01-20 11:22:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][120/1251] eta 0:45:13 lr 0.000791 time 3.4553 (2.3988) loss 3.3806 (3.7846) grad_norm 1.1760 (1.2326) [2022-01-20 11:22:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][130/1251] eta 0:44:37 lr 0.000791 time 1.5819 (2.3884) loss 4.3232 (3.7745) grad_norm 1.1600 (1.2343) [2022-01-20 11:23:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][140/1251] eta 0:43:50 lr 0.000791 time 2.2530 (2.3679) loss 3.9875 (3.7826) grad_norm 1.2119 (1.2345) [2022-01-20 11:23:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][150/1251] eta 0:43:02 lr 0.000791 time 1.6308 (2.3460) loss 3.7954 (3.7663) grad_norm 1.1773 (1.2359) [2022-01-20 11:23:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][160/1251] eta 0:42:37 lr 0.000791 time 4.0982 (2.3442) loss 3.0046 (3.7585) grad_norm 1.2374 (1.2329) [2022-01-20 11:24:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][170/1251] eta 0:42:06 lr 0.000791 time 1.9947 (2.3372) loss 3.8583 (3.7684) grad_norm 1.1449 (1.2253) [2022-01-20 11:24:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][180/1251] eta 0:41:40 lr 0.000791 time 2.9043 (2.3344) loss 4.1263 (3.7756) grad_norm 1.3229 (1.2225) [2022-01-20 11:24:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][190/1251] eta 0:41:07 lr 0.000791 time 2.0686 (2.3260) loss 2.8733 (3.7568) grad_norm 1.0784 (1.2174) [2022-01-20 11:25:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][200/1251] eta 0:40:40 lr 0.000791 time 2.9791 (2.3218) loss 4.3344 (3.7688) grad_norm 1.1452 (1.2164) [2022-01-20 11:25:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][210/1251] eta 0:39:58 lr 0.000791 time 1.5515 (2.3043) loss 4.5622 (3.7759) grad_norm 1.0727 (1.2148) [2022-01-20 11:25:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][220/1251] eta 0:39:21 lr 0.000791 time 1.8447 (2.2905) loss 3.0633 (3.7566) grad_norm 1.3328 (1.2144) [2022-01-20 11:26:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][230/1251] eta 0:38:46 lr 0.000791 time 1.8813 (2.2783) loss 3.8114 (3.7475) grad_norm 1.2006 (1.2151) [2022-01-20 11:26:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][240/1251] eta 0:38:14 lr 0.000791 time 2.2240 (2.2696) loss 4.6223 (3.7579) grad_norm 1.1310 (1.2156) [2022-01-20 11:26:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][250/1251] eta 0:37:49 lr 0.000791 time 1.7352 (2.2673) loss 2.9756 (3.7553) grad_norm 1.1292 (1.2153) [2022-01-20 11:27:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][260/1251] eta 0:37:24 lr 0.000791 time 2.3939 (2.2645) loss 4.4344 (3.7531) grad_norm 1.2880 (1.2213) [2022-01-20 11:27:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][270/1251] eta 0:36:58 lr 0.000791 time 1.9906 (2.2615) loss 3.9804 (3.7563) grad_norm 1.2272 (1.2234) [2022-01-20 11:28:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][280/1251] eta 0:36:32 lr 0.000791 time 1.8267 (2.2576) loss 4.3188 (3.7563) grad_norm 1.0600 (1.2239) [2022-01-20 11:28:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][290/1251] eta 0:36:07 lr 0.000791 time 1.8453 (2.2555) loss 3.9336 (3.7557) grad_norm 1.1123 (1.2205) [2022-01-20 11:28:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][300/1251] eta 0:35:41 lr 0.000791 time 1.5983 (2.2519) loss 3.8606 (3.7542) grad_norm 1.2483 (1.2217) [2022-01-20 11:29:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][310/1251] eta 0:35:15 lr 0.000791 time 1.8607 (2.2477) loss 4.2955 (3.7622) grad_norm 1.2926 (1.2200) [2022-01-20 11:29:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][320/1251] eta 0:34:48 lr 0.000791 time 1.8734 (2.2438) loss 4.1026 (3.7641) grad_norm 1.2200 (1.2208) [2022-01-20 11:29:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][330/1251] eta 0:34:25 lr 0.000791 time 1.8366 (2.2428) loss 3.3188 (3.7614) grad_norm 1.1778 (1.2211) [2022-01-20 11:30:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][340/1251] eta 0:34:05 lr 0.000791 time 2.5589 (2.2458) loss 2.7985 (3.7570) grad_norm 1.3359 (1.2216) [2022-01-20 11:30:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][350/1251] eta 0:33:47 lr 0.000791 time 2.8955 (2.2498) loss 4.2930 (3.7593) grad_norm 1.0523 (1.2219) [2022-01-20 11:31:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][360/1251] eta 0:33:30 lr 0.000791 time 2.0806 (2.2563) loss 3.1027 (3.7568) grad_norm 1.3373 (1.2225) [2022-01-20 11:31:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][370/1251] eta 0:33:06 lr 0.000790 time 1.8943 (2.2544) loss 2.9355 (3.7571) grad_norm 1.3748 (1.2228) [2022-01-20 11:31:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][380/1251] eta 0:32:41 lr 0.000790 time 2.2087 (2.2514) loss 4.5739 (3.7637) grad_norm 1.1529 (1.2241) [2022-01-20 11:32:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][390/1251] eta 0:32:10 lr 0.000790 time 1.8212 (2.2427) loss 3.8660 (3.7642) grad_norm 1.2913 (1.2251) [2022-01-20 11:32:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][400/1251] eta 0:31:43 lr 0.000790 time 2.5072 (2.2373) loss 2.9435 (3.7674) grad_norm 1.2593 (1.2287) [2022-01-20 11:32:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][410/1251] eta 0:31:17 lr 0.000790 time 1.2409 (2.2329) loss 2.6748 (3.7622) grad_norm 1.4415 (1.2310) [2022-01-20 11:33:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][420/1251] eta 0:30:51 lr 0.000790 time 2.2389 (2.2282) loss 3.9642 (3.7586) grad_norm 1.1201 (1.2313) [2022-01-20 11:33:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][430/1251] eta 0:30:28 lr 0.000790 time 2.2155 (2.2276) loss 3.6716 (3.7563) grad_norm 1.3369 (1.2295) [2022-01-20 11:33:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][440/1251] eta 0:30:08 lr 0.000790 time 2.7380 (2.2302) loss 3.4448 (3.7554) grad_norm 1.1480 (1.2307) [2022-01-20 11:34:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][450/1251] eta 0:29:46 lr 0.000790 time 1.7852 (2.2300) loss 3.2012 (3.7507) grad_norm 1.0924 (1.2306) [2022-01-20 11:34:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][460/1251] eta 0:29:24 lr 0.000790 time 1.5632 (2.2313) loss 4.4006 (3.7530) grad_norm 1.4353 (1.2324) [2022-01-20 11:35:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][470/1251] eta 0:29:02 lr 0.000790 time 2.3371 (2.2318) loss 3.9329 (3.7545) grad_norm 1.2796 (1.2313) [2022-01-20 11:35:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][480/1251] eta 0:28:42 lr 0.000790 time 2.8791 (2.2346) loss 3.7795 (3.7552) grad_norm 1.8340 (1.2311) [2022-01-20 11:35:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][490/1251] eta 0:28:19 lr 0.000790 time 1.5628 (2.2338) loss 3.8281 (3.7526) grad_norm 1.4067 (1.2323) [2022-01-20 11:36:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][500/1251] eta 0:27:55 lr 0.000790 time 1.7315 (2.2306) loss 3.5172 (3.7592) grad_norm 1.0278 (1.2313) [2022-01-20 11:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][510/1251] eta 0:27:31 lr 0.000790 time 1.8457 (2.2288) loss 2.8211 (3.7601) grad_norm 1.3381 (1.2319) [2022-01-20 11:36:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][520/1251] eta 0:27:10 lr 0.000790 time 2.8450 (2.2308) loss 3.7419 (3.7642) grad_norm 1.2662 (1.2319) [2022-01-20 11:37:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][530/1251] eta 0:26:49 lr 0.000790 time 1.9689 (2.2322) loss 2.8771 (3.7612) grad_norm 1.1895 (1.2324) [2022-01-20 11:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][540/1251] eta 0:26:24 lr 0.000790 time 1.5017 (2.2283) loss 2.8156 (3.7660) grad_norm 1.0831 (1.2323) [2022-01-20 11:37:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][550/1251] eta 0:25:59 lr 0.000790 time 1.9370 (2.2251) loss 4.1570 (3.7669) grad_norm 1.0853 (1.2314) [2022-01-20 11:38:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][560/1251] eta 0:25:35 lr 0.000790 time 1.8644 (2.2221) loss 3.8831 (3.7632) grad_norm 1.2816 (1.2301) [2022-01-20 11:38:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][570/1251] eta 0:25:13 lr 0.000790 time 2.2612 (2.2230) loss 4.3613 (3.7596) grad_norm 1.4273 (1.2301) [2022-01-20 11:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][580/1251] eta 0:24:50 lr 0.000790 time 2.1164 (2.2214) loss 4.4267 (3.7611) grad_norm 1.1820 (1.2294) [2022-01-20 11:39:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][590/1251] eta 0:24:28 lr 0.000790 time 1.9231 (2.2217) loss 4.1948 (3.7627) grad_norm 1.6063 (1.2310) [2022-01-20 11:39:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][600/1251] eta 0:24:04 lr 0.000790 time 1.7413 (2.2192) loss 3.9057 (3.7643) grad_norm 1.2048 (1.2316) [2022-01-20 11:40:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][610/1251] eta 0:23:43 lr 0.000790 time 2.4799 (2.2212) loss 4.2383 (3.7657) grad_norm 1.3785 (1.2334) [2022-01-20 11:40:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][620/1251] eta 0:23:21 lr 0.000790 time 2.0312 (2.2206) loss 3.9821 (3.7631) grad_norm 1.3291 (1.2342) [2022-01-20 11:40:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][630/1251] eta 0:22:58 lr 0.000790 time 2.4006 (2.2204) loss 3.9210 (3.7631) grad_norm 1.1902 (1.2333) [2022-01-20 11:41:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][640/1251] eta 0:22:35 lr 0.000790 time 1.6898 (2.2177) loss 3.8203 (3.7683) grad_norm 1.1924 (1.2316) [2022-01-20 11:41:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][650/1251] eta 0:22:11 lr 0.000790 time 1.5920 (2.2161) loss 4.3108 (3.7695) grad_norm 1.2964 (1.2309) [2022-01-20 11:41:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][660/1251] eta 0:21:48 lr 0.000790 time 1.8897 (2.2146) loss 3.9824 (3.7695) grad_norm 1.2404 (1.2305) [2022-01-20 11:42:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][670/1251] eta 0:21:27 lr 0.000789 time 2.4877 (2.2153) loss 4.1846 (3.7720) grad_norm 1.3350 (1.2303) [2022-01-20 11:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][680/1251] eta 0:21:04 lr 0.000789 time 2.2426 (2.2137) loss 3.0694 (3.7707) grad_norm 1.1297 (1.2315) [2022-01-20 11:43:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][690/1251] eta 0:20:42 lr 0.000789 time 1.9572 (2.2154) loss 2.9173 (3.7724) grad_norm 1.1971 (1.2309) [2022-01-20 11:43:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][700/1251] eta 0:20:21 lr 0.000789 time 2.4998 (2.2167) loss 4.2781 (3.7756) grad_norm 1.1645 (1.2302) [2022-01-20 11:43:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][710/1251] eta 0:19:59 lr 0.000789 time 2.4133 (2.2165) loss 4.2178 (3.7762) grad_norm 1.0966 (1.2307) [2022-01-20 11:44:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][720/1251] eta 0:19:36 lr 0.000789 time 1.9579 (2.2148) loss 4.2336 (3.7746) grad_norm 1.0887 (1.2299) [2022-01-20 11:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][730/1251] eta 0:19:12 lr 0.000789 time 1.8290 (2.2130) loss 3.8623 (3.7719) grad_norm 1.1337 (1.2299) [2022-01-20 11:44:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][740/1251] eta 0:18:49 lr 0.000789 time 1.9666 (2.2097) loss 4.0576 (3.7747) grad_norm 1.1532 (1.2302) [2022-01-20 11:45:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][750/1251] eta 0:18:25 lr 0.000789 time 1.9128 (2.2071) loss 2.9334 (3.7721) grad_norm 1.0896 (1.2300) [2022-01-20 11:45:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][760/1251] eta 0:18:03 lr 0.000789 time 2.8994 (2.2074) loss 4.0157 (3.7725) grad_norm 1.4044 (1.2297) [2022-01-20 11:45:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][770/1251] eta 0:17:42 lr 0.000789 time 1.8784 (2.2091) loss 4.3080 (3.7781) grad_norm 1.3887 (1.2300) [2022-01-20 11:46:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][780/1251] eta 0:17:21 lr 0.000789 time 2.3582 (2.2120) loss 3.9206 (3.7798) grad_norm 1.3754 (1.2309) [2022-01-20 11:46:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][790/1251] eta 0:17:00 lr 0.000789 time 2.8176 (2.2131) loss 3.7857 (3.7771) grad_norm 1.0660 (1.2310) [2022-01-20 11:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][800/1251] eta 0:16:37 lr 0.000789 time 1.9061 (2.2129) loss 4.0917 (3.7796) grad_norm 1.1962 (1.2317) [2022-01-20 11:47:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][810/1251] eta 0:16:15 lr 0.000789 time 1.8405 (2.2122) loss 3.6426 (3.7803) grad_norm 1.2134 (1.2323) [2022-01-20 11:47:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][820/1251] eta 0:15:52 lr 0.000789 time 1.6470 (2.2098) loss 3.1953 (3.7791) grad_norm 1.1060 (1.2329) [2022-01-20 11:48:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][830/1251] eta 0:15:29 lr 0.000789 time 1.8888 (2.2085) loss 2.7802 (3.7806) grad_norm 1.0534 (1.2316) [2022-01-20 11:48:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][840/1251] eta 0:15:08 lr 0.000789 time 2.2093 (2.2107) loss 4.0264 (3.7795) grad_norm 1.1680 (1.2311) [2022-01-20 11:48:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][850/1251] eta 0:14:47 lr 0.000789 time 2.2042 (2.2125) loss 3.6704 (3.7815) grad_norm 1.1738 (1.2308) [2022-01-20 11:49:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][860/1251] eta 0:14:25 lr 0.000789 time 2.2846 (2.2143) loss 3.8097 (3.7807) grad_norm 1.2306 (1.2310) [2022-01-20 11:49:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][870/1251] eta 0:14:03 lr 0.000789 time 1.8779 (2.2134) loss 3.4803 (3.7806) grad_norm 1.3133 (1.2304) [2022-01-20 11:49:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][880/1251] eta 0:13:40 lr 0.000789 time 1.9154 (2.2105) loss 3.7348 (3.7799) grad_norm 1.4712 (1.2299) [2022-01-20 11:50:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][890/1251] eta 0:13:17 lr 0.000789 time 1.8517 (2.2079) loss 3.6569 (3.7820) grad_norm 1.5624 (1.2299) [2022-01-20 11:50:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][900/1251] eta 0:12:54 lr 0.000789 time 1.8648 (2.2063) loss 4.0188 (3.7815) grad_norm 1.1652 (1.2303) [2022-01-20 11:51:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][910/1251] eta 0:12:32 lr 0.000789 time 2.0244 (2.2060) loss 3.4613 (3.7804) grad_norm 1.0720 (1.2294) [2022-01-20 11:51:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][920/1251] eta 0:12:10 lr 0.000789 time 2.3757 (2.2072) loss 4.2624 (3.7804) grad_norm 1.3461 (1.2292) [2022-01-20 11:51:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][930/1251] eta 0:11:48 lr 0.000789 time 1.6056 (2.2070) loss 4.2764 (3.7797) grad_norm 1.1518 (1.2286) [2022-01-20 11:52:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][940/1251] eta 0:11:26 lr 0.000789 time 2.5365 (2.2071) loss 3.9164 (3.7788) grad_norm 1.1560 (1.2293) [2022-01-20 11:52:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][950/1251] eta 0:11:04 lr 0.000789 time 1.6448 (2.2073) loss 2.7514 (3.7802) grad_norm 1.3277 (1.2296) [2022-01-20 11:52:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][960/1251] eta 0:10:42 lr 0.000788 time 1.9949 (2.2065) loss 4.1012 (3.7811) grad_norm 1.2282 (1.2288) [2022-01-20 11:53:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][970/1251] eta 0:10:20 lr 0.000788 time 2.1784 (2.2079) loss 3.9044 (3.7823) grad_norm 1.2937 (1.2293) [2022-01-20 11:53:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][980/1251] eta 0:09:58 lr 0.000788 time 1.6666 (2.2072) loss 2.8006 (3.7807) grad_norm 1.2142 (1.2293) [2022-01-20 11:53:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][990/1251] eta 0:09:36 lr 0.000788 time 1.7195 (2.2080) loss 3.8667 (3.7831) grad_norm 1.1747 (1.2292) [2022-01-20 11:54:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1000/1251] eta 0:09:14 lr 0.000788 time 2.4186 (2.2079) loss 4.1159 (3.7825) grad_norm 1.1919 (1.2289) [2022-01-20 11:54:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1010/1251] eta 0:08:52 lr 0.000788 time 1.7588 (2.2079) loss 3.2347 (3.7833) grad_norm 1.1227 (1.2285) [2022-01-20 11:55:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1020/1251] eta 0:08:29 lr 0.000788 time 1.6640 (2.2054) loss 4.4172 (3.7842) grad_norm 1.1536 (1.2281) [2022-01-20 11:55:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1030/1251] eta 0:08:06 lr 0.000788 time 1.9795 (2.2034) loss 3.9349 (3.7852) grad_norm 1.2982 (1.2280) [2022-01-20 11:55:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1040/1251] eta 0:07:44 lr 0.000788 time 1.5332 (2.2021) loss 3.5655 (3.7821) grad_norm 1.2684 (1.2273) [2022-01-20 11:56:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1050/1251] eta 0:07:22 lr 0.000788 time 1.9194 (2.2018) loss 4.5801 (3.7811) grad_norm 1.1019 (1.2270) [2022-01-20 11:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1060/1251] eta 0:07:00 lr 0.000788 time 2.5681 (2.2021) loss 4.4374 (3.7819) grad_norm 1.5593 (1.2272) [2022-01-20 11:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1070/1251] eta 0:06:38 lr 0.000788 time 1.6625 (2.2029) loss 4.2301 (3.7833) grad_norm 1.2400 (1.2276) [2022-01-20 11:57:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1080/1251] eta 0:06:16 lr 0.000788 time 2.0671 (2.2037) loss 3.9775 (3.7823) grad_norm 1.3362 (1.2274) [2022-01-20 11:57:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1090/1251] eta 0:05:55 lr 0.000788 time 2.8200 (2.2059) loss 2.3167 (3.7802) grad_norm 1.1904 (1.2275) [2022-01-20 11:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1100/1251] eta 0:05:33 lr 0.000788 time 2.2206 (2.2064) loss 3.3841 (3.7811) grad_norm 1.2510 (1.2276) [2022-01-20 11:58:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1110/1251] eta 0:05:11 lr 0.000788 time 2.2298 (2.2059) loss 3.9961 (3.7813) grad_norm 1.2676 (1.2283) [2022-01-20 11:58:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1120/1251] eta 0:04:48 lr 0.000788 time 1.7392 (2.2043) loss 4.3252 (3.7821) grad_norm 1.3243 (1.2283) [2022-01-20 11:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1130/1251] eta 0:04:26 lr 0.000788 time 2.2405 (2.2022) loss 3.6761 (3.7819) grad_norm 1.3502 (1.2288) [2022-01-20 11:59:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1140/1251] eta 0:04:04 lr 0.000788 time 2.0990 (2.2002) loss 3.7459 (3.7821) grad_norm 1.1757 (1.2291) [2022-01-20 11:59:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1150/1251] eta 0:03:42 lr 0.000788 time 2.4723 (2.2007) loss 3.6740 (3.7803) grad_norm 1.1480 (1.2289) [2022-01-20 12:00:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1160/1251] eta 0:03:20 lr 0.000788 time 1.5193 (2.2004) loss 3.9361 (3.7791) grad_norm 1.1578 (1.2288) [2022-01-20 12:00:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1170/1251] eta 0:02:58 lr 0.000788 time 1.8791 (2.2015) loss 4.5185 (3.7807) grad_norm 1.2348 (1.2287) [2022-01-20 12:00:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1180/1251] eta 0:02:36 lr 0.000788 time 2.8512 (2.2031) loss 3.6509 (3.7796) grad_norm 1.3174 (1.2287) [2022-01-20 12:01:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1190/1251] eta 0:02:14 lr 0.000788 time 2.7707 (2.2057) loss 3.9968 (3.7784) grad_norm 1.2305 (1.2286) [2022-01-20 12:01:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1200/1251] eta 0:01:52 lr 0.000788 time 1.9018 (2.2046) loss 2.8222 (3.7759) grad_norm 1.5232 (1.2285) [2022-01-20 12:02:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1210/1251] eta 0:01:30 lr 0.000788 time 1.9116 (2.2061) loss 3.2849 (3.7749) grad_norm 1.2300 (1.2284) [2022-01-20 12:02:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1220/1251] eta 0:01:08 lr 0.000788 time 1.8058 (2.2044) loss 3.9909 (3.7758) grad_norm 1.0857 (1.2281) [2022-01-20 12:02:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1230/1251] eta 0:00:46 lr 0.000788 time 1.9352 (2.2034) loss 4.1247 (3.7779) grad_norm 1.3316 (1.2281) [2022-01-20 12:03:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1240/1251] eta 0:00:24 lr 0.000788 time 1.5168 (2.2023) loss 3.8215 (3.7744) grad_norm 1.4169 (1.2285) [2022-01-20 12:03:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [91/300][1250/1251] eta 0:00:02 lr 0.000788 time 1.2029 (2.1969) loss 4.1192 (3.7734) grad_norm 1.5379 (1.2285) [2022-01-20 12:03:19 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 91 training takes 0:45:48 [2022-01-20 12:03:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 21.026 (21.026) Loss 1.1828 (1.1828) Acc@1 72.656 (72.656) Acc@5 91.309 (91.309) [2022-01-20 12:03:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.926 (3.521) Loss 1.2107 (1.1304) Acc@1 71.680 (73.571) Acc@5 90.723 (91.983) [2022-01-20 12:04:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.650 (2.794) Loss 1.1526 (1.1309) Acc@1 74.121 (73.679) Acc@5 90.820 (91.895) [2022-01-20 12:04:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.650 (2.318) Loss 1.0862 (1.1239) Acc@1 74.023 (73.831) Acc@5 92.285 (91.951) [2022-01-20 12:04:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.047 (2.193) Loss 1.0635 (1.1227) Acc@1 75.977 (73.857) Acc@5 92.676 (91.992) [2022-01-20 12:04:56 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 73.718 Acc@5 91.930 [2022-01-20 12:04:56 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 73.7% [2022-01-20 12:04:56 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.72% [2022-01-20 12:05:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][0/1251] eta 7:18:59 lr 0.000788 time 21.0547 (21.0547) loss 3.7688 (3.7688) grad_norm 1.1792 (1.1792) [2022-01-20 12:05:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][10/1251] eta 1:21:36 lr 0.000787 time 1.6257 (3.9457) loss 3.8560 (3.5799) grad_norm 1.2304 (1.2252) [2022-01-20 12:06:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][20/1251] eta 1:04:56 lr 0.000787 time 1.3280 (3.1656) loss 4.3660 (3.6771) grad_norm 1.0924 (1.2086) [2022-01-20 12:06:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][30/1251] eta 0:57:10 lr 0.000787 time 1.8970 (2.8093) loss 4.2598 (3.6629) grad_norm 1.2919 (1.1949) [2022-01-20 12:06:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][40/1251] eta 0:54:48 lr 0.000787 time 3.9026 (2.7156) loss 4.5459 (3.7104) grad_norm 1.0019 (1.1877) [2022-01-20 12:07:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][50/1251] eta 0:52:35 lr 0.000787 time 2.5192 (2.6278) loss 3.8704 (3.6774) grad_norm 1.2665 (1.1928) [2022-01-20 12:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][60/1251] eta 0:50:55 lr 0.000787 time 1.6168 (2.5651) loss 3.9740 (3.6472) grad_norm 1.1199 (1.1898) [2022-01-20 12:07:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][70/1251] eta 0:49:31 lr 0.000787 time 1.6676 (2.5164) loss 4.3896 (3.6564) grad_norm 1.2356 (1.2000) [2022-01-20 12:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][80/1251] eta 0:48:24 lr 0.000787 time 2.8585 (2.4803) loss 4.1092 (3.6716) grad_norm 1.1371 (1.1993) [2022-01-20 12:08:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][90/1251] eta 0:47:33 lr 0.000787 time 2.4448 (2.4580) loss 3.6697 (3.6717) grad_norm 1.1055 (1.2065) [2022-01-20 12:09:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][100/1251] eta 0:46:28 lr 0.000787 time 1.9548 (2.4230) loss 3.8830 (3.6557) grad_norm 1.1334 (1.2067) [2022-01-20 12:09:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][110/1251] eta 0:45:25 lr 0.000787 time 2.1662 (2.3886) loss 3.9876 (3.6702) grad_norm 1.0606 (1.2076) [2022-01-20 12:09:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][120/1251] eta 0:44:30 lr 0.000787 time 1.8805 (2.3616) loss 4.3228 (3.6943) grad_norm 1.4944 (1.2083) [2022-01-20 12:10:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][130/1251] eta 0:43:50 lr 0.000787 time 2.8452 (2.3465) loss 4.2064 (3.7137) grad_norm 1.3334 (1.2087) [2022-01-20 12:10:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][140/1251] eta 0:43:16 lr 0.000787 time 2.7821 (2.3373) loss 4.1007 (3.7304) grad_norm 1.2920 (1.2069) [2022-01-20 12:10:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][150/1251] eta 0:42:46 lr 0.000787 time 2.0492 (2.3311) loss 3.4829 (3.7291) grad_norm 1.2488 (1.2114) [2022-01-20 12:11:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][160/1251] eta 0:42:13 lr 0.000787 time 1.6083 (2.3225) loss 4.0353 (3.7447) grad_norm 1.0547 (1.2157) [2022-01-20 12:11:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][170/1251] eta 0:41:50 lr 0.000787 time 2.2328 (2.3226) loss 3.3299 (3.7509) grad_norm 1.0609 (1.2127) [2022-01-20 12:11:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][180/1251] eta 0:41:26 lr 0.000787 time 3.0675 (2.3219) loss 3.3905 (3.7571) grad_norm 1.1182 (1.2100) [2022-01-20 12:12:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][190/1251] eta 0:40:43 lr 0.000787 time 2.1866 (2.3031) loss 4.1865 (3.7577) grad_norm 1.2976 (1.2090) [2022-01-20 12:12:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][200/1251] eta 0:39:59 lr 0.000787 time 1.5443 (2.2828) loss 3.4763 (3.7392) grad_norm 1.4062 (1.2093) [2022-01-20 12:12:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][210/1251] eta 0:39:33 lr 0.000787 time 2.2732 (2.2803) loss 3.9833 (3.7474) grad_norm 1.4038 (1.2122) [2022-01-20 12:13:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][220/1251] eta 0:39:10 lr 0.000787 time 1.7959 (2.2795) loss 3.8064 (3.7473) grad_norm 1.3189 (1.2149) [2022-01-20 12:13:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][230/1251] eta 0:38:40 lr 0.000787 time 2.4968 (2.2727) loss 3.9201 (3.7541) grad_norm 1.7852 (1.2186) [2022-01-20 12:14:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][240/1251] eta 0:38:13 lr 0.000787 time 2.1369 (2.2685) loss 3.5020 (3.7571) grad_norm 1.2432 (1.2218) [2022-01-20 12:14:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][250/1251] eta 0:37:52 lr 0.000787 time 1.8552 (2.2700) loss 2.8816 (3.7589) grad_norm 1.4060 (1.2244) [2022-01-20 12:14:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][260/1251] eta 0:37:26 lr 0.000787 time 2.1290 (2.2669) loss 2.9833 (3.7616) grad_norm 1.1304 (1.2275) [2022-01-20 12:15:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][270/1251] eta 0:37:02 lr 0.000787 time 3.5907 (2.2655) loss 3.7390 (3.7606) grad_norm 1.0273 (1.2270) [2022-01-20 12:15:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][280/1251] eta 0:36:33 lr 0.000787 time 1.8414 (2.2592) loss 4.3924 (3.7532) grad_norm 1.1444 (1.2280) [2022-01-20 12:15:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][290/1251] eta 0:36:02 lr 0.000787 time 1.6911 (2.2498) loss 2.4186 (3.7483) grad_norm 1.2117 (1.2289) [2022-01-20 12:16:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][300/1251] eta 0:35:34 lr 0.000786 time 2.2967 (2.2448) loss 3.6996 (3.7465) grad_norm 1.2504 (1.2297) [2022-01-20 12:16:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][310/1251] eta 0:35:14 lr 0.000786 time 3.4073 (2.2468) loss 4.4113 (3.7510) grad_norm 1.1196 (1.2294) [2022-01-20 12:16:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][320/1251] eta 0:34:52 lr 0.000786 time 1.8610 (2.2475) loss 3.2675 (3.7490) grad_norm 1.2237 (1.2287) [2022-01-20 12:17:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][330/1251] eta 0:34:30 lr 0.000786 time 2.2343 (2.2485) loss 4.1372 (3.7525) grad_norm 1.1917 (1.2295) [2022-01-20 12:17:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][340/1251] eta 0:34:04 lr 0.000786 time 2.2928 (2.2447) loss 4.0951 (3.7491) grad_norm 1.3276 (1.2323) [2022-01-20 12:18:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][350/1251] eta 0:33:45 lr 0.000786 time 3.8433 (2.2477) loss 3.8588 (3.7541) grad_norm 1.1693 (1.2312) [2022-01-20 12:18:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][360/1251] eta 0:33:21 lr 0.000786 time 1.6500 (2.2465) loss 4.0612 (3.7569) grad_norm 1.2052 (1.2318) [2022-01-20 12:18:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][370/1251] eta 0:32:56 lr 0.000786 time 1.8608 (2.2433) loss 3.0574 (3.7618) grad_norm 1.2796 (1.2327) [2022-01-20 12:19:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][380/1251] eta 0:32:28 lr 0.000786 time 1.9154 (2.2368) loss 3.3634 (3.7561) grad_norm 1.3611 (1.2329) [2022-01-20 12:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][390/1251] eta 0:32:10 lr 0.000786 time 4.1742 (2.2422) loss 3.5157 (3.7565) grad_norm 1.0580 (1.2344) [2022-01-20 12:19:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][400/1251] eta 0:31:45 lr 0.000786 time 1.8890 (2.2388) loss 3.8123 (3.7609) grad_norm 1.2350 (1.2366) [2022-01-20 12:20:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][410/1251] eta 0:31:17 lr 0.000786 time 1.8778 (2.2321) loss 2.7140 (3.7559) grad_norm 1.1654 (1.2365) [2022-01-20 12:20:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][420/1251] eta 0:30:54 lr 0.000786 time 1.9267 (2.2319) loss 2.2410 (3.7556) grad_norm 1.2071 (1.2357) [2022-01-20 12:20:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][430/1251] eta 0:30:34 lr 0.000786 time 2.9022 (2.2350) loss 4.0600 (3.7587) grad_norm 1.1150 (1.2339) [2022-01-20 12:21:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][440/1251] eta 0:30:09 lr 0.000786 time 1.5632 (2.2314) loss 3.8646 (3.7579) grad_norm 1.4159 (1.2352) [2022-01-20 12:21:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][450/1251] eta 0:29:45 lr 0.000786 time 1.9321 (2.2286) loss 3.2714 (3.7602) grad_norm 1.2551 (1.2360) [2022-01-20 12:22:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][460/1251] eta 0:29:21 lr 0.000786 time 1.8397 (2.2264) loss 4.2229 (3.7542) grad_norm 1.3273 (1.2356) [2022-01-20 12:22:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][470/1251] eta 0:29:03 lr 0.000786 time 3.6066 (2.2318) loss 3.7292 (3.7621) grad_norm 1.2179 (1.2355) [2022-01-20 12:22:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][480/1251] eta 0:28:39 lr 0.000786 time 2.7225 (2.2302) loss 4.0467 (3.7633) grad_norm 1.2770 (1.2341) [2022-01-20 12:23:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][490/1251] eta 0:28:12 lr 0.000786 time 1.6374 (2.2247) loss 3.5636 (3.7686) grad_norm 1.2896 (1.2341) [2022-01-20 12:23:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][500/1251] eta 0:27:49 lr 0.000786 time 2.1271 (2.2232) loss 2.7833 (3.7681) grad_norm 1.0311 (1.2320) [2022-01-20 12:23:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][510/1251] eta 0:27:25 lr 0.000786 time 2.1120 (2.2201) loss 4.0792 (3.7650) grad_norm 0.9947 (1.2317) [2022-01-20 12:24:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][520/1251] eta 0:27:00 lr 0.000786 time 2.2837 (2.2173) loss 4.1636 (3.7640) grad_norm 1.3614 (1.2319) [2022-01-20 12:24:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][530/1251] eta 0:26:37 lr 0.000786 time 1.9166 (2.2156) loss 4.3723 (3.7666) grad_norm 1.2079 (1.2304) [2022-01-20 12:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][540/1251] eta 0:26:13 lr 0.000786 time 1.8425 (2.2128) loss 3.3008 (3.7706) grad_norm 1.0853 (1.2301) [2022-01-20 12:25:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][550/1251] eta 0:25:50 lr 0.000786 time 2.1699 (2.2121) loss 4.0702 (3.7700) grad_norm 1.2332 (1.2299) [2022-01-20 12:25:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][560/1251] eta 0:25:30 lr 0.000786 time 3.0321 (2.2143) loss 4.3652 (3.7754) grad_norm 1.1766 (1.2294) [2022-01-20 12:26:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][570/1251] eta 0:25:07 lr 0.000786 time 1.8932 (2.2131) loss 3.3487 (3.7681) grad_norm 1.6243 (1.2297) [2022-01-20 12:26:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][580/1251] eta 0:24:44 lr 0.000786 time 1.6142 (2.2117) loss 4.0595 (3.7663) grad_norm 1.0608 (1.2289) [2022-01-20 12:26:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][590/1251] eta 0:24:22 lr 0.000785 time 1.8146 (2.2127) loss 4.2901 (3.7652) grad_norm 1.3734 (1.2294) [2022-01-20 12:27:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][600/1251] eta 0:24:02 lr 0.000785 time 3.2033 (2.2155) loss 4.3661 (3.7691) grad_norm 1.2427 (1.2314) [2022-01-20 12:27:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][610/1251] eta 0:23:40 lr 0.000785 time 2.7852 (2.2162) loss 3.1469 (3.7714) grad_norm 1.1776 (1.2317) [2022-01-20 12:27:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][620/1251] eta 0:23:17 lr 0.000785 time 2.0013 (2.2150) loss 4.1488 (3.7694) grad_norm 1.1002 (1.2313) [2022-01-20 12:28:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][630/1251] eta 0:22:54 lr 0.000785 time 2.0256 (2.2139) loss 4.4726 (3.7668) grad_norm 1.5742 (1.2321) [2022-01-20 12:28:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][640/1251] eta 0:22:32 lr 0.000785 time 2.7414 (2.2139) loss 3.6559 (3.7684) grad_norm 1.3671 (1.2318) [2022-01-20 12:28:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][650/1251] eta 0:22:09 lr 0.000785 time 2.4037 (2.2119) loss 4.1371 (3.7702) grad_norm 0.9608 (1.2301) [2022-01-20 12:29:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][660/1251] eta 0:21:45 lr 0.000785 time 1.8798 (2.2085) loss 3.9699 (3.7720) grad_norm 1.0736 (1.2292) [2022-01-20 12:29:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][670/1251] eta 0:21:22 lr 0.000785 time 1.9424 (2.2081) loss 3.7885 (3.7732) grad_norm 1.1213 (1.2291) [2022-01-20 12:29:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][680/1251] eta 0:21:00 lr 0.000785 time 1.8421 (2.2078) loss 3.8678 (3.7761) grad_norm 1.2417 (1.2315) [2022-01-20 12:30:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][690/1251] eta 0:20:40 lr 0.000785 time 3.5837 (2.2105) loss 4.2734 (3.7740) grad_norm 1.3251 (1.2323) [2022-01-20 12:30:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][700/1251] eta 0:20:18 lr 0.000785 time 2.2158 (2.2116) loss 3.8198 (3.7696) grad_norm 1.3493 (1.2321) [2022-01-20 12:31:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][710/1251] eta 0:19:55 lr 0.000785 time 2.2020 (2.2099) loss 4.6107 (3.7736) grad_norm 1.1677 (1.2328) [2022-01-20 12:31:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][720/1251] eta 0:19:33 lr 0.000785 time 2.3072 (2.2093) loss 3.5248 (3.7743) grad_norm 1.3140 (1.2322) [2022-01-20 12:31:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][730/1251] eta 0:19:09 lr 0.000785 time 2.2264 (2.2073) loss 2.5935 (3.7701) grad_norm 1.3935 (1.2336) [2022-01-20 12:32:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][740/1251] eta 0:18:47 lr 0.000785 time 1.6805 (2.2057) loss 4.5869 (3.7727) grad_norm 1.1842 (1.2342) [2022-01-20 12:32:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][750/1251] eta 0:18:25 lr 0.000785 time 2.8140 (2.2057) loss 4.3855 (3.7747) grad_norm 1.0906 (1.2332) [2022-01-20 12:32:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][760/1251] eta 0:18:03 lr 0.000785 time 2.1867 (2.2075) loss 4.2743 (3.7729) grad_norm 1.1311 (1.2337) [2022-01-20 12:33:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][770/1251] eta 0:17:41 lr 0.000785 time 2.3460 (2.2077) loss 3.7641 (3.7718) grad_norm 1.0969 (1.2331) [2022-01-20 12:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][780/1251] eta 0:17:19 lr 0.000785 time 1.6614 (2.2070) loss 3.4096 (3.7693) grad_norm 1.1431 (1.2344) [2022-01-20 12:34:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][790/1251] eta 0:16:56 lr 0.000785 time 2.0924 (2.2051) loss 4.1152 (3.7707) grad_norm 1.1039 (1.2338) [2022-01-20 12:34:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][800/1251] eta 0:16:33 lr 0.000785 time 1.5881 (2.2027) loss 3.9851 (3.7699) grad_norm 1.2856 (1.2337) [2022-01-20 12:34:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][810/1251] eta 0:16:10 lr 0.000785 time 1.8235 (2.2015) loss 3.4447 (3.7689) grad_norm 1.2889 (1.2341) [2022-01-20 12:35:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][820/1251] eta 0:15:48 lr 0.000785 time 2.2834 (2.2014) loss 3.7159 (3.7704) grad_norm 1.1883 (1.2341) [2022-01-20 12:35:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][830/1251] eta 0:15:26 lr 0.000785 time 2.8456 (2.2018) loss 3.6863 (3.7682) grad_norm 1.1781 (1.2332) [2022-01-20 12:35:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][840/1251] eta 0:15:04 lr 0.000785 time 2.2360 (2.2016) loss 4.3546 (3.7665) grad_norm 1.2809 (1.2323) [2022-01-20 12:36:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][850/1251] eta 0:14:43 lr 0.000785 time 2.0923 (2.2045) loss 3.6278 (3.7659) grad_norm 1.3658 (1.2324) [2022-01-20 12:36:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][860/1251] eta 0:14:22 lr 0.000785 time 1.9532 (2.2056) loss 2.1993 (3.7650) grad_norm 1.2116 (1.2332) [2022-01-20 12:36:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][870/1251] eta 0:14:00 lr 0.000785 time 2.7516 (2.2052) loss 4.0498 (3.7646) grad_norm 1.0789 (1.2332) [2022-01-20 12:37:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][880/1251] eta 0:13:36 lr 0.000785 time 1.6475 (2.2020) loss 3.8115 (3.7631) grad_norm 1.1046 (1.2336) [2022-01-20 12:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][890/1251] eta 0:13:14 lr 0.000784 time 1.6595 (2.1998) loss 3.9192 (3.7655) grad_norm 1.1348 (1.2332) [2022-01-20 12:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][900/1251] eta 0:12:52 lr 0.000784 time 2.4009 (2.1996) loss 4.2056 (3.7646) grad_norm 1.2102 (1.2333) [2022-01-20 12:38:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][910/1251] eta 0:12:30 lr 0.000784 time 2.8647 (2.1996) loss 2.7113 (3.7636) grad_norm 1.1865 (1.2329) [2022-01-20 12:38:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][920/1251] eta 0:12:07 lr 0.000784 time 2.1057 (2.1988) loss 3.5840 (3.7640) grad_norm 1.3132 (1.2339) [2022-01-20 12:39:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][930/1251] eta 0:11:45 lr 0.000784 time 1.9123 (2.1993) loss 4.0827 (3.7631) grad_norm 1.2289 (1.2331) [2022-01-20 12:39:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][940/1251] eta 0:11:24 lr 0.000784 time 2.2375 (2.2010) loss 3.6168 (3.7629) grad_norm 1.2796 (1.2331) [2022-01-20 12:39:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][950/1251] eta 0:11:03 lr 0.000784 time 3.1559 (2.2034) loss 2.6216 (3.7620) grad_norm 1.2453 (1.2346) [2022-01-20 12:40:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][960/1251] eta 0:10:40 lr 0.000784 time 1.8907 (2.2027) loss 3.7519 (3.7592) grad_norm 1.1254 (1.2344) [2022-01-20 12:40:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][970/1251] eta 0:10:18 lr 0.000784 time 1.6785 (2.2004) loss 4.1488 (3.7586) grad_norm 1.1677 (1.2348) [2022-01-20 12:40:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][980/1251] eta 0:09:55 lr 0.000784 time 1.5683 (2.1984) loss 4.1400 (3.7585) grad_norm 1.1767 (1.2354) [2022-01-20 12:41:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][990/1251] eta 0:09:33 lr 0.000784 time 1.8255 (2.1978) loss 4.1302 (3.7591) grad_norm 1.1764 (1.2358) [2022-01-20 12:41:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1000/1251] eta 0:09:11 lr 0.000784 time 2.0976 (2.1983) loss 4.3501 (3.7605) grad_norm 1.1103 (1.2351) [2022-01-20 12:42:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1010/1251] eta 0:08:50 lr 0.000784 time 1.8321 (2.1995) loss 4.5406 (3.7596) grad_norm 1.2055 (1.2349) [2022-01-20 12:42:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1020/1251] eta 0:08:28 lr 0.000784 time 2.1771 (2.2003) loss 3.8870 (3.7601) grad_norm 1.1118 (1.2348) [2022-01-20 12:42:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1030/1251] eta 0:08:06 lr 0.000784 time 2.1129 (2.1997) loss 3.0345 (3.7594) grad_norm 1.1081 (1.2353) [2022-01-20 12:43:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1040/1251] eta 0:07:44 lr 0.000784 time 2.4999 (2.1992) loss 4.1949 (3.7601) grad_norm 1.5195 (1.2357) [2022-01-20 12:43:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1050/1251] eta 0:07:21 lr 0.000784 time 1.8365 (2.1980) loss 3.7900 (3.7606) grad_norm 1.6233 (1.2358) [2022-01-20 12:43:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1060/1251] eta 0:06:59 lr 0.000784 time 1.8957 (2.1972) loss 3.5188 (3.7566) grad_norm 1.1559 (1.2354) [2022-01-20 12:44:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1070/1251] eta 0:06:38 lr 0.000784 time 2.8143 (2.2006) loss 3.7598 (3.7590) grad_norm 1.4332 (1.2358) [2022-01-20 12:44:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1080/1251] eta 0:06:16 lr 0.000784 time 3.0501 (2.1991) loss 3.3555 (3.7577) grad_norm 1.2712 (1.2358) [2022-01-20 12:44:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1090/1251] eta 0:05:53 lr 0.000784 time 1.9499 (2.1971) loss 4.2060 (3.7586) grad_norm 1.3614 (1.2355) [2022-01-20 12:45:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1100/1251] eta 0:05:31 lr 0.000784 time 1.9523 (2.1946) loss 4.0525 (3.7594) grad_norm 1.2752 (1.2358) [2022-01-20 12:45:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1110/1251] eta 0:05:09 lr 0.000784 time 2.1055 (2.1939) loss 3.9564 (3.7607) grad_norm 1.2460 (1.2357) [2022-01-20 12:45:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1120/1251] eta 0:04:47 lr 0.000784 time 1.9603 (2.1939) loss 4.3070 (3.7587) grad_norm 1.2474 (1.2363) [2022-01-20 12:46:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1130/1251] eta 0:04:25 lr 0.000784 time 2.5885 (2.1947) loss 4.3046 (3.7609) grad_norm 1.0615 (1.2367) [2022-01-20 12:46:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1140/1251] eta 0:04:03 lr 0.000784 time 2.4269 (2.1953) loss 4.2407 (3.7625) grad_norm 1.2129 (1.2366) [2022-01-20 12:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1150/1251] eta 0:03:41 lr 0.000784 time 2.4165 (2.1953) loss 3.9353 (3.7611) grad_norm 1.3144 (1.2371) [2022-01-20 12:47:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1160/1251] eta 0:03:19 lr 0.000784 time 1.8282 (2.1967) loss 3.9194 (3.7632) grad_norm 1.0966 (1.2371) [2022-01-20 12:47:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1170/1251] eta 0:02:57 lr 0.000784 time 1.9403 (2.1972) loss 3.8485 (3.7630) grad_norm 1.0673 (1.2365) [2022-01-20 12:48:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1180/1251] eta 0:02:36 lr 0.000783 time 2.3745 (2.1984) loss 3.6878 (3.7621) grad_norm 1.4942 (1.2367) [2022-01-20 12:48:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1190/1251] eta 0:02:14 lr 0.000783 time 2.6103 (2.1978) loss 3.1529 (3.7603) grad_norm 1.1944 (1.2364) [2022-01-20 12:48:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1200/1251] eta 0:01:51 lr 0.000783 time 1.8039 (2.1958) loss 3.4136 (3.7601) grad_norm 1.0345 (1.2363) [2022-01-20 12:49:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1210/1251] eta 0:01:29 lr 0.000783 time 1.8776 (2.1949) loss 3.9194 (3.7619) grad_norm 1.0675 (1.2360) [2022-01-20 12:49:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1220/1251] eta 0:01:08 lr 0.000783 time 2.2318 (2.1944) loss 4.3056 (3.7611) grad_norm 1.1110 (1.2357) [2022-01-20 12:49:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1230/1251] eta 0:00:46 lr 0.000783 time 1.9882 (2.1943) loss 3.9209 (3.7610) grad_norm 1.2388 (1.2351) [2022-01-20 12:50:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1240/1251] eta 0:00:24 lr 0.000783 time 2.2646 (2.1941) loss 3.4652 (3.7611) grad_norm 1.3156 (1.2351) [2022-01-20 12:50:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [92/300][1250/1251] eta 0:00:02 lr 0.000783 time 1.1609 (2.1887) loss 3.7396 (3.7616) grad_norm 1.1610 (1.2348) [2022-01-20 12:50:34 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 92 training takes 0:45:38 [2022-01-20 12:50:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.654 (18.654) Loss 1.1378 (1.1378) Acc@1 73.340 (73.340) Acc@5 91.504 (91.504) [2022-01-20 12:51:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.210 (3.327) Loss 1.0602 (1.1367) Acc@1 75.977 (73.713) Acc@5 94.141 (91.966) [2022-01-20 12:51:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.611 (2.691) Loss 1.0733 (1.1382) Acc@1 73.828 (73.419) Acc@5 93.066 (91.969) [2022-01-20 12:51:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.582 (2.321) Loss 1.0685 (1.1341) Acc@1 75.195 (73.463) Acc@5 92.773 (92.046) [2022-01-20 12:52:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.633 (2.136) Loss 1.2764 (1.1477) Acc@1 70.898 (73.237) Acc@5 90.430 (91.916) [2022-01-20 12:52:11 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 73.284 Acc@5 91.972 [2022-01-20 12:52:11 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 73.3% [2022-01-20 12:52:11 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.72% [2022-01-20 12:52:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][0/1251] eta 7:11:10 lr 0.000783 time 20.6802 (20.6802) loss 4.2206 (4.2206) grad_norm 1.2168 (1.2168) [2022-01-20 12:52:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][10/1251] eta 1:21:55 lr 0.000783 time 2.5467 (3.9608) loss 4.1512 (3.6941) grad_norm 1.0660 (1.1843) [2022-01-20 12:53:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][20/1251] eta 1:03:27 lr 0.000783 time 1.7107 (3.0929) loss 3.5229 (3.8045) grad_norm 1.1571 (1.1935) [2022-01-20 12:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][30/1251] eta 0:57:01 lr 0.000783 time 1.5632 (2.8022) loss 4.0183 (3.7670) grad_norm 1.3163 (1.2036) [2022-01-20 12:54:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][40/1251] eta 0:54:16 lr 0.000783 time 3.2022 (2.6891) loss 3.8828 (3.7998) grad_norm 1.1011 (1.2120) [2022-01-20 12:54:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][50/1251] eta 0:52:35 lr 0.000783 time 2.7413 (2.6272) loss 3.4409 (3.7407) grad_norm 1.1616 (1.2063) [2022-01-20 12:54:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][60/1251] eta 0:50:53 lr 0.000783 time 1.9527 (2.5637) loss 4.5302 (3.7630) grad_norm 1.0978 (1.2002) [2022-01-20 12:55:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][70/1251] eta 0:49:11 lr 0.000783 time 2.2176 (2.4988) loss 3.2378 (3.7781) grad_norm 1.0908 (1.2108) [2022-01-20 12:55:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][80/1251] eta 0:47:53 lr 0.000783 time 2.8631 (2.4542) loss 4.4537 (3.7787) grad_norm 1.4513 (1.2130) [2022-01-20 12:55:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][90/1251] eta 0:47:07 lr 0.000783 time 3.2028 (2.4351) loss 4.4822 (3.7945) grad_norm 1.2895 (1.2175) [2022-01-20 12:56:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][100/1251] eta 0:46:22 lr 0.000783 time 2.6723 (2.4173) loss 4.2220 (3.8193) grad_norm 1.1532 (1.2163) [2022-01-20 12:56:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][110/1251] eta 0:45:49 lr 0.000783 time 2.7189 (2.4100) loss 4.3044 (3.8205) grad_norm 1.1809 (1.2116) [2022-01-20 12:57:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][120/1251] eta 0:45:14 lr 0.000783 time 2.5242 (2.4004) loss 3.4660 (3.8181) grad_norm 1.5091 (1.2201) [2022-01-20 12:57:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][130/1251] eta 0:44:36 lr 0.000783 time 2.5080 (2.3876) loss 4.1956 (3.8067) grad_norm 1.2416 (1.2228) [2022-01-20 12:57:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][140/1251] eta 0:43:55 lr 0.000783 time 2.5130 (2.3718) loss 3.8193 (3.7927) grad_norm 1.2206 (1.2203) [2022-01-20 12:58:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][150/1251] eta 0:43:08 lr 0.000783 time 2.2905 (2.3507) loss 3.9147 (3.7956) grad_norm 1.1862 (1.2213) [2022-01-20 12:58:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][160/1251] eta 0:42:22 lr 0.000783 time 1.5366 (2.3301) loss 4.0073 (3.8026) grad_norm 1.0824 (1.2238) [2022-01-20 12:58:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][170/1251] eta 0:41:55 lr 0.000783 time 3.1185 (2.3273) loss 4.1383 (3.7934) grad_norm 1.2317 (1.2231) [2022-01-20 12:59:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][180/1251] eta 0:41:23 lr 0.000783 time 2.5290 (2.3193) loss 4.2935 (3.8061) grad_norm 1.3769 (1.2257) [2022-01-20 12:59:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][190/1251] eta 0:40:55 lr 0.000783 time 2.4174 (2.3146) loss 3.3446 (3.8005) grad_norm 1.1033 (1.2279) [2022-01-20 12:59:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][200/1251] eta 0:40:25 lr 0.000783 time 2.1308 (2.3078) loss 4.1743 (3.7892) grad_norm 1.2702 (1.2275) [2022-01-20 13:00:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][210/1251] eta 0:39:54 lr 0.000783 time 2.2349 (2.3001) loss 3.2733 (3.7900) grad_norm 1.2410 (1.2287) [2022-01-20 13:00:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][220/1251] eta 0:39:31 lr 0.000782 time 2.1389 (2.3002) loss 2.7268 (3.7754) grad_norm 1.3244 (1.2295) [2022-01-20 13:01:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][230/1251] eta 0:39:02 lr 0.000782 time 2.3919 (2.2942) loss 3.8185 (3.7838) grad_norm 1.1430 (1.2304) [2022-01-20 13:01:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][240/1251] eta 0:38:34 lr 0.000782 time 1.9296 (2.2893) loss 2.9921 (3.7668) grad_norm 1.3492 (1.2334) [2022-01-20 13:01:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][250/1251] eta 0:38:13 lr 0.000782 time 2.5511 (2.2913) loss 4.3564 (3.7730) grad_norm 1.3200 (1.2332) [2022-01-20 13:02:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][260/1251] eta 0:37:50 lr 0.000782 time 2.0361 (2.2911) loss 3.9752 (3.7777) grad_norm 1.3450 (1.2389) [2022-01-20 13:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][270/1251] eta 0:37:18 lr 0.000782 time 2.0671 (2.2821) loss 3.7334 (3.7807) grad_norm 1.2554 (1.2432) [2022-01-20 13:02:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][280/1251] eta 0:36:53 lr 0.000782 time 2.4102 (2.2796) loss 3.4925 (3.7871) grad_norm 1.2045 (1.2466) [2022-01-20 13:03:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][290/1251] eta 0:36:24 lr 0.000782 time 1.9312 (2.2728) loss 4.2548 (3.7865) grad_norm 1.3078 (1.2451) [2022-01-20 13:03:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][300/1251] eta 0:35:53 lr 0.000782 time 2.4843 (2.2645) loss 2.4569 (3.7902) grad_norm 1.2556 (1.2441) [2022-01-20 13:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][310/1251] eta 0:35:24 lr 0.000782 time 2.5216 (2.2578) loss 4.4935 (3.7980) grad_norm 1.1410 (1.2422) [2022-01-20 13:04:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][320/1251] eta 0:34:57 lr 0.000782 time 2.4422 (2.2528) loss 4.0513 (3.7925) grad_norm 1.1422 (1.2404) [2022-01-20 13:04:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][330/1251] eta 0:34:33 lr 0.000782 time 1.8894 (2.2511) loss 4.6678 (3.7846) grad_norm 1.1921 (1.2393) [2022-01-20 13:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][340/1251] eta 0:34:11 lr 0.000782 time 2.4509 (2.2523) loss 3.8783 (3.7790) grad_norm 1.1354 (1.2383) [2022-01-20 13:05:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][350/1251] eta 0:33:50 lr 0.000782 time 1.9239 (2.2532) loss 4.1507 (3.7831) grad_norm 1.2634 (1.2378) [2022-01-20 13:05:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][360/1251] eta 0:33:30 lr 0.000782 time 1.6409 (2.2570) loss 3.9172 (3.7916) grad_norm 1.3615 (1.2358) [2022-01-20 13:06:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][370/1251] eta 0:33:10 lr 0.000782 time 2.0101 (2.2591) loss 2.9184 (3.7861) grad_norm 1.2589 (1.2360) [2022-01-20 13:06:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][380/1251] eta 0:32:44 lr 0.000782 time 1.7712 (2.2552) loss 3.3769 (3.7776) grad_norm 1.1925 (1.2368) [2022-01-20 13:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][390/1251] eta 0:32:17 lr 0.000782 time 1.9005 (2.2504) loss 3.8024 (3.7741) grad_norm 1.4080 (1.2377) [2022-01-20 13:07:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][400/1251] eta 0:31:49 lr 0.000782 time 1.5822 (2.2436) loss 3.0793 (3.7758) grad_norm 1.2752 (1.2391) [2022-01-20 13:07:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][410/1251] eta 0:31:23 lr 0.000782 time 1.9566 (2.2390) loss 4.1642 (3.7808) grad_norm 1.5416 (1.2395) [2022-01-20 13:07:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][420/1251] eta 0:30:55 lr 0.000782 time 1.9005 (2.2331) loss 2.9577 (3.7776) grad_norm 1.2948 (1.2379) [2022-01-20 13:08:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][430/1251] eta 0:30:33 lr 0.000782 time 2.5038 (2.2332) loss 3.4610 (3.7772) grad_norm 1.0921 (1.2368) [2022-01-20 13:08:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][440/1251] eta 0:30:12 lr 0.000782 time 2.1731 (2.2353) loss 3.3622 (3.7744) grad_norm 1.2710 (1.2366) [2022-01-20 13:09:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][450/1251] eta 0:29:51 lr 0.000782 time 2.1498 (2.2370) loss 3.2171 (3.7756) grad_norm 1.1024 (1.2357) [2022-01-20 13:09:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][460/1251] eta 0:29:29 lr 0.000782 time 1.6625 (2.2368) loss 3.9322 (3.7822) grad_norm 1.2678 (1.2347) [2022-01-20 13:09:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][470/1251] eta 0:29:06 lr 0.000782 time 1.9419 (2.2367) loss 3.8355 (3.7783) grad_norm 1.2301 (1.2366) [2022-01-20 13:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][480/1251] eta 0:28:44 lr 0.000782 time 2.2713 (2.2367) loss 4.4657 (3.7774) grad_norm 1.3497 (1.2352) [2022-01-20 13:10:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][490/1251] eta 0:28:19 lr 0.000782 time 1.5680 (2.2334) loss 2.7288 (3.7773) grad_norm 1.5858 (1.2352) [2022-01-20 13:10:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][500/1251] eta 0:27:54 lr 0.000782 time 1.7469 (2.2291) loss 4.1249 (3.7785) grad_norm 1.0755 (1.2365) [2022-01-20 13:11:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][510/1251] eta 0:27:32 lr 0.000781 time 1.8076 (2.2296) loss 4.2934 (3.7769) grad_norm 1.1092 (1.2355) [2022-01-20 13:11:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][520/1251] eta 0:27:09 lr 0.000781 time 2.3708 (2.2296) loss 4.5704 (3.7766) grad_norm 1.4645 (1.2378) [2022-01-20 13:11:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][530/1251] eta 0:26:46 lr 0.000781 time 2.5022 (2.2285) loss 3.7229 (3.7698) grad_norm 1.0365 (1.2376) [2022-01-20 13:12:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][540/1251] eta 0:26:23 lr 0.000781 time 1.9288 (2.2267) loss 4.1303 (3.7655) grad_norm 1.2153 (1.2392) [2022-01-20 13:12:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][550/1251] eta 0:26:02 lr 0.000781 time 2.2407 (2.2284) loss 4.2005 (3.7638) grad_norm 1.0397 (1.2386) [2022-01-20 13:13:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][560/1251] eta 0:25:38 lr 0.000781 time 1.8774 (2.2263) loss 3.5013 (3.7672) grad_norm 1.3873 (1.2375) [2022-01-20 13:13:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][570/1251] eta 0:25:14 lr 0.000781 time 1.9301 (2.2243) loss 4.2291 (3.7673) grad_norm 1.4964 (1.2389) [2022-01-20 13:13:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][580/1251] eta 0:24:51 lr 0.000781 time 1.9665 (2.2225) loss 4.6055 (3.7676) grad_norm 1.3078 (1.2390) [2022-01-20 13:14:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][590/1251] eta 0:24:30 lr 0.000781 time 1.8762 (2.2249) loss 4.2579 (3.7667) grad_norm 1.2181 (1.2387) [2022-01-20 13:14:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][600/1251] eta 0:24:06 lr 0.000781 time 1.5514 (2.2222) loss 3.6640 (3.7695) grad_norm 1.1491 (1.2387) [2022-01-20 13:14:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][610/1251] eta 0:23:43 lr 0.000781 time 2.2454 (2.2209) loss 4.5100 (3.7692) grad_norm 1.1397 (1.2377) [2022-01-20 13:15:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][620/1251] eta 0:23:21 lr 0.000781 time 1.6616 (2.2217) loss 4.0126 (3.7679) grad_norm 1.4460 (1.2376) [2022-01-20 13:15:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][630/1251] eta 0:23:00 lr 0.000781 time 2.2641 (2.2232) loss 4.2522 (3.7705) grad_norm 1.3219 (1.2379) [2022-01-20 13:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][640/1251] eta 0:22:36 lr 0.000781 time 2.6349 (2.2198) loss 4.0852 (3.7699) grad_norm 1.0622 (1.2363) [2022-01-20 13:16:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][650/1251] eta 0:22:13 lr 0.000781 time 2.3779 (2.2194) loss 3.9629 (3.7705) grad_norm 1.2300 (1.2359) [2022-01-20 13:16:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][660/1251] eta 0:21:50 lr 0.000781 time 1.6556 (2.2169) loss 3.7652 (3.7694) grad_norm 1.1670 (1.2360) [2022-01-20 13:16:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][670/1251] eta 0:21:27 lr 0.000781 time 2.5734 (2.2168) loss 4.5506 (3.7709) grad_norm 1.2223 (1.2366) [2022-01-20 13:17:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][680/1251] eta 0:21:04 lr 0.000781 time 2.0353 (2.2147) loss 3.8696 (3.7696) grad_norm 1.4357 (1.2375) [2022-01-20 13:17:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][690/1251] eta 0:20:42 lr 0.000781 time 2.8008 (2.2152) loss 4.4866 (3.7716) grad_norm 1.2871 (1.2364) [2022-01-20 13:18:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][700/1251] eta 0:20:21 lr 0.000781 time 2.6621 (2.2170) loss 4.4381 (3.7690) grad_norm 1.2547 (1.2371) [2022-01-20 13:18:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][710/1251] eta 0:20:00 lr 0.000781 time 2.6180 (2.2190) loss 3.7610 (3.7690) grad_norm 1.1112 (1.2379) [2022-01-20 13:18:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][720/1251] eta 0:19:39 lr 0.000781 time 1.8080 (2.2211) loss 3.7300 (3.7683) grad_norm 1.1575 (1.2370) [2022-01-20 13:19:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][730/1251] eta 0:19:16 lr 0.000781 time 1.5585 (2.2191) loss 4.2965 (3.7730) grad_norm 1.2248 (1.2367) [2022-01-20 13:19:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][740/1251] eta 0:18:54 lr 0.000781 time 3.4937 (2.2196) loss 4.1117 (3.7707) grad_norm 1.1501 (1.2369) [2022-01-20 13:20:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][750/1251] eta 0:18:33 lr 0.000781 time 3.7315 (2.2220) loss 4.3039 (3.7731) grad_norm 1.3443 (1.2370) [2022-01-20 13:20:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][760/1251] eta 0:18:10 lr 0.000781 time 1.8618 (2.2209) loss 3.6727 (3.7755) grad_norm 1.1888 (1.2371) [2022-01-20 13:20:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][770/1251] eta 0:17:47 lr 0.000781 time 1.8707 (2.2189) loss 4.0291 (3.7771) grad_norm 1.2901 (1.2374) [2022-01-20 13:21:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][780/1251] eta 0:17:24 lr 0.000781 time 2.9276 (2.2182) loss 2.7991 (3.7766) grad_norm 1.1756 (1.2385) [2022-01-20 13:21:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][790/1251] eta 0:17:02 lr 0.000781 time 3.0241 (2.2180) loss 4.1607 (3.7769) grad_norm 1.2425 (1.2379) [2022-01-20 13:21:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][800/1251] eta 0:16:39 lr 0.000780 time 1.9431 (2.2159) loss 3.0123 (3.7775) grad_norm 1.0722 (1.2369) [2022-01-20 13:22:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][810/1251] eta 0:16:16 lr 0.000780 time 2.6072 (2.2145) loss 4.1315 (3.7755) grad_norm 1.0982 (1.2368) [2022-01-20 13:22:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][820/1251] eta 0:15:53 lr 0.000780 time 1.9589 (2.2130) loss 4.5526 (3.7741) grad_norm 1.3005 (1.2375) [2022-01-20 13:22:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][830/1251] eta 0:15:31 lr 0.000780 time 2.7043 (2.2115) loss 4.2811 (3.7731) grad_norm 1.1638 (1.2368) [2022-01-20 13:23:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][840/1251] eta 0:15:09 lr 0.000780 time 2.1641 (2.2119) loss 3.0041 (3.7731) grad_norm 1.0773 (1.2358) [2022-01-20 13:23:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][850/1251] eta 0:14:47 lr 0.000780 time 2.8799 (2.2140) loss 4.1097 (3.7721) grad_norm 1.1858 (1.2352) [2022-01-20 13:23:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][860/1251] eta 0:14:25 lr 0.000780 time 1.5588 (2.2135) loss 3.9673 (3.7739) grad_norm 1.1518 (1.2348) [2022-01-20 13:24:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][870/1251] eta 0:14:03 lr 0.000780 time 1.8216 (2.2132) loss 3.8213 (3.7767) grad_norm 1.1608 (1.2338) [2022-01-20 13:24:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][880/1251] eta 0:13:40 lr 0.000780 time 2.2438 (2.2123) loss 3.3290 (3.7726) grad_norm 1.2048 (1.2346) [2022-01-20 13:25:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][890/1251] eta 0:13:17 lr 0.000780 time 2.1435 (2.2100) loss 2.4168 (3.7728) grad_norm 1.0882 (1.2345) [2022-01-20 13:25:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][900/1251] eta 0:12:54 lr 0.000780 time 1.8885 (2.2080) loss 2.7387 (3.7714) grad_norm 1.2548 (1.2347) [2022-01-20 13:25:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][910/1251] eta 0:12:32 lr 0.000780 time 1.5926 (2.2066) loss 3.7189 (3.7700) grad_norm 1.0492 (1.2346) [2022-01-20 13:26:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][920/1251] eta 0:12:10 lr 0.000780 time 1.8283 (2.2072) loss 3.9884 (3.7689) grad_norm 1.3302 (1.2342) [2022-01-20 13:26:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][930/1251] eta 0:11:48 lr 0.000780 time 1.8471 (2.2079) loss 3.9605 (3.7701) grad_norm 1.3951 (1.2343) [2022-01-20 13:26:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][940/1251] eta 0:11:27 lr 0.000780 time 2.1286 (2.2102) loss 4.3050 (3.7710) grad_norm 1.3574 (1.2354) [2022-01-20 13:27:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][950/1251] eta 0:11:05 lr 0.000780 time 1.6147 (2.2095) loss 3.5526 (3.7696) grad_norm 1.1938 (1.2363) [2022-01-20 13:27:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][960/1251] eta 0:10:42 lr 0.000780 time 1.8671 (2.2090) loss 3.1356 (3.7668) grad_norm 1.1545 (1.2367) [2022-01-20 13:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][970/1251] eta 0:10:20 lr 0.000780 time 1.8241 (2.2073) loss 3.3406 (3.7669) grad_norm 1.2684 (1.2365) [2022-01-20 13:28:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][980/1251] eta 0:09:57 lr 0.000780 time 1.7992 (2.2065) loss 3.5291 (3.7689) grad_norm 1.0743 (1.2355) [2022-01-20 13:28:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][990/1251] eta 0:09:35 lr 0.000780 time 1.9536 (2.2060) loss 3.2655 (3.7689) grad_norm 1.5587 (1.2354) [2022-01-20 13:29:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1000/1251] eta 0:09:14 lr 0.000780 time 1.8406 (2.2074) loss 4.4867 (3.7674) grad_norm 1.1677 (1.2358) [2022-01-20 13:29:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1010/1251] eta 0:08:51 lr 0.000780 time 2.5304 (2.2065) loss 3.4162 (3.7653) grad_norm 1.2840 (1.2372) [2022-01-20 13:29:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1020/1251] eta 0:08:29 lr 0.000780 time 2.2489 (2.2064) loss 4.1188 (3.7657) grad_norm 1.1494 (1.2368) [2022-01-20 13:30:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1030/1251] eta 0:08:07 lr 0.000780 time 1.8563 (2.2070) loss 4.6870 (3.7639) grad_norm 1.2526 (1.2361) [2022-01-20 13:30:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1040/1251] eta 0:07:46 lr 0.000780 time 2.4250 (2.2091) loss 4.8089 (3.7681) grad_norm 1.4778 (1.2359) [2022-01-20 13:30:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1050/1251] eta 0:07:24 lr 0.000780 time 2.6336 (2.2096) loss 3.3513 (3.7663) grad_norm 1.2356 (1.2365) [2022-01-20 13:31:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1060/1251] eta 0:07:01 lr 0.000780 time 1.8314 (2.2087) loss 4.0847 (3.7676) grad_norm 1.3896 (1.2374) [2022-01-20 13:31:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1070/1251] eta 0:06:39 lr 0.000780 time 1.5884 (2.2069) loss 3.0832 (3.7661) grad_norm 1.1384 (1.2377) [2022-01-20 13:31:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1080/1251] eta 0:06:17 lr 0.000780 time 2.2556 (2.2073) loss 4.0927 (3.7667) grad_norm 1.2605 (1.2374) [2022-01-20 13:32:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1090/1251] eta 0:05:55 lr 0.000779 time 1.5840 (2.2066) loss 3.4131 (3.7657) grad_norm 1.3620 (1.2371) [2022-01-20 13:32:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1100/1251] eta 0:05:33 lr 0.000779 time 1.8886 (2.2060) loss 3.5775 (3.7669) grad_norm 1.1846 (1.2367) [2022-01-20 13:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1110/1251] eta 0:05:10 lr 0.000779 time 1.9309 (2.2054) loss 3.4165 (3.7662) grad_norm 1.2776 (1.2364) [2022-01-20 13:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1120/1251] eta 0:04:48 lr 0.000779 time 2.5995 (2.2058) loss 4.0343 (3.7646) grad_norm 1.2399 (1.2365) [2022-01-20 13:33:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1130/1251] eta 0:04:26 lr 0.000779 time 1.9474 (2.2054) loss 3.8422 (3.7648) grad_norm 1.1600 (1.2373) [2022-01-20 13:34:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1140/1251] eta 0:04:04 lr 0.000779 time 1.8571 (2.2056) loss 3.5964 (3.7625) grad_norm 1.3613 (1.2369) [2022-01-20 13:34:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1150/1251] eta 0:03:42 lr 0.000779 time 2.0531 (2.2050) loss 3.2547 (3.7623) grad_norm 1.1722 (1.2376) [2022-01-20 13:34:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1160/1251] eta 0:03:20 lr 0.000779 time 1.8989 (2.2064) loss 4.2761 (3.7628) grad_norm 1.1770 (1.2379) [2022-01-20 13:35:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1170/1251] eta 0:02:58 lr 0.000779 time 1.9261 (2.2054) loss 4.3688 (3.7665) grad_norm 1.2366 (1.2382) [2022-01-20 13:35:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1180/1251] eta 0:02:36 lr 0.000779 time 1.5784 (2.2043) loss 3.5396 (3.7662) grad_norm 1.1551 (1.2387) [2022-01-20 13:35:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1190/1251] eta 0:02:14 lr 0.000779 time 2.0989 (2.2041) loss 4.1848 (3.7656) grad_norm 1.1940 (1.2383) [2022-01-20 13:36:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1200/1251] eta 0:01:52 lr 0.000779 time 1.7841 (2.2024) loss 3.4449 (3.7640) grad_norm 1.3523 (1.2383) [2022-01-20 13:36:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1210/1251] eta 0:01:30 lr 0.000779 time 2.5685 (2.2034) loss 4.2988 (3.7644) grad_norm 1.1424 (1.2377) [2022-01-20 13:37:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1220/1251] eta 0:01:08 lr 0.000779 time 2.1984 (2.2058) loss 4.0798 (3.7661) grad_norm 1.2741 (1.2374) [2022-01-20 13:37:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1230/1251] eta 0:00:46 lr 0.000779 time 2.4548 (2.2057) loss 4.3482 (3.7656) grad_norm 1.2187 (1.2368) [2022-01-20 13:37:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1240/1251] eta 0:00:24 lr 0.000779 time 1.6618 (2.2052) loss 3.4185 (3.7657) grad_norm 1.3379 (1.2372) [2022-01-20 13:38:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [93/300][1250/1251] eta 0:00:02 lr 0.000779 time 1.1659 (2.1989) loss 4.1051 (3.7664) grad_norm 1.2511 (1.2372) [2022-01-20 13:38:02 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 93 training takes 0:45:51 [2022-01-20 13:38:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.585 (18.585) Loss 1.1304 (1.1304) Acc@1 72.949 (72.949) Acc@5 91.797 (91.797) [2022-01-20 13:38:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.904 (3.447) Loss 1.1671 (1.1196) Acc@1 74.219 (74.361) Acc@5 91.602 (92.108) [2022-01-20 13:38:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.580 (2.727) Loss 1.0658 (1.1350) Acc@1 75.879 (73.796) Acc@5 93.066 (92.067) [2022-01-20 13:39:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.937 (2.291) Loss 1.1298 (1.1440) Acc@1 73.730 (73.538) Acc@5 92.383 (91.961) [2022-01-20 13:39:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.273 (2.202) Loss 1.1361 (1.1440) Acc@1 73.828 (73.438) Acc@5 91.992 (91.964) [2022-01-20 13:39:40 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 73.402 Acc@5 91.970 [2022-01-20 13:39:40 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 73.4% [2022-01-20 13:39:40 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.72% [2022-01-20 13:40:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][0/1251] eta 7:31:33 lr 0.000779 time 21.6572 (21.6572) loss 3.6734 (3.6734) grad_norm 1.1462 (1.1462) [2022-01-20 13:40:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][10/1251] eta 1:25:40 lr 0.000779 time 2.7290 (4.1419) loss 4.0455 (3.8429) grad_norm 1.0280 (1.1414) [2022-01-20 13:40:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][20/1251] eta 1:05:36 lr 0.000779 time 1.9269 (3.1975) loss 4.2341 (3.6838) grad_norm 1.1786 (1.2001) [2022-01-20 13:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][30/1251] eta 0:58:46 lr 0.000779 time 1.9215 (2.8880) loss 4.1867 (3.7084) grad_norm 1.5053 (1.2341) [2022-01-20 13:41:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][40/1251] eta 0:56:02 lr 0.000779 time 5.4699 (2.7763) loss 3.2467 (3.6781) grad_norm 1.1912 (1.2327) [2022-01-20 13:41:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][50/1251] eta 0:52:56 lr 0.000779 time 2.8336 (2.6449) loss 3.9833 (3.7088) grad_norm 1.1664 (1.2267) [2022-01-20 13:42:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][60/1251] eta 0:51:08 lr 0.000779 time 1.8181 (2.5762) loss 4.0572 (3.6770) grad_norm 1.1761 (1.2162) [2022-01-20 13:42:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][70/1251] eta 0:49:14 lr 0.000779 time 1.6939 (2.5013) loss 2.9436 (3.6866) grad_norm 1.2880 (1.2218) [2022-01-20 13:43:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][80/1251] eta 0:48:13 lr 0.000779 time 3.5464 (2.4712) loss 3.7386 (3.6988) grad_norm 1.4629 (1.2354) [2022-01-20 13:43:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][90/1251] eta 0:46:58 lr 0.000779 time 2.1150 (2.4275) loss 4.3693 (3.6965) grad_norm 1.3747 (1.2490) [2022-01-20 13:43:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][100/1251] eta 0:46:03 lr 0.000779 time 1.4879 (2.4013) loss 4.6743 (3.7136) grad_norm 1.3667 (1.2563) [2022-01-20 13:44:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][110/1251] eta 0:45:09 lr 0.000779 time 1.5669 (2.3750) loss 2.6254 (3.7070) grad_norm 1.1875 (1.2538) [2022-01-20 13:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][120/1251] eta 0:44:50 lr 0.000779 time 3.8367 (2.3793) loss 4.3849 (3.6908) grad_norm 1.5021 (1.2501) [2022-01-20 13:44:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][130/1251] eta 0:44:08 lr 0.000778 time 2.1342 (2.3630) loss 4.0364 (3.7123) grad_norm 1.2321 (1.2438) [2022-01-20 13:45:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][140/1251] eta 0:43:44 lr 0.000778 time 1.5317 (2.3627) loss 3.8740 (3.7193) grad_norm 1.2733 (1.2390) [2022-01-20 13:45:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][150/1251] eta 0:43:13 lr 0.000778 time 1.9338 (2.3556) loss 4.0036 (3.7338) grad_norm 1.1791 (1.2360) [2022-01-20 13:45:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][160/1251] eta 0:42:37 lr 0.000778 time 2.8230 (2.3443) loss 3.4860 (3.7407) grad_norm 1.1591 (1.2359) [2022-01-20 13:46:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][170/1251] eta 0:41:55 lr 0.000778 time 1.9750 (2.3268) loss 3.8010 (3.7432) grad_norm 1.2002 (1.2421) [2022-01-20 13:46:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][180/1251] eta 0:41:16 lr 0.000778 time 1.8730 (2.3123) loss 4.1933 (3.7330) grad_norm 1.3800 (1.2414) [2022-01-20 13:47:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][190/1251] eta 0:40:45 lr 0.000778 time 2.0088 (2.3047) loss 3.6465 (3.7396) grad_norm 1.0844 (1.2403) [2022-01-20 13:47:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][200/1251] eta 0:40:18 lr 0.000778 time 2.3556 (2.3010) loss 2.9121 (3.7318) grad_norm 1.3383 (1.2389) [2022-01-20 13:47:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][210/1251] eta 0:39:40 lr 0.000778 time 1.8118 (2.2869) loss 2.5258 (3.7395) grad_norm 1.1205 (1.2391) [2022-01-20 13:48:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][220/1251] eta 0:39:09 lr 0.000778 time 1.8021 (2.2786) loss 4.4126 (3.7490) grad_norm 1.1597 (1.2385) [2022-01-20 13:48:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][230/1251] eta 0:38:43 lr 0.000778 time 2.4821 (2.2758) loss 4.0727 (3.7319) grad_norm 1.2222 (1.2399) [2022-01-20 13:48:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][240/1251] eta 0:38:18 lr 0.000778 time 2.8928 (2.2740) loss 4.4682 (3.7329) grad_norm 1.1484 (1.2384) [2022-01-20 13:49:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][250/1251] eta 0:37:47 lr 0.000778 time 1.6990 (2.2653) loss 3.9186 (3.7355) grad_norm 1.2781 (1.2398) [2022-01-20 13:49:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][260/1251] eta 0:37:20 lr 0.000778 time 1.8171 (2.2606) loss 3.4623 (3.7347) grad_norm 1.2280 (1.2393) [2022-01-20 13:49:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][270/1251] eta 0:37:00 lr 0.000778 time 1.8960 (2.2640) loss 2.6761 (3.7196) grad_norm 1.1288 (1.2387) [2022-01-20 13:50:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][280/1251] eta 0:36:39 lr 0.000778 time 2.2139 (2.2648) loss 4.1218 (3.7323) grad_norm 1.0842 (1.2367) [2022-01-20 13:50:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][290/1251] eta 0:36:08 lr 0.000778 time 1.6875 (2.2570) loss 2.1869 (3.7266) grad_norm 1.2949 (1.2370) [2022-01-20 13:50:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][300/1251] eta 0:35:43 lr 0.000778 time 3.4604 (2.2543) loss 3.5185 (3.7310) grad_norm 1.4013 (1.2391) [2022-01-20 13:51:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][310/1251] eta 0:35:18 lr 0.000778 time 1.8155 (2.2516) loss 3.8927 (3.7359) grad_norm 1.0753 (1.2384) [2022-01-20 13:51:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][320/1251] eta 0:34:53 lr 0.000778 time 1.8811 (2.2488) loss 4.0091 (3.7389) grad_norm 1.1540 (1.2368) [2022-01-20 13:52:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][330/1251] eta 0:34:27 lr 0.000778 time 2.0224 (2.2447) loss 4.4472 (3.7496) grad_norm 1.1976 (1.2378) [2022-01-20 13:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][340/1251] eta 0:34:04 lr 0.000778 time 3.2320 (2.2443) loss 4.3094 (3.7496) grad_norm 1.1370 (1.2361) [2022-01-20 13:52:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][350/1251] eta 0:33:40 lr 0.000778 time 1.9157 (2.2424) loss 4.4247 (3.7485) grad_norm 1.0479 (1.2347) [2022-01-20 13:53:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][360/1251] eta 0:33:19 lr 0.000778 time 1.6492 (2.2444) loss 3.6113 (3.7513) grad_norm 1.1658 (1.2365) [2022-01-20 13:53:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][370/1251] eta 0:32:58 lr 0.000778 time 2.5282 (2.2454) loss 4.2250 (3.7535) grad_norm 1.6001 (1.2390) [2022-01-20 13:53:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][380/1251] eta 0:32:35 lr 0.000778 time 2.8514 (2.2455) loss 4.1692 (3.7558) grad_norm 1.3331 (1.2403) [2022-01-20 13:54:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][390/1251] eta 0:32:13 lr 0.000778 time 1.8882 (2.2458) loss 4.4779 (3.7558) grad_norm 1.2790 (1.2411) [2022-01-20 13:54:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][400/1251] eta 0:31:43 lr 0.000778 time 1.6922 (2.2370) loss 3.8389 (3.7635) grad_norm 1.2237 (1.2417) [2022-01-20 13:54:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][410/1251] eta 0:31:18 lr 0.000778 time 2.1379 (2.2336) loss 3.0202 (3.7560) grad_norm 1.6307 (1.2450) [2022-01-20 13:55:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][420/1251] eta 0:30:55 lr 0.000777 time 2.7267 (2.2331) loss 3.7675 (3.7515) grad_norm 1.2013 (1.2457) [2022-01-20 13:55:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][430/1251] eta 0:30:34 lr 0.000777 time 1.8376 (2.2344) loss 2.6110 (3.7516) grad_norm 1.1167 (1.2444) [2022-01-20 13:56:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][440/1251] eta 0:30:13 lr 0.000777 time 2.5616 (2.2356) loss 4.2494 (3.7576) grad_norm 1.2525 (1.2451) [2022-01-20 13:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][450/1251] eta 0:29:48 lr 0.000777 time 1.9165 (2.2334) loss 4.0803 (3.7586) grad_norm 1.3232 (1.2460) [2022-01-20 13:56:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][460/1251] eta 0:29:25 lr 0.000777 time 2.7719 (2.2326) loss 3.8304 (3.7564) grad_norm 1.4468 (1.2459) [2022-01-20 13:57:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][470/1251] eta 0:29:02 lr 0.000777 time 1.8279 (2.2311) loss 4.3622 (3.7512) grad_norm 1.2197 (1.2452) [2022-01-20 13:57:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][480/1251] eta 0:28:39 lr 0.000777 time 1.9048 (2.2296) loss 2.7183 (3.7449) grad_norm 1.3044 (1.2448) [2022-01-20 13:57:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][490/1251] eta 0:28:16 lr 0.000777 time 1.8120 (2.2297) loss 4.0476 (3.7403) grad_norm 1.2433 (1.2441) [2022-01-20 13:58:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][500/1251] eta 0:27:54 lr 0.000777 time 2.4012 (2.2302) loss 4.2123 (3.7445) grad_norm 1.1610 (1.2440) [2022-01-20 13:58:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][510/1251] eta 0:27:30 lr 0.000777 time 1.6287 (2.2276) loss 3.4392 (3.7492) grad_norm 1.2018 (1.2424) [2022-01-20 13:59:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][520/1251] eta 0:27:07 lr 0.000777 time 1.8438 (2.2268) loss 4.5116 (3.7523) grad_norm 1.1109 (1.2434) [2022-01-20 13:59:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][530/1251] eta 0:26:43 lr 0.000777 time 1.6039 (2.2242) loss 3.9241 (3.7494) grad_norm 1.4118 (1.2440) [2022-01-20 13:59:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][540/1251] eta 0:26:19 lr 0.000777 time 1.9838 (2.2209) loss 4.0891 (3.7520) grad_norm 1.0933 (1.2442) [2022-01-20 14:00:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][550/1251] eta 0:25:56 lr 0.000777 time 2.5629 (2.2205) loss 3.5419 (3.7474) grad_norm 1.2119 (1.2443) [2022-01-20 14:00:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][560/1251] eta 0:25:32 lr 0.000777 time 2.2089 (2.2184) loss 2.5201 (3.7466) grad_norm 1.4527 (1.2447) [2022-01-20 14:00:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][570/1251] eta 0:25:09 lr 0.000777 time 1.8943 (2.2163) loss 4.3347 (3.7430) grad_norm 1.2406 (1.2463) [2022-01-20 14:01:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][580/1251] eta 0:24:45 lr 0.000777 time 1.8973 (2.2144) loss 3.7853 (3.7449) grad_norm 1.1368 (1.2461) [2022-01-20 14:01:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][590/1251] eta 0:24:26 lr 0.000777 time 2.8699 (2.2180) loss 3.6888 (3.7393) grad_norm 1.2639 (1.2468) [2022-01-20 14:01:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][600/1251] eta 0:24:04 lr 0.000777 time 2.4948 (2.2191) loss 2.7437 (3.7407) grad_norm 1.0051 (1.2467) [2022-01-20 14:02:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][610/1251] eta 0:23:43 lr 0.000777 time 2.5423 (2.2210) loss 3.6943 (3.7432) grad_norm 1.4097 (1.2479) [2022-01-20 14:02:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][620/1251] eta 0:23:20 lr 0.000777 time 2.0040 (2.2198) loss 4.0835 (3.7420) grad_norm 1.1790 (1.2469) [2022-01-20 14:02:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][630/1251] eta 0:22:56 lr 0.000777 time 1.9498 (2.2159) loss 2.6459 (3.7393) grad_norm 1.1068 (1.2469) [2022-01-20 14:03:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][640/1251] eta 0:22:32 lr 0.000777 time 1.8714 (2.2132) loss 2.8566 (3.7365) grad_norm 1.2495 (1.2455) [2022-01-20 14:03:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][650/1251] eta 0:22:08 lr 0.000777 time 2.1153 (2.2104) loss 3.8927 (3.7382) grad_norm 1.2491 (1.2457) [2022-01-20 14:04:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][660/1251] eta 0:21:45 lr 0.000777 time 1.9296 (2.2098) loss 3.8951 (3.7391) grad_norm 1.1806 (1.2470) [2022-01-20 14:04:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][670/1251] eta 0:21:23 lr 0.000777 time 2.2538 (2.2099) loss 3.9114 (3.7439) grad_norm 1.3277 (1.2471) [2022-01-20 14:04:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][680/1251] eta 0:21:02 lr 0.000777 time 2.7876 (2.2118) loss 4.3650 (3.7425) grad_norm 1.0690 (1.2466) [2022-01-20 14:05:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][690/1251] eta 0:20:41 lr 0.000777 time 2.1819 (2.2137) loss 4.2115 (3.7486) grad_norm 1.1036 (1.2464) [2022-01-20 14:05:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][700/1251] eta 0:20:19 lr 0.000777 time 1.8446 (2.2137) loss 3.7584 (3.7501) grad_norm 1.4735 (1.2466) [2022-01-20 14:05:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][710/1251] eta 0:19:57 lr 0.000776 time 1.8860 (2.2127) loss 3.4580 (3.7517) grad_norm 1.3120 (1.2470) [2022-01-20 14:06:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][720/1251] eta 0:19:34 lr 0.000776 time 2.5160 (2.2114) loss 3.6080 (3.7541) grad_norm 1.3155 (1.2465) [2022-01-20 14:06:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][730/1251] eta 0:19:11 lr 0.000776 time 2.5420 (2.2109) loss 3.9522 (3.7558) grad_norm 1.1252 (1.2468) [2022-01-20 14:06:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][740/1251] eta 0:18:49 lr 0.000776 time 2.5169 (2.2096) loss 3.8384 (3.7534) grad_norm 1.2088 (1.2472) [2022-01-20 14:07:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][750/1251] eta 0:18:27 lr 0.000776 time 2.6128 (2.2106) loss 4.5022 (3.7554) grad_norm 1.4480 (1.2489) [2022-01-20 14:07:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][760/1251] eta 0:18:05 lr 0.000776 time 2.4459 (2.2108) loss 2.6313 (3.7567) grad_norm 1.2611 (1.2492) [2022-01-20 14:08:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][770/1251] eta 0:17:43 lr 0.000776 time 2.1327 (2.2104) loss 3.9080 (3.7579) grad_norm 1.2785 (1.2490) [2022-01-20 14:08:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][780/1251] eta 0:17:20 lr 0.000776 time 2.2157 (2.2102) loss 2.8997 (3.7591) grad_norm 1.2478 (1.2484) [2022-01-20 14:08:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][790/1251] eta 0:16:58 lr 0.000776 time 2.2451 (2.2103) loss 2.7418 (3.7592) grad_norm 1.2888 (1.2472) [2022-01-20 14:09:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][800/1251] eta 0:16:36 lr 0.000776 time 2.7852 (2.2087) loss 3.7540 (3.7585) grad_norm 1.0718 (1.2460) [2022-01-20 14:09:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][810/1251] eta 0:16:12 lr 0.000776 time 1.8782 (2.2063) loss 2.5881 (3.7571) grad_norm 1.3942 (1.2460) [2022-01-20 14:09:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][820/1251] eta 0:15:50 lr 0.000776 time 2.6906 (2.2051) loss 4.2099 (3.7587) grad_norm 1.2499 (1.2472) [2022-01-20 14:10:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][830/1251] eta 0:15:28 lr 0.000776 time 1.9058 (2.2051) loss 3.7718 (3.7568) grad_norm 1.1331 (1.2474) [2022-01-20 14:10:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][840/1251] eta 0:15:06 lr 0.000776 time 3.0260 (2.2061) loss 4.2438 (3.7533) grad_norm 1.2153 (1.2457) [2022-01-20 14:11:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][850/1251] eta 0:14:45 lr 0.000776 time 2.1666 (2.2088) loss 4.0732 (3.7502) grad_norm 1.2481 (1.2453) [2022-01-20 14:11:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][860/1251] eta 0:14:24 lr 0.000776 time 1.9538 (2.2103) loss 3.4840 (3.7519) grad_norm 1.3692 (1.2453) [2022-01-20 14:11:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][870/1251] eta 0:14:01 lr 0.000776 time 1.9134 (2.2079) loss 4.0623 (3.7505) grad_norm 1.3185 (1.2445) [2022-01-20 14:12:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][880/1251] eta 0:13:38 lr 0.000776 time 1.8820 (2.2068) loss 4.1349 (3.7520) grad_norm 1.2243 (1.2443) [2022-01-20 14:12:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][890/1251] eta 0:13:16 lr 0.000776 time 2.2749 (2.2057) loss 3.1175 (3.7486) grad_norm 1.3066 (1.2445) [2022-01-20 14:12:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][900/1251] eta 0:12:54 lr 0.000776 time 1.8642 (2.2057) loss 4.1279 (3.7526) grad_norm 1.2909 (1.2450) [2022-01-20 14:13:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][910/1251] eta 0:12:32 lr 0.000776 time 1.8956 (2.2053) loss 3.0889 (3.7519) grad_norm 1.2053 (1.2446) [2022-01-20 14:13:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][920/1251] eta 0:12:09 lr 0.000776 time 2.0086 (2.2043) loss 3.8923 (3.7504) grad_norm 1.1105 (1.2439) [2022-01-20 14:13:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][930/1251] eta 0:11:47 lr 0.000776 time 1.8461 (2.2033) loss 3.8056 (3.7470) grad_norm 1.2362 (1.2428) [2022-01-20 14:14:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][940/1251] eta 0:11:25 lr 0.000776 time 3.1713 (2.2051) loss 3.1181 (3.7472) grad_norm 1.3283 (1.2432) [2022-01-20 14:14:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][950/1251] eta 0:11:03 lr 0.000776 time 1.7489 (2.2041) loss 2.6101 (3.7452) grad_norm 1.1956 (1.2434) [2022-01-20 14:14:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][960/1251] eta 0:10:41 lr 0.000776 time 2.2915 (2.2036) loss 4.3589 (3.7463) grad_norm 1.1679 (1.2427) [2022-01-20 14:15:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][970/1251] eta 0:10:19 lr 0.000776 time 1.2542 (2.2037) loss 3.4590 (3.7447) grad_norm 1.1663 (1.2422) [2022-01-20 14:15:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][980/1251] eta 0:09:57 lr 0.000776 time 2.5008 (2.2060) loss 4.1130 (3.7455) grad_norm 1.2720 (1.2419) [2022-01-20 14:16:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][990/1251] eta 0:09:35 lr 0.000776 time 1.8318 (2.2061) loss 3.9003 (3.7459) grad_norm 1.1017 (1.2415) [2022-01-20 14:16:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1000/1251] eta 0:09:13 lr 0.000775 time 2.1774 (2.2063) loss 4.2654 (3.7481) grad_norm 1.2077 (1.2419) [2022-01-20 14:16:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1010/1251] eta 0:08:51 lr 0.000775 time 1.8382 (2.2062) loss 4.0771 (3.7477) grad_norm 1.2313 (1.2426) [2022-01-20 14:17:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1020/1251] eta 0:08:29 lr 0.000775 time 2.4773 (2.2075) loss 4.5203 (3.7464) grad_norm 1.2877 (1.2426) [2022-01-20 14:17:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1030/1251] eta 0:08:07 lr 0.000775 time 1.8907 (2.2060) loss 4.0184 (3.7446) grad_norm 1.3152 (1.2427) [2022-01-20 14:17:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1040/1251] eta 0:07:45 lr 0.000775 time 1.8664 (2.2045) loss 3.7399 (3.7444) grad_norm 1.4323 (1.2422) [2022-01-20 14:18:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1050/1251] eta 0:07:22 lr 0.000775 time 2.2459 (2.2036) loss 3.9266 (3.7467) grad_norm 1.4140 (1.2421) [2022-01-20 14:18:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1060/1251] eta 0:07:01 lr 0.000775 time 2.6979 (2.2048) loss 4.5035 (3.7481) grad_norm 1.1607 (1.2423) [2022-01-20 14:19:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1070/1251] eta 0:06:39 lr 0.000775 time 2.0936 (2.2048) loss 2.9770 (3.7478) grad_norm 1.1132 (1.2419) [2022-01-20 14:19:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1080/1251] eta 0:06:16 lr 0.000775 time 1.6853 (2.2037) loss 2.8548 (3.7481) grad_norm 1.1511 (1.2416) [2022-01-20 14:19:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1090/1251] eta 0:05:54 lr 0.000775 time 2.6512 (2.2030) loss 3.5113 (3.7457) grad_norm 1.2440 (1.2412) [2022-01-20 14:20:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1100/1251] eta 0:05:32 lr 0.000775 time 3.0983 (2.2043) loss 3.6562 (3.7470) grad_norm 1.2220 (1.2411) [2022-01-20 14:20:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1110/1251] eta 0:05:10 lr 0.000775 time 1.8980 (2.2043) loss 4.0281 (3.7473) grad_norm 1.1011 (1.2411) [2022-01-20 14:20:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1120/1251] eta 0:04:48 lr 0.000775 time 1.9674 (2.2026) loss 2.5939 (3.7455) grad_norm 1.0772 (1.2406) [2022-01-20 14:21:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1130/1251] eta 0:04:26 lr 0.000775 time 2.5881 (2.2015) loss 3.9354 (3.7460) grad_norm 1.3361 (1.2415) [2022-01-20 14:21:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1140/1251] eta 0:04:04 lr 0.000775 time 2.2259 (2.1997) loss 4.0000 (3.7452) grad_norm 1.1380 (1.2413) [2022-01-20 14:21:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1150/1251] eta 0:03:42 lr 0.000775 time 1.7069 (2.1985) loss 3.5534 (3.7445) grad_norm 1.2085 (1.2410) [2022-01-20 14:22:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1160/1251] eta 0:03:20 lr 0.000775 time 1.6888 (2.1985) loss 3.1203 (3.7443) grad_norm 1.0613 (1.2411) [2022-01-20 14:22:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1170/1251] eta 0:02:58 lr 0.000775 time 2.5965 (2.1987) loss 2.6694 (3.7415) grad_norm 1.1838 (1.2407) [2022-01-20 14:22:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1180/1251] eta 0:02:36 lr 0.000775 time 1.8655 (2.1980) loss 4.3033 (3.7426) grad_norm 1.0838 (1.2407) [2022-01-20 14:23:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1190/1251] eta 0:02:14 lr 0.000775 time 1.9232 (2.1990) loss 3.4776 (3.7401) grad_norm 1.1431 (1.2402) [2022-01-20 14:23:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1200/1251] eta 0:01:52 lr 0.000775 time 1.9895 (2.2001) loss 3.1087 (3.7404) grad_norm 1.0376 (1.2393) [2022-01-20 14:24:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1210/1251] eta 0:01:30 lr 0.000775 time 2.1487 (2.2011) loss 3.2456 (3.7403) grad_norm 1.0653 (1.2387) [2022-01-20 14:24:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1220/1251] eta 0:01:08 lr 0.000775 time 1.7144 (2.1998) loss 4.2398 (3.7392) grad_norm 1.2453 (1.2385) [2022-01-20 14:24:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1230/1251] eta 0:00:46 lr 0.000775 time 2.1197 (2.1990) loss 4.2528 (3.7379) grad_norm 1.3715 (1.2389) [2022-01-20 14:25:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1240/1251] eta 0:00:24 lr 0.000775 time 1.5562 (2.1976) loss 4.0260 (3.7367) grad_norm 1.2472 (1.2390) [2022-01-20 14:25:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [94/300][1250/1251] eta 0:00:02 lr 0.000775 time 1.1787 (2.1916) loss 4.1264 (3.7371) grad_norm 1.5930 (1.2394) [2022-01-20 14:25:22 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 94 training takes 0:45:42 [2022-01-20 14:25:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.596 (18.596) Loss 1.1059 (1.1059) Acc@1 74.805 (74.805) Acc@5 92.480 (92.480) [2022-01-20 14:25:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.607 (3.341) Loss 1.0841 (1.1615) Acc@1 76.270 (73.686) Acc@5 92.578 (91.868) [2022-01-20 14:26:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.288 (2.564) Loss 1.1591 (1.1565) Acc@1 73.730 (73.624) Acc@5 91.504 (92.006) [2022-01-20 14:26:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.946 (2.319) Loss 1.1895 (1.1612) Acc@1 73.438 (73.415) Acc@5 91.406 (91.932) [2022-01-20 14:26:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.377 (2.150) Loss 1.1428 (1.1626) Acc@1 74.219 (73.383) Acc@5 91.406 (91.928) [2022-01-20 14:26:57 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 73.494 Acc@5 91.908 [2022-01-20 14:26:57 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 73.5% [2022-01-20 14:26:57 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.72% [2022-01-20 14:27:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][0/1251] eta 7:33:36 lr 0.000775 time 21.7555 (21.7555) loss 3.9539 (3.9539) grad_norm 1.2353 (1.2353) [2022-01-20 14:27:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][10/1251] eta 1:23:46 lr 0.000775 time 2.3057 (4.0506) loss 3.7433 (3.4003) grad_norm 1.6619 (1.3464) [2022-01-20 14:28:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][20/1251] eta 1:04:55 lr 0.000775 time 1.4750 (3.1642) loss 3.8320 (3.5192) grad_norm 1.0993 (1.2644) [2022-01-20 14:28:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][30/1251] eta 0:56:57 lr 0.000774 time 1.9211 (2.7986) loss 3.8235 (3.5036) grad_norm 1.0196 (1.2256) [2022-01-20 14:28:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][40/1251] eta 0:54:49 lr 0.000774 time 4.2734 (2.7161) loss 4.1460 (3.5753) grad_norm 1.3495 (1.2191) [2022-01-20 14:29:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][50/1251] eta 0:52:18 lr 0.000774 time 2.4241 (2.6134) loss 3.6416 (3.6315) grad_norm 1.3167 (1.2256) [2022-01-20 14:29:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][60/1251] eta 0:49:57 lr 0.000774 time 1.2561 (2.5170) loss 4.2265 (3.6723) grad_norm 1.1793 (1.2396) [2022-01-20 14:29:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][70/1251] eta 0:48:25 lr 0.000774 time 1.5422 (2.4600) loss 3.2668 (3.6969) grad_norm 1.3102 (1.2526) [2022-01-20 14:30:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][80/1251] eta 0:47:55 lr 0.000774 time 3.5408 (2.4552) loss 4.7486 (3.7083) grad_norm 1.1239 (1.2508) [2022-01-20 14:30:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][90/1251] eta 0:46:57 lr 0.000774 time 2.1716 (2.4267) loss 4.6378 (3.7526) grad_norm 1.2952 (1.2513) [2022-01-20 14:31:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][100/1251] eta 0:46:18 lr 0.000774 time 1.5057 (2.4142) loss 2.9989 (3.7488) grad_norm 1.0493 (1.2468) [2022-01-20 14:31:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][110/1251] eta 0:45:29 lr 0.000774 time 1.9012 (2.3926) loss 4.3482 (3.7553) grad_norm 1.0623 (1.2474) [2022-01-20 14:31:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][120/1251] eta 0:44:42 lr 0.000774 time 2.7491 (2.3715) loss 4.0984 (3.7603) grad_norm 1.6634 (1.2501) [2022-01-20 14:32:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][130/1251] eta 0:43:50 lr 0.000774 time 1.8497 (2.3468) loss 3.3744 (3.7410) grad_norm 1.3507 (1.2477) [2022-01-20 14:32:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][140/1251] eta 0:43:20 lr 0.000774 time 2.2923 (2.3409) loss 4.0527 (3.7392) grad_norm 1.3672 (1.2468) [2022-01-20 14:32:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][150/1251] eta 0:42:39 lr 0.000774 time 1.9026 (2.3245) loss 4.1631 (3.7482) grad_norm 1.0331 (1.2419) [2022-01-20 14:33:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][160/1251] eta 0:42:12 lr 0.000774 time 3.1750 (2.3212) loss 3.8606 (3.7340) grad_norm 1.2521 (1.2386) [2022-01-20 14:33:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][170/1251] eta 0:41:31 lr 0.000774 time 2.1562 (2.3053) loss 4.0727 (3.7294) grad_norm 1.3981 (1.2362) [2022-01-20 14:33:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][180/1251] eta 0:40:53 lr 0.000774 time 1.6165 (2.2909) loss 3.2971 (3.7300) grad_norm 1.1337 (1.2327) [2022-01-20 14:34:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][190/1251] eta 0:40:24 lr 0.000774 time 1.6843 (2.2854) loss 4.3581 (3.7256) grad_norm 1.1012 (1.2332) [2022-01-20 14:34:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][200/1251] eta 0:40:16 lr 0.000774 time 3.1545 (2.2993) loss 3.1766 (3.7249) grad_norm 1.2574 (1.2371) [2022-01-20 14:35:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][210/1251] eta 0:39:54 lr 0.000774 time 1.7665 (2.3006) loss 3.8063 (3.7262) grad_norm 1.1253 (1.2414) [2022-01-20 14:35:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][220/1251] eta 0:39:30 lr 0.000774 time 1.8655 (2.2992) loss 4.0528 (3.7350) grad_norm 1.3476 (1.2432) [2022-01-20 14:35:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][230/1251] eta 0:38:52 lr 0.000774 time 1.8487 (2.2849) loss 2.6379 (3.7309) grad_norm 1.2563 (1.2459) [2022-01-20 14:36:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][240/1251] eta 0:38:22 lr 0.000774 time 2.1177 (2.2779) loss 3.8977 (3.7365) grad_norm 1.2486 (1.2468) [2022-01-20 14:36:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][250/1251] eta 0:37:56 lr 0.000774 time 2.1533 (2.2741) loss 4.3540 (3.7411) grad_norm 1.1658 (1.2461) [2022-01-20 14:36:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][260/1251] eta 0:37:33 lr 0.000774 time 2.1378 (2.2736) loss 3.6402 (3.7343) grad_norm 1.0701 (1.2451) [2022-01-20 14:37:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][270/1251] eta 0:37:07 lr 0.000774 time 2.0363 (2.2708) loss 3.8997 (3.7409) grad_norm 1.3354 (1.2476) [2022-01-20 14:37:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][280/1251] eta 0:36:37 lr 0.000774 time 2.1991 (2.2628) loss 4.1531 (3.7432) grad_norm 1.1823 (1.2480) [2022-01-20 14:37:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][290/1251] eta 0:36:08 lr 0.000774 time 1.8944 (2.2563) loss 3.1383 (3.7448) grad_norm 1.3913 (1.2487) [2022-01-20 14:38:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][300/1251] eta 0:35:46 lr 0.000774 time 1.5283 (2.2573) loss 4.4863 (3.7578) grad_norm 1.1178 (1.2451) [2022-01-20 14:38:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][310/1251] eta 0:35:24 lr 0.000774 time 1.8093 (2.2573) loss 3.6626 (3.7568) grad_norm 1.1183 (1.2430) [2022-01-20 14:39:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][320/1251] eta 0:34:57 lr 0.000773 time 2.0777 (2.2530) loss 2.8954 (3.7550) grad_norm 1.2645 (1.2433) [2022-01-20 14:39:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][330/1251] eta 0:34:30 lr 0.000773 time 2.0637 (2.2481) loss 4.2479 (3.7566) grad_norm 1.3061 (1.2437) [2022-01-20 14:39:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][340/1251] eta 0:34:07 lr 0.000773 time 2.1354 (2.2472) loss 4.5651 (3.7675) grad_norm 1.1845 (1.2420) [2022-01-20 14:40:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][350/1251] eta 0:33:44 lr 0.000773 time 2.5133 (2.2466) loss 3.2157 (3.7634) grad_norm 1.2610 (1.2419) [2022-01-20 14:40:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][360/1251] eta 0:33:20 lr 0.000773 time 1.5654 (2.2451) loss 2.8373 (3.7622) grad_norm 1.1296 (1.2413) [2022-01-20 14:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][370/1251] eta 0:32:58 lr 0.000773 time 2.1370 (2.2455) loss 3.6931 (3.7653) grad_norm 1.2801 (1.2410) [2022-01-20 14:41:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][380/1251] eta 0:32:36 lr 0.000773 time 1.8746 (2.2464) loss 2.8879 (3.7623) grad_norm 1.1243 (1.2414) [2022-01-20 14:41:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][390/1251] eta 0:32:12 lr 0.000773 time 2.1050 (2.2441) loss 2.7867 (3.7644) grad_norm 1.2852 (1.2421) [2022-01-20 14:41:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][400/1251] eta 0:31:45 lr 0.000773 time 1.8062 (2.2396) loss 4.2400 (3.7612) grad_norm 1.0604 (1.2427) [2022-01-20 14:42:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][410/1251] eta 0:31:20 lr 0.000773 time 1.8458 (2.2360) loss 3.6690 (3.7616) grad_norm 1.2760 (1.2408) [2022-01-20 14:42:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][420/1251] eta 0:30:58 lr 0.000773 time 2.1354 (2.2369) loss 3.3599 (3.7519) grad_norm 1.1405 (1.2394) [2022-01-20 14:43:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][430/1251] eta 0:30:33 lr 0.000773 time 1.7729 (2.2338) loss 4.2475 (3.7513) grad_norm 1.1568 (1.2385) [2022-01-20 14:43:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][440/1251] eta 0:30:11 lr 0.000773 time 2.8687 (2.2341) loss 4.0369 (3.7534) grad_norm 1.2215 (1.2378) [2022-01-20 14:43:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][450/1251] eta 0:29:48 lr 0.000773 time 1.5437 (2.2326) loss 3.8429 (3.7464) grad_norm 1.0897 (1.2372) [2022-01-20 14:44:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][460/1251] eta 0:29:26 lr 0.000773 time 2.1871 (2.2330) loss 4.5406 (3.7498) grad_norm 1.3158 (1.2390) [2022-01-20 14:44:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][470/1251] eta 0:29:03 lr 0.000773 time 2.0258 (2.2328) loss 2.8214 (3.7481) grad_norm 1.0964 (1.2382) [2022-01-20 14:44:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][480/1251] eta 0:28:41 lr 0.000773 time 2.8584 (2.2328) loss 3.7465 (3.7500) grad_norm 1.1144 (1.2381) [2022-01-20 14:45:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][490/1251] eta 0:28:17 lr 0.000773 time 1.7154 (2.2308) loss 3.9839 (3.7483) grad_norm 1.2229 (1.2381) [2022-01-20 14:45:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][500/1251] eta 0:27:55 lr 0.000773 time 2.1189 (2.2311) loss 2.4982 (3.7438) grad_norm 1.0568 (1.2375) [2022-01-20 14:45:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][510/1251] eta 0:27:31 lr 0.000773 time 1.8674 (2.2286) loss 3.7486 (3.7447) grad_norm 1.1812 (1.2375) [2022-01-20 14:46:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][520/1251] eta 0:27:07 lr 0.000773 time 2.5418 (2.2268) loss 4.1488 (3.7491) grad_norm 1.5149 (1.2379) [2022-01-20 14:46:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][530/1251] eta 0:26:43 lr 0.000773 time 1.8558 (2.2242) loss 3.0346 (3.7459) grad_norm 1.0013 (1.2370) [2022-01-20 14:47:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][540/1251] eta 0:26:20 lr 0.000773 time 2.2672 (2.2233) loss 4.7155 (3.7505) grad_norm 1.2181 (1.2370) [2022-01-20 14:47:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][550/1251] eta 0:26:00 lr 0.000773 time 3.4209 (2.2264) loss 3.6374 (3.7541) grad_norm 1.0970 (1.2386) [2022-01-20 14:47:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][560/1251] eta 0:25:39 lr 0.000773 time 2.5421 (2.2274) loss 4.4338 (3.7544) grad_norm 1.2111 (1.2377) [2022-01-20 14:48:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][570/1251] eta 0:25:17 lr 0.000773 time 1.5292 (2.2278) loss 4.0795 (3.7569) grad_norm 1.2852 (1.2396) [2022-01-20 14:48:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][580/1251] eta 0:24:51 lr 0.000773 time 1.7541 (2.2235) loss 3.2766 (3.7608) grad_norm 1.1604 (1.2398) [2022-01-20 14:48:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][590/1251] eta 0:24:29 lr 0.000773 time 2.1584 (2.2224) loss 3.7880 (3.7594) grad_norm 1.2640 (1.2398) [2022-01-20 14:49:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][600/1251] eta 0:24:04 lr 0.000773 time 1.7543 (2.2191) loss 2.5912 (3.7520) grad_norm 1.1953 (1.2401) [2022-01-20 14:49:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][610/1251] eta 0:23:42 lr 0.000772 time 2.1815 (2.2191) loss 2.8010 (3.7514) grad_norm 1.2152 (1.2401) [2022-01-20 14:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][620/1251] eta 0:23:18 lr 0.000772 time 1.9182 (2.2171) loss 4.1888 (3.7508) grad_norm 1.1089 (1.2392) [2022-01-20 14:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][630/1251] eta 0:22:56 lr 0.000772 time 2.2624 (2.2158) loss 3.7220 (3.7504) grad_norm 1.1534 (1.2387) [2022-01-20 14:50:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][640/1251] eta 0:22:33 lr 0.000772 time 1.5016 (2.2153) loss 3.9620 (3.7524) grad_norm 1.1755 (1.2376) [2022-01-20 14:50:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][650/1251] eta 0:22:10 lr 0.000772 time 1.9437 (2.2143) loss 4.4942 (3.7485) grad_norm 1.3281 (1.2378) [2022-01-20 14:51:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][660/1251] eta 0:21:48 lr 0.000772 time 2.2792 (2.2146) loss 3.5426 (3.7542) grad_norm 1.1938 (1.2397) [2022-01-20 14:51:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][670/1251] eta 0:21:28 lr 0.000772 time 2.5514 (2.2171) loss 4.0822 (3.7528) grad_norm 1.2637 (1.2396) [2022-01-20 14:52:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][680/1251] eta 0:21:06 lr 0.000772 time 1.6624 (2.2174) loss 2.7727 (3.7474) grad_norm 1.3618 (1.2398) [2022-01-20 14:52:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][690/1251] eta 0:20:42 lr 0.000772 time 1.8396 (2.2143) loss 4.0851 (3.7427) grad_norm 1.6156 (1.2402) [2022-01-20 14:52:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][700/1251] eta 0:20:19 lr 0.000772 time 1.8560 (2.2130) loss 3.3506 (3.7455) grad_norm 1.3267 (1.2405) [2022-01-20 14:53:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][710/1251] eta 0:19:58 lr 0.000772 time 2.4868 (2.2155) loss 4.1558 (3.7449) grad_norm 1.1961 (1.2417) [2022-01-20 14:53:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][720/1251] eta 0:19:37 lr 0.000772 time 1.6551 (2.2170) loss 4.4333 (3.7419) grad_norm 1.2259 (1.2412) [2022-01-20 14:53:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][730/1251] eta 0:19:14 lr 0.000772 time 1.6482 (2.2158) loss 3.9996 (3.7401) grad_norm 1.2239 (1.2411) [2022-01-20 14:54:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][740/1251] eta 0:18:51 lr 0.000772 time 1.5912 (2.2140) loss 3.8099 (3.7388) grad_norm 1.1269 (1.2413) [2022-01-20 14:54:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][750/1251] eta 0:18:28 lr 0.000772 time 3.0258 (2.2134) loss 3.7400 (3.7371) grad_norm 1.1825 (1.2414) [2022-01-20 14:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][760/1251] eta 0:18:06 lr 0.000772 time 2.1470 (2.2122) loss 4.0946 (3.7346) grad_norm 1.3103 (1.2416) [2022-01-20 14:55:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][770/1251] eta 0:17:43 lr 0.000772 time 2.2825 (2.2115) loss 3.9560 (3.7362) grad_norm 1.3059 (1.2417) [2022-01-20 14:55:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][780/1251] eta 0:17:21 lr 0.000772 time 1.9549 (2.2106) loss 4.2852 (3.7377) grad_norm 1.4402 (1.2419) [2022-01-20 14:56:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][790/1251] eta 0:16:59 lr 0.000772 time 2.4524 (2.2104) loss 3.8072 (3.7369) grad_norm 1.1499 (1.2419) [2022-01-20 14:56:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][800/1251] eta 0:16:36 lr 0.000772 time 2.2577 (2.2106) loss 3.6380 (3.7387) grad_norm 1.0990 (1.2418) [2022-01-20 14:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][810/1251] eta 0:16:14 lr 0.000772 time 1.9308 (2.2109) loss 3.5493 (3.7378) grad_norm 1.2203 (1.2410) [2022-01-20 14:57:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][820/1251] eta 0:15:52 lr 0.000772 time 2.7017 (2.2101) loss 3.4651 (3.7357) grad_norm 1.0802 (1.2413) [2022-01-20 14:57:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][830/1251] eta 0:15:29 lr 0.000772 time 1.7586 (2.2086) loss 4.5869 (3.7379) grad_norm 1.2011 (1.2412) [2022-01-20 14:57:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][840/1251] eta 0:15:07 lr 0.000772 time 2.4988 (2.2086) loss 4.3059 (3.7401) grad_norm 1.1188 (1.2406) [2022-01-20 14:58:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][850/1251] eta 0:14:44 lr 0.000772 time 1.8775 (2.2061) loss 3.9590 (3.7398) grad_norm 1.3899 (1.2403) [2022-01-20 14:58:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][860/1251] eta 0:14:21 lr 0.000772 time 2.6123 (2.2044) loss 3.5742 (3.7422) grad_norm 1.1508 (1.2400) [2022-01-20 14:58:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][870/1251] eta 0:13:59 lr 0.000772 time 2.2097 (2.2040) loss 2.5509 (3.7389) grad_norm 1.3087 (1.2397) [2022-01-20 14:59:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][880/1251] eta 0:13:38 lr 0.000772 time 2.4935 (2.2055) loss 4.7869 (3.7388) grad_norm 1.2520 (1.2392) [2022-01-20 14:59:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][890/1251] eta 0:13:16 lr 0.000771 time 1.7993 (2.2052) loss 3.5434 (3.7351) grad_norm 1.2354 (1.2403) [2022-01-20 15:00:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][900/1251] eta 0:12:53 lr 0.000771 time 2.1967 (2.2046) loss 4.0804 (3.7381) grad_norm 1.1190 (1.2403) [2022-01-20 15:00:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][910/1251] eta 0:12:32 lr 0.000771 time 2.0019 (2.2061) loss 3.6401 (3.7396) grad_norm 1.0540 (1.2397) [2022-01-20 15:00:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][920/1251] eta 0:12:11 lr 0.000771 time 2.5469 (2.2089) loss 3.3710 (3.7402) grad_norm 1.1880 (1.2398) [2022-01-20 15:01:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][930/1251] eta 0:11:48 lr 0.000771 time 2.2231 (2.2081) loss 4.0281 (3.7418) grad_norm 1.5042 (1.2406) [2022-01-20 15:01:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][940/1251] eta 0:11:26 lr 0.000771 time 2.4966 (2.2065) loss 4.1651 (3.7444) grad_norm 1.4322 (1.2407) [2022-01-20 15:01:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][950/1251] eta 0:11:03 lr 0.000771 time 2.0274 (2.2055) loss 3.5853 (3.7438) grad_norm 1.2674 (1.2412) [2022-01-20 15:02:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][960/1251] eta 0:10:41 lr 0.000771 time 2.7094 (2.2059) loss 4.0187 (3.7436) grad_norm 1.0589 (1.2409) [2022-01-20 15:02:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][970/1251] eta 0:10:19 lr 0.000771 time 1.6128 (2.2051) loss 3.7465 (3.7426) grad_norm 1.4358 (1.2408) [2022-01-20 15:03:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][980/1251] eta 0:09:57 lr 0.000771 time 1.5711 (2.2055) loss 2.8675 (3.7395) grad_norm 1.2039 (1.2408) [2022-01-20 15:03:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][990/1251] eta 0:09:35 lr 0.000771 time 1.8555 (2.2047) loss 3.9802 (3.7388) grad_norm 1.3429 (1.2408) [2022-01-20 15:03:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1000/1251] eta 0:09:13 lr 0.000771 time 2.7816 (2.2054) loss 3.1757 (3.7382) grad_norm 1.1581 (1.2406) [2022-01-20 15:04:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1010/1251] eta 0:08:51 lr 0.000771 time 2.1987 (2.2063) loss 4.3382 (3.7400) grad_norm 1.0705 (1.2399) [2022-01-20 15:04:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1020/1251] eta 0:08:29 lr 0.000771 time 2.0966 (2.2071) loss 3.7513 (3.7380) grad_norm 1.1936 (1.2404) [2022-01-20 15:04:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1030/1251] eta 0:08:07 lr 0.000771 time 1.9007 (2.2059) loss 3.9476 (3.7369) grad_norm 1.2943 (1.2398) [2022-01-20 15:05:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1040/1251] eta 0:07:45 lr 0.000771 time 2.2174 (2.2046) loss 4.4447 (3.7376) grad_norm 1.2311 (1.2394) [2022-01-20 15:05:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1050/1251] eta 0:07:22 lr 0.000771 time 1.6066 (2.2028) loss 3.3448 (3.7368) grad_norm 1.4304 (1.2400) [2022-01-20 15:05:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1060/1251] eta 0:07:00 lr 0.000771 time 1.9515 (2.2017) loss 4.5855 (3.7386) grad_norm 1.5506 (1.2406) [2022-01-20 15:06:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1070/1251] eta 0:06:38 lr 0.000771 time 1.8648 (2.2006) loss 4.3949 (3.7384) grad_norm 1.2777 (1.2405) [2022-01-20 15:06:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1080/1251] eta 0:06:16 lr 0.000771 time 2.1488 (2.2005) loss 3.4417 (3.7376) grad_norm 1.5488 (1.2406) [2022-01-20 15:06:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1090/1251] eta 0:05:54 lr 0.000771 time 2.4720 (2.2016) loss 4.6741 (3.7392) grad_norm 1.3088 (1.2408) [2022-01-20 15:07:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1100/1251] eta 0:05:32 lr 0.000771 time 2.4922 (2.2027) loss 4.5306 (3.7411) grad_norm 1.3557 (1.2405) [2022-01-20 15:07:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1110/1251] eta 0:05:10 lr 0.000771 time 1.8024 (2.2017) loss 2.7481 (3.7392) grad_norm 1.2278 (1.2402) [2022-01-20 15:08:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1120/1251] eta 0:04:48 lr 0.000771 time 1.8998 (2.2010) loss 4.2384 (3.7375) grad_norm 1.1940 (1.2399) [2022-01-20 15:08:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1130/1251] eta 0:04:26 lr 0.000771 time 1.9377 (2.2009) loss 2.9689 (3.7387) grad_norm 1.2058 (1.2399) [2022-01-20 15:08:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1140/1251] eta 0:04:04 lr 0.000771 time 2.2134 (2.2008) loss 4.6041 (3.7412) grad_norm 1.1704 (1.2399) [2022-01-20 15:09:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1150/1251] eta 0:03:42 lr 0.000771 time 2.2186 (2.2010) loss 4.3207 (3.7393) grad_norm 1.1497 (1.2401) [2022-01-20 15:09:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1160/1251] eta 0:03:20 lr 0.000771 time 1.9909 (2.2006) loss 4.2083 (3.7374) grad_norm 0.9985 (1.2397) [2022-01-20 15:09:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1170/1251] eta 0:02:58 lr 0.000771 time 2.0769 (2.2003) loss 3.9533 (3.7382) grad_norm 1.1656 (1.2389) [2022-01-20 15:10:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1180/1251] eta 0:02:36 lr 0.000770 time 2.1791 (2.2004) loss 2.9201 (3.7373) grad_norm 1.3471 (1.2396) [2022-01-20 15:10:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1190/1251] eta 0:02:14 lr 0.000770 time 2.7928 (2.1996) loss 3.7690 (3.7361) grad_norm 1.2254 (1.2400) [2022-01-20 15:10:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1200/1251] eta 0:01:52 lr 0.000770 time 2.3211 (2.1986) loss 3.8471 (3.7383) grad_norm 1.2296 (1.2400) [2022-01-20 15:11:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1210/1251] eta 0:01:30 lr 0.000770 time 1.6161 (2.1971) loss 4.3072 (3.7388) grad_norm 1.1958 (1.2399) [2022-01-20 15:11:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1220/1251] eta 0:01:08 lr 0.000770 time 2.7884 (2.1979) loss 3.6830 (3.7397) grad_norm 1.1210 (1.2393) [2022-01-20 15:12:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1230/1251] eta 0:00:46 lr 0.000770 time 1.8545 (2.1971) loss 3.5765 (3.7409) grad_norm 1.1686 (1.2396) [2022-01-20 15:12:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1240/1251] eta 0:00:24 lr 0.000770 time 2.0057 (2.1974) loss 4.3545 (3.7404) grad_norm 1.1828 (1.2401) [2022-01-20 15:12:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [95/300][1250/1251] eta 0:00:02 lr 0.000770 time 1.2865 (2.1930) loss 4.2889 (3.7393) grad_norm 1.2015 (1.2402) [2022-01-20 15:12:41 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 95 training takes 0:45:43 [2022-01-20 15:12:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.321 (18.321) Loss 1.1285 (1.1285) Acc@1 73.145 (73.145) Acc@5 91.992 (91.992) [2022-01-20 15:13:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.614 (3.213) Loss 1.1868 (1.1389) Acc@1 72.168 (72.718) Acc@5 91.113 (91.921) [2022-01-20 15:13:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.344 (2.590) Loss 1.1480 (1.1419) Acc@1 73.438 (72.949) Acc@5 91.211 (91.913) [2022-01-20 15:13:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.300 (2.251) Loss 1.1791 (1.1396) Acc@1 72.852 (73.082) Acc@5 91.797 (91.983) [2022-01-20 15:14:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.574 (2.180) Loss 1.2010 (1.1411) Acc@1 72.266 (73.068) Acc@5 90.918 (91.937) [2022-01-20 15:14:17 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 73.290 Acc@5 92.034 [2022-01-20 15:14:17 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 73.3% [2022-01-20 15:14:17 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.72% [2022-01-20 15:14:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][0/1251] eta 7:31:11 lr 0.000770 time 21.6403 (21.6403) loss 4.6918 (4.6918) grad_norm 1.2080 (1.2080) [2022-01-20 15:15:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][10/1251] eta 1:21:31 lr 0.000770 time 3.2267 (3.9412) loss 3.9488 (3.7988) grad_norm 1.2248 (1.2708) [2022-01-20 15:15:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][20/1251] eta 1:02:57 lr 0.000770 time 1.5305 (3.0686) loss 3.3903 (3.6848) grad_norm 1.2237 (1.2463) [2022-01-20 15:15:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][30/1251] eta 0:59:24 lr 0.000770 time 1.6571 (2.9194) loss 3.9973 (3.7728) grad_norm 1.3799 (1.2511) [2022-01-20 15:16:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][40/1251] eta 0:55:11 lr 0.000770 time 2.4059 (2.7342) loss 3.2801 (3.6886) grad_norm 1.2089 (1.2471) [2022-01-20 15:16:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][50/1251] eta 0:54:22 lr 0.000770 time 2.7628 (2.7162) loss 3.4498 (3.6155) grad_norm 1.2591 (1.2412) [2022-01-20 15:16:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][60/1251] eta 0:51:40 lr 0.000770 time 1.6801 (2.6036) loss 3.0821 (3.6438) grad_norm 1.3470 (1.2337) [2022-01-20 15:17:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][70/1251] eta 0:49:50 lr 0.000770 time 1.9228 (2.5319) loss 2.4880 (3.6432) grad_norm 1.2445 (1.2361) [2022-01-20 15:17:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][80/1251] eta 0:49:01 lr 0.000770 time 2.0097 (2.5116) loss 3.9895 (3.6443) grad_norm 1.5492 (1.2391) [2022-01-20 15:18:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][90/1251] eta 0:48:00 lr 0.000770 time 1.8860 (2.4811) loss 4.2631 (3.6669) grad_norm 1.2423 (1.2407) [2022-01-20 15:18:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][100/1251] eta 0:46:44 lr 0.000770 time 1.9255 (2.4368) loss 4.4332 (3.6525) grad_norm 1.4626 (1.2455) [2022-01-20 15:18:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][110/1251] eta 0:45:43 lr 0.000770 time 2.2569 (2.4044) loss 3.2653 (3.6612) grad_norm 1.4552 (1.2466) [2022-01-20 15:19:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][120/1251] eta 0:44:57 lr 0.000770 time 2.1383 (2.3850) loss 4.3918 (3.6933) grad_norm 1.4185 (1.2488) [2022-01-20 15:19:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][130/1251] eta 0:44:27 lr 0.000770 time 1.8368 (2.3795) loss 3.8660 (3.7049) grad_norm 1.2470 (1.2463) [2022-01-20 15:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][140/1251] eta 0:43:40 lr 0.000770 time 1.7067 (2.3590) loss 4.1225 (3.7038) grad_norm 1.1480 (1.2372) [2022-01-20 15:20:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][150/1251] eta 0:42:59 lr 0.000770 time 1.9770 (2.3432) loss 4.3696 (3.7118) grad_norm 1.2189 (1.2380) [2022-01-20 15:20:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][160/1251] eta 0:42:07 lr 0.000770 time 1.5837 (2.3168) loss 3.9027 (3.7066) grad_norm 1.1452 (1.2397) [2022-01-20 15:20:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][170/1251] eta 0:41:28 lr 0.000770 time 2.1983 (2.3019) loss 4.0326 (3.7247) grad_norm 1.1354 (1.2428) [2022-01-20 15:21:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][180/1251] eta 0:40:51 lr 0.000770 time 2.5566 (2.2889) loss 3.9573 (3.7355) grad_norm 1.2126 (1.2392) [2022-01-20 15:21:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][190/1251] eta 0:40:24 lr 0.000770 time 2.5084 (2.2855) loss 3.9446 (3.7241) grad_norm 1.1735 (1.2387) [2022-01-20 15:21:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][200/1251] eta 0:39:59 lr 0.000770 time 2.1034 (2.2829) loss 2.6990 (3.7219) grad_norm 1.2004 (1.2378) [2022-01-20 15:22:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][210/1251] eta 0:39:41 lr 0.000769 time 2.0148 (2.2874) loss 4.1584 (3.7429) grad_norm 1.3515 (1.2413) [2022-01-20 15:22:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][220/1251] eta 0:39:19 lr 0.000769 time 2.1011 (2.2885) loss 2.7453 (3.7237) grad_norm 1.5377 (1.2424) [2022-01-20 15:23:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][230/1251] eta 0:38:51 lr 0.000769 time 2.2426 (2.2838) loss 3.1673 (3.7205) grad_norm 1.2174 (1.2418) [2022-01-20 15:23:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][240/1251] eta 0:38:28 lr 0.000769 time 2.2432 (2.2830) loss 3.9517 (3.7179) grad_norm 1.0527 (1.2406) [2022-01-20 15:23:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][250/1251] eta 0:37:57 lr 0.000769 time 1.9477 (2.2755) loss 3.9983 (3.7076) grad_norm 1.1733 (1.2434) [2022-01-20 15:24:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][260/1251] eta 0:37:28 lr 0.000769 time 2.0314 (2.2687) loss 4.1367 (3.7131) grad_norm 1.3176 (1.2442) [2022-01-20 15:24:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][270/1251] eta 0:37:11 lr 0.000769 time 5.3926 (2.2743) loss 3.6843 (3.7166) grad_norm 1.2327 (1.2460) [2022-01-20 15:24:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][280/1251] eta 0:36:47 lr 0.000769 time 2.5604 (2.2735) loss 3.7695 (3.7115) grad_norm 1.1726 (1.2488) [2022-01-20 15:25:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][290/1251] eta 0:36:22 lr 0.000769 time 1.9101 (2.2715) loss 3.4608 (3.7150) grad_norm 1.2583 (1.2500) [2022-01-20 15:25:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][300/1251] eta 0:35:57 lr 0.000769 time 1.5275 (2.2686) loss 2.5726 (3.7141) grad_norm 1.3132 (1.2520) [2022-01-20 15:26:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][310/1251] eta 0:35:30 lr 0.000769 time 3.3880 (2.2643) loss 4.3877 (3.7115) grad_norm 1.2189 (1.2515) [2022-01-20 15:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][320/1251] eta 0:35:05 lr 0.000769 time 1.8633 (2.2619) loss 2.7646 (3.7134) grad_norm 1.2236 (1.2507) [2022-01-20 15:26:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][330/1251] eta 0:34:45 lr 0.000769 time 2.5954 (2.2639) loss 3.9373 (3.7161) grad_norm 1.2622 (1.2505) [2022-01-20 15:27:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][340/1251] eta 0:34:20 lr 0.000769 time 1.8047 (2.2623) loss 3.4648 (3.7183) grad_norm 1.1218 (1.2491) [2022-01-20 15:27:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][350/1251] eta 0:33:58 lr 0.000769 time 2.5106 (2.2622) loss 4.5657 (3.7192) grad_norm 1.2861 (1.2475) [2022-01-20 15:27:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][360/1251] eta 0:33:27 lr 0.000769 time 1.7351 (2.2530) loss 3.8483 (3.7219) grad_norm 1.1352 (1.2460) [2022-01-20 15:28:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][370/1251] eta 0:32:58 lr 0.000769 time 2.3352 (2.2453) loss 3.4196 (3.7303) grad_norm 1.2103 (1.2435) [2022-01-20 15:28:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][380/1251] eta 0:32:36 lr 0.000769 time 2.1902 (2.2463) loss 4.1013 (3.7326) grad_norm 1.0982 (1.2451) [2022-01-20 15:28:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][390/1251] eta 0:32:13 lr 0.000769 time 2.9615 (2.2455) loss 4.3120 (3.7326) grad_norm 2.0164 (1.2485) [2022-01-20 15:29:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][400/1251] eta 0:31:49 lr 0.000769 time 2.1953 (2.2438) loss 2.6514 (3.7271) grad_norm 1.3742 (1.2514) [2022-01-20 15:29:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][410/1251] eta 0:31:25 lr 0.000769 time 2.8905 (2.2416) loss 3.4940 (3.7299) grad_norm 1.4074 (1.2532) [2022-01-20 15:29:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][420/1251] eta 0:30:58 lr 0.000769 time 1.5790 (2.2369) loss 3.3437 (3.7265) grad_norm 1.1110 (1.2524) [2022-01-20 15:30:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][430/1251] eta 0:30:37 lr 0.000769 time 2.2319 (2.2384) loss 4.0998 (3.7298) grad_norm 1.2515 (1.2529) [2022-01-20 15:30:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][440/1251] eta 0:30:16 lr 0.000769 time 2.1513 (2.2400) loss 3.9010 (3.7311) grad_norm 1.1446 (1.2517) [2022-01-20 15:31:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][450/1251] eta 0:29:54 lr 0.000769 time 2.2279 (2.2408) loss 4.2711 (3.7352) grad_norm 1.5117 (1.2539) [2022-01-20 15:31:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][460/1251] eta 0:29:30 lr 0.000769 time 1.6230 (2.2389) loss 3.5062 (3.7324) grad_norm 1.2408 (1.2538) [2022-01-20 15:31:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][470/1251] eta 0:29:06 lr 0.000769 time 2.2130 (2.2368) loss 4.1951 (3.7274) grad_norm 1.1805 (1.2524) [2022-01-20 15:32:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][480/1251] eta 0:28:38 lr 0.000769 time 1.9744 (2.2287) loss 3.8379 (3.7278) grad_norm 1.3429 (1.2528) [2022-01-20 15:32:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][490/1251] eta 0:28:12 lr 0.000769 time 1.7880 (2.2240) loss 3.7977 (3.7233) grad_norm 1.3573 (1.2529) [2022-01-20 15:32:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][500/1251] eta 0:27:49 lr 0.000768 time 2.5278 (2.2227) loss 3.1986 (3.7233) grad_norm 1.0726 (1.2535) [2022-01-20 15:33:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][510/1251] eta 0:27:25 lr 0.000768 time 2.0195 (2.2207) loss 4.0464 (3.7276) grad_norm 1.2972 (1.2536) [2022-01-20 15:33:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][520/1251] eta 0:27:04 lr 0.000768 time 2.5378 (2.2223) loss 3.4086 (3.7255) grad_norm 1.0623 (1.2550) [2022-01-20 15:33:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][530/1251] eta 0:26:42 lr 0.000768 time 2.4556 (2.2224) loss 4.5947 (3.7279) grad_norm 1.1815 (1.2543) [2022-01-20 15:34:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][540/1251] eta 0:26:18 lr 0.000768 time 1.8544 (2.2205) loss 4.0422 (3.7300) grad_norm 1.2116 (1.2566) [2022-01-20 15:34:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][550/1251] eta 0:25:56 lr 0.000768 time 1.9818 (2.2204) loss 3.8862 (3.7368) grad_norm 1.1158 (1.2566) [2022-01-20 15:35:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][560/1251] eta 0:25:35 lr 0.000768 time 3.1752 (2.2224) loss 3.9592 (3.7424) grad_norm 1.2631 (1.2562) [2022-01-20 15:35:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][570/1251] eta 0:25:17 lr 0.000768 time 2.1257 (2.2282) loss 3.6966 (3.7440) grad_norm 1.3339 (1.2572) [2022-01-20 15:35:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][580/1251] eta 0:24:55 lr 0.000768 time 2.0473 (2.2291) loss 3.7215 (3.7419) grad_norm 1.1033 (1.2568) [2022-01-20 15:36:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][590/1251] eta 0:24:33 lr 0.000768 time 1.9242 (2.2286) loss 4.2075 (3.7451) grad_norm 1.4187 (1.2571) [2022-01-20 15:36:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][600/1251] eta 0:24:08 lr 0.000768 time 1.8367 (2.2248) loss 3.9549 (3.7463) grad_norm 1.0690 (1.2566) [2022-01-20 15:36:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][610/1251] eta 0:23:42 lr 0.000768 time 1.8600 (2.2192) loss 4.2887 (3.7475) grad_norm 1.3135 (1.2556) [2022-01-20 15:37:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][620/1251] eta 0:23:19 lr 0.000768 time 1.8232 (2.2175) loss 4.2396 (3.7493) grad_norm 1.1316 (1.2539) [2022-01-20 15:37:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][630/1251] eta 0:22:58 lr 0.000768 time 2.4210 (2.2191) loss 4.3846 (3.7502) grad_norm 1.1802 (1.2526) [2022-01-20 15:37:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][640/1251] eta 0:22:35 lr 0.000768 time 1.8205 (2.2183) loss 3.7017 (3.7486) grad_norm 0.9876 (1.2510) [2022-01-20 15:38:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][650/1251] eta 0:22:13 lr 0.000768 time 2.0001 (2.2180) loss 3.4825 (3.7540) grad_norm 1.1574 (1.2499) [2022-01-20 15:38:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][660/1251] eta 0:21:51 lr 0.000768 time 2.1293 (2.2190) loss 3.2842 (3.7532) grad_norm 1.1063 (1.2494) [2022-01-20 15:39:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][670/1251] eta 0:21:27 lr 0.000768 time 2.1493 (2.2168) loss 3.4984 (3.7541) grad_norm 1.1931 (1.2490) [2022-01-20 15:39:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][680/1251] eta 0:21:06 lr 0.000768 time 2.2451 (2.2172) loss 2.9186 (3.7510) grad_norm 1.5687 (1.2491) [2022-01-20 15:39:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][690/1251] eta 0:20:45 lr 0.000768 time 2.1751 (2.2193) loss 4.0357 (3.7485) grad_norm 1.1126 (1.2494) [2022-01-20 15:40:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][700/1251] eta 0:20:22 lr 0.000768 time 2.0719 (2.2179) loss 4.3568 (3.7474) grad_norm 1.4705 (1.2490) [2022-01-20 15:40:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][710/1251] eta 0:20:00 lr 0.000768 time 2.3704 (2.2188) loss 3.1619 (3.7483) grad_norm 1.3256 (1.2504) [2022-01-20 15:40:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][720/1251] eta 0:19:36 lr 0.000768 time 2.2146 (2.2165) loss 3.3346 (3.7483) grad_norm 1.1826 (1.2497) [2022-01-20 15:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][730/1251] eta 0:19:14 lr 0.000768 time 1.9565 (2.2150) loss 2.8230 (3.7464) grad_norm 1.4799 (1.2509) [2022-01-20 15:41:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][740/1251] eta 0:18:50 lr 0.000768 time 1.7720 (2.2123) loss 4.3103 (3.7453) grad_norm 1.1916 (1.2511) [2022-01-20 15:42:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][750/1251] eta 0:18:29 lr 0.000768 time 2.7342 (2.2136) loss 3.5455 (3.7462) grad_norm 1.1150 (1.2514) [2022-01-20 15:42:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][760/1251] eta 0:18:07 lr 0.000768 time 2.4514 (2.2150) loss 4.5718 (3.7476) grad_norm 1.1305 (1.2517) [2022-01-20 15:42:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][770/1251] eta 0:17:46 lr 0.000768 time 2.0308 (2.2168) loss 3.2487 (3.7488) grad_norm 1.5288 (1.2523) [2022-01-20 15:43:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][780/1251] eta 0:17:23 lr 0.000767 time 1.8226 (2.2155) loss 3.5700 (3.7509) grad_norm 1.2227 (1.2524) [2022-01-20 15:43:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][790/1251] eta 0:17:00 lr 0.000767 time 1.9185 (2.2134) loss 4.0139 (3.7526) grad_norm 1.2895 (1.2530) [2022-01-20 15:43:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][800/1251] eta 0:16:38 lr 0.000767 time 2.8109 (2.2147) loss 4.4061 (3.7542) grad_norm 1.1713 (1.2532) [2022-01-20 15:44:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][810/1251] eta 0:16:16 lr 0.000767 time 1.8662 (2.2136) loss 3.9073 (3.7536) grad_norm 1.1388 (1.2527) [2022-01-20 15:44:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][820/1251] eta 0:15:53 lr 0.000767 time 2.4147 (2.2119) loss 2.7616 (3.7527) grad_norm 1.5664 (1.2530) [2022-01-20 15:44:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][830/1251] eta 0:15:30 lr 0.000767 time 1.7428 (2.2108) loss 3.6535 (3.7530) grad_norm 1.2877 (1.2541) [2022-01-20 15:45:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][840/1251] eta 0:15:08 lr 0.000767 time 2.1665 (2.2097) loss 2.7657 (3.7574) grad_norm 1.2942 (1.2537) [2022-01-20 15:45:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][850/1251] eta 0:14:45 lr 0.000767 time 1.8363 (2.2090) loss 4.3961 (3.7562) grad_norm 1.3032 (1.2537) [2022-01-20 15:45:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][860/1251] eta 0:14:23 lr 0.000767 time 2.0830 (2.2079) loss 4.0006 (3.7566) grad_norm 1.3749 (1.2541) [2022-01-20 15:46:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][870/1251] eta 0:14:00 lr 0.000767 time 1.8445 (2.2063) loss 4.0810 (3.7540) grad_norm 1.2921 (1.2532) [2022-01-20 15:46:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][880/1251] eta 0:13:39 lr 0.000767 time 2.7017 (2.2085) loss 3.4622 (3.7541) grad_norm 1.0935 (1.2528) [2022-01-20 15:47:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][890/1251] eta 0:13:17 lr 0.000767 time 1.6874 (2.2079) loss 3.8725 (3.7534) grad_norm 1.2093 (1.2529) [2022-01-20 15:47:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][900/1251] eta 0:12:54 lr 0.000767 time 2.2559 (2.2076) loss 4.3393 (3.7568) grad_norm 1.0636 (1.2524) [2022-01-20 15:47:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][910/1251] eta 0:12:32 lr 0.000767 time 1.9179 (2.2069) loss 4.0378 (3.7537) grad_norm 1.0678 (1.2524) [2022-01-20 15:48:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][920/1251] eta 0:12:10 lr 0.000767 time 1.9078 (2.2060) loss 2.8936 (3.7527) grad_norm 1.2824 (1.2529) [2022-01-20 15:48:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][930/1251] eta 0:11:48 lr 0.000767 time 1.8166 (2.2075) loss 3.7891 (3.7500) grad_norm 1.3925 (1.2523) [2022-01-20 15:48:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][940/1251] eta 0:11:26 lr 0.000767 time 2.2542 (2.2069) loss 3.2889 (3.7506) grad_norm 1.3534 (1.2529) [2022-01-20 15:49:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][950/1251] eta 0:11:04 lr 0.000767 time 2.2530 (2.2070) loss 3.0703 (3.7491) grad_norm 1.0952 (1.2529) [2022-01-20 15:49:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][960/1251] eta 0:10:42 lr 0.000767 time 1.9149 (2.2075) loss 2.8627 (3.7495) grad_norm 1.2496 (1.2539) [2022-01-20 15:50:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][970/1251] eta 0:10:20 lr 0.000767 time 1.9206 (2.2070) loss 4.1607 (3.7500) grad_norm 1.3365 (1.2545) [2022-01-20 15:50:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][980/1251] eta 0:09:57 lr 0.000767 time 2.7518 (2.2066) loss 4.3836 (3.7518) grad_norm 1.1854 (1.2543) [2022-01-20 15:50:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][990/1251] eta 0:09:35 lr 0.000767 time 1.6548 (2.2049) loss 3.1363 (3.7513) grad_norm 1.3323 (1.2546) [2022-01-20 15:51:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1000/1251] eta 0:09:13 lr 0.000767 time 1.9229 (2.2046) loss 3.5441 (3.7493) grad_norm 1.2488 (1.2549) [2022-01-20 15:51:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1010/1251] eta 0:08:51 lr 0.000767 time 2.5091 (2.2049) loss 4.0034 (3.7492) grad_norm 1.3327 (1.2544) [2022-01-20 15:51:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1020/1251] eta 0:08:29 lr 0.000767 time 2.5293 (2.2044) loss 3.1143 (3.7492) grad_norm 1.0674 (1.2535) [2022-01-20 15:52:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1030/1251] eta 0:08:07 lr 0.000767 time 1.7711 (2.2042) loss 2.6270 (3.7491) grad_norm 1.2960 (1.2527) [2022-01-20 15:52:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1040/1251] eta 0:07:45 lr 0.000767 time 1.7337 (2.2040) loss 3.7494 (3.7460) grad_norm 1.7074 (1.2531) [2022-01-20 15:52:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1050/1251] eta 0:07:22 lr 0.000767 time 2.1424 (2.2035) loss 3.6146 (3.7432) grad_norm 1.4372 (1.2532) [2022-01-20 15:53:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1060/1251] eta 0:07:00 lr 0.000767 time 2.1812 (2.2025) loss 4.4238 (3.7444) grad_norm 1.2512 (1.2532) [2022-01-20 15:53:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1070/1251] eta 0:06:38 lr 0.000766 time 1.9203 (2.2022) loss 4.2223 (3.7462) grad_norm 1.3195 (1.2526) [2022-01-20 15:53:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1080/1251] eta 0:06:16 lr 0.000766 time 1.8568 (2.2029) loss 4.6975 (3.7488) grad_norm 1.2068 (1.2516) [2022-01-20 15:54:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1090/1251] eta 0:05:54 lr 0.000766 time 2.4323 (2.2032) loss 4.1693 (3.7484) grad_norm 1.5516 (1.2512) [2022-01-20 15:54:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1100/1251] eta 0:05:32 lr 0.000766 time 2.3816 (2.2033) loss 3.1985 (3.7495) grad_norm 1.3538 (1.2517) [2022-01-20 15:55:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1110/1251] eta 0:05:10 lr 0.000766 time 1.7111 (2.2023) loss 4.0801 (3.7507) grad_norm 1.1553 (1.2512) [2022-01-20 15:55:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1120/1251] eta 0:04:48 lr 0.000766 time 2.0971 (2.2027) loss 4.1700 (3.7509) grad_norm 1.2375 (1.2509) [2022-01-20 15:55:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1130/1251] eta 0:04:26 lr 0.000766 time 1.8993 (2.2018) loss 3.7909 (3.7506) grad_norm 1.0897 (1.2508) [2022-01-20 15:56:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1140/1251] eta 0:04:04 lr 0.000766 time 2.4823 (2.2017) loss 4.0940 (3.7522) grad_norm 1.0664 (1.2497) [2022-01-20 15:56:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1150/1251] eta 0:03:42 lr 0.000766 time 2.3216 (2.2014) loss 3.6184 (3.7508) grad_norm 1.2157 (1.2493) [2022-01-20 15:56:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1160/1251] eta 0:03:20 lr 0.000766 time 1.8076 (2.2019) loss 3.4671 (3.7513) grad_norm 1.1368 (1.2491) [2022-01-20 15:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1170/1251] eta 0:02:58 lr 0.000766 time 1.5651 (2.2013) loss 3.4224 (3.7479) grad_norm 1.1114 (1.2485) [2022-01-20 15:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1180/1251] eta 0:02:36 lr 0.000766 time 2.0200 (2.2002) loss 4.1730 (3.7496) grad_norm 1.0910 (1.2478) [2022-01-20 15:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1190/1251] eta 0:02:14 lr 0.000766 time 2.0558 (2.1997) loss 3.6568 (3.7498) grad_norm 1.0659 (1.2477) [2022-01-20 15:58:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1200/1251] eta 0:01:52 lr 0.000766 time 2.1155 (2.1996) loss 3.9278 (3.7516) grad_norm 1.0550 (1.2470) [2022-01-20 15:58:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1210/1251] eta 0:01:30 lr 0.000766 time 1.9150 (2.1991) loss 4.5219 (3.7527) grad_norm 1.1905 (1.2472) [2022-01-20 15:59:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1220/1251] eta 0:01:08 lr 0.000766 time 2.8111 (2.1990) loss 3.5252 (3.7508) grad_norm 1.2218 (1.2471) [2022-01-20 15:59:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1230/1251] eta 0:00:46 lr 0.000766 time 2.1869 (2.2000) loss 3.5660 (3.7512) grad_norm 1.1455 (1.2466) [2022-01-20 15:59:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1240/1251] eta 0:00:24 lr 0.000766 time 1.4557 (2.1987) loss 2.7143 (3.7482) grad_norm 1.6618 (1.2468) [2022-01-20 16:00:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [96/300][1250/1251] eta 0:00:02 lr 0.000766 time 1.3142 (2.1927) loss 4.1738 (3.7487) grad_norm 1.1578 (1.2472) [2022-01-20 16:00:01 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 96 training takes 0:45:43 [2022-01-20 16:00:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.712 (17.712) Loss 1.0944 (1.0944) Acc@1 73.633 (73.633) Acc@5 92.871 (92.871) [2022-01-20 16:00:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.612 (3.459) Loss 1.1466 (1.1456) Acc@1 74.023 (73.216) Acc@5 92.285 (92.205) [2022-01-20 16:00:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.297 (2.653) Loss 1.0630 (1.1455) Acc@1 76.270 (73.484) Acc@5 93.750 (92.197) [2022-01-20 16:01:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.619 (2.390) Loss 1.1987 (1.1562) Acc@1 72.461 (73.400) Acc@5 91.504 (91.945) [2022-01-20 16:01:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.951 (2.236) Loss 1.1825 (1.1549) Acc@1 72.852 (73.449) Acc@5 92.090 (91.942) [2022-01-20 16:01:39 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 73.416 Acc@5 91.974 [2022-01-20 16:01:39 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 73.4% [2022-01-20 16:01:39 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.72% [2022-01-20 16:02:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][0/1251] eta 7:34:16 lr 0.000766 time 21.7880 (21.7880) loss 3.6861 (3.6861) grad_norm 1.2490 (1.2490) [2022-01-20 16:02:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][10/1251] eta 1:24:40 lr 0.000766 time 1.6747 (4.0939) loss 3.8660 (3.6875) grad_norm 1.1635 (1.2834) [2022-01-20 16:02:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][20/1251] eta 1:04:30 lr 0.000766 time 1.3350 (3.1438) loss 3.9523 (3.8768) grad_norm 1.1848 (1.2713) [2022-01-20 16:03:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][30/1251] eta 0:57:23 lr 0.000766 time 1.5748 (2.8198) loss 3.7364 (3.8403) grad_norm 1.1942 (1.2474) [2022-01-20 16:03:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][40/1251] eta 0:54:34 lr 0.000766 time 3.9813 (2.7041) loss 3.2736 (3.7792) grad_norm 1.3079 (1.2415) [2022-01-20 16:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][50/1251] eta 0:52:35 lr 0.000766 time 2.4477 (2.6278) loss 3.7948 (3.7730) grad_norm 1.0732 (1.2363) [2022-01-20 16:04:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][60/1251] eta 0:50:45 lr 0.000766 time 1.8538 (2.5570) loss 2.7751 (3.7519) grad_norm 1.2951 (1.2375) [2022-01-20 16:04:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][70/1251] eta 0:49:07 lr 0.000766 time 2.3313 (2.4960) loss 4.3892 (3.7251) grad_norm 1.1935 (1.2377) [2022-01-20 16:05:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][80/1251] eta 0:48:27 lr 0.000766 time 3.5239 (2.4829) loss 3.4098 (3.7153) grad_norm 1.5430 (1.2430) [2022-01-20 16:05:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][90/1251] eta 0:47:41 lr 0.000766 time 2.5311 (2.4649) loss 3.5963 (3.7055) grad_norm 1.2353 (1.2565) [2022-01-20 16:05:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][100/1251] eta 0:46:54 lr 0.000765 time 2.2434 (2.4456) loss 4.0923 (3.6844) grad_norm 1.1298 (1.2485) [2022-01-20 16:06:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][110/1251] eta 0:45:45 lr 0.000765 time 1.9518 (2.4065) loss 3.9784 (3.7041) grad_norm 1.5161 (1.2517) [2022-01-20 16:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][120/1251] eta 0:44:52 lr 0.000765 time 2.8573 (2.3807) loss 3.4660 (3.7272) grad_norm 1.2164 (1.2526) [2022-01-20 16:06:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][130/1251] eta 0:44:05 lr 0.000765 time 2.1277 (2.3596) loss 3.8901 (3.7493) grad_norm 1.3333 (1.2502) [2022-01-20 16:07:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][140/1251] eta 0:43:29 lr 0.000765 time 2.5116 (2.3487) loss 2.6567 (3.7470) grad_norm 1.3064 (1.2457) [2022-01-20 16:07:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][150/1251] eta 0:42:57 lr 0.000765 time 2.0974 (2.3410) loss 4.2856 (3.7590) grad_norm 1.2670 (1.2474) [2022-01-20 16:07:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][160/1251] eta 0:42:27 lr 0.000765 time 2.3485 (2.3351) loss 4.2787 (3.7501) grad_norm 1.0823 (1.2439) [2022-01-20 16:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][170/1251] eta 0:41:53 lr 0.000765 time 2.5339 (2.3251) loss 4.0598 (3.7453) grad_norm 1.2172 (1.2410) [2022-01-20 16:08:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][180/1251] eta 0:41:21 lr 0.000765 time 2.3898 (2.3167) loss 2.9792 (3.7616) grad_norm 1.2454 (1.2403) [2022-01-20 16:09:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][190/1251] eta 0:40:50 lr 0.000765 time 1.8990 (2.3098) loss 4.1179 (3.7597) grad_norm 1.0919 (1.2394) [2022-01-20 16:09:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][200/1251] eta 0:40:20 lr 0.000765 time 2.2318 (2.3028) loss 3.4336 (3.7661) grad_norm 1.2446 (1.2366) [2022-01-20 16:09:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][210/1251] eta 0:39:54 lr 0.000765 time 2.6886 (2.3005) loss 3.8692 (3.7682) grad_norm 1.3855 (1.2387) [2022-01-20 16:10:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][220/1251] eta 0:39:22 lr 0.000765 time 2.5084 (2.2919) loss 2.7914 (3.7443) grad_norm 1.3993 (1.2386) [2022-01-20 16:10:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][230/1251] eta 0:38:52 lr 0.000765 time 2.1955 (2.2845) loss 4.0055 (3.7467) grad_norm 1.6245 (1.2386) [2022-01-20 16:10:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][240/1251] eta 0:38:24 lr 0.000765 time 1.8681 (2.2795) loss 2.8601 (3.7556) grad_norm 1.3167 (1.2417) [2022-01-20 16:11:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][250/1251] eta 0:37:54 lr 0.000765 time 2.6114 (2.2722) loss 3.8036 (3.7625) grad_norm 1.1198 (1.2394) [2022-01-20 16:11:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][260/1251] eta 0:37:30 lr 0.000765 time 2.8645 (2.2705) loss 3.1978 (3.7546) grad_norm 1.2306 (1.2407) [2022-01-20 16:11:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][270/1251] eta 0:36:59 lr 0.000765 time 1.9133 (2.2629) loss 4.2319 (3.7566) grad_norm 1.1964 (1.2400) [2022-01-20 16:12:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][280/1251] eta 0:36:34 lr 0.000765 time 2.5621 (2.2599) loss 4.4108 (3.7551) grad_norm 1.1622 (1.2398) [2022-01-20 16:12:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][290/1251] eta 0:36:11 lr 0.000765 time 2.3594 (2.2596) loss 4.4142 (3.7539) grad_norm 1.1698 (1.2397) [2022-01-20 16:12:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][300/1251] eta 0:35:47 lr 0.000765 time 2.7750 (2.2578) loss 4.3832 (3.7559) grad_norm 1.1654 (1.2415) [2022-01-20 16:13:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][310/1251] eta 0:35:21 lr 0.000765 time 2.5482 (2.2549) loss 2.5863 (3.7598) grad_norm 1.5575 (1.2407) [2022-01-20 16:13:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][320/1251] eta 0:34:54 lr 0.000765 time 2.5906 (2.2496) loss 4.1354 (3.7592) grad_norm 1.0577 (1.2393) [2022-01-20 16:14:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][330/1251] eta 0:34:25 lr 0.000765 time 1.9706 (2.2430) loss 2.6138 (3.7581) grad_norm 1.2097 (1.2373) [2022-01-20 16:14:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][340/1251] eta 0:33:57 lr 0.000765 time 1.8781 (2.2368) loss 4.5021 (3.7601) grad_norm 1.3145 (1.2355) [2022-01-20 16:14:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][350/1251] eta 0:33:32 lr 0.000765 time 1.8357 (2.2338) loss 2.9106 (3.7529) grad_norm 1.1582 (1.2355) [2022-01-20 16:15:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][360/1251] eta 0:33:08 lr 0.000765 time 3.0988 (2.2316) loss 4.0706 (3.7545) grad_norm 1.0248 (1.2343) [2022-01-20 16:15:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][370/1251] eta 0:32:49 lr 0.000765 time 1.8146 (2.2352) loss 2.8280 (3.7490) grad_norm 1.3071 (1.2334) [2022-01-20 16:15:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][380/1251] eta 0:32:25 lr 0.000765 time 2.1166 (2.2342) loss 3.6123 (3.7536) grad_norm 1.3437 (1.2347) [2022-01-20 16:16:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][390/1251] eta 0:32:02 lr 0.000764 time 2.1664 (2.2334) loss 4.1370 (3.7555) grad_norm 1.6639 (1.2351) [2022-01-20 16:16:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][400/1251] eta 0:31:43 lr 0.000764 time 3.7237 (2.2366) loss 4.1810 (3.7613) grad_norm 1.3596 (1.2354) [2022-01-20 16:16:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][410/1251] eta 0:31:18 lr 0.000764 time 2.2111 (2.2336) loss 2.6604 (3.7599) grad_norm 1.3721 (1.2347) [2022-01-20 16:17:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][420/1251] eta 0:30:53 lr 0.000764 time 2.5025 (2.2300) loss 4.0718 (3.7538) grad_norm 1.1907 (1.2354) [2022-01-20 16:17:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][430/1251] eta 0:30:30 lr 0.000764 time 2.1704 (2.2293) loss 4.1060 (3.7490) grad_norm 1.1291 (1.2351) [2022-01-20 16:18:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][440/1251] eta 0:30:06 lr 0.000764 time 2.5449 (2.2277) loss 4.3102 (3.7463) grad_norm 1.6159 (1.2377) [2022-01-20 16:18:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][450/1251] eta 0:29:42 lr 0.000764 time 2.1160 (2.2248) loss 3.8523 (3.7461) grad_norm 1.0124 (1.2367) [2022-01-20 16:18:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][460/1251] eta 0:29:19 lr 0.000764 time 1.9696 (2.2239) loss 4.5929 (3.7466) grad_norm 1.2309 (1.2373) [2022-01-20 16:19:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][470/1251] eta 0:28:55 lr 0.000764 time 1.7744 (2.2219) loss 4.3761 (3.7470) grad_norm 1.1919 (1.2367) [2022-01-20 16:19:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][480/1251] eta 0:28:34 lr 0.000764 time 2.5633 (2.2241) loss 3.3539 (3.7444) grad_norm 1.4949 (1.2371) [2022-01-20 16:19:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][490/1251] eta 0:28:11 lr 0.000764 time 1.9415 (2.2231) loss 4.2088 (3.7469) grad_norm 1.0224 (1.2366) [2022-01-20 16:20:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][500/1251] eta 0:27:47 lr 0.000764 time 1.7375 (2.2207) loss 4.2082 (3.7466) grad_norm 1.1118 (1.2367) [2022-01-20 16:20:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][510/1251] eta 0:27:26 lr 0.000764 time 2.1555 (2.2220) loss 4.0256 (3.7418) grad_norm 1.3948 (1.2368) [2022-01-20 16:20:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][520/1251] eta 0:27:05 lr 0.000764 time 2.3271 (2.2231) loss 3.8069 (3.7420) grad_norm 1.2341 (1.2364) [2022-01-20 16:21:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][530/1251] eta 0:26:41 lr 0.000764 time 2.2912 (2.2210) loss 3.6924 (3.7452) grad_norm 1.3026 (1.2360) [2022-01-20 16:21:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][540/1251] eta 0:26:19 lr 0.000764 time 2.5109 (2.2214) loss 3.6794 (3.7419) grad_norm 1.2480 (1.2368) [2022-01-20 16:22:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][550/1251] eta 0:25:55 lr 0.000764 time 1.8388 (2.2188) loss 4.3174 (3.7409) grad_norm 1.1586 (1.2367) [2022-01-20 16:22:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][560/1251] eta 0:25:31 lr 0.000764 time 1.8543 (2.2159) loss 4.3829 (3.7432) grad_norm 1.2277 (1.2368) [2022-01-20 16:22:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][570/1251] eta 0:25:08 lr 0.000764 time 2.4912 (2.2147) loss 4.1733 (3.7387) grad_norm 1.1892 (1.2367) [2022-01-20 16:23:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][580/1251] eta 0:24:45 lr 0.000764 time 1.8344 (2.2145) loss 2.5359 (3.7394) grad_norm 1.0455 (1.2358) [2022-01-20 16:23:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][590/1251] eta 0:24:22 lr 0.000764 time 1.8945 (2.2133) loss 3.2612 (3.7341) grad_norm 1.1849 (1.2372) [2022-01-20 16:23:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][600/1251] eta 0:24:02 lr 0.000764 time 2.4833 (2.2158) loss 4.4290 (3.7392) grad_norm 1.1604 (1.2384) [2022-01-20 16:24:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][610/1251] eta 0:23:40 lr 0.000764 time 2.7286 (2.2162) loss 3.6854 (3.7423) grad_norm 1.2776 (1.2390) [2022-01-20 16:24:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][620/1251] eta 0:23:17 lr 0.000764 time 2.2258 (2.2146) loss 4.4954 (3.7441) grad_norm 1.3183 (1.2378) [2022-01-20 16:24:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][630/1251] eta 0:22:53 lr 0.000764 time 1.8939 (2.2123) loss 3.9722 (3.7450) grad_norm 1.1865 (1.2374) [2022-01-20 16:25:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][640/1251] eta 0:22:32 lr 0.000764 time 2.8383 (2.2140) loss 4.1122 (3.7472) grad_norm 1.3008 (1.2370) [2022-01-20 16:25:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][650/1251] eta 0:22:11 lr 0.000764 time 1.9392 (2.2150) loss 3.7033 (3.7436) grad_norm 1.2570 (1.2378) [2022-01-20 16:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][660/1251] eta 0:21:48 lr 0.000764 time 2.2628 (2.2141) loss 3.8532 (3.7440) grad_norm 1.2170 (1.2382) [2022-01-20 16:26:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][670/1251] eta 0:21:25 lr 0.000763 time 1.8831 (2.2124) loss 4.5372 (3.7476) grad_norm 1.0681 (1.2394) [2022-01-20 16:26:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][680/1251] eta 0:21:01 lr 0.000763 time 1.8843 (2.2094) loss 3.4282 (3.7450) grad_norm 1.1749 (1.2383) [2022-01-20 16:27:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][690/1251] eta 0:20:39 lr 0.000763 time 2.2510 (2.2090) loss 3.5386 (3.7444) grad_norm 1.2171 (1.2372) [2022-01-20 16:27:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][700/1251] eta 0:20:16 lr 0.000763 time 2.0426 (2.2079) loss 3.7533 (3.7387) grad_norm 1.5974 (1.2374) [2022-01-20 16:27:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][710/1251] eta 0:19:53 lr 0.000763 time 1.6597 (2.2055) loss 2.6438 (3.7320) grad_norm 1.2007 (1.2378) [2022-01-20 16:28:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][720/1251] eta 0:19:30 lr 0.000763 time 2.2117 (2.2042) loss 3.7012 (3.7321) grad_norm 1.2821 (1.2388) [2022-01-20 16:28:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][730/1251] eta 0:19:07 lr 0.000763 time 1.7450 (2.2034) loss 2.6575 (3.7311) grad_norm 1.2696 (1.2396) [2022-01-20 16:28:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][740/1251] eta 0:18:46 lr 0.000763 time 2.2984 (2.2036) loss 4.2249 (3.7319) grad_norm 1.0778 (1.2380) [2022-01-20 16:29:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][750/1251] eta 0:18:23 lr 0.000763 time 2.1512 (2.2027) loss 2.5219 (3.7285) grad_norm 1.3787 (1.2395) [2022-01-20 16:29:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][760/1251] eta 0:18:02 lr 0.000763 time 2.1575 (2.2050) loss 2.7962 (3.7302) grad_norm 1.2034 (1.2402) [2022-01-20 16:30:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][770/1251] eta 0:17:40 lr 0.000763 time 1.9115 (2.2055) loss 3.7722 (3.7309) grad_norm 1.4966 (1.2400) [2022-01-20 16:30:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][780/1251] eta 0:17:18 lr 0.000763 time 2.2355 (2.2056) loss 3.8304 (3.7311) grad_norm 1.3989 (1.2403) [2022-01-20 16:30:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][790/1251] eta 0:16:56 lr 0.000763 time 2.6533 (2.2051) loss 3.2049 (3.7300) grad_norm 1.1673 (1.2396) [2022-01-20 16:31:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][800/1251] eta 0:16:35 lr 0.000763 time 2.2611 (2.2072) loss 4.7971 (3.7320) grad_norm 1.1460 (1.2391) [2022-01-20 16:31:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][810/1251] eta 0:16:13 lr 0.000763 time 1.8395 (2.2085) loss 4.1486 (3.7334) grad_norm 1.2301 (1.2392) [2022-01-20 16:31:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][820/1251] eta 0:15:51 lr 0.000763 time 2.2417 (2.2086) loss 3.0705 (3.7348) grad_norm 1.3070 (1.2406) [2022-01-20 16:32:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][830/1251] eta 0:15:29 lr 0.000763 time 1.6295 (2.2072) loss 4.2004 (3.7341) grad_norm 1.1273 (1.2407) [2022-01-20 16:32:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][840/1251] eta 0:15:07 lr 0.000763 time 1.8478 (2.2073) loss 4.2428 (3.7342) grad_norm 1.2279 (1.2408) [2022-01-20 16:33:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][850/1251] eta 0:14:46 lr 0.000763 time 1.8278 (2.2097) loss 3.9481 (3.7351) grad_norm 1.3323 (1.2415) [2022-01-20 16:33:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][860/1251] eta 0:14:23 lr 0.000763 time 1.7043 (2.2085) loss 3.1334 (3.7350) grad_norm 1.2231 (1.2417) [2022-01-20 16:33:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][870/1251] eta 0:14:01 lr 0.000763 time 1.7990 (2.2078) loss 4.3293 (3.7384) grad_norm 1.2116 (1.2411) [2022-01-20 16:34:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][880/1251] eta 0:13:38 lr 0.000763 time 1.8756 (2.2061) loss 3.9548 (3.7382) grad_norm 1.1857 (1.2412) [2022-01-20 16:34:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][890/1251] eta 0:13:16 lr 0.000763 time 2.0369 (2.2059) loss 3.4520 (3.7368) grad_norm 1.0913 (1.2412) [2022-01-20 16:34:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][900/1251] eta 0:12:54 lr 0.000763 time 1.8556 (2.2067) loss 4.3847 (3.7379) grad_norm 1.2588 (1.2411) [2022-01-20 16:35:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][910/1251] eta 0:12:32 lr 0.000763 time 1.6884 (2.2071) loss 3.6495 (3.7396) grad_norm 1.1212 (1.2405) [2022-01-20 16:35:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][920/1251] eta 0:12:10 lr 0.000763 time 1.9981 (2.2068) loss 2.7181 (3.7350) grad_norm 1.2646 (1.2403) [2022-01-20 16:35:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][930/1251] eta 0:11:48 lr 0.000763 time 2.3938 (2.2080) loss 4.3149 (3.7343) grad_norm 1.1302 (1.2402) [2022-01-20 16:36:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][940/1251] eta 0:11:26 lr 0.000763 time 1.7397 (2.2071) loss 4.5753 (3.7309) grad_norm 1.1902 (1.2403) [2022-01-20 16:36:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][950/1251] eta 0:11:03 lr 0.000762 time 1.7872 (2.2055) loss 3.5145 (3.7282) grad_norm 1.5326 (1.2415) [2022-01-20 16:36:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][960/1251] eta 0:10:41 lr 0.000762 time 1.7119 (2.2041) loss 3.4809 (3.7295) grad_norm 1.1863 (1.2423) [2022-01-20 16:37:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][970/1251] eta 0:10:19 lr 0.000762 time 1.8410 (2.2050) loss 2.7687 (3.7273) grad_norm 1.2233 (1.2427) [2022-01-20 16:37:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][980/1251] eta 0:09:57 lr 0.000762 time 1.7509 (2.2044) loss 4.0266 (3.7287) grad_norm 1.0410 (1.2417) [2022-01-20 16:38:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][990/1251] eta 0:09:35 lr 0.000762 time 1.9560 (2.2051) loss 3.9127 (3.7265) grad_norm 1.2146 (1.2411) [2022-01-20 16:38:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1000/1251] eta 0:09:13 lr 0.000762 time 2.4718 (2.2062) loss 2.8497 (3.7256) grad_norm 1.0828 (1.2407) [2022-01-20 16:38:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1010/1251] eta 0:08:51 lr 0.000762 time 2.3074 (2.2060) loss 3.8415 (3.7246) grad_norm 1.0404 (1.2407) [2022-01-20 16:39:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1020/1251] eta 0:08:29 lr 0.000762 time 1.9380 (2.2039) loss 4.0643 (3.7265) grad_norm 1.3064 (1.2414) [2022-01-20 16:39:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1030/1251] eta 0:08:06 lr 0.000762 time 1.9518 (2.2028) loss 4.4924 (3.7264) grad_norm 1.4720 (1.2418) [2022-01-20 16:39:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1040/1251] eta 0:07:44 lr 0.000762 time 2.3956 (2.2027) loss 3.0034 (3.7232) grad_norm 1.5253 (1.2422) [2022-01-20 16:40:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1050/1251] eta 0:07:22 lr 0.000762 time 2.6566 (2.2032) loss 4.1911 (3.7248) grad_norm 1.1373 (1.2421) [2022-01-20 16:40:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1060/1251] eta 0:07:00 lr 0.000762 time 2.1597 (2.2028) loss 3.4629 (3.7225) grad_norm 1.2121 (1.2425) [2022-01-20 16:41:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1070/1251] eta 0:06:38 lr 0.000762 time 2.0543 (2.2038) loss 2.8955 (3.7212) grad_norm 1.1768 (1.2419) [2022-01-20 16:41:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1080/1251] eta 0:06:17 lr 0.000762 time 3.1098 (2.2048) loss 3.3904 (3.7204) grad_norm 1.4799 (1.2423) [2022-01-20 16:41:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1090/1251] eta 0:05:55 lr 0.000762 time 2.9621 (2.2053) loss 2.9233 (3.7192) grad_norm 1.2271 (1.2417) [2022-01-20 16:42:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1100/1251] eta 0:05:32 lr 0.000762 time 1.6915 (2.2033) loss 3.9146 (3.7187) grad_norm 1.2082 (1.2413) [2022-01-20 16:42:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1110/1251] eta 0:05:10 lr 0.000762 time 1.9055 (2.2011) loss 2.7450 (3.7167) grad_norm 1.1873 (1.2409) [2022-01-20 16:42:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1120/1251] eta 0:04:48 lr 0.000762 time 1.9602 (2.1998) loss 4.4695 (3.7173) grad_norm 1.2851 (1.2402) [2022-01-20 16:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1130/1251] eta 0:04:26 lr 0.000762 time 2.8886 (2.2001) loss 3.8303 (3.7199) grad_norm 1.5058 (1.2415) [2022-01-20 16:43:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1140/1251] eta 0:04:04 lr 0.000762 time 1.8955 (2.1999) loss 3.9876 (3.7200) grad_norm 1.1570 (1.2416) [2022-01-20 16:43:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1150/1251] eta 0:03:42 lr 0.000762 time 2.1435 (2.2002) loss 3.4042 (3.7188) grad_norm 1.1222 (1.2412) [2022-01-20 16:44:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1160/1251] eta 0:03:20 lr 0.000762 time 2.2365 (2.2001) loss 3.2310 (3.7175) grad_norm 1.1569 (1.2407) [2022-01-20 16:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1170/1251] eta 0:02:58 lr 0.000762 time 2.4569 (2.2015) loss 4.0519 (3.7183) grad_norm 1.1036 (1.2410) [2022-01-20 16:45:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1180/1251] eta 0:02:36 lr 0.000762 time 2.1566 (2.2028) loss 4.1300 (3.7203) grad_norm 1.1265 (1.2414) [2022-01-20 16:45:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1190/1251] eta 0:02:14 lr 0.000762 time 2.2641 (2.2028) loss 4.2185 (3.7233) grad_norm 1.1793 (1.2412) [2022-01-20 16:45:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1200/1251] eta 0:01:52 lr 0.000762 time 1.9360 (2.2007) loss 3.3551 (3.7224) grad_norm 1.1109 (1.2405) [2022-01-20 16:46:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1210/1251] eta 0:01:30 lr 0.000762 time 3.0470 (2.2004) loss 2.6914 (3.7210) grad_norm 1.0607 (1.2404) [2022-01-20 16:46:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1220/1251] eta 0:01:08 lr 0.000762 time 1.9424 (2.1999) loss 3.1144 (3.7211) grad_norm 1.3770 (1.2403) [2022-01-20 16:46:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1230/1251] eta 0:00:46 lr 0.000761 time 2.2538 (2.2004) loss 3.4476 (3.7218) grad_norm 1.3626 (1.2411) [2022-01-20 16:47:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1240/1251] eta 0:00:24 lr 0.000761 time 2.5590 (2.1998) loss 4.2844 (3.7198) grad_norm 1.1520 (1.2407) [2022-01-20 16:47:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [97/300][1250/1251] eta 0:00:02 lr 0.000761 time 1.1723 (2.1944) loss 4.2055 (3.7190) grad_norm 1.2302 (1.2402) [2022-01-20 16:47:25 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 97 training takes 0:45:45 [2022-01-20 16:47:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.440 (18.440) Loss 1.0585 (1.0585) Acc@1 75.293 (75.293) Acc@5 93.164 (93.164) [2022-01-20 16:48:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.889 (3.510) Loss 1.1907 (1.1041) Acc@1 71.582 (74.503) Acc@5 92.188 (92.152) [2022-01-20 16:48:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.262 (2.702) Loss 1.1307 (1.1093) Acc@1 72.656 (74.228) Acc@5 92.285 (92.118) [2022-01-20 16:48:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.956 (2.392) Loss 1.1443 (1.1186) Acc@1 73.633 (74.017) Acc@5 91.602 (92.011) [2022-01-20 16:48:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.257 (2.304) Loss 1.0817 (1.1163) Acc@1 74.609 (73.919) Acc@5 92.188 (92.047) [2022-01-20 16:49:07 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 73.884 Acc@5 92.060 [2022-01-20 16:49:07 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 73.9% [2022-01-20 16:49:07 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.88% [2022-01-20 16:49:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][0/1251] eta 7:28:13 lr 0.000761 time 21.4976 (21.4976) loss 4.4248 (4.4248) grad_norm 1.3332 (1.3332) [2022-01-20 16:49:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][10/1251] eta 1:22:20 lr 0.000761 time 2.1477 (3.9813) loss 4.4051 (3.5519) grad_norm 1.0998 (1.2097) [2022-01-20 16:50:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][20/1251] eta 1:05:19 lr 0.000761 time 1.9195 (3.1839) loss 4.0061 (3.6784) grad_norm 1.2071 (1.1953) [2022-01-20 16:50:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][30/1251] eta 0:58:27 lr 0.000761 time 1.2846 (2.8729) loss 4.7085 (3.7441) grad_norm 1.4420 (1.2101) [2022-01-20 16:51:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][40/1251] eta 0:55:45 lr 0.000761 time 3.8609 (2.7622) loss 3.5145 (3.7625) grad_norm 1.2369 (1.2450) [2022-01-20 16:51:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][50/1251] eta 0:53:46 lr 0.000761 time 1.6765 (2.6862) loss 3.9114 (3.7475) grad_norm 1.1980 (1.2330) [2022-01-20 16:51:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][60/1251] eta 0:51:30 lr 0.000761 time 1.6493 (2.5953) loss 2.5695 (3.7111) grad_norm 1.2036 (1.2272) [2022-01-20 16:52:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][70/1251] eta 0:49:35 lr 0.000761 time 1.8455 (2.5191) loss 3.1525 (3.6811) grad_norm 1.1505 (1.2313) [2022-01-20 16:52:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][80/1251] eta 0:48:06 lr 0.000761 time 2.6918 (2.4652) loss 3.5186 (3.6682) grad_norm 1.2763 (1.2293) [2022-01-20 16:52:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][90/1251] eta 0:47:13 lr 0.000761 time 1.5624 (2.4403) loss 3.7420 (3.6978) grad_norm 1.3978 (1.2426) [2022-01-20 16:53:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][100/1251] eta 0:46:24 lr 0.000761 time 1.8734 (2.4190) loss 4.2977 (3.6999) grad_norm 1.4443 (1.2579) [2022-01-20 16:53:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][110/1251] eta 0:45:35 lr 0.000761 time 1.7729 (2.3977) loss 3.3045 (3.6902) grad_norm 1.3379 (1.2566) [2022-01-20 16:53:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][120/1251] eta 0:44:47 lr 0.000761 time 2.5693 (2.3766) loss 3.2611 (3.6655) grad_norm 1.1764 (1.2498) [2022-01-20 16:54:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][130/1251] eta 0:44:10 lr 0.000761 time 2.2233 (2.3645) loss 3.7080 (3.6854) grad_norm 1.4346 (1.2428) [2022-01-20 16:54:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][140/1251] eta 0:43:32 lr 0.000761 time 2.7000 (2.3515) loss 4.0421 (3.6799) grad_norm 1.0365 (1.2417) [2022-01-20 16:55:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][150/1251] eta 0:42:57 lr 0.000761 time 1.6568 (2.3413) loss 3.8670 (3.6773) grad_norm 1.0993 (1.2429) [2022-01-20 16:55:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][160/1251] eta 0:42:20 lr 0.000761 time 1.9039 (2.3285) loss 3.7711 (3.6868) grad_norm 1.0872 (1.2394) [2022-01-20 16:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][170/1251] eta 0:41:38 lr 0.000761 time 1.8610 (2.3111) loss 4.1961 (3.6802) grad_norm 1.4541 (1.2395) [2022-01-20 16:56:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][180/1251] eta 0:41:13 lr 0.000761 time 2.8339 (2.3091) loss 4.4489 (3.6837) grad_norm 1.4880 (1.2384) [2022-01-20 16:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][190/1251] eta 0:40:43 lr 0.000761 time 2.2898 (2.3035) loss 4.0935 (3.6947) grad_norm 1.0942 (1.2398) [2022-01-20 16:56:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][200/1251] eta 0:40:17 lr 0.000761 time 1.8013 (2.3001) loss 4.4513 (3.6941) grad_norm 1.2723 (1.2404) [2022-01-20 16:57:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][210/1251] eta 0:39:42 lr 0.000761 time 1.9090 (2.2884) loss 3.8822 (3.6955) grad_norm 1.2110 (1.2424) [2022-01-20 16:57:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][220/1251] eta 0:39:10 lr 0.000761 time 2.2257 (2.2795) loss 3.7279 (3.7108) grad_norm 1.2734 (1.2422) [2022-01-20 16:57:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][230/1251] eta 0:38:49 lr 0.000761 time 3.0182 (2.2813) loss 3.6094 (3.7187) grad_norm 1.5183 (1.2450) [2022-01-20 16:58:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][240/1251] eta 0:38:23 lr 0.000761 time 2.2550 (2.2781) loss 4.0598 (3.7179) grad_norm 1.0218 (1.2431) [2022-01-20 16:58:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][250/1251] eta 0:38:00 lr 0.000761 time 2.7489 (2.2787) loss 3.0642 (3.7186) grad_norm 1.3725 (1.2444) [2022-01-20 16:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][260/1251] eta 0:37:36 lr 0.000761 time 1.8951 (2.2765) loss 3.9697 (3.7098) grad_norm 1.2166 (1.2440) [2022-01-20 16:59:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][270/1251] eta 0:37:09 lr 0.000760 time 2.8155 (2.2730) loss 4.0787 (3.7074) grad_norm 1.0816 (1.2435) [2022-01-20 16:59:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][280/1251] eta 0:36:34 lr 0.000760 time 1.5716 (2.2600) loss 4.1467 (3.7011) grad_norm 1.2923 (1.2415) [2022-01-20 17:00:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][290/1251] eta 0:36:04 lr 0.000760 time 2.2831 (2.2528) loss 4.1186 (3.7046) grad_norm 1.2627 (1.2409) [2022-01-20 17:00:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][300/1251] eta 0:35:37 lr 0.000760 time 2.2186 (2.2480) loss 3.8741 (3.7106) grad_norm 1.1641 (1.2432) [2022-01-20 17:00:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][310/1251] eta 0:35:11 lr 0.000760 time 2.2071 (2.2443) loss 4.5928 (3.7049) grad_norm 1.1633 (1.2423) [2022-01-20 17:01:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][320/1251] eta 0:34:48 lr 0.000760 time 2.1383 (2.2433) loss 2.8148 (3.7012) grad_norm 1.3221 (1.2421) [2022-01-20 17:01:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][330/1251] eta 0:34:29 lr 0.000760 time 2.0140 (2.2475) loss 3.8664 (3.7029) grad_norm 1.4798 (1.2444) [2022-01-20 17:01:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][340/1251] eta 0:34:09 lr 0.000760 time 2.1125 (2.2502) loss 4.3998 (3.7007) grad_norm 1.1122 (1.2443) [2022-01-20 17:02:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][350/1251] eta 0:33:43 lr 0.000760 time 2.2152 (2.2458) loss 3.9776 (3.7091) grad_norm 1.2482 (1.2441) [2022-01-20 17:02:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][360/1251] eta 0:33:17 lr 0.000760 time 2.2344 (2.2419) loss 3.5445 (3.7105) grad_norm 1.2973 (1.2471) [2022-01-20 17:02:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][370/1251] eta 0:32:54 lr 0.000760 time 1.9098 (2.2408) loss 3.9887 (3.7054) grad_norm 1.2796 (1.2483) [2022-01-20 17:03:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][380/1251] eta 0:32:28 lr 0.000760 time 2.2009 (2.2372) loss 3.9493 (3.7098) grad_norm 1.4652 (1.2502) [2022-01-20 17:03:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][390/1251] eta 0:32:04 lr 0.000760 time 2.0280 (2.2351) loss 3.6089 (3.7091) grad_norm 1.3449 (1.2508) [2022-01-20 17:04:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][400/1251] eta 0:31:44 lr 0.000760 time 2.2486 (2.2381) loss 3.2227 (3.7145) grad_norm 1.0858 (1.2512) [2022-01-20 17:04:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][410/1251] eta 0:31:21 lr 0.000760 time 1.5616 (2.2376) loss 4.0371 (3.7194) grad_norm 1.2273 (1.2503) [2022-01-20 17:04:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][420/1251] eta 0:30:56 lr 0.000760 time 1.9521 (2.2335) loss 3.6158 (3.7218) grad_norm 1.1734 (1.2504) [2022-01-20 17:05:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][430/1251] eta 0:30:30 lr 0.000760 time 2.6458 (2.2300) loss 3.7915 (3.7183) grad_norm 1.3635 (1.2509) [2022-01-20 17:05:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][440/1251] eta 0:30:06 lr 0.000760 time 1.7890 (2.2279) loss 4.2613 (3.7239) grad_norm 1.4292 (1.2510) [2022-01-20 17:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][450/1251] eta 0:29:43 lr 0.000760 time 1.8321 (2.2267) loss 4.0697 (3.7248) grad_norm 1.1483 (1.2516) [2022-01-20 17:06:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][460/1251] eta 0:29:19 lr 0.000760 time 1.9433 (2.2243) loss 3.2082 (3.7233) grad_norm 1.3886 (1.2527) [2022-01-20 17:06:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][470/1251] eta 0:28:58 lr 0.000760 time 3.0975 (2.2255) loss 2.9905 (3.7166) grad_norm 1.0792 (1.2508) [2022-01-20 17:06:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][480/1251] eta 0:28:37 lr 0.000760 time 2.0658 (2.2270) loss 2.8695 (3.7169) grad_norm 1.2104 (1.2503) [2022-01-20 17:07:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][490/1251] eta 0:28:15 lr 0.000760 time 2.7061 (2.2286) loss 2.6237 (3.7089) grad_norm 1.4030 (1.2507) [2022-01-20 17:07:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][500/1251] eta 0:27:52 lr 0.000760 time 1.6767 (2.2266) loss 3.7341 (3.7097) grad_norm 1.2074 (1.2490) [2022-01-20 17:08:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][510/1251] eta 0:27:25 lr 0.000760 time 2.0660 (2.2205) loss 4.1360 (3.7090) grad_norm 1.2236 (1.2492) [2022-01-20 17:08:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][520/1251] eta 0:26:59 lr 0.000760 time 2.1442 (2.2161) loss 4.2274 (3.7107) grad_norm 1.2173 (1.2484) [2022-01-20 17:08:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][530/1251] eta 0:26:35 lr 0.000760 time 2.5575 (2.2129) loss 4.1619 (3.7119) grad_norm 1.1529 (1.2455) [2022-01-20 17:09:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][540/1251] eta 0:26:13 lr 0.000760 time 2.6384 (2.2124) loss 3.3277 (3.7093) grad_norm 1.1988 (1.2457) [2022-01-20 17:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][550/1251] eta 0:25:50 lr 0.000759 time 1.8440 (2.2119) loss 2.6798 (3.7037) grad_norm 1.2374 (1.2457) [2022-01-20 17:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][560/1251] eta 0:25:29 lr 0.000759 time 1.9485 (2.2136) loss 4.3539 (3.7056) grad_norm 1.1964 (1.2470) [2022-01-20 17:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][570/1251] eta 0:25:07 lr 0.000759 time 1.8532 (2.2141) loss 3.6569 (3.7092) grad_norm 1.3828 (1.2459) [2022-01-20 17:10:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][580/1251] eta 0:24:44 lr 0.000759 time 1.8824 (2.2131) loss 3.2137 (3.7083) grad_norm 1.1519 (1.2457) [2022-01-20 17:10:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][590/1251] eta 0:24:22 lr 0.000759 time 2.0583 (2.2120) loss 4.1008 (3.7076) grad_norm 1.1606 (1.2452) [2022-01-20 17:11:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][600/1251] eta 0:23:59 lr 0.000759 time 2.2396 (2.2114) loss 3.9900 (3.7120) grad_norm 1.2110 (1.2452) [2022-01-20 17:11:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][610/1251] eta 0:23:39 lr 0.000759 time 2.0892 (2.2141) loss 3.5629 (3.7071) grad_norm 1.3806 (1.2454) [2022-01-20 17:12:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][620/1251] eta 0:23:18 lr 0.000759 time 1.8981 (2.2159) loss 3.0715 (3.7040) grad_norm 1.2217 (1.2452) [2022-01-20 17:12:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][630/1251] eta 0:22:56 lr 0.000759 time 2.1249 (2.2168) loss 4.1324 (3.7052) grad_norm 1.3640 (1.2466) [2022-01-20 17:12:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][640/1251] eta 0:22:34 lr 0.000759 time 1.5312 (2.2170) loss 2.7202 (3.7032) grad_norm 1.1885 (1.2467) [2022-01-20 17:13:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][650/1251] eta 0:22:12 lr 0.000759 time 3.6357 (2.2175) loss 3.2829 (3.7012) grad_norm 1.2947 (1.2468) [2022-01-20 17:13:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][660/1251] eta 0:21:47 lr 0.000759 time 1.6441 (2.2130) loss 2.9225 (3.7050) grad_norm 1.2861 (1.2484) [2022-01-20 17:13:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][670/1251] eta 0:21:24 lr 0.000759 time 1.8630 (2.2111) loss 3.7603 (3.7043) grad_norm 1.2024 (1.2476) [2022-01-20 17:14:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][680/1251] eta 0:21:02 lr 0.000759 time 1.6501 (2.2106) loss 4.4606 (3.7063) grad_norm 1.1527 (1.2469) [2022-01-20 17:14:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][690/1251] eta 0:20:39 lr 0.000759 time 2.7670 (2.2098) loss 4.1112 (3.7083) grad_norm 1.1140 (1.2459) [2022-01-20 17:14:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][700/1251] eta 0:20:18 lr 0.000759 time 1.8656 (2.2110) loss 2.9084 (3.7111) grad_norm 1.0472 (1.2448) [2022-01-20 17:15:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][710/1251] eta 0:19:57 lr 0.000759 time 1.8753 (2.2128) loss 2.7618 (3.7098) grad_norm 1.1353 (1.2446) [2022-01-20 17:15:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][720/1251] eta 0:19:35 lr 0.000759 time 2.3437 (2.2144) loss 3.5035 (3.7114) grad_norm 1.3245 (1.2442) [2022-01-20 17:16:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][730/1251] eta 0:19:14 lr 0.000759 time 3.3998 (2.2160) loss 4.0738 (3.7122) grad_norm 1.2384 (1.2440) [2022-01-20 17:16:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][740/1251] eta 0:18:50 lr 0.000759 time 1.9842 (2.2131) loss 3.3815 (3.7148) grad_norm 1.1582 (1.2447) [2022-01-20 17:16:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][750/1251] eta 0:18:27 lr 0.000759 time 2.1783 (2.2101) loss 3.0363 (3.7131) grad_norm 1.3042 (1.2443) [2022-01-20 17:17:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][760/1251] eta 0:18:04 lr 0.000759 time 2.2358 (2.2087) loss 3.1663 (3.7091) grad_norm 1.5121 (1.2445) [2022-01-20 17:17:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][770/1251] eta 0:17:42 lr 0.000759 time 2.1326 (2.2086) loss 4.0054 (3.7093) grad_norm 1.1576 (1.2446) [2022-01-20 17:17:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][780/1251] eta 0:17:19 lr 0.000759 time 2.5164 (2.2081) loss 4.0107 (3.7112) grad_norm 1.3232 (1.2459) [2022-01-20 17:18:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][790/1251] eta 0:16:57 lr 0.000759 time 2.5572 (2.2071) loss 4.5441 (3.7153) grad_norm 1.2192 (1.2476) [2022-01-20 17:18:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][800/1251] eta 0:16:34 lr 0.000759 time 1.9251 (2.2060) loss 3.1881 (3.7141) grad_norm 1.1116 (1.2471) [2022-01-20 17:18:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][810/1251] eta 0:16:13 lr 0.000759 time 1.9040 (2.2067) loss 3.7884 (3.7089) grad_norm 1.2648 (1.2481) [2022-01-20 17:19:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][820/1251] eta 0:15:51 lr 0.000759 time 2.3507 (2.2076) loss 3.2951 (3.7081) grad_norm 1.2239 (1.2491) [2022-01-20 17:19:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][830/1251] eta 0:15:29 lr 0.000758 time 2.5201 (2.2072) loss 4.0599 (3.7048) grad_norm 1.2175 (1.2490) [2022-01-20 17:20:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][840/1251] eta 0:15:07 lr 0.000758 time 2.7525 (2.2075) loss 2.6485 (3.7029) grad_norm 1.1195 (1.2488) [2022-01-20 17:20:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][850/1251] eta 0:14:45 lr 0.000758 time 1.6901 (2.2072) loss 4.1175 (3.7026) grad_norm 1.3158 (1.2490) [2022-01-20 17:20:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][860/1251] eta 0:14:23 lr 0.000758 time 2.4984 (2.2076) loss 4.3095 (3.7048) grad_norm 1.2469 (1.2489) [2022-01-20 17:21:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][870/1251] eta 0:14:01 lr 0.000758 time 2.0768 (2.2075) loss 3.7467 (3.7032) grad_norm 1.1168 (1.2475) [2022-01-20 17:21:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][880/1251] eta 0:13:38 lr 0.000758 time 1.5923 (2.2068) loss 4.0586 (3.7056) grad_norm 1.3648 (1.2483) [2022-01-20 17:21:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][890/1251] eta 0:13:16 lr 0.000758 time 1.9523 (2.2068) loss 4.0063 (3.7074) grad_norm 1.3006 (1.2479) [2022-01-20 17:22:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][900/1251] eta 0:12:54 lr 0.000758 time 2.1198 (2.2051) loss 3.3547 (3.7073) grad_norm 1.0169 (1.2472) [2022-01-20 17:22:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][910/1251] eta 0:12:31 lr 0.000758 time 2.6496 (2.2040) loss 4.2038 (3.7078) grad_norm 0.9805 (1.2462) [2022-01-20 17:22:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][920/1251] eta 0:12:09 lr 0.000758 time 1.8740 (2.2025) loss 4.6044 (3.7092) grad_norm 1.0182 (1.2456) [2022-01-20 17:23:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][930/1251] eta 0:11:46 lr 0.000758 time 1.6265 (2.2019) loss 3.9735 (3.7106) grad_norm 1.3504 (1.2446) [2022-01-20 17:23:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][940/1251] eta 0:11:24 lr 0.000758 time 2.3112 (2.2022) loss 2.8868 (3.7114) grad_norm 1.3280 (1.2444) [2022-01-20 17:24:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][950/1251] eta 0:11:03 lr 0.000758 time 3.6812 (2.2047) loss 4.1410 (3.7149) grad_norm 1.1526 (1.2436) [2022-01-20 17:24:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][960/1251] eta 0:10:41 lr 0.000758 time 2.1921 (2.2044) loss 3.8468 (3.7154) grad_norm 1.2656 (1.2441) [2022-01-20 17:24:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][970/1251] eta 0:10:19 lr 0.000758 time 1.9133 (2.2032) loss 3.9571 (3.7144) grad_norm 1.4004 (1.2440) [2022-01-20 17:25:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][980/1251] eta 0:09:56 lr 0.000758 time 2.1566 (2.2016) loss 3.1087 (3.7158) grad_norm 1.1916 (1.2435) [2022-01-20 17:25:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][990/1251] eta 0:09:34 lr 0.000758 time 2.5049 (2.2007) loss 4.3292 (3.7144) grad_norm 1.2717 (1.2436) [2022-01-20 17:25:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1000/1251] eta 0:09:12 lr 0.000758 time 2.1916 (2.2000) loss 4.0164 (3.7145) grad_norm 1.1083 (1.2436) [2022-01-20 17:26:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1010/1251] eta 0:08:50 lr 0.000758 time 1.4538 (2.2005) loss 2.9003 (3.7142) grad_norm 1.1547 (1.2439) [2022-01-20 17:26:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1020/1251] eta 0:08:28 lr 0.000758 time 1.8865 (2.2012) loss 4.4873 (3.7153) grad_norm 1.1547 (1.2431) [2022-01-20 17:26:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1030/1251] eta 0:08:06 lr 0.000758 time 1.9520 (2.2025) loss 3.6869 (3.7148) grad_norm 1.3162 (1.2429) [2022-01-20 17:27:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1040/1251] eta 0:07:45 lr 0.000758 time 3.0666 (2.2046) loss 3.9005 (3.7168) grad_norm 1.3154 (1.2432) [2022-01-20 17:27:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1050/1251] eta 0:07:23 lr 0.000758 time 1.4655 (2.2050) loss 4.0510 (3.7179) grad_norm 1.6459 (1.2436) [2022-01-20 17:28:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1060/1251] eta 0:07:00 lr 0.000758 time 1.7411 (2.2036) loss 4.1602 (3.7183) grad_norm 1.1831 (1.2442) [2022-01-20 17:28:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1070/1251] eta 0:06:38 lr 0.000758 time 2.2022 (2.2015) loss 2.6453 (3.7172) grad_norm 1.1584 (1.2443) [2022-01-20 17:28:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1080/1251] eta 0:06:16 lr 0.000758 time 1.6302 (2.2006) loss 4.2308 (3.7196) grad_norm 1.2193 (1.2443) [2022-01-20 17:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1090/1251] eta 0:05:54 lr 0.000758 time 1.7916 (2.2006) loss 3.9071 (3.7189) grad_norm 1.4408 (1.2446) [2022-01-20 17:29:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1100/1251] eta 0:05:32 lr 0.000758 time 2.2445 (2.2022) loss 3.8371 (3.7184) grad_norm 1.3561 (1.2450) [2022-01-20 17:29:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1110/1251] eta 0:05:10 lr 0.000757 time 2.1941 (2.2026) loss 4.0538 (3.7174) grad_norm 1.3189 (1.2452) [2022-01-20 17:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1120/1251] eta 0:04:48 lr 0.000757 time 2.2080 (2.2047) loss 3.7354 (3.7189) grad_norm 1.2146 (1.2451) [2022-01-20 17:30:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1130/1251] eta 0:04:26 lr 0.000757 time 1.6044 (2.2039) loss 2.7514 (3.7181) grad_norm 1.2078 (1.2447) [2022-01-20 17:31:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1140/1251] eta 0:04:04 lr 0.000757 time 1.6000 (2.2021) loss 3.7259 (3.7174) grad_norm 1.1924 (1.2454) [2022-01-20 17:31:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1150/1251] eta 0:03:42 lr 0.000757 time 1.6344 (2.1997) loss 4.3087 (3.7181) grad_norm 1.2862 (1.2451) [2022-01-20 17:31:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1160/1251] eta 0:03:20 lr 0.000757 time 1.9705 (2.2002) loss 4.1227 (3.7169) grad_norm 1.1367 (1.2449) [2022-01-20 17:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1170/1251] eta 0:02:58 lr 0.000757 time 2.4581 (2.1994) loss 4.6168 (3.7188) grad_norm 1.3244 (1.2447) [2022-01-20 17:32:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1180/1251] eta 0:02:36 lr 0.000757 time 1.9382 (2.1989) loss 2.5326 (3.7190) grad_norm 1.2249 (1.2443) [2022-01-20 17:32:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1190/1251] eta 0:02:14 lr 0.000757 time 1.8716 (2.1984) loss 4.1011 (3.7177) grad_norm 1.2732 (1.2444) [2022-01-20 17:33:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1200/1251] eta 0:01:52 lr 0.000757 time 2.2404 (2.1991) loss 2.7750 (3.7192) grad_norm 1.3626 (1.2440) [2022-01-20 17:33:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1210/1251] eta 0:01:30 lr 0.000757 time 2.4492 (2.1994) loss 3.8853 (3.7168) grad_norm 1.1158 (1.2440) [2022-01-20 17:33:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1220/1251] eta 0:01:08 lr 0.000757 time 2.5151 (2.2016) loss 3.5105 (3.7168) grad_norm 1.0930 (1.2440) [2022-01-20 17:34:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1230/1251] eta 0:00:46 lr 0.000757 time 2.5626 (2.2031) loss 3.2546 (3.7162) grad_norm 1.1067 (1.2435) [2022-01-20 17:34:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1240/1251] eta 0:00:24 lr 0.000757 time 1.8063 (2.2018) loss 4.3438 (3.7137) grad_norm 1.1963 (1.2430) [2022-01-20 17:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [98/300][1250/1251] eta 0:00:02 lr 0.000757 time 1.1909 (2.1957) loss 4.0004 (3.7160) grad_norm 1.1702 (1.2427) [2022-01-20 17:34:54 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 98 training takes 0:45:47 [2022-01-20 17:35:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.992 (17.992) Loss 1.0703 (1.0703) Acc@1 73.730 (73.730) Acc@5 92.285 (92.285) [2022-01-20 17:35:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 2.325 (3.444) Loss 1.1786 (1.1107) Acc@1 73.438 (73.580) Acc@5 91.211 (92.312) [2022-01-20 17:35:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.597 (2.692) Loss 1.0629 (1.1004) Acc@1 75.879 (73.893) Acc@5 92.285 (92.229) [2022-01-20 17:36:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.295 (2.299) Loss 1.0627 (1.1028) Acc@1 73.730 (73.866) Acc@5 92.969 (92.210) [2022-01-20 17:36:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.805 (2.137) Loss 1.1058 (1.1116) Acc@1 73.828 (73.690) Acc@5 92.773 (92.126) [2022-01-20 17:36:29 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 73.618 Acc@5 92.116 [2022-01-20 17:36:29 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 73.6% [2022-01-20 17:36:29 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.88% [2022-01-20 17:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][0/1251] eta 7:29:56 lr 0.000757 time 21.5798 (21.5798) loss 2.9616 (2.9616) grad_norm 1.2396 (1.2396) [2022-01-20 17:37:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][10/1251] eta 1:23:34 lr 0.000757 time 1.8712 (4.0405) loss 3.8724 (3.3987) grad_norm 1.1150 (1.1627) [2022-01-20 17:37:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][20/1251] eta 1:04:04 lr 0.000757 time 1.3985 (3.1230) loss 4.0261 (3.6906) grad_norm 1.2398 (1.1914) [2022-01-20 17:37:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][30/1251] eta 0:56:07 lr 0.000757 time 1.9452 (2.7581) loss 3.3058 (3.7341) grad_norm 1.4269 (1.2144) [2022-01-20 17:38:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][40/1251] eta 0:55:09 lr 0.000757 time 5.5570 (2.7330) loss 3.8552 (3.7544) grad_norm 1.2529 (1.2061) [2022-01-20 17:38:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][50/1251] eta 0:52:40 lr 0.000757 time 1.4520 (2.6319) loss 3.7721 (3.7528) grad_norm 1.2952 (1.2157) [2022-01-20 17:39:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][60/1251] eta 0:50:56 lr 0.000757 time 1.9325 (2.5664) loss 3.7256 (3.6968) grad_norm 1.1889 (1.2158) [2022-01-20 17:39:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][70/1251] eta 0:49:56 lr 0.000757 time 2.1549 (2.5377) loss 3.3690 (3.7065) grad_norm 1.3920 (1.2314) [2022-01-20 17:39:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][80/1251] eta 0:48:50 lr 0.000757 time 3.9740 (2.5025) loss 3.0915 (3.7177) grad_norm 1.1812 (1.2351) [2022-01-20 17:40:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][90/1251] eta 0:47:23 lr 0.000757 time 1.6352 (2.4490) loss 3.4081 (3.7213) grad_norm 1.1979 (1.2385) [2022-01-20 17:40:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][100/1251] eta 0:46:04 lr 0.000757 time 1.9096 (2.4017) loss 2.6268 (3.6654) grad_norm 1.4692 (1.2489) [2022-01-20 17:40:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][110/1251] eta 0:45:23 lr 0.000757 time 2.2763 (2.3868) loss 2.8430 (3.6637) grad_norm 1.1671 (1.2486) [2022-01-20 17:41:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][120/1251] eta 0:44:54 lr 0.000757 time 2.8030 (2.3821) loss 2.9950 (3.6671) grad_norm 1.6062 (1.2514) [2022-01-20 17:41:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][130/1251] eta 0:44:16 lr 0.000757 time 2.1322 (2.3697) loss 2.6396 (3.6357) grad_norm 1.3159 (1.2526) [2022-01-20 17:42:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][140/1251] eta 0:43:39 lr 0.000756 time 1.8981 (2.3576) loss 3.1249 (3.6273) grad_norm 1.2501 (1.2506) [2022-01-20 17:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][150/1251] eta 0:43:03 lr 0.000756 time 1.9779 (2.3467) loss 3.8801 (3.6263) grad_norm 1.1481 (1.2503) [2022-01-20 17:42:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][160/1251] eta 0:42:25 lr 0.000756 time 2.8948 (2.3336) loss 4.2783 (3.6490) grad_norm 1.3143 (1.2520) [2022-01-20 17:43:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][170/1251] eta 0:41:43 lr 0.000756 time 1.6265 (2.3157) loss 2.8434 (3.6561) grad_norm 1.4003 (1.2555) [2022-01-20 17:43:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][180/1251] eta 0:41:07 lr 0.000756 time 2.1593 (2.3043) loss 4.1321 (3.6568) grad_norm 1.2213 (1.2523) [2022-01-20 17:43:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][190/1251] eta 0:40:37 lr 0.000756 time 2.0388 (2.2975) loss 3.7266 (3.6720) grad_norm 1.1009 (1.2502) [2022-01-20 17:44:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][200/1251] eta 0:40:16 lr 0.000756 time 3.0700 (2.2996) loss 3.2379 (3.6587) grad_norm 1.1782 (1.2492) [2022-01-20 17:44:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][210/1251] eta 0:39:56 lr 0.000756 time 2.1503 (2.3020) loss 3.8449 (3.6631) grad_norm 1.1120 (1.2450) [2022-01-20 17:44:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][220/1251] eta 0:39:33 lr 0.000756 time 2.0507 (2.3021) loss 3.3499 (3.6557) grad_norm 1.3888 (1.2422) [2022-01-20 17:45:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][230/1251] eta 0:38:59 lr 0.000756 time 1.9222 (2.2915) loss 2.9153 (3.6546) grad_norm 1.2082 (1.2401) [2022-01-20 17:45:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][240/1251] eta 0:38:18 lr 0.000756 time 1.8530 (2.2737) loss 4.5423 (3.6709) grad_norm 1.1535 (1.2378) [2022-01-20 17:45:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][250/1251] eta 0:37:40 lr 0.000756 time 1.6907 (2.2585) loss 3.8655 (3.6731) grad_norm 1.3652 (1.2380) [2022-01-20 17:46:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][260/1251] eta 0:37:13 lr 0.000756 time 2.1829 (2.2537) loss 3.5535 (3.6835) grad_norm 1.0751 (1.2374) [2022-01-20 17:46:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][270/1251] eta 0:36:46 lr 0.000756 time 2.6038 (2.2495) loss 4.0701 (3.6817) grad_norm 1.3541 (1.2373) [2022-01-20 17:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][280/1251] eta 0:36:31 lr 0.000756 time 2.3418 (2.2566) loss 3.4810 (3.6770) grad_norm 1.1532 (1.2395) [2022-01-20 17:47:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][290/1251] eta 0:36:19 lr 0.000756 time 2.1660 (2.2683) loss 4.0438 (3.6758) grad_norm 1.2615 (1.2417) [2022-01-20 17:47:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][300/1251] eta 0:36:04 lr 0.000756 time 2.5309 (2.2756) loss 2.6764 (3.6673) grad_norm 1.0810 (1.2418) [2022-01-20 17:48:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][310/1251] eta 0:35:40 lr 0.000756 time 2.0755 (2.2752) loss 3.9076 (3.6722) grad_norm 1.2844 (1.2426) [2022-01-20 17:48:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][320/1251] eta 0:35:13 lr 0.000756 time 1.9726 (2.2698) loss 3.7895 (3.6728) grad_norm 1.1816 (1.2445) [2022-01-20 17:48:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][330/1251] eta 0:34:41 lr 0.000756 time 2.0656 (2.2598) loss 4.0143 (3.6747) grad_norm 1.3978 (1.2440) [2022-01-20 17:49:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][340/1251] eta 0:34:10 lr 0.000756 time 2.4336 (2.2513) loss 3.7442 (3.6807) grad_norm 1.1765 (1.2456) [2022-01-20 17:49:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][350/1251] eta 0:33:43 lr 0.000756 time 2.3433 (2.2453) loss 3.4196 (3.6849) grad_norm 1.1796 (1.2462) [2022-01-20 17:49:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][360/1251] eta 0:33:15 lr 0.000756 time 1.8742 (2.2402) loss 3.9604 (3.6794) grad_norm 1.1528 (1.2473) [2022-01-20 17:50:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][370/1251] eta 0:32:52 lr 0.000756 time 1.5637 (2.2390) loss 3.9607 (3.6880) grad_norm 1.4162 (1.2497) [2022-01-20 17:50:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][380/1251] eta 0:32:31 lr 0.000756 time 2.3289 (2.2402) loss 3.9716 (3.6945) grad_norm 1.7197 (1.2545) [2022-01-20 17:51:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][390/1251] eta 0:32:07 lr 0.000756 time 1.6349 (2.2385) loss 4.3425 (3.7006) grad_norm 1.4130 (1.2565) [2022-01-20 17:51:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][400/1251] eta 0:31:44 lr 0.000756 time 2.8552 (2.2379) loss 3.4346 (3.7033) grad_norm 1.3341 (1.2580) [2022-01-20 17:51:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][410/1251] eta 0:31:23 lr 0.000756 time 2.0274 (2.2400) loss 3.8775 (3.7050) grad_norm 1.3215 (1.2590) [2022-01-20 17:52:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][420/1251] eta 0:31:04 lr 0.000755 time 2.7756 (2.2442) loss 3.4522 (3.7030) grad_norm 1.1469 (1.2579) [2022-01-20 17:52:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][430/1251] eta 0:30:42 lr 0.000755 time 1.8507 (2.2442) loss 3.9578 (3.7072) grad_norm 1.1790 (1.2566) [2022-01-20 17:52:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][440/1251] eta 0:30:19 lr 0.000755 time 2.8356 (2.2441) loss 4.1409 (3.7048) grad_norm 1.0993 (1.2570) [2022-01-20 17:53:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][450/1251] eta 0:29:56 lr 0.000755 time 1.6333 (2.2433) loss 3.1046 (3.7095) grad_norm 1.1833 (1.2566) [2022-01-20 17:53:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][460/1251] eta 0:29:34 lr 0.000755 time 3.2518 (2.2434) loss 4.4867 (3.7139) grad_norm 1.4149 (1.2584) [2022-01-20 17:54:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][470/1251] eta 0:29:07 lr 0.000755 time 1.5806 (2.2374) loss 2.8342 (3.7096) grad_norm 1.2900 (1.2583) [2022-01-20 17:54:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][480/1251] eta 0:28:42 lr 0.000755 time 1.8113 (2.2341) loss 3.9073 (3.7055) grad_norm 1.2697 (1.2581) [2022-01-20 17:54:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][490/1251] eta 0:28:20 lr 0.000755 time 2.2180 (2.2342) loss 2.5618 (3.7066) grad_norm 1.0586 (1.2579) [2022-01-20 17:55:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][500/1251] eta 0:28:02 lr 0.000755 time 3.3981 (2.2406) loss 2.8732 (3.7062) grad_norm 1.2719 (1.2569) [2022-01-20 17:55:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][510/1251] eta 0:27:39 lr 0.000755 time 1.7702 (2.2400) loss 2.7279 (3.7009) grad_norm 1.2581 (1.2567) [2022-01-20 17:55:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][520/1251] eta 0:27:15 lr 0.000755 time 1.9180 (2.2370) loss 2.4093 (3.7009) grad_norm 1.2285 (1.2575) [2022-01-20 17:56:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][530/1251] eta 0:26:50 lr 0.000755 time 1.9349 (2.2336) loss 4.1471 (3.6940) grad_norm 1.3369 (1.2582) [2022-01-20 17:56:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][540/1251] eta 0:26:28 lr 0.000755 time 2.7404 (2.2335) loss 4.0333 (3.6917) grad_norm 1.2067 (1.2580) [2022-01-20 17:57:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][550/1251] eta 0:26:06 lr 0.000755 time 2.2861 (2.2341) loss 2.7716 (3.6935) grad_norm 1.3887 (1.2578) [2022-01-20 17:57:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][560/1251] eta 0:25:45 lr 0.000755 time 1.9999 (2.2363) loss 3.9366 (3.6922) grad_norm 1.2337 (1.2584) [2022-01-20 17:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][570/1251] eta 0:25:22 lr 0.000755 time 1.8930 (2.2357) loss 4.1119 (3.6937) grad_norm 1.4377 (1.2606) [2022-01-20 17:58:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][580/1251] eta 0:25:02 lr 0.000755 time 2.7830 (2.2387) loss 2.9087 (3.6923) grad_norm 1.2314 (1.2598) [2022-01-20 17:58:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][590/1251] eta 0:24:37 lr 0.000755 time 1.8402 (2.2350) loss 3.9406 (3.6916) grad_norm 1.2476 (1.2584) [2022-01-20 17:58:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][600/1251] eta 0:24:11 lr 0.000755 time 1.6061 (2.2291) loss 4.2392 (3.6938) grad_norm 1.2251 (1.2587) [2022-01-20 17:59:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][610/1251] eta 0:23:46 lr 0.000755 time 2.0856 (2.2256) loss 3.7794 (3.6949) grad_norm 1.1410 (1.2589) [2022-01-20 17:59:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][620/1251] eta 0:23:23 lr 0.000755 time 2.1988 (2.2242) loss 4.0745 (3.6917) grad_norm 1.5273 (1.2589) [2022-01-20 17:59:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][630/1251] eta 0:23:01 lr 0.000755 time 2.0508 (2.2239) loss 4.2021 (3.6894) grad_norm 1.2599 (1.2596) [2022-01-20 18:00:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][640/1251] eta 0:22:38 lr 0.000755 time 2.3022 (2.2235) loss 4.4815 (3.6928) grad_norm 1.0894 (1.2584) [2022-01-20 18:00:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][650/1251] eta 0:22:17 lr 0.000755 time 1.9157 (2.2255) loss 4.0447 (3.6962) grad_norm 1.1783 (1.2571) [2022-01-20 18:01:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][660/1251] eta 0:21:55 lr 0.000755 time 2.5635 (2.2261) loss 4.0829 (3.6968) grad_norm 1.1635 (1.2567) [2022-01-20 18:01:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][670/1251] eta 0:21:32 lr 0.000755 time 1.9063 (2.2239) loss 4.1775 (3.6948) grad_norm 1.2375 (1.2561) [2022-01-20 18:01:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][680/1251] eta 0:21:08 lr 0.000755 time 2.0233 (2.2222) loss 4.1771 (3.6934) grad_norm 1.1780 (1.2562) [2022-01-20 18:02:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][690/1251] eta 0:20:45 lr 0.000755 time 1.9462 (2.2199) loss 3.1112 (3.6906) grad_norm 1.2229 (1.2547) [2022-01-20 18:02:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][700/1251] eta 0:20:23 lr 0.000754 time 2.5724 (2.2200) loss 3.5625 (3.6937) grad_norm 1.4791 (1.2539) [2022-01-20 18:02:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][710/1251] eta 0:20:01 lr 0.000754 time 2.3123 (2.2208) loss 3.7413 (3.6900) grad_norm 1.2370 (1.2540) [2022-01-20 18:03:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][720/1251] eta 0:19:39 lr 0.000754 time 2.5042 (2.2204) loss 3.6211 (3.6889) grad_norm 1.2210 (1.2545) [2022-01-20 18:03:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][730/1251] eta 0:19:17 lr 0.000754 time 1.8485 (2.2217) loss 4.5272 (3.6881) grad_norm 1.1942 (1.2556) [2022-01-20 18:03:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][740/1251] eta 0:18:54 lr 0.000754 time 1.7468 (2.2201) loss 3.5440 (3.6899) grad_norm 1.3582 (1.2562) [2022-01-20 18:04:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][750/1251] eta 0:18:31 lr 0.000754 time 1.8677 (2.2178) loss 4.0050 (3.6942) grad_norm 1.1939 (1.2572) [2022-01-20 18:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][760/1251] eta 0:18:08 lr 0.000754 time 3.2684 (2.2174) loss 4.2068 (3.6948) grad_norm 1.1377 (1.2563) [2022-01-20 18:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][770/1251] eta 0:17:46 lr 0.000754 time 1.8860 (2.2180) loss 4.1882 (3.6923) grad_norm 1.2048 (1.2563) [2022-01-20 18:05:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][780/1251] eta 0:17:23 lr 0.000754 time 1.6761 (2.2160) loss 4.4086 (3.6933) grad_norm 1.6622 (1.2587) [2022-01-20 18:05:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][790/1251] eta 0:17:01 lr 0.000754 time 1.7889 (2.2163) loss 4.3989 (3.6971) grad_norm 1.4457 (1.2594) [2022-01-20 18:06:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][800/1251] eta 0:16:40 lr 0.000754 time 3.7673 (2.2175) loss 4.1228 (3.6966) grad_norm 1.0861 (1.2600) [2022-01-20 18:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][810/1251] eta 0:16:17 lr 0.000754 time 2.2229 (2.2170) loss 3.7927 (3.7019) grad_norm 1.3848 (1.2605) [2022-01-20 18:06:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][820/1251] eta 0:15:54 lr 0.000754 time 1.6377 (2.2141) loss 4.4335 (3.7034) grad_norm 1.2402 (1.2597) [2022-01-20 18:07:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][830/1251] eta 0:15:31 lr 0.000754 time 1.8480 (2.2136) loss 3.7143 (3.7054) grad_norm 1.2672 (1.2589) [2022-01-20 18:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][840/1251] eta 0:15:10 lr 0.000754 time 2.7010 (2.2153) loss 3.1565 (3.7047) grad_norm 1.2957 (1.2590) [2022-01-20 18:07:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][850/1251] eta 0:14:47 lr 0.000754 time 1.5615 (2.2138) loss 3.2401 (3.7045) grad_norm 1.3149 (1.2582) [2022-01-20 18:08:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][860/1251] eta 0:14:24 lr 0.000754 time 1.7504 (2.2117) loss 3.9542 (3.7078) grad_norm 1.1781 (1.2575) [2022-01-20 18:08:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][870/1251] eta 0:14:02 lr 0.000754 time 1.6680 (2.2104) loss 2.6443 (3.7062) grad_norm 1.0559 (1.2560) [2022-01-20 18:08:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][880/1251] eta 0:13:40 lr 0.000754 time 1.6936 (2.2110) loss 4.5878 (3.7077) grad_norm 1.2610 (1.2558) [2022-01-20 18:09:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][890/1251] eta 0:13:17 lr 0.000754 time 1.8902 (2.2104) loss 3.6526 (3.7075) grad_norm 1.1808 (1.2553) [2022-01-20 18:09:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][900/1251] eta 0:12:55 lr 0.000754 time 2.5174 (2.2104) loss 3.1060 (3.7082) grad_norm 1.1511 (1.2551) [2022-01-20 18:10:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][910/1251] eta 0:12:33 lr 0.000754 time 1.9142 (2.2104) loss 2.8219 (3.7049) grad_norm 1.0361 (1.2540) [2022-01-20 18:10:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][920/1251] eta 0:12:11 lr 0.000754 time 1.6518 (2.2107) loss 4.0802 (3.7067) grad_norm 1.1742 (1.2533) [2022-01-20 18:10:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][930/1251] eta 0:11:49 lr 0.000754 time 1.8877 (2.2100) loss 3.8866 (3.7041) grad_norm 1.2063 (1.2530) [2022-01-20 18:11:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][940/1251] eta 0:11:26 lr 0.000754 time 2.1537 (2.2090) loss 4.1205 (3.7069) grad_norm 1.3688 (1.2532) [2022-01-20 18:11:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][950/1251] eta 0:11:04 lr 0.000754 time 2.4296 (2.2087) loss 4.5621 (3.7106) grad_norm 1.1493 (1.2533) [2022-01-20 18:11:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][960/1251] eta 0:10:42 lr 0.000754 time 1.8540 (2.2092) loss 2.7482 (3.7076) grad_norm 1.1122 (1.2530) [2022-01-20 18:12:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][970/1251] eta 0:10:20 lr 0.000754 time 2.0233 (2.2073) loss 3.9519 (3.7102) grad_norm 1.1621 (1.2524) [2022-01-20 18:12:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][980/1251] eta 0:09:58 lr 0.000753 time 2.1023 (2.2071) loss 4.5167 (3.7108) grad_norm 1.2760 (1.2520) [2022-01-20 18:12:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][990/1251] eta 0:09:35 lr 0.000753 time 1.8302 (2.2064) loss 3.5506 (3.7113) grad_norm 1.4897 (1.2526) [2022-01-20 18:13:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1000/1251] eta 0:09:13 lr 0.000753 time 1.5221 (2.2067) loss 3.2286 (3.7124) grad_norm 1.0836 (1.2518) [2022-01-20 18:13:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1010/1251] eta 0:08:51 lr 0.000753 time 2.9119 (2.2074) loss 4.0046 (3.7132) grad_norm 1.0267 (1.2512) [2022-01-20 18:14:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1020/1251] eta 0:08:29 lr 0.000753 time 2.4512 (2.2077) loss 4.1046 (3.7137) grad_norm 1.2009 (1.2512) [2022-01-20 18:14:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1030/1251] eta 0:08:08 lr 0.000753 time 1.7018 (2.2084) loss 3.6896 (3.7143) grad_norm 1.1620 (1.2511) [2022-01-20 18:14:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1040/1251] eta 0:07:45 lr 0.000753 time 2.0611 (2.2083) loss 3.1730 (3.7141) grad_norm 1.4410 (1.2504) [2022-01-20 18:15:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1050/1251] eta 0:07:23 lr 0.000753 time 3.4063 (2.2077) loss 2.9229 (3.7126) grad_norm 1.2145 (1.2501) [2022-01-20 18:15:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1060/1251] eta 0:07:01 lr 0.000753 time 2.1687 (2.2068) loss 4.0207 (3.7111) grad_norm 1.3324 (1.2505) [2022-01-20 18:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1070/1251] eta 0:06:39 lr 0.000753 time 1.7905 (2.2079) loss 4.0505 (3.7100) grad_norm 1.3693 (1.2499) [2022-01-20 18:16:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1080/1251] eta 0:06:17 lr 0.000753 time 2.2729 (2.2077) loss 4.3285 (3.7096) grad_norm 1.1936 (1.2503) [2022-01-20 18:16:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1090/1251] eta 0:05:55 lr 0.000753 time 2.5439 (2.2073) loss 3.8458 (3.7120) grad_norm 1.2863 (1.2499) [2022-01-20 18:16:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1100/1251] eta 0:05:32 lr 0.000753 time 1.7108 (2.2049) loss 2.8484 (3.7120) grad_norm 1.2844 (1.2504) [2022-01-20 18:17:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1110/1251] eta 0:05:10 lr 0.000753 time 1.9417 (2.2035) loss 4.2066 (3.7098) grad_norm 1.4830 (1.2518) [2022-01-20 18:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1120/1251] eta 0:04:48 lr 0.000753 time 2.1496 (2.2036) loss 2.5654 (3.7086) grad_norm 1.2526 (1.2517) [2022-01-20 18:18:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1130/1251] eta 0:04:26 lr 0.000753 time 2.5164 (2.2053) loss 4.5395 (3.7109) grad_norm 1.3669 (1.2517) [2022-01-20 18:18:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1140/1251] eta 0:04:04 lr 0.000753 time 1.6587 (2.2067) loss 3.9301 (3.7111) grad_norm 1.0443 (1.2515) [2022-01-20 18:18:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1150/1251] eta 0:03:42 lr 0.000753 time 2.4742 (2.2077) loss 4.4049 (3.7116) grad_norm 1.1521 (1.2515) [2022-01-20 18:19:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1160/1251] eta 0:03:20 lr 0.000753 time 1.6585 (2.2054) loss 4.1570 (3.7110) grad_norm 1.2052 (1.2511) [2022-01-20 18:19:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1170/1251] eta 0:02:58 lr 0.000753 time 1.9883 (2.2036) loss 3.7152 (3.7112) grad_norm 1.0781 (1.2497) [2022-01-20 18:19:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1180/1251] eta 0:02:36 lr 0.000753 time 1.5804 (2.2012) loss 4.3288 (3.7129) grad_norm 1.3390 (1.2493) [2022-01-20 18:20:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1190/1251] eta 0:02:14 lr 0.000753 time 1.5203 (2.2005) loss 4.3774 (3.7145) grad_norm 1.1976 (1.2494) [2022-01-20 18:20:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1200/1251] eta 0:01:52 lr 0.000753 time 1.8575 (2.2000) loss 3.8446 (3.7169) grad_norm 1.4140 (1.2500) [2022-01-20 18:20:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1210/1251] eta 0:01:30 lr 0.000753 time 1.8696 (2.2020) loss 4.1137 (3.7184) grad_norm 1.3026 (1.2502) [2022-01-20 18:21:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1220/1251] eta 0:01:08 lr 0.000753 time 2.2580 (2.2028) loss 3.7511 (3.7188) grad_norm 1.4597 (1.2501) [2022-01-20 18:21:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1230/1251] eta 0:00:46 lr 0.000753 time 2.1541 (2.2047) loss 4.3333 (3.7226) grad_norm 1.3400 (1.2502) [2022-01-20 18:22:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1240/1251] eta 0:00:24 lr 0.000753 time 1.7183 (2.2044) loss 3.6580 (3.7227) grad_norm 1.1059 (1.2505) [2022-01-20 18:22:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [99/300][1250/1251] eta 0:00:02 lr 0.000753 time 1.1833 (2.1981) loss 4.4907 (3.7218) grad_norm 1.2398 (1.2505) [2022-01-20 18:22:20 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 99 training takes 0:45:50 [2022-01-20 18:22:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.629 (18.629) Loss 1.1411 (1.1411) Acc@1 72.852 (72.852) Acc@5 92.090 (92.090) [2022-01-20 18:22:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.869 (3.361) Loss 1.1635 (1.1259) Acc@1 71.094 (73.580) Acc@5 92.676 (92.152) [2022-01-20 18:23:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 2.400 (2.602) Loss 1.1868 (1.1269) Acc@1 73.535 (73.744) Acc@5 91.016 (92.192) [2022-01-20 18:23:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.904 (2.271) Loss 1.1236 (1.1319) Acc@1 75.293 (73.740) Acc@5 91.797 (92.137) [2022-01-20 18:23:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.121 (2.174) Loss 1.1933 (1.1411) Acc@1 72.754 (73.540) Acc@5 91.406 (92.026) [2022-01-20 18:23:56 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 73.598 Acc@5 92.086 [2022-01-20 18:23:56 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 73.6% [2022-01-20 18:23:56 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.88% [2022-01-20 18:24:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][0/1251] eta 8:11:48 lr 0.000753 time 23.5883 (23.5883) loss 2.8116 (2.8116) grad_norm 1.4042 (1.4042) [2022-01-20 18:24:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][10/1251] eta 1:28:43 lr 0.000752 time 1.9948 (4.2900) loss 4.0054 (3.5784) grad_norm 1.2456 (1.2877) [2022-01-20 18:25:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][20/1251] eta 1:08:05 lr 0.000752 time 1.7565 (3.3189) loss 3.7076 (3.7180) grad_norm 1.3898 (1.2925) [2022-01-20 18:25:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][30/1251] eta 0:59:50 lr 0.000752 time 1.5787 (2.9406) loss 4.3817 (3.7886) grad_norm 1.1864 (1.2719) [2022-01-20 18:25:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][40/1251] eta 0:55:18 lr 0.000752 time 3.0059 (2.7400) loss 4.3803 (3.7534) grad_norm 1.2736 (1.2688) [2022-01-20 18:26:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][50/1251] eta 0:52:45 lr 0.000752 time 2.2860 (2.6358) loss 4.3755 (3.7783) grad_norm 1.2472 (1.2674) [2022-01-20 18:26:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][60/1251] eta 0:50:30 lr 0.000752 time 1.9182 (2.5449) loss 3.9196 (3.7452) grad_norm 1.1765 (1.2618) [2022-01-20 18:26:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][70/1251] eta 0:48:36 lr 0.000752 time 1.7380 (2.4696) loss 3.7464 (3.7353) grad_norm 1.2794 (1.2709) [2022-01-20 18:27:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][80/1251] eta 0:47:43 lr 0.000752 time 2.9323 (2.4449) loss 2.6473 (3.7088) grad_norm 1.3293 (1.2719) [2022-01-20 18:27:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][90/1251] eta 0:46:47 lr 0.000752 time 2.1324 (2.4183) loss 4.5059 (3.7460) grad_norm 1.3069 (1.2697) [2022-01-20 18:27:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][100/1251] eta 0:45:54 lr 0.000752 time 1.5031 (2.3932) loss 3.0315 (3.7381) grad_norm 1.2148 (1.2631) [2022-01-20 18:28:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][110/1251] eta 0:45:06 lr 0.000752 time 1.9368 (2.3722) loss 4.0689 (3.6967) grad_norm 1.1689 (1.2579) [2022-01-20 18:28:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][120/1251] eta 0:44:38 lr 0.000752 time 3.1195 (2.3683) loss 3.7713 (3.6888) grad_norm 1.1288 (1.2597) [2022-01-20 18:29:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][130/1251] eta 0:43:58 lr 0.000752 time 1.5442 (2.3536) loss 3.9310 (3.6772) grad_norm 1.3348 (1.2582) [2022-01-20 18:29:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][140/1251] eta 0:43:20 lr 0.000752 time 1.5112 (2.3404) loss 4.0945 (3.6485) grad_norm 1.3183 (1.2521) [2022-01-20 18:29:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][150/1251] eta 0:42:46 lr 0.000752 time 1.8087 (2.3315) loss 4.3747 (3.6740) grad_norm 1.3367 (1.2486) [2022-01-20 18:30:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][160/1251] eta 0:42:10 lr 0.000752 time 2.1487 (2.3197) loss 2.6161 (3.6551) grad_norm 1.1583 (1.2482) [2022-01-20 18:30:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][170/1251] eta 0:41:28 lr 0.000752 time 1.9888 (2.3023) loss 3.6178 (3.6507) grad_norm 1.2509 (1.2489) [2022-01-20 18:30:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][180/1251] eta 0:41:06 lr 0.000752 time 2.0642 (2.3029) loss 4.1275 (3.6611) grad_norm 1.1881 (1.2475) [2022-01-20 18:31:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][190/1251] eta 0:40:40 lr 0.000752 time 1.8695 (2.3005) loss 4.2989 (3.6710) grad_norm 1.2477 (1.2520) [2022-01-20 18:31:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][200/1251] eta 0:40:20 lr 0.000752 time 2.8516 (2.3033) loss 3.3197 (3.6656) grad_norm 1.3064 (1.2556) [2022-01-20 18:32:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][210/1251] eta 0:39:48 lr 0.000752 time 1.9800 (2.2949) loss 4.0357 (3.6718) grad_norm 1.2943 (1.2578) [2022-01-20 18:32:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][220/1251] eta 0:39:18 lr 0.000752 time 2.2326 (2.2872) loss 3.6990 (3.6832) grad_norm 1.8734 (1.2658) [2022-01-20 18:32:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][230/1251] eta 0:38:43 lr 0.000752 time 1.9826 (2.2755) loss 4.1525 (3.6790) grad_norm 1.3218 (1.2677) [2022-01-20 18:33:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][240/1251] eta 0:38:15 lr 0.000752 time 1.8999 (2.2709) loss 3.7695 (3.6932) grad_norm 1.1283 (1.2709) [2022-01-20 18:33:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][250/1251] eta 0:37:43 lr 0.000752 time 1.8254 (2.2609) loss 3.6213 (3.6830) grad_norm 1.2334 (1.2711) [2022-01-20 18:33:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][260/1251] eta 0:37:15 lr 0.000752 time 1.9015 (2.2557) loss 4.5214 (3.6832) grad_norm 1.3806 (1.2717) [2022-01-20 18:34:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][270/1251] eta 0:36:55 lr 0.000752 time 3.4710 (2.2582) loss 3.5616 (3.6787) grad_norm 1.1186 (1.2741) [2022-01-20 18:34:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][280/1251] eta 0:36:29 lr 0.000751 time 1.9452 (2.2554) loss 4.2563 (3.6787) grad_norm 1.1135 (1.2712) [2022-01-20 18:34:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][290/1251] eta 0:36:01 lr 0.000751 time 1.8351 (2.2490) loss 4.1844 (3.6833) grad_norm 1.1559 (1.2687) [2022-01-20 18:35:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][300/1251] eta 0:35:36 lr 0.000751 time 1.7890 (2.2467) loss 3.5311 (3.6755) grad_norm 1.2397 (1.2666) [2022-01-20 18:35:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][310/1251] eta 0:35:16 lr 0.000751 time 2.6093 (2.2490) loss 3.5687 (3.6808) grad_norm 1.2691 (1.2664) [2022-01-20 18:35:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][320/1251] eta 0:34:51 lr 0.000751 time 2.1441 (2.2464) loss 4.2068 (3.6846) grad_norm 1.1943 (1.2652) [2022-01-20 18:36:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][330/1251] eta 0:34:28 lr 0.000751 time 2.1394 (2.2457) loss 4.4994 (3.6877) grad_norm 1.1752 (1.2648) [2022-01-20 18:36:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][340/1251] eta 0:34:07 lr 0.000751 time 2.5436 (2.2471) loss 4.3310 (3.6945) grad_norm 1.1755 (1.2631) [2022-01-20 18:37:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][350/1251] eta 0:33:41 lr 0.000751 time 2.9376 (2.2438) loss 3.6543 (3.6950) grad_norm 1.1219 (1.2615) [2022-01-20 18:37:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][360/1251] eta 0:33:17 lr 0.000751 time 3.1026 (2.2422) loss 3.6990 (3.6945) grad_norm 1.1094 (1.2597) [2022-01-20 18:37:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][370/1251] eta 0:32:56 lr 0.000751 time 3.1850 (2.2430) loss 3.7179 (3.6937) grad_norm 1.3089 (1.2606) [2022-01-20 18:38:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][380/1251] eta 0:32:35 lr 0.000751 time 2.5450 (2.2452) loss 4.3445 (3.6895) grad_norm 1.2636 (1.2616) [2022-01-20 18:38:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][390/1251] eta 0:32:11 lr 0.000751 time 1.9185 (2.2439) loss 3.7345 (3.6869) grad_norm 1.3854 (1.2622) [2022-01-20 18:38:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][400/1251] eta 0:31:48 lr 0.000751 time 3.1012 (2.2429) loss 3.5717 (3.6915) grad_norm 1.2956 (1.2649) [2022-01-20 18:39:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][410/1251] eta 0:31:24 lr 0.000751 time 2.9110 (2.2412) loss 4.6021 (3.6922) grad_norm 1.2738 (1.2649) [2022-01-20 18:39:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][420/1251] eta 0:30:59 lr 0.000751 time 1.6418 (2.2377) loss 4.2162 (3.6929) grad_norm 1.1304 (1.2635) [2022-01-20 18:39:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][430/1251] eta 0:30:34 lr 0.000751 time 1.8573 (2.2341) loss 3.3254 (3.6929) grad_norm 1.1823 (1.2622) [2022-01-20 18:40:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][440/1251] eta 0:30:12 lr 0.000751 time 2.1169 (2.2348) loss 4.0310 (3.6925) grad_norm 1.1634 (1.2620) [2022-01-20 18:40:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][450/1251] eta 0:29:48 lr 0.000751 time 2.7700 (2.2333) loss 3.0979 (3.6889) grad_norm 1.1665 (1.2618) [2022-01-20 18:41:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][460/1251] eta 0:29:25 lr 0.000751 time 2.1265 (2.2326) loss 3.8252 (3.6921) grad_norm 1.2079 (1.2594) [2022-01-20 18:41:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][470/1251] eta 0:29:00 lr 0.000751 time 1.8732 (2.2283) loss 3.6904 (3.6980) grad_norm 1.3395 (1.2595) [2022-01-20 18:41:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][480/1251] eta 0:28:38 lr 0.000751 time 2.1642 (2.2287) loss 4.1585 (3.6953) grad_norm 1.1685 (1.2589) [2022-01-20 18:42:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][490/1251] eta 0:28:14 lr 0.000751 time 2.7111 (2.2271) loss 3.8358 (3.7036) grad_norm 1.3175 (1.2591) [2022-01-20 18:42:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][500/1251] eta 0:27:52 lr 0.000751 time 1.7565 (2.2272) loss 3.7973 (3.7048) grad_norm 1.1968 (1.2604) [2022-01-20 18:42:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][510/1251] eta 0:27:29 lr 0.000751 time 1.6808 (2.2265) loss 4.0259 (3.7039) grad_norm 1.1513 (1.2595) [2022-01-20 18:43:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][520/1251] eta 0:27:08 lr 0.000751 time 2.2436 (2.2281) loss 4.5749 (3.7054) grad_norm 1.2437 (1.2590) [2022-01-20 18:43:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][530/1251] eta 0:26:46 lr 0.000751 time 2.9824 (2.2285) loss 3.9242 (3.7081) grad_norm 1.1757 (1.2576) [2022-01-20 18:43:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][540/1251] eta 0:26:21 lr 0.000751 time 1.9268 (2.2240) loss 2.9679 (3.7105) grad_norm 1.3132 (1.2594) [2022-01-20 18:44:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][550/1251] eta 0:25:59 lr 0.000751 time 2.4970 (2.2253) loss 4.3764 (3.7123) grad_norm 1.1349 (1.2583) [2022-01-20 18:44:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][560/1251] eta 0:25:37 lr 0.000750 time 2.2956 (2.2244) loss 3.5849 (3.7131) grad_norm 1.2501 (1.2569) [2022-01-20 18:45:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][570/1251] eta 0:25:14 lr 0.000750 time 2.7148 (2.2235) loss 4.3463 (3.7147) grad_norm 1.2926 (1.2599) [2022-01-20 18:45:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][580/1251] eta 0:24:48 lr 0.000750 time 1.7024 (2.2189) loss 2.8638 (3.7177) grad_norm 1.2330 (1.2593) [2022-01-20 18:45:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][590/1251] eta 0:24:25 lr 0.000750 time 2.0503 (2.2169) loss 4.1671 (3.7194) grad_norm 1.1765 (1.2610) [2022-01-20 18:46:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][600/1251] eta 0:24:02 lr 0.000750 time 1.9916 (2.2156) loss 3.4975 (3.7176) grad_norm 1.2935 (1.2622) [2022-01-20 18:46:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][610/1251] eta 0:23:40 lr 0.000750 time 3.2567 (2.2154) loss 3.0437 (3.7125) grad_norm 1.2757 (1.2614) [2022-01-20 18:46:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][620/1251] eta 0:23:17 lr 0.000750 time 1.8798 (2.2152) loss 3.9983 (3.7131) grad_norm 1.2514 (1.2603) [2022-01-20 18:47:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][630/1251] eta 0:22:54 lr 0.000750 time 1.5523 (2.2136) loss 3.4779 (3.7114) grad_norm 1.5993 (1.2603) [2022-01-20 18:47:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][640/1251] eta 0:22:33 lr 0.000750 time 2.5756 (2.2151) loss 4.1373 (3.7095) grad_norm 1.2478 (1.2595) [2022-01-20 18:47:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][650/1251] eta 0:22:11 lr 0.000750 time 2.9340 (2.2162) loss 2.7076 (3.7102) grad_norm 1.2493 (1.2598) [2022-01-20 18:48:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][660/1251] eta 0:21:50 lr 0.000750 time 1.9210 (2.2169) loss 4.3898 (3.7155) grad_norm 1.2484 (1.2603) [2022-01-20 18:48:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][670/1251] eta 0:21:29 lr 0.000750 time 2.0346 (2.2188) loss 4.1460 (3.7197) grad_norm 1.1115 (1.2592) [2022-01-20 18:49:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][680/1251] eta 0:21:07 lr 0.000750 time 2.1147 (2.2194) loss 4.2458 (3.7193) grad_norm 1.2396 (1.2601) [2022-01-20 18:49:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][690/1251] eta 0:20:42 lr 0.000750 time 1.8649 (2.2150) loss 3.7950 (3.7201) grad_norm 1.5890 (1.2601) [2022-01-20 18:49:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][700/1251] eta 0:20:18 lr 0.000750 time 1.5912 (2.2121) loss 4.1815 (3.7238) grad_norm 1.2306 (1.2591) [2022-01-20 18:50:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][710/1251] eta 0:19:55 lr 0.000750 time 2.5650 (2.2104) loss 3.9250 (3.7234) grad_norm 1.1828 (1.2588) [2022-01-20 18:50:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][720/1251] eta 0:19:32 lr 0.000750 time 2.0008 (2.2090) loss 4.2329 (3.7208) grad_norm 1.3212 (1.2580) [2022-01-20 18:50:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][730/1251] eta 0:19:09 lr 0.000750 time 1.9528 (2.2067) loss 2.6747 (3.7212) grad_norm 1.1106 (1.2593) [2022-01-20 18:51:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][740/1251] eta 0:18:47 lr 0.000750 time 2.5197 (2.2066) loss 3.2242 (3.7198) grad_norm 1.1157 (1.2601) [2022-01-20 18:51:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][750/1251] eta 0:18:25 lr 0.000750 time 1.8682 (2.2067) loss 4.3114 (3.7184) grad_norm 1.3828 (1.2610) [2022-01-20 18:51:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][760/1251] eta 0:18:04 lr 0.000750 time 1.8504 (2.2087) loss 2.9290 (3.7161) grad_norm 1.4000 (1.2622) [2022-01-20 18:52:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][770/1251] eta 0:17:42 lr 0.000750 time 2.3053 (2.2092) loss 4.6048 (3.7173) grad_norm 1.1612 (1.2617) [2022-01-20 18:52:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][780/1251] eta 0:17:21 lr 0.000750 time 2.5356 (2.2107) loss 3.9202 (3.7142) grad_norm 1.1428 (1.2617) [2022-01-20 18:53:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][790/1251] eta 0:16:58 lr 0.000750 time 1.9040 (2.2095) loss 4.0561 (3.7149) grad_norm 1.2135 (1.2613) [2022-01-20 18:53:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][800/1251] eta 0:16:36 lr 0.000750 time 1.6075 (2.2088) loss 3.8951 (3.7140) grad_norm 1.2157 (1.2613) [2022-01-20 18:53:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][810/1251] eta 0:16:13 lr 0.000750 time 2.2705 (2.2084) loss 3.6435 (3.7120) grad_norm 1.2280 (1.2621) [2022-01-20 18:54:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][820/1251] eta 0:15:50 lr 0.000750 time 1.8762 (2.2063) loss 2.7037 (3.7109) grad_norm 1.3627 (1.2635) [2022-01-20 18:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][830/1251] eta 0:15:28 lr 0.000750 time 2.5420 (2.2062) loss 4.1609 (3.7065) grad_norm 1.1129 (1.2622) [2022-01-20 18:54:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][840/1251] eta 0:15:07 lr 0.000749 time 1.6085 (2.2072) loss 4.4425 (3.7090) grad_norm 1.1809 (1.2620) [2022-01-20 18:55:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][850/1251] eta 0:14:45 lr 0.000749 time 2.2604 (2.2086) loss 3.2701 (3.7080) grad_norm 1.2142 (1.2625) [2022-01-20 18:55:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][860/1251] eta 0:14:24 lr 0.000749 time 2.1837 (2.2097) loss 3.5296 (3.7080) grad_norm 1.1800 (1.2624) [2022-01-20 18:55:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][870/1251] eta 0:14:01 lr 0.000749 time 1.9110 (2.2074) loss 3.0125 (3.7056) grad_norm 1.1351 (1.2622) [2022-01-20 18:56:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][880/1251] eta 0:13:38 lr 0.000749 time 1.9638 (2.2050) loss 3.6447 (3.7070) grad_norm 1.2239 (1.2622) [2022-01-20 18:56:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][890/1251] eta 0:13:15 lr 0.000749 time 2.1699 (2.2039) loss 3.8475 (3.7071) grad_norm 1.2221 (1.2618) [2022-01-20 18:57:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][900/1251] eta 0:12:53 lr 0.000749 time 2.1879 (2.2040) loss 4.0524 (3.7098) grad_norm 1.1432 (1.2614) [2022-01-20 18:57:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][910/1251] eta 0:12:31 lr 0.000749 time 1.6671 (2.2042) loss 3.0600 (3.7085) grad_norm 1.1732 (1.2604) [2022-01-20 18:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][920/1251] eta 0:12:09 lr 0.000749 time 1.8966 (2.2041) loss 2.8346 (3.7075) grad_norm 1.0868 (1.2601) [2022-01-20 18:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][930/1251] eta 0:11:47 lr 0.000749 time 2.3313 (2.2036) loss 3.4682 (3.7084) grad_norm 1.1589 (1.2604) [2022-01-20 18:58:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][940/1251] eta 0:11:25 lr 0.000749 time 2.5904 (2.2040) loss 4.0815 (3.7121) grad_norm 1.3921 (1.2610) [2022-01-20 18:58:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][950/1251] eta 0:11:02 lr 0.000749 time 1.9306 (2.2017) loss 3.8108 (3.7124) grad_norm 1.1397 (1.2619) [2022-01-20 18:59:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][960/1251] eta 0:10:40 lr 0.000749 time 2.2197 (2.2003) loss 3.1371 (3.7108) grad_norm 1.1865 (1.2623) [2022-01-20 18:59:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][970/1251] eta 0:10:18 lr 0.000749 time 2.5405 (2.1998) loss 3.6359 (3.7122) grad_norm 1.4062 (1.2630) [2022-01-20 18:59:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][980/1251] eta 0:09:56 lr 0.000749 time 2.8537 (2.1996) loss 2.6454 (3.7109) grad_norm 1.2443 (1.2626) [2022-01-20 19:00:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][990/1251] eta 0:09:34 lr 0.000749 time 2.6957 (2.2002) loss 2.3565 (3.7085) grad_norm 1.1908 (1.2623) [2022-01-20 19:00:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1000/1251] eta 0:09:12 lr 0.000749 time 2.5755 (2.2011) loss 3.4142 (3.7077) grad_norm 1.1516 (1.2617) [2022-01-20 19:01:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1010/1251] eta 0:08:50 lr 0.000749 time 2.0779 (2.2030) loss 4.2004 (3.7062) grad_norm 1.1694 (1.2618) [2022-01-20 19:01:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1020/1251] eta 0:08:28 lr 0.000749 time 1.9214 (2.2028) loss 3.1781 (3.7046) grad_norm 1.2515 (1.2621) [2022-01-20 19:01:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1030/1251] eta 0:08:06 lr 0.000749 time 2.4778 (2.2018) loss 3.1303 (3.7025) grad_norm 1.2109 (1.2619) [2022-01-20 19:02:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1040/1251] eta 0:07:44 lr 0.000749 time 1.9156 (2.1999) loss 4.3225 (3.7047) grad_norm 1.3973 (1.2616) [2022-01-20 19:02:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1050/1251] eta 0:07:21 lr 0.000749 time 1.8059 (2.1987) loss 3.8210 (3.7025) grad_norm 1.4962 (1.2620) [2022-01-20 19:02:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1060/1251] eta 0:06:59 lr 0.000749 time 1.8856 (2.1982) loss 3.8095 (3.7050) grad_norm 1.1792 (1.2615) [2022-01-20 19:03:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1070/1251] eta 0:06:37 lr 0.000749 time 1.8139 (2.1985) loss 3.9678 (3.7072) grad_norm 1.1103 (1.2611) [2022-01-20 19:03:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1080/1251] eta 0:06:15 lr 0.000749 time 1.9054 (2.1985) loss 3.9605 (3.7089) grad_norm 1.1671 (1.2609) [2022-01-20 19:03:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1090/1251] eta 0:05:54 lr 0.000749 time 2.3437 (2.2008) loss 3.0368 (3.7128) grad_norm 1.2295 (1.2601) [2022-01-20 19:04:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1100/1251] eta 0:05:32 lr 0.000749 time 2.4158 (2.2018) loss 3.2987 (3.7116) grad_norm 1.2708 (1.2595) [2022-01-20 19:04:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1110/1251] eta 0:05:10 lr 0.000749 time 2.3351 (2.2006) loss 4.1296 (3.7112) grad_norm 1.2286 (1.2588) [2022-01-20 19:05:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1120/1251] eta 0:04:48 lr 0.000748 time 1.5903 (2.1988) loss 2.5525 (3.7104) grad_norm 1.4384 (1.2586) [2022-01-20 19:05:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1130/1251] eta 0:04:26 lr 0.000748 time 3.1862 (2.1986) loss 4.0650 (3.7125) grad_norm 1.3301 (1.2590) [2022-01-20 19:05:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1140/1251] eta 0:04:03 lr 0.000748 time 2.0158 (2.1974) loss 3.6908 (3.7138) grad_norm 1.6491 (1.2602) [2022-01-20 19:06:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1150/1251] eta 0:03:41 lr 0.000748 time 1.9442 (2.1968) loss 3.1736 (3.7112) grad_norm 1.2002 (1.2599) [2022-01-20 19:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1160/1251] eta 0:03:20 lr 0.000748 time 2.1044 (2.1978) loss 3.0571 (3.7099) grad_norm 1.1111 (1.2597) [2022-01-20 19:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1170/1251] eta 0:02:58 lr 0.000748 time 2.4994 (2.1987) loss 2.5295 (3.7097) grad_norm 1.2199 (1.2589) [2022-01-20 19:07:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1180/1251] eta 0:02:36 lr 0.000748 time 2.1471 (2.1990) loss 4.0233 (3.7101) grad_norm 1.3002 (1.2591) [2022-01-20 19:07:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1190/1251] eta 0:02:14 lr 0.000748 time 1.9215 (2.1980) loss 3.0091 (3.7085) grad_norm 1.1227 (1.2593) [2022-01-20 19:07:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1200/1251] eta 0:01:52 lr 0.000748 time 2.1285 (2.1977) loss 3.0591 (3.7089) grad_norm 1.2349 (1.2596) [2022-01-20 19:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1210/1251] eta 0:01:30 lr 0.000748 time 2.2606 (2.1973) loss 4.1530 (3.7109) grad_norm 1.2967 (1.2593) [2022-01-20 19:08:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1220/1251] eta 0:01:08 lr 0.000748 time 2.2061 (2.1977) loss 2.8789 (3.7087) grad_norm 1.4512 (1.2588) [2022-01-20 19:09:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1230/1251] eta 0:00:46 lr 0.000748 time 1.8478 (2.1971) loss 3.6302 (3.7078) grad_norm 1.2944 (1.2581) [2022-01-20 19:09:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1240/1251] eta 0:00:24 lr 0.000748 time 2.0834 (2.1971) loss 3.6996 (3.7081) grad_norm 1.7163 (1.2589) [2022-01-20 19:09:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [100/300][1250/1251] eta 0:00:02 lr 0.000748 time 1.1874 (2.1911) loss 3.6536 (3.7088) grad_norm 1.1312 (1.2582) [2022-01-20 19:09:37 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 100 training takes 0:45:41 [2022-01-20 19:09:37 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_100 saving...... [2022-01-20 19:09:48 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_100 saved !!! [2022-01-20 19:10:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 15.990 (15.990) Loss 1.1089 (1.1089) Acc@1 74.902 (74.902) Acc@5 92.773 (92.773) [2022-01-20 19:10:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.951 (2.910) Loss 1.1033 (1.1430) Acc@1 73.340 (73.304) Acc@5 93.457 (92.116) [2022-01-20 19:10:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.228 (2.432) Loss 1.1867 (1.1334) Acc@1 74.707 (73.670) Acc@5 90.527 (92.225) [2022-01-20 19:10:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.930 (2.143) Loss 1.0401 (1.1371) Acc@1 76.270 (73.617) Acc@5 93.066 (92.128) [2022-01-20 19:11:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 0.841 (2.025) Loss 1.1538 (1.1335) Acc@1 73.633 (73.695) Acc@5 91.992 (92.157) [2022-01-20 19:11:20 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 73.638 Acc@5 92.150 [2022-01-20 19:11:20 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 73.6% [2022-01-20 19:11:20 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.88% [2022-01-20 19:11:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][0/1251] eta 7:28:33 lr 0.000748 time 21.5133 (21.5133) loss 3.1974 (3.1974) grad_norm 1.1067 (1.1067) [2022-01-20 19:12:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][10/1251] eta 1:25:58 lr 0.000748 time 2.8006 (4.1569) loss 4.3138 (3.5698) grad_norm 1.3169 (1.2148) [2022-01-20 19:12:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][20/1251] eta 1:06:56 lr 0.000748 time 1.4429 (3.2632) loss 3.7703 (3.4982) grad_norm 1.2111 (1.2368) [2022-01-20 19:12:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][30/1251] eta 0:59:41 lr 0.000748 time 1.8854 (2.9333) loss 3.5078 (3.5049) grad_norm 1.1829 (1.2767) [2022-01-20 19:13:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][40/1251] eta 0:56:46 lr 0.000748 time 3.7266 (2.8129) loss 2.9400 (3.6032) grad_norm 1.1952 (1.3014) [2022-01-20 19:13:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][50/1251] eta 0:55:03 lr 0.000748 time 2.7711 (2.7503) loss 4.1055 (3.6170) grad_norm 1.2897 (1.2803) [2022-01-20 19:14:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][60/1251] eta 0:52:56 lr 0.000748 time 1.9281 (2.6669) loss 3.8075 (3.6443) grad_norm 1.2299 (1.2642) [2022-01-20 19:14:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][70/1251] eta 0:50:59 lr 0.000748 time 1.9617 (2.5909) loss 2.5425 (3.6754) grad_norm 1.1892 (1.2669) [2022-01-20 19:14:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][80/1251] eta 0:49:34 lr 0.000748 time 3.0278 (2.5401) loss 3.4494 (3.6948) grad_norm 1.2734 (1.2614) [2022-01-20 19:15:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][90/1251] eta 0:47:55 lr 0.000748 time 1.8430 (2.4770) loss 3.1110 (3.6959) grad_norm 1.0861 (1.2632) [2022-01-20 19:15:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][100/1251] eta 0:46:39 lr 0.000748 time 1.6864 (2.4318) loss 4.0503 (3.7058) grad_norm 1.1865 (1.2585) [2022-01-20 19:15:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][110/1251] eta 0:45:40 lr 0.000748 time 1.7960 (2.4015) loss 4.0379 (3.7093) grad_norm 1.1159 (1.2578) [2022-01-20 19:16:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][120/1251] eta 0:44:51 lr 0.000748 time 2.1413 (2.3799) loss 2.6422 (3.7165) grad_norm 1.1992 (1.2613) [2022-01-20 19:16:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][130/1251] eta 0:44:17 lr 0.000748 time 1.8665 (2.3704) loss 3.6609 (3.7160) grad_norm 1.4359 (1.2611) [2022-01-20 19:16:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][140/1251] eta 0:43:57 lr 0.000747 time 1.5947 (2.3736) loss 3.2490 (3.7139) grad_norm 1.1252 (1.2532) [2022-01-20 19:17:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][150/1251] eta 0:43:32 lr 0.000747 time 2.4659 (2.3724) loss 3.7801 (3.7171) grad_norm 1.1614 (1.2512) [2022-01-20 19:17:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][160/1251] eta 0:43:13 lr 0.000747 time 2.1688 (2.3767) loss 3.8757 (3.7251) grad_norm 1.2041 (1.2504) [2022-01-20 19:18:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][170/1251] eta 0:42:31 lr 0.000747 time 1.7615 (2.3600) loss 4.3791 (3.7195) grad_norm 1.2539 (1.2454) [2022-01-20 19:18:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][180/1251] eta 0:41:45 lr 0.000747 time 1.6541 (2.3397) loss 3.5292 (3.7228) grad_norm 1.6652 (1.2472) [2022-01-20 19:18:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][190/1251] eta 0:41:03 lr 0.000747 time 1.9056 (2.3218) loss 3.7504 (3.7161) grad_norm 1.0863 (1.2461) [2022-01-20 19:19:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][200/1251] eta 0:40:33 lr 0.000747 time 1.8973 (2.3158) loss 3.6560 (3.7169) grad_norm 1.2711 (1.2467) [2022-01-20 19:19:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][210/1251] eta 0:40:10 lr 0.000747 time 1.8811 (2.3158) loss 2.7130 (3.7093) grad_norm 1.2187 (1.2478) [2022-01-20 19:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][220/1251] eta 0:39:39 lr 0.000747 time 1.9489 (2.3080) loss 3.2221 (3.7160) grad_norm 1.3502 (1.2507) [2022-01-20 19:20:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][230/1251] eta 0:39:07 lr 0.000747 time 2.1697 (2.2996) loss 2.9553 (3.7220) grad_norm 1.7472 (1.2587) [2022-01-20 19:20:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][240/1251] eta 0:38:35 lr 0.000747 time 1.4883 (2.2899) loss 3.6778 (3.7102) grad_norm 1.2136 (1.2604) [2022-01-20 19:20:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][250/1251] eta 0:38:06 lr 0.000747 time 3.0283 (2.2846) loss 3.1447 (3.6971) grad_norm 1.4289 (1.2606) [2022-01-20 19:21:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][260/1251] eta 0:37:43 lr 0.000747 time 2.2608 (2.2837) loss 4.3320 (3.6979) grad_norm 1.5131 (1.2618) [2022-01-20 19:21:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][270/1251] eta 0:37:17 lr 0.000747 time 1.8829 (2.2812) loss 3.7107 (3.6973) grad_norm 1.0275 (1.2632) [2022-01-20 19:22:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][280/1251] eta 0:36:51 lr 0.000747 time 2.2841 (2.2775) loss 3.6267 (3.7019) grad_norm 1.1590 (1.2612) [2022-01-20 19:22:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][290/1251] eta 0:36:26 lr 0.000747 time 3.3315 (2.2749) loss 3.2214 (3.6973) grad_norm 1.1963 (1.2604) [2022-01-20 19:22:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][300/1251] eta 0:35:59 lr 0.000747 time 1.9532 (2.2709) loss 3.9635 (3.6969) grad_norm 1.2959 (1.2593) [2022-01-20 19:23:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][310/1251] eta 0:35:38 lr 0.000747 time 2.3725 (2.2723) loss 4.2429 (3.6967) grad_norm 1.3362 (1.2603) [2022-01-20 19:23:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][320/1251] eta 0:35:17 lr 0.000747 time 2.4794 (2.2748) loss 3.5262 (3.6941) grad_norm 1.2563 (1.2594) [2022-01-20 19:23:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][330/1251] eta 0:34:50 lr 0.000747 time 2.4843 (2.2697) loss 4.1858 (3.7015) grad_norm 1.4385 (1.2596) [2022-01-20 19:24:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][340/1251] eta 0:34:21 lr 0.000747 time 1.7147 (2.2624) loss 3.9930 (3.7040) grad_norm 1.1664 (1.2583) [2022-01-20 19:24:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][350/1251] eta 0:33:53 lr 0.000747 time 2.5694 (2.2567) loss 3.6743 (3.7005) grad_norm 1.6084 (1.2581) [2022-01-20 19:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][360/1251] eta 0:33:28 lr 0.000747 time 2.3048 (2.2538) loss 3.2650 (3.6954) grad_norm 1.1992 (1.2581) [2022-01-20 19:25:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][370/1251] eta 0:33:06 lr 0.000747 time 2.7556 (2.2548) loss 4.1359 (3.6969) grad_norm 1.1759 (1.2588) [2022-01-20 19:25:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][380/1251] eta 0:32:43 lr 0.000747 time 1.5635 (2.2541) loss 4.1461 (3.6946) grad_norm 1.4087 (1.2607) [2022-01-20 19:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][390/1251] eta 0:32:24 lr 0.000747 time 2.9954 (2.2580) loss 2.9884 (3.6902) grad_norm 1.4332 (1.2633) [2022-01-20 19:26:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][400/1251] eta 0:31:59 lr 0.000747 time 1.6537 (2.2552) loss 3.1203 (3.6937) grad_norm 1.0545 (1.2637) [2022-01-20 19:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][410/1251] eta 0:31:38 lr 0.000747 time 2.5773 (2.2572) loss 3.9272 (3.6963) grad_norm 1.5655 (1.2630) [2022-01-20 19:27:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][420/1251] eta 0:31:10 lr 0.000746 time 1.6254 (2.2507) loss 3.9953 (3.6895) grad_norm 1.3816 (1.2643) [2022-01-20 19:27:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][430/1251] eta 0:30:49 lr 0.000746 time 2.6745 (2.2524) loss 3.2214 (3.6918) grad_norm 1.8863 (1.2657) [2022-01-20 19:27:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][440/1251] eta 0:30:23 lr 0.000746 time 2.2551 (2.2480) loss 4.3812 (3.6942) grad_norm 1.5538 (1.2671) [2022-01-20 19:28:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][450/1251] eta 0:29:59 lr 0.000746 time 2.8653 (2.2467) loss 3.1943 (3.6914) grad_norm 1.1236 (1.2659) [2022-01-20 19:28:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][460/1251] eta 0:29:35 lr 0.000746 time 1.8854 (2.2444) loss 3.8675 (3.6953) grad_norm 1.4756 (1.2649) [2022-01-20 19:28:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][470/1251] eta 0:29:11 lr 0.000746 time 2.5245 (2.2430) loss 3.4720 (3.6916) grad_norm 1.2097 (1.2650) [2022-01-20 19:29:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][480/1251] eta 0:28:47 lr 0.000746 time 2.0041 (2.2407) loss 4.1423 (3.6913) grad_norm 1.1964 (1.2650) [2022-01-20 19:29:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][490/1251] eta 0:28:22 lr 0.000746 time 2.1238 (2.2373) loss 4.0169 (3.6898) grad_norm 1.1705 (1.2651) [2022-01-20 19:29:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][500/1251] eta 0:27:57 lr 0.000746 time 1.9875 (2.2342) loss 2.8469 (3.6889) grad_norm 1.0552 (1.2647) [2022-01-20 19:30:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][510/1251] eta 0:27:36 lr 0.000746 time 2.5237 (2.2354) loss 4.0555 (3.6921) grad_norm 1.2592 (1.2649) [2022-01-20 19:30:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][520/1251] eta 0:27:13 lr 0.000746 time 2.1069 (2.2344) loss 2.7255 (3.6907) grad_norm 1.7777 (1.2664) [2022-01-20 19:31:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][530/1251] eta 0:26:51 lr 0.000746 time 2.5223 (2.2351) loss 2.6099 (3.6911) grad_norm 1.5311 (1.2663) [2022-01-20 19:31:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][540/1251] eta 0:26:27 lr 0.000746 time 1.9145 (2.2326) loss 4.3277 (3.6921) grad_norm 1.2830 (1.2667) [2022-01-20 19:31:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][550/1251] eta 0:26:06 lr 0.000746 time 3.2179 (2.2344) loss 3.6803 (3.6950) grad_norm 1.1182 (1.2655) [2022-01-20 19:32:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][560/1251] eta 0:25:42 lr 0.000746 time 2.5614 (2.2321) loss 3.4022 (3.6985) grad_norm 1.1395 (1.2654) [2022-01-20 19:32:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][570/1251] eta 0:25:16 lr 0.000746 time 2.5010 (2.2275) loss 3.5276 (3.7025) grad_norm 1.2477 (1.2650) [2022-01-20 19:32:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][580/1251] eta 0:24:54 lr 0.000746 time 1.7762 (2.2274) loss 4.2376 (3.6977) grad_norm 1.3621 (1.2643) [2022-01-20 19:33:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][590/1251] eta 0:24:34 lr 0.000746 time 2.6642 (2.2300) loss 4.1982 (3.7015) grad_norm 1.4120 (1.2641) [2022-01-20 19:33:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][600/1251] eta 0:24:13 lr 0.000746 time 2.4703 (2.2331) loss 3.5860 (3.7003) grad_norm 1.0745 (1.2646) [2022-01-20 19:34:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][610/1251] eta 0:23:49 lr 0.000746 time 1.9299 (2.2307) loss 4.6079 (3.7011) grad_norm 1.3963 (1.2639) [2022-01-20 19:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][620/1251] eta 0:23:25 lr 0.000746 time 2.0330 (2.2267) loss 4.1412 (3.7026) grad_norm 1.3357 (1.2632) [2022-01-20 19:34:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][630/1251] eta 0:23:03 lr 0.000746 time 2.6535 (2.2275) loss 3.2930 (3.7016) grad_norm 1.2083 (1.2646) [2022-01-20 19:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][640/1251] eta 0:22:40 lr 0.000746 time 2.7537 (2.2271) loss 4.1197 (3.7012) grad_norm 1.3751 (1.2643) [2022-01-20 19:35:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][650/1251] eta 0:22:17 lr 0.000746 time 2.1263 (2.2259) loss 4.2387 (3.7011) grad_norm 1.0449 (1.2637) [2022-01-20 19:35:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][660/1251] eta 0:21:54 lr 0.000746 time 2.1795 (2.2236) loss 3.2762 (3.6993) grad_norm 1.3388 (1.2639) [2022-01-20 19:36:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][670/1251] eta 0:21:30 lr 0.000746 time 2.8298 (2.2220) loss 3.4134 (3.6942) grad_norm 1.2397 (1.2648) [2022-01-20 19:36:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][680/1251] eta 0:21:09 lr 0.000746 time 3.5144 (2.2233) loss 3.4259 (3.6941) grad_norm 1.3385 (1.2659) [2022-01-20 19:36:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][690/1251] eta 0:20:47 lr 0.000746 time 2.5362 (2.2231) loss 4.0054 (3.6956) grad_norm 1.2742 (1.2648) [2022-01-20 19:37:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][700/1251] eta 0:20:25 lr 0.000745 time 1.9492 (2.2239) loss 4.1289 (3.6990) grad_norm 1.3409 (1.2639) [2022-01-20 19:37:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][710/1251] eta 0:20:02 lr 0.000745 time 2.7254 (2.2236) loss 4.0135 (3.7021) grad_norm 1.2190 (1.2637) [2022-01-20 19:38:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][720/1251] eta 0:19:39 lr 0.000745 time 1.8094 (2.2222) loss 4.0051 (3.7031) grad_norm 1.1569 (1.2630) [2022-01-20 19:38:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][730/1251] eta 0:19:16 lr 0.000745 time 2.4495 (2.2199) loss 3.7055 (3.7031) grad_norm 1.1443 (1.2619) [2022-01-20 19:38:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][740/1251] eta 0:18:52 lr 0.000745 time 1.7760 (2.2166) loss 2.4837 (3.7026) grad_norm 1.1769 (1.2627) [2022-01-20 19:39:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][750/1251] eta 0:18:29 lr 0.000745 time 2.2834 (2.2155) loss 3.8445 (3.7028) grad_norm 1.2476 (1.2637) [2022-01-20 19:39:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][760/1251] eta 0:18:07 lr 0.000745 time 2.6990 (2.2152) loss 3.7512 (3.7045) grad_norm 1.3395 (1.2638) [2022-01-20 19:39:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][770/1251] eta 0:17:45 lr 0.000745 time 2.5451 (2.2161) loss 3.4561 (3.7070) grad_norm 1.1975 (1.2629) [2022-01-20 19:40:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][780/1251] eta 0:17:23 lr 0.000745 time 2.3399 (2.2157) loss 4.1069 (3.7099) grad_norm 1.0865 (1.2622) [2022-01-20 19:40:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][790/1251] eta 0:17:01 lr 0.000745 time 1.8420 (2.2149) loss 3.9438 (3.7125) grad_norm 1.1496 (1.2621) [2022-01-20 19:40:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][800/1251] eta 0:16:38 lr 0.000745 time 2.1860 (2.2135) loss 3.9024 (3.7090) grad_norm 1.1717 (1.2609) [2022-01-20 19:41:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][810/1251] eta 0:16:15 lr 0.000745 time 2.3369 (2.2111) loss 3.8586 (3.7073) grad_norm 1.0917 (1.2599) [2022-01-20 19:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][820/1251] eta 0:15:53 lr 0.000745 time 1.8182 (2.2130) loss 3.2414 (3.7071) grad_norm 1.4584 (1.2603) [2022-01-20 19:42:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][830/1251] eta 0:15:32 lr 0.000745 time 2.2265 (2.2150) loss 3.1611 (3.7070) grad_norm 1.2095 (1.2605) [2022-01-20 19:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][840/1251] eta 0:15:11 lr 0.000745 time 2.0735 (2.2171) loss 2.7762 (3.7074) grad_norm 1.3724 (1.2619) [2022-01-20 19:42:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][850/1251] eta 0:14:49 lr 0.000745 time 2.2592 (2.2174) loss 4.2154 (3.7085) grad_norm 1.2706 (1.2625) [2022-01-20 19:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][860/1251] eta 0:14:26 lr 0.000745 time 1.5396 (2.2164) loss 3.6132 (3.7103) grad_norm 1.0957 (1.2614) [2022-01-20 19:43:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][870/1251] eta 0:14:03 lr 0.000745 time 1.5925 (2.2138) loss 4.3193 (3.7095) grad_norm 1.1868 (1.2606) [2022-01-20 19:43:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][880/1251] eta 0:13:41 lr 0.000745 time 2.2893 (2.2134) loss 4.3717 (3.7080) grad_norm 1.1800 (1.2613) [2022-01-20 19:44:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][890/1251] eta 0:13:18 lr 0.000745 time 2.0785 (2.2129) loss 4.0668 (3.7085) grad_norm 1.0999 (1.2614) [2022-01-20 19:44:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][900/1251] eta 0:12:56 lr 0.000745 time 1.5135 (2.2122) loss 3.6151 (3.7086) grad_norm 1.1478 (1.2610) [2022-01-20 19:44:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][910/1251] eta 0:12:34 lr 0.000745 time 2.4406 (2.2117) loss 2.7160 (3.7075) grad_norm 1.4646 (1.2614) [2022-01-20 19:45:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][920/1251] eta 0:12:11 lr 0.000745 time 1.8961 (2.2109) loss 3.4579 (3.7082) grad_norm 1.2063 (1.2616) [2022-01-20 19:45:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][930/1251] eta 0:11:49 lr 0.000745 time 2.2081 (2.2108) loss 2.3162 (3.7049) grad_norm 1.1814 (1.2608) [2022-01-20 19:46:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][940/1251] eta 0:11:27 lr 0.000745 time 1.9222 (2.2103) loss 3.5316 (3.7075) grad_norm 1.2237 (1.2601) [2022-01-20 19:46:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][950/1251] eta 0:11:05 lr 0.000745 time 2.5808 (2.2117) loss 2.4833 (3.7078) grad_norm 1.1681 (1.2604) [2022-01-20 19:46:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][960/1251] eta 0:10:43 lr 0.000745 time 2.4726 (2.2114) loss 3.3686 (3.7078) grad_norm 1.1172 (1.2594) [2022-01-20 19:47:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][970/1251] eta 0:10:21 lr 0.000744 time 1.8553 (2.2102) loss 3.7989 (3.7090) grad_norm 1.2194 (1.2589) [2022-01-20 19:47:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][980/1251] eta 0:09:58 lr 0.000744 time 2.4703 (2.2103) loss 3.9968 (3.7096) grad_norm 1.1709 (1.2585) [2022-01-20 19:47:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][990/1251] eta 0:09:37 lr 0.000744 time 3.0119 (2.2112) loss 4.0221 (3.7072) grad_norm 1.4477 (1.2590) [2022-01-20 19:48:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1000/1251] eta 0:09:15 lr 0.000744 time 2.5149 (2.2114) loss 3.1027 (3.7071) grad_norm 1.5249 (1.2595) [2022-01-20 19:48:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1010/1251] eta 0:08:52 lr 0.000744 time 1.9025 (2.2106) loss 4.0121 (3.7080) grad_norm 1.2209 (1.2594) [2022-01-20 19:48:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1020/1251] eta 0:08:30 lr 0.000744 time 1.7223 (2.2100) loss 4.2812 (3.7095) grad_norm 1.2460 (1.2593) [2022-01-20 19:49:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1030/1251] eta 0:08:07 lr 0.000744 time 1.5759 (2.2079) loss 3.1620 (3.7076) grad_norm 1.1696 (1.2590) [2022-01-20 19:49:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1040/1251] eta 0:07:45 lr 0.000744 time 2.1132 (2.2059) loss 2.5661 (3.7064) grad_norm 1.3819 (1.2592) [2022-01-20 19:49:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1050/1251] eta 0:07:23 lr 0.000744 time 1.8264 (2.2051) loss 4.2430 (3.7097) grad_norm 1.2426 (1.2582) [2022-01-20 19:50:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1060/1251] eta 0:07:00 lr 0.000744 time 1.8487 (2.2040) loss 4.2777 (3.7110) grad_norm 1.3585 (1.2580) [2022-01-20 19:50:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1070/1251] eta 0:06:39 lr 0.000744 time 2.1091 (2.2057) loss 4.0785 (3.7084) grad_norm 1.1177 (1.2578) [2022-01-20 19:51:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1080/1251] eta 0:06:17 lr 0.000744 time 2.2889 (2.2070) loss 3.7298 (3.7096) grad_norm 1.4555 (1.2578) [2022-01-20 19:51:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1090/1251] eta 0:05:55 lr 0.000744 time 2.8024 (2.2073) loss 4.4360 (3.7081) grad_norm 1.1924 (1.2586) [2022-01-20 19:51:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1100/1251] eta 0:05:33 lr 0.000744 time 1.7873 (2.2071) loss 3.4385 (3.7083) grad_norm 1.3301 (1.2582) [2022-01-20 19:52:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1110/1251] eta 0:05:11 lr 0.000744 time 2.7444 (2.2076) loss 4.1833 (3.7073) grad_norm 1.2918 (1.2577) [2022-01-20 19:52:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1120/1251] eta 0:04:49 lr 0.000744 time 2.1146 (2.2074) loss 3.8566 (3.7090) grad_norm 1.2054 (1.2578) [2022-01-20 19:52:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1130/1251] eta 0:04:26 lr 0.000744 time 1.7262 (2.2056) loss 4.1937 (3.7114) grad_norm 1.3110 (1.2592) [2022-01-20 19:53:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1140/1251] eta 0:04:04 lr 0.000744 time 1.6250 (2.2054) loss 4.0521 (3.7103) grad_norm 1.3944 (1.2590) [2022-01-20 19:53:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1150/1251] eta 0:03:42 lr 0.000744 time 1.8897 (2.2045) loss 4.1661 (3.7090) grad_norm 1.4690 (1.2591) [2022-01-20 19:53:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1160/1251] eta 0:03:20 lr 0.000744 time 2.8440 (2.2047) loss 3.4263 (3.7088) grad_norm 1.2314 (1.2590) [2022-01-20 19:54:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1170/1251] eta 0:02:58 lr 0.000744 time 1.9676 (2.2057) loss 4.0197 (3.7100) grad_norm 1.3776 (1.2599) [2022-01-20 19:54:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1180/1251] eta 0:02:36 lr 0.000744 time 1.5831 (2.2050) loss 2.4219 (3.7087) grad_norm 1.2018 (1.2600) [2022-01-20 19:55:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1190/1251] eta 0:02:14 lr 0.000744 time 1.9354 (2.2046) loss 3.5360 (3.7104) grad_norm 1.2269 (1.2602) [2022-01-20 19:55:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1200/1251] eta 0:01:52 lr 0.000744 time 2.3095 (2.2032) loss 3.9780 (3.7126) grad_norm 1.2448 (1.2600) [2022-01-20 19:55:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1210/1251] eta 0:01:30 lr 0.000744 time 2.4249 (2.2029) loss 4.0479 (3.7145) grad_norm 1.1539 (1.2595) [2022-01-20 19:56:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1220/1251] eta 0:01:08 lr 0.000744 time 1.9510 (2.2024) loss 3.7837 (3.7118) grad_norm 1.1901 (1.2595) [2022-01-20 19:56:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1230/1251] eta 0:00:46 lr 0.000744 time 2.5032 (2.2038) loss 3.6725 (3.7130) grad_norm 1.3242 (1.2592) [2022-01-20 19:56:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1240/1251] eta 0:00:24 lr 0.000744 time 1.7341 (2.2029) loss 2.7490 (3.7135) grad_norm 1.3801 (1.2590) [2022-01-20 19:57:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [101/300][1250/1251] eta 0:00:02 lr 0.000743 time 1.1948 (2.1971) loss 4.2586 (3.7141) grad_norm 1.3783 (1.2599) [2022-01-20 19:57:09 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 101 training takes 0:45:49 [2022-01-20 19:57:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.747 (17.747) Loss 1.0720 (1.0720) Acc@1 74.805 (74.805) Acc@5 92.773 (92.773) [2022-01-20 19:57:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.218 (3.356) Loss 1.0211 (1.0812) Acc@1 74.121 (74.885) Acc@5 92.773 (92.551) [2022-01-20 19:58:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.935 (2.680) Loss 1.1281 (1.1048) Acc@1 74.707 (74.461) Acc@5 92.090 (92.267) [2022-01-20 19:58:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.989 (2.418) Loss 1.1439 (1.1118) Acc@1 73.633 (74.354) Acc@5 91.895 (92.191) [2022-01-20 19:58:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.316 (2.208) Loss 1.1017 (1.1216) Acc@1 74.121 (74.040) Acc@5 93.262 (92.085) [2022-01-20 19:58:47 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 73.982 Acc@5 92.090 [2022-01-20 19:58:47 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.0% [2022-01-20 19:58:47 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.98% [2022-01-20 19:59:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][0/1251] eta 7:27:10 lr 0.000743 time 21.4476 (21.4476) loss 4.0122 (4.0122) grad_norm 1.2566 (1.2566) [2022-01-20 19:59:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][10/1251] eta 1:23:24 lr 0.000743 time 2.1001 (4.0322) loss 3.0531 (3.5591) grad_norm 1.2568 (1.2848) [2022-01-20 19:59:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][20/1251] eta 1:04:47 lr 0.000743 time 1.5351 (3.1581) loss 3.9507 (3.5666) grad_norm 1.0432 (1.2972) [2022-01-20 20:00:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][30/1251] eta 0:58:35 lr 0.000743 time 1.5254 (2.8795) loss 4.4682 (3.5982) grad_norm 1.1170 (1.2671) [2022-01-20 20:00:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][40/1251] eta 0:55:00 lr 0.000743 time 4.2030 (2.7254) loss 3.8833 (3.6962) grad_norm 1.2198 (1.2427) [2022-01-20 20:01:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][50/1251] eta 0:52:42 lr 0.000743 time 2.3158 (2.6329) loss 4.2893 (3.7164) grad_norm 1.3771 (1.2441) [2022-01-20 20:01:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][60/1251] eta 0:51:33 lr 0.000743 time 1.6342 (2.5972) loss 3.6344 (3.7048) grad_norm 1.2623 (1.2574) [2022-01-20 20:01:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][70/1251] eta 0:49:55 lr 0.000743 time 1.9464 (2.5367) loss 3.5569 (3.6064) grad_norm 1.2129 (1.2599) [2022-01-20 20:02:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][80/1251] eta 0:48:34 lr 0.000743 time 2.6616 (2.4888) loss 4.4612 (3.6438) grad_norm 1.4307 (1.2708) [2022-01-20 20:02:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][90/1251] eta 0:47:19 lr 0.000743 time 2.2236 (2.4457) loss 3.8802 (3.6456) grad_norm 1.3364 (1.2869) [2022-01-20 20:02:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][100/1251] eta 0:46:05 lr 0.000743 time 1.7418 (2.4031) loss 4.5319 (3.6649) grad_norm 1.3099 (1.2865) [2022-01-20 20:03:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][110/1251] eta 0:45:12 lr 0.000743 time 2.0207 (2.3773) loss 4.0673 (3.6615) grad_norm 1.4147 (1.2805) [2022-01-20 20:03:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][120/1251] eta 0:44:22 lr 0.000743 time 3.2503 (2.3545) loss 4.3466 (3.6262) grad_norm 1.4636 (1.2761) [2022-01-20 20:03:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][130/1251] eta 0:43:44 lr 0.000743 time 2.7009 (2.3416) loss 4.0161 (3.6335) grad_norm 1.2212 (1.2715) [2022-01-20 20:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][140/1251] eta 0:43:13 lr 0.000743 time 1.9554 (2.3348) loss 3.9155 (3.6309) grad_norm 1.2814 (1.2699) [2022-01-20 20:04:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][150/1251] eta 0:42:46 lr 0.000743 time 1.9579 (2.3307) loss 2.5802 (3.6298) grad_norm 1.4351 (1.2732) [2022-01-20 20:05:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][160/1251] eta 0:42:22 lr 0.000743 time 2.5859 (2.3304) loss 4.4914 (3.6304) grad_norm 1.1406 (1.2698) [2022-01-20 20:05:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][170/1251] eta 0:41:50 lr 0.000743 time 2.4359 (2.3228) loss 2.5427 (3.6326) grad_norm 1.4383 (1.2748) [2022-01-20 20:05:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][180/1251] eta 0:41:26 lr 0.000743 time 1.9108 (2.3218) loss 3.9346 (3.6371) grad_norm 1.5879 (1.2747) [2022-01-20 20:06:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][190/1251] eta 0:40:55 lr 0.000743 time 1.9161 (2.3139) loss 2.8206 (3.6274) grad_norm 1.2942 (1.2824) [2022-01-20 20:06:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][200/1251] eta 0:40:18 lr 0.000743 time 1.8684 (2.3011) loss 4.3343 (3.6436) grad_norm 1.2100 (1.2807) [2022-01-20 20:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][210/1251] eta 0:39:47 lr 0.000743 time 1.9255 (2.2937) loss 3.1920 (3.6431) grad_norm 1.1702 (1.2772) [2022-01-20 20:07:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][220/1251] eta 0:39:28 lr 0.000743 time 2.2127 (2.2976) loss 4.4295 (3.6470) grad_norm 1.2956 (1.2761) [2022-01-20 20:07:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][230/1251] eta 0:38:57 lr 0.000743 time 1.4601 (2.2891) loss 3.3721 (3.6575) grad_norm 1.1998 (1.2729) [2022-01-20 20:07:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][240/1251] eta 0:38:33 lr 0.000743 time 2.1963 (2.2884) loss 3.4046 (3.6665) grad_norm 1.1452 (1.2731) [2022-01-20 20:08:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][250/1251] eta 0:38:01 lr 0.000743 time 1.6408 (2.2794) loss 4.3055 (3.6669) grad_norm 1.1830 (1.2708) [2022-01-20 20:08:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][260/1251] eta 0:37:33 lr 0.000743 time 1.9629 (2.2744) loss 4.3317 (3.6689) grad_norm 1.1536 (1.2695) [2022-01-20 20:09:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][270/1251] eta 0:37:05 lr 0.000742 time 1.8393 (2.2688) loss 4.1540 (3.6731) grad_norm 0.9556 (1.2656) [2022-01-20 20:09:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][280/1251] eta 0:36:42 lr 0.000742 time 2.2204 (2.2679) loss 4.1045 (3.6817) grad_norm 1.2499 (1.2637) [2022-01-20 20:09:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][290/1251] eta 0:36:16 lr 0.000742 time 2.4962 (2.2650) loss 3.3143 (3.6901) grad_norm 1.1819 (1.2626) [2022-01-20 20:10:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][300/1251] eta 0:35:51 lr 0.000742 time 2.1356 (2.2624) loss 3.8584 (3.6995) grad_norm 1.1710 (1.2616) [2022-01-20 20:10:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][310/1251] eta 0:35:26 lr 0.000742 time 2.1339 (2.2602) loss 3.5459 (3.7072) grad_norm 1.2853 (1.2595) [2022-01-20 20:10:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][320/1251] eta 0:35:00 lr 0.000742 time 2.2384 (2.2562) loss 4.2469 (3.7076) grad_norm 1.5512 (1.2613) [2022-01-20 20:11:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][330/1251] eta 0:34:37 lr 0.000742 time 2.1818 (2.2553) loss 4.2735 (3.7177) grad_norm 1.2017 (1.2604) [2022-01-20 20:11:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][340/1251] eta 0:34:14 lr 0.000742 time 2.2981 (2.2555) loss 3.9061 (3.7202) grad_norm 1.1255 (1.2598) [2022-01-20 20:11:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][350/1251] eta 0:33:49 lr 0.000742 time 1.6168 (2.2524) loss 3.1413 (3.7178) grad_norm 1.1128 (1.2595) [2022-01-20 20:12:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][360/1251] eta 0:33:20 lr 0.000742 time 2.2958 (2.2457) loss 4.0586 (3.7173) grad_norm 1.1925 (1.2611) [2022-01-20 20:12:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][370/1251] eta 0:32:54 lr 0.000742 time 1.7216 (2.2410) loss 3.2087 (3.7086) grad_norm 1.3017 (1.2619) [2022-01-20 20:12:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][380/1251] eta 0:32:27 lr 0.000742 time 1.9048 (2.2355) loss 3.8185 (3.7103) grad_norm 1.2330 (1.2636) [2022-01-20 20:13:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][390/1251] eta 0:32:02 lr 0.000742 time 2.3968 (2.2332) loss 3.4529 (3.7079) grad_norm 1.4863 (1.2657) [2022-01-20 20:13:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][400/1251] eta 0:31:43 lr 0.000742 time 2.7449 (2.2367) loss 4.0485 (3.7183) grad_norm 1.2745 (1.2652) [2022-01-20 20:14:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][410/1251] eta 0:31:20 lr 0.000742 time 2.6950 (2.2365) loss 3.8751 (3.7136) grad_norm 1.5066 (1.2671) [2022-01-20 20:14:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][420/1251] eta 0:30:59 lr 0.000742 time 2.5283 (2.2373) loss 3.0426 (3.7147) grad_norm 1.3845 (1.2690) [2022-01-20 20:14:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][430/1251] eta 0:30:37 lr 0.000742 time 2.2747 (2.2379) loss 3.9452 (3.7190) grad_norm 1.5362 (1.2694) [2022-01-20 20:15:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][440/1251] eta 0:30:16 lr 0.000742 time 2.5176 (2.2394) loss 3.4189 (3.7159) grad_norm 1.6476 (1.2710) [2022-01-20 20:15:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][450/1251] eta 0:29:53 lr 0.000742 time 2.1793 (2.2390) loss 3.3721 (3.7173) grad_norm 1.3204 (1.2719) [2022-01-20 20:15:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][460/1251] eta 0:29:27 lr 0.000742 time 1.9436 (2.2350) loss 3.9942 (3.7203) grad_norm 1.3217 (1.2713) [2022-01-20 20:16:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][470/1251] eta 0:29:00 lr 0.000742 time 1.9229 (2.2283) loss 4.1681 (3.7156) grad_norm 1.2717 (1.2704) [2022-01-20 20:16:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][480/1251] eta 0:28:34 lr 0.000742 time 2.2437 (2.2244) loss 2.9843 (3.7141) grad_norm 1.1740 (1.2701) [2022-01-20 20:17:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][490/1251] eta 0:28:13 lr 0.000742 time 2.4434 (2.2257) loss 3.6071 (3.7097) grad_norm 1.3192 (1.2689) [2022-01-20 20:17:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][500/1251] eta 0:27:53 lr 0.000742 time 2.5822 (2.2283) loss 3.3107 (3.7106) grad_norm 1.2672 (1.2691) [2022-01-20 20:17:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][510/1251] eta 0:27:30 lr 0.000742 time 1.9671 (2.2268) loss 3.9542 (3.7041) grad_norm 1.2879 (1.2680) [2022-01-20 20:18:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][520/1251] eta 0:27:06 lr 0.000742 time 2.1587 (2.2256) loss 3.7270 (3.7069) grad_norm 1.2471 (1.2677) [2022-01-20 20:18:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][530/1251] eta 0:26:42 lr 0.000742 time 2.1873 (2.2223) loss 3.9603 (3.7030) grad_norm 1.0709 (1.2664) [2022-01-20 20:18:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][540/1251] eta 0:26:20 lr 0.000742 time 1.8161 (2.2227) loss 4.0240 (3.7052) grad_norm 1.2115 (1.2683) [2022-01-20 20:19:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][550/1251] eta 0:25:57 lr 0.000741 time 1.9519 (2.2218) loss 4.1132 (3.7051) grad_norm 1.3386 (1.2680) [2022-01-20 20:19:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][560/1251] eta 0:25:36 lr 0.000741 time 2.7402 (2.2238) loss 2.6722 (3.7000) grad_norm 1.2640 (1.2678) [2022-01-20 20:19:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][570/1251] eta 0:25:13 lr 0.000741 time 2.2378 (2.2225) loss 3.6881 (3.7026) grad_norm 1.1049 (1.2682) [2022-01-20 20:20:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][580/1251] eta 0:24:52 lr 0.000741 time 1.6360 (2.2240) loss 3.3359 (3.7026) grad_norm 1.4063 (1.2681) [2022-01-20 20:20:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][590/1251] eta 0:24:31 lr 0.000741 time 2.1223 (2.2254) loss 4.0398 (3.7022) grad_norm 1.4236 (1.2682) [2022-01-20 20:21:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][600/1251] eta 0:24:07 lr 0.000741 time 2.5882 (2.2230) loss 3.6534 (3.7035) grad_norm 1.0852 (1.2683) [2022-01-20 20:21:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][610/1251] eta 0:23:43 lr 0.000741 time 2.3506 (2.2206) loss 4.3576 (3.7065) grad_norm 1.2923 (1.2678) [2022-01-20 20:21:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][620/1251] eta 0:23:19 lr 0.000741 time 2.4246 (2.2176) loss 3.7870 (3.7110) grad_norm 1.3576 (1.2673) [2022-01-20 20:22:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][630/1251] eta 0:22:55 lr 0.000741 time 1.6858 (2.2148) loss 4.0339 (3.7094) grad_norm 1.2576 (1.2682) [2022-01-20 20:22:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][640/1251] eta 0:22:33 lr 0.000741 time 2.8190 (2.2152) loss 3.2722 (3.7056) grad_norm 1.0889 (1.2677) [2022-01-20 20:22:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][650/1251] eta 0:22:09 lr 0.000741 time 2.1541 (2.2127) loss 3.2312 (3.7029) grad_norm 1.2337 (1.2679) [2022-01-20 20:23:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][660/1251] eta 0:21:48 lr 0.000741 time 2.5077 (2.2138) loss 3.3900 (3.7062) grad_norm 1.1647 (1.2693) [2022-01-20 20:23:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][670/1251] eta 0:21:27 lr 0.000741 time 1.5109 (2.2164) loss 3.4332 (3.7063) grad_norm 1.2056 (1.2709) [2022-01-20 20:23:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][680/1251] eta 0:21:05 lr 0.000741 time 1.8250 (2.2166) loss 4.6178 (3.7069) grad_norm 1.1980 (1.2711) [2022-01-20 20:24:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][690/1251] eta 0:20:42 lr 0.000741 time 2.2939 (2.2147) loss 4.1742 (3.7034) grad_norm 1.1341 (1.2706) [2022-01-20 20:24:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][700/1251] eta 0:20:19 lr 0.000741 time 2.3383 (2.2134) loss 4.3472 (3.7033) grad_norm 1.2112 (1.2694) [2022-01-20 20:25:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][710/1251] eta 0:19:57 lr 0.000741 time 1.9778 (2.2139) loss 4.2883 (3.7014) grad_norm 1.1867 (1.2696) [2022-01-20 20:25:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][720/1251] eta 0:19:35 lr 0.000741 time 2.1659 (2.2134) loss 2.7633 (3.6957) grad_norm 1.3025 (1.2697) [2022-01-20 20:25:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][730/1251] eta 0:19:13 lr 0.000741 time 2.2539 (2.2137) loss 4.3079 (3.6977) grad_norm 1.3227 (1.2692) [2022-01-20 20:26:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][740/1251] eta 0:18:51 lr 0.000741 time 2.8198 (2.2149) loss 4.2382 (3.6970) grad_norm 1.2834 (1.2691) [2022-01-20 20:26:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][750/1251] eta 0:18:29 lr 0.000741 time 1.6342 (2.2145) loss 3.2554 (3.6987) grad_norm 1.0984 (1.2689) [2022-01-20 20:26:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][760/1251] eta 0:18:06 lr 0.000741 time 2.2636 (2.2127) loss 4.4029 (3.6993) grad_norm 1.2350 (1.2689) [2022-01-20 20:27:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][770/1251] eta 0:17:43 lr 0.000741 time 1.5424 (2.2102) loss 3.7736 (3.7018) grad_norm 1.2372 (1.2683) [2022-01-20 20:27:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][780/1251] eta 0:17:20 lr 0.000741 time 1.9409 (2.2096) loss 3.9880 (3.7026) grad_norm 1.5477 (1.2698) [2022-01-20 20:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][790/1251] eta 0:16:58 lr 0.000741 time 2.5327 (2.2084) loss 2.8142 (3.7054) grad_norm 1.1465 (1.2691) [2022-01-20 20:28:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][800/1251] eta 0:16:36 lr 0.000741 time 2.2370 (2.2103) loss 3.9907 (3.7067) grad_norm 1.2015 (1.2681) [2022-01-20 20:28:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][810/1251] eta 0:16:14 lr 0.000741 time 1.9614 (2.2103) loss 4.0167 (3.7051) grad_norm 1.1048 (1.2677) [2022-01-20 20:29:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][820/1251] eta 0:15:52 lr 0.000740 time 2.1339 (2.2099) loss 4.4697 (3.7063) grad_norm 1.3162 (1.2684) [2022-01-20 20:29:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][830/1251] eta 0:15:30 lr 0.000740 time 2.1559 (2.2094) loss 4.0865 (3.7050) grad_norm 1.5625 (1.2681) [2022-01-20 20:29:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][840/1251] eta 0:15:08 lr 0.000740 time 1.9004 (2.2095) loss 4.2806 (3.7075) grad_norm 1.1383 (1.2678) [2022-01-20 20:30:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][850/1251] eta 0:14:45 lr 0.000740 time 2.1806 (2.2084) loss 2.5851 (3.7076) grad_norm 1.2272 (1.2675) [2022-01-20 20:30:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][860/1251] eta 0:14:22 lr 0.000740 time 2.5599 (2.2071) loss 3.0970 (3.7027) grad_norm 1.1102 (1.2690) [2022-01-20 20:30:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][870/1251] eta 0:14:00 lr 0.000740 time 1.8407 (2.2052) loss 3.8658 (3.7016) grad_norm 1.1327 (1.2684) [2022-01-20 20:31:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][880/1251] eta 0:13:39 lr 0.000740 time 2.2505 (2.2077) loss 3.7870 (3.7026) grad_norm 1.3942 (1.2686) [2022-01-20 20:31:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][890/1251] eta 0:13:16 lr 0.000740 time 2.2015 (2.2069) loss 3.2932 (3.7040) grad_norm 1.2607 (1.2681) [2022-01-20 20:31:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][900/1251] eta 0:12:54 lr 0.000740 time 2.4824 (2.2076) loss 4.3272 (3.7076) grad_norm 1.1310 (1.2681) [2022-01-20 20:32:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][910/1251] eta 0:12:32 lr 0.000740 time 2.1231 (2.2077) loss 2.6665 (3.7085) grad_norm 1.1870 (1.2672) [2022-01-20 20:32:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][920/1251] eta 0:12:10 lr 0.000740 time 1.6075 (2.2065) loss 4.2207 (3.7112) grad_norm 1.2699 (1.2670) [2022-01-20 20:33:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][930/1251] eta 0:11:47 lr 0.000740 time 1.8842 (2.2045) loss 4.1613 (3.7099) grad_norm 1.2021 (1.2658) [2022-01-20 20:33:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][940/1251] eta 0:11:25 lr 0.000740 time 1.8747 (2.2036) loss 4.0438 (3.7130) grad_norm 1.2809 (1.2658) [2022-01-20 20:33:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][950/1251] eta 0:11:02 lr 0.000740 time 2.2385 (2.2025) loss 3.9380 (3.7146) grad_norm 1.2159 (1.2667) [2022-01-20 20:34:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][960/1251] eta 0:10:41 lr 0.000740 time 1.8903 (2.2031) loss 3.8007 (3.7143) grad_norm 1.0932 (1.2663) [2022-01-20 20:34:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][970/1251] eta 0:10:19 lr 0.000740 time 2.5996 (2.2033) loss 4.2663 (3.7142) grad_norm 1.3138 (1.2657) [2022-01-20 20:34:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][980/1251] eta 0:09:57 lr 0.000740 time 2.3763 (2.2039) loss 3.9984 (3.7134) grad_norm 1.2462 (1.2660) [2022-01-20 20:35:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][990/1251] eta 0:09:35 lr 0.000740 time 1.6405 (2.2033) loss 3.2689 (3.7112) grad_norm 1.1954 (1.2656) [2022-01-20 20:35:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1000/1251] eta 0:09:12 lr 0.000740 time 1.6163 (2.2031) loss 4.1604 (3.7112) grad_norm 1.1679 (1.2656) [2022-01-20 20:35:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1010/1251] eta 0:08:50 lr 0.000740 time 2.2056 (2.2015) loss 3.4077 (3.7112) grad_norm 1.1550 (1.2648) [2022-01-20 20:36:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1020/1251] eta 0:08:28 lr 0.000740 time 3.0113 (2.2011) loss 3.2851 (3.7117) grad_norm 1.1845 (1.2648) [2022-01-20 20:36:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1030/1251] eta 0:08:06 lr 0.000740 time 2.1443 (2.2013) loss 4.4094 (3.7125) grad_norm 1.4572 (1.2651) [2022-01-20 20:36:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1040/1251] eta 0:07:44 lr 0.000740 time 1.5703 (2.2012) loss 3.9183 (3.7124) grad_norm 1.5209 (1.2655) [2022-01-20 20:37:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1050/1251] eta 0:07:22 lr 0.000740 time 1.6315 (2.2000) loss 4.7448 (3.7120) grad_norm 1.5840 (1.2655) [2022-01-20 20:37:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1060/1251] eta 0:07:00 lr 0.000740 time 2.4372 (2.2007) loss 4.0706 (3.7119) grad_norm 1.2729 (1.2653) [2022-01-20 20:38:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1070/1251] eta 0:06:38 lr 0.000740 time 1.5742 (2.2005) loss 4.3847 (3.7128) grad_norm 1.4053 (1.2652) [2022-01-20 20:38:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1080/1251] eta 0:06:16 lr 0.000740 time 2.3686 (2.2020) loss 4.0304 (3.7126) grad_norm 1.2318 (1.2649) [2022-01-20 20:38:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1090/1251] eta 0:05:54 lr 0.000740 time 1.9479 (2.2016) loss 3.7038 (3.7112) grad_norm 1.4160 (1.2649) [2022-01-20 20:39:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1100/1251] eta 0:05:32 lr 0.000739 time 2.2171 (2.2005) loss 4.3859 (3.7109) grad_norm 1.3462 (1.2650) [2022-01-20 20:39:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1110/1251] eta 0:05:10 lr 0.000739 time 1.5489 (2.1987) loss 2.3354 (3.7116) grad_norm 1.2039 (1.2653) [2022-01-20 20:39:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1120/1251] eta 0:04:47 lr 0.000739 time 1.9573 (2.1975) loss 3.9238 (3.7106) grad_norm 1.1265 (1.2651) [2022-01-20 20:40:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1130/1251] eta 0:04:25 lr 0.000739 time 1.9534 (2.1971) loss 3.8723 (3.7107) grad_norm 1.1655 (1.2643) [2022-01-20 20:40:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1140/1251] eta 0:04:03 lr 0.000739 time 2.4169 (2.1979) loss 2.8091 (3.7083) grad_norm 1.1139 (1.2639) [2022-01-20 20:40:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1150/1251] eta 0:03:42 lr 0.000739 time 2.2003 (2.1995) loss 3.7381 (3.7061) grad_norm 1.2791 (1.2636) [2022-01-20 20:41:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1160/1251] eta 0:03:20 lr 0.000739 time 2.7526 (2.1997) loss 3.3815 (3.7062) grad_norm 1.0391 (1.2632) [2022-01-20 20:41:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1170/1251] eta 0:02:58 lr 0.000739 time 2.1947 (2.1999) loss 4.0691 (3.7086) grad_norm 1.1471 (1.2630) [2022-01-20 20:42:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1180/1251] eta 0:02:36 lr 0.000739 time 2.2048 (2.2002) loss 3.2451 (3.7088) grad_norm 1.2075 (1.2623) [2022-01-20 20:42:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1190/1251] eta 0:02:14 lr 0.000739 time 1.6323 (2.1995) loss 3.0957 (3.7077) grad_norm 1.3551 (1.2631) [2022-01-20 20:42:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1200/1251] eta 0:01:52 lr 0.000739 time 2.1485 (2.1995) loss 3.2541 (3.7066) grad_norm 1.1152 (1.2627) [2022-01-20 20:43:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1210/1251] eta 0:01:30 lr 0.000739 time 1.6441 (2.1986) loss 3.2025 (3.7042) grad_norm 1.1851 (1.2625) [2022-01-20 20:43:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1220/1251] eta 0:01:08 lr 0.000739 time 3.1325 (2.1990) loss 3.7768 (3.7053) grad_norm 1.1754 (1.2624) [2022-01-20 20:43:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1230/1251] eta 0:00:46 lr 0.000739 time 2.4031 (2.2004) loss 3.5154 (3.7043) grad_norm 1.1677 (1.2621) [2022-01-20 20:44:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1240/1251] eta 0:00:24 lr 0.000739 time 1.2564 (2.1975) loss 4.2360 (3.7035) grad_norm 1.2700 (1.2623) [2022-01-20 20:44:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [102/300][1250/1251] eta 0:00:02 lr 0.000739 time 1.1896 (2.1916) loss 4.1369 (3.7055) grad_norm 1.2084 (1.2617) [2022-01-20 20:44:29 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 102 training takes 0:45:42 [2022-01-20 20:44:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.787 (18.787) Loss 1.1350 (1.1350) Acc@1 72.852 (72.852) Acc@5 92.578 (92.578) [2022-01-20 20:45:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.693 (3.460) Loss 1.0913 (1.1264) Acc@1 73.438 (73.509) Acc@5 93.164 (92.285) [2022-01-20 20:45:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.271 (2.482) Loss 1.1779 (1.1374) Acc@1 71.680 (73.498) Acc@5 91.504 (92.090) [2022-01-20 20:45:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.636 (2.233) Loss 1.1785 (1.1305) Acc@1 72.949 (73.671) Acc@5 90.918 (92.080) [2022-01-20 20:45:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.827 (2.162) Loss 1.1580 (1.1335) Acc@1 73.242 (73.552) Acc@5 90.918 (92.054) [2022-01-20 20:46:05 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 73.596 Acc@5 92.040 [2022-01-20 20:46:05 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 73.6% [2022-01-20 20:46:05 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.98% [2022-01-20 20:46:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][0/1251] eta 7:40:25 lr 0.000739 time 22.0824 (22.0824) loss 3.7372 (3.7372) grad_norm 1.1489 (1.1489) [2022-01-20 20:46:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][10/1251] eta 1:23:27 lr 0.000739 time 2.5328 (4.0347) loss 3.0550 (3.5069) grad_norm 1.2587 (1.1680) [2022-01-20 20:47:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][20/1251] eta 1:03:55 lr 0.000739 time 1.4664 (3.1154) loss 3.2562 (3.6429) grad_norm 1.2959 (1.2222) [2022-01-20 20:47:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][30/1251] eta 0:57:29 lr 0.000739 time 1.9197 (2.8250) loss 4.3172 (3.6639) grad_norm 1.3547 (1.2611) [2022-01-20 20:47:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][40/1251] eta 0:54:53 lr 0.000739 time 3.7461 (2.7196) loss 3.0574 (3.6502) grad_norm 1.2063 (1.2650) [2022-01-20 20:48:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][50/1251] eta 0:52:36 lr 0.000739 time 2.2141 (2.6283) loss 4.1299 (3.5744) grad_norm 1.0609 (1.2456) [2022-01-20 20:48:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][60/1251] eta 0:50:13 lr 0.000739 time 1.5377 (2.5305) loss 4.3172 (3.6345) grad_norm 1.2995 (1.2390) [2022-01-20 20:49:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][70/1251] eta 0:48:44 lr 0.000739 time 1.8454 (2.4764) loss 3.7278 (3.6555) grad_norm 1.1082 (1.2458) [2022-01-20 20:49:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][80/1251] eta 0:48:00 lr 0.000739 time 3.3560 (2.4594) loss 4.1342 (3.6612) grad_norm 1.3283 (1.2492) [2022-01-20 20:49:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][90/1251] eta 0:47:00 lr 0.000739 time 2.3397 (2.4290) loss 4.4713 (3.6805) grad_norm 1.3437 (1.2571) [2022-01-20 20:50:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][100/1251] eta 0:46:09 lr 0.000739 time 1.7500 (2.4062) loss 3.8984 (3.6581) grad_norm 1.3429 (1.2784) [2022-01-20 20:50:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][110/1251] eta 0:45:23 lr 0.000739 time 1.9690 (2.3870) loss 3.4662 (3.6328) grad_norm 1.4909 (1.2791) [2022-01-20 20:50:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][120/1251] eta 0:44:52 lr 0.000738 time 3.0818 (2.3805) loss 3.9081 (3.6431) grad_norm 1.4361 (1.2853) [2022-01-20 20:51:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][130/1251] eta 0:44:09 lr 0.000738 time 2.1966 (2.3638) loss 2.5915 (3.6390) grad_norm 1.3169 (1.2800) [2022-01-20 20:51:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][140/1251] eta 0:43:23 lr 0.000738 time 1.9267 (2.3433) loss 4.6604 (3.6453) grad_norm 1.2714 (1.2741) [2022-01-20 20:51:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][150/1251] eta 0:42:43 lr 0.000738 time 1.9658 (2.3279) loss 4.1587 (3.6519) grad_norm 1.1500 (1.2700) [2022-01-20 20:52:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][160/1251] eta 0:42:09 lr 0.000738 time 2.5557 (2.3183) loss 3.3827 (3.6621) grad_norm 1.1313 (1.2700) [2022-01-20 20:52:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][170/1251] eta 0:41:37 lr 0.000738 time 1.8833 (2.3107) loss 3.0973 (3.6631) grad_norm 1.4495 (1.2718) [2022-01-20 20:53:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][180/1251] eta 0:41:08 lr 0.000738 time 2.2320 (2.3051) loss 3.6888 (3.6548) grad_norm 1.1394 (1.2702) [2022-01-20 20:53:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][190/1251] eta 0:40:39 lr 0.000738 time 2.1716 (2.2988) loss 2.9548 (3.6463) grad_norm 1.4253 (1.2666) [2022-01-20 20:53:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][200/1251] eta 0:40:13 lr 0.000738 time 2.8607 (2.2961) loss 3.7034 (3.6628) grad_norm 1.1557 (1.2667) [2022-01-20 20:54:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][210/1251] eta 0:39:44 lr 0.000738 time 2.1152 (2.2910) loss 3.7848 (3.6676) grad_norm 1.2182 (1.2667) [2022-01-20 20:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][220/1251] eta 0:39:12 lr 0.000738 time 1.8572 (2.2815) loss 3.8990 (3.6674) grad_norm 1.3793 (1.2660) [2022-01-20 20:54:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][230/1251] eta 0:38:43 lr 0.000738 time 2.2036 (2.2759) loss 3.1349 (3.6648) grad_norm 1.4120 (1.2667) [2022-01-20 20:55:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][240/1251] eta 0:38:19 lr 0.000738 time 2.7948 (2.2741) loss 4.3177 (3.6670) grad_norm 1.2103 (1.2672) [2022-01-20 20:55:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][250/1251] eta 0:37:51 lr 0.000738 time 1.7893 (2.2692) loss 2.5820 (3.6703) grad_norm 1.4548 (1.2654) [2022-01-20 20:55:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][260/1251] eta 0:37:31 lr 0.000738 time 2.0818 (2.2715) loss 3.8032 (3.6704) grad_norm 1.3531 (1.2673) [2022-01-20 20:56:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][270/1251] eta 0:37:06 lr 0.000738 time 2.2058 (2.2695) loss 3.6686 (3.6663) grad_norm 1.1049 (1.2661) [2022-01-20 20:56:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][280/1251] eta 0:36:38 lr 0.000738 time 2.4786 (2.2640) loss 4.0689 (3.6669) grad_norm 1.3599 (1.2654) [2022-01-20 20:57:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][290/1251] eta 0:36:04 lr 0.000738 time 1.5735 (2.2520) loss 2.4580 (3.6606) grad_norm 1.4036 (1.2646) [2022-01-20 20:57:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][300/1251] eta 0:35:39 lr 0.000738 time 2.4132 (2.2492) loss 3.0977 (3.6552) grad_norm 1.1492 (1.2636) [2022-01-20 20:57:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][310/1251] eta 0:35:16 lr 0.000738 time 3.1423 (2.2490) loss 3.1757 (3.6535) grad_norm 1.0991 (1.2618) [2022-01-20 20:58:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][320/1251] eta 0:34:55 lr 0.000738 time 2.7276 (2.2512) loss 4.0692 (3.6435) grad_norm 1.3956 (1.2619) [2022-01-20 20:58:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][330/1251] eta 0:34:28 lr 0.000738 time 1.9522 (2.2457) loss 4.2085 (3.6475) grad_norm 1.4633 (1.2686) [2022-01-20 20:58:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][340/1251] eta 0:34:05 lr 0.000738 time 2.2041 (2.2459) loss 2.6903 (3.6460) grad_norm 1.1404 (1.2685) [2022-01-20 20:59:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][350/1251] eta 0:33:45 lr 0.000738 time 3.1500 (2.2485) loss 3.0064 (3.6496) grad_norm 1.5071 (1.2698) [2022-01-20 20:59:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][360/1251] eta 0:33:19 lr 0.000738 time 2.5563 (2.2445) loss 3.9760 (3.6569) grad_norm 1.5863 (1.2725) [2022-01-20 20:59:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][370/1251] eta 0:32:50 lr 0.000738 time 1.9174 (2.2371) loss 3.8611 (3.6547) grad_norm 1.3004 (1.2721) [2022-01-20 21:00:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][380/1251] eta 0:32:24 lr 0.000738 time 1.9431 (2.2322) loss 4.4676 (3.6593) grad_norm 1.1061 (1.2714) [2022-01-20 21:00:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][390/1251] eta 0:32:00 lr 0.000737 time 1.8792 (2.2306) loss 3.7700 (3.6639) grad_norm 1.5799 (1.2727) [2022-01-20 21:01:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][400/1251] eta 0:31:38 lr 0.000737 time 2.6738 (2.2307) loss 4.3044 (3.6657) grad_norm 1.2157 (1.2751) [2022-01-20 21:01:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][410/1251] eta 0:31:16 lr 0.000737 time 2.7394 (2.2318) loss 3.4030 (3.6741) grad_norm 1.4617 (1.2763) [2022-01-20 21:01:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][420/1251] eta 0:30:55 lr 0.000737 time 3.1036 (2.2324) loss 3.5798 (3.6753) grad_norm 1.1836 (1.2760) [2022-01-20 21:02:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][430/1251] eta 0:30:30 lr 0.000737 time 2.0468 (2.2291) loss 3.9490 (3.6753) grad_norm 1.1943 (1.2748) [2022-01-20 21:02:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][440/1251] eta 0:30:07 lr 0.000737 time 2.4470 (2.2291) loss 4.5255 (3.6803) grad_norm 1.2189 (1.2741) [2022-01-20 21:02:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][450/1251] eta 0:29:47 lr 0.000737 time 2.5104 (2.2310) loss 3.7327 (3.6712) grad_norm 1.1240 (1.2739) [2022-01-20 21:03:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][460/1251] eta 0:29:25 lr 0.000737 time 2.5618 (2.2321) loss 2.9730 (3.6641) grad_norm 1.2126 (1.2724) [2022-01-20 21:03:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][470/1251] eta 0:29:00 lr 0.000737 time 1.9647 (2.2286) loss 4.4757 (3.6667) grad_norm 1.6471 (1.2721) [2022-01-20 21:03:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][480/1251] eta 0:28:33 lr 0.000737 time 1.9312 (2.2231) loss 4.2017 (3.6659) grad_norm 1.3449 (1.2730) [2022-01-20 21:04:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][490/1251] eta 0:28:08 lr 0.000737 time 2.0134 (2.2190) loss 3.9629 (3.6646) grad_norm 1.1332 (1.2722) [2022-01-20 21:04:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][500/1251] eta 0:27:45 lr 0.000737 time 2.5684 (2.2178) loss 3.8264 (3.6642) grad_norm 1.2775 (1.2731) [2022-01-20 21:04:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][510/1251] eta 0:27:22 lr 0.000737 time 2.1945 (2.2163) loss 4.0792 (3.6689) grad_norm 1.3590 (1.2728) [2022-01-20 21:05:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][520/1251] eta 0:27:01 lr 0.000737 time 1.4591 (2.2181) loss 4.1104 (3.6715) grad_norm 1.2406 (1.2730) [2022-01-20 21:05:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][530/1251] eta 0:26:40 lr 0.000737 time 2.2176 (2.2200) loss 4.0891 (3.6681) grad_norm 1.2222 (1.2738) [2022-01-20 21:06:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][540/1251] eta 0:26:18 lr 0.000737 time 2.3646 (2.2201) loss 2.7520 (3.6647) grad_norm 1.2387 (1.2740) [2022-01-20 21:06:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][550/1251] eta 0:25:55 lr 0.000737 time 2.1040 (2.2195) loss 3.9114 (3.6620) grad_norm 1.2752 (1.2749) [2022-01-20 21:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][560/1251] eta 0:25:32 lr 0.000737 time 1.8407 (2.2183) loss 3.0806 (3.6614) grad_norm 1.0730 (1.2746) [2022-01-20 21:07:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][570/1251] eta 0:25:11 lr 0.000737 time 2.1471 (2.2195) loss 3.9881 (3.6670) grad_norm 1.3040 (1.2751) [2022-01-20 21:07:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][580/1251] eta 0:24:50 lr 0.000737 time 2.1574 (2.2207) loss 4.6602 (3.6715) grad_norm 1.2449 (1.2748) [2022-01-20 21:07:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][590/1251] eta 0:24:29 lr 0.000737 time 2.6176 (2.2230) loss 4.0407 (3.6720) grad_norm 1.4439 (1.2749) [2022-01-20 21:08:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][600/1251] eta 0:24:07 lr 0.000737 time 1.9347 (2.2228) loss 2.9121 (3.6677) grad_norm 1.3307 (1.2764) [2022-01-20 21:08:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][610/1251] eta 0:23:42 lr 0.000737 time 2.2506 (2.2195) loss 4.1069 (3.6716) grad_norm 1.3766 (1.2765) [2022-01-20 21:09:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][620/1251] eta 0:23:16 lr 0.000737 time 1.9129 (2.2138) loss 3.7329 (3.6717) grad_norm 1.2620 (1.2758) [2022-01-20 21:09:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][630/1251] eta 0:22:52 lr 0.000737 time 1.8604 (2.2107) loss 4.0685 (3.6730) grad_norm 1.4392 (1.2757) [2022-01-20 21:09:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][640/1251] eta 0:22:30 lr 0.000737 time 2.5864 (2.2099) loss 4.0368 (3.6752) grad_norm 1.2576 (1.2760) [2022-01-20 21:10:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][650/1251] eta 0:22:08 lr 0.000737 time 2.4956 (2.2110) loss 3.1563 (3.6801) grad_norm 1.2501 (1.2752) [2022-01-20 21:10:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][660/1251] eta 0:21:45 lr 0.000736 time 1.9130 (2.2090) loss 3.4921 (3.6815) grad_norm 1.3139 (1.2751) [2022-01-20 21:10:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][670/1251] eta 0:21:24 lr 0.000736 time 2.5069 (2.2105) loss 3.7273 (3.6819) grad_norm 1.2350 (1.2760) [2022-01-20 21:11:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][680/1251] eta 0:21:03 lr 0.000736 time 2.4968 (2.2122) loss 4.6304 (3.6826) grad_norm 1.3136 (1.2762) [2022-01-20 21:11:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][690/1251] eta 0:20:41 lr 0.000736 time 2.7326 (2.2129) loss 4.1411 (3.6848) grad_norm 1.2320 (1.2751) [2022-01-20 21:11:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][700/1251] eta 0:20:18 lr 0.000736 time 1.9741 (2.2116) loss 2.4682 (3.6832) grad_norm 1.6777 (1.2745) [2022-01-20 21:12:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][710/1251] eta 0:19:55 lr 0.000736 time 1.9628 (2.2105) loss 3.9000 (3.6823) grad_norm 1.4556 (1.2754) [2022-01-20 21:12:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][720/1251] eta 0:19:32 lr 0.000736 time 2.1820 (2.2080) loss 3.6110 (3.6820) grad_norm 1.1800 (1.2747) [2022-01-20 21:12:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][730/1251] eta 0:19:09 lr 0.000736 time 2.8715 (2.2072) loss 3.7498 (3.6821) grad_norm 1.0880 (1.2744) [2022-01-20 21:13:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][740/1251] eta 0:18:47 lr 0.000736 time 1.8933 (2.2070) loss 4.4053 (3.6819) grad_norm 1.3733 (1.2752) [2022-01-20 21:13:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][750/1251] eta 0:18:25 lr 0.000736 time 2.1617 (2.2059) loss 4.0206 (3.6849) grad_norm 1.4016 (1.2760) [2022-01-20 21:14:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][760/1251] eta 0:18:03 lr 0.000736 time 2.3979 (2.2077) loss 4.3098 (3.6802) grad_norm 1.2049 (1.2767) [2022-01-20 21:14:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][770/1251] eta 0:17:42 lr 0.000736 time 2.9493 (2.2085) loss 3.9341 (3.6813) grad_norm 1.1393 (1.2756) [2022-01-20 21:14:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][780/1251] eta 0:17:20 lr 0.000736 time 2.0155 (2.2092) loss 3.4804 (3.6806) grad_norm 1.1131 (1.2761) [2022-01-20 21:15:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][790/1251] eta 0:16:58 lr 0.000736 time 2.2424 (2.2096) loss 4.5421 (3.6801) grad_norm 1.0817 (1.2764) [2022-01-20 21:15:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][800/1251] eta 0:16:37 lr 0.000736 time 2.4080 (2.2106) loss 4.2220 (3.6753) grad_norm 1.1297 (1.2762) [2022-01-20 21:15:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][810/1251] eta 0:16:14 lr 0.000736 time 2.2597 (2.2099) loss 3.2919 (3.6738) grad_norm 1.1321 (1.2750) [2022-01-20 21:16:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][820/1251] eta 0:15:50 lr 0.000736 time 1.5901 (2.2062) loss 4.0602 (3.6749) grad_norm 1.2879 (1.2751) [2022-01-20 21:16:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][830/1251] eta 0:15:27 lr 0.000736 time 1.9720 (2.2042) loss 3.8528 (3.6763) grad_norm 1.3756 (1.2750) [2022-01-20 21:16:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][840/1251] eta 0:15:05 lr 0.000736 time 2.5554 (2.2025) loss 3.9678 (3.6817) grad_norm 1.1718 (1.2755) [2022-01-20 21:17:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][850/1251] eta 0:14:42 lr 0.000736 time 1.8475 (2.2014) loss 4.2149 (3.6814) grad_norm 1.3972 (1.2746) [2022-01-20 21:17:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][860/1251] eta 0:14:20 lr 0.000736 time 1.9104 (2.2020) loss 4.2599 (3.6813) grad_norm 1.2091 (1.2748) [2022-01-20 21:18:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][870/1251] eta 0:13:59 lr 0.000736 time 2.2716 (2.2034) loss 2.7614 (3.6807) grad_norm 1.0626 (1.2750) [2022-01-20 21:18:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][880/1251] eta 0:13:37 lr 0.000736 time 2.2498 (2.2047) loss 3.5964 (3.6810) grad_norm 1.3945 (1.2750) [2022-01-20 21:18:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][890/1251] eta 0:13:16 lr 0.000736 time 2.1470 (2.2062) loss 2.6905 (3.6796) grad_norm 1.2288 (1.2747) [2022-01-20 21:19:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][900/1251] eta 0:12:55 lr 0.000736 time 3.0415 (2.2087) loss 3.7725 (3.6781) grad_norm 1.0900 (1.2742) [2022-01-20 21:19:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][910/1251] eta 0:12:33 lr 0.000736 time 2.1958 (2.2108) loss 2.6170 (3.6788) grad_norm 1.3127 (1.2737) [2022-01-20 21:20:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][920/1251] eta 0:12:11 lr 0.000736 time 2.1635 (2.2102) loss 3.8912 (3.6780) grad_norm 1.3373 (1.2733) [2022-01-20 21:20:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][930/1251] eta 0:11:49 lr 0.000736 time 2.0016 (2.2088) loss 3.9882 (3.6803) grad_norm 1.1950 (1.2727) [2022-01-20 21:20:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][940/1251] eta 0:11:26 lr 0.000735 time 1.9340 (2.2074) loss 4.1674 (3.6806) grad_norm 1.4383 (1.2737) [2022-01-20 21:21:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][950/1251] eta 0:11:04 lr 0.000735 time 2.2825 (2.2067) loss 4.0298 (3.6850) grad_norm 1.4602 (1.2737) [2022-01-20 21:21:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][960/1251] eta 0:10:41 lr 0.000735 time 2.2192 (2.2059) loss 3.6504 (3.6866) grad_norm 1.0780 (1.2729) [2022-01-20 21:21:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][970/1251] eta 0:10:19 lr 0.000735 time 1.8070 (2.2044) loss 3.9882 (3.6886) grad_norm 1.2881 (1.2727) [2022-01-20 21:22:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][980/1251] eta 0:09:57 lr 0.000735 time 2.5153 (2.2041) loss 3.4059 (3.6898) grad_norm 1.1279 (1.2729) [2022-01-20 21:22:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][990/1251] eta 0:09:35 lr 0.000735 time 2.1819 (2.2047) loss 4.1110 (3.6927) grad_norm 1.1477 (1.2730) [2022-01-20 21:22:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1000/1251] eta 0:09:13 lr 0.000735 time 2.4582 (2.2059) loss 3.8312 (3.6926) grad_norm 1.3901 (1.2726) [2022-01-20 21:23:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1010/1251] eta 0:08:51 lr 0.000735 time 1.7924 (2.2050) loss 3.9345 (3.6933) grad_norm 1.1882 (1.2722) [2022-01-20 21:23:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1020/1251] eta 0:08:29 lr 0.000735 time 2.7516 (2.2053) loss 3.5499 (3.6936) grad_norm 1.3995 (1.2729) [2022-01-20 21:23:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1030/1251] eta 0:08:07 lr 0.000735 time 2.6751 (2.2050) loss 3.2167 (3.6926) grad_norm 1.2063 (1.2721) [2022-01-20 21:24:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1040/1251] eta 0:07:45 lr 0.000735 time 3.0985 (2.2054) loss 2.9776 (3.6939) grad_norm 1.3457 (1.2725) [2022-01-20 21:24:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1050/1251] eta 0:07:22 lr 0.000735 time 1.5711 (2.2033) loss 4.8188 (3.6951) grad_norm 1.4669 (1.2728) [2022-01-20 21:25:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1060/1251] eta 0:07:00 lr 0.000735 time 2.0377 (2.2013) loss 3.5499 (3.6938) grad_norm 1.2024 (1.2728) [2022-01-20 21:25:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1070/1251] eta 0:06:38 lr 0.000735 time 2.6331 (2.2007) loss 4.0241 (3.6920) grad_norm 1.1032 (1.2719) [2022-01-20 21:25:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1080/1251] eta 0:06:16 lr 0.000735 time 2.4373 (2.1997) loss 3.5649 (3.6911) grad_norm 1.3281 (1.2713) [2022-01-20 21:26:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1090/1251] eta 0:05:53 lr 0.000735 time 1.9140 (2.1984) loss 4.0080 (3.6916) grad_norm 1.2522 (1.2710) [2022-01-20 21:26:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1100/1251] eta 0:05:31 lr 0.000735 time 1.9428 (2.1980) loss 4.1246 (3.6927) grad_norm 1.2372 (1.2701) [2022-01-20 21:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1110/1251] eta 0:05:09 lr 0.000735 time 2.1496 (2.1979) loss 3.9096 (3.6930) grad_norm 1.2105 (1.2699) [2022-01-20 21:27:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1120/1251] eta 0:04:48 lr 0.000735 time 2.5781 (2.2011) loss 3.1265 (3.6923) grad_norm 1.2237 (1.2690) [2022-01-20 21:27:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1130/1251] eta 0:04:26 lr 0.000735 time 2.4607 (2.2031) loss 4.3512 (3.6947) grad_norm 1.3920 (1.2692) [2022-01-20 21:28:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1140/1251] eta 0:04:04 lr 0.000735 time 1.5300 (2.2041) loss 4.4400 (3.6939) grad_norm 1.0667 (1.2691) [2022-01-20 21:28:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1150/1251] eta 0:03:42 lr 0.000735 time 1.9954 (2.2026) loss 3.5164 (3.6949) grad_norm 1.5148 (1.2694) [2022-01-20 21:28:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1160/1251] eta 0:03:20 lr 0.000735 time 1.8840 (2.2003) loss 2.8800 (3.6972) grad_norm 1.1031 (1.2695) [2022-01-20 21:29:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1170/1251] eta 0:02:58 lr 0.000735 time 1.9106 (2.1995) loss 4.2032 (3.6987) grad_norm 1.3434 (1.2697) [2022-01-20 21:29:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1180/1251] eta 0:02:36 lr 0.000735 time 1.9980 (2.1986) loss 3.5408 (3.6984) grad_norm 1.2711 (1.2707) [2022-01-20 21:29:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1190/1251] eta 0:02:14 lr 0.000735 time 2.2478 (2.2002) loss 3.9974 (3.6983) grad_norm 1.2893 (1.2713) [2022-01-20 21:30:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1200/1251] eta 0:01:52 lr 0.000735 time 1.8988 (2.2016) loss 4.2147 (3.6989) grad_norm 1.4107 (1.2711) [2022-01-20 21:30:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1210/1251] eta 0:01:30 lr 0.000734 time 1.8688 (2.2020) loss 3.4879 (3.6992) grad_norm 1.0701 (1.2709) [2022-01-20 21:30:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1220/1251] eta 0:01:08 lr 0.000734 time 1.4784 (2.2007) loss 3.4458 (3.6975) grad_norm 1.0841 (1.2705) [2022-01-20 21:31:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1230/1251] eta 0:00:46 lr 0.000734 time 3.0354 (2.2002) loss 3.9134 (3.6982) grad_norm 1.2470 (1.2704) [2022-01-20 21:31:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1240/1251] eta 0:00:24 lr 0.000734 time 1.5715 (2.1984) loss 4.2689 (3.6985) grad_norm 1.1933 (1.2705) [2022-01-20 21:31:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [103/300][1250/1251] eta 0:00:02 lr 0.000734 time 1.1970 (2.1924) loss 3.6612 (3.6984) grad_norm 1.1618 (1.2698) [2022-01-20 21:31:48 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 103 training takes 0:45:43 [2022-01-20 21:32:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.314 (18.314) Loss 1.0247 (1.0247) Acc@1 75.977 (75.977) Acc@5 92.676 (92.676) [2022-01-20 21:32:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.264 (3.404) Loss 1.0253 (1.0854) Acc@1 75.684 (74.414) Acc@5 92.285 (92.214) [2022-01-20 21:32:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.949 (2.581) Loss 1.2002 (1.0979) Acc@1 72.070 (74.172) Acc@5 90.430 (92.225) [2022-01-20 21:33:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.317 (2.302) Loss 1.1341 (1.0978) Acc@1 70.898 (73.913) Acc@5 91.602 (92.232) [2022-01-20 21:33:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.776 (2.163) Loss 1.1227 (1.1002) Acc@1 73.633 (73.983) Acc@5 91.797 (92.221) [2022-01-20 21:33:24 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 73.922 Acc@5 92.216 [2022-01-20 21:33:24 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 73.9% [2022-01-20 21:33:24 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.98% [2022-01-20 21:33:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][0/1251] eta 7:28:22 lr 0.000734 time 21.5049 (21.5049) loss 3.6833 (3.6833) grad_norm 1.2427 (1.2427) [2022-01-20 21:34:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][10/1251] eta 1:24:00 lr 0.000734 time 2.4209 (4.0613) loss 2.9496 (3.5552) grad_norm 1.1218 (1.2691) [2022-01-20 21:34:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][20/1251] eta 1:09:28 lr 0.000734 time 2.0798 (3.3862) loss 3.2290 (3.4376) grad_norm 1.1198 (1.2727) [2022-01-20 21:34:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][30/1251] eta 1:01:08 lr 0.000734 time 1.8924 (3.0048) loss 3.0295 (3.5298) grad_norm 1.3051 (1.2742) [2022-01-20 21:35:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][40/1251] eta 0:57:47 lr 0.000734 time 3.7496 (2.8635) loss 3.1588 (3.5697) grad_norm 1.3262 (1.2877) [2022-01-20 21:35:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][50/1251] eta 0:55:03 lr 0.000734 time 1.8025 (2.7510) loss 3.9115 (3.5195) grad_norm 1.0917 (1.2768) [2022-01-20 21:36:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][60/1251] eta 0:52:20 lr 0.000734 time 1.8075 (2.6372) loss 3.5417 (3.5750) grad_norm 1.0748 (1.2653) [2022-01-20 21:36:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][70/1251] eta 0:50:11 lr 0.000734 time 1.9581 (2.5498) loss 3.1055 (3.5757) grad_norm 1.3371 (1.2760) [2022-01-20 21:36:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][80/1251] eta 0:48:42 lr 0.000734 time 2.5024 (2.4954) loss 4.2794 (3.6065) grad_norm 1.6352 (1.2825) [2022-01-20 21:37:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][90/1251] eta 0:47:40 lr 0.000734 time 2.2070 (2.4639) loss 4.1668 (3.6111) grad_norm 1.3519 (1.2926) [2022-01-20 21:37:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][100/1251] eta 0:46:51 lr 0.000734 time 1.6134 (2.4430) loss 3.9835 (3.6314) grad_norm 1.2275 (1.2950) [2022-01-20 21:37:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][110/1251] eta 0:46:00 lr 0.000734 time 2.0099 (2.4194) loss 3.0155 (3.6372) grad_norm 1.3331 (1.2943) [2022-01-20 21:38:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][120/1251] eta 0:45:08 lr 0.000734 time 1.8877 (2.3948) loss 2.9642 (3.6362) grad_norm 1.4341 (1.2908) [2022-01-20 21:38:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][130/1251] eta 0:44:28 lr 0.000734 time 1.9857 (2.3805) loss 4.4388 (3.6606) grad_norm 1.2310 (1.2859) [2022-01-20 21:38:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][140/1251] eta 0:43:50 lr 0.000734 time 3.1224 (2.3679) loss 2.3953 (3.6548) grad_norm 1.1745 (1.2840) [2022-01-20 21:39:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][150/1251] eta 0:43:19 lr 0.000734 time 2.2948 (2.3610) loss 3.0749 (3.6602) grad_norm 1.1460 (1.2837) [2022-01-20 21:39:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][160/1251] eta 0:42:49 lr 0.000734 time 1.6257 (2.3555) loss 3.8334 (3.6790) grad_norm 1.0258 (1.2803) [2022-01-20 21:40:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][170/1251] eta 0:42:11 lr 0.000734 time 1.5254 (2.3422) loss 4.4195 (3.6891) grad_norm 1.5452 (1.2783) [2022-01-20 21:40:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][180/1251] eta 0:41:35 lr 0.000734 time 2.8812 (2.3298) loss 3.7023 (3.6878) grad_norm 1.3696 (1.2796) [2022-01-20 21:40:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][190/1251] eta 0:40:58 lr 0.000734 time 1.6416 (2.3171) loss 4.0566 (3.6861) grad_norm 1.3555 (1.2776) [2022-01-20 21:41:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][200/1251] eta 0:40:17 lr 0.000734 time 1.9573 (2.3005) loss 3.2120 (3.6917) grad_norm 1.2382 (1.2757) [2022-01-20 21:41:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][210/1251] eta 0:39:44 lr 0.000734 time 2.1134 (2.2905) loss 3.2956 (3.6872) grad_norm 1.2012 (1.2757) [2022-01-20 21:41:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][220/1251] eta 0:39:15 lr 0.000734 time 1.9712 (2.2848) loss 4.5751 (3.6834) grad_norm 1.4011 (1.2764) [2022-01-20 21:42:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][230/1251] eta 0:38:53 lr 0.000733 time 1.5525 (2.2854) loss 4.4352 (3.6932) grad_norm 1.5276 (1.2771) [2022-01-20 21:42:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][240/1251] eta 0:38:38 lr 0.000733 time 2.5609 (2.2935) loss 3.2400 (3.6865) grad_norm 1.3187 (1.2802) [2022-01-20 21:43:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][250/1251] eta 0:38:19 lr 0.000733 time 2.4902 (2.2976) loss 3.5929 (3.6920) grad_norm 1.1014 (1.2792) [2022-01-20 21:43:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][260/1251] eta 0:37:53 lr 0.000733 time 1.5631 (2.2938) loss 2.7954 (3.6866) grad_norm 1.3257 (1.2793) [2022-01-20 21:43:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][270/1251] eta 0:37:17 lr 0.000733 time 1.7840 (2.2811) loss 4.4094 (3.6913) grad_norm 1.1832 (1.2778) [2022-01-20 21:44:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][280/1251] eta 0:36:45 lr 0.000733 time 2.0595 (2.2713) loss 3.2253 (3.6949) grad_norm 1.1798 (1.2745) [2022-01-20 21:44:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][290/1251] eta 0:36:22 lr 0.000733 time 1.6971 (2.2712) loss 3.9756 (3.7022) grad_norm 1.1786 (1.2769) [2022-01-20 21:44:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][300/1251] eta 0:35:54 lr 0.000733 time 1.9514 (2.2651) loss 4.0374 (3.7001) grad_norm 1.2082 (1.2755) [2022-01-20 21:45:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][310/1251] eta 0:35:27 lr 0.000733 time 2.1711 (2.2609) loss 4.6194 (3.7001) grad_norm 1.3656 (1.2770) [2022-01-20 21:45:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][320/1251] eta 0:35:00 lr 0.000733 time 1.9141 (2.2559) loss 3.6420 (3.6904) grad_norm 1.3132 (1.2774) [2022-01-20 21:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][330/1251] eta 0:34:38 lr 0.000733 time 1.9492 (2.2572) loss 4.2596 (3.6910) grad_norm 1.2475 (1.2761) [2022-01-20 21:46:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][340/1251] eta 0:34:17 lr 0.000733 time 2.0503 (2.2586) loss 3.9656 (3.6887) grad_norm 1.0795 (1.2741) [2022-01-20 21:46:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][350/1251] eta 0:33:59 lr 0.000733 time 2.4727 (2.2635) loss 2.9736 (3.6806) grad_norm 1.3058 (1.2749) [2022-01-20 21:47:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][360/1251] eta 0:33:37 lr 0.000733 time 2.1174 (2.2640) loss 4.2706 (3.6762) grad_norm 1.1533 (1.2739) [2022-01-20 21:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][370/1251] eta 0:33:10 lr 0.000733 time 1.9411 (2.2597) loss 4.1086 (3.6871) grad_norm 1.4847 (1.2758) [2022-01-20 21:47:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][380/1251] eta 0:32:44 lr 0.000733 time 2.0749 (2.2556) loss 3.1980 (3.6969) grad_norm 1.3320 (1.2758) [2022-01-20 21:48:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][390/1251] eta 0:32:16 lr 0.000733 time 1.9404 (2.2494) loss 4.0648 (3.7002) grad_norm 1.4584 (1.2748) [2022-01-20 21:48:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][400/1251] eta 0:31:53 lr 0.000733 time 1.9030 (2.2481) loss 3.7547 (3.6962) grad_norm 1.4840 (1.2765) [2022-01-20 21:48:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][410/1251] eta 0:31:31 lr 0.000733 time 2.1458 (2.2486) loss 4.0079 (3.6923) grad_norm 1.3941 (1.2792) [2022-01-20 21:49:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][420/1251] eta 0:31:06 lr 0.000733 time 1.9082 (2.2463) loss 3.8502 (3.6884) grad_norm 1.2566 (1.2788) [2022-01-20 21:49:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][430/1251] eta 0:30:43 lr 0.000733 time 1.5586 (2.2456) loss 4.0070 (3.6846) grad_norm 1.1459 (1.2766) [2022-01-20 21:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][440/1251] eta 0:30:20 lr 0.000733 time 2.1882 (2.2445) loss 2.6443 (3.6813) grad_norm 1.5535 (1.2764) [2022-01-20 21:50:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][450/1251] eta 0:29:56 lr 0.000733 time 2.6432 (2.2431) loss 3.8695 (3.6790) grad_norm 1.0552 (1.2772) [2022-01-20 21:50:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][460/1251] eta 0:29:31 lr 0.000733 time 1.5315 (2.2390) loss 3.1453 (3.6775) grad_norm 1.5130 (1.2779) [2022-01-20 21:50:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][470/1251] eta 0:29:06 lr 0.000733 time 1.7937 (2.2357) loss 3.6225 (3.6728) grad_norm 1.2179 (1.2771) [2022-01-20 21:51:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][480/1251] eta 0:28:42 lr 0.000733 time 2.1564 (2.2343) loss 4.3017 (3.6793) grad_norm 1.1751 (1.2763) [2022-01-20 21:51:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][490/1251] eta 0:28:18 lr 0.000733 time 1.9034 (2.2325) loss 4.2520 (3.6856) grad_norm 1.3366 (1.2769) [2022-01-20 21:52:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][500/1251] eta 0:27:56 lr 0.000732 time 1.5935 (2.2325) loss 3.8671 (3.6913) grad_norm 1.1275 (1.2772) [2022-01-20 21:52:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][510/1251] eta 0:27:32 lr 0.000732 time 2.1900 (2.2297) loss 4.4872 (3.6972) grad_norm 1.4569 (1.2771) [2022-01-20 21:52:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][520/1251] eta 0:27:09 lr 0.000732 time 2.8288 (2.2290) loss 4.3758 (3.6987) grad_norm 1.3169 (1.2797) [2022-01-20 21:53:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][530/1251] eta 0:26:49 lr 0.000732 time 2.2284 (2.2322) loss 3.3531 (3.7020) grad_norm 1.2630 (1.2808) [2022-01-20 21:53:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][540/1251] eta 0:26:30 lr 0.000732 time 1.7505 (2.2367) loss 4.1231 (3.6990) grad_norm 1.0900 (1.2808) [2022-01-20 21:53:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][550/1251] eta 0:26:09 lr 0.000732 time 2.1602 (2.2383) loss 4.2210 (3.7036) grad_norm 1.1528 (1.2795) [2022-01-20 21:54:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][560/1251] eta 0:25:46 lr 0.000732 time 2.8873 (2.2379) loss 2.7023 (3.7031) grad_norm 1.2563 (1.2793) [2022-01-20 21:54:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][570/1251] eta 0:25:20 lr 0.000732 time 1.8806 (2.2334) loss 4.1440 (3.7028) grad_norm 1.2863 (1.2797) [2022-01-20 21:54:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][580/1251] eta 0:24:55 lr 0.000732 time 1.9401 (2.2287) loss 3.7313 (3.7006) grad_norm 1.1741 (1.2803) [2022-01-20 21:55:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][590/1251] eta 0:24:31 lr 0.000732 time 2.5237 (2.2258) loss 3.8878 (3.6973) grad_norm 1.3366 (1.2795) [2022-01-20 21:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][600/1251] eta 0:24:09 lr 0.000732 time 2.3580 (2.2262) loss 3.5060 (3.6942) grad_norm 1.2504 (1.2791) [2022-01-20 21:56:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][610/1251] eta 0:23:47 lr 0.000732 time 2.1315 (2.2262) loss 3.7395 (3.7010) grad_norm 1.3468 (1.2804) [2022-01-20 21:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][620/1251] eta 0:23:24 lr 0.000732 time 1.5928 (2.2259) loss 3.8823 (3.7029) grad_norm 1.3231 (1.2810) [2022-01-20 21:56:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][630/1251] eta 0:23:04 lr 0.000732 time 3.1298 (2.2292) loss 2.9061 (3.7029) grad_norm 1.2058 (1.2798) [2022-01-20 21:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][640/1251] eta 0:22:43 lr 0.000732 time 2.8403 (2.2323) loss 3.4160 (3.6977) grad_norm 1.2271 (1.2794) [2022-01-20 21:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][650/1251] eta 0:22:19 lr 0.000732 time 1.6867 (2.2293) loss 2.8193 (3.6926) grad_norm 1.2408 (1.2788) [2022-01-20 21:57:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][660/1251] eta 0:21:55 lr 0.000732 time 1.7095 (2.2255) loss 4.4027 (3.6882) grad_norm 1.1288 (1.2789) [2022-01-20 21:58:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][670/1251] eta 0:21:31 lr 0.000732 time 2.2323 (2.2225) loss 2.9641 (3.6894) grad_norm 1.2580 (1.2794) [2022-01-20 21:58:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][680/1251] eta 0:21:09 lr 0.000732 time 2.1768 (2.2224) loss 3.8936 (3.6929) grad_norm 1.1545 (1.2782) [2022-01-20 21:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][690/1251] eta 0:20:47 lr 0.000732 time 1.4544 (2.2232) loss 3.7204 (3.6934) grad_norm 1.2210 (1.2773) [2022-01-20 21:59:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][700/1251] eta 0:20:25 lr 0.000732 time 2.0180 (2.2235) loss 3.8034 (3.6912) grad_norm 1.3539 (1.2773) [2022-01-20 21:59:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][710/1251] eta 0:20:04 lr 0.000732 time 2.3585 (2.2268) loss 3.5086 (3.6912) grad_norm 1.2113 (1.2768) [2022-01-20 22:00:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][720/1251] eta 0:19:41 lr 0.000732 time 1.9510 (2.2252) loss 3.9281 (3.6891) grad_norm 1.2783 (1.2764) [2022-01-20 22:00:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][730/1251] eta 0:19:17 lr 0.000732 time 1.8931 (2.2217) loss 3.5209 (3.6913) grad_norm 1.7599 (1.2766) [2022-01-20 22:00:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][740/1251] eta 0:18:53 lr 0.000732 time 1.9071 (2.2186) loss 2.6517 (3.6904) grad_norm 1.2535 (1.2760) [2022-01-20 22:01:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][750/1251] eta 0:18:32 lr 0.000732 time 2.8187 (2.2196) loss 3.4222 (3.6936) grad_norm 1.1737 (1.2757) [2022-01-20 22:01:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][760/1251] eta 0:18:09 lr 0.000732 time 2.4183 (2.2180) loss 2.7217 (3.6918) grad_norm 1.2496 (1.2774) [2022-01-20 22:01:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][770/1251] eta 0:17:45 lr 0.000731 time 2.1248 (2.2160) loss 4.0061 (3.6918) grad_norm 1.1261 (1.2770) [2022-01-20 22:02:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][780/1251] eta 0:17:23 lr 0.000731 time 2.2392 (2.2146) loss 4.2015 (3.6938) grad_norm 1.2724 (1.2762) [2022-01-20 22:02:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][790/1251] eta 0:17:00 lr 0.000731 time 1.8003 (2.2144) loss 3.7057 (3.6937) grad_norm 1.2953 (1.2757) [2022-01-20 22:02:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][800/1251] eta 0:16:38 lr 0.000731 time 2.1456 (2.2137) loss 4.2162 (3.6965) grad_norm 1.1480 (1.2756) [2022-01-20 22:03:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][810/1251] eta 0:16:16 lr 0.000731 time 2.7592 (2.2135) loss 4.0766 (3.6959) grad_norm 1.2891 (1.2758) [2022-01-20 22:03:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][820/1251] eta 0:15:53 lr 0.000731 time 2.1975 (2.2121) loss 3.7893 (3.6963) grad_norm 1.2322 (1.2759) [2022-01-20 22:04:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][830/1251] eta 0:15:31 lr 0.000731 time 1.7080 (2.2132) loss 4.4461 (3.6968) grad_norm 1.3492 (1.2758) [2022-01-20 22:04:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][840/1251] eta 0:15:09 lr 0.000731 time 1.7336 (2.2138) loss 4.1761 (3.6974) grad_norm 1.2858 (1.2768) [2022-01-20 22:04:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][850/1251] eta 0:14:48 lr 0.000731 time 2.8824 (2.2163) loss 3.9562 (3.6969) grad_norm 1.1940 (1.2771) [2022-01-20 22:05:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][860/1251] eta 0:14:26 lr 0.000731 time 2.2142 (2.2161) loss 3.3573 (3.6947) grad_norm 1.1183 (1.2765) [2022-01-20 22:05:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][870/1251] eta 0:14:04 lr 0.000731 time 1.8653 (2.2169) loss 3.6064 (3.6928) grad_norm 1.3627 (1.2756) [2022-01-20 22:05:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][880/1251] eta 0:13:41 lr 0.000731 time 1.7318 (2.2146) loss 4.3571 (3.6945) grad_norm 1.2989 (1.2752) [2022-01-20 22:06:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][890/1251] eta 0:13:19 lr 0.000731 time 2.3128 (2.2139) loss 4.0054 (3.6939) grad_norm 1.1473 (1.2739) [2022-01-20 22:06:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][900/1251] eta 0:12:56 lr 0.000731 time 1.7847 (2.2126) loss 4.0093 (3.6942) grad_norm 1.0910 (1.2732) [2022-01-20 22:07:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][910/1251] eta 0:12:34 lr 0.000731 time 2.4877 (2.2130) loss 4.3475 (3.6955) grad_norm 1.3647 (1.2732) [2022-01-20 22:07:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][920/1251] eta 0:12:11 lr 0.000731 time 2.2014 (2.2102) loss 2.4706 (3.6934) grad_norm 1.1137 (1.2727) [2022-01-20 22:07:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][930/1251] eta 0:11:49 lr 0.000731 time 2.3714 (2.2089) loss 2.9736 (3.6900) grad_norm 1.3540 (1.2723) [2022-01-20 22:08:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][940/1251] eta 0:11:26 lr 0.000731 time 2.2322 (2.2085) loss 3.3396 (3.6880) grad_norm 1.3195 (1.2726) [2022-01-20 22:08:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][950/1251] eta 0:11:04 lr 0.000731 time 2.5305 (2.2084) loss 2.9798 (3.6880) grad_norm 1.1642 (1.2725) [2022-01-20 22:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][960/1251] eta 0:10:42 lr 0.000731 time 2.1197 (2.2090) loss 3.6714 (3.6875) grad_norm 1.1592 (1.2717) [2022-01-20 22:09:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][970/1251] eta 0:10:21 lr 0.000731 time 2.8407 (2.2106) loss 3.1750 (3.6856) grad_norm 1.4844 (1.2722) [2022-01-20 22:09:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][980/1251] eta 0:09:58 lr 0.000731 time 1.9845 (2.2097) loss 3.5768 (3.6856) grad_norm 1.0535 (1.2727) [2022-01-20 22:09:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][990/1251] eta 0:09:36 lr 0.000731 time 2.8865 (2.2099) loss 4.5033 (3.6870) grad_norm 1.5075 (1.2730) [2022-01-20 22:10:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1000/1251] eta 0:09:14 lr 0.000731 time 2.3255 (2.2096) loss 3.6850 (3.6877) grad_norm 1.2983 (1.2727) [2022-01-20 22:10:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1010/1251] eta 0:08:52 lr 0.000731 time 2.7598 (2.2098) loss 3.1017 (3.6869) grad_norm 1.2386 (1.2725) [2022-01-20 22:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1020/1251] eta 0:08:30 lr 0.000731 time 1.8081 (2.2100) loss 4.2304 (3.6848) grad_norm 1.1516 (1.2722) [2022-01-20 22:11:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1030/1251] eta 0:08:08 lr 0.000731 time 2.7451 (2.2100) loss 4.4275 (3.6868) grad_norm 1.1391 (1.2719) [2022-01-20 22:11:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1040/1251] eta 0:07:46 lr 0.000731 time 1.9077 (2.2097) loss 3.8714 (3.6900) grad_norm 1.6493 (1.2721) [2022-01-20 22:12:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1050/1251] eta 0:07:23 lr 0.000730 time 1.9361 (2.2074) loss 4.0039 (3.6891) grad_norm 1.1331 (1.2718) [2022-01-20 22:12:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1060/1251] eta 0:07:01 lr 0.000730 time 1.8181 (2.2076) loss 3.6567 (3.6902) grad_norm 1.2904 (1.2714) [2022-01-20 22:12:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1070/1251] eta 0:06:39 lr 0.000730 time 2.5082 (2.2075) loss 3.9044 (3.6907) grad_norm 1.4563 (1.2713) [2022-01-20 22:13:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1080/1251] eta 0:06:17 lr 0.000730 time 2.2368 (2.2079) loss 3.8960 (3.6899) grad_norm 1.3823 (1.2711) [2022-01-20 22:13:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1090/1251] eta 0:05:55 lr 0.000730 time 1.8879 (2.2082) loss 3.1210 (3.6894) grad_norm 1.3114 (1.2711) [2022-01-20 22:13:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1100/1251] eta 0:05:33 lr 0.000730 time 1.8662 (2.2090) loss 4.1238 (3.6886) grad_norm 1.2912 (1.2706) [2022-01-20 22:14:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1110/1251] eta 0:05:11 lr 0.000730 time 1.7584 (2.2080) loss 3.8928 (3.6856) grad_norm 1.1516 (1.2702) [2022-01-20 22:14:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1120/1251] eta 0:04:49 lr 0.000730 time 1.9125 (2.2072) loss 3.5931 (3.6859) grad_norm 1.1166 (1.2696) [2022-01-20 22:14:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1130/1251] eta 0:04:26 lr 0.000730 time 1.8868 (2.2061) loss 3.1440 (3.6857) grad_norm 1.2316 (1.2695) [2022-01-20 22:15:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1140/1251] eta 0:04:05 lr 0.000730 time 1.9777 (2.2072) loss 3.6398 (3.6866) grad_norm 1.4325 (1.2690) [2022-01-20 22:15:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1150/1251] eta 0:03:42 lr 0.000730 time 1.9778 (2.2060) loss 3.8409 (3.6884) grad_norm 1.0874 (1.2687) [2022-01-20 22:16:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1160/1251] eta 0:03:20 lr 0.000730 time 1.8410 (2.2057) loss 4.4269 (3.6871) grad_norm 1.2462 (1.2688) [2022-01-20 22:16:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1170/1251] eta 0:02:58 lr 0.000730 time 2.0936 (2.2058) loss 3.9005 (3.6886) grad_norm 1.3393 (1.2692) [2022-01-20 22:16:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1180/1251] eta 0:02:36 lr 0.000730 time 2.1941 (2.2049) loss 3.0928 (3.6876) grad_norm 1.0974 (1.2694) [2022-01-20 22:17:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1190/1251] eta 0:02:14 lr 0.000730 time 2.1798 (2.2040) loss 3.6355 (3.6891) grad_norm 1.2921 (1.2697) [2022-01-20 22:17:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1200/1251] eta 0:01:52 lr 0.000730 time 2.1554 (2.2027) loss 2.5207 (3.6887) grad_norm 1.4386 (1.2703) [2022-01-20 22:17:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1210/1251] eta 0:01:30 lr 0.000730 time 1.8475 (2.2027) loss 4.4276 (3.6906) grad_norm 1.3698 (1.2703) [2022-01-20 22:18:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1220/1251] eta 0:01:08 lr 0.000730 time 2.3590 (2.2029) loss 4.2654 (3.6913) grad_norm 1.1604 (1.2701) [2022-01-20 22:18:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1230/1251] eta 0:00:46 lr 0.000730 time 2.2151 (2.2027) loss 3.7258 (3.6904) grad_norm 1.1995 (1.2700) [2022-01-20 22:18:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1240/1251] eta 0:00:24 lr 0.000730 time 1.2820 (2.2008) loss 4.2278 (3.6899) grad_norm 1.5330 (1.2707) [2022-01-20 22:19:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [104/300][1250/1251] eta 0:00:02 lr 0.000730 time 1.1931 (2.1953) loss 3.9246 (3.6887) grad_norm 1.0848 (1.2699) [2022-01-20 22:19:11 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 104 training takes 0:45:46 [2022-01-20 22:19:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.589 (18.589) Loss 1.0873 (1.0873) Acc@1 74.414 (74.414) Acc@5 93.457 (93.457) [2022-01-20 22:19:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.231 (3.330) Loss 1.1863 (1.1285) Acc@1 72.949 (72.994) Acc@5 91.797 (92.374) [2022-01-20 22:20:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.212 (2.683) Loss 1.1436 (1.1240) Acc@1 74.805 (73.479) Acc@5 91.211 (92.267) [2022-01-20 22:20:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.614 (2.335) Loss 1.0768 (1.1211) Acc@1 73.828 (73.551) Acc@5 93.457 (92.269) [2022-01-20 22:20:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.760 (2.185) Loss 1.2278 (1.1182) Acc@1 71.582 (73.676) Acc@5 90.039 (92.235) [2022-01-20 22:20:48 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 73.682 Acc@5 92.208 [2022-01-20 22:20:48 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 73.7% [2022-01-20 22:20:48 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 73.98% [2022-01-20 22:21:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][0/1251] eta 7:24:11 lr 0.000730 time 21.3042 (21.3042) loss 3.7641 (3.7641) grad_norm 1.3067 (1.3067) [2022-01-20 22:21:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][10/1251] eta 1:20:24 lr 0.000730 time 1.9451 (3.8880) loss 3.9858 (3.9229) grad_norm 1.2011 (1.1682) [2022-01-20 22:21:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][20/1251] eta 1:05:09 lr 0.000730 time 1.8656 (3.1758) loss 3.1726 (3.8608) grad_norm 1.3397 (1.2063) [2022-01-20 22:22:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][30/1251] eta 0:56:55 lr 0.000730 time 1.9338 (2.7975) loss 3.4127 (3.7948) grad_norm 1.4725 (1.2587) [2022-01-20 22:22:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][40/1251] eta 0:55:13 lr 0.000730 time 6.6171 (2.7366) loss 4.1994 (3.8076) grad_norm 1.1401 (1.2613) [2022-01-20 22:23:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][50/1251] eta 0:52:39 lr 0.000730 time 1.7787 (2.6308) loss 3.7016 (3.7094) grad_norm 1.2531 (1.2799) [2022-01-20 22:23:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][60/1251] eta 0:50:28 lr 0.000730 time 1.9378 (2.5430) loss 4.1290 (3.6422) grad_norm 1.2297 (1.2795) [2022-01-20 22:23:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][70/1251] eta 0:49:03 lr 0.000729 time 1.9842 (2.4923) loss 4.0860 (3.6567) grad_norm 1.1632 (1.2806) [2022-01-20 22:24:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][80/1251] eta 0:48:17 lr 0.000729 time 3.8461 (2.4747) loss 3.8271 (3.6221) grad_norm 1.2935 (1.2810) [2022-01-20 22:24:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][90/1251] eta 0:47:49 lr 0.000729 time 2.4250 (2.4712) loss 3.5692 (3.6028) grad_norm 1.2282 (1.2954) [2022-01-20 22:24:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][100/1251] eta 0:46:47 lr 0.000729 time 1.8769 (2.4393) loss 4.1496 (3.6111) grad_norm 1.3982 (1.2984) [2022-01-20 22:25:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][110/1251] eta 0:45:45 lr 0.000729 time 1.7281 (2.4058) loss 3.3344 (3.6117) grad_norm 1.4863 (1.2970) [2022-01-20 22:25:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][120/1251] eta 0:45:04 lr 0.000729 time 2.6694 (2.3911) loss 4.1811 (3.6095) grad_norm 1.3349 (1.2987) [2022-01-20 22:25:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][130/1251] eta 0:44:27 lr 0.000729 time 1.6849 (2.3794) loss 3.4031 (3.6225) grad_norm 1.6070 (1.2924) [2022-01-20 22:26:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][140/1251] eta 0:43:42 lr 0.000729 time 1.9142 (2.3601) loss 4.5174 (3.6247) grad_norm 1.4453 (1.2902) [2022-01-20 22:26:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][150/1251] eta 0:43:00 lr 0.000729 time 1.9602 (2.3436) loss 4.3215 (3.6364) grad_norm 1.0922 (1.2861) [2022-01-20 22:27:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][160/1251] eta 0:42:27 lr 0.000729 time 2.7517 (2.3351) loss 3.3200 (3.6261) grad_norm 1.1627 (1.2836) [2022-01-20 22:27:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][170/1251] eta 0:41:57 lr 0.000729 time 1.9088 (2.3286) loss 3.9238 (3.6403) grad_norm 1.4238 (1.2834) [2022-01-20 22:27:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][180/1251] eta 0:41:20 lr 0.000729 time 1.5122 (2.3157) loss 4.4668 (3.6477) grad_norm 1.6162 (1.2819) [2022-01-20 22:28:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][190/1251] eta 0:40:50 lr 0.000729 time 2.2099 (2.3094) loss 3.5501 (3.6407) grad_norm 1.2343 (1.2795) [2022-01-20 22:28:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][200/1251] eta 0:40:24 lr 0.000729 time 3.1285 (2.3073) loss 4.2022 (3.6323) grad_norm 1.2389 (1.2780) [2022-01-20 22:28:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][210/1251] eta 0:39:54 lr 0.000729 time 1.9589 (2.2999) loss 3.6604 (3.6496) grad_norm 1.1079 (1.2793) [2022-01-20 22:29:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][220/1251] eta 0:39:26 lr 0.000729 time 1.9516 (2.2954) loss 3.6242 (3.6499) grad_norm 1.1164 (1.2786) [2022-01-20 22:29:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][230/1251] eta 0:39:00 lr 0.000729 time 1.9138 (2.2921) loss 4.0594 (3.6433) grad_norm 1.4460 (1.2781) [2022-01-20 22:29:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][240/1251] eta 0:38:34 lr 0.000729 time 2.7675 (2.2892) loss 4.0559 (3.6565) grad_norm 1.2060 (1.2756) [2022-01-20 22:30:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][250/1251] eta 0:38:09 lr 0.000729 time 1.8343 (2.2874) loss 3.7656 (3.6483) grad_norm 1.2918 (1.2782) [2022-01-20 22:30:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][260/1251] eta 0:37:42 lr 0.000729 time 1.5915 (2.2832) loss 3.5044 (3.6369) grad_norm 1.2541 (1.2763) [2022-01-20 22:31:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][270/1251] eta 0:37:13 lr 0.000729 time 1.8378 (2.2772) loss 3.8493 (3.6447) grad_norm 1.1509 (1.2747) [2022-01-20 22:31:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][280/1251] eta 0:36:47 lr 0.000729 time 2.2138 (2.2732) loss 3.1105 (3.6440) grad_norm 1.1523 (1.2742) [2022-01-20 22:31:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][290/1251] eta 0:36:13 lr 0.000729 time 1.8447 (2.2617) loss 2.6366 (3.6493) grad_norm 1.1803 (1.2749) [2022-01-20 22:32:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][300/1251] eta 0:35:44 lr 0.000729 time 1.9351 (2.2549) loss 4.1968 (3.6569) grad_norm 1.2296 (1.2770) [2022-01-20 22:32:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][310/1251] eta 0:35:19 lr 0.000729 time 1.8881 (2.2521) loss 2.8409 (3.6547) grad_norm 1.4995 (1.2773) [2022-01-20 22:32:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][320/1251] eta 0:34:51 lr 0.000729 time 1.9252 (2.2467) loss 4.0375 (3.6523) grad_norm 1.2601 (1.2784) [2022-01-20 22:33:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][330/1251] eta 0:34:26 lr 0.000729 time 1.9263 (2.2436) loss 4.2591 (3.6520) grad_norm 1.1462 (1.2801) [2022-01-20 22:33:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][340/1251] eta 0:34:06 lr 0.000728 time 2.8031 (2.2465) loss 2.9421 (3.6520) grad_norm 1.2381 (1.2787) [2022-01-20 22:33:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][350/1251] eta 0:33:47 lr 0.000728 time 1.8308 (2.2501) loss 3.6407 (3.6444) grad_norm 1.3760 (1.2778) [2022-01-20 22:34:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][360/1251] eta 0:33:24 lr 0.000728 time 1.9210 (2.2495) loss 4.4829 (3.6406) grad_norm 1.2765 (1.2767) [2022-01-20 22:34:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][370/1251] eta 0:33:01 lr 0.000728 time 1.8564 (2.2488) loss 4.2304 (3.6425) grad_norm 1.1093 (1.2757) [2022-01-20 22:35:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][380/1251] eta 0:32:36 lr 0.000728 time 2.1488 (2.2466) loss 3.8433 (3.6422) grad_norm 1.1428 (1.2757) [2022-01-20 22:35:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][390/1251] eta 0:32:09 lr 0.000728 time 1.8963 (2.2411) loss 3.6683 (3.6547) grad_norm 1.2863 (1.2762) [2022-01-20 22:35:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][400/1251] eta 0:31:41 lr 0.000728 time 1.6496 (2.2349) loss 3.6183 (3.6500) grad_norm 1.4106 (1.2760) [2022-01-20 22:36:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][410/1251] eta 0:31:20 lr 0.000728 time 2.2740 (2.2355) loss 4.3200 (3.6564) grad_norm 1.4128 (1.2751) [2022-01-20 22:36:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][420/1251] eta 0:30:58 lr 0.000728 time 2.6297 (2.2370) loss 3.2509 (3.6547) grad_norm 1.0426 (1.2745) [2022-01-20 22:36:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][430/1251] eta 0:30:36 lr 0.000728 time 2.1214 (2.2374) loss 2.4525 (3.6535) grad_norm 1.3424 (1.2741) [2022-01-20 22:37:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][440/1251] eta 0:30:15 lr 0.000728 time 2.2617 (2.2387) loss 4.6174 (3.6596) grad_norm 1.6540 (1.2750) [2022-01-20 22:37:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][450/1251] eta 0:29:51 lr 0.000728 time 2.0931 (2.2371) loss 4.1061 (3.6541) grad_norm 1.1550 (1.2760) [2022-01-20 22:37:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][460/1251] eta 0:29:26 lr 0.000728 time 1.6330 (2.2334) loss 3.7964 (3.6519) grad_norm 1.1995 (1.2780) [2022-01-20 22:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][470/1251] eta 0:29:01 lr 0.000728 time 1.8766 (2.2300) loss 4.2652 (3.6530) grad_norm 1.1946 (1.2773) [2022-01-20 22:38:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][480/1251] eta 0:28:41 lr 0.000728 time 3.3588 (2.2324) loss 3.6416 (3.6574) grad_norm 1.1096 (1.2753) [2022-01-20 22:39:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][490/1251] eta 0:28:18 lr 0.000728 time 1.9286 (2.2322) loss 3.0978 (3.6563) grad_norm 1.1514 (1.2735) [2022-01-20 22:39:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][500/1251] eta 0:27:54 lr 0.000728 time 1.5995 (2.2293) loss 3.0194 (3.6578) grad_norm 1.2798 (1.2747) [2022-01-20 22:39:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][510/1251] eta 0:27:29 lr 0.000728 time 1.8604 (2.2261) loss 3.0365 (3.6545) grad_norm 1.2070 (1.2741) [2022-01-20 22:40:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][520/1251] eta 0:27:07 lr 0.000728 time 1.9753 (2.2263) loss 4.2576 (3.6591) grad_norm 1.4000 (1.2765) [2022-01-20 22:40:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][530/1251] eta 0:26:43 lr 0.000728 time 2.1637 (2.2241) loss 2.8531 (3.6601) grad_norm 1.3298 (1.2753) [2022-01-20 22:40:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][540/1251] eta 0:26:21 lr 0.000728 time 2.7383 (2.2239) loss 2.8193 (3.6583) grad_norm 1.4405 (1.2772) [2022-01-20 22:41:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][550/1251] eta 0:25:59 lr 0.000728 time 2.8665 (2.2242) loss 4.0643 (3.6596) grad_norm 2.0701 (1.2785) [2022-01-20 22:41:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][560/1251] eta 0:25:35 lr 0.000728 time 1.7169 (2.2215) loss 4.3698 (3.6593) grad_norm 1.1730 (1.2776) [2022-01-20 22:41:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][570/1251] eta 0:25:11 lr 0.000728 time 2.1541 (2.2192) loss 3.1613 (3.6614) grad_norm 1.3185 (1.2767) [2022-01-20 22:42:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][580/1251] eta 0:24:49 lr 0.000728 time 3.4866 (2.2197) loss 3.8000 (3.6605) grad_norm 1.0655 (1.2774) [2022-01-20 22:42:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][590/1251] eta 0:24:26 lr 0.000728 time 2.3730 (2.2183) loss 4.1116 (3.6572) grad_norm 1.1619 (1.2777) [2022-01-20 22:43:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][600/1251] eta 0:24:03 lr 0.000728 time 1.4740 (2.2175) loss 3.6908 (3.6566) grad_norm 1.3482 (1.2801) [2022-01-20 22:43:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][610/1251] eta 0:23:41 lr 0.000727 time 2.5655 (2.2173) loss 3.4964 (3.6578) grad_norm 1.4393 (1.2819) [2022-01-20 22:43:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][620/1251] eta 0:23:22 lr 0.000727 time 3.6435 (2.2231) loss 3.7667 (3.6567) grad_norm 1.4465 (1.2815) [2022-01-20 22:44:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][630/1251] eta 0:23:00 lr 0.000727 time 2.2879 (2.2235) loss 4.6523 (3.6594) grad_norm 1.4893 (1.2826) [2022-01-20 22:44:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][640/1251] eta 0:22:37 lr 0.000727 time 1.5504 (2.2211) loss 3.1870 (3.6576) grad_norm 1.0886 (1.2824) [2022-01-20 22:44:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][650/1251] eta 0:22:12 lr 0.000727 time 1.8502 (2.2175) loss 4.4008 (3.6588) grad_norm 1.1209 (1.2825) [2022-01-20 22:45:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][660/1251] eta 0:21:50 lr 0.000727 time 2.5131 (2.2167) loss 4.0749 (3.6613) grad_norm 1.1215 (1.2819) [2022-01-20 22:45:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][670/1251] eta 0:21:26 lr 0.000727 time 1.9475 (2.2139) loss 4.3251 (3.6613) grad_norm 1.2965 (1.2821) [2022-01-20 22:45:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][680/1251] eta 0:21:05 lr 0.000727 time 2.4805 (2.2159) loss 3.5642 (3.6627) grad_norm 1.1899 (1.2821) [2022-01-20 22:46:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][690/1251] eta 0:20:42 lr 0.000727 time 1.5852 (2.2157) loss 4.3176 (3.6640) grad_norm 1.2235 (1.2812) [2022-01-20 22:46:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][700/1251] eta 0:20:22 lr 0.000727 time 3.0575 (2.2184) loss 4.2129 (3.6635) grad_norm 1.4400 (1.2809) [2022-01-20 22:47:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][710/1251] eta 0:19:59 lr 0.000727 time 2.0730 (2.2178) loss 3.8711 (3.6624) grad_norm 1.1527 (1.2813) [2022-01-20 22:47:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][720/1251] eta 0:19:36 lr 0.000727 time 1.8378 (2.2163) loss 4.2084 (3.6651) grad_norm 1.6259 (1.2817) [2022-01-20 22:47:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][730/1251] eta 0:19:12 lr 0.000727 time 1.8864 (2.2130) loss 2.9925 (3.6699) grad_norm 1.4149 (1.2822) [2022-01-20 22:48:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][740/1251] eta 0:18:50 lr 0.000727 time 2.5405 (2.2119) loss 3.8847 (3.6725) grad_norm 1.3285 (1.2824) [2022-01-20 22:48:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][750/1251] eta 0:18:28 lr 0.000727 time 1.8398 (2.2132) loss 4.4251 (3.6756) grad_norm 1.1402 (1.2828) [2022-01-20 22:48:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][760/1251] eta 0:18:07 lr 0.000727 time 1.8673 (2.2144) loss 3.9355 (3.6749) grad_norm 1.4999 (1.2831) [2022-01-20 22:49:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][770/1251] eta 0:17:45 lr 0.000727 time 1.8852 (2.2151) loss 2.5528 (3.6735) grad_norm 1.4589 (1.2836) [2022-01-20 22:49:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][780/1251] eta 0:17:23 lr 0.000727 time 2.2327 (2.2158) loss 3.7718 (3.6712) grad_norm 1.3830 (1.2842) [2022-01-20 22:50:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][790/1251] eta 0:17:01 lr 0.000727 time 1.5674 (2.2150) loss 3.0761 (3.6677) grad_norm 1.1779 (1.2838) [2022-01-20 22:50:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][800/1251] eta 0:16:37 lr 0.000727 time 2.0869 (2.2127) loss 4.1464 (3.6721) grad_norm 1.2884 (1.2835) [2022-01-20 22:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][810/1251] eta 0:16:14 lr 0.000727 time 1.9631 (2.2101) loss 2.8081 (3.6732) grad_norm 1.4402 (1.2840) [2022-01-20 22:51:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][820/1251] eta 0:15:51 lr 0.000727 time 1.8756 (2.2084) loss 3.8400 (3.6755) grad_norm 1.1486 (1.2832) [2022-01-20 22:51:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][830/1251] eta 0:15:29 lr 0.000727 time 1.7656 (2.2074) loss 4.4774 (3.6770) grad_norm 1.1401 (1.2821) [2022-01-20 22:51:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][840/1251] eta 0:15:07 lr 0.000727 time 2.2658 (2.2079) loss 3.6650 (3.6757) grad_norm 1.3217 (1.2813) [2022-01-20 22:52:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][850/1251] eta 0:14:45 lr 0.000727 time 2.5378 (2.2085) loss 2.8443 (3.6745) grad_norm 1.1297 (1.2812) [2022-01-20 22:52:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][860/1251] eta 0:14:23 lr 0.000727 time 1.8098 (2.2079) loss 4.1233 (3.6765) grad_norm 1.2379 (1.2813) [2022-01-20 22:52:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][870/1251] eta 0:14:00 lr 0.000727 time 1.4976 (2.2066) loss 3.6488 (3.6750) grad_norm 1.3406 (1.2810) [2022-01-20 22:53:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][880/1251] eta 0:13:38 lr 0.000726 time 2.3709 (2.2070) loss 3.7417 (3.6766) grad_norm 1.2519 (1.2812) [2022-01-20 22:53:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][890/1251] eta 0:13:16 lr 0.000726 time 2.5342 (2.2066) loss 4.2668 (3.6808) grad_norm 1.3944 (1.2811) [2022-01-20 22:53:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][900/1251] eta 0:12:54 lr 0.000726 time 2.0843 (2.2062) loss 3.4970 (3.6768) grad_norm 1.3221 (1.2813) [2022-01-20 22:54:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][910/1251] eta 0:12:33 lr 0.000726 time 1.9336 (2.2082) loss 4.6476 (3.6789) grad_norm 1.2716 (1.2803) [2022-01-20 22:54:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][920/1251] eta 0:12:11 lr 0.000726 time 2.9905 (2.2089) loss 3.2572 (3.6814) grad_norm 1.2116 (1.2798) [2022-01-20 22:55:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][930/1251] eta 0:11:48 lr 0.000726 time 1.9075 (2.2076) loss 3.4349 (3.6801) grad_norm 1.4827 (1.2804) [2022-01-20 22:55:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][940/1251] eta 0:11:26 lr 0.000726 time 2.3333 (2.2068) loss 4.2850 (3.6783) grad_norm 1.4364 (1.2811) [2022-01-20 22:55:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][950/1251] eta 0:11:03 lr 0.000726 time 1.7648 (2.2060) loss 3.9731 (3.6816) grad_norm 1.2942 (1.2808) [2022-01-20 22:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][960/1251] eta 0:10:41 lr 0.000726 time 2.5700 (2.2054) loss 4.1960 (3.6820) grad_norm 1.1781 (1.2805) [2022-01-20 22:56:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][970/1251] eta 0:10:19 lr 0.000726 time 1.8217 (2.2041) loss 3.6464 (3.6780) grad_norm 1.1683 (1.2795) [2022-01-20 22:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][980/1251] eta 0:09:57 lr 0.000726 time 2.0384 (2.2036) loss 3.7117 (3.6771) grad_norm 1.1925 (1.2793) [2022-01-20 22:57:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][990/1251] eta 0:09:35 lr 0.000726 time 2.1956 (2.2043) loss 4.3834 (3.6778) grad_norm 1.3017 (1.2790) [2022-01-20 22:57:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1000/1251] eta 0:09:13 lr 0.000726 time 2.7252 (2.2038) loss 3.8752 (3.6781) grad_norm 1.0317 (1.2790) [2022-01-20 22:57:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1010/1251] eta 0:08:50 lr 0.000726 time 2.0396 (2.2022) loss 3.5997 (3.6774) grad_norm 1.1639 (1.2785) [2022-01-20 22:58:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1020/1251] eta 0:08:28 lr 0.000726 time 1.6392 (2.2014) loss 4.1015 (3.6782) grad_norm 1.1655 (1.2780) [2022-01-20 22:58:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1030/1251] eta 0:08:06 lr 0.000726 time 2.6441 (2.2008) loss 3.5453 (3.6784) grad_norm 1.3637 (1.2777) [2022-01-20 22:58:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1040/1251] eta 0:07:44 lr 0.000726 time 2.5659 (2.2008) loss 3.9477 (3.6787) grad_norm 1.2707 (1.2780) [2022-01-20 22:59:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1050/1251] eta 0:07:22 lr 0.000726 time 1.5593 (2.1998) loss 4.2171 (3.6811) grad_norm 1.1407 (1.2775) [2022-01-20 22:59:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1060/1251] eta 0:07:00 lr 0.000726 time 1.8985 (2.1996) loss 4.1198 (3.6803) grad_norm 1.3164 (1.2771) [2022-01-20 23:00:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1070/1251] eta 0:06:38 lr 0.000726 time 2.3754 (2.2013) loss 3.2549 (3.6822) grad_norm 1.3877 (1.2761) [2022-01-20 23:00:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1080/1251] eta 0:06:16 lr 0.000726 time 1.5126 (2.2004) loss 4.1806 (3.6816) grad_norm 1.3984 (1.2759) [2022-01-20 23:00:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1090/1251] eta 0:05:54 lr 0.000726 time 1.7275 (2.1994) loss 4.0943 (3.6832) grad_norm 1.4824 (1.2760) [2022-01-20 23:01:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1100/1251] eta 0:05:32 lr 0.000726 time 2.3899 (2.1998) loss 3.0239 (3.6828) grad_norm 1.5129 (1.2766) [2022-01-20 23:01:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1110/1251] eta 0:05:10 lr 0.000726 time 2.7590 (2.2010) loss 2.8441 (3.6845) grad_norm 1.3996 (1.2772) [2022-01-20 23:01:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1120/1251] eta 0:04:48 lr 0.000726 time 1.8603 (2.2025) loss 4.4960 (3.6849) grad_norm 1.3831 (1.2771) [2022-01-20 23:02:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1130/1251] eta 0:04:26 lr 0.000726 time 1.8126 (2.2018) loss 3.9699 (3.6839) grad_norm 1.1282 (1.2770) [2022-01-20 23:02:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1140/1251] eta 0:04:04 lr 0.000726 time 1.8781 (2.2012) loss 3.9401 (3.6859) grad_norm 1.2738 (1.2769) [2022-01-20 23:03:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1150/1251] eta 0:03:42 lr 0.000725 time 2.0222 (2.2002) loss 4.0793 (3.6885) grad_norm 1.2816 (1.2769) [2022-01-20 23:03:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1160/1251] eta 0:03:20 lr 0.000725 time 2.2724 (2.2003) loss 3.9970 (3.6890) grad_norm 1.5649 (1.2769) [2022-01-20 23:03:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1170/1251] eta 0:02:58 lr 0.000725 time 1.9721 (2.1996) loss 4.2047 (3.6914) grad_norm 1.2889 (1.2771) [2022-01-20 23:04:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1180/1251] eta 0:02:36 lr 0.000725 time 2.2488 (2.2009) loss 4.2423 (3.6911) grad_norm 1.4336 (1.2775) [2022-01-20 23:04:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1190/1251] eta 0:02:14 lr 0.000725 time 2.2588 (2.2004) loss 3.8112 (3.6910) grad_norm 1.6838 (1.2781) [2022-01-20 23:04:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1200/1251] eta 0:01:52 lr 0.000725 time 1.9311 (2.2000) loss 4.1108 (3.6921) grad_norm 1.2951 (1.2788) [2022-01-20 23:05:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1210/1251] eta 0:01:30 lr 0.000725 time 2.1729 (2.1981) loss 3.2077 (3.6951) grad_norm 1.0486 (1.2785) [2022-01-20 23:05:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1220/1251] eta 0:01:08 lr 0.000725 time 1.8997 (2.1966) loss 4.4227 (3.6927) grad_norm 1.0720 (1.2780) [2022-01-20 23:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1230/1251] eta 0:00:46 lr 0.000725 time 2.1695 (2.1959) loss 3.7611 (3.6939) grad_norm 1.0984 (1.2777) [2022-01-20 23:06:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1240/1251] eta 0:00:24 lr 0.000725 time 1.5357 (2.1961) loss 3.6028 (3.6949) grad_norm 1.3215 (1.2780) [2022-01-20 23:06:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [105/300][1250/1251] eta 0:00:02 lr 0.000725 time 1.1615 (2.1905) loss 3.1865 (3.6918) grad_norm 1.3545 (1.2788) [2022-01-20 23:06:28 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 105 training takes 0:45:40 [2022-01-20 23:06:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.344 (18.344) Loss 1.0316 (1.0316) Acc@1 75.195 (75.195) Acc@5 93.359 (93.359) [2022-01-20 23:07:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.967 (3.542) Loss 1.1388 (1.1072) Acc@1 72.754 (74.112) Acc@5 91.504 (92.516) [2022-01-20 23:07:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.661 (2.724) Loss 1.1105 (1.1099) Acc@1 75.488 (74.126) Acc@5 92.090 (92.332) [2022-01-20 23:07:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.298 (2.236) Loss 1.0345 (1.1046) Acc@1 75.586 (74.159) Acc@5 92.188 (92.329) [2022-01-20 23:07:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.973 (2.195) Loss 1.1810 (1.1070) Acc@1 72.070 (74.166) Acc@5 90.723 (92.288) [2022-01-20 23:08:07 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.228 Acc@5 92.234 [2022-01-20 23:08:07 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.2% [2022-01-20 23:08:07 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.23% [2022-01-20 23:08:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][0/1251] eta 7:39:32 lr 0.000725 time 22.0405 (22.0405) loss 3.8532 (3.8532) grad_norm 1.3487 (1.3487) [2022-01-20 23:08:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][10/1251] eta 1:23:38 lr 0.000725 time 1.9970 (4.0437) loss 3.3312 (3.6121) grad_norm 1.5003 (1.2738) [2022-01-20 23:09:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][20/1251] eta 1:06:05 lr 0.000725 time 1.9157 (3.2214) loss 3.9418 (3.6103) grad_norm 1.1448 (1.2275) [2022-01-20 23:09:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][30/1251] eta 0:58:12 lr 0.000725 time 1.6100 (2.8603) loss 3.7019 (3.6432) grad_norm 1.3352 (1.2198) [2022-01-20 23:09:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][40/1251] eta 0:55:06 lr 0.000725 time 4.1768 (2.7306) loss 4.6931 (3.6743) grad_norm 1.1790 (1.2308) [2022-01-20 23:10:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][50/1251] eta 0:52:44 lr 0.000725 time 1.7082 (2.6349) loss 4.3336 (3.6598) grad_norm 1.2618 (1.2425) [2022-01-20 23:10:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][60/1251] eta 0:52:08 lr 0.000725 time 2.3057 (2.6268) loss 3.7939 (3.6989) grad_norm 1.2219 (1.2392) [2022-01-20 23:11:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][70/1251] eta 0:50:47 lr 0.000725 time 1.8490 (2.5806) loss 3.1954 (3.6845) grad_norm 1.2233 (1.2514) [2022-01-20 23:11:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][80/1251] eta 0:49:45 lr 0.000725 time 2.6288 (2.5496) loss 3.8020 (3.6237) grad_norm 1.2859 (1.2517) [2022-01-20 23:11:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][90/1251] eta 0:48:19 lr 0.000725 time 1.9385 (2.4973) loss 4.0527 (3.6186) grad_norm 1.1450 (1.2570) [2022-01-20 23:12:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][100/1251] eta 0:46:41 lr 0.000725 time 2.2930 (2.4340) loss 3.5294 (3.6251) grad_norm 1.6445 (1.2666) [2022-01-20 23:12:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][110/1251] eta 0:45:50 lr 0.000725 time 2.2455 (2.4109) loss 3.9111 (3.6126) grad_norm 1.1349 (1.2601) [2022-01-20 23:12:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][120/1251] eta 0:45:09 lr 0.000725 time 2.3697 (2.3958) loss 3.0591 (3.6167) grad_norm 1.6650 (1.2649) [2022-01-20 23:13:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][130/1251] eta 0:44:26 lr 0.000725 time 2.1110 (2.3788) loss 3.7947 (3.6138) grad_norm 1.1692 (1.2638) [2022-01-20 23:13:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][140/1251] eta 0:44:06 lr 0.000725 time 4.0493 (2.3819) loss 3.8830 (3.6005) grad_norm 1.6491 (1.2661) [2022-01-20 23:14:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][150/1251] eta 0:43:43 lr 0.000725 time 1.6907 (2.3829) loss 3.4675 (3.5786) grad_norm 1.4679 (1.2666) [2022-01-20 23:14:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][160/1251] eta 0:43:05 lr 0.000725 time 1.9188 (2.3696) loss 4.1089 (3.5987) grad_norm 1.5502 (1.2704) [2022-01-20 23:14:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][170/1251] eta 0:42:27 lr 0.000724 time 1.9498 (2.3562) loss 3.7656 (3.6132) grad_norm 1.1712 (1.2677) [2022-01-20 23:15:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][180/1251] eta 0:41:39 lr 0.000724 time 2.1046 (2.3339) loss 4.1648 (3.6111) grad_norm 1.4618 (1.2675) [2022-01-20 23:15:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][190/1251] eta 0:41:07 lr 0.000724 time 1.9398 (2.3259) loss 4.0356 (3.6259) grad_norm 1.2256 (1.2678) [2022-01-20 23:15:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][200/1251] eta 0:40:37 lr 0.000724 time 2.0585 (2.3193) loss 3.9680 (3.6269) grad_norm 1.2381 (1.2733) [2022-01-20 23:16:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][210/1251] eta 0:40:05 lr 0.000724 time 2.2504 (2.3108) loss 3.2755 (3.6365) grad_norm 1.3977 (1.2770) [2022-01-20 23:16:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][220/1251] eta 0:39:28 lr 0.000724 time 1.6774 (2.2970) loss 2.9693 (3.6352) grad_norm 1.1096 (1.2791) [2022-01-20 23:16:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][230/1251] eta 0:39:03 lr 0.000724 time 2.2790 (2.2952) loss 3.9261 (3.6416) grad_norm 1.1242 (1.2763) [2022-01-20 23:17:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][240/1251] eta 0:38:40 lr 0.000724 time 2.3513 (2.2950) loss 3.7748 (3.6537) grad_norm 1.1976 (1.2726) [2022-01-20 23:17:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][250/1251] eta 0:38:17 lr 0.000724 time 2.6177 (2.2948) loss 3.5256 (3.6582) grad_norm 1.2730 (1.2727) [2022-01-20 23:18:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][260/1251] eta 0:37:49 lr 0.000724 time 1.8868 (2.2904) loss 3.8202 (3.6672) grad_norm 1.2197 (1.2751) [2022-01-20 23:18:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][270/1251] eta 0:37:24 lr 0.000724 time 2.5502 (2.2882) loss 3.8849 (3.6657) grad_norm 1.1389 (1.2732) [2022-01-20 23:18:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][280/1251] eta 0:36:57 lr 0.000724 time 2.9260 (2.2838) loss 4.3334 (3.6703) grad_norm 1.4015 (1.2718) [2022-01-20 23:19:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][290/1251] eta 0:36:22 lr 0.000724 time 2.1075 (2.2716) loss 3.9867 (3.6687) grad_norm 1.7712 (1.2758) [2022-01-20 23:19:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][300/1251] eta 0:35:53 lr 0.000724 time 1.8564 (2.2649) loss 3.7470 (3.6668) grad_norm 1.1757 (1.2741) [2022-01-20 23:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][310/1251] eta 0:35:28 lr 0.000724 time 2.1465 (2.2620) loss 4.0713 (3.6684) grad_norm 1.4011 (1.2723) [2022-01-20 23:20:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][320/1251] eta 0:35:07 lr 0.000724 time 3.0589 (2.2639) loss 3.7602 (3.6764) grad_norm 1.1772 (1.2720) [2022-01-20 23:20:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][330/1251] eta 0:34:42 lr 0.000724 time 1.9546 (2.2616) loss 3.9137 (3.6782) grad_norm 1.2818 (1.2702) [2022-01-20 23:20:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][340/1251] eta 0:34:21 lr 0.000724 time 2.1650 (2.2634) loss 3.8805 (3.6819) grad_norm 1.2062 (1.2679) [2022-01-20 23:21:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][350/1251] eta 0:34:01 lr 0.000724 time 2.1933 (2.2653) loss 3.9067 (3.6807) grad_norm 1.1482 (1.2671) [2022-01-20 23:21:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][360/1251] eta 0:33:35 lr 0.000724 time 2.2541 (2.2619) loss 4.0478 (3.6828) grad_norm 1.0689 (1.2706) [2022-01-20 23:22:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][370/1251] eta 0:33:03 lr 0.000724 time 1.6463 (2.2511) loss 4.2623 (3.6850) grad_norm 1.4451 (1.2725) [2022-01-20 23:22:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][380/1251] eta 0:32:34 lr 0.000724 time 2.1793 (2.2440) loss 2.8198 (3.6850) grad_norm 1.3990 (1.2759) [2022-01-20 23:22:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][390/1251] eta 0:32:09 lr 0.000724 time 2.1322 (2.2411) loss 3.6433 (3.6823) grad_norm 1.2096 (1.2750) [2022-01-20 23:23:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][400/1251] eta 0:31:46 lr 0.000724 time 2.3244 (2.2399) loss 3.5028 (3.6847) grad_norm 1.2569 (1.2750) [2022-01-20 23:23:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][410/1251] eta 0:31:21 lr 0.000724 time 1.7585 (2.2371) loss 3.5059 (3.6777) grad_norm 1.5097 (1.2758) [2022-01-20 23:23:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][420/1251] eta 0:31:00 lr 0.000724 time 2.6201 (2.2391) loss 3.8223 (3.6808) grad_norm 1.1778 (1.2745) [2022-01-20 23:24:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][430/1251] eta 0:30:37 lr 0.000723 time 2.1965 (2.2387) loss 3.5325 (3.6790) grad_norm 1.5427 (1.2751) [2022-01-20 23:24:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][440/1251] eta 0:30:18 lr 0.000723 time 2.8091 (2.2417) loss 4.0221 (3.6761) grad_norm 1.4682 (1.2753) [2022-01-20 23:24:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][450/1251] eta 0:29:57 lr 0.000723 time 2.4616 (2.2446) loss 3.8392 (3.6841) grad_norm 1.1728 (1.2757) [2022-01-20 23:25:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][460/1251] eta 0:29:34 lr 0.000723 time 1.9579 (2.2434) loss 4.1727 (3.6883) grad_norm 1.3117 (1.2763) [2022-01-20 23:25:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][470/1251] eta 0:29:08 lr 0.000723 time 1.9276 (2.2394) loss 3.0976 (3.6824) grad_norm 1.3269 (1.2768) [2022-01-20 23:26:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][480/1251] eta 0:28:43 lr 0.000723 time 2.0016 (2.2348) loss 3.8990 (3.6844) grad_norm 1.2629 (1.2757) [2022-01-20 23:26:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][490/1251] eta 0:28:21 lr 0.000723 time 2.0086 (2.2357) loss 4.2499 (3.6876) grad_norm 1.1936 (1.2759) [2022-01-20 23:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][500/1251] eta 0:27:59 lr 0.000723 time 2.4578 (2.2369) loss 4.1112 (3.6965) grad_norm 1.2932 (1.2786) [2022-01-20 23:27:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][510/1251] eta 0:27:36 lr 0.000723 time 1.9800 (2.2361) loss 4.2436 (3.7037) grad_norm 1.2588 (1.2781) [2022-01-20 23:27:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][520/1251] eta 0:27:14 lr 0.000723 time 2.7672 (2.2364) loss 3.3670 (3.7009) grad_norm 1.3256 (1.2782) [2022-01-20 23:27:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][530/1251] eta 0:26:50 lr 0.000723 time 1.6240 (2.2339) loss 4.1766 (3.7024) grad_norm 1.2493 (1.2773) [2022-01-20 23:28:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][540/1251] eta 0:26:26 lr 0.000723 time 2.4783 (2.2318) loss 4.2403 (3.7031) grad_norm 1.1443 (1.2770) [2022-01-20 23:28:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][550/1251] eta 0:26:03 lr 0.000723 time 1.5747 (2.2304) loss 3.3304 (3.6983) grad_norm 1.2645 (1.2764) [2022-01-20 23:28:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][560/1251] eta 0:25:41 lr 0.000723 time 2.4677 (2.2307) loss 4.1243 (3.6992) grad_norm 1.2584 (1.2766) [2022-01-20 23:29:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][570/1251] eta 0:25:18 lr 0.000723 time 1.9064 (2.2293) loss 4.3949 (3.6938) grad_norm 1.5077 (1.2781) [2022-01-20 23:29:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][580/1251] eta 0:24:55 lr 0.000723 time 1.8935 (2.2295) loss 4.2862 (3.6917) grad_norm 1.2376 (1.2770) [2022-01-20 23:30:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][590/1251] eta 0:24:33 lr 0.000723 time 2.1572 (2.2287) loss 4.3277 (3.6918) grad_norm 1.2546 (1.2777) [2022-01-20 23:30:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][600/1251] eta 0:24:09 lr 0.000723 time 1.8760 (2.2269) loss 3.5184 (3.6932) grad_norm 1.3362 (1.2792) [2022-01-20 23:30:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][610/1251] eta 0:23:46 lr 0.000723 time 1.8467 (2.2260) loss 4.1357 (3.6944) grad_norm 1.1750 (1.2788) [2022-01-20 23:31:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][620/1251] eta 0:23:23 lr 0.000723 time 1.9450 (2.2238) loss 3.8747 (3.6987) grad_norm 1.3992 (1.2789) [2022-01-20 23:31:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][630/1251] eta 0:22:59 lr 0.000723 time 2.3049 (2.2209) loss 4.0418 (3.7010) grad_norm 1.4620 (1.2793) [2022-01-20 23:31:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][640/1251] eta 0:22:37 lr 0.000723 time 2.8074 (2.2211) loss 3.9165 (3.7032) grad_norm 1.1918 (1.2791) [2022-01-20 23:32:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][650/1251] eta 0:22:16 lr 0.000723 time 1.9718 (2.2230) loss 3.0863 (3.7028) grad_norm 1.1687 (1.2784) [2022-01-20 23:32:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][660/1251] eta 0:21:54 lr 0.000723 time 2.1091 (2.2246) loss 3.3769 (3.7009) grad_norm 1.4177 (1.2777) [2022-01-20 23:32:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][670/1251] eta 0:21:30 lr 0.000723 time 1.8084 (2.2219) loss 3.0158 (3.7019) grad_norm 1.1112 (1.2777) [2022-01-20 23:33:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][680/1251] eta 0:21:08 lr 0.000723 time 2.1551 (2.2210) loss 3.9017 (3.7069) grad_norm 1.4999 (1.2795) [2022-01-20 23:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][690/1251] eta 0:20:44 lr 0.000723 time 1.6376 (2.2185) loss 3.0922 (3.7084) grad_norm 1.4095 (1.2797) [2022-01-20 23:34:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][700/1251] eta 0:20:21 lr 0.000722 time 2.1124 (2.2176) loss 2.6433 (3.7034) grad_norm 1.4881 (1.2797) [2022-01-20 23:34:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][710/1251] eta 0:20:00 lr 0.000722 time 1.6199 (2.2191) loss 2.9877 (3.7023) grad_norm 1.2588 (1.2784) [2022-01-20 23:34:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][720/1251] eta 0:19:39 lr 0.000722 time 2.1053 (2.2215) loss 2.9945 (3.6996) grad_norm 1.1685 (1.2783) [2022-01-20 23:35:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][730/1251] eta 0:19:16 lr 0.000722 time 1.5600 (2.2194) loss 4.3775 (3.7029) grad_norm 1.3707 (1.2784) [2022-01-20 23:35:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][740/1251] eta 0:18:53 lr 0.000722 time 1.5872 (2.2182) loss 3.9687 (3.7056) grad_norm 1.0791 (1.2782) [2022-01-20 23:35:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][750/1251] eta 0:18:31 lr 0.000722 time 1.8357 (2.2179) loss 4.0661 (3.7081) grad_norm 1.1083 (1.2780) [2022-01-20 23:36:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][760/1251] eta 0:18:08 lr 0.000722 time 1.7783 (2.2165) loss 2.9043 (3.7067) grad_norm 1.3919 (1.2772) [2022-01-20 23:36:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][770/1251] eta 0:17:45 lr 0.000722 time 2.1943 (2.2155) loss 4.5178 (3.7059) grad_norm 1.2212 (1.2765) [2022-01-20 23:36:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][780/1251] eta 0:17:22 lr 0.000722 time 1.8626 (2.2141) loss 4.1748 (3.7068) grad_norm 1.2761 (1.2763) [2022-01-20 23:37:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][790/1251] eta 0:17:00 lr 0.000722 time 2.5853 (2.2142) loss 3.8261 (3.7058) grad_norm 1.4338 (1.2763) [2022-01-20 23:37:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][800/1251] eta 0:16:37 lr 0.000722 time 1.6479 (2.2127) loss 3.9530 (3.7057) grad_norm 1.2572 (1.2756) [2022-01-20 23:38:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][810/1251] eta 0:16:15 lr 0.000722 time 1.9208 (2.2127) loss 3.5760 (3.7071) grad_norm 1.1656 (1.2748) [2022-01-20 23:38:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][820/1251] eta 0:15:53 lr 0.000722 time 1.9883 (2.2125) loss 3.8131 (3.7076) grad_norm 1.1173 (1.2752) [2022-01-20 23:38:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][830/1251] eta 0:15:32 lr 0.000722 time 2.4289 (2.2141) loss 4.5728 (3.7091) grad_norm 1.1423 (1.2741) [2022-01-20 23:39:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][840/1251] eta 0:15:09 lr 0.000722 time 1.5369 (2.2127) loss 4.1728 (3.7097) grad_norm 1.1405 (1.2738) [2022-01-20 23:39:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][850/1251] eta 0:14:46 lr 0.000722 time 1.8070 (2.2114) loss 3.9052 (3.7063) grad_norm 1.4918 (1.2733) [2022-01-20 23:39:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][860/1251] eta 0:14:23 lr 0.000722 time 2.2588 (2.2094) loss 4.3853 (3.7063) grad_norm 1.2891 (1.2729) [2022-01-20 23:40:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][870/1251] eta 0:14:01 lr 0.000722 time 1.7883 (2.2089) loss 3.7510 (3.7084) grad_norm 1.2307 (1.2730) [2022-01-20 23:40:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][880/1251] eta 0:13:39 lr 0.000722 time 1.8277 (2.2093) loss 3.6318 (3.7097) grad_norm 1.2739 (1.2738) [2022-01-20 23:40:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][890/1251] eta 0:13:17 lr 0.000722 time 1.4345 (2.2104) loss 2.6279 (3.7117) grad_norm 1.3121 (1.2745) [2022-01-20 23:41:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][900/1251] eta 0:12:55 lr 0.000722 time 2.3950 (2.2107) loss 3.6157 (3.7093) grad_norm 1.0404 (1.2734) [2022-01-20 23:41:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][910/1251] eta 0:12:33 lr 0.000722 time 1.8918 (2.2099) loss 4.2508 (3.7129) grad_norm 1.4173 (1.2731) [2022-01-20 23:42:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][920/1251] eta 0:12:11 lr 0.000722 time 2.2319 (2.2088) loss 3.7295 (3.7100) grad_norm 1.3729 (1.2729) [2022-01-20 23:42:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][930/1251] eta 0:11:48 lr 0.000722 time 1.8781 (2.2063) loss 2.6813 (3.7088) grad_norm 1.2007 (1.2727) [2022-01-20 23:42:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][940/1251] eta 0:11:26 lr 0.000722 time 2.5038 (2.2059) loss 3.5504 (3.7089) grad_norm 1.2124 (1.2729) [2022-01-20 23:43:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][950/1251] eta 0:11:04 lr 0.000722 time 1.9255 (2.2060) loss 4.5247 (3.7069) grad_norm 1.1657 (1.2734) [2022-01-20 23:43:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][960/1251] eta 0:10:42 lr 0.000722 time 2.2499 (2.2069) loss 4.3850 (3.7060) grad_norm 1.2411 (1.2734) [2022-01-20 23:43:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][970/1251] eta 0:10:19 lr 0.000721 time 1.7224 (2.2061) loss 3.9154 (3.7054) grad_norm 1.1987 (1.2735) [2022-01-20 23:44:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][980/1251] eta 0:09:58 lr 0.000721 time 2.2266 (2.2071) loss 4.6091 (3.7071) grad_norm 1.1929 (1.2729) [2022-01-20 23:44:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][990/1251] eta 0:09:36 lr 0.000721 time 1.7719 (2.2072) loss 3.7541 (3.7090) grad_norm 1.2306 (1.2721) [2022-01-20 23:44:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1000/1251] eta 0:09:14 lr 0.000721 time 2.2850 (2.2077) loss 3.9269 (3.7101) grad_norm 1.1527 (1.2710) [2022-01-20 23:45:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1010/1251] eta 0:08:51 lr 0.000721 time 1.5406 (2.2064) loss 4.0634 (3.7094) grad_norm 1.1240 (1.2708) [2022-01-20 23:45:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1020/1251] eta 0:08:29 lr 0.000721 time 1.6334 (2.2051) loss 3.6112 (3.7084) grad_norm 1.0825 (1.2700) [2022-01-20 23:46:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1030/1251] eta 0:08:07 lr 0.000721 time 2.1601 (2.2048) loss 3.0285 (3.7093) grad_norm 1.5939 (1.2698) [2022-01-20 23:46:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1040/1251] eta 0:07:45 lr 0.000721 time 2.2565 (2.2065) loss 4.1586 (3.7108) grad_norm 1.4338 (1.2695) [2022-01-20 23:46:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1050/1251] eta 0:07:23 lr 0.000721 time 2.0873 (2.2063) loss 3.0362 (3.7093) grad_norm 1.3491 (1.2692) [2022-01-20 23:47:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1060/1251] eta 0:07:01 lr 0.000721 time 1.8486 (2.2054) loss 3.9669 (3.7095) grad_norm 1.2261 (1.2700) [2022-01-20 23:47:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1070/1251] eta 0:06:39 lr 0.000721 time 1.8631 (2.2046) loss 3.1297 (3.7095) grad_norm 1.5645 (1.2701) [2022-01-20 23:47:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1080/1251] eta 0:06:16 lr 0.000721 time 2.2716 (2.2041) loss 3.9820 (3.7120) grad_norm 1.4697 (1.2705) [2022-01-20 23:48:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1090/1251] eta 0:05:54 lr 0.000721 time 1.9217 (2.2028) loss 3.3913 (3.7114) grad_norm 1.3776 (1.2702) [2022-01-20 23:48:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1100/1251] eta 0:05:32 lr 0.000721 time 3.1245 (2.2037) loss 3.9995 (3.7118) grad_norm 1.3590 (1.2709) [2022-01-20 23:48:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1110/1251] eta 0:05:10 lr 0.000721 time 1.8736 (2.2027) loss 3.2995 (3.7110) grad_norm 1.2492 (1.2720) [2022-01-20 23:49:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1120/1251] eta 0:04:48 lr 0.000721 time 2.0076 (2.2018) loss 2.8287 (3.7077) grad_norm 1.3671 (1.2720) [2022-01-20 23:49:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1130/1251] eta 0:04:26 lr 0.000721 time 2.3067 (2.2015) loss 4.5878 (3.7101) grad_norm 1.1715 (1.2719) [2022-01-20 23:49:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1140/1251] eta 0:04:04 lr 0.000721 time 1.4703 (2.2016) loss 2.9022 (3.7082) grad_norm 1.4250 (1.2722) [2022-01-20 23:50:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1150/1251] eta 0:03:42 lr 0.000721 time 2.6396 (2.2023) loss 4.1838 (3.7086) grad_norm 1.2667 (1.2721) [2022-01-20 23:50:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1160/1251] eta 0:03:20 lr 0.000721 time 2.1538 (2.2018) loss 3.4889 (3.7081) grad_norm 1.3789 (1.2725) [2022-01-20 23:51:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1170/1251] eta 0:02:58 lr 0.000721 time 2.3064 (2.2014) loss 3.6278 (3.7074) grad_norm 1.3980 (1.2727) [2022-01-20 23:51:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1180/1251] eta 0:02:36 lr 0.000721 time 1.5081 (2.2009) loss 4.0230 (3.7066) grad_norm 1.2538 (1.2727) [2022-01-20 23:51:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1190/1251] eta 0:02:14 lr 0.000721 time 2.5662 (2.2003) loss 3.9544 (3.7058) grad_norm 1.2087 (1.2729) [2022-01-20 23:52:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1200/1251] eta 0:01:52 lr 0.000721 time 2.2899 (2.2004) loss 2.6684 (3.7042) grad_norm 1.2538 (1.2726) [2022-01-20 23:52:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1210/1251] eta 0:01:30 lr 0.000721 time 1.7231 (2.2000) loss 3.7791 (3.7048) grad_norm 1.2857 (1.2729) [2022-01-20 23:52:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1220/1251] eta 0:01:08 lr 0.000721 time 2.1302 (2.2011) loss 3.7074 (3.7066) grad_norm 1.2680 (1.2722) [2022-01-20 23:53:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1230/1251] eta 0:00:46 lr 0.000721 time 2.5489 (2.2014) loss 3.4305 (3.7059) grad_norm 1.1500 (1.2719) [2022-01-20 23:53:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1240/1251] eta 0:00:24 lr 0.000720 time 1.4007 (2.2004) loss 4.1781 (3.7046) grad_norm 1.2516 (1.2720) [2022-01-20 23:53:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [106/300][1250/1251] eta 0:00:02 lr 0.000720 time 1.1910 (2.1950) loss 3.6321 (3.7040) grad_norm 1.2323 (1.2719) [2022-01-20 23:53:53 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 106 training takes 0:45:46 [2022-01-20 23:54:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.129 (18.129) Loss 1.0796 (1.0796) Acc@1 74.902 (74.902) Acc@5 93.359 (93.359) [2022-01-20 23:54:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.885 (3.348) Loss 1.2235 (1.1152) Acc@1 72.168 (74.290) Acc@5 90.527 (92.276) [2022-01-20 23:54:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.592 (2.468) Loss 1.1081 (1.1141) Acc@1 75.000 (74.358) Acc@5 90.723 (92.160) [2022-01-20 23:55:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.040 (2.190) Loss 1.1102 (1.1172) Acc@1 74.805 (74.083) Acc@5 92.383 (92.238) [2022-01-20 23:55:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.682 (2.134) Loss 1.0852 (1.1086) Acc@1 73.633 (74.204) Acc@5 92.090 (92.297) [2022-01-20 23:55:28 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.166 Acc@5 92.188 [2022-01-20 23:55:28 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.2% [2022-01-20 23:55:28 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.23% [2022-01-20 23:55:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][0/1251] eta 7:29:52 lr 0.000720 time 21.5764 (21.5764) loss 2.5680 (2.5680) grad_norm 1.4775 (1.4775) [2022-01-20 23:56:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][10/1251] eta 1:30:30 lr 0.000720 time 2.6139 (4.3762) loss 3.4718 (3.5144) grad_norm 1.1910 (1.2890) [2022-01-20 23:56:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][20/1251] eta 1:07:52 lr 0.000720 time 1.4292 (3.3082) loss 4.5692 (3.5667) grad_norm 1.1842 (1.2773) [2022-01-20 23:56:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][30/1251] eta 1:00:06 lr 0.000720 time 1.5585 (2.9535) loss 2.6323 (3.5467) grad_norm 1.4231 (1.3173) [2022-01-20 23:57:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][40/1251] eta 0:57:24 lr 0.000720 time 3.4062 (2.8443) loss 3.9116 (3.5861) grad_norm 1.4016 (1.3062) [2022-01-20 23:57:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][50/1251] eta 0:55:08 lr 0.000720 time 2.2537 (2.7545) loss 3.7589 (3.6091) grad_norm 1.4417 (1.3115) [2022-01-20 23:58:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][60/1251] eta 0:52:15 lr 0.000720 time 1.6326 (2.6325) loss 4.4935 (3.6206) grad_norm 1.2443 (1.3036) [2022-01-20 23:58:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][70/1251] eta 0:49:41 lr 0.000720 time 2.0563 (2.5243) loss 3.7207 (3.6204) grad_norm 1.3940 (1.3057) [2022-01-20 23:58:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][80/1251] eta 0:47:48 lr 0.000720 time 1.8728 (2.4497) loss 4.2070 (3.6457) grad_norm 1.1971 (1.3183) [2022-01-20 23:59:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][90/1251] eta 0:46:43 lr 0.000720 time 1.7627 (2.4151) loss 2.4378 (3.6259) grad_norm 1.2091 (1.3183) [2022-01-20 23:59:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][100/1251] eta 0:46:01 lr 0.000720 time 3.0408 (2.3989) loss 3.9867 (3.6509) grad_norm 1.2423 (1.3168) [2022-01-20 23:59:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][110/1251] eta 0:45:19 lr 0.000720 time 2.0802 (2.3837) loss 3.5925 (3.6371) grad_norm 1.2360 (1.3065) [2022-01-21 00:00:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][120/1251] eta 0:44:52 lr 0.000720 time 1.8416 (2.3803) loss 3.4703 (3.6330) grad_norm 1.1254 (1.3078) [2022-01-21 00:00:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][130/1251] eta 0:44:42 lr 0.000720 time 3.1242 (2.3931) loss 2.9335 (3.6320) grad_norm 1.2520 (1.3075) [2022-01-21 00:01:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][140/1251] eta 0:44:04 lr 0.000720 time 2.1873 (2.3801) loss 3.3778 (3.6246) grad_norm 1.3287 (1.2997) [2022-01-21 00:01:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][150/1251] eta 0:43:17 lr 0.000720 time 1.9194 (2.3596) loss 3.6398 (3.6269) grad_norm 1.2401 (1.2982) [2022-01-21 00:01:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][160/1251] eta 0:42:20 lr 0.000720 time 1.8817 (2.3290) loss 2.7940 (3.6274) grad_norm 1.1670 (1.2961) [2022-01-21 00:02:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][170/1251] eta 0:41:41 lr 0.000720 time 1.8855 (2.3139) loss 4.2365 (3.6272) grad_norm 1.0269 (1.2902) [2022-01-21 00:02:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][180/1251] eta 0:41:11 lr 0.000720 time 1.8667 (2.3075) loss 4.1305 (3.6458) grad_norm 1.2407 (1.2851) [2022-01-21 00:02:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][190/1251] eta 0:40:40 lr 0.000720 time 2.0578 (2.3001) loss 4.0190 (3.6358) grad_norm 1.3845 (1.2874) [2022-01-21 00:03:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][200/1251] eta 0:40:24 lr 0.000720 time 2.9655 (2.3070) loss 3.0306 (3.6239) grad_norm 1.3116 (1.2922) [2022-01-21 00:03:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][210/1251] eta 0:39:57 lr 0.000720 time 2.0867 (2.3030) loss 4.5611 (3.6431) grad_norm 1.2061 (1.2896) [2022-01-21 00:03:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][220/1251] eta 0:39:28 lr 0.000720 time 2.1567 (2.2974) loss 4.0587 (3.6432) grad_norm 1.8647 (1.2942) [2022-01-21 00:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][230/1251] eta 0:38:59 lr 0.000720 time 1.9511 (2.2911) loss 3.3816 (3.6336) grad_norm 1.2346 (1.2945) [2022-01-21 00:04:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][240/1251] eta 0:38:37 lr 0.000720 time 2.9332 (2.2925) loss 4.0645 (3.6505) grad_norm 1.1267 (1.2923) [2022-01-21 00:05:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][250/1251] eta 0:38:04 lr 0.000720 time 2.1017 (2.2823) loss 3.3970 (3.6543) grad_norm 1.2782 (1.2912) [2022-01-21 00:05:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][260/1251] eta 0:37:36 lr 0.000719 time 1.8369 (2.2774) loss 3.2605 (3.6543) grad_norm 1.3199 (1.2923) [2022-01-21 00:05:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][270/1251] eta 0:37:10 lr 0.000719 time 1.8667 (2.2732) loss 4.4319 (3.6575) grad_norm 1.2745 (1.2932) [2022-01-21 00:06:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][280/1251] eta 0:36:45 lr 0.000719 time 2.4147 (2.2714) loss 3.5891 (3.6541) grad_norm 1.0942 (1.2894) [2022-01-21 00:06:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][290/1251] eta 0:36:20 lr 0.000719 time 1.8964 (2.2688) loss 3.9439 (3.6665) grad_norm 1.4151 (1.2884) [2022-01-21 00:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][300/1251] eta 0:35:55 lr 0.000719 time 1.9605 (2.2668) loss 3.8534 (3.6624) grad_norm 1.2893 (1.2871) [2022-01-21 00:07:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][310/1251] eta 0:35:36 lr 0.000719 time 1.8582 (2.2709) loss 4.1803 (3.6714) grad_norm 1.1534 (1.2869) [2022-01-21 00:07:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][320/1251] eta 0:35:12 lr 0.000719 time 3.3962 (2.2691) loss 4.0078 (3.6698) grad_norm 1.2369 (1.2863) [2022-01-21 00:07:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][330/1251] eta 0:34:41 lr 0.000719 time 1.7827 (2.2600) loss 4.3675 (3.6813) grad_norm 1.2559 (1.2861) [2022-01-21 00:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][340/1251] eta 0:34:15 lr 0.000719 time 2.1939 (2.2567) loss 3.8504 (3.6796) grad_norm 1.5166 (1.2893) [2022-01-21 00:08:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][350/1251] eta 0:33:51 lr 0.000719 time 1.5993 (2.2545) loss 4.0702 (3.6804) grad_norm 1.3819 (1.2898) [2022-01-21 00:09:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][360/1251] eta 0:33:32 lr 0.000719 time 3.3066 (2.2590) loss 3.8356 (3.6782) grad_norm 1.1470 (1.2899) [2022-01-21 00:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][370/1251] eta 0:33:09 lr 0.000719 time 2.0938 (2.2584) loss 3.9835 (3.6779) grad_norm 1.5848 (1.2907) [2022-01-21 00:09:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][380/1251] eta 0:32:43 lr 0.000719 time 2.1599 (2.2541) loss 4.1286 (3.6833) grad_norm 1.1851 (1.2947) [2022-01-21 00:10:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][390/1251] eta 0:32:13 lr 0.000719 time 1.9103 (2.2453) loss 3.0229 (3.6803) grad_norm 1.3766 (1.2968) [2022-01-21 00:10:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][400/1251] eta 0:31:47 lr 0.000719 time 2.3159 (2.2410) loss 4.4293 (3.6824) grad_norm 1.2110 (1.2962) [2022-01-21 00:10:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][410/1251] eta 0:31:22 lr 0.000719 time 2.1328 (2.2390) loss 4.3980 (3.6757) grad_norm 1.4105 (1.2961) [2022-01-21 00:11:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][420/1251] eta 0:31:01 lr 0.000719 time 2.7965 (2.2395) loss 4.0671 (3.6808) grad_norm 1.3357 (1.2961) [2022-01-21 00:11:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][430/1251] eta 0:30:39 lr 0.000719 time 2.1620 (2.2409) loss 3.7798 (3.6798) grad_norm 1.1694 (1.2945) [2022-01-21 00:11:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][440/1251] eta 0:30:17 lr 0.000719 time 2.5057 (2.2408) loss 2.2836 (3.6810) grad_norm 1.0804 (1.2924) [2022-01-21 00:12:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][450/1251] eta 0:29:56 lr 0.000719 time 2.1905 (2.2432) loss 3.2837 (3.6801) grad_norm 1.1582 (1.2903) [2022-01-21 00:12:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][460/1251] eta 0:29:33 lr 0.000719 time 2.7732 (2.2423) loss 3.2579 (3.6775) grad_norm 1.4414 (1.2908) [2022-01-21 00:13:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][470/1251] eta 0:29:07 lr 0.000719 time 1.6345 (2.2372) loss 3.5963 (3.6754) grad_norm 1.3132 (1.2917) [2022-01-21 00:13:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][480/1251] eta 0:28:42 lr 0.000719 time 2.4326 (2.2345) loss 3.6859 (3.6781) grad_norm 1.3157 (1.2910) [2022-01-21 00:13:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][490/1251] eta 0:28:19 lr 0.000719 time 1.8648 (2.2326) loss 2.8092 (3.6787) grad_norm 1.4411 (1.2907) [2022-01-21 00:14:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][500/1251] eta 0:27:59 lr 0.000719 time 3.4182 (2.2361) loss 2.6396 (3.6802) grad_norm 1.2199 (1.2925) [2022-01-21 00:14:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][510/1251] eta 0:27:38 lr 0.000719 time 2.1181 (2.2378) loss 3.0714 (3.6832) grad_norm 1.1964 (1.2919) [2022-01-21 00:14:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][520/1251] eta 0:27:14 lr 0.000718 time 2.2177 (2.2358) loss 2.9994 (3.6826) grad_norm 1.4150 (1.2924) [2022-01-21 00:15:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][530/1251] eta 0:26:48 lr 0.000718 time 1.7169 (2.2306) loss 3.2121 (3.6791) grad_norm 1.3180 (1.2918) [2022-01-21 00:15:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][540/1251] eta 0:26:22 lr 0.000718 time 2.0586 (2.2256) loss 3.0053 (3.6826) grad_norm 1.1441 (1.2920) [2022-01-21 00:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][550/1251] eta 0:26:00 lr 0.000718 time 2.4368 (2.2255) loss 4.1807 (3.6808) grad_norm 1.1321 (1.2907) [2022-01-21 00:16:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][560/1251] eta 0:25:38 lr 0.000718 time 1.8550 (2.2265) loss 4.0356 (3.6819) grad_norm 1.2690 (1.2906) [2022-01-21 00:16:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][570/1251] eta 0:25:17 lr 0.000718 time 2.5793 (2.2287) loss 3.8465 (3.6839) grad_norm 1.2956 (1.2895) [2022-01-21 00:17:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][580/1251] eta 0:24:53 lr 0.000718 time 1.9858 (2.2262) loss 2.5959 (3.6836) grad_norm 1.3376 (1.2883) [2022-01-21 00:17:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][590/1251] eta 0:24:30 lr 0.000718 time 2.9137 (2.2247) loss 2.5468 (3.6839) grad_norm 1.4629 (1.2878) [2022-01-21 00:17:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][600/1251] eta 0:24:06 lr 0.000718 time 1.9566 (2.2218) loss 3.1425 (3.6778) grad_norm 1.1272 (1.2875) [2022-01-21 00:18:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][610/1251] eta 0:23:43 lr 0.000718 time 2.3547 (2.2205) loss 4.2165 (3.6748) grad_norm 1.2813 (1.2865) [2022-01-21 00:18:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][620/1251] eta 0:23:19 lr 0.000718 time 1.8973 (2.2174) loss 4.0072 (3.6744) grad_norm 1.1605 (1.2855) [2022-01-21 00:18:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][630/1251] eta 0:22:57 lr 0.000718 time 2.9937 (2.2189) loss 4.4204 (3.6789) grad_norm 1.1514 (1.2845) [2022-01-21 00:19:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][640/1251] eta 0:22:34 lr 0.000718 time 1.6976 (2.2169) loss 3.6950 (3.6808) grad_norm 1.1713 (1.2835) [2022-01-21 00:19:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][650/1251] eta 0:22:13 lr 0.000718 time 2.2991 (2.2180) loss 2.9448 (3.6855) grad_norm 1.2650 (1.2849) [2022-01-21 00:19:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][660/1251] eta 0:21:50 lr 0.000718 time 2.8015 (2.2168) loss 3.6133 (3.6823) grad_norm 1.2999 (1.2844) [2022-01-21 00:20:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][670/1251] eta 0:21:27 lr 0.000718 time 2.2591 (2.2157) loss 3.9982 (3.6813) grad_norm 1.2633 (1.2833) [2022-01-21 00:20:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][680/1251] eta 0:21:05 lr 0.000718 time 1.8437 (2.2155) loss 3.9920 (3.6817) grad_norm 1.5397 (1.2847) [2022-01-21 00:20:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][690/1251] eta 0:20:43 lr 0.000718 time 2.0092 (2.2168) loss 3.6783 (3.6764) grad_norm 1.4449 (1.2852) [2022-01-21 00:21:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][700/1251] eta 0:20:22 lr 0.000718 time 2.6168 (2.2182) loss 4.2264 (3.6795) grad_norm 1.0680 (1.2851) [2022-01-21 00:21:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][710/1251] eta 0:20:00 lr 0.000718 time 3.0367 (2.2195) loss 3.2452 (3.6786) grad_norm 1.2752 (1.2843) [2022-01-21 00:22:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][720/1251] eta 0:19:38 lr 0.000718 time 1.9079 (2.2189) loss 3.9048 (3.6762) grad_norm 1.1194 (1.2843) [2022-01-21 00:22:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][730/1251] eta 0:19:15 lr 0.000718 time 1.9886 (2.2186) loss 4.6781 (3.6743) grad_norm 1.2226 (1.2851) [2022-01-21 00:22:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][740/1251] eta 0:18:52 lr 0.000718 time 2.1562 (2.2166) loss 4.0692 (3.6742) grad_norm 1.2333 (1.2869) [2022-01-21 00:23:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][750/1251] eta 0:18:29 lr 0.000718 time 2.5084 (2.2149) loss 3.0391 (3.6727) grad_norm 1.2078 (1.2871) [2022-01-21 00:23:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][760/1251] eta 0:18:06 lr 0.000718 time 1.9259 (2.2122) loss 2.7683 (3.6737) grad_norm 1.2429 (1.2880) [2022-01-21 00:23:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][770/1251] eta 0:17:42 lr 0.000718 time 1.5333 (2.2099) loss 3.0245 (3.6721) grad_norm 1.4953 (1.2879) [2022-01-21 00:24:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][780/1251] eta 0:17:21 lr 0.000718 time 2.9541 (2.2106) loss 3.8206 (3.6719) grad_norm 1.4752 (1.2879) [2022-01-21 00:24:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][790/1251] eta 0:16:58 lr 0.000717 time 1.7407 (2.2098) loss 3.7282 (3.6769) grad_norm 1.1549 (1.2874) [2022-01-21 00:24:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][800/1251] eta 0:16:36 lr 0.000717 time 2.2363 (2.2103) loss 3.8795 (3.6770) grad_norm 1.0593 (1.2864) [2022-01-21 00:25:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][810/1251] eta 0:16:14 lr 0.000717 time 1.9530 (2.2095) loss 4.2244 (3.6768) grad_norm 1.2801 (1.2869) [2022-01-21 00:25:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][820/1251] eta 0:15:52 lr 0.000717 time 1.8717 (2.2099) loss 3.4947 (3.6814) grad_norm 1.3136 (1.2875) [2022-01-21 00:26:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][830/1251] eta 0:15:30 lr 0.000717 time 2.2469 (2.2105) loss 3.5112 (3.6829) grad_norm 1.1500 (1.2865) [2022-01-21 00:26:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][840/1251] eta 0:15:08 lr 0.000717 time 2.2946 (2.2114) loss 4.0962 (3.6826) grad_norm 1.1775 (1.2865) [2022-01-21 00:26:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][850/1251] eta 0:14:46 lr 0.000717 time 1.6691 (2.2101) loss 3.9842 (3.6825) grad_norm 1.1274 (1.2864) [2022-01-21 00:27:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][860/1251] eta 0:14:23 lr 0.000717 time 1.8128 (2.2091) loss 4.2611 (3.6816) grad_norm 1.2201 (1.2880) [2022-01-21 00:27:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][870/1251] eta 0:14:01 lr 0.000717 time 1.9204 (2.2095) loss 3.1472 (3.6806) grad_norm 1.3713 (1.2875) [2022-01-21 00:27:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][880/1251] eta 0:13:40 lr 0.000717 time 2.8024 (2.2109) loss 4.2522 (3.6804) grad_norm 1.4392 (1.2881) [2022-01-21 00:28:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][890/1251] eta 0:13:18 lr 0.000717 time 1.9968 (2.2108) loss 4.0329 (3.6813) grad_norm 1.2360 (1.2877) [2022-01-21 00:28:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][900/1251] eta 0:12:55 lr 0.000717 time 2.0423 (2.2094) loss 3.2217 (3.6796) grad_norm 1.1165 (1.2882) [2022-01-21 00:28:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][910/1251] eta 0:12:32 lr 0.000717 time 1.8984 (2.2082) loss 4.1238 (3.6822) grad_norm 1.0520 (1.2876) [2022-01-21 00:29:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][920/1251] eta 0:12:10 lr 0.000717 time 1.8530 (2.2068) loss 4.1307 (3.6838) grad_norm 1.2204 (1.2865) [2022-01-21 00:29:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][930/1251] eta 0:11:47 lr 0.000717 time 2.1051 (2.2041) loss 2.8602 (3.6859) grad_norm 1.3013 (1.2865) [2022-01-21 00:30:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][940/1251] eta 0:11:25 lr 0.000717 time 1.9222 (2.2037) loss 3.1075 (3.6864) grad_norm 1.7058 (1.2875) [2022-01-21 00:30:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][950/1251] eta 0:11:03 lr 0.000717 time 1.9505 (2.2036) loss 2.5983 (3.6855) grad_norm 1.3896 (1.2876) [2022-01-21 00:30:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][960/1251] eta 0:10:41 lr 0.000717 time 1.9477 (2.2036) loss 2.6356 (3.6865) grad_norm 1.2159 (1.2877) [2022-01-21 00:31:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][970/1251] eta 0:10:19 lr 0.000717 time 2.7724 (2.2049) loss 4.3463 (3.6882) grad_norm 1.3114 (1.2874) [2022-01-21 00:31:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][980/1251] eta 0:09:57 lr 0.000717 time 2.0636 (2.2053) loss 3.9579 (3.6893) grad_norm 1.3531 (1.2876) [2022-01-21 00:31:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][990/1251] eta 0:09:35 lr 0.000717 time 2.1387 (2.2051) loss 3.5803 (3.6867) grad_norm 1.2997 (1.2881) [2022-01-21 00:32:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1000/1251] eta 0:09:13 lr 0.000717 time 2.7566 (2.2046) loss 3.9385 (3.6871) grad_norm 1.5580 (1.2877) [2022-01-21 00:32:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1010/1251] eta 0:08:51 lr 0.000717 time 2.7106 (2.2034) loss 3.6505 (3.6891) grad_norm 1.1441 (1.2886) [2022-01-21 00:32:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1020/1251] eta 0:08:28 lr 0.000717 time 1.9529 (2.2018) loss 4.0033 (3.6919) grad_norm 1.2196 (1.2885) [2022-01-21 00:33:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1030/1251] eta 0:08:06 lr 0.000717 time 1.8993 (2.2012) loss 4.2114 (3.6929) grad_norm 1.1234 (1.2875) [2022-01-21 00:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1040/1251] eta 0:07:44 lr 0.000717 time 1.8903 (2.2016) loss 3.7376 (3.6935) grad_norm 1.3483 (1.2869) [2022-01-21 00:34:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1050/1251] eta 0:07:22 lr 0.000717 time 2.7128 (2.2019) loss 3.6186 (3.6925) grad_norm 1.5130 (1.2866) [2022-01-21 00:34:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1060/1251] eta 0:07:00 lr 0.000716 time 2.1611 (2.2007) loss 4.1170 (3.6926) grad_norm 1.4168 (1.2870) [2022-01-21 00:34:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1070/1251] eta 0:06:38 lr 0.000716 time 2.0189 (2.2006) loss 4.0792 (3.6930) grad_norm 1.2104 (1.2874) [2022-01-21 00:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1080/1251] eta 0:06:16 lr 0.000716 time 2.7386 (2.2015) loss 3.3458 (3.6958) grad_norm 1.2168 (1.2876) [2022-01-21 00:35:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1090/1251] eta 0:05:54 lr 0.000716 time 1.9325 (2.2019) loss 4.1492 (3.6960) grad_norm 1.1753 (1.2870) [2022-01-21 00:35:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1100/1251] eta 0:05:32 lr 0.000716 time 2.4720 (2.2013) loss 4.2232 (3.6938) grad_norm 1.4709 (1.2878) [2022-01-21 00:36:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1110/1251] eta 0:05:10 lr 0.000716 time 1.4283 (2.1997) loss 4.2579 (3.6936) grad_norm 1.3941 (1.2880) [2022-01-21 00:36:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1120/1251] eta 0:04:48 lr 0.000716 time 2.8871 (2.1999) loss 3.5677 (3.6914) grad_norm 1.2960 (1.2885) [2022-01-21 00:36:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1130/1251] eta 0:04:26 lr 0.000716 time 2.1887 (2.1987) loss 3.9532 (3.6904) grad_norm 1.2018 (1.2891) [2022-01-21 00:37:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1140/1251] eta 0:04:03 lr 0.000716 time 1.8473 (2.1981) loss 4.4509 (3.6921) grad_norm 1.3380 (1.2890) [2022-01-21 00:37:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1150/1251] eta 0:03:41 lr 0.000716 time 1.7785 (2.1980) loss 3.4285 (3.6913) grad_norm 1.3218 (1.2894) [2022-01-21 00:38:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1160/1251] eta 0:03:20 lr 0.000716 time 2.4011 (2.1982) loss 4.1630 (3.6920) grad_norm 1.1589 (1.2895) [2022-01-21 00:38:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1170/1251] eta 0:02:58 lr 0.000716 time 2.7125 (2.1984) loss 3.7258 (3.6919) grad_norm 1.2472 (1.2896) [2022-01-21 00:38:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1180/1251] eta 0:02:36 lr 0.000716 time 2.3950 (2.1993) loss 3.8401 (3.6924) grad_norm 1.2405 (1.2895) [2022-01-21 00:39:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1190/1251] eta 0:02:14 lr 0.000716 time 2.1958 (2.1988) loss 2.5887 (3.6910) grad_norm 1.1133 (1.2890) [2022-01-21 00:39:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1200/1251] eta 0:01:52 lr 0.000716 time 3.2697 (2.1992) loss 3.9118 (3.6926) grad_norm 1.2940 (1.2885) [2022-01-21 00:39:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1210/1251] eta 0:01:30 lr 0.000716 time 1.5912 (2.1982) loss 2.3864 (3.6898) grad_norm 1.2350 (1.2885) [2022-01-21 00:40:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1220/1251] eta 0:01:08 lr 0.000716 time 1.6199 (2.1983) loss 4.1498 (3.6890) grad_norm 1.2770 (1.2878) [2022-01-21 00:40:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1230/1251] eta 0:00:46 lr 0.000716 time 1.8602 (2.1974) loss 4.1868 (3.6884) grad_norm 1.0552 (1.2874) [2022-01-21 00:40:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1240/1251] eta 0:00:24 lr 0.000716 time 2.7089 (2.1982) loss 4.3603 (3.6892) grad_norm 1.2620 (1.2870) [2022-01-21 00:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [107/300][1250/1251] eta 0:00:02 lr 0.000716 time 1.2138 (2.1916) loss 2.8118 (3.6882) grad_norm 1.2200 (1.2869) [2022-01-21 00:41:10 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 107 training takes 0:45:42 [2022-01-21 00:41:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.419 (18.419) Loss 1.0641 (1.0641) Acc@1 76.172 (76.172) Acc@5 92.773 (92.773) [2022-01-21 00:41:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 2.017 (3.408) Loss 1.1609 (1.0727) Acc@1 72.070 (74.707) Acc@5 91.113 (92.694) [2022-01-21 00:42:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.293 (2.763) Loss 1.0478 (1.0827) Acc@1 75.488 (74.456) Acc@5 92.285 (92.480) [2022-01-21 00:42:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.305 (2.274) Loss 1.1104 (1.0860) Acc@1 74.219 (74.291) Acc@5 91.504 (92.408) [2022-01-21 00:42:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.808 (2.169) Loss 1.0935 (1.0865) Acc@1 75.391 (74.219) Acc@5 92.383 (92.445) [2022-01-21 00:42:47 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.062 Acc@5 92.408 [2022-01-21 00:42:47 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.1% [2022-01-21 00:42:47 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.23% [2022-01-21 00:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][0/1251] eta 7:24:49 lr 0.000716 time 21.3346 (21.3346) loss 4.1280 (4.1280) grad_norm 1.1840 (1.1840) [2022-01-21 00:43:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][10/1251] eta 1:20:51 lr 0.000716 time 1.8392 (3.9090) loss 2.7842 (3.6450) grad_norm 1.1590 (1.2363) [2022-01-21 00:43:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][20/1251] eta 1:03:39 lr 0.000716 time 1.8744 (3.1026) loss 2.9750 (3.7006) grad_norm 1.1731 (1.2403) [2022-01-21 00:44:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][30/1251] eta 0:56:56 lr 0.000716 time 1.7178 (2.7980) loss 3.7207 (3.7508) grad_norm 1.2847 (1.2456) [2022-01-21 00:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][40/1251] eta 0:54:11 lr 0.000716 time 4.8357 (2.6851) loss 3.7243 (3.7392) grad_norm 1.1462 (1.2517) [2022-01-21 00:45:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][50/1251] eta 0:52:21 lr 0.000716 time 2.1837 (2.6160) loss 2.8805 (3.7495) grad_norm 1.2681 (1.2553) [2022-01-21 00:45:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][60/1251] eta 0:50:32 lr 0.000716 time 1.5006 (2.5466) loss 3.7520 (3.7348) grad_norm 1.1821 (1.2524) [2022-01-21 00:45:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][70/1251] eta 0:49:12 lr 0.000715 time 2.7584 (2.5004) loss 3.6047 (3.7161) grad_norm 1.4306 (1.2642) [2022-01-21 00:46:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][80/1251] eta 0:48:32 lr 0.000715 time 3.9392 (2.4870) loss 4.0659 (3.7088) grad_norm 1.2273 (1.2669) [2022-01-21 00:46:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][90/1251] eta 0:47:38 lr 0.000715 time 2.4299 (2.4621) loss 2.7221 (3.6848) grad_norm 1.1645 (1.2796) [2022-01-21 00:46:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][100/1251] eta 0:46:41 lr 0.000715 time 2.7279 (2.4341) loss 4.1636 (3.6731) grad_norm 1.1915 (1.2868) [2022-01-21 00:47:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][110/1251] eta 0:45:42 lr 0.000715 time 1.6610 (2.4039) loss 3.5135 (3.6812) grad_norm 1.3514 (1.2867) [2022-01-21 00:47:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][120/1251] eta 0:45:08 lr 0.000715 time 3.1565 (2.3951) loss 2.9891 (3.6751) grad_norm 1.3925 (1.2833) [2022-01-21 00:48:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][130/1251] eta 0:44:42 lr 0.000715 time 3.0638 (2.3932) loss 4.3741 (3.7035) grad_norm 1.5996 (1.2808) [2022-01-21 00:48:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][140/1251] eta 0:43:49 lr 0.000715 time 1.8907 (2.3671) loss 2.9140 (3.6919) grad_norm 1.2117 (1.2754) [2022-01-21 00:48:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][150/1251] eta 0:43:03 lr 0.000715 time 1.7949 (2.3469) loss 2.6537 (3.7000) grad_norm 1.2395 (1.2729) [2022-01-21 00:49:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][160/1251] eta 0:42:29 lr 0.000715 time 2.7714 (2.3365) loss 2.6772 (3.6956) grad_norm 1.2972 (1.2709) [2022-01-21 00:49:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][170/1251] eta 0:41:53 lr 0.000715 time 2.1105 (2.3255) loss 4.0113 (3.7001) grad_norm 1.3444 (1.2684) [2022-01-21 00:49:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][180/1251] eta 0:41:17 lr 0.000715 time 1.9211 (2.3129) loss 3.9035 (3.7003) grad_norm 1.3502 (1.2637) [2022-01-21 00:50:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][190/1251] eta 0:40:47 lr 0.000715 time 1.6788 (2.3065) loss 4.1358 (3.7051) grad_norm 1.1233 (1.2597) [2022-01-21 00:50:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][200/1251] eta 0:40:23 lr 0.000715 time 3.3890 (2.3061) loss 4.3004 (3.7105) grad_norm 1.1548 (1.2592) [2022-01-21 00:50:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][210/1251] eta 0:39:53 lr 0.000715 time 1.8142 (2.2988) loss 3.9754 (3.7087) grad_norm 1.4622 (1.2612) [2022-01-21 00:51:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][220/1251] eta 0:39:31 lr 0.000715 time 2.1902 (2.3006) loss 3.5407 (3.7158) grad_norm 1.3997 (1.2642) [2022-01-21 00:51:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][230/1251] eta 0:39:00 lr 0.000715 time 1.6840 (2.2922) loss 4.1929 (3.7111) grad_norm 1.4345 (1.2699) [2022-01-21 00:51:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][240/1251] eta 0:38:32 lr 0.000715 time 2.3420 (2.2875) loss 3.9532 (3.7114) grad_norm 1.0959 (1.2762) [2022-01-21 00:52:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][250/1251] eta 0:38:01 lr 0.000715 time 1.8949 (2.2796) loss 3.3101 (3.7146) grad_norm 1.2168 (1.2785) [2022-01-21 00:52:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][260/1251] eta 0:37:33 lr 0.000715 time 1.8775 (2.2736) loss 3.6383 (3.7121) grad_norm 1.1743 (1.2780) [2022-01-21 00:53:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][270/1251] eta 0:37:08 lr 0.000715 time 2.4136 (2.2716) loss 4.1493 (3.7164) grad_norm 1.5637 (1.2826) [2022-01-21 00:53:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][280/1251] eta 0:36:49 lr 0.000715 time 3.5604 (2.2758) loss 4.1420 (3.7119) grad_norm 1.2259 (1.2836) [2022-01-21 00:53:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][290/1251] eta 0:36:23 lr 0.000715 time 1.8813 (2.2724) loss 3.8703 (3.7175) grad_norm 1.6046 (1.2836) [2022-01-21 00:54:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][300/1251] eta 0:36:01 lr 0.000715 time 2.2487 (2.2727) loss 3.9984 (3.7195) grad_norm 1.2960 (1.2850) [2022-01-21 00:54:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][310/1251] eta 0:35:28 lr 0.000715 time 1.8208 (2.2621) loss 2.8878 (3.7186) grad_norm 1.1984 (1.2839) [2022-01-21 00:54:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][320/1251] eta 0:35:00 lr 0.000715 time 2.2042 (2.2564) loss 4.1273 (3.7198) grad_norm 1.2451 (1.2835) [2022-01-21 00:55:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][330/1251] eta 0:34:34 lr 0.000715 time 2.1745 (2.2523) loss 3.9472 (3.7258) grad_norm 1.1711 (1.2833) [2022-01-21 00:55:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][340/1251] eta 0:34:08 lr 0.000714 time 1.8371 (2.2482) loss 4.0600 (3.7221) grad_norm 1.1337 (1.2810) [2022-01-21 00:55:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][350/1251] eta 0:33:45 lr 0.000714 time 1.9085 (2.2475) loss 3.9372 (3.7244) grad_norm 1.3350 (1.2814) [2022-01-21 00:56:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][360/1251] eta 0:33:23 lr 0.000714 time 2.3094 (2.2487) loss 3.7984 (3.7225) grad_norm 1.1885 (1.2814) [2022-01-21 00:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][370/1251] eta 0:32:59 lr 0.000714 time 1.9390 (2.2473) loss 3.0435 (3.7051) grad_norm 1.1293 (1.2827) [2022-01-21 00:57:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][380/1251] eta 0:32:39 lr 0.000714 time 3.1221 (2.2492) loss 3.4193 (3.7024) grad_norm 1.1826 (1.2823) [2022-01-21 00:57:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][390/1251] eta 0:32:19 lr 0.000714 time 1.8322 (2.2527) loss 3.4918 (3.6939) grad_norm 1.2771 (1.2847) [2022-01-21 00:57:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][400/1251] eta 0:31:52 lr 0.000714 time 2.3121 (2.2473) loss 4.1016 (3.6912) grad_norm 1.2112 (1.2847) [2022-01-21 00:58:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][410/1251] eta 0:31:24 lr 0.000714 time 1.8556 (2.2411) loss 3.9867 (3.6989) grad_norm 1.3027 (1.2873) [2022-01-21 00:58:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][420/1251] eta 0:30:59 lr 0.000714 time 1.9107 (2.2381) loss 3.6740 (3.6959) grad_norm 1.1491 (1.2864) [2022-01-21 00:58:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][430/1251] eta 0:30:35 lr 0.000714 time 2.4910 (2.2362) loss 3.5089 (3.6966) grad_norm 1.2835 (1.2845) [2022-01-21 00:59:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][440/1251] eta 0:30:14 lr 0.000714 time 2.2410 (2.2371) loss 3.7975 (3.7022) grad_norm 1.5095 (1.2843) [2022-01-21 00:59:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][450/1251] eta 0:29:51 lr 0.000714 time 1.5625 (2.2365) loss 4.0251 (3.7057) grad_norm 1.4265 (1.2841) [2022-01-21 00:59:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][460/1251] eta 0:29:27 lr 0.000714 time 1.5907 (2.2347) loss 2.8574 (3.7049) grad_norm 1.3955 (1.2846) [2022-01-21 01:00:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][470/1251] eta 0:29:06 lr 0.000714 time 3.5408 (2.2360) loss 3.5250 (3.6997) grad_norm 1.3058 (1.2851) [2022-01-21 01:00:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][480/1251] eta 0:28:43 lr 0.000714 time 2.2992 (2.2348) loss 3.4856 (3.6981) grad_norm 1.2641 (1.2835) [2022-01-21 01:01:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][490/1251] eta 0:28:19 lr 0.000714 time 1.5021 (2.2338) loss 4.1664 (3.7016) grad_norm 1.3551 (1.2846) [2022-01-21 01:01:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][500/1251] eta 0:27:53 lr 0.000714 time 2.1383 (2.2288) loss 2.5982 (3.7006) grad_norm 1.2123 (1.2843) [2022-01-21 01:01:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][510/1251] eta 0:27:31 lr 0.000714 time 3.1327 (2.2286) loss 3.7992 (3.7005) grad_norm 1.2020 (1.2842) [2022-01-21 01:02:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][520/1251] eta 0:27:07 lr 0.000714 time 2.2649 (2.2271) loss 4.1006 (3.7021) grad_norm 1.1811 (1.2837) [2022-01-21 01:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][530/1251] eta 0:26:45 lr 0.000714 time 1.5189 (2.2265) loss 4.5887 (3.7054) grad_norm 1.3393 (1.2838) [2022-01-21 01:02:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][540/1251] eta 0:26:22 lr 0.000714 time 2.1058 (2.2259) loss 3.9955 (3.7069) grad_norm 1.1093 (1.2843) [2022-01-21 01:03:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][550/1251] eta 0:26:00 lr 0.000714 time 2.6950 (2.2255) loss 4.0700 (3.7057) grad_norm 1.2223 (1.2833) [2022-01-21 01:03:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][560/1251] eta 0:25:37 lr 0.000714 time 1.7709 (2.2257) loss 3.5533 (3.7047) grad_norm 1.2926 (1.2837) [2022-01-21 01:03:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][570/1251] eta 0:25:15 lr 0.000714 time 2.1637 (2.2249) loss 4.3815 (3.7102) grad_norm 1.6642 (1.2839) [2022-01-21 01:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][580/1251] eta 0:24:50 lr 0.000714 time 1.6481 (2.2215) loss 3.1371 (3.7113) grad_norm 1.1786 (1.2829) [2022-01-21 01:04:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][590/1251] eta 0:24:28 lr 0.000714 time 3.6584 (2.2219) loss 3.7847 (3.7073) grad_norm 1.2825 (1.2831) [2022-01-21 01:05:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][600/1251] eta 0:24:05 lr 0.000714 time 2.4809 (2.2203) loss 3.0628 (3.7083) grad_norm 1.2602 (1.2834) [2022-01-21 01:05:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][610/1251] eta 0:23:41 lr 0.000713 time 1.5211 (2.2178) loss 3.5716 (3.7119) grad_norm 1.1563 (1.2830) [2022-01-21 01:05:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][620/1251] eta 0:23:18 lr 0.000713 time 1.5969 (2.2165) loss 3.8628 (3.7124) grad_norm 1.2676 (1.2823) [2022-01-21 01:06:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][630/1251] eta 0:22:56 lr 0.000713 time 3.1172 (2.2174) loss 4.5401 (3.7090) grad_norm 1.2293 (1.2817) [2022-01-21 01:06:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][640/1251] eta 0:22:35 lr 0.000713 time 2.1380 (2.2182) loss 3.7982 (3.7100) grad_norm 1.3960 (1.2817) [2022-01-21 01:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][650/1251] eta 0:22:12 lr 0.000713 time 1.7254 (2.2165) loss 4.0446 (3.7130) grad_norm 1.1094 (1.2810) [2022-01-21 01:07:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][660/1251] eta 0:21:49 lr 0.000713 time 1.8367 (2.2158) loss 4.0779 (3.7146) grad_norm 1.5442 (1.2811) [2022-01-21 01:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][670/1251] eta 0:21:26 lr 0.000713 time 2.2430 (2.2144) loss 3.9653 (3.7154) grad_norm 1.1870 (1.2808) [2022-01-21 01:07:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][680/1251] eta 0:21:04 lr 0.000713 time 2.2377 (2.2142) loss 3.9117 (3.7114) grad_norm 1.1350 (1.2809) [2022-01-21 01:08:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][690/1251] eta 0:20:41 lr 0.000713 time 2.2409 (2.2131) loss 3.5175 (3.7066) grad_norm 1.2140 (1.2803) [2022-01-21 01:08:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][700/1251] eta 0:20:18 lr 0.000713 time 1.6448 (2.2119) loss 3.0123 (3.7059) grad_norm 1.1596 (1.2792) [2022-01-21 01:08:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][710/1251] eta 0:19:55 lr 0.000713 time 1.8667 (2.2102) loss 2.8016 (3.7060) grad_norm 1.1037 (1.2785) [2022-01-21 01:09:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][720/1251] eta 0:19:34 lr 0.000713 time 2.8501 (2.2112) loss 4.2458 (3.7070) grad_norm 1.1964 (1.2788) [2022-01-21 01:09:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][730/1251] eta 0:19:12 lr 0.000713 time 1.9231 (2.2126) loss 3.8945 (3.7080) grad_norm 1.3418 (1.2788) [2022-01-21 01:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][740/1251] eta 0:18:50 lr 0.000713 time 1.8641 (2.2131) loss 3.5710 (3.7106) grad_norm 1.3635 (1.2800) [2022-01-21 01:10:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][750/1251] eta 0:18:28 lr 0.000713 time 1.9673 (2.2131) loss 4.6081 (3.7091) grad_norm 1.2690 (1.2800) [2022-01-21 01:10:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][760/1251] eta 0:18:06 lr 0.000713 time 3.4143 (2.2130) loss 4.3827 (3.7081) grad_norm 1.3602 (1.2803) [2022-01-21 01:11:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][770/1251] eta 0:17:44 lr 0.000713 time 1.9419 (2.2131) loss 3.7507 (3.7061) grad_norm 1.3026 (1.2803) [2022-01-21 01:11:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][780/1251] eta 0:17:22 lr 0.000713 time 1.9632 (2.2125) loss 3.7675 (3.7040) grad_norm 1.3297 (1.2801) [2022-01-21 01:11:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][790/1251] eta 0:16:59 lr 0.000713 time 2.2304 (2.2112) loss 2.5555 (3.7036) grad_norm 1.2568 (1.2801) [2022-01-21 01:12:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][800/1251] eta 0:16:37 lr 0.000713 time 3.1574 (2.2107) loss 4.1152 (3.7082) grad_norm 1.1291 (1.2796) [2022-01-21 01:12:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][810/1251] eta 0:16:14 lr 0.000713 time 2.3490 (2.2094) loss 4.0090 (3.7091) grad_norm 1.3031 (1.2791) [2022-01-21 01:12:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][820/1251] eta 0:15:51 lr 0.000713 time 1.6029 (2.2073) loss 3.3960 (3.7096) grad_norm 1.3968 (1.2794) [2022-01-21 01:13:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][830/1251] eta 0:15:28 lr 0.000713 time 2.6064 (2.2061) loss 4.5149 (3.7096) grad_norm 1.3349 (1.2789) [2022-01-21 01:13:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][840/1251] eta 0:15:06 lr 0.000713 time 1.6367 (2.2052) loss 3.6182 (3.7125) grad_norm 1.4446 (1.2794) [2022-01-21 01:14:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][850/1251] eta 0:14:44 lr 0.000713 time 2.6308 (2.2063) loss 4.1702 (3.7143) grad_norm 1.0869 (1.2790) [2022-01-21 01:14:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][860/1251] eta 0:14:22 lr 0.000713 time 2.1017 (2.2068) loss 4.1630 (3.7163) grad_norm 1.1749 (1.2781) [2022-01-21 01:14:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][870/1251] eta 0:14:01 lr 0.000712 time 2.0639 (2.2093) loss 4.1946 (3.7155) grad_norm 1.1830 (1.2775) [2022-01-21 01:15:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][880/1251] eta 0:13:39 lr 0.000712 time 1.6781 (2.2096) loss 3.4453 (3.7162) grad_norm 1.2341 (1.2769) [2022-01-21 01:15:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][890/1251] eta 0:13:17 lr 0.000712 time 2.0297 (2.2099) loss 3.9992 (3.7164) grad_norm 1.2169 (1.2770) [2022-01-21 01:15:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][900/1251] eta 0:12:54 lr 0.000712 time 1.9186 (2.2071) loss 3.0273 (3.7131) grad_norm 1.1951 (1.2773) [2022-01-21 01:16:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][910/1251] eta 0:12:31 lr 0.000712 time 1.8631 (2.2044) loss 3.9941 (3.7132) grad_norm 1.3315 (1.2776) [2022-01-21 01:16:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][920/1251] eta 0:12:08 lr 0.000712 time 1.8152 (2.2023) loss 3.0220 (3.7097) grad_norm 1.2110 (1.2776) [2022-01-21 01:16:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][930/1251] eta 0:11:46 lr 0.000712 time 1.9590 (2.2004) loss 4.0544 (3.7085) grad_norm 1.3384 (1.2785) [2022-01-21 01:17:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][940/1251] eta 0:11:24 lr 0.000712 time 2.7077 (2.2007) loss 3.9716 (3.7092) grad_norm 1.2975 (1.2789) [2022-01-21 01:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][950/1251] eta 0:11:02 lr 0.000712 time 1.9321 (2.2007) loss 3.9663 (3.7108) grad_norm 1.2256 (1.2790) [2022-01-21 01:18:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][960/1251] eta 0:10:40 lr 0.000712 time 1.8998 (2.2012) loss 4.2331 (3.7094) grad_norm 1.1261 (1.2789) [2022-01-21 01:18:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][970/1251] eta 0:10:19 lr 0.000712 time 1.9502 (2.2029) loss 3.1710 (3.7097) grad_norm 1.2883 (1.2798) [2022-01-21 01:18:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][980/1251] eta 0:09:57 lr 0.000712 time 2.8407 (2.2062) loss 3.4666 (3.7088) grad_norm 1.4171 (1.2798) [2022-01-21 01:19:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][990/1251] eta 0:09:35 lr 0.000712 time 1.8108 (2.2059) loss 4.4742 (3.7120) grad_norm 1.3829 (1.2791) [2022-01-21 01:19:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1000/1251] eta 0:09:13 lr 0.000712 time 2.1795 (2.2055) loss 3.9653 (3.7148) grad_norm 1.1199 (1.2793) [2022-01-21 01:19:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1010/1251] eta 0:08:51 lr 0.000712 time 1.5454 (2.2034) loss 3.4335 (3.7130) grad_norm 1.1394 (1.2795) [2022-01-21 01:20:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1020/1251] eta 0:08:28 lr 0.000712 time 2.2515 (2.2023) loss 4.4476 (3.7145) grad_norm 1.3247 (1.2795) [2022-01-21 01:20:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1030/1251] eta 0:08:06 lr 0.000712 time 2.5832 (2.2020) loss 3.4657 (3.7129) grad_norm 1.5935 (1.2792) [2022-01-21 01:20:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1040/1251] eta 0:07:44 lr 0.000712 time 2.1239 (2.2018) loss 4.0640 (3.7104) grad_norm 1.4508 (1.2794) [2022-01-21 01:21:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1050/1251] eta 0:07:22 lr 0.000712 time 2.1592 (2.2036) loss 3.0650 (3.7110) grad_norm 1.2431 (1.2789) [2022-01-21 01:21:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1060/1251] eta 0:07:00 lr 0.000712 time 2.2457 (2.2033) loss 2.8977 (3.7075) grad_norm 1.2018 (1.2786) [2022-01-21 01:22:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1070/1251] eta 0:06:38 lr 0.000712 time 2.2108 (2.2030) loss 4.3996 (3.7082) grad_norm 1.1678 (1.2782) [2022-01-21 01:22:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1080/1251] eta 0:06:16 lr 0.000712 time 2.2576 (2.2031) loss 4.1250 (3.7106) grad_norm 1.2100 (1.2777) [2022-01-21 01:22:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1090/1251] eta 0:05:54 lr 0.000712 time 2.0130 (2.2033) loss 3.7930 (3.7126) grad_norm 1.6434 (1.2774) [2022-01-21 01:23:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1100/1251] eta 0:05:32 lr 0.000712 time 1.9363 (2.2020) loss 3.2673 (3.7131) grad_norm 1.2523 (1.2774) [2022-01-21 01:23:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1110/1251] eta 0:05:10 lr 0.000712 time 2.3387 (2.2016) loss 4.1356 (3.7131) grad_norm 1.3084 (1.2782) [2022-01-21 01:23:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1120/1251] eta 0:04:48 lr 0.000712 time 1.9586 (2.2002) loss 4.0213 (3.7137) grad_norm 1.2623 (1.2781) [2022-01-21 01:24:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1130/1251] eta 0:04:26 lr 0.000712 time 2.0222 (2.1991) loss 3.4685 (3.7107) grad_norm 1.4555 (1.2781) [2022-01-21 01:24:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1140/1251] eta 0:04:04 lr 0.000711 time 1.6178 (2.1985) loss 3.9596 (3.7098) grad_norm 1.1712 (1.2779) [2022-01-21 01:24:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1150/1251] eta 0:03:42 lr 0.000711 time 2.6828 (2.1996) loss 3.7172 (3.7114) grad_norm 1.4746 (1.2784) [2022-01-21 01:25:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1160/1251] eta 0:03:20 lr 0.000711 time 2.8903 (2.2007) loss 4.4699 (3.7141) grad_norm 1.4806 (1.2793) [2022-01-21 01:25:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1170/1251] eta 0:02:58 lr 0.000711 time 1.6884 (2.2018) loss 4.1850 (3.7163) grad_norm 1.1925 (1.2792) [2022-01-21 01:26:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1180/1251] eta 0:02:36 lr 0.000711 time 1.7916 (2.2025) loss 4.2293 (3.7169) grad_norm 1.2922 (1.2800) [2022-01-21 01:26:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1190/1251] eta 0:02:14 lr 0.000711 time 1.8782 (2.2032) loss 3.0148 (3.7176) grad_norm 1.1499 (1.2805) [2022-01-21 01:26:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1200/1251] eta 0:01:52 lr 0.000711 time 3.1314 (2.2027) loss 3.0761 (3.7183) grad_norm 1.2188 (1.2802) [2022-01-21 01:27:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1210/1251] eta 0:01:30 lr 0.000711 time 2.1931 (2.2015) loss 4.2538 (3.7167) grad_norm 1.2213 (1.2798) [2022-01-21 01:27:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1220/1251] eta 0:01:08 lr 0.000711 time 2.2114 (2.1998) loss 3.5259 (3.7174) grad_norm 1.1241 (1.2801) [2022-01-21 01:27:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1230/1251] eta 0:00:46 lr 0.000711 time 2.2482 (2.1983) loss 3.7163 (3.7190) grad_norm 1.2731 (1.2804) [2022-01-21 01:28:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1240/1251] eta 0:00:24 lr 0.000711 time 1.6730 (2.1990) loss 3.4287 (3.7190) grad_norm 1.2827 (1.2804) [2022-01-21 01:28:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [108/300][1250/1251] eta 0:00:02 lr 0.000711 time 1.2880 (2.1934) loss 3.9799 (3.7207) grad_norm 1.2843 (1.2804) [2022-01-21 01:28:31 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 108 training takes 0:45:44 [2022-01-21 01:28:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.712 (18.712) Loss 1.1144 (1.1144) Acc@1 73.340 (73.340) Acc@5 91.211 (91.211) [2022-01-21 01:29:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.319 (3.219) Loss 1.0935 (1.0923) Acc@1 72.168 (73.917) Acc@5 93.262 (92.356) [2022-01-21 01:29:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.641 (2.547) Loss 1.1945 (1.1028) Acc@1 71.875 (73.772) Acc@5 90.723 (92.188) [2022-01-21 01:29:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.572 (2.328) Loss 1.0149 (1.0846) Acc@1 76.172 (74.162) Acc@5 94.531 (92.468) [2022-01-21 01:30:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 5.037 (2.186) Loss 1.1054 (1.0898) Acc@1 74.121 (74.021) Acc@5 92.480 (92.452) [2022-01-21 01:30:07 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.076 Acc@5 92.410 [2022-01-21 01:30:07 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.1% [2022-01-21 01:30:07 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.23% [2022-01-21 01:30:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][0/1251] eta 7:31:29 lr 0.000711 time 21.6540 (21.6540) loss 3.4089 (3.4089) grad_norm 1.1617 (1.1617) [2022-01-21 01:30:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][10/1251] eta 1:23:33 lr 0.000711 time 1.5424 (4.0402) loss 4.3264 (3.5265) grad_norm 1.2256 (1.2304) [2022-01-21 01:31:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][20/1251] eta 1:04:22 lr 0.000711 time 2.0895 (3.1375) loss 3.2042 (3.5417) grad_norm 1.1555 (1.2383) [2022-01-21 01:31:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][30/1251] eta 0:56:36 lr 0.000711 time 1.5822 (2.7815) loss 2.9705 (3.5423) grad_norm 1.1456 (1.2423) [2022-01-21 01:31:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][40/1251] eta 0:54:33 lr 0.000711 time 3.9737 (2.7033) loss 3.9258 (3.6104) grad_norm 1.1387 (1.2445) [2022-01-21 01:32:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][50/1251] eta 0:51:29 lr 0.000711 time 1.3720 (2.5727) loss 3.8328 (3.6286) grad_norm 1.2881 (1.2459) [2022-01-21 01:32:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][60/1251] eta 0:49:24 lr 0.000711 time 2.5967 (2.4890) loss 3.8689 (3.6702) grad_norm 1.2103 (1.2514) [2022-01-21 01:33:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][70/1251] eta 0:47:43 lr 0.000711 time 2.2588 (2.4250) loss 2.8862 (3.6492) grad_norm 1.1840 (1.2625) [2022-01-21 01:33:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][80/1251] eta 0:47:37 lr 0.000711 time 5.3752 (2.4399) loss 3.8629 (3.6328) grad_norm 1.3375 (1.2566) [2022-01-21 01:33:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][90/1251] eta 0:46:35 lr 0.000711 time 1.4600 (2.4081) loss 3.2457 (3.6081) grad_norm 1.4128 (1.2608) [2022-01-21 01:34:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][100/1251] eta 0:45:56 lr 0.000711 time 2.8202 (2.3952) loss 3.3170 (3.6121) grad_norm 1.1721 (1.2587) [2022-01-21 01:34:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][110/1251] eta 0:44:56 lr 0.000711 time 1.7460 (2.3636) loss 3.0216 (3.6066) grad_norm 1.1869 (1.2557) [2022-01-21 01:34:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][120/1251] eta 0:44:15 lr 0.000711 time 2.5298 (2.3478) loss 2.8025 (3.6074) grad_norm 1.3613 (1.2532) [2022-01-21 01:35:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][130/1251] eta 0:43:46 lr 0.000711 time 2.0291 (2.3433) loss 4.4148 (3.6070) grad_norm 1.3731 (1.2517) [2022-01-21 01:35:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][140/1251] eta 0:43:23 lr 0.000711 time 2.8052 (2.3434) loss 2.4794 (3.5769) grad_norm 1.4373 (1.2561) [2022-01-21 01:35:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][150/1251] eta 0:42:43 lr 0.000710 time 1.8860 (2.3283) loss 4.0396 (3.5855) grad_norm 1.2282 (1.2606) [2022-01-21 01:36:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][160/1251] eta 0:42:16 lr 0.000710 time 2.5913 (2.3246) loss 4.0417 (3.5646) grad_norm 1.3538 (1.2616) [2022-01-21 01:36:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][170/1251] eta 0:41:49 lr 0.000710 time 2.8704 (2.3219) loss 3.9544 (3.5748) grad_norm 1.3922 (1.2645) [2022-01-21 01:37:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][180/1251] eta 0:41:19 lr 0.000710 time 2.1724 (2.3149) loss 3.3065 (3.5814) grad_norm 1.2723 (1.2635) [2022-01-21 01:37:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][190/1251] eta 0:40:47 lr 0.000710 time 2.1954 (2.3068) loss 4.0967 (3.5803) grad_norm 1.4032 (1.2650) [2022-01-21 01:37:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][200/1251] eta 0:40:12 lr 0.000710 time 2.5171 (2.2952) loss 3.4609 (3.5870) grad_norm 1.2228 (1.2641) [2022-01-21 01:38:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][210/1251] eta 0:39:40 lr 0.000710 time 2.1689 (2.2866) loss 3.9119 (3.6104) grad_norm 1.2064 (1.2675) [2022-01-21 01:38:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][220/1251] eta 0:39:14 lr 0.000710 time 2.2280 (2.2837) loss 3.7698 (3.6292) grad_norm 1.6782 (1.2694) [2022-01-21 01:38:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][230/1251] eta 0:38:46 lr 0.000710 time 1.7036 (2.2784) loss 4.0135 (3.6317) grad_norm 1.2126 (1.2702) [2022-01-21 01:39:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][240/1251] eta 0:38:23 lr 0.000710 time 1.8434 (2.2789) loss 4.2359 (3.6414) grad_norm 1.0591 (1.2693) [2022-01-21 01:39:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][250/1251] eta 0:37:53 lr 0.000710 time 2.4517 (2.2710) loss 2.8274 (3.6373) grad_norm 1.2650 (1.2679) [2022-01-21 01:39:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][260/1251] eta 0:37:19 lr 0.000710 time 1.7865 (2.2594) loss 3.6851 (3.6389) grad_norm 1.3126 (1.2663) [2022-01-21 01:40:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][270/1251] eta 0:36:52 lr 0.000710 time 1.6465 (2.2553) loss 3.3211 (3.6316) grad_norm 1.1026 (1.2687) [2022-01-21 01:40:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][280/1251] eta 0:36:27 lr 0.000710 time 2.8612 (2.2525) loss 3.9808 (3.6326) grad_norm 1.2041 (1.2701) [2022-01-21 01:41:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][290/1251] eta 0:35:59 lr 0.000710 time 2.1041 (2.2474) loss 3.8105 (3.6381) grad_norm 1.1116 (1.2696) [2022-01-21 01:41:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][300/1251] eta 0:35:41 lr 0.000710 time 1.6632 (2.2519) loss 3.9344 (3.6264) grad_norm 1.1086 (1.2688) [2022-01-21 01:41:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][310/1251] eta 0:35:19 lr 0.000710 time 1.9318 (2.2528) loss 3.1304 (3.6251) grad_norm 1.2330 (1.2673) [2022-01-21 01:42:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][320/1251] eta 0:35:02 lr 0.000710 time 2.1483 (2.2584) loss 2.5423 (3.6219) grad_norm 1.2781 (1.2667) [2022-01-21 01:42:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][330/1251] eta 0:34:36 lr 0.000710 time 2.0980 (2.2548) loss 4.2196 (3.6219) grad_norm 1.4052 (1.2698) [2022-01-21 01:42:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][340/1251] eta 0:34:07 lr 0.000710 time 1.6545 (2.2472) loss 4.1089 (3.6242) grad_norm 1.2601 (1.2709) [2022-01-21 01:43:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][350/1251] eta 0:33:43 lr 0.000710 time 1.6771 (2.2453) loss 3.8686 (3.6164) grad_norm 1.1889 (1.2696) [2022-01-21 01:43:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][360/1251] eta 0:33:22 lr 0.000710 time 2.2175 (2.2475) loss 4.2247 (3.6239) grad_norm 1.1891 (1.2677) [2022-01-21 01:44:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][370/1251] eta 0:33:00 lr 0.000710 time 1.7269 (2.2481) loss 2.7602 (3.6286) grad_norm 1.3479 (1.2684) [2022-01-21 01:44:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][380/1251] eta 0:32:36 lr 0.000710 time 1.8175 (2.2458) loss 3.7406 (3.6273) grad_norm 1.1076 (1.2671) [2022-01-21 01:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][390/1251] eta 0:32:07 lr 0.000710 time 1.9507 (2.2392) loss 3.5014 (3.6275) grad_norm 1.3867 (1.2682) [2022-01-21 01:45:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][400/1251] eta 0:31:45 lr 0.000710 time 1.8796 (2.2396) loss 2.6282 (3.6323) grad_norm 1.2796 (1.2690) [2022-01-21 01:45:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][410/1251] eta 0:31:23 lr 0.000710 time 1.6797 (2.2398) loss 4.0160 (3.6400) grad_norm 1.6320 (1.2711) [2022-01-21 01:45:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][420/1251] eta 0:30:59 lr 0.000709 time 1.5399 (2.2376) loss 3.5871 (3.6423) grad_norm 1.1512 (1.2729) [2022-01-21 01:46:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][430/1251] eta 0:30:34 lr 0.000709 time 1.9842 (2.2344) loss 2.9073 (3.6397) grad_norm 1.2257 (1.2726) [2022-01-21 01:46:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][440/1251] eta 0:30:12 lr 0.000709 time 2.2059 (2.2355) loss 2.4761 (3.6373) grad_norm 1.2820 (1.2713) [2022-01-21 01:46:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][450/1251] eta 0:29:49 lr 0.000709 time 1.8815 (2.2346) loss 3.7854 (3.6383) grad_norm 1.3269 (1.2713) [2022-01-21 01:47:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][460/1251] eta 0:29:24 lr 0.000709 time 1.8738 (2.2308) loss 4.5719 (3.6373) grad_norm 1.2677 (1.2738) [2022-01-21 01:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][470/1251] eta 0:29:02 lr 0.000709 time 1.9906 (2.2317) loss 2.9893 (3.6390) grad_norm 1.5493 (1.2760) [2022-01-21 01:48:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][480/1251] eta 0:28:40 lr 0.000709 time 1.6152 (2.2321) loss 4.2954 (3.6440) grad_norm 1.5848 (1.2763) [2022-01-21 01:48:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][490/1251] eta 0:28:16 lr 0.000709 time 1.6132 (2.2291) loss 4.3545 (3.6448) grad_norm 1.2241 (1.2751) [2022-01-21 01:48:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][500/1251] eta 0:27:52 lr 0.000709 time 1.8241 (2.2267) loss 4.3158 (3.6495) grad_norm 1.2227 (1.2745) [2022-01-21 01:49:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][510/1251] eta 0:27:28 lr 0.000709 time 1.8860 (2.2244) loss 4.1621 (3.6536) grad_norm 1.4545 (1.2741) [2022-01-21 01:49:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][520/1251] eta 0:27:07 lr 0.000709 time 1.7901 (2.2259) loss 3.4306 (3.6566) grad_norm 1.3197 (1.2767) [2022-01-21 01:49:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][530/1251] eta 0:26:44 lr 0.000709 time 1.5469 (2.2250) loss 4.6223 (3.6538) grad_norm 1.1836 (1.2764) [2022-01-21 01:50:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][540/1251] eta 0:26:22 lr 0.000709 time 2.6985 (2.2259) loss 3.5744 (3.6530) grad_norm 1.2218 (1.2763) [2022-01-21 01:50:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][550/1251] eta 0:25:59 lr 0.000709 time 1.7638 (2.2245) loss 4.4011 (3.6545) grad_norm 1.1873 (1.2750) [2022-01-21 01:50:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][560/1251] eta 0:25:39 lr 0.000709 time 2.7516 (2.2273) loss 4.2604 (3.6529) grad_norm 1.1817 (1.2743) [2022-01-21 01:51:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][570/1251] eta 0:25:17 lr 0.000709 time 1.8225 (2.2280) loss 4.2169 (3.6510) grad_norm 1.4767 (1.2745) [2022-01-21 01:51:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][580/1251] eta 0:24:53 lr 0.000709 time 2.2206 (2.2263) loss 3.1066 (3.6517) grad_norm 1.1715 (1.2744) [2022-01-21 01:52:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][590/1251] eta 0:24:29 lr 0.000709 time 1.8089 (2.2229) loss 3.1072 (3.6551) grad_norm 1.1288 (1.2753) [2022-01-21 01:52:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][600/1251] eta 0:24:05 lr 0.000709 time 1.8365 (2.2207) loss 3.7394 (3.6536) grad_norm 1.1871 (1.2751) [2022-01-21 01:52:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][610/1251] eta 0:23:42 lr 0.000709 time 1.8079 (2.2184) loss 4.0624 (3.6577) grad_norm 1.2087 (1.2756) [2022-01-21 01:53:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][620/1251] eta 0:23:17 lr 0.000709 time 1.9649 (2.2154) loss 4.7847 (3.6605) grad_norm 1.3611 (1.2756) [2022-01-21 01:53:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][630/1251] eta 0:22:55 lr 0.000709 time 2.2770 (2.2151) loss 4.5072 (3.6610) grad_norm 1.4091 (1.2766) [2022-01-21 01:53:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][640/1251] eta 0:22:33 lr 0.000709 time 2.4797 (2.2153) loss 3.3609 (3.6621) grad_norm 1.2043 (1.2768) [2022-01-21 01:54:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][650/1251] eta 0:22:11 lr 0.000709 time 1.9124 (2.2160) loss 3.9748 (3.6640) grad_norm 1.3352 (1.2768) [2022-01-21 01:54:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][660/1251] eta 0:21:50 lr 0.000709 time 2.2520 (2.2175) loss 3.7430 (3.6619) grad_norm 1.2040 (1.2768) [2022-01-21 01:54:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][670/1251] eta 0:21:29 lr 0.000709 time 1.8823 (2.2197) loss 4.1913 (3.6683) grad_norm 1.2632 (1.2769) [2022-01-21 01:55:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][680/1251] eta 0:21:07 lr 0.000708 time 2.1893 (2.2200) loss 3.0133 (3.6663) grad_norm 1.3758 (1.2772) [2022-01-21 01:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][690/1251] eta 0:20:45 lr 0.000708 time 1.8867 (2.2202) loss 4.1440 (3.6698) grad_norm 1.3718 (1.2778) [2022-01-21 01:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][700/1251] eta 0:20:22 lr 0.000708 time 1.6946 (2.2193) loss 3.8375 (3.6704) grad_norm 1.4299 (1.2786) [2022-01-21 01:56:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][710/1251] eta 0:20:01 lr 0.000708 time 1.7861 (2.2204) loss 3.6423 (3.6705) grad_norm 1.1608 (1.2806) [2022-01-21 01:56:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][720/1251] eta 0:19:37 lr 0.000708 time 1.9067 (2.2184) loss 3.8706 (3.6705) grad_norm 1.4900 (1.2815) [2022-01-21 01:57:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][730/1251] eta 0:19:14 lr 0.000708 time 1.8568 (2.2166) loss 3.6475 (3.6700) grad_norm 1.3096 (1.2821) [2022-01-21 01:57:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][740/1251] eta 0:18:52 lr 0.000708 time 2.1281 (2.2154) loss 4.2665 (3.6700) grad_norm 1.1353 (1.2840) [2022-01-21 01:57:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][750/1251] eta 0:18:29 lr 0.000708 time 1.8994 (2.2142) loss 4.3130 (3.6749) grad_norm 1.1692 (1.2851) [2022-01-21 01:58:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][760/1251] eta 0:18:07 lr 0.000708 time 2.2487 (2.2153) loss 3.8750 (3.6728) grad_norm 1.2705 (1.2848) [2022-01-21 01:58:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][770/1251] eta 0:17:45 lr 0.000708 time 2.6843 (2.2161) loss 3.5892 (3.6738) grad_norm 1.3239 (1.2851) [2022-01-21 01:58:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][780/1251] eta 0:17:23 lr 0.000708 time 1.7396 (2.2166) loss 4.0611 (3.6721) grad_norm 1.1211 (1.2854) [2022-01-21 01:59:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][790/1251] eta 0:17:01 lr 0.000708 time 2.1817 (2.2163) loss 2.4279 (3.6689) grad_norm 1.2732 (1.2855) [2022-01-21 01:59:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][800/1251] eta 0:16:39 lr 0.000708 time 2.0387 (2.2163) loss 4.6402 (3.6713) grad_norm 1.1753 (1.2846) [2022-01-21 02:00:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][810/1251] eta 0:16:18 lr 0.000708 time 2.9385 (2.2179) loss 3.0089 (3.6743) grad_norm 1.3769 (1.2853) [2022-01-21 02:00:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][820/1251] eta 0:15:54 lr 0.000708 time 1.7151 (2.2152) loss 4.1493 (3.6740) grad_norm 1.3432 (1.2849) [2022-01-21 02:00:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][830/1251] eta 0:15:31 lr 0.000708 time 1.9389 (2.2136) loss 4.4766 (3.6734) grad_norm 1.2782 (1.2849) [2022-01-21 02:01:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][840/1251] eta 0:15:09 lr 0.000708 time 2.1420 (2.2137) loss 3.9826 (3.6725) grad_norm 1.1524 (1.2838) [2022-01-21 02:01:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][850/1251] eta 0:14:48 lr 0.000708 time 3.1679 (2.2157) loss 3.3423 (3.6681) grad_norm 1.3005 (1.2828) [2022-01-21 02:01:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][860/1251] eta 0:14:25 lr 0.000708 time 2.0657 (2.2141) loss 4.2957 (3.6698) grad_norm 1.2316 (1.2830) [2022-01-21 02:02:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][870/1251] eta 0:14:03 lr 0.000708 time 1.5427 (2.2127) loss 2.7513 (3.6707) grad_norm 1.3989 (1.2833) [2022-01-21 02:02:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][880/1251] eta 0:13:40 lr 0.000708 time 2.2011 (2.2113) loss 4.4496 (3.6692) grad_norm 1.2801 (1.2847) [2022-01-21 02:02:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][890/1251] eta 0:13:18 lr 0.000708 time 2.5339 (2.2121) loss 4.0030 (3.6724) grad_norm 1.3397 (1.2841) [2022-01-21 02:03:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][900/1251] eta 0:12:56 lr 0.000708 time 1.9163 (2.2118) loss 3.2788 (3.6701) grad_norm 1.2794 (1.2858) [2022-01-21 02:03:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][910/1251] eta 0:12:34 lr 0.000708 time 1.8488 (2.2117) loss 3.8894 (3.6717) grad_norm 1.2537 (1.2858) [2022-01-21 02:04:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][920/1251] eta 0:12:11 lr 0.000708 time 1.9092 (2.2112) loss 3.3164 (3.6732) grad_norm 1.4431 (1.2853) [2022-01-21 02:04:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][930/1251] eta 0:11:49 lr 0.000708 time 1.8752 (2.2098) loss 4.1026 (3.6765) grad_norm 1.2374 (1.2856) [2022-01-21 02:04:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][940/1251] eta 0:11:26 lr 0.000708 time 1.8365 (2.2080) loss 3.8451 (3.6777) grad_norm 1.2425 (1.2853) [2022-01-21 02:05:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][950/1251] eta 0:11:04 lr 0.000707 time 1.6160 (2.2072) loss 3.2631 (3.6731) grad_norm 1.1074 (1.2853) [2022-01-21 02:05:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][960/1251] eta 0:10:42 lr 0.000707 time 1.6143 (2.2064) loss 3.9594 (3.6711) grad_norm 1.3295 (1.2845) [2022-01-21 02:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][970/1251] eta 0:10:20 lr 0.000707 time 2.8322 (2.2077) loss 2.2044 (3.6708) grad_norm 1.2195 (1.2844) [2022-01-21 02:06:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][980/1251] eta 0:09:58 lr 0.000707 time 1.7815 (2.2079) loss 2.6258 (3.6713) grad_norm 1.2770 (1.2840) [2022-01-21 02:06:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][990/1251] eta 0:09:36 lr 0.000707 time 1.5248 (2.2086) loss 3.9323 (3.6736) grad_norm 1.1862 (1.2842) [2022-01-21 02:06:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1000/1251] eta 0:09:14 lr 0.000707 time 2.0662 (2.2087) loss 2.8519 (3.6723) grad_norm 1.2063 (1.2846) [2022-01-21 02:07:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1010/1251] eta 0:08:52 lr 0.000707 time 2.4999 (2.2078) loss 3.2408 (3.6692) grad_norm 1.3384 (1.2842) [2022-01-21 02:07:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1020/1251] eta 0:08:29 lr 0.000707 time 1.6272 (2.2054) loss 4.1827 (3.6684) grad_norm 1.2265 (1.2838) [2022-01-21 02:08:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1030/1251] eta 0:08:07 lr 0.000707 time 2.0295 (2.2057) loss 2.8929 (3.6652) grad_norm 1.3645 (1.2834) [2022-01-21 02:08:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1040/1251] eta 0:07:45 lr 0.000707 time 1.8705 (2.2048) loss 3.6623 (3.6658) grad_norm 1.6664 (1.2834) [2022-01-21 02:08:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1050/1251] eta 0:07:23 lr 0.000707 time 2.3223 (2.2044) loss 3.2004 (3.6644) grad_norm 1.1976 (1.2825) [2022-01-21 02:09:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1060/1251] eta 0:07:01 lr 0.000707 time 2.1962 (2.2048) loss 3.4560 (3.6647) grad_norm 1.1110 (1.2824) [2022-01-21 02:09:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1070/1251] eta 0:06:39 lr 0.000707 time 2.1862 (2.2065) loss 3.2513 (3.6663) grad_norm 1.1893 (1.2826) [2022-01-21 02:09:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1080/1251] eta 0:06:17 lr 0.000707 time 1.7684 (2.2063) loss 3.2284 (3.6669) grad_norm 1.1472 (1.2835) [2022-01-21 02:10:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1090/1251] eta 0:05:55 lr 0.000707 time 2.5110 (2.2052) loss 4.0229 (3.6699) grad_norm 1.2654 (1.2834) [2022-01-21 02:10:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1100/1251] eta 0:05:32 lr 0.000707 time 1.8245 (2.2030) loss 3.7223 (3.6689) grad_norm 1.0881 (1.2839) [2022-01-21 02:10:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1110/1251] eta 0:05:10 lr 0.000707 time 1.9771 (2.2015) loss 4.1640 (3.6703) grad_norm 1.2756 (1.2848) [2022-01-21 02:11:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1120/1251] eta 0:04:48 lr 0.000707 time 2.2770 (2.2007) loss 4.0308 (3.6700) grad_norm 1.3561 (1.2848) [2022-01-21 02:11:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1130/1251] eta 0:04:26 lr 0.000707 time 2.9438 (2.2002) loss 4.1698 (3.6706) grad_norm 1.3142 (1.2859) [2022-01-21 02:11:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1140/1251] eta 0:04:04 lr 0.000707 time 2.1699 (2.1990) loss 3.5186 (3.6702) grad_norm 1.0479 (1.2851) [2022-01-21 02:12:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1150/1251] eta 0:03:42 lr 0.000707 time 2.0052 (2.1991) loss 4.2808 (3.6695) grad_norm 1.4757 (1.2850) [2022-01-21 02:12:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1160/1251] eta 0:03:20 lr 0.000707 time 2.9336 (2.2007) loss 4.0763 (3.6698) grad_norm 1.6043 (1.2856) [2022-01-21 02:13:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1170/1251] eta 0:02:58 lr 0.000707 time 1.8986 (2.2006) loss 3.0074 (3.6685) grad_norm 1.4965 (1.2857) [2022-01-21 02:13:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1180/1251] eta 0:02:36 lr 0.000707 time 2.1566 (2.2009) loss 2.7380 (3.6686) grad_norm 1.4512 (1.2856) [2022-01-21 02:13:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1190/1251] eta 0:02:14 lr 0.000707 time 1.6986 (2.2003) loss 3.7294 (3.6679) grad_norm 1.3487 (1.2861) [2022-01-21 02:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1200/1251] eta 0:01:52 lr 0.000707 time 2.1458 (2.2004) loss 3.0086 (3.6669) grad_norm 1.0999 (1.2855) [2022-01-21 02:14:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1210/1251] eta 0:01:30 lr 0.000706 time 2.1524 (2.2001) loss 3.4177 (3.6672) grad_norm 1.1079 (1.2849) [2022-01-21 02:14:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1220/1251] eta 0:01:08 lr 0.000706 time 2.5489 (2.2005) loss 3.1716 (3.6677) grad_norm 1.3457 (1.2855) [2022-01-21 02:15:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1230/1251] eta 0:00:46 lr 0.000706 time 2.2100 (2.2012) loss 3.9488 (3.6667) grad_norm 1.3955 (1.2856) [2022-01-21 02:15:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1240/1251] eta 0:00:24 lr 0.000706 time 1.7498 (2.2008) loss 4.2633 (3.6676) grad_norm 1.1513 (1.2859) [2022-01-21 02:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [109/300][1250/1251] eta 0:00:02 lr 0.000706 time 1.1839 (2.1957) loss 3.0463 (3.6668) grad_norm 1.2145 (1.2855) [2022-01-21 02:15:55 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 109 training takes 0:45:47 [2022-01-21 02:16:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.304 (18.304) Loss 1.0103 (1.0103) Acc@1 74.609 (74.609) Acc@5 93.750 (93.750) [2022-01-21 02:16:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.274 (3.322) Loss 0.9434 (1.0681) Acc@1 78.906 (74.494) Acc@5 93.750 (92.658) [2022-01-21 02:16:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.139 (2.488) Loss 1.0972 (1.0890) Acc@1 73.633 (73.982) Acc@5 93.066 (92.304) [2022-01-21 02:17:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.602 (2.208) Loss 1.0223 (1.0903) Acc@1 76.074 (74.061) Acc@5 93.066 (92.273) [2022-01-21 02:17:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.970 (2.113) Loss 1.0353 (1.0838) Acc@1 75.586 (74.307) Acc@5 92.773 (92.369) [2022-01-21 02:17:28 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.252 Acc@5 92.412 [2022-01-21 02:17:28 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.3% [2022-01-21 02:17:28 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.25% [2022-01-21 02:17:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][0/1251] eta 7:27:23 lr 0.000706 time 21.4576 (21.4576) loss 4.0841 (4.0841) grad_norm 1.2325 (1.2325) [2022-01-21 02:18:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][10/1251] eta 1:23:15 lr 0.000706 time 1.4799 (4.0250) loss 2.7721 (3.7263) grad_norm 1.3950 (1.3028) [2022-01-21 02:18:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][20/1251] eta 1:02:19 lr 0.000706 time 1.9657 (3.0376) loss 2.6521 (3.6985) grad_norm 1.3023 (1.3141) [2022-01-21 02:18:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][30/1251] eta 0:54:39 lr 0.000706 time 1.8758 (2.6858) loss 3.8519 (3.7290) grad_norm 1.2630 (1.3100) [2022-01-21 02:19:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][40/1251] eta 0:54:07 lr 0.000706 time 6.7793 (2.6814) loss 2.7613 (3.6790) grad_norm 1.2432 (1.3358) [2022-01-21 02:19:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][50/1251] eta 0:52:22 lr 0.000706 time 1.7749 (2.6162) loss 3.4673 (3.6824) grad_norm 1.2687 (1.3212) [2022-01-21 02:20:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][60/1251] eta 0:50:57 lr 0.000706 time 1.8604 (2.5669) loss 2.6767 (3.6971) grad_norm 1.2328 (1.3063) [2022-01-21 02:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][70/1251] eta 0:49:30 lr 0.000706 time 1.4464 (2.5156) loss 4.1419 (3.6914) grad_norm 1.2653 (1.3007) [2022-01-21 02:20:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][80/1251] eta 0:48:32 lr 0.000706 time 2.4863 (2.4870) loss 3.8262 (3.6890) grad_norm 1.4700 (1.3026) [2022-01-21 02:21:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][90/1251] eta 0:47:42 lr 0.000706 time 2.7074 (2.4656) loss 3.7832 (3.6690) grad_norm 1.4856 (1.3168) [2022-01-21 02:21:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][100/1251] eta 0:46:30 lr 0.000706 time 2.1645 (2.4244) loss 3.3425 (3.6686) grad_norm 1.2154 (1.3143) [2022-01-21 02:21:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][110/1251] eta 0:45:25 lr 0.000706 time 2.0086 (2.3890) loss 3.1485 (3.6675) grad_norm 1.1285 (1.3096) [2022-01-21 02:22:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][120/1251] eta 0:44:33 lr 0.000706 time 2.6426 (2.3642) loss 4.4288 (3.6800) grad_norm 1.3797 (1.3116) [2022-01-21 02:22:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][130/1251] eta 0:43:56 lr 0.000706 time 1.7596 (2.3515) loss 4.2595 (3.6951) grad_norm 1.4855 (1.3049) [2022-01-21 02:22:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][140/1251] eta 0:43:19 lr 0.000706 time 2.1298 (2.3395) loss 4.1127 (3.7063) grad_norm 1.4887 (1.2997) [2022-01-21 02:23:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][150/1251] eta 0:42:53 lr 0.000706 time 3.6507 (2.3375) loss 3.8465 (3.6850) grad_norm 1.1193 (1.3061) [2022-01-21 02:23:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][160/1251] eta 0:42:27 lr 0.000706 time 2.7669 (2.3348) loss 3.2198 (3.6836) grad_norm 1.1270 (1.3073) [2022-01-21 02:24:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][170/1251] eta 0:42:03 lr 0.000706 time 2.2021 (2.3347) loss 2.8082 (3.6785) grad_norm 1.1049 (1.3061) [2022-01-21 02:24:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][180/1251] eta 0:41:26 lr 0.000706 time 1.9416 (2.3215) loss 3.8938 (3.6868) grad_norm 1.2215 (1.3055) [2022-01-21 02:24:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][190/1251] eta 0:40:41 lr 0.000706 time 2.1549 (2.3015) loss 3.5993 (3.6780) grad_norm 1.3453 (1.3047) [2022-01-21 02:25:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][200/1251] eta 0:40:10 lr 0.000706 time 1.8904 (2.2937) loss 3.8886 (3.6847) grad_norm 1.2475 (1.3032) [2022-01-21 02:25:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][210/1251] eta 0:39:39 lr 0.000706 time 2.4848 (2.2862) loss 3.9319 (3.6892) grad_norm 1.3596 (1.2993) [2022-01-21 02:25:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][220/1251] eta 0:39:05 lr 0.000706 time 1.6409 (2.2751) loss 4.0117 (3.6915) grad_norm 1.3655 (1.2982) [2022-01-21 02:26:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][230/1251] eta 0:38:33 lr 0.000705 time 2.2803 (2.2662) loss 3.2655 (3.6830) grad_norm 1.3719 (1.2966) [2022-01-21 02:26:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][240/1251] eta 0:38:06 lr 0.000705 time 2.0347 (2.2614) loss 3.2727 (3.6830) grad_norm 1.1801 (1.2998) [2022-01-21 02:26:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][250/1251] eta 0:37:46 lr 0.000705 time 2.3246 (2.2645) loss 2.7576 (3.6745) grad_norm 1.3694 (1.2994) [2022-01-21 02:27:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][260/1251] eta 0:37:30 lr 0.000705 time 1.9160 (2.2714) loss 2.5079 (3.6718) grad_norm 1.2645 (1.2986) [2022-01-21 02:27:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][270/1251] eta 0:37:12 lr 0.000705 time 2.4619 (2.2759) loss 3.6467 (3.6728) grad_norm 1.2101 (1.2995) [2022-01-21 02:28:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][280/1251] eta 0:36:49 lr 0.000705 time 2.2610 (2.2755) loss 3.9764 (3.6730) grad_norm 1.1776 (1.2964) [2022-01-21 02:28:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][290/1251] eta 0:36:20 lr 0.000705 time 2.4028 (2.2691) loss 4.1271 (3.6712) grad_norm 1.4214 (1.2950) [2022-01-21 02:28:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][300/1251] eta 0:35:49 lr 0.000705 time 1.9337 (2.2605) loss 2.6272 (3.6658) grad_norm 1.1241 (1.2963) [2022-01-21 02:29:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][310/1251] eta 0:35:19 lr 0.000705 time 2.3992 (2.2520) loss 4.0855 (3.6684) grad_norm 1.3912 (1.2956) [2022-01-21 02:29:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][320/1251] eta 0:34:55 lr 0.000705 time 2.8728 (2.2508) loss 3.5900 (3.6629) grad_norm 1.1459 (1.2946) [2022-01-21 02:29:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][330/1251] eta 0:34:34 lr 0.000705 time 2.4896 (2.2522) loss 3.4224 (3.6656) grad_norm 1.1920 (1.2944) [2022-01-21 02:30:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][340/1251] eta 0:34:08 lr 0.000705 time 1.8737 (2.2487) loss 4.4431 (3.6627) grad_norm 1.2832 (1.2930) [2022-01-21 02:30:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][350/1251] eta 0:33:49 lr 0.000705 time 3.7533 (2.2523) loss 3.7381 (3.6570) grad_norm 1.3574 (1.2932) [2022-01-21 02:31:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][360/1251] eta 0:33:28 lr 0.000705 time 2.7500 (2.2539) loss 3.9104 (3.6545) grad_norm 1.4561 (1.2921) [2022-01-21 02:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][370/1251] eta 0:32:59 lr 0.000705 time 1.6340 (2.2468) loss 3.6816 (3.6535) grad_norm 1.2575 (1.2924) [2022-01-21 02:31:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][380/1251] eta 0:32:32 lr 0.000705 time 2.2050 (2.2418) loss 3.4808 (3.6556) grad_norm 1.0977 (1.2932) [2022-01-21 02:32:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][390/1251] eta 0:32:10 lr 0.000705 time 3.0350 (2.2420) loss 4.6194 (3.6667) grad_norm 1.4679 (1.2929) [2022-01-21 02:32:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][400/1251] eta 0:31:48 lr 0.000705 time 3.3330 (2.2430) loss 3.2095 (3.6702) grad_norm 1.2107 (1.2952) [2022-01-21 02:32:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][410/1251] eta 0:31:27 lr 0.000705 time 1.7349 (2.2441) loss 2.7759 (3.6724) grad_norm 1.6525 (1.2957) [2022-01-21 02:33:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][420/1251] eta 0:31:04 lr 0.000705 time 2.8367 (2.2433) loss 2.9745 (3.6741) grad_norm 1.2285 (1.2996) [2022-01-21 02:33:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][430/1251] eta 0:30:38 lr 0.000705 time 2.5673 (2.2389) loss 4.0819 (3.6807) grad_norm 1.2228 (1.3001) [2022-01-21 02:33:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][440/1251] eta 0:30:10 lr 0.000705 time 1.9556 (2.2323) loss 2.8486 (3.6753) grad_norm 1.2540 (1.3007) [2022-01-21 02:34:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][450/1251] eta 0:29:46 lr 0.000705 time 1.9410 (2.2302) loss 2.5269 (3.6700) grad_norm 1.1440 (1.2992) [2022-01-21 02:34:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][460/1251] eta 0:29:23 lr 0.000705 time 2.7289 (2.2300) loss 3.9333 (3.6732) grad_norm 1.2457 (1.2988) [2022-01-21 02:34:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][470/1251] eta 0:29:02 lr 0.000705 time 2.5919 (2.2309) loss 3.9383 (3.6728) grad_norm 1.5125 (1.2989) [2022-01-21 02:35:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][480/1251] eta 0:28:40 lr 0.000705 time 2.5047 (2.2321) loss 3.3991 (3.6747) grad_norm 1.1051 (1.2987) [2022-01-21 02:35:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][490/1251] eta 0:28:21 lr 0.000704 time 2.2835 (2.2353) loss 3.6603 (3.6740) grad_norm 1.2594 (1.2977) [2022-01-21 02:36:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][500/1251] eta 0:27:58 lr 0.000704 time 1.8335 (2.2353) loss 3.8457 (3.6765) grad_norm 1.1157 (1.2963) [2022-01-21 02:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][510/1251] eta 0:27:33 lr 0.000704 time 1.5472 (2.2314) loss 2.9347 (3.6783) grad_norm 1.1831 (1.2973) [2022-01-21 02:36:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][520/1251] eta 0:27:08 lr 0.000704 time 1.8230 (2.2277) loss 2.6819 (3.6792) grad_norm 1.4215 (1.2977) [2022-01-21 02:37:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][530/1251] eta 0:26:43 lr 0.000704 time 1.7009 (2.2245) loss 4.0723 (3.6825) grad_norm 1.4663 (1.2981) [2022-01-21 02:37:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][540/1251] eta 0:26:22 lr 0.000704 time 2.2796 (2.2252) loss 3.5016 (3.6850) grad_norm 1.3698 (1.3009) [2022-01-21 02:37:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][550/1251] eta 0:25:59 lr 0.000704 time 1.6654 (2.2254) loss 3.6068 (3.6868) grad_norm 1.2377 (1.2998) [2022-01-21 02:38:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][560/1251] eta 0:25:36 lr 0.000704 time 3.0306 (2.2240) loss 4.1787 (3.6833) grad_norm 1.2515 (1.3007) [2022-01-21 02:38:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][570/1251] eta 0:25:13 lr 0.000704 time 1.9236 (2.2229) loss 2.7054 (3.6800) grad_norm 1.5214 (1.3024) [2022-01-21 02:38:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][580/1251] eta 0:24:50 lr 0.000704 time 2.1784 (2.2217) loss 4.5366 (3.6802) grad_norm 1.3132 (1.3028) [2022-01-21 02:39:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][590/1251] eta 0:24:28 lr 0.000704 time 2.0760 (2.2223) loss 3.9694 (3.6774) grad_norm 1.4987 (1.3037) [2022-01-21 02:39:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][600/1251] eta 0:24:06 lr 0.000704 time 3.3543 (2.2223) loss 4.4118 (3.6796) grad_norm 1.3199 (1.3050) [2022-01-21 02:40:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][610/1251] eta 0:23:45 lr 0.000704 time 2.7067 (2.2233) loss 3.2229 (3.6793) grad_norm 1.3261 (1.3039) [2022-01-21 02:40:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][620/1251] eta 0:23:23 lr 0.000704 time 2.7788 (2.2244) loss 4.4210 (3.6799) grad_norm 1.2980 (1.3037) [2022-01-21 02:40:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][630/1251] eta 0:22:59 lr 0.000704 time 1.8559 (2.2220) loss 3.8434 (3.6829) grad_norm 1.1789 (1.3017) [2022-01-21 02:41:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][640/1251] eta 0:22:38 lr 0.000704 time 2.9518 (2.2241) loss 3.7243 (3.6834) grad_norm 1.4389 (1.3010) [2022-01-21 02:41:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][650/1251] eta 0:22:16 lr 0.000704 time 2.3528 (2.2233) loss 3.9901 (3.6835) grad_norm 1.2380 (1.3003) [2022-01-21 02:41:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][660/1251] eta 0:21:53 lr 0.000704 time 1.9235 (2.2223) loss 3.4765 (3.6794) grad_norm 1.2295 (1.2993) [2022-01-21 02:42:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][670/1251] eta 0:21:29 lr 0.000704 time 1.8926 (2.2193) loss 3.9453 (3.6788) grad_norm 1.2697 (1.2983) [2022-01-21 02:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][680/1251] eta 0:21:05 lr 0.000704 time 2.3004 (2.2161) loss 3.2266 (3.6761) grad_norm 1.3103 (1.2980) [2022-01-21 02:42:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][690/1251] eta 0:20:41 lr 0.000704 time 2.1181 (2.2123) loss 4.1989 (3.6769) grad_norm 1.2248 (1.2975) [2022-01-21 02:43:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][700/1251] eta 0:20:17 lr 0.000704 time 2.6162 (2.2102) loss 4.0735 (3.6742) grad_norm 1.2467 (1.2967) [2022-01-21 02:43:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][710/1251] eta 0:19:54 lr 0.000704 time 2.2214 (2.2082) loss 4.1731 (3.6765) grad_norm 1.2175 (1.2956) [2022-01-21 02:44:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][720/1251] eta 0:19:34 lr 0.000704 time 3.0096 (2.2114) loss 3.7444 (3.6787) grad_norm 1.3556 (1.2947) [2022-01-21 02:44:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][730/1251] eta 0:19:12 lr 0.000704 time 1.7902 (2.2125) loss 4.2862 (3.6772) grad_norm 1.2301 (1.2948) [2022-01-21 02:44:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][740/1251] eta 0:18:50 lr 0.000704 time 2.6967 (2.2126) loss 2.7094 (3.6720) grad_norm 1.4171 (1.2945) [2022-01-21 02:45:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][750/1251] eta 0:18:29 lr 0.000703 time 2.0281 (2.2141) loss 4.0907 (3.6725) grad_norm 1.1637 (1.2961) [2022-01-21 02:45:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][760/1251] eta 0:18:07 lr 0.000703 time 2.0988 (2.2156) loss 3.1428 (3.6688) grad_norm 1.2805 (1.2971) [2022-01-21 02:45:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][770/1251] eta 0:17:46 lr 0.000703 time 1.9631 (2.2179) loss 3.8281 (3.6693) grad_norm 1.3858 (1.2972) [2022-01-21 02:46:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][780/1251] eta 0:17:24 lr 0.000703 time 2.5802 (2.2175) loss 3.0986 (3.6676) grad_norm 1.2126 (1.2969) [2022-01-21 02:46:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][790/1251] eta 0:17:00 lr 0.000703 time 1.5452 (2.2147) loss 4.2565 (3.6707) grad_norm 1.3262 (1.2972) [2022-01-21 02:47:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][800/1251] eta 0:16:37 lr 0.000703 time 1.8314 (2.2123) loss 3.1437 (3.6692) grad_norm 1.2807 (1.2970) [2022-01-21 02:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][810/1251] eta 0:16:15 lr 0.000703 time 2.3223 (2.2126) loss 2.6784 (3.6679) grad_norm 1.2446 (1.2967) [2022-01-21 02:47:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][820/1251] eta 0:15:53 lr 0.000703 time 1.9934 (2.2112) loss 4.3052 (3.6680) grad_norm 1.6020 (1.2982) [2022-01-21 02:48:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][830/1251] eta 0:15:30 lr 0.000703 time 1.7443 (2.2097) loss 4.0480 (3.6689) grad_norm 1.3538 (1.2982) [2022-01-21 02:48:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][840/1251] eta 0:15:07 lr 0.000703 time 1.9087 (2.2088) loss 4.0490 (3.6718) grad_norm 1.1086 (1.2970) [2022-01-21 02:48:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][850/1251] eta 0:14:46 lr 0.000703 time 2.3006 (2.2101) loss 3.9413 (3.6723) grad_norm 1.4403 (1.2972) [2022-01-21 02:49:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][860/1251] eta 0:14:24 lr 0.000703 time 1.8968 (2.2112) loss 2.7886 (3.6729) grad_norm 1.0878 (1.2969) [2022-01-21 02:49:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][870/1251] eta 0:14:02 lr 0.000703 time 1.8830 (2.2117) loss 2.7655 (3.6725) grad_norm 1.2523 (1.2963) [2022-01-21 02:49:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][880/1251] eta 0:13:40 lr 0.000703 time 1.7340 (2.2108) loss 2.6654 (3.6715) grad_norm 1.4016 (1.2990) [2022-01-21 02:50:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][890/1251] eta 0:13:18 lr 0.000703 time 1.8552 (2.2123) loss 3.8649 (3.6693) grad_norm 1.8858 (1.3011) [2022-01-21 02:50:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][900/1251] eta 0:12:56 lr 0.000703 time 1.8526 (2.2115) loss 4.1273 (3.6684) grad_norm 1.2291 (1.3021) [2022-01-21 02:51:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][910/1251] eta 0:12:33 lr 0.000703 time 1.7522 (2.2109) loss 3.8608 (3.6672) grad_norm 1.2242 (1.3018) [2022-01-21 02:51:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][920/1251] eta 0:12:11 lr 0.000703 time 1.9679 (2.2101) loss 4.2389 (3.6682) grad_norm 1.0943 (1.3014) [2022-01-21 02:51:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][930/1251] eta 0:11:49 lr 0.000703 time 2.3629 (2.2099) loss 4.3410 (3.6691) grad_norm 1.2770 (1.3004) [2022-01-21 02:52:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][940/1251] eta 0:11:27 lr 0.000703 time 1.8670 (2.2096) loss 4.1216 (3.6667) grad_norm 1.3007 (1.3006) [2022-01-21 02:52:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][950/1251] eta 0:11:04 lr 0.000703 time 1.9210 (2.2091) loss 3.1165 (3.6670) grad_norm 1.1758 (1.3007) [2022-01-21 02:52:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][960/1251] eta 0:10:42 lr 0.000703 time 2.0866 (2.2074) loss 4.3717 (3.6660) grad_norm 1.2543 (1.3009) [2022-01-21 02:53:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][970/1251] eta 0:10:20 lr 0.000703 time 2.3955 (2.2069) loss 4.0255 (3.6698) grad_norm 1.7742 (1.3009) [2022-01-21 02:53:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][980/1251] eta 0:09:58 lr 0.000703 time 2.0349 (2.2075) loss 3.6046 (3.6698) grad_norm 1.3411 (1.3007) [2022-01-21 02:53:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][990/1251] eta 0:09:36 lr 0.000703 time 1.7973 (2.2072) loss 2.6415 (3.6668) grad_norm 1.1890 (1.3002) [2022-01-21 02:54:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1000/1251] eta 0:09:14 lr 0.000703 time 1.8769 (2.2079) loss 4.1371 (3.6635) grad_norm 1.2573 (1.3004) [2022-01-21 02:54:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1010/1251] eta 0:08:52 lr 0.000703 time 3.2300 (2.2079) loss 3.9783 (3.6621) grad_norm 1.2288 (1.3008) [2022-01-21 02:55:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1020/1251] eta 0:08:29 lr 0.000702 time 1.9217 (2.2061) loss 2.8554 (3.6601) grad_norm 1.1766 (1.3006) [2022-01-21 02:55:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1030/1251] eta 0:08:07 lr 0.000702 time 1.9589 (2.2048) loss 3.2508 (3.6601) grad_norm 1.3508 (1.3002) [2022-01-21 02:55:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1040/1251] eta 0:07:45 lr 0.000702 time 2.7699 (2.2051) loss 3.3446 (3.6593) grad_norm 1.7005 (1.3006) [2022-01-21 02:56:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1050/1251] eta 0:07:23 lr 0.000702 time 2.3506 (2.2068) loss 4.5135 (3.6593) grad_norm 1.2062 (1.3005) [2022-01-21 02:56:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1060/1251] eta 0:07:01 lr 0.000702 time 2.1610 (2.2065) loss 4.3749 (3.6591) grad_norm 1.4090 (1.3006) [2022-01-21 02:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1070/1251] eta 0:06:39 lr 0.000702 time 2.0948 (2.2049) loss 3.9311 (3.6603) grad_norm 1.3662 (1.3000) [2022-01-21 02:57:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1080/1251] eta 0:06:16 lr 0.000702 time 3.1215 (2.2045) loss 3.5324 (3.6588) grad_norm 1.4298 (1.2998) [2022-01-21 02:57:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1090/1251] eta 0:05:54 lr 0.000702 time 2.4169 (2.2031) loss 3.9979 (3.6610) grad_norm 1.3347 (1.3000) [2022-01-21 02:57:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1100/1251] eta 0:05:32 lr 0.000702 time 1.9159 (2.2013) loss 4.2602 (3.6582) grad_norm 1.2615 (1.2996) [2022-01-21 02:58:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1110/1251] eta 0:05:10 lr 0.000702 time 1.5979 (2.2002) loss 3.7886 (3.6583) grad_norm 1.2030 (1.2992) [2022-01-21 02:58:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1120/1251] eta 0:04:48 lr 0.000702 time 1.8188 (2.2008) loss 4.0573 (3.6593) grad_norm 1.3769 (1.2991) [2022-01-21 02:59:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1130/1251] eta 0:04:26 lr 0.000702 time 2.2969 (2.2027) loss 4.0549 (3.6601) grad_norm 1.1168 (1.2992) [2022-01-21 02:59:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1140/1251] eta 0:04:04 lr 0.000702 time 2.2151 (2.2039) loss 3.9920 (3.6617) grad_norm 1.2404 (1.2984) [2022-01-21 02:59:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1150/1251] eta 0:03:42 lr 0.000702 time 1.5384 (2.2038) loss 3.0815 (3.6604) grad_norm 1.3494 (1.2982) [2022-01-21 03:00:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1160/1251] eta 0:03:20 lr 0.000702 time 2.1918 (2.2033) loss 3.8979 (3.6591) grad_norm 1.2119 (1.2984) [2022-01-21 03:00:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1170/1251] eta 0:02:58 lr 0.000702 time 2.0356 (2.2033) loss 2.9104 (3.6574) grad_norm 1.1410 (1.2979) [2022-01-21 03:00:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1180/1251] eta 0:02:36 lr 0.000702 time 1.9620 (2.2029) loss 3.7655 (3.6584) grad_norm 1.4078 (1.2975) [2022-01-21 03:01:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1190/1251] eta 0:02:14 lr 0.000702 time 2.9614 (2.2031) loss 3.9416 (3.6581) grad_norm 1.2234 (1.2972) [2022-01-21 03:01:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1200/1251] eta 0:01:52 lr 0.000702 time 1.7934 (2.2022) loss 3.8265 (3.6574) grad_norm 1.5591 (1.2978) [2022-01-21 03:01:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1210/1251] eta 0:01:30 lr 0.000702 time 1.9706 (2.2007) loss 4.1157 (3.6601) grad_norm 1.5731 (1.2983) [2022-01-21 03:02:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1220/1251] eta 0:01:08 lr 0.000702 time 1.5538 (2.1997) loss 2.5767 (3.6583) grad_norm 1.3178 (1.2984) [2022-01-21 03:02:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1230/1251] eta 0:00:46 lr 0.000702 time 3.4633 (2.2006) loss 3.4540 (3.6602) grad_norm 1.2644 (1.2978) [2022-01-21 03:02:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1240/1251] eta 0:00:24 lr 0.000702 time 1.4631 (2.2004) loss 4.5184 (3.6601) grad_norm 1.1947 (1.2976) [2022-01-21 03:03:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [110/300][1250/1251] eta 0:00:02 lr 0.000702 time 1.1879 (2.1948) loss 3.8386 (3.6592) grad_norm 1.0667 (1.2974) [2022-01-21 03:03:15 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 110 training takes 0:45:46 [2022-01-21 03:03:15 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_110 saving...... [2022-01-21 03:03:26 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_110 saved !!! [2022-01-21 03:03:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 16.360 (16.360) Loss 1.0318 (1.0318) Acc@1 76.270 (76.270) Acc@5 93.262 (93.262) [2022-01-21 03:03:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.564 (2.764) Loss 1.1205 (1.0771) Acc@1 73.438 (74.627) Acc@5 91.992 (92.543) [2022-01-21 03:04:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.645 (2.318) Loss 1.1025 (1.0888) Acc@1 73.340 (74.186) Acc@5 92.578 (92.401) [2022-01-21 03:04:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.034 (2.104) Loss 1.0342 (1.0946) Acc@1 75.488 (74.197) Acc@5 92.773 (92.203) [2022-01-21 03:04:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.815 (2.018) Loss 1.0836 (1.0922) Acc@1 73.242 (74.162) Acc@5 92.188 (92.342) [2022-01-21 03:04:55 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.214 Acc@5 92.336 [2022-01-21 03:04:55 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.2% [2022-01-21 03:04:55 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.25% [2022-01-21 03:05:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][0/1251] eta 7:46:06 lr 0.000702 time 22.3554 (22.3554) loss 3.4609 (3.4609) grad_norm 1.3444 (1.3444) [2022-01-21 03:05:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][10/1251] eta 1:26:49 lr 0.000702 time 2.5832 (4.1976) loss 4.2617 (3.7078) grad_norm 1.1330 (1.2514) [2022-01-21 03:06:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][20/1251] eta 1:06:05 lr 0.000702 time 2.0466 (3.2216) loss 3.9192 (3.6238) grad_norm 1.2708 (1.2723) [2022-01-21 03:06:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][30/1251] eta 0:58:28 lr 0.000701 time 1.5840 (2.8731) loss 2.9162 (3.5905) grad_norm 1.4461 (1.2999) [2022-01-21 03:06:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][40/1251] eta 0:54:56 lr 0.000701 time 3.5650 (2.7222) loss 4.4391 (3.5629) grad_norm 1.2623 (1.3020) [2022-01-21 03:07:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][50/1251] eta 0:52:56 lr 0.000701 time 2.0073 (2.6452) loss 4.5177 (3.5971) grad_norm 1.2930 (1.3025) [2022-01-21 03:07:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][60/1251] eta 0:51:10 lr 0.000701 time 1.5979 (2.5780) loss 4.1846 (3.6465) grad_norm 1.2285 (1.3014) [2022-01-21 03:07:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][70/1251] eta 0:49:57 lr 0.000701 time 1.9549 (2.5384) loss 3.8820 (3.6574) grad_norm 1.2490 (1.3194) [2022-01-21 03:08:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][80/1251] eta 0:49:20 lr 0.000701 time 3.3232 (2.5284) loss 4.0480 (3.6579) grad_norm 1.2189 (1.3124) [2022-01-21 03:08:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][90/1251] eta 0:47:59 lr 0.000701 time 1.8933 (2.4801) loss 3.2619 (3.6525) grad_norm 1.3346 (1.3066) [2022-01-21 03:09:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][100/1251] eta 0:46:47 lr 0.000701 time 1.8803 (2.4388) loss 3.5190 (3.6635) grad_norm 1.1914 (1.3043) [2022-01-21 03:09:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][110/1251] eta 0:45:36 lr 0.000701 time 1.7322 (2.3986) loss 3.8534 (3.6559) grad_norm 1.2381 (1.2998) [2022-01-21 03:09:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][120/1251] eta 0:44:41 lr 0.000701 time 2.5134 (2.3705) loss 3.4685 (3.6638) grad_norm 1.3564 (1.3048) [2022-01-21 03:10:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][130/1251] eta 0:43:52 lr 0.000701 time 1.6406 (2.3481) loss 3.3502 (3.6390) grad_norm 1.6221 (1.3045) [2022-01-21 03:10:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][140/1251] eta 0:43:19 lr 0.000701 time 2.1953 (2.3400) loss 4.0329 (3.6312) grad_norm 1.4327 (1.3014) [2022-01-21 03:10:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][150/1251] eta 0:42:54 lr 0.000701 time 2.8434 (2.3381) loss 3.5504 (3.6279) grad_norm 1.2185 (1.2995) [2022-01-21 03:11:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][160/1251] eta 0:42:44 lr 0.000701 time 2.9991 (2.3505) loss 4.0080 (3.6302) grad_norm 1.2245 (1.2973) [2022-01-21 03:11:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][170/1251] eta 0:42:18 lr 0.000701 time 1.9080 (2.3480) loss 4.5364 (3.6367) grad_norm 1.5364 (1.2987) [2022-01-21 03:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][180/1251] eta 0:41:45 lr 0.000701 time 1.9393 (2.3394) loss 4.1434 (3.6472) grad_norm 1.5296 (1.3019) [2022-01-21 03:12:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][190/1251] eta 0:41:01 lr 0.000701 time 1.9638 (2.3203) loss 2.9964 (3.6450) grad_norm 1.1691 (1.2992) [2022-01-21 03:12:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][200/1251] eta 0:40:28 lr 0.000701 time 2.3113 (2.3103) loss 3.8831 (3.6493) grad_norm 1.3788 (1.2971) [2022-01-21 03:13:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][210/1251] eta 0:39:57 lr 0.000701 time 1.7072 (2.3029) loss 4.0668 (3.6566) grad_norm 1.2498 (1.2954) [2022-01-21 03:13:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][220/1251] eta 0:39:27 lr 0.000701 time 1.6783 (2.2961) loss 2.6782 (3.6590) grad_norm 1.2716 (1.2963) [2022-01-21 03:13:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][230/1251] eta 0:39:00 lr 0.000701 time 1.7054 (2.2920) loss 2.7739 (3.6550) grad_norm 1.4922 (1.2959) [2022-01-21 03:14:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][240/1251] eta 0:38:37 lr 0.000701 time 2.5060 (2.2926) loss 4.5306 (3.6587) grad_norm 1.3187 (1.2978) [2022-01-21 03:14:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][250/1251] eta 0:38:13 lr 0.000701 time 1.7602 (2.2910) loss 4.3133 (3.6699) grad_norm 1.3202 (1.2967) [2022-01-21 03:14:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][260/1251] eta 0:37:51 lr 0.000701 time 1.7844 (2.2916) loss 3.4461 (3.6810) grad_norm 1.3454 (1.2972) [2022-01-21 03:15:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][270/1251] eta 0:37:24 lr 0.000701 time 1.8615 (2.2881) loss 3.0980 (3.6729) grad_norm 1.1193 (1.2964) [2022-01-21 03:15:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][280/1251] eta 0:36:53 lr 0.000701 time 1.7950 (2.2798) loss 4.2165 (3.6655) grad_norm 1.2660 (1.2963) [2022-01-21 03:15:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][290/1251] eta 0:36:25 lr 0.000700 time 1.8812 (2.2740) loss 3.5511 (3.6666) grad_norm 1.1393 (1.2953) [2022-01-21 03:16:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][300/1251] eta 0:36:02 lr 0.000700 time 1.8957 (2.2736) loss 2.5928 (3.6592) grad_norm 1.4570 (1.2980) [2022-01-21 03:16:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][310/1251] eta 0:35:33 lr 0.000700 time 2.2151 (2.2673) loss 4.0210 (3.6599) grad_norm 1.1936 (1.2984) [2022-01-21 03:17:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][320/1251] eta 0:35:06 lr 0.000700 time 2.3567 (2.2629) loss 4.1700 (3.6608) grad_norm 1.1406 (1.2969) [2022-01-21 03:17:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][330/1251] eta 0:34:42 lr 0.000700 time 1.9152 (2.2607) loss 3.9304 (3.6646) grad_norm 1.1290 (1.2959) [2022-01-21 03:17:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][340/1251] eta 0:34:19 lr 0.000700 time 1.9400 (2.2609) loss 3.6470 (3.6538) grad_norm 1.1888 (1.2949) [2022-01-21 03:18:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][350/1251] eta 0:33:55 lr 0.000700 time 2.2552 (2.2590) loss 4.1805 (3.6489) grad_norm 1.4802 (1.2921) [2022-01-21 03:18:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][360/1251] eta 0:33:29 lr 0.000700 time 2.2463 (2.2552) loss 4.0510 (3.6417) grad_norm 1.3748 (1.2930) [2022-01-21 03:18:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][370/1251] eta 0:33:02 lr 0.000700 time 1.9499 (2.2508) loss 4.0723 (3.6409) grad_norm 1.3236 (1.2921) [2022-01-21 03:19:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][380/1251] eta 0:32:40 lr 0.000700 time 1.7846 (2.2511) loss 3.5651 (3.6400) grad_norm 1.2107 (1.2914) [2022-01-21 03:19:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][390/1251] eta 0:32:16 lr 0.000700 time 1.7537 (2.2491) loss 4.2246 (3.6486) grad_norm 1.3286 (1.2914) [2022-01-21 03:19:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][400/1251] eta 0:31:57 lr 0.000700 time 2.9756 (2.2527) loss 2.8857 (3.6492) grad_norm 1.3874 (1.2932) [2022-01-21 03:20:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][410/1251] eta 0:31:34 lr 0.000700 time 1.9069 (2.2523) loss 4.5439 (3.6618) grad_norm 1.2170 (1.2913) [2022-01-21 03:20:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][420/1251] eta 0:31:11 lr 0.000700 time 2.8623 (2.2524) loss 3.6622 (3.6667) grad_norm 1.3964 (1.2918) [2022-01-21 03:21:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][430/1251] eta 0:30:44 lr 0.000700 time 1.9429 (2.2463) loss 3.7571 (3.6612) grad_norm 1.1750 (1.2929) [2022-01-21 03:21:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][440/1251] eta 0:30:17 lr 0.000700 time 1.6590 (2.2414) loss 4.1506 (3.6654) grad_norm 1.0777 (1.2949) [2022-01-21 03:21:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][450/1251] eta 0:29:53 lr 0.000700 time 1.9309 (2.2390) loss 2.6652 (3.6637) grad_norm 1.3114 (1.2936) [2022-01-21 03:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][460/1251] eta 0:29:33 lr 0.000700 time 3.7885 (2.2419) loss 3.9968 (3.6650) grad_norm 1.3636 (1.2926) [2022-01-21 03:22:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][470/1251] eta 0:29:10 lr 0.000700 time 2.8203 (2.2419) loss 3.8771 (3.6653) grad_norm 1.5977 (1.2931) [2022-01-21 03:22:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][480/1251] eta 0:28:48 lr 0.000700 time 2.7001 (2.2418) loss 4.3867 (3.6649) grad_norm 1.1819 (1.2923) [2022-01-21 03:23:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][490/1251] eta 0:28:23 lr 0.000700 time 1.8329 (2.2383) loss 3.1129 (3.6676) grad_norm 1.4685 (1.2920) [2022-01-21 03:23:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][500/1251] eta 0:27:58 lr 0.000700 time 1.6078 (2.2353) loss 3.7142 (3.6666) grad_norm 1.3319 (1.2936) [2022-01-21 03:23:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][510/1251] eta 0:27:34 lr 0.000700 time 2.3093 (2.2329) loss 3.4146 (3.6656) grad_norm 1.1603 (1.2938) [2022-01-21 03:24:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][520/1251] eta 0:27:11 lr 0.000700 time 2.3834 (2.2324) loss 3.9256 (3.6684) grad_norm 1.2459 (1.2947) [2022-01-21 03:24:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][530/1251] eta 0:26:49 lr 0.000700 time 2.1226 (2.2321) loss 3.9779 (3.6662) grad_norm 1.2299 (1.2944) [2022-01-21 03:25:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][540/1251] eta 0:26:26 lr 0.000700 time 2.3614 (2.2315) loss 3.5427 (3.6687) grad_norm 1.2470 (1.2966) [2022-01-21 03:25:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][550/1251] eta 0:26:03 lr 0.000699 time 2.5359 (2.2297) loss 3.7286 (3.6686) grad_norm 1.4451 (1.2971) [2022-01-21 03:25:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][560/1251] eta 0:25:40 lr 0.000699 time 1.9426 (2.2287) loss 3.7177 (3.6692) grad_norm 1.4030 (1.2988) [2022-01-21 03:26:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][570/1251] eta 0:25:18 lr 0.000699 time 2.8209 (2.2301) loss 3.8329 (3.6709) grad_norm 1.6126 (1.3026) [2022-01-21 03:26:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][580/1251] eta 0:24:56 lr 0.000699 time 2.2786 (2.2297) loss 4.0089 (3.6733) grad_norm 1.2300 (1.3020) [2022-01-21 03:26:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][590/1251] eta 0:24:32 lr 0.000699 time 2.4790 (2.2283) loss 4.0010 (3.6741) grad_norm 1.7223 (1.3021) [2022-01-21 03:27:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][600/1251] eta 0:24:07 lr 0.000699 time 2.0879 (2.2240) loss 4.0347 (3.6752) grad_norm 1.1492 (1.3043) [2022-01-21 03:27:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][610/1251] eta 0:23:44 lr 0.000699 time 2.1762 (2.2216) loss 4.3360 (3.6767) grad_norm 1.3722 (1.3055) [2022-01-21 03:27:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][620/1251] eta 0:23:22 lr 0.000699 time 2.8690 (2.2224) loss 3.8388 (3.6758) grad_norm 1.1833 (1.3035) [2022-01-21 03:28:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][630/1251] eta 0:23:00 lr 0.000699 time 1.8814 (2.2225) loss 3.5039 (3.6708) grad_norm 1.4265 (1.3030) [2022-01-21 03:28:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][640/1251] eta 0:22:38 lr 0.000699 time 1.7939 (2.2227) loss 2.9702 (3.6755) grad_norm 1.4697 (1.3022) [2022-01-21 03:29:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][650/1251] eta 0:22:16 lr 0.000699 time 1.9944 (2.2237) loss 3.9713 (3.6744) grad_norm 1.5352 (1.3033) [2022-01-21 03:29:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][660/1251] eta 0:21:54 lr 0.000699 time 2.2873 (2.2238) loss 4.0063 (3.6731) grad_norm 1.1755 (1.3035) [2022-01-21 03:29:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][670/1251] eta 0:21:31 lr 0.000699 time 1.8084 (2.2226) loss 4.4730 (3.6779) grad_norm 1.2868 (1.3032) [2022-01-21 03:30:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][680/1251] eta 0:21:08 lr 0.000699 time 1.5454 (2.2207) loss 3.6132 (3.6807) grad_norm 1.1736 (1.3029) [2022-01-21 03:30:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][690/1251] eta 0:20:45 lr 0.000699 time 2.2044 (2.2203) loss 3.8375 (3.6794) grad_norm 1.1409 (1.3028) [2022-01-21 03:30:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][700/1251] eta 0:20:21 lr 0.000699 time 2.2903 (2.2177) loss 4.0435 (3.6802) grad_norm 1.1698 (1.3018) [2022-01-21 03:31:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][710/1251] eta 0:19:59 lr 0.000699 time 1.9053 (2.2166) loss 3.0050 (3.6786) grad_norm 1.3391 (1.3018) [2022-01-21 03:31:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][720/1251] eta 0:19:35 lr 0.000699 time 1.5517 (2.2140) loss 3.1665 (3.6779) grad_norm 1.2721 (1.3020) [2022-01-21 03:31:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][730/1251] eta 0:19:12 lr 0.000699 time 2.2228 (2.2120) loss 3.9393 (3.6744) grad_norm 1.2380 (1.3027) [2022-01-21 03:32:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][740/1251] eta 0:18:49 lr 0.000699 time 2.1811 (2.2110) loss 3.7414 (3.6726) grad_norm 1.2270 (1.3028) [2022-01-21 03:32:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][750/1251] eta 0:18:28 lr 0.000699 time 3.1294 (2.2129) loss 4.3800 (3.6759) grad_norm 1.5301 (1.3057) [2022-01-21 03:32:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][760/1251] eta 0:18:06 lr 0.000699 time 1.3009 (2.2129) loss 4.1040 (3.6787) grad_norm 1.1827 (1.3065) [2022-01-21 03:33:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][770/1251] eta 0:17:44 lr 0.000699 time 2.5047 (2.2141) loss 4.4238 (3.6788) grad_norm 1.2733 (1.3069) [2022-01-21 03:33:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][780/1251] eta 0:17:23 lr 0.000699 time 2.0603 (2.2152) loss 3.1881 (3.6792) grad_norm 1.6942 (1.3082) [2022-01-21 03:34:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][790/1251] eta 0:17:01 lr 0.000699 time 2.5905 (2.2151) loss 4.2896 (3.6776) grad_norm 1.4385 (1.3080) [2022-01-21 03:34:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][800/1251] eta 0:16:38 lr 0.000699 time 1.8435 (2.2134) loss 3.1130 (3.6761) grad_norm 1.2815 (1.3082) [2022-01-21 03:34:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][810/1251] eta 0:16:16 lr 0.000699 time 2.4937 (2.2133) loss 3.9359 (3.6768) grad_norm 1.0666 (1.3092) [2022-01-21 03:35:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][820/1251] eta 0:15:53 lr 0.000698 time 1.8108 (2.2112) loss 2.6998 (3.6759) grad_norm 1.2106 (1.3090) [2022-01-21 03:35:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][830/1251] eta 0:15:30 lr 0.000698 time 1.8876 (2.2103) loss 4.4955 (3.6798) grad_norm 1.3102 (1.3093) [2022-01-21 03:35:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][840/1251] eta 0:15:08 lr 0.000698 time 1.8616 (2.2097) loss 4.4094 (3.6789) grad_norm 1.3704 (1.3091) [2022-01-21 03:36:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][850/1251] eta 0:14:45 lr 0.000698 time 1.9993 (2.2088) loss 3.6805 (3.6788) grad_norm 1.0985 (1.3086) [2022-01-21 03:36:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][860/1251] eta 0:14:23 lr 0.000698 time 1.9531 (2.2083) loss 4.0957 (3.6771) grad_norm 1.1846 (1.3081) [2022-01-21 03:36:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][870/1251] eta 0:14:01 lr 0.000698 time 2.4760 (2.2075) loss 4.3321 (3.6771) grad_norm 1.3067 (1.3085) [2022-01-21 03:37:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][880/1251] eta 0:13:38 lr 0.000698 time 2.3757 (2.2064) loss 3.6668 (3.6762) grad_norm 1.4009 (1.3087) [2022-01-21 03:37:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][890/1251] eta 0:13:16 lr 0.000698 time 1.5724 (2.2069) loss 3.2289 (3.6757) grad_norm 1.2483 (1.3090) [2022-01-21 03:38:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][900/1251] eta 0:12:55 lr 0.000698 time 2.5915 (2.2080) loss 4.2804 (3.6742) grad_norm 1.3819 (1.3088) [2022-01-21 03:38:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][910/1251] eta 0:12:32 lr 0.000698 time 2.2053 (2.2072) loss 4.2707 (3.6762) grad_norm 1.3713 (1.3090) [2022-01-21 03:38:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][920/1251] eta 0:12:10 lr 0.000698 time 1.9182 (2.2067) loss 4.4887 (3.6797) grad_norm 1.5354 (1.3098) [2022-01-21 03:39:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][930/1251] eta 0:11:48 lr 0.000698 time 2.2055 (2.2068) loss 3.7447 (3.6788) grad_norm 1.4233 (1.3105) [2022-01-21 03:39:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][940/1251] eta 0:11:26 lr 0.000698 time 2.2545 (2.2067) loss 3.9177 (3.6792) grad_norm 1.5530 (1.3118) [2022-01-21 03:39:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][950/1251] eta 0:11:03 lr 0.000698 time 1.9615 (2.2049) loss 2.7979 (3.6789) grad_norm 1.2868 (1.3125) [2022-01-21 03:40:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][960/1251] eta 0:10:41 lr 0.000698 time 2.3845 (2.2038) loss 4.4159 (3.6792) grad_norm 1.2827 (1.3130) [2022-01-21 03:40:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][970/1251] eta 0:10:19 lr 0.000698 time 2.6449 (2.2041) loss 4.2480 (3.6815) grad_norm 1.1785 (1.3128) [2022-01-21 03:40:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][980/1251] eta 0:09:57 lr 0.000698 time 2.1796 (2.2048) loss 4.3431 (3.6845) grad_norm 1.3611 (1.3122) [2022-01-21 03:41:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][990/1251] eta 0:09:35 lr 0.000698 time 1.5712 (2.2053) loss 3.0898 (3.6833) grad_norm 1.2358 (1.3115) [2022-01-21 03:41:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1000/1251] eta 0:09:13 lr 0.000698 time 1.9693 (2.2051) loss 3.3459 (3.6855) grad_norm 1.3582 (1.3118) [2022-01-21 03:42:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1010/1251] eta 0:08:51 lr 0.000698 time 2.4494 (2.2054) loss 4.2347 (3.6864) grad_norm 1.1843 (1.3114) [2022-01-21 03:42:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1020/1251] eta 0:08:29 lr 0.000698 time 2.2653 (2.2044) loss 3.8394 (3.6886) grad_norm 1.3493 (1.3123) [2022-01-21 03:42:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1030/1251] eta 0:08:06 lr 0.000698 time 2.3434 (2.2030) loss 4.6375 (3.6883) grad_norm 1.3373 (1.3120) [2022-01-21 03:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1040/1251] eta 0:07:44 lr 0.000698 time 1.6666 (2.2020) loss 4.0672 (3.6906) grad_norm 1.1638 (1.3111) [2022-01-21 03:43:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1050/1251] eta 0:07:22 lr 0.000698 time 2.4976 (2.2029) loss 2.9809 (3.6915) grad_norm 1.3369 (1.3110) [2022-01-21 03:43:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1060/1251] eta 0:07:01 lr 0.000698 time 3.1035 (2.2050) loss 4.0320 (3.6940) grad_norm 1.2868 (1.3107) [2022-01-21 03:44:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1070/1251] eta 0:06:39 lr 0.000698 time 2.9867 (2.2060) loss 3.8238 (3.6895) grad_norm 1.2846 (1.3100) [2022-01-21 03:44:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1080/1251] eta 0:06:16 lr 0.000697 time 1.6090 (2.2043) loss 2.9295 (3.6909) grad_norm 1.4483 (1.3097) [2022-01-21 03:44:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1090/1251] eta 0:05:54 lr 0.000697 time 1.8559 (2.2033) loss 3.6635 (3.6915) grad_norm 1.2805 (1.3096) [2022-01-21 03:45:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1100/1251] eta 0:05:32 lr 0.000697 time 1.8887 (2.2011) loss 2.6959 (3.6883) grad_norm 1.4404 (1.3098) [2022-01-21 03:45:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1110/1251] eta 0:05:10 lr 0.000697 time 1.8080 (2.2001) loss 2.6773 (3.6846) grad_norm 1.3040 (1.3106) [2022-01-21 03:46:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1120/1251] eta 0:04:48 lr 0.000697 time 1.9298 (2.2002) loss 3.8967 (3.6842) grad_norm 1.1593 (1.3104) [2022-01-21 03:46:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1130/1251] eta 0:04:26 lr 0.000697 time 2.1469 (2.1997) loss 3.7077 (3.6860) grad_norm 1.3143 (1.3101) [2022-01-21 03:46:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1140/1251] eta 0:04:04 lr 0.000697 time 2.4490 (2.2013) loss 3.5081 (3.6843) grad_norm 1.1134 (1.3094) [2022-01-21 03:47:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1150/1251] eta 0:03:42 lr 0.000697 time 1.9244 (2.2015) loss 3.9369 (3.6834) grad_norm 1.1987 (1.3098) [2022-01-21 03:47:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1160/1251] eta 0:03:20 lr 0.000697 time 2.2925 (2.2016) loss 2.4680 (3.6839) grad_norm 1.4068 (1.3104) [2022-01-21 03:47:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1170/1251] eta 0:02:58 lr 0.000697 time 1.5471 (2.2006) loss 3.5329 (3.6842) grad_norm 1.2054 (1.3096) [2022-01-21 03:48:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1180/1251] eta 0:02:36 lr 0.000697 time 1.8924 (2.2011) loss 4.0037 (3.6861) grad_norm 1.2366 (1.3093) [2022-01-21 03:48:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1190/1251] eta 0:02:14 lr 0.000697 time 2.2356 (2.1996) loss 2.3388 (3.6864) grad_norm 1.4068 (1.3099) [2022-01-21 03:48:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1200/1251] eta 0:01:52 lr 0.000697 time 1.9938 (2.1984) loss 4.5706 (3.6897) grad_norm 1.1959 (1.3098) [2022-01-21 03:49:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1210/1251] eta 0:01:30 lr 0.000697 time 2.2414 (2.1983) loss 3.1701 (3.6894) grad_norm 1.2699 (1.3096) [2022-01-21 03:49:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1220/1251] eta 0:01:08 lr 0.000697 time 2.5186 (2.1997) loss 4.0878 (3.6902) grad_norm 1.1483 (1.3086) [2022-01-21 03:50:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1230/1251] eta 0:00:46 lr 0.000697 time 2.5183 (2.2002) loss 3.5610 (3.6909) grad_norm 1.2845 (1.3088) [2022-01-21 03:50:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1240/1251] eta 0:00:24 lr 0.000697 time 1.6346 (2.1995) loss 2.6658 (3.6884) grad_norm 1.2315 (1.3080) [2022-01-21 03:50:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [111/300][1250/1251] eta 0:00:02 lr 0.000697 time 1.3031 (2.1945) loss 2.9591 (3.6873) grad_norm 1.2522 (1.3076) [2022-01-21 03:50:41 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 111 training takes 0:45:45 [2022-01-21 03:50:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.026 (18.026) Loss 1.1370 (1.1370) Acc@1 72.656 (72.656) Acc@5 92.383 (92.383) [2022-01-21 03:51:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.941 (3.499) Loss 1.0608 (1.0767) Acc@1 75.781 (74.956) Acc@5 93.066 (92.862) [2022-01-21 03:51:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.301 (2.528) Loss 1.1185 (1.1051) Acc@1 73.242 (74.233) Acc@5 93.066 (92.527) [2022-01-21 03:51:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.635 (2.202) Loss 1.1044 (1.1024) Acc@1 72.168 (74.260) Acc@5 92.480 (92.487) [2022-01-21 03:52:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.801 (2.158) Loss 1.1236 (1.1075) Acc@1 72.363 (74.019) Acc@5 93.359 (92.476) [2022-01-21 03:52:17 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.096 Acc@5 92.452 [2022-01-21 03:52:17 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.1% [2022-01-21 03:52:17 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.25% [2022-01-21 03:52:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][0/1251] eta 7:39:41 lr 0.000697 time 22.0473 (22.0473) loss 4.3669 (4.3669) grad_norm 1.2917 (1.2917) [2022-01-21 03:53:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][10/1251] eta 1:26:13 lr 0.000697 time 2.5156 (4.1686) loss 3.5350 (3.7665) grad_norm 1.2701 (1.2595) [2022-01-21 03:53:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][20/1251] eta 1:05:06 lr 0.000697 time 1.5791 (3.1734) loss 4.1203 (3.6764) grad_norm 1.1825 (1.2339) [2022-01-21 03:53:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][30/1251] eta 0:57:53 lr 0.000697 time 1.5092 (2.8446) loss 2.8272 (3.6240) grad_norm 1.3804 (1.2593) [2022-01-21 03:54:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][40/1251] eta 0:55:29 lr 0.000697 time 5.3761 (2.7491) loss 3.8857 (3.6358) grad_norm 1.3110 (1.2733) [2022-01-21 03:54:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][50/1251] eta 0:53:28 lr 0.000697 time 2.8817 (2.6715) loss 3.5450 (3.6151) grad_norm 1.4744 (1.2697) [2022-01-21 03:54:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][60/1251] eta 0:51:22 lr 0.000697 time 2.5187 (2.5882) loss 4.0057 (3.6217) grad_norm 1.1486 (1.2650) [2022-01-21 03:55:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][70/1251] eta 0:49:24 lr 0.000697 time 1.6695 (2.5101) loss 4.3497 (3.6449) grad_norm 1.1854 (1.2671) [2022-01-21 03:55:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][80/1251] eta 0:48:10 lr 0.000697 time 2.7021 (2.4686) loss 4.1816 (3.6512) grad_norm 1.3380 (1.2699) [2022-01-21 03:55:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][90/1251] eta 0:47:13 lr 0.000696 time 2.1620 (2.4409) loss 2.4511 (3.6439) grad_norm 1.2310 (1.2750) [2022-01-21 03:56:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][100/1251] eta 0:46:26 lr 0.000696 time 2.7953 (2.4211) loss 3.9718 (3.6554) grad_norm 1.3356 (1.2753) [2022-01-21 03:56:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][110/1251] eta 0:45:39 lr 0.000696 time 2.0665 (2.4006) loss 3.6919 (3.6498) grad_norm 1.3315 (1.2778) [2022-01-21 03:57:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][120/1251] eta 0:45:02 lr 0.000696 time 2.5809 (2.3896) loss 2.9180 (3.6354) grad_norm 1.9700 (1.2824) [2022-01-21 03:57:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][130/1251] eta 0:44:18 lr 0.000696 time 3.0451 (2.3712) loss 3.8170 (3.6298) grad_norm 1.6354 (1.2818) [2022-01-21 03:57:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][140/1251] eta 0:43:29 lr 0.000696 time 1.7228 (2.3483) loss 4.2531 (3.6392) grad_norm 1.2609 (1.2846) [2022-01-21 03:58:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][150/1251] eta 0:42:56 lr 0.000696 time 2.4067 (2.3399) loss 3.8198 (3.6610) grad_norm 1.2132 (1.2872) [2022-01-21 03:58:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][160/1251] eta 0:42:27 lr 0.000696 time 2.1412 (2.3346) loss 2.7357 (3.6641) grad_norm 1.1968 (1.2846) [2022-01-21 03:58:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][170/1251] eta 0:41:59 lr 0.000696 time 2.5145 (2.3305) loss 4.0177 (3.6690) grad_norm 1.2494 (1.2835) [2022-01-21 03:59:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][180/1251] eta 0:41:16 lr 0.000696 time 1.7733 (2.3124) loss 2.6585 (3.6497) grad_norm 1.4101 (1.2827) [2022-01-21 03:59:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][190/1251] eta 0:40:37 lr 0.000696 time 2.0305 (2.2973) loss 3.7083 (3.6530) grad_norm 1.2451 (1.2846) [2022-01-21 03:59:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][200/1251] eta 0:40:10 lr 0.000696 time 2.7565 (2.2933) loss 4.2446 (3.6559) grad_norm 1.1349 (1.2870) [2022-01-21 04:00:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][210/1251] eta 0:39:36 lr 0.000696 time 1.9440 (2.2833) loss 3.6660 (3.6520) grad_norm 1.3885 (1.2870) [2022-01-21 04:00:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][220/1251] eta 0:39:09 lr 0.000696 time 1.8184 (2.2787) loss 3.7055 (3.6537) grad_norm 1.2683 (1.2857) [2022-01-21 04:01:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][230/1251] eta 0:38:42 lr 0.000696 time 1.9932 (2.2747) loss 4.3268 (3.6446) grad_norm 1.3636 (1.2853) [2022-01-21 04:01:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][240/1251] eta 0:38:17 lr 0.000696 time 3.2784 (2.2727) loss 4.0894 (3.6447) grad_norm 1.2828 (1.2844) [2022-01-21 04:01:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][250/1251] eta 0:37:47 lr 0.000696 time 1.7049 (2.2653) loss 3.3686 (3.6434) grad_norm 1.1320 (1.2830) [2022-01-21 04:02:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][260/1251] eta 0:37:22 lr 0.000696 time 2.2393 (2.2628) loss 4.1478 (3.6393) grad_norm 1.2343 (1.2815) [2022-01-21 04:02:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][270/1251] eta 0:37:00 lr 0.000696 time 2.6215 (2.2634) loss 4.2609 (3.6490) grad_norm 1.2143 (1.2834) [2022-01-21 04:02:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][280/1251] eta 0:36:32 lr 0.000696 time 2.1655 (2.2575) loss 2.8683 (3.6436) grad_norm 1.3283 (1.2838) [2022-01-21 04:03:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][290/1251] eta 0:36:03 lr 0.000696 time 1.6963 (2.2508) loss 4.3050 (3.6515) grad_norm 1.4432 (1.2861) [2022-01-21 04:03:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][300/1251] eta 0:35:38 lr 0.000696 time 2.5894 (2.2489) loss 4.1322 (3.6495) grad_norm 1.1920 (1.2880) [2022-01-21 04:03:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][310/1251] eta 0:35:12 lr 0.000696 time 2.4149 (2.2448) loss 4.2684 (3.6583) grad_norm 1.2141 (1.2874) [2022-01-21 04:04:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][320/1251] eta 0:34:49 lr 0.000696 time 3.4080 (2.2449) loss 3.5242 (3.6514) grad_norm 1.2159 (1.2883) [2022-01-21 04:04:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][330/1251] eta 0:34:23 lr 0.000696 time 1.8409 (2.2401) loss 3.6065 (3.6529) grad_norm 1.3181 (1.2876) [2022-01-21 04:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][340/1251] eta 0:33:56 lr 0.000696 time 2.2324 (2.2352) loss 4.3637 (3.6514) grad_norm 1.4306 (1.2886) [2022-01-21 04:05:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][350/1251] eta 0:33:35 lr 0.000695 time 1.7173 (2.2370) loss 4.4272 (3.6538) grad_norm 1.2949 (1.2880) [2022-01-21 04:05:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][360/1251] eta 0:33:13 lr 0.000695 time 2.3174 (2.2369) loss 3.8388 (3.6566) grad_norm 1.0724 (1.2876) [2022-01-21 04:06:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][370/1251] eta 0:32:51 lr 0.000695 time 2.6013 (2.2382) loss 4.0024 (3.6590) grad_norm 1.2717 (1.2889) [2022-01-21 04:06:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][380/1251] eta 0:32:28 lr 0.000695 time 1.8212 (2.2370) loss 4.3707 (3.6617) grad_norm 1.2039 (1.2901) [2022-01-21 04:06:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][390/1251] eta 0:32:06 lr 0.000695 time 2.3567 (2.2372) loss 3.6820 (3.6669) grad_norm 1.4036 (1.2903) [2022-01-21 04:07:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][400/1251] eta 0:31:44 lr 0.000695 time 1.7774 (2.2381) loss 3.5102 (3.6714) grad_norm 1.4517 (1.2912) [2022-01-21 04:07:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][410/1251] eta 0:31:22 lr 0.000695 time 2.0973 (2.2386) loss 2.7685 (3.6661) grad_norm 1.2718 (1.2908) [2022-01-21 04:08:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][420/1251] eta 0:31:00 lr 0.000695 time 2.2242 (2.2391) loss 3.4814 (3.6680) grad_norm 1.1447 (1.2926) [2022-01-21 04:08:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][430/1251] eta 0:30:35 lr 0.000695 time 2.2596 (2.2351) loss 3.7958 (3.6721) grad_norm 1.3092 (1.2923) [2022-01-21 04:08:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][440/1251] eta 0:30:08 lr 0.000695 time 2.2374 (2.2299) loss 4.0763 (3.6737) grad_norm 1.4783 (1.2931) [2022-01-21 04:09:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][450/1251] eta 0:29:44 lr 0.000695 time 1.8815 (2.2278) loss 3.8337 (3.6730) grad_norm 1.2868 (1.2945) [2022-01-21 04:09:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][460/1251] eta 0:29:21 lr 0.000695 time 2.2786 (2.2265) loss 3.0845 (3.6693) grad_norm 1.2815 (1.2950) [2022-01-21 04:09:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][470/1251] eta 0:28:59 lr 0.000695 time 2.5482 (2.2275) loss 4.0255 (3.6726) grad_norm 1.2977 (1.2937) [2022-01-21 04:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][480/1251] eta 0:28:34 lr 0.000695 time 1.7370 (2.2237) loss 2.6419 (3.6687) grad_norm 1.2145 (1.2928) [2022-01-21 04:10:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][490/1251] eta 0:28:11 lr 0.000695 time 1.9098 (2.2232) loss 3.7131 (3.6725) grad_norm 1.1634 (1.2923) [2022-01-21 04:10:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][500/1251] eta 0:27:50 lr 0.000695 time 2.0987 (2.2239) loss 3.5853 (3.6744) grad_norm 1.1038 (1.2919) [2022-01-21 04:11:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][510/1251] eta 0:27:27 lr 0.000695 time 2.8216 (2.2228) loss 3.6739 (3.6800) grad_norm 1.0917 (1.2906) [2022-01-21 04:11:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][520/1251] eta 0:27:02 lr 0.000695 time 1.9039 (2.2199) loss 2.6186 (3.6798) grad_norm 1.3077 (1.2908) [2022-01-21 04:11:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][530/1251] eta 0:26:42 lr 0.000695 time 2.1157 (2.2224) loss 2.8675 (3.6831) grad_norm 1.4355 (1.2909) [2022-01-21 04:12:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][540/1251] eta 0:26:20 lr 0.000695 time 1.8887 (2.2229) loss 4.1977 (3.6844) grad_norm 1.2997 (1.2927) [2022-01-21 04:12:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][550/1251] eta 0:25:57 lr 0.000695 time 2.4153 (2.2222) loss 4.2910 (3.6840) grad_norm 1.1513 (1.2923) [2022-01-21 04:13:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][560/1251] eta 0:25:33 lr 0.000695 time 1.9620 (2.2196) loss 3.7916 (3.6793) grad_norm 1.4420 (1.2918) [2022-01-21 04:13:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][570/1251] eta 0:25:09 lr 0.000695 time 1.8942 (2.2170) loss 3.9596 (3.6813) grad_norm 1.3182 (1.2913) [2022-01-21 04:13:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][580/1251] eta 0:24:45 lr 0.000695 time 1.9629 (2.2133) loss 3.5980 (3.6783) grad_norm 1.2196 (1.2907) [2022-01-21 04:14:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][590/1251] eta 0:24:22 lr 0.000695 time 2.5212 (2.2122) loss 4.1441 (3.6768) grad_norm 1.3125 (1.2915) [2022-01-21 04:14:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][600/1251] eta 0:23:59 lr 0.000695 time 2.1640 (2.2107) loss 3.8948 (3.6786) grad_norm 1.1483 (1.2935) [2022-01-21 04:14:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][610/1251] eta 0:23:36 lr 0.000694 time 2.8984 (2.2099) loss 4.0503 (3.6824) grad_norm 1.2264 (1.2943) [2022-01-21 04:15:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][620/1251] eta 0:23:13 lr 0.000694 time 2.1320 (2.2077) loss 3.2259 (3.6756) grad_norm 1.2123 (1.2937) [2022-01-21 04:15:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][630/1251] eta 0:22:50 lr 0.000694 time 1.8743 (2.2070) loss 3.6618 (3.6736) grad_norm 1.3897 (1.2925) [2022-01-21 04:15:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][640/1251] eta 0:22:29 lr 0.000694 time 2.7343 (2.2080) loss 4.4188 (3.6739) grad_norm 1.2036 (1.2916) [2022-01-21 04:16:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][650/1251] eta 0:22:07 lr 0.000694 time 2.4088 (2.2093) loss 3.7917 (3.6725) grad_norm 1.3638 (1.2917) [2022-01-21 04:16:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][660/1251] eta 0:21:44 lr 0.000694 time 1.5470 (2.2076) loss 4.0741 (3.6731) grad_norm 1.1318 (1.2903) [2022-01-21 04:16:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][670/1251] eta 0:21:22 lr 0.000694 time 1.8227 (2.2075) loss 2.6971 (3.6703) grad_norm 1.2491 (1.2912) [2022-01-21 04:17:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][680/1251] eta 0:21:01 lr 0.000694 time 3.3756 (2.2088) loss 2.7226 (3.6686) grad_norm 1.4056 (1.2928) [2022-01-21 04:17:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][690/1251] eta 0:20:39 lr 0.000694 time 2.8218 (2.2087) loss 3.4823 (3.6699) grad_norm 1.4477 (1.2935) [2022-01-21 04:18:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][700/1251] eta 0:20:17 lr 0.000694 time 2.2828 (2.2101) loss 3.5184 (3.6727) grad_norm 1.3398 (1.2926) [2022-01-21 04:18:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][710/1251] eta 0:19:56 lr 0.000694 time 1.7567 (2.2111) loss 3.6659 (3.6723) grad_norm 1.3829 (1.2933) [2022-01-21 04:18:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][720/1251] eta 0:19:34 lr 0.000694 time 2.3245 (2.2118) loss 3.6540 (3.6715) grad_norm 1.1997 (1.2927) [2022-01-21 04:19:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][730/1251] eta 0:19:14 lr 0.000694 time 6.0746 (2.2150) loss 3.3646 (3.6711) grad_norm 1.1818 (1.2931) [2022-01-21 04:19:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][740/1251] eta 0:18:53 lr 0.000694 time 1.6490 (2.2185) loss 3.4592 (3.6685) grad_norm 1.3720 (1.2950) [2022-01-21 04:20:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][750/1251] eta 0:18:29 lr 0.000694 time 1.8852 (2.2139) loss 4.0870 (3.6666) grad_norm 1.3243 (1.2948) [2022-01-21 04:20:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][760/1251] eta 0:18:06 lr 0.000694 time 1.9053 (2.2132) loss 4.3048 (3.6696) grad_norm 1.2206 (1.2952) [2022-01-21 04:20:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][770/1251] eta 0:17:45 lr 0.000694 time 3.8865 (2.2155) loss 4.3810 (3.6709) grad_norm 1.7313 (1.2968) [2022-01-21 04:21:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][780/1251] eta 0:17:22 lr 0.000694 time 1.6769 (2.2124) loss 3.8667 (3.6704) grad_norm 1.3458 (1.2979) [2022-01-21 04:21:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][790/1251] eta 0:16:58 lr 0.000694 time 1.6791 (2.2095) loss 3.4278 (3.6690) grad_norm 1.1295 (1.2972) [2022-01-21 04:21:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][800/1251] eta 0:16:35 lr 0.000694 time 2.0663 (2.2084) loss 3.1952 (3.6682) grad_norm 1.1875 (1.2969) [2022-01-21 04:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][810/1251] eta 0:16:14 lr 0.000694 time 3.2747 (2.2097) loss 3.9126 (3.6698) grad_norm 1.2003 (1.2962) [2022-01-21 04:22:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][820/1251] eta 0:15:52 lr 0.000694 time 2.5061 (2.2100) loss 2.8301 (3.6718) grad_norm 1.4624 (1.2978) [2022-01-21 04:22:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][830/1251] eta 0:15:30 lr 0.000694 time 1.9387 (2.2101) loss 3.6973 (3.6714) grad_norm 1.1660 (1.2972) [2022-01-21 04:23:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][840/1251] eta 0:15:08 lr 0.000694 time 1.7268 (2.2114) loss 3.9805 (3.6713) grad_norm 1.2972 (1.2977) [2022-01-21 04:23:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][850/1251] eta 0:14:48 lr 0.000694 time 2.8632 (2.2145) loss 3.6800 (3.6710) grad_norm 1.2999 (1.2968) [2022-01-21 04:24:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][860/1251] eta 0:14:24 lr 0.000694 time 1.9119 (2.2117) loss 3.5353 (3.6703) grad_norm 1.1806 (1.2962) [2022-01-21 04:24:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][870/1251] eta 0:14:01 lr 0.000693 time 2.0378 (2.2097) loss 3.4323 (3.6692) grad_norm 1.2068 (1.2958) [2022-01-21 04:24:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][880/1251] eta 0:13:39 lr 0.000693 time 1.9250 (2.2090) loss 3.2960 (3.6675) grad_norm 1.0370 (1.2957) [2022-01-21 04:25:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][890/1251] eta 0:13:17 lr 0.000693 time 2.3094 (2.2079) loss 3.8885 (3.6671) grad_norm 1.2040 (1.2949) [2022-01-21 04:25:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][900/1251] eta 0:12:54 lr 0.000693 time 2.3273 (2.2076) loss 3.0396 (3.6673) grad_norm 1.2650 (1.2947) [2022-01-21 04:25:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][910/1251] eta 0:12:32 lr 0.000693 time 2.2068 (2.2070) loss 3.7974 (3.6673) grad_norm 1.3085 (1.2942) [2022-01-21 04:26:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][920/1251] eta 0:12:10 lr 0.000693 time 2.2325 (2.2070) loss 3.6825 (3.6685) grad_norm 1.4233 (1.2945) [2022-01-21 04:26:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][930/1251] eta 0:11:48 lr 0.000693 time 2.1443 (2.2075) loss 4.2401 (3.6691) grad_norm 1.2334 (1.2946) [2022-01-21 04:26:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][940/1251] eta 0:11:26 lr 0.000693 time 1.8462 (2.2069) loss 3.2945 (3.6686) grad_norm 2.0171 (1.2953) [2022-01-21 04:27:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][950/1251] eta 0:11:03 lr 0.000693 time 1.7028 (2.2058) loss 2.4365 (3.6670) grad_norm 1.0940 (1.2948) [2022-01-21 04:27:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][960/1251] eta 0:10:41 lr 0.000693 time 2.4929 (2.2060) loss 3.6708 (3.6673) grad_norm 1.1953 (1.2946) [2022-01-21 04:28:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][970/1251] eta 0:10:20 lr 0.000693 time 2.1738 (2.2067) loss 3.4154 (3.6685) grad_norm 1.6516 (1.2946) [2022-01-21 04:28:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][980/1251] eta 0:09:57 lr 0.000693 time 2.3976 (2.2066) loss 3.7245 (3.6684) grad_norm 1.1298 (1.2939) [2022-01-21 04:28:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][990/1251] eta 0:09:35 lr 0.000693 time 1.9796 (2.2058) loss 3.1985 (3.6712) grad_norm 1.6311 (1.2942) [2022-01-21 04:29:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1000/1251] eta 0:09:13 lr 0.000693 time 1.9169 (2.2043) loss 4.3513 (3.6746) grad_norm 1.3651 (1.2939) [2022-01-21 04:29:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1010/1251] eta 0:08:51 lr 0.000693 time 1.8404 (2.2042) loss 4.1441 (3.6714) grad_norm 1.2098 (1.2934) [2022-01-21 04:29:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1020/1251] eta 0:08:29 lr 0.000693 time 1.5851 (2.2038) loss 4.0352 (3.6741) grad_norm 1.4025 (1.2939) [2022-01-21 04:30:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1030/1251] eta 0:08:06 lr 0.000693 time 1.8641 (2.2033) loss 3.3511 (3.6727) grad_norm 1.4552 (1.2942) [2022-01-21 04:30:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1040/1251] eta 0:07:45 lr 0.000693 time 2.6387 (2.2038) loss 3.0448 (3.6712) grad_norm 1.2912 (1.2936) [2022-01-21 04:30:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1050/1251] eta 0:07:23 lr 0.000693 time 2.1823 (2.2041) loss 2.8704 (3.6680) grad_norm 1.1769 (1.2933) [2022-01-21 04:31:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1060/1251] eta 0:07:00 lr 0.000693 time 2.1527 (2.2031) loss 3.9277 (3.6687) grad_norm 1.1957 (1.2934) [2022-01-21 04:31:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1070/1251] eta 0:06:38 lr 0.000693 time 1.6000 (2.2015) loss 3.9955 (3.6690) grad_norm 1.2830 (1.2929) [2022-01-21 04:31:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1080/1251] eta 0:06:16 lr 0.000693 time 1.9090 (2.2005) loss 2.9018 (3.6702) grad_norm 1.2708 (1.2928) [2022-01-21 04:32:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1090/1251] eta 0:05:54 lr 0.000693 time 2.2163 (2.2004) loss 4.0531 (3.6674) grad_norm 1.2121 (1.2931) [2022-01-21 04:32:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1100/1251] eta 0:05:32 lr 0.000693 time 2.4727 (2.2011) loss 3.4565 (3.6645) grad_norm 1.3581 (1.2934) [2022-01-21 04:33:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1110/1251] eta 0:05:10 lr 0.000693 time 2.5707 (2.2020) loss 3.1670 (3.6605) grad_norm 1.3379 (1.2933) [2022-01-21 04:33:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1120/1251] eta 0:04:48 lr 0.000693 time 2.2091 (2.2018) loss 2.8563 (3.6599) grad_norm 1.4525 (1.2932) [2022-01-21 04:33:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1130/1251] eta 0:04:26 lr 0.000692 time 1.8363 (2.2010) loss 4.0357 (3.6613) grad_norm 1.2310 (1.2933) [2022-01-21 04:34:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1140/1251] eta 0:04:04 lr 0.000692 time 1.9823 (2.2001) loss 2.8441 (3.6589) grad_norm 1.3382 (1.2933) [2022-01-21 04:34:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1150/1251] eta 0:03:42 lr 0.000692 time 2.2848 (2.1994) loss 3.3276 (3.6562) grad_norm 1.1512 (1.2934) [2022-01-21 04:34:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1160/1251] eta 0:03:20 lr 0.000692 time 1.9252 (2.1989) loss 3.5800 (3.6555) grad_norm 1.3030 (1.2938) [2022-01-21 04:35:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1170/1251] eta 0:02:58 lr 0.000692 time 1.8031 (2.1984) loss 3.9892 (3.6553) grad_norm 1.0754 (1.2943) [2022-01-21 04:35:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1180/1251] eta 0:02:36 lr 0.000692 time 3.6653 (2.2000) loss 3.9249 (3.6555) grad_norm 1.3374 (1.2944) [2022-01-21 04:35:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1190/1251] eta 0:02:14 lr 0.000692 time 1.9143 (2.2007) loss 3.3645 (3.6540) grad_norm 1.1909 (1.2941) [2022-01-21 04:36:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1200/1251] eta 0:01:52 lr 0.000692 time 2.3992 (2.2019) loss 3.7441 (3.6559) grad_norm 1.1835 (1.2935) [2022-01-21 04:36:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1210/1251] eta 0:01:30 lr 0.000692 time 1.5561 (2.2015) loss 4.0053 (3.6568) grad_norm 1.3076 (1.2933) [2022-01-21 04:37:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1220/1251] eta 0:01:08 lr 0.000692 time 2.4149 (2.2011) loss 2.9838 (3.6538) grad_norm 1.2045 (1.2932) [2022-01-21 04:37:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1230/1251] eta 0:00:46 lr 0.000692 time 1.5897 (2.1997) loss 4.1556 (3.6562) grad_norm 1.1256 (1.2925) [2022-01-21 04:37:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1240/1251] eta 0:00:24 lr 0.000692 time 1.9570 (2.1981) loss 3.8067 (3.6550) grad_norm 1.2510 (1.2923) [2022-01-21 04:38:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [112/300][1250/1251] eta 0:00:02 lr 0.000692 time 1.1691 (2.1925) loss 3.6952 (3.6554) grad_norm 1.3241 (1.2921) [2022-01-21 04:38:00 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 112 training takes 0:45:43 [2022-01-21 04:38:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.311 (18.311) Loss 1.1475 (1.1475) Acc@1 74.609 (74.609) Acc@5 91.016 (91.016) [2022-01-21 04:38:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.894 (3.225) Loss 1.1345 (1.0983) Acc@1 72.266 (74.245) Acc@5 91.602 (92.489) [2022-01-21 04:38:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.600 (2.453) Loss 1.1052 (1.0964) Acc@1 72.363 (74.298) Acc@5 92.090 (92.494) [2022-01-21 04:39:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.557 (2.212) Loss 1.1050 (1.0969) Acc@1 74.512 (74.408) Acc@5 91.602 (92.440) [2022-01-21 04:39:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.775 (2.149) Loss 1.1909 (1.0963) Acc@1 71.094 (74.243) Acc@5 91.211 (92.519) [2022-01-21 04:39:35 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.242 Acc@5 92.550 [2022-01-21 04:39:35 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.2% [2022-01-21 04:39:35 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.25% [2022-01-21 04:39:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][0/1251] eta 7:05:26 lr 0.000692 time 20.4052 (20.4052) loss 2.6393 (2.6393) grad_norm 1.1985 (1.1985) [2022-01-21 04:40:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][10/1251] eta 1:19:18 lr 0.000692 time 2.1578 (3.8346) loss 3.8375 (3.6697) grad_norm 1.4295 (1.3106) [2022-01-21 04:40:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][20/1251] eta 1:03:23 lr 0.000692 time 1.6558 (3.0894) loss 4.0274 (3.4762) grad_norm 1.2599 (1.3208) [2022-01-21 04:41:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][30/1251] eta 0:56:09 lr 0.000692 time 1.6557 (2.7598) loss 3.4952 (3.5203) grad_norm 1.3097 (1.3049) [2022-01-21 04:41:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][40/1251] eta 0:54:11 lr 0.000692 time 3.7804 (2.6846) loss 4.1722 (3.5225) grad_norm 1.3178 (1.2941) [2022-01-21 04:41:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][50/1251] eta 0:52:49 lr 0.000692 time 2.1757 (2.6391) loss 3.9821 (3.5236) grad_norm 1.4881 (1.3065) [2022-01-21 04:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][60/1251] eta 0:51:16 lr 0.000692 time 2.0758 (2.5829) loss 3.4085 (3.5486) grad_norm 1.3818 (1.3038) [2022-01-21 04:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][70/1251] eta 0:49:42 lr 0.000692 time 2.0367 (2.5252) loss 4.2169 (3.5822) grad_norm 1.5488 (1.3250) [2022-01-21 04:42:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][80/1251] eta 0:48:23 lr 0.000692 time 1.9650 (2.4799) loss 3.8959 (3.5824) grad_norm 1.3057 (1.3181) [2022-01-21 04:43:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][90/1251] eta 0:47:18 lr 0.000692 time 2.1992 (2.4450) loss 3.9982 (3.5951) grad_norm 1.7931 (1.3314) [2022-01-21 04:43:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][100/1251] eta 0:46:11 lr 0.000692 time 2.4938 (2.4079) loss 3.7912 (3.5936) grad_norm 1.5692 (1.3343) [2022-01-21 04:43:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][110/1251] eta 0:45:13 lr 0.000692 time 1.8800 (2.3781) loss 2.6928 (3.5911) grad_norm 1.3792 (1.3388) [2022-01-21 04:44:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][120/1251] eta 0:44:17 lr 0.000692 time 1.8084 (2.3496) loss 4.0644 (3.6042) grad_norm 1.4027 (1.3413) [2022-01-21 04:44:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][130/1251] eta 0:43:45 lr 0.000692 time 2.3261 (2.3425) loss 3.5691 (3.6378) grad_norm 1.2982 (1.3351) [2022-01-21 04:45:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][140/1251] eta 0:43:08 lr 0.000691 time 1.9084 (2.3296) loss 4.3246 (3.6371) grad_norm 1.4930 (1.3323) [2022-01-21 04:45:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][150/1251] eta 0:42:45 lr 0.000691 time 2.9595 (2.3300) loss 2.7463 (3.6471) grad_norm 1.3589 (1.3344) [2022-01-21 04:45:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][160/1251] eta 0:42:14 lr 0.000691 time 1.5129 (2.3229) loss 3.4380 (3.6405) grad_norm 1.2296 (1.3355) [2022-01-21 04:46:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][170/1251] eta 0:41:51 lr 0.000691 time 2.4114 (2.3230) loss 2.4585 (3.6407) grad_norm 1.6137 (1.3377) [2022-01-21 04:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][180/1251] eta 0:41:15 lr 0.000691 time 1.7854 (2.3117) loss 3.3436 (3.6470) grad_norm 1.1191 (1.3356) [2022-01-21 04:46:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][190/1251] eta 0:40:35 lr 0.000691 time 1.8918 (2.2957) loss 3.3964 (3.6412) grad_norm 1.1942 (1.3343) [2022-01-21 04:47:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][200/1251] eta 0:40:02 lr 0.000691 time 2.3237 (2.2860) loss 4.4572 (3.6458) grad_norm 1.3833 (1.3354) [2022-01-21 04:47:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][210/1251] eta 0:39:37 lr 0.000691 time 2.4335 (2.2836) loss 3.8918 (3.6493) grad_norm 1.2526 (1.3337) [2022-01-21 04:48:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][220/1251] eta 0:39:13 lr 0.000691 time 2.1683 (2.2827) loss 4.2708 (3.6568) grad_norm 1.3375 (1.3332) [2022-01-21 04:48:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][230/1251] eta 0:38:46 lr 0.000691 time 2.1444 (2.2783) loss 4.1205 (3.6590) grad_norm 1.5662 (1.3334) [2022-01-21 04:48:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][240/1251] eta 0:38:19 lr 0.000691 time 2.2208 (2.2740) loss 3.6209 (3.6571) grad_norm 1.0922 (1.3327) [2022-01-21 04:49:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][250/1251] eta 0:37:56 lr 0.000691 time 2.7483 (2.2737) loss 3.1505 (3.6518) grad_norm 1.2392 (1.3311) [2022-01-21 04:49:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][260/1251] eta 0:37:29 lr 0.000691 time 1.9066 (2.2702) loss 3.7071 (3.6561) grad_norm 1.2492 (1.3281) [2022-01-21 04:49:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][270/1251] eta 0:37:01 lr 0.000691 time 1.9128 (2.2645) loss 3.1217 (3.6470) grad_norm 1.1046 (1.3267) [2022-01-21 04:50:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][280/1251] eta 0:36:34 lr 0.000691 time 2.9218 (2.2603) loss 3.6475 (3.6499) grad_norm 1.3607 (1.3277) [2022-01-21 04:50:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][290/1251] eta 0:36:09 lr 0.000691 time 2.6904 (2.2574) loss 3.8862 (3.6539) grad_norm 1.2155 (1.3267) [2022-01-21 04:50:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][300/1251] eta 0:35:46 lr 0.000691 time 2.2385 (2.2575) loss 4.0894 (3.6617) grad_norm 1.1210 (1.3275) [2022-01-21 04:51:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][310/1251] eta 0:35:24 lr 0.000691 time 2.3330 (2.2576) loss 4.0801 (3.6576) grad_norm 1.2435 (1.3251) [2022-01-21 04:51:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][320/1251] eta 0:34:59 lr 0.000691 time 2.7690 (2.2551) loss 4.0449 (3.6584) grad_norm 1.3814 (1.3236) [2022-01-21 04:51:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][330/1251] eta 0:34:29 lr 0.000691 time 1.6314 (2.2474) loss 4.4164 (3.6645) grad_norm 1.3083 (1.3239) [2022-01-21 04:52:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][340/1251] eta 0:34:00 lr 0.000691 time 1.9984 (2.2398) loss 3.9824 (3.6680) grad_norm 1.1526 (1.3213) [2022-01-21 04:52:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][350/1251] eta 0:33:36 lr 0.000691 time 2.1645 (2.2377) loss 3.1188 (3.6728) grad_norm 1.3816 (1.3225) [2022-01-21 04:53:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][360/1251] eta 0:33:11 lr 0.000691 time 2.6038 (2.2348) loss 3.9790 (3.6776) grad_norm 1.2196 (1.3206) [2022-01-21 04:53:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][370/1251] eta 0:32:49 lr 0.000691 time 3.1662 (2.2352) loss 2.5313 (3.6785) grad_norm 1.2030 (1.3196) [2022-01-21 04:53:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][380/1251] eta 0:32:28 lr 0.000691 time 1.5039 (2.2373) loss 3.5709 (3.6807) grad_norm 1.5302 (1.3210) [2022-01-21 04:54:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][390/1251] eta 0:32:10 lr 0.000691 time 2.6690 (2.2418) loss 2.7469 (3.6867) grad_norm 1.3958 (1.3193) [2022-01-21 04:54:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][400/1251] eta 0:31:48 lr 0.000690 time 2.4684 (2.2426) loss 3.9612 (3.6864) grad_norm 1.2114 (1.3196) [2022-01-21 04:54:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][410/1251] eta 0:31:22 lr 0.000690 time 2.5370 (2.2386) loss 3.3192 (3.6836) grad_norm 1.3910 (1.3205) [2022-01-21 04:55:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][420/1251] eta 0:30:54 lr 0.000690 time 1.8581 (2.2315) loss 3.5579 (3.6813) grad_norm 1.1990 (1.3197) [2022-01-21 04:55:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][430/1251] eta 0:30:28 lr 0.000690 time 2.5178 (2.2268) loss 3.4373 (3.6809) grad_norm 1.2479 (1.3176) [2022-01-21 04:55:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][440/1251] eta 0:30:02 lr 0.000690 time 1.8866 (2.2223) loss 2.8923 (3.6771) grad_norm 1.1780 (1.3168) [2022-01-21 04:56:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][450/1251] eta 0:29:37 lr 0.000690 time 1.8679 (2.2195) loss 4.0883 (3.6785) grad_norm 1.2588 (1.3174) [2022-01-21 04:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][460/1251] eta 0:29:17 lr 0.000690 time 2.2211 (2.2220) loss 4.0293 (3.6771) grad_norm 1.3414 (1.3182) [2022-01-21 04:57:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][470/1251] eta 0:28:56 lr 0.000690 time 2.5347 (2.2238) loss 4.0914 (3.6792) grad_norm 1.2081 (1.3170) [2022-01-21 04:57:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][480/1251] eta 0:28:32 lr 0.000690 time 1.8685 (2.2215) loss 3.9810 (3.6794) grad_norm 1.4597 (1.3167) [2022-01-21 04:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][490/1251] eta 0:28:10 lr 0.000690 time 2.0269 (2.2217) loss 3.5410 (3.6809) grad_norm 1.4128 (1.3184) [2022-01-21 04:58:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][500/1251] eta 0:27:47 lr 0.000690 time 2.2702 (2.2210) loss 3.8763 (3.6817) grad_norm 1.1203 (1.3176) [2022-01-21 04:58:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][510/1251] eta 0:27:26 lr 0.000690 time 3.0501 (2.2221) loss 2.4993 (3.6735) grad_norm 1.2268 (1.3161) [2022-01-21 04:58:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][520/1251] eta 0:27:04 lr 0.000690 time 1.9193 (2.2217) loss 3.9119 (3.6772) grad_norm 1.4804 (1.3173) [2022-01-21 04:59:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][530/1251] eta 0:26:38 lr 0.000690 time 1.8727 (2.2178) loss 4.0672 (3.6776) grad_norm 1.4490 (1.3182) [2022-01-21 04:59:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][540/1251] eta 0:26:14 lr 0.000690 time 1.9909 (2.2142) loss 3.2887 (3.6769) grad_norm 1.2086 (1.3183) [2022-01-21 04:59:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][550/1251] eta 0:25:54 lr 0.000690 time 3.3240 (2.2175) loss 3.6038 (3.6796) grad_norm 1.2763 (1.3170) [2022-01-21 05:00:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][560/1251] eta 0:25:32 lr 0.000690 time 2.6082 (2.2184) loss 3.9851 (3.6808) grad_norm 1.3501 (1.3151) [2022-01-21 05:00:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][570/1251] eta 0:25:10 lr 0.000690 time 1.8251 (2.2179) loss 4.7131 (3.6772) grad_norm 1.4535 (1.3154) [2022-01-21 05:01:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][580/1251] eta 0:24:47 lr 0.000690 time 1.9507 (2.2163) loss 2.8796 (3.6742) grad_norm 1.1524 (1.3146) [2022-01-21 05:01:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][590/1251] eta 0:24:25 lr 0.000690 time 3.0839 (2.2170) loss 4.2712 (3.6760) grad_norm 1.6648 (1.3159) [2022-01-21 05:01:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][600/1251] eta 0:24:02 lr 0.000690 time 2.1212 (2.2163) loss 3.7921 (3.6742) grad_norm 1.3235 (1.3170) [2022-01-21 05:02:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][610/1251] eta 0:23:38 lr 0.000690 time 2.0071 (2.2135) loss 3.3553 (3.6692) grad_norm 1.3747 (1.3172) [2022-01-21 05:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][620/1251] eta 0:23:15 lr 0.000690 time 1.9392 (2.2121) loss 4.3712 (3.6703) grad_norm 1.4384 (1.3176) [2022-01-21 05:02:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][630/1251] eta 0:22:53 lr 0.000690 time 3.0883 (2.2125) loss 2.6178 (3.6678) grad_norm 1.4398 (1.3180) [2022-01-21 05:03:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][640/1251] eta 0:22:31 lr 0.000690 time 2.6086 (2.2115) loss 4.0482 (3.6669) grad_norm 1.3544 (1.3178) [2022-01-21 05:03:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][650/1251] eta 0:22:07 lr 0.000690 time 1.5785 (2.2083) loss 4.3530 (3.6668) grad_norm 1.4150 (1.3171) [2022-01-21 05:03:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][660/1251] eta 0:21:44 lr 0.000689 time 1.6500 (2.2073) loss 4.0399 (3.6656) grad_norm 1.2620 (1.3158) [2022-01-21 05:04:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][670/1251] eta 0:21:23 lr 0.000689 time 3.0410 (2.2089) loss 3.0466 (3.6670) grad_norm 1.2169 (1.3151) [2022-01-21 05:04:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][680/1251] eta 0:21:01 lr 0.000689 time 2.2877 (2.2102) loss 2.8234 (3.6642) grad_norm 1.5720 (1.3151) [2022-01-21 05:05:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][690/1251] eta 0:20:40 lr 0.000689 time 2.1100 (2.2106) loss 3.3007 (3.6647) grad_norm 1.3281 (1.3157) [2022-01-21 05:05:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][700/1251] eta 0:20:17 lr 0.000689 time 1.9078 (2.2102) loss 4.2822 (3.6653) grad_norm 1.4005 (1.3156) [2022-01-21 05:05:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][710/1251] eta 0:19:56 lr 0.000689 time 3.0756 (2.2110) loss 3.8592 (3.6639) grad_norm 1.4184 (1.3155) [2022-01-21 05:06:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][720/1251] eta 0:19:32 lr 0.000689 time 1.8657 (2.2074) loss 4.2516 (3.6641) grad_norm 1.3339 (1.3166) [2022-01-21 05:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][730/1251] eta 0:19:08 lr 0.000689 time 1.8787 (2.2046) loss 3.4015 (3.6650) grad_norm 1.4711 (1.3171) [2022-01-21 05:06:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][740/1251] eta 0:18:46 lr 0.000689 time 2.2624 (2.2041) loss 3.9714 (3.6671) grad_norm 1.2650 (1.3167) [2022-01-21 05:07:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][750/1251] eta 0:18:24 lr 0.000689 time 3.0475 (2.2045) loss 4.5000 (3.6675) grad_norm 1.7391 (1.3188) [2022-01-21 05:07:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][760/1251] eta 0:18:02 lr 0.000689 time 2.0917 (2.2048) loss 3.1606 (3.6678) grad_norm 1.3615 (1.3200) [2022-01-21 05:07:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][770/1251] eta 0:17:41 lr 0.000689 time 2.4755 (2.2062) loss 4.5291 (3.6731) grad_norm 1.5059 (1.3196) [2022-01-21 05:08:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][780/1251] eta 0:17:18 lr 0.000689 time 1.7187 (2.2059) loss 2.6084 (3.6711) grad_norm 1.3071 (1.3208) [2022-01-21 05:08:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][790/1251] eta 0:16:57 lr 0.000689 time 2.8104 (2.2079) loss 4.5261 (3.6712) grad_norm 1.2656 (1.3205) [2022-01-21 05:09:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][800/1251] eta 0:16:36 lr 0.000689 time 2.1766 (2.2091) loss 3.5809 (3.6727) grad_norm 1.3728 (1.3193) [2022-01-21 05:09:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][810/1251] eta 0:16:13 lr 0.000689 time 2.1697 (2.2070) loss 3.8707 (3.6724) grad_norm 1.1887 (1.3194) [2022-01-21 05:09:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][820/1251] eta 0:15:49 lr 0.000689 time 2.1260 (2.2040) loss 3.0694 (3.6733) grad_norm 1.5975 (1.3191) [2022-01-21 05:10:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][830/1251] eta 0:15:27 lr 0.000689 time 2.1632 (2.2032) loss 4.1232 (3.6740) grad_norm 1.2505 (1.3187) [2022-01-21 05:10:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][840/1251] eta 0:15:04 lr 0.000689 time 1.8312 (2.2012) loss 4.2014 (3.6734) grad_norm 1.2407 (1.3187) [2022-01-21 05:10:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][850/1251] eta 0:14:42 lr 0.000689 time 2.1420 (2.2007) loss 2.7044 (3.6698) grad_norm 1.3682 (1.3184) [2022-01-21 05:11:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][860/1251] eta 0:14:21 lr 0.000689 time 2.4861 (2.2026) loss 3.2545 (3.6701) grad_norm 1.4664 (1.3190) [2022-01-21 05:11:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][870/1251] eta 0:13:59 lr 0.000689 time 1.9111 (2.2028) loss 3.9330 (3.6714) grad_norm 1.3694 (1.3183) [2022-01-21 05:11:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][880/1251] eta 0:13:37 lr 0.000689 time 2.1793 (2.2039) loss 3.5491 (3.6681) grad_norm 1.7059 (1.3206) [2022-01-21 05:12:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][890/1251] eta 0:13:16 lr 0.000689 time 3.1825 (2.2055) loss 3.1564 (3.6658) grad_norm 1.2870 (1.3202) [2022-01-21 05:12:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][900/1251] eta 0:12:54 lr 0.000689 time 1.9623 (2.2053) loss 4.1318 (3.6681) grad_norm 1.3452 (1.3203) [2022-01-21 05:13:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][910/1251] eta 0:12:31 lr 0.000689 time 1.8659 (2.2053) loss 4.1654 (3.6668) grad_norm 1.2597 (1.3201) [2022-01-21 05:13:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][920/1251] eta 0:12:09 lr 0.000688 time 1.9055 (2.2035) loss 3.1788 (3.6685) grad_norm 1.2390 (1.3197) [2022-01-21 05:13:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][930/1251] eta 0:11:46 lr 0.000688 time 1.9540 (2.2024) loss 3.8704 (3.6704) grad_norm 1.2212 (1.3192) [2022-01-21 05:14:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][940/1251] eta 0:11:24 lr 0.000688 time 1.8833 (2.2008) loss 4.0146 (3.6695) grad_norm 1.1844 (1.3186) [2022-01-21 05:14:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][950/1251] eta 0:11:02 lr 0.000688 time 1.5900 (2.2009) loss 3.3396 (3.6704) grad_norm 1.3580 (1.3182) [2022-01-21 05:14:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][960/1251] eta 0:10:40 lr 0.000688 time 2.7253 (2.2022) loss 3.9009 (3.6705) grad_norm 1.1711 (1.3180) [2022-01-21 05:15:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][970/1251] eta 0:10:18 lr 0.000688 time 2.2841 (2.2027) loss 3.3389 (3.6693) grad_norm 1.3401 (1.3181) [2022-01-21 05:15:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][980/1251] eta 0:09:57 lr 0.000688 time 1.9572 (2.2030) loss 3.1731 (3.6686) grad_norm 1.4844 (1.3179) [2022-01-21 05:15:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][990/1251] eta 0:09:35 lr 0.000688 time 1.9157 (2.2038) loss 4.4916 (3.6710) grad_norm 1.3710 (1.3165) [2022-01-21 05:16:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1000/1251] eta 0:09:12 lr 0.000688 time 1.7354 (2.2025) loss 3.7776 (3.6693) grad_norm 1.3797 (1.3168) [2022-01-21 05:16:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1010/1251] eta 0:08:50 lr 0.000688 time 1.9233 (2.1997) loss 4.6662 (3.6701) grad_norm 1.2217 (1.3172) [2022-01-21 05:16:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1020/1251] eta 0:08:27 lr 0.000688 time 2.2520 (2.1978) loss 4.2714 (3.6701) grad_norm 1.2544 (1.3172) [2022-01-21 05:17:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1030/1251] eta 0:08:05 lr 0.000688 time 1.9072 (2.1977) loss 4.1534 (3.6709) grad_norm 1.4405 (1.3167) [2022-01-21 05:17:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1040/1251] eta 0:07:43 lr 0.000688 time 1.9691 (2.1979) loss 4.0384 (3.6696) grad_norm 1.5610 (1.3173) [2022-01-21 05:18:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1050/1251] eta 0:07:22 lr 0.000688 time 2.7029 (2.1990) loss 3.5374 (3.6684) grad_norm 1.2329 (1.3171) [2022-01-21 05:18:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1060/1251] eta 0:07:00 lr 0.000688 time 1.9162 (2.1996) loss 4.0763 (3.6676) grad_norm 1.5169 (1.3177) [2022-01-21 05:18:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1070/1251] eta 0:06:38 lr 0.000688 time 2.0000 (2.1993) loss 4.1441 (3.6666) grad_norm 1.3683 (1.3191) [2022-01-21 05:19:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1080/1251] eta 0:06:15 lr 0.000688 time 1.6365 (2.1982) loss 4.0160 (3.6666) grad_norm 1.6060 (1.3194) [2022-01-21 05:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1090/1251] eta 0:05:53 lr 0.000688 time 2.4747 (2.1973) loss 2.6325 (3.6684) grad_norm 1.4800 (1.3188) [2022-01-21 05:19:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1100/1251] eta 0:05:31 lr 0.000688 time 1.8403 (2.1971) loss 4.4002 (3.6695) grad_norm 1.2893 (1.3185) [2022-01-21 05:20:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1110/1251] eta 0:05:10 lr 0.000688 time 1.8422 (2.1991) loss 3.2290 (3.6673) grad_norm 1.3363 (1.3189) [2022-01-21 05:20:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1120/1251] eta 0:04:48 lr 0.000688 time 1.9356 (2.1995) loss 2.9507 (3.6683) grad_norm 1.2245 (1.3186) [2022-01-21 05:21:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1130/1251] eta 0:04:26 lr 0.000688 time 2.1373 (2.1988) loss 3.9393 (3.6675) grad_norm 1.2987 (1.3190) [2022-01-21 05:21:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1140/1251] eta 0:04:03 lr 0.000688 time 1.6226 (2.1980) loss 4.2884 (3.6687) grad_norm 1.4767 (1.3200) [2022-01-21 05:21:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1150/1251] eta 0:03:41 lr 0.000688 time 1.8487 (2.1965) loss 2.9762 (3.6680) grad_norm 1.3019 (1.3195) [2022-01-21 05:22:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1160/1251] eta 0:03:19 lr 0.000688 time 2.2527 (2.1968) loss 3.8844 (3.6687) grad_norm 1.1536 (1.3194) [2022-01-21 05:22:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1170/1251] eta 0:02:57 lr 0.000688 time 2.5540 (2.1970) loss 3.9139 (3.6682) grad_norm 1.3327 (1.3201) [2022-01-21 05:22:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1180/1251] eta 0:02:35 lr 0.000687 time 1.8964 (2.1970) loss 3.9360 (3.6686) grad_norm 1.3975 (1.3209) [2022-01-21 05:23:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1190/1251] eta 0:02:13 lr 0.000687 time 1.6077 (2.1965) loss 3.9412 (3.6693) grad_norm 1.1873 (1.3207) [2022-01-21 05:23:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1200/1251] eta 0:01:52 lr 0.000687 time 2.0240 (2.1964) loss 3.7338 (3.6726) grad_norm 1.4872 (1.3201) [2022-01-21 05:23:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1210/1251] eta 0:01:30 lr 0.000687 time 2.0097 (2.1977) loss 3.2390 (3.6725) grad_norm 1.1755 (1.3195) [2022-01-21 05:24:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1220/1251] eta 0:01:08 lr 0.000687 time 2.1729 (2.1978) loss 4.1204 (3.6723) grad_norm 1.3283 (1.3194) [2022-01-21 05:24:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1230/1251] eta 0:00:46 lr 0.000687 time 2.5591 (2.1981) loss 2.8801 (3.6714) grad_norm 1.1786 (1.3189) [2022-01-21 05:25:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1240/1251] eta 0:00:24 lr 0.000687 time 2.6610 (2.1973) loss 3.8815 (3.6719) grad_norm 1.2637 (1.3182) [2022-01-21 05:25:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [113/300][1250/1251] eta 0:00:02 lr 0.000687 time 1.1791 (2.1911) loss 3.8114 (3.6708) grad_norm 1.2304 (1.3184) [2022-01-21 05:25:17 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 113 training takes 0:45:41 [2022-01-21 05:25:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.965 (18.965) Loss 1.1364 (1.1364) Acc@1 73.926 (73.926) Acc@5 92.676 (92.676) [2022-01-21 05:25:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.531 (3.701) Loss 1.0904 (1.1342) Acc@1 74.902 (73.659) Acc@5 91.699 (92.205) [2022-01-21 05:26:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.589 (2.684) Loss 1.1089 (1.1281) Acc@1 74.316 (73.889) Acc@5 91.797 (92.299) [2022-01-21 05:26:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.956 (2.393) Loss 1.0960 (1.1254) Acc@1 73.438 (73.891) Acc@5 92.676 (92.266) [2022-01-21 05:26:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.053 (2.211) Loss 1.0234 (1.1200) Acc@1 77.344 (74.126) Acc@5 93.945 (92.302) [2022-01-21 05:26:55 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.130 Acc@5 92.338 [2022-01-21 05:26:55 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.1% [2022-01-21 05:26:55 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.25% [2022-01-21 05:27:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][0/1251] eta 7:22:15 lr 0.000687 time 21.2117 (21.2117) loss 3.8590 (3.8590) grad_norm 1.4323 (1.4323) [2022-01-21 05:27:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][10/1251] eta 1:22:04 lr 0.000687 time 1.9361 (3.9685) loss 4.1762 (3.6982) grad_norm 1.4278 (1.3779) [2022-01-21 05:28:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][20/1251] eta 1:03:48 lr 0.000687 time 1.7380 (3.1102) loss 4.1919 (3.7950) grad_norm 1.3057 (1.3104) [2022-01-21 05:28:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][30/1251] eta 0:56:01 lr 0.000687 time 1.9867 (2.7532) loss 3.3838 (3.6704) grad_norm 1.2176 (1.2965) [2022-01-21 05:28:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][40/1251] eta 0:55:49 lr 0.000687 time 3.9225 (2.7662) loss 3.4391 (3.6108) grad_norm 1.0643 (1.3008) [2022-01-21 05:29:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][50/1251] eta 0:53:35 lr 0.000687 time 2.5014 (2.6770) loss 3.9149 (3.5934) grad_norm 1.3350 (1.2911) [2022-01-21 05:29:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][60/1251] eta 0:51:39 lr 0.000687 time 2.0927 (2.6024) loss 4.1444 (3.6195) grad_norm 1.3306 (1.2843) [2022-01-21 05:29:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][70/1251] eta 0:49:50 lr 0.000687 time 1.9041 (2.5320) loss 2.8611 (3.6165) grad_norm 1.2203 (1.2864) [2022-01-21 05:30:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][80/1251] eta 0:48:34 lr 0.000687 time 2.9076 (2.4887) loss 2.9635 (3.5997) grad_norm 1.3245 (1.2903) [2022-01-21 05:30:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][90/1251] eta 0:47:11 lr 0.000687 time 1.9566 (2.4390) loss 3.4176 (3.6022) grad_norm 1.2984 (1.2949) [2022-01-21 05:30:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][100/1251] eta 0:46:11 lr 0.000687 time 1.9159 (2.4078) loss 4.3831 (3.6177) grad_norm 1.6162 (1.2933) [2022-01-21 05:31:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][110/1251] eta 0:45:18 lr 0.000687 time 2.0393 (2.3827) loss 4.0171 (3.6116) grad_norm 1.3044 (1.2978) [2022-01-21 05:31:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][120/1251] eta 0:44:52 lr 0.000687 time 3.5600 (2.3808) loss 4.0313 (3.6255) grad_norm 1.1863 (1.2967) [2022-01-21 05:32:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][130/1251] eta 0:44:21 lr 0.000687 time 1.4777 (2.3746) loss 3.7448 (3.6422) grad_norm 1.6462 (1.3021) [2022-01-21 05:32:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][140/1251] eta 0:43:40 lr 0.000687 time 1.6964 (2.3586) loss 4.0973 (3.6446) grad_norm 1.4276 (1.3015) [2022-01-21 05:32:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][150/1251] eta 0:43:00 lr 0.000687 time 2.2372 (2.3434) loss 3.2369 (3.6524) grad_norm 1.3232 (1.3008) [2022-01-21 05:33:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][160/1251] eta 0:42:25 lr 0.000687 time 3.3886 (2.3335) loss 3.8166 (3.6610) grad_norm 1.3461 (1.3090) [2022-01-21 05:33:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][170/1251] eta 0:41:52 lr 0.000687 time 1.8685 (2.3240) loss 3.5631 (3.6576) grad_norm 1.2917 (1.3094) [2022-01-21 05:33:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][180/1251] eta 0:41:12 lr 0.000687 time 2.1343 (2.3089) loss 3.6590 (3.6602) grad_norm 1.1887 (1.3078) [2022-01-21 05:34:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][190/1251] eta 0:40:40 lr 0.000686 time 1.6159 (2.2998) loss 3.5113 (3.6672) grad_norm 1.2314 (1.3064) [2022-01-21 05:34:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][200/1251] eta 0:40:14 lr 0.000686 time 3.0876 (2.2976) loss 4.4454 (3.6656) grad_norm 1.2364 (1.3050) [2022-01-21 05:34:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][210/1251] eta 0:39:44 lr 0.000686 time 2.2479 (2.2903) loss 2.8463 (3.6543) grad_norm 1.3029 (1.3079) [2022-01-21 05:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][220/1251] eta 0:39:15 lr 0.000686 time 1.7359 (2.2848) loss 3.9456 (3.6494) grad_norm 1.2172 (1.3079) [2022-01-21 05:35:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][230/1251] eta 0:38:48 lr 0.000686 time 2.1732 (2.2810) loss 3.5798 (3.6507) grad_norm 1.2578 (1.3068) [2022-01-21 05:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][240/1251] eta 0:38:24 lr 0.000686 time 3.0968 (2.2797) loss 2.8859 (3.6475) grad_norm 1.2645 (1.3066) [2022-01-21 05:36:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][250/1251] eta 0:38:06 lr 0.000686 time 3.1144 (2.2846) loss 3.2369 (3.6485) grad_norm 1.0911 (1.3045) [2022-01-21 05:36:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][260/1251] eta 0:37:42 lr 0.000686 time 1.9300 (2.2826) loss 4.0245 (3.6548) grad_norm 1.3771 (1.3063) [2022-01-21 05:37:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][270/1251] eta 0:37:18 lr 0.000686 time 2.4400 (2.2816) loss 4.1367 (3.6644) grad_norm 1.2379 (1.3075) [2022-01-21 05:37:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][280/1251] eta 0:36:48 lr 0.000686 time 2.2169 (2.2748) loss 4.0351 (3.6643) grad_norm 1.2473 (1.3069) [2022-01-21 05:37:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][290/1251] eta 0:36:14 lr 0.000686 time 2.1102 (2.2623) loss 4.0939 (3.6655) grad_norm 1.2777 (1.3067) [2022-01-21 05:38:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][300/1251] eta 0:35:45 lr 0.000686 time 1.9674 (2.2561) loss 3.7550 (3.6584) grad_norm 1.3512 (1.3072) [2022-01-21 05:38:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][310/1251] eta 0:35:23 lr 0.000686 time 2.7150 (2.2565) loss 4.2147 (3.6543) grad_norm 1.6050 (1.3100) [2022-01-21 05:38:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][320/1251] eta 0:35:00 lr 0.000686 time 2.1003 (2.2561) loss 3.3777 (3.6551) grad_norm 1.2299 (1.3099) [2022-01-21 05:39:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][330/1251] eta 0:34:35 lr 0.000686 time 2.1055 (2.2534) loss 4.1672 (3.6547) grad_norm 1.2359 (1.3084) [2022-01-21 05:39:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][340/1251] eta 0:34:12 lr 0.000686 time 1.8927 (2.2529) loss 3.0648 (3.6437) grad_norm 1.4307 (1.3078) [2022-01-21 05:40:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][350/1251] eta 0:33:49 lr 0.000686 time 2.3095 (2.2530) loss 4.5322 (3.6483) grad_norm 1.1757 (1.3054) [2022-01-21 05:40:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][360/1251] eta 0:33:22 lr 0.000686 time 1.5589 (2.2471) loss 4.2589 (3.6529) grad_norm 1.2718 (1.3053) [2022-01-21 05:40:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][370/1251] eta 0:33:00 lr 0.000686 time 3.1814 (2.2476) loss 4.3549 (3.6522) grad_norm 1.2758 (1.3067) [2022-01-21 05:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][380/1251] eta 0:32:33 lr 0.000686 time 1.9107 (2.2429) loss 3.4176 (3.6525) grad_norm 1.2950 (1.3063) [2022-01-21 05:41:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][390/1251] eta 0:32:04 lr 0.000686 time 2.1652 (2.2347) loss 4.1551 (3.6516) grad_norm 1.5339 (1.3087) [2022-01-21 05:41:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][400/1251] eta 0:31:37 lr 0.000686 time 1.9887 (2.2293) loss 3.9855 (3.6585) grad_norm 1.3279 (1.3098) [2022-01-21 05:42:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][410/1251] eta 0:31:12 lr 0.000686 time 2.2949 (2.2259) loss 2.3723 (3.6531) grad_norm 1.2702 (1.3090) [2022-01-21 05:42:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][420/1251] eta 0:30:47 lr 0.000686 time 2.0068 (2.2235) loss 2.8343 (3.6480) grad_norm 1.3868 (1.3092) [2022-01-21 05:42:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][430/1251] eta 0:30:25 lr 0.000686 time 2.2284 (2.2235) loss 3.6039 (3.6514) grad_norm 1.3298 (1.3094) [2022-01-21 05:43:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][440/1251] eta 0:30:06 lr 0.000686 time 2.5483 (2.2271) loss 4.1150 (3.6483) grad_norm 1.7430 (1.3113) [2022-01-21 05:43:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][450/1251] eta 0:29:46 lr 0.000685 time 1.8417 (2.2302) loss 3.0962 (3.6479) grad_norm 1.2542 (1.3114) [2022-01-21 05:44:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][460/1251] eta 0:29:25 lr 0.000685 time 2.4447 (2.2322) loss 4.0947 (3.6523) grad_norm 1.4864 (1.3103) [2022-01-21 05:44:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][470/1251] eta 0:29:03 lr 0.000685 time 1.7085 (2.2318) loss 3.6779 (3.6574) grad_norm 1.3659 (1.3108) [2022-01-21 05:44:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][480/1251] eta 0:28:39 lr 0.000685 time 1.6777 (2.2307) loss 3.6304 (3.6479) grad_norm 1.4081 (1.3106) [2022-01-21 05:45:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][490/1251] eta 0:28:16 lr 0.000685 time 2.0274 (2.2293) loss 4.4312 (3.6478) grad_norm 1.3642 (1.3109) [2022-01-21 05:45:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][500/1251] eta 0:27:52 lr 0.000685 time 2.5992 (2.2273) loss 2.7367 (3.6394) grad_norm 1.2352 (1.3109) [2022-01-21 05:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][510/1251] eta 0:27:27 lr 0.000685 time 2.0019 (2.2228) loss 3.8177 (3.6393) grad_norm 1.3822 (1.3101) [2022-01-21 05:46:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][520/1251] eta 0:27:03 lr 0.000685 time 1.8053 (2.2208) loss 4.5606 (3.6392) grad_norm 1.3136 (1.3106) [2022-01-21 05:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][530/1251] eta 0:26:40 lr 0.000685 time 2.0025 (2.2201) loss 3.9951 (3.6397) grad_norm 1.2758 (1.3105) [2022-01-21 05:46:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][540/1251] eta 0:26:18 lr 0.000685 time 2.1443 (2.2200) loss 3.4993 (3.6363) grad_norm 1.1763 (1.3114) [2022-01-21 05:47:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][550/1251] eta 0:25:54 lr 0.000685 time 1.6448 (2.2170) loss 3.0613 (3.6320) grad_norm 1.2853 (1.3120) [2022-01-21 05:47:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][560/1251] eta 0:25:31 lr 0.000685 time 1.8490 (2.2159) loss 4.2547 (3.6324) grad_norm 1.2478 (1.3118) [2022-01-21 05:48:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][570/1251] eta 0:25:10 lr 0.000685 time 2.3135 (2.2178) loss 3.2879 (3.6265) grad_norm 1.3833 (1.3115) [2022-01-21 05:48:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][580/1251] eta 0:24:50 lr 0.000685 time 3.3800 (2.2212) loss 2.5508 (3.6262) grad_norm 1.2609 (1.3119) [2022-01-21 05:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][590/1251] eta 0:24:27 lr 0.000685 time 1.9664 (2.2206) loss 2.7031 (3.6213) grad_norm 1.4317 (1.3121) [2022-01-21 05:49:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][600/1251] eta 0:24:03 lr 0.000685 time 1.6010 (2.2176) loss 4.2365 (3.6243) grad_norm 1.2388 (1.3113) [2022-01-21 05:49:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][610/1251] eta 0:23:42 lr 0.000685 time 1.8678 (2.2192) loss 4.0462 (3.6241) grad_norm 1.5614 (1.3123) [2022-01-21 05:49:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][620/1251] eta 0:23:20 lr 0.000685 time 3.3736 (2.2191) loss 4.1157 (3.6327) grad_norm 1.2919 (1.3120) [2022-01-21 05:50:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][630/1251] eta 0:22:55 lr 0.000685 time 1.9382 (2.2149) loss 4.3015 (3.6351) grad_norm 1.6947 (1.3119) [2022-01-21 05:50:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][640/1251] eta 0:22:32 lr 0.000685 time 2.1864 (2.2132) loss 4.3685 (3.6405) grad_norm 1.2881 (1.3106) [2022-01-21 05:50:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][650/1251] eta 0:22:10 lr 0.000685 time 2.3246 (2.2144) loss 2.6020 (3.6383) grad_norm 1.2197 (1.3099) [2022-01-21 05:51:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][660/1251] eta 0:21:50 lr 0.000685 time 3.6361 (2.2177) loss 3.9487 (3.6365) grad_norm 1.2584 (1.3102) [2022-01-21 05:51:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][670/1251] eta 0:21:27 lr 0.000685 time 1.6365 (2.2155) loss 3.8740 (3.6378) grad_norm 1.1711 (1.3097) [2022-01-21 05:52:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][680/1251] eta 0:21:04 lr 0.000685 time 2.0221 (2.2138) loss 3.6332 (3.6383) grad_norm 1.2223 (1.3101) [2022-01-21 05:52:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][690/1251] eta 0:20:41 lr 0.000685 time 2.1420 (2.2137) loss 4.4076 (3.6394) grad_norm 1.4274 (1.3092) [2022-01-21 05:52:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][700/1251] eta 0:20:20 lr 0.000685 time 3.4056 (2.2157) loss 2.9885 (3.6400) grad_norm 1.4214 (1.3090) [2022-01-21 05:53:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][710/1251] eta 0:19:59 lr 0.000684 time 2.2894 (2.2170) loss 4.1780 (3.6416) grad_norm 1.1051 (1.3094) [2022-01-21 05:53:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][720/1251] eta 0:19:37 lr 0.000684 time 2.6634 (2.2173) loss 4.2455 (3.6401) grad_norm 1.2291 (1.3091) [2022-01-21 05:53:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][730/1251] eta 0:19:14 lr 0.000684 time 1.7059 (2.2151) loss 3.1201 (3.6414) grad_norm 1.3141 (1.3100) [2022-01-21 05:54:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][740/1251] eta 0:18:51 lr 0.000684 time 3.1438 (2.2151) loss 2.5375 (3.6394) grad_norm 1.2764 (1.3097) [2022-01-21 05:54:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][750/1251] eta 0:18:28 lr 0.000684 time 1.9719 (2.2117) loss 3.4021 (3.6440) grad_norm 1.3208 (1.3103) [2022-01-21 05:54:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][760/1251] eta 0:18:05 lr 0.000684 time 2.5451 (2.2106) loss 4.0493 (3.6436) grad_norm 1.6865 (1.3111) [2022-01-21 05:55:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][770/1251] eta 0:17:42 lr 0.000684 time 2.0280 (2.2096) loss 3.5930 (3.6464) grad_norm 1.1937 (1.3111) [2022-01-21 05:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][780/1251] eta 0:17:21 lr 0.000684 time 2.5310 (2.2111) loss 3.7379 (3.6426) grad_norm 1.1351 (1.3114) [2022-01-21 05:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][790/1251] eta 0:16:58 lr 0.000684 time 1.9746 (2.2103) loss 3.8710 (3.6446) grad_norm 1.2674 (1.3109) [2022-01-21 05:56:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][800/1251] eta 0:16:35 lr 0.000684 time 1.6081 (2.2080) loss 4.3950 (3.6464) grad_norm 1.1782 (1.3105) [2022-01-21 05:56:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][810/1251] eta 0:16:13 lr 0.000684 time 2.1256 (2.2071) loss 2.6809 (3.6451) grad_norm 1.2030 (1.3112) [2022-01-21 05:57:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][820/1251] eta 0:15:52 lr 0.000684 time 1.8605 (2.2097) loss 3.0662 (3.6417) grad_norm 1.7504 (1.3118) [2022-01-21 05:57:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][830/1251] eta 0:15:29 lr 0.000684 time 1.5526 (2.2087) loss 4.1676 (3.6434) grad_norm 1.1674 (1.3110) [2022-01-21 05:57:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][840/1251] eta 0:15:08 lr 0.000684 time 1.9660 (2.2103) loss 4.1010 (3.6479) grad_norm 1.3601 (1.3111) [2022-01-21 05:58:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][850/1251] eta 0:14:46 lr 0.000684 time 2.7964 (2.2108) loss 3.5836 (3.6476) grad_norm 1.2518 (1.3110) [2022-01-21 05:58:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][860/1251] eta 0:14:24 lr 0.000684 time 2.2141 (2.2102) loss 3.6444 (3.6524) grad_norm 1.2555 (1.3108) [2022-01-21 05:58:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][870/1251] eta 0:14:01 lr 0.000684 time 1.5928 (2.2087) loss 4.0466 (3.6518) grad_norm 1.2176 (1.3098) [2022-01-21 05:59:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][880/1251] eta 0:13:38 lr 0.000684 time 2.2316 (2.2072) loss 3.7933 (3.6518) grad_norm 1.2289 (1.3106) [2022-01-21 05:59:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][890/1251] eta 0:13:16 lr 0.000684 time 1.5716 (2.2062) loss 3.8481 (3.6527) grad_norm 1.3307 (1.3108) [2022-01-21 06:00:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][900/1251] eta 0:12:54 lr 0.000684 time 2.0123 (2.2063) loss 3.6352 (3.6535) grad_norm 1.2138 (1.3098) [2022-01-21 06:00:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][910/1251] eta 0:12:32 lr 0.000684 time 1.9570 (2.2065) loss 3.3560 (3.6516) grad_norm 1.0753 (1.3094) [2022-01-21 06:00:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][920/1251] eta 0:12:10 lr 0.000684 time 2.1996 (2.2064) loss 4.0542 (3.6528) grad_norm 1.1032 (1.3089) [2022-01-21 06:01:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][930/1251] eta 0:11:48 lr 0.000684 time 2.1941 (2.2057) loss 4.2146 (3.6558) grad_norm 1.3715 (1.3089) [2022-01-21 06:01:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][940/1251] eta 0:11:25 lr 0.000684 time 2.4402 (2.2054) loss 3.5104 (3.6559) grad_norm 1.2302 (1.3087) [2022-01-21 06:01:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][950/1251] eta 0:11:03 lr 0.000684 time 2.1399 (2.2034) loss 3.6264 (3.6560) grad_norm 1.1964 (1.3083) [2022-01-21 06:02:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][960/1251] eta 0:10:40 lr 0.000684 time 1.7881 (2.2020) loss 4.0814 (3.6561) grad_norm 1.2315 (1.3084) [2022-01-21 06:02:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][970/1251] eta 0:10:18 lr 0.000683 time 2.0931 (2.2016) loss 3.6228 (3.6546) grad_norm 1.3907 (1.3091) [2022-01-21 06:02:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][980/1251] eta 0:09:56 lr 0.000683 time 2.2600 (2.2013) loss 3.9834 (3.6550) grad_norm 1.1307 (1.3088) [2022-01-21 06:03:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][990/1251] eta 0:09:34 lr 0.000683 time 2.2488 (2.2017) loss 4.2408 (3.6538) grad_norm 1.4407 (1.3087) [2022-01-21 06:03:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1000/1251] eta 0:09:12 lr 0.000683 time 2.4478 (2.2028) loss 3.5009 (3.6502) grad_norm 1.1740 (1.3077) [2022-01-21 06:04:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1010/1251] eta 0:08:51 lr 0.000683 time 2.2239 (2.2039) loss 3.5878 (3.6487) grad_norm 1.3313 (1.3075) [2022-01-21 06:04:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1020/1251] eta 0:08:29 lr 0.000683 time 2.4254 (2.2052) loss 3.6074 (3.6464) grad_norm 1.2810 (1.3082) [2022-01-21 06:04:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1030/1251] eta 0:08:07 lr 0.000683 time 2.2377 (2.2044) loss 3.9652 (3.6466) grad_norm 1.2824 (1.3090) [2022-01-21 06:05:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1040/1251] eta 0:07:44 lr 0.000683 time 1.8616 (2.2011) loss 4.3190 (3.6468) grad_norm 1.4322 (1.3090) [2022-01-21 06:05:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1050/1251] eta 0:07:21 lr 0.000683 time 1.8256 (2.1984) loss 3.9975 (3.6480) grad_norm 1.3678 (1.3096) [2022-01-21 06:05:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1060/1251] eta 0:06:59 lr 0.000683 time 2.2027 (2.1972) loss 3.8761 (3.6482) grad_norm 1.2146 (1.3092) [2022-01-21 06:06:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1070/1251] eta 0:06:37 lr 0.000683 time 2.7005 (2.1966) loss 3.7790 (3.6474) grad_norm 1.5657 (1.3093) [2022-01-21 06:06:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1080/1251] eta 0:06:15 lr 0.000683 time 1.6777 (2.1972) loss 3.4916 (3.6466) grad_norm 1.4530 (1.3105) [2022-01-21 06:06:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1090/1251] eta 0:05:53 lr 0.000683 time 2.0343 (2.1975) loss 3.4842 (3.6437) grad_norm 1.1612 (1.3103) [2022-01-21 06:07:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1100/1251] eta 0:05:32 lr 0.000683 time 2.5443 (2.2000) loss 3.8812 (3.6461) grad_norm 1.5263 (1.3112) [2022-01-21 06:07:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1110/1251] eta 0:05:10 lr 0.000683 time 2.6112 (2.2016) loss 4.0133 (3.6480) grad_norm 1.1618 (1.3109) [2022-01-21 06:08:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1120/1251] eta 0:04:48 lr 0.000683 time 1.5618 (2.2026) loss 4.0498 (3.6499) grad_norm 1.3580 (1.3108) [2022-01-21 06:08:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1130/1251] eta 0:04:26 lr 0.000683 time 1.8381 (2.2012) loss 4.2231 (3.6498) grad_norm 1.3181 (1.3109) [2022-01-21 06:08:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1140/1251] eta 0:04:04 lr 0.000683 time 1.8518 (2.1997) loss 3.7505 (3.6477) grad_norm 1.5315 (1.3110) [2022-01-21 06:09:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1150/1251] eta 0:03:41 lr 0.000683 time 2.1663 (2.1979) loss 3.6049 (3.6507) grad_norm 1.2410 (1.3110) [2022-01-21 06:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1160/1251] eta 0:03:19 lr 0.000683 time 3.0750 (2.1976) loss 3.9026 (3.6497) grad_norm 1.3120 (1.3111) [2022-01-21 06:09:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1170/1251] eta 0:02:57 lr 0.000683 time 1.9936 (2.1971) loss 4.1041 (3.6496) grad_norm 1.1429 (1.3116) [2022-01-21 06:10:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1180/1251] eta 0:02:35 lr 0.000683 time 1.7414 (2.1960) loss 3.8498 (3.6495) grad_norm 1.1315 (1.3119) [2022-01-21 06:10:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1190/1251] eta 0:02:13 lr 0.000683 time 2.0255 (2.1952) loss 4.1568 (3.6472) grad_norm 1.3282 (1.3127) [2022-01-21 06:10:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1200/1251] eta 0:01:51 lr 0.000683 time 2.2196 (2.1942) loss 3.8764 (3.6494) grad_norm 1.3113 (1.3127) [2022-01-21 06:11:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1210/1251] eta 0:01:29 lr 0.000683 time 1.8760 (2.1948) loss 4.2726 (3.6457) grad_norm 1.1907 (1.3126) [2022-01-21 06:11:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1220/1251] eta 0:01:08 lr 0.000683 time 2.2408 (2.1964) loss 3.9935 (3.6443) grad_norm 1.4213 (1.3126) [2022-01-21 06:12:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1230/1251] eta 0:00:46 lr 0.000682 time 2.7700 (2.1977) loss 4.2326 (3.6447) grad_norm 1.1198 (1.3123) [2022-01-21 06:12:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1240/1251] eta 0:00:24 lr 0.000682 time 2.0964 (2.1975) loss 4.1808 (3.6461) grad_norm 1.3542 (1.3121) [2022-01-21 06:12:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [114/300][1250/1251] eta 0:00:02 lr 0.000682 time 1.1857 (2.1920) loss 4.1335 (3.6439) grad_norm 1.3274 (1.3118) [2022-01-21 06:12:37 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 114 training takes 0:45:42 [2022-01-21 06:12:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.790 (18.790) Loss 1.1055 (1.1055) Acc@1 73.340 (73.340) Acc@5 92.578 (92.578) [2022-01-21 06:13:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.286 (3.574) Loss 1.1521 (1.0802) Acc@1 72.754 (74.379) Acc@5 91.797 (92.676) [2022-01-21 06:13:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.645 (2.724) Loss 1.0459 (1.0879) Acc@1 75.195 (74.414) Acc@5 92.969 (92.666) [2022-01-21 06:13:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.241 (2.476) Loss 1.1364 (1.0936) Acc@1 73.926 (74.197) Acc@5 91.602 (92.540) [2022-01-21 06:14:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.073 (2.274) Loss 1.0885 (1.0989) Acc@1 73.438 (74.131) Acc@5 93.164 (92.483) [2022-01-21 06:14:18 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.178 Acc@5 92.454 [2022-01-21 06:14:18 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.2% [2022-01-21 06:14:18 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.25% [2022-01-21 06:14:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][0/1251] eta 7:32:20 lr 0.000682 time 21.6950 (21.6950) loss 4.3458 (4.3458) grad_norm 1.1847 (1.1847) [2022-01-21 06:15:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][10/1251] eta 1:24:01 lr 0.000682 time 1.9050 (4.0622) loss 3.3009 (3.8459) grad_norm 1.1080 (1.2128) [2022-01-21 06:15:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][20/1251] eta 1:05:53 lr 0.000682 time 2.0984 (3.2114) loss 4.1216 (3.7608) grad_norm 1.2705 (1.2638) [2022-01-21 06:15:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][30/1251] eta 0:57:40 lr 0.000682 time 1.2182 (2.8341) loss 3.5616 (3.6323) grad_norm 1.3018 (1.2908) [2022-01-21 06:16:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][40/1251] eta 0:55:34 lr 0.000682 time 4.4326 (2.7532) loss 4.3891 (3.6317) grad_norm 1.3821 (1.3142) [2022-01-21 06:16:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][50/1251] eta 0:52:48 lr 0.000682 time 1.2933 (2.6382) loss 3.1167 (3.6019) grad_norm 1.0617 (1.3154) [2022-01-21 06:16:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][60/1251] eta 0:51:20 lr 0.000682 time 1.4191 (2.5863) loss 3.8897 (3.6280) grad_norm 1.2177 (1.3125) [2022-01-21 06:17:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][70/1251] eta 0:49:40 lr 0.000682 time 1.6768 (2.5234) loss 3.5437 (3.6685) grad_norm 1.3006 (1.3095) [2022-01-21 06:17:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][80/1251] eta 0:48:41 lr 0.000682 time 3.1542 (2.4950) loss 2.7478 (3.6343) grad_norm 1.4658 (1.3144) [2022-01-21 06:18:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][90/1251] eta 0:47:18 lr 0.000682 time 1.6814 (2.4452) loss 3.5982 (3.6105) grad_norm 1.1885 (1.3209) [2022-01-21 06:18:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][100/1251] eta 0:46:06 lr 0.000682 time 1.8897 (2.4032) loss 2.7480 (3.6227) grad_norm 1.2047 (1.3156) [2022-01-21 06:18:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][110/1251] eta 0:45:05 lr 0.000682 time 1.7546 (2.3708) loss 3.7921 (3.6436) grad_norm 1.2123 (1.3114) [2022-01-21 06:19:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][120/1251] eta 0:44:43 lr 0.000682 time 3.8726 (2.3723) loss 3.9626 (3.6361) grad_norm 1.3171 (1.3083) [2022-01-21 06:19:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][130/1251] eta 0:44:03 lr 0.000682 time 1.8667 (2.3582) loss 3.6448 (3.6439) grad_norm 1.1497 (1.3041) [2022-01-21 06:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][140/1251] eta 0:43:35 lr 0.000682 time 1.9637 (2.3540) loss 3.8469 (3.6236) grad_norm 1.2568 (1.2987) [2022-01-21 06:20:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][150/1251] eta 0:42:55 lr 0.000682 time 2.0087 (2.3397) loss 3.6344 (3.6246) grad_norm 1.1341 (1.2978) [2022-01-21 06:20:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][160/1251] eta 0:42:21 lr 0.000682 time 2.7912 (2.3292) loss 4.2069 (3.6256) grad_norm 1.1795 (1.2991) [2022-01-21 06:20:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][170/1251] eta 0:41:52 lr 0.000682 time 2.2341 (2.3242) loss 3.6647 (3.6275) grad_norm 1.2504 (1.2979) [2022-01-21 06:21:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][180/1251] eta 0:41:22 lr 0.000682 time 1.5666 (2.3179) loss 4.0407 (3.6354) grad_norm 1.0983 (1.2945) [2022-01-21 06:21:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][190/1251] eta 0:40:54 lr 0.000682 time 2.1173 (2.3138) loss 4.0974 (3.6203) grad_norm 1.2663 (1.2993) [2022-01-21 06:22:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][200/1251] eta 0:40:24 lr 0.000682 time 2.2586 (2.3073) loss 4.0413 (3.6148) grad_norm 1.3009 (1.3005) [2022-01-21 06:22:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][210/1251] eta 0:39:47 lr 0.000682 time 2.1446 (2.2939) loss 4.5162 (3.6341) grad_norm 1.0983 (1.2958) [2022-01-21 06:22:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][220/1251] eta 0:39:12 lr 0.000682 time 1.8212 (2.2822) loss 3.2577 (3.6411) grad_norm 1.2999 (1.2966) [2022-01-21 06:23:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][230/1251] eta 0:38:42 lr 0.000682 time 1.8803 (2.2748) loss 4.4585 (3.6537) grad_norm 1.5652 (1.2962) [2022-01-21 06:23:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][240/1251] eta 0:38:19 lr 0.000681 time 2.8097 (2.2747) loss 3.8665 (3.6599) grad_norm 1.1580 (1.2982) [2022-01-21 06:23:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][250/1251] eta 0:37:56 lr 0.000681 time 2.2068 (2.2739) loss 4.0619 (3.6491) grad_norm 1.1969 (1.2968) [2022-01-21 06:24:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][260/1251] eta 0:37:34 lr 0.000681 time 2.6559 (2.2753) loss 4.0368 (3.6519) grad_norm 1.1900 (1.2953) [2022-01-21 06:24:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][270/1251] eta 0:37:10 lr 0.000681 time 2.7815 (2.2737) loss 3.6750 (3.6574) grad_norm 1.2007 (1.2945) [2022-01-21 06:24:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][280/1251] eta 0:36:41 lr 0.000681 time 1.9630 (2.2674) loss 3.0312 (3.6434) grad_norm 1.2498 (1.2924) [2022-01-21 06:25:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][290/1251] eta 0:36:10 lr 0.000681 time 1.9063 (2.2590) loss 3.8078 (3.6466) grad_norm 1.2075 (1.2932) [2022-01-21 06:25:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][300/1251] eta 0:35:48 lr 0.000681 time 2.0617 (2.2589) loss 4.4436 (3.6468) grad_norm 1.3660 (1.2941) [2022-01-21 06:26:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][310/1251] eta 0:35:27 lr 0.000681 time 3.0272 (2.2610) loss 3.6076 (3.6414) grad_norm 1.6890 (1.2975) [2022-01-21 06:26:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][320/1251] eta 0:34:58 lr 0.000681 time 1.5612 (2.2542) loss 3.3346 (3.6374) grad_norm 1.3420 (1.2984) [2022-01-21 06:26:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][330/1251] eta 0:34:32 lr 0.000681 time 1.9473 (2.2500) loss 3.9598 (3.6481) grad_norm 1.3006 (1.2986) [2022-01-21 06:27:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][340/1251] eta 0:34:07 lr 0.000681 time 2.1778 (2.2473) loss 3.5708 (3.6516) grad_norm 1.3007 (1.2983) [2022-01-21 06:27:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][350/1251] eta 0:33:43 lr 0.000681 time 2.2526 (2.2460) loss 3.4979 (3.6501) grad_norm 1.3204 (1.2987) [2022-01-21 06:27:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][360/1251] eta 0:33:18 lr 0.000681 time 1.7292 (2.2435) loss 4.2334 (3.6528) grad_norm 1.1420 (1.2989) [2022-01-21 06:28:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][370/1251] eta 0:32:54 lr 0.000681 time 2.0594 (2.2416) loss 4.7405 (3.6523) grad_norm 1.2833 (1.3003) [2022-01-21 06:28:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][380/1251] eta 0:32:32 lr 0.000681 time 1.9225 (2.2419) loss 3.3565 (3.6444) grad_norm 1.2537 (1.3020) [2022-01-21 06:28:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][390/1251] eta 0:32:08 lr 0.000681 time 2.2328 (2.2400) loss 3.9160 (3.6423) grad_norm 1.5259 (1.3035) [2022-01-21 06:29:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][400/1251] eta 0:31:52 lr 0.000681 time 1.8164 (2.2478) loss 2.8137 (3.6425) grad_norm 1.2371 (1.3030) [2022-01-21 06:29:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][410/1251] eta 0:31:27 lr 0.000681 time 1.5715 (2.2439) loss 4.3807 (3.6409) grad_norm 1.3490 (1.3033) [2022-01-21 06:29:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][420/1251] eta 0:30:57 lr 0.000681 time 2.0691 (2.2358) loss 4.0516 (3.6372) grad_norm 1.1055 (1.3026) [2022-01-21 06:30:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][430/1251] eta 0:30:32 lr 0.000681 time 2.9612 (2.2326) loss 3.7354 (3.6376) grad_norm 1.2380 (1.3026) [2022-01-21 06:30:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][440/1251] eta 0:30:13 lr 0.000681 time 2.1444 (2.2367) loss 4.0962 (3.6402) grad_norm 1.3667 (1.3016) [2022-01-21 06:31:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][450/1251] eta 0:29:53 lr 0.000681 time 1.5490 (2.2386) loss 4.1181 (3.6421) grad_norm 1.2949 (1.3025) [2022-01-21 06:31:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][460/1251] eta 0:29:28 lr 0.000681 time 2.5721 (2.2358) loss 2.8492 (3.6364) grad_norm 1.4289 (1.3031) [2022-01-21 06:31:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][470/1251] eta 0:29:08 lr 0.000681 time 3.4620 (2.2382) loss 3.5779 (3.6333) grad_norm 1.1518 (1.3026) [2022-01-21 06:32:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][480/1251] eta 0:28:44 lr 0.000681 time 2.5303 (2.2365) loss 2.8350 (3.6318) grad_norm 1.2461 (1.3030) [2022-01-21 06:32:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][490/1251] eta 0:28:20 lr 0.000680 time 1.6088 (2.2346) loss 3.8333 (3.6288) grad_norm 1.3189 (1.3023) [2022-01-21 06:32:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][500/1251] eta 0:27:56 lr 0.000680 time 1.8638 (2.2329) loss 2.5477 (3.6306) grad_norm 1.2368 (1.3040) [2022-01-21 06:33:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][510/1251] eta 0:27:31 lr 0.000680 time 1.9338 (2.2294) loss 3.7397 (3.6305) grad_norm 1.2670 (1.3044) [2022-01-21 06:33:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][520/1251] eta 0:27:05 lr 0.000680 time 1.8274 (2.2240) loss 3.8497 (3.6377) grad_norm 1.7290 (1.3073) [2022-01-21 06:33:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][530/1251] eta 0:26:40 lr 0.000680 time 1.9392 (2.2203) loss 3.5107 (3.6346) grad_norm 1.1619 (1.3060) [2022-01-21 06:34:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][540/1251] eta 0:26:15 lr 0.000680 time 2.0640 (2.2165) loss 3.9847 (3.6345) grad_norm 1.2525 (1.3063) [2022-01-21 06:34:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][550/1251] eta 0:25:53 lr 0.000680 time 2.2305 (2.2166) loss 3.7872 (3.6342) grad_norm 1.2483 (1.3078) [2022-01-21 06:35:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][560/1251] eta 0:25:33 lr 0.000680 time 2.4318 (2.2187) loss 3.5432 (3.6333) grad_norm 1.4684 (1.3105) [2022-01-21 06:35:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][570/1251] eta 0:25:11 lr 0.000680 time 2.3378 (2.2192) loss 3.9469 (3.6331) grad_norm 1.1883 (1.3120) [2022-01-21 06:35:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][580/1251] eta 0:24:49 lr 0.000680 time 2.1018 (2.2198) loss 2.9941 (3.6313) grad_norm 1.1524 (1.3111) [2022-01-21 06:36:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][590/1251] eta 0:24:28 lr 0.000680 time 2.0095 (2.2218) loss 3.3891 (3.6316) grad_norm 1.4984 (1.3122) [2022-01-21 06:36:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][600/1251] eta 0:24:07 lr 0.000680 time 2.0862 (2.2236) loss 3.9751 (3.6301) grad_norm 1.4599 (1.3123) [2022-01-21 06:36:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][610/1251] eta 0:23:45 lr 0.000680 time 1.5965 (2.2233) loss 2.5082 (3.6267) grad_norm 1.2574 (1.3121) [2022-01-21 06:37:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][620/1251] eta 0:23:21 lr 0.000680 time 3.1587 (2.2217) loss 3.4534 (3.6261) grad_norm 1.3676 (1.3130) [2022-01-21 06:37:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][630/1251] eta 0:22:57 lr 0.000680 time 1.9424 (2.2182) loss 3.0492 (3.6246) grad_norm 1.2887 (1.3127) [2022-01-21 06:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][640/1251] eta 0:22:33 lr 0.000680 time 2.1020 (2.2154) loss 3.4283 (3.6264) grad_norm 1.3183 (1.3123) [2022-01-21 06:38:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][650/1251] eta 0:22:11 lr 0.000680 time 1.8889 (2.2147) loss 3.8040 (3.6234) grad_norm 1.4269 (1.3109) [2022-01-21 06:38:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][660/1251] eta 0:21:51 lr 0.000680 time 1.6814 (2.2197) loss 3.2161 (3.6239) grad_norm 1.2063 (1.3102) [2022-01-21 06:39:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][670/1251] eta 0:21:29 lr 0.000680 time 1.8414 (2.2197) loss 4.1297 (3.6236) grad_norm 1.2064 (1.3091) [2022-01-21 06:39:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][680/1251] eta 0:21:07 lr 0.000680 time 2.1828 (2.2203) loss 4.4079 (3.6295) grad_norm 1.2097 (1.3090) [2022-01-21 06:39:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][690/1251] eta 0:20:44 lr 0.000680 time 1.9328 (2.2187) loss 4.4243 (3.6314) grad_norm 1.2705 (1.3086) [2022-01-21 06:40:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][700/1251] eta 0:20:22 lr 0.000680 time 2.4695 (2.2179) loss 3.6478 (3.6362) grad_norm 1.2939 (1.3085) [2022-01-21 06:40:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][710/1251] eta 0:19:58 lr 0.000680 time 2.0186 (2.2157) loss 2.4974 (3.6373) grad_norm 1.1754 (1.3089) [2022-01-21 06:40:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][720/1251] eta 0:19:36 lr 0.000680 time 2.1504 (2.2155) loss 4.2764 (3.6358) grad_norm 1.2895 (1.3084) [2022-01-21 06:41:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][730/1251] eta 0:19:14 lr 0.000680 time 1.6509 (2.2159) loss 4.5854 (3.6365) grad_norm 1.2812 (1.3083) [2022-01-21 06:41:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][740/1251] eta 0:18:53 lr 0.000680 time 4.5641 (2.2175) loss 3.4670 (3.6389) grad_norm 1.1819 (1.3101) [2022-01-21 06:42:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][750/1251] eta 0:18:31 lr 0.000679 time 2.1949 (2.2186) loss 3.6608 (3.6410) grad_norm 1.5157 (1.3124) [2022-01-21 06:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][760/1251] eta 0:18:07 lr 0.000679 time 1.9101 (2.2155) loss 3.8114 (3.6436) grad_norm 1.2360 (1.3128) [2022-01-21 06:42:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][770/1251] eta 0:17:44 lr 0.000679 time 1.7596 (2.2138) loss 3.8985 (3.6455) grad_norm 1.1642 (1.3122) [2022-01-21 06:43:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][780/1251] eta 0:17:25 lr 0.000679 time 6.3836 (2.2188) loss 3.3567 (3.6459) grad_norm 1.1677 (1.3123) [2022-01-21 06:43:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][790/1251] eta 0:17:01 lr 0.000679 time 1.9168 (2.2161) loss 3.3624 (3.6454) grad_norm 1.2325 (1.3130) [2022-01-21 06:43:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][800/1251] eta 0:16:38 lr 0.000679 time 2.0436 (2.2132) loss 4.1911 (3.6450) grad_norm 1.2271 (1.3133) [2022-01-21 06:44:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][810/1251] eta 0:16:16 lr 0.000679 time 2.0867 (2.2151) loss 3.3118 (3.6428) grad_norm 1.2041 (1.3123) [2022-01-21 06:44:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][820/1251] eta 0:15:54 lr 0.000679 time 3.0558 (2.2140) loss 3.0877 (3.6431) grad_norm 1.4878 (1.3134) [2022-01-21 06:44:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][830/1251] eta 0:15:31 lr 0.000679 time 1.9571 (2.2123) loss 4.1216 (3.6441) grad_norm 1.2787 (1.3136) [2022-01-21 06:45:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][840/1251] eta 0:15:08 lr 0.000679 time 2.1351 (2.2104) loss 3.3561 (3.6406) grad_norm 1.4796 (1.3141) [2022-01-21 06:45:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][850/1251] eta 0:14:45 lr 0.000679 time 1.5653 (2.2093) loss 3.3077 (3.6391) grad_norm 1.5717 (1.3142) [2022-01-21 06:46:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][860/1251] eta 0:14:25 lr 0.000679 time 3.1162 (2.2127) loss 3.6769 (3.6395) grad_norm 1.3108 (1.3140) [2022-01-21 06:46:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][870/1251] eta 0:14:03 lr 0.000679 time 2.2293 (2.2132) loss 3.4387 (3.6374) grad_norm 1.3616 (1.3134) [2022-01-21 06:46:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][880/1251] eta 0:13:40 lr 0.000679 time 1.6013 (2.2117) loss 3.6999 (3.6350) grad_norm 1.2620 (1.3136) [2022-01-21 06:47:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][890/1251] eta 0:13:17 lr 0.000679 time 2.1709 (2.2098) loss 3.7210 (3.6351) grad_norm 1.2070 (1.3134) [2022-01-21 06:47:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][900/1251] eta 0:12:54 lr 0.000679 time 2.2659 (2.2079) loss 4.1217 (3.6334) grad_norm 1.2109 (1.3138) [2022-01-21 06:47:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][910/1251] eta 0:12:32 lr 0.000679 time 1.9001 (2.2076) loss 3.8016 (3.6349) grad_norm 1.2940 (1.3140) [2022-01-21 06:48:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][920/1251] eta 0:12:10 lr 0.000679 time 2.0828 (2.2078) loss 3.4018 (3.6341) grad_norm 1.2959 (1.3144) [2022-01-21 06:48:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][930/1251] eta 0:11:48 lr 0.000679 time 2.6168 (2.2082) loss 3.8422 (3.6321) grad_norm 1.2769 (1.3139) [2022-01-21 06:48:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][940/1251] eta 0:11:27 lr 0.000679 time 2.2822 (2.2092) loss 4.0921 (3.6337) grad_norm 1.4063 (1.3140) [2022-01-21 06:49:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][950/1251] eta 0:11:04 lr 0.000679 time 1.8557 (2.2080) loss 4.1852 (3.6352) grad_norm 1.3457 (1.3142) [2022-01-21 06:49:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][960/1251] eta 0:10:42 lr 0.000679 time 2.3758 (2.2076) loss 3.9517 (3.6350) grad_norm 1.2356 (1.3147) [2022-01-21 06:50:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][970/1251] eta 0:10:20 lr 0.000679 time 2.5138 (2.2067) loss 3.2036 (3.6361) grad_norm 1.2021 (1.3151) [2022-01-21 06:50:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][980/1251] eta 0:09:57 lr 0.000679 time 2.3102 (2.2063) loss 3.8107 (3.6374) grad_norm 1.4177 (1.3160) [2022-01-21 06:50:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][990/1251] eta 0:09:35 lr 0.000679 time 2.2153 (2.2046) loss 3.4513 (3.6360) grad_norm 1.3775 (1.3172) [2022-01-21 06:51:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1000/1251] eta 0:09:13 lr 0.000679 time 2.2268 (2.2040) loss 3.5289 (3.6335) grad_norm 1.4459 (1.3170) [2022-01-21 06:51:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1010/1251] eta 0:08:50 lr 0.000678 time 2.6419 (2.2032) loss 4.0150 (3.6349) grad_norm 1.2514 (1.3170) [2022-01-21 06:51:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1020/1251] eta 0:08:29 lr 0.000678 time 2.2262 (2.2047) loss 3.5184 (3.6355) grad_norm 1.3802 (1.3166) [2022-01-21 06:52:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1030/1251] eta 0:08:07 lr 0.000678 time 3.0007 (2.2066) loss 3.8814 (3.6344) grad_norm 1.4861 (1.3166) [2022-01-21 06:52:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1040/1251] eta 0:07:45 lr 0.000678 time 2.7299 (2.2080) loss 4.0147 (3.6373) grad_norm 1.4594 (1.3169) [2022-01-21 06:52:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1050/1251] eta 0:07:23 lr 0.000678 time 2.4901 (2.2083) loss 3.8093 (3.6357) grad_norm 1.2677 (1.3165) [2022-01-21 06:53:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1060/1251] eta 0:07:01 lr 0.000678 time 3.0856 (2.2081) loss 3.3821 (3.6358) grad_norm 1.4859 (1.3164) [2022-01-21 06:53:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1070/1251] eta 0:06:39 lr 0.000678 time 2.1270 (2.2066) loss 4.0505 (3.6350) grad_norm 1.3703 (1.3167) [2022-01-21 06:54:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1080/1251] eta 0:06:16 lr 0.000678 time 1.9082 (2.2041) loss 3.1774 (3.6353) grad_norm 1.4837 (1.3168) [2022-01-21 06:54:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1090/1251] eta 0:05:54 lr 0.000678 time 1.6591 (2.2008) loss 4.4376 (3.6347) grad_norm 1.5427 (1.3172) [2022-01-21 06:54:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1100/1251] eta 0:05:32 lr 0.000678 time 2.1361 (2.2013) loss 3.2266 (3.6337) grad_norm 1.3226 (1.3176) [2022-01-21 06:55:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1110/1251] eta 0:05:10 lr 0.000678 time 2.2412 (2.2014) loss 4.2407 (3.6316) grad_norm 1.1783 (1.3173) [2022-01-21 06:55:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1120/1251] eta 0:04:48 lr 0.000678 time 1.5161 (2.2014) loss 3.6470 (3.6313) grad_norm 1.4112 (1.3171) [2022-01-21 06:55:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1130/1251] eta 0:04:26 lr 0.000678 time 2.1789 (2.2011) loss 3.9886 (3.6354) grad_norm 1.3415 (1.3175) [2022-01-21 06:56:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1140/1251] eta 0:04:04 lr 0.000678 time 1.9606 (2.2004) loss 3.9343 (3.6359) grad_norm 1.5757 (1.3177) [2022-01-21 06:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1150/1251] eta 0:03:42 lr 0.000678 time 2.4965 (2.2016) loss 3.2010 (3.6340) grad_norm 1.3055 (1.3177) [2022-01-21 06:56:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1160/1251] eta 0:03:20 lr 0.000678 time 1.8626 (2.2029) loss 3.9282 (3.6341) grad_norm 1.2057 (1.3169) [2022-01-21 06:57:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1170/1251] eta 0:02:58 lr 0.000678 time 2.0456 (2.2023) loss 3.2319 (3.6338) grad_norm 1.4702 (1.3167) [2022-01-21 06:57:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1180/1251] eta 0:02:36 lr 0.000678 time 1.9688 (2.2016) loss 3.4688 (3.6356) grad_norm 2.5683 (1.3177) [2022-01-21 06:58:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1190/1251] eta 0:02:14 lr 0.000678 time 2.7981 (2.2020) loss 3.0527 (3.6342) grad_norm 1.4366 (1.3187) [2022-01-21 06:58:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1200/1251] eta 0:01:52 lr 0.000678 time 2.4273 (2.2020) loss 3.5351 (3.6316) grad_norm 1.3566 (1.3183) [2022-01-21 06:58:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1210/1251] eta 0:01:30 lr 0.000678 time 1.8639 (2.2008) loss 3.0405 (3.6301) grad_norm 1.3812 (1.3180) [2022-01-21 06:59:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1220/1251] eta 0:01:08 lr 0.000678 time 2.1375 (2.2011) loss 4.0459 (3.6319) grad_norm 1.3408 (1.3175) [2022-01-21 06:59:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1230/1251] eta 0:00:46 lr 0.000678 time 2.2085 (2.2016) loss 3.9016 (3.6324) grad_norm 1.2223 (1.3171) [2022-01-21 06:59:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1240/1251] eta 0:00:24 lr 0.000678 time 1.6115 (2.2003) loss 3.1935 (3.6320) grad_norm 1.4103 (1.3179) [2022-01-21 07:00:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [115/300][1250/1251] eta 0:00:02 lr 0.000678 time 1.1536 (2.1954) loss 3.8454 (3.6316) grad_norm 1.2752 (1.3174) [2022-01-21 07:00:04 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 115 training takes 0:45:46 [2022-01-21 07:00:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.921 (17.921) Loss 1.0901 (1.0901) Acc@1 74.609 (74.609) Acc@5 93.066 (93.066) [2022-01-21 07:00:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.312 (3.329) Loss 1.1253 (1.0810) Acc@1 73.438 (74.663) Acc@5 93.066 (92.782) [2022-01-21 07:01:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.904 (2.685) Loss 1.0945 (1.0868) Acc@1 74.902 (74.656) Acc@5 92.871 (92.592) [2022-01-21 07:01:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.314 (2.424) Loss 1.0946 (1.0926) Acc@1 75.391 (74.361) Acc@5 92.480 (92.471) [2022-01-21 07:01:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.723 (2.179) Loss 1.0888 (1.0863) Acc@1 74.902 (74.481) Acc@5 92.480 (92.616) [2022-01-21 07:01:42 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.404 Acc@5 92.554 [2022-01-21 07:01:42 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.4% [2022-01-21 07:01:42 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.40% [2022-01-21 07:02:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][0/1251] eta 7:42:07 lr 0.000678 time 22.1643 (22.1643) loss 3.8243 (3.8243) grad_norm 1.2981 (1.2981) [2022-01-21 07:02:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][10/1251] eta 1:23:20 lr 0.000678 time 2.1782 (4.0296) loss 3.6794 (3.7207) grad_norm 1.2387 (1.2602) [2022-01-21 07:02:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][20/1251] eta 1:05:39 lr 0.000677 time 2.2140 (3.2000) loss 3.8697 (3.6362) grad_norm 1.2642 (1.2984) [2022-01-21 07:03:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][30/1251] eta 0:58:10 lr 0.000677 time 1.5156 (2.8590) loss 4.0445 (3.7065) grad_norm 1.2850 (1.3155) [2022-01-21 07:03:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][40/1251] eta 0:54:40 lr 0.000677 time 3.1717 (2.7093) loss 3.8061 (3.6772) grad_norm 1.2008 (1.3117) [2022-01-21 07:03:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][50/1251] eta 0:52:36 lr 0.000677 time 1.5564 (2.6278) loss 3.5609 (3.6840) grad_norm 1.3030 (1.3125) [2022-01-21 07:04:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][60/1251] eta 0:50:51 lr 0.000677 time 2.4387 (2.5619) loss 3.8302 (3.6458) grad_norm 1.3541 (1.3298) [2022-01-21 07:04:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][70/1251] eta 0:49:22 lr 0.000677 time 1.5052 (2.5081) loss 3.5999 (3.6843) grad_norm 1.5377 (1.3493) [2022-01-21 07:05:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][80/1251] eta 0:48:52 lr 0.000677 time 3.7125 (2.5046) loss 2.7214 (3.6493) grad_norm 1.2588 (1.3488) [2022-01-21 07:05:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][90/1251] eta 0:48:17 lr 0.000677 time 1.9033 (2.4956) loss 3.2864 (3.6495) grad_norm 1.2331 (1.3513) [2022-01-21 07:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][100/1251] eta 0:47:14 lr 0.000677 time 1.8663 (2.4625) loss 4.3640 (3.6695) grad_norm 1.4338 (1.3491) [2022-01-21 07:06:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][110/1251] eta 0:46:03 lr 0.000677 time 1.6926 (2.4218) loss 2.9814 (3.6291) grad_norm 1.5899 (1.3484) [2022-01-21 07:06:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][120/1251] eta 0:45:11 lr 0.000677 time 2.7471 (2.3975) loss 4.5387 (3.6270) grad_norm 1.3214 (1.3444) [2022-01-21 07:06:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][130/1251] eta 0:44:16 lr 0.000677 time 1.6468 (2.3698) loss 4.4201 (3.6426) grad_norm 1.4764 (1.3374) [2022-01-21 07:07:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][140/1251] eta 0:43:33 lr 0.000677 time 1.4996 (2.3522) loss 4.0048 (3.6535) grad_norm 1.3838 (1.3352) [2022-01-21 07:07:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][150/1251] eta 0:43:00 lr 0.000677 time 2.1585 (2.3434) loss 4.5102 (3.6390) grad_norm 1.2619 (1.3354) [2022-01-21 07:07:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][160/1251] eta 0:42:25 lr 0.000677 time 3.0952 (2.3335) loss 4.6888 (3.6590) grad_norm 1.2833 (1.3355) [2022-01-21 07:08:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][170/1251] eta 0:42:05 lr 0.000677 time 2.5599 (2.3359) loss 4.1697 (3.6618) grad_norm 1.2532 (1.3336) [2022-01-21 07:08:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][180/1251] eta 0:41:48 lr 0.000677 time 2.4611 (2.3425) loss 3.7142 (3.6698) grad_norm 1.3085 (1.3264) [2022-01-21 07:09:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][190/1251] eta 0:41:24 lr 0.000677 time 1.5717 (2.3413) loss 3.9676 (3.6678) grad_norm 1.6172 (1.3331) [2022-01-21 07:09:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][200/1251] eta 0:40:55 lr 0.000677 time 2.2853 (2.3360) loss 3.5990 (3.6606) grad_norm 1.1476 (1.3326) [2022-01-21 07:09:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][210/1251] eta 0:40:15 lr 0.000677 time 2.1820 (2.3207) loss 3.1600 (3.6515) grad_norm 1.2354 (1.3330) [2022-01-21 07:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][220/1251] eta 0:39:33 lr 0.000677 time 1.9372 (2.3026) loss 3.9286 (3.6643) grad_norm 1.2908 (1.3281) [2022-01-21 07:10:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][230/1251] eta 0:38:55 lr 0.000677 time 1.9858 (2.2870) loss 3.5602 (3.6486) grad_norm 1.5888 (1.3272) [2022-01-21 07:10:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][240/1251] eta 0:38:25 lr 0.000677 time 2.5512 (2.2805) loss 4.4013 (3.6489) grad_norm 1.3886 (1.3275) [2022-01-21 07:11:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][250/1251] eta 0:38:02 lr 0.000677 time 2.1418 (2.2799) loss 3.5278 (3.6458) grad_norm 1.2701 (1.3238) [2022-01-21 07:11:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][260/1251] eta 0:37:38 lr 0.000677 time 2.1719 (2.2792) loss 3.5113 (3.6365) grad_norm 1.2982 (1.3219) [2022-01-21 07:12:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][270/1251] eta 0:37:17 lr 0.000676 time 1.8867 (2.2813) loss 4.0914 (3.6408) grad_norm 1.1755 (1.3244) [2022-01-21 07:12:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][280/1251] eta 0:36:59 lr 0.000676 time 1.6970 (2.2858) loss 3.9970 (3.6403) grad_norm 1.4084 (1.3266) [2022-01-21 07:12:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][290/1251] eta 0:36:32 lr 0.000676 time 2.9610 (2.2817) loss 3.7505 (3.6413) grad_norm 1.3759 (1.3292) [2022-01-21 07:13:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][300/1251] eta 0:36:10 lr 0.000676 time 2.5438 (2.2819) loss 3.7011 (3.6354) grad_norm 1.3041 (1.3303) [2022-01-21 07:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][310/1251] eta 0:35:46 lr 0.000676 time 2.6617 (2.2814) loss 2.7042 (3.6338) grad_norm 1.3915 (1.3289) [2022-01-21 07:13:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][320/1251] eta 0:35:18 lr 0.000676 time 1.6621 (2.2757) loss 3.7502 (3.6404) grad_norm 1.3923 (1.3288) [2022-01-21 07:14:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][330/1251] eta 0:34:47 lr 0.000676 time 2.2255 (2.2661) loss 4.5442 (3.6537) grad_norm 1.4145 (1.3301) [2022-01-21 07:14:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][340/1251] eta 0:34:18 lr 0.000676 time 1.9250 (2.2599) loss 4.4736 (3.6609) grad_norm 1.3529 (1.3304) [2022-01-21 07:14:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][350/1251] eta 0:33:54 lr 0.000676 time 3.0921 (2.2585) loss 2.4293 (3.6537) grad_norm 1.3201 (1.3302) [2022-01-21 07:15:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][360/1251] eta 0:33:29 lr 0.000676 time 1.6950 (2.2551) loss 3.9937 (3.6511) grad_norm 1.0633 (1.3291) [2022-01-21 07:15:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][370/1251] eta 0:33:06 lr 0.000676 time 2.9111 (2.2548) loss 2.6328 (3.6567) grad_norm 1.3500 (1.3288) [2022-01-21 07:16:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][380/1251] eta 0:32:41 lr 0.000676 time 1.9497 (2.2520) loss 3.4640 (3.6539) grad_norm 1.3735 (1.3285) [2022-01-21 07:16:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][390/1251] eta 0:32:17 lr 0.000676 time 1.9797 (2.2504) loss 2.8046 (3.6556) grad_norm 1.5678 (1.3311) [2022-01-21 07:16:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][400/1251] eta 0:31:52 lr 0.000676 time 1.8612 (2.2476) loss 2.8128 (3.6487) grad_norm 1.3567 (1.3314) [2022-01-21 07:17:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][410/1251] eta 0:31:31 lr 0.000676 time 3.1182 (2.2491) loss 3.7754 (3.6449) grad_norm 1.8644 (1.3328) [2022-01-21 07:17:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][420/1251] eta 0:31:11 lr 0.000676 time 2.8622 (2.2521) loss 4.1084 (3.6463) grad_norm 1.3215 (1.3332) [2022-01-21 07:17:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][430/1251] eta 0:30:47 lr 0.000676 time 2.6868 (2.2507) loss 4.3960 (3.6465) grad_norm 1.2500 (1.3320) [2022-01-21 07:18:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][440/1251] eta 0:30:23 lr 0.000676 time 1.8198 (2.2483) loss 2.8628 (3.6455) grad_norm 1.9962 (1.3339) [2022-01-21 07:18:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][450/1251] eta 0:29:58 lr 0.000676 time 2.5551 (2.2454) loss 3.6072 (3.6420) grad_norm 1.1394 (1.3321) [2022-01-21 07:18:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][460/1251] eta 0:29:34 lr 0.000676 time 2.5131 (2.2429) loss 3.9209 (3.6456) grad_norm 1.2714 (1.3306) [2022-01-21 07:19:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][470/1251] eta 0:29:09 lr 0.000676 time 2.1018 (2.2405) loss 4.3428 (3.6477) grad_norm 1.1887 (1.3297) [2022-01-21 07:19:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][480/1251] eta 0:28:44 lr 0.000676 time 1.9004 (2.2370) loss 4.0021 (3.6500) grad_norm 1.5652 (1.3287) [2022-01-21 07:20:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][490/1251] eta 0:28:21 lr 0.000676 time 2.2463 (2.2361) loss 2.5733 (3.6478) grad_norm 1.3241 (1.3290) [2022-01-21 07:20:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][500/1251] eta 0:27:58 lr 0.000676 time 2.1982 (2.2353) loss 3.4761 (3.6495) grad_norm 1.1802 (1.3273) [2022-01-21 07:20:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][510/1251] eta 0:27:37 lr 0.000676 time 2.5551 (2.2372) loss 2.9901 (3.6460) grad_norm 1.1233 (1.3259) [2022-01-21 07:21:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][520/1251] eta 0:27:17 lr 0.000676 time 1.5916 (2.2394) loss 3.4476 (3.6447) grad_norm 1.2424 (1.3265) [2022-01-21 07:21:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][530/1251] eta 0:26:56 lr 0.000675 time 2.6096 (2.2419) loss 3.7507 (3.6406) grad_norm 1.3729 (1.3258) [2022-01-21 07:21:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][540/1251] eta 0:26:30 lr 0.000675 time 1.8788 (2.2367) loss 2.4598 (3.6363) grad_norm 1.2092 (1.3255) [2022-01-21 07:22:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][550/1251] eta 0:26:03 lr 0.000675 time 2.2231 (2.2310) loss 2.6476 (3.6365) grad_norm 1.5157 (1.3266) [2022-01-21 07:22:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][560/1251] eta 0:25:39 lr 0.000675 time 2.3081 (2.2276) loss 3.1534 (3.6379) grad_norm 1.0829 (1.3263) [2022-01-21 07:22:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][570/1251] eta 0:25:16 lr 0.000675 time 2.5514 (2.2263) loss 4.0949 (3.6400) grad_norm 1.5775 (1.3263) [2022-01-21 07:23:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][580/1251] eta 0:24:52 lr 0.000675 time 2.2425 (2.2249) loss 3.1641 (3.6440) grad_norm 1.2276 (1.3266) [2022-01-21 07:23:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][590/1251] eta 0:24:28 lr 0.000675 time 2.2371 (2.2219) loss 3.6124 (3.6441) grad_norm 1.3665 (1.3259) [2022-01-21 07:23:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][600/1251] eta 0:24:05 lr 0.000675 time 2.5586 (2.2208) loss 3.9778 (3.6490) grad_norm 1.3449 (1.3262) [2022-01-21 07:24:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][610/1251] eta 0:23:43 lr 0.000675 time 2.6974 (2.2207) loss 3.3395 (3.6491) grad_norm 1.4671 (1.3295) [2022-01-21 07:24:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][620/1251] eta 0:23:21 lr 0.000675 time 1.8471 (2.2217) loss 2.6055 (3.6499) grad_norm 1.4125 (1.3284) [2022-01-21 07:25:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][630/1251] eta 0:23:00 lr 0.000675 time 2.0839 (2.2229) loss 2.8620 (3.6464) grad_norm 1.6538 (1.3281) [2022-01-21 07:25:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][640/1251] eta 0:22:38 lr 0.000675 time 2.6472 (2.2231) loss 3.6792 (3.6467) grad_norm 1.2633 (1.3266) [2022-01-21 07:25:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][650/1251] eta 0:22:15 lr 0.000675 time 2.8135 (2.2218) loss 3.6958 (3.6462) grad_norm 1.1711 (1.3252) [2022-01-21 07:26:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][660/1251] eta 0:21:52 lr 0.000675 time 2.0867 (2.2212) loss 3.7557 (3.6449) grad_norm 1.2730 (1.3249) [2022-01-21 07:26:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][670/1251] eta 0:21:30 lr 0.000675 time 1.9658 (2.2214) loss 3.7219 (3.6437) grad_norm 1.1742 (1.3242) [2022-01-21 07:26:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][680/1251] eta 0:21:09 lr 0.000675 time 3.3752 (2.2235) loss 3.7311 (3.6484) grad_norm 1.4488 (1.3247) [2022-01-21 07:27:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][690/1251] eta 0:20:46 lr 0.000675 time 2.6010 (2.2228) loss 4.1549 (3.6451) grad_norm 1.4569 (1.3251) [2022-01-21 07:27:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][700/1251] eta 0:20:24 lr 0.000675 time 1.6521 (2.2214) loss 3.1419 (3.6486) grad_norm 1.3674 (1.3244) [2022-01-21 07:27:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][710/1251] eta 0:20:00 lr 0.000675 time 1.9651 (2.2182) loss 3.4985 (3.6513) grad_norm 1.2378 (1.3243) [2022-01-21 07:28:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][720/1251] eta 0:19:38 lr 0.000675 time 2.4836 (2.2191) loss 3.8608 (3.6526) grad_norm 1.2761 (1.3233) [2022-01-21 07:28:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][730/1251] eta 0:19:15 lr 0.000675 time 2.7503 (2.2183) loss 4.3755 (3.6544) grad_norm 1.1270 (1.3227) [2022-01-21 07:29:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][740/1251] eta 0:18:53 lr 0.000675 time 1.9898 (2.2186) loss 4.1189 (3.6584) grad_norm 1.1445 (1.3229) [2022-01-21 07:29:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][750/1251] eta 0:18:31 lr 0.000675 time 2.3091 (2.2181) loss 4.4557 (3.6549) grad_norm 1.2751 (1.3234) [2022-01-21 07:29:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][760/1251] eta 0:18:09 lr 0.000675 time 2.5405 (2.2180) loss 2.8723 (3.6547) grad_norm 1.6834 (1.3254) [2022-01-21 07:30:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][770/1251] eta 0:17:46 lr 0.000675 time 1.8971 (2.2166) loss 4.3930 (3.6537) grad_norm 1.3808 (1.3264) [2022-01-21 07:30:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][780/1251] eta 0:17:22 lr 0.000675 time 1.8948 (2.2137) loss 3.7230 (3.6530) grad_norm 1.3026 (1.3273) [2022-01-21 07:30:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][790/1251] eta 0:17:00 lr 0.000674 time 1.9676 (2.2127) loss 3.7340 (3.6550) grad_norm 1.3786 (1.3270) [2022-01-21 07:31:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][800/1251] eta 0:16:37 lr 0.000674 time 2.3181 (2.2127) loss 3.9798 (3.6548) grad_norm 1.1550 (1.3264) [2022-01-21 07:31:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][810/1251] eta 0:16:16 lr 0.000674 time 2.6198 (2.2138) loss 3.5160 (3.6576) grad_norm 1.1709 (1.3254) [2022-01-21 07:31:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][820/1251] eta 0:15:53 lr 0.000674 time 2.1027 (2.2134) loss 2.5951 (3.6554) grad_norm 1.6876 (1.3256) [2022-01-21 07:32:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][830/1251] eta 0:15:31 lr 0.000674 time 1.9973 (2.2120) loss 4.0335 (3.6558) grad_norm 1.1188 (1.3247) [2022-01-21 07:32:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][840/1251] eta 0:15:09 lr 0.000674 time 1.9166 (2.2135) loss 4.1251 (3.6561) grad_norm 1.1185 (1.3245) [2022-01-21 07:33:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][850/1251] eta 0:14:47 lr 0.000674 time 1.9043 (2.2126) loss 3.7199 (3.6573) grad_norm 1.3628 (1.3235) [2022-01-21 07:33:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][860/1251] eta 0:14:25 lr 0.000674 time 2.5107 (2.2127) loss 4.2104 (3.6564) grad_norm 1.1702 (1.3229) [2022-01-21 07:33:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][870/1251] eta 0:14:02 lr 0.000674 time 2.4986 (2.2122) loss 3.9091 (3.6545) grad_norm 1.2018 (1.3224) [2022-01-21 07:34:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][880/1251] eta 0:13:40 lr 0.000674 time 1.5515 (2.2112) loss 4.0598 (3.6561) grad_norm 1.3300 (1.3219) [2022-01-21 07:34:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][890/1251] eta 0:13:18 lr 0.000674 time 2.4878 (2.2121) loss 4.0846 (3.6596) grad_norm 1.2412 (1.3219) [2022-01-21 07:34:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][900/1251] eta 0:12:56 lr 0.000674 time 3.0936 (2.2120) loss 3.9337 (3.6602) grad_norm 1.1246 (1.3219) [2022-01-21 07:35:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][910/1251] eta 0:12:33 lr 0.000674 time 2.1246 (2.2111) loss 3.6987 (3.6600) grad_norm 1.3860 (1.3210) [2022-01-21 07:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][920/1251] eta 0:12:12 lr 0.000674 time 1.9144 (2.2123) loss 3.2062 (3.6611) grad_norm 1.2485 (1.3199) [2022-01-21 07:36:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][930/1251] eta 0:11:50 lr 0.000674 time 2.7543 (2.2129) loss 3.2232 (3.6613) grad_norm 1.1464 (1.3198) [2022-01-21 07:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][940/1251] eta 0:11:27 lr 0.000674 time 2.5109 (2.2110) loss 4.2158 (3.6631) grad_norm 1.4437 (1.3194) [2022-01-21 07:36:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][950/1251] eta 0:11:05 lr 0.000674 time 1.7705 (2.2097) loss 3.1103 (3.6606) grad_norm 1.2632 (1.3201) [2022-01-21 07:37:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][960/1251] eta 0:10:42 lr 0.000674 time 2.3400 (2.2091) loss 4.4003 (3.6608) grad_norm 1.2150 (1.3208) [2022-01-21 07:37:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][970/1251] eta 0:10:20 lr 0.000674 time 2.3017 (2.2080) loss 3.7541 (3.6616) grad_norm 1.4343 (1.3211) [2022-01-21 07:37:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][980/1251] eta 0:09:57 lr 0.000674 time 2.2531 (2.2066) loss 3.4123 (3.6596) grad_norm 1.2003 (1.3207) [2022-01-21 07:38:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][990/1251] eta 0:09:35 lr 0.000674 time 2.2426 (2.2068) loss 3.8797 (3.6611) grad_norm 1.4102 (1.3207) [2022-01-21 07:38:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1000/1251] eta 0:09:13 lr 0.000674 time 2.1760 (2.2056) loss 3.5520 (3.6626) grad_norm 1.1935 (1.3197) [2022-01-21 07:38:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1010/1251] eta 0:08:51 lr 0.000674 time 1.9216 (2.2043) loss 2.9320 (3.6624) grad_norm 1.2075 (1.3194) [2022-01-21 07:39:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1020/1251] eta 0:08:29 lr 0.000674 time 2.8197 (2.2046) loss 3.2965 (3.6617) grad_norm 1.2544 (1.3196) [2022-01-21 07:39:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1030/1251] eta 0:08:07 lr 0.000674 time 2.5561 (2.2051) loss 3.9992 (3.6613) grad_norm 1.3159 (1.3190) [2022-01-21 07:39:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1040/1251] eta 0:07:45 lr 0.000673 time 1.8780 (2.2046) loss 2.6740 (3.6592) grad_norm 1.2960 (1.3189) [2022-01-21 07:40:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1050/1251] eta 0:07:23 lr 0.000673 time 1.7670 (2.2048) loss 3.1829 (3.6611) grad_norm 1.4949 (1.3192) [2022-01-21 07:40:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1060/1251] eta 0:07:01 lr 0.000673 time 2.7132 (2.2056) loss 3.9117 (3.6637) grad_norm 1.1641 (1.3188) [2022-01-21 07:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1070/1251] eta 0:06:39 lr 0.000673 time 2.1769 (2.2051) loss 2.7812 (3.6618) grad_norm 1.5998 (1.3186) [2022-01-21 07:41:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1080/1251] eta 0:06:17 lr 0.000673 time 2.2104 (2.2056) loss 3.8477 (3.6634) grad_norm 1.8332 (1.3188) [2022-01-21 07:41:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1090/1251] eta 0:05:55 lr 0.000673 time 2.3682 (2.2057) loss 3.8064 (3.6636) grad_norm 1.6176 (1.3188) [2022-01-21 07:42:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1100/1251] eta 0:05:33 lr 0.000673 time 2.7383 (2.2061) loss 4.0776 (3.6666) grad_norm 1.4498 (1.3185) [2022-01-21 07:42:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1110/1251] eta 0:05:10 lr 0.000673 time 2.2198 (2.2054) loss 3.4972 (3.6670) grad_norm 1.2351 (1.3183) [2022-01-21 07:42:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1120/1251] eta 0:04:48 lr 0.000673 time 1.7820 (2.2036) loss 3.5401 (3.6678) grad_norm 1.4892 (1.3178) [2022-01-21 07:43:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1130/1251] eta 0:04:26 lr 0.000673 time 1.7389 (2.2018) loss 3.7980 (3.6663) grad_norm 1.5312 (1.3184) [2022-01-21 07:43:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1140/1251] eta 0:04:04 lr 0.000673 time 2.7200 (2.2052) loss 3.4853 (3.6656) grad_norm 1.3258 (1.3183) [2022-01-21 07:43:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1150/1251] eta 0:03:42 lr 0.000673 time 1.8017 (2.2038) loss 3.3899 (3.6643) grad_norm 1.2245 (1.3176) [2022-01-21 07:44:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1160/1251] eta 0:03:20 lr 0.000673 time 1.9392 (2.2026) loss 3.2122 (3.6632) grad_norm 1.0613 (1.3176) [2022-01-21 07:44:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1170/1251] eta 0:02:58 lr 0.000673 time 2.4728 (2.2029) loss 2.8810 (3.6618) grad_norm 1.2580 (1.3171) [2022-01-21 07:45:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1180/1251] eta 0:02:36 lr 0.000673 time 2.2393 (2.2029) loss 3.8128 (3.6610) grad_norm 1.1838 (1.3161) [2022-01-21 07:45:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1190/1251] eta 0:02:14 lr 0.000673 time 2.3221 (2.2025) loss 3.8690 (3.6603) grad_norm 1.2874 (1.3156) [2022-01-21 07:45:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1200/1251] eta 0:01:52 lr 0.000673 time 2.7374 (2.2035) loss 3.6893 (3.6605) grad_norm 1.2001 (1.3158) [2022-01-21 07:46:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1210/1251] eta 0:01:30 lr 0.000673 time 2.5851 (2.2028) loss 4.0179 (3.6592) grad_norm 1.1069 (1.3151) [2022-01-21 07:46:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1220/1251] eta 0:01:08 lr 0.000673 time 2.2474 (2.2016) loss 3.6505 (3.6606) grad_norm 1.1633 (1.3145) [2022-01-21 07:46:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1230/1251] eta 0:00:46 lr 0.000673 time 3.1583 (2.2027) loss 3.5945 (3.6602) grad_norm 1.3295 (1.3140) [2022-01-21 07:47:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1240/1251] eta 0:00:24 lr 0.000673 time 1.5027 (2.2018) loss 2.7502 (3.6590) grad_norm 1.4242 (1.3147) [2022-01-21 07:47:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [116/300][1250/1251] eta 0:00:02 lr 0.000673 time 1.1911 (2.1968) loss 2.5549 (3.6581) grad_norm 1.3013 (1.3142) [2022-01-21 07:47:31 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 116 training takes 0:45:48 [2022-01-21 07:47:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.381 (18.381) Loss 1.0418 (1.0418) Acc@1 75.488 (75.488) Acc@5 93.555 (93.555) [2022-01-21 07:48:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.968 (3.236) Loss 1.0177 (1.0721) Acc@1 75.098 (74.183) Acc@5 93.359 (92.960) [2022-01-21 07:48:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.312 (2.570) Loss 1.0640 (1.0690) Acc@1 74.023 (74.400) Acc@5 93.066 (93.080) [2022-01-21 07:48:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.304 (2.295) Loss 1.1530 (1.0802) Acc@1 73.926 (74.373) Acc@5 91.309 (92.795) [2022-01-21 07:49:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.876 (2.245) Loss 1.0560 (1.0849) Acc@1 73.438 (74.247) Acc@5 92.188 (92.731) [2022-01-21 07:49:10 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.320 Acc@5 92.684 [2022-01-21 07:49:10 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.3% [2022-01-21 07:49:10 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.40% [2022-01-21 07:49:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][0/1251] eta 7:20:11 lr 0.000673 time 21.1120 (21.1120) loss 3.9214 (3.9214) grad_norm 1.7969 (1.7969) [2022-01-21 07:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][10/1251] eta 1:24:16 lr 0.000673 time 1.5378 (4.0747) loss 2.9477 (3.5819) grad_norm 1.2452 (1.3521) [2022-01-21 07:50:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][20/1251] eta 1:06:00 lr 0.000673 time 1.5083 (3.2171) loss 3.2975 (3.6654) grad_norm 1.3092 (1.3728) [2022-01-21 07:50:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][30/1251] eta 0:58:14 lr 0.000673 time 1.2053 (2.8623) loss 3.7505 (3.7690) grad_norm 1.6798 (1.3980) [2022-01-21 07:51:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][40/1251] eta 0:55:52 lr 0.000673 time 3.8691 (2.7683) loss 2.4034 (3.7241) grad_norm 1.1026 (1.3786) [2022-01-21 07:51:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][50/1251] eta 0:53:52 lr 0.000672 time 3.2699 (2.6915) loss 2.8087 (3.7493) grad_norm 1.5821 (1.3674) [2022-01-21 07:51:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][60/1251] eta 0:52:05 lr 0.000672 time 1.5955 (2.6245) loss 2.7621 (3.7065) grad_norm 1.1246 (1.3503) [2022-01-21 07:52:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][70/1251] eta 0:50:15 lr 0.000672 time 1.6966 (2.5536) loss 3.9982 (3.6904) grad_norm 1.3117 (1.3533) [2022-01-21 07:52:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][80/1251] eta 0:49:10 lr 0.000672 time 3.0193 (2.5197) loss 4.2347 (3.6797) grad_norm 1.4583 (1.3570) [2022-01-21 07:52:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][90/1251] eta 0:47:45 lr 0.000672 time 2.1932 (2.4681) loss 3.1759 (3.6847) grad_norm 1.4390 (1.3651) [2022-01-21 07:53:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][100/1251] eta 0:46:41 lr 0.000672 time 2.1946 (2.4338) loss 4.1687 (3.6918) grad_norm 1.3359 (1.3636) [2022-01-21 07:53:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][110/1251] eta 0:45:38 lr 0.000672 time 1.5828 (2.3998) loss 3.8260 (3.6854) grad_norm 1.2066 (1.3556) [2022-01-21 07:53:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][120/1251] eta 0:45:03 lr 0.000672 time 2.5112 (2.3908) loss 3.2222 (3.6752) grad_norm 1.6270 (1.3540) [2022-01-21 07:54:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][130/1251] eta 0:44:24 lr 0.000672 time 2.7471 (2.3773) loss 2.7510 (3.6715) grad_norm 1.4457 (1.3560) [2022-01-21 07:54:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][140/1251] eta 0:43:36 lr 0.000672 time 2.0461 (2.3550) loss 3.8902 (3.6706) grad_norm 1.3286 (1.3524) [2022-01-21 07:55:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][150/1251] eta 0:42:53 lr 0.000672 time 1.5365 (2.3378) loss 4.0516 (3.6806) grad_norm 1.2464 (1.3527) [2022-01-21 07:55:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][160/1251] eta 0:42:34 lr 0.000672 time 2.5594 (2.3411) loss 3.0744 (3.6822) grad_norm 1.2306 (1.3504) [2022-01-21 07:55:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][170/1251] eta 0:42:06 lr 0.000672 time 2.3892 (2.3371) loss 4.3137 (3.6902) grad_norm 1.9941 (1.3517) [2022-01-21 07:56:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][180/1251] eta 0:41:37 lr 0.000672 time 2.4586 (2.3316) loss 4.4516 (3.6892) grad_norm 1.2444 (1.3470) [2022-01-21 07:56:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][190/1251] eta 0:41:09 lr 0.000672 time 1.8725 (2.3278) loss 3.9253 (3.7002) grad_norm 1.1751 (1.3428) [2022-01-21 07:56:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][200/1251] eta 0:40:34 lr 0.000672 time 1.6434 (2.3166) loss 3.2515 (3.6989) grad_norm 1.1922 (1.3397) [2022-01-21 07:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][210/1251] eta 0:39:50 lr 0.000672 time 1.9511 (2.2965) loss 4.3580 (3.6905) grad_norm 1.2656 (1.3390) [2022-01-21 07:57:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][220/1251] eta 0:39:10 lr 0.000672 time 1.8489 (2.2800) loss 2.8342 (3.6887) grad_norm 1.7119 (1.3464) [2022-01-21 07:57:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][230/1251] eta 0:38:34 lr 0.000672 time 1.8551 (2.2667) loss 2.3583 (3.6865) grad_norm 1.3770 (1.3477) [2022-01-21 07:58:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][240/1251] eta 0:38:06 lr 0.000672 time 2.2595 (2.2616) loss 2.9408 (3.6733) grad_norm 1.4377 (1.3472) [2022-01-21 07:58:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][250/1251] eta 0:37:37 lr 0.000672 time 2.4035 (2.2554) loss 2.7379 (3.6789) grad_norm 1.4352 (1.3457) [2022-01-21 07:58:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][260/1251] eta 0:37:08 lr 0.000672 time 1.8860 (2.2487) loss 3.6048 (3.6762) grad_norm 1.3474 (1.3465) [2022-01-21 07:59:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][270/1251] eta 0:36:46 lr 0.000672 time 2.4872 (2.2490) loss 3.7749 (3.6703) grad_norm 1.2381 (1.3477) [2022-01-21 07:59:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][280/1251] eta 0:36:26 lr 0.000672 time 2.9225 (2.2517) loss 4.4165 (3.6630) grad_norm 1.7883 (1.3499) [2022-01-21 08:00:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][290/1251] eta 0:36:02 lr 0.000672 time 2.2513 (2.2501) loss 4.1723 (3.6642) grad_norm 1.4048 (1.3512) [2022-01-21 08:00:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][300/1251] eta 0:35:37 lr 0.000672 time 2.1959 (2.2477) loss 2.4672 (3.6594) grad_norm 1.2266 (1.3512) [2022-01-21 08:00:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][310/1251] eta 0:35:21 lr 0.000671 time 3.2715 (2.2541) loss 3.9686 (3.6651) grad_norm 1.2045 (1.3500) [2022-01-21 08:01:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][320/1251] eta 0:34:58 lr 0.000671 time 2.1238 (2.2544) loss 3.9429 (3.6647) grad_norm 1.3323 (1.3485) [2022-01-21 08:01:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][330/1251] eta 0:34:36 lr 0.000671 time 2.4266 (2.2541) loss 3.5534 (3.6689) grad_norm 1.1753 (1.3453) [2022-01-21 08:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][340/1251] eta 0:34:13 lr 0.000671 time 2.1974 (2.2540) loss 3.9717 (3.6691) grad_norm 1.3199 (1.3469) [2022-01-21 08:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][350/1251] eta 0:33:48 lr 0.000671 time 1.5188 (2.2515) loss 3.9824 (3.6617) grad_norm 1.3674 (1.3463) [2022-01-21 08:02:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][360/1251] eta 0:33:25 lr 0.000671 time 1.7262 (2.2505) loss 2.8161 (3.6590) grad_norm 1.2791 (1.3454) [2022-01-21 08:03:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][370/1251] eta 0:33:01 lr 0.000671 time 2.0594 (2.2489) loss 3.3617 (3.6631) grad_norm 1.5095 (1.3511) [2022-01-21 08:03:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][380/1251] eta 0:32:34 lr 0.000671 time 2.2390 (2.2443) loss 3.4724 (3.6547) grad_norm 1.3582 (1.3519) [2022-01-21 08:03:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][390/1251] eta 0:32:10 lr 0.000671 time 1.7640 (2.2422) loss 2.5005 (3.6494) grad_norm 1.4250 (1.3518) [2022-01-21 08:04:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][400/1251] eta 0:31:48 lr 0.000671 time 1.8013 (2.2426) loss 2.6444 (3.6496) grad_norm 1.5493 (1.3547) [2022-01-21 08:04:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][410/1251] eta 0:31:26 lr 0.000671 time 2.8330 (2.2426) loss 3.2680 (3.6513) grad_norm 1.2991 (1.3527) [2022-01-21 08:04:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][420/1251] eta 0:31:07 lr 0.000671 time 3.1159 (2.2472) loss 3.6665 (3.6537) grad_norm 1.2728 (1.3518) [2022-01-21 08:05:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][430/1251] eta 0:30:46 lr 0.000671 time 2.1441 (2.2488) loss 2.7653 (3.6544) grad_norm 1.1802 (1.3519) [2022-01-21 08:05:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][440/1251] eta 0:30:19 lr 0.000671 time 1.8683 (2.2440) loss 3.8578 (3.6578) grad_norm 1.2882 (1.3514) [2022-01-21 08:05:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][450/1251] eta 0:29:52 lr 0.000671 time 1.8356 (2.2375) loss 4.5658 (3.6619) grad_norm 1.2227 (1.3496) [2022-01-21 08:06:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][460/1251] eta 0:29:26 lr 0.000671 time 2.4928 (2.2329) loss 3.7580 (3.6647) grad_norm 1.1219 (1.3479) [2022-01-21 08:06:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][470/1251] eta 0:29:01 lr 0.000671 time 2.5094 (2.2305) loss 4.0544 (3.6715) grad_norm 1.4714 (1.3484) [2022-01-21 08:07:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][480/1251] eta 0:28:38 lr 0.000671 time 1.8136 (2.2291) loss 3.7945 (3.6767) grad_norm 1.1851 (1.3475) [2022-01-21 08:07:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][490/1251] eta 0:28:15 lr 0.000671 time 2.2188 (2.2284) loss 3.7034 (3.6758) grad_norm 1.2992 (1.3462) [2022-01-21 08:07:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][500/1251] eta 0:27:54 lr 0.000671 time 2.4272 (2.2293) loss 3.3837 (3.6712) grad_norm 1.0945 (1.3459) [2022-01-21 08:08:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][510/1251] eta 0:27:30 lr 0.000671 time 2.1002 (2.2279) loss 4.0124 (3.6657) grad_norm 1.2182 (1.3434) [2022-01-21 08:08:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][520/1251] eta 0:27:06 lr 0.000671 time 1.5274 (2.2257) loss 2.7122 (3.6654) grad_norm 1.2322 (1.3425) [2022-01-21 08:08:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][530/1251] eta 0:26:44 lr 0.000671 time 2.6907 (2.2259) loss 3.6178 (3.6694) grad_norm 1.2468 (1.3407) [2022-01-21 08:09:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][540/1251] eta 0:26:21 lr 0.000671 time 1.9964 (2.2246) loss 3.8055 (3.6716) grad_norm 1.1557 (1.3412) [2022-01-21 08:09:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][550/1251] eta 0:25:58 lr 0.000671 time 2.4508 (2.2239) loss 3.4810 (3.6677) grad_norm 1.3626 (1.3421) [2022-01-21 08:10:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][560/1251] eta 0:25:39 lr 0.000670 time 1.8116 (2.2282) loss 4.0931 (3.6688) grad_norm 1.1257 (1.3399) [2022-01-21 08:10:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][570/1251] eta 0:25:17 lr 0.000670 time 2.5623 (2.2287) loss 4.0568 (3.6683) grad_norm 1.3966 (1.3413) [2022-01-21 08:10:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][580/1251] eta 0:24:53 lr 0.000670 time 1.9270 (2.2256) loss 3.4275 (3.6684) grad_norm 1.4825 (1.3413) [2022-01-21 08:11:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][590/1251] eta 0:24:29 lr 0.000670 time 1.9469 (2.2237) loss 3.8377 (3.6687) grad_norm 1.7080 (1.3412) [2022-01-21 08:11:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][600/1251] eta 0:24:07 lr 0.000670 time 2.4890 (2.2240) loss 3.5266 (3.6704) grad_norm 1.5157 (1.3431) [2022-01-21 08:11:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][610/1251] eta 0:23:45 lr 0.000670 time 1.8887 (2.2233) loss 3.3158 (3.6667) grad_norm 1.1669 (1.3425) [2022-01-21 08:12:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][620/1251] eta 0:23:21 lr 0.000670 time 1.5350 (2.2204) loss 3.9206 (3.6661) grad_norm 1.4142 (1.3407) [2022-01-21 08:12:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][630/1251] eta 0:22:58 lr 0.000670 time 2.7774 (2.2206) loss 3.9720 (3.6683) grad_norm 1.6196 (1.3411) [2022-01-21 08:12:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][640/1251] eta 0:22:37 lr 0.000670 time 1.7796 (2.2216) loss 3.8672 (3.6687) grad_norm 1.3380 (1.3410) [2022-01-21 08:13:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][650/1251] eta 0:22:14 lr 0.000670 time 2.0242 (2.2199) loss 3.8478 (3.6695) grad_norm 2.1576 (1.3411) [2022-01-21 08:13:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][660/1251] eta 0:21:51 lr 0.000670 time 1.9144 (2.2184) loss 4.2686 (3.6679) grad_norm 1.2776 (1.3402) [2022-01-21 08:13:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][670/1251] eta 0:21:28 lr 0.000670 time 1.8962 (2.2170) loss 4.0562 (3.6674) grad_norm 1.3958 (1.3403) [2022-01-21 08:14:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][680/1251] eta 0:21:05 lr 0.000670 time 1.8320 (2.2154) loss 4.0200 (3.6633) grad_norm 1.1260 (1.3391) [2022-01-21 08:14:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][690/1251] eta 0:20:45 lr 0.000670 time 1.8750 (2.2206) loss 3.2051 (3.6617) grad_norm 1.4197 (1.3389) [2022-01-21 08:15:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][700/1251] eta 0:20:23 lr 0.000670 time 1.7227 (2.2198) loss 3.7462 (3.6598) grad_norm 1.5576 (1.3394) [2022-01-21 08:15:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][710/1251] eta 0:20:02 lr 0.000670 time 1.5923 (2.2233) loss 3.7019 (3.6609) grad_norm 1.1654 (1.3408) [2022-01-21 08:15:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][720/1251] eta 0:19:39 lr 0.000670 time 1.5878 (2.2208) loss 4.0239 (3.6615) grad_norm 1.2622 (1.3399) [2022-01-21 08:16:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][730/1251] eta 0:19:14 lr 0.000670 time 1.6187 (2.2168) loss 3.8952 (3.6569) grad_norm 1.3036 (1.3395) [2022-01-21 08:16:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][740/1251] eta 0:18:51 lr 0.000670 time 1.6053 (2.2140) loss 3.6545 (3.6574) grad_norm 1.3353 (1.3392) [2022-01-21 08:16:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][750/1251] eta 0:18:31 lr 0.000670 time 2.4128 (2.2182) loss 3.6437 (3.6558) grad_norm 1.4768 (1.3391) [2022-01-21 08:17:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][760/1251] eta 0:18:08 lr 0.000670 time 2.0372 (2.2170) loss 3.9244 (3.6582) grad_norm 1.3243 (1.3396) [2022-01-21 08:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][770/1251] eta 0:17:46 lr 0.000670 time 1.8293 (2.2170) loss 3.9693 (3.6605) grad_norm 1.2661 (1.3393) [2022-01-21 08:18:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][780/1251] eta 0:17:24 lr 0.000670 time 2.3205 (2.2174) loss 3.9409 (3.6620) grad_norm 1.3813 (1.3400) [2022-01-21 08:18:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][790/1251] eta 0:17:03 lr 0.000670 time 2.9963 (2.2198) loss 3.9819 (3.6624) grad_norm 1.2282 (1.3391) [2022-01-21 08:18:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][800/1251] eta 0:16:40 lr 0.000670 time 1.6068 (2.2191) loss 4.0087 (3.6636) grad_norm 1.1644 (1.3380) [2022-01-21 08:19:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][810/1251] eta 0:16:18 lr 0.000670 time 1.8885 (2.2186) loss 3.4719 (3.6610) grad_norm 1.4309 (1.3378) [2022-01-21 08:19:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][820/1251] eta 0:15:55 lr 0.000669 time 1.7522 (2.2161) loss 4.1712 (3.6619) grad_norm 1.2783 (1.3396) [2022-01-21 08:19:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][830/1251] eta 0:15:32 lr 0.000669 time 3.0092 (2.2154) loss 3.9147 (3.6610) grad_norm 1.3587 (1.3405) [2022-01-21 08:20:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][840/1251] eta 0:15:09 lr 0.000669 time 1.9022 (2.2135) loss 3.1980 (3.6622) grad_norm 1.2065 (1.3403) [2022-01-21 08:20:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][850/1251] eta 0:14:48 lr 0.000669 time 2.0933 (2.2146) loss 3.9913 (3.6629) grad_norm 1.1291 (1.3400) [2022-01-21 08:20:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][860/1251] eta 0:14:25 lr 0.000669 time 1.9666 (2.2135) loss 2.8984 (3.6658) grad_norm 1.3461 (1.3399) [2022-01-21 08:21:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][870/1251] eta 0:14:03 lr 0.000669 time 2.9696 (2.2143) loss 2.5601 (3.6663) grad_norm 1.5077 (1.3392) [2022-01-21 08:21:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][880/1251] eta 0:13:41 lr 0.000669 time 1.8159 (2.2132) loss 2.7385 (3.6630) grad_norm 1.3092 (1.3398) [2022-01-21 08:22:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][890/1251] eta 0:13:18 lr 0.000669 time 2.3357 (2.2125) loss 3.9508 (3.6601) grad_norm 1.2547 (1.3401) [2022-01-21 08:22:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][900/1251] eta 0:12:56 lr 0.000669 time 2.5180 (2.2115) loss 4.0199 (3.6582) grad_norm 1.6516 (1.3399) [2022-01-21 08:22:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][910/1251] eta 0:12:34 lr 0.000669 time 2.4623 (2.2124) loss 4.1614 (3.6572) grad_norm 1.4860 (1.3399) [2022-01-21 08:23:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][920/1251] eta 0:12:13 lr 0.000669 time 1.8117 (2.2149) loss 4.4980 (3.6584) grad_norm 1.4577 (1.3394) [2022-01-21 08:23:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][930/1251] eta 0:11:51 lr 0.000669 time 1.6557 (2.2156) loss 4.6575 (3.6574) grad_norm 1.2576 (1.3394) [2022-01-21 08:23:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][940/1251] eta 0:11:28 lr 0.000669 time 2.7536 (2.2147) loss 2.2762 (3.6538) grad_norm 1.5084 (1.3396) [2022-01-21 08:24:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][950/1251] eta 0:11:05 lr 0.000669 time 1.9351 (2.2123) loss 3.8296 (3.6559) grad_norm 1.2192 (1.3399) [2022-01-21 08:24:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][960/1251] eta 0:10:43 lr 0.000669 time 1.7727 (2.2102) loss 4.5286 (3.6568) grad_norm 1.1100 (1.3385) [2022-01-21 08:24:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][970/1251] eta 0:10:20 lr 0.000669 time 2.2435 (2.2088) loss 3.9378 (3.6587) grad_norm 1.2949 (1.3383) [2022-01-21 08:25:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][980/1251] eta 0:09:58 lr 0.000669 time 2.5659 (2.2093) loss 3.1110 (3.6605) grad_norm 1.2672 (1.3379) [2022-01-21 08:25:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][990/1251] eta 0:09:37 lr 0.000669 time 4.0283 (2.2120) loss 4.3537 (3.6618) grad_norm 1.5929 (1.3386) [2022-01-21 08:26:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1000/1251] eta 0:09:15 lr 0.000669 time 1.6844 (2.2118) loss 3.0823 (3.6624) grad_norm 1.2472 (1.3394) [2022-01-21 08:26:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1010/1251] eta 0:08:53 lr 0.000669 time 1.9129 (2.2117) loss 2.8025 (3.6621) grad_norm 1.2096 (1.3392) [2022-01-21 08:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1020/1251] eta 0:08:30 lr 0.000669 time 1.8877 (2.2102) loss 2.8542 (3.6624) grad_norm 1.1898 (1.3391) [2022-01-21 08:27:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1030/1251] eta 0:08:08 lr 0.000669 time 3.7054 (2.2102) loss 4.1553 (3.6625) grad_norm 1.2676 (1.3384) [2022-01-21 08:27:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1040/1251] eta 0:07:46 lr 0.000669 time 1.9417 (2.2096) loss 3.6110 (3.6617) grad_norm 1.4597 (1.3388) [2022-01-21 08:27:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1050/1251] eta 0:07:23 lr 0.000669 time 1.7655 (2.2089) loss 3.6750 (3.6607) grad_norm 1.2267 (1.3381) [2022-01-21 08:28:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1060/1251] eta 0:07:01 lr 0.000669 time 2.5588 (2.2080) loss 2.8580 (3.6588) grad_norm 1.2394 (1.3383) [2022-01-21 08:28:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1070/1251] eta 0:06:39 lr 0.000668 time 3.0851 (2.2081) loss 3.1728 (3.6577) grad_norm 1.3900 (1.3379) [2022-01-21 08:28:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1080/1251] eta 0:06:17 lr 0.000668 time 3.1507 (2.2090) loss 2.9660 (3.6579) grad_norm 1.2192 (1.3378) [2022-01-21 08:29:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1090/1251] eta 0:05:55 lr 0.000668 time 1.6460 (2.2094) loss 3.7629 (3.6604) grad_norm 1.2339 (1.3382) [2022-01-21 08:29:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1100/1251] eta 0:05:34 lr 0.000668 time 1.8432 (2.2133) loss 4.1203 (3.6629) grad_norm 1.3954 (1.3382) [2022-01-21 08:30:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1110/1251] eta 0:05:11 lr 0.000668 time 2.2033 (2.2121) loss 3.8623 (3.6639) grad_norm 1.2355 (1.3378) [2022-01-21 08:30:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1120/1251] eta 0:04:49 lr 0.000668 time 3.4840 (2.2122) loss 4.4244 (3.6646) grad_norm 1.3271 (1.3378) [2022-01-21 08:30:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1130/1251] eta 0:04:27 lr 0.000668 time 1.9099 (2.2111) loss 3.4987 (3.6630) grad_norm 1.4403 (1.3384) [2022-01-21 08:31:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1140/1251] eta 0:04:05 lr 0.000668 time 1.8199 (2.2101) loss 3.9402 (3.6635) grad_norm 1.3454 (1.3376) [2022-01-21 08:31:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1150/1251] eta 0:03:43 lr 0.000668 time 2.1231 (2.2097) loss 3.5081 (3.6631) grad_norm 1.4915 (1.3377) [2022-01-21 08:31:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1160/1251] eta 0:03:21 lr 0.000668 time 2.3769 (2.2096) loss 4.1761 (3.6623) grad_norm 1.2205 (1.3374) [2022-01-21 08:32:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1170/1251] eta 0:02:58 lr 0.000668 time 2.2456 (2.2089) loss 3.8157 (3.6628) grad_norm 1.1072 (1.3370) [2022-01-21 08:32:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1180/1251] eta 0:02:36 lr 0.000668 time 2.4068 (2.2087) loss 3.3531 (3.6613) grad_norm 1.2061 (1.3376) [2022-01-21 08:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1190/1251] eta 0:02:14 lr 0.000668 time 2.2814 (2.2085) loss 4.0116 (3.6613) grad_norm 1.1776 (1.3380) [2022-01-21 08:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1200/1251] eta 0:01:52 lr 0.000668 time 2.3164 (2.2099) loss 3.9878 (3.6600) grad_norm 1.1463 (1.3377) [2022-01-21 08:33:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1210/1251] eta 0:01:30 lr 0.000668 time 2.4459 (2.2088) loss 4.0854 (3.6622) grad_norm 1.1112 (1.3371) [2022-01-21 08:34:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1220/1251] eta 0:01:08 lr 0.000668 time 1.8375 (2.2070) loss 4.3618 (3.6657) grad_norm 1.3115 (1.3367) [2022-01-21 08:34:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1230/1251] eta 0:00:46 lr 0.000668 time 1.6377 (2.2059) loss 3.7754 (3.6665) grad_norm 1.2031 (1.3367) [2022-01-21 08:34:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1240/1251] eta 0:00:24 lr 0.000668 time 1.9151 (2.2049) loss 3.2868 (3.6660) grad_norm 1.2947 (1.3362) [2022-01-21 08:35:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [117/300][1250/1251] eta 0:00:02 lr 0.000668 time 1.1566 (2.1992) loss 4.0157 (3.6680) grad_norm 1.1822 (1.3357) [2022-01-21 08:35:02 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 117 training takes 0:45:51 [2022-01-21 08:35:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.568 (18.568) Loss 1.0669 (1.0669) Acc@1 74.805 (74.805) Acc@5 92.969 (92.969) [2022-01-21 08:35:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.458 (3.531) Loss 1.0867 (1.0977) Acc@1 74.707 (74.139) Acc@5 91.895 (92.383) [2022-01-21 08:35:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.916 (2.727) Loss 0.9662 (1.0733) Acc@1 78.027 (74.865) Acc@5 94.336 (92.713) [2022-01-21 08:36:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.621 (2.282) Loss 1.0744 (1.0793) Acc@1 77.148 (74.726) Acc@5 92.480 (92.584) [2022-01-21 08:36:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.690 (2.210) Loss 1.0352 (1.0781) Acc@1 76.758 (74.721) Acc@5 92.090 (92.590) [2022-01-21 08:36:39 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.552 Acc@5 92.506 [2022-01-21 08:36:39 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.6% [2022-01-21 08:36:39 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.55% [2022-01-21 08:37:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][0/1251] eta 8:13:23 lr 0.000668 time 23.6636 (23.6636) loss 2.3415 (2.3415) grad_norm 1.3677 (1.3677) [2022-01-21 08:37:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][10/1251] eta 1:22:21 lr 0.000668 time 1.5308 (3.9821) loss 3.3639 (3.5571) grad_norm 1.5687 (1.2749) [2022-01-21 08:37:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][20/1251] eta 1:06:35 lr 0.000668 time 2.5401 (3.2453) loss 4.4040 (3.5847) grad_norm 1.4529 (1.3336) [2022-01-21 08:38:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][30/1251] eta 0:57:48 lr 0.000668 time 1.2753 (2.8404) loss 3.5838 (3.6633) grad_norm 1.4206 (1.3546) [2022-01-21 08:38:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][40/1251] eta 0:55:27 lr 0.000668 time 5.3635 (2.7475) loss 4.3058 (3.6541) grad_norm 1.2145 (1.3862) [2022-01-21 08:38:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][50/1251] eta 0:52:47 lr 0.000668 time 2.0323 (2.6375) loss 4.0710 (3.7151) grad_norm 1.4563 (1.3765) [2022-01-21 08:39:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][60/1251] eta 0:51:07 lr 0.000668 time 2.3239 (2.5755) loss 3.9777 (3.7236) grad_norm 1.3016 (1.3630) [2022-01-21 08:39:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][70/1251] eta 0:49:36 lr 0.000668 time 1.5882 (2.5200) loss 3.1422 (3.6588) grad_norm 1.3571 (1.3506) [2022-01-21 08:40:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][80/1251] eta 0:48:32 lr 0.000667 time 3.6126 (2.4871) loss 4.0637 (3.6606) grad_norm 1.3136 (1.3491) [2022-01-21 08:40:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][90/1251] eta 0:47:00 lr 0.000667 time 1.5635 (2.4295) loss 4.5623 (3.6530) grad_norm 1.3480 (1.3438) [2022-01-21 08:40:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][100/1251] eta 0:46:10 lr 0.000667 time 2.0235 (2.4069) loss 3.6109 (3.6558) grad_norm 1.7678 (1.3462) [2022-01-21 08:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][110/1251] eta 0:45:24 lr 0.000667 time 2.1020 (2.3875) loss 3.6279 (3.6805) grad_norm 1.2435 (1.3430) [2022-01-21 08:41:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][120/1251] eta 0:44:59 lr 0.000667 time 3.6439 (2.3869) loss 4.1762 (3.6758) grad_norm 1.4773 (1.3433) [2022-01-21 08:41:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][130/1251] eta 0:44:22 lr 0.000667 time 2.1335 (2.3754) loss 2.7898 (3.6529) grad_norm 1.2190 (1.3383) [2022-01-21 08:42:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][140/1251] eta 0:43:34 lr 0.000667 time 1.8923 (2.3534) loss 4.0969 (3.6506) grad_norm 1.2848 (1.3295) [2022-01-21 08:42:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][150/1251] eta 0:42:43 lr 0.000667 time 1.9809 (2.3282) loss 4.0620 (3.6659) grad_norm 1.1975 (1.3253) [2022-01-21 08:42:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][160/1251] eta 0:42:08 lr 0.000667 time 2.5808 (2.3180) loss 3.8248 (3.6755) grad_norm 1.2175 (1.3266) [2022-01-21 08:43:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][170/1251] eta 0:41:35 lr 0.000667 time 2.9019 (2.3081) loss 3.8769 (3.6904) grad_norm 1.6780 (1.3243) [2022-01-21 08:43:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][180/1251] eta 0:41:03 lr 0.000667 time 2.1341 (2.2997) loss 3.3717 (3.7036) grad_norm 1.4005 (1.3240) [2022-01-21 08:43:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][190/1251] eta 0:40:38 lr 0.000667 time 2.2057 (2.2981) loss 4.3628 (3.6980) grad_norm 1.2149 (1.3198) [2022-01-21 08:44:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][200/1251] eta 0:40:24 lr 0.000667 time 3.4182 (2.3065) loss 3.0928 (3.6897) grad_norm 1.2150 (1.3185) [2022-01-21 08:44:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][210/1251] eta 0:39:52 lr 0.000667 time 1.5502 (2.2986) loss 4.0364 (3.6817) grad_norm 1.2402 (1.3189) [2022-01-21 08:45:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][220/1251] eta 0:39:21 lr 0.000667 time 1.4834 (2.2902) loss 3.7208 (3.6828) grad_norm 1.6645 (1.3204) [2022-01-21 08:45:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][230/1251] eta 0:38:51 lr 0.000667 time 2.1834 (2.2834) loss 4.0249 (3.6820) grad_norm 1.3248 (1.3281) [2022-01-21 08:45:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][240/1251] eta 0:38:25 lr 0.000667 time 2.8550 (2.2801) loss 2.6441 (3.6753) grad_norm 1.3445 (1.3329) [2022-01-21 08:46:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][250/1251] eta 0:37:54 lr 0.000667 time 1.6929 (2.2722) loss 3.0894 (3.6711) grad_norm 1.1625 (1.3337) [2022-01-21 08:46:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][260/1251] eta 0:37:27 lr 0.000667 time 1.7616 (2.2677) loss 4.3395 (3.6657) grad_norm 1.3612 (1.3363) [2022-01-21 08:46:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][270/1251] eta 0:36:56 lr 0.000667 time 1.5995 (2.2597) loss 3.5141 (3.6546) grad_norm 1.1624 (1.3361) [2022-01-21 08:47:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][280/1251] eta 0:36:39 lr 0.000667 time 2.3622 (2.2650) loss 2.9083 (3.6595) grad_norm 1.3415 (1.3358) [2022-01-21 08:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][290/1251] eta 0:36:18 lr 0.000667 time 2.1530 (2.2666) loss 3.8586 (3.6507) grad_norm 1.6125 (1.3378) [2022-01-21 08:48:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][300/1251] eta 0:35:58 lr 0.000667 time 1.8953 (2.2692) loss 3.7152 (3.6457) grad_norm 1.1924 (1.3358) [2022-01-21 08:48:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][310/1251] eta 0:35:30 lr 0.000667 time 1.9647 (2.2646) loss 3.9587 (3.6379) grad_norm 1.5002 (1.3357) [2022-01-21 08:48:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][320/1251] eta 0:35:05 lr 0.000667 time 2.9000 (2.2615) loss 3.7903 (3.6244) grad_norm 1.5680 (1.3351) [2022-01-21 08:49:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][330/1251] eta 0:34:36 lr 0.000666 time 1.8974 (2.2551) loss 4.1657 (3.6350) grad_norm 1.3844 (1.3344) [2022-01-21 08:49:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][340/1251] eta 0:34:12 lr 0.000666 time 2.6057 (2.2535) loss 2.7941 (3.6336) grad_norm 1.2856 (1.3361) [2022-01-21 08:49:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][350/1251] eta 0:33:48 lr 0.000666 time 2.1687 (2.2509) loss 4.2047 (3.6401) grad_norm 1.3531 (1.3359) [2022-01-21 08:50:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][360/1251] eta 0:33:26 lr 0.000666 time 2.8914 (2.2515) loss 3.2951 (3.6419) grad_norm 1.2194 (1.3352) [2022-01-21 08:50:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][370/1251] eta 0:32:59 lr 0.000666 time 2.1800 (2.2464) loss 3.9839 (3.6475) grad_norm 1.3049 (1.3376) [2022-01-21 08:50:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][380/1251] eta 0:32:37 lr 0.000666 time 2.5341 (2.2470) loss 4.1374 (3.6461) grad_norm 1.2435 (1.3398) [2022-01-21 08:51:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][390/1251] eta 0:32:11 lr 0.000666 time 1.8405 (2.2431) loss 3.8880 (3.6474) grad_norm 1.7935 (1.3426) [2022-01-21 08:51:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][400/1251] eta 0:31:47 lr 0.000666 time 2.1870 (2.2419) loss 3.0037 (3.6535) grad_norm 1.4027 (1.3464) [2022-01-21 08:52:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][410/1251] eta 0:31:24 lr 0.000666 time 2.5116 (2.2408) loss 4.3375 (3.6442) grad_norm 1.5682 (1.3458) [2022-01-21 08:52:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][420/1251] eta 0:31:00 lr 0.000666 time 2.0172 (2.2392) loss 3.6449 (3.6461) grad_norm 1.1821 (1.3452) [2022-01-21 08:52:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][430/1251] eta 0:30:37 lr 0.000666 time 1.8780 (2.2386) loss 3.7455 (3.6476) grad_norm 1.3724 (1.3437) [2022-01-21 08:53:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][440/1251] eta 0:30:13 lr 0.000666 time 2.5310 (2.2358) loss 3.9822 (3.6511) grad_norm 2.1620 (1.3473) [2022-01-21 08:53:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][450/1251] eta 0:29:50 lr 0.000666 time 1.8914 (2.2348) loss 2.6405 (3.6477) grad_norm 1.2318 (1.3473) [2022-01-21 08:53:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][460/1251] eta 0:29:29 lr 0.000666 time 3.1237 (2.2366) loss 4.4085 (3.6396) grad_norm 1.6251 (1.3482) [2022-01-21 08:54:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][470/1251] eta 0:29:04 lr 0.000666 time 1.8200 (2.2334) loss 2.8115 (3.6390) grad_norm 1.2776 (1.3472) [2022-01-21 08:54:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][480/1251] eta 0:28:42 lr 0.000666 time 2.1627 (2.2339) loss 3.9315 (3.6422) grad_norm 1.2896 (1.3456) [2022-01-21 08:54:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][490/1251] eta 0:28:18 lr 0.000666 time 1.5681 (2.2316) loss 4.4451 (3.6477) grad_norm 1.4151 (1.3453) [2022-01-21 08:55:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][500/1251] eta 0:27:54 lr 0.000666 time 2.2433 (2.2300) loss 3.5880 (3.6500) grad_norm 1.1888 (1.3467) [2022-01-21 08:55:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][510/1251] eta 0:27:28 lr 0.000666 time 2.1597 (2.2247) loss 3.5500 (3.6462) grad_norm 1.0781 (1.3468) [2022-01-21 08:55:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][520/1251] eta 0:27:05 lr 0.000666 time 2.7510 (2.2234) loss 3.5971 (3.6503) grad_norm 1.4745 (1.3461) [2022-01-21 08:56:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][530/1251] eta 0:26:41 lr 0.000666 time 2.6114 (2.2219) loss 3.9036 (3.6522) grad_norm 1.4681 (1.3452) [2022-01-21 08:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][540/1251] eta 0:26:17 lr 0.000666 time 1.5477 (2.2188) loss 3.9154 (3.6527) grad_norm 1.2708 (1.3467) [2022-01-21 08:57:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][550/1251] eta 0:25:56 lr 0.000666 time 2.1374 (2.2204) loss 3.8829 (3.6496) grad_norm 1.3300 (1.3468) [2022-01-21 08:57:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][560/1251] eta 0:25:37 lr 0.000666 time 3.5651 (2.2250) loss 4.3847 (3.6503) grad_norm 1.4211 (1.3467) [2022-01-21 08:57:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][570/1251] eta 0:25:16 lr 0.000666 time 3.1537 (2.2272) loss 3.5884 (3.6551) grad_norm 1.6592 (1.3468) [2022-01-21 08:58:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][580/1251] eta 0:24:52 lr 0.000666 time 2.5075 (2.2250) loss 3.2501 (3.6573) grad_norm 1.4325 (1.3473) [2022-01-21 08:58:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][590/1251] eta 0:24:28 lr 0.000665 time 2.0687 (2.2222) loss 4.3045 (3.6588) grad_norm 1.4194 (1.3469) [2022-01-21 08:58:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][600/1251] eta 0:24:05 lr 0.000665 time 2.6514 (2.2203) loss 4.3459 (3.6627) grad_norm 1.2924 (1.3469) [2022-01-21 08:59:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][610/1251] eta 0:23:43 lr 0.000665 time 2.5776 (2.2203) loss 2.8326 (3.6574) grad_norm 1.1809 (1.3458) [2022-01-21 08:59:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][620/1251] eta 0:23:20 lr 0.000665 time 2.2124 (2.2198) loss 3.5065 (3.6624) grad_norm 1.1259 (1.3449) [2022-01-21 08:59:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][630/1251] eta 0:22:57 lr 0.000665 time 3.1028 (2.2186) loss 3.8952 (3.6617) grad_norm 1.6254 (1.3449) [2022-01-21 09:00:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][640/1251] eta 0:22:34 lr 0.000665 time 1.9734 (2.2166) loss 4.2081 (3.6657) grad_norm 1.2838 (1.3441) [2022-01-21 09:00:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][650/1251] eta 0:22:11 lr 0.000665 time 1.9035 (2.2151) loss 4.1151 (3.6651) grad_norm 1.2777 (1.3439) [2022-01-21 09:01:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][660/1251] eta 0:21:48 lr 0.000665 time 1.9220 (2.2141) loss 3.5839 (3.6662) grad_norm 1.1546 (1.3423) [2022-01-21 09:01:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][670/1251] eta 0:21:26 lr 0.000665 time 2.4685 (2.2141) loss 3.9510 (3.6640) grad_norm 1.2562 (1.3423) [2022-01-21 09:01:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][680/1251] eta 0:21:05 lr 0.000665 time 2.2244 (2.2155) loss 3.8526 (3.6628) grad_norm 1.3774 (1.3428) [2022-01-21 09:02:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][690/1251] eta 0:20:43 lr 0.000665 time 2.5116 (2.2159) loss 3.8795 (3.6618) grad_norm 1.1844 (1.3422) [2022-01-21 09:02:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][700/1251] eta 0:20:21 lr 0.000665 time 2.4698 (2.2161) loss 3.5752 (3.6626) grad_norm 1.5121 (1.3426) [2022-01-21 09:02:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][710/1251] eta 0:19:57 lr 0.000665 time 1.9334 (2.2134) loss 4.0744 (3.6622) grad_norm 1.2711 (1.3427) [2022-01-21 09:03:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][720/1251] eta 0:19:33 lr 0.000665 time 1.8018 (2.2107) loss 3.6666 (3.6618) grad_norm 1.3425 (1.3421) [2022-01-21 09:03:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][730/1251] eta 0:19:11 lr 0.000665 time 1.9076 (2.2092) loss 2.7395 (3.6627) grad_norm 1.3879 (1.3411) [2022-01-21 09:03:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][740/1251] eta 0:18:48 lr 0.000665 time 2.5182 (2.2092) loss 4.1206 (3.6620) grad_norm 1.1410 (1.3401) [2022-01-21 09:04:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][750/1251] eta 0:18:27 lr 0.000665 time 2.0917 (2.2099) loss 4.3541 (3.6575) grad_norm 1.2058 (1.3400) [2022-01-21 09:04:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][760/1251] eta 0:18:04 lr 0.000665 time 1.8712 (2.2092) loss 3.9630 (3.6595) grad_norm 1.4650 (1.3395) [2022-01-21 09:05:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][770/1251] eta 0:17:43 lr 0.000665 time 2.8203 (2.2109) loss 2.9020 (3.6604) grad_norm 1.5505 (1.3400) [2022-01-21 09:05:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][780/1251] eta 0:17:20 lr 0.000665 time 1.9572 (2.2090) loss 3.8333 (3.6631) grad_norm 1.3062 (1.3402) [2022-01-21 09:05:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][790/1251] eta 0:16:58 lr 0.000665 time 2.1227 (2.2096) loss 3.5627 (3.6670) grad_norm 1.0937 (1.3413) [2022-01-21 09:06:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][800/1251] eta 0:16:36 lr 0.000665 time 2.7275 (2.2096) loss 3.9576 (3.6666) grad_norm 1.2364 (1.3418) [2022-01-21 09:06:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][810/1251] eta 0:16:14 lr 0.000665 time 2.3559 (2.2099) loss 3.4917 (3.6658) grad_norm 1.5373 (1.3430) [2022-01-21 09:06:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][820/1251] eta 0:15:52 lr 0.000665 time 2.2218 (2.2092) loss 3.5235 (3.6630) grad_norm 1.6734 (1.3441) [2022-01-21 09:07:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][830/1251] eta 0:15:30 lr 0.000665 time 2.1090 (2.2094) loss 3.4898 (3.6616) grad_norm 1.2787 (1.3439) [2022-01-21 09:07:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][840/1251] eta 0:15:07 lr 0.000664 time 2.5309 (2.2075) loss 3.5162 (3.6641) grad_norm 1.1821 (1.3430) [2022-01-21 09:07:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][850/1251] eta 0:14:44 lr 0.000664 time 1.9057 (2.2056) loss 2.8212 (3.6646) grad_norm 1.3159 (1.3428) [2022-01-21 09:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][860/1251] eta 0:14:21 lr 0.000664 time 2.4234 (2.2044) loss 3.3163 (3.6647) grad_norm 1.3205 (1.3428) [2022-01-21 09:08:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][870/1251] eta 0:13:59 lr 0.000664 time 2.0935 (2.2033) loss 4.1837 (3.6644) grad_norm 1.2538 (1.3422) [2022-01-21 09:09:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][880/1251] eta 0:13:37 lr 0.000664 time 2.8555 (2.2042) loss 2.9849 (3.6617) grad_norm 1.3254 (1.3419) [2022-01-21 09:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][890/1251] eta 0:13:16 lr 0.000664 time 2.9134 (2.2068) loss 3.4970 (3.6594) grad_norm 1.5253 (1.3425) [2022-01-21 09:09:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][900/1251] eta 0:12:55 lr 0.000664 time 2.9450 (2.2089) loss 4.0311 (3.6609) grad_norm 1.3060 (1.3426) [2022-01-21 09:10:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][910/1251] eta 0:12:32 lr 0.000664 time 1.6938 (2.2061) loss 4.2505 (3.6648) grad_norm 1.2036 (1.3422) [2022-01-21 09:10:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][920/1251] eta 0:12:09 lr 0.000664 time 2.6557 (2.2042) loss 2.6867 (3.6615) grad_norm 1.3297 (1.3428) [2022-01-21 09:10:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][930/1251] eta 0:11:47 lr 0.000664 time 1.9234 (2.2035) loss 3.9959 (3.6615) grad_norm 1.4234 (1.3420) [2022-01-21 09:11:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][940/1251] eta 0:11:26 lr 0.000664 time 2.2172 (2.2070) loss 3.9211 (3.6604) grad_norm 1.2623 (1.3418) [2022-01-21 09:11:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][950/1251] eta 0:11:04 lr 0.000664 time 1.5920 (2.2075) loss 3.8300 (3.6602) grad_norm 1.1106 (1.3409) [2022-01-21 09:12:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][960/1251] eta 0:10:43 lr 0.000664 time 6.6802 (2.2110) loss 4.1330 (3.6588) grad_norm 1.1460 (1.3405) [2022-01-21 09:12:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][970/1251] eta 0:10:20 lr 0.000664 time 1.6992 (2.2083) loss 4.2942 (3.6605) grad_norm 1.3440 (1.3396) [2022-01-21 09:12:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][980/1251] eta 0:09:58 lr 0.000664 time 1.9733 (2.2079) loss 4.2981 (3.6596) grad_norm 1.2785 (1.3393) [2022-01-21 09:13:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][990/1251] eta 0:09:35 lr 0.000664 time 1.6250 (2.2056) loss 3.7434 (3.6599) grad_norm 1.2896 (1.3389) [2022-01-21 09:13:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1000/1251] eta 0:09:14 lr 0.000664 time 4.9885 (2.2090) loss 4.1230 (3.6595) grad_norm 1.2294 (1.3379) [2022-01-21 09:13:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1010/1251] eta 0:08:52 lr 0.000664 time 1.8085 (2.2091) loss 3.2907 (3.6590) grad_norm 1.3004 (1.3386) [2022-01-21 09:14:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1020/1251] eta 0:08:30 lr 0.000664 time 1.5762 (2.2083) loss 2.5391 (3.6612) grad_norm 1.3691 (1.3380) [2022-01-21 09:14:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1030/1251] eta 0:08:07 lr 0.000664 time 1.8801 (2.2066) loss 4.4939 (3.6632) grad_norm 1.2573 (1.3376) [2022-01-21 09:14:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1040/1251] eta 0:07:45 lr 0.000664 time 3.1740 (2.2059) loss 3.7526 (3.6625) grad_norm 1.3392 (1.3368) [2022-01-21 09:15:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1050/1251] eta 0:07:23 lr 0.000664 time 1.9929 (2.2050) loss 3.9125 (3.6626) grad_norm 1.4585 (1.3368) [2022-01-21 09:15:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1060/1251] eta 0:07:01 lr 0.000664 time 2.4822 (2.2053) loss 4.3842 (3.6638) grad_norm 1.2036 (1.3364) [2022-01-21 09:16:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1070/1251] eta 0:06:38 lr 0.000664 time 1.9535 (2.2041) loss 3.8170 (3.6627) grad_norm 1.0873 (1.3352) [2022-01-21 09:16:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1080/1251] eta 0:06:17 lr 0.000664 time 2.6275 (2.2056) loss 3.3612 (3.6638) grad_norm 1.6073 (1.3355) [2022-01-21 09:16:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1090/1251] eta 0:05:55 lr 0.000664 time 2.2780 (2.2061) loss 3.4909 (3.6609) grad_norm 1.3492 (1.3357) [2022-01-21 09:17:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1100/1251] eta 0:05:33 lr 0.000663 time 2.5401 (2.2067) loss 3.4164 (3.6569) grad_norm 1.2277 (1.3360) [2022-01-21 09:17:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1110/1251] eta 0:05:11 lr 0.000663 time 2.3550 (2.2064) loss 3.3616 (3.6592) grad_norm 1.2237 (1.3363) [2022-01-21 09:17:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1120/1251] eta 0:04:49 lr 0.000663 time 2.2560 (2.2066) loss 3.7875 (3.6606) grad_norm 1.3739 (1.3359) [2022-01-21 09:18:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1130/1251] eta 0:04:26 lr 0.000663 time 2.0069 (2.2055) loss 3.9839 (3.6623) grad_norm 1.3546 (1.3357) [2022-01-21 09:18:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1140/1251] eta 0:04:04 lr 0.000663 time 1.8304 (2.2048) loss 4.1296 (3.6609) grad_norm 1.4715 (1.3359) [2022-01-21 09:18:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1150/1251] eta 0:03:42 lr 0.000663 time 1.8083 (2.2030) loss 4.2909 (3.6633) grad_norm 1.2562 (1.3359) [2022-01-21 09:19:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1160/1251] eta 0:03:20 lr 0.000663 time 1.9027 (2.2039) loss 3.6833 (3.6645) grad_norm 1.1476 (1.3361) [2022-01-21 09:19:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1170/1251] eta 0:02:58 lr 0.000663 time 2.2299 (2.2033) loss 3.8733 (3.6625) grad_norm 1.3017 (1.3357) [2022-01-21 09:20:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1180/1251] eta 0:02:36 lr 0.000663 time 1.4992 (2.2027) loss 4.0083 (3.6612) grad_norm 1.2555 (1.3361) [2022-01-21 09:20:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1190/1251] eta 0:02:14 lr 0.000663 time 1.9640 (2.2033) loss 2.5709 (3.6591) grad_norm 1.3071 (1.3360) [2022-01-21 09:20:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1200/1251] eta 0:01:52 lr 0.000663 time 2.4504 (2.2053) loss 4.0204 (3.6590) grad_norm 1.5417 (1.3359) [2022-01-21 09:21:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1210/1251] eta 0:01:30 lr 0.000663 time 1.9344 (2.2041) loss 3.9411 (3.6609) grad_norm 1.3972 (1.3361) [2022-01-21 09:21:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1220/1251] eta 0:01:08 lr 0.000663 time 1.7956 (2.2021) loss 3.3375 (3.6595) grad_norm 1.5025 (1.3359) [2022-01-21 09:21:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1230/1251] eta 0:00:46 lr 0.000663 time 2.0479 (2.2009) loss 4.3104 (3.6572) grad_norm 1.1515 (1.3352) [2022-01-21 09:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1240/1251] eta 0:00:24 lr 0.000663 time 2.4053 (2.1995) loss 4.5433 (3.6590) grad_norm 1.2718 (1.3347) [2022-01-21 09:22:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [118/300][1250/1251] eta 0:00:02 lr 0.000663 time 1.1571 (2.1943) loss 3.8015 (3.6578) grad_norm 1.3625 (1.3342) [2022-01-21 09:22:25 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 118 training takes 0:45:45 [2022-01-21 09:22:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 20.495 (20.495) Loss 1.1206 (1.1206) Acc@1 73.926 (73.926) Acc@5 91.602 (91.602) [2022-01-21 09:23:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.338 (3.183) Loss 1.0746 (1.1012) Acc@1 74.902 (74.015) Acc@5 92.383 (92.285) [2022-01-21 09:23:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.634 (2.532) Loss 1.1734 (1.0982) Acc@1 74.121 (74.423) Acc@5 91.797 (92.401) [2022-01-21 09:23:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.962 (2.267) Loss 1.0824 (1.1045) Acc@1 72.949 (74.149) Acc@5 92.969 (92.304) [2022-01-21 09:23:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.489 (2.177) Loss 1.0761 (1.1007) Acc@1 76.660 (74.340) Acc@5 93.262 (92.452) [2022-01-21 09:24:01 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.462 Acc@5 92.502 [2022-01-21 09:24:01 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.5% [2022-01-21 09:24:01 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.55% [2022-01-21 09:24:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][0/1251] eta 7:32:33 lr 0.000663 time 21.7051 (21.7051) loss 4.0848 (4.0848) grad_norm 1.2481 (1.2481) [2022-01-21 09:24:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][10/1251] eta 1:20:08 lr 0.000663 time 1.6059 (3.8750) loss 2.7303 (3.5836) grad_norm 1.1847 (1.2617) [2022-01-21 09:25:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][20/1251] eta 1:02:11 lr 0.000663 time 1.5845 (3.0312) loss 3.8059 (3.5993) grad_norm 1.4569 (1.2857) [2022-01-21 09:25:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][30/1251] eta 0:56:16 lr 0.000663 time 1.6179 (2.7651) loss 4.0082 (3.5810) grad_norm 1.2235 (1.3021) [2022-01-21 09:25:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][40/1251] eta 0:53:56 lr 0.000663 time 4.5789 (2.6722) loss 3.6568 (3.6321) grad_norm 1.3770 (1.3142) [2022-01-21 09:26:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][50/1251] eta 0:52:02 lr 0.000663 time 1.4505 (2.6003) loss 4.1128 (3.6439) grad_norm 1.1455 (1.3094) [2022-01-21 09:26:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][60/1251] eta 0:49:47 lr 0.000663 time 1.5909 (2.5081) loss 3.1617 (3.5856) grad_norm 1.2446 (1.3054) [2022-01-21 09:26:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][70/1251] eta 0:48:07 lr 0.000663 time 2.0350 (2.4451) loss 4.0084 (3.5591) grad_norm 1.3700 (1.3240) [2022-01-21 09:27:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][80/1251] eta 0:47:39 lr 0.000663 time 3.5088 (2.4420) loss 3.6408 (3.5627) grad_norm 1.4792 (1.3246) [2022-01-21 09:27:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][90/1251] eta 0:47:03 lr 0.000663 time 1.2729 (2.4319) loss 4.2590 (3.5764) grad_norm 1.4991 (1.3234) [2022-01-21 09:28:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][100/1251] eta 0:46:24 lr 0.000662 time 2.1153 (2.4188) loss 3.6145 (3.5620) grad_norm 1.2625 (1.3177) [2022-01-21 09:28:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][110/1251] eta 0:45:47 lr 0.000662 time 1.8719 (2.4077) loss 3.5000 (3.5755) grad_norm 1.3418 (1.3140) [2022-01-21 09:28:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][120/1251] eta 0:45:13 lr 0.000662 time 2.4790 (2.3990) loss 3.9744 (3.5792) grad_norm 1.4044 (1.3249) [2022-01-21 09:29:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][130/1251] eta 0:44:23 lr 0.000662 time 2.2221 (2.3762) loss 3.0197 (3.5607) grad_norm 1.2986 (1.3227) [2022-01-21 09:29:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][140/1251] eta 0:43:26 lr 0.000662 time 1.9536 (2.3462) loss 4.3671 (3.5644) grad_norm 1.3665 (1.3256) [2022-01-21 09:29:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][150/1251] eta 0:42:41 lr 0.000662 time 2.5411 (2.3267) loss 3.6189 (3.5698) grad_norm 1.2334 (1.3246) [2022-01-21 09:30:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][160/1251] eta 0:42:05 lr 0.000662 time 2.4566 (2.3147) loss 3.4679 (3.5701) grad_norm 1.2191 (1.3230) [2022-01-21 09:30:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][170/1251] eta 0:41:44 lr 0.000662 time 2.4991 (2.3164) loss 3.2153 (3.5771) grad_norm 1.3709 (1.3244) [2022-01-21 09:30:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][180/1251] eta 0:41:14 lr 0.000662 time 2.2011 (2.3104) loss 4.0316 (3.5763) grad_norm 1.1867 (1.3183) [2022-01-21 09:31:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][190/1251] eta 0:40:42 lr 0.000662 time 2.0418 (2.3021) loss 3.4339 (3.5728) grad_norm 1.2907 (1.3162) [2022-01-21 09:31:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][200/1251] eta 0:40:11 lr 0.000662 time 1.7833 (2.2942) loss 4.1951 (3.5793) grad_norm 1.3457 (1.3189) [2022-01-21 09:32:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][210/1251] eta 0:39:47 lr 0.000662 time 2.1954 (2.2936) loss 4.1532 (3.5922) grad_norm 1.1784 (1.3173) [2022-01-21 09:32:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][220/1251] eta 0:39:14 lr 0.000662 time 1.8916 (2.2837) loss 4.1412 (3.6034) grad_norm 1.8547 (1.3201) [2022-01-21 09:32:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][230/1251] eta 0:38:54 lr 0.000662 time 2.6771 (2.2862) loss 3.5470 (3.6094) grad_norm 1.4492 (1.3227) [2022-01-21 09:33:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][240/1251] eta 0:38:29 lr 0.000662 time 1.8978 (2.2841) loss 4.0049 (3.6131) grad_norm 1.2523 (1.3257) [2022-01-21 09:33:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][250/1251] eta 0:38:05 lr 0.000662 time 1.7678 (2.2827) loss 3.6345 (3.6117) grad_norm 1.4747 (1.3235) [2022-01-21 09:33:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][260/1251] eta 0:37:39 lr 0.000662 time 2.2356 (2.2801) loss 3.2040 (3.6074) grad_norm 1.1970 (1.3226) [2022-01-21 09:34:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][270/1251] eta 0:37:07 lr 0.000662 time 1.6399 (2.2708) loss 3.5248 (3.6065) grad_norm 1.3841 (1.3258) [2022-01-21 09:34:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][280/1251] eta 0:36:35 lr 0.000662 time 1.8540 (2.2610) loss 4.6831 (3.5941) grad_norm 1.2871 (1.3263) [2022-01-21 09:34:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][290/1251] eta 0:36:08 lr 0.000662 time 1.9428 (2.2567) loss 3.8065 (3.5926) grad_norm 1.6374 (1.3331) [2022-01-21 09:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][300/1251] eta 0:35:45 lr 0.000662 time 1.7611 (2.2562) loss 3.4481 (3.5879) grad_norm 1.2109 (1.3339) [2022-01-21 09:35:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][310/1251] eta 0:35:23 lr 0.000662 time 2.1680 (2.2565) loss 3.7074 (3.5951) grad_norm 1.3770 (1.3327) [2022-01-21 09:36:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][320/1251] eta 0:34:58 lr 0.000662 time 2.1181 (2.2543) loss 3.5799 (3.5961) grad_norm 1.5182 (1.3327) [2022-01-21 09:36:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][330/1251] eta 0:34:37 lr 0.000662 time 1.9214 (2.2553) loss 3.5206 (3.5925) grad_norm 1.3643 (1.3305) [2022-01-21 09:36:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][340/1251] eta 0:34:14 lr 0.000662 time 1.5314 (2.2550) loss 3.4521 (3.5947) grad_norm 1.2978 (1.3298) [2022-01-21 09:37:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][350/1251] eta 0:33:47 lr 0.000662 time 2.2659 (2.2503) loss 4.0054 (3.5919) grad_norm 1.2697 (1.3286) [2022-01-21 09:37:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][360/1251] eta 0:33:21 lr 0.000661 time 1.7532 (2.2465) loss 2.7314 (3.5888) grad_norm 1.1545 (1.3278) [2022-01-21 09:37:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][370/1251] eta 0:32:57 lr 0.000661 time 1.8379 (2.2449) loss 3.9937 (3.5937) grad_norm 1.8879 (1.3293) [2022-01-21 09:38:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][380/1251] eta 0:32:35 lr 0.000661 time 1.5146 (2.2451) loss 3.4558 (3.5999) grad_norm 1.3296 (1.3292) [2022-01-21 09:38:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][390/1251] eta 0:32:12 lr 0.000661 time 1.5851 (2.2443) loss 3.7632 (3.6034) grad_norm 1.7341 (1.3323) [2022-01-21 09:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][400/1251] eta 0:31:49 lr 0.000661 time 1.5677 (2.2442) loss 3.1022 (3.5965) grad_norm 1.1607 (1.3313) [2022-01-21 09:39:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][410/1251] eta 0:31:25 lr 0.000661 time 1.7894 (2.2424) loss 2.9957 (3.5963) grad_norm 1.2542 (1.3302) [2022-01-21 09:39:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][420/1251] eta 0:31:03 lr 0.000661 time 2.0611 (2.2429) loss 4.1453 (3.5963) grad_norm 1.3212 (1.3301) [2022-01-21 09:40:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][430/1251] eta 0:30:41 lr 0.000661 time 1.6865 (2.2435) loss 2.9193 (3.5879) grad_norm 1.2090 (1.3303) [2022-01-21 09:40:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][440/1251] eta 0:30:17 lr 0.000661 time 1.8507 (2.2404) loss 2.5385 (3.5895) grad_norm 1.3399 (1.3283) [2022-01-21 09:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][450/1251] eta 0:29:51 lr 0.000661 time 1.6635 (2.2367) loss 3.9475 (3.5911) grad_norm 1.1547 (1.3281) [2022-01-21 09:41:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][460/1251] eta 0:29:25 lr 0.000661 time 1.6254 (2.2324) loss 3.9324 (3.5940) grad_norm 1.2190 (1.3277) [2022-01-21 09:41:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][470/1251] eta 0:29:01 lr 0.000661 time 2.3497 (2.2292) loss 3.5277 (3.6004) grad_norm 1.2061 (1.3259) [2022-01-21 09:41:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][480/1251] eta 0:28:36 lr 0.000661 time 1.6915 (2.2268) loss 3.4503 (3.6016) grad_norm 1.4284 (1.3246) [2022-01-21 09:42:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][490/1251] eta 0:28:16 lr 0.000661 time 2.5557 (2.2290) loss 4.3506 (3.6025) grad_norm 1.3712 (1.3243) [2022-01-21 09:42:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][500/1251] eta 0:27:52 lr 0.000661 time 2.2617 (2.2266) loss 3.3024 (3.6047) grad_norm 1.2893 (1.3242) [2022-01-21 09:42:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][510/1251] eta 0:27:30 lr 0.000661 time 2.3984 (2.2273) loss 4.1243 (3.6017) grad_norm 1.1928 (1.3249) [2022-01-21 09:43:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][520/1251] eta 0:27:07 lr 0.000661 time 2.1875 (2.2263) loss 3.4074 (3.6048) grad_norm 1.2407 (1.3260) [2022-01-21 09:43:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][530/1251] eta 0:26:47 lr 0.000661 time 2.4581 (2.2297) loss 3.6967 (3.6042) grad_norm 1.3363 (1.3264) [2022-01-21 09:44:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][540/1251] eta 0:26:27 lr 0.000661 time 1.8751 (2.2334) loss 3.9548 (3.6067) grad_norm 1.2289 (1.3305) [2022-01-21 09:44:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][550/1251] eta 0:26:06 lr 0.000661 time 2.1336 (2.2348) loss 3.5277 (3.6126) grad_norm 1.3781 (1.3303) [2022-01-21 09:44:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][560/1251] eta 0:25:41 lr 0.000661 time 1.8799 (2.2310) loss 4.0081 (3.6125) grad_norm 1.2659 (1.3299) [2022-01-21 09:45:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][570/1251] eta 0:25:16 lr 0.000661 time 1.8723 (2.2263) loss 3.9176 (3.6175) grad_norm 1.4092 (1.3290) [2022-01-21 09:45:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][580/1251] eta 0:24:51 lr 0.000661 time 1.8380 (2.2226) loss 4.1856 (3.6174) grad_norm 1.3114 (1.3294) [2022-01-21 09:45:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][590/1251] eta 0:24:27 lr 0.000661 time 1.9929 (2.2202) loss 4.1309 (3.6194) grad_norm 1.3083 (1.3287) [2022-01-21 09:46:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][600/1251] eta 0:24:04 lr 0.000661 time 2.3688 (2.2194) loss 3.1909 (3.6218) grad_norm 1.5117 (1.3287) [2022-01-21 09:46:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][610/1251] eta 0:23:40 lr 0.000660 time 2.1278 (2.2167) loss 3.7775 (3.6241) grad_norm 1.3653 (1.3284) [2022-01-21 09:46:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][620/1251] eta 0:23:18 lr 0.000660 time 1.9660 (2.2166) loss 3.7323 (3.6271) grad_norm 1.4979 (1.3290) [2022-01-21 09:47:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][630/1251] eta 0:22:55 lr 0.000660 time 2.5842 (2.2156) loss 3.8396 (3.6289) grad_norm 1.5001 (1.3294) [2022-01-21 09:47:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][640/1251] eta 0:22:33 lr 0.000660 time 1.8919 (2.2159) loss 3.7053 (3.6273) grad_norm 1.2666 (1.3289) [2022-01-21 09:48:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][650/1251] eta 0:22:12 lr 0.000660 time 2.4270 (2.2179) loss 4.3109 (3.6288) grad_norm 1.3433 (1.3292) [2022-01-21 09:48:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][660/1251] eta 0:21:51 lr 0.000660 time 2.1641 (2.2188) loss 3.9758 (3.6303) grad_norm 1.2397 (1.3289) [2022-01-21 09:48:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][670/1251] eta 0:21:30 lr 0.000660 time 2.2515 (2.2207) loss 3.6343 (3.6332) grad_norm 1.4004 (1.3285) [2022-01-21 09:49:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][680/1251] eta 0:21:06 lr 0.000660 time 1.8954 (2.2187) loss 4.3818 (3.6329) grad_norm 1.1833 (1.3284) [2022-01-21 09:49:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][690/1251] eta 0:20:43 lr 0.000660 time 1.5443 (2.2158) loss 4.2067 (3.6344) grad_norm 1.2823 (1.3275) [2022-01-21 09:49:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][700/1251] eta 0:20:19 lr 0.000660 time 2.2499 (2.2138) loss 2.5761 (3.6354) grad_norm 1.7640 (1.3272) [2022-01-21 09:50:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][710/1251] eta 0:19:57 lr 0.000660 time 1.9008 (2.2129) loss 3.5798 (3.6338) grad_norm 1.2440 (1.3281) [2022-01-21 09:50:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][720/1251] eta 0:19:35 lr 0.000660 time 1.9588 (2.2137) loss 3.7728 (3.6306) grad_norm 1.1843 (1.3280) [2022-01-21 09:51:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][730/1251] eta 0:19:13 lr 0.000660 time 2.1701 (2.2141) loss 3.1087 (3.6283) grad_norm 1.4591 (1.3283) [2022-01-21 09:51:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][740/1251] eta 0:18:51 lr 0.000660 time 1.8824 (2.2142) loss 3.2167 (3.6268) grad_norm 1.1997 (1.3280) [2022-01-21 09:51:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][750/1251] eta 0:18:30 lr 0.000660 time 2.1397 (2.2156) loss 3.0027 (3.6244) grad_norm 1.3533 (1.3284) [2022-01-21 09:52:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][760/1251] eta 0:18:07 lr 0.000660 time 2.4746 (2.2155) loss 4.0845 (3.6275) grad_norm 1.2847 (1.3282) [2022-01-21 09:52:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][770/1251] eta 0:17:44 lr 0.000660 time 2.4962 (2.2137) loss 3.6096 (3.6255) grad_norm 1.1705 (1.3281) [2022-01-21 09:52:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][780/1251] eta 0:17:21 lr 0.000660 time 1.6109 (2.2119) loss 4.1431 (3.6261) grad_norm 1.4526 (1.3282) [2022-01-21 09:53:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][790/1251] eta 0:17:01 lr 0.000660 time 2.1710 (2.2148) loss 3.4835 (3.6281) grad_norm 1.2987 (1.3294) [2022-01-21 09:53:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][800/1251] eta 0:16:39 lr 0.000660 time 2.5143 (2.2158) loss 3.8586 (3.6279) grad_norm 1.2643 (1.3291) [2022-01-21 09:53:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][810/1251] eta 0:16:17 lr 0.000660 time 2.3643 (2.2164) loss 3.8811 (3.6310) grad_norm 1.4107 (1.3294) [2022-01-21 09:54:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][820/1251] eta 0:15:54 lr 0.000660 time 1.6546 (2.2156) loss 3.4385 (3.6303) grad_norm 1.2373 (1.3292) [2022-01-21 09:54:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][830/1251] eta 0:15:32 lr 0.000660 time 2.1544 (2.2155) loss 4.1379 (3.6300) grad_norm 1.1379 (1.3286) [2022-01-21 09:55:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][840/1251] eta 0:15:09 lr 0.000660 time 1.6653 (2.2130) loss 3.6047 (3.6279) grad_norm 1.3636 (1.3289) [2022-01-21 09:55:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][850/1251] eta 0:14:47 lr 0.000660 time 2.1467 (2.2125) loss 4.1306 (3.6269) grad_norm 1.3786 (1.3299) [2022-01-21 09:55:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][860/1251] eta 0:14:24 lr 0.000660 time 1.6809 (2.2116) loss 4.0502 (3.6289) grad_norm 1.2786 (1.3310) [2022-01-21 09:56:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][870/1251] eta 0:14:04 lr 0.000659 time 2.7291 (2.2160) loss 3.7129 (3.6312) grad_norm 1.4795 (1.3315) [2022-01-21 09:56:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][880/1251] eta 0:13:41 lr 0.000659 time 1.6649 (2.2153) loss 3.3892 (3.6309) grad_norm 1.5276 (1.3322) [2022-01-21 09:56:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][890/1251] eta 0:13:19 lr 0.000659 time 2.1861 (2.2137) loss 3.4885 (3.6344) grad_norm 1.2373 (1.3318) [2022-01-21 09:57:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][900/1251] eta 0:12:56 lr 0.000659 time 1.9741 (2.2120) loss 2.7407 (3.6336) grad_norm 1.4969 (1.3321) [2022-01-21 09:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][910/1251] eta 0:12:34 lr 0.000659 time 2.2352 (2.2121) loss 3.8699 (3.6318) grad_norm 1.2185 (1.3317) [2022-01-21 09:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][920/1251] eta 0:12:11 lr 0.000659 time 2.0431 (2.2105) loss 3.7068 (3.6351) grad_norm 1.1087 (1.3311) [2022-01-21 09:58:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][930/1251] eta 0:11:49 lr 0.000659 time 1.8607 (2.2089) loss 3.5627 (3.6362) grad_norm 1.2877 (1.3310) [2022-01-21 09:58:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][940/1251] eta 0:11:26 lr 0.000659 time 1.6278 (2.2074) loss 3.2646 (3.6340) grad_norm 1.5727 (1.3336) [2022-01-21 09:59:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][950/1251] eta 0:11:04 lr 0.000659 time 2.6319 (2.2066) loss 3.1706 (3.6319) grad_norm 1.1664 (1.3344) [2022-01-21 09:59:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][960/1251] eta 0:10:41 lr 0.000659 time 1.8172 (2.2048) loss 4.1937 (3.6303) grad_norm 1.2620 (1.3333) [2022-01-21 09:59:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][970/1251] eta 0:10:19 lr 0.000659 time 1.5384 (2.2041) loss 3.6327 (3.6291) grad_norm 1.2018 (1.3325) [2022-01-21 10:00:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][980/1251] eta 0:09:57 lr 0.000659 time 2.2531 (2.2045) loss 3.8930 (3.6300) grad_norm 1.1872 (1.3315) [2022-01-21 10:00:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][990/1251] eta 0:09:36 lr 0.000659 time 2.7698 (2.2069) loss 3.8934 (3.6311) grad_norm 1.7134 (1.3323) [2022-01-21 10:00:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1000/1251] eta 0:09:13 lr 0.000659 time 2.2010 (2.2066) loss 3.7466 (3.6308) grad_norm 1.3500 (1.3325) [2022-01-21 10:01:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1010/1251] eta 0:08:52 lr 0.000659 time 1.6823 (2.2079) loss 4.1224 (3.6324) grad_norm 1.2021 (1.3323) [2022-01-21 10:01:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1020/1251] eta 0:08:30 lr 0.000659 time 2.3805 (2.2089) loss 3.4248 (3.6334) grad_norm 1.2096 (1.3326) [2022-01-21 10:02:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1030/1251] eta 0:08:08 lr 0.000659 time 3.2965 (2.2103) loss 4.0492 (3.6334) grad_norm 1.2174 (1.3320) [2022-01-21 10:02:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1040/1251] eta 0:07:46 lr 0.000659 time 2.4565 (2.2106) loss 4.0420 (3.6332) grad_norm 1.4676 (1.3314) [2022-01-21 10:02:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1050/1251] eta 0:07:24 lr 0.000659 time 1.8894 (2.2098) loss 4.1352 (3.6361) grad_norm 1.3767 (1.3319) [2022-01-21 10:03:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1060/1251] eta 0:07:01 lr 0.000659 time 2.5971 (2.2087) loss 4.2745 (3.6360) grad_norm 1.4739 (1.3324) [2022-01-21 10:03:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1070/1251] eta 0:06:39 lr 0.000659 time 2.8394 (2.2080) loss 3.3348 (3.6337) grad_norm 1.3337 (1.3325) [2022-01-21 10:03:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1080/1251] eta 0:06:17 lr 0.000659 time 1.8950 (2.2081) loss 3.6354 (3.6341) grad_norm 1.6157 (1.3331) [2022-01-21 10:04:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1090/1251] eta 0:05:55 lr 0.000659 time 2.5849 (2.2085) loss 3.9734 (3.6368) grad_norm 1.4826 (1.3331) [2022-01-21 10:04:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1100/1251] eta 0:05:33 lr 0.000659 time 2.5372 (2.2080) loss 3.9436 (3.6385) grad_norm 1.2578 (1.3329) [2022-01-21 10:04:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1110/1251] eta 0:05:11 lr 0.000659 time 2.7511 (2.2074) loss 3.9534 (3.6391) grad_norm 1.3700 (1.3323) [2022-01-21 10:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1120/1251] eta 0:04:49 lr 0.000658 time 1.8977 (2.2068) loss 4.6588 (3.6406) grad_norm 1.2141 (1.3323) [2022-01-21 10:05:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1130/1251] eta 0:04:26 lr 0.000658 time 1.9107 (2.2049) loss 3.8876 (3.6418) grad_norm 1.2500 (1.3320) [2022-01-21 10:05:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1140/1251] eta 0:04:04 lr 0.000658 time 1.9596 (2.2048) loss 3.8759 (3.6423) grad_norm 1.3995 (1.3318) [2022-01-21 10:06:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1150/1251] eta 0:03:42 lr 0.000658 time 2.8218 (2.2055) loss 2.3835 (3.6423) grad_norm 1.2773 (1.3311) [2022-01-21 10:06:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1160/1251] eta 0:03:20 lr 0.000658 time 1.7942 (2.2055) loss 3.8531 (3.6436) grad_norm 1.2296 (1.3311) [2022-01-21 10:07:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1170/1251] eta 0:02:58 lr 0.000658 time 2.0997 (2.2047) loss 2.9366 (3.6444) grad_norm 1.2011 (1.3313) [2022-01-21 10:07:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1180/1251] eta 0:02:36 lr 0.000658 time 2.6329 (2.2054) loss 4.0639 (3.6448) grad_norm 1.1289 (1.3310) [2022-01-21 10:07:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1190/1251] eta 0:02:14 lr 0.000658 time 2.6786 (2.2066) loss 4.6634 (3.6450) grad_norm 1.4376 (1.3312) [2022-01-21 10:08:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1200/1251] eta 0:01:52 lr 0.000658 time 2.2501 (2.2060) loss 4.2480 (3.6454) grad_norm 1.4164 (1.3308) [2022-01-21 10:08:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1210/1251] eta 0:01:30 lr 0.000658 time 1.9561 (2.2045) loss 4.1423 (3.6483) grad_norm 1.3879 (1.3307) [2022-01-21 10:08:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1220/1251] eta 0:01:08 lr 0.000658 time 2.2403 (2.2035) loss 2.9265 (3.6461) grad_norm 1.3236 (1.3304) [2022-01-21 10:09:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1230/1251] eta 0:00:46 lr 0.000658 time 1.7460 (2.2034) loss 4.3236 (3.6450) grad_norm 1.2665 (1.3300) [2022-01-21 10:09:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1240/1251] eta 0:00:24 lr 0.000658 time 1.9823 (2.2029) loss 3.8517 (3.6451) grad_norm 1.3899 (1.3297) [2022-01-21 10:09:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [119/300][1250/1251] eta 0:00:02 lr 0.000658 time 1.1969 (2.1974) loss 3.2990 (3.6455) grad_norm 1.3605 (1.3304) [2022-01-21 10:09:50 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 119 training takes 0:45:49 [2022-01-21 10:10:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.587 (18.587) Loss 1.0913 (1.0913) Acc@1 74.414 (74.414) Acc@5 92.676 (92.676) [2022-01-21 10:10:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.921 (3.541) Loss 1.1365 (1.0810) Acc@1 72.461 (74.609) Acc@5 92.090 (92.516) [2022-01-21 10:10:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.618 (2.586) Loss 1.0825 (1.0873) Acc@1 75.195 (74.577) Acc@5 92.090 (92.420) [2022-01-21 10:11:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.891 (2.317) Loss 1.1022 (1.0924) Acc@1 74.414 (74.436) Acc@5 92.578 (92.392) [2022-01-21 10:11:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.722 (2.237) Loss 1.0333 (1.0861) Acc@1 75.098 (74.464) Acc@5 94.336 (92.540) [2022-01-21 10:11:29 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.502 Acc@5 92.466 [2022-01-21 10:11:29 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.5% [2022-01-21 10:11:29 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.55% [2022-01-21 10:11:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][0/1251] eta 7:28:15 lr 0.000658 time 21.4995 (21.4995) loss 3.0650 (3.0650) grad_norm 1.2852 (1.2852) [2022-01-21 10:12:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][10/1251] eta 1:23:51 lr 0.000658 time 1.5101 (4.0544) loss 3.0806 (3.4787) grad_norm 1.2243 (1.3707) [2022-01-21 10:12:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][20/1251] eta 1:05:16 lr 0.000658 time 2.1386 (3.1816) loss 3.8310 (3.6137) grad_norm 1.6305 (1.3678) [2022-01-21 10:12:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][30/1251] eta 0:56:46 lr 0.000658 time 1.4792 (2.7903) loss 3.8554 (3.6732) grad_norm 1.3331 (1.3516) [2022-01-21 10:13:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][40/1251] eta 0:55:02 lr 0.000658 time 3.9319 (2.7269) loss 3.0725 (3.6983) grad_norm 1.3515 (1.3495) [2022-01-21 10:13:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][50/1251] eta 0:52:29 lr 0.000658 time 1.8823 (2.6221) loss 2.5843 (3.7034) grad_norm 1.4340 (1.3411) [2022-01-21 10:14:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][60/1251] eta 0:50:28 lr 0.000658 time 1.8258 (2.5427) loss 4.3176 (3.6835) grad_norm 1.2876 (1.3479) [2022-01-21 10:14:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][70/1251] eta 0:49:07 lr 0.000658 time 1.8198 (2.4962) loss 3.6891 (3.6570) grad_norm 1.2795 (1.3536) [2022-01-21 10:14:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][80/1251] eta 0:48:12 lr 0.000658 time 3.8165 (2.4697) loss 3.9696 (3.6343) grad_norm 1.6585 (1.3553) [2022-01-21 10:15:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][90/1251] eta 0:47:18 lr 0.000658 time 2.5822 (2.4451) loss 3.5207 (3.6377) grad_norm 1.2750 (1.3533) [2022-01-21 10:15:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][100/1251] eta 0:46:19 lr 0.000658 time 2.2783 (2.4146) loss 3.9835 (3.6456) grad_norm 1.2971 (1.3633) [2022-01-21 10:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][110/1251] eta 0:45:24 lr 0.000658 time 1.6386 (2.3881) loss 3.5693 (3.6629) grad_norm 1.1955 (1.3714) [2022-01-21 10:16:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][120/1251] eta 0:44:44 lr 0.000657 time 3.4649 (2.3733) loss 4.0382 (3.6415) grad_norm 1.4584 (1.3714) [2022-01-21 10:16:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][130/1251] eta 0:44:14 lr 0.000657 time 3.1377 (2.3684) loss 3.8232 (3.6431) grad_norm 1.5224 (1.3689) [2022-01-21 10:17:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][140/1251] eta 0:43:40 lr 0.000657 time 1.5375 (2.3585) loss 3.2697 (3.6292) grad_norm 1.5815 (1.3689) [2022-01-21 10:17:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][150/1251] eta 0:43:07 lr 0.000657 time 1.8015 (2.3498) loss 3.7350 (3.6249) grad_norm 1.4163 (1.3752) [2022-01-21 10:17:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][160/1251] eta 0:42:38 lr 0.000657 time 3.5209 (2.3449) loss 3.7751 (3.6330) grad_norm 1.1124 (1.3707) [2022-01-21 10:18:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][170/1251] eta 0:41:59 lr 0.000657 time 2.1560 (2.3310) loss 4.0332 (3.6309) grad_norm 1.2055 (1.3677) [2022-01-21 10:18:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][180/1251] eta 0:41:12 lr 0.000657 time 1.5548 (2.3082) loss 4.4303 (3.6358) grad_norm 1.3737 (1.3629) [2022-01-21 10:18:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][190/1251] eta 0:40:38 lr 0.000657 time 2.1313 (2.2983) loss 4.1696 (3.6315) grad_norm 1.4767 (1.3632) [2022-01-21 10:19:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][200/1251] eta 0:40:17 lr 0.000657 time 3.0234 (2.2999) loss 4.1183 (3.6412) grad_norm 1.2233 (1.3625) [2022-01-21 10:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][210/1251] eta 0:39:45 lr 0.000657 time 1.8217 (2.2920) loss 3.4320 (3.6382) grad_norm 1.1737 (1.3580) [2022-01-21 10:19:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][220/1251] eta 0:39:17 lr 0.000657 time 2.0309 (2.2865) loss 4.2511 (3.6355) grad_norm 1.1761 (1.3570) [2022-01-21 10:20:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][230/1251] eta 0:38:47 lr 0.000657 time 1.6526 (2.2799) loss 4.0083 (3.6435) grad_norm 1.4785 (1.3555) [2022-01-21 10:20:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][240/1251] eta 0:38:15 lr 0.000657 time 2.4959 (2.2710) loss 4.1550 (3.6408) grad_norm 1.3262 (1.3560) [2022-01-21 10:20:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][250/1251] eta 0:37:47 lr 0.000657 time 2.4105 (2.2656) loss 3.0759 (3.6443) grad_norm 1.3522 (1.3555) [2022-01-21 10:21:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][260/1251] eta 0:37:21 lr 0.000657 time 1.8920 (2.2615) loss 3.2496 (3.6470) grad_norm 1.2159 (1.3548) [2022-01-21 10:21:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][270/1251] eta 0:36:56 lr 0.000657 time 2.0729 (2.2597) loss 2.9563 (3.6545) grad_norm 1.3311 (1.3548) [2022-01-21 10:22:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][280/1251] eta 0:36:33 lr 0.000657 time 3.1048 (2.2590) loss 2.8992 (3.6563) grad_norm 1.2134 (1.3534) [2022-01-21 10:22:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][290/1251] eta 0:36:07 lr 0.000657 time 1.7263 (2.2555) loss 3.9203 (3.6572) grad_norm 1.7367 (1.3533) [2022-01-21 10:22:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][300/1251] eta 0:35:43 lr 0.000657 time 1.8447 (2.2539) loss 2.9060 (3.6590) grad_norm 1.3240 (1.3531) [2022-01-21 10:23:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][310/1251] eta 0:35:18 lr 0.000657 time 2.2021 (2.2518) loss 4.0146 (3.6625) grad_norm 1.6640 (1.3539) [2022-01-21 10:23:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][320/1251] eta 0:34:59 lr 0.000657 time 2.6592 (2.2550) loss 4.0218 (3.6615) grad_norm 1.7309 (1.3545) [2022-01-21 10:23:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][330/1251] eta 0:34:35 lr 0.000657 time 2.8263 (2.2531) loss 4.1097 (3.6649) grad_norm 1.3003 (1.3563) [2022-01-21 10:24:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][340/1251] eta 0:34:11 lr 0.000657 time 1.9272 (2.2524) loss 2.7884 (3.6627) grad_norm 1.2722 (1.3556) [2022-01-21 10:24:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][350/1251] eta 0:33:48 lr 0.000657 time 1.7286 (2.2511) loss 4.1939 (3.6638) grad_norm 1.1865 (1.3551) [2022-01-21 10:25:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][360/1251] eta 0:33:21 lr 0.000657 time 2.1587 (2.2468) loss 4.1087 (3.6619) grad_norm 1.4219 (1.3557) [2022-01-21 10:25:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][370/1251] eta 0:32:54 lr 0.000657 time 2.2305 (2.2409) loss 3.1246 (3.6608) grad_norm 1.3559 (1.3568) [2022-01-21 10:25:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][380/1251] eta 0:32:27 lr 0.000656 time 1.8254 (2.2358) loss 4.0465 (3.6691) grad_norm 1.1847 (1.3564) [2022-01-21 10:26:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][390/1251] eta 0:32:02 lr 0.000656 time 2.6079 (2.2325) loss 2.6327 (3.6654) grad_norm 1.3351 (1.3547) [2022-01-21 10:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][400/1251] eta 0:31:37 lr 0.000656 time 2.1091 (2.2295) loss 2.3843 (3.6685) grad_norm 1.4652 (1.3554) [2022-01-21 10:26:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][410/1251] eta 0:31:15 lr 0.000656 time 2.6428 (2.2295) loss 3.1625 (3.6642) grad_norm 1.8143 (1.3571) [2022-01-21 10:27:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][420/1251] eta 0:30:55 lr 0.000656 time 2.1541 (2.2331) loss 4.4166 (3.6648) grad_norm 1.1507 (1.3583) [2022-01-21 10:27:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][430/1251] eta 0:30:36 lr 0.000656 time 2.5495 (2.2375) loss 3.6897 (3.6658) grad_norm 1.3581 (1.3576) [2022-01-21 10:27:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][440/1251] eta 0:30:12 lr 0.000656 time 2.0245 (2.2354) loss 3.9085 (3.6679) grad_norm 1.4243 (1.3568) [2022-01-21 10:28:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][450/1251] eta 0:29:49 lr 0.000656 time 1.6225 (2.2345) loss 3.2879 (3.6703) grad_norm 1.3837 (1.3542) [2022-01-21 10:28:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][460/1251] eta 0:29:26 lr 0.000656 time 1.8993 (2.2336) loss 3.3169 (3.6671) grad_norm 1.4024 (1.3531) [2022-01-21 10:28:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][470/1251] eta 0:29:01 lr 0.000656 time 1.7528 (2.2294) loss 3.5400 (3.6661) grad_norm 1.2321 (1.3524) [2022-01-21 10:29:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][480/1251] eta 0:28:36 lr 0.000656 time 2.2965 (2.2259) loss 4.5245 (3.6676) grad_norm 1.3373 (1.3527) [2022-01-21 10:29:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][490/1251] eta 0:28:12 lr 0.000656 time 1.8398 (2.2240) loss 4.1142 (3.6661) grad_norm 1.4537 (1.3518) [2022-01-21 10:30:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][500/1251] eta 0:27:53 lr 0.000656 time 2.1496 (2.2280) loss 3.2593 (3.6663) grad_norm 1.0770 (1.3521) [2022-01-21 10:30:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][510/1251] eta 0:27:29 lr 0.000656 time 1.5707 (2.2266) loss 3.9388 (3.6669) grad_norm 1.2214 (1.3517) [2022-01-21 10:30:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][520/1251] eta 0:27:06 lr 0.000656 time 2.1997 (2.2254) loss 3.8870 (3.6660) grad_norm 1.2519 (1.3513) [2022-01-21 10:31:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][530/1251] eta 0:26:45 lr 0.000656 time 2.4206 (2.2273) loss 2.7927 (3.6634) grad_norm 1.3571 (1.3519) [2022-01-21 10:31:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][540/1251] eta 0:26:24 lr 0.000656 time 1.8776 (2.2285) loss 4.1238 (3.6669) grad_norm 1.4002 (1.3534) [2022-01-21 10:31:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][550/1251] eta 0:25:58 lr 0.000656 time 2.1868 (2.2227) loss 3.7255 (3.6693) grad_norm 1.2925 (1.3526) [2022-01-21 10:32:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][560/1251] eta 0:25:33 lr 0.000656 time 2.1299 (2.2192) loss 2.9921 (3.6667) grad_norm 1.3755 (1.3527) [2022-01-21 10:32:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][570/1251] eta 0:25:13 lr 0.000656 time 2.5854 (2.2227) loss 3.8241 (3.6693) grad_norm 1.5395 (1.3522) [2022-01-21 10:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][580/1251] eta 0:24:51 lr 0.000656 time 2.1098 (2.2230) loss 3.4349 (3.6707) grad_norm 1.2539 (1.3517) [2022-01-21 10:33:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][590/1251] eta 0:24:27 lr 0.000656 time 2.1339 (2.2200) loss 2.7839 (3.6661) grad_norm 1.2079 (1.3506) [2022-01-21 10:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][600/1251] eta 0:24:01 lr 0.000656 time 2.1481 (2.2145) loss 3.7872 (3.6674) grad_norm 1.2190 (1.3510) [2022-01-21 10:34:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][610/1251] eta 0:23:38 lr 0.000656 time 2.0744 (2.2123) loss 3.9665 (3.6680) grad_norm 1.1925 (1.3523) [2022-01-21 10:34:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][620/1251] eta 0:23:17 lr 0.000656 time 1.8285 (2.2143) loss 3.6448 (3.6674) grad_norm 1.4467 (1.3535) [2022-01-21 10:34:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][630/1251] eta 0:22:54 lr 0.000655 time 2.1804 (2.2130) loss 3.0244 (3.6684) grad_norm 1.2250 (1.3541) [2022-01-21 10:35:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][640/1251] eta 0:22:34 lr 0.000655 time 2.8329 (2.2162) loss 3.6230 (3.6649) grad_norm 1.1411 (1.3533) [2022-01-21 10:35:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][650/1251] eta 0:22:13 lr 0.000655 time 3.3594 (2.2182) loss 4.1825 (3.6613) grad_norm 1.5674 (1.3533) [2022-01-21 10:35:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][660/1251] eta 0:21:51 lr 0.000655 time 1.6691 (2.2197) loss 4.0961 (3.6611) grad_norm 1.2062 (1.3540) [2022-01-21 10:36:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][670/1251] eta 0:21:28 lr 0.000655 time 1.8929 (2.2183) loss 4.4877 (3.6606) grad_norm 1.4790 (1.3533) [2022-01-21 10:36:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][680/1251] eta 0:21:04 lr 0.000655 time 2.0135 (2.2148) loss 3.7626 (3.6619) grad_norm 1.2589 (1.3528) [2022-01-21 10:36:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][690/1251] eta 0:20:40 lr 0.000655 time 1.7903 (2.2119) loss 4.3685 (3.6630) grad_norm 1.2914 (1.3516) [2022-01-21 10:37:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][700/1251] eta 0:20:17 lr 0.000655 time 1.7704 (2.2102) loss 2.7232 (3.6631) grad_norm 1.3061 (1.3511) [2022-01-21 10:37:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][710/1251] eta 0:19:55 lr 0.000655 time 2.0818 (2.2097) loss 3.6555 (3.6638) grad_norm 1.3691 (1.3501) [2022-01-21 10:38:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][720/1251] eta 0:19:33 lr 0.000655 time 2.1575 (2.2094) loss 3.3169 (3.6624) grad_norm 1.4758 (1.3496) [2022-01-21 10:38:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][730/1251] eta 0:19:11 lr 0.000655 time 1.9953 (2.2095) loss 4.2296 (3.6649) grad_norm 1.4394 (1.3490) [2022-01-21 10:38:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][740/1251] eta 0:18:48 lr 0.000655 time 1.8635 (2.2087) loss 3.8012 (3.6607) grad_norm 1.2995 (1.3492) [2022-01-21 10:39:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][750/1251] eta 0:18:25 lr 0.000655 time 2.2927 (2.2074) loss 4.5236 (3.6630) grad_norm 1.3843 (1.3496) [2022-01-21 10:39:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][760/1251] eta 0:18:03 lr 0.000655 time 2.1368 (2.2067) loss 3.0866 (3.6618) grad_norm 1.2845 (1.3497) [2022-01-21 10:39:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][770/1251] eta 0:17:41 lr 0.000655 time 1.9642 (2.2068) loss 3.7161 (3.6636) grad_norm 1.3647 (1.3500) [2022-01-21 10:40:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][780/1251] eta 0:17:19 lr 0.000655 time 1.7333 (2.2063) loss 3.8897 (3.6638) grad_norm 1.4599 (1.3503) [2022-01-21 10:40:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][790/1251] eta 0:16:57 lr 0.000655 time 1.9720 (2.2072) loss 2.6364 (3.6619) grad_norm 1.3914 (1.3513) [2022-01-21 10:40:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][800/1251] eta 0:16:36 lr 0.000655 time 1.9816 (2.2097) loss 3.0033 (3.6572) grad_norm 1.2467 (1.3518) [2022-01-21 10:41:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][810/1251] eta 0:16:13 lr 0.000655 time 1.6967 (2.2085) loss 3.8731 (3.6562) grad_norm 1.2160 (1.3529) [2022-01-21 10:41:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][820/1251] eta 0:15:52 lr 0.000655 time 1.9521 (2.2092) loss 4.0838 (3.6588) grad_norm 1.3600 (1.3537) [2022-01-21 10:42:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][830/1251] eta 0:15:29 lr 0.000655 time 1.5988 (2.2072) loss 4.4233 (3.6646) grad_norm 1.2879 (1.3531) [2022-01-21 10:42:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][840/1251] eta 0:15:06 lr 0.000655 time 1.6350 (2.2060) loss 2.9417 (3.6625) grad_norm 1.1938 (1.3524) [2022-01-21 10:42:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][850/1251] eta 0:14:43 lr 0.000655 time 2.2332 (2.2034) loss 2.8965 (3.6623) grad_norm 1.3524 (1.3523) [2022-01-21 10:43:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][860/1251] eta 0:14:21 lr 0.000655 time 1.8809 (2.2038) loss 4.1421 (3.6641) grad_norm 1.3228 (1.3524) [2022-01-21 10:43:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][870/1251] eta 0:13:59 lr 0.000655 time 1.5574 (2.2026) loss 2.9455 (3.6640) grad_norm 1.2945 (1.3521) [2022-01-21 10:43:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][880/1251] eta 0:13:36 lr 0.000654 time 1.9636 (2.2016) loss 3.5119 (3.6643) grad_norm 1.2879 (1.3525) [2022-01-21 10:44:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][890/1251] eta 0:13:15 lr 0.000654 time 2.0828 (2.2027) loss 2.6579 (3.6644) grad_norm 1.4209 (1.3523) [2022-01-21 10:44:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][900/1251] eta 0:12:55 lr 0.000654 time 1.8115 (2.2085) loss 3.5397 (3.6636) grad_norm 1.2202 (1.3519) [2022-01-21 10:45:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][910/1251] eta 0:12:33 lr 0.000654 time 1.9062 (2.2093) loss 4.1573 (3.6647) grad_norm 1.3886 (1.3517) [2022-01-21 10:45:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][920/1251] eta 0:12:11 lr 0.000654 time 2.0031 (2.2086) loss 2.7032 (3.6628) grad_norm 1.3602 (1.3519) [2022-01-21 10:45:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][930/1251] eta 0:11:48 lr 0.000654 time 1.5513 (2.2066) loss 3.8974 (3.6612) grad_norm 1.4697 (1.3515) [2022-01-21 10:46:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][940/1251] eta 0:11:25 lr 0.000654 time 1.9830 (2.2045) loss 3.8137 (3.6634) grad_norm 1.2648 (1.3506) [2022-01-21 10:46:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][950/1251] eta 0:11:03 lr 0.000654 time 2.2912 (2.2040) loss 3.2885 (3.6649) grad_norm 1.2727 (1.3515) [2022-01-21 10:46:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][960/1251] eta 0:10:41 lr 0.000654 time 2.1877 (2.2042) loss 2.5692 (3.6648) grad_norm 1.2388 (1.3516) [2022-01-21 10:47:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][970/1251] eta 0:10:19 lr 0.000654 time 1.8569 (2.2050) loss 3.8485 (3.6648) grad_norm 1.3364 (1.3521) [2022-01-21 10:47:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][980/1251] eta 0:09:57 lr 0.000654 time 2.2103 (2.2047) loss 3.7765 (3.6663) grad_norm 1.1952 (1.3513) [2022-01-21 10:47:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][990/1251] eta 0:09:35 lr 0.000654 time 2.1456 (2.2055) loss 4.4855 (3.6671) grad_norm 1.4854 (1.3524) [2022-01-21 10:48:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1000/1251] eta 0:09:13 lr 0.000654 time 2.2494 (2.2059) loss 3.1855 (3.6679) grad_norm 1.6649 (1.3525) [2022-01-21 10:48:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1010/1251] eta 0:08:51 lr 0.000654 time 1.5579 (2.2054) loss 3.4836 (3.6661) grad_norm 1.1420 (1.3526) [2022-01-21 10:49:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1020/1251] eta 0:08:29 lr 0.000654 time 2.1804 (2.2043) loss 3.9272 (3.6635) grad_norm 1.3804 (1.3526) [2022-01-21 10:49:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1030/1251] eta 0:08:06 lr 0.000654 time 1.9002 (2.2028) loss 4.1479 (3.6653) grad_norm 1.2937 (1.3519) [2022-01-21 10:49:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1040/1251] eta 0:07:44 lr 0.000654 time 2.4230 (2.2024) loss 3.7179 (3.6628) grad_norm 1.4023 (1.3519) [2022-01-21 10:50:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1050/1251] eta 0:07:22 lr 0.000654 time 1.8609 (2.2016) loss 3.5960 (3.6641) grad_norm 1.3041 (1.3517) [2022-01-21 10:50:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1060/1251] eta 0:07:00 lr 0.000654 time 2.2795 (2.2016) loss 2.9646 (3.6660) grad_norm 1.6877 (1.3524) [2022-01-21 10:50:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1070/1251] eta 0:06:38 lr 0.000654 time 1.5644 (2.2027) loss 4.0529 (3.6667) grad_norm 1.2402 (1.3520) [2022-01-21 10:51:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1080/1251] eta 0:06:16 lr 0.000654 time 2.7892 (2.2033) loss 3.5394 (3.6646) grad_norm 1.2534 (1.3518) [2022-01-21 10:51:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1090/1251] eta 0:05:54 lr 0.000654 time 1.9650 (2.2026) loss 4.0875 (3.6652) grad_norm 1.3653 (1.3513) [2022-01-21 10:51:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1100/1251] eta 0:05:32 lr 0.000654 time 1.9780 (2.2023) loss 3.9758 (3.6656) grad_norm 1.2958 (1.3502) [2022-01-21 10:52:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1110/1251] eta 0:05:10 lr 0.000654 time 1.7866 (2.2023) loss 3.9959 (3.6644) grad_norm 1.1985 (1.3496) [2022-01-21 10:52:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1120/1251] eta 0:04:48 lr 0.000654 time 2.4757 (2.2019) loss 3.2132 (3.6618) grad_norm 1.1793 (1.3493) [2022-01-21 10:52:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1130/1251] eta 0:04:26 lr 0.000654 time 2.2032 (2.2003) loss 3.4295 (3.6624) grad_norm 1.2700 (1.3491) [2022-01-21 10:53:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1140/1251] eta 0:04:04 lr 0.000653 time 1.6658 (2.1993) loss 3.5592 (3.6624) grad_norm 1.2767 (1.3483) [2022-01-21 10:53:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1150/1251] eta 0:03:42 lr 0.000653 time 1.9221 (2.2003) loss 3.5846 (3.6623) grad_norm 1.2278 (1.3478) [2022-01-21 10:54:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1160/1251] eta 0:03:20 lr 0.000653 time 2.4210 (2.2021) loss 3.0362 (3.6619) grad_norm 1.4228 (1.3475) [2022-01-21 10:54:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1170/1251] eta 0:02:58 lr 0.000653 time 3.1459 (2.2040) loss 3.8438 (3.6618) grad_norm 1.1438 (1.3469) [2022-01-21 10:54:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1180/1251] eta 0:02:36 lr 0.000653 time 1.7979 (2.2039) loss 2.5427 (3.6613) grad_norm 1.3099 (1.3473) [2022-01-21 10:55:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1190/1251] eta 0:02:14 lr 0.000653 time 1.7833 (2.2033) loss 4.3048 (3.6583) grad_norm 1.1762 (1.3474) [2022-01-21 10:55:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1200/1251] eta 0:01:52 lr 0.000653 time 1.5745 (2.2014) loss 4.3436 (3.6592) grad_norm 1.2756 (1.3474) [2022-01-21 10:55:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1210/1251] eta 0:01:30 lr 0.000653 time 1.6338 (2.1987) loss 3.5671 (3.6612) grad_norm 1.3803 (1.3477) [2022-01-21 10:56:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1220/1251] eta 0:01:08 lr 0.000653 time 1.8598 (2.1971) loss 4.0191 (3.6614) grad_norm 1.2621 (1.3474) [2022-01-21 10:56:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1230/1251] eta 0:00:46 lr 0.000653 time 2.3457 (2.1961) loss 4.1682 (3.6632) grad_norm 1.2068 (1.3477) [2022-01-21 10:56:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1240/1251] eta 0:00:24 lr 0.000653 time 1.5427 (2.1977) loss 4.3632 (3.6628) grad_norm 1.4153 (1.3478) [2022-01-21 10:57:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [120/300][1250/1251] eta 0:00:02 lr 0.000653 time 1.1840 (2.1922) loss 2.3908 (3.6619) grad_norm 1.2903 (1.3476) [2022-01-21 10:57:12 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 120 training takes 0:45:42 [2022-01-21 10:57:12 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_120 saving...... [2022-01-21 10:57:23 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_120 saved !!! [2022-01-21 10:57:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 16.723 (16.723) Loss 1.0924 (1.0924) Acc@1 74.316 (74.316) Acc@5 93.164 (93.164) [2022-01-21 10:57:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.400 (2.892) Loss 1.0867 (1.0646) Acc@1 74.512 (74.769) Acc@5 92.383 (92.694) [2022-01-21 10:58:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.649 (2.397) Loss 1.0385 (1.0747) Acc@1 75.391 (74.595) Acc@5 93.848 (92.680) [2022-01-21 10:58:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.925 (2.073) Loss 1.0164 (1.0663) Acc@1 75.977 (74.767) Acc@5 93.164 (92.736) [2022-01-21 10:58:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.315 (1.932) Loss 1.0824 (1.0658) Acc@1 73.145 (74.869) Acc@5 91.699 (92.635) [2022-01-21 10:58:50 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.952 Acc@5 92.696 [2022-01-21 10:58:50 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.0% [2022-01-21 10:58:50 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.95% [2022-01-21 10:59:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][0/1251] eta 7:39:54 lr 0.000653 time 22.0581 (22.0581) loss 3.1582 (3.1582) grad_norm 1.2712 (1.2712) [2022-01-21 10:59:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][10/1251] eta 1:22:33 lr 0.000653 time 2.1256 (3.9914) loss 3.9637 (3.7630) grad_norm 1.2706 (1.3200) [2022-01-21 10:59:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][20/1251] eta 1:06:02 lr 0.000653 time 2.7280 (3.2193) loss 3.9857 (3.7534) grad_norm 1.1258 (1.3259) [2022-01-21 11:00:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][30/1251] eta 0:59:42 lr 0.000653 time 1.8942 (2.9343) loss 3.9605 (3.8079) grad_norm 1.7471 (1.3605) [2022-01-21 11:00:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][40/1251] eta 0:56:12 lr 0.000653 time 3.2709 (2.7848) loss 3.8423 (3.7345) grad_norm 1.3544 (1.3652) [2022-01-21 11:01:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][50/1251] eta 0:53:37 lr 0.000653 time 1.9580 (2.6791) loss 2.8108 (3.6865) grad_norm 1.5400 (1.3565) [2022-01-21 11:01:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][60/1251] eta 0:50:58 lr 0.000653 time 1.6884 (2.5679) loss 3.3440 (3.6553) grad_norm 1.2285 (1.3542) [2022-01-21 11:01:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][70/1251] eta 0:49:17 lr 0.000653 time 2.0721 (2.5046) loss 2.7684 (3.6542) grad_norm 1.6829 (1.3596) [2022-01-21 11:02:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][80/1251] eta 0:47:55 lr 0.000653 time 2.8399 (2.4554) loss 2.7728 (3.6478) grad_norm 1.3937 (1.3576) [2022-01-21 11:02:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][90/1251] eta 0:46:49 lr 0.000653 time 1.9513 (2.4201) loss 3.9256 (3.6333) grad_norm 1.5137 (1.3620) [2022-01-21 11:02:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][100/1251] eta 0:46:02 lr 0.000653 time 1.9388 (2.3998) loss 3.1597 (3.6217) grad_norm 1.2900 (1.3635) [2022-01-21 11:03:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][110/1251] eta 0:45:13 lr 0.000653 time 1.8586 (2.3781) loss 3.6976 (3.6011) grad_norm 1.3930 (1.3670) [2022-01-21 11:03:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][120/1251] eta 0:44:52 lr 0.000653 time 3.1171 (2.3808) loss 4.2592 (3.6009) grad_norm 1.5008 (1.3695) [2022-01-21 11:04:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][130/1251] eta 0:44:12 lr 0.000653 time 2.4216 (2.3661) loss 3.7849 (3.5952) grad_norm 1.3888 (1.3664) [2022-01-21 11:04:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][140/1251] eta 0:43:30 lr 0.000652 time 1.6882 (2.3500) loss 3.6574 (3.5912) grad_norm 1.1947 (1.3646) [2022-01-21 11:04:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][150/1251] eta 0:42:55 lr 0.000652 time 1.9227 (2.3396) loss 4.0500 (3.6060) grad_norm 1.3610 (1.3586) [2022-01-21 11:05:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][160/1251] eta 0:42:23 lr 0.000652 time 3.1196 (2.3310) loss 2.7156 (3.5839) grad_norm 1.3499 (1.3614) [2022-01-21 11:05:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][170/1251] eta 0:41:51 lr 0.000652 time 2.3883 (2.3238) loss 3.0411 (3.5863) grad_norm 1.3156 (1.3604) [2022-01-21 11:05:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][180/1251] eta 0:41:23 lr 0.000652 time 2.1407 (2.3188) loss 3.4002 (3.5837) grad_norm 1.3708 (1.3589) [2022-01-21 11:06:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][190/1251] eta 0:40:57 lr 0.000652 time 2.1282 (2.3165) loss 2.5486 (3.5860) grad_norm 1.2105 (1.3616) [2022-01-21 11:06:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][200/1251] eta 0:40:26 lr 0.000652 time 2.7419 (2.3084) loss 3.5428 (3.5955) grad_norm 1.2021 (1.3646) [2022-01-21 11:06:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][210/1251] eta 0:39:52 lr 0.000652 time 1.8748 (2.2985) loss 3.2193 (3.5832) grad_norm 1.2156 (1.3586) [2022-01-21 11:07:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][220/1251] eta 0:39:20 lr 0.000652 time 1.7995 (2.2898) loss 4.3268 (3.5881) grad_norm 1.4755 (1.3569) [2022-01-21 11:07:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][230/1251] eta 0:38:47 lr 0.000652 time 1.6897 (2.2799) loss 4.1406 (3.6020) grad_norm 1.3664 (1.3572) [2022-01-21 11:07:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][240/1251] eta 0:38:18 lr 0.000652 time 2.8400 (2.2739) loss 3.9903 (3.6054) grad_norm 1.0642 (1.3598) [2022-01-21 11:08:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][250/1251] eta 0:37:48 lr 0.000652 time 2.3279 (2.2666) loss 4.2847 (3.6046) grad_norm 1.3669 (1.3591) [2022-01-21 11:08:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][260/1251] eta 0:37:19 lr 0.000652 time 1.6229 (2.2595) loss 4.3565 (3.6078) grad_norm 1.3838 (1.3569) [2022-01-21 11:09:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][270/1251] eta 0:36:55 lr 0.000652 time 2.2276 (2.2588) loss 3.7586 (3.6124) grad_norm 1.3884 (1.3535) [2022-01-21 11:09:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][280/1251] eta 0:36:39 lr 0.000652 time 3.1579 (2.2650) loss 4.1712 (3.6086) grad_norm 1.2986 (1.3528) [2022-01-21 11:09:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][290/1251] eta 0:36:17 lr 0.000652 time 2.3966 (2.2655) loss 4.0540 (3.5968) grad_norm 1.4571 (1.3536) [2022-01-21 11:10:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][300/1251] eta 0:35:54 lr 0.000652 time 2.2767 (2.2653) loss 3.4197 (3.5985) grad_norm 1.2028 (1.3532) [2022-01-21 11:10:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][310/1251] eta 0:35:29 lr 0.000652 time 1.9838 (2.2632) loss 4.4353 (3.5978) grad_norm 1.4498 (1.3511) [2022-01-21 11:10:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][320/1251] eta 0:35:03 lr 0.000652 time 3.2519 (2.2590) loss 3.5702 (3.5990) grad_norm 1.3552 (1.3499) [2022-01-21 11:11:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][330/1251] eta 0:34:32 lr 0.000652 time 1.8513 (2.2503) loss 3.4768 (3.5949) grad_norm 1.3604 (1.3507) [2022-01-21 11:11:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][340/1251] eta 0:34:06 lr 0.000652 time 1.9203 (2.2460) loss 2.9343 (3.5940) grad_norm 1.1521 (1.3496) [2022-01-21 11:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][350/1251] eta 0:33:43 lr 0.000652 time 2.5163 (2.2459) loss 2.8041 (3.5947) grad_norm 1.2505 (1.3492) [2022-01-21 11:12:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][360/1251] eta 0:33:24 lr 0.000652 time 2.8147 (2.2496) loss 3.9897 (3.5972) grad_norm 1.1480 (1.3475) [2022-01-21 11:12:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][370/1251] eta 0:33:05 lr 0.000652 time 3.0589 (2.2536) loss 4.2509 (3.6005) grad_norm 1.2463 (1.3481) [2022-01-21 11:13:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][380/1251] eta 0:32:39 lr 0.000652 time 2.0431 (2.2494) loss 3.8098 (3.6059) grad_norm 1.1759 (1.3467) [2022-01-21 11:13:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][390/1251] eta 0:32:15 lr 0.000651 time 3.0958 (2.2475) loss 3.0284 (3.6094) grad_norm 1.5005 (1.3485) [2022-01-21 11:13:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][400/1251] eta 0:31:48 lr 0.000651 time 1.8608 (2.2431) loss 3.5499 (3.6040) grad_norm 1.1970 (1.3520) [2022-01-21 11:14:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][410/1251] eta 0:31:23 lr 0.000651 time 2.2550 (2.2399) loss 3.9436 (3.6103) grad_norm 1.4867 (1.3508) [2022-01-21 11:14:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][420/1251] eta 0:31:00 lr 0.000651 time 1.7107 (2.2389) loss 3.4617 (3.6164) grad_norm 1.2092 (1.3499) [2022-01-21 11:14:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][430/1251] eta 0:30:39 lr 0.000651 time 2.6792 (2.2403) loss 3.9282 (3.6192) grad_norm 1.3918 (1.3479) [2022-01-21 11:15:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][440/1251] eta 0:30:13 lr 0.000651 time 1.9029 (2.2359) loss 3.3882 (3.6168) grad_norm 1.2552 (1.3469) [2022-01-21 11:15:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][450/1251] eta 0:29:48 lr 0.000651 time 2.5293 (2.2330) loss 4.3048 (3.6202) grad_norm 1.2596 (1.3471) [2022-01-21 11:15:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][460/1251] eta 0:29:25 lr 0.000651 time 2.4957 (2.2317) loss 3.5135 (3.6185) grad_norm 1.5215 (1.3481) [2022-01-21 11:16:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][470/1251] eta 0:29:01 lr 0.000651 time 2.1484 (2.2304) loss 4.3043 (3.6223) grad_norm 1.2211 (1.3469) [2022-01-21 11:16:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][480/1251] eta 0:28:39 lr 0.000651 time 1.8830 (2.2299) loss 4.0005 (3.6250) grad_norm 1.3905 (1.3457) [2022-01-21 11:17:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][490/1251] eta 0:28:17 lr 0.000651 time 2.5517 (2.2307) loss 3.9922 (3.6274) grad_norm 1.0470 (1.3450) [2022-01-21 11:17:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][500/1251] eta 0:27:55 lr 0.000651 time 1.8415 (2.2305) loss 4.0137 (3.6260) grad_norm 1.2922 (1.3446) [2022-01-21 11:17:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][510/1251] eta 0:27:33 lr 0.000651 time 1.8944 (2.2315) loss 4.2499 (3.6249) grad_norm 1.2929 (1.3431) [2022-01-21 11:18:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][520/1251] eta 0:27:12 lr 0.000651 time 2.2529 (2.2328) loss 3.6924 (3.6267) grad_norm 1.2346 (1.3424) [2022-01-21 11:18:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][530/1251] eta 0:26:50 lr 0.000651 time 3.1525 (2.2336) loss 3.8724 (3.6245) grad_norm 1.1668 (1.3420) [2022-01-21 11:18:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][540/1251] eta 0:26:25 lr 0.000651 time 1.9029 (2.2298) loss 3.9202 (3.6259) grad_norm 1.1646 (1.3425) [2022-01-21 11:19:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][550/1251] eta 0:26:00 lr 0.000651 time 2.1450 (2.2264) loss 2.5694 (3.6285) grad_norm 1.5223 (1.3420) [2022-01-21 11:19:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][560/1251] eta 0:25:37 lr 0.000651 time 2.1312 (2.2247) loss 3.9132 (3.6287) grad_norm 1.3587 (1.3415) [2022-01-21 11:20:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][570/1251] eta 0:25:13 lr 0.000651 time 2.5130 (2.2227) loss 4.0020 (3.6281) grad_norm 1.6248 (1.3417) [2022-01-21 11:20:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][580/1251] eta 0:24:49 lr 0.000651 time 2.1476 (2.2195) loss 3.7984 (3.6260) grad_norm 1.4051 (1.3416) [2022-01-21 11:20:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][590/1251] eta 0:24:26 lr 0.000651 time 2.6797 (2.2191) loss 4.0429 (3.6253) grad_norm 1.1365 (1.3435) [2022-01-21 11:21:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][600/1251] eta 0:24:02 lr 0.000651 time 1.5962 (2.2163) loss 3.4995 (3.6261) grad_norm 1.2123 (1.3429) [2022-01-21 11:21:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][610/1251] eta 0:23:40 lr 0.000651 time 2.1987 (2.2167) loss 4.2213 (3.6303) grad_norm 1.2425 (1.3420) [2022-01-21 11:21:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][620/1251] eta 0:23:18 lr 0.000651 time 1.8874 (2.2159) loss 3.1689 (3.6325) grad_norm 1.3610 (1.3429) [2022-01-21 11:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][630/1251] eta 0:22:56 lr 0.000651 time 2.3776 (2.2171) loss 3.6279 (3.6303) grad_norm 1.4368 (1.3435) [2022-01-21 11:22:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][640/1251] eta 0:22:34 lr 0.000650 time 2.7908 (2.2176) loss 3.1302 (3.6283) grad_norm 1.3686 (1.3429) [2022-01-21 11:22:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][650/1251] eta 0:22:12 lr 0.000650 time 2.1649 (2.2167) loss 3.8723 (3.6325) grad_norm 1.2999 (1.3416) [2022-01-21 11:23:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][660/1251] eta 0:21:49 lr 0.000650 time 1.6562 (2.2160) loss 4.0816 (3.6334) grad_norm 1.3498 (1.3420) [2022-01-21 11:23:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][670/1251] eta 0:21:28 lr 0.000650 time 2.2717 (2.2181) loss 3.8168 (3.6313) grad_norm 1.7827 (1.3428) [2022-01-21 11:24:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][680/1251] eta 0:21:06 lr 0.000650 time 1.7481 (2.2174) loss 3.0476 (3.6335) grad_norm 1.3972 (1.3434) [2022-01-21 11:24:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][690/1251] eta 0:20:42 lr 0.000650 time 1.5777 (2.2156) loss 3.1053 (3.6320) grad_norm 1.1735 (1.3442) [2022-01-21 11:24:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][700/1251] eta 0:20:20 lr 0.000650 time 2.0350 (2.2152) loss 3.9471 (3.6327) grad_norm 1.3655 (1.3455) [2022-01-21 11:25:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][710/1251] eta 0:19:58 lr 0.000650 time 1.8937 (2.2154) loss 4.2266 (3.6346) grad_norm 1.4776 (1.3467) [2022-01-21 11:25:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][720/1251] eta 0:19:35 lr 0.000650 time 1.5631 (2.2134) loss 2.7138 (3.6385) grad_norm 1.3802 (1.3463) [2022-01-21 11:25:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][730/1251] eta 0:19:11 lr 0.000650 time 1.9407 (2.2105) loss 3.6999 (3.6370) grad_norm 1.3183 (1.3467) [2022-01-21 11:26:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][740/1251] eta 0:18:48 lr 0.000650 time 1.8921 (2.2089) loss 2.5422 (3.6396) grad_norm 1.2309 (1.3475) [2022-01-21 11:26:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][750/1251] eta 0:18:27 lr 0.000650 time 2.1187 (2.2100) loss 3.8557 (3.6409) grad_norm 1.3880 (1.3489) [2022-01-21 11:26:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][760/1251] eta 0:18:04 lr 0.000650 time 2.1477 (2.2093) loss 3.4162 (3.6408) grad_norm 1.3543 (1.3499) [2022-01-21 11:27:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][770/1251] eta 0:17:42 lr 0.000650 time 2.2652 (2.2088) loss 3.0608 (3.6407) grad_norm 1.3498 (1.3501) [2022-01-21 11:27:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][780/1251] eta 0:17:20 lr 0.000650 time 2.1849 (2.2099) loss 4.2426 (3.6447) grad_norm 1.3936 (1.3501) [2022-01-21 11:27:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][790/1251] eta 0:16:59 lr 0.000650 time 1.9076 (2.2108) loss 3.1891 (3.6414) grad_norm 1.4557 (1.3513) [2022-01-21 11:28:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][800/1251] eta 0:16:35 lr 0.000650 time 1.9330 (2.2083) loss 3.1365 (3.6422) grad_norm 1.3191 (1.3523) [2022-01-21 11:28:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][810/1251] eta 0:16:14 lr 0.000650 time 1.7860 (2.2092) loss 4.0014 (3.6416) grad_norm 1.3733 (1.3524) [2022-01-21 11:29:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][820/1251] eta 0:15:52 lr 0.000650 time 2.9347 (2.2098) loss 2.5210 (3.6387) grad_norm 1.4219 (1.3528) [2022-01-21 11:29:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][830/1251] eta 0:15:29 lr 0.000650 time 1.5763 (2.2088) loss 3.6640 (3.6388) grad_norm 1.3717 (1.3522) [2022-01-21 11:29:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][840/1251] eta 0:15:07 lr 0.000650 time 1.6846 (2.2075) loss 4.1317 (3.6371) grad_norm 1.3236 (1.3518) [2022-01-21 11:30:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][850/1251] eta 0:14:44 lr 0.000650 time 1.5843 (2.2058) loss 3.3178 (3.6382) grad_norm 1.2397 (1.3512) [2022-01-21 11:30:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][860/1251] eta 0:14:22 lr 0.000650 time 2.2753 (2.2050) loss 4.1301 (3.6386) grad_norm 1.2792 (1.3512) [2022-01-21 11:30:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][870/1251] eta 0:14:00 lr 0.000650 time 2.3181 (2.2065) loss 3.4177 (3.6377) grad_norm 1.4402 (1.3514) [2022-01-21 11:31:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][880/1251] eta 0:13:38 lr 0.000650 time 1.9782 (2.2061) loss 3.5812 (3.6390) grad_norm 1.7505 (1.3522) [2022-01-21 11:31:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][890/1251] eta 0:13:16 lr 0.000650 time 1.5993 (2.2059) loss 2.9837 (3.6406) grad_norm 1.5026 (1.3548) [2022-01-21 11:31:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][900/1251] eta 0:12:54 lr 0.000649 time 2.5917 (2.2067) loss 4.1987 (3.6405) grad_norm 1.1807 (1.3551) [2022-01-21 11:32:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][910/1251] eta 0:12:32 lr 0.000649 time 2.2133 (2.2066) loss 3.9393 (3.6408) grad_norm 1.3642 (1.3545) [2022-01-21 11:32:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][920/1251] eta 0:12:10 lr 0.000649 time 2.0512 (2.2072) loss 3.8127 (3.6398) grad_norm 1.3412 (1.3544) [2022-01-21 11:33:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][930/1251] eta 0:11:48 lr 0.000649 time 2.1713 (2.2071) loss 3.5037 (3.6401) grad_norm 1.2513 (1.3539) [2022-01-21 11:33:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][940/1251] eta 0:11:26 lr 0.000649 time 1.6170 (2.2073) loss 4.0824 (3.6381) grad_norm 1.3920 (1.3534) [2022-01-21 11:33:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][950/1251] eta 0:11:03 lr 0.000649 time 1.9177 (2.2058) loss 3.5668 (3.6369) grad_norm 1.3332 (1.3530) [2022-01-21 11:34:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][960/1251] eta 0:10:41 lr 0.000649 time 1.8722 (2.2054) loss 2.8055 (3.6375) grad_norm 1.2772 (1.3523) [2022-01-21 11:34:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][970/1251] eta 0:10:19 lr 0.000649 time 1.9345 (2.2047) loss 4.0045 (3.6387) grad_norm 1.5627 (1.3526) [2022-01-21 11:34:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][980/1251] eta 0:09:57 lr 0.000649 time 2.2823 (2.2034) loss 3.7764 (3.6413) grad_norm 1.1251 (1.3528) [2022-01-21 11:35:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][990/1251] eta 0:09:34 lr 0.000649 time 2.4089 (2.2024) loss 4.0326 (3.6415) grad_norm 1.1890 (1.3537) [2022-01-21 11:35:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1000/1251] eta 0:09:12 lr 0.000649 time 1.7779 (2.2008) loss 3.4862 (3.6390) grad_norm 1.3954 (1.3526) [2022-01-21 11:35:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1010/1251] eta 0:08:49 lr 0.000649 time 2.0048 (2.1990) loss 4.1699 (3.6380) grad_norm 1.4365 (1.3519) [2022-01-21 11:36:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1020/1251] eta 0:08:28 lr 0.000649 time 1.9935 (2.1995) loss 3.3054 (3.6366) grad_norm 1.3859 (1.3521) [2022-01-21 11:36:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1030/1251] eta 0:08:06 lr 0.000649 time 2.8899 (2.2003) loss 3.4653 (3.6399) grad_norm 1.3018 (1.3522) [2022-01-21 11:37:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1040/1251] eta 0:07:44 lr 0.000649 time 1.4816 (2.2021) loss 4.0546 (3.6418) grad_norm 1.2866 (1.3518) [2022-01-21 11:37:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1050/1251] eta 0:07:23 lr 0.000649 time 1.7082 (2.2042) loss 4.2784 (3.6413) grad_norm 1.3246 (1.3517) [2022-01-21 11:37:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1060/1251] eta 0:07:01 lr 0.000649 time 2.2822 (2.2053) loss 3.5864 (3.6403) grad_norm 1.3733 (1.3518) [2022-01-21 11:38:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1070/1251] eta 0:06:39 lr 0.000649 time 2.4802 (2.2046) loss 3.7125 (3.6409) grad_norm 1.1589 (1.3511) [2022-01-21 11:38:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1080/1251] eta 0:06:16 lr 0.000649 time 1.8724 (2.2025) loss 3.1273 (3.6378) grad_norm 1.2297 (1.3509) [2022-01-21 11:38:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1090/1251] eta 0:05:54 lr 0.000649 time 1.9010 (2.2002) loss 3.9996 (3.6371) grad_norm 1.2662 (1.3508) [2022-01-21 11:39:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1100/1251] eta 0:05:32 lr 0.000649 time 2.6297 (2.2007) loss 3.3494 (3.6373) grad_norm 1.4488 (1.3503) [2022-01-21 11:39:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1110/1251] eta 0:05:10 lr 0.000649 time 3.4712 (2.2024) loss 2.7254 (3.6342) grad_norm 1.3129 (1.3502) [2022-01-21 11:40:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1120/1251] eta 0:04:48 lr 0.000649 time 2.2152 (2.2033) loss 2.5657 (3.6314) grad_norm 1.3527 (1.3509) [2022-01-21 11:40:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1130/1251] eta 0:04:26 lr 0.000649 time 2.5638 (2.2036) loss 4.1265 (3.6325) grad_norm 1.4070 (1.3510) [2022-01-21 11:40:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1140/1251] eta 0:04:04 lr 0.000649 time 1.8554 (2.2016) loss 3.9052 (3.6332) grad_norm 1.4496 (1.3514) [2022-01-21 11:41:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1150/1251] eta 0:03:42 lr 0.000648 time 2.2183 (2.1997) loss 3.9407 (3.6313) grad_norm 1.2847 (1.3513) [2022-01-21 11:41:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1160/1251] eta 0:03:20 lr 0.000648 time 2.7414 (2.1984) loss 3.9735 (3.6312) grad_norm 1.3804 (1.3515) [2022-01-21 11:41:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1170/1251] eta 0:02:58 lr 0.000648 time 2.4594 (2.1987) loss 4.5380 (3.6295) grad_norm 1.2500 (1.3509) [2022-01-21 11:42:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1180/1251] eta 0:02:36 lr 0.000648 time 2.2425 (2.1981) loss 4.0559 (3.6328) grad_norm 1.3404 (1.3510) [2022-01-21 11:42:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1190/1251] eta 0:02:14 lr 0.000648 time 1.8546 (2.1978) loss 3.2280 (3.6319) grad_norm 1.3848 (1.3509) [2022-01-21 11:42:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1200/1251] eta 0:01:52 lr 0.000648 time 1.8809 (2.1973) loss 2.9188 (3.6330) grad_norm 1.6810 (1.3520) [2022-01-21 11:43:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1210/1251] eta 0:01:30 lr 0.000648 time 2.8712 (2.1972) loss 3.3985 (3.6328) grad_norm 1.2255 (1.3520) [2022-01-21 11:43:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1220/1251] eta 0:01:08 lr 0.000648 time 2.1459 (2.1975) loss 4.2362 (3.6355) grad_norm 1.3558 (1.3519) [2022-01-21 11:43:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1230/1251] eta 0:00:46 lr 0.000648 time 1.5436 (2.1989) loss 4.1756 (3.6386) grad_norm 1.2009 (1.3514) [2022-01-21 11:44:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1240/1251] eta 0:00:24 lr 0.000648 time 1.2178 (2.1976) loss 2.5291 (3.6385) grad_norm 1.5313 (1.3518) [2022-01-21 11:44:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [121/300][1250/1251] eta 0:00:02 lr 0.000648 time 1.2968 (2.1930) loss 3.0079 (3.6367) grad_norm 1.1520 (1.3516) [2022-01-21 11:44:34 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 121 training takes 0:45:43 [2022-01-21 11:44:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.409 (18.409) Loss 1.1320 (1.1320) Acc@1 73.242 (73.242) Acc@5 92.578 (92.578) [2022-01-21 11:45:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.159 (3.308) Loss 1.1135 (1.0766) Acc@1 72.852 (74.885) Acc@5 92.871 (92.489) [2022-01-21 11:45:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.954 (2.604) Loss 1.0083 (1.0750) Acc@1 75.781 (74.879) Acc@5 93.359 (92.639) [2022-01-21 11:45:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.615 (2.301) Loss 1.0041 (1.0781) Acc@1 76.367 (74.814) Acc@5 92.871 (92.594) [2022-01-21 11:46:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.902 (2.206) Loss 1.1103 (1.0763) Acc@1 73.438 (74.829) Acc@5 92.285 (92.652) [2022-01-21 11:46:12 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.844 Acc@5 92.696 [2022-01-21 11:46:12 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.8% [2022-01-21 11:46:12 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.95% [2022-01-21 11:46:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][0/1251] eta 8:32:29 lr 0.000648 time 24.5803 (24.5803) loss 3.5255 (3.5255) grad_norm 1.3080 (1.3080) [2022-01-21 11:47:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][10/1251] eta 1:33:04 lr 0.000648 time 3.3594 (4.4996) loss 4.0601 (3.4600) grad_norm 1.5652 (1.3676) [2022-01-21 11:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][20/1251] eta 1:09:14 lr 0.000648 time 1.5000 (3.3752) loss 3.7995 (3.6466) grad_norm 1.3032 (1.3727) [2022-01-21 11:47:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][30/1251] eta 1:01:06 lr 0.000648 time 1.8367 (3.0031) loss 2.5439 (3.6671) grad_norm 1.7548 (1.3850) [2022-01-21 11:48:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][40/1251] eta 0:57:30 lr 0.000648 time 3.1742 (2.8494) loss 3.1922 (3.6746) grad_norm 1.3289 (1.4134) [2022-01-21 11:48:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][50/1251] eta 0:54:38 lr 0.000648 time 2.4045 (2.7295) loss 2.7125 (3.6598) grad_norm 1.4493 (1.4129) [2022-01-21 11:48:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][60/1251] eta 0:52:06 lr 0.000648 time 1.9065 (2.6247) loss 3.6071 (3.6372) grad_norm 1.1375 (1.3936) [2022-01-21 11:49:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][70/1251] eta 0:50:16 lr 0.000648 time 2.2020 (2.5544) loss 3.0625 (3.5790) grad_norm 1.2975 (1.3915) [2022-01-21 11:49:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][80/1251] eta 0:48:55 lr 0.000648 time 2.5836 (2.5069) loss 4.4720 (3.5840) grad_norm 1.5258 (1.3879) [2022-01-21 11:49:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][90/1251] eta 0:47:59 lr 0.000648 time 2.3130 (2.4803) loss 3.5817 (3.5821) grad_norm 1.5185 (1.3855) [2022-01-21 11:50:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][100/1251] eta 0:46:48 lr 0.000648 time 2.0572 (2.4399) loss 2.7864 (3.5573) grad_norm 1.3603 (1.3813) [2022-01-21 11:50:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][110/1251] eta 0:45:43 lr 0.000648 time 1.9932 (2.4045) loss 4.3009 (3.5518) grad_norm 1.3797 (1.3843) [2022-01-21 11:51:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][120/1251] eta 0:44:59 lr 0.000648 time 2.4987 (2.3865) loss 3.2149 (3.5519) grad_norm 1.4955 (1.3866) [2022-01-21 11:51:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][130/1251] eta 0:44:11 lr 0.000648 time 2.2431 (2.3653) loss 4.2109 (3.5648) grad_norm 1.2063 (1.3834) [2022-01-21 11:51:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][140/1251] eta 0:43:30 lr 0.000648 time 1.7480 (2.3499) loss 3.2971 (3.5622) grad_norm 1.4580 (1.3809) [2022-01-21 11:52:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][150/1251] eta 0:42:58 lr 0.000647 time 2.0653 (2.3421) loss 3.3083 (3.5671) grad_norm 1.4855 (1.3823) [2022-01-21 11:52:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][160/1251] eta 0:42:26 lr 0.000647 time 2.7610 (2.3345) loss 3.8492 (3.5628) grad_norm 1.1730 (1.3851) [2022-01-21 11:52:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][170/1251] eta 0:42:10 lr 0.000647 time 2.9608 (2.3413) loss 4.6766 (3.5600) grad_norm 1.5461 (1.3828) [2022-01-21 11:53:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][180/1251] eta 0:41:42 lr 0.000647 time 2.2608 (2.3362) loss 3.6408 (3.5565) grad_norm 1.2964 (1.3782) [2022-01-21 11:53:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][190/1251] eta 0:41:11 lr 0.000647 time 1.8697 (2.3290) loss 3.6487 (3.5561) grad_norm 1.2484 (1.3723) [2022-01-21 11:53:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][200/1251] eta 0:40:36 lr 0.000647 time 1.8351 (2.3181) loss 3.4420 (3.5566) grad_norm 1.0993 (1.3650) [2022-01-21 11:54:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][210/1251] eta 0:40:02 lr 0.000647 time 2.3417 (2.3077) loss 4.0320 (3.5607) grad_norm 1.3232 (1.3626) [2022-01-21 11:54:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][220/1251] eta 0:39:22 lr 0.000647 time 1.9196 (2.2912) loss 3.1853 (3.5643) grad_norm 1.2500 (1.3626) [2022-01-21 11:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][230/1251] eta 0:38:52 lr 0.000647 time 2.3679 (2.2842) loss 4.0546 (3.5701) grad_norm 1.8530 (1.3650) [2022-01-21 11:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][240/1251] eta 0:38:29 lr 0.000647 time 1.9517 (2.2841) loss 4.3297 (3.5718) grad_norm 1.2693 (1.3663) [2022-01-21 11:55:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][250/1251] eta 0:38:06 lr 0.000647 time 2.1390 (2.2845) loss 4.1633 (3.5662) grad_norm 1.3125 (1.3634) [2022-01-21 11:56:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][260/1251] eta 0:37:44 lr 0.000647 time 2.0367 (2.2855) loss 2.7429 (3.5630) grad_norm 1.2311 (1.3653) [2022-01-21 11:56:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][270/1251] eta 0:37:39 lr 0.000647 time 3.1688 (2.3037) loss 3.9313 (3.5685) grad_norm 1.1776 (1.3648) [2022-01-21 11:57:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][280/1251] eta 0:37:18 lr 0.000647 time 2.3064 (2.3052) loss 3.7384 (3.5688) grad_norm 1.2656 (1.3618) [2022-01-21 11:57:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][290/1251] eta 0:36:46 lr 0.000647 time 1.9048 (2.2964) loss 4.2160 (3.5812) grad_norm 1.2647 (1.3626) [2022-01-21 11:57:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][300/1251] eta 0:36:10 lr 0.000647 time 1.9702 (2.2820) loss 3.3882 (3.5711) grad_norm 1.2846 (1.3603) [2022-01-21 11:57:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][310/1251] eta 0:35:37 lr 0.000647 time 2.1028 (2.2712) loss 2.6232 (3.5621) grad_norm 1.4651 (1.3580) [2022-01-21 11:58:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][320/1251] eta 0:35:09 lr 0.000647 time 2.2091 (2.2656) loss 4.4371 (3.5722) grad_norm 1.3882 (1.3582) [2022-01-21 11:58:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][330/1251] eta 0:34:48 lr 0.000647 time 3.1092 (2.2673) loss 3.4088 (3.5672) grad_norm 1.2708 (1.3566) [2022-01-21 11:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][340/1251] eta 0:34:21 lr 0.000647 time 2.1880 (2.2629) loss 3.7321 (3.5723) grad_norm 1.2730 (1.3574) [2022-01-21 11:59:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][350/1251] eta 0:33:55 lr 0.000647 time 2.2070 (2.2593) loss 3.4181 (3.5739) grad_norm 1.1662 (1.3563) [2022-01-21 11:59:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][360/1251] eta 0:33:31 lr 0.000647 time 2.3810 (2.2580) loss 3.9858 (3.5765) grad_norm 1.3252 (1.3574) [2022-01-21 12:00:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][370/1251] eta 0:33:11 lr 0.000647 time 2.4151 (2.2604) loss 3.8108 (3.5735) grad_norm 1.2835 (1.3557) [2022-01-21 12:00:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][380/1251] eta 0:32:54 lr 0.000647 time 3.5537 (2.2666) loss 4.0449 (3.5692) grad_norm 1.3116 (1.3557) [2022-01-21 12:00:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][390/1251] eta 0:32:32 lr 0.000647 time 2.7690 (2.2681) loss 4.3854 (3.5719) grad_norm 1.4782 (1.3554) [2022-01-21 12:01:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][400/1251] eta 0:32:09 lr 0.000646 time 1.9391 (2.2677) loss 3.9833 (3.5659) grad_norm 1.2602 (1.3535) [2022-01-21 12:01:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][410/1251] eta 0:31:42 lr 0.000646 time 1.8688 (2.2621) loss 3.8063 (3.5688) grad_norm 1.4362 (1.3530) [2022-01-21 12:02:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][420/1251] eta 0:31:15 lr 0.000646 time 3.1333 (2.2575) loss 4.1770 (3.5743) grad_norm 1.1613 (1.3524) [2022-01-21 12:02:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][430/1251] eta 0:30:47 lr 0.000646 time 2.0687 (2.2503) loss 2.4127 (3.5759) grad_norm 1.3449 (1.3527) [2022-01-21 12:02:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][440/1251] eta 0:30:23 lr 0.000646 time 2.4445 (2.2490) loss 3.0932 (3.5749) grad_norm 1.2118 (1.3552) [2022-01-21 12:03:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][450/1251] eta 0:29:59 lr 0.000646 time 1.9403 (2.2465) loss 3.0288 (3.5756) grad_norm 1.3703 (1.3555) [2022-01-21 12:03:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][460/1251] eta 0:29:38 lr 0.000646 time 3.1535 (2.2481) loss 3.9141 (3.5720) grad_norm 1.5435 (1.3558) [2022-01-21 12:03:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][470/1251] eta 0:29:16 lr 0.000646 time 1.9603 (2.2494) loss 3.9701 (3.5723) grad_norm 1.5480 (1.3554) [2022-01-21 12:04:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][480/1251] eta 0:28:53 lr 0.000646 time 1.7974 (2.2482) loss 3.8590 (3.5731) grad_norm 1.4523 (1.3561) [2022-01-21 12:04:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][490/1251] eta 0:28:28 lr 0.000646 time 1.8646 (2.2453) loss 4.1091 (3.5734) grad_norm 1.5083 (1.3556) [2022-01-21 12:04:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][500/1251] eta 0:28:03 lr 0.000646 time 2.2805 (2.2423) loss 4.1499 (3.5750) grad_norm 1.2818 (1.3549) [2022-01-21 12:05:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][510/1251] eta 0:27:38 lr 0.000646 time 1.8730 (2.2388) loss 4.5742 (3.5784) grad_norm 1.2426 (1.3547) [2022-01-21 12:05:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][520/1251] eta 0:27:16 lr 0.000646 time 2.1782 (2.2392) loss 4.1096 (3.5778) grad_norm 1.1213 (1.3546) [2022-01-21 12:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][530/1251] eta 0:26:55 lr 0.000646 time 1.8864 (2.2400) loss 3.9429 (3.5791) grad_norm 1.3174 (1.3547) [2022-01-21 12:06:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][540/1251] eta 0:26:33 lr 0.000646 time 2.3825 (2.2416) loss 3.2481 (3.5800) grad_norm 1.3527 (1.3552) [2022-01-21 12:06:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][550/1251] eta 0:26:09 lr 0.000646 time 1.9434 (2.2384) loss 3.2867 (3.5814) grad_norm 1.3916 (1.3553) [2022-01-21 12:07:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][560/1251] eta 0:25:45 lr 0.000646 time 1.9037 (2.2366) loss 3.4198 (3.5811) grad_norm 1.4194 (1.3584) [2022-01-21 12:07:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][570/1251] eta 0:25:21 lr 0.000646 time 2.0077 (2.2337) loss 3.5494 (3.5799) grad_norm 1.6927 (1.3615) [2022-01-21 12:07:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][580/1251] eta 0:24:57 lr 0.000646 time 2.1720 (2.2323) loss 2.5822 (3.5762) grad_norm 1.3064 (1.3617) [2022-01-21 12:08:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][590/1251] eta 0:24:33 lr 0.000646 time 1.8308 (2.2288) loss 4.0006 (3.5738) grad_norm 1.2340 (1.3603) [2022-01-21 12:08:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][600/1251] eta 0:24:09 lr 0.000646 time 2.4662 (2.2264) loss 2.8496 (3.5763) grad_norm 1.1941 (1.3602) [2022-01-21 12:08:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][610/1251] eta 0:23:45 lr 0.000646 time 2.5090 (2.2239) loss 4.3497 (3.5739) grad_norm 1.4171 (1.3605) [2022-01-21 12:09:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][620/1251] eta 0:23:24 lr 0.000646 time 2.2150 (2.2262) loss 3.7743 (3.5769) grad_norm 1.3008 (1.3593) [2022-01-21 12:09:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][630/1251] eta 0:23:03 lr 0.000646 time 1.5754 (2.2273) loss 3.9723 (3.5782) grad_norm 1.2202 (1.3581) [2022-01-21 12:10:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][640/1251] eta 0:22:41 lr 0.000646 time 2.1358 (2.2282) loss 4.1058 (3.5813) grad_norm 1.2921 (1.3568) [2022-01-21 12:10:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][650/1251] eta 0:22:19 lr 0.000645 time 1.9852 (2.2281) loss 4.2727 (3.5877) grad_norm 1.3239 (1.3571) [2022-01-21 12:10:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][660/1251] eta 0:21:59 lr 0.000645 time 2.7894 (2.2322) loss 3.5685 (3.5865) grad_norm 1.2015 (1.3568) [2022-01-21 12:11:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][670/1251] eta 0:21:35 lr 0.000645 time 1.6296 (2.2301) loss 3.2745 (3.5893) grad_norm 1.1630 (1.3560) [2022-01-21 12:11:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][680/1251] eta 0:21:10 lr 0.000645 time 1.5380 (2.2259) loss 4.0141 (3.5918) grad_norm 1.4156 (1.3560) [2022-01-21 12:11:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][690/1251] eta 0:20:47 lr 0.000645 time 2.0639 (2.2229) loss 3.4279 (3.5949) grad_norm 1.3601 (1.3563) [2022-01-21 12:12:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][700/1251] eta 0:20:23 lr 0.000645 time 2.2052 (2.2212) loss 3.9721 (3.5993) grad_norm 1.2470 (1.3564) [2022-01-21 12:12:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][710/1251] eta 0:20:02 lr 0.000645 time 1.6426 (2.2223) loss 3.7206 (3.6014) grad_norm 1.4516 (1.3554) [2022-01-21 12:12:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][720/1251] eta 0:19:39 lr 0.000645 time 1.8689 (2.2216) loss 3.9297 (3.6018) grad_norm 1.3369 (1.3551) [2022-01-21 12:13:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][730/1251] eta 0:19:17 lr 0.000645 time 2.5751 (2.2221) loss 4.1368 (3.6057) grad_norm 1.2899 (1.3551) [2022-01-21 12:13:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][740/1251] eta 0:18:56 lr 0.000645 time 2.6137 (2.2239) loss 4.1641 (3.6071) grad_norm 1.4870 (1.3556) [2022-01-21 12:14:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][750/1251] eta 0:18:34 lr 0.000645 time 1.5669 (2.2237) loss 3.6684 (3.6057) grad_norm 1.2413 (1.3555) [2022-01-21 12:14:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][760/1251] eta 0:18:11 lr 0.000645 time 1.6967 (2.2224) loss 3.9290 (3.6104) grad_norm 1.3018 (1.3558) [2022-01-21 12:14:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][770/1251] eta 0:17:48 lr 0.000645 time 1.9257 (2.2222) loss 4.4600 (3.6137) grad_norm 1.4680 (1.3554) [2022-01-21 12:15:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][780/1251] eta 0:17:26 lr 0.000645 time 2.7165 (2.2217) loss 3.7349 (3.6108) grad_norm 1.3908 (1.3553) [2022-01-21 12:15:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][790/1251] eta 0:17:03 lr 0.000645 time 1.9033 (2.2210) loss 3.6585 (3.6104) grad_norm 1.2373 (1.3557) [2022-01-21 12:15:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][800/1251] eta 0:16:41 lr 0.000645 time 1.9175 (2.2200) loss 3.6651 (3.6129) grad_norm 1.2932 (1.3550) [2022-01-21 12:16:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][810/1251] eta 0:16:18 lr 0.000645 time 1.8283 (2.2185) loss 4.0823 (3.6171) grad_norm 1.2717 (1.3550) [2022-01-21 12:16:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][820/1251] eta 0:15:56 lr 0.000645 time 2.5105 (2.2198) loss 3.9088 (3.6187) grad_norm 1.9598 (1.3551) [2022-01-21 12:16:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][830/1251] eta 0:15:34 lr 0.000645 time 2.4089 (2.2190) loss 3.2146 (3.6195) grad_norm 1.4958 (1.3552) [2022-01-21 12:17:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][840/1251] eta 0:15:11 lr 0.000645 time 2.2826 (2.2182) loss 4.0036 (3.6193) grad_norm 1.2888 (1.3546) [2022-01-21 12:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][850/1251] eta 0:14:48 lr 0.000645 time 2.0119 (2.2168) loss 3.6002 (3.6174) grad_norm 1.3528 (1.3547) [2022-01-21 12:18:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][860/1251] eta 0:14:27 lr 0.000645 time 2.1434 (2.2174) loss 3.3869 (3.6170) grad_norm 1.2406 (1.3547) [2022-01-21 12:18:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][870/1251] eta 0:14:04 lr 0.000645 time 2.0823 (2.2177) loss 4.6234 (3.6146) grad_norm 1.5382 (1.3544) [2022-01-21 12:18:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][880/1251] eta 0:13:42 lr 0.000645 time 2.3702 (2.2174) loss 4.2510 (3.6178) grad_norm 1.3143 (1.3547) [2022-01-21 12:19:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][890/1251] eta 0:13:20 lr 0.000645 time 2.2676 (2.2178) loss 3.9948 (3.6205) grad_norm 1.1656 (1.3543) [2022-01-21 12:19:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][900/1251] eta 0:12:58 lr 0.000644 time 2.1700 (2.2172) loss 3.9774 (3.6176) grad_norm 1.2864 (1.3555) [2022-01-21 12:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][910/1251] eta 0:12:35 lr 0.000644 time 2.4734 (2.2150) loss 3.8829 (3.6212) grad_norm 1.1858 (1.3547) [2022-01-21 12:20:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][920/1251] eta 0:12:12 lr 0.000644 time 1.9862 (2.2124) loss 3.8548 (3.6198) grad_norm 1.1461 (1.3546) [2022-01-21 12:20:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][930/1251] eta 0:11:49 lr 0.000644 time 1.8888 (2.2097) loss 3.4922 (3.6186) grad_norm 1.6004 (1.3544) [2022-01-21 12:20:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][940/1251] eta 0:11:27 lr 0.000644 time 2.3220 (2.2104) loss 4.1580 (3.6171) grad_norm 1.4014 (1.3552) [2022-01-21 12:21:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][950/1251] eta 0:11:05 lr 0.000644 time 2.8480 (2.2105) loss 3.3752 (3.6163) grad_norm 1.2581 (1.3546) [2022-01-21 12:21:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][960/1251] eta 0:10:43 lr 0.000644 time 1.8947 (2.2110) loss 4.2824 (3.6153) grad_norm 1.4220 (1.3541) [2022-01-21 12:21:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][970/1251] eta 0:10:21 lr 0.000644 time 2.1855 (2.2106) loss 3.6517 (3.6187) grad_norm 1.4101 (1.3546) [2022-01-21 12:22:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][980/1251] eta 0:09:59 lr 0.000644 time 1.7179 (2.2118) loss 4.1944 (3.6195) grad_norm 1.2825 (1.3548) [2022-01-21 12:22:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][990/1251] eta 0:09:37 lr 0.000644 time 2.8679 (2.2119) loss 3.2145 (3.6159) grad_norm 1.7112 (1.3552) [2022-01-21 12:23:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1000/1251] eta 0:09:15 lr 0.000644 time 1.6491 (2.2112) loss 3.8936 (3.6164) grad_norm 1.3864 (1.3548) [2022-01-21 12:23:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1010/1251] eta 0:08:53 lr 0.000644 time 2.0705 (2.2122) loss 3.8481 (3.6164) grad_norm 1.1708 (1.3546) [2022-01-21 12:23:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1020/1251] eta 0:08:31 lr 0.000644 time 1.8995 (2.2122) loss 3.4763 (3.6141) grad_norm 1.5296 (1.3551) [2022-01-21 12:24:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1030/1251] eta 0:08:08 lr 0.000644 time 2.0151 (2.2105) loss 4.1309 (3.6151) grad_norm 1.6356 (1.3554) [2022-01-21 12:24:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1040/1251] eta 0:07:46 lr 0.000644 time 1.9484 (2.2104) loss 4.0046 (3.6167) grad_norm 1.5303 (1.3554) [2022-01-21 12:24:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1050/1251] eta 0:07:24 lr 0.000644 time 2.5682 (2.2109) loss 3.3931 (3.6165) grad_norm 1.5471 (1.3561) [2022-01-21 12:25:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1060/1251] eta 0:07:02 lr 0.000644 time 2.3437 (2.2120) loss 3.8624 (3.6176) grad_norm 1.2606 (1.3554) [2022-01-21 12:25:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1070/1251] eta 0:06:39 lr 0.000644 time 1.7696 (2.2099) loss 2.7350 (3.6181) grad_norm 1.3381 (1.3550) [2022-01-21 12:26:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1080/1251] eta 0:06:17 lr 0.000644 time 1.7798 (2.2090) loss 3.0839 (3.6141) grad_norm 1.5094 (1.3558) [2022-01-21 12:26:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1090/1251] eta 0:05:55 lr 0.000644 time 1.8352 (2.2076) loss 4.4054 (3.6142) grad_norm 1.3216 (1.3561) [2022-01-21 12:26:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1100/1251] eta 0:05:33 lr 0.000644 time 2.2663 (2.2076) loss 3.8159 (3.6142) grad_norm 1.1999 (1.3559) [2022-01-21 12:27:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1110/1251] eta 0:05:11 lr 0.000644 time 1.8471 (2.2066) loss 2.7203 (3.6133) grad_norm 1.2049 (1.3557) [2022-01-21 12:27:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1120/1251] eta 0:04:49 lr 0.000644 time 2.5648 (2.2072) loss 3.9238 (3.6148) grad_norm 1.3119 (1.3549) [2022-01-21 12:27:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1130/1251] eta 0:04:27 lr 0.000644 time 1.9179 (2.2070) loss 4.1466 (3.6163) grad_norm 1.1770 (1.3541) [2022-01-21 12:28:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1140/1251] eta 0:04:05 lr 0.000644 time 2.4773 (2.2075) loss 3.5570 (3.6150) grad_norm 1.5836 (1.3539) [2022-01-21 12:28:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1150/1251] eta 0:03:42 lr 0.000644 time 2.5948 (2.2074) loss 4.2159 (3.6153) grad_norm 1.1828 (1.3535) [2022-01-21 12:28:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1160/1251] eta 0:03:20 lr 0.000643 time 2.5594 (2.2087) loss 3.7606 (3.6153) grad_norm 1.1847 (1.3539) [2022-01-21 12:29:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1170/1251] eta 0:02:58 lr 0.000643 time 1.7473 (2.2071) loss 2.7133 (3.6147) grad_norm 1.2170 (1.3532) [2022-01-21 12:29:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1180/1251] eta 0:02:36 lr 0.000643 time 1.7296 (2.2051) loss 3.3018 (3.6134) grad_norm 1.4053 (1.3539) [2022-01-21 12:29:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1190/1251] eta 0:02:14 lr 0.000643 time 2.0405 (2.2044) loss 4.1121 (3.6150) grad_norm 1.4697 (1.3541) [2022-01-21 12:30:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1200/1251] eta 0:01:52 lr 0.000643 time 2.3530 (2.2060) loss 3.8384 (3.6145) grad_norm 1.3138 (1.3536) [2022-01-21 12:30:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1210/1251] eta 0:01:30 lr 0.000643 time 2.0809 (2.2048) loss 4.0221 (3.6162) grad_norm 1.3501 (1.3532) [2022-01-21 12:31:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1220/1251] eta 0:01:08 lr 0.000643 time 1.8291 (2.2044) loss 2.7515 (3.6150) grad_norm 1.1918 (1.3526) [2022-01-21 12:31:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1230/1251] eta 0:00:46 lr 0.000643 time 1.7564 (2.2054) loss 4.1425 (3.6154) grad_norm 1.3492 (1.3526) [2022-01-21 12:31:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1240/1251] eta 0:00:24 lr 0.000643 time 1.5307 (2.2045) loss 3.7807 (3.6158) grad_norm 1.0970 (1.3520) [2022-01-21 12:32:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [122/300][1250/1251] eta 0:00:02 lr 0.000643 time 1.3275 (2.1995) loss 3.5946 (3.6166) grad_norm 1.1712 (1.3515) [2022-01-21 12:32:04 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 122 training takes 0:45:51 [2022-01-21 12:32:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.772 (18.772) Loss 1.0526 (1.0526) Acc@1 75.488 (75.488) Acc@5 92.773 (92.773) [2022-01-21 12:32:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.550 (3.262) Loss 1.1283 (1.0841) Acc@1 73.047 (73.961) Acc@5 92.383 (92.498) [2022-01-21 12:32:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.896 (2.612) Loss 1.0748 (1.0743) Acc@1 74.414 (74.419) Acc@5 92.676 (92.690) [2022-01-21 12:33:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.899 (2.338) Loss 1.0617 (1.0705) Acc@1 74.512 (74.641) Acc@5 93.164 (92.657) [2022-01-21 12:33:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.732 (2.200) Loss 1.1380 (1.0784) Acc@1 73.633 (74.533) Acc@5 91.895 (92.554) [2022-01-21 12:33:41 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.618 Acc@5 92.560 [2022-01-21 12:33:41 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.6% [2022-01-21 12:33:41 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.95% [2022-01-21 12:34:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][0/1251] eta 7:32:17 lr 0.000643 time 21.6925 (21.6925) loss 4.1370 (4.1370) grad_norm 1.6038 (1.6038) [2022-01-21 12:34:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][10/1251] eta 1:22:42 lr 0.000643 time 2.2078 (3.9991) loss 3.0796 (3.5957) grad_norm 1.1625 (1.3572) [2022-01-21 12:34:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][20/1251] eta 1:04:19 lr 0.000643 time 1.3763 (3.1354) loss 3.5372 (3.6558) grad_norm 1.1802 (1.3425) [2022-01-21 12:35:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][30/1251] eta 0:57:36 lr 0.000643 time 2.2483 (2.8307) loss 3.2836 (3.5797) grad_norm 1.3682 (1.3458) [2022-01-21 12:35:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][40/1251] eta 0:55:42 lr 0.000643 time 3.8884 (2.7605) loss 3.1278 (3.5398) grad_norm 1.2602 (1.3515) [2022-01-21 12:35:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][50/1251] eta 0:53:53 lr 0.000643 time 3.0641 (2.6922) loss 4.4054 (3.5601) grad_norm 1.1926 (1.3400) [2022-01-21 12:36:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][60/1251] eta 0:51:49 lr 0.000643 time 1.7665 (2.6107) loss 2.4516 (3.5386) grad_norm 1.3507 (1.3346) [2022-01-21 12:36:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][70/1251] eta 0:50:04 lr 0.000643 time 2.2241 (2.5437) loss 3.4089 (3.5387) grad_norm 1.5757 (1.3528) [2022-01-21 12:37:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][80/1251] eta 0:48:37 lr 0.000643 time 3.3480 (2.4912) loss 3.7418 (3.5605) grad_norm 1.6123 (1.3555) [2022-01-21 12:37:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][90/1251] eta 0:47:46 lr 0.000643 time 3.1566 (2.4687) loss 2.6453 (3.5754) grad_norm 1.2491 (1.3544) [2022-01-21 12:37:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][100/1251] eta 0:46:47 lr 0.000643 time 1.8692 (2.4394) loss 3.6070 (3.5472) grad_norm 1.2037 (1.3564) [2022-01-21 12:38:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][110/1251] eta 0:46:01 lr 0.000643 time 2.2833 (2.4199) loss 4.2680 (3.5736) grad_norm 1.1286 (1.3500) [2022-01-21 12:38:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][120/1251] eta 0:45:09 lr 0.000643 time 2.4427 (2.3958) loss 3.9133 (3.5791) grad_norm 1.6941 (1.3544) [2022-01-21 12:38:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][130/1251] eta 0:44:28 lr 0.000643 time 3.1952 (2.3808) loss 4.0293 (3.5844) grad_norm 1.5102 (1.3533) [2022-01-21 12:39:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][140/1251] eta 0:43:49 lr 0.000643 time 2.9133 (2.3670) loss 4.1648 (3.5924) grad_norm 1.4410 (1.3536) [2022-01-21 12:39:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][150/1251] eta 0:43:11 lr 0.000643 time 1.7821 (2.3538) loss 3.5227 (3.6050) grad_norm 1.2826 (1.3613) [2022-01-21 12:39:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][160/1251] eta 0:42:23 lr 0.000642 time 1.9826 (2.3312) loss 3.4946 (3.6102) grad_norm 1.2203 (1.3611) [2022-01-21 12:40:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][170/1251] eta 0:41:47 lr 0.000642 time 2.5654 (2.3200) loss 3.4679 (3.6294) grad_norm 1.2036 (1.3578) [2022-01-21 12:40:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][180/1251] eta 0:41:13 lr 0.000642 time 1.5321 (2.3091) loss 2.9343 (3.6422) grad_norm 1.7630 (1.3590) [2022-01-21 12:41:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][190/1251] eta 0:40:41 lr 0.000642 time 2.0780 (2.3014) loss 3.8385 (3.6489) grad_norm 1.3776 (1.3573) [2022-01-21 12:41:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][200/1251] eta 0:40:15 lr 0.000642 time 2.5600 (2.2980) loss 3.7818 (3.6462) grad_norm 1.3518 (1.3541) [2022-01-21 12:41:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][210/1251] eta 0:39:42 lr 0.000642 time 1.9470 (2.2888) loss 3.3823 (3.6422) grad_norm 1.3912 (1.3523) [2022-01-21 12:42:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][220/1251] eta 0:39:20 lr 0.000642 time 2.1166 (2.2895) loss 3.5885 (3.6488) grad_norm 1.4458 (1.3513) [2022-01-21 12:42:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][230/1251] eta 0:38:56 lr 0.000642 time 2.4759 (2.2884) loss 2.6706 (3.6396) grad_norm 1.4763 (1.3515) [2022-01-21 12:42:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][240/1251] eta 0:38:24 lr 0.000642 time 2.3194 (2.2789) loss 2.8514 (3.6361) grad_norm 1.2228 (1.3523) [2022-01-21 12:43:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][250/1251] eta 0:38:02 lr 0.000642 time 2.3308 (2.2797) loss 3.5244 (3.6228) grad_norm 1.3628 (1.3532) [2022-01-21 12:43:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][260/1251] eta 0:37:42 lr 0.000642 time 2.2065 (2.2831) loss 3.7626 (3.6174) grad_norm 1.2249 (1.3565) [2022-01-21 12:43:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][270/1251] eta 0:37:12 lr 0.000642 time 2.9195 (2.2759) loss 4.0670 (3.6157) grad_norm 1.3262 (1.3568) [2022-01-21 12:44:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][280/1251] eta 0:36:45 lr 0.000642 time 1.8920 (2.2709) loss 3.9994 (3.6092) grad_norm 1.2847 (1.3561) [2022-01-21 12:44:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][290/1251] eta 0:36:16 lr 0.000642 time 2.2574 (2.2652) loss 3.7466 (3.6072) grad_norm 1.2803 (1.3547) [2022-01-21 12:45:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][300/1251] eta 0:35:50 lr 0.000642 time 2.1800 (2.2609) loss 3.8494 (3.5961) grad_norm 1.3687 (1.3550) [2022-01-21 12:45:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][310/1251] eta 0:35:23 lr 0.000642 time 2.5220 (2.2563) loss 4.0253 (3.6079) grad_norm 1.5247 (1.3561) [2022-01-21 12:45:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][320/1251] eta 0:34:55 lr 0.000642 time 1.9529 (2.2503) loss 3.0370 (3.6077) grad_norm 1.2976 (1.3539) [2022-01-21 12:46:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][330/1251] eta 0:34:29 lr 0.000642 time 2.3493 (2.2475) loss 2.7809 (3.6146) grad_norm 1.2801 (1.3540) [2022-01-21 12:46:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][340/1251] eta 0:34:09 lr 0.000642 time 2.2050 (2.2495) loss 4.1945 (3.6249) grad_norm 1.4092 (1.3524) [2022-01-21 12:46:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][350/1251] eta 0:33:47 lr 0.000642 time 3.0759 (2.2502) loss 3.8630 (3.6251) grad_norm 1.4866 (1.3531) [2022-01-21 12:47:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][360/1251] eta 0:33:26 lr 0.000642 time 1.9259 (2.2514) loss 3.5625 (3.6203) grad_norm 1.3171 (1.3526) [2022-01-21 12:47:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][370/1251] eta 0:33:03 lr 0.000642 time 1.9289 (2.2514) loss 3.9339 (3.6252) grad_norm 1.4343 (1.3529) [2022-01-21 12:48:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][380/1251] eta 0:32:42 lr 0.000642 time 1.9369 (2.2531) loss 3.9038 (3.6261) grad_norm 1.4304 (1.3531) [2022-01-21 12:48:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][390/1251] eta 0:32:18 lr 0.000642 time 2.7827 (2.2510) loss 2.8858 (3.6167) grad_norm 1.7409 (1.3557) [2022-01-21 12:48:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][400/1251] eta 0:31:51 lr 0.000642 time 1.9237 (2.2460) loss 4.0163 (3.6212) grad_norm 1.1873 (1.3568) [2022-01-21 12:49:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][410/1251] eta 0:31:23 lr 0.000641 time 1.6217 (2.2392) loss 3.5353 (3.6223) grad_norm 1.3253 (1.3558) [2022-01-21 12:49:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][420/1251] eta 0:31:00 lr 0.000641 time 1.8825 (2.2384) loss 2.9868 (3.6262) grad_norm 1.3829 (1.3564) [2022-01-21 12:49:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][430/1251] eta 0:30:37 lr 0.000641 time 3.1501 (2.2385) loss 4.1250 (3.6240) grad_norm 1.4301 (1.3560) [2022-01-21 12:50:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][440/1251] eta 0:30:15 lr 0.000641 time 2.5504 (2.2380) loss 4.1640 (3.6219) grad_norm 1.5164 (1.3574) [2022-01-21 12:50:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][450/1251] eta 0:29:51 lr 0.000641 time 2.0991 (2.2372) loss 4.0402 (3.6233) grad_norm 1.5418 (1.3581) [2022-01-21 12:50:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][460/1251] eta 0:29:30 lr 0.000641 time 1.9197 (2.2379) loss 2.7522 (3.6235) grad_norm 1.4042 (1.3595) [2022-01-21 12:51:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][470/1251] eta 0:29:08 lr 0.000641 time 3.0159 (2.2383) loss 4.3685 (3.6254) grad_norm 1.4926 (1.3603) [2022-01-21 12:51:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][480/1251] eta 0:28:43 lr 0.000641 time 2.2530 (2.2357) loss 2.6948 (3.6214) grad_norm 1.2945 (1.3600) [2022-01-21 12:51:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][490/1251] eta 0:28:18 lr 0.000641 time 1.8563 (2.2313) loss 3.6082 (3.6186) grad_norm 1.2903 (1.3601) [2022-01-21 12:52:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][500/1251] eta 0:27:54 lr 0.000641 time 2.2127 (2.2291) loss 3.9380 (3.6212) grad_norm 1.2633 (1.3591) [2022-01-21 12:52:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][510/1251] eta 0:27:27 lr 0.000641 time 1.9119 (2.2236) loss 4.2825 (3.6248) grad_norm 1.2139 (1.3575) [2022-01-21 12:52:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][520/1251] eta 0:27:03 lr 0.000641 time 2.4659 (2.2214) loss 3.6087 (3.6223) grad_norm 1.6514 (1.3576) [2022-01-21 12:53:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][530/1251] eta 0:26:38 lr 0.000641 time 1.4535 (2.2177) loss 3.7084 (3.6247) grad_norm 1.1981 (1.3564) [2022-01-21 12:53:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][540/1251] eta 0:26:18 lr 0.000641 time 2.5073 (2.2194) loss 3.4980 (3.6265) grad_norm 1.2270 (1.3562) [2022-01-21 12:54:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][550/1251] eta 0:25:56 lr 0.000641 time 2.2342 (2.2204) loss 3.2644 (3.6311) grad_norm 1.1844 (1.3564) [2022-01-21 12:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][560/1251] eta 0:25:36 lr 0.000641 time 2.3217 (2.2238) loss 3.7926 (3.6333) grad_norm 1.2891 (1.3565) [2022-01-21 12:54:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][570/1251] eta 0:25:15 lr 0.000641 time 2.0862 (2.2252) loss 3.3758 (3.6366) grad_norm 1.2327 (1.3563) [2022-01-21 12:55:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][580/1251] eta 0:24:53 lr 0.000641 time 2.9200 (2.2258) loss 3.5552 (3.6419) grad_norm 1.2954 (1.3568) [2022-01-21 12:55:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][590/1251] eta 0:24:29 lr 0.000641 time 1.6713 (2.2233) loss 3.9397 (3.6393) grad_norm 1.3216 (1.3568) [2022-01-21 12:55:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][600/1251] eta 0:24:07 lr 0.000641 time 2.9441 (2.2239) loss 4.1906 (3.6415) grad_norm 1.1867 (1.3568) [2022-01-21 12:56:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][610/1251] eta 0:23:45 lr 0.000641 time 1.5874 (2.2245) loss 3.2695 (3.6387) grad_norm 1.1423 (1.3556) [2022-01-21 12:56:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][620/1251] eta 0:23:23 lr 0.000641 time 2.6511 (2.2241) loss 4.2660 (3.6424) grad_norm 1.2655 (1.3542) [2022-01-21 12:57:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][630/1251] eta 0:23:00 lr 0.000641 time 1.8810 (2.2224) loss 3.0829 (3.6421) grad_norm 1.2650 (1.3544) [2022-01-21 12:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][640/1251] eta 0:22:37 lr 0.000641 time 2.2919 (2.2214) loss 4.0528 (3.6440) grad_norm 1.2733 (1.3542) [2022-01-21 12:57:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][650/1251] eta 0:22:15 lr 0.000641 time 2.0369 (2.2222) loss 2.6138 (3.6465) grad_norm 1.4194 (1.3533) [2022-01-21 12:58:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][660/1251] eta 0:21:53 lr 0.000640 time 3.1142 (2.2226) loss 4.1524 (3.6468) grad_norm 1.2524 (1.3540) [2022-01-21 12:58:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][670/1251] eta 0:21:30 lr 0.000640 time 2.2929 (2.2216) loss 4.1223 (3.6455) grad_norm 1.5257 (1.3547) [2022-01-21 12:58:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][680/1251] eta 0:21:06 lr 0.000640 time 1.6187 (2.2188) loss 3.6500 (3.6460) grad_norm 1.5121 (1.3553) [2022-01-21 12:59:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][690/1251] eta 0:20:44 lr 0.000640 time 2.4759 (2.2184) loss 3.9889 (3.6432) grad_norm 1.2081 (1.3551) [2022-01-21 12:59:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][700/1251] eta 0:20:20 lr 0.000640 time 1.9287 (2.2154) loss 3.3592 (3.6429) grad_norm 1.3393 (1.3553) [2022-01-21 12:59:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][710/1251] eta 0:19:57 lr 0.000640 time 2.6400 (2.2140) loss 4.1321 (3.6427) grad_norm 1.1671 (1.3552) [2022-01-21 13:00:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][720/1251] eta 0:19:34 lr 0.000640 time 1.5881 (2.2124) loss 3.5898 (3.6421) grad_norm 1.6319 (1.3555) [2022-01-21 13:00:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][730/1251] eta 0:19:13 lr 0.000640 time 2.7210 (2.2132) loss 3.6510 (3.6438) grad_norm 1.3228 (1.3562) [2022-01-21 13:01:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][740/1251] eta 0:18:51 lr 0.000640 time 1.7444 (2.2145) loss 4.0220 (3.6422) grad_norm 1.2910 (1.3554) [2022-01-21 13:01:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][750/1251] eta 0:18:30 lr 0.000640 time 1.9800 (2.2164) loss 4.3631 (3.6453) grad_norm 1.3560 (1.3549) [2022-01-21 13:01:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][760/1251] eta 0:18:08 lr 0.000640 time 2.7044 (2.2177) loss 3.4946 (3.6470) grad_norm 1.2874 (1.3541) [2022-01-21 13:02:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][770/1251] eta 0:17:46 lr 0.000640 time 2.6039 (2.2176) loss 3.4121 (3.6466) grad_norm 1.3468 (1.3535) [2022-01-21 13:02:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][780/1251] eta 0:17:23 lr 0.000640 time 2.0767 (2.2160) loss 3.3758 (3.6474) grad_norm 1.6928 (1.3542) [2022-01-21 13:02:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][790/1251] eta 0:17:00 lr 0.000640 time 1.9459 (2.2133) loss 3.9457 (3.6474) grad_norm 1.4277 (1.3544) [2022-01-21 13:03:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][800/1251] eta 0:16:38 lr 0.000640 time 2.3284 (2.2129) loss 3.7624 (3.6473) grad_norm 1.2252 (1.3538) [2022-01-21 13:03:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][810/1251] eta 0:16:15 lr 0.000640 time 2.3716 (2.2116) loss 2.7705 (3.6467) grad_norm 1.4275 (1.3548) [2022-01-21 13:03:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][820/1251] eta 0:15:52 lr 0.000640 time 2.0215 (2.2101) loss 3.8330 (3.6479) grad_norm 1.4321 (1.3556) [2022-01-21 13:04:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][830/1251] eta 0:15:30 lr 0.000640 time 2.8619 (2.2106) loss 3.6735 (3.6476) grad_norm 1.2462 (1.3546) [2022-01-21 13:04:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][840/1251] eta 0:15:08 lr 0.000640 time 2.6460 (2.2107) loss 3.9073 (3.6482) grad_norm 1.2716 (1.3552) [2022-01-21 13:05:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][850/1251] eta 0:14:46 lr 0.000640 time 1.8559 (2.2096) loss 3.9109 (3.6492) grad_norm 1.5756 (1.3563) [2022-01-21 13:05:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][860/1251] eta 0:14:24 lr 0.000640 time 2.1542 (2.2103) loss 2.7906 (3.6488) grad_norm 1.3160 (1.3561) [2022-01-21 13:05:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][870/1251] eta 0:14:01 lr 0.000640 time 2.2109 (2.2096) loss 4.1518 (3.6490) grad_norm 1.4755 (1.3552) [2022-01-21 13:06:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][880/1251] eta 0:13:39 lr 0.000640 time 2.2737 (2.2081) loss 3.6489 (3.6470) grad_norm 1.3496 (1.3552) [2022-01-21 13:06:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][890/1251] eta 0:13:17 lr 0.000640 time 2.4082 (2.2082) loss 4.2547 (3.6460) grad_norm 1.2172 (1.3547) [2022-01-21 13:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][900/1251] eta 0:12:55 lr 0.000640 time 2.0527 (2.2088) loss 3.9052 (3.6448) grad_norm 1.1389 (1.3550) [2022-01-21 13:07:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][910/1251] eta 0:12:33 lr 0.000639 time 2.9897 (2.2108) loss 4.0459 (3.6452) grad_norm 1.3337 (1.3542) [2022-01-21 13:07:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][920/1251] eta 0:12:11 lr 0.000639 time 1.7105 (2.2098) loss 4.0553 (3.6427) grad_norm 1.5369 (1.3548) [2022-01-21 13:07:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][930/1251] eta 0:11:48 lr 0.000639 time 2.1263 (2.2085) loss 3.8747 (3.6417) grad_norm 1.3322 (1.3555) [2022-01-21 13:08:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][940/1251] eta 0:11:26 lr 0.000639 time 1.7415 (2.2066) loss 4.1909 (3.6419) grad_norm 1.2119 (1.3557) [2022-01-21 13:08:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][950/1251] eta 0:11:03 lr 0.000639 time 2.4810 (2.2052) loss 3.8087 (3.6440) grad_norm 1.3328 (1.3567) [2022-01-21 13:09:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][960/1251] eta 0:10:41 lr 0.000639 time 2.5970 (2.2044) loss 4.0410 (3.6446) grad_norm 1.1961 (1.3573) [2022-01-21 13:09:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][970/1251] eta 0:10:19 lr 0.000639 time 1.8236 (2.2041) loss 3.7905 (3.6463) grad_norm 1.2252 (1.3569) [2022-01-21 13:09:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][980/1251] eta 0:09:57 lr 0.000639 time 1.9593 (2.2046) loss 4.4715 (3.6463) grad_norm 1.3256 (1.3564) [2022-01-21 13:10:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][990/1251] eta 0:09:36 lr 0.000639 time 3.0186 (2.2069) loss 3.9648 (3.6448) grad_norm 1.3874 (1.3567) [2022-01-21 13:10:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1000/1251] eta 0:09:13 lr 0.000639 time 2.0199 (2.2071) loss 3.3767 (3.6443) grad_norm 1.5450 (1.3568) [2022-01-21 13:10:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1010/1251] eta 0:08:51 lr 0.000639 time 2.3029 (2.2071) loss 2.4402 (3.6426) grad_norm 1.2969 (1.3568) [2022-01-21 13:11:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1020/1251] eta 0:08:29 lr 0.000639 time 1.9335 (2.2063) loss 3.9768 (3.6431) grad_norm 1.3973 (1.3568) [2022-01-21 13:11:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1030/1251] eta 0:08:07 lr 0.000639 time 2.1595 (2.2060) loss 4.2544 (3.6432) grad_norm 1.2161 (1.3562) [2022-01-21 13:11:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1040/1251] eta 0:07:45 lr 0.000639 time 1.9726 (2.2058) loss 3.6421 (3.6430) grad_norm 1.5892 (1.3559) [2022-01-21 13:12:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1050/1251] eta 0:07:23 lr 0.000639 time 2.5734 (2.2064) loss 3.7754 (3.6433) grad_norm 1.3432 (1.3552) [2022-01-21 13:12:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1060/1251] eta 0:07:01 lr 0.000639 time 1.8592 (2.2057) loss 4.2294 (3.6458) grad_norm 1.2690 (1.3544) [2022-01-21 13:13:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1070/1251] eta 0:06:39 lr 0.000639 time 2.2025 (2.2064) loss 2.4514 (3.6446) grad_norm 1.2150 (1.3535) [2022-01-21 13:13:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1080/1251] eta 0:06:17 lr 0.000639 time 2.2002 (2.2057) loss 2.8202 (3.6469) grad_norm 1.4050 (1.3539) [2022-01-21 13:13:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1090/1251] eta 0:05:55 lr 0.000639 time 1.8809 (2.2051) loss 3.9924 (3.6446) grad_norm 1.3208 (1.3535) [2022-01-21 13:14:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1100/1251] eta 0:05:32 lr 0.000639 time 1.8970 (2.2035) loss 4.0363 (3.6436) grad_norm 1.4150 (1.3534) [2022-01-21 13:14:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1110/1251] eta 0:05:10 lr 0.000639 time 2.4887 (2.2040) loss 4.1373 (3.6405) grad_norm 1.3355 (1.3534) [2022-01-21 13:14:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1120/1251] eta 0:04:48 lr 0.000639 time 3.3922 (2.2060) loss 3.8935 (3.6399) grad_norm 1.2815 (1.3535) [2022-01-21 13:15:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1130/1251] eta 0:04:26 lr 0.000639 time 1.7275 (2.2053) loss 3.9018 (3.6406) grad_norm 1.3271 (1.3536) [2022-01-21 13:15:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1140/1251] eta 0:04:04 lr 0.000639 time 2.1844 (2.2039) loss 3.9732 (3.6410) grad_norm 1.1900 (1.3528) [2022-01-21 13:15:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1150/1251] eta 0:03:42 lr 0.000639 time 2.2982 (2.2028) loss 3.9234 (3.6414) grad_norm 1.4359 (1.3524) [2022-01-21 13:16:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1160/1251] eta 0:03:20 lr 0.000638 time 3.0634 (2.2039) loss 3.9934 (3.6424) grad_norm 1.2160 (1.3526) [2022-01-21 13:16:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1170/1251] eta 0:02:58 lr 0.000638 time 1.9402 (2.2034) loss 3.6350 (3.6421) grad_norm 1.0687 (1.3521) [2022-01-21 13:17:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1180/1251] eta 0:02:36 lr 0.000638 time 2.7119 (2.2035) loss 2.8661 (3.6427) grad_norm 1.3698 (1.3518) [2022-01-21 13:17:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1190/1251] eta 0:02:14 lr 0.000638 time 1.6144 (2.2028) loss 4.2811 (3.6435) grad_norm 1.2271 (1.3509) [2022-01-21 13:17:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1200/1251] eta 0:01:52 lr 0.000638 time 2.7705 (2.2025) loss 4.1601 (3.6433) grad_norm 1.1908 (1.3509) [2022-01-21 13:18:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1210/1251] eta 0:01:30 lr 0.000638 time 1.9154 (2.2037) loss 3.0104 (3.6402) grad_norm 1.2235 (1.3508) [2022-01-21 13:18:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1220/1251] eta 0:01:08 lr 0.000638 time 3.2277 (2.2044) loss 2.5704 (3.6393) grad_norm 1.2875 (1.3505) [2022-01-21 13:18:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1230/1251] eta 0:00:46 lr 0.000638 time 1.5563 (2.2036) loss 3.2915 (3.6381) grad_norm 1.3771 (1.3506) [2022-01-21 13:19:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1240/1251] eta 0:00:24 lr 0.000638 time 1.5277 (2.2008) loss 4.2372 (3.6377) grad_norm 1.3613 (1.3507) [2022-01-21 13:19:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [123/300][1250/1251] eta 0:00:02 lr 0.000638 time 1.1771 (2.1941) loss 4.1961 (3.6392) grad_norm 1.4004 (1.3510) [2022-01-21 13:19:26 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 123 training takes 0:45:45 [2022-01-21 13:19:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.487 (18.487) Loss 1.1270 (1.1270) Acc@1 73.145 (73.145) Acc@5 91.602 (91.602) [2022-01-21 13:20:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.866 (3.254) Loss 1.0384 (1.0893) Acc@1 76.562 (74.840) Acc@5 92.871 (92.436) [2022-01-21 13:20:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.318 (2.446) Loss 1.1157 (1.0871) Acc@1 74.707 (74.642) Acc@5 91.797 (92.518) [2022-01-21 13:20:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.656 (2.169) Loss 1.0990 (1.0782) Acc@1 74.707 (74.921) Acc@5 92.480 (92.654) [2022-01-21 13:20:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.606 (2.171) Loss 1.1601 (1.0776) Acc@1 73.926 (74.898) Acc@5 91.895 (92.712) [2022-01-21 13:21:03 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.774 Acc@5 92.658 [2022-01-21 13:21:03 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.8% [2022-01-21 13:21:03 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.95% [2022-01-21 13:21:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][0/1251] eta 7:27:00 lr 0.000638 time 21.4391 (21.4391) loss 3.7610 (3.7610) grad_norm 1.4058 (1.4058) [2022-01-21 13:21:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][10/1251] eta 1:25:19 lr 0.000638 time 2.0932 (4.1254) loss 2.9035 (3.4939) grad_norm 1.2937 (1.3029) [2022-01-21 13:22:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][20/1251] eta 1:04:08 lr 0.000638 time 1.5138 (3.1262) loss 3.2045 (3.4982) grad_norm 1.2634 (1.3310) [2022-01-21 13:22:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][30/1251] eta 0:56:43 lr 0.000638 time 1.3930 (2.7873) loss 2.7639 (3.5630) grad_norm 1.2240 (1.3391) [2022-01-21 13:22:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][40/1251] eta 0:54:12 lr 0.000638 time 3.9259 (2.6860) loss 4.1586 (3.5973) grad_norm 1.1845 (1.3437) [2022-01-21 13:23:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][50/1251] eta 0:52:28 lr 0.000638 time 2.7680 (2.6215) loss 3.6597 (3.5902) grad_norm 1.4001 (1.3633) [2022-01-21 13:23:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][60/1251] eta 0:50:29 lr 0.000638 time 1.8800 (2.5433) loss 4.0315 (3.6123) grad_norm 1.3430 (1.3563) [2022-01-21 13:23:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][70/1251] eta 0:48:45 lr 0.000638 time 1.8128 (2.4774) loss 2.9380 (3.6194) grad_norm 1.4345 (1.3667) [2022-01-21 13:24:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][80/1251] eta 0:48:10 lr 0.000638 time 3.8316 (2.4681) loss 3.0206 (3.5993) grad_norm 1.3838 (1.3701) [2022-01-21 13:24:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][90/1251] eta 0:47:37 lr 0.000638 time 3.4282 (2.4610) loss 4.4442 (3.5601) grad_norm 1.4459 (1.3725) [2022-01-21 13:25:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][100/1251] eta 0:46:40 lr 0.000638 time 1.5172 (2.4327) loss 4.2142 (3.5728) grad_norm 1.2949 (1.3651) [2022-01-21 13:25:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][110/1251] eta 0:45:52 lr 0.000638 time 1.7175 (2.4122) loss 4.3674 (3.5867) grad_norm 1.2989 (1.3653) [2022-01-21 13:25:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][120/1251] eta 0:45:05 lr 0.000638 time 2.4013 (2.3918) loss 2.8725 (3.6013) grad_norm 1.7286 (1.3655) [2022-01-21 13:26:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][130/1251] eta 0:44:11 lr 0.000638 time 1.9695 (2.3651) loss 4.0424 (3.6092) grad_norm 1.3179 (1.3610) [2022-01-21 13:26:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][140/1251] eta 0:43:23 lr 0.000638 time 1.8849 (2.3437) loss 4.1169 (3.5827) grad_norm 1.5655 (1.3692) [2022-01-21 13:26:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][150/1251] eta 0:42:53 lr 0.000638 time 1.9419 (2.3377) loss 3.9885 (3.5832) grad_norm 1.3165 (1.3724) [2022-01-21 13:27:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][160/1251] eta 0:42:17 lr 0.000637 time 2.3027 (2.3255) loss 3.7450 (3.5864) grad_norm 1.2439 (1.3725) [2022-01-21 13:27:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][170/1251] eta 0:41:46 lr 0.000637 time 2.5436 (2.3190) loss 3.9121 (3.5846) grad_norm 1.3028 (1.3725) [2022-01-21 13:28:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][180/1251] eta 0:41:17 lr 0.000637 time 2.0409 (2.3135) loss 3.2705 (3.5861) grad_norm 1.5260 (1.3721) [2022-01-21 13:28:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][190/1251] eta 0:40:46 lr 0.000637 time 1.9651 (2.3055) loss 3.2827 (3.5910) grad_norm 1.2714 (1.3747) [2022-01-21 13:28:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][200/1251] eta 0:40:23 lr 0.000637 time 3.0195 (2.3060) loss 4.3174 (3.5962) grad_norm 1.7215 (1.3753) [2022-01-21 13:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][210/1251] eta 0:39:53 lr 0.000637 time 2.2271 (2.2992) loss 2.5708 (3.6038) grad_norm 1.2566 (1.3742) [2022-01-21 13:29:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][220/1251] eta 0:39:16 lr 0.000637 time 1.9344 (2.2860) loss 3.2158 (3.6035) grad_norm 1.4897 (1.3723) [2022-01-21 13:29:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][230/1251] eta 0:38:42 lr 0.000637 time 1.8959 (2.2747) loss 2.7253 (3.6051) grad_norm 1.3953 (1.3738) [2022-01-21 13:30:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][240/1251] eta 0:38:16 lr 0.000637 time 2.8629 (2.2715) loss 3.3988 (3.6238) grad_norm 1.1661 (1.3726) [2022-01-21 13:30:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][250/1251] eta 0:37:47 lr 0.000637 time 2.0880 (2.2655) loss 3.7517 (3.6226) grad_norm 1.6422 (1.3760) [2022-01-21 13:30:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][260/1251] eta 0:37:24 lr 0.000637 time 3.0974 (2.2650) loss 3.1628 (3.6175) grad_norm 1.4552 (1.3778) [2022-01-21 13:31:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][270/1251] eta 0:37:00 lr 0.000637 time 1.4318 (2.2637) loss 3.5000 (3.6232) grad_norm 1.4144 (1.3756) [2022-01-21 13:31:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][280/1251] eta 0:36:37 lr 0.000637 time 1.8479 (2.2635) loss 3.0814 (3.6178) grad_norm 1.3369 (1.3761) [2022-01-21 13:32:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][290/1251] eta 0:36:13 lr 0.000637 time 2.0978 (2.2622) loss 2.5034 (3.6136) grad_norm 1.3304 (1.3766) [2022-01-21 13:32:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][300/1251] eta 0:35:44 lr 0.000637 time 2.4084 (2.2552) loss 3.3113 (3.6164) grad_norm 1.2621 (1.3770) [2022-01-21 13:32:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][310/1251] eta 0:35:19 lr 0.000637 time 2.8766 (2.2520) loss 3.1986 (3.6136) grad_norm 1.4085 (1.3755) [2022-01-21 13:33:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][320/1251] eta 0:34:52 lr 0.000637 time 1.6945 (2.2481) loss 3.6954 (3.6149) grad_norm 1.1686 (1.3738) [2022-01-21 13:33:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][330/1251] eta 0:34:33 lr 0.000637 time 2.3042 (2.2519) loss 3.9946 (3.6181) grad_norm 1.3118 (1.3738) [2022-01-21 13:33:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][340/1251] eta 0:34:08 lr 0.000637 time 1.8864 (2.2489) loss 4.1349 (3.6238) grad_norm 1.4000 (1.3737) [2022-01-21 13:34:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][350/1251] eta 0:33:44 lr 0.000637 time 2.4194 (2.2474) loss 2.6848 (3.6307) grad_norm 1.3383 (1.3747) [2022-01-21 13:34:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][360/1251] eta 0:33:17 lr 0.000637 time 1.5493 (2.2422) loss 3.7598 (3.6288) grad_norm 1.4312 (1.3758) [2022-01-21 13:34:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][370/1251] eta 0:32:52 lr 0.000637 time 2.3175 (2.2385) loss 4.1321 (3.6323) grad_norm 1.5148 (1.3773) [2022-01-21 13:35:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][380/1251] eta 0:32:29 lr 0.000637 time 2.1530 (2.2384) loss 4.4093 (3.6359) grad_norm 1.2192 (1.3786) [2022-01-21 13:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][390/1251] eta 0:32:12 lr 0.000637 time 3.7645 (2.2444) loss 4.0826 (3.6415) grad_norm 1.4036 (1.3837) [2022-01-21 13:36:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][400/1251] eta 0:31:53 lr 0.000637 time 2.9408 (2.2491) loss 3.7934 (3.6394) grad_norm 1.2255 (1.3823) [2022-01-21 13:36:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][410/1251] eta 0:31:32 lr 0.000636 time 2.1568 (2.2498) loss 4.0350 (3.6384) grad_norm 1.5737 (1.3838) [2022-01-21 13:36:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][420/1251] eta 0:31:06 lr 0.000636 time 1.8855 (2.2464) loss 3.0869 (3.6384) grad_norm 1.5562 (1.3853) [2022-01-21 13:37:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][430/1251] eta 0:30:38 lr 0.000636 time 1.8705 (2.2398) loss 4.0439 (3.6407) grad_norm 1.4579 (1.3862) [2022-01-21 13:37:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][440/1251] eta 0:30:10 lr 0.000636 time 2.2063 (2.2329) loss 3.5375 (3.6437) grad_norm 1.2754 (1.3874) [2022-01-21 13:37:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][450/1251] eta 0:29:45 lr 0.000636 time 1.9278 (2.2288) loss 2.7238 (3.6412) grad_norm 1.2084 (1.3882) [2022-01-21 13:38:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][460/1251] eta 0:29:23 lr 0.000636 time 2.2457 (2.2292) loss 3.2903 (3.6435) grad_norm 1.5035 (1.3875) [2022-01-21 13:38:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][470/1251] eta 0:28:59 lr 0.000636 time 2.2299 (2.2275) loss 3.3807 (3.6454) grad_norm 1.4193 (1.3855) [2022-01-21 13:38:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][480/1251] eta 0:28:36 lr 0.000636 time 2.4990 (2.2258) loss 3.1331 (3.6421) grad_norm 1.5225 (1.3851) [2022-01-21 13:39:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][490/1251] eta 0:28:16 lr 0.000636 time 1.5669 (2.2297) loss 3.5845 (3.6454) grad_norm 1.3203 (1.3844) [2022-01-21 13:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][500/1251] eta 0:27:56 lr 0.000636 time 1.8660 (2.2326) loss 3.4134 (3.6441) grad_norm 1.3224 (1.3841) [2022-01-21 13:40:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][510/1251] eta 0:27:36 lr 0.000636 time 3.2122 (2.2350) loss 3.2771 (3.6441) grad_norm 1.5350 (1.3829) [2022-01-21 13:40:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][520/1251] eta 0:27:14 lr 0.000636 time 2.9094 (2.2360) loss 4.4673 (3.6437) grad_norm 1.3201 (1.3819) [2022-01-21 13:40:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][530/1251] eta 0:26:51 lr 0.000636 time 1.8445 (2.2349) loss 3.9041 (3.6430) grad_norm 1.1934 (1.3803) [2022-01-21 13:41:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][540/1251] eta 0:26:26 lr 0.000636 time 1.9142 (2.2320) loss 3.6048 (3.6453) grad_norm 1.2207 (1.3808) [2022-01-21 13:41:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][550/1251] eta 0:26:01 lr 0.000636 time 2.5570 (2.2275) loss 3.4473 (3.6399) grad_norm 1.2670 (1.3799) [2022-01-21 13:41:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][560/1251] eta 0:25:37 lr 0.000636 time 2.0159 (2.2255) loss 3.1656 (3.6403) grad_norm 1.6409 (1.3792) [2022-01-21 13:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][570/1251] eta 0:25:15 lr 0.000636 time 1.8445 (2.2247) loss 4.0210 (3.6444) grad_norm 1.5042 (1.3792) [2022-01-21 13:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][580/1251] eta 0:24:52 lr 0.000636 time 2.2541 (2.2236) loss 3.1851 (3.6400) grad_norm 1.4358 (1.3780) [2022-01-21 13:42:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][590/1251] eta 0:24:31 lr 0.000636 time 3.2802 (2.2264) loss 4.0378 (3.6376) grad_norm 1.2091 (1.3767) [2022-01-21 13:43:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][600/1251] eta 0:24:09 lr 0.000636 time 2.7164 (2.2264) loss 4.0776 (3.6398) grad_norm 1.2857 (1.3767) [2022-01-21 13:43:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][610/1251] eta 0:23:46 lr 0.000636 time 1.8855 (2.2257) loss 3.2878 (3.6415) grad_norm 1.3610 (1.3770) [2022-01-21 13:44:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][620/1251] eta 0:23:23 lr 0.000636 time 2.1851 (2.2248) loss 3.9194 (3.6417) grad_norm 1.2992 (1.3760) [2022-01-21 13:44:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][630/1251] eta 0:23:02 lr 0.000636 time 3.1624 (2.2255) loss 3.8853 (3.6411) grad_norm 1.8536 (1.3757) [2022-01-21 13:44:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][640/1251] eta 0:22:37 lr 0.000636 time 1.9606 (2.2214) loss 3.5826 (3.6398) grad_norm 1.2625 (1.3740) [2022-01-21 13:45:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][650/1251] eta 0:22:13 lr 0.000636 time 1.9249 (2.2181) loss 3.1496 (3.6400) grad_norm 1.2515 (1.3723) [2022-01-21 13:45:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][660/1251] eta 0:21:49 lr 0.000635 time 2.1717 (2.2150) loss 2.8850 (3.6388) grad_norm 1.5786 (1.3720) [2022-01-21 13:45:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][670/1251] eta 0:21:26 lr 0.000635 time 2.6324 (2.2140) loss 2.4957 (3.6370) grad_norm 1.4849 (1.3720) [2022-01-21 13:46:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][680/1251] eta 0:21:03 lr 0.000635 time 1.8959 (2.2129) loss 4.0237 (3.6403) grad_norm 1.3900 (1.3721) [2022-01-21 13:46:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][690/1251] eta 0:20:39 lr 0.000635 time 1.8554 (2.2100) loss 3.0108 (3.6368) grad_norm 1.4291 (1.3705) [2022-01-21 13:46:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][700/1251] eta 0:20:17 lr 0.000635 time 1.8257 (2.2100) loss 3.7325 (3.6343) grad_norm 1.3014 (1.3706) [2022-01-21 13:47:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][710/1251] eta 0:19:55 lr 0.000635 time 2.8251 (2.2107) loss 3.9760 (3.6397) grad_norm 1.4804 (1.3707) [2022-01-21 13:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][720/1251] eta 0:19:35 lr 0.000635 time 2.8067 (2.2144) loss 4.4394 (3.6387) grad_norm 1.4208 (1.3703) [2022-01-21 13:48:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][730/1251] eta 0:19:14 lr 0.000635 time 2.0864 (2.2154) loss 3.0039 (3.6376) grad_norm 1.3789 (1.3700) [2022-01-21 13:48:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][740/1251] eta 0:18:52 lr 0.000635 time 2.2592 (2.2169) loss 4.3793 (3.6392) grad_norm 1.2684 (1.3710) [2022-01-21 13:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][750/1251] eta 0:18:30 lr 0.000635 time 2.3740 (2.2166) loss 3.3960 (3.6437) grad_norm 1.1561 (1.3721) [2022-01-21 13:49:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][760/1251] eta 0:18:07 lr 0.000635 time 1.9533 (2.2149) loss 2.9413 (3.6453) grad_norm 1.3963 (1.3715) [2022-01-21 13:49:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][770/1251] eta 0:17:44 lr 0.000635 time 2.2828 (2.2132) loss 3.9917 (3.6433) grad_norm 1.1912 (1.3713) [2022-01-21 13:49:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][780/1251] eta 0:17:22 lr 0.000635 time 2.1257 (2.2128) loss 4.2780 (3.6402) grad_norm 1.2368 (1.3718) [2022-01-21 13:50:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][790/1251] eta 0:16:59 lr 0.000635 time 2.3164 (2.2123) loss 4.2618 (3.6395) grad_norm 1.2187 (1.3702) [2022-01-21 13:50:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][800/1251] eta 0:16:37 lr 0.000635 time 1.8033 (2.2110) loss 4.2949 (3.6397) grad_norm 1.2132 (1.3705) [2022-01-21 13:50:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][810/1251] eta 0:16:14 lr 0.000635 time 1.8894 (2.2107) loss 3.8461 (3.6413) grad_norm 1.2365 (1.3706) [2022-01-21 13:51:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][820/1251] eta 0:15:53 lr 0.000635 time 2.0978 (2.2120) loss 3.1696 (3.6407) grad_norm 1.2486 (1.3704) [2022-01-21 13:51:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][830/1251] eta 0:15:32 lr 0.000635 time 3.3671 (2.2145) loss 3.1968 (3.6403) grad_norm 1.5256 (1.3698) [2022-01-21 13:52:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][840/1251] eta 0:15:09 lr 0.000635 time 1.7834 (2.2136) loss 4.1958 (3.6419) grad_norm 1.2933 (1.3704) [2022-01-21 13:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][850/1251] eta 0:14:47 lr 0.000635 time 1.5300 (2.2120) loss 2.3395 (3.6360) grad_norm 1.4354 (1.3706) [2022-01-21 13:52:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][860/1251] eta 0:14:23 lr 0.000635 time 1.8784 (2.2093) loss 3.6035 (3.6342) grad_norm 1.4348 (1.3701) [2022-01-21 13:53:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][870/1251] eta 0:14:01 lr 0.000635 time 2.4681 (2.2087) loss 2.5704 (3.6314) grad_norm 1.1186 (1.3686) [2022-01-21 13:53:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][880/1251] eta 0:13:39 lr 0.000635 time 2.1047 (2.2076) loss 3.9742 (3.6343) grad_norm 1.5016 (1.3684) [2022-01-21 13:53:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][890/1251] eta 0:13:16 lr 0.000635 time 2.3751 (2.2076) loss 3.7905 (3.6355) grad_norm 1.2334 (1.3683) [2022-01-21 13:54:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][900/1251] eta 0:12:54 lr 0.000635 time 2.1706 (2.2063) loss 4.0404 (3.6358) grad_norm 1.2060 (1.3676) [2022-01-21 13:54:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][910/1251] eta 0:12:32 lr 0.000634 time 1.8354 (2.2059) loss 3.4822 (3.6384) grad_norm 1.2804 (1.3676) [2022-01-21 13:54:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][920/1251] eta 0:12:11 lr 0.000634 time 3.4509 (2.2089) loss 3.2778 (3.6381) grad_norm 1.3350 (1.3680) [2022-01-21 13:55:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][930/1251] eta 0:11:49 lr 0.000634 time 2.4144 (2.2105) loss 3.8782 (3.6373) grad_norm 1.2125 (1.3681) [2022-01-21 13:55:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][940/1251] eta 0:11:27 lr 0.000634 time 1.8627 (2.2104) loss 4.0987 (3.6398) grad_norm 1.5713 (1.3677) [2022-01-21 13:56:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][950/1251] eta 0:11:04 lr 0.000634 time 1.8158 (2.2079) loss 3.9281 (3.6407) grad_norm 1.3224 (1.3675) [2022-01-21 13:56:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][960/1251] eta 0:10:42 lr 0.000634 time 2.5174 (2.2067) loss 2.7565 (3.6411) grad_norm 1.3487 (1.3669) [2022-01-21 13:56:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][970/1251] eta 0:10:19 lr 0.000634 time 2.2488 (2.2058) loss 4.1635 (3.6411) grad_norm 1.2447 (1.3659) [2022-01-21 13:57:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][980/1251] eta 0:09:57 lr 0.000634 time 1.9370 (2.2052) loss 3.0899 (3.6400) grad_norm 1.3433 (1.3665) [2022-01-21 13:57:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][990/1251] eta 0:09:35 lr 0.000634 time 2.1281 (2.2047) loss 3.3226 (3.6403) grad_norm 2.0166 (1.3668) [2022-01-21 13:57:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1000/1251] eta 0:09:13 lr 0.000634 time 2.9304 (2.2050) loss 3.9796 (3.6412) grad_norm 1.5383 (1.3671) [2022-01-21 13:58:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1010/1251] eta 0:08:51 lr 0.000634 time 3.5552 (2.2055) loss 3.9129 (3.6405) grad_norm 1.2591 (1.3676) [2022-01-21 13:58:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1020/1251] eta 0:08:29 lr 0.000634 time 1.6368 (2.2053) loss 4.3584 (3.6410) grad_norm 1.5100 (1.3680) [2022-01-21 13:58:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1030/1251] eta 0:08:07 lr 0.000634 time 2.8885 (2.2060) loss 2.5552 (3.6405) grad_norm 1.3249 (1.3684) [2022-01-21 13:59:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1040/1251] eta 0:07:45 lr 0.000634 time 2.5514 (2.2057) loss 3.5210 (3.6414) grad_norm 1.3278 (1.3688) [2022-01-21 13:59:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1050/1251] eta 0:07:23 lr 0.000634 time 2.8299 (2.2063) loss 3.9908 (3.6399) grad_norm 1.3898 (1.3688) [2022-01-21 14:00:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1060/1251] eta 0:07:01 lr 0.000634 time 2.2111 (2.2052) loss 3.0928 (3.6417) grad_norm 1.3676 (1.3689) [2022-01-21 14:00:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1070/1251] eta 0:06:39 lr 0.000634 time 1.9425 (2.2044) loss 3.8503 (3.6420) grad_norm 1.2349 (1.3682) [2022-01-21 14:00:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1080/1251] eta 0:06:17 lr 0.000634 time 2.4252 (2.2051) loss 4.0753 (3.6453) grad_norm 1.2026 (1.3673) [2022-01-21 14:01:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1090/1251] eta 0:05:55 lr 0.000634 time 2.8614 (2.2053) loss 3.9142 (3.6458) grad_norm 1.3696 (1.3673) [2022-01-21 14:01:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1100/1251] eta 0:05:32 lr 0.000634 time 1.8815 (2.2035) loss 3.6909 (3.6459) grad_norm 1.4349 (1.3679) [2022-01-21 14:01:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1110/1251] eta 0:05:10 lr 0.000634 time 2.2034 (2.2022) loss 3.9804 (3.6451) grad_norm 1.2601 (1.3678) [2022-01-21 14:02:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1120/1251] eta 0:04:48 lr 0.000634 time 2.2527 (2.2012) loss 4.2088 (3.6457) grad_norm 1.2630 (1.3681) [2022-01-21 14:02:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1130/1251] eta 0:04:26 lr 0.000634 time 1.7859 (2.2008) loss 3.2570 (3.6454) grad_norm 1.5715 (1.3687) [2022-01-21 14:02:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1140/1251] eta 0:04:04 lr 0.000634 time 2.1908 (2.2013) loss 4.3236 (3.6469) grad_norm 1.3714 (1.3684) [2022-01-21 14:03:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1150/1251] eta 0:03:42 lr 0.000634 time 2.7125 (2.2022) loss 3.1886 (3.6490) grad_norm 1.6199 (1.3684) [2022-01-21 14:03:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1160/1251] eta 0:03:20 lr 0.000633 time 2.2878 (2.2022) loss 4.0561 (3.6490) grad_norm 1.1779 (1.3679) [2022-01-21 14:04:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1170/1251] eta 0:02:58 lr 0.000633 time 2.1248 (2.2016) loss 3.2924 (3.6507) grad_norm 1.2553 (1.3678) [2022-01-21 14:04:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1180/1251] eta 0:02:36 lr 0.000633 time 2.4079 (2.2017) loss 4.2039 (3.6547) grad_norm 1.3330 (1.3680) [2022-01-21 14:04:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1190/1251] eta 0:02:14 lr 0.000633 time 2.5457 (2.2013) loss 4.3032 (3.6533) grad_norm 1.4946 (1.3687) [2022-01-21 14:05:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1200/1251] eta 0:01:52 lr 0.000633 time 1.9372 (2.2007) loss 2.7915 (3.6528) grad_norm 1.5403 (1.3687) [2022-01-21 14:05:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1210/1251] eta 0:01:30 lr 0.000633 time 1.8530 (2.2014) loss 4.3928 (3.6553) grad_norm 1.2660 (1.3686) [2022-01-21 14:05:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1220/1251] eta 0:01:08 lr 0.000633 time 2.8605 (2.2011) loss 3.5005 (3.6538) grad_norm 1.2187 (1.3686) [2022-01-21 14:06:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1230/1251] eta 0:00:46 lr 0.000633 time 2.2497 (2.2004) loss 2.8668 (3.6532) grad_norm 1.2822 (1.3687) [2022-01-21 14:06:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1240/1251] eta 0:00:24 lr 0.000633 time 1.5248 (2.1989) loss 3.9299 (3.6527) grad_norm 1.4717 (1.3687) [2022-01-21 14:06:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [124/300][1250/1251] eta 0:00:02 lr 0.000633 time 1.3146 (2.1932) loss 2.8472 (3.6523) grad_norm 1.4105 (1.3686) [2022-01-21 14:06:47 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 124 training takes 0:45:44 [2022-01-21 14:07:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.616 (18.616) Loss 1.0201 (1.0201) Acc@1 74.609 (74.609) Acc@5 93.750 (93.750) [2022-01-21 14:07:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.562 (3.533) Loss 1.0365 (1.0525) Acc@1 76.074 (75.071) Acc@5 92.871 (92.773) [2022-01-21 14:07:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.602 (2.571) Loss 1.0615 (1.0659) Acc@1 75.684 (74.893) Acc@5 91.895 (92.690) [2022-01-21 14:07:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.302 (2.230) Loss 1.0019 (1.0626) Acc@1 76.465 (74.915) Acc@5 93.555 (92.704) [2022-01-21 14:08:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.631 (2.151) Loss 1.0858 (1.0695) Acc@1 74.609 (74.721) Acc@5 92.188 (92.595) [2022-01-21 14:08:22 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.818 Acc@5 92.600 [2022-01-21 14:08:22 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.8% [2022-01-21 14:08:22 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.95% [2022-01-21 14:08:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][0/1251] eta 7:24:42 lr 0.000633 time 21.3286 (21.3286) loss 3.3534 (3.3534) grad_norm 1.4086 (1.4086) [2022-01-21 14:09:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][10/1251] eta 1:21:14 lr 0.000633 time 1.5827 (3.9282) loss 2.4392 (3.2766) grad_norm 1.4138 (1.3314) [2022-01-21 14:09:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][20/1251] eta 1:04:11 lr 0.000633 time 2.0673 (3.1291) loss 4.3028 (3.5917) grad_norm 1.3013 (1.3431) [2022-01-21 14:09:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][30/1251] eta 0:56:29 lr 0.000633 time 1.8891 (2.7763) loss 2.5952 (3.5197) grad_norm 1.4078 (1.3413) [2022-01-21 14:10:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][40/1251] eta 0:55:23 lr 0.000633 time 7.8021 (2.7447) loss 3.6079 (3.4689) grad_norm 1.2665 (1.3489) [2022-01-21 14:10:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][50/1251] eta 0:52:42 lr 0.000633 time 1.2933 (2.6329) loss 3.8386 (3.5077) grad_norm 1.6919 (1.3531) [2022-01-21 14:10:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][60/1251] eta 0:50:28 lr 0.000633 time 1.4638 (2.5431) loss 3.9889 (3.5560) grad_norm 1.3403 (1.3472) [2022-01-21 14:11:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][70/1251] eta 0:48:51 lr 0.000633 time 1.8148 (2.4826) loss 3.9040 (3.5624) grad_norm 1.2722 (1.3546) [2022-01-21 14:11:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][80/1251] eta 0:48:03 lr 0.000633 time 3.6565 (2.4627) loss 3.5246 (3.5514) grad_norm 1.6402 (1.3570) [2022-01-21 14:12:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][90/1251] eta 0:47:12 lr 0.000633 time 1.6654 (2.4398) loss 3.8719 (3.5693) grad_norm 1.2399 (1.3600) [2022-01-21 14:12:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][100/1251] eta 0:46:32 lr 0.000633 time 2.4564 (2.4263) loss 3.7493 (3.5662) grad_norm 1.4294 (1.3612) [2022-01-21 14:12:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][110/1251] eta 0:45:38 lr 0.000633 time 1.7649 (2.3999) loss 3.5002 (3.5786) grad_norm 1.2853 (1.3602) [2022-01-21 14:13:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][120/1251] eta 0:44:50 lr 0.000633 time 2.9047 (2.3786) loss 3.7552 (3.6103) grad_norm 1.6667 (1.3647) [2022-01-21 14:13:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][130/1251] eta 0:44:04 lr 0.000633 time 1.8957 (2.3589) loss 3.6426 (3.6117) grad_norm 1.3869 (1.3632) [2022-01-21 14:13:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][140/1251] eta 0:43:13 lr 0.000633 time 2.2510 (2.3346) loss 4.0379 (3.6127) grad_norm 1.2850 (1.3629) [2022-01-21 14:14:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][150/1251] eta 0:42:30 lr 0.000633 time 1.8103 (2.3163) loss 3.9280 (3.6234) grad_norm 1.4621 (1.3620) [2022-01-21 14:14:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][160/1251] eta 0:41:58 lr 0.000632 time 2.9368 (2.3088) loss 4.1302 (3.6141) grad_norm 1.3985 (1.3642) [2022-01-21 14:14:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][170/1251] eta 0:41:35 lr 0.000632 time 2.3316 (2.3082) loss 3.3224 (3.6195) grad_norm 1.5269 (1.3632) [2022-01-21 14:15:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][180/1251] eta 0:41:04 lr 0.000632 time 2.1640 (2.3008) loss 3.9855 (3.6059) grad_norm 1.4759 (1.3610) [2022-01-21 14:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][190/1251] eta 0:40:38 lr 0.000632 time 2.3724 (2.2988) loss 3.6208 (3.6091) grad_norm 1.3244 (1.3649) [2022-01-21 14:16:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][200/1251] eta 0:40:19 lr 0.000632 time 3.6006 (2.3024) loss 4.0929 (3.6059) grad_norm 1.2404 (1.3672) [2022-01-21 14:16:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][210/1251] eta 0:39:54 lr 0.000632 time 1.8596 (2.3001) loss 3.9152 (3.6194) grad_norm 1.2442 (1.3634) [2022-01-21 14:16:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][220/1251] eta 0:39:27 lr 0.000632 time 1.9887 (2.2962) loss 3.4758 (3.6183) grad_norm 1.5707 (1.3620) [2022-01-21 14:17:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][230/1251] eta 0:38:50 lr 0.000632 time 1.8673 (2.2830) loss 4.2086 (3.6195) grad_norm 1.4934 (1.3617) [2022-01-21 14:17:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][240/1251] eta 0:38:23 lr 0.000632 time 2.8395 (2.2782) loss 3.4963 (3.6219) grad_norm 1.3129 (1.3610) [2022-01-21 14:17:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][250/1251] eta 0:37:58 lr 0.000632 time 2.4612 (2.2759) loss 3.2005 (3.6200) grad_norm 1.3824 (1.3604) [2022-01-21 14:18:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][260/1251] eta 0:37:30 lr 0.000632 time 1.8465 (2.2711) loss 4.1837 (3.6303) grad_norm 1.3564 (1.3599) [2022-01-21 14:18:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][270/1251] eta 0:37:08 lr 0.000632 time 2.8476 (2.2714) loss 3.9766 (3.6426) grad_norm 1.2032 (1.3584) [2022-01-21 14:19:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][280/1251] eta 0:36:42 lr 0.000632 time 2.4883 (2.2688) loss 3.8498 (3.6411) grad_norm 1.2592 (1.3538) [2022-01-21 14:19:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][290/1251] eta 0:36:20 lr 0.000632 time 2.3827 (2.2685) loss 3.6286 (3.6433) grad_norm 1.3780 (1.3529) [2022-01-21 14:19:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][300/1251] eta 0:35:52 lr 0.000632 time 1.7832 (2.2639) loss 3.2186 (3.6457) grad_norm 1.5038 (1.3546) [2022-01-21 14:20:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][310/1251] eta 0:35:24 lr 0.000632 time 2.2666 (2.2579) loss 2.8248 (3.6370) grad_norm 1.3985 (1.3571) [2022-01-21 14:20:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][320/1251] eta 0:34:56 lr 0.000632 time 1.5767 (2.2520) loss 2.9682 (3.6326) grad_norm 1.2903 (1.3564) [2022-01-21 14:20:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][330/1251] eta 0:34:29 lr 0.000632 time 2.1815 (2.2469) loss 4.5973 (3.6327) grad_norm 1.2349 (1.3566) [2022-01-21 14:21:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][340/1251] eta 0:34:07 lr 0.000632 time 2.2607 (2.2471) loss 3.9496 (3.6343) grad_norm 1.3540 (1.3577) [2022-01-21 14:21:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][350/1251] eta 0:33:44 lr 0.000632 time 2.1112 (2.2473) loss 2.7253 (3.6331) grad_norm 1.2952 (1.3562) [2022-01-21 14:21:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][360/1251] eta 0:33:27 lr 0.000632 time 2.1571 (2.2530) loss 4.1721 (3.6250) grad_norm 1.3845 (1.3592) [2022-01-21 14:22:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][370/1251] eta 0:33:02 lr 0.000632 time 1.7946 (2.2500) loss 3.3800 (3.6189) grad_norm 1.2226 (1.3610) [2022-01-21 14:22:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][380/1251] eta 0:32:36 lr 0.000632 time 1.9364 (2.2466) loss 3.8350 (3.6205) grad_norm 1.2600 (1.3608) [2022-01-21 14:23:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][390/1251] eta 0:32:12 lr 0.000632 time 1.9241 (2.2440) loss 3.1024 (3.6242) grad_norm 1.8855 (1.3638) [2022-01-21 14:23:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][400/1251] eta 0:31:49 lr 0.000632 time 2.1839 (2.2435) loss 4.1136 (3.6250) grad_norm 1.2475 (1.3651) [2022-01-21 14:23:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][410/1251] eta 0:31:25 lr 0.000631 time 2.3396 (2.2415) loss 2.8483 (3.6270) grad_norm 1.4884 (1.3662) [2022-01-21 14:24:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][420/1251] eta 0:30:58 lr 0.000631 time 2.0766 (2.2370) loss 2.7896 (3.6236) grad_norm 1.2122 (1.3667) [2022-01-21 14:24:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][430/1251] eta 0:30:35 lr 0.000631 time 1.6140 (2.2353) loss 3.4989 (3.6240) grad_norm 1.5537 (1.3672) [2022-01-21 14:24:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][440/1251] eta 0:30:11 lr 0.000631 time 1.7341 (2.2342) loss 3.3808 (3.6159) grad_norm 1.5416 (1.3675) [2022-01-21 14:25:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][450/1251] eta 0:29:47 lr 0.000631 time 2.5406 (2.2316) loss 3.1901 (3.6192) grad_norm 1.3364 (1.3696) [2022-01-21 14:25:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][460/1251] eta 0:29:23 lr 0.000631 time 1.8407 (2.2300) loss 2.4389 (3.6159) grad_norm 1.3380 (1.3683) [2022-01-21 14:25:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][470/1251] eta 0:29:00 lr 0.000631 time 2.3291 (2.2280) loss 2.5164 (3.6113) grad_norm 1.2873 (1.3689) [2022-01-21 14:26:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][480/1251] eta 0:28:37 lr 0.000631 time 1.8774 (2.2274) loss 3.5258 (3.6088) grad_norm 1.2384 (1.3674) [2022-01-21 14:26:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][490/1251] eta 0:28:16 lr 0.000631 time 2.8343 (2.2288) loss 3.9336 (3.6076) grad_norm 1.1879 (1.3664) [2022-01-21 14:26:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][500/1251] eta 0:27:53 lr 0.000631 time 2.3346 (2.2289) loss 4.2584 (3.6071) grad_norm 1.3930 (1.3676) [2022-01-21 14:27:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][510/1251] eta 0:27:30 lr 0.000631 time 2.2014 (2.2280) loss 3.8578 (3.6063) grad_norm 1.3933 (1.3669) [2022-01-21 14:27:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][520/1251] eta 0:27:08 lr 0.000631 time 2.1453 (2.2277) loss 4.0503 (3.6137) grad_norm 1.3064 (1.3676) [2022-01-21 14:28:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][530/1251] eta 0:26:44 lr 0.000631 time 2.2898 (2.2259) loss 3.6873 (3.6136) grad_norm 1.2656 (1.3663) [2022-01-21 14:28:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][540/1251] eta 0:26:21 lr 0.000631 time 2.2389 (2.2238) loss 2.6662 (3.6116) grad_norm 1.3670 (1.3666) [2022-01-21 14:28:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][550/1251] eta 0:25:56 lr 0.000631 time 2.0890 (2.2205) loss 3.5397 (3.6128) grad_norm 2.0561 (1.3697) [2022-01-21 14:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][560/1251] eta 0:25:34 lr 0.000631 time 3.0688 (2.2203) loss 4.5632 (3.6177) grad_norm 1.4102 (1.3693) [2022-01-21 14:29:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][570/1251] eta 0:25:13 lr 0.000631 time 1.9610 (2.2225) loss 3.0572 (3.6195) grad_norm 1.2775 (1.3697) [2022-01-21 14:29:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][580/1251] eta 0:24:50 lr 0.000631 time 2.2281 (2.2217) loss 2.6891 (3.6164) grad_norm 1.2557 (1.3681) [2022-01-21 14:30:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][590/1251] eta 0:24:27 lr 0.000631 time 2.4404 (2.2205) loss 3.9419 (3.6182) grad_norm 1.8576 (1.3680) [2022-01-21 14:30:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][600/1251] eta 0:24:04 lr 0.000631 time 3.5676 (2.2190) loss 3.1910 (3.6197) grad_norm 1.6357 (1.3682) [2022-01-21 14:30:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][610/1251] eta 0:23:42 lr 0.000631 time 2.0127 (2.2200) loss 3.9598 (3.6168) grad_norm 1.4422 (1.3677) [2022-01-21 14:31:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][620/1251] eta 0:23:18 lr 0.000631 time 1.6151 (2.2167) loss 4.4986 (3.6229) grad_norm 1.3130 (1.3671) [2022-01-21 14:31:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][630/1251] eta 0:22:55 lr 0.000631 time 2.4989 (2.2151) loss 3.6027 (3.6186) grad_norm 1.5696 (1.3668) [2022-01-21 14:32:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][640/1251] eta 0:22:33 lr 0.000631 time 3.0783 (2.2152) loss 3.3215 (3.6186) grad_norm 1.3324 (1.3666) [2022-01-21 14:32:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][650/1251] eta 0:22:10 lr 0.000631 time 1.7346 (2.2132) loss 2.7897 (3.6145) grad_norm 1.4472 (1.3668) [2022-01-21 14:32:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][660/1251] eta 0:21:48 lr 0.000630 time 1.8879 (2.2136) loss 3.8057 (3.6117) grad_norm 1.2429 (1.3662) [2022-01-21 14:33:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][670/1251] eta 0:21:27 lr 0.000630 time 2.1051 (2.2152) loss 2.8887 (3.6097) grad_norm 1.4756 (1.3657) [2022-01-21 14:33:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][680/1251] eta 0:21:05 lr 0.000630 time 3.3085 (2.2163) loss 3.9329 (3.6124) grad_norm 1.2134 (1.3662) [2022-01-21 14:33:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][690/1251] eta 0:20:41 lr 0.000630 time 1.9175 (2.2135) loss 2.5974 (3.6115) grad_norm 1.2388 (1.3660) [2022-01-21 14:34:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][700/1251] eta 0:20:19 lr 0.000630 time 1.8678 (2.2132) loss 3.7668 (3.6099) grad_norm 1.4270 (1.3647) [2022-01-21 14:34:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][710/1251] eta 0:19:56 lr 0.000630 time 2.0202 (2.2121) loss 4.0778 (3.6114) grad_norm 1.2249 (1.3644) [2022-01-21 14:34:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][720/1251] eta 0:19:35 lr 0.000630 time 3.8392 (2.2139) loss 3.0126 (3.6104) grad_norm 1.5745 (1.3634) [2022-01-21 14:35:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][730/1251] eta 0:19:11 lr 0.000630 time 1.8586 (2.2107) loss 2.7249 (3.6121) grad_norm 1.5321 (1.3639) [2022-01-21 14:35:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][740/1251] eta 0:18:48 lr 0.000630 time 2.1823 (2.2090) loss 4.0918 (3.6129) grad_norm 1.4588 (1.3640) [2022-01-21 14:36:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][750/1251] eta 0:18:26 lr 0.000630 time 1.9390 (2.2082) loss 2.9566 (3.6107) grad_norm 1.2539 (1.3641) [2022-01-21 14:36:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][760/1251] eta 0:18:04 lr 0.000630 time 2.5427 (2.2081) loss 3.9866 (3.6081) grad_norm 1.4244 (1.3652) [2022-01-21 14:36:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][770/1251] eta 0:17:42 lr 0.000630 time 2.6799 (2.2090) loss 3.8701 (3.6102) grad_norm 1.2744 (1.3654) [2022-01-21 14:37:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][780/1251] eta 0:17:20 lr 0.000630 time 2.5351 (2.2101) loss 4.2405 (3.6124) grad_norm 1.3463 (1.3657) [2022-01-21 14:37:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][790/1251] eta 0:16:59 lr 0.000630 time 2.2530 (2.2119) loss 3.4344 (3.6100) grad_norm 1.1639 (1.3647) [2022-01-21 14:37:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][800/1251] eta 0:16:37 lr 0.000630 time 2.0263 (2.2114) loss 3.5304 (3.6107) grad_norm 1.2890 (1.3639) [2022-01-21 14:38:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][810/1251] eta 0:16:15 lr 0.000630 time 3.1357 (2.2123) loss 3.2599 (3.6078) grad_norm 1.2869 (1.3645) [2022-01-21 14:38:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][820/1251] eta 0:15:52 lr 0.000630 time 2.2604 (2.2096) loss 3.3646 (3.6103) grad_norm 1.4131 (1.3646) [2022-01-21 14:38:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][830/1251] eta 0:15:29 lr 0.000630 time 2.2335 (2.2069) loss 3.1636 (3.6080) grad_norm 1.5200 (1.3648) [2022-01-21 14:39:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][840/1251] eta 0:15:05 lr 0.000630 time 1.9491 (2.2038) loss 4.0409 (3.6065) grad_norm 1.2346 (1.3651) [2022-01-21 14:39:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][850/1251] eta 0:14:43 lr 0.000630 time 2.8741 (2.2035) loss 4.3714 (3.6093) grad_norm 1.1508 (1.3645) [2022-01-21 14:39:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][860/1251] eta 0:14:21 lr 0.000630 time 2.7314 (2.2028) loss 3.8571 (3.6106) grad_norm 1.2172 (1.3639) [2022-01-21 14:40:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][870/1251] eta 0:14:00 lr 0.000630 time 2.7391 (2.2048) loss 3.0339 (3.6095) grad_norm 1.4033 (1.3633) [2022-01-21 14:40:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][880/1251] eta 0:13:39 lr 0.000630 time 2.2029 (2.2080) loss 4.2005 (3.6093) grad_norm 1.3036 (1.3650) [2022-01-21 14:41:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][890/1251] eta 0:13:17 lr 0.000630 time 2.5461 (2.2095) loss 3.7803 (3.6104) grad_norm 1.6150 (1.3652) [2022-01-21 14:41:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][900/1251] eta 0:12:54 lr 0.000630 time 1.6359 (2.2076) loss 3.9187 (3.6148) grad_norm 1.2371 (1.3652) [2022-01-21 14:41:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][910/1251] eta 0:12:32 lr 0.000629 time 2.9161 (2.2076) loss 2.8190 (3.6139) grad_norm 1.3574 (1.3651) [2022-01-21 14:42:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][920/1251] eta 0:12:10 lr 0.000629 time 1.8484 (2.2063) loss 2.4930 (3.6127) grad_norm 1.2640 (1.3651) [2022-01-21 14:42:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][930/1251] eta 0:11:48 lr 0.000629 time 1.8071 (2.2058) loss 4.1803 (3.6117) grad_norm 1.4996 (1.3647) [2022-01-21 14:42:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][940/1251] eta 0:11:25 lr 0.000629 time 1.6764 (2.2052) loss 4.1694 (3.6126) grad_norm 1.4546 (1.3652) [2022-01-21 14:43:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][950/1251] eta 0:11:04 lr 0.000629 time 3.2246 (2.2079) loss 3.2014 (3.6124) grad_norm 1.2688 (1.3647) [2022-01-21 14:43:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][960/1251] eta 0:10:41 lr 0.000629 time 1.8349 (2.2061) loss 4.0729 (3.6102) grad_norm 1.2967 (1.3638) [2022-01-21 14:44:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][970/1251] eta 0:10:19 lr 0.000629 time 1.8950 (2.2044) loss 2.5871 (3.6104) grad_norm 1.3446 (1.3631) [2022-01-21 14:44:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][980/1251] eta 0:09:56 lr 0.000629 time 1.8285 (2.2028) loss 3.5673 (3.6116) grad_norm 1.2197 (1.3629) [2022-01-21 14:44:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][990/1251] eta 0:09:35 lr 0.000629 time 1.9783 (2.2034) loss 4.0963 (3.6117) grad_norm 1.5940 (1.3630) [2022-01-21 14:45:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1000/1251] eta 0:09:13 lr 0.000629 time 2.5278 (2.2035) loss 3.4044 (3.6143) grad_norm 1.1882 (1.3623) [2022-01-21 14:45:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1010/1251] eta 0:08:50 lr 0.000629 time 1.6623 (2.2025) loss 3.7481 (3.6167) grad_norm 1.3825 (1.3629) [2022-01-21 14:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1020/1251] eta 0:08:28 lr 0.000629 time 2.3636 (2.2025) loss 3.9413 (3.6173) grad_norm 1.2264 (1.3635) [2022-01-21 14:46:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1030/1251] eta 0:08:07 lr 0.000629 time 1.6902 (2.2038) loss 3.1357 (3.6148) grad_norm 1.3938 (1.3641) [2022-01-21 14:46:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1040/1251] eta 0:07:45 lr 0.000629 time 3.2023 (2.2051) loss 3.6432 (3.6140) grad_norm 1.5734 (1.3645) [2022-01-21 14:47:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1050/1251] eta 0:07:23 lr 0.000629 time 1.6129 (2.2053) loss 3.6848 (3.6121) grad_norm 1.3118 (1.3646) [2022-01-21 14:47:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1060/1251] eta 0:07:00 lr 0.000629 time 1.6338 (2.2041) loss 2.9203 (3.6115) grad_norm 1.2574 (1.3642) [2022-01-21 14:47:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1070/1251] eta 0:06:39 lr 0.000629 time 2.0933 (2.2044) loss 3.7325 (3.6116) grad_norm 1.2959 (1.3646) [2022-01-21 14:48:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1080/1251] eta 0:06:17 lr 0.000629 time 3.6972 (2.2054) loss 2.6395 (3.6139) grad_norm 1.6480 (1.3657) [2022-01-21 14:48:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1090/1251] eta 0:05:55 lr 0.000629 time 1.5723 (2.2055) loss 3.9231 (3.6119) grad_norm 1.3969 (1.3658) [2022-01-21 14:48:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1100/1251] eta 0:05:32 lr 0.000629 time 1.8348 (2.2040) loss 4.1265 (3.6125) grad_norm 1.3615 (1.3658) [2022-01-21 14:49:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1110/1251] eta 0:05:10 lr 0.000629 time 2.3001 (2.2038) loss 4.3687 (3.6131) grad_norm 1.2370 (1.3653) [2022-01-21 14:49:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1120/1251] eta 0:04:48 lr 0.000629 time 3.5363 (2.2049) loss 3.7381 (3.6160) grad_norm 1.7526 (1.3661) [2022-01-21 14:49:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1130/1251] eta 0:04:26 lr 0.000629 time 1.6057 (2.2056) loss 3.9197 (3.6175) grad_norm 1.3312 (1.3666) [2022-01-21 14:50:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1140/1251] eta 0:04:04 lr 0.000629 time 1.8961 (2.2049) loss 2.9067 (3.6170) grad_norm 1.2287 (1.3670) [2022-01-21 14:50:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1150/1251] eta 0:03:42 lr 0.000629 time 1.6166 (2.2032) loss 4.0974 (3.6161) grad_norm 1.5766 (1.3667) [2022-01-21 14:50:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1160/1251] eta 0:03:20 lr 0.000628 time 2.1052 (2.2019) loss 3.2608 (3.6164) grad_norm 1.3672 (1.3666) [2022-01-21 14:51:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1170/1251] eta 0:02:58 lr 0.000628 time 1.6312 (2.2004) loss 3.7317 (3.6191) grad_norm 1.4393 (1.3663) [2022-01-21 14:51:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1180/1251] eta 0:02:36 lr 0.000628 time 2.1281 (2.1995) loss 3.8404 (3.6196) grad_norm 1.2619 (1.3657) [2022-01-21 14:52:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1190/1251] eta 0:02:14 lr 0.000628 time 1.6131 (2.1989) loss 3.8359 (3.6200) grad_norm 1.2941 (1.3655) [2022-01-21 14:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1200/1251] eta 0:01:52 lr 0.000628 time 1.6952 (2.2008) loss 3.3823 (3.6198) grad_norm 1.3565 (1.3648) [2022-01-21 14:52:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1210/1251] eta 0:01:30 lr 0.000628 time 1.8736 (2.2011) loss 4.2648 (3.6200) grad_norm 1.2141 (1.3643) [2022-01-21 14:53:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1220/1251] eta 0:01:08 lr 0.000628 time 2.8261 (2.2019) loss 4.1656 (3.6203) grad_norm 1.3343 (1.3636) [2022-01-21 14:53:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1230/1251] eta 0:00:46 lr 0.000628 time 1.5427 (2.2014) loss 4.1477 (3.6220) grad_norm 1.3649 (1.3640) [2022-01-21 14:53:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1240/1251] eta 0:00:24 lr 0.000628 time 1.7134 (2.1996) loss 2.5492 (3.6228) grad_norm 1.2571 (1.3636) [2022-01-21 14:54:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [125/300][1250/1251] eta 0:00:02 lr 0.000628 time 1.2739 (2.1942) loss 3.7022 (3.6234) grad_norm 1.4770 (1.3635) [2022-01-21 14:54:08 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 125 training takes 0:45:45 [2022-01-21 14:54:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.625 (18.625) Loss 1.0342 (1.0342) Acc@1 76.465 (76.465) Acc@5 92.871 (92.871) [2022-01-21 14:54:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.951 (3.304) Loss 1.1137 (1.0609) Acc@1 73.730 (74.929) Acc@5 91.992 (93.084) [2022-01-21 14:54:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.598 (2.445) Loss 1.0388 (1.0681) Acc@1 75.488 (74.856) Acc@5 93.359 (92.871) [2022-01-21 14:55:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.627 (2.230) Loss 1.0894 (1.0686) Acc@1 74.414 (75.120) Acc@5 91.602 (92.751) [2022-01-21 14:55:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.342 (2.172) Loss 1.0739 (1.0704) Acc@1 74.902 (74.955) Acc@5 92.871 (92.714) [2022-01-21 14:55:44 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.982 Acc@5 92.726 [2022-01-21 14:55:44 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.0% [2022-01-21 14:55:44 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 74.98% [2022-01-21 14:56:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][0/1251] eta 7:26:50 lr 0.000628 time 21.4310 (21.4310) loss 3.9685 (3.9685) grad_norm 1.5053 (1.5053) [2022-01-21 14:56:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][10/1251] eta 1:25:41 lr 0.000628 time 1.4024 (4.1432) loss 3.4631 (3.7172) grad_norm 1.3460 (1.3484) [2022-01-21 14:56:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][20/1251] eta 1:08:20 lr 0.000628 time 2.0883 (3.3313) loss 4.4613 (3.7062) grad_norm 1.5673 (1.3884) [2022-01-21 14:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][30/1251] eta 0:59:46 lr 0.000628 time 1.5062 (2.9372) loss 4.3585 (3.7059) grad_norm 1.2746 (1.3821) [2022-01-21 14:57:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][40/1251] eta 0:57:06 lr 0.000628 time 3.9067 (2.8291) loss 4.1028 (3.7081) grad_norm 1.2984 (1.3793) [2022-01-21 14:58:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][50/1251] eta 0:54:16 lr 0.000628 time 1.4867 (2.7116) loss 3.4439 (3.7057) grad_norm 1.5685 (1.3733) [2022-01-21 14:58:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][60/1251] eta 0:51:20 lr 0.000628 time 1.9715 (2.5866) loss 2.9264 (3.6889) grad_norm 1.3272 (1.3622) [2022-01-21 14:58:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][70/1251] eta 0:49:19 lr 0.000628 time 1.9276 (2.5055) loss 3.2627 (3.6963) grad_norm 1.4163 (1.3677) [2022-01-21 14:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][80/1251] eta 0:48:14 lr 0.000628 time 3.1238 (2.4716) loss 3.7951 (3.6608) grad_norm 1.2950 (1.3638) [2022-01-21 14:59:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][90/1251] eta 0:47:42 lr 0.000628 time 2.8411 (2.4660) loss 2.7618 (3.6541) grad_norm 1.3583 (1.3685) [2022-01-21 14:59:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][100/1251] eta 0:46:53 lr 0.000628 time 1.8504 (2.4445) loss 4.1503 (3.6672) grad_norm 1.3483 (1.3705) [2022-01-21 15:00:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][110/1251] eta 0:46:08 lr 0.000628 time 2.6159 (2.4263) loss 2.6260 (3.6600) grad_norm 1.2640 (1.3587) [2022-01-21 15:00:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][120/1251] eta 0:45:26 lr 0.000628 time 2.2779 (2.4103) loss 3.6232 (3.6497) grad_norm 1.4495 (1.3570) [2022-01-21 15:00:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][130/1251] eta 0:44:16 lr 0.000628 time 1.9482 (2.3698) loss 4.3850 (3.6474) grad_norm 1.4423 (1.3569) [2022-01-21 15:01:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][140/1251] eta 0:43:24 lr 0.000628 time 1.8838 (2.3446) loss 3.0990 (3.6396) grad_norm 1.5493 (1.3592) [2022-01-21 15:01:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][150/1251] eta 0:43:14 lr 0.000627 time 3.6650 (2.3563) loss 3.0774 (3.6338) grad_norm 1.5287 (1.3578) [2022-01-21 15:02:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][160/1251] eta 0:42:47 lr 0.000627 time 1.9646 (2.3534) loss 4.0479 (3.6382) grad_norm 1.2667 (1.3566) [2022-01-21 15:02:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][170/1251] eta 0:42:19 lr 0.000627 time 2.4841 (2.3495) loss 4.1330 (3.6420) grad_norm 1.7307 (1.3558) [2022-01-21 15:02:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][180/1251] eta 0:41:40 lr 0.000627 time 1.6193 (2.3351) loss 3.8977 (3.6316) grad_norm 1.3082 (1.3563) [2022-01-21 15:03:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][190/1251] eta 0:40:58 lr 0.000627 time 2.5377 (2.3167) loss 4.0703 (3.6314) grad_norm 1.5239 (1.3621) [2022-01-21 15:03:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][200/1251] eta 0:40:16 lr 0.000627 time 1.7595 (2.2993) loss 4.6070 (3.6373) grad_norm 1.2702 (1.3617) [2022-01-21 15:03:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][210/1251] eta 0:39:39 lr 0.000627 time 2.2304 (2.2860) loss 3.1219 (3.6348) grad_norm 1.3435 (1.3612) [2022-01-21 15:04:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][220/1251] eta 0:39:07 lr 0.000627 time 2.0806 (2.2766) loss 3.7906 (3.6440) grad_norm 1.2611 (1.3639) [2022-01-21 15:04:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][230/1251] eta 0:38:44 lr 0.000627 time 1.8929 (2.2763) loss 2.6890 (3.6335) grad_norm 1.8220 (1.3647) [2022-01-21 15:04:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][240/1251] eta 0:38:15 lr 0.000627 time 1.5894 (2.2706) loss 4.1262 (3.6345) grad_norm 1.2111 (1.3649) [2022-01-21 15:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][250/1251] eta 0:37:56 lr 0.000627 time 2.1028 (2.2747) loss 3.7617 (3.6394) grad_norm 1.3252 (1.3664) [2022-01-21 15:05:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][260/1251] eta 0:37:32 lr 0.000627 time 2.1579 (2.2733) loss 3.9508 (3.6381) grad_norm 1.4707 (1.3691) [2022-01-21 15:06:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][270/1251] eta 0:37:11 lr 0.000627 time 1.8749 (2.2742) loss 3.4786 (3.6430) grad_norm 1.1957 (1.3702) [2022-01-21 15:06:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][280/1251] eta 0:36:55 lr 0.000627 time 2.6403 (2.2812) loss 4.1212 (3.6569) grad_norm 1.3499 (1.3692) [2022-01-21 15:06:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][290/1251] eta 0:36:31 lr 0.000627 time 1.9000 (2.2804) loss 4.2070 (3.6634) grad_norm 1.3837 (1.3728) [2022-01-21 15:07:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][300/1251] eta 0:35:59 lr 0.000627 time 1.7152 (2.2711) loss 2.8222 (3.6515) grad_norm 1.2244 (1.3704) [2022-01-21 15:07:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][310/1251] eta 0:35:26 lr 0.000627 time 1.9687 (2.2603) loss 4.1619 (3.6484) grad_norm 1.3844 (1.3687) [2022-01-21 15:07:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][320/1251] eta 0:35:01 lr 0.000627 time 2.2248 (2.2567) loss 3.8277 (3.6505) grad_norm 1.5050 (1.3672) [2022-01-21 15:08:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][330/1251] eta 0:34:37 lr 0.000627 time 2.5096 (2.2559) loss 4.0759 (3.6383) grad_norm 1.4387 (1.3652) [2022-01-21 15:08:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][340/1251] eta 0:34:15 lr 0.000627 time 1.9135 (2.2560) loss 3.3245 (3.6395) grad_norm 1.2236 (1.3642) [2022-01-21 15:08:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][350/1251] eta 0:33:53 lr 0.000627 time 2.0177 (2.2564) loss 4.5437 (3.6411) grad_norm 1.4471 (1.3637) [2022-01-21 15:09:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][360/1251] eta 0:33:27 lr 0.000627 time 1.8948 (2.2532) loss 4.4724 (3.6458) grad_norm 1.3852 (1.3647) [2022-01-21 15:09:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][370/1251] eta 0:33:03 lr 0.000627 time 2.1623 (2.2514) loss 3.7530 (3.6450) grad_norm 1.4448 (1.3686) [2022-01-21 15:10:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][380/1251] eta 0:32:36 lr 0.000627 time 2.2664 (2.2468) loss 3.1893 (3.6439) grad_norm 1.3795 (1.3694) [2022-01-21 15:10:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][390/1251] eta 0:32:13 lr 0.000627 time 2.2309 (2.2458) loss 3.7727 (3.6486) grad_norm 1.3285 (1.3708) [2022-01-21 15:10:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][400/1251] eta 0:31:49 lr 0.000626 time 2.0958 (2.2434) loss 3.6350 (3.6477) grad_norm 1.1951 (1.3745) [2022-01-21 15:11:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][410/1251] eta 0:31:25 lr 0.000626 time 1.9606 (2.2424) loss 3.5578 (3.6476) grad_norm 1.4290 (1.3752) [2022-01-21 15:11:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][420/1251] eta 0:31:00 lr 0.000626 time 2.8163 (2.2388) loss 4.1643 (3.6503) grad_norm 1.2341 (1.3748) [2022-01-21 15:11:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][430/1251] eta 0:30:37 lr 0.000626 time 1.9866 (2.2380) loss 3.0685 (3.6482) grad_norm 1.4792 (1.3739) [2022-01-21 15:12:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][440/1251] eta 0:30:16 lr 0.000626 time 1.8370 (2.2396) loss 3.9112 (3.6479) grad_norm 1.3116 (1.3743) [2022-01-21 15:12:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][450/1251] eta 0:29:54 lr 0.000626 time 2.4572 (2.2408) loss 4.3191 (3.6460) grad_norm 1.2997 (1.3737) [2022-01-21 15:12:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][460/1251] eta 0:29:29 lr 0.000626 time 1.6140 (2.2370) loss 4.2944 (3.6414) grad_norm 1.4471 (1.3728) [2022-01-21 15:13:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][470/1251] eta 0:29:04 lr 0.000626 time 1.6450 (2.2335) loss 3.3018 (3.6481) grad_norm 1.3586 (1.3724) [2022-01-21 15:13:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][480/1251] eta 0:28:41 lr 0.000626 time 2.2787 (2.2324) loss 4.3961 (3.6481) grad_norm 1.6007 (1.3731) [2022-01-21 15:14:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][490/1251] eta 0:28:21 lr 0.000626 time 3.0326 (2.2361) loss 4.0156 (3.6479) grad_norm 1.3591 (1.3737) [2022-01-21 15:14:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][500/1251] eta 0:28:02 lr 0.000626 time 2.1155 (2.2408) loss 3.0977 (3.6438) grad_norm 1.2151 (1.3736) [2022-01-21 15:14:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][510/1251] eta 0:27:41 lr 0.000626 time 2.0087 (2.2429) loss 3.3336 (3.6388) grad_norm 1.2804 (1.3728) [2022-01-21 15:15:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][520/1251] eta 0:27:19 lr 0.000626 time 2.3885 (2.2432) loss 4.0937 (3.6403) grad_norm 1.3543 (1.3718) [2022-01-21 15:15:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][530/1251] eta 0:26:55 lr 0.000626 time 2.2358 (2.2407) loss 4.4039 (3.6366) grad_norm 1.3893 (1.3714) [2022-01-21 15:15:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][540/1251] eta 0:26:28 lr 0.000626 time 1.9771 (2.2341) loss 3.9369 (3.6385) grad_norm 1.1583 (1.3701) [2022-01-21 15:16:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][550/1251] eta 0:26:03 lr 0.000626 time 2.6332 (2.2301) loss 3.8939 (3.6379) grad_norm 1.1595 (1.3691) [2022-01-21 15:16:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][560/1251] eta 0:25:41 lr 0.000626 time 2.7802 (2.2302) loss 2.6233 (3.6349) grad_norm 1.3058 (1.3702) [2022-01-21 15:16:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][570/1251] eta 0:25:18 lr 0.000626 time 2.7098 (2.2296) loss 3.9210 (3.6321) grad_norm 1.5984 (1.3712) [2022-01-21 15:17:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][580/1251] eta 0:24:55 lr 0.000626 time 1.9179 (2.2294) loss 4.2617 (3.6323) grad_norm 1.4510 (1.3714) [2022-01-21 15:17:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][590/1251] eta 0:24:34 lr 0.000626 time 2.8794 (2.2300) loss 4.2612 (3.6375) grad_norm 1.3616 (1.3713) [2022-01-21 15:18:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][600/1251] eta 0:24:11 lr 0.000626 time 2.2711 (2.2296) loss 3.2951 (3.6389) grad_norm 1.1805 (1.3720) [2022-01-21 15:18:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][610/1251] eta 0:23:48 lr 0.000626 time 2.2342 (2.2289) loss 3.9279 (3.6400) grad_norm 1.3190 (1.3721) [2022-01-21 15:18:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][620/1251] eta 0:23:25 lr 0.000626 time 1.9320 (2.2282) loss 2.9903 (3.6401) grad_norm 1.5073 (1.3727) [2022-01-21 15:19:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][630/1251] eta 0:23:03 lr 0.000626 time 2.6729 (2.2286) loss 3.9273 (3.6404) grad_norm 1.9370 (1.3725) [2022-01-21 15:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][640/1251] eta 0:22:42 lr 0.000626 time 2.4329 (2.2292) loss 3.9733 (3.6400) grad_norm 1.3056 (1.3715) [2022-01-21 15:19:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][650/1251] eta 0:22:17 lr 0.000625 time 1.6543 (2.2257) loss 3.7948 (3.6408) grad_norm 1.4327 (1.3719) [2022-01-21 15:20:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][660/1251] eta 0:21:54 lr 0.000625 time 1.9539 (2.2236) loss 3.6736 (3.6354) grad_norm 1.3008 (1.3734) [2022-01-21 15:20:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][670/1251] eta 0:21:31 lr 0.000625 time 2.8843 (2.2224) loss 3.9459 (3.6310) grad_norm 1.3189 (1.3746) [2022-01-21 15:20:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][680/1251] eta 0:21:07 lr 0.000625 time 1.6233 (2.2200) loss 3.8358 (3.6321) grad_norm 1.4830 (1.3768) [2022-01-21 15:21:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][690/1251] eta 0:20:44 lr 0.000625 time 2.1794 (2.2187) loss 3.5222 (3.6357) grad_norm 1.2372 (1.3759) [2022-01-21 15:21:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][700/1251] eta 0:20:21 lr 0.000625 time 2.2167 (2.2174) loss 3.5650 (3.6414) grad_norm 1.3794 (1.3760) [2022-01-21 15:22:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][710/1251] eta 0:19:59 lr 0.000625 time 2.3225 (2.2173) loss 3.2764 (3.6396) grad_norm 1.3412 (1.3768) [2022-01-21 15:22:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][720/1251] eta 0:19:38 lr 0.000625 time 2.1450 (2.2195) loss 3.6673 (3.6386) grad_norm 1.2402 (1.3757) [2022-01-21 15:22:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][730/1251] eta 0:19:16 lr 0.000625 time 1.5793 (2.2195) loss 3.1217 (3.6369) grad_norm 1.3223 (1.3775) [2022-01-21 15:23:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][740/1251] eta 0:18:54 lr 0.000625 time 1.9681 (2.2198) loss 4.0376 (3.6356) grad_norm 1.5043 (1.3769) [2022-01-21 15:23:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][750/1251] eta 0:18:31 lr 0.000625 time 2.4016 (2.2193) loss 3.3365 (3.6356) grad_norm 1.3594 (1.3772) [2022-01-21 15:23:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][760/1251] eta 0:18:09 lr 0.000625 time 2.2235 (2.2179) loss 2.4967 (3.6323) grad_norm 1.2991 (1.3773) [2022-01-21 15:24:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][770/1251] eta 0:17:46 lr 0.000625 time 2.4219 (2.2169) loss 4.2839 (3.6359) grad_norm 1.2305 (1.3763) [2022-01-21 15:24:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][780/1251] eta 0:17:24 lr 0.000625 time 2.4500 (2.2167) loss 2.8339 (3.6353) grad_norm 1.3210 (1.3762) [2022-01-21 15:24:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][790/1251] eta 0:17:01 lr 0.000625 time 2.2066 (2.2151) loss 2.5462 (3.6350) grad_norm 1.2856 (1.3753) [2022-01-21 15:25:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][800/1251] eta 0:16:37 lr 0.000625 time 1.9736 (2.2127) loss 3.8812 (3.6358) grad_norm 1.2656 (1.3755) [2022-01-21 15:25:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][810/1251] eta 0:16:16 lr 0.000625 time 2.9149 (2.2138) loss 3.9765 (3.6377) grad_norm 1.3984 (1.3759) [2022-01-21 15:26:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][820/1251] eta 0:15:54 lr 0.000625 time 2.5092 (2.2138) loss 3.2158 (3.6374) grad_norm 1.7247 (1.3763) [2022-01-21 15:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][830/1251] eta 0:15:32 lr 0.000625 time 1.7931 (2.2139) loss 4.0213 (3.6389) grad_norm 1.2378 (1.3762) [2022-01-21 15:26:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][840/1251] eta 0:15:09 lr 0.000625 time 1.9032 (2.2133) loss 3.7449 (3.6405) grad_norm 1.1756 (1.3755) [2022-01-21 15:27:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][850/1251] eta 0:14:46 lr 0.000625 time 2.2447 (2.2118) loss 3.5279 (3.6432) grad_norm 1.2105 (1.3747) [2022-01-21 15:27:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][860/1251] eta 0:14:25 lr 0.000625 time 1.9218 (2.2130) loss 4.3686 (3.6409) grad_norm 1.2941 (1.3753) [2022-01-21 15:27:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][870/1251] eta 0:14:03 lr 0.000625 time 1.7875 (2.2130) loss 3.6426 (3.6428) grad_norm 1.6085 (1.3761) [2022-01-21 15:28:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][880/1251] eta 0:13:40 lr 0.000625 time 1.5637 (2.2126) loss 3.4175 (3.6453) grad_norm 1.3799 (1.3759) [2022-01-21 15:28:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][890/1251] eta 0:13:18 lr 0.000625 time 1.5798 (2.2112) loss 2.3763 (3.6448) grad_norm 1.2122 (1.3747) [2022-01-21 15:28:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][900/1251] eta 0:12:55 lr 0.000624 time 1.8717 (2.2106) loss 3.8950 (3.6401) grad_norm 1.1277 (1.3733) [2022-01-21 15:29:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][910/1251] eta 0:12:33 lr 0.000624 time 1.9245 (2.2096) loss 4.2404 (3.6371) grad_norm 1.4389 (1.3720) [2022-01-21 15:29:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][920/1251] eta 0:12:10 lr 0.000624 time 2.0408 (2.2083) loss 3.2568 (3.6358) grad_norm 1.3653 (1.3712) [2022-01-21 15:30:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][930/1251] eta 0:11:48 lr 0.000624 time 2.0600 (2.2083) loss 3.5112 (3.6351) grad_norm 1.3261 (1.3702) [2022-01-21 15:30:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][940/1251] eta 0:11:26 lr 0.000624 time 2.3173 (2.2080) loss 3.7002 (3.6351) grad_norm 1.3072 (1.3698) [2022-01-21 15:30:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][950/1251] eta 0:11:05 lr 0.000624 time 2.4953 (2.2093) loss 2.5277 (3.6328) grad_norm 1.3787 (1.3703) [2022-01-21 15:31:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][960/1251] eta 0:10:42 lr 0.000624 time 2.1553 (2.2087) loss 3.6833 (3.6318) grad_norm 1.2316 (1.3713) [2022-01-21 15:31:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][970/1251] eta 0:10:20 lr 0.000624 time 2.1575 (2.2099) loss 4.1528 (3.6330) grad_norm 1.2705 (1.3711) [2022-01-21 15:31:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][980/1251] eta 0:09:58 lr 0.000624 time 1.8862 (2.2096) loss 3.0781 (3.6316) grad_norm 1.3149 (1.3705) [2022-01-21 15:32:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][990/1251] eta 0:09:37 lr 0.000624 time 2.1993 (2.2114) loss 4.0175 (3.6335) grad_norm 1.3946 (1.3701) [2022-01-21 15:32:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1000/1251] eta 0:09:14 lr 0.000624 time 1.6065 (2.2091) loss 3.7852 (3.6340) grad_norm 1.3944 (1.3697) [2022-01-21 15:32:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1010/1251] eta 0:08:52 lr 0.000624 time 2.2216 (2.2079) loss 3.5365 (3.6349) grad_norm 1.2748 (1.3703) [2022-01-21 15:33:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1020/1251] eta 0:08:29 lr 0.000624 time 1.9865 (2.2072) loss 3.9972 (3.6369) grad_norm 1.2049 (1.3701) [2022-01-21 15:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1030/1251] eta 0:08:07 lr 0.000624 time 2.2559 (2.2080) loss 3.6376 (3.6381) grad_norm 1.5149 (1.3721) [2022-01-21 15:34:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1040/1251] eta 0:07:45 lr 0.000624 time 1.8189 (2.2076) loss 3.6550 (3.6352) grad_norm 1.4150 (1.3719) [2022-01-21 15:34:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1050/1251] eta 0:07:23 lr 0.000624 time 2.2113 (2.2074) loss 3.3461 (3.6347) grad_norm 1.2791 (1.3713) [2022-01-21 15:34:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1060/1251] eta 0:07:01 lr 0.000624 time 1.9142 (2.2064) loss 3.8073 (3.6348) grad_norm 1.4282 (1.3711) [2022-01-21 15:35:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1070/1251] eta 0:06:39 lr 0.000624 time 1.6425 (2.2050) loss 4.2500 (3.6347) grad_norm 1.3973 (1.3702) [2022-01-21 15:35:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1080/1251] eta 0:06:16 lr 0.000624 time 2.1882 (2.2040) loss 3.2068 (3.6352) grad_norm 1.6682 (1.3702) [2022-01-21 15:35:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1090/1251] eta 0:05:54 lr 0.000624 time 1.7985 (2.2027) loss 3.8439 (3.6337) grad_norm 1.3389 (1.3695) [2022-01-21 15:36:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1100/1251] eta 0:05:32 lr 0.000624 time 2.1745 (2.2013) loss 3.1249 (3.6331) grad_norm 1.5262 (1.3702) [2022-01-21 15:36:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1110/1251] eta 0:05:10 lr 0.000624 time 1.4743 (2.2005) loss 4.1870 (3.6337) grad_norm 1.3320 (1.3712) [2022-01-21 15:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1120/1251] eta 0:04:48 lr 0.000624 time 2.5339 (2.2006) loss 4.1205 (3.6330) grad_norm 1.2413 (1.3712) [2022-01-21 15:37:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1130/1251] eta 0:04:26 lr 0.000624 time 1.5524 (2.2014) loss 4.3333 (3.6344) grad_norm 1.2472 (1.3712) [2022-01-21 15:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1140/1251] eta 0:04:04 lr 0.000624 time 2.7645 (2.2020) loss 3.5223 (3.6341) grad_norm 1.1835 (1.3707) [2022-01-21 15:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1150/1251] eta 0:03:42 lr 0.000623 time 1.6904 (2.2019) loss 3.5655 (3.6341) grad_norm 1.5654 (1.3706) [2022-01-21 15:38:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1160/1251] eta 0:03:20 lr 0.000623 time 2.3619 (2.2030) loss 3.7033 (3.6321) grad_norm 1.2858 (1.3700) [2022-01-21 15:38:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1170/1251] eta 0:02:58 lr 0.000623 time 2.4780 (2.2032) loss 4.4980 (3.6317) grad_norm 1.2889 (1.3692) [2022-01-21 15:39:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1180/1251] eta 0:02:36 lr 0.000623 time 2.3593 (2.2031) loss 3.6920 (3.6303) grad_norm 1.3736 (1.3691) [2022-01-21 15:39:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1190/1251] eta 0:02:14 lr 0.000623 time 1.7327 (2.2024) loss 3.7818 (3.6301) grad_norm 1.3278 (1.3687) [2022-01-21 15:39:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1200/1251] eta 0:01:52 lr 0.000623 time 2.8475 (2.2018) loss 4.0991 (3.6288) grad_norm 1.2834 (1.3688) [2022-01-21 15:40:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1210/1251] eta 0:01:30 lr 0.000623 time 2.1671 (2.2020) loss 4.0039 (3.6308) grad_norm 1.3792 (1.3687) [2022-01-21 15:40:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1220/1251] eta 0:01:08 lr 0.000623 time 1.6042 (2.2008) loss 4.1104 (3.6320) grad_norm 1.3745 (1.3684) [2022-01-21 15:40:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1230/1251] eta 0:00:46 lr 0.000623 time 1.8619 (2.2004) loss 3.8842 (3.6330) grad_norm 1.2899 (1.3677) [2022-01-21 15:41:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1240/1251] eta 0:00:24 lr 0.000623 time 2.0128 (2.1997) loss 3.1247 (3.6336) grad_norm 1.7836 (1.3693) [2022-01-21 15:41:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [126/300][1250/1251] eta 0:00:02 lr 0.000623 time 1.1569 (2.1940) loss 4.2848 (3.6354) grad_norm 1.3423 (1.3686) [2022-01-21 15:41:29 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 126 training takes 0:45:45 [2022-01-21 15:41:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.574 (18.574) Loss 1.1716 (1.1716) Acc@1 73.047 (73.047) Acc@5 91.992 (91.992) [2022-01-21 15:42:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.581 (3.399) Loss 1.0162 (1.0686) Acc@1 74.512 (74.911) Acc@5 94.238 (92.738) [2022-01-21 15:42:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.953 (2.532) Loss 1.0942 (1.0634) Acc@1 74.902 (74.926) Acc@5 91.895 (92.787) [2022-01-21 15:42:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.580 (2.231) Loss 1.0333 (1.0548) Acc@1 76.367 (75.079) Acc@5 93.457 (92.852) [2022-01-21 15:42:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.463 (2.154) Loss 1.0363 (1.0555) Acc@1 76.172 (75.143) Acc@5 92.676 (92.838) [2022-01-21 15:43:04 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.068 Acc@5 92.816 [2022-01-21 15:43:04 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.1% [2022-01-21 15:43:04 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.07% [2022-01-21 15:43:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][0/1251] eta 7:34:10 lr 0.000623 time 21.7829 (21.7829) loss 4.3345 (4.3345) grad_norm 1.3434 (1.3434) [2022-01-21 15:43:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][10/1251] eta 1:24:02 lr 0.000623 time 1.4471 (4.0631) loss 4.1204 (3.8208) grad_norm 1.3903 (1.3388) [2022-01-21 15:44:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][20/1251] eta 1:04:14 lr 0.000623 time 1.5443 (3.1311) loss 3.9614 (3.6942) grad_norm 1.3732 (1.3173) [2022-01-21 15:44:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][30/1251] eta 0:59:14 lr 0.000623 time 1.6132 (2.9111) loss 4.3887 (3.6418) grad_norm 1.4152 (1.3270) [2022-01-21 15:44:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][40/1251] eta 0:56:07 lr 0.000623 time 3.5598 (2.7804) loss 3.1610 (3.6565) grad_norm 1.3061 (1.3655) [2022-01-21 15:45:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][50/1251] eta 0:53:34 lr 0.000623 time 2.4818 (2.6767) loss 2.4234 (3.6115) grad_norm 1.2942 (1.3558) [2022-01-21 15:45:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][60/1251] eta 0:51:03 lr 0.000623 time 2.3948 (2.5719) loss 3.3940 (3.6159) grad_norm 1.2675 (1.3453) [2022-01-21 15:46:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][70/1251] eta 0:49:42 lr 0.000623 time 2.5153 (2.5258) loss 2.4368 (3.6011) grad_norm 1.3707 (1.3394) [2022-01-21 15:46:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][80/1251] eta 0:48:34 lr 0.000623 time 3.1786 (2.4888) loss 4.0354 (3.5718) grad_norm 1.3256 (1.3641) [2022-01-21 15:46:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][90/1251] eta 0:47:16 lr 0.000623 time 2.1633 (2.4435) loss 3.7830 (3.5679) grad_norm 1.1310 (1.3760) [2022-01-21 15:47:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][100/1251] eta 0:46:14 lr 0.000623 time 2.0337 (2.4101) loss 3.5567 (3.5776) grad_norm 1.4498 (1.3740) [2022-01-21 15:47:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][110/1251] eta 0:45:27 lr 0.000623 time 2.3585 (2.3904) loss 3.2579 (3.5824) grad_norm 1.3518 (1.3751) [2022-01-21 15:47:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][120/1251] eta 0:44:43 lr 0.000623 time 3.0184 (2.3731) loss 4.0652 (3.5897) grad_norm 1.4527 (1.3778) [2022-01-21 15:48:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][130/1251] eta 0:44:07 lr 0.000623 time 2.5331 (2.3620) loss 4.0587 (3.5748) grad_norm 1.4506 (1.3778) [2022-01-21 15:48:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][140/1251] eta 0:43:27 lr 0.000623 time 2.4632 (2.3471) loss 4.2367 (3.5744) grad_norm 1.3086 (1.3725) [2022-01-21 15:48:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][150/1251] eta 0:42:39 lr 0.000622 time 1.8619 (2.3246) loss 3.4257 (3.5851) grad_norm 1.3785 (1.3723) [2022-01-21 15:49:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][160/1251] eta 0:42:05 lr 0.000622 time 2.3680 (2.3153) loss 2.3746 (3.5767) grad_norm 1.1653 (1.3751) [2022-01-21 15:49:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][170/1251] eta 0:41:37 lr 0.000622 time 2.0332 (2.3106) loss 3.6560 (3.5906) grad_norm 1.2953 (1.3727) [2022-01-21 15:50:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][180/1251] eta 0:41:12 lr 0.000622 time 2.4177 (2.3083) loss 3.5947 (3.6070) grad_norm 1.4383 (1.3708) [2022-01-21 15:50:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][190/1251] eta 0:40:47 lr 0.000622 time 1.8082 (2.3070) loss 4.4698 (3.6277) grad_norm 1.4310 (1.3683) [2022-01-21 15:50:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][200/1251] eta 0:40:18 lr 0.000622 time 2.8205 (2.3010) loss 3.9838 (3.6330) grad_norm 1.2991 (1.3652) [2022-01-21 15:51:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][210/1251] eta 0:39:41 lr 0.000622 time 1.9752 (2.2874) loss 3.8433 (3.6159) grad_norm 1.3075 (1.3617) [2022-01-21 15:51:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][220/1251] eta 0:39:06 lr 0.000622 time 1.8618 (2.2756) loss 3.8978 (3.6262) grad_norm 1.1933 (1.3633) [2022-01-21 15:51:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][230/1251] eta 0:38:38 lr 0.000622 time 1.8363 (2.2707) loss 3.6581 (3.6225) grad_norm 1.7803 (1.3677) [2022-01-21 15:52:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][240/1251] eta 0:38:21 lr 0.000622 time 3.0973 (2.2767) loss 3.8358 (3.6271) grad_norm 1.2658 (1.3695) [2022-01-21 15:52:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][250/1251] eta 0:37:57 lr 0.000622 time 2.5446 (2.2753) loss 4.4450 (3.6347) grad_norm 1.3960 (1.3681) [2022-01-21 15:52:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][260/1251] eta 0:37:36 lr 0.000622 time 2.7295 (2.2768) loss 3.3933 (3.6315) grad_norm 1.3379 (1.3701) [2022-01-21 15:53:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][270/1251] eta 0:37:13 lr 0.000622 time 1.8593 (2.2770) loss 4.0851 (3.6295) grad_norm 1.5023 (1.3700) [2022-01-21 15:53:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][280/1251] eta 0:36:39 lr 0.000622 time 1.6754 (2.2657) loss 3.8085 (3.6252) grad_norm 1.2734 (1.3672) [2022-01-21 15:53:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][290/1251] eta 0:36:03 lr 0.000622 time 1.9199 (2.2517) loss 3.1570 (3.6237) grad_norm 1.4707 (1.3672) [2022-01-21 15:54:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][300/1251] eta 0:35:36 lr 0.000622 time 2.4417 (2.2465) loss 3.8482 (3.6151) grad_norm 1.3037 (1.3663) [2022-01-21 15:54:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][310/1251] eta 0:35:08 lr 0.000622 time 2.7037 (2.2404) loss 3.0351 (3.6112) grad_norm 1.4333 (1.3667) [2022-01-21 15:55:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][320/1251] eta 0:34:45 lr 0.000622 time 1.8234 (2.2402) loss 3.6838 (3.6115) grad_norm 1.3367 (1.3663) [2022-01-21 15:55:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][330/1251] eta 0:34:24 lr 0.000622 time 2.3126 (2.2412) loss 3.5070 (3.6190) grad_norm 1.3341 (1.3657) [2022-01-21 15:55:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][340/1251] eta 0:34:00 lr 0.000622 time 2.5728 (2.2396) loss 2.9877 (3.6152) grad_norm 1.4795 (1.3649) [2022-01-21 15:56:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][350/1251] eta 0:33:37 lr 0.000622 time 2.2624 (2.2388) loss 3.9366 (3.6145) grad_norm 1.5771 (1.3667) [2022-01-21 15:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][360/1251] eta 0:33:14 lr 0.000622 time 2.1807 (2.2387) loss 3.9321 (3.6168) grad_norm 1.3177 (1.3668) [2022-01-21 15:56:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][370/1251] eta 0:32:50 lr 0.000622 time 2.1591 (2.2365) loss 3.7603 (3.6192) grad_norm 1.2854 (1.3668) [2022-01-21 15:57:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][380/1251] eta 0:32:27 lr 0.000622 time 2.4290 (2.2362) loss 4.3714 (3.6205) grad_norm 1.5917 (1.3675) [2022-01-21 15:57:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][390/1251] eta 0:32:05 lr 0.000622 time 2.9480 (2.2366) loss 3.7266 (3.6265) grad_norm 1.6148 (1.3706) [2022-01-21 15:58:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][400/1251] eta 0:31:45 lr 0.000621 time 2.1453 (2.2390) loss 3.5067 (3.6247) grad_norm 1.3464 (1.3705) [2022-01-21 15:58:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][410/1251] eta 0:31:19 lr 0.000621 time 2.2277 (2.2353) loss 4.3128 (3.6220) grad_norm 1.3902 (1.3711) [2022-01-21 15:58:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][420/1251] eta 0:30:55 lr 0.000621 time 2.1773 (2.2334) loss 3.8195 (3.6214) grad_norm 1.4963 (1.3724) [2022-01-21 15:59:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][430/1251] eta 0:30:32 lr 0.000621 time 2.1795 (2.2324) loss 3.5008 (3.6192) grad_norm 1.3832 (1.3719) [2022-01-21 15:59:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][440/1251] eta 0:30:07 lr 0.000621 time 1.8518 (2.2289) loss 3.5351 (3.6154) grad_norm 1.2718 (1.3720) [2022-01-21 15:59:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][450/1251] eta 0:29:41 lr 0.000621 time 1.8162 (2.2246) loss 3.3439 (3.6186) grad_norm 1.1827 (1.3740) [2022-01-21 16:00:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][460/1251] eta 0:29:19 lr 0.000621 time 1.5747 (2.2250) loss 2.5656 (3.6184) grad_norm 1.4275 (1.3752) [2022-01-21 16:00:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][470/1251] eta 0:28:57 lr 0.000621 time 1.9149 (2.2247) loss 3.4543 (3.6195) grad_norm 1.8318 (1.3788) [2022-01-21 16:00:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][480/1251] eta 0:28:37 lr 0.000621 time 2.1249 (2.2281) loss 4.0265 (3.6186) grad_norm 1.5513 (1.3799) [2022-01-21 16:01:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][490/1251] eta 0:28:16 lr 0.000621 time 2.0980 (2.2288) loss 3.6424 (3.6175) grad_norm 1.4839 (1.3793) [2022-01-21 16:01:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][500/1251] eta 0:27:53 lr 0.000621 time 2.2782 (2.2277) loss 3.4259 (3.6160) grad_norm 1.1767 (1.3789) [2022-01-21 16:02:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][510/1251] eta 0:27:29 lr 0.000621 time 2.5571 (2.2266) loss 3.7164 (3.6104) grad_norm 1.5393 (1.3795) [2022-01-21 16:02:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][520/1251] eta 0:27:06 lr 0.000621 time 1.7365 (2.2249) loss 3.9925 (3.6094) grad_norm 1.5592 (1.3807) [2022-01-21 16:02:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][530/1251] eta 0:26:44 lr 0.000621 time 2.5175 (2.2249) loss 4.2372 (3.6119) grad_norm 1.3519 (1.3799) [2022-01-21 16:03:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][540/1251] eta 0:26:22 lr 0.000621 time 2.5491 (2.2262) loss 2.6249 (3.6039) grad_norm 1.2096 (1.3829) [2022-01-21 16:03:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][550/1251] eta 0:26:01 lr 0.000621 time 2.3567 (2.2271) loss 4.2594 (3.6085) grad_norm 1.5462 (1.3844) [2022-01-21 16:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][560/1251] eta 0:25:37 lr 0.000621 time 1.7963 (2.2257) loss 3.9363 (3.6118) grad_norm 1.4069 (1.3848) [2022-01-21 16:04:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][570/1251] eta 0:25:11 lr 0.000621 time 1.9654 (2.2197) loss 2.4764 (3.6107) grad_norm 1.6943 (1.3870) [2022-01-21 16:04:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][580/1251] eta 0:24:48 lr 0.000621 time 1.8595 (2.2176) loss 3.1365 (3.6065) grad_norm 1.4595 (1.3866) [2022-01-21 16:04:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][590/1251] eta 0:24:23 lr 0.000621 time 2.1014 (2.2140) loss 4.1451 (3.6092) grad_norm 1.5302 (1.3860) [2022-01-21 16:05:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][600/1251] eta 0:24:00 lr 0.000621 time 2.2556 (2.2134) loss 3.8333 (3.6092) grad_norm 1.3677 (1.3861) [2022-01-21 16:05:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][610/1251] eta 0:23:38 lr 0.000621 time 2.7856 (2.2122) loss 4.0512 (3.6072) grad_norm 1.2850 (1.3868) [2022-01-21 16:05:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][620/1251] eta 0:23:14 lr 0.000621 time 1.7264 (2.2105) loss 4.2302 (3.6126) grad_norm 1.6343 (1.3875) [2022-01-21 16:06:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][630/1251] eta 0:22:52 lr 0.000621 time 1.8788 (2.2106) loss 3.1819 (3.6099) grad_norm 1.3711 (1.3872) [2022-01-21 16:06:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][640/1251] eta 0:22:31 lr 0.000620 time 3.0179 (2.2116) loss 3.9096 (3.6115) grad_norm 1.2348 (1.3870) [2022-01-21 16:07:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][650/1251] eta 0:22:09 lr 0.000620 time 2.0389 (2.2125) loss 3.9130 (3.6141) grad_norm 1.3673 (1.3862) [2022-01-21 16:07:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][660/1251] eta 0:21:48 lr 0.000620 time 1.9003 (2.2146) loss 3.4659 (3.6140) grad_norm 1.3744 (1.3870) [2022-01-21 16:07:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][670/1251] eta 0:21:28 lr 0.000620 time 3.1882 (2.2175) loss 2.7718 (3.6138) grad_norm 1.3928 (1.3865) [2022-01-21 16:08:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][680/1251] eta 0:21:05 lr 0.000620 time 1.5893 (2.2164) loss 3.8113 (3.6112) grad_norm 1.5157 (1.3858) [2022-01-21 16:08:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][690/1251] eta 0:20:42 lr 0.000620 time 1.9562 (2.2148) loss 3.0260 (3.6113) grad_norm 1.3219 (1.3853) [2022-01-21 16:08:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][700/1251] eta 0:20:19 lr 0.000620 time 1.9197 (2.2125) loss 4.2106 (3.6156) grad_norm 1.5333 (1.3856) [2022-01-21 16:09:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][710/1251] eta 0:19:57 lr 0.000620 time 2.1066 (2.2127) loss 4.4617 (3.6133) grad_norm 1.3458 (1.3857) [2022-01-21 16:09:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][720/1251] eta 0:19:34 lr 0.000620 time 1.9781 (2.2119) loss 4.2193 (3.6113) grad_norm 1.2924 (1.3849) [2022-01-21 16:10:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][730/1251] eta 0:19:11 lr 0.000620 time 1.7057 (2.2109) loss 4.1034 (3.6129) grad_norm 1.3190 (1.3852) [2022-01-21 16:10:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][740/1251] eta 0:18:49 lr 0.000620 time 1.9312 (2.2095) loss 3.3885 (3.6163) grad_norm 1.2802 (1.3852) [2022-01-21 16:10:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][750/1251] eta 0:18:27 lr 0.000620 time 2.0940 (2.2109) loss 3.6227 (3.6180) grad_norm 1.2269 (1.3842) [2022-01-21 16:11:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][760/1251] eta 0:18:05 lr 0.000620 time 2.1122 (2.2108) loss 3.4882 (3.6155) grad_norm 1.3489 (1.3844) [2022-01-21 16:11:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][770/1251] eta 0:17:42 lr 0.000620 time 1.8839 (2.2099) loss 4.1550 (3.6184) grad_norm 1.5316 (1.3851) [2022-01-21 16:11:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][780/1251] eta 0:17:20 lr 0.000620 time 2.5357 (2.2099) loss 3.2336 (3.6111) grad_norm 1.2744 (1.3846) [2022-01-21 16:12:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][790/1251] eta 0:16:58 lr 0.000620 time 2.1745 (2.2088) loss 2.7898 (3.6096) grad_norm 1.4310 (1.3842) [2022-01-21 16:12:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][800/1251] eta 0:16:35 lr 0.000620 time 1.6138 (2.2083) loss 2.8725 (3.6090) grad_norm 1.3352 (1.3851) [2022-01-21 16:12:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][810/1251] eta 0:16:13 lr 0.000620 time 1.9494 (2.2080) loss 3.5566 (3.6097) grad_norm 1.2434 (1.3844) [2022-01-21 16:13:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][820/1251] eta 0:15:50 lr 0.000620 time 2.1503 (2.2061) loss 3.5706 (3.6100) grad_norm 1.3141 (1.3841) [2022-01-21 16:13:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][830/1251] eta 0:15:28 lr 0.000620 time 2.2511 (2.2045) loss 4.0221 (3.6091) grad_norm 1.4580 (1.3833) [2022-01-21 16:13:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][840/1251] eta 0:15:05 lr 0.000620 time 2.3983 (2.2040) loss 4.3701 (3.6092) grad_norm 1.4067 (1.3831) [2022-01-21 16:14:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][850/1251] eta 0:14:43 lr 0.000620 time 1.9659 (2.2044) loss 3.4241 (3.6101) grad_norm 1.4224 (1.3843) [2022-01-21 16:14:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][860/1251] eta 0:14:22 lr 0.000620 time 2.8803 (2.2071) loss 2.8997 (3.6088) grad_norm 1.5415 (1.3845) [2022-01-21 16:15:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][870/1251] eta 0:14:01 lr 0.000620 time 2.2454 (2.2092) loss 2.7692 (3.6067) grad_norm 1.1564 (1.3834) [2022-01-21 16:15:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][880/1251] eta 0:13:39 lr 0.000620 time 1.8548 (2.2086) loss 2.9457 (3.6066) grad_norm 1.3238 (1.3837) [2022-01-21 16:15:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][890/1251] eta 0:13:16 lr 0.000619 time 1.9111 (2.2055) loss 4.5005 (3.6037) grad_norm 2.1146 (1.3846) [2022-01-21 16:16:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][900/1251] eta 0:12:53 lr 0.000619 time 1.8191 (2.2036) loss 3.6544 (3.6061) grad_norm 1.3484 (1.3848) [2022-01-21 16:16:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][910/1251] eta 0:12:30 lr 0.000619 time 2.1855 (2.2021) loss 3.4676 (3.6078) grad_norm 1.4538 (1.3851) [2022-01-21 16:16:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][920/1251] eta 0:12:08 lr 0.000619 time 2.0995 (2.2016) loss 4.1114 (3.6114) grad_norm 1.3827 (1.3847) [2022-01-21 16:17:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][930/1251] eta 0:11:46 lr 0.000619 time 2.1709 (2.2015) loss 4.0813 (3.6101) grad_norm 1.5107 (1.3836) [2022-01-21 16:17:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][940/1251] eta 0:11:25 lr 0.000619 time 2.7697 (2.2041) loss 4.2370 (3.6104) grad_norm 1.5746 (1.3849) [2022-01-21 16:18:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][950/1251] eta 0:11:04 lr 0.000619 time 3.4759 (2.2071) loss 4.2786 (3.6112) grad_norm 1.3179 (1.3841) [2022-01-21 16:18:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][960/1251] eta 0:10:42 lr 0.000619 time 1.6597 (2.2070) loss 4.2017 (3.6119) grad_norm 1.3582 (1.3840) [2022-01-21 16:18:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][970/1251] eta 0:10:19 lr 0.000619 time 1.6926 (2.2044) loss 3.6213 (3.6121) grad_norm 1.2836 (1.3839) [2022-01-21 16:19:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][980/1251] eta 0:09:56 lr 0.000619 time 2.1748 (2.2028) loss 3.7406 (3.6143) grad_norm 1.0847 (1.3833) [2022-01-21 16:19:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][990/1251] eta 0:09:34 lr 0.000619 time 2.2454 (2.2012) loss 2.8520 (3.6137) grad_norm 1.4868 (1.3826) [2022-01-21 16:19:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1000/1251] eta 0:09:12 lr 0.000619 time 2.4912 (2.2005) loss 3.6833 (3.6137) grad_norm 1.1457 (1.3827) [2022-01-21 16:20:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1010/1251] eta 0:08:50 lr 0.000619 time 2.7011 (2.2023) loss 4.0036 (3.6139) grad_norm 1.5360 (1.3823) [2022-01-21 16:20:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1020/1251] eta 0:08:28 lr 0.000619 time 2.5565 (2.2031) loss 2.3870 (3.6147) grad_norm 1.2705 (1.3818) [2022-01-21 16:20:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1030/1251] eta 0:08:06 lr 0.000619 time 1.8529 (2.2021) loss 4.2431 (3.6131) grad_norm 1.3201 (1.3822) [2022-01-21 16:21:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1040/1251] eta 0:07:44 lr 0.000619 time 2.1959 (2.2008) loss 2.7631 (3.6111) grad_norm 1.8771 (1.3822) [2022-01-21 16:21:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1050/1251] eta 0:07:22 lr 0.000619 time 2.2690 (2.2012) loss 2.7388 (3.6121) grad_norm 1.5416 (1.3827) [2022-01-21 16:21:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1060/1251] eta 0:07:00 lr 0.000619 time 2.2307 (2.2012) loss 3.7309 (3.6131) grad_norm 1.4574 (1.3829) [2022-01-21 16:22:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1070/1251] eta 0:06:38 lr 0.000619 time 2.1679 (2.2002) loss 3.5840 (3.6122) grad_norm 1.2908 (1.3827) [2022-01-21 16:22:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1080/1251] eta 0:06:16 lr 0.000619 time 2.4431 (2.2002) loss 3.3850 (3.6106) grad_norm 1.5914 (1.3830) [2022-01-21 16:23:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1090/1251] eta 0:05:54 lr 0.000619 time 1.7909 (2.2017) loss 4.2504 (3.6107) grad_norm 1.3485 (1.3828) [2022-01-21 16:23:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1100/1251] eta 0:05:32 lr 0.000619 time 1.9348 (2.2021) loss 3.9140 (3.6110) grad_norm 1.2779 (1.3832) [2022-01-21 16:23:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1110/1251] eta 0:05:10 lr 0.000619 time 2.1942 (2.2019) loss 3.3018 (3.6103) grad_norm 1.3142 (1.3827) [2022-01-21 16:24:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1120/1251] eta 0:04:48 lr 0.000619 time 1.8930 (2.1996) loss 4.0294 (3.6088) grad_norm 1.3047 (1.3819) [2022-01-21 16:24:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1130/1251] eta 0:04:26 lr 0.000619 time 2.7095 (2.1987) loss 3.4502 (3.6081) grad_norm 1.5272 (1.3815) [2022-01-21 16:24:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1140/1251] eta 0:04:03 lr 0.000618 time 1.9457 (2.1978) loss 3.8719 (3.6079) grad_norm 1.3318 (1.3811) [2022-01-21 16:25:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1150/1251] eta 0:03:42 lr 0.000618 time 2.2928 (2.1999) loss 3.9374 (3.6063) grad_norm 1.2923 (1.3800) [2022-01-21 16:25:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1160/1251] eta 0:03:20 lr 0.000618 time 1.8790 (2.2003) loss 4.1072 (3.6078) grad_norm 1.2683 (1.3795) [2022-01-21 16:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1170/1251] eta 0:02:58 lr 0.000618 time 2.4944 (2.2027) loss 3.5619 (3.6077) grad_norm 1.3455 (1.3788) [2022-01-21 16:26:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1180/1251] eta 0:02:36 lr 0.000618 time 1.9827 (2.2021) loss 4.1811 (3.6088) grad_norm 1.5743 (1.3797) [2022-01-21 16:26:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1190/1251] eta 0:02:14 lr 0.000618 time 1.9122 (2.2004) loss 2.8141 (3.6077) grad_norm 1.1316 (1.3805) [2022-01-21 16:27:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1200/1251] eta 0:01:52 lr 0.000618 time 1.8422 (2.1996) loss 3.0187 (3.6091) grad_norm 1.4318 (1.3801) [2022-01-21 16:27:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1210/1251] eta 0:01:30 lr 0.000618 time 2.1488 (2.1993) loss 3.8784 (3.6093) grad_norm 1.3200 (1.3793) [2022-01-21 16:27:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1220/1251] eta 0:01:08 lr 0.000618 time 2.2570 (2.1986) loss 3.8058 (3.6102) grad_norm 1.3992 (1.3789) [2022-01-21 16:28:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1230/1251] eta 0:00:46 lr 0.000618 time 2.5323 (2.1982) loss 2.4253 (3.6105) grad_norm 1.1792 (1.3790) [2022-01-21 16:28:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1240/1251] eta 0:00:24 lr 0.000618 time 2.0216 (2.1972) loss 3.9548 (3.6119) grad_norm 1.3418 (1.3785) [2022-01-21 16:28:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [127/300][1250/1251] eta 0:00:02 lr 0.000618 time 1.1984 (2.1919) loss 4.0786 (3.6101) grad_norm 1.3561 (1.3784) [2022-01-21 16:28:46 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 127 training takes 0:45:42 [2022-01-21 16:29:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.899 (17.899) Loss 1.0080 (1.0080) Acc@1 77.051 (77.051) Acc@5 93.555 (93.555) [2022-01-21 16:29:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.619 (3.372) Loss 1.0769 (1.0850) Acc@1 75.195 (74.334) Acc@5 93.555 (92.658) [2022-01-21 16:29:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.306 (2.432) Loss 1.0470 (1.0773) Acc@1 76.465 (74.619) Acc@5 93.555 (92.773) [2022-01-21 16:29:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.920 (2.305) Loss 1.1202 (1.0767) Acc@1 73.438 (74.820) Acc@5 93.066 (92.814) [2022-01-21 16:30:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.537 (2.170) Loss 1.0615 (1.0765) Acc@1 74.805 (74.786) Acc@5 92.969 (92.778) [2022-01-21 16:30:23 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.874 Acc@5 92.806 [2022-01-21 16:30:23 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.9% [2022-01-21 16:30:23 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.07% [2022-01-21 16:30:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][0/1251] eta 8:15:50 lr 0.000618 time 23.7811 (23.7811) loss 3.8330 (3.8330) grad_norm 1.2176 (1.2176) [2022-01-21 16:31:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][10/1251] eta 1:26:02 lr 0.000618 time 2.4704 (4.1600) loss 3.3607 (3.5037) grad_norm 1.4205 (1.3561) [2022-01-21 16:31:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][20/1251] eta 1:06:09 lr 0.000618 time 2.5011 (3.2244) loss 3.9524 (3.6475) grad_norm 1.2240 (1.3753) [2022-01-21 16:31:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][30/1251] eta 0:58:38 lr 0.000618 time 1.9931 (2.8814) loss 2.7981 (3.4895) grad_norm 1.3787 (1.3625) [2022-01-21 16:32:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][40/1251] eta 0:55:24 lr 0.000618 time 4.5883 (2.7451) loss 3.3833 (3.5280) grad_norm 1.4040 (1.3792) [2022-01-21 16:32:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][50/1251] eta 0:53:29 lr 0.000618 time 2.4800 (2.6721) loss 3.8244 (3.5674) grad_norm 1.4696 (1.3814) [2022-01-21 16:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][60/1251] eta 0:51:30 lr 0.000618 time 2.0852 (2.5952) loss 3.6847 (3.5925) grad_norm 1.4459 (1.3885) [2022-01-21 16:33:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][70/1251] eta 0:49:55 lr 0.000618 time 1.8292 (2.5365) loss 3.2533 (3.6167) grad_norm 1.3882 (1.3954) [2022-01-21 16:33:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][80/1251] eta 0:48:53 lr 0.000618 time 3.1976 (2.5047) loss 2.7426 (3.6260) grad_norm 1.4793 (1.3910) [2022-01-21 16:34:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][90/1251] eta 0:48:09 lr 0.000618 time 2.0302 (2.4891) loss 3.3337 (3.6472) grad_norm 1.4241 (1.3885) [2022-01-21 16:34:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][100/1251] eta 0:46:52 lr 0.000618 time 1.6981 (2.4432) loss 4.4316 (3.6336) grad_norm 1.5510 (1.3868) [2022-01-21 16:34:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][110/1251] eta 0:45:49 lr 0.000618 time 1.9465 (2.4095) loss 2.8420 (3.6357) grad_norm 1.2034 (1.3801) [2022-01-21 16:35:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][120/1251] eta 0:44:53 lr 0.000618 time 2.2361 (2.3812) loss 4.3617 (3.6655) grad_norm 1.5693 (1.3782) [2022-01-21 16:35:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][130/1251] eta 0:44:34 lr 0.000618 time 2.3873 (2.3860) loss 2.7463 (3.6292) grad_norm 1.4970 (1.3783) [2022-01-21 16:35:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][140/1251] eta 0:44:00 lr 0.000617 time 2.4180 (2.3766) loss 3.0098 (3.6272) grad_norm 1.3560 (1.3802) [2022-01-21 16:36:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][150/1251] eta 0:43:20 lr 0.000617 time 2.0453 (2.3619) loss 3.9960 (3.6280) grad_norm 1.5601 (1.3787) [2022-01-21 16:36:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][160/1251] eta 0:42:41 lr 0.000617 time 1.6212 (2.3480) loss 3.5379 (3.6307) grad_norm 1.1898 (1.3789) [2022-01-21 16:37:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][170/1251] eta 0:42:00 lr 0.000617 time 2.2190 (2.3318) loss 3.9441 (3.6327) grad_norm 1.6973 (1.3787) [2022-01-21 16:37:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][180/1251] eta 0:41:30 lr 0.000617 time 2.8344 (2.3255) loss 3.1910 (3.6298) grad_norm 1.7060 (1.3754) [2022-01-21 16:37:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][190/1251] eta 0:40:53 lr 0.000617 time 1.9188 (2.3127) loss 3.9181 (3.6382) grad_norm 1.2906 (1.3776) [2022-01-21 16:38:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][200/1251] eta 0:40:24 lr 0.000617 time 2.9101 (2.3072) loss 4.5412 (3.6428) grad_norm 1.5146 (1.3793) [2022-01-21 16:38:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][210/1251] eta 0:39:58 lr 0.000617 time 2.2129 (2.3042) loss 4.3692 (3.6410) grad_norm 1.2972 (1.3808) [2022-01-21 16:38:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][220/1251] eta 0:39:30 lr 0.000617 time 2.5291 (2.2994) loss 3.3017 (3.6412) grad_norm 1.2490 (1.3849) [2022-01-21 16:39:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][230/1251] eta 0:39:08 lr 0.000617 time 2.0750 (2.3000) loss 3.8120 (3.6407) grad_norm 1.6889 (1.3862) [2022-01-21 16:39:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][240/1251] eta 0:38:36 lr 0.000617 time 2.0724 (2.2909) loss 3.6438 (3.6417) grad_norm 1.3299 (1.3870) [2022-01-21 16:39:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][250/1251] eta 0:38:02 lr 0.000617 time 1.8465 (2.2802) loss 4.4792 (3.6352) grad_norm 1.2505 (1.3900) [2022-01-21 16:40:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][260/1251] eta 0:37:32 lr 0.000617 time 2.4392 (2.2727) loss 3.2041 (3.6284) grad_norm 1.4770 (1.3973) [2022-01-21 16:40:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][270/1251] eta 0:37:02 lr 0.000617 time 2.2507 (2.2650) loss 3.4747 (3.6296) grad_norm 1.4398 (1.3986) [2022-01-21 16:40:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][280/1251] eta 0:36:32 lr 0.000617 time 2.9998 (2.2584) loss 3.7305 (3.6205) grad_norm 1.2327 (1.4008) [2022-01-21 16:41:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][290/1251] eta 0:36:05 lr 0.000617 time 2.5253 (2.2530) loss 3.6165 (3.6213) grad_norm 1.7491 (1.3994) [2022-01-21 16:41:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][300/1251] eta 0:35:43 lr 0.000617 time 2.5197 (2.2537) loss 3.5578 (3.6152) grad_norm 1.4523 (1.4004) [2022-01-21 16:42:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][310/1251] eta 0:35:21 lr 0.000617 time 2.1859 (2.2541) loss 4.1092 (3.6106) grad_norm 1.2971 (1.4013) [2022-01-21 16:42:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][320/1251] eta 0:34:57 lr 0.000617 time 2.6872 (2.2532) loss 3.5306 (3.6084) grad_norm 1.1927 (1.3982) [2022-01-21 16:42:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][330/1251] eta 0:34:33 lr 0.000617 time 2.1605 (2.2511) loss 4.0347 (3.6107) grad_norm 1.2718 (1.3970) [2022-01-21 16:43:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][340/1251] eta 0:34:10 lr 0.000617 time 2.8193 (2.2509) loss 3.7270 (3.6109) grad_norm 1.2765 (1.3957) [2022-01-21 16:43:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][350/1251] eta 0:33:48 lr 0.000617 time 2.1184 (2.2510) loss 3.7522 (3.6044) grad_norm 1.5490 (1.3953) [2022-01-21 16:43:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][360/1251] eta 0:33:27 lr 0.000617 time 1.9038 (2.2529) loss 3.8103 (3.6102) grad_norm 1.2246 (1.3946) [2022-01-21 16:44:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][370/1251] eta 0:33:02 lr 0.000617 time 2.3651 (2.2499) loss 3.4467 (3.6132) grad_norm 1.2780 (1.3956) [2022-01-21 16:44:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][380/1251] eta 0:32:38 lr 0.000617 time 2.7996 (2.2484) loss 4.0649 (3.6136) grad_norm 1.5684 (1.3969) [2022-01-21 16:45:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][390/1251] eta 0:32:12 lr 0.000616 time 2.2474 (2.2443) loss 3.6436 (3.6114) grad_norm 1.4987 (1.3991) [2022-01-21 16:45:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][400/1251] eta 0:31:47 lr 0.000616 time 2.2706 (2.2417) loss 3.8773 (3.6107) grad_norm 1.4842 (1.3986) [2022-01-21 16:45:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][410/1251] eta 0:31:24 lr 0.000616 time 1.9667 (2.2408) loss 4.2975 (3.6071) grad_norm 1.4664 (1.3974) [2022-01-21 16:46:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][420/1251] eta 0:31:01 lr 0.000616 time 3.3774 (2.2406) loss 2.2436 (3.6085) grad_norm 1.1763 (1.3970) [2022-01-21 16:46:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][430/1251] eta 0:30:38 lr 0.000616 time 2.4957 (2.2394) loss 2.7421 (3.6065) grad_norm 1.3640 (1.3949) [2022-01-21 16:46:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][440/1251] eta 0:30:14 lr 0.000616 time 2.0924 (2.2375) loss 4.3744 (3.6068) grad_norm 1.5582 (1.3956) [2022-01-21 16:47:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][450/1251] eta 0:29:54 lr 0.000616 time 1.7842 (2.2400) loss 2.6826 (3.6012) grad_norm 1.6105 (1.3951) [2022-01-21 16:47:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][460/1251] eta 0:29:32 lr 0.000616 time 3.1006 (2.2405) loss 3.3188 (3.6046) grad_norm 1.4547 (1.3935) [2022-01-21 16:47:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][470/1251] eta 0:29:06 lr 0.000616 time 2.0913 (2.2358) loss 3.5210 (3.6075) grad_norm 1.3394 (1.3919) [2022-01-21 16:48:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][480/1251] eta 0:28:40 lr 0.000616 time 1.9210 (2.2310) loss 3.6280 (3.6084) grad_norm 1.3208 (1.3891) [2022-01-21 16:48:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][490/1251] eta 0:28:19 lr 0.000616 time 2.4143 (2.2336) loss 3.9339 (3.6042) grad_norm 1.4304 (1.3879) [2022-01-21 16:49:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][500/1251] eta 0:27:58 lr 0.000616 time 1.9473 (2.2345) loss 3.9771 (3.6075) grad_norm 1.3029 (1.3859) [2022-01-21 16:49:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][510/1251] eta 0:27:33 lr 0.000616 time 1.5829 (2.2314) loss 4.4610 (3.6020) grad_norm 1.3983 (1.3859) [2022-01-21 16:49:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][520/1251] eta 0:27:10 lr 0.000616 time 1.9966 (2.2305) loss 2.9099 (3.6014) grad_norm 1.4490 (1.3865) [2022-01-21 16:50:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][530/1251] eta 0:26:49 lr 0.000616 time 3.1249 (2.2327) loss 3.9528 (3.6013) grad_norm 1.4383 (1.3879) [2022-01-21 16:50:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][540/1251] eta 0:26:25 lr 0.000616 time 2.1075 (2.2304) loss 4.5019 (3.6062) grad_norm 1.2715 (1.3877) [2022-01-21 16:50:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][550/1251] eta 0:26:01 lr 0.000616 time 2.1972 (2.2282) loss 3.4903 (3.6093) grad_norm 1.4287 (1.3865) [2022-01-21 16:51:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][560/1251] eta 0:25:36 lr 0.000616 time 1.8497 (2.2237) loss 4.0756 (3.6065) grad_norm 1.4941 (1.3884) [2022-01-21 16:51:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][570/1251] eta 0:25:13 lr 0.000616 time 2.5343 (2.2229) loss 3.8116 (3.6073) grad_norm 1.3572 (1.3880) [2022-01-21 16:51:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][580/1251] eta 0:24:50 lr 0.000616 time 2.3004 (2.2215) loss 4.2867 (3.6050) grad_norm 1.3797 (1.3873) [2022-01-21 16:52:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][590/1251] eta 0:24:27 lr 0.000616 time 2.2156 (2.2197) loss 3.8658 (3.6052) grad_norm 1.5508 (1.3871) [2022-01-21 16:52:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][600/1251] eta 0:24:06 lr 0.000616 time 2.5054 (2.2214) loss 3.3107 (3.6054) grad_norm 1.3778 (1.3875) [2022-01-21 16:53:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][610/1251] eta 0:23:46 lr 0.000616 time 2.9264 (2.2254) loss 4.2564 (3.6075) grad_norm 1.3628 (1.3887) [2022-01-21 16:53:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][620/1251] eta 0:23:25 lr 0.000616 time 2.4771 (2.2279) loss 2.8775 (3.6078) grad_norm 1.2862 (1.3880) [2022-01-21 16:53:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][630/1251] eta 0:23:00 lr 0.000615 time 1.7986 (2.2237) loss 3.5283 (3.6053) grad_norm 1.1876 (1.3862) [2022-01-21 16:54:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][640/1251] eta 0:22:37 lr 0.000615 time 2.1288 (2.2215) loss 3.7276 (3.6027) grad_norm 1.3466 (1.3852) [2022-01-21 16:54:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][650/1251] eta 0:22:12 lr 0.000615 time 1.9038 (2.2176) loss 4.1182 (3.6020) grad_norm 1.5511 (1.3856) [2022-01-21 16:54:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][660/1251] eta 0:21:49 lr 0.000615 time 2.3167 (2.2150) loss 4.1189 (3.6023) grad_norm 1.4333 (1.3856) [2022-01-21 16:55:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][670/1251] eta 0:21:26 lr 0.000615 time 2.0116 (2.2137) loss 3.8204 (3.6002) grad_norm 1.4483 (1.3849) [2022-01-21 16:55:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][680/1251] eta 0:21:04 lr 0.000615 time 2.1357 (2.2137) loss 3.3377 (3.5983) grad_norm 1.4761 (1.3860) [2022-01-21 16:55:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][690/1251] eta 0:20:42 lr 0.000615 time 2.8091 (2.2141) loss 3.8940 (3.6009) grad_norm 1.1632 (1.3856) [2022-01-21 16:56:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][700/1251] eta 0:20:21 lr 0.000615 time 2.9082 (2.2174) loss 3.7473 (3.6039) grad_norm 1.6005 (1.3860) [2022-01-21 16:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][710/1251] eta 0:20:00 lr 0.000615 time 1.6136 (2.2183) loss 4.3882 (3.6026) grad_norm 1.2865 (1.3856) [2022-01-21 16:57:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][720/1251] eta 0:19:38 lr 0.000615 time 1.9422 (2.2195) loss 4.3864 (3.6082) grad_norm 1.3982 (1.3855) [2022-01-21 16:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][730/1251] eta 0:19:16 lr 0.000615 time 1.6373 (2.2193) loss 4.3631 (3.6082) grad_norm 1.3830 (1.3852) [2022-01-21 16:57:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][740/1251] eta 0:18:52 lr 0.000615 time 2.3418 (2.2172) loss 3.5087 (3.6043) grad_norm 1.2759 (1.3853) [2022-01-21 16:58:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][750/1251] eta 0:18:29 lr 0.000615 time 1.8805 (2.2151) loss 3.6048 (3.6064) grad_norm 1.2695 (1.3843) [2022-01-21 16:58:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][760/1251] eta 0:18:07 lr 0.000615 time 1.8171 (2.2153) loss 3.6252 (3.6047) grad_norm 1.3132 (1.3835) [2022-01-21 16:58:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][770/1251] eta 0:17:44 lr 0.000615 time 2.5175 (2.2141) loss 3.8859 (3.6042) grad_norm 1.5376 (1.3842) [2022-01-21 16:59:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][780/1251] eta 0:17:22 lr 0.000615 time 2.4437 (2.2132) loss 2.7044 (3.5983) grad_norm 1.2453 (1.3855) [2022-01-21 16:59:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][790/1251] eta 0:16:59 lr 0.000615 time 1.6374 (2.2124) loss 4.1877 (3.6016) grad_norm 1.1798 (1.3851) [2022-01-21 16:59:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][800/1251] eta 0:16:37 lr 0.000615 time 1.8451 (2.2107) loss 3.1885 (3.6040) grad_norm 1.2206 (1.3858) [2022-01-21 17:00:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][810/1251] eta 0:16:15 lr 0.000615 time 2.4997 (2.2113) loss 3.6634 (3.6050) grad_norm 1.4234 (1.3858) [2022-01-21 17:00:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][820/1251] eta 0:15:53 lr 0.000615 time 2.4044 (2.2120) loss 4.2216 (3.6041) grad_norm 1.4903 (1.3857) [2022-01-21 17:01:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][830/1251] eta 0:15:32 lr 0.000615 time 2.2062 (2.2138) loss 4.0384 (3.6067) grad_norm 1.7592 (1.3854) [2022-01-21 17:01:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][840/1251] eta 0:15:09 lr 0.000615 time 1.7527 (2.2134) loss 4.2691 (3.6054) grad_norm 1.3039 (1.3856) [2022-01-21 17:01:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][850/1251] eta 0:14:47 lr 0.000615 time 2.1319 (2.2128) loss 3.2596 (3.6057) grad_norm 1.3790 (1.3850) [2022-01-21 17:02:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][860/1251] eta 0:14:24 lr 0.000615 time 1.7830 (2.2111) loss 3.9085 (3.6088) grad_norm 1.2802 (1.3844) [2022-01-21 17:02:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][870/1251] eta 0:14:01 lr 0.000615 time 1.6664 (2.2082) loss 2.5775 (3.6074) grad_norm 1.1902 (1.3828) [2022-01-21 17:02:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][880/1251] eta 0:13:38 lr 0.000614 time 2.3356 (2.2071) loss 3.7158 (3.6089) grad_norm 1.3464 (1.3829) [2022-01-21 17:03:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][890/1251] eta 0:13:16 lr 0.000614 time 1.9765 (2.2068) loss 2.5260 (3.6063) grad_norm 1.7987 (1.3834) [2022-01-21 17:03:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][900/1251] eta 0:12:55 lr 0.000614 time 1.6142 (2.2084) loss 3.7208 (3.6043) grad_norm 1.4243 (1.3835) [2022-01-21 17:03:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][910/1251] eta 0:12:33 lr 0.000614 time 1.8301 (2.2094) loss 2.4417 (3.6068) grad_norm 1.4533 (1.3832) [2022-01-21 17:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][920/1251] eta 0:12:11 lr 0.000614 time 1.8431 (2.2093) loss 2.8206 (3.6033) grad_norm 1.4549 (1.3830) [2022-01-21 17:04:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][930/1251] eta 0:11:49 lr 0.000614 time 2.0156 (2.2090) loss 3.8980 (3.6058) grad_norm 1.5743 (1.3835) [2022-01-21 17:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][940/1251] eta 0:11:26 lr 0.000614 time 1.8442 (2.2067) loss 3.6135 (3.6070) grad_norm 1.4897 (1.3837) [2022-01-21 17:05:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][950/1251] eta 0:11:03 lr 0.000614 time 2.5278 (2.2059) loss 3.8812 (3.6052) grad_norm 1.3415 (1.3841) [2022-01-21 17:05:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][960/1251] eta 0:10:41 lr 0.000614 time 2.1877 (2.2062) loss 3.1156 (3.6035) grad_norm 1.3956 (1.3837) [2022-01-21 17:06:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][970/1251] eta 0:10:20 lr 0.000614 time 2.4011 (2.2069) loss 4.2227 (3.6051) grad_norm 1.3739 (1.3830) [2022-01-21 17:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][980/1251] eta 0:09:57 lr 0.000614 time 1.7559 (2.2064) loss 3.5836 (3.6069) grad_norm 1.4231 (1.3829) [2022-01-21 17:06:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][990/1251] eta 0:09:35 lr 0.000614 time 2.1432 (2.2066) loss 2.9072 (3.6063) grad_norm 1.6781 (1.3836) [2022-01-21 17:07:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1000/1251] eta 0:09:13 lr 0.000614 time 2.1997 (2.2053) loss 4.0529 (3.6064) grad_norm 1.5726 (1.3847) [2022-01-21 17:07:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1010/1251] eta 0:08:51 lr 0.000614 time 1.8650 (2.2035) loss 3.5577 (3.6077) grad_norm 1.3547 (1.3857) [2022-01-21 17:07:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1020/1251] eta 0:08:28 lr 0.000614 time 2.1808 (2.2017) loss 3.1571 (3.6083) grad_norm 1.2233 (1.3864) [2022-01-21 17:08:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1030/1251] eta 0:08:06 lr 0.000614 time 1.9432 (2.2000) loss 3.9541 (3.6077) grad_norm 1.3352 (1.3858) [2022-01-21 17:08:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1040/1251] eta 0:07:44 lr 0.000614 time 2.4892 (2.1992) loss 3.4952 (3.6085) grad_norm 1.4021 (1.3860) [2022-01-21 17:08:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1050/1251] eta 0:07:21 lr 0.000614 time 1.5457 (2.1987) loss 3.3557 (3.6083) grad_norm 1.4306 (1.3863) [2022-01-21 17:09:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1060/1251] eta 0:06:59 lr 0.000614 time 1.8877 (2.1969) loss 4.0886 (3.6072) grad_norm 1.1857 (1.3855) [2022-01-21 17:09:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1070/1251] eta 0:06:37 lr 0.000614 time 1.8307 (2.1983) loss 3.0011 (3.6089) grad_norm 1.3694 (1.3851) [2022-01-21 17:10:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1080/1251] eta 0:06:16 lr 0.000614 time 2.4780 (2.2005) loss 2.5303 (3.6054) grad_norm 1.4602 (1.3863) [2022-01-21 17:10:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1090/1251] eta 0:05:54 lr 0.000614 time 1.6892 (2.2012) loss 4.0022 (3.6057) grad_norm 1.2530 (1.3862) [2022-01-21 17:10:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1100/1251] eta 0:05:32 lr 0.000614 time 2.5043 (2.2017) loss 3.7128 (3.6079) grad_norm 1.2096 (1.3855) [2022-01-21 17:11:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1110/1251] eta 0:05:10 lr 0.000614 time 1.4562 (2.2021) loss 4.0140 (3.6086) grad_norm 1.3536 (1.3852) [2022-01-21 17:11:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1120/1251] eta 0:04:48 lr 0.000614 time 2.7392 (2.2025) loss 3.1526 (3.6062) grad_norm 1.2957 (1.3851) [2022-01-21 17:11:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1130/1251] eta 0:04:26 lr 0.000613 time 1.5593 (2.2019) loss 4.1802 (3.6065) grad_norm 1.5276 (1.3849) [2022-01-21 17:12:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1140/1251] eta 0:04:04 lr 0.000613 time 1.8514 (2.1999) loss 4.0005 (3.6044) grad_norm 1.3664 (1.3842) [2022-01-21 17:12:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1150/1251] eta 0:03:42 lr 0.000613 time 1.5625 (2.1986) loss 3.9853 (3.6060) grad_norm 1.2882 (1.3837) [2022-01-21 17:12:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1160/1251] eta 0:03:20 lr 0.000613 time 3.1182 (2.1984) loss 3.8976 (3.6093) grad_norm 1.1951 (1.3835) [2022-01-21 17:13:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1170/1251] eta 0:02:58 lr 0.000613 time 1.7028 (2.1983) loss 2.4566 (3.6091) grad_norm 1.4430 (1.3831) [2022-01-21 17:13:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1180/1251] eta 0:02:36 lr 0.000613 time 2.7708 (2.1975) loss 3.8207 (3.6069) grad_norm 1.1394 (1.3827) [2022-01-21 17:13:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1190/1251] eta 0:02:13 lr 0.000613 time 1.8266 (2.1964) loss 2.9163 (3.6058) grad_norm 1.3070 (1.3823) [2022-01-21 17:14:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1200/1251] eta 0:01:51 lr 0.000613 time 2.8332 (2.1960) loss 3.4189 (3.6060) grad_norm 1.2162 (1.3821) [2022-01-21 17:14:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1210/1251] eta 0:01:30 lr 0.000613 time 2.1116 (2.1973) loss 3.7180 (3.6059) grad_norm 1.2190 (1.3821) [2022-01-21 17:15:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1220/1251] eta 0:01:08 lr 0.000613 time 1.8582 (2.1972) loss 4.2212 (3.6068) grad_norm 1.2419 (1.3821) [2022-01-21 17:15:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1230/1251] eta 0:00:46 lr 0.000613 time 1.7727 (2.1980) loss 3.8222 (3.6061) grad_norm 1.3434 (1.3822) [2022-01-21 17:15:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1240/1251] eta 0:00:24 lr 0.000613 time 2.2976 (2.1982) loss 3.5632 (3.6040) grad_norm 1.4011 (1.3822) [2022-01-21 17:16:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [128/300][1250/1251] eta 0:00:02 lr 0.000613 time 1.1414 (2.1927) loss 2.6875 (3.6034) grad_norm 1.2107 (1.3820) [2022-01-21 17:16:06 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 128 training takes 0:45:43 [2022-01-21 17:16:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.891 (18.891) Loss 1.0375 (1.0375) Acc@1 75.977 (75.977) Acc@5 93.066 (93.066) [2022-01-21 17:16:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.962 (3.403) Loss 1.0859 (1.0850) Acc@1 75.098 (74.876) Acc@5 92.285 (92.560) [2022-01-21 17:17:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.367 (2.671) Loss 1.1211 (1.0782) Acc@1 72.461 (74.688) Acc@5 93.457 (92.690) [2022-01-21 17:17:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.594 (2.369) Loss 1.0732 (1.0713) Acc@1 74.023 (74.849) Acc@5 92.285 (92.792) [2022-01-21 17:17:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.293 (2.248) Loss 1.0977 (1.0735) Acc@1 74.414 (74.871) Acc@5 92.676 (92.785) [2022-01-21 17:17:45 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.812 Acc@5 92.760 [2022-01-21 17:17:45 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 74.8% [2022-01-21 17:17:45 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.07% [2022-01-21 17:18:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][0/1251] eta 7:23:22 lr 0.000613 time 21.2649 (21.2649) loss 3.9198 (3.9198) grad_norm 1.3924 (1.3924) [2022-01-21 17:18:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][10/1251] eta 1:18:59 lr 0.000613 time 1.7257 (3.8193) loss 3.7813 (3.6290) grad_norm 1.4327 (1.3628) [2022-01-21 17:18:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][20/1251] eta 1:01:50 lr 0.000613 time 1.4767 (3.0142) loss 3.8493 (3.7517) grad_norm 1.2039 (1.3755) [2022-01-21 17:19:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][30/1251] eta 0:55:43 lr 0.000613 time 1.3988 (2.7387) loss 3.8511 (3.7416) grad_norm 1.2150 (1.3617) [2022-01-21 17:19:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][40/1251] eta 0:53:23 lr 0.000613 time 3.7117 (2.6450) loss 3.7077 (3.6798) grad_norm 1.1608 (1.3753) [2022-01-21 17:19:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][50/1251] eta 0:51:52 lr 0.000613 time 3.0976 (2.5916) loss 3.3183 (3.6283) grad_norm 1.6311 (1.3715) [2022-01-21 17:20:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][60/1251] eta 0:50:15 lr 0.000613 time 1.5236 (2.5318) loss 4.2033 (3.5689) grad_norm 1.4088 (1.3671) [2022-01-21 17:20:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][70/1251] eta 0:49:05 lr 0.000613 time 1.5565 (2.4939) loss 2.4762 (3.5582) grad_norm 1.4956 (1.3839) [2022-01-21 17:21:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][80/1251] eta 0:48:24 lr 0.000613 time 3.5249 (2.4807) loss 3.2397 (3.5442) grad_norm 1.5458 (1.3840) [2022-01-21 17:21:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][90/1251] eta 0:47:31 lr 0.000613 time 1.4269 (2.4562) loss 2.9655 (3.5484) grad_norm 1.5947 (1.3815) [2022-01-21 17:21:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][100/1251] eta 0:46:19 lr 0.000613 time 1.8305 (2.4144) loss 3.8585 (3.5450) grad_norm 1.8095 (1.3845) [2022-01-21 17:22:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][110/1251] eta 0:45:25 lr 0.000613 time 1.8615 (2.3891) loss 3.9762 (3.5568) grad_norm 1.3139 (1.3799) [2022-01-21 17:22:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][120/1251] eta 0:44:45 lr 0.000612 time 2.9931 (2.3748) loss 4.2572 (3.5645) grad_norm 1.3487 (1.3772) [2022-01-21 17:22:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][130/1251] eta 0:44:14 lr 0.000612 time 2.4206 (2.3683) loss 3.9521 (3.5628) grad_norm 1.5227 (1.3697) [2022-01-21 17:23:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][140/1251] eta 0:43:32 lr 0.000612 time 2.1138 (2.3516) loss 4.4154 (3.5723) grad_norm 1.2054 (1.3614) [2022-01-21 17:23:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][150/1251] eta 0:42:58 lr 0.000612 time 1.9406 (2.3417) loss 4.0828 (3.5763) grad_norm 1.2687 (1.3617) [2022-01-21 17:24:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][160/1251] eta 0:42:27 lr 0.000612 time 2.6752 (2.3346) loss 2.6430 (3.5705) grad_norm 1.4560 (1.3641) [2022-01-21 17:24:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][170/1251] eta 0:41:43 lr 0.000612 time 1.8972 (2.3162) loss 3.7613 (3.5713) grad_norm 1.3370 (1.3654) [2022-01-21 17:24:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][180/1251] eta 0:41:12 lr 0.000612 time 2.3051 (2.3087) loss 3.8511 (3.5795) grad_norm 1.5618 (1.3661) [2022-01-21 17:25:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][190/1251] eta 0:40:42 lr 0.000612 time 2.1490 (2.3019) loss 3.9928 (3.5886) grad_norm 1.3941 (1.3666) [2022-01-21 17:25:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][200/1251] eta 0:40:07 lr 0.000612 time 2.2213 (2.2908) loss 3.9686 (3.5849) grad_norm 1.4392 (1.3658) [2022-01-21 17:25:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][210/1251] eta 0:39:44 lr 0.000612 time 2.1143 (2.2909) loss 3.8083 (3.5925) grad_norm 1.3591 (1.3680) [2022-01-21 17:26:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][220/1251] eta 0:39:24 lr 0.000612 time 1.6331 (2.2933) loss 3.8527 (3.6065) grad_norm 1.4574 (1.3677) [2022-01-21 17:26:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][230/1251] eta 0:38:59 lr 0.000612 time 1.9124 (2.2914) loss 3.5964 (3.6075) grad_norm 1.3889 (1.3722) [2022-01-21 17:26:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][240/1251] eta 0:38:37 lr 0.000612 time 2.1503 (2.2920) loss 3.9218 (3.6074) grad_norm 1.3859 (1.3752) [2022-01-21 17:27:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][250/1251] eta 0:38:03 lr 0.000612 time 1.9409 (2.2808) loss 3.5220 (3.6009) grad_norm 1.5811 (1.3749) [2022-01-21 17:27:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][260/1251] eta 0:37:34 lr 0.000612 time 1.5902 (2.2748) loss 3.9972 (3.6041) grad_norm 1.2539 (1.3750) [2022-01-21 17:27:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][270/1251] eta 0:37:02 lr 0.000612 time 1.9379 (2.2652) loss 3.2509 (3.6093) grad_norm 1.2414 (1.3758) [2022-01-21 17:28:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][280/1251] eta 0:36:37 lr 0.000612 time 1.9243 (2.2628) loss 3.4118 (3.6085) grad_norm 1.3164 (1.3746) [2022-01-21 17:28:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][290/1251] eta 0:36:15 lr 0.000612 time 2.8049 (2.2633) loss 3.0522 (3.6021) grad_norm 1.4298 (1.3777) [2022-01-21 17:29:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][300/1251] eta 0:35:55 lr 0.000612 time 2.3067 (2.2663) loss 3.5184 (3.6055) grad_norm 1.4879 (1.3773) [2022-01-21 17:29:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][310/1251] eta 0:35:30 lr 0.000612 time 1.9904 (2.2642) loss 3.7615 (3.6009) grad_norm 1.3952 (1.3784) [2022-01-21 17:29:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][320/1251] eta 0:35:05 lr 0.000612 time 2.3568 (2.2611) loss 3.3141 (3.6064) grad_norm 1.6119 (1.3783) [2022-01-21 17:30:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][330/1251] eta 0:34:32 lr 0.000612 time 2.1704 (2.2505) loss 3.8497 (3.6149) grad_norm 1.2967 (1.3784) [2022-01-21 17:30:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][340/1251] eta 0:34:02 lr 0.000612 time 1.9413 (2.2424) loss 3.2832 (3.6165) grad_norm 1.3963 (1.3777) [2022-01-21 17:30:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][350/1251] eta 0:33:40 lr 0.000612 time 2.2062 (2.2423) loss 3.7447 (3.6165) grad_norm 1.3460 (1.3794) [2022-01-21 17:31:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][360/1251] eta 0:33:15 lr 0.000612 time 2.0265 (2.2401) loss 3.8408 (3.6190) grad_norm 1.2578 (1.3788) [2022-01-21 17:31:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][370/1251] eta 0:32:52 lr 0.000611 time 2.1950 (2.2384) loss 4.1000 (3.6200) grad_norm 1.5598 (1.3794) [2022-01-21 17:31:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][380/1251] eta 0:32:30 lr 0.000611 time 2.1951 (2.2390) loss 3.0056 (3.6239) grad_norm 1.4427 (1.3807) [2022-01-21 17:32:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][390/1251] eta 0:32:06 lr 0.000611 time 1.7691 (2.2371) loss 3.3861 (3.6256) grad_norm 1.4457 (1.3825) [2022-01-21 17:32:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][400/1251] eta 0:31:42 lr 0.000611 time 1.8406 (2.2360) loss 3.3365 (3.6310) grad_norm 1.2557 (1.3810) [2022-01-21 17:33:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][410/1251] eta 0:31:17 lr 0.000611 time 2.0173 (2.2322) loss 3.2439 (3.6289) grad_norm 1.4188 (1.3791) [2022-01-21 17:33:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][420/1251] eta 0:30:56 lr 0.000611 time 2.5290 (2.2339) loss 2.5958 (3.6229) grad_norm 1.1732 (1.3788) [2022-01-21 17:33:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][430/1251] eta 0:30:35 lr 0.000611 time 2.3951 (2.2352) loss 3.0193 (3.6219) grad_norm 1.5073 (1.3780) [2022-01-21 17:34:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][440/1251] eta 0:30:10 lr 0.000611 time 1.6589 (2.2328) loss 3.6891 (3.6292) grad_norm 1.5604 (1.3797) [2022-01-21 17:34:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][450/1251] eta 0:29:47 lr 0.000611 time 2.3242 (2.2317) loss 2.7623 (3.6192) grad_norm 1.3765 (1.3792) [2022-01-21 17:34:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][460/1251] eta 0:29:22 lr 0.000611 time 1.8637 (2.2278) loss 3.9615 (3.6212) grad_norm 1.4523 (1.3801) [2022-01-21 17:35:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][470/1251] eta 0:28:58 lr 0.000611 time 2.1035 (2.2262) loss 3.6894 (3.6237) grad_norm 1.4371 (1.3800) [2022-01-21 17:35:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][480/1251] eta 0:28:35 lr 0.000611 time 2.2211 (2.2254) loss 3.3143 (3.6262) grad_norm 1.2293 (1.3788) [2022-01-21 17:35:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][490/1251] eta 0:28:13 lr 0.000611 time 2.4547 (2.2253) loss 3.7469 (3.6239) grad_norm 1.4734 (1.3781) [2022-01-21 17:36:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][500/1251] eta 0:27:49 lr 0.000611 time 2.2885 (2.2232) loss 4.0117 (3.6223) grad_norm 1.2884 (1.3774) [2022-01-21 17:36:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][510/1251] eta 0:27:26 lr 0.000611 time 2.2427 (2.2224) loss 3.9807 (3.6179) grad_norm 1.3506 (1.3779) [2022-01-21 17:37:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][520/1251] eta 0:27:07 lr 0.000611 time 3.1014 (2.2258) loss 3.0381 (3.6195) grad_norm 1.3784 (1.3793) [2022-01-21 17:37:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][530/1251] eta 0:26:44 lr 0.000611 time 2.1420 (2.2258) loss 4.2248 (3.6195) grad_norm 1.5008 (1.3788) [2022-01-21 17:37:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][540/1251] eta 0:26:23 lr 0.000611 time 2.1806 (2.2267) loss 3.1042 (3.6193) grad_norm 1.1995 (1.3802) [2022-01-21 17:38:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][550/1251] eta 0:25:58 lr 0.000611 time 1.7363 (2.2238) loss 4.0845 (3.6192) grad_norm 1.3201 (1.3800) [2022-01-21 17:38:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][560/1251] eta 0:25:34 lr 0.000611 time 2.8328 (2.2211) loss 4.2736 (3.6259) grad_norm 1.3481 (1.3792) [2022-01-21 17:38:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][570/1251] eta 0:25:10 lr 0.000611 time 1.8572 (2.2184) loss 4.4579 (3.6292) grad_norm 1.5059 (1.3799) [2022-01-21 17:39:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][580/1251] eta 0:24:48 lr 0.000611 time 2.0695 (2.2184) loss 3.3210 (3.6317) grad_norm 1.2671 (1.3789) [2022-01-21 17:39:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][590/1251] eta 0:24:23 lr 0.000611 time 1.7243 (2.2147) loss 2.7900 (3.6306) grad_norm 1.4129 (1.3787) [2022-01-21 17:39:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][600/1251] eta 0:24:01 lr 0.000611 time 2.7366 (2.2141) loss 3.2997 (3.6323) grad_norm 1.2612 (1.3778) [2022-01-21 17:40:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][610/1251] eta 0:23:38 lr 0.000611 time 2.0692 (2.2133) loss 2.6587 (3.6310) grad_norm 1.2911 (1.3771) [2022-01-21 17:40:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][620/1251] eta 0:23:16 lr 0.000610 time 2.2659 (2.2128) loss 3.3378 (3.6312) grad_norm 1.4278 (1.3768) [2022-01-21 17:41:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][630/1251] eta 0:22:53 lr 0.000610 time 2.1765 (2.2119) loss 4.0065 (3.6307) grad_norm 1.4365 (1.3762) [2022-01-21 17:41:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][640/1251] eta 0:22:30 lr 0.000610 time 2.4755 (2.2107) loss 3.8577 (3.6295) grad_norm 1.3202 (1.3759) [2022-01-21 17:41:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][650/1251] eta 0:22:09 lr 0.000610 time 2.3636 (2.2120) loss 3.7948 (3.6306) grad_norm 1.3083 (1.3752) [2022-01-21 17:42:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][660/1251] eta 0:21:46 lr 0.000610 time 2.4038 (2.2114) loss 4.3889 (3.6266) grad_norm 1.2987 (1.3749) [2022-01-21 17:42:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][670/1251] eta 0:21:24 lr 0.000610 time 2.0174 (2.2110) loss 4.0413 (3.6275) grad_norm 1.3861 (1.3737) [2022-01-21 17:42:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][680/1251] eta 0:21:03 lr 0.000610 time 2.5263 (2.2122) loss 2.7640 (3.6272) grad_norm 1.2613 (1.3737) [2022-01-21 17:43:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][690/1251] eta 0:20:41 lr 0.000610 time 2.7262 (2.2129) loss 3.0408 (3.6264) grad_norm 1.5591 (1.3738) [2022-01-21 17:43:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][700/1251] eta 0:20:19 lr 0.000610 time 2.2113 (2.2132) loss 3.9230 (3.6257) grad_norm 1.3663 (1.3751) [2022-01-21 17:43:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][710/1251] eta 0:19:57 lr 0.000610 time 2.5422 (2.2143) loss 4.2549 (3.6235) grad_norm 1.2219 (1.3758) [2022-01-21 17:44:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][720/1251] eta 0:19:35 lr 0.000610 time 2.8724 (2.2138) loss 3.3399 (3.6235) grad_norm 1.3573 (1.3765) [2022-01-21 17:44:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][730/1251] eta 0:19:11 lr 0.000610 time 1.8870 (2.2101) loss 2.5980 (3.6229) grad_norm 1.3920 (1.3766) [2022-01-21 17:45:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][740/1251] eta 0:18:48 lr 0.000610 time 2.0336 (2.2089) loss 3.8361 (3.6239) grad_norm 1.2468 (1.3762) [2022-01-21 17:45:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][750/1251] eta 0:18:26 lr 0.000610 time 2.5754 (2.2086) loss 3.4171 (3.6224) grad_norm 1.3122 (1.3758) [2022-01-21 17:45:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][760/1251] eta 0:18:04 lr 0.000610 time 2.2018 (2.2092) loss 3.9186 (3.6248) grad_norm 1.4639 (1.3758) [2022-01-21 17:46:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][770/1251] eta 0:17:42 lr 0.000610 time 2.7124 (2.2098) loss 4.0398 (3.6273) grad_norm 1.3163 (1.3761) [2022-01-21 17:46:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][780/1251] eta 0:17:20 lr 0.000610 time 2.4885 (2.2102) loss 3.5628 (3.6256) grad_norm 1.2074 (1.3777) [2022-01-21 17:46:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][790/1251] eta 0:16:58 lr 0.000610 time 1.7388 (2.2100) loss 4.1585 (3.6279) grad_norm 1.2077 (1.3772) [2022-01-21 17:47:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][800/1251] eta 0:16:37 lr 0.000610 time 2.5775 (2.2110) loss 4.3886 (3.6271) grad_norm 1.2000 (1.3764) [2022-01-21 17:47:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][810/1251] eta 0:16:15 lr 0.000610 time 2.8515 (2.2110) loss 3.7562 (3.6257) grad_norm 1.4218 (1.3774) [2022-01-21 17:47:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][820/1251] eta 0:15:52 lr 0.000610 time 1.8907 (2.2093) loss 4.0747 (3.6264) grad_norm 1.5405 (1.3801) [2022-01-21 17:48:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][830/1251] eta 0:15:29 lr 0.000610 time 1.7636 (2.2082) loss 4.1941 (3.6281) grad_norm 1.3935 (1.3799) [2022-01-21 17:48:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][840/1251] eta 0:15:07 lr 0.000610 time 1.9382 (2.2085) loss 3.3656 (3.6283) grad_norm 1.2547 (1.3784) [2022-01-21 17:49:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][850/1251] eta 0:14:45 lr 0.000610 time 2.8097 (2.2095) loss 3.8058 (3.6267) grad_norm 1.1716 (1.3778) [2022-01-21 17:49:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][860/1251] eta 0:14:24 lr 0.000610 time 2.2809 (2.2098) loss 4.0438 (3.6298) grad_norm 1.3033 (1.3785) [2022-01-21 17:49:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][870/1251] eta 0:14:01 lr 0.000609 time 1.8123 (2.2078) loss 2.5102 (3.6281) grad_norm 1.4288 (1.3790) [2022-01-21 17:50:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][880/1251] eta 0:13:38 lr 0.000609 time 1.9458 (2.2058) loss 2.8438 (3.6274) grad_norm 1.2893 (1.3798) [2022-01-21 17:50:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][890/1251] eta 0:13:16 lr 0.000609 time 2.6218 (2.2060) loss 3.6357 (3.6292) grad_norm 1.3548 (1.3794) [2022-01-21 17:50:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][900/1251] eta 0:12:53 lr 0.000609 time 1.9579 (2.2036) loss 3.5720 (3.6328) grad_norm 1.2410 (1.3792) [2022-01-21 17:51:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][910/1251] eta 0:12:31 lr 0.000609 time 2.1333 (2.2037) loss 3.6042 (3.6341) grad_norm 1.4358 (1.3788) [2022-01-21 17:51:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][920/1251] eta 0:12:09 lr 0.000609 time 2.2123 (2.2040) loss 3.6903 (3.6334) grad_norm 1.5238 (1.3790) [2022-01-21 17:51:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][930/1251] eta 0:11:47 lr 0.000609 time 1.8759 (2.2047) loss 4.1644 (3.6356) grad_norm 1.3655 (1.3786) [2022-01-21 17:52:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][940/1251] eta 0:11:25 lr 0.000609 time 2.2598 (2.2043) loss 3.2560 (3.6362) grad_norm 1.3410 (1.3797) [2022-01-21 17:52:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][950/1251] eta 0:11:03 lr 0.000609 time 2.4020 (2.2057) loss 2.4459 (3.6342) grad_norm 1.2832 (1.3801) [2022-01-21 17:53:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][960/1251] eta 0:10:41 lr 0.000609 time 1.8712 (2.2059) loss 3.9128 (3.6336) grad_norm 1.1898 (1.3801) [2022-01-21 17:53:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][970/1251] eta 0:10:19 lr 0.000609 time 1.9889 (2.2044) loss 3.8958 (3.6338) grad_norm 1.3603 (1.3802) [2022-01-21 17:53:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][980/1251] eta 0:09:57 lr 0.000609 time 2.4067 (2.2043) loss 3.4991 (3.6332) grad_norm 1.4427 (1.3808) [2022-01-21 17:54:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][990/1251] eta 0:09:35 lr 0.000609 time 1.7809 (2.2041) loss 2.2918 (3.6339) grad_norm 1.1667 (1.3805) [2022-01-21 17:54:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1000/1251] eta 0:09:12 lr 0.000609 time 2.2591 (2.2030) loss 3.6620 (3.6330) grad_norm 1.5188 (1.3809) [2022-01-21 17:54:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1010/1251] eta 0:08:50 lr 0.000609 time 1.6928 (2.2021) loss 4.0564 (3.6346) grad_norm 1.3131 (1.3817) [2022-01-21 17:55:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1020/1251] eta 0:08:28 lr 0.000609 time 2.0993 (2.2014) loss 4.4260 (3.6382) grad_norm 1.2911 (1.3821) [2022-01-21 17:55:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1030/1251] eta 0:08:06 lr 0.000609 time 2.1424 (2.2010) loss 3.3479 (3.6404) grad_norm 1.5586 (1.3823) [2022-01-21 17:55:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1040/1251] eta 0:07:44 lr 0.000609 time 1.8672 (2.2001) loss 3.1169 (3.6386) grad_norm 1.4054 (1.3822) [2022-01-21 17:56:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1050/1251] eta 0:07:22 lr 0.000609 time 2.1639 (2.1993) loss 3.7222 (3.6393) grad_norm 1.3410 (1.3820) [2022-01-21 17:56:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1060/1251] eta 0:06:59 lr 0.000609 time 2.1509 (2.1989) loss 3.9228 (3.6393) grad_norm 1.2760 (1.3818) [2022-01-21 17:57:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1070/1251] eta 0:06:38 lr 0.000609 time 1.9163 (2.1995) loss 4.4499 (3.6426) grad_norm 1.2536 (1.3812) [2022-01-21 17:57:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1080/1251] eta 0:06:16 lr 0.000609 time 1.8464 (2.1989) loss 3.9064 (3.6403) grad_norm 1.3369 (1.3814) [2022-01-21 17:57:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1090/1251] eta 0:05:53 lr 0.000609 time 2.2172 (2.1984) loss 3.2326 (3.6382) grad_norm 1.4237 (1.3812) [2022-01-21 17:58:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1100/1251] eta 0:05:32 lr 0.000609 time 2.7724 (2.1991) loss 4.0531 (3.6389) grad_norm 1.6145 (1.3814) [2022-01-21 17:58:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1110/1251] eta 0:05:10 lr 0.000608 time 1.7085 (2.1993) loss 2.6842 (3.6388) grad_norm 1.5453 (1.3815) [2022-01-21 17:58:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1120/1251] eta 0:04:47 lr 0.000608 time 1.8367 (2.1983) loss 3.8217 (3.6379) grad_norm 1.2781 (1.3813) [2022-01-21 17:59:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1130/1251] eta 0:04:25 lr 0.000608 time 1.9089 (2.1979) loss 3.5722 (3.6368) grad_norm 1.6128 (1.3810) [2022-01-21 17:59:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1140/1251] eta 0:04:03 lr 0.000608 time 2.4786 (2.1972) loss 2.6750 (3.6335) grad_norm 1.4278 (1.3810) [2022-01-21 17:59:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1150/1251] eta 0:03:41 lr 0.000608 time 1.9745 (2.1965) loss 2.4242 (3.6317) grad_norm 1.9364 (1.3819) [2022-01-21 18:00:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1160/1251] eta 0:03:19 lr 0.000608 time 1.7570 (2.1976) loss 4.0271 (3.6315) grad_norm 1.4204 (1.3831) [2022-01-21 18:00:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1170/1251] eta 0:02:58 lr 0.000608 time 1.8729 (2.1979) loss 3.4284 (3.6319) grad_norm 1.7418 (1.3837) [2022-01-21 18:01:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1180/1251] eta 0:02:36 lr 0.000608 time 3.4915 (2.1995) loss 3.9705 (3.6316) grad_norm 1.5039 (1.3844) [2022-01-21 18:01:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1190/1251] eta 0:02:14 lr 0.000608 time 1.7484 (2.2005) loss 3.8616 (3.6343) grad_norm 1.2553 (1.3837) [2022-01-21 18:01:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1200/1251] eta 0:01:52 lr 0.000608 time 1.9212 (2.2009) loss 3.7974 (3.6353) grad_norm 1.3346 (1.3832) [2022-01-21 18:02:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1210/1251] eta 0:01:30 lr 0.000608 time 1.6507 (2.2003) loss 3.5809 (3.6341) grad_norm 1.1297 (1.3830) [2022-01-21 18:02:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1220/1251] eta 0:01:08 lr 0.000608 time 2.5038 (2.1989) loss 3.8637 (3.6329) grad_norm 1.2435 (1.3836) [2022-01-21 18:02:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1230/1251] eta 0:00:46 lr 0.000608 time 1.8784 (2.1975) loss 2.4830 (3.6331) grad_norm 1.7089 (1.3840) [2022-01-21 18:03:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1240/1251] eta 0:00:24 lr 0.000608 time 1.6021 (2.1962) loss 3.7985 (3.6330) grad_norm 1.2907 (1.3846) [2022-01-21 18:03:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [129/300][1250/1251] eta 0:00:02 lr 0.000608 time 1.1568 (2.1912) loss 3.5060 (3.6340) grad_norm 1.3758 (1.3844) [2022-01-21 18:03:27 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 129 training takes 0:45:41 [2022-01-21 18:03:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.617 (18.617) Loss 1.0534 (1.0534) Acc@1 75.000 (75.000) Acc@5 92.969 (92.969) [2022-01-21 18:04:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 2.048 (3.605) Loss 1.1095 (1.0427) Acc@1 74.414 (75.062) Acc@5 92.383 (92.889) [2022-01-21 18:04:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.961 (2.657) Loss 1.0368 (1.0404) Acc@1 75.391 (75.051) Acc@5 93.848 (92.992) [2022-01-21 18:04:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.599 (2.250) Loss 0.9773 (1.0330) Acc@1 76.953 (75.302) Acc@5 93.359 (92.975) [2022-01-21 18:04:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.600 (2.130) Loss 1.1184 (1.0386) Acc@1 73.340 (75.176) Acc@5 92.188 (92.933) [2022-01-21 18:05:02 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.162 Acc@5 92.884 [2022-01-21 18:05:02 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.2% [2022-01-21 18:05:02 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.16% [2022-01-21 18:05:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][0/1251] eta 7:36:34 lr 0.000608 time 21.8984 (21.8984) loss 3.7832 (3.7832) grad_norm 1.3110 (1.3110) [2022-01-21 18:05:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][10/1251] eta 1:25:39 lr 0.000608 time 2.4482 (4.1413) loss 3.8099 (3.5681) grad_norm 1.4619 (1.3442) [2022-01-21 18:06:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][20/1251] eta 1:07:42 lr 0.000608 time 2.1682 (3.3001) loss 3.9611 (3.6740) grad_norm 1.4287 (1.3565) [2022-01-21 18:06:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][30/1251] eta 0:59:56 lr 0.000608 time 1.8371 (2.9454) loss 3.9759 (3.7611) grad_norm 1.3407 (1.3727) [2022-01-21 18:06:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][40/1251] eta 0:56:10 lr 0.000608 time 2.7720 (2.7832) loss 4.2568 (3.7888) grad_norm 1.2636 (1.4152) [2022-01-21 18:07:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][50/1251] eta 0:53:48 lr 0.000608 time 2.4331 (2.6880) loss 2.2982 (3.7771) grad_norm 1.5283 (1.4273) [2022-01-21 18:07:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][60/1251] eta 0:50:59 lr 0.000608 time 1.9363 (2.5691) loss 3.3798 (3.7789) grad_norm 1.3731 (1.4178) [2022-01-21 18:07:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][70/1251] eta 0:49:09 lr 0.000608 time 1.9540 (2.4977) loss 3.6409 (3.7472) grad_norm 1.5102 (1.4335) [2022-01-21 18:08:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][80/1251] eta 0:47:55 lr 0.000608 time 2.1458 (2.4554) loss 3.9379 (3.7213) grad_norm 1.6236 (1.4236) [2022-01-21 18:08:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][90/1251] eta 0:47:13 lr 0.000608 time 3.3440 (2.4405) loss 3.2887 (3.7290) grad_norm 1.3472 (1.4143) [2022-01-21 18:09:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][100/1251] eta 0:46:23 lr 0.000608 time 1.8703 (2.4184) loss 3.0975 (3.6628) grad_norm 1.3963 (1.4123) [2022-01-21 18:09:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][110/1251] eta 0:45:26 lr 0.000607 time 1.8487 (2.3900) loss 4.0242 (3.6626) grad_norm 1.2528 (1.4092) [2022-01-21 18:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][120/1251] eta 0:44:45 lr 0.000607 time 2.0242 (2.3744) loss 3.9128 (3.6624) grad_norm 1.3964 (1.4094) [2022-01-21 18:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][130/1251] eta 0:44:09 lr 0.000607 time 2.8841 (2.3638) loss 4.2233 (3.6648) grad_norm 1.4276 (1.4120) [2022-01-21 18:10:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][140/1251] eta 0:43:31 lr 0.000607 time 1.8883 (2.3507) loss 3.0342 (3.6434) grad_norm 1.4485 (1.4062) [2022-01-21 18:10:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][150/1251] eta 0:42:48 lr 0.000607 time 2.1986 (2.3330) loss 3.7956 (3.6622) grad_norm 1.2190 (1.4039) [2022-01-21 18:11:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][160/1251] eta 0:42:09 lr 0.000607 time 1.8016 (2.3186) loss 4.0219 (3.6664) grad_norm 1.7045 (1.4041) [2022-01-21 18:11:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][170/1251] eta 0:41:26 lr 0.000607 time 1.8643 (2.2998) loss 4.4026 (3.6752) grad_norm 1.5110 (1.4048) [2022-01-21 18:11:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][180/1251] eta 0:40:51 lr 0.000607 time 2.1248 (2.2890) loss 2.8213 (3.6737) grad_norm 1.3150 (1.4028) [2022-01-21 18:12:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][190/1251] eta 0:40:19 lr 0.000607 time 2.4912 (2.2806) loss 4.2695 (3.6834) grad_norm 1.3015 (1.4019) [2022-01-21 18:12:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][200/1251] eta 0:40:03 lr 0.000607 time 1.8108 (2.2867) loss 4.0103 (3.6737) grad_norm 1.3389 (1.3979) [2022-01-21 18:13:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][210/1251] eta 0:39:36 lr 0.000607 time 2.2622 (2.2825) loss 2.9143 (3.6691) grad_norm 1.3622 (1.3965) [2022-01-21 18:13:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][220/1251] eta 0:39:13 lr 0.000607 time 2.7701 (2.2827) loss 3.8802 (3.6832) grad_norm 1.4703 (1.3944) [2022-01-21 18:13:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][230/1251] eta 0:38:38 lr 0.000607 time 1.9089 (2.2713) loss 4.3796 (3.6842) grad_norm 1.2533 (1.3984) [2022-01-21 18:14:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][240/1251] eta 0:38:13 lr 0.000607 time 2.3040 (2.2686) loss 4.0903 (3.6597) grad_norm 1.7691 (1.3990) [2022-01-21 18:14:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][250/1251] eta 0:37:48 lr 0.000607 time 1.6805 (2.2661) loss 2.0902 (3.6446) grad_norm 1.4881 (1.4000) [2022-01-21 18:14:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][260/1251] eta 0:37:28 lr 0.000607 time 2.1494 (2.2690) loss 2.9870 (3.6396) grad_norm 1.3713 (1.4007) [2022-01-21 18:15:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][270/1251] eta 0:37:04 lr 0.000607 time 1.8196 (2.2675) loss 4.2608 (3.6469) grad_norm 1.3145 (1.3980) [2022-01-21 18:15:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][280/1251] eta 0:36:38 lr 0.000607 time 1.6797 (2.2639) loss 4.2852 (3.6565) grad_norm 1.3318 (1.3950) [2022-01-21 18:15:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][290/1251] eta 0:36:08 lr 0.000607 time 1.9966 (2.2569) loss 4.1332 (3.6650) grad_norm 1.7579 (1.3940) [2022-01-21 18:16:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][300/1251] eta 0:35:42 lr 0.000607 time 2.1749 (2.2526) loss 4.5085 (3.6710) grad_norm 1.2914 (1.3932) [2022-01-21 18:16:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][310/1251] eta 0:35:12 lr 0.000607 time 1.8915 (2.2452) loss 3.2468 (3.6765) grad_norm 1.2578 (1.3902) [2022-01-21 18:17:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][320/1251] eta 0:34:46 lr 0.000607 time 2.2703 (2.2407) loss 3.7800 (3.6737) grad_norm 1.1537 (1.3875) [2022-01-21 18:17:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][330/1251] eta 0:34:19 lr 0.000607 time 1.9213 (2.2364) loss 4.4612 (3.6713) grad_norm 1.3080 (1.3845) [2022-01-21 18:17:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][340/1251] eta 0:33:56 lr 0.000607 time 1.4292 (2.2350) loss 2.8562 (3.6610) grad_norm 1.3841 (1.3838) [2022-01-21 18:18:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][350/1251] eta 0:33:34 lr 0.000606 time 1.9489 (2.2356) loss 3.4290 (3.6539) grad_norm 1.5060 (1.3845) [2022-01-21 18:18:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][360/1251] eta 0:33:13 lr 0.000606 time 2.1481 (2.2369) loss 3.6809 (3.6535) grad_norm 1.3353 (1.3854) [2022-01-21 18:18:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][370/1251] eta 0:32:53 lr 0.000606 time 1.7610 (2.2404) loss 3.8716 (3.6582) grad_norm 1.4161 (1.3860) [2022-01-21 18:19:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][380/1251] eta 0:32:30 lr 0.000606 time 2.0588 (2.2397) loss 4.2430 (3.6588) grad_norm 1.4472 (1.3872) [2022-01-21 18:19:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][390/1251] eta 0:32:04 lr 0.000606 time 1.8725 (2.2354) loss 4.1128 (3.6574) grad_norm 1.4128 (1.3904) [2022-01-21 18:19:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][400/1251] eta 0:31:42 lr 0.000606 time 1.9129 (2.2352) loss 3.5389 (3.6580) grad_norm 1.2599 (1.3914) [2022-01-21 18:20:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][410/1251] eta 0:31:21 lr 0.000606 time 2.1537 (2.2368) loss 4.0578 (3.6549) grad_norm 1.2957 (1.3903) [2022-01-21 18:20:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][420/1251] eta 0:30:58 lr 0.000606 time 2.1060 (2.2363) loss 3.7794 (3.6538) grad_norm 1.3583 (1.3891) [2022-01-21 18:21:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][430/1251] eta 0:30:33 lr 0.000606 time 1.5884 (2.2329) loss 2.6002 (3.6589) grad_norm 1.3178 (1.3889) [2022-01-21 18:21:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][440/1251] eta 0:30:08 lr 0.000606 time 1.5049 (2.2297) loss 4.3283 (3.6575) grad_norm 1.7112 (1.3907) [2022-01-21 18:21:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][450/1251] eta 0:29:46 lr 0.000606 time 1.9516 (2.2308) loss 3.4318 (3.6594) grad_norm 1.2733 (1.3900) [2022-01-21 18:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][460/1251] eta 0:29:21 lr 0.000606 time 1.5921 (2.2273) loss 2.7385 (3.6554) grad_norm 1.4932 (1.3887) [2022-01-21 18:22:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][470/1251] eta 0:28:59 lr 0.000606 time 2.3003 (2.2271) loss 3.2641 (3.6577) grad_norm 1.2004 (1.3878) [2022-01-21 18:22:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][480/1251] eta 0:28:35 lr 0.000606 time 1.5389 (2.2256) loss 3.3132 (3.6565) grad_norm 1.9380 (1.3872) [2022-01-21 18:23:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][490/1251] eta 0:28:15 lr 0.000606 time 2.1909 (2.2281) loss 3.4552 (3.6582) grad_norm 1.5915 (1.3877) [2022-01-21 18:23:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][500/1251] eta 0:27:53 lr 0.000606 time 1.9190 (2.2278) loss 4.1409 (3.6575) grad_norm 1.1981 (1.3874) [2022-01-21 18:23:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][510/1251] eta 0:27:28 lr 0.000606 time 2.5309 (2.2249) loss 3.7155 (3.6581) grad_norm 1.3368 (1.3863) [2022-01-21 18:24:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][520/1251] eta 0:27:04 lr 0.000606 time 1.7315 (2.2218) loss 4.0064 (3.6595) grad_norm 1.4616 (1.3871) [2022-01-21 18:24:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][530/1251] eta 0:26:42 lr 0.000606 time 1.8584 (2.2221) loss 3.7689 (3.6600) grad_norm 1.2832 (1.3866) [2022-01-21 18:25:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][540/1251] eta 0:26:19 lr 0.000606 time 3.0259 (2.2213) loss 4.2681 (3.6546) grad_norm 1.4348 (1.3861) [2022-01-21 18:25:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][550/1251] eta 0:25:57 lr 0.000606 time 2.4437 (2.2223) loss 3.6637 (3.6574) grad_norm 1.2963 (1.3846) [2022-01-21 18:25:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][560/1251] eta 0:25:34 lr 0.000606 time 2.4208 (2.2211) loss 3.7918 (3.6581) grad_norm 1.3531 (1.3858) [2022-01-21 18:26:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][570/1251] eta 0:25:11 lr 0.000606 time 2.2448 (2.2194) loss 3.7544 (3.6596) grad_norm 1.7994 (1.3870) [2022-01-21 18:26:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][580/1251] eta 0:24:47 lr 0.000606 time 2.8765 (2.2164) loss 2.4453 (3.6559) grad_norm 1.4687 (1.3874) [2022-01-21 18:26:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][590/1251] eta 0:24:24 lr 0.000606 time 2.2597 (2.2158) loss 3.1520 (3.6599) grad_norm 1.2435 (1.3877) [2022-01-21 18:27:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][600/1251] eta 0:24:02 lr 0.000605 time 2.1707 (2.2155) loss 3.5248 (3.6546) grad_norm 1.2769 (1.3879) [2022-01-21 18:27:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][610/1251] eta 0:23:40 lr 0.000605 time 2.1564 (2.2166) loss 2.8528 (3.6535) grad_norm 1.6320 (1.3877) [2022-01-21 18:27:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][620/1251] eta 0:23:19 lr 0.000605 time 2.6668 (2.2173) loss 3.0802 (3.6473) grad_norm 1.3480 (1.3873) [2022-01-21 18:28:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][630/1251] eta 0:22:55 lr 0.000605 time 1.7473 (2.2154) loss 4.4091 (3.6484) grad_norm 1.5611 (1.3874) [2022-01-21 18:28:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][640/1251] eta 0:22:32 lr 0.000605 time 1.9693 (2.2130) loss 4.0515 (3.6470) grad_norm 1.1652 (1.3862) [2022-01-21 18:29:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][650/1251] eta 0:22:08 lr 0.000605 time 1.8365 (2.2100) loss 3.4926 (3.6449) grad_norm 1.2398 (1.3853) [2022-01-21 18:29:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][660/1251] eta 0:21:46 lr 0.000605 time 2.1849 (2.2106) loss 2.5359 (3.6423) grad_norm 1.2242 (1.3846) [2022-01-21 18:29:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][670/1251] eta 0:21:23 lr 0.000605 time 1.8807 (2.2098) loss 4.0096 (3.6386) grad_norm 1.3404 (1.3844) [2022-01-21 18:30:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][680/1251] eta 0:21:01 lr 0.000605 time 2.1165 (2.2096) loss 4.1787 (3.6431) grad_norm 1.5138 (1.3845) [2022-01-21 18:30:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][690/1251] eta 0:20:40 lr 0.000605 time 1.9289 (2.2109) loss 3.7102 (3.6420) grad_norm 1.3627 (1.3837) [2022-01-21 18:30:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][700/1251] eta 0:20:17 lr 0.000605 time 1.6677 (2.2104) loss 3.9326 (3.6402) grad_norm 1.2471 (1.3824) [2022-01-21 18:31:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][710/1251] eta 0:19:56 lr 0.000605 time 1.7524 (2.2108) loss 3.6495 (3.6411) grad_norm 1.2902 (1.3826) [2022-01-21 18:31:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][720/1251] eta 0:19:33 lr 0.000605 time 1.6450 (2.2099) loss 3.7881 (3.6418) grad_norm 1.7171 (1.3831) [2022-01-21 18:31:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][730/1251] eta 0:19:11 lr 0.000605 time 2.0929 (2.2093) loss 3.0531 (3.6436) grad_norm 1.4117 (1.3835) [2022-01-21 18:32:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][740/1251] eta 0:18:48 lr 0.000605 time 1.8791 (2.2077) loss 4.1409 (3.6443) grad_norm 1.1842 (1.3839) [2022-01-21 18:32:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][750/1251] eta 0:18:26 lr 0.000605 time 3.1059 (2.2091) loss 4.0114 (3.6450) grad_norm 1.3119 (1.3840) [2022-01-21 18:33:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][760/1251] eta 0:18:05 lr 0.000605 time 1.8413 (2.2100) loss 3.4996 (3.6473) grad_norm 1.7106 (1.3852) [2022-01-21 18:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][770/1251] eta 0:17:42 lr 0.000605 time 1.6696 (2.2082) loss 2.9656 (3.6423) grad_norm 1.4439 (1.3848) [2022-01-21 18:33:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][780/1251] eta 0:17:19 lr 0.000605 time 1.8611 (2.2061) loss 3.0087 (3.6462) grad_norm 1.4003 (1.3853) [2022-01-21 18:34:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][790/1251] eta 0:16:58 lr 0.000605 time 2.5021 (2.2083) loss 2.7942 (3.6426) grad_norm 1.5184 (1.3866) [2022-01-21 18:34:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][800/1251] eta 0:16:36 lr 0.000605 time 1.9439 (2.2085) loss 4.2440 (3.6427) grad_norm 1.4397 (1.3878) [2022-01-21 18:34:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][810/1251] eta 0:16:13 lr 0.000605 time 1.6146 (2.2066) loss 3.8969 (3.6401) grad_norm 1.4869 (1.3877) [2022-01-21 18:35:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][820/1251] eta 0:15:50 lr 0.000605 time 2.0516 (2.2043) loss 3.7547 (3.6337) grad_norm 1.4053 (1.3888) [2022-01-21 18:35:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][830/1251] eta 0:15:27 lr 0.000605 time 1.9775 (2.2041) loss 4.2185 (3.6326) grad_norm 1.2984 (1.3884) [2022-01-21 18:35:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][840/1251] eta 0:15:05 lr 0.000605 time 2.0341 (2.2042) loss 3.9053 (3.6369) grad_norm 1.2041 (1.3880) [2022-01-21 18:36:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][850/1251] eta 0:14:43 lr 0.000604 time 1.8806 (2.2034) loss 2.5866 (3.6365) grad_norm 1.3768 (1.3871) [2022-01-21 18:36:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][860/1251] eta 0:14:21 lr 0.000604 time 1.8051 (2.2034) loss 4.2135 (3.6374) grad_norm 1.3719 (1.3872) [2022-01-21 18:37:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][870/1251] eta 0:13:59 lr 0.000604 time 2.5253 (2.2038) loss 4.3794 (3.6406) grad_norm 1.5525 (1.3868) [2022-01-21 18:37:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][880/1251] eta 0:13:37 lr 0.000604 time 2.8422 (2.2039) loss 3.6525 (3.6393) grad_norm 1.4704 (1.3869) [2022-01-21 18:37:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][890/1251] eta 0:13:15 lr 0.000604 time 1.8416 (2.2030) loss 4.3990 (3.6411) grad_norm 1.3891 (1.3867) [2022-01-21 18:38:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][900/1251] eta 0:12:52 lr 0.000604 time 1.8667 (2.2021) loss 3.4875 (3.6438) grad_norm 1.3580 (1.3870) [2022-01-21 18:38:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][910/1251] eta 0:12:30 lr 0.000604 time 1.9290 (2.2009) loss 4.3924 (3.6477) grad_norm 1.3170 (1.3867) [2022-01-21 18:38:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][920/1251] eta 0:12:08 lr 0.000604 time 1.9413 (2.2001) loss 4.2257 (3.6478) grad_norm 1.3025 (1.3872) [2022-01-21 18:39:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][930/1251] eta 0:11:45 lr 0.000604 time 1.6454 (2.1990) loss 3.7969 (3.6472) grad_norm 1.3316 (1.3873) [2022-01-21 18:39:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][940/1251] eta 0:11:23 lr 0.000604 time 1.9507 (2.1962) loss 2.4968 (3.6471) grad_norm 1.6040 (1.3875) [2022-01-21 18:39:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][950/1251] eta 0:11:01 lr 0.000604 time 1.9754 (2.1964) loss 4.3566 (3.6489) grad_norm 1.3455 (1.3882) [2022-01-21 18:40:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][960/1251] eta 0:10:39 lr 0.000604 time 2.1436 (2.1976) loss 3.4465 (3.6508) grad_norm 1.3812 (1.3876) [2022-01-21 18:40:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][970/1251] eta 0:10:18 lr 0.000604 time 2.5974 (2.1994) loss 3.5016 (3.6520) grad_norm 1.4452 (1.3879) [2022-01-21 18:40:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][980/1251] eta 0:09:55 lr 0.000604 time 2.0772 (2.1990) loss 3.8916 (3.6517) grad_norm 1.3304 (1.3874) [2022-01-21 18:41:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][990/1251] eta 0:09:34 lr 0.000604 time 2.2443 (2.2002) loss 3.8896 (3.6513) grad_norm 1.2713 (1.3866) [2022-01-21 18:41:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1000/1251] eta 0:09:12 lr 0.000604 time 2.7966 (2.2025) loss 3.8038 (3.6514) grad_norm 1.2320 (1.3863) [2022-01-21 18:42:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1010/1251] eta 0:08:50 lr 0.000604 time 1.8864 (2.2032) loss 2.9243 (3.6522) grad_norm 1.5281 (1.3870) [2022-01-21 18:42:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1020/1251] eta 0:08:28 lr 0.000604 time 1.6109 (2.2029) loss 2.5810 (3.6528) grad_norm 1.3671 (1.3872) [2022-01-21 18:42:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1030/1251] eta 0:08:06 lr 0.000604 time 1.9261 (2.2011) loss 3.9186 (3.6499) grad_norm 1.6077 (1.3879) [2022-01-21 18:43:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1040/1251] eta 0:07:43 lr 0.000604 time 1.9711 (2.1984) loss 2.7396 (3.6490) grad_norm 1.3254 (1.3871) [2022-01-21 18:43:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1050/1251] eta 0:07:21 lr 0.000604 time 1.9992 (2.1985) loss 3.7544 (3.6491) grad_norm 1.6723 (1.3873) [2022-01-21 18:43:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1060/1251] eta 0:07:00 lr 0.000604 time 2.1019 (2.1991) loss 3.8784 (3.6486) grad_norm 1.8234 (1.3880) [2022-01-21 18:44:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1070/1251] eta 0:06:38 lr 0.000604 time 2.0110 (2.2004) loss 2.5848 (3.6459) grad_norm 1.4192 (1.3880) [2022-01-21 18:44:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1080/1251] eta 0:06:16 lr 0.000604 time 2.2291 (2.2006) loss 3.7837 (3.6471) grad_norm 1.3659 (1.3872) [2022-01-21 18:45:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1090/1251] eta 0:05:54 lr 0.000603 time 2.3204 (2.2011) loss 3.6230 (3.6461) grad_norm 1.3488 (1.3864) [2022-01-21 18:45:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1100/1251] eta 0:05:32 lr 0.000603 time 1.6682 (2.2001) loss 3.7613 (3.6462) grad_norm 1.5529 (1.3866) [2022-01-21 18:45:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1110/1251] eta 0:05:10 lr 0.000603 time 2.3754 (2.2005) loss 3.7581 (3.6470) grad_norm 1.3801 (1.3862) [2022-01-21 18:46:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1120/1251] eta 0:04:48 lr 0.000603 time 1.6934 (2.1995) loss 3.9797 (3.6459) grad_norm 1.4824 (1.3867) [2022-01-21 18:46:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1130/1251] eta 0:04:26 lr 0.000603 time 2.5768 (2.1997) loss 4.3438 (3.6473) grad_norm 1.6483 (1.3866) [2022-01-21 18:46:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1140/1251] eta 0:04:04 lr 0.000603 time 1.7834 (2.1992) loss 3.9721 (3.6469) grad_norm 1.5050 (1.3866) [2022-01-21 18:47:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1150/1251] eta 0:03:42 lr 0.000603 time 2.4929 (2.1993) loss 4.3512 (3.6458) grad_norm 1.3159 (1.3873) [2022-01-21 18:47:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1160/1251] eta 0:03:20 lr 0.000603 time 1.9004 (2.1995) loss 4.2172 (3.6460) grad_norm 1.3669 (1.3883) [2022-01-21 18:47:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1170/1251] eta 0:02:58 lr 0.000603 time 2.7524 (2.2009) loss 4.0313 (3.6468) grad_norm 1.5366 (1.3882) [2022-01-21 18:48:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1180/1251] eta 0:02:36 lr 0.000603 time 2.1885 (2.2003) loss 3.0777 (3.6467) grad_norm 1.4440 (1.3887) [2022-01-21 18:48:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1190/1251] eta 0:02:14 lr 0.000603 time 2.5005 (2.2005) loss 3.1199 (3.6458) grad_norm 1.1494 (1.3890) [2022-01-21 18:49:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1200/1251] eta 0:01:52 lr 0.000603 time 1.8624 (2.1985) loss 4.5213 (3.6493) grad_norm 1.4153 (1.3887) [2022-01-21 18:49:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1210/1251] eta 0:01:30 lr 0.000603 time 1.8801 (2.1973) loss 2.4912 (3.6499) grad_norm 1.1373 (1.3886) [2022-01-21 18:49:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1220/1251] eta 0:01:08 lr 0.000603 time 2.1885 (2.1972) loss 3.3477 (3.6499) grad_norm 1.2916 (1.3885) [2022-01-21 18:50:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1230/1251] eta 0:00:46 lr 0.000603 time 2.1944 (2.1971) loss 4.1939 (3.6498) grad_norm 1.3226 (1.3883) [2022-01-21 18:50:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1240/1251] eta 0:00:24 lr 0.000603 time 1.6375 (2.1971) loss 2.9955 (3.6495) grad_norm 1.4577 (1.3881) [2022-01-21 18:50:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [130/300][1250/1251] eta 0:00:02 lr 0.000603 time 1.3117 (2.1927) loss 3.2579 (3.6515) grad_norm 1.6434 (1.3877) [2022-01-21 18:50:45 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 130 training takes 0:45:43 [2022-01-21 18:50:45 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_130 saving...... [2022-01-21 18:50:57 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_130 saved !!! [2022-01-21 18:51:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 13.200 (13.200) Loss 1.0646 (1.0646) Acc@1 76.074 (76.074) Acc@5 92.773 (92.773) [2022-01-21 18:51:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.625 (2.867) Loss 1.1011 (1.0576) Acc@1 74.121 (75.027) Acc@5 92.383 (92.889) [2022-01-21 18:51:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.622 (2.164) Loss 1.0794 (1.0550) Acc@1 74.609 (74.949) Acc@5 90.820 (92.885) [2022-01-21 18:52:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.995 (2.064) Loss 1.0468 (1.0540) Acc@1 74.902 (74.852) Acc@5 92.871 (92.893) [2022-01-21 18:52:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.255 (1.930) Loss 1.0173 (1.0566) Acc@1 75.977 (74.793) Acc@5 92.773 (92.871) [2022-01-21 18:52:24 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.964 Acc@5 92.858 [2022-01-21 18:52:24 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.0% [2022-01-21 18:52:24 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.16% [2022-01-21 18:52:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][0/1251] eta 7:21:32 lr 0.000603 time 21.1773 (21.1773) loss 3.5679 (3.5679) grad_norm 1.4463 (1.4463) [2022-01-21 18:53:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][10/1251] eta 1:24:44 lr 0.000603 time 2.1026 (4.0968) loss 3.9503 (3.7554) grad_norm 1.3785 (1.3749) [2022-01-21 18:53:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][20/1251] eta 1:05:16 lr 0.000603 time 1.5730 (3.1817) loss 2.8283 (3.5992) grad_norm 1.3026 (1.3718) [2022-01-21 18:53:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][30/1251] eta 0:57:42 lr 0.000603 time 1.4530 (2.8354) loss 2.6060 (3.6044) grad_norm 1.6553 (1.3894) [2022-01-21 18:54:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][40/1251] eta 0:55:47 lr 0.000603 time 5.8746 (2.7645) loss 4.5210 (3.6800) grad_norm 1.2355 (1.3887) [2022-01-21 18:54:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][50/1251] eta 0:54:10 lr 0.000603 time 2.4224 (2.7062) loss 4.3613 (3.6780) grad_norm 1.4781 (1.3969) [2022-01-21 18:55:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][60/1251] eta 0:51:30 lr 0.000603 time 1.7274 (2.5949) loss 4.3414 (3.6721) grad_norm 1.3800 (1.3878) [2022-01-21 18:55:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][70/1251] eta 0:49:56 lr 0.000603 time 1.7136 (2.5377) loss 3.8459 (3.6075) grad_norm 1.3851 (1.3870) [2022-01-21 18:55:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][80/1251] eta 0:48:39 lr 0.000603 time 2.6171 (2.4933) loss 3.8983 (3.6260) grad_norm 1.2969 (1.3866) [2022-01-21 18:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][90/1251] eta 0:47:19 lr 0.000602 time 1.7612 (2.4460) loss 4.0609 (3.6336) grad_norm 1.1969 (1.3895) [2022-01-21 18:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][100/1251] eta 0:46:02 lr 0.000602 time 1.8295 (2.4004) loss 3.8844 (3.6317) grad_norm 1.6086 (1.3937) [2022-01-21 18:56:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][110/1251] eta 0:45:05 lr 0.000602 time 2.1675 (2.3715) loss 3.9184 (3.6483) grad_norm 1.3429 (1.3870) [2022-01-21 18:57:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][120/1251] eta 0:44:22 lr 0.000602 time 1.8706 (2.3540) loss 3.6992 (3.6487) grad_norm 1.5465 (1.3878) [2022-01-21 18:57:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][130/1251] eta 0:43:59 lr 0.000602 time 2.4899 (2.3548) loss 3.8158 (3.6220) grad_norm 1.7635 (1.3888) [2022-01-21 18:57:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][140/1251] eta 0:43:15 lr 0.000602 time 2.1661 (2.3366) loss 4.4498 (3.6196) grad_norm 1.4096 (1.3906) [2022-01-21 18:58:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][150/1251] eta 0:42:54 lr 0.000602 time 2.4815 (2.3384) loss 3.8057 (3.6086) grad_norm 1.2483 (1.3901) [2022-01-21 18:58:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][160/1251] eta 0:42:21 lr 0.000602 time 2.0652 (2.3297) loss 2.9010 (3.6012) grad_norm 1.2101 (1.3929) [2022-01-21 18:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][170/1251] eta 0:41:46 lr 0.000602 time 1.6643 (2.3185) loss 3.6193 (3.6046) grad_norm 1.3924 (1.3919) [2022-01-21 18:59:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][180/1251] eta 0:41:03 lr 0.000602 time 1.6120 (2.3002) loss 4.6148 (3.5991) grad_norm 1.4033 (1.3878) [2022-01-21 18:59:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][190/1251] eta 0:40:33 lr 0.000602 time 2.2857 (2.2934) loss 3.5109 (3.6070) grad_norm 1.3378 (1.3870) [2022-01-21 19:00:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][200/1251] eta 0:40:04 lr 0.000602 time 1.9638 (2.2882) loss 2.9993 (3.6039) grad_norm 1.2360 (1.3875) [2022-01-21 19:00:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][210/1251] eta 0:39:38 lr 0.000602 time 1.9244 (2.2850) loss 4.0947 (3.6058) grad_norm 1.3820 (1.3868) [2022-01-21 19:00:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][220/1251] eta 0:39:08 lr 0.000602 time 1.8687 (2.2774) loss 3.7057 (3.5975) grad_norm 1.4862 (1.3854) [2022-01-21 19:01:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][230/1251] eta 0:38:41 lr 0.000602 time 2.3032 (2.2734) loss 3.7879 (3.5907) grad_norm 1.5064 (1.3859) [2022-01-21 19:01:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][240/1251] eta 0:38:16 lr 0.000602 time 1.9509 (2.2712) loss 4.2896 (3.6006) grad_norm 1.2984 (1.3857) [2022-01-21 19:01:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][250/1251] eta 0:37:53 lr 0.000602 time 2.0252 (2.2715) loss 3.4114 (3.5949) grad_norm 1.4765 (1.3899) [2022-01-21 19:02:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][260/1251] eta 0:37:27 lr 0.000602 time 1.9474 (2.2678) loss 4.0929 (3.6050) grad_norm 1.1536 (1.3932) [2022-01-21 19:02:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][270/1251] eta 0:37:05 lr 0.000602 time 2.2208 (2.2683) loss 3.7274 (3.6032) grad_norm 1.1972 (1.3936) [2022-01-21 19:03:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][280/1251] eta 0:36:47 lr 0.000602 time 2.6701 (2.2735) loss 4.0900 (3.6114) grad_norm 1.4233 (1.3947) [2022-01-21 19:03:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][290/1251] eta 0:36:21 lr 0.000602 time 2.6213 (2.2699) loss 3.5289 (3.6179) grad_norm 1.6118 (1.3955) [2022-01-21 19:03:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][300/1251] eta 0:35:49 lr 0.000602 time 1.6504 (2.2606) loss 3.5601 (3.6166) grad_norm 1.4097 (1.3944) [2022-01-21 19:04:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][310/1251] eta 0:35:18 lr 0.000602 time 1.8589 (2.2514) loss 4.3746 (3.6215) grad_norm 1.4499 (1.3950) [2022-01-21 19:04:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][320/1251] eta 0:34:51 lr 0.000602 time 2.6420 (2.2468) loss 3.4426 (3.6141) grad_norm 1.3202 (1.3950) [2022-01-21 19:04:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][330/1251] eta 0:34:25 lr 0.000601 time 1.5682 (2.2423) loss 3.5357 (3.6146) grad_norm 1.2729 (1.3946) [2022-01-21 19:05:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][340/1251] eta 0:34:01 lr 0.000601 time 2.0394 (2.2410) loss 3.5303 (3.6069) grad_norm 1.3282 (1.3939) [2022-01-21 19:05:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][350/1251] eta 0:33:39 lr 0.000601 time 1.5722 (2.2410) loss 4.4660 (3.6092) grad_norm 1.4760 (1.3929) [2022-01-21 19:05:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][360/1251] eta 0:33:28 lr 0.000601 time 2.7586 (2.2539) loss 3.6062 (3.6140) grad_norm 1.3401 (1.3915) [2022-01-21 19:06:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][370/1251] eta 0:33:05 lr 0.000601 time 2.1837 (2.2542) loss 3.8607 (3.6145) grad_norm 1.3638 (1.3912) [2022-01-21 19:06:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][380/1251] eta 0:32:40 lr 0.000601 time 2.4626 (2.2508) loss 3.4699 (3.6094) grad_norm 1.3678 (1.3925) [2022-01-21 19:07:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][390/1251] eta 0:32:11 lr 0.000601 time 1.5231 (2.2435) loss 3.6577 (3.6114) grad_norm 1.4592 (1.3957) [2022-01-21 19:07:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][400/1251] eta 0:31:49 lr 0.000601 time 2.1702 (2.2442) loss 3.2994 (3.6064) grad_norm 1.3223 (1.3969) [2022-01-21 19:07:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][410/1251] eta 0:31:23 lr 0.000601 time 1.5364 (2.2400) loss 3.5997 (3.6064) grad_norm 1.3800 (1.3969) [2022-01-21 19:08:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][420/1251] eta 0:31:00 lr 0.000601 time 1.8983 (2.2391) loss 3.9557 (3.6041) grad_norm 1.2698 (1.3954) [2022-01-21 19:08:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][430/1251] eta 0:30:37 lr 0.000601 time 1.9618 (2.2380) loss 3.8178 (3.6049) grad_norm 1.3206 (1.3943) [2022-01-21 19:08:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][440/1251] eta 0:30:18 lr 0.000601 time 2.2271 (2.2419) loss 3.8412 (3.6060) grad_norm 1.5814 (1.3943) [2022-01-21 19:09:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][450/1251] eta 0:29:54 lr 0.000601 time 1.6094 (2.2399) loss 3.6927 (3.5995) grad_norm 1.3985 (1.3961) [2022-01-21 19:09:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][460/1251] eta 0:29:28 lr 0.000601 time 1.8843 (2.2353) loss 3.4792 (3.5953) grad_norm 1.6243 (1.3957) [2022-01-21 19:09:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][470/1251] eta 0:29:04 lr 0.000601 time 2.1676 (2.2335) loss 3.9814 (3.5955) grad_norm 1.5206 (1.3970) [2022-01-21 19:10:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][480/1251] eta 0:28:46 lr 0.000601 time 2.3956 (2.2393) loss 3.7377 (3.5889) grad_norm 1.2979 (1.3964) [2022-01-21 19:10:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][490/1251] eta 0:28:22 lr 0.000601 time 2.1567 (2.2369) loss 2.9570 (3.5850) grad_norm 1.5354 (1.3974) [2022-01-21 19:11:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][500/1251] eta 0:27:58 lr 0.000601 time 1.9501 (2.2349) loss 4.3183 (3.5851) grad_norm 1.2883 (1.3975) [2022-01-21 19:11:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][510/1251] eta 0:27:33 lr 0.000601 time 1.8866 (2.2313) loss 4.1783 (3.5858) grad_norm 1.3978 (1.3977) [2022-01-21 19:11:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][520/1251] eta 0:27:09 lr 0.000601 time 1.8133 (2.2294) loss 3.9455 (3.5870) grad_norm 1.4102 (1.3998) [2022-01-21 19:12:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][530/1251] eta 0:26:45 lr 0.000601 time 1.9412 (2.2269) loss 4.0040 (3.5841) grad_norm 1.4905 (1.3999) [2022-01-21 19:12:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][540/1251] eta 0:26:19 lr 0.000601 time 1.9622 (2.2215) loss 4.0470 (3.5855) grad_norm 1.3618 (1.4016) [2022-01-21 19:12:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][550/1251] eta 0:25:56 lr 0.000601 time 1.9173 (2.2203) loss 4.1669 (3.5888) grad_norm 1.3838 (1.4001) [2022-01-21 19:13:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][560/1251] eta 0:25:35 lr 0.000601 time 3.0791 (2.2219) loss 2.5947 (3.5862) grad_norm 1.1064 (1.4005) [2022-01-21 19:13:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][570/1251] eta 0:25:14 lr 0.000601 time 2.3980 (2.2242) loss 2.4964 (3.5865) grad_norm 1.5720 (1.4013) [2022-01-21 19:13:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][580/1251] eta 0:24:51 lr 0.000600 time 1.8968 (2.2235) loss 4.2501 (3.5924) grad_norm 1.3964 (1.4002) [2022-01-21 19:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][590/1251] eta 0:24:29 lr 0.000600 time 2.0106 (2.2234) loss 3.7558 (3.5908) grad_norm 1.5338 (1.3994) [2022-01-21 19:14:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][600/1251] eta 0:24:07 lr 0.000600 time 2.5624 (2.2240) loss 4.0337 (3.5938) grad_norm 1.4428 (1.4001) [2022-01-21 19:15:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][610/1251] eta 0:23:44 lr 0.000600 time 2.1840 (2.2228) loss 2.4685 (3.5946) grad_norm 1.3621 (1.3998) [2022-01-21 19:15:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][620/1251] eta 0:23:21 lr 0.000600 time 1.9321 (2.2207) loss 4.2310 (3.5963) grad_norm 1.3165 (1.4005) [2022-01-21 19:15:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][630/1251] eta 0:22:57 lr 0.000600 time 1.9010 (2.2177) loss 3.8299 (3.5943) grad_norm 1.5016 (1.4013) [2022-01-21 19:16:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][640/1251] eta 0:22:34 lr 0.000600 time 2.7873 (2.2161) loss 3.1331 (3.5948) grad_norm 1.1863 (1.3997) [2022-01-21 19:16:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][650/1251] eta 0:22:11 lr 0.000600 time 2.1855 (2.2154) loss 3.1834 (3.5929) grad_norm 1.3556 (1.3982) [2022-01-21 19:16:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][660/1251] eta 0:21:47 lr 0.000600 time 2.2337 (2.2128) loss 3.9026 (3.5944) grad_norm 1.2852 (1.3973) [2022-01-21 19:17:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][670/1251] eta 0:21:24 lr 0.000600 time 1.8450 (2.2108) loss 4.0153 (3.5938) grad_norm 1.6619 (1.3973) [2022-01-21 19:17:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][680/1251] eta 0:21:02 lr 0.000600 time 2.2207 (2.2105) loss 3.6630 (3.5957) grad_norm 1.3073 (1.3968) [2022-01-21 19:17:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][690/1251] eta 0:20:41 lr 0.000600 time 2.7704 (2.2122) loss 4.3891 (3.5956) grad_norm 1.2183 (1.3969) [2022-01-21 19:18:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][700/1251] eta 0:20:19 lr 0.000600 time 2.4403 (2.2124) loss 3.5534 (3.5964) grad_norm 1.4195 (1.3960) [2022-01-21 19:18:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][710/1251] eta 0:19:57 lr 0.000600 time 1.6545 (2.2129) loss 3.8108 (3.6007) grad_norm 1.3779 (1.3955) [2022-01-21 19:19:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][720/1251] eta 0:19:35 lr 0.000600 time 2.5145 (2.2131) loss 4.7000 (3.6018) grad_norm 1.4318 (1.3949) [2022-01-21 19:19:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][730/1251] eta 0:19:12 lr 0.000600 time 2.1442 (2.2121) loss 3.1878 (3.5968) grad_norm 1.2066 (1.3945) [2022-01-21 19:19:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][740/1251] eta 0:18:50 lr 0.000600 time 2.6190 (2.2121) loss 3.3944 (3.5980) grad_norm 1.5621 (1.3949) [2022-01-21 19:20:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][750/1251] eta 0:18:28 lr 0.000600 time 2.1083 (2.2127) loss 2.8731 (3.6007) grad_norm 1.5272 (1.3965) [2022-01-21 19:20:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][760/1251] eta 0:18:06 lr 0.000600 time 1.7157 (2.2134) loss 3.5127 (3.5997) grad_norm 2.0551 (1.3997) [2022-01-21 19:20:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][770/1251] eta 0:17:44 lr 0.000600 time 1.9217 (2.2123) loss 3.3373 (3.5949) grad_norm 1.6010 (1.4003) [2022-01-21 19:21:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][780/1251] eta 0:17:21 lr 0.000600 time 1.9463 (2.2111) loss 4.3434 (3.5961) grad_norm 1.3032 (1.4016) [2022-01-21 19:21:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][790/1251] eta 0:16:58 lr 0.000600 time 2.0817 (2.2098) loss 4.2614 (3.5954) grad_norm 1.5627 (1.4024) [2022-01-21 19:21:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][800/1251] eta 0:16:36 lr 0.000600 time 1.9658 (2.2103) loss 3.7347 (3.5975) grad_norm 1.1881 (1.4013) [2022-01-21 19:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][810/1251] eta 0:16:14 lr 0.000600 time 2.1283 (2.2095) loss 2.6174 (3.5973) grad_norm 1.3929 (1.4011) [2022-01-21 19:22:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][820/1251] eta 0:15:51 lr 0.000600 time 1.7498 (2.2088) loss 3.4660 (3.5995) grad_norm 1.1704 (1.4014) [2022-01-21 19:23:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][830/1251] eta 0:15:30 lr 0.000599 time 1.8166 (2.2101) loss 3.7739 (3.6012) grad_norm 1.2749 (1.4009) [2022-01-21 19:23:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][840/1251] eta 0:15:09 lr 0.000599 time 2.8773 (2.2139) loss 3.4043 (3.6016) grad_norm 1.2416 (1.4011) [2022-01-21 19:23:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][850/1251] eta 0:14:47 lr 0.000599 time 1.8309 (2.2126) loss 4.1310 (3.6029) grad_norm 1.4756 (1.4011) [2022-01-21 19:24:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][860/1251] eta 0:14:24 lr 0.000599 time 2.4636 (2.2121) loss 4.0417 (3.6020) grad_norm 1.3108 (1.4009) [2022-01-21 19:24:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][870/1251] eta 0:14:01 lr 0.000599 time 1.6205 (2.2098) loss 3.2678 (3.6023) grad_norm 1.2942 (1.3996) [2022-01-21 19:24:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][880/1251] eta 0:13:39 lr 0.000599 time 1.9580 (2.2077) loss 4.6239 (3.6031) grad_norm 1.4848 (1.3991) [2022-01-21 19:25:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][890/1251] eta 0:13:16 lr 0.000599 time 1.9019 (2.2060) loss 2.5559 (3.6021) grad_norm 1.2095 (1.3988) [2022-01-21 19:25:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][900/1251] eta 0:12:54 lr 0.000599 time 2.5443 (2.2059) loss 4.2446 (3.6034) grad_norm 1.2637 (1.3986) [2022-01-21 19:25:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][910/1251] eta 0:12:32 lr 0.000599 time 2.7539 (2.2070) loss 3.6686 (3.6036) grad_norm 1.6173 (1.3997) [2022-01-21 19:26:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][920/1251] eta 0:12:10 lr 0.000599 time 2.0496 (2.2066) loss 4.2735 (3.6049) grad_norm 1.3622 (1.4003) [2022-01-21 19:26:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][930/1251] eta 0:11:47 lr 0.000599 time 1.4961 (2.2045) loss 3.9871 (3.6033) grad_norm 1.3084 (1.4003) [2022-01-21 19:26:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][940/1251] eta 0:11:25 lr 0.000599 time 2.3543 (2.2045) loss 3.2610 (3.6037) grad_norm 1.4395 (1.4002) [2022-01-21 19:27:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][950/1251] eta 0:11:03 lr 0.000599 time 2.5070 (2.2057) loss 4.0087 (3.6005) grad_norm 1.4258 (1.4003) [2022-01-21 19:27:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][960/1251] eta 0:10:41 lr 0.000599 time 2.2999 (2.2057) loss 4.4615 (3.6045) grad_norm 1.3650 (1.4000) [2022-01-21 19:28:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][970/1251] eta 0:10:20 lr 0.000599 time 1.7174 (2.2068) loss 3.6778 (3.6017) grad_norm 1.2785 (1.3992) [2022-01-21 19:28:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][980/1251] eta 0:09:58 lr 0.000599 time 2.2266 (2.2068) loss 2.7130 (3.6022) grad_norm 1.4069 (1.3989) [2022-01-21 19:28:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][990/1251] eta 0:09:35 lr 0.000599 time 1.8428 (2.2050) loss 3.7323 (3.6004) grad_norm 1.5605 (1.3986) [2022-01-21 19:29:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1000/1251] eta 0:09:13 lr 0.000599 time 2.8307 (2.2035) loss 3.1739 (3.5994) grad_norm 1.3239 (1.3989) [2022-01-21 19:29:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1010/1251] eta 0:08:50 lr 0.000599 time 1.9093 (2.2013) loss 3.1782 (3.5990) grad_norm 1.3541 (1.3995) [2022-01-21 19:29:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1020/1251] eta 0:08:28 lr 0.000599 time 2.8668 (2.2008) loss 3.8561 (3.6005) grad_norm 1.9749 (1.4004) [2022-01-21 19:30:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1030/1251] eta 0:08:06 lr 0.000599 time 2.0322 (2.2022) loss 4.1036 (3.6021) grad_norm 1.3004 (1.4006) [2022-01-21 19:30:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1040/1251] eta 0:07:45 lr 0.000599 time 2.3242 (2.2038) loss 3.6301 (3.6028) grad_norm 1.3374 (1.4013) [2022-01-21 19:31:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1050/1251] eta 0:07:22 lr 0.000599 time 1.7792 (2.2031) loss 4.0626 (3.6047) grad_norm 1.7570 (1.4011) [2022-01-21 19:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1060/1251] eta 0:07:00 lr 0.000599 time 3.1747 (2.2035) loss 4.1481 (3.6060) grad_norm 1.2646 (1.4003) [2022-01-21 19:31:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1070/1251] eta 0:06:38 lr 0.000598 time 2.2435 (2.2021) loss 3.9743 (3.6062) grad_norm 1.3411 (1.3998) [2022-01-21 19:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1080/1251] eta 0:06:16 lr 0.000598 time 1.8992 (2.2003) loss 3.6935 (3.6073) grad_norm 1.4646 (1.4000) [2022-01-21 19:32:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1090/1251] eta 0:05:53 lr 0.000598 time 1.7917 (2.1985) loss 3.3335 (3.6061) grad_norm 1.3619 (1.3993) [2022-01-21 19:32:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1100/1251] eta 0:05:32 lr 0.000598 time 2.5611 (2.1990) loss 3.9988 (3.6093) grad_norm 1.3466 (1.3989) [2022-01-21 19:33:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1110/1251] eta 0:05:10 lr 0.000598 time 2.5274 (2.1994) loss 3.3117 (3.6110) grad_norm 1.3309 (1.3984) [2022-01-21 19:33:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1120/1251] eta 0:04:48 lr 0.000598 time 2.5728 (2.1987) loss 3.8047 (3.6127) grad_norm 1.2272 (1.3979) [2022-01-21 19:33:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1130/1251] eta 0:04:25 lr 0.000598 time 1.5596 (2.1979) loss 2.9409 (3.6098) grad_norm 1.3623 (1.3979) [2022-01-21 19:34:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1140/1251] eta 0:04:04 lr 0.000598 time 3.7666 (2.1983) loss 3.8818 (3.6090) grad_norm 1.3564 (1.3972) [2022-01-21 19:34:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1150/1251] eta 0:03:42 lr 0.000598 time 2.8709 (2.1989) loss 3.8145 (3.6088) grad_norm 1.3411 (1.3970) [2022-01-21 19:34:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1160/1251] eta 0:03:20 lr 0.000598 time 2.4509 (2.1998) loss 3.9445 (3.6086) grad_norm 1.3863 (1.3966) [2022-01-21 19:35:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1170/1251] eta 0:02:58 lr 0.000598 time 1.9291 (2.2004) loss 3.8691 (3.6105) grad_norm 1.4668 (1.3968) [2022-01-21 19:35:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1180/1251] eta 0:02:36 lr 0.000598 time 2.6995 (2.2000) loss 4.0082 (3.6120) grad_norm 1.7224 (1.3975) [2022-01-21 19:36:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1190/1251] eta 0:02:14 lr 0.000598 time 2.7081 (2.2007) loss 4.3987 (3.6135) grad_norm 1.3101 (1.3978) [2022-01-21 19:36:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1200/1251] eta 0:01:52 lr 0.000598 time 2.7402 (2.2013) loss 2.8878 (3.6137) grad_norm 1.2507 (1.3972) [2022-01-21 19:36:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1210/1251] eta 0:01:30 lr 0.000598 time 1.8969 (2.2002) loss 4.1295 (3.6140) grad_norm 1.4724 (1.3970) [2022-01-21 19:37:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1220/1251] eta 0:01:08 lr 0.000598 time 2.1258 (2.1997) loss 4.1731 (3.6142) grad_norm 1.3547 (1.3967) [2022-01-21 19:37:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1230/1251] eta 0:00:46 lr 0.000598 time 2.0648 (2.1979) loss 3.7602 (3.6139) grad_norm 1.3260 (1.3969) [2022-01-21 19:37:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1240/1251] eta 0:00:24 lr 0.000598 time 1.5090 (2.1964) loss 3.8059 (3.6138) grad_norm 1.4071 (1.3967) [2022-01-21 19:38:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [131/300][1250/1251] eta 0:00:02 lr 0.000598 time 1.3229 (2.1906) loss 2.3602 (3.6129) grad_norm 1.2769 (1.3965) [2022-01-21 19:38:05 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 131 training takes 0:45:40 [2022-01-21 19:38:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.889 (18.889) Loss 1.0969 (1.0969) Acc@1 75.391 (75.391) Acc@5 91.797 (91.797) [2022-01-21 19:38:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.574 (3.364) Loss 1.0691 (1.0767) Acc@1 74.512 (75.133) Acc@5 92.578 (92.525) [2022-01-21 19:38:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.938 (2.524) Loss 1.0159 (1.0735) Acc@1 76.367 (75.195) Acc@5 93.555 (92.569) [2022-01-21 19:39:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.677 (2.257) Loss 1.1487 (1.0754) Acc@1 74.219 (75.284) Acc@5 91.504 (92.550) [2022-01-21 19:39:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.176 (2.171) Loss 1.1413 (1.0746) Acc@1 73.535 (75.162) Acc@5 92.090 (92.573) [2022-01-21 19:39:41 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.150 Acc@5 92.704 [2022-01-21 19:39:41 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.1% [2022-01-21 19:39:41 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.16% [2022-01-21 19:40:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][0/1251] eta 7:32:41 lr 0.000598 time 21.7121 (21.7121) loss 4.2605 (4.2605) grad_norm 1.4450 (1.4450) [2022-01-21 19:40:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][10/1251] eta 1:24:08 lr 0.000598 time 2.5524 (4.0680) loss 3.5729 (3.2561) grad_norm 1.3653 (1.4173) [2022-01-21 19:40:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][20/1251] eta 1:05:40 lr 0.000598 time 1.5308 (3.2015) loss 3.1260 (3.4601) grad_norm 1.3469 (1.4131) [2022-01-21 19:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][30/1251] eta 0:57:57 lr 0.000598 time 1.5777 (2.8479) loss 4.0777 (3.4855) grad_norm 1.2612 (1.4031) [2022-01-21 19:41:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][40/1251] eta 0:55:50 lr 0.000598 time 3.9169 (2.7668) loss 3.7799 (3.5332) grad_norm 1.2096 (1.3879) [2022-01-21 19:41:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][50/1251] eta 0:53:32 lr 0.000598 time 3.0865 (2.6752) loss 4.2410 (3.5400) grad_norm 1.6623 (1.3869) [2022-01-21 19:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][60/1251] eta 0:53:05 lr 0.000598 time 2.9079 (2.6748) loss 3.9335 (3.5550) grad_norm 1.3507 (1.3859) [2022-01-21 19:42:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][70/1251] eta 0:51:51 lr 0.000597 time 1.5001 (2.6344) loss 2.7378 (3.5498) grad_norm 1.5160 (1.4028) [2022-01-21 19:43:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][80/1251] eta 0:50:28 lr 0.000597 time 2.2800 (2.5860) loss 4.2303 (3.5803) grad_norm 1.4786 (1.4073) [2022-01-21 19:43:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][90/1251] eta 0:49:00 lr 0.000597 time 2.2389 (2.5327) loss 3.2167 (3.5583) grad_norm 1.3573 (1.4054) [2022-01-21 19:43:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][100/1251] eta 0:47:28 lr 0.000597 time 1.7892 (2.4745) loss 2.6876 (3.5437) grad_norm 1.7789 (1.4027) [2022-01-21 19:44:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][110/1251] eta 0:46:12 lr 0.000597 time 2.2645 (2.4295) loss 2.8930 (3.5375) grad_norm 1.2986 (1.3967) [2022-01-21 19:44:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][120/1251] eta 0:45:12 lr 0.000597 time 1.9433 (2.3979) loss 3.1986 (3.5377) grad_norm 1.4297 (1.3917) [2022-01-21 19:44:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][130/1251] eta 0:44:19 lr 0.000597 time 1.9821 (2.3724) loss 3.8752 (3.5555) grad_norm 1.5542 (1.3938) [2022-01-21 19:45:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][140/1251] eta 0:43:39 lr 0.000597 time 1.5558 (2.3580) loss 3.9631 (3.5460) grad_norm 1.4799 (1.3919) [2022-01-21 19:45:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][150/1251] eta 0:43:06 lr 0.000597 time 1.9293 (2.3491) loss 2.6163 (3.5537) grad_norm 1.3872 (1.3980) [2022-01-21 19:45:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][160/1251] eta 0:42:29 lr 0.000597 time 1.8199 (2.3372) loss 3.5855 (3.5510) grad_norm 1.3423 (1.3992) [2022-01-21 19:46:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][170/1251] eta 0:42:15 lr 0.000597 time 3.2001 (2.3451) loss 3.2083 (3.5562) grad_norm 1.4874 (1.3982) [2022-01-21 19:46:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][180/1251] eta 0:41:42 lr 0.000597 time 1.9186 (2.3365) loss 3.8411 (3.5595) grad_norm 1.4073 (1.3959) [2022-01-21 19:47:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][190/1251] eta 0:41:11 lr 0.000597 time 2.3077 (2.3290) loss 4.0679 (3.5733) grad_norm 1.3573 (1.3961) [2022-01-21 19:47:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][200/1251] eta 0:40:36 lr 0.000597 time 1.8918 (2.3181) loss 2.7328 (3.5764) grad_norm 1.4485 (1.3956) [2022-01-21 19:47:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][210/1251] eta 0:40:09 lr 0.000597 time 2.5665 (2.3142) loss 3.3951 (3.5661) grad_norm 1.3892 (1.3963) [2022-01-21 19:48:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][220/1251] eta 0:39:43 lr 0.000597 time 2.5454 (2.3116) loss 4.1615 (3.5732) grad_norm 1.3698 (1.3926) [2022-01-21 19:48:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][230/1251] eta 0:39:28 lr 0.000597 time 3.2335 (2.3197) loss 2.7471 (3.5759) grad_norm 1.4546 (1.3937) [2022-01-21 19:48:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][240/1251] eta 0:38:56 lr 0.000597 time 1.5556 (2.3109) loss 3.9786 (3.5833) grad_norm 1.2180 (1.3931) [2022-01-21 19:49:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][250/1251] eta 0:38:26 lr 0.000597 time 1.8603 (2.3040) loss 3.8340 (3.5813) grad_norm 1.3936 (1.3946) [2022-01-21 19:49:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][260/1251] eta 0:37:49 lr 0.000597 time 1.9389 (2.2901) loss 3.4245 (3.5794) grad_norm 1.4387 (1.3971) [2022-01-21 19:50:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][270/1251] eta 0:37:33 lr 0.000597 time 2.4732 (2.2972) loss 4.0453 (3.5849) grad_norm 1.3098 (1.3960) [2022-01-21 19:50:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][280/1251] eta 0:37:11 lr 0.000597 time 1.6686 (2.2979) loss 4.1339 (3.5947) grad_norm 1.2172 (1.3936) [2022-01-21 19:50:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][290/1251] eta 0:36:42 lr 0.000597 time 1.7082 (2.2922) loss 3.1748 (3.5915) grad_norm 1.2204 (1.3908) [2022-01-21 19:51:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][300/1251] eta 0:36:12 lr 0.000597 time 1.8638 (2.2843) loss 3.2263 (3.5864) grad_norm 1.2439 (1.3877) [2022-01-21 19:51:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][310/1251] eta 0:35:44 lr 0.000596 time 1.9699 (2.2788) loss 2.7239 (3.5801) grad_norm 1.2746 (1.3852) [2022-01-21 19:51:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][320/1251] eta 0:35:18 lr 0.000596 time 1.5726 (2.2753) loss 4.2606 (3.5835) grad_norm 1.2334 (1.3860) [2022-01-21 19:52:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][330/1251] eta 0:34:56 lr 0.000596 time 1.9743 (2.2759) loss 4.0991 (3.5893) grad_norm 1.2990 (1.3850) [2022-01-21 19:52:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][340/1251] eta 0:34:33 lr 0.000596 time 1.5398 (2.2759) loss 3.6817 (3.5973) grad_norm 1.2838 (1.3865) [2022-01-21 19:52:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][350/1251] eta 0:34:07 lr 0.000596 time 1.6540 (2.2727) loss 4.4187 (3.6007) grad_norm 1.4580 (1.3866) [2022-01-21 19:53:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][360/1251] eta 0:33:38 lr 0.000596 time 1.6172 (2.2654) loss 3.3582 (3.5949) grad_norm 1.4202 (1.3878) [2022-01-21 19:53:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][370/1251] eta 0:33:20 lr 0.000596 time 1.5906 (2.2703) loss 4.0646 (3.5990) grad_norm 1.3452 (1.3872) [2022-01-21 19:54:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][380/1251] eta 0:32:57 lr 0.000596 time 1.8652 (2.2709) loss 3.0536 (3.5994) grad_norm 1.4408 (1.3893) [2022-01-21 19:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][390/1251] eta 0:32:34 lr 0.000596 time 1.8765 (2.2702) loss 3.7642 (3.6096) grad_norm 1.6383 (1.3921) [2022-01-21 19:54:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][400/1251] eta 0:32:06 lr 0.000596 time 1.6849 (2.2632) loss 4.5508 (3.6124) grad_norm 1.3996 (1.3946) [2022-01-21 19:55:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][410/1251] eta 0:31:42 lr 0.000596 time 2.2145 (2.2627) loss 3.4216 (3.6071) grad_norm 1.3041 (1.3954) [2022-01-21 19:55:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][420/1251] eta 0:31:20 lr 0.000596 time 2.8337 (2.2628) loss 3.6329 (3.6080) grad_norm 1.5510 (1.3978) [2022-01-21 19:55:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][430/1251] eta 0:30:57 lr 0.000596 time 1.8018 (2.2629) loss 3.6694 (3.6049) grad_norm 1.3750 (1.3984) [2022-01-21 19:56:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][440/1251] eta 0:30:32 lr 0.000596 time 1.5848 (2.2601) loss 3.4439 (3.6040) grad_norm 1.7296 (1.4011) [2022-01-21 19:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][450/1251] eta 0:30:09 lr 0.000596 time 1.5615 (2.2592) loss 3.9156 (3.6070) grad_norm 1.3611 (1.4021) [2022-01-21 19:57:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][460/1251] eta 0:29:42 lr 0.000596 time 1.7386 (2.2540) loss 3.5691 (3.6089) grad_norm 1.3643 (1.4038) [2022-01-21 19:57:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][470/1251] eta 0:29:19 lr 0.000596 time 2.0154 (2.2528) loss 3.8886 (3.6099) grad_norm 1.2636 (1.4025) [2022-01-21 19:57:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][480/1251] eta 0:28:55 lr 0.000596 time 1.8818 (2.2508) loss 4.0725 (3.6099) grad_norm 1.2985 (1.4028) [2022-01-21 19:58:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][490/1251] eta 0:28:37 lr 0.000596 time 1.9005 (2.2566) loss 3.5330 (3.6082) grad_norm 1.4886 (1.4029) [2022-01-21 19:58:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][500/1251] eta 0:28:13 lr 0.000596 time 1.9912 (2.2554) loss 3.7757 (3.6083) grad_norm 1.1869 (1.4021) [2022-01-21 19:58:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][510/1251] eta 0:27:46 lr 0.000596 time 1.9175 (2.2487) loss 3.9075 (3.6091) grad_norm 1.3682 (1.4019) [2022-01-21 19:59:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][520/1251] eta 0:27:20 lr 0.000596 time 1.8398 (2.2437) loss 3.5277 (3.6118) grad_norm 1.4156 (1.4012) [2022-01-21 19:59:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][530/1251] eta 0:27:01 lr 0.000596 time 2.6822 (2.2489) loss 3.7902 (3.6188) grad_norm 1.3139 (1.3994) [2022-01-21 19:59:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][540/1251] eta 0:26:38 lr 0.000596 time 2.2290 (2.2489) loss 4.3154 (3.6194) grad_norm 1.2924 (1.3987) [2022-01-21 20:00:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][550/1251] eta 0:26:15 lr 0.000596 time 2.1361 (2.2474) loss 3.7091 (3.6212) grad_norm 1.4630 (1.3999) [2022-01-21 20:00:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][560/1251] eta 0:25:50 lr 0.000595 time 1.8978 (2.2441) loss 3.6428 (3.6236) grad_norm 1.2771 (1.3993) [2022-01-21 20:01:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][570/1251] eta 0:25:29 lr 0.000595 time 1.7868 (2.2453) loss 4.0569 (3.6266) grad_norm 1.6241 (1.3990) [2022-01-21 20:01:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][580/1251] eta 0:25:06 lr 0.000595 time 1.9567 (2.2455) loss 2.3940 (3.6215) grad_norm 1.4695 (1.3985) [2022-01-21 20:01:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][590/1251] eta 0:24:43 lr 0.000595 time 2.2084 (2.2440) loss 4.3246 (3.6228) grad_norm 1.5348 (1.3993) [2022-01-21 20:02:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][600/1251] eta 0:24:20 lr 0.000595 time 1.8105 (2.2428) loss 3.8132 (3.6206) grad_norm 1.1972 (1.3993) [2022-01-21 20:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][610/1251] eta 0:23:54 lr 0.000595 time 1.5331 (2.2383) loss 4.3044 (3.6209) grad_norm 1.6163 (1.4012) [2022-01-21 20:02:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][620/1251] eta 0:23:31 lr 0.000595 time 2.1744 (2.2369) loss 2.9613 (3.6203) grad_norm 1.6346 (1.4013) [2022-01-21 20:03:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][630/1251] eta 0:23:07 lr 0.000595 time 2.2550 (2.2348) loss 3.1294 (3.6171) grad_norm 1.4989 (1.4028) [2022-01-21 20:03:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][640/1251] eta 0:22:44 lr 0.000595 time 1.5837 (2.2329) loss 3.7808 (3.6208) grad_norm 1.4277 (1.4025) [2022-01-21 20:03:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][650/1251] eta 0:22:21 lr 0.000595 time 1.6957 (2.2319) loss 4.4320 (3.6194) grad_norm 1.4910 (1.4036) [2022-01-21 20:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][660/1251] eta 0:21:59 lr 0.000595 time 3.0780 (2.2323) loss 3.6138 (3.6195) grad_norm 1.5138 (1.4058) [2022-01-21 20:04:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][670/1251] eta 0:21:37 lr 0.000595 time 3.4270 (2.2333) loss 3.2733 (3.6107) grad_norm 1.3668 (1.4052) [2022-01-21 20:05:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][680/1251] eta 0:21:13 lr 0.000595 time 1.8792 (2.2309) loss 4.2287 (3.6104) grad_norm 1.4933 (1.4059) [2022-01-21 20:05:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][690/1251] eta 0:20:50 lr 0.000595 time 1.8328 (2.2294) loss 3.5291 (3.6086) grad_norm 1.3748 (1.4054) [2022-01-21 20:05:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][700/1251] eta 0:20:28 lr 0.000595 time 3.5850 (2.2297) loss 3.6847 (3.6080) grad_norm 1.3608 (1.4051) [2022-01-21 20:06:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][710/1251] eta 0:20:07 lr 0.000595 time 2.7796 (2.2321) loss 3.2366 (3.6083) grad_norm 1.2383 (1.4043) [2022-01-21 20:06:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][720/1251] eta 0:19:44 lr 0.000595 time 1.8047 (2.2312) loss 4.0342 (3.6082) grad_norm 1.1200 (1.4037) [2022-01-21 20:06:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][730/1251] eta 0:19:22 lr 0.000595 time 1.9324 (2.2310) loss 3.8592 (3.6097) grad_norm 1.4828 (1.4043) [2022-01-21 20:07:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][740/1251] eta 0:19:01 lr 0.000595 time 4.0027 (2.2329) loss 4.3792 (3.6112) grad_norm 1.5148 (1.4045) [2022-01-21 20:07:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][750/1251] eta 0:18:38 lr 0.000595 time 1.9119 (2.2327) loss 3.3433 (3.6080) grad_norm 1.2103 (1.4039) [2022-01-21 20:07:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][760/1251] eta 0:18:14 lr 0.000595 time 1.5788 (2.2295) loss 3.2278 (3.6092) grad_norm 1.5480 (1.4045) [2022-01-21 20:08:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][770/1251] eta 0:17:51 lr 0.000595 time 2.2057 (2.2272) loss 4.0661 (3.6069) grad_norm 1.3678 (1.4045) [2022-01-21 20:08:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][780/1251] eta 0:17:28 lr 0.000595 time 2.8536 (2.2262) loss 4.2093 (3.6067) grad_norm 1.5630 (1.4059) [2022-01-21 20:09:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][790/1251] eta 0:17:05 lr 0.000595 time 2.7285 (2.2252) loss 3.4967 (3.6067) grad_norm 1.3103 (1.4066) [2022-01-21 20:09:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][800/1251] eta 0:16:43 lr 0.000594 time 1.7301 (2.2242) loss 2.5681 (3.6078) grad_norm 1.3487 (1.4069) [2022-01-21 20:09:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][810/1251] eta 0:16:20 lr 0.000594 time 2.1292 (2.2237) loss 4.2163 (3.6107) grad_norm 1.2537 (1.4071) [2022-01-21 20:10:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][820/1251] eta 0:15:59 lr 0.000594 time 3.6134 (2.2270) loss 3.1863 (3.6133) grad_norm 1.3968 (1.4075) [2022-01-21 20:10:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][830/1251] eta 0:15:38 lr 0.000594 time 2.8001 (2.2284) loss 3.5592 (3.6104) grad_norm 1.4985 (1.4084) [2022-01-21 20:10:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][840/1251] eta 0:15:15 lr 0.000594 time 1.8530 (2.2275) loss 4.1201 (3.6113) grad_norm 1.3145 (1.4090) [2022-01-21 20:11:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][850/1251] eta 0:14:51 lr 0.000594 time 1.7161 (2.2242) loss 3.9541 (3.6098) grad_norm 1.6467 (1.4091) [2022-01-21 20:11:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][860/1251] eta 0:14:28 lr 0.000594 time 2.8106 (2.2219) loss 3.7662 (3.6084) grad_norm 1.3556 (1.4087) [2022-01-21 20:11:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][870/1251] eta 0:14:05 lr 0.000594 time 2.0787 (2.2204) loss 3.2338 (3.6059) grad_norm 1.3036 (1.4080) [2022-01-21 20:12:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][880/1251] eta 0:13:44 lr 0.000594 time 2.2610 (2.2212) loss 4.2220 (3.6003) grad_norm 1.3473 (1.4074) [2022-01-21 20:12:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][890/1251] eta 0:13:21 lr 0.000594 time 2.2283 (2.2214) loss 3.2857 (3.6037) grad_norm 1.3239 (1.4068) [2022-01-21 20:13:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][900/1251] eta 0:12:59 lr 0.000594 time 2.2484 (2.2210) loss 3.6717 (3.6043) grad_norm 1.2131 (1.4062) [2022-01-21 20:13:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][910/1251] eta 0:12:37 lr 0.000594 time 1.7530 (2.2227) loss 3.6240 (3.6051) grad_norm 1.3424 (1.4054) [2022-01-21 20:13:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][920/1251] eta 0:12:15 lr 0.000594 time 1.5804 (2.2224) loss 3.6624 (3.6074) grad_norm 1.7412 (1.4062) [2022-01-21 20:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][930/1251] eta 0:11:53 lr 0.000594 time 1.8595 (2.2227) loss 4.0740 (3.6065) grad_norm 1.3203 (1.4054) [2022-01-21 20:14:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][940/1251] eta 0:11:31 lr 0.000594 time 2.9307 (2.2231) loss 4.3125 (3.6091) grad_norm 1.5525 (1.4050) [2022-01-21 20:14:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][950/1251] eta 0:11:08 lr 0.000594 time 1.8170 (2.2217) loss 3.5013 (3.6108) grad_norm 1.5222 (1.4045) [2022-01-21 20:15:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][960/1251] eta 0:10:46 lr 0.000594 time 2.2637 (2.2211) loss 2.8689 (3.6108) grad_norm 1.1947 (1.4043) [2022-01-21 20:15:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][970/1251] eta 0:10:23 lr 0.000594 time 1.6092 (2.2191) loss 3.8666 (3.6113) grad_norm 1.4818 (1.4036) [2022-01-21 20:15:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][980/1251] eta 0:10:01 lr 0.000594 time 2.2761 (2.2193) loss 2.5849 (3.6099) grad_norm 1.2597 (1.4038) [2022-01-21 20:16:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][990/1251] eta 0:09:39 lr 0.000594 time 1.7560 (2.2200) loss 4.0177 (3.6106) grad_norm 1.3512 (1.4041) [2022-01-21 20:16:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1000/1251] eta 0:09:16 lr 0.000594 time 1.9230 (2.2183) loss 3.6073 (3.6101) grad_norm 1.1434 (1.4032) [2022-01-21 20:17:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1010/1251] eta 0:08:54 lr 0.000594 time 2.5778 (2.2194) loss 2.6616 (3.6090) grad_norm 1.4877 (1.4025) [2022-01-21 20:17:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1020/1251] eta 0:08:32 lr 0.000594 time 1.6503 (2.2196) loss 3.1355 (3.6094) grad_norm 1.4370 (1.4022) [2022-01-21 20:17:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1030/1251] eta 0:08:10 lr 0.000594 time 2.1445 (2.2194) loss 3.5113 (3.6105) grad_norm 1.3965 (1.4029) [2022-01-21 20:18:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1040/1251] eta 0:07:48 lr 0.000594 time 1.9791 (2.2195) loss 4.3616 (3.6098) grad_norm 1.6194 (1.4025) [2022-01-21 20:18:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1050/1251] eta 0:07:25 lr 0.000593 time 1.9758 (2.2183) loss 4.3867 (3.6108) grad_norm 1.3236 (1.4025) [2022-01-21 20:18:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1060/1251] eta 0:07:03 lr 0.000593 time 1.6977 (2.2162) loss 3.7217 (3.6132) grad_norm 1.4030 (1.4026) [2022-01-21 20:19:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1070/1251] eta 0:06:40 lr 0.000593 time 1.8213 (2.2147) loss 4.1521 (3.6146) grad_norm 1.2648 (1.4027) [2022-01-21 20:19:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1080/1251] eta 0:06:18 lr 0.000593 time 1.9287 (2.2136) loss 3.1204 (3.6132) grad_norm 1.7430 (1.4027) [2022-01-21 20:19:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1090/1251] eta 0:05:56 lr 0.000593 time 2.8061 (2.2152) loss 3.1248 (3.6120) grad_norm 1.2716 (1.4022) [2022-01-21 20:20:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1100/1251] eta 0:05:34 lr 0.000593 time 2.9044 (2.2156) loss 2.7225 (3.6108) grad_norm 1.1650 (1.4012) [2022-01-21 20:20:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1110/1251] eta 0:05:12 lr 0.000593 time 1.6798 (2.2158) loss 3.8531 (3.6104) grad_norm 1.3472 (1.4013) [2022-01-21 20:21:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1120/1251] eta 0:04:50 lr 0.000593 time 1.6140 (2.2151) loss 2.8649 (3.6113) grad_norm 1.6563 (1.4012) [2022-01-21 20:21:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1130/1251] eta 0:04:28 lr 0.000593 time 2.2241 (2.2157) loss 3.6838 (3.6107) grad_norm 1.3712 (1.4019) [2022-01-21 20:21:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1140/1251] eta 0:04:05 lr 0.000593 time 1.6596 (2.2140) loss 3.5546 (3.6119) grad_norm 1.3571 (1.4022) [2022-01-21 20:22:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1150/1251] eta 0:03:43 lr 0.000593 time 2.5625 (2.2154) loss 3.9997 (3.6130) grad_norm 1.5141 (1.4021) [2022-01-21 20:22:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1160/1251] eta 0:03:21 lr 0.000593 time 1.8987 (2.2134) loss 3.1667 (3.6140) grad_norm 1.3200 (1.4029) [2022-01-21 20:22:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1170/1251] eta 0:02:59 lr 0.000593 time 2.1921 (2.2127) loss 3.7429 (3.6172) grad_norm 1.3897 (1.4024) [2022-01-21 20:23:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1180/1251] eta 0:02:37 lr 0.000593 time 2.2350 (2.2132) loss 3.7401 (3.6189) grad_norm 1.3989 (1.4021) [2022-01-21 20:23:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1190/1251] eta 0:02:14 lr 0.000593 time 2.5265 (2.2129) loss 2.8816 (3.6172) grad_norm 1.3586 (1.4023) [2022-01-21 20:23:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1200/1251] eta 0:01:52 lr 0.000593 time 2.0950 (2.2132) loss 3.2580 (3.6184) grad_norm 1.2854 (1.4026) [2022-01-21 20:24:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1210/1251] eta 0:01:30 lr 0.000593 time 2.4583 (2.2136) loss 3.3793 (3.6184) grad_norm 1.1956 (1.4026) [2022-01-21 20:24:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1220/1251] eta 0:01:08 lr 0.000593 time 2.4067 (2.2134) loss 3.9507 (3.6167) grad_norm 1.5379 (1.4026) [2022-01-21 20:25:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1230/1251] eta 0:00:46 lr 0.000593 time 1.9781 (2.2124) loss 4.0953 (3.6158) grad_norm 1.2987 (1.4022) [2022-01-21 20:25:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1240/1251] eta 0:00:24 lr 0.000593 time 1.7301 (2.2112) loss 3.6411 (3.6150) grad_norm 1.3156 (1.4021) [2022-01-21 20:25:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [132/300][1250/1251] eta 0:00:02 lr 0.000593 time 1.3276 (2.2063) loss 3.9100 (3.6142) grad_norm 1.5049 (1.4024) [2022-01-21 20:25:41 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 132 training takes 0:46:00 [2022-01-21 20:26:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.473 (18.473) Loss 1.0796 (1.0796) Acc@1 76.270 (76.270) Acc@5 92.285 (92.285) [2022-01-21 20:26:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.316 (3.469) Loss 1.0659 (1.0603) Acc@1 75.098 (75.053) Acc@5 93.457 (92.809) [2022-01-21 20:26:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.556 (2.652) Loss 1.0276 (1.0474) Acc@1 75.000 (74.991) Acc@5 92.676 (93.043) [2022-01-21 20:26:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.277 (2.313) Loss 0.9794 (1.0545) Acc@1 77.637 (74.924) Acc@5 94.043 (92.988) [2022-01-21 20:27:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.276 (2.231) Loss 1.0809 (1.0513) Acc@1 74.316 (75.062) Acc@5 92.871 (93.040) [2022-01-21 20:27:20 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 74.988 Acc@5 93.020 [2022-01-21 20:27:20 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.0% [2022-01-21 20:27:20 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.16% [2022-01-21 20:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][0/1251] eta 7:20:09 lr 0.000593 time 21.1107 (21.1107) loss 4.0387 (4.0387) grad_norm 1.5958 (1.5958) [2022-01-21 20:28:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][10/1251] eta 1:25:30 lr 0.000593 time 1.8458 (4.1341) loss 4.1758 (3.7219) grad_norm 1.5215 (1.5284) [2022-01-21 20:28:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][20/1251] eta 1:05:18 lr 0.000593 time 2.1813 (3.1829) loss 3.1051 (3.5573) grad_norm 1.5283 (1.4835) [2022-01-21 20:28:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][30/1251] eta 0:56:38 lr 0.000593 time 1.5474 (2.7830) loss 2.4935 (3.4992) grad_norm 1.3982 (1.4585) [2022-01-21 20:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][40/1251] eta 0:53:14 lr 0.000592 time 2.9841 (2.6382) loss 3.8633 (3.5344) grad_norm 1.3729 (1.4524) [2022-01-21 20:29:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][50/1251] eta 0:51:24 lr 0.000592 time 1.5448 (2.5683) loss 3.9954 (3.5435) grad_norm 1.3007 (1.4306) [2022-01-21 20:29:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][60/1251] eta 0:49:28 lr 0.000592 time 1.4763 (2.4921) loss 3.6481 (3.5500) grad_norm 1.3614 (1.4174) [2022-01-21 20:30:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][70/1251] eta 0:48:01 lr 0.000592 time 1.9117 (2.4399) loss 3.9759 (3.5391) grad_norm 1.3075 (1.4105) [2022-01-21 20:30:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][80/1251] eta 0:47:33 lr 0.000592 time 3.3961 (2.4368) loss 3.9890 (3.5432) grad_norm 1.8054 (1.4109) [2022-01-21 20:31:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][90/1251] eta 0:46:54 lr 0.000592 time 1.5712 (2.4246) loss 3.3382 (3.5595) grad_norm 1.2797 (1.4145) [2022-01-21 20:31:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][100/1251] eta 0:46:07 lr 0.000592 time 2.2058 (2.4046) loss 4.5022 (3.5661) grad_norm 1.6016 (1.4132) [2022-01-21 20:31:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][110/1251] eta 0:45:17 lr 0.000592 time 1.7119 (2.3820) loss 2.9084 (3.5709) grad_norm 1.5076 (1.4158) [2022-01-21 20:32:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][120/1251] eta 0:44:37 lr 0.000592 time 3.0421 (2.3671) loss 3.8641 (3.5830) grad_norm 1.3676 (1.4203) [2022-01-21 20:32:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][130/1251] eta 0:43:55 lr 0.000592 time 1.8457 (2.3509) loss 4.0114 (3.5822) grad_norm 1.3506 (1.4192) [2022-01-21 20:32:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][140/1251] eta 0:43:17 lr 0.000592 time 2.3894 (2.3384) loss 3.8938 (3.5812) grad_norm 1.2781 (1.4115) [2022-01-21 20:33:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][150/1251] eta 0:42:45 lr 0.000592 time 2.1887 (2.3302) loss 3.8456 (3.5903) grad_norm 1.2777 (1.4088) [2022-01-21 20:33:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][160/1251] eta 0:42:20 lr 0.000592 time 3.0585 (2.3290) loss 3.7318 (3.5980) grad_norm 1.3401 (1.4050) [2022-01-21 20:33:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][170/1251] eta 0:41:45 lr 0.000592 time 2.2577 (2.3173) loss 3.9713 (3.6052) grad_norm 1.6719 (1.4065) [2022-01-21 20:34:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][180/1251] eta 0:41:05 lr 0.000592 time 1.9851 (2.3017) loss 3.4495 (3.6026) grad_norm 1.3701 (1.4054) [2022-01-21 20:34:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][190/1251] eta 0:40:30 lr 0.000592 time 1.7638 (2.2905) loss 2.7577 (3.6066) grad_norm 1.2728 (1.4034) [2022-01-21 20:35:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][200/1251] eta 0:40:13 lr 0.000592 time 3.0216 (2.2968) loss 3.7400 (3.6092) grad_norm 1.3004 (1.4009) [2022-01-21 20:35:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][210/1251] eta 0:39:44 lr 0.000592 time 1.8491 (2.2902) loss 2.4954 (3.6107) grad_norm 1.3372 (1.3962) [2022-01-21 20:35:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][220/1251] eta 0:39:12 lr 0.000592 time 2.8035 (2.2821) loss 3.3147 (3.6127) grad_norm 1.4482 (1.3934) [2022-01-21 20:36:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][230/1251] eta 0:38:51 lr 0.000592 time 1.8584 (2.2836) loss 3.0666 (3.6124) grad_norm 1.7377 (1.3958) [2022-01-21 20:36:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][240/1251] eta 0:38:26 lr 0.000592 time 2.1848 (2.2813) loss 3.6931 (3.6164) grad_norm 1.3749 (1.3954) [2022-01-21 20:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][250/1251] eta 0:37:58 lr 0.000592 time 1.9951 (2.2763) loss 3.2260 (3.6081) grad_norm 1.2121 (1.3925) [2022-01-21 20:37:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][260/1251] eta 0:37:33 lr 0.000592 time 2.3929 (2.2735) loss 4.4370 (3.6043) grad_norm 1.4559 (1.3944) [2022-01-21 20:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][270/1251] eta 0:37:09 lr 0.000592 time 1.9161 (2.2730) loss 3.4916 (3.6011) grad_norm 1.1791 (1.3970) [2022-01-21 20:37:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][280/1251] eta 0:36:41 lr 0.000592 time 1.7491 (2.2673) loss 3.4262 (3.5929) grad_norm 1.2863 (1.3966) [2022-01-21 20:38:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][290/1251] eta 0:36:14 lr 0.000591 time 1.6290 (2.2626) loss 3.8242 (3.5936) grad_norm 1.4436 (1.3963) [2022-01-21 20:38:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][300/1251] eta 0:35:45 lr 0.000591 time 2.6525 (2.2558) loss 2.6666 (3.5880) grad_norm 1.2437 (1.3949) [2022-01-21 20:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][310/1251] eta 0:35:20 lr 0.000591 time 2.1864 (2.2538) loss 3.8729 (3.5920) grad_norm 1.4304 (1.3932) [2022-01-21 20:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][320/1251] eta 0:35:00 lr 0.000591 time 2.2757 (2.2560) loss 4.1786 (3.5974) grad_norm 1.4599 (1.3935) [2022-01-21 20:39:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][330/1251] eta 0:34:39 lr 0.000591 time 1.6799 (2.2583) loss 3.9344 (3.5976) grad_norm 1.3564 (1.3918) [2022-01-21 20:40:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][340/1251] eta 0:34:13 lr 0.000591 time 1.6536 (2.2544) loss 3.3095 (3.5986) grad_norm 1.6090 (1.3932) [2022-01-21 20:40:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][350/1251] eta 0:33:48 lr 0.000591 time 2.5164 (2.2515) loss 3.2124 (3.6022) grad_norm 1.7184 (1.3943) [2022-01-21 20:40:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][360/1251] eta 0:33:21 lr 0.000591 time 1.7491 (2.2461) loss 3.6463 (3.6007) grad_norm 1.3744 (1.3930) [2022-01-21 20:41:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][370/1251] eta 0:32:54 lr 0.000591 time 1.9459 (2.2416) loss 2.8761 (3.5983) grad_norm 1.7936 (1.3966) [2022-01-21 20:41:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][380/1251] eta 0:32:27 lr 0.000591 time 1.8592 (2.2354) loss 4.5210 (3.5994) grad_norm 1.3241 (1.3984) [2022-01-21 20:41:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][390/1251] eta 0:32:03 lr 0.000591 time 1.9345 (2.2335) loss 4.1152 (3.6083) grad_norm 1.5237 (1.4009) [2022-01-21 20:42:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][400/1251] eta 0:31:38 lr 0.000591 time 1.7256 (2.2315) loss 2.7484 (3.6026) grad_norm 1.4491 (1.4016) [2022-01-21 20:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][410/1251] eta 0:31:17 lr 0.000591 time 2.2304 (2.2327) loss 3.2207 (3.6055) grad_norm 1.4121 (1.4017) [2022-01-21 20:43:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][420/1251] eta 0:30:57 lr 0.000591 time 2.0712 (2.2357) loss 2.8837 (3.5987) grad_norm 1.2042 (1.4015) [2022-01-21 20:43:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][430/1251] eta 0:30:37 lr 0.000591 time 2.0199 (2.2385) loss 4.1590 (3.5959) grad_norm 1.9006 (1.4040) [2022-01-21 20:43:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][440/1251] eta 0:30:14 lr 0.000591 time 1.9855 (2.2375) loss 4.0047 (3.5955) grad_norm 1.5138 (1.4054) [2022-01-21 20:44:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][450/1251] eta 0:29:48 lr 0.000591 time 2.4104 (2.2323) loss 2.7160 (3.5994) grad_norm 1.3669 (1.4059) [2022-01-21 20:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][460/1251] eta 0:29:24 lr 0.000591 time 1.8646 (2.2306) loss 4.0733 (3.6041) grad_norm 1.7117 (1.4076) [2022-01-21 20:44:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][470/1251] eta 0:29:03 lr 0.000591 time 2.4853 (2.2321) loss 3.7422 (3.6080) grad_norm 1.4062 (1.4066) [2022-01-21 20:45:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][480/1251] eta 0:28:41 lr 0.000591 time 1.5207 (2.2325) loss 2.4032 (3.6002) grad_norm 1.4915 (1.4052) [2022-01-21 20:45:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][490/1251] eta 0:28:18 lr 0.000591 time 2.1645 (2.2324) loss 4.2329 (3.6033) grad_norm 1.3496 (1.4037) [2022-01-21 20:45:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][500/1251] eta 0:27:54 lr 0.000591 time 2.4874 (2.2291) loss 3.2692 (3.6063) grad_norm 1.4179 (1.4034) [2022-01-21 20:46:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][510/1251] eta 0:27:29 lr 0.000591 time 1.9827 (2.2266) loss 4.3568 (3.6082) grad_norm 1.3865 (1.4031) [2022-01-21 20:46:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][520/1251] eta 0:27:06 lr 0.000591 time 2.2027 (2.2249) loss 3.8516 (3.6112) grad_norm 1.2365 (1.4028) [2022-01-21 20:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][530/1251] eta 0:26:45 lr 0.000590 time 2.2093 (2.2270) loss 4.6519 (3.6143) grad_norm 1.3567 (1.4016) [2022-01-21 20:47:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][540/1251] eta 0:26:22 lr 0.000590 time 1.9751 (2.2262) loss 4.6298 (3.6144) grad_norm 1.6883 (1.4050) [2022-01-21 20:47:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][550/1251] eta 0:25:59 lr 0.000590 time 2.0454 (2.2247) loss 3.8889 (3.6157) grad_norm 1.4355 (1.4059) [2022-01-21 20:48:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][560/1251] eta 0:25:37 lr 0.000590 time 1.5883 (2.2250) loss 3.9741 (3.6157) grad_norm 1.5803 (1.4087) [2022-01-21 20:48:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][570/1251] eta 0:25:14 lr 0.000590 time 1.9218 (2.2233) loss 4.5223 (3.6220) grad_norm 1.5324 (1.4112) [2022-01-21 20:48:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][580/1251] eta 0:24:49 lr 0.000590 time 1.8445 (2.2203) loss 3.6368 (3.6261) grad_norm 1.3621 (1.4116) [2022-01-21 20:49:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][590/1251] eta 0:24:25 lr 0.000590 time 1.8628 (2.2177) loss 3.9371 (3.6191) grad_norm 1.4175 (1.4110) [2022-01-21 20:49:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][600/1251] eta 0:24:03 lr 0.000590 time 1.5696 (2.2167) loss 4.2951 (3.6176) grad_norm 1.5441 (1.4106) [2022-01-21 20:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][610/1251] eta 0:23:40 lr 0.000590 time 2.4345 (2.2167) loss 2.5070 (3.6128) grad_norm 1.6326 (1.4101) [2022-01-21 20:50:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][620/1251] eta 0:23:18 lr 0.000590 time 2.7218 (2.2169) loss 4.1168 (3.6103) grad_norm 1.4156 (1.4086) [2022-01-21 20:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][630/1251] eta 0:22:57 lr 0.000590 time 2.1778 (2.2188) loss 2.6065 (3.6079) grad_norm 1.3615 (1.4082) [2022-01-21 20:51:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][640/1251] eta 0:22:36 lr 0.000590 time 2.1114 (2.2200) loss 3.3619 (3.6072) grad_norm 1.4301 (1.4075) [2022-01-21 20:51:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][650/1251] eta 0:22:14 lr 0.000590 time 2.1427 (2.2203) loss 2.5760 (3.6053) grad_norm 1.2541 (1.4063) [2022-01-21 20:51:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][660/1251] eta 0:21:49 lr 0.000590 time 1.8505 (2.2162) loss 2.6975 (3.6046) grad_norm 1.6407 (1.4075) [2022-01-21 20:52:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][670/1251] eta 0:21:25 lr 0.000590 time 1.8636 (2.2121) loss 4.2957 (3.6058) grad_norm 1.1655 (1.4074) [2022-01-21 20:52:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][680/1251] eta 0:21:02 lr 0.000590 time 1.8309 (2.2111) loss 3.6739 (3.6013) grad_norm 1.4942 (1.4075) [2022-01-21 20:52:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][690/1251] eta 0:20:39 lr 0.000590 time 1.9887 (2.2093) loss 2.9891 (3.5976) grad_norm 1.3530 (1.4069) [2022-01-21 20:53:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][700/1251] eta 0:20:17 lr 0.000590 time 1.9570 (2.2096) loss 4.0814 (3.5928) grad_norm 1.3712 (1.4058) [2022-01-21 20:53:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][710/1251] eta 0:19:54 lr 0.000590 time 1.6197 (2.2088) loss 4.1605 (3.5938) grad_norm 1.3340 (1.4060) [2022-01-21 20:53:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][720/1251] eta 0:19:33 lr 0.000590 time 2.1995 (2.2091) loss 3.7858 (3.5965) grad_norm 1.3702 (1.4060) [2022-01-21 20:54:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][730/1251] eta 0:19:10 lr 0.000590 time 2.5991 (2.2089) loss 3.4570 (3.5980) grad_norm 1.5153 (1.4072) [2022-01-21 20:54:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][740/1251] eta 0:18:48 lr 0.000590 time 2.5628 (2.2093) loss 3.2381 (3.5977) grad_norm 1.4254 (1.4070) [2022-01-21 20:54:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][750/1251] eta 0:18:26 lr 0.000590 time 1.5011 (2.2082) loss 3.7908 (3.5990) grad_norm 1.5200 (1.4072) [2022-01-21 20:55:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][760/1251] eta 0:18:04 lr 0.000590 time 2.5893 (2.2092) loss 4.0383 (3.6012) grad_norm 1.6012 (1.4075) [2022-01-21 20:55:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][770/1251] eta 0:17:42 lr 0.000590 time 2.2563 (2.2100) loss 3.7012 (3.6001) grad_norm 1.4408 (1.4074) [2022-01-21 20:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][780/1251] eta 0:17:21 lr 0.000589 time 2.1810 (2.2114) loss 4.2391 (3.5987) grad_norm 1.3321 (1.4075) [2022-01-21 20:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][790/1251] eta 0:16:58 lr 0.000589 time 1.7050 (2.2089) loss 3.8099 (3.6000) grad_norm 1.4823 (1.4081) [2022-01-21 20:56:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][800/1251] eta 0:16:35 lr 0.000589 time 2.4675 (2.2077) loss 4.1540 (3.5987) grad_norm 1.3578 (1.4073) [2022-01-21 20:57:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][810/1251] eta 0:16:13 lr 0.000589 time 2.1644 (2.2064) loss 4.2082 (3.6003) grad_norm 1.3352 (1.4094) [2022-01-21 20:57:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][820/1251] eta 0:15:50 lr 0.000589 time 1.9318 (2.2061) loss 4.1984 (3.6015) grad_norm 1.6777 (1.4113) [2022-01-21 20:57:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][830/1251] eta 0:15:29 lr 0.000589 time 2.1599 (2.2071) loss 3.9708 (3.6013) grad_norm 1.4766 (1.4115) [2022-01-21 20:58:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][840/1251] eta 0:15:06 lr 0.000589 time 2.1686 (2.2061) loss 3.7075 (3.6015) grad_norm 1.2051 (1.4107) [2022-01-21 20:58:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][850/1251] eta 0:14:44 lr 0.000589 time 1.8938 (2.2048) loss 4.2183 (3.5998) grad_norm 1.2788 (1.4097) [2022-01-21 20:58:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][860/1251] eta 0:14:21 lr 0.000589 time 2.3617 (2.2027) loss 3.8881 (3.5993) grad_norm 1.5473 (1.4103) [2022-01-21 20:59:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][870/1251] eta 0:13:59 lr 0.000589 time 2.3552 (2.2024) loss 4.2038 (3.6000) grad_norm 1.4156 (1.4099) [2022-01-21 20:59:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][880/1251] eta 0:13:37 lr 0.000589 time 2.1229 (2.2034) loss 2.8104 (3.5999) grad_norm 1.3118 (1.4093) [2022-01-21 21:00:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][890/1251] eta 0:13:15 lr 0.000589 time 2.4799 (2.2044) loss 4.0093 (3.5986) grad_norm 1.2516 (1.4087) [2022-01-21 21:00:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][900/1251] eta 0:12:55 lr 0.000589 time 2.5038 (2.2081) loss 4.0584 (3.5976) grad_norm 1.3616 (1.4084) [2022-01-21 21:00:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][910/1251] eta 0:12:32 lr 0.000589 time 2.2102 (2.2076) loss 2.7358 (3.5970) grad_norm 1.2577 (1.4083) [2022-01-21 21:01:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][920/1251] eta 0:12:10 lr 0.000589 time 1.9685 (2.2060) loss 3.9236 (3.5970) grad_norm 1.3859 (1.4081) [2022-01-21 21:01:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][930/1251] eta 0:11:47 lr 0.000589 time 1.9564 (2.2036) loss 3.8799 (3.5988) grad_norm 1.7145 (1.4085) [2022-01-21 21:01:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][940/1251] eta 0:11:25 lr 0.000589 time 2.5519 (2.2041) loss 4.4696 (3.5942) grad_norm 1.4624 (1.4089) [2022-01-21 21:02:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][950/1251] eta 0:11:03 lr 0.000589 time 1.6868 (2.2030) loss 3.5212 (3.5970) grad_norm 1.3119 (1.4089) [2022-01-21 21:02:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][960/1251] eta 0:10:40 lr 0.000589 time 2.1669 (2.2022) loss 2.8430 (3.5960) grad_norm 1.1860 (1.4082) [2022-01-21 21:02:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][970/1251] eta 0:10:18 lr 0.000589 time 1.9623 (2.2003) loss 3.5885 (3.5951) grad_norm 1.3412 (1.4081) [2022-01-21 21:03:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][980/1251] eta 0:09:56 lr 0.000589 time 2.1444 (2.1998) loss 4.2233 (3.5979) grad_norm 1.3114 (1.4075) [2022-01-21 21:03:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][990/1251] eta 0:09:34 lr 0.000589 time 1.9068 (2.1995) loss 2.9589 (3.5977) grad_norm 1.5041 (1.4076) [2022-01-21 21:04:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1000/1251] eta 0:09:12 lr 0.000589 time 2.1568 (2.2001) loss 3.8830 (3.5986) grad_norm 1.2868 (1.4076) [2022-01-21 21:04:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1010/1251] eta 0:08:50 lr 0.000589 time 1.9133 (2.2010) loss 4.2089 (3.5970) grad_norm 1.2997 (1.4079) [2022-01-21 21:04:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1020/1251] eta 0:08:28 lr 0.000588 time 2.4373 (2.2018) loss 3.6599 (3.5966) grad_norm 1.3568 (1.4076) [2022-01-21 21:05:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1030/1251] eta 0:08:06 lr 0.000588 time 1.9967 (2.2009) loss 3.5980 (3.5975) grad_norm 1.5689 (1.4076) [2022-01-21 21:05:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1040/1251] eta 0:07:44 lr 0.000588 time 1.9634 (2.2014) loss 3.5332 (3.5993) grad_norm 1.6631 (1.4079) [2022-01-21 21:05:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1050/1251] eta 0:07:22 lr 0.000588 time 2.2483 (2.2026) loss 2.8685 (3.5980) grad_norm 1.5218 (1.4077) [2022-01-21 21:06:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1060/1251] eta 0:07:00 lr 0.000588 time 1.7582 (2.2023) loss 3.1053 (3.5993) grad_norm 1.4263 (1.4075) [2022-01-21 21:06:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1070/1251] eta 0:06:38 lr 0.000588 time 2.2605 (2.2004) loss 4.1927 (3.6002) grad_norm 1.2989 (1.4073) [2022-01-21 21:06:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1080/1251] eta 0:06:16 lr 0.000588 time 2.1950 (2.2002) loss 3.9546 (3.5996) grad_norm 1.6385 (1.4087) [2022-01-21 21:07:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1090/1251] eta 0:05:54 lr 0.000588 time 1.8806 (2.1994) loss 3.9516 (3.5987) grad_norm 1.3563 (1.4087) [2022-01-21 21:07:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1100/1251] eta 0:05:32 lr 0.000588 time 2.5231 (2.1996) loss 4.0799 (3.6008) grad_norm 1.2578 (1.4092) [2022-01-21 21:08:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1110/1251] eta 0:05:10 lr 0.000588 time 1.8829 (2.2002) loss 3.7433 (3.6014) grad_norm 1.4224 (1.4084) [2022-01-21 21:08:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1120/1251] eta 0:04:48 lr 0.000588 time 2.3462 (2.2018) loss 4.5130 (3.6027) grad_norm 1.4182 (1.4078) [2022-01-21 21:08:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1130/1251] eta 0:04:26 lr 0.000588 time 3.1475 (2.2026) loss 3.9889 (3.6029) grad_norm 1.3036 (1.4070) [2022-01-21 21:09:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1140/1251] eta 0:04:04 lr 0.000588 time 1.8829 (2.2015) loss 3.4833 (3.6018) grad_norm 1.3172 (1.4066) [2022-01-21 21:09:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1150/1251] eta 0:03:42 lr 0.000588 time 1.8419 (2.2001) loss 3.5345 (3.6029) grad_norm 1.2521 (1.4057) [2022-01-21 21:09:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1160/1251] eta 0:03:20 lr 0.000588 time 2.0119 (2.1994) loss 4.0828 (3.6042) grad_norm 1.2455 (1.4055) [2022-01-21 21:10:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1170/1251] eta 0:02:58 lr 0.000588 time 2.4731 (2.1989) loss 3.9208 (3.6062) grad_norm 1.5199 (1.4057) [2022-01-21 21:10:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1180/1251] eta 0:02:36 lr 0.000588 time 1.5196 (2.1977) loss 2.4649 (3.6061) grad_norm 1.3116 (1.4057) [2022-01-21 21:10:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1190/1251] eta 0:02:14 lr 0.000588 time 1.8607 (2.1985) loss 2.2548 (3.6043) grad_norm 1.4118 (1.4054) [2022-01-21 21:11:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1200/1251] eta 0:01:52 lr 0.000588 time 2.7381 (2.1990) loss 3.3119 (3.6048) grad_norm 1.3440 (1.4055) [2022-01-21 21:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1210/1251] eta 0:01:30 lr 0.000588 time 1.9440 (2.1997) loss 3.1545 (3.6054) grad_norm 1.5780 (1.4049) [2022-01-21 21:12:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1220/1251] eta 0:01:08 lr 0.000588 time 1.8259 (2.2004) loss 2.8954 (3.6047) grad_norm 1.4821 (1.4051) [2022-01-21 21:12:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1230/1251] eta 0:00:46 lr 0.000588 time 1.6539 (2.1998) loss 3.4013 (3.6039) grad_norm 1.3128 (1.4051) [2022-01-21 21:12:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1240/1251] eta 0:00:24 lr 0.000588 time 1.4971 (2.1981) loss 2.8856 (3.6038) grad_norm 1.4500 (1.4055) [2022-01-21 21:13:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [133/300][1250/1251] eta 0:00:02 lr 0.000588 time 1.2328 (2.1919) loss 2.9774 (3.6021) grad_norm 1.5762 (1.4057) [2022-01-21 21:13:03 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 133 training takes 0:45:42 [2022-01-21 21:13:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.505 (18.505) Loss 1.0326 (1.0326) Acc@1 74.707 (74.707) Acc@5 93.457 (93.457) [2022-01-21 21:13:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.970 (3.588) Loss 1.0755 (1.0488) Acc@1 74.414 (75.080) Acc@5 92.676 (92.969) [2022-01-21 21:14:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.261 (2.763) Loss 1.0352 (1.0416) Acc@1 75.391 (75.339) Acc@5 93.457 (93.006) [2022-01-21 21:14:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.269 (2.252) Loss 1.0154 (1.0448) Acc@1 76.074 (75.217) Acc@5 92.871 (92.934) [2022-01-21 21:14:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.009 (2.137) Loss 1.1024 (1.0473) Acc@1 73.926 (75.138) Acc@5 92.676 (92.966) [2022-01-21 21:14:38 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.164 Acc@5 92.884 [2022-01-21 21:14:38 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.2% [2022-01-21 21:14:38 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.16% [2022-01-21 21:14:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][0/1251] eta 7:28:11 lr 0.000588 time 21.4957 (21.4957) loss 3.4602 (3.4602) grad_norm 1.4351 (1.4351) [2022-01-21 21:15:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][10/1251] eta 1:26:58 lr 0.000588 time 2.5268 (4.2049) loss 3.5177 (3.6851) grad_norm 1.4507 (1.4633) [2022-01-21 21:15:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][20/1251] eta 1:06:27 lr 0.000587 time 1.9080 (3.2391) loss 2.7633 (3.6441) grad_norm 1.2436 (1.4929) [2022-01-21 21:16:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][30/1251] eta 0:59:03 lr 0.000587 time 1.5876 (2.9023) loss 3.1971 (3.6229) grad_norm 1.2748 (1.4418) [2022-01-21 21:16:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][40/1251] eta 0:56:03 lr 0.000587 time 3.7675 (2.7778) loss 3.1886 (3.6174) grad_norm 1.3606 (1.4343) [2022-01-21 21:16:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][50/1251] eta 0:53:33 lr 0.000587 time 1.8442 (2.6755) loss 3.8291 (3.6180) grad_norm 1.4942 (1.4419) [2022-01-21 21:17:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][60/1251] eta 0:51:33 lr 0.000587 time 1.9555 (2.5976) loss 2.3857 (3.5943) grad_norm 1.4890 (1.4447) [2022-01-21 21:17:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][70/1251] eta 0:49:22 lr 0.000587 time 1.8803 (2.5082) loss 4.1263 (3.5907) grad_norm 1.5245 (1.4513) [2022-01-21 21:17:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][80/1251] eta 0:47:54 lr 0.000587 time 2.5986 (2.4548) loss 3.8948 (3.5949) grad_norm 1.2186 (1.4351) [2022-01-21 21:18:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][90/1251] eta 0:46:33 lr 0.000587 time 2.2414 (2.4057) loss 2.6191 (3.5832) grad_norm 1.4389 (1.4232) [2022-01-21 21:18:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][100/1251] eta 0:45:27 lr 0.000587 time 2.1622 (2.3696) loss 2.8321 (3.5714) grad_norm 1.4242 (1.4227) [2022-01-21 21:18:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][110/1251] eta 0:44:29 lr 0.000587 time 1.5724 (2.3399) loss 3.1865 (3.5761) grad_norm 1.1909 (1.4221) [2022-01-21 21:19:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][120/1251] eta 0:44:00 lr 0.000587 time 2.9661 (2.3349) loss 3.9963 (3.5850) grad_norm 1.7109 (1.4192) [2022-01-21 21:19:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][130/1251] eta 0:43:23 lr 0.000587 time 2.4666 (2.3227) loss 3.8813 (3.5768) grad_norm 1.2979 (1.4120) [2022-01-21 21:20:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][140/1251] eta 0:42:49 lr 0.000587 time 2.2528 (2.3131) loss 2.9618 (3.5806) grad_norm 1.3755 (1.4075) [2022-01-21 21:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][150/1251] eta 0:42:25 lr 0.000587 time 2.3148 (2.3122) loss 3.0636 (3.5692) grad_norm 1.3455 (1.4075) [2022-01-21 21:20:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][160/1251] eta 0:41:58 lr 0.000587 time 2.4196 (2.3081) loss 3.7040 (3.5678) grad_norm 1.4074 (1.4059) [2022-01-21 21:21:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][170/1251] eta 0:41:38 lr 0.000587 time 2.7613 (2.3115) loss 3.0813 (3.5439) grad_norm 1.6153 (1.4078) [2022-01-21 21:21:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][180/1251] eta 0:41:11 lr 0.000587 time 2.1350 (2.3072) loss 2.9378 (3.5358) grad_norm 1.5691 (1.4063) [2022-01-21 21:21:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][190/1251] eta 0:40:44 lr 0.000587 time 1.8890 (2.3037) loss 3.8660 (3.5453) grad_norm 1.2594 (1.4074) [2022-01-21 21:22:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][200/1251] eta 0:40:14 lr 0.000587 time 1.8324 (2.2978) loss 3.7704 (3.5425) grad_norm 1.6074 (1.4081) [2022-01-21 21:22:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][210/1251] eta 0:39:52 lr 0.000587 time 3.4591 (2.2979) loss 3.3005 (3.5544) grad_norm 1.3023 (1.4075) [2022-01-21 21:23:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][220/1251] eta 0:39:15 lr 0.000587 time 1.8634 (2.2844) loss 3.7901 (3.5569) grad_norm 1.4918 (1.4141) [2022-01-21 21:23:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][230/1251] eta 0:38:40 lr 0.000587 time 1.9544 (2.2724) loss 2.5032 (3.5565) grad_norm 1.4689 (1.4162) [2022-01-21 21:23:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][240/1251] eta 0:38:07 lr 0.000587 time 1.5968 (2.2629) loss 3.0016 (3.5494) grad_norm 1.3604 (1.4143) [2022-01-21 21:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][250/1251] eta 0:37:45 lr 0.000587 time 2.8158 (2.2628) loss 3.1074 (3.5528) grad_norm 1.5490 (1.4130) [2022-01-21 21:24:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][260/1251] eta 0:37:19 lr 0.000586 time 2.0426 (2.2597) loss 4.3195 (3.5637) grad_norm 1.4644 (1.4120) [2022-01-21 21:24:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][270/1251] eta 0:36:56 lr 0.000586 time 2.1251 (2.2593) loss 3.2953 (3.5647) grad_norm 1.5396 (1.4125) [2022-01-21 21:25:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][280/1251] eta 0:36:36 lr 0.000586 time 2.4354 (2.2621) loss 4.3857 (3.5773) grad_norm 1.4639 (1.4153) [2022-01-21 21:25:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][290/1251] eta 0:36:15 lr 0.000586 time 2.1840 (2.2643) loss 3.9208 (3.5900) grad_norm 1.4281 (1.4149) [2022-01-21 21:25:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][300/1251] eta 0:35:53 lr 0.000586 time 1.5557 (2.2643) loss 3.9204 (3.5924) grad_norm 1.3020 (1.4140) [2022-01-21 21:26:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][310/1251] eta 0:35:26 lr 0.000586 time 2.4359 (2.2594) loss 3.0950 (3.5850) grad_norm 1.7281 (1.4134) [2022-01-21 21:26:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][320/1251] eta 0:34:51 lr 0.000586 time 1.8856 (2.2463) loss 4.1089 (3.5851) grad_norm 1.6100 (1.4141) [2022-01-21 21:27:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][330/1251] eta 0:34:26 lr 0.000586 time 2.5123 (2.2435) loss 3.9049 (3.5739) grad_norm 1.2329 (1.4132) [2022-01-21 21:27:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][340/1251] eta 0:34:00 lr 0.000586 time 1.8272 (2.2400) loss 3.7197 (3.5734) grad_norm 1.3772 (1.4131) [2022-01-21 21:27:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][350/1251] eta 0:33:39 lr 0.000586 time 2.4769 (2.2411) loss 2.6494 (3.5715) grad_norm 1.5659 (1.4124) [2022-01-21 21:28:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][360/1251] eta 0:33:16 lr 0.000586 time 2.2772 (2.2407) loss 3.6047 (3.5715) grad_norm 1.5160 (1.4133) [2022-01-21 21:28:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][370/1251] eta 0:32:50 lr 0.000586 time 2.4662 (2.2372) loss 3.2813 (3.5775) grad_norm 1.4373 (1.4142) [2022-01-21 21:28:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][380/1251] eta 0:32:27 lr 0.000586 time 1.5939 (2.2354) loss 4.4368 (3.5801) grad_norm 1.3657 (1.4166) [2022-01-21 21:29:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][390/1251] eta 0:32:06 lr 0.000586 time 2.6099 (2.2373) loss 3.9629 (3.5851) grad_norm 1.6596 (1.4184) [2022-01-21 21:29:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][400/1251] eta 0:31:43 lr 0.000586 time 2.8133 (2.2372) loss 4.1457 (3.5931) grad_norm 1.4798 (1.4190) [2022-01-21 21:29:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][410/1251] eta 0:31:20 lr 0.000586 time 2.5525 (2.2364) loss 2.8053 (3.5940) grad_norm 1.4665 (1.4187) [2022-01-21 21:30:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][420/1251] eta 0:31:00 lr 0.000586 time 2.5279 (2.2387) loss 2.8673 (3.5861) grad_norm 1.2615 (1.4160) [2022-01-21 21:30:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][430/1251] eta 0:30:38 lr 0.000586 time 3.1018 (2.2393) loss 3.1385 (3.5855) grad_norm 1.3607 (1.4157) [2022-01-21 21:31:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][440/1251] eta 0:30:12 lr 0.000586 time 1.9957 (2.2355) loss 3.1488 (3.5766) grad_norm 1.3672 (1.4183) [2022-01-21 21:31:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][450/1251] eta 0:29:46 lr 0.000586 time 2.5430 (2.2299) loss 3.8334 (3.5687) grad_norm 1.2713 (1.4187) [2022-01-21 21:31:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][460/1251] eta 0:29:22 lr 0.000586 time 2.1872 (2.2277) loss 3.2757 (3.5704) grad_norm 1.5083 (1.4188) [2022-01-21 21:32:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][470/1251] eta 0:29:02 lr 0.000586 time 3.1120 (2.2305) loss 3.8561 (3.5749) grad_norm 1.2333 (1.4183) [2022-01-21 21:32:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][480/1251] eta 0:28:39 lr 0.000586 time 1.8465 (2.2308) loss 2.9258 (3.5714) grad_norm 1.5203 (1.4183) [2022-01-21 21:32:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][490/1251] eta 0:28:16 lr 0.000586 time 2.2281 (2.2292) loss 3.8717 (3.5720) grad_norm 1.3628 (1.4170) [2022-01-21 21:33:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][500/1251] eta 0:27:54 lr 0.000586 time 1.8377 (2.2292) loss 3.9343 (3.5741) grad_norm 1.2219 (1.4160) [2022-01-21 21:33:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][510/1251] eta 0:27:30 lr 0.000585 time 2.2853 (2.2271) loss 3.8024 (3.5771) grad_norm 1.4418 (1.4150) [2022-01-21 21:33:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][520/1251] eta 0:27:06 lr 0.000585 time 2.4975 (2.2245) loss 3.2079 (3.5748) grad_norm 1.3272 (1.4155) [2022-01-21 21:34:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][530/1251] eta 0:26:42 lr 0.000585 time 1.8699 (2.2219) loss 3.5638 (3.5756) grad_norm 1.4328 (1.4152) [2022-01-21 21:34:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][540/1251] eta 0:26:18 lr 0.000585 time 2.2075 (2.2204) loss 3.8183 (3.5766) grad_norm 1.6877 (1.4168) [2022-01-21 21:35:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][550/1251] eta 0:25:57 lr 0.000585 time 2.6740 (2.2215) loss 3.6980 (3.5803) grad_norm 1.3534 (1.4170) [2022-01-21 21:35:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][560/1251] eta 0:25:34 lr 0.000585 time 1.8042 (2.2203) loss 4.0786 (3.5796) grad_norm 1.4999 (1.4169) [2022-01-21 21:35:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][570/1251] eta 0:25:12 lr 0.000585 time 2.7865 (2.2212) loss 3.7459 (3.5838) grad_norm 1.4864 (1.4176) [2022-01-21 21:36:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][580/1251] eta 0:24:51 lr 0.000585 time 2.7639 (2.2225) loss 3.4503 (3.5868) grad_norm 1.4017 (1.4181) [2022-01-21 21:36:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][590/1251] eta 0:24:28 lr 0.000585 time 1.5874 (2.2210) loss 2.3590 (3.5884) grad_norm 1.3378 (1.4181) [2022-01-21 21:36:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][600/1251] eta 0:24:05 lr 0.000585 time 1.8280 (2.2200) loss 3.1484 (3.5841) grad_norm 1.4556 (1.4201) [2022-01-21 21:37:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][610/1251] eta 0:23:41 lr 0.000585 time 2.3128 (2.2179) loss 3.9060 (3.5840) grad_norm 1.8234 (1.4227) [2022-01-21 21:37:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][620/1251] eta 0:23:19 lr 0.000585 time 2.3470 (2.2178) loss 4.2359 (3.5893) grad_norm 1.4403 (1.4221) [2022-01-21 21:37:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][630/1251] eta 0:22:57 lr 0.000585 time 1.9381 (2.2174) loss 4.2794 (3.5904) grad_norm 1.6384 (1.4222) [2022-01-21 21:38:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][640/1251] eta 0:22:34 lr 0.000585 time 1.9348 (2.2168) loss 3.9832 (3.5951) grad_norm 1.3884 (1.4220) [2022-01-21 21:38:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][650/1251] eta 0:22:12 lr 0.000585 time 2.5746 (2.2171) loss 3.3381 (3.5994) grad_norm 1.1980 (1.4212) [2022-01-21 21:39:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][660/1251] eta 0:21:49 lr 0.000585 time 1.8342 (2.2165) loss 3.5592 (3.5986) grad_norm 1.5343 (1.4213) [2022-01-21 21:39:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][670/1251] eta 0:21:28 lr 0.000585 time 3.1288 (2.2182) loss 3.9197 (3.5982) grad_norm 1.1785 (1.4214) [2022-01-21 21:39:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][680/1251] eta 0:21:07 lr 0.000585 time 1.8541 (2.2189) loss 3.6281 (3.5979) grad_norm 1.4582 (1.4212) [2022-01-21 21:40:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][690/1251] eta 0:20:44 lr 0.000585 time 1.8867 (2.2190) loss 3.0855 (3.5966) grad_norm 1.2543 (1.4208) [2022-01-21 21:40:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][700/1251] eta 0:20:22 lr 0.000585 time 2.4972 (2.2190) loss 3.0815 (3.5901) grad_norm 1.4779 (1.4210) [2022-01-21 21:40:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][710/1251] eta 0:19:59 lr 0.000585 time 1.9338 (2.2167) loss 3.3397 (3.5890) grad_norm 1.4126 (1.4210) [2022-01-21 21:41:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][720/1251] eta 0:19:34 lr 0.000585 time 2.0298 (2.2128) loss 4.2777 (3.5920) grad_norm 1.4278 (1.4205) [2022-01-21 21:41:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][730/1251] eta 0:19:11 lr 0.000585 time 2.1614 (2.2103) loss 2.7404 (3.5931) grad_norm 1.3423 (1.4199) [2022-01-21 21:41:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][740/1251] eta 0:18:48 lr 0.000585 time 2.2456 (2.2091) loss 3.7093 (3.5930) grad_norm 1.3903 (1.4212) [2022-01-21 21:42:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][750/1251] eta 0:18:26 lr 0.000584 time 2.0779 (2.2085) loss 2.7552 (3.5951) grad_norm 1.5624 (1.4219) [2022-01-21 21:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][760/1251] eta 0:18:04 lr 0.000584 time 1.9650 (2.2087) loss 3.9501 (3.5926) grad_norm 1.4067 (1.4218) [2022-01-21 21:43:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][770/1251] eta 0:17:43 lr 0.000584 time 1.5551 (2.2112) loss 3.9826 (3.5951) grad_norm 1.3443 (1.4219) [2022-01-21 21:43:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][780/1251] eta 0:17:21 lr 0.000584 time 1.9978 (2.2111) loss 3.8210 (3.5997) grad_norm 1.4562 (1.4220) [2022-01-21 21:43:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][790/1251] eta 0:16:59 lr 0.000584 time 2.0427 (2.2107) loss 3.1625 (3.6002) grad_norm 1.3677 (1.4218) [2022-01-21 21:44:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][800/1251] eta 0:16:36 lr 0.000584 time 2.6380 (2.2103) loss 2.5291 (3.5998) grad_norm 1.2005 (1.4202) [2022-01-21 21:44:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][810/1251] eta 0:16:15 lr 0.000584 time 1.8104 (2.2114) loss 3.9327 (3.6028) grad_norm 1.1650 (1.4199) [2022-01-21 21:44:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][820/1251] eta 0:15:53 lr 0.000584 time 2.4942 (2.2124) loss 3.4008 (3.6018) grad_norm 1.1964 (1.4207) [2022-01-21 21:45:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][830/1251] eta 0:15:31 lr 0.000584 time 1.9359 (2.2117) loss 4.3072 (3.6036) grad_norm 1.4871 (1.4203) [2022-01-21 21:45:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][840/1251] eta 0:15:08 lr 0.000584 time 2.5544 (2.2096) loss 3.7143 (3.6041) grad_norm 1.3367 (1.4204) [2022-01-21 21:45:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][850/1251] eta 0:14:44 lr 0.000584 time 1.9757 (2.2059) loss 3.6273 (3.6034) grad_norm 1.2266 (1.4194) [2022-01-21 21:46:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][860/1251] eta 0:14:22 lr 0.000584 time 2.1823 (2.2052) loss 4.0359 (3.6019) grad_norm 1.3069 (1.4190) [2022-01-21 21:46:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][870/1251] eta 0:14:00 lr 0.000584 time 1.6224 (2.2048) loss 4.1509 (3.6020) grad_norm 1.2643 (1.4188) [2022-01-21 21:47:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][880/1251] eta 0:13:38 lr 0.000584 time 1.5562 (2.2050) loss 4.3965 (3.6020) grad_norm 1.6280 (1.4189) [2022-01-21 21:47:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][890/1251] eta 0:13:15 lr 0.000584 time 1.7900 (2.2043) loss 3.6665 (3.6031) grad_norm 1.3311 (1.4191) [2022-01-21 21:47:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][900/1251] eta 0:12:54 lr 0.000584 time 2.5064 (2.2079) loss 4.0296 (3.6056) grad_norm 1.1826 (1.4191) [2022-01-21 21:48:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][910/1251] eta 0:12:32 lr 0.000584 time 1.5622 (2.2078) loss 3.6221 (3.6055) grad_norm 1.2139 (1.4181) [2022-01-21 21:48:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][920/1251] eta 0:12:11 lr 0.000584 time 1.4705 (2.2089) loss 3.4497 (3.6052) grad_norm 1.6625 (1.4183) [2022-01-21 21:48:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][930/1251] eta 0:11:49 lr 0.000584 time 1.5564 (2.2115) loss 3.8050 (3.6032) grad_norm 1.4381 (1.4180) [2022-01-21 21:49:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][940/1251] eta 0:11:27 lr 0.000584 time 1.8810 (2.2118) loss 3.1876 (3.6019) grad_norm 1.4966 (1.4179) [2022-01-21 21:49:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][950/1251] eta 0:11:05 lr 0.000584 time 1.9320 (2.2106) loss 3.5652 (3.6002) grad_norm 1.3662 (1.4181) [2022-01-21 21:50:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][960/1251] eta 0:10:42 lr 0.000584 time 2.1996 (2.2096) loss 3.2816 (3.5992) grad_norm 1.3426 (1.4174) [2022-01-21 21:50:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][970/1251] eta 0:10:20 lr 0.000584 time 1.6365 (2.2082) loss 3.6991 (3.6026) grad_norm 1.3204 (1.4170) [2022-01-21 21:50:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][980/1251] eta 0:09:58 lr 0.000584 time 2.1653 (2.2090) loss 3.4663 (3.6016) grad_norm 1.5912 (1.4168) [2022-01-21 21:51:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][990/1251] eta 0:09:36 lr 0.000584 time 2.1979 (2.2098) loss 4.0084 (3.5977) grad_norm 1.6466 (1.4180) [2022-01-21 21:51:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1000/1251] eta 0:09:14 lr 0.000583 time 2.2555 (2.2086) loss 4.2401 (3.5981) grad_norm 1.5321 (1.4177) [2022-01-21 21:51:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1010/1251] eta 0:08:51 lr 0.000583 time 1.7166 (2.2067) loss 3.6850 (3.6008) grad_norm 1.5671 (1.4173) [2022-01-21 21:52:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1020/1251] eta 0:08:29 lr 0.000583 time 1.8570 (2.2071) loss 3.7924 (3.6003) grad_norm 1.3934 (1.4168) [2022-01-21 21:52:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1030/1251] eta 0:08:07 lr 0.000583 time 2.2497 (2.2064) loss 2.4318 (3.5994) grad_norm 1.5389 (1.4166) [2022-01-21 21:52:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1040/1251] eta 0:07:45 lr 0.000583 time 1.8782 (2.2048) loss 3.8895 (3.6006) grad_norm 1.3628 (1.4161) [2022-01-21 21:53:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1050/1251] eta 0:07:23 lr 0.000583 time 1.7868 (2.2045) loss 3.2948 (3.6010) grad_norm 1.4578 (1.4170) [2022-01-21 21:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1060/1251] eta 0:07:01 lr 0.000583 time 2.6839 (2.2054) loss 2.6738 (3.6019) grad_norm 1.4987 (1.4168) [2022-01-21 21:53:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1070/1251] eta 0:06:39 lr 0.000583 time 2.0723 (2.2051) loss 4.2713 (3.6021) grad_norm 1.3823 (1.4171) [2022-01-21 21:54:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1080/1251] eta 0:06:16 lr 0.000583 time 2.2204 (2.2045) loss 3.5508 (3.6005) grad_norm 1.4318 (1.4170) [2022-01-21 21:54:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1090/1251] eta 0:05:54 lr 0.000583 time 1.6133 (2.2045) loss 3.3498 (3.5998) grad_norm 1.3216 (1.4170) [2022-01-21 21:55:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1100/1251] eta 0:05:32 lr 0.000583 time 2.1223 (2.2051) loss 3.7630 (3.6005) grad_norm 1.4551 (1.4170) [2022-01-21 21:55:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1110/1251] eta 0:05:11 lr 0.000583 time 2.7623 (2.2060) loss 3.0322 (3.6012) grad_norm 1.4213 (1.4170) [2022-01-21 21:55:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1120/1251] eta 0:04:48 lr 0.000583 time 1.8838 (2.2041) loss 3.5845 (3.6002) grad_norm 1.6130 (1.4174) [2022-01-21 21:56:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1130/1251] eta 0:04:26 lr 0.000583 time 2.1582 (2.2027) loss 2.6300 (3.6007) grad_norm 1.5093 (1.4172) [2022-01-21 21:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1140/1251] eta 0:04:04 lr 0.000583 time 2.3811 (2.2036) loss 4.2061 (3.6013) grad_norm 1.3492 (1.4165) [2022-01-21 21:56:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1150/1251] eta 0:03:42 lr 0.000583 time 2.2363 (2.2042) loss 3.7662 (3.6038) grad_norm 1.6289 (1.4171) [2022-01-21 21:57:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1160/1251] eta 0:03:20 lr 0.000583 time 1.7072 (2.2038) loss 3.5230 (3.6033) grad_norm 1.2667 (1.4172) [2022-01-21 21:57:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1170/1251] eta 0:02:58 lr 0.000583 time 2.1970 (2.2026) loss 2.2949 (3.6027) grad_norm 1.2937 (1.4170) [2022-01-21 21:57:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1180/1251] eta 0:02:36 lr 0.000583 time 2.4722 (2.2015) loss 4.0729 (3.6036) grad_norm 1.3812 (1.4166) [2022-01-21 21:58:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1190/1251] eta 0:02:14 lr 0.000583 time 2.1786 (2.2019) loss 3.1713 (3.6006) grad_norm 1.4404 (1.4162) [2022-01-21 21:58:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1200/1251] eta 0:01:52 lr 0.000583 time 1.9140 (2.2022) loss 3.2912 (3.6007) grad_norm 1.1795 (1.4156) [2022-01-21 21:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1210/1251] eta 0:01:30 lr 0.000583 time 2.2047 (2.2018) loss 3.7788 (3.6014) grad_norm 1.2379 (1.4145) [2022-01-21 21:59:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1220/1251] eta 0:01:08 lr 0.000583 time 3.0860 (2.2022) loss 3.4036 (3.6014) grad_norm 1.3148 (1.4144) [2022-01-21 21:59:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1230/1251] eta 0:00:46 lr 0.000583 time 1.9758 (2.2023) loss 3.3451 (3.6025) grad_norm 1.2158 (1.4139) [2022-01-21 22:00:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1240/1251] eta 0:00:24 lr 0.000582 time 1.1964 (2.2015) loss 3.2240 (3.6006) grad_norm 1.3592 (1.4133) [2022-01-21 22:00:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [134/300][1250/1251] eta 0:00:02 lr 0.000582 time 1.1934 (2.1969) loss 3.6332 (3.6000) grad_norm 1.4180 (1.4130) [2022-01-21 22:00:26 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 134 training takes 0:45:48 [2022-01-21 22:00:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 20.834 (20.834) Loss 0.9621 (0.9621) Acc@1 76.172 (76.172) Acc@5 93.848 (93.848) [2022-01-21 22:01:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.630 (3.172) Loss 1.0216 (1.0313) Acc@1 76.270 (75.328) Acc@5 92.480 (92.978) [2022-01-21 22:01:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 2.360 (2.555) Loss 0.9491 (1.0201) Acc@1 77.344 (75.535) Acc@5 93.457 (93.127) [2022-01-21 22:01:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.625 (2.303) Loss 1.0674 (1.0381) Acc@1 73.730 (75.176) Acc@5 92.578 (92.921) [2022-01-21 22:01:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.215 (2.201) Loss 1.0455 (1.0370) Acc@1 76.465 (75.243) Acc@5 92.676 (92.976) [2022-01-21 22:02:04 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.248 Acc@5 93.006 [2022-01-21 22:02:04 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.2% [2022-01-21 22:02:04 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.25% [2022-01-21 22:02:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][0/1251] eta 7:26:13 lr 0.000582 time 21.4014 (21.4014) loss 4.2354 (4.2354) grad_norm 1.3848 (1.3848) [2022-01-21 22:02:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][10/1251] eta 1:24:36 lr 0.000582 time 2.0518 (4.0903) loss 3.3111 (3.4470) grad_norm 1.3100 (1.3541) [2022-01-21 22:03:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][20/1251] eta 1:04:48 lr 0.000582 time 1.5133 (3.1584) loss 4.0421 (3.5446) grad_norm 1.4564 (1.4171) [2022-01-21 22:03:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][30/1251] eta 0:57:43 lr 0.000582 time 1.5132 (2.8363) loss 3.3703 (3.5069) grad_norm 1.3873 (1.3940) [2022-01-21 22:03:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][40/1251] eta 0:55:09 lr 0.000582 time 3.9799 (2.7326) loss 3.2343 (3.5373) grad_norm 1.2349 (1.4180) [2022-01-21 22:04:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][50/1251] eta 0:52:45 lr 0.000582 time 2.1831 (2.6358) loss 4.2719 (3.5470) grad_norm 1.2742 (1.4323) [2022-01-21 22:04:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][60/1251] eta 0:51:15 lr 0.000582 time 2.3793 (2.5823) loss 3.9145 (3.5770) grad_norm 1.3211 (1.4190) [2022-01-21 22:05:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][70/1251] eta 0:49:52 lr 0.000582 time 1.8946 (2.5342) loss 3.1509 (3.5381) grad_norm 1.4319 (1.4251) [2022-01-21 22:05:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][80/1251] eta 0:48:42 lr 0.000582 time 2.6083 (2.4956) loss 3.7914 (3.5566) grad_norm 1.5694 (1.4304) [2022-01-21 22:05:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][90/1251] eta 0:47:17 lr 0.000582 time 2.8670 (2.4441) loss 3.5176 (3.5981) grad_norm 1.1926 (1.4305) [2022-01-21 22:06:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][100/1251] eta 0:46:06 lr 0.000582 time 2.1480 (2.4035) loss 3.9076 (3.5924) grad_norm 1.4636 (1.4243) [2022-01-21 22:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][110/1251] eta 0:45:02 lr 0.000582 time 1.9113 (2.3689) loss 4.4067 (3.6058) grad_norm 1.7003 (1.4256) [2022-01-21 22:06:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][120/1251] eta 0:44:08 lr 0.000582 time 1.9885 (2.3420) loss 2.9468 (3.6123) grad_norm 1.5853 (1.4290) [2022-01-21 22:07:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][130/1251] eta 0:43:35 lr 0.000582 time 1.9555 (2.3332) loss 3.8177 (3.6206) grad_norm 1.5253 (1.4236) [2022-01-21 22:07:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][140/1251] eta 0:43:13 lr 0.000582 time 2.3975 (2.3346) loss 3.0656 (3.6261) grad_norm 1.5656 (1.4197) [2022-01-21 22:07:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][150/1251] eta 0:42:44 lr 0.000582 time 1.8897 (2.3293) loss 4.0116 (3.6355) grad_norm 1.4630 (1.4204) [2022-01-21 22:08:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][160/1251] eta 0:42:11 lr 0.000582 time 2.1968 (2.3200) loss 3.3488 (3.6434) grad_norm 1.2248 (1.4140) [2022-01-21 22:08:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][170/1251] eta 0:41:50 lr 0.000582 time 3.4494 (2.3223) loss 3.9726 (3.6340) grad_norm 1.4119 (1.4157) [2022-01-21 22:09:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][180/1251] eta 0:41:13 lr 0.000582 time 1.7430 (2.3092) loss 3.8817 (3.6441) grad_norm 1.5791 (1.4155) [2022-01-21 22:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][190/1251] eta 0:40:52 lr 0.000582 time 1.8855 (2.3119) loss 4.2446 (3.6530) grad_norm 1.4605 (1.4159) [2022-01-21 22:09:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][200/1251] eta 0:40:20 lr 0.000582 time 1.6384 (2.3029) loss 4.1795 (3.6589) grad_norm 1.5792 (1.4145) [2022-01-21 22:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][210/1251] eta 0:40:03 lr 0.000582 time 3.2743 (2.3089) loss 3.8016 (3.6519) grad_norm 1.2426 (1.4146) [2022-01-21 22:10:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][220/1251] eta 0:39:26 lr 0.000582 time 1.8981 (2.2949) loss 3.1121 (3.6470) grad_norm 1.2642 (1.4123) [2022-01-21 22:10:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][230/1251] eta 0:38:54 lr 0.000581 time 1.9685 (2.2867) loss 2.9767 (3.6404) grad_norm 1.4563 (1.4109) [2022-01-21 22:11:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][240/1251] eta 0:38:24 lr 0.000581 time 1.6339 (2.2799) loss 3.5375 (3.6414) grad_norm 1.3686 (1.4102) [2022-01-21 22:11:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][250/1251] eta 0:38:07 lr 0.000581 time 3.0391 (2.2856) loss 3.4007 (3.6407) grad_norm 1.2414 (1.4080) [2022-01-21 22:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][260/1251] eta 0:37:39 lr 0.000581 time 1.8907 (2.2797) loss 2.5444 (3.6398) grad_norm 1.2142 (1.4060) [2022-01-21 22:12:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][270/1251] eta 0:37:05 lr 0.000581 time 1.6216 (2.2688) loss 3.7025 (3.6290) grad_norm 1.2944 (1.4060) [2022-01-21 22:12:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][280/1251] eta 0:36:38 lr 0.000581 time 2.0260 (2.2638) loss 4.2872 (3.6331) grad_norm 1.2807 (1.4079) [2022-01-21 22:13:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][290/1251] eta 0:36:13 lr 0.000581 time 2.2534 (2.2621) loss 3.5424 (3.6320) grad_norm 1.3671 (1.4070) [2022-01-21 22:13:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][300/1251] eta 0:35:50 lr 0.000581 time 2.4758 (2.2611) loss 3.4192 (3.6217) grad_norm 1.3197 (1.4038) [2022-01-21 22:13:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][310/1251] eta 0:35:27 lr 0.000581 time 1.9388 (2.2610) loss 3.5869 (3.6288) grad_norm 1.2686 (1.4018) [2022-01-21 22:14:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][320/1251] eta 0:35:00 lr 0.000581 time 1.6343 (2.2564) loss 4.0677 (3.6340) grad_norm 1.5182 (1.4034) [2022-01-21 22:14:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][330/1251] eta 0:34:31 lr 0.000581 time 1.8990 (2.2491) loss 3.6206 (3.6375) grad_norm 1.4166 (1.4049) [2022-01-21 22:14:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][340/1251] eta 0:34:02 lr 0.000581 time 2.0645 (2.2419) loss 3.2162 (3.6253) grad_norm 1.2690 (1.4044) [2022-01-21 22:15:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][350/1251] eta 0:33:38 lr 0.000581 time 2.5007 (2.2398) loss 3.9158 (3.6254) grad_norm 1.3948 (1.4041) [2022-01-21 22:15:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][360/1251] eta 0:33:14 lr 0.000581 time 2.1242 (2.2381) loss 4.3097 (3.6220) grad_norm 1.2069 (1.4035) [2022-01-21 22:15:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][370/1251] eta 0:32:52 lr 0.000581 time 1.8848 (2.2390) loss 4.2390 (3.6253) grad_norm 1.4009 (1.4043) [2022-01-21 22:16:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][380/1251] eta 0:32:30 lr 0.000581 time 2.2364 (2.2388) loss 3.3021 (3.6258) grad_norm 1.3692 (1.4028) [2022-01-21 22:16:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][390/1251] eta 0:32:09 lr 0.000581 time 3.1805 (2.2408) loss 2.5329 (3.6264) grad_norm 1.7546 (1.4028) [2022-01-21 22:17:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][400/1251] eta 0:31:47 lr 0.000581 time 2.8381 (2.2418) loss 3.5889 (3.6250) grad_norm 1.3790 (1.4034) [2022-01-21 22:17:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][410/1251] eta 0:31:26 lr 0.000581 time 1.9664 (2.2431) loss 2.6817 (3.6245) grad_norm 1.4623 (1.4041) [2022-01-21 22:17:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][420/1251] eta 0:31:04 lr 0.000581 time 2.6483 (2.2431) loss 2.5429 (3.6195) grad_norm 1.3673 (1.4033) [2022-01-21 22:18:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][430/1251] eta 0:30:40 lr 0.000581 time 1.7927 (2.2415) loss 3.6755 (3.6189) grad_norm 1.6128 (1.4025) [2022-01-21 22:18:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][440/1251] eta 0:30:16 lr 0.000581 time 1.7864 (2.2400) loss 3.8617 (3.6229) grad_norm 1.3348 (1.4023) [2022-01-21 22:18:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][450/1251] eta 0:29:47 lr 0.000581 time 1.8887 (2.2319) loss 2.6835 (3.6248) grad_norm 1.3682 (1.4020) [2022-01-21 22:19:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][460/1251] eta 0:29:20 lr 0.000581 time 1.8246 (2.2263) loss 3.9868 (3.6260) grad_norm 1.4036 (1.4019) [2022-01-21 22:19:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][470/1251] eta 0:28:57 lr 0.000581 time 2.5585 (2.2246) loss 3.6165 (3.6283) grad_norm 1.3824 (1.4021) [2022-01-21 22:19:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][480/1251] eta 0:28:32 lr 0.000580 time 1.7145 (2.2218) loss 3.2221 (3.6334) grad_norm 1.4114 (1.4023) [2022-01-21 22:20:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][490/1251] eta 0:28:09 lr 0.000580 time 2.8151 (2.2195) loss 3.9783 (3.6365) grad_norm 1.4849 (1.4020) [2022-01-21 22:20:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][500/1251] eta 0:27:45 lr 0.000580 time 2.1835 (2.2176) loss 3.5872 (3.6301) grad_norm 1.2703 (1.4029) [2022-01-21 22:20:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][510/1251] eta 0:27:25 lr 0.000580 time 2.7595 (2.2204) loss 3.3969 (3.6309) grad_norm 1.4690 (1.4035) [2022-01-21 22:21:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][520/1251] eta 0:27:03 lr 0.000580 time 1.7390 (2.2211) loss 3.0991 (3.6348) grad_norm 1.6463 (1.4039) [2022-01-21 22:21:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][530/1251] eta 0:26:42 lr 0.000580 time 1.9357 (2.2226) loss 3.9168 (3.6307) grad_norm 1.5149 (1.4048) [2022-01-21 22:22:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][540/1251] eta 0:26:19 lr 0.000580 time 1.7845 (2.2215) loss 4.5201 (3.6344) grad_norm 1.3532 (1.4062) [2022-01-21 22:22:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][550/1251] eta 0:25:55 lr 0.000580 time 2.2600 (2.2184) loss 4.1229 (3.6367) grad_norm 1.2036 (1.4049) [2022-01-21 22:22:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][560/1251] eta 0:25:31 lr 0.000580 time 2.1051 (2.2160) loss 2.6018 (3.6335) grad_norm 1.1495 (1.4040) [2022-01-21 22:23:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][570/1251] eta 0:25:08 lr 0.000580 time 2.6990 (2.2156) loss 3.5479 (3.6323) grad_norm 1.2553 (1.4041) [2022-01-21 22:23:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][580/1251] eta 0:24:47 lr 0.000580 time 2.1613 (2.2164) loss 3.9655 (3.6331) grad_norm 1.1241 (1.4028) [2022-01-21 22:23:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][590/1251] eta 0:24:23 lr 0.000580 time 2.2730 (2.2142) loss 3.8066 (3.6333) grad_norm 1.3306 (1.4035) [2022-01-21 22:24:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][600/1251] eta 0:24:00 lr 0.000580 time 2.2829 (2.2133) loss 4.0118 (3.6358) grad_norm 1.3727 (1.4025) [2022-01-21 22:24:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][610/1251] eta 0:23:38 lr 0.000580 time 2.7848 (2.2130) loss 3.5577 (3.6389) grad_norm 1.4187 (1.4033) [2022-01-21 22:24:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][620/1251] eta 0:23:14 lr 0.000580 time 1.8798 (2.2096) loss 4.4957 (3.6365) grad_norm 1.5317 (1.4029) [2022-01-21 22:25:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][630/1251] eta 0:22:51 lr 0.000580 time 2.5470 (2.2083) loss 3.9462 (3.6312) grad_norm 1.6401 (1.4040) [2022-01-21 22:25:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][640/1251] eta 0:22:29 lr 0.000580 time 2.0391 (2.2087) loss 3.9486 (3.6296) grad_norm 1.6263 (1.4037) [2022-01-21 22:26:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][650/1251] eta 0:22:08 lr 0.000580 time 2.8846 (2.2112) loss 3.3531 (3.6290) grad_norm 1.3933 (1.4033) [2022-01-21 22:26:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][660/1251] eta 0:21:46 lr 0.000580 time 2.2169 (2.2105) loss 3.6474 (3.6263) grad_norm 1.2965 (1.4029) [2022-01-21 22:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][670/1251] eta 0:21:23 lr 0.000580 time 2.8484 (2.2100) loss 3.7931 (3.6275) grad_norm 1.9214 (1.4036) [2022-01-21 22:27:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][680/1251] eta 0:21:02 lr 0.000580 time 1.9544 (2.2105) loss 3.2855 (3.6271) grad_norm 1.8192 (1.4044) [2022-01-21 22:27:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][690/1251] eta 0:20:39 lr 0.000580 time 2.5156 (2.2088) loss 4.2956 (3.6281) grad_norm 1.4892 (1.4047) [2022-01-21 22:27:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][700/1251] eta 0:20:17 lr 0.000580 time 1.9051 (2.2088) loss 4.3382 (3.6249) grad_norm 1.3216 (1.4049) [2022-01-21 22:28:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][710/1251] eta 0:19:54 lr 0.000580 time 1.9787 (2.2079) loss 3.5759 (3.6236) grad_norm 1.3780 (1.4050) [2022-01-21 22:28:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][720/1251] eta 0:19:33 lr 0.000579 time 1.7432 (2.2104) loss 4.4247 (3.6276) grad_norm 1.4787 (1.4054) [2022-01-21 22:29:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][730/1251] eta 0:19:12 lr 0.000579 time 1.8836 (2.2121) loss 2.9390 (3.6253) grad_norm 1.3648 (1.4065) [2022-01-21 22:29:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][740/1251] eta 0:18:51 lr 0.000579 time 2.3032 (2.2137) loss 3.5352 (3.6233) grad_norm 1.2470 (1.4055) [2022-01-21 22:29:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][750/1251] eta 0:18:28 lr 0.000579 time 1.9638 (2.2125) loss 4.2877 (3.6273) grad_norm 1.4182 (1.4065) [2022-01-21 22:30:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][760/1251] eta 0:18:06 lr 0.000579 time 1.7655 (2.2124) loss 2.9416 (3.6273) grad_norm 1.5021 (1.4101) [2022-01-21 22:30:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][770/1251] eta 0:17:43 lr 0.000579 time 1.8800 (2.2108) loss 4.2094 (3.6281) grad_norm 1.3353 (1.4102) [2022-01-21 22:30:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][780/1251] eta 0:17:20 lr 0.000579 time 1.5134 (2.2085) loss 2.8862 (3.6277) grad_norm 1.3208 (1.4111) [2022-01-21 22:31:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][790/1251] eta 0:16:58 lr 0.000579 time 1.8505 (2.2096) loss 4.0329 (3.6257) grad_norm 1.1700 (1.4113) [2022-01-21 22:31:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][800/1251] eta 0:16:37 lr 0.000579 time 2.0118 (2.2108) loss 3.9529 (3.6269) grad_norm 1.2544 (1.4109) [2022-01-21 22:31:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][810/1251] eta 0:16:14 lr 0.000579 time 1.6079 (2.2100) loss 3.8413 (3.6297) grad_norm 1.3589 (1.4100) [2022-01-21 22:32:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][820/1251] eta 0:15:51 lr 0.000579 time 1.7844 (2.2087) loss 2.9728 (3.6277) grad_norm 1.2702 (1.4106) [2022-01-21 22:32:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][830/1251] eta 0:15:28 lr 0.000579 time 2.2072 (2.2065) loss 3.8859 (3.6257) grad_norm 1.3400 (1.4110) [2022-01-21 22:32:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][840/1251] eta 0:15:06 lr 0.000579 time 2.0526 (2.2051) loss 2.6932 (3.6228) grad_norm 1.3132 (1.4113) [2022-01-21 22:33:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][850/1251] eta 0:14:44 lr 0.000579 time 2.1009 (2.2054) loss 3.7107 (3.6249) grad_norm 1.5202 (1.4130) [2022-01-21 22:33:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][860/1251] eta 0:14:22 lr 0.000579 time 2.0860 (2.2049) loss 3.1896 (3.6237) grad_norm 1.3438 (1.4142) [2022-01-21 22:34:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][870/1251] eta 0:14:01 lr 0.000579 time 2.1812 (2.2075) loss 4.1293 (3.6233) grad_norm 1.4001 (1.4149) [2022-01-21 22:34:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][880/1251] eta 0:13:39 lr 0.000579 time 2.1087 (2.2079) loss 4.0121 (3.6238) grad_norm 1.2511 (1.4152) [2022-01-21 22:34:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][890/1251] eta 0:13:17 lr 0.000579 time 1.8800 (2.2078) loss 4.0402 (3.6247) grad_norm 1.3181 (1.4147) [2022-01-21 22:35:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][900/1251] eta 0:12:54 lr 0.000579 time 2.0393 (2.2069) loss 3.0773 (3.6222) grad_norm 1.3168 (1.4142) [2022-01-21 22:35:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][910/1251] eta 0:12:32 lr 0.000579 time 2.6358 (2.2077) loss 4.1016 (3.6230) grad_norm 1.4149 (1.4136) [2022-01-21 22:35:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][920/1251] eta 0:12:10 lr 0.000579 time 1.8883 (2.2057) loss 4.0229 (3.6218) grad_norm 1.3560 (1.4126) [2022-01-21 22:36:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][930/1251] eta 0:11:47 lr 0.000579 time 1.8334 (2.2041) loss 3.9870 (3.6186) grad_norm 1.2354 (1.4121) [2022-01-21 22:36:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][940/1251] eta 0:11:25 lr 0.000579 time 1.9445 (2.2029) loss 3.7102 (3.6190) grad_norm 1.5522 (1.4120) [2022-01-21 22:36:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][950/1251] eta 0:11:02 lr 0.000579 time 2.4285 (2.2025) loss 4.0165 (3.6192) grad_norm 1.2944 (1.4121) [2022-01-21 22:37:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][960/1251] eta 0:10:41 lr 0.000579 time 2.1940 (2.2038) loss 4.0853 (3.6192) grad_norm 1.1261 (1.4122) [2022-01-21 22:37:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][970/1251] eta 0:10:19 lr 0.000578 time 2.4623 (2.2039) loss 3.5720 (3.6153) grad_norm 1.5360 (1.4135) [2022-01-21 22:38:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][980/1251] eta 0:09:57 lr 0.000578 time 2.5447 (2.2051) loss 4.2100 (3.6168) grad_norm 1.3883 (1.4134) [2022-01-21 22:38:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][990/1251] eta 0:09:35 lr 0.000578 time 1.9851 (2.2042) loss 4.2890 (3.6179) grad_norm 1.3441 (1.4133) [2022-01-21 22:38:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1000/1251] eta 0:09:13 lr 0.000578 time 1.8683 (2.2050) loss 2.8065 (3.6169) grad_norm 1.3147 (1.4127) [2022-01-21 22:39:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1010/1251] eta 0:08:51 lr 0.000578 time 1.9026 (2.2046) loss 3.1171 (3.6149) grad_norm 1.4837 (1.4129) [2022-01-21 22:39:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1020/1251] eta 0:08:29 lr 0.000578 time 2.7391 (2.2036) loss 3.8409 (3.6158) grad_norm 1.2153 (1.4134) [2022-01-21 22:39:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1030/1251] eta 0:08:06 lr 0.000578 time 1.5849 (2.2023) loss 2.4363 (3.6119) grad_norm 1.2945 (1.4128) [2022-01-21 22:40:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1040/1251] eta 0:07:44 lr 0.000578 time 2.0795 (2.2018) loss 4.1819 (3.6110) grad_norm 2.0922 (1.4128) [2022-01-21 22:40:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1050/1251] eta 0:07:22 lr 0.000578 time 1.7390 (2.2021) loss 4.0906 (3.6120) grad_norm 1.6189 (1.4133) [2022-01-21 22:41:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1060/1251] eta 0:07:00 lr 0.000578 time 2.2911 (2.2022) loss 3.5240 (3.6120) grad_norm 1.6088 (1.4137) [2022-01-21 22:41:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1070/1251] eta 0:06:38 lr 0.000578 time 2.4828 (2.2029) loss 3.4877 (3.6131) grad_norm 1.4098 (1.4143) [2022-01-21 22:41:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1080/1251] eta 0:06:16 lr 0.000578 time 2.1908 (2.2018) loss 2.7569 (3.6123) grad_norm 1.8534 (1.4149) [2022-01-21 22:42:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1090/1251] eta 0:05:54 lr 0.000578 time 1.6427 (2.2015) loss 3.0451 (3.6104) grad_norm 1.3627 (1.4154) [2022-01-21 22:42:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1100/1251] eta 0:05:32 lr 0.000578 time 2.2740 (2.2000) loss 3.8759 (3.6126) grad_norm 1.5735 (1.4156) [2022-01-21 22:42:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1110/1251] eta 0:05:10 lr 0.000578 time 2.2568 (2.1987) loss 3.4129 (3.6105) grad_norm 1.3956 (1.4157) [2022-01-21 22:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1120/1251] eta 0:04:47 lr 0.000578 time 1.8217 (2.1978) loss 3.9565 (3.6090) grad_norm 1.3854 (1.4163) [2022-01-21 22:43:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1130/1251] eta 0:04:25 lr 0.000578 time 1.7504 (2.1959) loss 4.4012 (3.6087) grad_norm 1.5254 (1.4161) [2022-01-21 22:43:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1140/1251] eta 0:04:03 lr 0.000578 time 2.7186 (2.1955) loss 3.9116 (3.6092) grad_norm 1.4228 (1.4159) [2022-01-21 22:44:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1150/1251] eta 0:03:41 lr 0.000578 time 2.1510 (2.1954) loss 4.3847 (3.6103) grad_norm 1.5325 (1.4156) [2022-01-21 22:44:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1160/1251] eta 0:03:19 lr 0.000578 time 1.9366 (2.1977) loss 4.0493 (3.6110) grad_norm 1.1854 (1.4156) [2022-01-21 22:45:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1170/1251] eta 0:02:58 lr 0.000578 time 2.1671 (2.2003) loss 2.9618 (3.6102) grad_norm 1.3046 (1.4148) [2022-01-21 22:45:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1180/1251] eta 0:02:36 lr 0.000578 time 3.0737 (2.2024) loss 3.4432 (3.6096) grad_norm 1.3291 (1.4150) [2022-01-21 22:45:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1190/1251] eta 0:02:14 lr 0.000578 time 2.4212 (2.2007) loss 2.8028 (3.6100) grad_norm 1.3133 (1.4146) [2022-01-21 22:46:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1200/1251] eta 0:01:52 lr 0.000578 time 1.8270 (2.1988) loss 2.9954 (3.6092) grad_norm 1.3303 (1.4139) [2022-01-21 22:46:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1210/1251] eta 0:01:30 lr 0.000577 time 1.6575 (2.1969) loss 3.6517 (3.6066) grad_norm 1.5811 (1.4137) [2022-01-21 22:46:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1220/1251] eta 0:01:08 lr 0.000577 time 2.5627 (2.1963) loss 3.8406 (3.6063) grad_norm 1.4801 (1.4135) [2022-01-21 22:47:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1230/1251] eta 0:00:46 lr 0.000577 time 2.1070 (2.1956) loss 3.0812 (3.6079) grad_norm 1.3777 (1.4129) [2022-01-21 22:47:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1240/1251] eta 0:00:24 lr 0.000577 time 1.9725 (2.1963) loss 3.7124 (3.6079) grad_norm 1.6221 (1.4137) [2022-01-21 22:47:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [135/300][1250/1251] eta 0:00:02 lr 0.000577 time 1.1786 (2.1912) loss 2.8158 (3.6089) grad_norm 1.4139 (1.4135) [2022-01-21 22:47:46 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 135 training takes 0:45:41 [2022-01-21 22:48:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.696 (18.696) Loss 1.0181 (1.0181) Acc@1 74.707 (74.707) Acc@5 93.652 (93.652) [2022-01-21 22:48:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.957 (3.477) Loss 1.0533 (1.0400) Acc@1 74.609 (75.266) Acc@5 93.359 (93.235) [2022-01-21 22:48:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.619 (2.480) Loss 1.0675 (1.0461) Acc@1 75.098 (75.191) Acc@5 93.457 (93.006) [2022-01-21 22:48:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.880 (2.209) Loss 1.1093 (1.0447) Acc@1 74.316 (75.359) Acc@5 92.383 (93.003) [2022-01-21 22:49:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.559 (2.144) Loss 0.9409 (1.0492) Acc@1 75.586 (75.248) Acc@5 94.531 (92.926) [2022-01-21 22:49:21 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.232 Acc@5 92.938 [2022-01-21 22:49:21 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.2% [2022-01-21 22:49:21 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.25% [2022-01-21 22:49:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][0/1251] eta 7:29:27 lr 0.000577 time 21.5564 (21.5564) loss 3.3264 (3.3264) grad_norm 1.3960 (1.3960) [2022-01-21 22:50:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][10/1251] eta 1:23:04 lr 0.000577 time 1.7930 (4.0163) loss 3.4800 (3.4534) grad_norm 1.6349 (1.3589) [2022-01-21 22:50:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][20/1251] eta 1:03:04 lr 0.000577 time 1.9164 (3.0744) loss 2.6960 (3.4498) grad_norm 1.4113 (1.3761) [2022-01-21 22:50:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][30/1251] eta 0:56:40 lr 0.000577 time 1.8201 (2.7851) loss 3.4598 (3.5445) grad_norm 1.7104 (1.4214) [2022-01-21 22:51:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][40/1251] eta 0:54:05 lr 0.000577 time 4.0461 (2.6800) loss 3.7970 (3.5945) grad_norm 1.1099 (1.4345) [2022-01-21 22:51:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][50/1251] eta 0:53:07 lr 0.000577 time 2.1615 (2.6538) loss 3.2240 (3.5479) grad_norm 1.3060 (1.4277) [2022-01-21 22:51:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][60/1251] eta 0:51:09 lr 0.000577 time 1.9539 (2.5776) loss 3.2626 (3.5402) grad_norm 1.5106 (1.4183) [2022-01-21 22:52:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][70/1251] eta 0:49:45 lr 0.000577 time 1.8484 (2.5276) loss 2.9636 (3.5127) grad_norm 1.5573 (1.4166) [2022-01-21 22:52:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][80/1251] eta 0:48:47 lr 0.000577 time 3.2632 (2.5002) loss 4.3104 (3.5126) grad_norm 1.6512 (1.4158) [2022-01-21 22:53:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][90/1251] eta 0:47:27 lr 0.000577 time 2.3280 (2.4523) loss 3.8358 (3.5145) grad_norm 1.5023 (1.4236) [2022-01-21 22:53:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][100/1251] eta 0:46:09 lr 0.000577 time 1.6041 (2.4060) loss 4.0071 (3.5144) grad_norm 1.4632 (1.4207) [2022-01-21 22:53:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][110/1251] eta 0:45:05 lr 0.000577 time 1.7275 (2.3710) loss 2.9925 (3.5202) grad_norm 1.3489 (1.4245) [2022-01-21 22:54:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][120/1251] eta 0:44:20 lr 0.000577 time 2.4782 (2.3523) loss 3.6668 (3.5571) grad_norm 1.3924 (1.4232) [2022-01-21 22:54:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][130/1251] eta 0:43:51 lr 0.000577 time 2.3627 (2.3476) loss 3.8710 (3.5611) grad_norm 1.4585 (1.4155) [2022-01-21 22:54:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][140/1251] eta 0:43:13 lr 0.000577 time 2.4058 (2.3340) loss 3.7719 (3.5677) grad_norm 1.9322 (1.4159) [2022-01-21 22:55:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][150/1251] eta 0:42:39 lr 0.000577 time 2.0197 (2.3251) loss 3.9858 (3.5953) grad_norm 1.4619 (1.4199) [2022-01-21 22:55:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][160/1251] eta 0:42:08 lr 0.000577 time 2.6263 (2.3176) loss 3.7138 (3.6046) grad_norm 1.2767 (1.4187) [2022-01-21 22:55:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][170/1251] eta 0:41:37 lr 0.000577 time 2.5094 (2.3108) loss 3.8759 (3.6284) grad_norm 1.2144 (1.4127) [2022-01-21 22:56:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][180/1251] eta 0:41:22 lr 0.000577 time 2.7258 (2.3176) loss 3.7204 (3.6286) grad_norm 1.4527 (1.4109) [2022-01-21 22:56:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][190/1251] eta 0:40:42 lr 0.000577 time 1.5697 (2.3023) loss 4.4399 (3.6308) grad_norm 1.7413 (1.4128) [2022-01-21 22:57:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][200/1251] eta 0:40:06 lr 0.000576 time 2.2334 (2.2897) loss 4.1602 (3.6326) grad_norm 1.4596 (1.4167) [2022-01-21 22:57:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][210/1251] eta 0:39:40 lr 0.000576 time 2.3936 (2.2864) loss 4.0852 (3.6371) grad_norm 1.3029 (1.4156) [2022-01-21 22:57:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][220/1251] eta 0:39:09 lr 0.000576 time 2.4159 (2.2792) loss 3.7902 (3.6303) grad_norm 1.3658 (1.4155) [2022-01-21 22:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][230/1251] eta 0:38:43 lr 0.000576 time 2.4119 (2.2760) loss 4.2557 (3.6324) grad_norm 1.3749 (1.4140) [2022-01-21 22:58:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][240/1251] eta 0:38:18 lr 0.000576 time 2.6801 (2.2738) loss 2.9652 (3.6255) grad_norm 1.1822 (1.4151) [2022-01-21 22:58:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][250/1251] eta 0:37:53 lr 0.000576 time 2.1570 (2.2714) loss 4.1278 (3.6248) grad_norm 1.6244 (1.4162) [2022-01-21 22:59:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][260/1251] eta 0:37:25 lr 0.000576 time 2.1142 (2.2661) loss 3.5019 (3.6213) grad_norm 1.4926 (1.4142) [2022-01-21 22:59:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][270/1251] eta 0:37:04 lr 0.000576 time 2.4675 (2.2672) loss 2.4910 (3.6259) grad_norm 1.4118 (1.4118) [2022-01-21 22:59:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][280/1251] eta 0:36:39 lr 0.000576 time 2.2972 (2.2652) loss 3.2989 (3.6205) grad_norm 1.1380 (1.4095) [2022-01-21 23:00:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][290/1251] eta 0:36:17 lr 0.000576 time 2.4810 (2.2656) loss 3.4413 (3.6249) grad_norm 1.4690 (1.4089) [2022-01-21 23:00:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][300/1251] eta 0:35:50 lr 0.000576 time 1.7144 (2.2610) loss 3.7139 (3.6243) grad_norm 1.3447 (1.4087) [2022-01-21 23:01:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][310/1251] eta 0:35:21 lr 0.000576 time 2.6273 (2.2547) loss 4.1630 (3.6297) grad_norm 1.4551 (1.4121) [2022-01-21 23:01:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][320/1251] eta 0:34:57 lr 0.000576 time 2.5351 (2.2533) loss 4.0620 (3.6298) grad_norm 1.3071 (1.4141) [2022-01-21 23:01:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][330/1251] eta 0:34:33 lr 0.000576 time 2.1828 (2.2510) loss 3.5701 (3.6234) grad_norm 1.5985 (1.4141) [2022-01-21 23:02:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][340/1251] eta 0:34:04 lr 0.000576 time 1.9118 (2.2437) loss 3.5541 (3.6263) grad_norm 1.5975 (1.4139) [2022-01-21 23:02:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][350/1251] eta 0:33:38 lr 0.000576 time 2.1885 (2.2401) loss 3.7993 (3.6242) grad_norm 1.4040 (1.4147) [2022-01-21 23:02:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][360/1251] eta 0:33:17 lr 0.000576 time 2.5185 (2.2420) loss 4.1162 (3.6290) grad_norm 1.3111 (1.4155) [2022-01-21 23:03:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][370/1251] eta 0:32:54 lr 0.000576 time 2.1984 (2.2411) loss 3.6994 (3.6265) grad_norm 1.5404 (1.4158) [2022-01-21 23:03:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][380/1251] eta 0:32:31 lr 0.000576 time 1.9109 (2.2400) loss 3.5661 (3.6249) grad_norm 1.5810 (1.4192) [2022-01-21 23:03:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][390/1251] eta 0:32:08 lr 0.000576 time 2.2302 (2.2395) loss 3.6495 (3.6248) grad_norm 1.5664 (1.4202) [2022-01-21 23:04:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][400/1251] eta 0:31:47 lr 0.000576 time 3.1552 (2.2409) loss 2.5611 (3.6175) grad_norm 1.6527 (1.4211) [2022-01-21 23:04:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][410/1251] eta 0:31:22 lr 0.000576 time 1.8922 (2.2389) loss 3.6366 (3.6134) grad_norm 1.4978 (1.4208) [2022-01-21 23:05:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][420/1251] eta 0:30:56 lr 0.000576 time 1.9421 (2.2344) loss 4.2721 (3.6179) grad_norm 1.3159 (1.4210) [2022-01-21 23:05:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][430/1251] eta 0:30:28 lr 0.000576 time 1.5147 (2.2278) loss 2.9257 (3.6178) grad_norm 1.4812 (1.4208) [2022-01-21 23:05:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][440/1251] eta 0:30:02 lr 0.000576 time 1.9061 (2.2222) loss 4.6539 (3.6151) grad_norm 1.4101 (1.4199) [2022-01-21 23:06:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][450/1251] eta 0:29:38 lr 0.000575 time 2.1055 (2.2205) loss 2.5275 (3.6132) grad_norm 1.4912 (1.4203) [2022-01-21 23:06:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][460/1251] eta 0:29:16 lr 0.000575 time 1.7908 (2.2201) loss 4.5446 (3.6145) grad_norm 1.4085 (1.4199) [2022-01-21 23:06:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][470/1251] eta 0:28:53 lr 0.000575 time 1.4619 (2.2199) loss 4.0924 (3.6197) grad_norm 1.3426 (1.4180) [2022-01-21 23:07:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][480/1251] eta 0:28:32 lr 0.000575 time 2.5062 (2.2210) loss 2.4241 (3.6124) grad_norm 1.1840 (1.4172) [2022-01-21 23:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][490/1251] eta 0:28:11 lr 0.000575 time 3.0493 (2.2231) loss 3.5357 (3.6097) grad_norm 1.4914 (1.4173) [2022-01-21 23:07:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][500/1251] eta 0:27:48 lr 0.000575 time 1.7952 (2.2222) loss 4.2025 (3.6138) grad_norm 1.4493 (1.4185) [2022-01-21 23:08:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][510/1251] eta 0:27:28 lr 0.000575 time 1.6543 (2.2243) loss 3.5884 (3.6130) grad_norm 1.3527 (1.4195) [2022-01-21 23:08:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][520/1251] eta 0:27:04 lr 0.000575 time 1.7972 (2.2218) loss 2.6310 (3.6123) grad_norm 1.5147 (1.4215) [2022-01-21 23:09:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][530/1251] eta 0:26:41 lr 0.000575 time 2.1199 (2.2210) loss 4.3247 (3.6140) grad_norm 1.3413 (1.4215) [2022-01-21 23:09:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][540/1251] eta 0:26:17 lr 0.000575 time 2.1182 (2.2188) loss 4.3250 (3.6159) grad_norm 1.3449 (1.4244) [2022-01-21 23:09:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][550/1251] eta 0:25:53 lr 0.000575 time 1.8055 (2.2165) loss 2.9990 (3.6204) grad_norm 1.3255 (1.4239) [2022-01-21 23:10:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][560/1251] eta 0:25:33 lr 0.000575 time 2.8419 (2.2186) loss 2.7846 (3.6231) grad_norm 1.4022 (1.4239) [2022-01-21 23:10:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][570/1251] eta 0:25:11 lr 0.000575 time 1.8791 (2.2188) loss 2.9716 (3.6223) grad_norm 1.5446 (1.4247) [2022-01-21 23:10:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][580/1251] eta 0:24:48 lr 0.000575 time 2.5763 (2.2188) loss 3.3459 (3.6223) grad_norm 1.2719 (1.4231) [2022-01-21 23:11:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][590/1251] eta 0:24:26 lr 0.000575 time 2.8963 (2.2182) loss 3.7563 (3.6224) grad_norm 1.4403 (1.4232) [2022-01-21 23:11:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][600/1251] eta 0:24:03 lr 0.000575 time 2.7307 (2.2166) loss 3.8095 (3.6226) grad_norm 1.3927 (1.4231) [2022-01-21 23:11:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][610/1251] eta 0:23:39 lr 0.000575 time 1.8905 (2.2151) loss 3.6178 (3.6208) grad_norm 1.5614 (1.4231) [2022-01-21 23:12:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][620/1251] eta 0:23:17 lr 0.000575 time 2.2813 (2.2140) loss 2.3035 (3.6168) grad_norm 1.5769 (1.4229) [2022-01-21 23:12:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][630/1251] eta 0:22:53 lr 0.000575 time 2.8337 (2.2121) loss 2.9951 (3.6204) grad_norm 1.4471 (1.4220) [2022-01-21 23:13:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][640/1251] eta 0:22:32 lr 0.000575 time 2.1849 (2.2134) loss 3.9344 (3.6214) grad_norm 1.5785 (1.4214) [2022-01-21 23:13:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][650/1251] eta 0:22:10 lr 0.000575 time 1.9688 (2.2137) loss 2.4187 (3.6220) grad_norm 1.4760 (1.4208) [2022-01-21 23:13:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][660/1251] eta 0:21:47 lr 0.000575 time 2.0018 (2.2119) loss 4.0918 (3.6212) grad_norm 1.3044 (1.4227) [2022-01-21 23:14:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][670/1251] eta 0:21:24 lr 0.000575 time 2.5674 (2.2110) loss 3.5743 (3.6188) grad_norm 1.2984 (1.4231) [2022-01-21 23:14:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][680/1251] eta 0:21:01 lr 0.000575 time 1.7067 (2.2090) loss 4.2702 (3.6218) grad_norm 1.6257 (1.4243) [2022-01-21 23:14:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][690/1251] eta 0:20:39 lr 0.000574 time 2.2288 (2.2087) loss 4.1522 (3.6226) grad_norm 1.2279 (1.4230) [2022-01-21 23:15:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][700/1251] eta 0:20:16 lr 0.000574 time 1.8030 (2.2078) loss 4.3655 (3.6219) grad_norm 1.4587 (1.4220) [2022-01-21 23:15:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][710/1251] eta 0:19:55 lr 0.000574 time 3.6984 (2.2097) loss 3.9765 (3.6204) grad_norm 1.2157 (1.4226) [2022-01-21 23:15:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][720/1251] eta 0:19:32 lr 0.000574 time 1.5740 (2.2082) loss 3.8957 (3.6226) grad_norm 1.5162 (1.4226) [2022-01-21 23:16:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][730/1251] eta 0:19:09 lr 0.000574 time 2.3339 (2.2073) loss 3.7883 (3.6209) grad_norm 1.6294 (1.4242) [2022-01-21 23:16:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][740/1251] eta 0:18:47 lr 0.000574 time 1.9234 (2.2057) loss 3.5289 (3.6221) grad_norm 1.5047 (1.4251) [2022-01-21 23:16:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][750/1251] eta 0:18:25 lr 0.000574 time 3.3908 (2.2069) loss 4.0226 (3.6215) grad_norm 1.4794 (1.4256) [2022-01-21 23:17:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][760/1251] eta 0:18:02 lr 0.000574 time 2.4615 (2.2049) loss 3.9202 (3.6236) grad_norm 1.2836 (1.4258) [2022-01-21 23:17:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][770/1251] eta 0:17:40 lr 0.000574 time 2.4545 (2.2045) loss 3.7729 (3.6227) grad_norm 1.5691 (1.4267) [2022-01-21 23:18:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][780/1251] eta 0:17:18 lr 0.000574 time 2.6551 (2.2054) loss 3.8014 (3.6233) grad_norm 1.5809 (1.4265) [2022-01-21 23:18:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][790/1251] eta 0:16:57 lr 0.000574 time 2.9677 (2.2062) loss 3.8850 (3.6220) grad_norm 1.1906 (1.4259) [2022-01-21 23:18:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][800/1251] eta 0:16:35 lr 0.000574 time 1.8773 (2.2064) loss 4.0386 (3.6221) grad_norm 1.5645 (1.4250) [2022-01-21 23:19:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][810/1251] eta 0:16:13 lr 0.000574 time 2.7730 (2.2068) loss 3.7537 (3.6186) grad_norm 1.3219 (1.4250) [2022-01-21 23:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][820/1251] eta 0:15:51 lr 0.000574 time 2.5232 (2.2067) loss 4.4600 (3.6196) grad_norm 1.6981 (1.4263) [2022-01-21 23:19:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][830/1251] eta 0:15:28 lr 0.000574 time 2.6378 (2.2060) loss 4.5379 (3.6227) grad_norm 1.3767 (1.4266) [2022-01-21 23:20:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][840/1251] eta 0:15:05 lr 0.000574 time 1.5702 (2.2043) loss 4.3181 (3.6244) grad_norm 1.3616 (1.4263) [2022-01-21 23:20:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][850/1251] eta 0:14:44 lr 0.000574 time 3.4826 (2.2046) loss 2.4488 (3.6228) grad_norm 1.2436 (1.4258) [2022-01-21 23:20:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][860/1251] eta 0:14:21 lr 0.000574 time 1.7357 (2.2032) loss 4.1008 (3.6220) grad_norm 1.0906 (1.4254) [2022-01-21 23:21:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][870/1251] eta 0:13:59 lr 0.000574 time 2.0864 (2.2028) loss 2.9786 (3.6197) grad_norm 1.2845 (1.4244) [2022-01-21 23:21:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][880/1251] eta 0:13:37 lr 0.000574 time 1.8537 (2.2035) loss 3.9207 (3.6180) grad_norm 1.4789 (1.4241) [2022-01-21 23:22:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][890/1251] eta 0:13:15 lr 0.000574 time 3.3703 (2.2046) loss 4.2312 (3.6196) grad_norm 1.6052 (1.4245) [2022-01-21 23:22:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][900/1251] eta 0:12:53 lr 0.000574 time 1.8120 (2.2034) loss 3.8400 (3.6172) grad_norm 1.3353 (1.4250) [2022-01-21 23:22:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][910/1251] eta 0:12:31 lr 0.000574 time 2.8546 (2.2045) loss 2.5143 (3.6150) grad_norm 1.2585 (1.4240) [2022-01-21 23:23:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][920/1251] eta 0:12:09 lr 0.000574 time 2.0087 (2.2025) loss 3.7395 (3.6138) grad_norm 1.9194 (1.4242) [2022-01-21 23:23:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][930/1251] eta 0:11:46 lr 0.000573 time 2.8300 (2.2021) loss 3.6330 (3.6134) grad_norm 1.5552 (1.4249) [2022-01-21 23:23:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][940/1251] eta 0:11:24 lr 0.000573 time 2.1524 (2.2001) loss 3.6512 (3.6113) grad_norm 1.4787 (1.4267) [2022-01-21 23:24:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][950/1251] eta 0:11:02 lr 0.000573 time 2.4900 (2.2008) loss 2.4426 (3.6096) grad_norm 1.2614 (1.4267) [2022-01-21 23:24:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][960/1251] eta 0:10:40 lr 0.000573 time 2.1326 (2.2018) loss 2.7896 (3.6096) grad_norm 1.2223 (1.4260) [2022-01-21 23:25:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][970/1251] eta 0:10:19 lr 0.000573 time 2.9469 (2.2033) loss 4.4469 (3.6117) grad_norm 1.3349 (1.4252) [2022-01-21 23:25:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][980/1251] eta 0:09:57 lr 0.000573 time 2.1776 (2.2036) loss 3.4336 (3.6100) grad_norm 1.2013 (1.4253) [2022-01-21 23:25:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][990/1251] eta 0:09:35 lr 0.000573 time 2.7764 (2.2040) loss 3.8017 (3.6085) grad_norm 1.2939 (1.4248) [2022-01-21 23:26:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1000/1251] eta 0:09:12 lr 0.000573 time 1.7782 (2.2026) loss 3.8199 (3.6083) grad_norm 1.5433 (1.4244) [2022-01-21 23:26:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1010/1251] eta 0:08:50 lr 0.000573 time 2.5383 (2.2016) loss 3.8530 (3.6072) grad_norm 1.3005 (1.4246) [2022-01-21 23:26:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1020/1251] eta 0:08:28 lr 0.000573 time 2.0241 (2.2015) loss 3.0635 (3.6067) grad_norm 1.4284 (1.4249) [2022-01-21 23:27:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1030/1251] eta 0:08:06 lr 0.000573 time 2.4720 (2.2023) loss 4.4522 (3.6080) grad_norm 1.6484 (1.4242) [2022-01-21 23:27:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1040/1251] eta 0:07:44 lr 0.000573 time 2.8887 (2.2020) loss 3.3267 (3.6095) grad_norm 2.0444 (1.4243) [2022-01-21 23:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1050/1251] eta 0:07:22 lr 0.000573 time 2.0241 (2.2009) loss 2.9246 (3.6069) grad_norm 1.4637 (1.4244) [2022-01-21 23:28:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1060/1251] eta 0:07:00 lr 0.000573 time 2.2838 (2.2001) loss 4.4448 (3.6101) grad_norm 1.5802 (1.4250) [2022-01-21 23:28:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1070/1251] eta 0:06:38 lr 0.000573 time 1.8860 (2.1993) loss 3.1760 (3.6066) grad_norm 1.3618 (1.4245) [2022-01-21 23:28:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1080/1251] eta 0:06:16 lr 0.000573 time 2.8006 (2.1990) loss 3.1301 (3.6077) grad_norm 1.6620 (1.4247) [2022-01-21 23:29:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1090/1251] eta 0:05:54 lr 0.000573 time 2.4110 (2.1992) loss 2.7284 (3.6069) grad_norm 1.5627 (1.4246) [2022-01-21 23:29:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1100/1251] eta 0:05:31 lr 0.000573 time 2.1681 (2.1984) loss 3.6986 (3.6058) grad_norm 1.3141 (1.4246) [2022-01-21 23:30:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1110/1251] eta 0:05:10 lr 0.000573 time 1.9154 (2.1997) loss 3.9516 (3.6048) grad_norm 1.1876 (1.4241) [2022-01-21 23:30:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1120/1251] eta 0:04:48 lr 0.000573 time 3.5739 (2.2018) loss 3.8034 (3.6061) grad_norm 1.6827 (1.4238) [2022-01-21 23:30:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1130/1251] eta 0:04:26 lr 0.000573 time 1.6852 (2.2016) loss 3.6982 (3.6071) grad_norm 1.1818 (1.4241) [2022-01-21 23:31:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1140/1251] eta 0:04:04 lr 0.000573 time 1.8755 (2.2010) loss 3.7773 (3.6070) grad_norm 1.3432 (1.4237) [2022-01-21 23:31:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1150/1251] eta 0:03:42 lr 0.000573 time 1.7640 (2.1994) loss 3.5904 (3.6088) grad_norm 1.3775 (1.4236) [2022-01-21 23:31:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1160/1251] eta 0:03:20 lr 0.000573 time 3.1492 (2.1982) loss 3.7492 (3.6084) grad_norm 1.4796 (1.4241) [2022-01-21 23:32:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1170/1251] eta 0:02:58 lr 0.000573 time 1.8840 (2.1981) loss 3.9354 (3.6094) grad_norm 1.1552 (1.4236) [2022-01-21 23:32:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1180/1251] eta 0:02:36 lr 0.000572 time 2.2249 (2.1986) loss 3.8159 (3.6092) grad_norm 1.2431 (1.4232) [2022-01-21 23:32:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1190/1251] eta 0:02:14 lr 0.000572 time 2.2318 (2.1985) loss 3.9712 (3.6080) grad_norm 1.3529 (1.4229) [2022-01-21 23:33:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1200/1251] eta 0:01:52 lr 0.000572 time 2.0444 (2.1970) loss 3.7732 (3.6072) grad_norm 1.3975 (1.4224) [2022-01-21 23:33:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1210/1251] eta 0:01:30 lr 0.000572 time 1.9954 (2.1966) loss 3.1919 (3.6062) grad_norm 1.4014 (1.4220) [2022-01-21 23:34:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1220/1251] eta 0:01:08 lr 0.000572 time 1.9168 (2.1959) loss 3.4261 (3.6052) grad_norm 1.2748 (1.4219) [2022-01-21 23:34:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1230/1251] eta 0:00:46 lr 0.000572 time 2.2762 (2.1964) loss 4.0619 (3.6074) grad_norm 1.4103 (1.4217) [2022-01-21 23:34:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1240/1251] eta 0:00:24 lr 0.000572 time 1.7504 (2.1958) loss 4.0968 (3.6095) grad_norm 1.7808 (1.4224) [2022-01-21 23:35:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [136/300][1250/1251] eta 0:00:02 lr 0.000572 time 1.1292 (2.1905) loss 4.1155 (3.6092) grad_norm 1.6111 (1.4225) [2022-01-21 23:35:02 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 136 training takes 0:45:40 [2022-01-21 23:35:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.336 (18.336) Loss 1.1019 (1.1019) Acc@1 74.902 (74.902) Acc@5 92.480 (92.480) [2022-01-21 23:35:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.646 (3.193) Loss 1.1174 (1.0962) Acc@1 75.391 (74.476) Acc@5 91.504 (92.702) [2022-01-21 23:35:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.256 (2.429) Loss 1.0695 (1.0820) Acc@1 74.707 (75.056) Acc@5 92.969 (92.829) [2022-01-21 23:36:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.355 (2.296) Loss 1.0509 (1.0762) Acc@1 77.246 (75.406) Acc@5 93.652 (92.852) [2022-01-21 23:36:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.566 (2.168) Loss 1.0818 (1.0761) Acc@1 74.805 (75.310) Acc@5 93.066 (92.878) [2022-01-21 23:36:37 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.432 Acc@5 92.958 [2022-01-21 23:36:37 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.4% [2022-01-21 23:36:37 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.43% [2022-01-21 23:36:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][0/1251] eta 7:24:25 lr 0.000572 time 21.3151 (21.3151) loss 3.3318 (3.3318) grad_norm 1.4423 (1.4423) [2022-01-21 23:37:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][10/1251] eta 1:24:19 lr 0.000572 time 3.0540 (4.0772) loss 3.2084 (3.7473) grad_norm 1.4305 (1.3864) [2022-01-21 23:37:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][20/1251] eta 1:05:41 lr 0.000572 time 1.5437 (3.2019) loss 4.3546 (3.6094) grad_norm 1.3212 (1.4192) [2022-01-21 23:38:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][30/1251] eta 0:58:26 lr 0.000572 time 1.5109 (2.8715) loss 3.4531 (3.5610) grad_norm 1.5014 (1.4200) [2022-01-21 23:38:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][40/1251] eta 0:55:37 lr 0.000572 time 3.9455 (2.7562) loss 3.9742 (3.5801) grad_norm 1.4260 (1.4178) [2022-01-21 23:38:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][50/1251] eta 0:53:30 lr 0.000572 time 2.0066 (2.6731) loss 3.4929 (3.5766) grad_norm 1.3190 (1.4121) [2022-01-21 23:39:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][60/1251] eta 0:51:10 lr 0.000572 time 1.4289 (2.5782) loss 2.6218 (3.5500) grad_norm 1.3567 (1.4080) [2022-01-21 23:39:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][70/1251] eta 0:49:28 lr 0.000572 time 1.6799 (2.5136) loss 2.8879 (3.5475) grad_norm 1.2382 (1.4109) [2022-01-21 23:39:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][80/1251] eta 0:48:25 lr 0.000572 time 3.2553 (2.4814) loss 3.8909 (3.5586) grad_norm 1.3685 (1.4051) [2022-01-21 23:40:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][90/1251] eta 0:47:25 lr 0.000572 time 2.2619 (2.4511) loss 3.9443 (3.5790) grad_norm 1.4098 (1.4148) [2022-01-21 23:40:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][100/1251] eta 0:46:25 lr 0.000572 time 1.5575 (2.4201) loss 2.5714 (3.5783) grad_norm 1.6782 (1.4151) [2022-01-21 23:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][110/1251] eta 0:45:37 lr 0.000572 time 1.9060 (2.3989) loss 3.7661 (3.5711) grad_norm 1.3951 (1.4214) [2022-01-21 23:41:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][120/1251] eta 0:44:47 lr 0.000572 time 2.2767 (2.3760) loss 3.9650 (3.5858) grad_norm 1.8629 (1.4244) [2022-01-21 23:41:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][130/1251] eta 0:44:33 lr 0.000572 time 3.5546 (2.3853) loss 3.8473 (3.5780) grad_norm 1.5219 (1.4191) [2022-01-21 23:42:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][140/1251] eta 0:43:55 lr 0.000572 time 2.0428 (2.3718) loss 4.2832 (3.5855) grad_norm 1.5821 (1.4158) [2022-01-21 23:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][150/1251] eta 0:43:23 lr 0.000572 time 2.3514 (2.3651) loss 4.1885 (3.5785) grad_norm 1.4923 (1.4218) [2022-01-21 23:42:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][160/1251] eta 0:42:50 lr 0.000572 time 2.7310 (2.3563) loss 3.3852 (3.5717) grad_norm 1.4074 (1.4201) [2022-01-21 23:43:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][170/1251] eta 0:42:07 lr 0.000571 time 1.8599 (2.3383) loss 3.7682 (3.5725) grad_norm 1.4111 (1.4139) [2022-01-21 23:43:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][180/1251] eta 0:41:26 lr 0.000571 time 2.2107 (2.3217) loss 3.5604 (3.5712) grad_norm 1.2629 (1.4091) [2022-01-21 23:43:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][190/1251] eta 0:40:51 lr 0.000571 time 1.8606 (2.3105) loss 3.8294 (3.5843) grad_norm 1.3192 (1.4130) [2022-01-21 23:44:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][200/1251] eta 0:40:31 lr 0.000571 time 2.8121 (2.3135) loss 4.2353 (3.5916) grad_norm 1.5120 (1.4114) [2022-01-21 23:44:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][210/1251] eta 0:40:00 lr 0.000571 time 1.6409 (2.3061) loss 4.1250 (3.5926) grad_norm 1.2476 (1.4080) [2022-01-21 23:45:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][220/1251] eta 0:39:28 lr 0.000571 time 2.0934 (2.2970) loss 4.3781 (3.5863) grad_norm 1.8234 (1.4125) [2022-01-21 23:45:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][230/1251] eta 0:38:58 lr 0.000571 time 1.9209 (2.2906) loss 3.8952 (3.5815) grad_norm 1.3492 (1.4127) [2022-01-21 23:45:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][240/1251] eta 0:38:36 lr 0.000571 time 1.9948 (2.2909) loss 4.0346 (3.5715) grad_norm 1.3063 (1.4100) [2022-01-21 23:46:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][250/1251] eta 0:38:12 lr 0.000571 time 2.0864 (2.2899) loss 3.2095 (3.5691) grad_norm 1.5021 (1.4094) [2022-01-21 23:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][260/1251] eta 0:37:44 lr 0.000571 time 2.5811 (2.2848) loss 4.3534 (3.5827) grad_norm 1.3910 (1.4099) [2022-01-21 23:46:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][270/1251] eta 0:37:12 lr 0.000571 time 1.6804 (2.2757) loss 4.1971 (3.5807) grad_norm 1.2535 (1.4119) [2022-01-21 23:47:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][280/1251] eta 0:36:47 lr 0.000571 time 2.1501 (2.2739) loss 4.1540 (3.5810) grad_norm 1.3472 (1.4078) [2022-01-21 23:47:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][290/1251] eta 0:36:19 lr 0.000571 time 2.5007 (2.2678) loss 3.9012 (3.5881) grad_norm 1.3403 (1.4093) [2022-01-21 23:48:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][300/1251] eta 0:35:56 lr 0.000571 time 3.0289 (2.2672) loss 3.1449 (3.5783) grad_norm 1.2442 (1.4097) [2022-01-21 23:48:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][310/1251] eta 0:35:26 lr 0.000571 time 1.9679 (2.2593) loss 3.2485 (3.5755) grad_norm 1.3986 (1.4100) [2022-01-21 23:48:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][320/1251] eta 0:35:01 lr 0.000571 time 2.6511 (2.2570) loss 3.6509 (3.5673) grad_norm 1.5493 (1.4128) [2022-01-21 23:49:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][330/1251] eta 0:34:39 lr 0.000571 time 2.2186 (2.2578) loss 2.5675 (3.5646) grad_norm 1.2817 (1.4139) [2022-01-21 23:49:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][340/1251] eta 0:34:15 lr 0.000571 time 2.4546 (2.2559) loss 3.5532 (3.5604) grad_norm 1.3156 (1.4111) [2022-01-21 23:49:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][350/1251] eta 0:33:50 lr 0.000571 time 1.8812 (2.2539) loss 2.8443 (3.5644) grad_norm 1.5768 (1.4102) [2022-01-21 23:50:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][360/1251] eta 0:33:24 lr 0.000571 time 2.1179 (2.2494) loss 3.3251 (3.5629) grad_norm 1.3375 (1.4093) [2022-01-21 23:50:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][370/1251] eta 0:32:59 lr 0.000571 time 1.7957 (2.2464) loss 3.6586 (3.5629) grad_norm 1.4825 (1.4102) [2022-01-21 23:50:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][380/1251] eta 0:32:46 lr 0.000571 time 5.4474 (2.2580) loss 4.2468 (3.5634) grad_norm 1.3910 (1.4112) [2022-01-21 23:51:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][390/1251] eta 0:32:22 lr 0.000571 time 2.1331 (2.2566) loss 2.7680 (3.5592) grad_norm 1.6384 (1.4138) [2022-01-21 23:51:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][400/1251] eta 0:31:57 lr 0.000571 time 1.8536 (2.2529) loss 4.1219 (3.5696) grad_norm 1.4430 (1.4141) [2022-01-21 23:52:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][410/1251] eta 0:31:31 lr 0.000570 time 1.7490 (2.2485) loss 3.7560 (3.5666) grad_norm 1.5347 (1.4130) [2022-01-21 23:52:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][420/1251] eta 0:31:06 lr 0.000570 time 3.4003 (2.2463) loss 3.4139 (3.5661) grad_norm 1.4102 (1.4131) [2022-01-21 23:52:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][430/1251] eta 0:30:41 lr 0.000570 time 2.0198 (2.2431) loss 2.8834 (3.5642) grad_norm 1.5005 (1.4132) [2022-01-21 23:53:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][440/1251] eta 0:30:18 lr 0.000570 time 2.2024 (2.2420) loss 2.7996 (3.5596) grad_norm 1.3620 (1.4140) [2022-01-21 23:53:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][450/1251] eta 0:29:56 lr 0.000570 time 2.2224 (2.2424) loss 2.7900 (3.5592) grad_norm 1.2463 (1.4138) [2022-01-21 23:53:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][460/1251] eta 0:29:34 lr 0.000570 time 2.7571 (2.2429) loss 3.0731 (3.5568) grad_norm 1.4447 (1.4145) [2022-01-21 23:54:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][470/1251] eta 0:29:08 lr 0.000570 time 1.9457 (2.2388) loss 3.9132 (3.5578) grad_norm 1.6709 (1.4153) [2022-01-21 23:54:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][480/1251] eta 0:28:43 lr 0.000570 time 1.9175 (2.2358) loss 3.5860 (3.5546) grad_norm 1.3919 (1.4151) [2022-01-21 23:54:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][490/1251] eta 0:28:19 lr 0.000570 time 2.0038 (2.2334) loss 3.9289 (3.5614) grad_norm 1.7867 (1.4156) [2022-01-21 23:55:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][500/1251] eta 0:28:00 lr 0.000570 time 3.8144 (2.2378) loss 2.8718 (3.5630) grad_norm 1.2250 (1.4159) [2022-01-21 23:55:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][510/1251] eta 0:27:37 lr 0.000570 time 1.8828 (2.2368) loss 2.4119 (3.5629) grad_norm 1.3161 (1.4152) [2022-01-21 23:56:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][520/1251] eta 0:27:13 lr 0.000570 time 2.4274 (2.2344) loss 3.5855 (3.5570) grad_norm 1.4403 (1.4145) [2022-01-21 23:56:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][530/1251] eta 0:26:50 lr 0.000570 time 1.5922 (2.2332) loss 2.9066 (3.5571) grad_norm 1.3246 (1.4129) [2022-01-21 23:56:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][540/1251] eta 0:26:27 lr 0.000570 time 3.6476 (2.2328) loss 2.5119 (3.5541) grad_norm 1.3652 (1.4142) [2022-01-21 23:57:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][550/1251] eta 0:26:03 lr 0.000570 time 2.4929 (2.2303) loss 4.1026 (3.5527) grad_norm 1.7730 (1.4183) [2022-01-21 23:57:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][560/1251] eta 0:25:40 lr 0.000570 time 1.8883 (2.2299) loss 4.1044 (3.5543) grad_norm 1.6248 (1.4203) [2022-01-21 23:57:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][570/1251] eta 0:25:18 lr 0.000570 time 2.2823 (2.2291) loss 3.3955 (3.5485) grad_norm 1.6851 (1.4212) [2022-01-21 23:58:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][580/1251] eta 0:24:56 lr 0.000570 time 2.6096 (2.2302) loss 3.2386 (3.5505) grad_norm 1.3234 (1.4215) [2022-01-21 23:58:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][590/1251] eta 0:24:33 lr 0.000570 time 2.4201 (2.2288) loss 4.1045 (3.5513) grad_norm 1.3297 (1.4214) [2022-01-21 23:58:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][600/1251] eta 0:24:09 lr 0.000570 time 2.1187 (2.2272) loss 3.8113 (3.5514) grad_norm 1.2843 (1.4234) [2022-01-21 23:59:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][610/1251] eta 0:23:46 lr 0.000570 time 2.2904 (2.2259) loss 4.3350 (3.5580) grad_norm 1.5147 (1.4240) [2022-01-21 23:59:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][620/1251] eta 0:23:24 lr 0.000570 time 2.5590 (2.2252) loss 3.9388 (3.5616) grad_norm 1.4676 (1.4241) [2022-01-22 00:00:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][630/1251] eta 0:23:02 lr 0.000570 time 2.7678 (2.2258) loss 2.4261 (3.5605) grad_norm 1.2980 (1.4227) [2022-01-22 00:00:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][640/1251] eta 0:22:38 lr 0.000570 time 2.1628 (2.2236) loss 3.3156 (3.5589) grad_norm 1.5539 (1.4226) [2022-01-22 00:00:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][650/1251] eta 0:22:16 lr 0.000570 time 2.4526 (2.2238) loss 3.3924 (3.5606) grad_norm 1.7067 (1.4231) [2022-01-22 00:01:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][660/1251] eta 0:21:54 lr 0.000569 time 2.9362 (2.2247) loss 3.6013 (3.5608) grad_norm 1.2844 (1.4223) [2022-01-22 00:01:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][670/1251] eta 0:21:32 lr 0.000569 time 2.7809 (2.2243) loss 2.8023 (3.5623) grad_norm 1.1437 (1.4223) [2022-01-22 00:01:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][680/1251] eta 0:21:10 lr 0.000569 time 2.8413 (2.2249) loss 3.9763 (3.5611) grad_norm 1.5095 (1.4224) [2022-01-22 00:02:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][690/1251] eta 0:20:47 lr 0.000569 time 1.7728 (2.2243) loss 2.9482 (3.5605) grad_norm 1.3257 (1.4220) [2022-01-22 00:02:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][700/1251] eta 0:20:26 lr 0.000569 time 3.5160 (2.2260) loss 2.8594 (3.5613) grad_norm 1.4774 (1.4230) [2022-01-22 00:02:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][710/1251] eta 0:20:02 lr 0.000569 time 1.8434 (2.2228) loss 3.0186 (3.5564) grad_norm 1.5066 (1.4243) [2022-01-22 00:03:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][720/1251] eta 0:19:38 lr 0.000569 time 1.9347 (2.2186) loss 3.8473 (3.5560) grad_norm 1.5913 (1.4235) [2022-01-22 00:03:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][730/1251] eta 0:19:14 lr 0.000569 time 1.8823 (2.2157) loss 2.6471 (3.5561) grad_norm 1.3980 (1.4237) [2022-01-22 00:03:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][740/1251] eta 0:18:51 lr 0.000569 time 2.5809 (2.2142) loss 3.1676 (3.5554) grad_norm 1.3136 (1.4229) [2022-01-22 00:04:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][750/1251] eta 0:18:28 lr 0.000569 time 1.7801 (2.2120) loss 2.8513 (3.5545) grad_norm 1.4564 (1.4250) [2022-01-22 00:04:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][760/1251] eta 0:18:05 lr 0.000569 time 2.6075 (2.2108) loss 2.8811 (3.5551) grad_norm 1.3731 (1.4255) [2022-01-22 00:05:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][770/1251] eta 0:17:43 lr 0.000569 time 2.3398 (2.2115) loss 2.8563 (3.5539) grad_norm 1.4315 (1.4250) [2022-01-22 00:05:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][780/1251] eta 0:17:21 lr 0.000569 time 2.1606 (2.2115) loss 3.9364 (3.5528) grad_norm 1.4161 (1.4249) [2022-01-22 00:05:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][790/1251] eta 0:17:00 lr 0.000569 time 2.5435 (2.2130) loss 2.3961 (3.5531) grad_norm 1.5229 (1.4254) [2022-01-22 00:06:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][800/1251] eta 0:16:38 lr 0.000569 time 2.5233 (2.2150) loss 3.4907 (3.5574) grad_norm 1.3242 (1.4253) [2022-01-22 00:06:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][810/1251] eta 0:16:17 lr 0.000569 time 2.9128 (2.2172) loss 2.7046 (3.5576) grad_norm 1.4462 (1.4261) [2022-01-22 00:06:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][820/1251] eta 0:15:55 lr 0.000569 time 2.0460 (2.2168) loss 4.0586 (3.5574) grad_norm 1.3423 (1.4267) [2022-01-22 00:07:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][830/1251] eta 0:15:32 lr 0.000569 time 1.9857 (2.2149) loss 2.8679 (3.5528) grad_norm 1.3774 (1.4268) [2022-01-22 00:07:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][840/1251] eta 0:15:09 lr 0.000569 time 1.9272 (2.2120) loss 4.1579 (3.5543) grad_norm 1.3418 (1.4277) [2022-01-22 00:07:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][850/1251] eta 0:14:45 lr 0.000569 time 1.8853 (2.2095) loss 3.8852 (3.5553) grad_norm 1.3320 (1.4280) [2022-01-22 00:08:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][860/1251] eta 0:14:24 lr 0.000569 time 2.6077 (2.2101) loss 2.3459 (3.5575) grad_norm 1.5908 (1.4280) [2022-01-22 00:08:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][870/1251] eta 0:14:02 lr 0.000569 time 1.8642 (2.2115) loss 3.5899 (3.5569) grad_norm 1.3809 (1.4265) [2022-01-22 00:09:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][880/1251] eta 0:13:40 lr 0.000569 time 2.2214 (2.2116) loss 3.9436 (3.5599) grad_norm 1.3777 (1.4260) [2022-01-22 00:09:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][890/1251] eta 0:13:18 lr 0.000569 time 1.8892 (2.2119) loss 3.3307 (3.5610) grad_norm 1.2335 (1.4251) [2022-01-22 00:09:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][900/1251] eta 0:12:56 lr 0.000568 time 2.4654 (2.2124) loss 3.6788 (3.5596) grad_norm 1.2209 (1.4245) [2022-01-22 00:10:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][910/1251] eta 0:12:33 lr 0.000568 time 2.2316 (2.2110) loss 3.7753 (3.5635) grad_norm 1.3034 (1.4240) [2022-01-22 00:10:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][920/1251] eta 0:12:11 lr 0.000568 time 2.0539 (2.2113) loss 4.3474 (3.5627) grad_norm 1.3049 (1.4242) [2022-01-22 00:10:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][930/1251] eta 0:11:49 lr 0.000568 time 1.9747 (2.2102) loss 4.0812 (3.5636) grad_norm 1.5464 (1.4235) [2022-01-22 00:11:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][940/1251] eta 0:11:27 lr 0.000568 time 2.5660 (2.2091) loss 3.6874 (3.5631) grad_norm 1.4916 (1.4236) [2022-01-22 00:11:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][950/1251] eta 0:11:04 lr 0.000568 time 1.9886 (2.2085) loss 3.4331 (3.5641) grad_norm 1.5086 (1.4235) [2022-01-22 00:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][960/1251] eta 0:10:42 lr 0.000568 time 1.5523 (2.2077) loss 4.0762 (3.5660) grad_norm 1.2617 (1.4234) [2022-01-22 00:12:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][970/1251] eta 0:10:20 lr 0.000568 time 2.4531 (2.2075) loss 2.5985 (3.5648) grad_norm 1.3838 (1.4231) [2022-01-22 00:12:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][980/1251] eta 0:09:58 lr 0.000568 time 4.0363 (2.2081) loss 4.0032 (3.5696) grad_norm 1.2931 (1.4229) [2022-01-22 00:13:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][990/1251] eta 0:09:36 lr 0.000568 time 2.2460 (2.2087) loss 3.4695 (3.5706) grad_norm 1.7205 (1.4230) [2022-01-22 00:13:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1000/1251] eta 0:09:14 lr 0.000568 time 1.9469 (2.2085) loss 3.2785 (3.5696) grad_norm 1.2041 (1.4223) [2022-01-22 00:13:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1010/1251] eta 0:08:52 lr 0.000568 time 2.3820 (2.2089) loss 2.9507 (3.5673) grad_norm 1.5598 (1.4223) [2022-01-22 00:14:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1020/1251] eta 0:08:29 lr 0.000568 time 1.9696 (2.2077) loss 4.0513 (3.5707) grad_norm 1.3749 (1.4222) [2022-01-22 00:14:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1030/1251] eta 0:08:07 lr 0.000568 time 2.2408 (2.2064) loss 3.6677 (3.5724) grad_norm 1.5411 (1.4216) [2022-01-22 00:14:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1040/1251] eta 0:07:45 lr 0.000568 time 2.5111 (2.2052) loss 3.5148 (3.5730) grad_norm 1.4751 (1.4225) [2022-01-22 00:15:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1050/1251] eta 0:07:23 lr 0.000568 time 2.5294 (2.2042) loss 3.9673 (3.5733) grad_norm 1.6518 (1.4239) [2022-01-22 00:15:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1060/1251] eta 0:07:01 lr 0.000568 time 3.0731 (2.2048) loss 4.1424 (3.5755) grad_norm 1.4294 (1.4239) [2022-01-22 00:15:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1070/1251] eta 0:06:39 lr 0.000568 time 1.9856 (2.2052) loss 3.9927 (3.5766) grad_norm 1.3148 (1.4236) [2022-01-22 00:16:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1080/1251] eta 0:06:17 lr 0.000568 time 2.6762 (2.2061) loss 2.7459 (3.5748) grad_norm 1.8365 (1.4241) [2022-01-22 00:16:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1090/1251] eta 0:05:55 lr 0.000568 time 1.7838 (2.2060) loss 3.7418 (3.5783) grad_norm 1.3598 (1.4251) [2022-01-22 00:17:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1100/1251] eta 0:05:33 lr 0.000568 time 2.6639 (2.2074) loss 3.2069 (3.5789) grad_norm 1.4681 (1.4248) [2022-01-22 00:17:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1110/1251] eta 0:05:11 lr 0.000568 time 2.5222 (2.2076) loss 4.0279 (3.5804) grad_norm 1.4157 (1.4252) [2022-01-22 00:17:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1120/1251] eta 0:04:49 lr 0.000568 time 2.1016 (2.2077) loss 4.0436 (3.5822) grad_norm 1.6968 (1.4261) [2022-01-22 00:18:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1130/1251] eta 0:04:26 lr 0.000568 time 1.9336 (2.2061) loss 3.8544 (3.5819) grad_norm 1.6305 (1.4264) [2022-01-22 00:18:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1140/1251] eta 0:04:04 lr 0.000567 time 1.8922 (2.2031) loss 4.0339 (3.5800) grad_norm 1.2553 (1.4259) [2022-01-22 00:18:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1150/1251] eta 0:03:42 lr 0.000567 time 1.9525 (2.2018) loss 4.1395 (3.5797) grad_norm 1.3463 (1.4261) [2022-01-22 00:19:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1160/1251] eta 0:03:20 lr 0.000567 time 2.5251 (2.2017) loss 3.9029 (3.5786) grad_norm 1.3003 (1.4258) [2022-01-22 00:19:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1170/1251] eta 0:02:58 lr 0.000567 time 1.8912 (2.2008) loss 2.9525 (3.5793) grad_norm 1.5380 (1.4255) [2022-01-22 00:19:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1180/1251] eta 0:02:36 lr 0.000567 time 2.1750 (2.2007) loss 2.4511 (3.5778) grad_norm 1.4943 (1.4262) [2022-01-22 00:20:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1190/1251] eta 0:02:14 lr 0.000567 time 3.1401 (2.2024) loss 3.5592 (3.5761) grad_norm 1.3357 (1.4269) [2022-01-22 00:20:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1200/1251] eta 0:01:52 lr 0.000567 time 2.1072 (2.2035) loss 4.3342 (3.5768) grad_norm 1.3706 (1.4267) [2022-01-22 00:21:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1210/1251] eta 0:01:30 lr 0.000567 time 2.1539 (2.2050) loss 3.6075 (3.5758) grad_norm 1.3925 (1.4264) [2022-01-22 00:21:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1220/1251] eta 0:01:08 lr 0.000567 time 1.7439 (2.2062) loss 4.1785 (3.5772) grad_norm 1.3636 (1.4263) [2022-01-22 00:21:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1230/1251] eta 0:00:46 lr 0.000567 time 2.6076 (2.2059) loss 4.1064 (3.5787) grad_norm 1.4925 (1.4268) [2022-01-22 00:22:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1240/1251] eta 0:00:24 lr 0.000567 time 1.5426 (2.2028) loss 3.8611 (3.5775) grad_norm 1.5089 (1.4267) [2022-01-22 00:22:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [137/300][1250/1251] eta 0:00:02 lr 0.000567 time 1.1997 (2.1970) loss 3.1402 (3.5761) grad_norm 1.8720 (1.4268) [2022-01-22 00:22:26 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 137 training takes 0:45:48 [2022-01-22 00:22:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.670 (18.670) Loss 1.0270 (1.0270) Acc@1 75.195 (75.195) Acc@5 93.848 (93.848) [2022-01-22 00:23:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.975 (3.359) Loss 1.0444 (1.0283) Acc@1 75.879 (75.479) Acc@5 92.188 (93.164) [2022-01-22 00:23:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.601 (2.541) Loss 1.0604 (1.0282) Acc@1 74.609 (75.632) Acc@5 92.969 (93.206) [2022-01-22 00:23:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.849 (2.254) Loss 1.0596 (1.0425) Acc@1 75.391 (75.337) Acc@5 91.602 (92.956) [2022-01-22 00:23:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.946 (2.184) Loss 1.0013 (1.0393) Acc@1 75.488 (75.403) Acc@5 93.164 (92.993) [2022-01-22 00:24:03 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.318 Acc@5 93.048 [2022-01-22 00:24:03 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.3% [2022-01-22 00:24:03 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.43% [2022-01-22 00:24:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][0/1251] eta 7:32:02 lr 0.000567 time 21.6808 (21.6808) loss 3.8658 (3.8658) grad_norm 1.5098 (1.5098) [2022-01-22 00:24:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][10/1251] eta 1:29:10 lr 0.000567 time 1.7908 (4.3116) loss 3.1264 (3.5643) grad_norm 1.3959 (1.5015) [2022-01-22 00:25:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][20/1251] eta 1:08:27 lr 0.000567 time 2.0767 (3.3370) loss 4.1305 (3.6212) grad_norm 1.5362 (1.4739) [2022-01-22 00:25:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][30/1251] eta 0:59:53 lr 0.000567 time 1.6335 (2.9428) loss 2.7547 (3.5089) grad_norm 1.2943 (1.4424) [2022-01-22 00:25:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][40/1251] eta 0:56:29 lr 0.000567 time 3.7836 (2.7988) loss 4.0358 (3.5215) grad_norm 1.3335 (1.4458) [2022-01-22 00:26:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][50/1251] eta 0:53:58 lr 0.000567 time 1.8150 (2.6961) loss 3.3894 (3.5172) grad_norm 1.2893 (1.4282) [2022-01-22 00:26:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][60/1251] eta 0:51:51 lr 0.000567 time 1.6523 (2.6126) loss 3.0747 (3.5560) grad_norm 1.3866 (1.4220) [2022-01-22 00:27:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][70/1251] eta 0:49:39 lr 0.000567 time 1.5861 (2.5225) loss 4.2493 (3.5797) grad_norm 1.6909 (1.4323) [2022-01-22 00:27:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][80/1251] eta 0:48:06 lr 0.000567 time 2.2177 (2.4649) loss 3.8569 (3.5858) grad_norm 1.4083 (1.4242) [2022-01-22 00:27:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][90/1251] eta 0:47:11 lr 0.000567 time 2.4518 (2.4390) loss 3.8352 (3.5740) grad_norm 1.5840 (1.4375) [2022-01-22 00:28:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][100/1251] eta 0:46:00 lr 0.000567 time 1.8574 (2.3981) loss 2.4825 (3.5917) grad_norm 1.3286 (1.4375) [2022-01-22 00:28:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][110/1251] eta 0:45:22 lr 0.000567 time 1.8250 (2.3860) loss 4.0619 (3.6153) grad_norm 1.4395 (1.4340) [2022-01-22 00:28:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][120/1251] eta 0:44:33 lr 0.000567 time 2.0049 (2.3635) loss 2.8013 (3.6007) grad_norm 1.4978 (1.4403) [2022-01-22 00:29:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][130/1251] eta 0:43:59 lr 0.000567 time 2.4928 (2.3543) loss 3.4396 (3.6105) grad_norm 1.5415 (1.4387) [2022-01-22 00:29:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][140/1251] eta 0:43:13 lr 0.000566 time 1.8715 (2.3340) loss 4.1166 (3.6144) grad_norm 1.8152 (1.4426) [2022-01-22 00:29:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][150/1251] eta 0:42:42 lr 0.000566 time 1.8834 (2.3274) loss 3.9046 (3.6197) grad_norm 1.3827 (1.4479) [2022-01-22 00:30:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][160/1251] eta 0:42:17 lr 0.000566 time 2.3132 (2.3256) loss 3.8693 (3.6337) grad_norm 1.4014 (1.4471) [2022-01-22 00:30:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][170/1251] eta 0:41:49 lr 0.000566 time 2.2627 (2.3218) loss 2.9953 (3.6279) grad_norm 1.5766 (1.4455) [2022-01-22 00:30:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][180/1251] eta 0:41:02 lr 0.000566 time 1.8815 (2.2996) loss 4.4972 (3.6276) grad_norm 1.7330 (1.4424) [2022-01-22 00:31:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][190/1251] eta 0:40:17 lr 0.000566 time 1.9414 (2.2785) loss 3.9048 (3.6261) grad_norm 1.3872 (1.4424) [2022-01-22 00:31:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][200/1251] eta 0:39:39 lr 0.000566 time 1.9034 (2.2644) loss 3.7627 (3.6272) grad_norm 1.3937 (1.4431) [2022-01-22 00:31:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][210/1251] eta 0:39:11 lr 0.000566 time 2.0821 (2.2584) loss 3.9103 (3.6337) grad_norm 1.6221 (1.4404) [2022-01-22 00:32:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][220/1251] eta 0:38:38 lr 0.000566 time 2.2449 (2.2491) loss 2.3922 (3.6365) grad_norm 1.5504 (1.4414) [2022-01-22 00:32:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][230/1251] eta 0:38:15 lr 0.000566 time 1.9511 (2.2481) loss 3.5565 (3.6279) grad_norm 1.3662 (1.4451) [2022-01-22 00:33:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][240/1251] eta 0:38:07 lr 0.000566 time 2.2178 (2.2628) loss 3.3682 (3.6278) grad_norm 1.3168 (1.4446) [2022-01-22 00:33:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][250/1251] eta 0:37:48 lr 0.000566 time 1.8070 (2.2659) loss 3.0596 (3.6139) grad_norm 1.3755 (1.4428) [2022-01-22 00:33:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][260/1251] eta 0:37:28 lr 0.000566 time 2.1583 (2.2687) loss 3.9027 (3.6221) grad_norm 1.2268 (1.4393) [2022-01-22 00:34:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][270/1251] eta 0:37:04 lr 0.000566 time 2.1565 (2.2676) loss 3.6935 (3.6291) grad_norm 1.2283 (1.4348) [2022-01-22 00:34:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][280/1251] eta 0:36:36 lr 0.000566 time 2.6743 (2.2626) loss 4.1790 (3.6270) grad_norm 1.3614 (1.4330) [2022-01-22 00:35:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][290/1251] eta 0:36:10 lr 0.000566 time 1.6668 (2.2581) loss 2.8679 (3.6277) grad_norm 1.5282 (1.4374) [2022-01-22 00:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][300/1251] eta 0:35:39 lr 0.000566 time 2.6168 (2.2496) loss 2.8622 (3.6185) grad_norm 1.3326 (1.4364) [2022-01-22 00:35:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][310/1251] eta 0:35:15 lr 0.000566 time 2.5529 (2.2482) loss 3.0003 (3.6153) grad_norm 1.2995 (1.4385) [2022-01-22 00:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][320/1251] eta 0:34:50 lr 0.000566 time 2.2168 (2.2457) loss 3.1321 (3.6131) grad_norm 1.6077 (1.4367) [2022-01-22 00:36:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][330/1251] eta 0:34:29 lr 0.000566 time 2.1156 (2.2465) loss 3.5613 (3.6012) grad_norm 1.5770 (1.4394) [2022-01-22 00:36:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][340/1251] eta 0:34:04 lr 0.000566 time 2.1998 (2.2447) loss 2.5412 (3.5990) grad_norm 1.3217 (1.4382) [2022-01-22 00:37:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][350/1251] eta 0:33:39 lr 0.000566 time 1.9164 (2.2414) loss 3.7388 (3.6009) grad_norm 1.4357 (1.4371) [2022-01-22 00:37:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][360/1251] eta 0:33:15 lr 0.000566 time 2.6879 (2.2391) loss 2.7456 (3.6005) grad_norm 1.3117 (1.4371) [2022-01-22 00:37:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][370/1251] eta 0:32:53 lr 0.000566 time 1.9605 (2.2397) loss 3.8940 (3.5989) grad_norm 1.5170 (1.4360) [2022-01-22 00:38:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][380/1251] eta 0:32:26 lr 0.000565 time 2.1619 (2.2353) loss 2.8043 (3.5962) grad_norm 1.2811 (1.4361) [2022-01-22 00:38:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][390/1251] eta 0:32:02 lr 0.000565 time 2.6654 (2.2332) loss 3.0456 (3.5906) grad_norm 1.5006 (1.4398) [2022-01-22 00:38:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][400/1251] eta 0:31:39 lr 0.000565 time 3.0591 (2.2324) loss 2.6934 (3.5904) grad_norm 1.4238 (1.4404) [2022-01-22 00:39:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][410/1251] eta 0:31:15 lr 0.000565 time 2.1766 (2.2297) loss 2.5929 (3.5841) grad_norm 1.4662 (1.4420) [2022-01-22 00:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][420/1251] eta 0:30:51 lr 0.000565 time 1.4833 (2.2276) loss 3.4785 (3.5844) grad_norm 1.2940 (1.4399) [2022-01-22 00:40:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][430/1251] eta 0:30:28 lr 0.000565 time 2.2208 (2.2268) loss 3.5749 (3.5865) grad_norm 1.4722 (1.4386) [2022-01-22 00:40:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][440/1251] eta 0:30:06 lr 0.000565 time 2.6474 (2.2278) loss 4.0772 (3.5857) grad_norm 1.4538 (1.4377) [2022-01-22 00:40:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][450/1251] eta 0:29:44 lr 0.000565 time 1.9577 (2.2275) loss 3.8850 (3.5814) grad_norm 1.3269 (1.4366) [2022-01-22 00:41:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][460/1251] eta 0:29:21 lr 0.000565 time 2.2060 (2.2269) loss 3.1403 (3.5784) grad_norm 1.3441 (1.4357) [2022-01-22 00:41:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][470/1251] eta 0:28:56 lr 0.000565 time 2.1815 (2.2232) loss 3.6210 (3.5770) grad_norm 1.4415 (1.4341) [2022-01-22 00:41:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][480/1251] eta 0:28:31 lr 0.000565 time 2.4467 (2.2204) loss 3.7213 (3.5787) grad_norm 1.4174 (1.4338) [2022-01-22 00:42:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][490/1251] eta 0:28:10 lr 0.000565 time 2.1402 (2.2213) loss 3.5764 (3.5783) grad_norm 1.6997 (1.4335) [2022-01-22 00:42:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][500/1251] eta 0:27:48 lr 0.000565 time 2.6081 (2.2215) loss 3.9946 (3.5782) grad_norm 1.4083 (1.4359) [2022-01-22 00:42:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][510/1251] eta 0:27:27 lr 0.000565 time 2.2304 (2.2234) loss 3.2463 (3.5821) grad_norm 1.8698 (1.4368) [2022-01-22 00:43:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][520/1251] eta 0:27:04 lr 0.000565 time 2.2033 (2.2221) loss 3.2501 (3.5837) grad_norm 1.5465 (1.4362) [2022-01-22 00:43:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][530/1251] eta 0:26:41 lr 0.000565 time 1.9125 (2.2213) loss 3.6132 (3.5854) grad_norm 1.2206 (1.4355) [2022-01-22 00:44:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][540/1251] eta 0:26:17 lr 0.000565 time 1.8834 (2.2181) loss 3.4530 (3.5907) grad_norm 1.4177 (1.4362) [2022-01-22 00:44:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][550/1251] eta 0:25:52 lr 0.000565 time 2.2872 (2.2153) loss 2.3370 (3.5832) grad_norm 1.3800 (1.4364) [2022-01-22 00:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][560/1251] eta 0:25:27 lr 0.000565 time 1.9223 (2.2110) loss 2.9440 (3.5754) grad_norm 1.3590 (1.4366) [2022-01-22 00:45:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][570/1251] eta 0:25:04 lr 0.000565 time 1.9131 (2.2088) loss 2.3042 (3.5743) grad_norm 1.5618 (1.4374) [2022-01-22 00:45:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][580/1251] eta 0:24:42 lr 0.000565 time 2.3008 (2.2087) loss 2.4654 (3.5681) grad_norm 1.4613 (1.4376) [2022-01-22 00:45:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][590/1251] eta 0:24:20 lr 0.000565 time 2.6581 (2.2099) loss 4.0466 (3.5603) grad_norm 1.2944 (1.4361) [2022-01-22 00:46:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][600/1251] eta 0:23:58 lr 0.000565 time 2.1188 (2.2092) loss 4.0785 (3.5626) grad_norm 1.4106 (1.4361) [2022-01-22 00:46:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][610/1251] eta 0:23:35 lr 0.000565 time 2.0762 (2.2076) loss 3.9857 (3.5637) grad_norm 1.3847 (1.4375) [2022-01-22 00:46:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][620/1251] eta 0:23:13 lr 0.000564 time 2.4118 (2.2089) loss 4.2234 (3.5658) grad_norm 1.4360 (1.4379) [2022-01-22 00:47:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][630/1251] eta 0:22:53 lr 0.000564 time 2.9594 (2.2110) loss 2.5639 (3.5609) grad_norm 1.4248 (1.4373) [2022-01-22 00:47:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][640/1251] eta 0:22:31 lr 0.000564 time 2.2390 (2.2119) loss 3.8170 (3.5646) grad_norm 1.4529 (1.4367) [2022-01-22 00:48:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][650/1251] eta 0:22:09 lr 0.000564 time 1.8939 (2.2128) loss 4.1411 (3.5605) grad_norm 1.2993 (1.4378) [2022-01-22 00:48:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][660/1251] eta 0:21:47 lr 0.000564 time 1.7594 (2.2119) loss 3.8540 (3.5620) grad_norm 1.4030 (1.4378) [2022-01-22 00:48:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][670/1251] eta 0:21:25 lr 0.000564 time 2.6887 (2.2133) loss 3.9032 (3.5625) grad_norm 1.4432 (1.4373) [2022-01-22 00:49:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][680/1251] eta 0:21:01 lr 0.000564 time 1.5297 (2.2086) loss 3.1682 (3.5630) grad_norm 1.3530 (1.4367) [2022-01-22 00:49:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][690/1251] eta 0:20:36 lr 0.000564 time 1.8986 (2.2050) loss 2.8060 (3.5656) grad_norm 1.3011 (1.4358) [2022-01-22 00:49:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][700/1251] eta 0:20:13 lr 0.000564 time 1.8768 (2.2029) loss 3.3029 (3.5671) grad_norm 1.4030 (1.4344) [2022-01-22 00:50:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][710/1251] eta 0:19:52 lr 0.000564 time 2.5392 (2.2036) loss 2.9451 (3.5668) grad_norm 1.2181 (1.4331) [2022-01-22 00:50:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][720/1251] eta 0:19:30 lr 0.000564 time 2.1304 (2.2044) loss 3.8056 (3.5640) grad_norm 1.5965 (1.4323) [2022-01-22 00:50:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][730/1251] eta 0:19:09 lr 0.000564 time 2.2354 (2.2064) loss 3.4793 (3.5649) grad_norm 1.2974 (1.4335) [2022-01-22 00:51:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][740/1251] eta 0:18:47 lr 0.000564 time 1.8465 (2.2073) loss 3.7611 (3.5657) grad_norm 1.4069 (1.4340) [2022-01-22 00:51:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][750/1251] eta 0:18:26 lr 0.000564 time 1.8815 (2.2084) loss 2.5801 (3.5659) grad_norm 1.4912 (1.4342) [2022-01-22 00:52:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][760/1251] eta 0:18:05 lr 0.000564 time 3.0479 (2.2101) loss 3.2543 (3.5659) grad_norm 1.2892 (1.4331) [2022-01-22 00:52:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][770/1251] eta 0:17:42 lr 0.000564 time 2.0580 (2.2089) loss 4.1208 (3.5650) grad_norm 1.2685 (1.4320) [2022-01-22 00:52:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][780/1251] eta 0:17:19 lr 0.000564 time 1.9038 (2.2066) loss 3.5716 (3.5679) grad_norm 1.4305 (1.4325) [2022-01-22 00:53:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][790/1251] eta 0:16:56 lr 0.000564 time 1.9246 (2.2049) loss 4.2823 (3.5683) grad_norm 1.3192 (1.4332) [2022-01-22 00:53:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][800/1251] eta 0:16:33 lr 0.000564 time 3.1744 (2.2038) loss 2.8828 (3.5671) grad_norm 1.3082 (1.4333) [2022-01-22 00:53:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][810/1251] eta 0:16:11 lr 0.000564 time 2.2558 (2.2033) loss 3.4586 (3.5697) grad_norm 1.3766 (1.4339) [2022-01-22 00:54:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][820/1251] eta 0:15:50 lr 0.000564 time 2.3279 (2.2043) loss 3.1803 (3.5704) grad_norm 1.5421 (1.4353) [2022-01-22 00:54:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][830/1251] eta 0:15:28 lr 0.000564 time 2.4675 (2.2053) loss 3.8374 (3.5704) grad_norm 1.3263 (1.4356) [2022-01-22 00:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][840/1251] eta 0:15:07 lr 0.000564 time 3.6788 (2.2078) loss 3.6209 (3.5710) grad_norm 1.4336 (1.4361) [2022-01-22 00:55:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][850/1251] eta 0:14:45 lr 0.000564 time 1.9110 (2.2081) loss 4.0924 (3.5733) grad_norm 1.6266 (1.4383) [2022-01-22 00:55:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][860/1251] eta 0:14:22 lr 0.000564 time 1.8448 (2.2063) loss 4.1146 (3.5733) grad_norm 1.4019 (1.4391) [2022-01-22 00:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][870/1251] eta 0:13:59 lr 0.000563 time 2.1565 (2.2043) loss 4.6094 (3.5735) grad_norm 1.4269 (1.4387) [2022-01-22 00:56:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][880/1251] eta 0:13:37 lr 0.000563 time 3.2152 (2.2046) loss 4.0713 (3.5771) grad_norm 1.7241 (1.4392) [2022-01-22 00:56:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][890/1251] eta 0:13:15 lr 0.000563 time 2.2293 (2.2040) loss 4.3038 (3.5810) grad_norm 1.5899 (1.4402) [2022-01-22 00:57:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][900/1251] eta 0:12:53 lr 0.000563 time 2.5130 (2.2045) loss 2.7127 (3.5785) grad_norm 1.3794 (1.4409) [2022-01-22 00:57:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][910/1251] eta 0:12:31 lr 0.000563 time 2.1586 (2.2051) loss 3.8553 (3.5781) grad_norm 1.2335 (1.4404) [2022-01-22 00:57:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][920/1251] eta 0:12:09 lr 0.000563 time 1.8634 (2.2043) loss 4.1804 (3.5810) grad_norm 1.3492 (1.4395) [2022-01-22 00:58:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][930/1251] eta 0:11:46 lr 0.000563 time 1.6469 (2.2020) loss 3.9228 (3.5789) grad_norm 1.7920 (1.4395) [2022-01-22 00:58:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][940/1251] eta 0:11:24 lr 0.000563 time 2.1646 (2.2006) loss 3.8337 (3.5794) grad_norm 1.3365 (1.4385) [2022-01-22 00:58:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][950/1251] eta 0:11:01 lr 0.000563 time 1.9138 (2.1992) loss 3.7396 (3.5814) grad_norm 1.3343 (1.4376) [2022-01-22 00:59:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][960/1251] eta 0:10:40 lr 0.000563 time 2.9078 (2.2011) loss 3.5913 (3.5804) grad_norm 1.2845 (1.4372) [2022-01-22 00:59:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][970/1251] eta 0:10:18 lr 0.000563 time 1.9730 (2.2008) loss 3.3002 (3.5813) grad_norm 1.4176 (1.4376) [2022-01-22 01:00:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][980/1251] eta 0:09:56 lr 0.000563 time 1.6406 (2.2005) loss 4.1692 (3.5815) grad_norm 1.2170 (1.4369) [2022-01-22 01:00:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][990/1251] eta 0:09:34 lr 0.000563 time 2.2616 (2.2001) loss 3.5101 (3.5809) grad_norm 1.5428 (1.4370) [2022-01-22 01:00:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1000/1251] eta 0:09:12 lr 0.000563 time 2.5862 (2.2001) loss 4.0193 (3.5839) grad_norm 1.5614 (1.4365) [2022-01-22 01:01:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1010/1251] eta 0:08:49 lr 0.000563 time 1.8143 (2.1984) loss 3.6295 (3.5857) grad_norm 1.2186 (1.4361) [2022-01-22 01:01:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1020/1251] eta 0:08:27 lr 0.000563 time 1.9321 (2.1982) loss 3.2002 (3.5876) grad_norm 1.3285 (1.4357) [2022-01-22 01:01:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1030/1251] eta 0:08:06 lr 0.000563 time 2.1557 (2.1997) loss 3.8962 (3.5869) grad_norm 1.5824 (1.4356) [2022-01-22 01:02:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1040/1251] eta 0:07:44 lr 0.000563 time 2.5469 (2.1993) loss 3.9622 (3.5870) grad_norm 1.6254 (1.4360) [2022-01-22 01:02:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1050/1251] eta 0:07:21 lr 0.000563 time 2.2507 (2.1987) loss 2.8412 (3.5853) grad_norm 1.7712 (1.4366) [2022-01-22 01:02:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1060/1251] eta 0:06:59 lr 0.000563 time 1.6336 (2.1974) loss 2.9473 (3.5840) grad_norm 1.3607 (1.4363) [2022-01-22 01:03:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1070/1251] eta 0:06:37 lr 0.000563 time 1.9263 (2.1967) loss 4.5736 (3.5845) grad_norm 1.2715 (1.4355) [2022-01-22 01:03:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1080/1251] eta 0:06:15 lr 0.000563 time 2.5284 (2.1967) loss 3.5898 (3.5867) grad_norm 1.5251 (1.4358) [2022-01-22 01:04:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1090/1251] eta 0:05:53 lr 0.000563 time 2.8135 (2.1968) loss 3.3218 (3.5872) grad_norm 1.3326 (1.4348) [2022-01-22 01:04:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1100/1251] eta 0:05:31 lr 0.000563 time 2.6205 (2.1968) loss 3.8084 (3.5883) grad_norm 1.3602 (1.4345) [2022-01-22 01:04:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1110/1251] eta 0:05:09 lr 0.000562 time 2.0017 (2.1969) loss 4.3763 (3.5896) grad_norm 1.4504 (1.4340) [2022-01-22 01:05:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1120/1251] eta 0:04:47 lr 0.000562 time 2.1747 (2.1975) loss 3.4708 (3.5905) grad_norm 1.5020 (1.4346) [2022-01-22 01:05:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1130/1251] eta 0:04:26 lr 0.000562 time 2.6053 (2.1985) loss 3.6131 (3.5900) grad_norm 1.5072 (1.4342) [2022-01-22 01:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1140/1251] eta 0:04:03 lr 0.000562 time 2.3288 (2.1978) loss 3.7764 (3.5882) grad_norm 1.3007 (1.4341) [2022-01-22 01:06:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1150/1251] eta 0:03:41 lr 0.000562 time 1.8515 (2.1960) loss 4.0651 (3.5896) grad_norm 1.3839 (1.4349) [2022-01-22 01:06:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1160/1251] eta 0:03:19 lr 0.000562 time 2.5780 (2.1948) loss 3.4838 (3.5873) grad_norm 1.4731 (1.4349) [2022-01-22 01:06:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1170/1251] eta 0:02:57 lr 0.000562 time 2.1125 (2.1942) loss 2.8768 (3.5878) grad_norm 1.2962 (1.4349) [2022-01-22 01:07:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1180/1251] eta 0:02:35 lr 0.000562 time 1.8535 (2.1935) loss 3.4726 (3.5880) grad_norm 1.3491 (1.4356) [2022-01-22 01:07:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1190/1251] eta 0:02:13 lr 0.000562 time 2.2303 (2.1936) loss 2.7796 (3.5875) grad_norm 1.4197 (1.4357) [2022-01-22 01:08:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1200/1251] eta 0:01:51 lr 0.000562 time 2.9048 (2.1955) loss 3.8756 (3.5861) grad_norm 1.4872 (1.4356) [2022-01-22 01:08:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1210/1251] eta 0:01:30 lr 0.000562 time 1.5565 (2.1957) loss 3.7369 (3.5871) grad_norm 1.4376 (1.4353) [2022-01-22 01:08:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1220/1251] eta 0:01:08 lr 0.000562 time 2.5373 (2.1969) loss 3.9324 (3.5881) grad_norm 1.3041 (1.4349) [2022-01-22 01:09:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1230/1251] eta 0:00:46 lr 0.000562 time 2.1889 (2.1972) loss 4.0534 (3.5887) grad_norm 1.6780 (1.4350) [2022-01-22 01:09:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1240/1251] eta 0:00:24 lr 0.000562 time 2.2809 (2.1954) loss 2.8229 (3.5887) grad_norm 1.6629 (1.4349) [2022-01-22 01:09:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [138/300][1250/1251] eta 0:00:02 lr 0.000562 time 1.2330 (2.1886) loss 3.5512 (3.5900) grad_norm 1.4038 (1.4349) [2022-01-22 01:09:41 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 138 training takes 0:45:38 [2022-01-22 01:10:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.413 (19.413) Loss 1.1198 (1.1198) Acc@1 74.316 (74.316) Acc@5 91.699 (91.699) [2022-01-22 01:10:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.879 (3.305) Loss 1.0301 (1.0275) Acc@1 76.855 (75.533) Acc@5 92.871 (92.880) [2022-01-22 01:10:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.882 (2.671) Loss 1.0679 (1.0136) Acc@1 76.367 (76.046) Acc@5 92.480 (93.048) [2022-01-22 01:10:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.298 (2.343) Loss 1.0548 (1.0204) Acc@1 74.121 (75.904) Acc@5 92.871 (92.997) [2022-01-22 01:11:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.852 (2.168) Loss 0.9623 (1.0244) Acc@1 78.320 (75.715) Acc@5 94.336 (92.983) [2022-01-22 01:11:17 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.630 Acc@5 92.980 [2022-01-22 01:11:17 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.6% [2022-01-22 01:11:17 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.63% [2022-01-22 01:11:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][0/1251] eta 6:43:54 lr 0.000562 time 19.3723 (19.3723) loss 2.8769 (2.8769) grad_norm 1.5099 (1.5099) [2022-01-22 01:12:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][10/1251] eta 1:21:20 lr 0.000562 time 2.5494 (3.9328) loss 3.4981 (3.2680) grad_norm 1.3137 (1.4792) [2022-01-22 01:12:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][20/1251] eta 1:03:44 lr 0.000562 time 2.5766 (3.1066) loss 4.0458 (3.4869) grad_norm 1.4770 (1.4411) [2022-01-22 01:12:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][30/1251] eta 0:57:09 lr 0.000562 time 1.8634 (2.8089) loss 2.9395 (3.5280) grad_norm 1.5296 (1.4531) [2022-01-22 01:13:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][40/1251] eta 0:55:10 lr 0.000562 time 7.1322 (2.7334) loss 3.9345 (3.5473) grad_norm 1.5443 (1.4419) [2022-01-22 01:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][50/1251] eta 0:52:59 lr 0.000562 time 2.3264 (2.6473) loss 3.9714 (3.5221) grad_norm 1.3595 (1.4291) [2022-01-22 01:13:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][60/1251] eta 0:50:48 lr 0.000562 time 1.5612 (2.5598) loss 4.0547 (3.5597) grad_norm 1.3749 (1.4257) [2022-01-22 01:14:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][70/1251] eta 0:49:27 lr 0.000562 time 1.9143 (2.5126) loss 2.5628 (3.5115) grad_norm 1.5526 (1.4300) [2022-01-22 01:14:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][80/1251] eta 0:48:28 lr 0.000562 time 3.7143 (2.4836) loss 2.4957 (3.5149) grad_norm 1.5163 (1.4374) [2022-01-22 01:15:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][90/1251] eta 0:47:30 lr 0.000562 time 1.8310 (2.4549) loss 3.5955 (3.5169) grad_norm 1.5853 (1.4337) [2022-01-22 01:15:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][100/1251] eta 0:46:36 lr 0.000561 time 2.2108 (2.4295) loss 3.6094 (3.5139) grad_norm 1.4984 (1.4422) [2022-01-22 01:15:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][110/1251] eta 0:45:52 lr 0.000561 time 1.8225 (2.4127) loss 2.7883 (3.5210) grad_norm 1.4105 (1.4441) [2022-01-22 01:16:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][120/1251] eta 0:45:17 lr 0.000561 time 3.0321 (2.4023) loss 4.2749 (3.5533) grad_norm 2.0691 (1.4522) [2022-01-22 01:16:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][130/1251] eta 0:44:41 lr 0.000561 time 1.8871 (2.3917) loss 4.1598 (3.5623) grad_norm 1.4148 (1.4476) [2022-01-22 01:16:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][140/1251] eta 0:43:49 lr 0.000561 time 2.0456 (2.3670) loss 3.4352 (3.5637) grad_norm 1.7565 (1.4475) [2022-01-22 01:17:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][150/1251] eta 0:42:59 lr 0.000561 time 1.7722 (2.3427) loss 3.9945 (3.5817) grad_norm 1.2568 (1.4442) [2022-01-22 01:17:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][160/1251] eta 0:42:27 lr 0.000561 time 3.0364 (2.3348) loss 4.3186 (3.5872) grad_norm 1.3500 (1.4451) [2022-01-22 01:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][170/1251] eta 0:41:47 lr 0.000561 time 1.9375 (2.3195) loss 4.1541 (3.5920) grad_norm 1.5955 (1.4442) [2022-01-22 01:18:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][180/1251] eta 0:41:07 lr 0.000561 time 1.6087 (2.3035) loss 4.0782 (3.5786) grad_norm 1.6330 (1.4443) [2022-01-22 01:18:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][190/1251] eta 0:40:35 lr 0.000561 time 2.5522 (2.2959) loss 3.9876 (3.5818) grad_norm 1.1940 (1.4393) [2022-01-22 01:18:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][200/1251] eta 0:40:11 lr 0.000561 time 2.2738 (2.2944) loss 2.8955 (3.5826) grad_norm 1.3437 (1.4349) [2022-01-22 01:19:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][210/1251] eta 0:39:52 lr 0.000561 time 2.2316 (2.2979) loss 3.2953 (3.5920) grad_norm 1.4792 (1.4339) [2022-01-22 01:19:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][220/1251] eta 0:39:23 lr 0.000561 time 1.5225 (2.2924) loss 3.6723 (3.5824) grad_norm 1.7328 (1.4350) [2022-01-22 01:20:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][230/1251] eta 0:38:56 lr 0.000561 time 2.6900 (2.2886) loss 3.8203 (3.5845) grad_norm 1.7418 (1.4358) [2022-01-22 01:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][240/1251] eta 0:38:26 lr 0.000561 time 2.3051 (2.2813) loss 3.6432 (3.5855) grad_norm 1.2465 (1.4385) [2022-01-22 01:20:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][250/1251] eta 0:37:57 lr 0.000561 time 2.1186 (2.2755) loss 3.6925 (3.5893) grad_norm 1.3836 (1.4367) [2022-01-22 01:21:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][260/1251] eta 0:37:30 lr 0.000561 time 1.8461 (2.2710) loss 3.0406 (3.5870) grad_norm 1.3631 (1.4411) [2022-01-22 01:21:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][270/1251] eta 0:37:05 lr 0.000561 time 2.5605 (2.2685) loss 2.9944 (3.5867) grad_norm 1.2980 (1.4415) [2022-01-22 01:21:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][280/1251] eta 0:36:39 lr 0.000561 time 1.9087 (2.2648) loss 4.0911 (3.5814) grad_norm 1.4625 (1.4439) [2022-01-22 01:22:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][290/1251] eta 0:36:22 lr 0.000561 time 2.9987 (2.2714) loss 3.5834 (3.5778) grad_norm 1.5270 (1.4465) [2022-01-22 01:22:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][300/1251] eta 0:36:01 lr 0.000561 time 1.6698 (2.2734) loss 4.3792 (3.5790) grad_norm 1.5417 (1.4452) [2022-01-22 01:23:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][310/1251] eta 0:35:32 lr 0.000561 time 2.2460 (2.2661) loss 4.0582 (3.5803) grad_norm 1.3904 (1.4430) [2022-01-22 01:23:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][320/1251] eta 0:34:59 lr 0.000561 time 1.8560 (2.2553) loss 3.9657 (3.5846) grad_norm 1.3442 (1.4416) [2022-01-22 01:23:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][330/1251] eta 0:34:36 lr 0.000561 time 2.4280 (2.2542) loss 2.8663 (3.5791) grad_norm 1.3025 (1.4426) [2022-01-22 01:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][340/1251] eta 0:34:13 lr 0.000560 time 2.5366 (2.2541) loss 2.8823 (3.5731) grad_norm 1.5693 (1.4415) [2022-01-22 01:24:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][350/1251] eta 0:33:51 lr 0.000560 time 2.1298 (2.2550) loss 3.9949 (3.5641) grad_norm 1.6141 (1.4408) [2022-01-22 01:24:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][360/1251] eta 0:33:23 lr 0.000560 time 1.9321 (2.2491) loss 4.1167 (3.5660) grad_norm 1.3482 (1.4403) [2022-01-22 01:25:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][370/1251] eta 0:33:00 lr 0.000560 time 2.5818 (2.2479) loss 4.1323 (3.5727) grad_norm 1.4284 (1.4406) [2022-01-22 01:25:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][380/1251] eta 0:32:32 lr 0.000560 time 1.8746 (2.2415) loss 3.8779 (3.5802) grad_norm 1.4493 (1.4422) [2022-01-22 01:25:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][390/1251] eta 0:32:10 lr 0.000560 time 2.3931 (2.2418) loss 3.3905 (3.5776) grad_norm 1.5364 (1.4432) [2022-01-22 01:26:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][400/1251] eta 0:31:45 lr 0.000560 time 1.8887 (2.2395) loss 3.4012 (3.5830) grad_norm 1.5888 (1.4447) [2022-01-22 01:26:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][410/1251] eta 0:31:24 lr 0.000560 time 3.5087 (2.2405) loss 3.0068 (3.5852) grad_norm 1.4490 (1.4459) [2022-01-22 01:27:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][420/1251] eta 0:31:00 lr 0.000560 time 2.0222 (2.2389) loss 3.5225 (3.5818) grad_norm 1.4896 (1.4454) [2022-01-22 01:27:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][430/1251] eta 0:30:36 lr 0.000560 time 2.2290 (2.2374) loss 3.8394 (3.5870) grad_norm 1.4798 (1.4455) [2022-01-22 01:27:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][440/1251] eta 0:30:12 lr 0.000560 time 1.7090 (2.2353) loss 2.9301 (3.5871) grad_norm 1.3929 (1.4440) [2022-01-22 01:28:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][450/1251] eta 0:29:48 lr 0.000560 time 2.7678 (2.2334) loss 2.8097 (3.5838) grad_norm 1.2495 (1.4453) [2022-01-22 01:28:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][460/1251] eta 0:29:24 lr 0.000560 time 1.8627 (2.2307) loss 3.5054 (3.5865) grad_norm 1.6035 (1.4449) [2022-01-22 01:28:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][470/1251] eta 0:29:01 lr 0.000560 time 2.2234 (2.2298) loss 2.7297 (3.5855) grad_norm 1.3863 (1.4438) [2022-01-22 01:29:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][480/1251] eta 0:28:40 lr 0.000560 time 1.6409 (2.2315) loss 4.1250 (3.5869) grad_norm 1.6007 (1.4437) [2022-01-22 01:29:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][490/1251] eta 0:28:20 lr 0.000560 time 3.4626 (2.2351) loss 3.2856 (3.5852) grad_norm 1.2828 (1.4427) [2022-01-22 01:29:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][500/1251] eta 0:27:58 lr 0.000560 time 1.8035 (2.2347) loss 3.3287 (3.5860) grad_norm 1.5025 (1.4432) [2022-01-22 01:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][510/1251] eta 0:27:33 lr 0.000560 time 1.9947 (2.2313) loss 3.1337 (3.5773) grad_norm 1.7083 (1.4431) [2022-01-22 01:30:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][520/1251] eta 0:27:08 lr 0.000560 time 1.7626 (2.2274) loss 4.1867 (3.5799) grad_norm 1.3834 (1.4448) [2022-01-22 01:30:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][530/1251] eta 0:26:42 lr 0.000560 time 1.9334 (2.2222) loss 3.7985 (3.5809) grad_norm 1.2659 (1.4432) [2022-01-22 01:31:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][540/1251] eta 0:26:19 lr 0.000560 time 2.4302 (2.2222) loss 3.2380 (3.5849) grad_norm 1.3152 (1.4426) [2022-01-22 01:31:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][550/1251] eta 0:25:56 lr 0.000560 time 2.5842 (2.2211) loss 3.5477 (3.5837) grad_norm 1.3581 (1.4410) [2022-01-22 01:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][560/1251] eta 0:25:34 lr 0.000560 time 2.0539 (2.2210) loss 3.6184 (3.5835) grad_norm 1.4997 (1.4423) [2022-01-22 01:32:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][570/1251] eta 0:25:11 lr 0.000560 time 2.0386 (2.2189) loss 3.7474 (3.5824) grad_norm 1.3235 (1.4428) [2022-01-22 01:32:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][580/1251] eta 0:24:48 lr 0.000560 time 2.4635 (2.2179) loss 3.8280 (3.5829) grad_norm 1.4026 (1.4417) [2022-01-22 01:33:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][590/1251] eta 0:24:25 lr 0.000559 time 2.2435 (2.2165) loss 3.7115 (3.5813) grad_norm 1.5239 (1.4408) [2022-01-22 01:33:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][600/1251] eta 0:24:02 lr 0.000559 time 1.8293 (2.2158) loss 4.0057 (3.5851) grad_norm 1.2882 (1.4410) [2022-01-22 01:33:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][610/1251] eta 0:23:40 lr 0.000559 time 2.4586 (2.2167) loss 4.0800 (3.5865) grad_norm 1.4098 (1.4430) [2022-01-22 01:34:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][620/1251] eta 0:23:18 lr 0.000559 time 2.4548 (2.2170) loss 3.6756 (3.5860) grad_norm 2.0402 (1.4437) [2022-01-22 01:34:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][630/1251] eta 0:22:57 lr 0.000559 time 2.6697 (2.2184) loss 3.9700 (3.5810) grad_norm 1.5545 (1.4458) [2022-01-22 01:34:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][640/1251] eta 0:22:33 lr 0.000559 time 2.1771 (2.2160) loss 4.0074 (3.5838) grad_norm 1.3808 (1.4457) [2022-01-22 01:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][650/1251] eta 0:22:11 lr 0.000559 time 2.4176 (2.2157) loss 2.5985 (3.5852) grad_norm 1.2810 (1.4454) [2022-01-22 01:35:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][660/1251] eta 0:21:49 lr 0.000559 time 2.2060 (2.2150) loss 2.3101 (3.5852) grad_norm 1.3450 (1.4445) [2022-01-22 01:36:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][670/1251] eta 0:21:27 lr 0.000559 time 2.9549 (2.2167) loss 3.8339 (3.5844) grad_norm 1.4979 (1.4440) [2022-01-22 01:36:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][680/1251] eta 0:21:05 lr 0.000559 time 1.7972 (2.2170) loss 2.8889 (3.5854) grad_norm 1.4782 (1.4445) [2022-01-22 01:36:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][690/1251] eta 0:20:43 lr 0.000559 time 1.5969 (2.2158) loss 2.6406 (3.5838) grad_norm 1.3542 (1.4440) [2022-01-22 01:37:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][700/1251] eta 0:20:20 lr 0.000559 time 1.8968 (2.2143) loss 4.1629 (3.5860) grad_norm 1.3082 (1.4427) [2022-01-22 01:37:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][710/1251] eta 0:19:57 lr 0.000559 time 2.8747 (2.2144) loss 3.1988 (3.5842) grad_norm 1.4599 (1.4420) [2022-01-22 01:37:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][720/1251] eta 0:19:34 lr 0.000559 time 1.8647 (2.2122) loss 3.3380 (3.5795) grad_norm 1.3246 (1.4423) [2022-01-22 01:38:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][730/1251] eta 0:19:11 lr 0.000559 time 2.0454 (2.2106) loss 3.6358 (3.5809) grad_norm 1.5186 (1.4436) [2022-01-22 01:38:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][740/1251] eta 0:18:49 lr 0.000559 time 1.9186 (2.2104) loss 2.3864 (3.5771) grad_norm 1.3605 (1.4449) [2022-01-22 01:38:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][750/1251] eta 0:18:27 lr 0.000559 time 2.1726 (2.2102) loss 4.2874 (3.5795) grad_norm 1.3272 (1.4457) [2022-01-22 01:39:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][760/1251] eta 0:18:05 lr 0.000559 time 2.4034 (2.2116) loss 4.0667 (3.5766) grad_norm 1.3055 (1.4459) [2022-01-22 01:39:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][770/1251] eta 0:17:43 lr 0.000559 time 1.9269 (2.2116) loss 3.8878 (3.5774) grad_norm 1.3781 (1.4456) [2022-01-22 01:40:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][780/1251] eta 0:17:21 lr 0.000559 time 1.9702 (2.2105) loss 4.0107 (3.5786) grad_norm 1.4304 (1.4451) [2022-01-22 01:40:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][790/1251] eta 0:16:59 lr 0.000559 time 2.1250 (2.2106) loss 2.8326 (3.5737) grad_norm 1.6024 (1.4449) [2022-01-22 01:40:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][800/1251] eta 0:16:35 lr 0.000559 time 1.8225 (2.2083) loss 3.3825 (3.5751) grad_norm 1.2545 (1.4439) [2022-01-22 01:41:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][810/1251] eta 0:16:13 lr 0.000559 time 1.4891 (2.2071) loss 4.0085 (3.5773) grad_norm 1.3756 (1.4448) [2022-01-22 01:41:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][820/1251] eta 0:15:52 lr 0.000559 time 2.3425 (2.2109) loss 2.3538 (3.5751) grad_norm 1.2582 (1.4448) [2022-01-22 01:41:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][830/1251] eta 0:15:31 lr 0.000558 time 2.2828 (2.2121) loss 3.9430 (3.5730) grad_norm 1.5657 (1.4451) [2022-01-22 01:42:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][840/1251] eta 0:15:08 lr 0.000558 time 1.8629 (2.2110) loss 3.3879 (3.5703) grad_norm 1.2391 (1.4447) [2022-01-22 01:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][850/1251] eta 0:14:45 lr 0.000558 time 2.2471 (2.2095) loss 3.7826 (3.5687) grad_norm 1.5400 (1.4448) [2022-01-22 01:42:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][860/1251] eta 0:14:23 lr 0.000558 time 1.8492 (2.2077) loss 3.3255 (3.5658) grad_norm 1.5119 (1.4453) [2022-01-22 01:43:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][870/1251] eta 0:14:00 lr 0.000558 time 2.3980 (2.2073) loss 2.4895 (3.5681) grad_norm 1.3997 (1.4448) [2022-01-22 01:43:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][880/1251] eta 0:13:38 lr 0.000558 time 2.0902 (2.2062) loss 3.9356 (3.5673) grad_norm 1.5667 (1.4460) [2022-01-22 01:44:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][890/1251] eta 0:13:16 lr 0.000558 time 1.9240 (2.2068) loss 4.0349 (3.5684) grad_norm 1.2855 (1.4453) [2022-01-22 01:44:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][900/1251] eta 0:12:55 lr 0.000558 time 1.8471 (2.2084) loss 4.3342 (3.5697) grad_norm 1.5068 (1.4451) [2022-01-22 01:44:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][910/1251] eta 0:12:33 lr 0.000558 time 3.1762 (2.2096) loss 3.4979 (3.5711) grad_norm 1.3751 (1.4450) [2022-01-22 01:45:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][920/1251] eta 0:12:10 lr 0.000558 time 1.5695 (2.2084) loss 2.6457 (3.5693) grad_norm 1.2589 (1.4461) [2022-01-22 01:45:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][930/1251] eta 0:11:47 lr 0.000558 time 2.0775 (2.2052) loss 3.6912 (3.5685) grad_norm 1.3480 (1.4464) [2022-01-22 01:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][940/1251] eta 0:11:25 lr 0.000558 time 2.2349 (2.2034) loss 3.6301 (3.5674) grad_norm 1.5274 (1.4480) [2022-01-22 01:46:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][950/1251] eta 0:11:03 lr 0.000558 time 1.8748 (2.2029) loss 4.0192 (3.5703) grad_norm 1.5380 (1.4483) [2022-01-22 01:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][960/1251] eta 0:10:40 lr 0.000558 time 1.9959 (2.2023) loss 2.8275 (3.5711) grad_norm 1.3215 (1.4483) [2022-01-22 01:46:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][970/1251] eta 0:10:18 lr 0.000558 time 1.6308 (2.2016) loss 3.9889 (3.5677) grad_norm 1.6591 (1.4479) [2022-01-22 01:47:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][980/1251] eta 0:09:56 lr 0.000558 time 2.1934 (2.2011) loss 4.1499 (3.5687) grad_norm 1.4454 (1.4476) [2022-01-22 01:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][990/1251] eta 0:09:34 lr 0.000558 time 2.0383 (2.2011) loss 2.8077 (3.5656) grad_norm 1.3408 (1.4469) [2022-01-22 01:48:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1000/1251] eta 0:09:12 lr 0.000558 time 2.1266 (2.2009) loss 3.6977 (3.5666) grad_norm 1.6044 (1.4470) [2022-01-22 01:48:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1010/1251] eta 0:08:50 lr 0.000558 time 1.9258 (2.2014) loss 2.7601 (3.5650) grad_norm 1.4362 (1.4460) [2022-01-22 01:48:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1020/1251] eta 0:08:29 lr 0.000558 time 2.3085 (2.2040) loss 3.0631 (3.5649) grad_norm 1.5019 (1.4455) [2022-01-22 01:49:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1030/1251] eta 0:08:07 lr 0.000558 time 2.0589 (2.2058) loss 3.6054 (3.5666) grad_norm 1.2434 (1.4442) [2022-01-22 01:49:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1040/1251] eta 0:07:45 lr 0.000558 time 1.5429 (2.2059) loss 3.8955 (3.5683) grad_norm 1.7055 (1.4440) [2022-01-22 01:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1050/1251] eta 0:07:23 lr 0.000558 time 2.3187 (2.2050) loss 3.1050 (3.5649) grad_norm 1.3477 (1.4455) [2022-01-22 01:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1060/1251] eta 0:07:00 lr 0.000558 time 2.0670 (2.2027) loss 3.1429 (3.5650) grad_norm 1.3785 (1.4453) [2022-01-22 01:50:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1070/1251] eta 0:06:38 lr 0.000557 time 1.9401 (2.2026) loss 3.6674 (3.5641) grad_norm 1.4257 (1.4449) [2022-01-22 01:50:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1080/1251] eta 0:06:16 lr 0.000557 time 1.8899 (2.2033) loss 4.0149 (3.5665) grad_norm 1.2623 (1.4446) [2022-01-22 01:51:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1090/1251] eta 0:05:54 lr 0.000557 time 1.5109 (2.2027) loss 2.4239 (3.5652) grad_norm 1.4551 (1.4436) [2022-01-22 01:51:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1100/1251] eta 0:05:32 lr 0.000557 time 2.5985 (2.2030) loss 3.6648 (3.5662) grad_norm 1.3445 (1.4432) [2022-01-22 01:52:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1110/1251] eta 0:05:11 lr 0.000557 time 2.7329 (2.2059) loss 3.6003 (3.5665) grad_norm 1.4351 (1.4442) [2022-01-22 01:52:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1120/1251] eta 0:04:48 lr 0.000557 time 1.8728 (2.2059) loss 2.4510 (3.5670) grad_norm 1.4812 (1.4452) [2022-01-22 01:52:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1130/1251] eta 0:04:26 lr 0.000557 time 1.7612 (2.2048) loss 3.7308 (3.5667) grad_norm 1.6024 (1.4464) [2022-01-22 01:53:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1140/1251] eta 0:04:04 lr 0.000557 time 1.9651 (2.2026) loss 4.1417 (3.5649) grad_norm 1.4390 (1.4471) [2022-01-22 01:53:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1150/1251] eta 0:03:42 lr 0.000557 time 1.9674 (2.2029) loss 3.4871 (3.5679) grad_norm 1.5392 (1.4469) [2022-01-22 01:53:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1160/1251] eta 0:03:20 lr 0.000557 time 1.9875 (2.2027) loss 3.7256 (3.5691) grad_norm 1.4861 (1.4470) [2022-01-22 01:54:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1170/1251] eta 0:02:58 lr 0.000557 time 1.9816 (2.2028) loss 3.4500 (3.5684) grad_norm 1.2619 (1.4465) [2022-01-22 01:54:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1180/1251] eta 0:02:36 lr 0.000557 time 1.8291 (2.2032) loss 3.6747 (3.5700) grad_norm 1.2641 (1.4464) [2022-01-22 01:55:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1190/1251] eta 0:02:14 lr 0.000557 time 1.6233 (2.2028) loss 2.8882 (3.5680) grad_norm 1.2310 (1.4466) [2022-01-22 01:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1200/1251] eta 0:01:52 lr 0.000557 time 1.9450 (2.2026) loss 4.1640 (3.5699) grad_norm 1.2043 (1.4461) [2022-01-22 01:55:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1210/1251] eta 0:01:30 lr 0.000557 time 4.1944 (2.2034) loss 3.6819 (3.5713) grad_norm 1.3216 (1.4449) [2022-01-22 01:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1220/1251] eta 0:01:08 lr 0.000557 time 1.9350 (2.2026) loss 4.0528 (3.5719) grad_norm 1.4335 (1.4447) [2022-01-22 01:56:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1230/1251] eta 0:00:46 lr 0.000557 time 1.9263 (2.2019) loss 3.2502 (3.5731) grad_norm 1.3548 (1.4447) [2022-01-22 01:56:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1240/1251] eta 0:00:24 lr 0.000557 time 1.4960 (2.2007) loss 3.0818 (3.5744) grad_norm 1.4935 (1.4450) [2022-01-22 01:57:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [139/300][1250/1251] eta 0:00:02 lr 0.000557 time 1.1728 (2.1957) loss 4.4121 (3.5751) grad_norm 1.4989 (1.4451) [2022-01-22 01:57:05 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 139 training takes 0:45:47 [2022-01-22 01:57:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.236 (18.236) Loss 1.0305 (1.0305) Acc@1 77.051 (77.051) Acc@5 94.043 (94.043) [2022-01-22 01:57:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.188 (3.262) Loss 1.0475 (1.0491) Acc@1 75.000 (75.932) Acc@5 92.188 (92.978) [2022-01-22 01:57:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.332 (2.577) Loss 0.9347 (1.0423) Acc@1 78.613 (76.014) Acc@5 93.750 (92.950) [2022-01-22 01:58:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.307 (2.319) Loss 0.9846 (1.0467) Acc@1 78.809 (75.810) Acc@5 93.750 (93.007) [2022-01-22 01:58:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.746 (2.183) Loss 0.9965 (1.0517) Acc@1 75.977 (75.631) Acc@5 94.238 (92.943) [2022-01-22 01:58:41 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.616 Acc@5 92.942 [2022-01-22 01:58:41 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.6% [2022-01-22 01:58:41 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.63% [2022-01-22 01:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][0/1251] eta 7:44:32 lr 0.000557 time 22.2803 (22.2803) loss 4.1124 (4.1124) grad_norm 1.3530 (1.3530) [2022-01-22 01:59:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][10/1251] eta 1:24:14 lr 0.000557 time 2.7060 (4.0727) loss 3.6442 (3.7141) grad_norm 1.2962 (1.3475) [2022-01-22 01:59:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][20/1251] eta 1:05:11 lr 0.000557 time 1.8180 (3.1773) loss 3.3789 (3.5343) grad_norm 1.3714 (1.3558) [2022-01-22 02:00:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][30/1251] eta 0:58:49 lr 0.000557 time 1.5547 (2.8906) loss 3.9685 (3.5877) grad_norm 1.4003 (1.3677) [2022-01-22 02:00:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][40/1251] eta 0:56:07 lr 0.000557 time 3.9999 (2.7806) loss 4.0092 (3.5550) grad_norm 1.3960 (1.4015) [2022-01-22 02:00:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][50/1251] eta 0:53:09 lr 0.000557 time 1.9202 (2.6559) loss 4.1712 (3.5529) grad_norm 1.4172 (1.4121) [2022-01-22 02:01:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][60/1251] eta 0:50:36 lr 0.000556 time 1.6740 (2.5499) loss 3.8268 (3.5797) grad_norm 1.4119 (1.4175) [2022-01-22 02:01:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][70/1251] eta 0:49:03 lr 0.000556 time 2.0509 (2.4924) loss 4.1448 (3.5577) grad_norm 1.7372 (1.4217) [2022-01-22 02:02:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][80/1251] eta 0:47:59 lr 0.000556 time 3.1891 (2.4586) loss 4.3423 (3.5321) grad_norm 1.3565 (1.4184) [2022-01-22 02:02:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][90/1251] eta 0:47:07 lr 0.000556 time 2.8247 (2.4351) loss 4.1650 (3.5440) grad_norm 1.4039 (1.4277) [2022-01-22 02:02:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][100/1251] eta 0:46:33 lr 0.000556 time 1.7432 (2.4274) loss 2.7932 (3.5666) grad_norm 1.4645 (1.4226) [2022-01-22 02:03:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][110/1251] eta 0:45:35 lr 0.000556 time 2.2376 (2.3975) loss 3.2967 (3.5823) grad_norm 1.6132 (1.4280) [2022-01-22 02:03:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][120/1251] eta 0:44:49 lr 0.000556 time 2.9755 (2.3778) loss 2.4740 (3.5740) grad_norm 1.5090 (1.4293) [2022-01-22 02:03:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][130/1251] eta 0:44:03 lr 0.000556 time 2.6209 (2.3584) loss 3.8666 (3.5521) grad_norm 1.3863 (1.4360) [2022-01-22 02:04:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][140/1251] eta 0:43:10 lr 0.000556 time 1.7973 (2.3320) loss 3.1573 (3.5499) grad_norm 1.6033 (1.4399) [2022-01-22 02:04:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][150/1251] eta 0:42:30 lr 0.000556 time 1.9106 (2.3164) loss 3.8319 (3.5409) grad_norm 1.4830 (1.4394) [2022-01-22 02:04:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][160/1251] eta 0:42:09 lr 0.000556 time 2.5470 (2.3184) loss 4.0484 (3.5532) grad_norm 1.4648 (1.4394) [2022-01-22 02:05:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][170/1251] eta 0:41:50 lr 0.000556 time 2.0862 (2.3221) loss 3.2555 (3.5598) grad_norm 1.5273 (1.4379) [2022-01-22 02:05:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][180/1251] eta 0:41:26 lr 0.000556 time 2.9031 (2.3216) loss 3.5100 (3.5707) grad_norm 1.6145 (1.4385) [2022-01-22 02:06:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][190/1251] eta 0:41:01 lr 0.000556 time 1.8085 (2.3197) loss 3.5674 (3.5619) grad_norm 1.5944 (1.4414) [2022-01-22 02:06:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][200/1251] eta 0:40:26 lr 0.000556 time 1.9701 (2.3092) loss 3.8450 (3.5728) grad_norm 1.5724 (1.4445) [2022-01-22 02:06:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][210/1251] eta 0:39:45 lr 0.000556 time 1.8572 (2.2914) loss 3.1265 (3.5713) grad_norm 1.2790 (1.4431) [2022-01-22 02:07:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][220/1251] eta 0:39:08 lr 0.000556 time 2.0182 (2.2778) loss 3.7356 (3.5838) grad_norm 1.3756 (1.4406) [2022-01-22 02:07:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][230/1251] eta 0:38:36 lr 0.000556 time 2.4323 (2.2691) loss 3.1508 (3.5789) grad_norm 1.5857 (1.4433) [2022-01-22 02:07:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][240/1251] eta 0:38:09 lr 0.000556 time 2.2122 (2.2650) loss 2.3222 (3.5655) grad_norm 1.4958 (1.4433) [2022-01-22 02:08:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][250/1251] eta 0:37:46 lr 0.000556 time 2.4446 (2.2643) loss 3.4750 (3.5642) grad_norm 1.4187 (1.4428) [2022-01-22 02:08:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][260/1251] eta 0:37:18 lr 0.000556 time 1.9548 (2.2585) loss 3.8770 (3.5632) grad_norm 1.3785 (1.4429) [2022-01-22 02:08:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][270/1251] eta 0:36:54 lr 0.000556 time 2.7261 (2.2578) loss 2.8664 (3.5662) grad_norm 1.4374 (1.4445) [2022-01-22 02:09:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][280/1251] eta 0:36:28 lr 0.000556 time 1.8892 (2.2542) loss 4.1437 (3.5699) grad_norm 1.8044 (1.4477) [2022-01-22 02:09:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][290/1251] eta 0:36:09 lr 0.000556 time 2.3816 (2.2578) loss 3.4907 (3.5645) grad_norm 1.3602 (1.4489) [2022-01-22 02:10:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][300/1251] eta 0:35:47 lr 0.000556 time 1.8088 (2.2578) loss 3.8121 (3.5662) grad_norm 1.3976 (1.4499) [2022-01-22 02:10:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][310/1251] eta 0:35:30 lr 0.000555 time 2.7610 (2.2638) loss 2.9727 (3.5620) grad_norm 1.6350 (1.4481) [2022-01-22 02:10:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][320/1251] eta 0:35:04 lr 0.000555 time 1.5643 (2.2602) loss 4.2334 (3.5703) grad_norm 1.2854 (1.4467) [2022-01-22 02:11:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][330/1251] eta 0:34:38 lr 0.000555 time 1.5850 (2.2568) loss 2.4115 (3.5670) grad_norm 1.2410 (1.4425) [2022-01-22 02:11:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][340/1251] eta 0:34:08 lr 0.000555 time 1.9980 (2.2482) loss 3.8004 (3.5570) grad_norm 1.4917 (1.4419) [2022-01-22 02:11:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][350/1251] eta 0:33:40 lr 0.000555 time 2.1610 (2.2420) loss 3.6291 (3.5626) grad_norm 1.5755 (1.4423) [2022-01-22 02:12:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][360/1251] eta 0:33:13 lr 0.000555 time 1.7177 (2.2374) loss 3.4500 (3.5663) grad_norm 1.6908 (1.4449) [2022-01-22 02:12:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][370/1251] eta 0:32:45 lr 0.000555 time 1.8847 (2.2315) loss 3.9082 (3.5693) grad_norm 1.3016 (1.4494) [2022-01-22 02:12:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][380/1251] eta 0:32:21 lr 0.000555 time 1.8012 (2.2293) loss 4.0359 (3.5690) grad_norm 1.2474 (1.4471) [2022-01-22 02:13:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][390/1251] eta 0:31:58 lr 0.000555 time 2.4747 (2.2285) loss 3.8073 (3.5732) grad_norm 1.3989 (1.4467) [2022-01-22 02:13:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][400/1251] eta 0:31:35 lr 0.000555 time 1.9575 (2.2275) loss 3.6593 (3.5749) grad_norm 1.2524 (1.4470) [2022-01-22 02:13:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][410/1251] eta 0:31:11 lr 0.000555 time 1.8820 (2.2257) loss 2.5456 (3.5751) grad_norm 1.4469 (1.4454) [2022-01-22 02:14:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][420/1251] eta 0:30:51 lr 0.000555 time 2.1986 (2.2277) loss 3.3476 (3.5771) grad_norm 1.3030 (1.4461) [2022-01-22 02:14:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][430/1251] eta 0:30:28 lr 0.000555 time 2.6187 (2.2276) loss 3.6789 (3.5772) grad_norm 1.3316 (1.4451) [2022-01-22 02:15:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][440/1251] eta 0:30:09 lr 0.000555 time 2.1129 (2.2307) loss 3.3493 (3.5725) grad_norm 1.4077 (1.4449) [2022-01-22 02:15:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][450/1251] eta 0:29:44 lr 0.000555 time 2.0865 (2.2277) loss 3.0265 (3.5731) grad_norm 1.2946 (1.4441) [2022-01-22 02:15:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][460/1251] eta 0:29:21 lr 0.000555 time 1.5811 (2.2271) loss 4.2715 (3.5722) grad_norm 1.3335 (1.4445) [2022-01-22 02:16:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][470/1251] eta 0:28:59 lr 0.000555 time 1.9564 (2.2273) loss 4.0394 (3.5742) grad_norm 1.7236 (1.4436) [2022-01-22 02:16:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][480/1251] eta 0:28:38 lr 0.000555 time 1.8671 (2.2291) loss 4.3319 (3.5784) grad_norm 1.8917 (1.4431) [2022-01-22 02:16:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][490/1251] eta 0:28:14 lr 0.000555 time 1.5413 (2.2261) loss 3.6994 (3.5795) grad_norm 1.5674 (1.4462) [2022-01-22 02:17:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][500/1251] eta 0:27:50 lr 0.000555 time 1.8156 (2.2249) loss 3.5632 (3.5817) grad_norm 1.1139 (1.4479) [2022-01-22 02:17:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][510/1251] eta 0:27:28 lr 0.000555 time 2.8244 (2.2248) loss 3.4654 (3.5830) grad_norm 1.5622 (1.4488) [2022-01-22 02:18:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][520/1251] eta 0:27:07 lr 0.000555 time 1.8651 (2.2268) loss 3.0393 (3.5839) grad_norm 1.2935 (1.4481) [2022-01-22 02:18:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][530/1251] eta 0:26:43 lr 0.000555 time 1.6609 (2.2237) loss 4.2626 (3.5859) grad_norm 1.5562 (1.4472) [2022-01-22 02:18:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][540/1251] eta 0:26:19 lr 0.000555 time 2.2554 (2.2214) loss 2.5114 (3.5837) grad_norm 1.2700 (1.4467) [2022-01-22 02:19:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][550/1251] eta 0:25:56 lr 0.000554 time 1.8609 (2.2208) loss 4.1718 (3.5860) grad_norm 1.3612 (1.4451) [2022-01-22 02:19:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][560/1251] eta 0:25:33 lr 0.000554 time 1.9089 (2.2198) loss 3.2410 (3.5823) grad_norm 1.5414 (1.4464) [2022-01-22 02:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][570/1251] eta 0:25:12 lr 0.000554 time 2.0924 (2.2208) loss 3.1417 (3.5816) grad_norm 1.8235 (1.4468) [2022-01-22 02:20:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][580/1251] eta 0:24:48 lr 0.000554 time 2.2473 (2.2190) loss 3.6946 (3.5841) grad_norm 1.2336 (1.4466) [2022-01-22 02:20:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][590/1251] eta 0:24:26 lr 0.000554 time 2.3846 (2.2180) loss 4.3384 (3.5875) grad_norm 1.5593 (1.4468) [2022-01-22 02:20:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][600/1251] eta 0:24:02 lr 0.000554 time 2.4513 (2.2163) loss 2.4749 (3.5834) grad_norm 1.5460 (1.4471) [2022-01-22 02:21:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][610/1251] eta 0:23:38 lr 0.000554 time 2.5557 (2.2130) loss 3.9968 (3.5851) grad_norm 1.5654 (1.4499) [2022-01-22 02:21:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][620/1251] eta 0:23:15 lr 0.000554 time 1.9429 (2.2113) loss 3.9879 (3.5843) grad_norm 1.5880 (1.4495) [2022-01-22 02:21:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][630/1251] eta 0:22:53 lr 0.000554 time 2.6214 (2.2113) loss 2.9502 (3.5845) grad_norm 1.5332 (1.4495) [2022-01-22 02:22:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][640/1251] eta 0:22:29 lr 0.000554 time 1.5999 (2.2090) loss 4.0720 (3.5874) grad_norm 1.3511 (1.4482) [2022-01-22 02:22:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][650/1251] eta 0:22:07 lr 0.000554 time 2.4544 (2.2088) loss 3.7171 (3.5914) grad_norm 1.2302 (1.4474) [2022-01-22 02:23:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][660/1251] eta 0:21:44 lr 0.000554 time 1.8158 (2.2072) loss 4.0154 (3.5954) grad_norm 1.3755 (1.4473) [2022-01-22 02:23:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][670/1251] eta 0:21:26 lr 0.000554 time 2.2454 (2.2135) loss 3.2906 (3.5955) grad_norm 1.2754 (1.4472) [2022-01-22 02:23:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][680/1251] eta 0:21:05 lr 0.000554 time 1.9163 (2.2165) loss 4.3077 (3.5962) grad_norm 1.4934 (1.4472) [2022-01-22 02:24:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][690/1251] eta 0:20:43 lr 0.000554 time 2.2039 (2.2163) loss 3.8408 (3.5928) grad_norm 1.3660 (1.4462) [2022-01-22 02:24:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][700/1251] eta 0:20:20 lr 0.000554 time 2.1297 (2.2154) loss 3.4520 (3.5927) grad_norm 1.4627 (1.4443) [2022-01-22 02:24:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][710/1251] eta 0:19:59 lr 0.000554 time 2.4128 (2.2166) loss 3.4320 (3.5939) grad_norm 1.3617 (1.4436) [2022-01-22 02:25:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][720/1251] eta 0:19:35 lr 0.000554 time 1.9468 (2.2137) loss 3.5948 (3.5926) grad_norm 1.6150 (1.4431) [2022-01-22 02:25:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][730/1251] eta 0:19:12 lr 0.000554 time 2.4584 (2.2128) loss 3.7690 (3.5939) grad_norm 1.4514 (1.4432) [2022-01-22 02:26:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][740/1251] eta 0:18:49 lr 0.000554 time 2.0899 (2.2108) loss 3.1142 (3.5937) grad_norm 1.3478 (1.4432) [2022-01-22 02:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][750/1251] eta 0:18:28 lr 0.000554 time 1.8898 (2.2129) loss 4.2372 (3.5903) grad_norm 1.3813 (1.4427) [2022-01-22 02:26:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][760/1251] eta 0:18:06 lr 0.000554 time 1.8454 (2.2128) loss 4.0177 (3.5906) grad_norm 1.5804 (1.4428) [2022-01-22 02:27:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][770/1251] eta 0:17:44 lr 0.000554 time 2.1856 (2.2128) loss 2.8357 (3.5886) grad_norm 1.3518 (1.4423) [2022-01-22 02:27:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][780/1251] eta 0:17:21 lr 0.000554 time 1.9216 (2.2112) loss 2.8433 (3.5864) grad_norm 1.2917 (1.4417) [2022-01-22 02:27:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][790/1251] eta 0:17:01 lr 0.000553 time 2.4963 (2.2148) loss 2.9875 (3.5821) grad_norm 1.3940 (1.4409) [2022-01-22 02:28:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][800/1251] eta 0:16:37 lr 0.000553 time 1.6731 (2.2123) loss 4.2694 (3.5838) grad_norm 1.4366 (1.4407) [2022-01-22 02:28:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][810/1251] eta 0:16:14 lr 0.000553 time 1.8639 (2.2103) loss 3.8933 (3.5865) grad_norm 1.3928 (1.4405) [2022-01-22 02:28:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][820/1251] eta 0:15:52 lr 0.000553 time 1.7618 (2.2092) loss 4.0850 (3.5879) grad_norm 1.5124 (1.4404) [2022-01-22 02:29:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][830/1251] eta 0:15:30 lr 0.000553 time 2.4619 (2.2104) loss 3.2346 (3.5907) grad_norm 1.2541 (1.4400) [2022-01-22 02:29:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][840/1251] eta 0:15:08 lr 0.000553 time 1.8117 (2.2102) loss 2.8176 (3.5914) grad_norm 1.4951 (1.4405) [2022-01-22 02:30:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][850/1251] eta 0:14:46 lr 0.000553 time 1.9418 (2.2096) loss 3.3702 (3.5902) grad_norm 1.8829 (1.4412) [2022-01-22 02:30:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][860/1251] eta 0:14:23 lr 0.000553 time 1.7208 (2.2090) loss 3.9224 (3.5901) grad_norm 1.2893 (1.4412) [2022-01-22 02:30:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][870/1251] eta 0:14:01 lr 0.000553 time 2.0876 (2.2094) loss 4.4237 (3.5886) grad_norm 1.6382 (1.4405) [2022-01-22 02:31:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][880/1251] eta 0:13:38 lr 0.000553 time 1.9015 (2.2069) loss 3.6355 (3.5895) grad_norm 1.3457 (1.4396) [2022-01-22 02:31:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][890/1251] eta 0:13:16 lr 0.000553 time 1.9487 (2.2055) loss 3.6452 (3.5915) grad_norm 1.5496 (1.4389) [2022-01-22 02:31:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][900/1251] eta 0:12:53 lr 0.000553 time 2.0792 (2.2047) loss 2.9791 (3.5913) grad_norm 1.5627 (1.4389) [2022-01-22 02:32:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][910/1251] eta 0:12:32 lr 0.000553 time 2.1696 (2.2065) loss 3.5513 (3.5912) grad_norm 1.3440 (1.4388) [2022-01-22 02:32:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][920/1251] eta 0:12:10 lr 0.000553 time 1.4738 (2.2071) loss 2.5511 (3.5877) grad_norm 1.5289 (1.4384) [2022-01-22 02:32:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][930/1251] eta 0:11:48 lr 0.000553 time 1.4932 (2.2061) loss 3.7070 (3.5848) grad_norm 1.5926 (1.4378) [2022-01-22 02:33:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][940/1251] eta 0:11:25 lr 0.000553 time 1.9596 (2.2053) loss 2.6158 (3.5826) grad_norm 1.8665 (1.4387) [2022-01-22 02:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][950/1251] eta 0:11:04 lr 0.000553 time 2.1409 (2.2062) loss 4.3952 (3.5845) grad_norm 1.5786 (1.4392) [2022-01-22 02:34:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][960/1251] eta 0:10:42 lr 0.000553 time 1.6579 (2.2074) loss 4.4160 (3.5867) grad_norm 1.3445 (1.4397) [2022-01-22 02:34:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][970/1251] eta 0:10:19 lr 0.000553 time 1.8294 (2.2060) loss 4.0480 (3.5836) grad_norm 1.5664 (1.4400) [2022-01-22 02:34:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][980/1251] eta 0:09:57 lr 0.000553 time 1.8901 (2.2048) loss 2.7690 (3.5842) grad_norm 1.4540 (1.4390) [2022-01-22 02:35:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][990/1251] eta 0:09:34 lr 0.000553 time 1.6207 (2.2030) loss 3.2159 (3.5844) grad_norm 1.4802 (1.4392) [2022-01-22 02:35:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1000/1251] eta 0:09:12 lr 0.000553 time 2.0133 (2.2021) loss 3.4704 (3.5843) grad_norm 1.4528 (1.4389) [2022-01-22 02:35:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1010/1251] eta 0:08:50 lr 0.000553 time 1.9289 (2.2022) loss 4.1575 (3.5861) grad_norm 1.2704 (1.4393) [2022-01-22 02:36:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1020/1251] eta 0:08:28 lr 0.000553 time 2.1810 (2.2017) loss 3.7168 (3.5853) grad_norm 1.4206 (1.4398) [2022-01-22 02:36:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1030/1251] eta 0:08:07 lr 0.000552 time 2.2375 (2.2040) loss 3.0007 (3.5847) grad_norm 1.6255 (1.4396) [2022-01-22 02:36:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1040/1251] eta 0:07:45 lr 0.000552 time 2.0049 (2.2044) loss 3.9486 (3.5833) grad_norm 2.0919 (1.4406) [2022-01-22 02:37:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1050/1251] eta 0:07:23 lr 0.000552 time 2.1662 (2.2047) loss 3.6698 (3.5837) grad_norm 1.6947 (1.4412) [2022-01-22 02:37:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1060/1251] eta 0:07:00 lr 0.000552 time 1.9694 (2.2032) loss 3.2171 (3.5849) grad_norm 1.4668 (1.4406) [2022-01-22 02:38:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1070/1251] eta 0:06:38 lr 0.000552 time 2.5575 (2.2028) loss 4.1487 (3.5877) grad_norm 1.3050 (1.4398) [2022-01-22 02:38:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1080/1251] eta 0:06:16 lr 0.000552 time 1.5630 (2.2016) loss 3.8148 (3.5867) grad_norm 1.4555 (1.4403) [2022-01-22 02:38:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1090/1251] eta 0:05:54 lr 0.000552 time 1.8582 (2.2011) loss 2.4675 (3.5851) grad_norm 1.4715 (1.4406) [2022-01-22 02:39:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1100/1251] eta 0:05:32 lr 0.000552 time 2.2404 (2.2017) loss 3.8747 (3.5863) grad_norm 1.3406 (1.4397) [2022-01-22 02:39:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1110/1251] eta 0:05:10 lr 0.000552 time 2.1808 (2.2012) loss 4.0064 (3.5866) grad_norm 1.3038 (1.4392) [2022-01-22 02:39:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1120/1251] eta 0:04:48 lr 0.000552 time 2.7813 (2.2025) loss 3.0930 (3.5846) grad_norm 1.7958 (1.4400) [2022-01-22 02:40:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1130/1251] eta 0:04:26 lr 0.000552 time 2.4945 (2.2034) loss 3.9487 (3.5873) grad_norm 1.3148 (1.4402) [2022-01-22 02:40:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1140/1251] eta 0:04:04 lr 0.000552 time 2.1065 (2.2039) loss 3.9694 (3.5882) grad_norm 1.4372 (1.4404) [2022-01-22 02:40:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1150/1251] eta 0:03:42 lr 0.000552 time 1.8924 (2.2029) loss 4.0104 (3.5894) grad_norm 1.7326 (1.4407) [2022-01-22 02:41:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1160/1251] eta 0:03:20 lr 0.000552 time 2.5085 (2.2016) loss 3.9063 (3.5902) grad_norm 1.3944 (1.4409) [2022-01-22 02:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1170/1251] eta 0:02:58 lr 0.000552 time 1.9045 (2.1996) loss 3.8777 (3.5881) grad_norm 1.2959 (1.4405) [2022-01-22 02:41:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1180/1251] eta 0:02:36 lr 0.000552 time 2.1619 (2.1986) loss 4.0725 (3.5886) grad_norm 1.5260 (1.4404) [2022-01-22 02:42:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1190/1251] eta 0:02:14 lr 0.000552 time 2.0574 (2.1984) loss 4.2398 (3.5902) grad_norm 1.4430 (1.4408) [2022-01-22 02:42:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1200/1251] eta 0:01:52 lr 0.000552 time 2.1519 (2.1984) loss 3.8644 (3.5915) grad_norm 1.5414 (1.4414) [2022-01-22 02:43:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1210/1251] eta 0:01:30 lr 0.000552 time 1.9960 (2.1979) loss 3.3396 (3.5934) grad_norm 1.9549 (1.4428) [2022-01-22 02:43:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1220/1251] eta 0:01:08 lr 0.000552 time 1.9173 (2.1967) loss 2.7604 (3.5935) grad_norm 1.3302 (1.4432) [2022-01-22 02:43:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1230/1251] eta 0:00:46 lr 0.000552 time 2.1429 (2.1966) loss 4.0964 (3.5959) grad_norm 1.2297 (1.4430) [2022-01-22 02:44:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1240/1251] eta 0:00:24 lr 0.000552 time 2.1203 (2.1959) loss 4.1540 (3.5970) grad_norm 1.1873 (1.4428) [2022-01-22 02:44:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [140/300][1250/1251] eta 0:00:02 lr 0.000552 time 1.1904 (2.1908) loss 3.5406 (3.5962) grad_norm 1.5033 (1.4425) [2022-01-22 02:44:23 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 140 training takes 0:45:41 [2022-01-22 02:44:23 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_140 saving...... [2022-01-22 02:44:34 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_140 saved !!! [2022-01-22 02:44:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 15.043 (15.043) Loss 1.0058 (1.0058) Acc@1 76.562 (76.562) Acc@5 92.871 (92.871) [2022-01-22 02:45:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.600 (2.890) Loss 1.1210 (1.0518) Acc@1 73.730 (75.426) Acc@5 91.602 (92.605) [2022-01-22 02:45:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.699 (2.386) Loss 1.0345 (1.0467) Acc@1 75.293 (75.437) Acc@5 93.750 (92.811) [2022-01-22 02:45:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.028 (2.071) Loss 0.9947 (1.0360) Acc@1 76.465 (75.564) Acc@5 93.359 (93.016) [2022-01-22 02:45:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.840 (1.955) Loss 0.9948 (1.0345) Acc@1 77.344 (75.679) Acc@5 93.262 (93.062) [2022-01-22 02:46:01 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.632 Acc@5 93.040 [2022-01-22 02:46:01 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.6% [2022-01-22 02:46:01 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.63% [2022-01-22 02:46:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][0/1251] eta 7:34:52 lr 0.000552 time 21.8162 (21.8162) loss 2.9439 (2.9439) grad_norm 1.4289 (1.4289) [2022-01-22 02:46:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][10/1251] eta 1:24:30 lr 0.000552 time 2.0757 (4.0856) loss 3.6359 (3.5176) grad_norm 1.4123 (1.4190) [2022-01-22 02:47:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][20/1251] eta 1:04:40 lr 0.000552 time 2.4170 (3.1524) loss 4.0974 (3.6095) grad_norm 1.3533 (1.4665) [2022-01-22 02:47:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][30/1251] eta 0:57:50 lr 0.000551 time 1.3682 (2.8420) loss 2.4714 (3.5609) grad_norm 1.3546 (1.4796) [2022-01-22 02:47:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][40/1251] eta 0:55:14 lr 0.000551 time 5.2599 (2.7368) loss 3.6483 (3.5546) grad_norm 1.2895 (1.4750) [2022-01-22 02:48:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][50/1251] eta 0:53:26 lr 0.000551 time 3.7649 (2.6697) loss 3.7475 (3.5831) grad_norm 1.2044 (1.4635) [2022-01-22 02:48:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][60/1251] eta 0:51:29 lr 0.000551 time 1.5350 (2.5938) loss 2.7905 (3.5680) grad_norm 1.4110 (1.4539) [2022-01-22 02:49:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][70/1251] eta 0:49:32 lr 0.000551 time 1.9111 (2.5173) loss 3.8842 (3.5685) grad_norm 1.3909 (1.4684) [2022-01-22 02:49:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][80/1251] eta 0:48:23 lr 0.000551 time 3.4654 (2.4798) loss 4.1449 (3.5371) grad_norm 1.4682 (1.4698) [2022-01-22 02:49:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][90/1251] eta 0:47:29 lr 0.000551 time 2.7013 (2.4547) loss 2.4644 (3.5507) grad_norm 1.5055 (1.4729) [2022-01-22 02:50:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][100/1251] eta 0:46:45 lr 0.000551 time 2.1116 (2.4377) loss 3.0635 (3.5523) grad_norm 1.3174 (1.4683) [2022-01-22 02:50:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][110/1251] eta 0:45:55 lr 0.000551 time 2.3502 (2.4149) loss 3.0402 (3.5206) grad_norm 1.3103 (1.4583) [2022-01-22 02:50:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][120/1251] eta 0:45:14 lr 0.000551 time 3.3905 (2.4002) loss 3.5694 (3.5183) grad_norm 1.5393 (1.4535) [2022-01-22 02:51:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][130/1251] eta 0:44:16 lr 0.000551 time 1.8036 (2.3698) loss 3.9516 (3.5222) grad_norm 1.6162 (1.4491) [2022-01-22 02:51:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][140/1251] eta 0:43:29 lr 0.000551 time 2.1524 (2.3487) loss 3.5588 (3.5119) grad_norm 1.4924 (1.4481) [2022-01-22 02:51:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][150/1251] eta 0:42:51 lr 0.000551 time 2.2136 (2.3355) loss 2.3846 (3.4985) grad_norm 1.5844 (1.4492) [2022-01-22 02:52:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][160/1251] eta 0:42:27 lr 0.000551 time 3.5379 (2.3353) loss 3.5065 (3.4843) grad_norm 1.5107 (1.4542) [2022-01-22 02:52:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][170/1251] eta 0:41:56 lr 0.000551 time 2.5912 (2.3275) loss 3.4987 (3.4693) grad_norm 1.5075 (1.4528) [2022-01-22 02:52:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][180/1251] eta 0:41:11 lr 0.000551 time 1.7782 (2.3078) loss 3.6525 (3.4843) grad_norm 1.5824 (1.4493) [2022-01-22 02:53:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][190/1251] eta 0:40:43 lr 0.000551 time 2.5232 (2.3034) loss 3.3933 (3.4708) grad_norm 1.4288 (1.4489) [2022-01-22 02:53:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][200/1251] eta 0:40:20 lr 0.000551 time 2.2998 (2.3027) loss 3.7517 (3.4816) grad_norm 1.4219 (1.4495) [2022-01-22 02:54:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][210/1251] eta 0:39:57 lr 0.000551 time 2.1364 (2.3031) loss 2.4752 (3.4893) grad_norm 1.6670 (1.4515) [2022-01-22 02:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][220/1251] eta 0:39:29 lr 0.000551 time 2.5271 (2.2980) loss 4.0437 (3.4916) grad_norm 1.4823 (1.4561) [2022-01-22 02:54:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][230/1251] eta 0:38:54 lr 0.000551 time 1.9056 (2.2861) loss 3.2416 (3.4880) grad_norm 1.6281 (1.4572) [2022-01-22 02:55:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][240/1251] eta 0:38:19 lr 0.000551 time 2.0110 (2.2747) loss 3.6701 (3.4918) grad_norm 1.5428 (1.4576) [2022-01-22 02:55:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][250/1251] eta 0:37:53 lr 0.000551 time 2.4919 (2.2713) loss 3.1732 (3.4956) grad_norm 1.3758 (1.4572) [2022-01-22 02:55:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][260/1251] eta 0:37:23 lr 0.000551 time 2.1615 (2.2642) loss 3.9356 (3.4956) grad_norm 1.5780 (1.4575) [2022-01-22 02:56:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][270/1251] eta 0:37:01 lr 0.000550 time 2.8423 (2.2645) loss 2.7730 (3.4890) grad_norm 1.4966 (1.4598) [2022-01-22 02:56:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][280/1251] eta 0:36:30 lr 0.000550 time 1.7606 (2.2557) loss 3.0079 (3.4970) grad_norm 1.2827 (1.4567) [2022-01-22 02:56:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][290/1251] eta 0:36:06 lr 0.000550 time 2.6663 (2.2543) loss 3.3996 (3.4972) grad_norm 1.4931 (1.4560) [2022-01-22 02:57:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][300/1251] eta 0:35:38 lr 0.000550 time 1.8500 (2.2489) loss 3.0795 (3.5022) grad_norm 1.2147 (1.4546) [2022-01-22 02:57:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][310/1251] eta 0:35:15 lr 0.000550 time 2.2986 (2.2483) loss 3.9654 (3.4960) grad_norm 1.3631 (1.4530) [2022-01-22 02:58:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][320/1251] eta 0:34:51 lr 0.000550 time 2.9358 (2.2467) loss 2.8257 (3.4858) grad_norm 1.2922 (1.4516) [2022-01-22 02:58:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][330/1251] eta 0:34:30 lr 0.000550 time 2.8167 (2.2480) loss 4.0766 (3.4854) grad_norm 1.4713 (1.4510) [2022-01-22 02:58:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][340/1251] eta 0:34:06 lr 0.000550 time 1.9275 (2.2469) loss 4.1433 (3.4907) grad_norm 1.3053 (1.4528) [2022-01-22 02:59:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][350/1251] eta 0:33:44 lr 0.000550 time 2.8430 (2.2471) loss 3.8418 (3.5005) grad_norm 1.6345 (1.4508) [2022-01-22 02:59:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][360/1251] eta 0:33:16 lr 0.000550 time 1.9833 (2.2409) loss 3.5135 (3.5047) grad_norm 1.4891 (1.4504) [2022-01-22 02:59:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][370/1251] eta 0:32:55 lr 0.000550 time 2.6682 (2.2428) loss 2.7756 (3.5040) grad_norm 1.5472 (1.4482) [2022-01-22 03:00:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][380/1251] eta 0:32:30 lr 0.000550 time 2.2240 (2.2399) loss 4.1352 (3.5017) grad_norm 1.5134 (1.4558) [2022-01-22 03:00:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][390/1251] eta 0:32:09 lr 0.000550 time 2.2069 (2.2411) loss 3.7366 (3.5072) grad_norm 1.6818 (1.4562) [2022-01-22 03:01:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][400/1251] eta 0:31:47 lr 0.000550 time 1.8767 (2.2410) loss 4.1152 (3.5079) grad_norm 1.3148 (1.4539) [2022-01-22 03:01:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][410/1251] eta 0:31:24 lr 0.000550 time 3.3418 (2.2413) loss 3.7005 (3.5100) grad_norm 1.6609 (1.4547) [2022-01-22 03:01:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][420/1251] eta 0:30:58 lr 0.000550 time 1.8902 (2.2364) loss 3.4036 (3.5123) grad_norm 1.2956 (1.4547) [2022-01-22 03:02:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][430/1251] eta 0:30:31 lr 0.000550 time 2.1788 (2.2311) loss 4.0725 (3.5099) grad_norm 1.5825 (1.4554) [2022-01-22 03:02:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][440/1251] eta 0:30:07 lr 0.000550 time 2.2520 (2.2287) loss 3.3851 (3.5043) grad_norm 1.4407 (1.4574) [2022-01-22 03:02:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][450/1251] eta 0:29:45 lr 0.000550 time 2.4776 (2.2288) loss 4.2303 (3.5062) grad_norm 1.4156 (1.4584) [2022-01-22 03:03:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][460/1251] eta 0:29:23 lr 0.000550 time 2.0949 (2.2300) loss 2.8995 (3.5078) grad_norm 1.5909 (1.4585) [2022-01-22 03:03:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][470/1251] eta 0:29:02 lr 0.000550 time 1.8878 (2.2313) loss 2.9170 (3.5099) grad_norm 1.4441 (1.4578) [2022-01-22 03:03:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][480/1251] eta 0:28:40 lr 0.000550 time 2.1591 (2.2317) loss 3.6203 (3.5116) grad_norm 1.5051 (1.4571) [2022-01-22 03:04:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][490/1251] eta 0:28:16 lr 0.000550 time 2.2747 (2.2291) loss 3.6348 (3.5137) grad_norm 1.3377 (1.4564) [2022-01-22 03:04:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][500/1251] eta 0:27:49 lr 0.000550 time 1.7835 (2.2234) loss 3.0416 (3.5078) grad_norm 1.4500 (1.4583) [2022-01-22 03:04:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][510/1251] eta 0:27:25 lr 0.000549 time 1.7971 (2.2205) loss 2.5851 (3.5121) grad_norm 1.4985 (1.4587) [2022-01-22 03:05:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][520/1251] eta 0:27:01 lr 0.000549 time 1.9684 (2.2188) loss 2.9377 (3.5143) grad_norm 1.3783 (1.4596) [2022-01-22 03:05:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][530/1251] eta 0:26:40 lr 0.000549 time 3.0363 (2.2198) loss 4.2135 (3.5132) grad_norm 1.3181 (1.4587) [2022-01-22 03:06:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][540/1251] eta 0:26:19 lr 0.000549 time 2.4220 (2.2213) loss 3.2723 (3.5160) grad_norm 1.4445 (1.4601) [2022-01-22 03:06:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][550/1251] eta 0:25:55 lr 0.000549 time 1.9985 (2.2191) loss 4.0355 (3.5156) grad_norm 1.3952 (1.4599) [2022-01-22 03:06:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][560/1251] eta 0:25:32 lr 0.000549 time 1.6985 (2.2184) loss 4.0976 (3.5200) grad_norm 1.2448 (1.4603) [2022-01-22 03:07:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][570/1251] eta 0:25:11 lr 0.000549 time 3.0940 (2.2199) loss 3.9320 (3.5198) grad_norm 1.4528 (1.4599) [2022-01-22 03:07:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][580/1251] eta 0:24:47 lr 0.000549 time 1.6522 (2.2169) loss 2.9616 (3.5211) grad_norm 1.3795 (1.4595) [2022-01-22 03:07:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][590/1251] eta 0:24:25 lr 0.000549 time 2.3081 (2.2171) loss 3.8372 (3.5242) grad_norm 1.4329 (1.4597) [2022-01-22 03:08:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][600/1251] eta 0:24:02 lr 0.000549 time 1.9797 (2.2159) loss 4.1164 (3.5227) grad_norm 1.3519 (1.4621) [2022-01-22 03:08:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][610/1251] eta 0:23:40 lr 0.000549 time 2.8824 (2.2161) loss 2.9448 (3.5254) grad_norm 1.3862 (1.4644) [2022-01-22 03:08:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][620/1251] eta 0:23:16 lr 0.000549 time 1.9553 (2.2136) loss 3.0464 (3.5242) grad_norm 1.5190 (1.4669) [2022-01-22 03:09:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][630/1251] eta 0:22:53 lr 0.000549 time 1.9617 (2.2123) loss 4.0207 (3.5302) grad_norm 1.5953 (1.4661) [2022-01-22 03:09:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][640/1251] eta 0:22:31 lr 0.000549 time 1.6471 (2.2114) loss 3.8092 (3.5324) grad_norm 1.4689 (1.4660) [2022-01-22 03:10:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][650/1251] eta 0:22:10 lr 0.000549 time 3.6741 (2.2135) loss 2.9592 (3.5349) grad_norm 1.5143 (1.4658) [2022-01-22 03:10:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][660/1251] eta 0:21:48 lr 0.000549 time 2.3110 (2.2144) loss 3.7578 (3.5377) grad_norm 1.3448 (1.4661) [2022-01-22 03:10:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][670/1251] eta 0:21:27 lr 0.000549 time 1.8793 (2.2164) loss 3.7921 (3.5395) grad_norm 1.5642 (1.4654) [2022-01-22 03:11:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][680/1251] eta 0:21:05 lr 0.000549 time 2.0023 (2.2165) loss 2.8945 (3.5379) grad_norm 1.3387 (1.4648) [2022-01-22 03:11:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][690/1251] eta 0:20:44 lr 0.000549 time 3.6563 (2.2189) loss 3.7239 (3.5402) grad_norm 1.4323 (1.4634) [2022-01-22 03:11:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][700/1251] eta 0:20:21 lr 0.000549 time 1.8930 (2.2165) loss 4.1770 (3.5413) grad_norm 1.3503 (1.4622) [2022-01-22 03:12:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][710/1251] eta 0:19:57 lr 0.000549 time 1.8333 (2.2139) loss 3.8087 (3.5403) grad_norm 1.2814 (1.4623) [2022-01-22 03:12:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][720/1251] eta 0:19:33 lr 0.000549 time 1.8926 (2.2101) loss 4.1984 (3.5425) grad_norm 1.6587 (1.4623) [2022-01-22 03:12:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][730/1251] eta 0:19:10 lr 0.000549 time 2.2037 (2.2084) loss 3.1280 (3.5419) grad_norm 1.6301 (1.4626) [2022-01-22 03:13:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][740/1251] eta 0:18:47 lr 0.000549 time 1.9130 (2.2062) loss 3.7422 (3.5422) grad_norm 1.2793 (1.4627) [2022-01-22 03:13:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][750/1251] eta 0:18:25 lr 0.000548 time 1.9824 (2.2056) loss 2.5358 (3.5433) grad_norm 1.5090 (1.4625) [2022-01-22 03:13:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][760/1251] eta 0:18:02 lr 0.000548 time 1.7720 (2.2038) loss 4.3945 (3.5465) grad_norm 1.4469 (1.4621) [2022-01-22 03:14:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][770/1251] eta 0:17:40 lr 0.000548 time 2.7667 (2.2045) loss 4.0517 (3.5484) grad_norm 1.3041 (1.4611) [2022-01-22 03:14:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][780/1251] eta 0:17:19 lr 0.000548 time 1.8329 (2.2063) loss 3.8340 (3.5472) grad_norm 1.4664 (1.4622) [2022-01-22 03:15:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][790/1251] eta 0:16:57 lr 0.000548 time 1.9855 (2.2073) loss 2.6992 (3.5448) grad_norm 1.3286 (1.4628) [2022-01-22 03:15:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][800/1251] eta 0:16:36 lr 0.000548 time 1.8682 (2.2098) loss 3.2977 (3.5416) grad_norm 1.4784 (1.4626) [2022-01-22 03:15:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][810/1251] eta 0:16:15 lr 0.000548 time 2.2049 (2.2114) loss 3.8479 (3.5451) grad_norm 1.6612 (1.4650) [2022-01-22 03:16:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][820/1251] eta 0:15:53 lr 0.000548 time 1.9774 (2.2113) loss 3.6154 (3.5472) grad_norm 1.4025 (1.4654) [2022-01-22 03:16:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][830/1251] eta 0:15:29 lr 0.000548 time 1.9465 (2.2085) loss 3.5573 (3.5473) grad_norm 1.4593 (1.4651) [2022-01-22 03:16:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][840/1251] eta 0:15:06 lr 0.000548 time 1.9360 (2.2059) loss 4.1200 (3.5519) grad_norm 1.3818 (1.4647) [2022-01-22 03:17:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][850/1251] eta 0:14:45 lr 0.000548 time 2.8684 (2.2071) loss 2.8293 (3.5549) grad_norm 1.3037 (1.4634) [2022-01-22 03:17:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][860/1251] eta 0:14:23 lr 0.000548 time 2.4216 (2.2081) loss 2.8600 (3.5549) grad_norm 1.3480 (1.4625) [2022-01-22 03:18:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][870/1251] eta 0:14:01 lr 0.000548 time 1.8471 (2.2081) loss 4.0517 (3.5540) grad_norm 1.3759 (1.4611) [2022-01-22 03:18:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][880/1251] eta 0:13:39 lr 0.000548 time 2.1941 (2.2097) loss 2.6872 (3.5523) grad_norm 1.2961 (1.4603) [2022-01-22 03:18:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][890/1251] eta 0:13:18 lr 0.000548 time 1.5996 (2.2107) loss 4.1646 (3.5565) grad_norm 1.5559 (1.4599) [2022-01-22 03:19:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][900/1251] eta 0:12:55 lr 0.000548 time 1.9878 (2.2080) loss 3.1468 (3.5578) grad_norm 1.3247 (1.4608) [2022-01-22 03:19:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][910/1251] eta 0:12:32 lr 0.000548 time 2.0177 (2.2060) loss 3.1077 (3.5558) grad_norm 1.2999 (1.4607) [2022-01-22 03:19:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][920/1251] eta 0:12:09 lr 0.000548 time 1.6447 (2.2039) loss 4.2910 (3.5562) grad_norm 1.3824 (1.4618) [2022-01-22 03:20:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][930/1251] eta 0:11:47 lr 0.000548 time 1.8293 (2.2027) loss 4.5163 (3.5594) grad_norm 1.4214 (1.4619) [2022-01-22 03:20:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][940/1251] eta 0:11:25 lr 0.000548 time 2.0881 (2.2028) loss 4.0827 (3.5605) grad_norm 1.4865 (1.4635) [2022-01-22 03:20:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][950/1251] eta 0:11:03 lr 0.000548 time 2.5500 (2.2043) loss 2.6402 (3.5611) grad_norm 1.3460 (1.4633) [2022-01-22 03:21:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][960/1251] eta 0:10:42 lr 0.000548 time 2.6984 (2.2064) loss 3.4011 (3.5581) grad_norm 1.3740 (1.4636) [2022-01-22 03:21:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][970/1251] eta 0:10:20 lr 0.000548 time 2.4571 (2.2093) loss 3.3961 (3.5614) grad_norm 1.3502 (1.4629) [2022-01-22 03:22:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][980/1251] eta 0:09:58 lr 0.000548 time 1.8469 (2.2090) loss 3.9636 (3.5645) grad_norm 1.3323 (1.4627) [2022-01-22 03:22:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][990/1251] eta 0:09:36 lr 0.000547 time 1.9035 (2.2074) loss 4.1621 (3.5674) grad_norm 1.8120 (1.4639) [2022-01-22 03:22:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1000/1251] eta 0:09:13 lr 0.000547 time 1.8427 (2.2045) loss 3.8390 (3.5677) grad_norm 1.5231 (1.4646) [2022-01-22 03:23:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1010/1251] eta 0:08:50 lr 0.000547 time 1.7753 (2.2026) loss 3.9826 (3.5702) grad_norm 1.3074 (1.4651) [2022-01-22 03:23:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1020/1251] eta 0:08:29 lr 0.000547 time 1.4838 (2.2052) loss 4.3603 (3.5731) grad_norm 1.3099 (1.4647) [2022-01-22 03:23:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1030/1251] eta 0:08:07 lr 0.000547 time 1.8779 (2.2043) loss 3.9659 (3.5748) grad_norm 1.5266 (1.4648) [2022-01-22 03:24:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1040/1251] eta 0:07:45 lr 0.000547 time 1.9202 (2.2040) loss 2.8233 (3.5736) grad_norm 1.4217 (1.4649) [2022-01-22 03:24:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1050/1251] eta 0:07:23 lr 0.000547 time 1.8389 (2.2046) loss 4.0819 (3.5772) grad_norm 1.8410 (1.4650) [2022-01-22 03:25:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1060/1251] eta 0:07:00 lr 0.000547 time 1.8303 (2.2039) loss 3.6256 (3.5792) grad_norm 1.3371 (1.4648) [2022-01-22 03:25:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1070/1251] eta 0:06:38 lr 0.000547 time 1.7244 (2.2036) loss 2.4053 (3.5800) grad_norm 1.4011 (1.4640) [2022-01-22 03:25:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1080/1251] eta 0:06:16 lr 0.000547 time 1.8011 (2.2021) loss 2.9806 (3.5808) grad_norm 1.4024 (1.4631) [2022-01-22 03:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1090/1251] eta 0:05:54 lr 0.000547 time 2.9618 (2.2015) loss 3.2902 (3.5796) grad_norm 1.4501 (1.4626) [2022-01-22 03:26:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1100/1251] eta 0:05:32 lr 0.000547 time 1.7747 (2.2006) loss 3.0701 (3.5777) grad_norm 1.4281 (1.4625) [2022-01-22 03:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1110/1251] eta 0:05:10 lr 0.000547 time 2.4783 (2.2016) loss 4.4640 (3.5796) grad_norm 1.5078 (1.4625) [2022-01-22 03:27:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1120/1251] eta 0:04:48 lr 0.000547 time 2.2049 (2.2015) loss 3.6986 (3.5814) grad_norm 1.3923 (1.4628) [2022-01-22 03:27:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1130/1251] eta 0:04:26 lr 0.000547 time 2.1357 (2.2015) loss 2.6309 (3.5814) grad_norm 1.7149 (1.4630) [2022-01-22 03:27:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1140/1251] eta 0:04:04 lr 0.000547 time 2.2023 (2.2003) loss 4.0342 (3.5846) grad_norm 1.3350 (1.4626) [2022-01-22 03:28:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1150/1251] eta 0:03:42 lr 0.000547 time 1.8295 (2.2021) loss 2.8861 (3.5864) grad_norm 1.3991 (1.4622) [2022-01-22 03:28:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1160/1251] eta 0:03:20 lr 0.000547 time 2.9054 (2.2021) loss 2.8886 (3.5864) grad_norm 1.3208 (1.4617) [2022-01-22 03:29:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1170/1251] eta 0:02:58 lr 0.000547 time 2.0668 (2.2020) loss 3.8703 (3.5866) grad_norm 1.4802 (1.4617) [2022-01-22 03:29:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1180/1251] eta 0:02:36 lr 0.000547 time 1.8765 (2.2001) loss 4.1804 (3.5858) grad_norm 1.8020 (1.4626) [2022-01-22 03:29:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1190/1251] eta 0:02:14 lr 0.000547 time 1.8695 (2.1992) loss 3.4538 (3.5856) grad_norm 1.4772 (1.4625) [2022-01-22 03:30:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1200/1251] eta 0:01:52 lr 0.000547 time 1.9874 (2.1983) loss 3.4158 (3.5848) grad_norm 1.4099 (1.4624) [2022-01-22 03:30:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1210/1251] eta 0:01:30 lr 0.000547 time 1.8089 (2.1980) loss 4.0328 (3.5838) grad_norm 1.3746 (1.4621) [2022-01-22 03:30:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1220/1251] eta 0:01:08 lr 0.000547 time 1.6753 (2.1975) loss 3.6810 (3.5832) grad_norm 1.2993 (1.4620) [2022-01-22 03:31:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1230/1251] eta 0:00:46 lr 0.000547 time 1.7572 (2.1967) loss 3.7962 (3.5861) grad_norm 2.0059 (1.4625) [2022-01-22 03:31:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1240/1251] eta 0:00:24 lr 0.000546 time 1.8161 (2.1976) loss 2.8917 (3.5878) grad_norm 1.5131 (1.4630) [2022-01-22 03:31:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [141/300][1250/1251] eta 0:00:02 lr 0.000546 time 1.1540 (2.1926) loss 3.5816 (3.5878) grad_norm 1.3198 (1.4624) [2022-01-22 03:31:45 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 141 training takes 0:45:43 [2022-01-22 03:32:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.409 (19.409) Loss 1.0407 (1.0407) Acc@1 75.781 (75.781) Acc@5 94.043 (94.043) [2022-01-22 03:32:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.461 (3.573) Loss 1.0651 (1.0611) Acc@1 76.074 (75.648) Acc@5 92.773 (93.226) [2022-01-22 03:32:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.621 (2.638) Loss 1.0216 (1.0558) Acc@1 76.562 (75.581) Acc@5 93.262 (93.299) [2022-01-22 03:32:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.954 (2.331) Loss 1.0798 (1.0640) Acc@1 74.316 (75.545) Acc@5 92.090 (93.013) [2022-01-22 03:33:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.238 (2.202) Loss 1.0319 (1.0650) Acc@1 76.562 (75.550) Acc@5 92.480 (92.945) [2022-01-22 03:33:23 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.550 Acc@5 92.968 [2022-01-22 03:33:23 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.6% [2022-01-22 03:33:23 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.63% [2022-01-22 03:33:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][0/1251] eta 7:22:14 lr 0.000546 time 21.2104 (21.2104) loss 2.8355 (2.8355) grad_norm 1.5724 (1.5724) [2022-01-22 03:34:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][10/1251] eta 1:26:35 lr 0.000546 time 2.9475 (4.1869) loss 3.3418 (3.4564) grad_norm 1.2028 (1.4404) [2022-01-22 03:34:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][20/1251] eta 1:07:10 lr 0.000546 time 2.5170 (3.2739) loss 3.9441 (3.4612) grad_norm 1.2597 (1.4132) [2022-01-22 03:34:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][30/1251] eta 0:58:02 lr 0.000546 time 1.4719 (2.8519) loss 2.3949 (3.5043) grad_norm 1.5624 (1.4051) [2022-01-22 03:35:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][40/1251] eta 0:54:50 lr 0.000546 time 3.4985 (2.7174) loss 3.9980 (3.4675) grad_norm 1.4611 (1.4185) [2022-01-22 03:35:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][50/1251] eta 0:53:02 lr 0.000546 time 2.8557 (2.6501) loss 3.5281 (3.5596) grad_norm 1.3102 (1.4184) [2022-01-22 03:36:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][60/1251] eta 0:51:44 lr 0.000546 time 2.1896 (2.6065) loss 3.5254 (3.5593) grad_norm 1.6403 (1.4412) [2022-01-22 03:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][70/1251] eta 0:49:52 lr 0.000546 time 1.9149 (2.5342) loss 4.3283 (3.5637) grad_norm 1.3694 (1.4395) [2022-01-22 03:36:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][80/1251] eta 0:48:23 lr 0.000546 time 2.6243 (2.4794) loss 4.2666 (3.5420) grad_norm 1.3443 (1.4440) [2022-01-22 03:37:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][90/1251] eta 0:46:44 lr 0.000546 time 1.6613 (2.4153) loss 4.0439 (3.5374) grad_norm 1.4683 (1.4532) [2022-01-22 03:37:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][100/1251] eta 0:45:46 lr 0.000546 time 2.5558 (2.3861) loss 3.6863 (3.5445) grad_norm 1.9943 (1.4669) [2022-01-22 03:37:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][110/1251] eta 0:44:53 lr 0.000546 time 2.1975 (2.3610) loss 3.4327 (3.5513) grad_norm 1.2818 (1.4658) [2022-01-22 03:38:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][120/1251] eta 0:44:26 lr 0.000546 time 3.1846 (2.3576) loss 4.4008 (3.5704) grad_norm 1.8945 (1.4742) [2022-01-22 03:38:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][130/1251] eta 0:43:46 lr 0.000546 time 1.4641 (2.3427) loss 3.4336 (3.5422) grad_norm 1.5187 (1.4765) [2022-01-22 03:38:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][140/1251] eta 0:43:18 lr 0.000546 time 1.9807 (2.3388) loss 3.8939 (3.5321) grad_norm 1.5870 (1.4808) [2022-01-22 03:39:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][150/1251] eta 0:42:43 lr 0.000546 time 1.5479 (2.3287) loss 3.2157 (3.5307) grad_norm 1.3356 (1.4800) [2022-01-22 03:39:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][160/1251] eta 0:42:22 lr 0.000546 time 3.4821 (2.3304) loss 4.1536 (3.5295) grad_norm 1.5301 (1.4831) [2022-01-22 03:39:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][170/1251] eta 0:41:42 lr 0.000546 time 2.1807 (2.3150) loss 2.7485 (3.5365) grad_norm 1.3487 (1.4814) [2022-01-22 03:40:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][180/1251] eta 0:41:08 lr 0.000546 time 1.9707 (2.3048) loss 3.3901 (3.5206) grad_norm 1.4628 (1.4787) [2022-01-22 03:40:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][190/1251] eta 0:40:32 lr 0.000546 time 1.5564 (2.2930) loss 3.9399 (3.5106) grad_norm 1.6817 (1.4807) [2022-01-22 03:41:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][200/1251] eta 0:40:01 lr 0.000546 time 2.8802 (2.2849) loss 4.0795 (3.5292) grad_norm 1.2370 (1.4776) [2022-01-22 03:41:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][210/1251] eta 0:39:31 lr 0.000546 time 2.4800 (2.2779) loss 4.1855 (3.5370) grad_norm 1.2107 (1.4752) [2022-01-22 03:41:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][220/1251] eta 0:39:07 lr 0.000546 time 2.5732 (2.2770) loss 3.8310 (3.5419) grad_norm 1.6012 (1.4758) [2022-01-22 03:42:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][230/1251] eta 0:38:41 lr 0.000545 time 1.8910 (2.2739) loss 3.6890 (3.5393) grad_norm 1.7445 (1.4746) [2022-01-22 03:42:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][240/1251] eta 0:38:26 lr 0.000545 time 3.1925 (2.2812) loss 3.0946 (3.5386) grad_norm 1.2360 (1.4702) [2022-01-22 03:42:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][250/1251] eta 0:37:56 lr 0.000545 time 1.8519 (2.2739) loss 3.6442 (3.5436) grad_norm 1.4847 (1.4668) [2022-01-22 03:43:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][260/1251] eta 0:37:24 lr 0.000545 time 1.8943 (2.2649) loss 3.7967 (3.5500) grad_norm 1.5779 (1.4684) [2022-01-22 03:43:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][270/1251] eta 0:36:50 lr 0.000545 time 1.9816 (2.2532) loss 2.7662 (3.5493) grad_norm 1.4571 (1.4715) [2022-01-22 03:43:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][280/1251] eta 0:36:21 lr 0.000545 time 2.1293 (2.2466) loss 2.9279 (3.5462) grad_norm 1.3508 (1.4714) [2022-01-22 03:44:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][290/1251] eta 0:35:56 lr 0.000545 time 1.8620 (2.2439) loss 3.8549 (3.5510) grad_norm 1.5873 (1.4728) [2022-01-22 03:44:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][300/1251] eta 0:35:35 lr 0.000545 time 2.1738 (2.2457) loss 3.6127 (3.5510) grad_norm 1.3560 (1.4716) [2022-01-22 03:45:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][310/1251] eta 0:35:16 lr 0.000545 time 3.2484 (2.2492) loss 3.4827 (3.5421) grad_norm 1.5193 (1.4693) [2022-01-22 03:45:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][320/1251] eta 0:34:52 lr 0.000545 time 2.1376 (2.2472) loss 3.2383 (3.5402) grad_norm 1.4737 (1.4695) [2022-01-22 03:45:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][330/1251] eta 0:34:30 lr 0.000545 time 1.5276 (2.2480) loss 4.2201 (3.5361) grad_norm 1.6438 (1.4713) [2022-01-22 03:46:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][340/1251] eta 0:34:02 lr 0.000545 time 1.5171 (2.2417) loss 3.6303 (3.5367) grad_norm 1.3958 (1.4684) [2022-01-22 03:46:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][350/1251] eta 0:33:41 lr 0.000545 time 3.2899 (2.2436) loss 2.7812 (3.5369) grad_norm 1.7206 (1.4704) [2022-01-22 03:46:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][360/1251] eta 0:33:16 lr 0.000545 time 1.9032 (2.2411) loss 3.6242 (3.5383) grad_norm 1.2024 (1.4716) [2022-01-22 03:47:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][370/1251] eta 0:32:50 lr 0.000545 time 2.1430 (2.2361) loss 2.7276 (3.5380) grad_norm 1.8268 (1.4763) [2022-01-22 03:47:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][380/1251] eta 0:32:23 lr 0.000545 time 1.8708 (2.2312) loss 3.7876 (3.5358) grad_norm 1.4276 (1.4764) [2022-01-22 03:47:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][390/1251] eta 0:32:00 lr 0.000545 time 3.4421 (2.2304) loss 3.8186 (3.5384) grad_norm 1.6040 (1.4768) [2022-01-22 03:48:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][400/1251] eta 0:31:37 lr 0.000545 time 1.7833 (2.2296) loss 3.9044 (3.5373) grad_norm 1.5286 (1.4785) [2022-01-22 03:48:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][410/1251] eta 0:31:14 lr 0.000545 time 2.4718 (2.2290) loss 2.7323 (3.5407) grad_norm 1.5114 (1.4777) [2022-01-22 03:49:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][420/1251] eta 0:30:50 lr 0.000545 time 2.1689 (2.2274) loss 4.2928 (3.5412) grad_norm 1.3362 (1.4755) [2022-01-22 03:49:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][430/1251] eta 0:30:33 lr 0.000545 time 3.6656 (2.2328) loss 3.7104 (3.5395) grad_norm 1.4578 (1.4738) [2022-01-22 03:49:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][440/1251] eta 0:30:12 lr 0.000545 time 1.9504 (2.2352) loss 3.2916 (3.5385) grad_norm 1.3282 (1.4731) [2022-01-22 03:50:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][450/1251] eta 0:29:49 lr 0.000545 time 2.3328 (2.2339) loss 2.9536 (3.5311) grad_norm 1.1671 (1.4725) [2022-01-22 03:50:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][460/1251] eta 0:29:22 lr 0.000545 time 2.2641 (2.2287) loss 3.6879 (3.5329) grad_norm 1.4319 (1.4706) [2022-01-22 03:50:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][470/1251] eta 0:28:57 lr 0.000544 time 3.0365 (2.2253) loss 4.4666 (3.5381) grad_norm 1.3638 (1.4711) [2022-01-22 03:51:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][480/1251] eta 0:28:33 lr 0.000544 time 2.1337 (2.2221) loss 3.5256 (3.5412) grad_norm 1.4331 (1.4708) [2022-01-22 03:51:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][490/1251] eta 0:28:09 lr 0.000544 time 1.7207 (2.2197) loss 3.1381 (3.5413) grad_norm 1.4973 (1.4705) [2022-01-22 03:51:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][500/1251] eta 0:27:46 lr 0.000544 time 2.5142 (2.2196) loss 3.3319 (3.5424) grad_norm 1.3786 (1.4699) [2022-01-22 03:52:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][510/1251] eta 0:27:26 lr 0.000544 time 2.4728 (2.2215) loss 3.8132 (3.5389) grad_norm 1.2882 (1.4684) [2022-01-22 03:52:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][520/1251] eta 0:27:04 lr 0.000544 time 2.1450 (2.2219) loss 3.6443 (3.5378) grad_norm 1.2959 (1.4679) [2022-01-22 03:53:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][530/1251] eta 0:26:40 lr 0.000544 time 1.8963 (2.2204) loss 3.7698 (3.5357) grad_norm 1.4465 (1.4682) [2022-01-22 03:53:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][540/1251] eta 0:26:17 lr 0.000544 time 1.9617 (2.2182) loss 3.4724 (3.5309) grad_norm 1.4244 (1.4670) [2022-01-22 03:53:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][550/1251] eta 0:25:52 lr 0.000544 time 1.9877 (2.2148) loss 3.6902 (3.5356) grad_norm 1.2528 (1.4663) [2022-01-22 03:54:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][560/1251] eta 0:25:29 lr 0.000544 time 1.8576 (2.2132) loss 4.1380 (3.5370) grad_norm 1.4811 (1.4666) [2022-01-22 03:54:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][570/1251] eta 0:25:06 lr 0.000544 time 1.8484 (2.2120) loss 3.3508 (3.5376) grad_norm 1.8457 (1.4684) [2022-01-22 03:54:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][580/1251] eta 0:24:42 lr 0.000544 time 2.3469 (2.2101) loss 3.9306 (3.5449) grad_norm 1.3496 (1.4693) [2022-01-22 03:55:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][590/1251] eta 0:24:21 lr 0.000544 time 2.0645 (2.2114) loss 2.4884 (3.5420) grad_norm 1.3573 (1.4692) [2022-01-22 03:55:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][600/1251] eta 0:24:00 lr 0.000544 time 1.9788 (2.2128) loss 4.1194 (3.5408) grad_norm 1.4538 (1.4699) [2022-01-22 03:55:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][610/1251] eta 0:23:39 lr 0.000544 time 2.3092 (2.2138) loss 3.5160 (3.5436) grad_norm 1.4886 (1.4698) [2022-01-22 03:56:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][620/1251] eta 0:23:16 lr 0.000544 time 2.2146 (2.2136) loss 3.5677 (3.5443) grad_norm 1.4324 (1.4686) [2022-01-22 03:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][630/1251] eta 0:22:55 lr 0.000544 time 2.7969 (2.2146) loss 2.8600 (3.5436) grad_norm 1.7217 (1.4684) [2022-01-22 03:57:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][640/1251] eta 0:22:31 lr 0.000544 time 1.8909 (2.2120) loss 2.6120 (3.5454) grad_norm 1.3048 (1.4665) [2022-01-22 03:57:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][650/1251] eta 0:22:08 lr 0.000544 time 1.6815 (2.2106) loss 4.0740 (3.5424) grad_norm 1.3279 (1.4663) [2022-01-22 03:57:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][660/1251] eta 0:21:45 lr 0.000544 time 2.0356 (2.2094) loss 3.7576 (3.5423) grad_norm 1.4077 (1.4653) [2022-01-22 03:58:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][670/1251] eta 0:21:25 lr 0.000544 time 3.6417 (2.2133) loss 4.1888 (3.5451) grad_norm 1.4195 (1.4640) [2022-01-22 03:58:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][680/1251] eta 0:21:03 lr 0.000544 time 2.2040 (2.2136) loss 2.2926 (3.5448) grad_norm 1.2386 (1.4628) [2022-01-22 03:58:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][690/1251] eta 0:20:41 lr 0.000544 time 1.6996 (2.2129) loss 2.2671 (3.5401) grad_norm 1.3509 (1.4636) [2022-01-22 03:59:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][700/1251] eta 0:20:17 lr 0.000544 time 2.1813 (2.2104) loss 3.3825 (3.5412) grad_norm 1.2439 (1.4636) [2022-01-22 03:59:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][710/1251] eta 0:19:56 lr 0.000543 time 2.5744 (2.2113) loss 4.2277 (3.5419) grad_norm 1.4460 (1.4640) [2022-01-22 03:59:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][720/1251] eta 0:19:33 lr 0.000543 time 2.5546 (2.2101) loss 4.0656 (3.5456) grad_norm 1.3289 (1.4634) [2022-01-22 04:00:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][730/1251] eta 0:19:11 lr 0.000543 time 2.1331 (2.2095) loss 3.4986 (3.5436) grad_norm 1.5193 (1.4624) [2022-01-22 04:00:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][740/1251] eta 0:18:48 lr 0.000543 time 1.8043 (2.2085) loss 2.4853 (3.5410) grad_norm 1.6821 (1.4631) [2022-01-22 04:01:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][750/1251] eta 0:18:27 lr 0.000543 time 3.0349 (2.2105) loss 3.0196 (3.5432) grad_norm 1.1784 (1.4630) [2022-01-22 04:01:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][760/1251] eta 0:18:04 lr 0.000543 time 1.5753 (2.2094) loss 3.6008 (3.5419) grad_norm 1.2258 (1.4624) [2022-01-22 04:01:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][770/1251] eta 0:17:42 lr 0.000543 time 2.4942 (2.2088) loss 3.5571 (3.5424) grad_norm 1.5829 (1.4629) [2022-01-22 04:02:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][780/1251] eta 0:17:19 lr 0.000543 time 1.8219 (2.2071) loss 4.3046 (3.5434) grad_norm 1.3799 (1.4629) [2022-01-22 04:02:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][790/1251] eta 0:16:56 lr 0.000543 time 2.1815 (2.2058) loss 3.7421 (3.5456) grad_norm 1.2027 (1.4627) [2022-01-22 04:02:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][800/1251] eta 0:16:33 lr 0.000543 time 1.7626 (2.2039) loss 3.7475 (3.5456) grad_norm 1.3135 (1.4624) [2022-01-22 04:03:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][810/1251] eta 0:16:12 lr 0.000543 time 2.4513 (2.2048) loss 3.4938 (3.5428) grad_norm 1.5309 (1.4624) [2022-01-22 04:03:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][820/1251] eta 0:15:50 lr 0.000543 time 1.6740 (2.2053) loss 3.8453 (3.5422) grad_norm 1.4056 (1.4613) [2022-01-22 04:03:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][830/1251] eta 0:15:28 lr 0.000543 time 2.5700 (2.2051) loss 3.6003 (3.5423) grad_norm 1.4823 (1.4608) [2022-01-22 04:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][840/1251] eta 0:15:06 lr 0.000543 time 1.9586 (2.2047) loss 3.8533 (3.5429) grad_norm 1.3078 (1.4604) [2022-01-22 04:04:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][850/1251] eta 0:14:44 lr 0.000543 time 1.9275 (2.2047) loss 3.6956 (3.5449) grad_norm 1.5402 (1.4617) [2022-01-22 04:05:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][860/1251] eta 0:14:22 lr 0.000543 time 1.5662 (2.2058) loss 3.9241 (3.5451) grad_norm 1.3655 (1.4621) [2022-01-22 04:05:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][870/1251] eta 0:14:00 lr 0.000543 time 2.2368 (2.2055) loss 4.0423 (3.5497) grad_norm 1.4045 (1.4617) [2022-01-22 04:05:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][880/1251] eta 0:13:37 lr 0.000543 time 2.6728 (2.2042) loss 4.1489 (3.5521) grad_norm 1.3820 (1.4612) [2022-01-22 04:06:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][890/1251] eta 0:13:15 lr 0.000543 time 2.1006 (2.2045) loss 2.6043 (3.5537) grad_norm 1.4151 (1.4611) [2022-01-22 04:06:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][900/1251] eta 0:12:53 lr 0.000543 time 1.9114 (2.2044) loss 3.7075 (3.5554) grad_norm 1.4281 (1.4607) [2022-01-22 04:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][910/1251] eta 0:12:31 lr 0.000543 time 1.8103 (2.2044) loss 3.9859 (3.5570) grad_norm 1.5076 (1.4602) [2022-01-22 04:07:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][920/1251] eta 0:12:10 lr 0.000543 time 4.3612 (2.2066) loss 3.4229 (3.5590) grad_norm 1.4050 (1.4594) [2022-01-22 04:07:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][930/1251] eta 0:11:48 lr 0.000543 time 2.3568 (2.2085) loss 3.5990 (3.5607) grad_norm 1.3926 (1.4580) [2022-01-22 04:08:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][940/1251] eta 0:11:26 lr 0.000543 time 1.8173 (2.2076) loss 2.9825 (3.5596) grad_norm 1.6128 (1.4583) [2022-01-22 04:08:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][950/1251] eta 0:11:03 lr 0.000542 time 1.6402 (2.2057) loss 4.1451 (3.5623) grad_norm 1.4064 (1.4578) [2022-01-22 04:08:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][960/1251] eta 0:10:41 lr 0.000542 time 2.4866 (2.2035) loss 3.4125 (3.5645) grad_norm 1.3178 (1.4582) [2022-01-22 04:09:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][970/1251] eta 0:10:18 lr 0.000542 time 1.8257 (2.2015) loss 4.3674 (3.5656) grad_norm 1.7683 (1.4595) [2022-01-22 04:09:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][980/1251] eta 0:09:56 lr 0.000542 time 2.0818 (2.2008) loss 3.5560 (3.5665) grad_norm 1.4592 (1.4601) [2022-01-22 04:09:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][990/1251] eta 0:09:34 lr 0.000542 time 2.2539 (2.2000) loss 3.9073 (3.5653) grad_norm 1.3777 (1.4592) [2022-01-22 04:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1000/1251] eta 0:09:12 lr 0.000542 time 2.7603 (2.2021) loss 3.5893 (3.5637) grad_norm 1.4780 (1.4592) [2022-01-22 04:10:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1010/1251] eta 0:08:50 lr 0.000542 time 1.6126 (2.2018) loss 3.7030 (3.5658) grad_norm 1.4666 (1.4587) [2022-01-22 04:10:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1020/1251] eta 0:08:28 lr 0.000542 time 2.2060 (2.2017) loss 3.8813 (3.5665) grad_norm 1.3847 (1.4588) [2022-01-22 04:11:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1030/1251] eta 0:08:06 lr 0.000542 time 1.9500 (2.2027) loss 3.5547 (3.5645) grad_norm 1.6439 (1.4595) [2022-01-22 04:11:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1040/1251] eta 0:07:44 lr 0.000542 time 2.4874 (2.2019) loss 2.9903 (3.5653) grad_norm 1.3674 (1.4593) [2022-01-22 04:11:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1050/1251] eta 0:07:22 lr 0.000542 time 2.0176 (2.2021) loss 2.9866 (3.5645) grad_norm 1.5318 (1.4591) [2022-01-22 04:12:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1060/1251] eta 0:07:00 lr 0.000542 time 2.3900 (2.2013) loss 4.0202 (3.5645) grad_norm 1.6985 (1.4597) [2022-01-22 04:12:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1070/1251] eta 0:06:38 lr 0.000542 time 2.0286 (2.2003) loss 2.7548 (3.5650) grad_norm 1.3814 (1.4594) [2022-01-22 04:13:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1080/1251] eta 0:06:16 lr 0.000542 time 2.8174 (2.2001) loss 2.3542 (3.5643) grad_norm 1.3565 (1.4590) [2022-01-22 04:13:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1090/1251] eta 0:05:54 lr 0.000542 time 2.8800 (2.2010) loss 3.8364 (3.5635) grad_norm 1.3358 (1.4588) [2022-01-22 04:13:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1100/1251] eta 0:05:32 lr 0.000542 time 2.0966 (2.2016) loss 3.9295 (3.5645) grad_norm 1.4038 (1.4589) [2022-01-22 04:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1110/1251] eta 0:05:10 lr 0.000542 time 2.2318 (2.2024) loss 3.7200 (3.5633) grad_norm 1.5233 (1.4589) [2022-01-22 04:14:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1120/1251] eta 0:04:48 lr 0.000542 time 1.9329 (2.2027) loss 3.6067 (3.5600) grad_norm 1.5539 (1.4599) [2022-01-22 04:14:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1130/1251] eta 0:04:26 lr 0.000542 time 2.7288 (2.2022) loss 2.8787 (3.5592) grad_norm 1.5942 (1.4611) [2022-01-22 04:15:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1140/1251] eta 0:04:04 lr 0.000542 time 1.7634 (2.1999) loss 3.7327 (3.5589) grad_norm 1.4114 (1.4611) [2022-01-22 04:15:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1150/1251] eta 0:03:42 lr 0.000542 time 2.5017 (2.1981) loss 3.6384 (3.5594) grad_norm 1.5564 (1.4615) [2022-01-22 04:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1160/1251] eta 0:03:19 lr 0.000542 time 2.0139 (2.1976) loss 3.4406 (3.5593) grad_norm 1.2586 (1.4620) [2022-01-22 04:16:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1170/1251] eta 0:02:58 lr 0.000542 time 2.2697 (2.1994) loss 3.4285 (3.5607) grad_norm 1.4661 (1.4620) [2022-01-22 04:16:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1180/1251] eta 0:02:36 lr 0.000542 time 2.4973 (2.1994) loss 3.1037 (3.5616) grad_norm 1.5663 (1.4623) [2022-01-22 04:17:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1190/1251] eta 0:02:14 lr 0.000542 time 1.8793 (2.1997) loss 3.8165 (3.5644) grad_norm 1.5626 (1.4625) [2022-01-22 04:17:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1200/1251] eta 0:01:52 lr 0.000541 time 2.8206 (2.1994) loss 2.2776 (3.5611) grad_norm 1.5851 (1.4623) [2022-01-22 04:17:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1210/1251] eta 0:01:30 lr 0.000541 time 2.4917 (2.1996) loss 3.3774 (3.5613) grad_norm 1.5246 (1.4629) [2022-01-22 04:18:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1220/1251] eta 0:01:08 lr 0.000541 time 2.8290 (2.2003) loss 3.3074 (3.5614) grad_norm 1.6539 (1.4635) [2022-01-22 04:18:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1230/1251] eta 0:00:46 lr 0.000541 time 1.7518 (2.1999) loss 3.9519 (3.5611) grad_norm 1.2809 (1.4633) [2022-01-22 04:18:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1240/1251] eta 0:00:24 lr 0.000541 time 2.2441 (2.1985) loss 3.7823 (3.5611) grad_norm 1.6961 (1.4636) [2022-01-22 04:19:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [142/300][1250/1251] eta 0:00:02 lr 0.000541 time 1.1782 (2.1926) loss 3.9586 (3.5620) grad_norm 1.2772 (1.4629) [2022-01-22 04:19:06 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 142 training takes 0:45:43 [2022-01-22 04:19:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.348 (18.348) Loss 1.0572 (1.0572) Acc@1 75.098 (75.098) Acc@5 92.285 (92.285) [2022-01-22 04:19:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.319 (3.388) Loss 1.0653 (1.0332) Acc@1 74.219 (75.888) Acc@5 92.578 (93.253) [2022-01-22 04:20:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 2.065 (2.610) Loss 1.0221 (1.0379) Acc@1 76.270 (75.702) Acc@5 93.262 (93.150) [2022-01-22 04:20:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.989 (2.270) Loss 1.0328 (1.0391) Acc@1 75.391 (75.545) Acc@5 93.164 (93.088) [2022-01-22 04:20:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.811 (2.136) Loss 1.0452 (1.0374) Acc@1 75.000 (75.593) Acc@5 92.871 (93.054) [2022-01-22 04:20:41 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.616 Acc@5 93.090 [2022-01-22 04:20:41 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.6% [2022-01-22 04:20:41 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.63% [2022-01-22 04:21:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][0/1251] eta 7:29:00 lr 0.000541 time 21.5349 (21.5349) loss 2.9915 (2.9915) grad_norm 1.3497 (1.3497) [2022-01-22 04:21:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][10/1251] eta 1:25:14 lr 0.000541 time 2.5327 (4.1210) loss 4.1917 (3.6324) grad_norm 1.2954 (1.3757) [2022-01-22 04:21:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][20/1251] eta 1:05:36 lr 0.000541 time 1.9102 (3.1976) loss 2.9011 (3.6427) grad_norm 1.2475 (1.3900) [2022-01-22 04:22:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][30/1251] eta 0:58:56 lr 0.000541 time 1.8889 (2.8962) loss 2.3852 (3.6023) grad_norm 1.8333 (1.4036) [2022-01-22 04:22:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][40/1251] eta 0:55:27 lr 0.000541 time 3.1255 (2.7473) loss 3.7139 (3.5301) grad_norm 1.4442 (1.4218) [2022-01-22 04:22:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][50/1251] eta 0:52:44 lr 0.000541 time 1.5892 (2.6347) loss 3.6162 (3.5691) grad_norm 1.5659 (1.4234) [2022-01-22 04:23:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][60/1251] eta 0:50:57 lr 0.000541 time 2.1848 (2.5668) loss 3.8288 (3.5818) grad_norm 1.5557 (1.4445) [2022-01-22 04:23:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][70/1251] eta 0:49:35 lr 0.000541 time 1.5881 (2.5196) loss 3.6656 (3.5811) grad_norm 1.5398 (1.4581) [2022-01-22 04:24:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][80/1251] eta 0:48:20 lr 0.000541 time 2.9991 (2.4772) loss 3.3189 (3.5899) grad_norm 1.3734 (1.4606) [2022-01-22 04:24:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][90/1251] eta 0:47:09 lr 0.000541 time 1.9275 (2.4371) loss 4.0428 (3.5823) grad_norm 1.3249 (1.4590) [2022-01-22 04:24:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][100/1251] eta 0:45:53 lr 0.000541 time 1.8654 (2.3926) loss 3.7281 (3.5850) grad_norm 1.4627 (1.4772) [2022-01-22 04:25:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][110/1251] eta 0:45:04 lr 0.000541 time 2.1730 (2.3703) loss 3.2534 (3.5666) grad_norm 1.5942 (1.4797) [2022-01-22 04:25:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][120/1251] eta 0:44:22 lr 0.000541 time 2.3254 (2.3545) loss 3.6794 (3.5751) grad_norm 1.2513 (1.4786) [2022-01-22 04:25:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][130/1251] eta 0:43:51 lr 0.000541 time 1.5870 (2.3478) loss 3.8879 (3.5644) grad_norm 1.5160 (1.4734) [2022-01-22 04:26:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][140/1251] eta 0:43:20 lr 0.000541 time 2.3717 (2.3409) loss 2.8884 (3.5520) grad_norm 1.6867 (1.4723) [2022-01-22 04:26:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][150/1251] eta 0:42:48 lr 0.000541 time 2.8245 (2.3331) loss 3.4495 (3.5547) grad_norm 1.3152 (1.4730) [2022-01-22 04:26:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][160/1251] eta 0:42:09 lr 0.000541 time 2.1628 (2.3183) loss 4.1718 (3.5696) grad_norm 1.2968 (1.4712) [2022-01-22 04:27:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][170/1251] eta 0:41:33 lr 0.000541 time 1.8187 (2.3064) loss 3.0893 (3.5656) grad_norm 1.8523 (1.4705) [2022-01-22 04:27:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][180/1251] eta 0:41:02 lr 0.000541 time 1.9021 (2.2992) loss 3.0172 (3.5726) grad_norm 1.8708 (1.4705) [2022-01-22 04:27:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][190/1251] eta 0:40:29 lr 0.000540 time 2.2489 (2.2898) loss 4.2393 (3.5845) grad_norm 1.5037 (1.4730) [2022-01-22 04:28:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][200/1251] eta 0:39:57 lr 0.000540 time 2.5741 (2.2812) loss 2.7695 (3.5798) grad_norm 1.3146 (1.4709) [2022-01-22 04:28:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][210/1251] eta 0:39:29 lr 0.000540 time 2.0885 (2.2763) loss 3.8114 (3.5767) grad_norm 1.6322 (1.4701) [2022-01-22 04:29:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][220/1251] eta 0:38:59 lr 0.000540 time 2.1241 (2.2688) loss 4.2874 (3.5737) grad_norm 2.3679 (1.4722) [2022-01-22 04:29:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][230/1251] eta 0:38:30 lr 0.000540 time 2.2480 (2.2628) loss 3.5937 (3.5751) grad_norm 1.3937 (1.4724) [2022-01-22 04:29:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][240/1251] eta 0:38:04 lr 0.000540 time 2.3860 (2.2595) loss 2.8496 (3.5705) grad_norm 1.2644 (1.4700) [2022-01-22 04:30:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][250/1251] eta 0:37:33 lr 0.000540 time 2.0705 (2.2508) loss 3.9996 (3.5784) grad_norm 1.4926 (1.4689) [2022-01-22 04:30:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][260/1251] eta 0:37:13 lr 0.000540 time 2.4117 (2.2534) loss 2.7624 (3.5674) grad_norm 1.3158 (1.4670) [2022-01-22 04:30:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][270/1251] eta 0:36:46 lr 0.000540 time 3.0946 (2.2494) loss 3.8619 (3.5614) grad_norm 1.4284 (1.4658) [2022-01-22 04:31:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][280/1251] eta 0:36:23 lr 0.000540 time 2.5947 (2.2491) loss 2.8669 (3.5565) grad_norm 1.5091 (1.4659) [2022-01-22 04:31:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][290/1251] eta 0:36:08 lr 0.000540 time 2.1529 (2.2565) loss 3.7529 (3.5558) grad_norm 1.7699 (1.4704) [2022-01-22 04:31:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][300/1251] eta 0:35:44 lr 0.000540 time 2.1677 (2.2546) loss 3.7154 (3.5532) grad_norm 1.5266 (1.4717) [2022-01-22 04:32:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][310/1251] eta 0:35:20 lr 0.000540 time 3.1364 (2.2534) loss 3.4867 (3.5569) grad_norm 1.7970 (1.4691) [2022-01-22 04:32:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][320/1251] eta 0:34:55 lr 0.000540 time 2.4176 (2.2504) loss 4.3616 (3.5633) grad_norm 1.3830 (1.4699) [2022-01-22 04:33:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][330/1251] eta 0:34:30 lr 0.000540 time 2.5221 (2.2476) loss 3.9165 (3.5617) grad_norm 1.4325 (1.4727) [2022-01-22 04:33:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][340/1251] eta 0:34:01 lr 0.000540 time 1.8811 (2.2414) loss 3.5818 (3.5615) grad_norm 1.3547 (1.4717) [2022-01-22 04:33:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][350/1251] eta 0:33:35 lr 0.000540 time 2.7484 (2.2375) loss 2.4650 (3.5491) grad_norm 1.4434 (1.4720) [2022-01-22 04:34:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][360/1251] eta 0:33:08 lr 0.000540 time 1.7980 (2.2312) loss 4.1105 (3.5403) grad_norm 1.4726 (1.4724) [2022-01-22 04:34:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][370/1251] eta 0:32:44 lr 0.000540 time 1.7992 (2.2297) loss 3.7033 (3.5414) grad_norm 1.2967 (1.4715) [2022-01-22 04:34:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][380/1251] eta 0:32:21 lr 0.000540 time 2.5644 (2.2293) loss 3.5946 (3.5419) grad_norm 1.7016 (1.4747) [2022-01-22 04:35:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][390/1251] eta 0:32:00 lr 0.000540 time 2.3887 (2.2304) loss 3.9520 (3.5409) grad_norm 1.6472 (1.4757) [2022-01-22 04:35:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][400/1251] eta 0:31:36 lr 0.000540 time 1.5857 (2.2291) loss 4.2808 (3.5456) grad_norm 1.3378 (1.4754) [2022-01-22 04:35:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][410/1251] eta 0:31:14 lr 0.000540 time 2.5867 (2.2287) loss 3.7948 (3.5465) grad_norm 1.6298 (1.4759) [2022-01-22 04:36:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][420/1251] eta 0:30:48 lr 0.000540 time 1.5984 (2.2247) loss 4.0922 (3.5481) grad_norm 1.4593 (1.4746) [2022-01-22 04:36:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][430/1251] eta 0:30:24 lr 0.000539 time 2.0461 (2.2226) loss 3.4983 (3.5417) grad_norm 1.4455 (1.4724) [2022-01-22 04:37:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][440/1251] eta 0:30:02 lr 0.000539 time 1.9265 (2.2230) loss 2.7829 (3.5407) grad_norm 1.5799 (1.4713) [2022-01-22 04:37:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][450/1251] eta 0:29:40 lr 0.000539 time 2.4567 (2.2231) loss 4.1106 (3.5399) grad_norm 1.6112 (1.4711) [2022-01-22 04:37:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][460/1251] eta 0:29:19 lr 0.000539 time 1.6131 (2.2240) loss 4.3931 (3.5456) grad_norm 1.5923 (1.4706) [2022-01-22 04:38:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][470/1251] eta 0:28:56 lr 0.000539 time 2.1780 (2.2228) loss 2.4073 (3.5413) grad_norm 1.4433 (1.4713) [2022-01-22 04:38:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][480/1251] eta 0:28:32 lr 0.000539 time 1.9763 (2.2217) loss 3.0449 (3.5437) grad_norm 1.6399 (1.4705) [2022-01-22 04:38:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][490/1251] eta 0:28:09 lr 0.000539 time 2.2585 (2.2206) loss 3.6685 (3.5377) grad_norm 1.3507 (1.4697) [2022-01-22 04:39:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][500/1251] eta 0:27:50 lr 0.000539 time 1.8915 (2.2246) loss 3.8131 (3.5370) grad_norm 1.5875 (1.4703) [2022-01-22 04:39:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][510/1251] eta 0:27:28 lr 0.000539 time 2.5301 (2.2245) loss 3.1365 (3.5373) grad_norm 1.2975 (1.4699) [2022-01-22 04:39:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][520/1251] eta 0:27:05 lr 0.000539 time 1.9326 (2.2231) loss 3.4881 (3.5383) grad_norm 1.4875 (1.4689) [2022-01-22 04:40:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][530/1251] eta 0:26:41 lr 0.000539 time 1.9567 (2.2207) loss 3.7267 (3.5406) grad_norm 1.4514 (1.4697) [2022-01-22 04:40:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][540/1251] eta 0:26:18 lr 0.000539 time 1.8826 (2.2205) loss 3.7069 (3.5390) grad_norm 1.3359 (1.4728) [2022-01-22 04:41:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][550/1251] eta 0:25:54 lr 0.000539 time 2.7837 (2.2178) loss 4.0364 (3.5448) grad_norm 1.2375 (1.4716) [2022-01-22 04:41:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][560/1251] eta 0:25:31 lr 0.000539 time 2.2488 (2.2164) loss 4.2374 (3.5476) grad_norm 1.7197 (1.4726) [2022-01-22 04:41:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][570/1251] eta 0:25:08 lr 0.000539 time 1.8384 (2.2145) loss 4.0630 (3.5544) grad_norm 1.6449 (1.4731) [2022-01-22 04:42:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][580/1251] eta 0:24:48 lr 0.000539 time 2.2180 (2.2179) loss 3.9532 (3.5519) grad_norm 1.2092 (1.4731) [2022-01-22 04:42:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][590/1251] eta 0:24:27 lr 0.000539 time 3.0403 (2.2194) loss 3.6899 (3.5525) grad_norm 1.4593 (1.4737) [2022-01-22 04:42:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][600/1251] eta 0:24:05 lr 0.000539 time 2.5521 (2.2210) loss 3.8748 (3.5494) grad_norm 1.3035 (1.4737) [2022-01-22 04:43:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][610/1251] eta 0:23:41 lr 0.000539 time 1.9192 (2.2177) loss 3.8542 (3.5494) grad_norm 1.2872 (1.4726) [2022-01-22 04:43:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][620/1251] eta 0:23:19 lr 0.000539 time 1.5233 (2.2171) loss 2.9694 (3.5463) grad_norm 1.3274 (1.4706) [2022-01-22 04:43:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][630/1251] eta 0:22:56 lr 0.000539 time 2.8060 (2.2163) loss 3.8450 (3.5433) grad_norm 1.6835 (1.4701) [2022-01-22 04:44:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][640/1251] eta 0:22:32 lr 0.000539 time 1.6558 (2.2139) loss 2.6766 (3.5410) grad_norm 1.3400 (1.4689) [2022-01-22 04:44:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][650/1251] eta 0:22:10 lr 0.000539 time 2.1159 (2.2138) loss 4.2041 (3.5386) grad_norm 1.2882 (1.4677) [2022-01-22 04:45:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][660/1251] eta 0:21:47 lr 0.000539 time 2.1785 (2.2124) loss 4.1610 (3.5426) grad_norm 1.3161 (1.4689) [2022-01-22 04:45:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][670/1251] eta 0:21:25 lr 0.000538 time 3.3432 (2.2128) loss 3.8654 (3.5414) grad_norm 1.4889 (1.4688) [2022-01-22 04:45:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][680/1251] eta 0:21:03 lr 0.000538 time 1.8350 (2.2127) loss 3.6112 (3.5413) grad_norm 1.3619 (1.4692) [2022-01-22 04:46:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][690/1251] eta 0:20:42 lr 0.000538 time 2.1618 (2.2141) loss 3.1916 (3.5389) grad_norm 1.4338 (1.4688) [2022-01-22 04:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][700/1251] eta 0:20:21 lr 0.000538 time 2.1971 (2.2161) loss 4.0352 (3.5392) grad_norm 1.5138 (1.4680) [2022-01-22 04:46:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][710/1251] eta 0:19:58 lr 0.000538 time 3.1817 (2.2162) loss 3.4617 (3.5363) grad_norm 1.4512 (1.4681) [2022-01-22 04:47:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][720/1251] eta 0:19:34 lr 0.000538 time 1.9453 (2.2124) loss 3.0061 (3.5323) grad_norm 1.3857 (1.4673) [2022-01-22 04:47:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][730/1251] eta 0:19:10 lr 0.000538 time 1.8144 (2.2087) loss 3.9244 (3.5331) grad_norm 1.3549 (1.4675) [2022-01-22 04:47:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][740/1251] eta 0:18:47 lr 0.000538 time 1.9333 (2.2057) loss 3.6913 (3.5347) grad_norm 1.4733 (1.4677) [2022-01-22 04:48:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][750/1251] eta 0:18:23 lr 0.000538 time 2.4887 (2.2035) loss 2.5369 (3.5324) grad_norm 1.2616 (1.4684) [2022-01-22 04:48:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][760/1251] eta 0:18:01 lr 0.000538 time 2.2192 (2.2019) loss 4.2541 (3.5321) grad_norm 1.7859 (1.4686) [2022-01-22 04:48:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][770/1251] eta 0:17:39 lr 0.000538 time 2.4510 (2.2027) loss 3.4908 (3.5343) grad_norm 1.5116 (1.4684) [2022-01-22 04:49:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][780/1251] eta 0:17:18 lr 0.000538 time 1.8485 (2.2044) loss 2.6822 (3.5300) grad_norm 1.3509 (1.4674) [2022-01-22 04:49:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][790/1251] eta 0:16:55 lr 0.000538 time 2.0830 (2.2028) loss 3.5788 (3.5286) grad_norm 1.2855 (1.4673) [2022-01-22 04:50:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][800/1251] eta 0:16:33 lr 0.000538 time 2.8798 (2.2022) loss 4.0441 (3.5303) grad_norm 1.2326 (1.4669) [2022-01-22 04:50:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][810/1251] eta 0:16:10 lr 0.000538 time 1.7977 (2.2010) loss 3.9702 (3.5321) grad_norm 1.3521 (1.4685) [2022-01-22 04:50:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][820/1251] eta 0:15:49 lr 0.000538 time 2.4972 (2.2022) loss 4.1306 (3.5341) grad_norm 1.4257 (1.4690) [2022-01-22 04:51:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][830/1251] eta 0:15:27 lr 0.000538 time 1.9226 (2.2023) loss 3.8554 (3.5368) grad_norm 1.3941 (1.4686) [2022-01-22 04:51:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][840/1251] eta 0:15:05 lr 0.000538 time 2.0057 (2.2026) loss 3.2853 (3.5369) grad_norm 1.3122 (1.4691) [2022-01-22 04:51:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][850/1251] eta 0:14:44 lr 0.000538 time 2.4852 (2.2045) loss 3.2056 (3.5376) grad_norm 1.7632 (1.4699) [2022-01-22 04:52:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][860/1251] eta 0:14:21 lr 0.000538 time 2.2694 (2.2041) loss 3.7612 (3.5378) grad_norm 1.2651 (1.4699) [2022-01-22 04:52:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][870/1251] eta 0:13:59 lr 0.000538 time 2.1871 (2.2044) loss 3.2436 (3.5369) grad_norm 1.4192 (1.4698) [2022-01-22 04:53:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][880/1251] eta 0:13:38 lr 0.000538 time 1.8648 (2.2053) loss 4.2501 (3.5377) grad_norm 1.4871 (1.4700) [2022-01-22 04:53:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][890/1251] eta 0:13:16 lr 0.000538 time 2.7462 (2.2055) loss 2.9633 (3.5385) grad_norm 1.2826 (1.4697) [2022-01-22 04:53:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][900/1251] eta 0:12:53 lr 0.000538 time 1.9117 (2.2038) loss 4.1136 (3.5366) grad_norm 1.5635 (1.4697) [2022-01-22 04:54:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][910/1251] eta 0:12:31 lr 0.000537 time 2.6939 (2.2031) loss 4.2872 (3.5375) grad_norm 1.5268 (1.4693) [2022-01-22 04:54:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][920/1251] eta 0:12:08 lr 0.000537 time 1.5610 (2.2012) loss 3.3365 (3.5363) grad_norm 1.4145 (1.4691) [2022-01-22 04:54:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][930/1251] eta 0:11:46 lr 0.000537 time 1.9240 (2.1997) loss 2.9443 (3.5390) grad_norm 1.5136 (1.4685) [2022-01-22 04:55:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][940/1251] eta 0:11:24 lr 0.000537 time 2.4896 (2.1999) loss 3.7808 (3.5394) grad_norm 1.3195 (1.4691) [2022-01-22 04:55:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][950/1251] eta 0:11:02 lr 0.000537 time 2.8533 (2.2000) loss 2.2839 (3.5392) grad_norm 1.4518 (1.4697) [2022-01-22 04:55:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][960/1251] eta 0:10:40 lr 0.000537 time 2.3433 (2.2008) loss 4.1900 (3.5409) grad_norm 1.5062 (1.4703) [2022-01-22 04:56:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][970/1251] eta 0:10:18 lr 0.000537 time 1.8967 (2.2003) loss 3.6122 (3.5377) grad_norm 1.6256 (1.4712) [2022-01-22 04:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][980/1251] eta 0:09:56 lr 0.000537 time 1.8710 (2.2005) loss 2.8417 (3.5373) grad_norm 1.3454 (1.4706) [2022-01-22 04:57:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][990/1251] eta 0:09:34 lr 0.000537 time 2.4762 (2.2013) loss 3.1785 (3.5364) grad_norm 1.6429 (1.4712) [2022-01-22 04:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1000/1251] eta 0:09:12 lr 0.000537 time 1.5186 (2.2016) loss 2.2231 (3.5352) grad_norm 1.3954 (1.4716) [2022-01-22 04:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1010/1251] eta 0:08:50 lr 0.000537 time 1.8557 (2.2012) loss 4.1229 (3.5350) grad_norm 1.6316 (1.4718) [2022-01-22 04:58:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1020/1251] eta 0:08:28 lr 0.000537 time 1.9016 (2.2013) loss 3.6041 (3.5349) grad_norm 1.3057 (1.4710) [2022-01-22 04:58:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1030/1251] eta 0:08:06 lr 0.000537 time 2.5170 (2.2014) loss 4.1746 (3.5362) grad_norm 1.8754 (1.4715) [2022-01-22 04:58:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1040/1251] eta 0:07:44 lr 0.000537 time 1.7352 (2.2014) loss 3.5624 (3.5329) grad_norm 1.6497 (1.4710) [2022-01-22 04:59:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1050/1251] eta 0:07:22 lr 0.000537 time 1.5445 (2.2007) loss 2.9238 (3.5306) grad_norm 1.3970 (1.4706) [2022-01-22 04:59:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1060/1251] eta 0:07:00 lr 0.000537 time 1.7702 (2.2013) loss 3.4257 (3.5310) grad_norm 1.6214 (1.4703) [2022-01-22 04:59:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1070/1251] eta 0:06:38 lr 0.000537 time 1.9356 (2.2006) loss 4.4178 (3.5326) grad_norm 1.4034 (1.4693) [2022-01-22 05:00:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1080/1251] eta 0:06:16 lr 0.000537 time 1.9051 (2.2011) loss 3.0939 (3.5318) grad_norm 1.4773 (1.4687) [2022-01-22 05:00:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1090/1251] eta 0:05:54 lr 0.000537 time 1.6610 (2.2006) loss 4.0667 (3.5337) grad_norm 1.3758 (1.4678) [2022-01-22 05:01:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1100/1251] eta 0:05:32 lr 0.000537 time 1.9336 (2.2007) loss 3.9664 (3.5353) grad_norm 1.7235 (1.4678) [2022-01-22 05:01:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1110/1251] eta 0:05:10 lr 0.000537 time 1.6342 (2.1994) loss 3.8991 (3.5372) grad_norm 1.6682 (1.4684) [2022-01-22 05:01:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1120/1251] eta 0:04:47 lr 0.000537 time 1.9338 (2.1984) loss 2.8874 (3.5349) grad_norm 1.3991 (1.4682) [2022-01-22 05:02:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1130/1251] eta 0:04:25 lr 0.000537 time 1.7163 (2.1975) loss 3.7676 (3.5365) grad_norm 1.3339 (1.4683) [2022-01-22 05:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1140/1251] eta 0:04:04 lr 0.000537 time 2.2372 (2.1985) loss 3.5541 (3.5385) grad_norm 1.6089 (1.4680) [2022-01-22 05:02:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1150/1251] eta 0:03:42 lr 0.000536 time 2.2166 (2.1991) loss 2.9373 (3.5381) grad_norm 1.3781 (1.4682) [2022-01-22 05:03:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1160/1251] eta 0:03:20 lr 0.000536 time 2.2421 (2.1990) loss 2.8044 (3.5379) grad_norm 1.4985 (1.4687) [2022-01-22 05:03:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1170/1251] eta 0:02:58 lr 0.000536 time 1.9255 (2.1983) loss 4.1306 (3.5395) grad_norm 1.5138 (1.4686) [2022-01-22 05:03:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1180/1251] eta 0:02:36 lr 0.000536 time 1.9241 (2.1985) loss 3.9221 (3.5407) grad_norm 1.3090 (1.4684) [2022-01-22 05:04:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1190/1251] eta 0:02:14 lr 0.000536 time 1.8323 (2.1982) loss 3.6169 (3.5419) grad_norm 1.3950 (1.4678) [2022-01-22 05:04:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1200/1251] eta 0:01:52 lr 0.000536 time 2.2532 (2.1979) loss 4.0480 (3.5430) grad_norm 1.5486 (1.4672) [2022-01-22 05:05:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1210/1251] eta 0:01:30 lr 0.000536 time 1.6195 (2.1982) loss 3.7281 (3.5424) grad_norm 1.4918 (1.4666) [2022-01-22 05:05:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1220/1251] eta 0:01:08 lr 0.000536 time 2.4260 (2.1993) loss 3.7732 (3.5444) grad_norm 1.3519 (1.4659) [2022-01-22 05:05:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1230/1251] eta 0:00:46 lr 0.000536 time 1.8483 (2.1985) loss 4.1010 (3.5440) grad_norm 1.2575 (1.4662) [2022-01-22 05:06:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1240/1251] eta 0:00:24 lr 0.000536 time 1.4224 (2.1966) loss 3.4031 (3.5453) grad_norm 1.8639 (1.4666) [2022-01-22 05:06:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [143/300][1250/1251] eta 0:00:02 lr 0.000536 time 1.1829 (2.1908) loss 3.2089 (3.5454) grad_norm 1.5416 (1.4667) [2022-01-22 05:06:22 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 143 training takes 0:45:41 [2022-01-22 05:06:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.253 (18.253) Loss 1.0716 (1.0716) Acc@1 73.047 (73.047) Acc@5 93.652 (93.652) [2022-01-22 05:07:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.674 (3.418) Loss 1.0801 (1.0509) Acc@1 73.828 (75.320) Acc@5 93.750 (92.907) [2022-01-22 05:07:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.614 (2.466) Loss 1.1317 (1.0426) Acc@1 73.340 (75.335) Acc@5 91.113 (92.927) [2022-01-22 05:07:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.623 (2.184) Loss 1.0447 (1.0475) Acc@1 75.000 (75.324) Acc@5 93.457 (92.915) [2022-01-22 05:07:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.309 (2.133) Loss 0.9858 (1.0437) Acc@1 75.684 (75.291) Acc@5 94.238 (93.069) [2022-01-22 05:07:56 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.360 Acc@5 93.076 [2022-01-22 05:07:56 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.4% [2022-01-22 05:07:56 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.63% [2022-01-22 05:08:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][0/1251] eta 7:33:14 lr 0.000536 time 21.7383 (21.7383) loss 3.8652 (3.8652) grad_norm 1.5845 (1.5845) [2022-01-22 05:08:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][10/1251] eta 1:22:51 lr 0.000536 time 2.7570 (4.0060) loss 3.8432 (3.5964) grad_norm 1.1972 (1.3719) [2022-01-22 05:09:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][20/1251] eta 1:05:56 lr 0.000536 time 1.4088 (3.2137) loss 3.8367 (3.5410) grad_norm 1.2918 (1.3609) [2022-01-22 05:09:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][30/1251] eta 0:57:40 lr 0.000536 time 1.4998 (2.8342) loss 4.2453 (3.4978) grad_norm 1.5481 (1.3931) [2022-01-22 05:09:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][40/1251] eta 0:55:14 lr 0.000536 time 3.9413 (2.7369) loss 2.8797 (3.4479) grad_norm 1.3528 (1.4302) [2022-01-22 05:10:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][50/1251] eta 0:53:13 lr 0.000536 time 2.2285 (2.6586) loss 3.7392 (3.4411) grad_norm 1.4113 (1.4578) [2022-01-22 05:10:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][60/1251] eta 0:51:29 lr 0.000536 time 2.2225 (2.5936) loss 3.3330 (3.5003) grad_norm 1.3907 (1.4551) [2022-01-22 05:10:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][70/1251] eta 0:49:57 lr 0.000536 time 1.5702 (2.5383) loss 3.9258 (3.5363) grad_norm 1.3022 (1.4533) [2022-01-22 05:11:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][80/1251] eta 0:48:51 lr 0.000536 time 3.2810 (2.5037) loss 2.9063 (3.5286) grad_norm 1.3823 (1.4490) [2022-01-22 05:11:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][90/1251] eta 0:47:50 lr 0.000536 time 1.6008 (2.4724) loss 3.1442 (3.4863) grad_norm 1.6017 (1.4597) [2022-01-22 05:12:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][100/1251] eta 0:46:40 lr 0.000536 time 1.6739 (2.4330) loss 4.1620 (3.5213) grad_norm 1.3784 (1.4527) [2022-01-22 05:12:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][110/1251] eta 0:45:38 lr 0.000536 time 1.9663 (2.4004) loss 4.2377 (3.5233) grad_norm 1.6818 (1.4582) [2022-01-22 05:12:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][120/1251] eta 0:44:52 lr 0.000536 time 3.0965 (2.3808) loss 3.7302 (3.5197) grad_norm 1.4685 (1.4556) [2022-01-22 05:13:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][130/1251] eta 0:44:29 lr 0.000536 time 2.0944 (2.3816) loss 4.1061 (3.5359) grad_norm 1.3781 (1.4544) [2022-01-22 05:13:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][140/1251] eta 0:43:42 lr 0.000536 time 1.8878 (2.3605) loss 3.7686 (3.5248) grad_norm 1.5957 (1.4515) [2022-01-22 05:13:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][150/1251] eta 0:43:03 lr 0.000535 time 1.9172 (2.3469) loss 3.9271 (3.5120) grad_norm 1.3436 (1.4588) [2022-01-22 05:14:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][160/1251] eta 0:42:30 lr 0.000535 time 2.1917 (2.3381) loss 3.0048 (3.5081) grad_norm 1.6621 (1.4583) [2022-01-22 05:14:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][170/1251] eta 0:41:58 lr 0.000535 time 2.1553 (2.3294) loss 4.2169 (3.5049) grad_norm 1.6240 (1.4579) [2022-01-22 05:14:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][180/1251] eta 0:41:20 lr 0.000535 time 1.9918 (2.3165) loss 2.7867 (3.4943) grad_norm 1.4733 (1.4553) [2022-01-22 05:15:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][190/1251] eta 0:40:43 lr 0.000535 time 2.2484 (2.3028) loss 4.3590 (3.5018) grad_norm 1.2824 (1.4519) [2022-01-22 05:15:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][200/1251] eta 0:40:18 lr 0.000535 time 2.8311 (2.3012) loss 3.2620 (3.4979) grad_norm 1.4279 (1.4519) [2022-01-22 05:15:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][210/1251] eta 0:39:42 lr 0.000535 time 2.1555 (2.2886) loss 2.8996 (3.4871) grad_norm 1.4199 (1.4567) [2022-01-22 05:16:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][220/1251] eta 0:39:13 lr 0.000535 time 1.9907 (2.2827) loss 4.2298 (3.4979) grad_norm 1.5254 (1.4569) [2022-01-22 05:16:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][230/1251] eta 0:38:44 lr 0.000535 time 2.5143 (2.2768) loss 3.6002 (3.4941) grad_norm 1.4448 (1.4568) [2022-01-22 05:17:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][240/1251] eta 0:38:20 lr 0.000535 time 2.7494 (2.2751) loss 3.1527 (3.4991) grad_norm 1.6788 (1.4602) [2022-01-22 05:17:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][250/1251] eta 0:37:54 lr 0.000535 time 2.3606 (2.2721) loss 3.9212 (3.5085) grad_norm 1.3614 (1.4587) [2022-01-22 05:17:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][260/1251] eta 0:37:26 lr 0.000535 time 2.4022 (2.2667) loss 3.7311 (3.5095) grad_norm 1.2780 (1.4577) [2022-01-22 05:18:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][270/1251] eta 0:37:08 lr 0.000535 time 1.5975 (2.2717) loss 3.9700 (3.5113) grad_norm 1.4569 (1.4571) [2022-01-22 05:18:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][280/1251] eta 0:36:41 lr 0.000535 time 2.2776 (2.2668) loss 4.2417 (3.5067) grad_norm 1.4541 (1.4591) [2022-01-22 05:18:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][290/1251] eta 0:36:13 lr 0.000535 time 3.0629 (2.2619) loss 4.2013 (3.5045) grad_norm 1.3663 (1.4575) [2022-01-22 05:19:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][300/1251] eta 0:35:46 lr 0.000535 time 1.8547 (2.2575) loss 3.7545 (3.5095) grad_norm 1.3128 (1.4553) [2022-01-22 05:19:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][310/1251] eta 0:35:20 lr 0.000535 time 1.8558 (2.2536) loss 3.7360 (3.5120) grad_norm 1.4299 (1.4537) [2022-01-22 05:20:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][320/1251] eta 0:35:02 lr 0.000535 time 5.6024 (2.2579) loss 3.6626 (3.5095) grad_norm 1.4804 (1.4551) [2022-01-22 05:20:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][330/1251] eta 0:34:39 lr 0.000535 time 1.9002 (2.2581) loss 3.4643 (3.5093) grad_norm 1.5707 (1.4548) [2022-01-22 05:20:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][340/1251] eta 0:34:11 lr 0.000535 time 1.9996 (2.2522) loss 3.6696 (3.5132) grad_norm 1.2924 (1.4560) [2022-01-22 05:21:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][350/1251] eta 0:33:42 lr 0.000535 time 2.0130 (2.2452) loss 3.4596 (3.5190) grad_norm 1.5252 (1.4573) [2022-01-22 05:21:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][360/1251] eta 0:33:22 lr 0.000535 time 3.0934 (2.2475) loss 3.5390 (3.5187) grad_norm 1.4039 (1.4613) [2022-01-22 05:21:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][370/1251] eta 0:32:58 lr 0.000535 time 2.2165 (2.2452) loss 4.4243 (3.5164) grad_norm 2.0113 (1.4626) [2022-01-22 05:22:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][380/1251] eta 0:32:33 lr 0.000535 time 2.0306 (2.2429) loss 3.4032 (3.5180) grad_norm 1.5397 (1.4640) [2022-01-22 05:22:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][390/1251] eta 0:32:10 lr 0.000534 time 2.1670 (2.2424) loss 3.7033 (3.5224) grad_norm 1.5757 (1.4686) [2022-01-22 05:22:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][400/1251] eta 0:31:48 lr 0.000534 time 3.6185 (2.2422) loss 3.9529 (3.5212) grad_norm 1.4190 (1.4696) [2022-01-22 05:23:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][410/1251] eta 0:31:24 lr 0.000534 time 2.2026 (2.2405) loss 3.8418 (3.5290) grad_norm 1.7376 (1.4710) [2022-01-22 05:23:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][420/1251] eta 0:30:59 lr 0.000534 time 2.1701 (2.2376) loss 3.8024 (3.5273) grad_norm 1.5433 (1.4716) [2022-01-22 05:24:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][430/1251] eta 0:30:34 lr 0.000534 time 1.7358 (2.2350) loss 3.9478 (3.5282) grad_norm 1.8100 (1.4729) [2022-01-22 05:24:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][440/1251] eta 0:30:10 lr 0.000534 time 2.7675 (2.2319) loss 3.8274 (3.5274) grad_norm 1.5039 (1.4725) [2022-01-22 05:24:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][450/1251] eta 0:29:43 lr 0.000534 time 1.6273 (2.2271) loss 4.1029 (3.5287) grad_norm 1.4597 (1.4696) [2022-01-22 05:25:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][460/1251] eta 0:29:19 lr 0.000534 time 1.9554 (2.2244) loss 4.1056 (3.5257) grad_norm 1.4673 (1.4687) [2022-01-22 05:25:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][470/1251] eta 0:28:55 lr 0.000534 time 2.0144 (2.2219) loss 3.6941 (3.5214) grad_norm 1.3815 (1.4674) [2022-01-22 05:25:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][480/1251] eta 0:28:33 lr 0.000534 time 2.2232 (2.2228) loss 3.8108 (3.5163) grad_norm 1.5417 (1.4675) [2022-01-22 05:26:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][490/1251] eta 0:28:08 lr 0.000534 time 1.8276 (2.2190) loss 4.3069 (3.5237) grad_norm 1.4103 (1.4687) [2022-01-22 05:26:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][500/1251] eta 0:27:44 lr 0.000534 time 2.4310 (2.2166) loss 2.8022 (3.5232) grad_norm 1.4125 (1.4684) [2022-01-22 05:26:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][510/1251] eta 0:27:22 lr 0.000534 time 1.5991 (2.2163) loss 3.5348 (3.5262) grad_norm 1.2606 (1.4691) [2022-01-22 05:27:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][520/1251] eta 0:27:00 lr 0.000534 time 2.1854 (2.2163) loss 3.7482 (3.5271) grad_norm 1.4578 (1.4687) [2022-01-22 05:27:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][530/1251] eta 0:26:37 lr 0.000534 time 2.2701 (2.2161) loss 4.1135 (3.5303) grad_norm 1.7012 (1.4693) [2022-01-22 05:27:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][540/1251] eta 0:26:19 lr 0.000534 time 3.2044 (2.2212) loss 3.5610 (3.5312) grad_norm 1.3334 (1.4708) [2022-01-22 05:28:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][550/1251] eta 0:25:58 lr 0.000534 time 2.5714 (2.2239) loss 4.0212 (3.5299) grad_norm 1.2255 (1.4686) [2022-01-22 05:28:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][560/1251] eta 0:25:37 lr 0.000534 time 2.2307 (2.2256) loss 3.4112 (3.5317) grad_norm 1.6134 (1.4683) [2022-01-22 05:29:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][570/1251] eta 0:25:14 lr 0.000534 time 1.5005 (2.2241) loss 4.4500 (3.5355) grad_norm 1.4795 (1.4684) [2022-01-22 05:29:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][580/1251] eta 0:24:50 lr 0.000534 time 2.8263 (2.2218) loss 4.0813 (3.5373) grad_norm 1.3606 (1.4684) [2022-01-22 05:29:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][590/1251] eta 0:24:25 lr 0.000534 time 1.9217 (2.2169) loss 3.8418 (3.5378) grad_norm 1.7325 (1.4685) [2022-01-22 05:30:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][600/1251] eta 0:24:01 lr 0.000534 time 1.9416 (2.2150) loss 3.1922 (3.5381) grad_norm 1.3277 (1.4682) [2022-01-22 05:30:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][610/1251] eta 0:23:41 lr 0.000534 time 2.0180 (2.2171) loss 3.2766 (3.5323) grad_norm 1.5027 (1.4686) [2022-01-22 05:30:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][620/1251] eta 0:23:22 lr 0.000534 time 3.5490 (2.2228) loss 2.6981 (3.5315) grad_norm 1.5888 (1.4693) [2022-01-22 05:31:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][630/1251] eta 0:22:59 lr 0.000533 time 1.9318 (2.2216) loss 4.0374 (3.5366) grad_norm 1.5428 (1.4690) [2022-01-22 05:31:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][640/1251] eta 0:22:35 lr 0.000533 time 1.7676 (2.2191) loss 3.0047 (3.5331) grad_norm 1.2271 (1.4686) [2022-01-22 05:31:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][650/1251] eta 0:22:12 lr 0.000533 time 2.5544 (2.2165) loss 4.4255 (3.5345) grad_norm 1.4588 (1.4681) [2022-01-22 05:32:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][660/1251] eta 0:21:48 lr 0.000533 time 2.8747 (2.2135) loss 4.0162 (3.5335) grad_norm 1.5972 (1.4673) [2022-01-22 05:32:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][670/1251] eta 0:21:24 lr 0.000533 time 1.8928 (2.2104) loss 4.1406 (3.5340) grad_norm 1.6481 (1.4677) [2022-01-22 05:33:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][680/1251] eta 0:21:01 lr 0.000533 time 1.9138 (2.2085) loss 3.0399 (3.5315) grad_norm 1.4853 (1.4683) [2022-01-22 05:33:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][690/1251] eta 0:20:39 lr 0.000533 time 2.5268 (2.2086) loss 3.7680 (3.5318) grad_norm 1.4501 (1.4691) [2022-01-22 05:33:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][700/1251] eta 0:20:16 lr 0.000533 time 2.6894 (2.2075) loss 4.1130 (3.5303) grad_norm 1.5574 (1.4695) [2022-01-22 05:34:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][710/1251] eta 0:19:54 lr 0.000533 time 2.4381 (2.2081) loss 3.7842 (3.5301) grad_norm 1.5760 (1.4700) [2022-01-22 05:34:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][720/1251] eta 0:19:33 lr 0.000533 time 1.8877 (2.2106) loss 2.3705 (3.5283) grad_norm 1.5860 (1.4700) [2022-01-22 05:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][730/1251] eta 0:19:13 lr 0.000533 time 2.8148 (2.2131) loss 4.1885 (3.5277) grad_norm 1.3724 (1.4691) [2022-01-22 05:35:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][740/1251] eta 0:18:51 lr 0.000533 time 2.0979 (2.2144) loss 3.8470 (3.5276) grad_norm 1.4722 (1.4693) [2022-01-22 05:35:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][750/1251] eta 0:18:29 lr 0.000533 time 1.8292 (2.2139) loss 2.5111 (3.5275) grad_norm 1.2820 (1.4715) [2022-01-22 05:35:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][760/1251] eta 0:18:05 lr 0.000533 time 2.1552 (2.2115) loss 3.7567 (3.5300) grad_norm 1.6215 (1.4717) [2022-01-22 05:36:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][770/1251] eta 0:17:41 lr 0.000533 time 1.9035 (2.2075) loss 3.6531 (3.5313) grad_norm 1.4865 (1.4719) [2022-01-22 05:36:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][780/1251] eta 0:17:19 lr 0.000533 time 1.8889 (2.2061) loss 2.4735 (3.5297) grad_norm 1.5899 (1.4733) [2022-01-22 05:37:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][790/1251] eta 0:16:56 lr 0.000533 time 1.8467 (2.2049) loss 3.6396 (3.5306) grad_norm 1.4107 (1.4738) [2022-01-22 05:37:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][800/1251] eta 0:16:34 lr 0.000533 time 2.2217 (2.2059) loss 4.0018 (3.5355) grad_norm 1.4363 (1.4734) [2022-01-22 05:37:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][810/1251] eta 0:16:12 lr 0.000533 time 1.8104 (2.2058) loss 2.9115 (3.5341) grad_norm 1.3417 (1.4741) [2022-01-22 05:38:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][820/1251] eta 0:15:51 lr 0.000533 time 1.9272 (2.2079) loss 3.7436 (3.5341) grad_norm 1.4094 (1.4741) [2022-01-22 05:38:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][830/1251] eta 0:15:30 lr 0.000533 time 1.7958 (2.2102) loss 3.1315 (3.5336) grad_norm 1.4017 (1.4736) [2022-01-22 05:38:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][840/1251] eta 0:15:09 lr 0.000533 time 3.3894 (2.2127) loss 3.0508 (3.5335) grad_norm 1.2612 (1.4723) [2022-01-22 05:39:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][850/1251] eta 0:14:47 lr 0.000533 time 1.8770 (2.2123) loss 2.5618 (3.5320) grad_norm 1.2925 (1.4718) [2022-01-22 05:39:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][860/1251] eta 0:14:24 lr 0.000533 time 1.9069 (2.2110) loss 3.1175 (3.5311) grad_norm 1.2415 (1.4718) [2022-01-22 05:40:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][870/1251] eta 0:14:01 lr 0.000532 time 1.8131 (2.2085) loss 2.8774 (3.5318) grad_norm 1.4989 (1.4713) [2022-01-22 05:40:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][880/1251] eta 0:13:40 lr 0.000532 time 3.4732 (2.2108) loss 2.6701 (3.5276) grad_norm 1.4500 (1.4735) [2022-01-22 05:40:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][890/1251] eta 0:13:18 lr 0.000532 time 1.5851 (2.2107) loss 4.2061 (3.5302) grad_norm 1.5721 (1.4733) [2022-01-22 05:41:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][900/1251] eta 0:12:55 lr 0.000532 time 1.5479 (2.2105) loss 4.1210 (3.5314) grad_norm 1.3939 (1.4733) [2022-01-22 05:41:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][910/1251] eta 0:12:33 lr 0.000532 time 2.2387 (2.2104) loss 2.6567 (3.5314) grad_norm 1.4847 (1.4728) [2022-01-22 05:41:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][920/1251] eta 0:12:11 lr 0.000532 time 3.2028 (2.2109) loss 3.8128 (3.5335) grad_norm 1.4478 (1.4722) [2022-01-22 05:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][930/1251] eta 0:11:49 lr 0.000532 time 1.6057 (2.2092) loss 3.0723 (3.5339) grad_norm 1.5554 (1.4724) [2022-01-22 05:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][940/1251] eta 0:11:27 lr 0.000532 time 1.7676 (2.2092) loss 3.5972 (3.5348) grad_norm 1.9131 (1.4749) [2022-01-22 05:42:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][950/1251] eta 0:11:04 lr 0.000532 time 1.8513 (2.2087) loss 2.9387 (3.5375) grad_norm 1.4067 (1.4752) [2022-01-22 05:43:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][960/1251] eta 0:10:42 lr 0.000532 time 2.4811 (2.2082) loss 3.1053 (3.5347) grad_norm 1.1779 (1.4742) [2022-01-22 05:43:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][970/1251] eta 0:10:20 lr 0.000532 time 1.8457 (2.2073) loss 3.4140 (3.5335) grad_norm 1.1829 (1.4735) [2022-01-22 05:44:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][980/1251] eta 0:09:58 lr 0.000532 time 2.3071 (2.2075) loss 2.8504 (3.5324) grad_norm 1.2923 (1.4730) [2022-01-22 05:44:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][990/1251] eta 0:09:35 lr 0.000532 time 2.2371 (2.2064) loss 3.9979 (3.5320) grad_norm 1.5454 (1.4733) [2022-01-22 05:44:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1000/1251] eta 0:09:13 lr 0.000532 time 2.1844 (2.2055) loss 3.6955 (3.5319) grad_norm 1.2395 (1.4729) [2022-01-22 05:45:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1010/1251] eta 0:08:51 lr 0.000532 time 1.7063 (2.2043) loss 3.5540 (3.5317) grad_norm 1.8087 (1.4724) [2022-01-22 05:45:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1020/1251] eta 0:08:29 lr 0.000532 time 2.7578 (2.2062) loss 3.0136 (3.5291) grad_norm 1.4201 (1.4727) [2022-01-22 05:45:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1030/1251] eta 0:08:07 lr 0.000532 time 2.8055 (2.2058) loss 3.5375 (3.5308) grad_norm 1.3713 (1.4725) [2022-01-22 05:46:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1040/1251] eta 0:07:45 lr 0.000532 time 1.7099 (2.2047) loss 4.0250 (3.5293) grad_norm 2.0712 (1.4735) [2022-01-22 05:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1050/1251] eta 0:07:23 lr 0.000532 time 1.6152 (2.2053) loss 3.9220 (3.5318) grad_norm 1.5516 (1.4730) [2022-01-22 05:46:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1060/1251] eta 0:07:01 lr 0.000532 time 2.5244 (2.2068) loss 4.5413 (3.5346) grad_norm 1.6952 (1.4729) [2022-01-22 05:47:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1070/1251] eta 0:06:39 lr 0.000532 time 2.4106 (2.2059) loss 4.3940 (3.5366) grad_norm 1.6315 (1.4736) [2022-01-22 05:47:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1080/1251] eta 0:06:16 lr 0.000532 time 1.6161 (2.2036) loss 2.8116 (3.5387) grad_norm 1.6932 (1.4739) [2022-01-22 05:47:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1090/1251] eta 0:05:54 lr 0.000532 time 2.0888 (2.2027) loss 3.7373 (3.5393) grad_norm 1.4647 (1.4730) [2022-01-22 05:48:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1100/1251] eta 0:05:32 lr 0.000532 time 2.4424 (2.2024) loss 3.3115 (3.5376) grad_norm 1.3064 (1.4725) [2022-01-22 05:48:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1110/1251] eta 0:05:10 lr 0.000531 time 1.8911 (2.2030) loss 3.5614 (3.5376) grad_norm 1.6150 (1.4725) [2022-01-22 05:49:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1120/1251] eta 0:04:48 lr 0.000531 time 2.1910 (2.2038) loss 3.9375 (3.5358) grad_norm 1.3312 (1.4725) [2022-01-22 05:49:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1130/1251] eta 0:04:26 lr 0.000531 time 1.9066 (2.2040) loss 3.5497 (3.5359) grad_norm 1.3182 (1.4719) [2022-01-22 05:49:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1140/1251] eta 0:04:04 lr 0.000531 time 1.9030 (2.2043) loss 2.6820 (3.5368) grad_norm 1.2871 (1.4715) [2022-01-22 05:50:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1150/1251] eta 0:03:42 lr 0.000531 time 1.6347 (2.2043) loss 4.0633 (3.5353) grad_norm 1.3547 (1.4706) [2022-01-22 05:50:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1160/1251] eta 0:03:20 lr 0.000531 time 1.5061 (2.2038) loss 3.7804 (3.5368) grad_norm 1.4234 (1.4703) [2022-01-22 05:50:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1170/1251] eta 0:02:58 lr 0.000531 time 1.7447 (2.2030) loss 2.7818 (3.5369) grad_norm 1.4131 (1.4702) [2022-01-22 05:51:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1180/1251] eta 0:02:36 lr 0.000531 time 2.0205 (2.2025) loss 3.4747 (3.5375) grad_norm 1.5384 (1.4704) [2022-01-22 05:51:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1190/1251] eta 0:02:14 lr 0.000531 time 2.1910 (2.2012) loss 2.7413 (3.5399) grad_norm 1.3876 (1.4701) [2022-01-22 05:51:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1200/1251] eta 0:01:52 lr 0.000531 time 1.9465 (2.1995) loss 4.1190 (3.5393) grad_norm 1.5182 (1.4694) [2022-01-22 05:52:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1210/1251] eta 0:01:30 lr 0.000531 time 1.8161 (2.1992) loss 4.2299 (3.5398) grad_norm 1.1956 (1.4683) [2022-01-22 05:52:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1220/1251] eta 0:01:08 lr 0.000531 time 2.1621 (2.2010) loss 4.0101 (3.5395) grad_norm 1.5727 (1.4680) [2022-01-22 05:53:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1230/1251] eta 0:00:46 lr 0.000531 time 1.7559 (2.2025) loss 3.2421 (3.5384) grad_norm 1.3489 (1.4675) [2022-01-22 05:53:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1240/1251] eta 0:00:24 lr 0.000531 time 1.9086 (2.2020) loss 3.4753 (3.5380) grad_norm 1.5519 (1.4678) [2022-01-22 05:53:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [144/300][1250/1251] eta 0:00:02 lr 0.000531 time 1.1628 (2.1965) loss 2.8300 (3.5376) grad_norm 1.6806 (1.4679) [2022-01-22 05:53:44 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 144 training takes 0:45:48 [2022-01-22 05:54:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.614 (18.614) Loss 0.9720 (0.9720) Acc@1 77.051 (77.051) Acc@5 94.531 (94.531) [2022-01-22 05:54:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.965 (3.405) Loss 0.9818 (1.0385) Acc@1 75.586 (75.142) Acc@5 93.945 (93.510) [2022-01-22 05:54:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.293 (2.543) Loss 0.9838 (1.0392) Acc@1 77.051 (75.242) Acc@5 94.043 (93.411) [2022-01-22 05:54:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.282 (2.279) Loss 1.0335 (1.0320) Acc@1 75.781 (75.447) Acc@5 93.164 (93.300) [2022-01-22 05:55:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.127 (2.225) Loss 1.0184 (1.0327) Acc@1 75.781 (75.460) Acc@5 92.969 (93.155) [2022-01-22 05:55:23 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.458 Acc@5 93.202 [2022-01-22 05:55:23 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.5% [2022-01-22 05:55:23 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.63% [2022-01-22 05:55:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][0/1251] eta 7:28:56 lr 0.000531 time 21.5319 (21.5319) loss 2.6017 (2.6017) grad_norm 1.4743 (1.4743) [2022-01-22 05:56:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][10/1251] eta 1:21:57 lr 0.000531 time 1.8808 (3.9624) loss 3.2876 (3.4396) grad_norm 1.3315 (1.4380) [2022-01-22 05:56:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][20/1251] eta 1:03:26 lr 0.000531 time 1.4774 (3.0926) loss 4.0415 (3.4996) grad_norm 1.4550 (1.4602) [2022-01-22 05:56:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][30/1251] eta 0:57:54 lr 0.000531 time 1.6021 (2.8456) loss 3.3850 (3.4284) grad_norm 1.5743 (1.4735) [2022-01-22 05:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][40/1251] eta 0:55:18 lr 0.000531 time 4.9029 (2.7399) loss 3.8008 (3.4346) grad_norm 1.3609 (1.4733) [2022-01-22 05:57:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][50/1251] eta 0:51:46 lr 0.000531 time 1.6367 (2.5864) loss 3.4247 (3.4383) grad_norm 1.4929 (1.4739) [2022-01-22 05:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][60/1251] eta 0:50:45 lr 0.000531 time 2.3770 (2.5569) loss 4.5282 (3.4233) grad_norm 1.9341 (1.4792) [2022-01-22 05:58:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][70/1251] eta 0:49:00 lr 0.000531 time 1.6954 (2.4896) loss 2.5592 (3.4689) grad_norm 1.3946 (1.4837) [2022-01-22 05:58:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][80/1251] eta 0:47:54 lr 0.000531 time 3.4323 (2.4551) loss 2.3464 (3.4828) grad_norm 1.6183 (1.4786) [2022-01-22 05:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][90/1251] eta 0:46:55 lr 0.000531 time 2.0596 (2.4254) loss 3.7220 (3.5028) grad_norm 1.5994 (1.4795) [2022-01-22 05:59:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][100/1251] eta 0:46:08 lr 0.000530 time 1.5754 (2.4053) loss 2.7854 (3.5105) grad_norm 1.3826 (1.4826) [2022-01-22 05:59:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][110/1251] eta 0:45:17 lr 0.000530 time 1.9042 (2.3815) loss 3.2273 (3.5135) grad_norm 1.5133 (1.4786) [2022-01-22 06:00:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][120/1251] eta 0:44:45 lr 0.000530 time 3.4335 (2.3747) loss 4.3706 (3.5303) grad_norm 1.5029 (1.4754) [2022-01-22 06:00:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][130/1251] eta 0:44:16 lr 0.000530 time 2.5790 (2.3698) loss 3.9852 (3.5403) grad_norm 2.2719 (1.4792) [2022-01-22 06:00:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][140/1251] eta 0:43:30 lr 0.000530 time 1.7864 (2.3499) loss 3.9777 (3.5410) grad_norm 1.6872 (1.4791) [2022-01-22 06:01:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][150/1251] eta 0:42:54 lr 0.000530 time 2.1730 (2.3379) loss 3.3143 (3.5683) grad_norm 1.6560 (1.4828) [2022-01-22 06:01:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][160/1251] eta 0:42:24 lr 0.000530 time 2.5296 (2.3318) loss 3.9808 (3.5706) grad_norm 1.4326 (1.4854) [2022-01-22 06:02:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][170/1251] eta 0:41:51 lr 0.000530 time 2.5398 (2.3233) loss 3.7751 (3.5770) grad_norm 1.5700 (1.4784) [2022-01-22 06:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][180/1251] eta 0:41:10 lr 0.000530 time 1.8375 (2.3067) loss 2.4383 (3.5710) grad_norm 1.5397 (1.4773) [2022-01-22 06:02:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][190/1251] eta 0:40:49 lr 0.000530 time 2.4783 (2.3087) loss 2.5090 (3.5531) grad_norm 1.4000 (1.4738) [2022-01-22 06:03:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][200/1251] eta 0:40:20 lr 0.000530 time 2.7532 (2.3033) loss 3.0897 (3.5533) grad_norm 1.3755 (1.4776) [2022-01-22 06:03:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][210/1251] eta 0:39:47 lr 0.000530 time 2.3518 (2.2934) loss 3.8319 (3.5675) grad_norm 1.5647 (1.4778) [2022-01-22 06:03:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][220/1251] eta 0:39:13 lr 0.000530 time 1.9368 (2.2831) loss 3.2893 (3.5612) grad_norm 1.4223 (1.4768) [2022-01-22 06:04:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][230/1251] eta 0:38:43 lr 0.000530 time 2.1310 (2.2757) loss 3.7297 (3.5714) grad_norm 1.5070 (1.4764) [2022-01-22 06:04:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][240/1251] eta 0:38:15 lr 0.000530 time 1.8098 (2.2702) loss 2.5473 (3.5684) grad_norm 1.2473 (1.4770) [2022-01-22 06:04:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][250/1251] eta 0:37:49 lr 0.000530 time 2.8499 (2.2673) loss 2.8026 (3.5607) grad_norm 1.4391 (1.4742) [2022-01-22 06:05:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][260/1251] eta 0:37:23 lr 0.000530 time 2.0546 (2.2639) loss 4.1767 (3.5599) grad_norm 1.5343 (1.4764) [2022-01-22 06:05:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][270/1251] eta 0:37:04 lr 0.000530 time 3.6172 (2.2678) loss 3.4506 (3.5624) grad_norm 1.2635 (1.4767) [2022-01-22 06:06:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][280/1251] eta 0:36:40 lr 0.000530 time 1.8109 (2.2661) loss 3.4611 (3.5678) grad_norm 1.6923 (1.4758) [2022-01-22 06:06:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][290/1251] eta 0:36:17 lr 0.000530 time 2.8290 (2.2659) loss 4.0037 (3.5668) grad_norm 1.6062 (1.4767) [2022-01-22 06:06:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][300/1251] eta 0:35:47 lr 0.000530 time 1.5767 (2.2582) loss 4.3964 (3.5797) grad_norm 1.2835 (1.4778) [2022-01-22 06:07:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][310/1251] eta 0:35:22 lr 0.000530 time 2.9410 (2.2551) loss 4.4057 (3.5730) grad_norm 1.4433 (1.4763) [2022-01-22 06:07:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][320/1251] eta 0:34:54 lr 0.000530 time 1.7109 (2.2499) loss 3.0268 (3.5717) grad_norm 1.6149 (1.4752) [2022-01-22 06:07:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][330/1251] eta 0:34:29 lr 0.000530 time 2.2009 (2.2470) loss 3.8135 (3.5734) grad_norm 1.2731 (1.4763) [2022-01-22 06:08:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][340/1251] eta 0:34:04 lr 0.000529 time 1.7834 (2.2442) loss 2.6783 (3.5712) grad_norm 1.3864 (1.4781) [2022-01-22 06:08:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][350/1251] eta 0:33:41 lr 0.000529 time 3.0657 (2.2433) loss 3.8150 (3.5719) grad_norm 1.3751 (1.4779) [2022-01-22 06:08:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][360/1251] eta 0:33:24 lr 0.000529 time 2.9884 (2.2498) loss 3.1137 (3.5676) grad_norm 1.2191 (1.4754) [2022-01-22 06:09:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][370/1251] eta 0:33:01 lr 0.000529 time 1.9937 (2.2491) loss 3.8253 (3.5724) grad_norm 1.6679 (1.4747) [2022-01-22 06:09:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][380/1251] eta 0:32:35 lr 0.000529 time 2.0083 (2.2452) loss 3.3315 (3.5754) grad_norm 1.3945 (1.4748) [2022-01-22 06:09:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][390/1251] eta 0:32:07 lr 0.000529 time 1.8049 (2.2383) loss 3.9947 (3.5804) grad_norm 1.5233 (1.4757) [2022-01-22 06:10:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][400/1251] eta 0:31:40 lr 0.000529 time 1.8723 (2.2334) loss 3.9514 (3.5796) grad_norm 1.4437 (1.4767) [2022-01-22 06:10:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][410/1251] eta 0:31:15 lr 0.000529 time 2.0912 (2.2299) loss 4.0891 (3.5769) grad_norm 1.4043 (1.4764) [2022-01-22 06:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][420/1251] eta 0:30:51 lr 0.000529 time 1.6407 (2.2278) loss 3.8453 (3.5768) grad_norm 1.3803 (1.4762) [2022-01-22 06:11:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][430/1251] eta 0:30:29 lr 0.000529 time 2.3710 (2.2281) loss 3.8518 (3.5764) grad_norm 1.4682 (1.4769) [2022-01-22 06:11:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][440/1251] eta 0:30:10 lr 0.000529 time 1.5366 (2.2319) loss 2.7723 (3.5713) grad_norm 1.7123 (1.4789) [2022-01-22 06:12:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][450/1251] eta 0:29:49 lr 0.000529 time 2.3750 (2.2344) loss 2.8461 (3.5687) grad_norm 1.4622 (1.4777) [2022-01-22 06:12:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][460/1251] eta 0:29:25 lr 0.000529 time 2.3692 (2.2316) loss 2.6888 (3.5678) grad_norm 1.4565 (1.4760) [2022-01-22 06:12:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][470/1251] eta 0:29:00 lr 0.000529 time 2.2811 (2.2288) loss 4.1117 (3.5685) grad_norm 1.2126 (1.4749) [2022-01-22 06:13:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][480/1251] eta 0:28:35 lr 0.000529 time 1.6513 (2.2249) loss 3.1657 (3.5658) grad_norm 1.9920 (1.4749) [2022-01-22 06:13:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][490/1251] eta 0:28:11 lr 0.000529 time 2.3369 (2.2232) loss 4.1007 (3.5636) grad_norm 1.3934 (1.4742) [2022-01-22 06:13:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][500/1251] eta 0:27:49 lr 0.000529 time 2.9165 (2.2236) loss 2.9966 (3.5612) grad_norm 1.4889 (1.4733) [2022-01-22 06:14:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][510/1251] eta 0:27:27 lr 0.000529 time 2.7851 (2.2237) loss 3.8737 (3.5605) grad_norm 1.5186 (1.4730) [2022-01-22 06:14:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][520/1251] eta 0:27:04 lr 0.000529 time 1.9250 (2.2229) loss 3.7409 (3.5671) grad_norm 1.3810 (1.4724) [2022-01-22 06:15:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][530/1251] eta 0:26:43 lr 0.000529 time 3.1464 (2.2239) loss 4.2310 (3.5757) grad_norm 1.4140 (1.4720) [2022-01-22 06:15:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][540/1251] eta 0:26:19 lr 0.000529 time 2.5690 (2.2220) loss 3.6423 (3.5759) grad_norm 1.3367 (1.4719) [2022-01-22 06:15:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][550/1251] eta 0:25:58 lr 0.000529 time 3.3619 (2.2231) loss 3.7701 (3.5756) grad_norm 1.4145 (1.4713) [2022-01-22 06:16:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][560/1251] eta 0:25:34 lr 0.000529 time 2.0061 (2.2201) loss 2.7156 (3.5707) grad_norm 1.4804 (1.4701) [2022-01-22 06:16:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][570/1251] eta 0:25:09 lr 0.000529 time 2.1888 (2.2169) loss 4.0036 (3.5718) grad_norm 1.3599 (1.4687) [2022-01-22 06:16:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][580/1251] eta 0:24:46 lr 0.000529 time 2.3281 (2.2158) loss 3.5247 (3.5690) grad_norm 1.3527 (1.4674) [2022-01-22 06:17:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][590/1251] eta 0:24:25 lr 0.000528 time 2.5477 (2.2174) loss 2.8318 (3.5681) grad_norm 1.6876 (1.4688) [2022-01-22 06:17:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][600/1251] eta 0:24:05 lr 0.000528 time 2.7711 (2.2206) loss 4.1038 (3.5712) grad_norm 1.4203 (1.4683) [2022-01-22 06:18:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][610/1251] eta 0:23:43 lr 0.000528 time 2.1766 (2.2204) loss 3.6741 (3.5714) grad_norm 1.3327 (1.4690) [2022-01-22 06:18:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][620/1251] eta 0:23:19 lr 0.000528 time 2.2672 (2.2186) loss 2.4225 (3.5693) grad_norm 1.4890 (1.4679) [2022-01-22 06:18:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][630/1251] eta 0:22:55 lr 0.000528 time 1.9911 (2.2153) loss 3.7308 (3.5698) grad_norm 1.5849 (1.4677) [2022-01-22 06:19:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][640/1251] eta 0:22:32 lr 0.000528 time 2.1814 (2.2129) loss 4.4316 (3.5697) grad_norm 1.5068 (1.4666) [2022-01-22 06:19:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][650/1251] eta 0:22:10 lr 0.000528 time 2.2390 (2.2136) loss 3.9204 (3.5697) grad_norm 1.3285 (1.4651) [2022-01-22 06:19:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][660/1251] eta 0:21:48 lr 0.000528 time 1.9495 (2.2138) loss 4.1571 (3.5704) grad_norm 1.6214 (1.4650) [2022-01-22 06:20:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][670/1251] eta 0:21:26 lr 0.000528 time 2.5785 (2.2151) loss 2.5213 (3.5672) grad_norm 1.5044 (1.4652) [2022-01-22 06:20:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][680/1251] eta 0:21:03 lr 0.000528 time 1.9330 (2.2136) loss 3.7088 (3.5663) grad_norm 1.8617 (1.4667) [2022-01-22 06:20:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][690/1251] eta 0:20:40 lr 0.000528 time 1.7356 (2.2119) loss 4.0067 (3.5685) grad_norm 1.4784 (1.4658) [2022-01-22 06:21:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][700/1251] eta 0:20:18 lr 0.000528 time 1.5125 (2.2114) loss 3.6728 (3.5665) grad_norm 1.4380 (1.4649) [2022-01-22 06:21:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][710/1251] eta 0:19:55 lr 0.000528 time 1.7023 (2.2099) loss 2.5632 (3.5646) grad_norm 1.4194 (1.4649) [2022-01-22 06:21:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][720/1251] eta 0:19:33 lr 0.000528 time 2.1515 (2.2101) loss 2.5855 (3.5628) grad_norm 1.5538 (1.4653) [2022-01-22 06:22:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][730/1251] eta 0:19:12 lr 0.000528 time 2.7766 (2.2112) loss 3.2267 (3.5630) grad_norm 1.3347 (1.4659) [2022-01-22 06:22:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][740/1251] eta 0:18:50 lr 0.000528 time 1.8806 (2.2119) loss 3.7895 (3.5612) grad_norm 1.3216 (1.4655) [2022-01-22 06:23:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][750/1251] eta 0:18:27 lr 0.000528 time 1.8970 (2.2114) loss 3.9767 (3.5646) grad_norm 1.6613 (1.4654) [2022-01-22 06:23:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][760/1251] eta 0:18:04 lr 0.000528 time 1.8125 (2.2092) loss 3.1037 (3.5625) grad_norm 1.4910 (1.4663) [2022-01-22 06:23:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][770/1251] eta 0:17:41 lr 0.000528 time 1.5599 (2.2069) loss 3.8323 (3.5628) grad_norm 1.5511 (1.4655) [2022-01-22 06:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][780/1251] eta 0:17:18 lr 0.000528 time 2.1380 (2.2058) loss 4.1418 (3.5681) grad_norm 1.7460 (1.4662) [2022-01-22 06:24:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][790/1251] eta 0:16:56 lr 0.000528 time 2.1110 (2.2048) loss 3.5857 (3.5658) grad_norm 1.6235 (1.4680) [2022-01-22 06:24:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][800/1251] eta 0:16:34 lr 0.000528 time 1.8737 (2.2051) loss 3.0579 (3.5673) grad_norm 1.2239 (1.4674) [2022-01-22 06:25:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][810/1251] eta 0:16:12 lr 0.000528 time 2.5217 (2.2056) loss 3.6248 (3.5675) grad_norm 1.2702 (1.4682) [2022-01-22 06:25:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][820/1251] eta 0:15:51 lr 0.000528 time 1.9162 (2.2075) loss 2.4618 (3.5646) grad_norm 1.6569 (1.4694) [2022-01-22 06:25:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][830/1251] eta 0:15:29 lr 0.000527 time 2.3101 (2.2089) loss 3.1889 (3.5641) grad_norm 1.4465 (1.4693) [2022-01-22 06:26:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][840/1251] eta 0:15:07 lr 0.000527 time 1.9222 (2.2071) loss 2.4450 (3.5649) grad_norm 1.4799 (1.4694) [2022-01-22 06:26:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][850/1251] eta 0:14:44 lr 0.000527 time 1.9048 (2.2062) loss 3.1627 (3.5637) grad_norm 1.4326 (1.4703) [2022-01-22 06:27:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][860/1251] eta 0:14:22 lr 0.000527 time 1.8640 (2.2062) loss 3.3137 (3.5617) grad_norm 1.5733 (1.4711) [2022-01-22 06:27:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][870/1251] eta 0:14:00 lr 0.000527 time 1.8977 (2.2059) loss 2.7400 (3.5579) grad_norm 1.6804 (1.4716) [2022-01-22 06:27:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][880/1251] eta 0:13:37 lr 0.000527 time 1.8309 (2.2047) loss 3.2667 (3.5588) grad_norm 1.4003 (1.4718) [2022-01-22 06:28:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][890/1251] eta 0:13:15 lr 0.000527 time 2.1314 (2.2038) loss 2.4550 (3.5587) grad_norm 1.6494 (1.4719) [2022-01-22 06:28:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][900/1251] eta 0:12:53 lr 0.000527 time 1.8765 (2.2041) loss 3.8604 (3.5600) grad_norm 1.2306 (1.4709) [2022-01-22 06:28:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][910/1251] eta 0:12:31 lr 0.000527 time 1.9419 (2.2038) loss 3.8822 (3.5602) grad_norm 2.3213 (1.4722) [2022-01-22 06:29:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][920/1251] eta 0:12:09 lr 0.000527 time 1.9249 (2.2036) loss 3.7414 (3.5615) grad_norm 1.2872 (1.4739) [2022-01-22 06:29:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][930/1251] eta 0:11:47 lr 0.000527 time 2.2656 (2.2038) loss 3.1277 (3.5615) grad_norm 1.6204 (1.4748) [2022-01-22 06:29:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][940/1251] eta 0:11:25 lr 0.000527 time 1.9259 (2.2032) loss 4.2535 (3.5606) grad_norm 1.4791 (1.4750) [2022-01-22 06:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][950/1251] eta 0:11:03 lr 0.000527 time 1.8637 (2.2034) loss 3.5523 (3.5585) grad_norm 1.4931 (1.4756) [2022-01-22 06:30:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][960/1251] eta 0:10:40 lr 0.000527 time 1.5395 (2.2021) loss 3.3458 (3.5581) grad_norm 1.4299 (1.4765) [2022-01-22 06:31:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][970/1251] eta 0:10:19 lr 0.000527 time 3.4590 (2.2033) loss 3.7713 (3.5568) grad_norm 1.6803 (1.4761) [2022-01-22 06:31:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][980/1251] eta 0:09:57 lr 0.000527 time 1.7155 (2.2035) loss 2.8104 (3.5574) grad_norm 1.4766 (1.4760) [2022-01-22 06:31:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][990/1251] eta 0:09:35 lr 0.000527 time 1.8810 (2.2033) loss 3.8684 (3.5567) grad_norm 1.5339 (1.4771) [2022-01-22 06:32:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1000/1251] eta 0:09:12 lr 0.000527 time 1.8930 (2.2027) loss 2.6986 (3.5560) grad_norm 1.3078 (1.4772) [2022-01-22 06:32:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1010/1251] eta 0:08:50 lr 0.000527 time 3.2619 (2.2018) loss 3.6646 (3.5554) grad_norm 1.4945 (1.4763) [2022-01-22 06:32:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1020/1251] eta 0:08:28 lr 0.000527 time 1.8704 (2.2008) loss 2.8195 (3.5531) grad_norm 1.5580 (1.4761) [2022-01-22 06:33:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1030/1251] eta 0:08:06 lr 0.000527 time 2.0230 (2.2000) loss 3.6594 (3.5522) grad_norm 1.4736 (1.4748) [2022-01-22 06:33:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1040/1251] eta 0:07:44 lr 0.000527 time 2.0706 (2.2009) loss 3.6595 (3.5532) grad_norm 1.3798 (1.4747) [2022-01-22 06:33:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1050/1251] eta 0:07:22 lr 0.000527 time 2.8319 (2.2019) loss 4.0611 (3.5507) grad_norm 1.6571 (1.4746) [2022-01-22 06:34:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1060/1251] eta 0:07:00 lr 0.000527 time 2.1188 (2.2006) loss 3.9116 (3.5517) grad_norm 1.2623 (1.4736) [2022-01-22 06:34:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1070/1251] eta 0:06:38 lr 0.000526 time 1.8190 (2.2001) loss 3.1455 (3.5520) grad_norm 1.2551 (1.4730) [2022-01-22 06:35:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1080/1251] eta 0:06:16 lr 0.000526 time 2.0484 (2.1993) loss 3.7789 (3.5517) grad_norm 1.4528 (1.4729) [2022-01-22 06:35:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1090/1251] eta 0:05:54 lr 0.000526 time 2.8701 (2.1994) loss 2.3171 (3.5513) grad_norm 1.6038 (1.4734) [2022-01-22 06:35:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1100/1251] eta 0:05:32 lr 0.000526 time 2.4918 (2.1995) loss 3.7136 (3.5524) grad_norm 1.4928 (1.4734) [2022-01-22 06:36:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1110/1251] eta 0:05:10 lr 0.000526 time 2.1282 (2.2005) loss 3.9302 (3.5543) grad_norm 1.3123 (1.4733) [2022-01-22 06:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1120/1251] eta 0:04:48 lr 0.000526 time 2.1905 (2.2003) loss 2.8163 (3.5539) grad_norm 1.6587 (1.4725) [2022-01-22 06:36:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1130/1251] eta 0:04:26 lr 0.000526 time 3.1091 (2.2020) loss 3.7906 (3.5540) grad_norm 1.4034 (1.4722) [2022-01-22 06:37:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1140/1251] eta 0:04:04 lr 0.000526 time 1.6623 (2.2007) loss 2.4081 (3.5529) grad_norm 1.3717 (1.4720) [2022-01-22 06:37:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1150/1251] eta 0:03:42 lr 0.000526 time 1.6794 (2.1988) loss 3.7649 (3.5508) grad_norm 1.4535 (1.4718) [2022-01-22 06:37:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1160/1251] eta 0:03:19 lr 0.000526 time 1.9280 (2.1978) loss 3.6510 (3.5509) grad_norm 1.4286 (1.4718) [2022-01-22 06:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1170/1251] eta 0:02:58 lr 0.000526 time 2.5038 (2.1993) loss 3.5317 (3.5519) grad_norm 1.2893 (1.4719) [2022-01-22 06:38:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1180/1251] eta 0:02:36 lr 0.000526 time 1.8676 (2.1993) loss 4.0440 (3.5539) grad_norm 1.3581 (1.4715) [2022-01-22 06:39:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1190/1251] eta 0:02:14 lr 0.000526 time 1.9373 (2.1989) loss 3.3588 (3.5540) grad_norm 1.5066 (1.4713) [2022-01-22 06:39:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1200/1251] eta 0:01:52 lr 0.000526 time 1.7353 (2.1978) loss 3.7881 (3.5556) grad_norm 1.4316 (1.4706) [2022-01-22 06:39:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1210/1251] eta 0:01:30 lr 0.000526 time 1.8083 (2.1973) loss 3.5996 (3.5555) grad_norm 1.5745 (1.4702) [2022-01-22 06:40:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1220/1251] eta 0:01:08 lr 0.000526 time 2.0717 (2.1969) loss 4.2783 (3.5552) grad_norm 1.3146 (1.4700) [2022-01-22 06:40:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1230/1251] eta 0:00:46 lr 0.000526 time 2.7480 (2.1972) loss 4.0952 (3.5567) grad_norm 1.5796 (1.4705) [2022-01-22 06:40:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1240/1251] eta 0:00:24 lr 0.000526 time 1.5346 (2.1959) loss 3.5618 (3.5580) grad_norm 1.3729 (1.4704) [2022-01-22 06:41:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [145/300][1250/1251] eta 0:00:02 lr 0.000526 time 1.2877 (2.1907) loss 3.6659 (3.5566) grad_norm 1.3058 (1.4706) [2022-01-22 06:41:04 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 145 training takes 0:45:41 [2022-01-22 06:41:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.031 (19.031) Loss 0.9948 (0.9948) Acc@1 77.930 (77.930) Acc@5 94.043 (94.043) [2022-01-22 06:41:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.951 (3.429) Loss 1.0948 (1.0154) Acc@1 73.926 (76.003) Acc@5 93.555 (93.510) [2022-01-22 06:41:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.297 (2.473) Loss 1.0425 (1.0250) Acc@1 74.805 (75.944) Acc@5 92.676 (93.331) [2022-01-22 06:42:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.654 (2.212) Loss 0.9925 (1.0290) Acc@1 77.344 (75.876) Acc@5 94.336 (93.363) [2022-01-22 06:42:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.283 (2.137) Loss 1.0362 (1.0334) Acc@1 74.121 (75.653) Acc@5 93.555 (93.297) [2022-01-22 06:42:39 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.706 Acc@5 93.252 [2022-01-22 06:42:39 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.7% [2022-01-22 06:42:39 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.71% [2022-01-22 06:43:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][0/1251] eta 7:36:50 lr 0.000526 time 21.9106 (21.9106) loss 4.1716 (4.1716) grad_norm 1.5978 (1.5978) [2022-01-22 06:43:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][10/1251] eta 1:25:05 lr 0.000526 time 2.5875 (4.1141) loss 3.8748 (3.6219) grad_norm 1.2156 (1.4401) [2022-01-22 06:43:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][20/1251] eta 1:04:19 lr 0.000526 time 1.7921 (3.1350) loss 2.6983 (3.5197) grad_norm 1.4228 (1.5056) [2022-01-22 06:44:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][30/1251] eta 0:57:18 lr 0.000526 time 1.6854 (2.8163) loss 4.0033 (3.5454) grad_norm 1.3814 (1.4921) [2022-01-22 06:44:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][40/1251] eta 0:54:37 lr 0.000526 time 3.7529 (2.7065) loss 3.4075 (3.5043) grad_norm 1.4379 (1.4892) [2022-01-22 06:44:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][50/1251] eta 0:53:20 lr 0.000526 time 3.3629 (2.6647) loss 4.3507 (3.4937) grad_norm 1.8470 (1.4760) [2022-01-22 06:45:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][60/1251] eta 0:52:23 lr 0.000525 time 2.5617 (2.6395) loss 2.7194 (3.4448) grad_norm 1.4342 (1.4758) [2022-01-22 06:45:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][70/1251] eta 0:50:31 lr 0.000525 time 1.6987 (2.5667) loss 3.7964 (3.4513) grad_norm 1.5139 (1.4809) [2022-01-22 06:46:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][80/1251] eta 0:48:42 lr 0.000525 time 1.9118 (2.4954) loss 3.8542 (3.4374) grad_norm 1.5722 (1.4794) [2022-01-22 06:46:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][90/1251] eta 0:46:48 lr 0.000525 time 1.6578 (2.4194) loss 4.2321 (3.4417) grad_norm 1.6077 (1.4966) [2022-01-22 06:46:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][100/1251] eta 0:46:00 lr 0.000525 time 2.6072 (2.3981) loss 2.3410 (3.4801) grad_norm 1.4480 (1.5039) [2022-01-22 06:47:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][110/1251] eta 0:45:09 lr 0.000525 time 1.8676 (2.3749) loss 3.9605 (3.4762) grad_norm 1.4966 (1.5029) [2022-01-22 06:47:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][120/1251] eta 0:44:36 lr 0.000525 time 2.1136 (2.3661) loss 3.0910 (3.4701) grad_norm 1.9097 (1.5130) [2022-01-22 06:47:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][130/1251] eta 0:44:02 lr 0.000525 time 1.5497 (2.3576) loss 3.2443 (3.4631) grad_norm 1.5257 (1.5085) [2022-01-22 06:48:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][140/1251] eta 0:43:49 lr 0.000525 time 3.0828 (2.3667) loss 4.0804 (3.4828) grad_norm 1.5772 (1.5051) [2022-01-22 06:48:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][150/1251] eta 0:43:19 lr 0.000525 time 2.6719 (2.3613) loss 3.2699 (3.4860) grad_norm 1.6870 (1.5068) [2022-01-22 06:48:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][160/1251] eta 0:42:35 lr 0.000525 time 1.6310 (2.3421) loss 3.2511 (3.4963) grad_norm 1.3993 (1.5039) [2022-01-22 06:49:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][170/1251] eta 0:41:49 lr 0.000525 time 1.7392 (2.3212) loss 3.6742 (3.4924) grad_norm 1.6695 (1.4981) [2022-01-22 06:49:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][180/1251] eta 0:41:09 lr 0.000525 time 2.2697 (2.3061) loss 3.7097 (3.5007) grad_norm 1.3514 (1.4918) [2022-01-22 06:49:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][190/1251] eta 0:40:38 lr 0.000525 time 2.5481 (2.2984) loss 3.0928 (3.5021) grad_norm 1.2474 (1.4869) [2022-01-22 06:50:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][200/1251] eta 0:40:17 lr 0.000525 time 2.6047 (2.3003) loss 2.6020 (3.4987) grad_norm 1.5141 (1.4854) [2022-01-22 06:50:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][210/1251] eta 0:39:51 lr 0.000525 time 1.8472 (2.2973) loss 3.8803 (3.5077) grad_norm 1.4394 (1.4842) [2022-01-22 06:51:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][220/1251] eta 0:39:28 lr 0.000525 time 2.4276 (2.2972) loss 3.9466 (3.4988) grad_norm 1.7492 (1.4832) [2022-01-22 06:51:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][230/1251] eta 0:38:58 lr 0.000525 time 2.4512 (2.2904) loss 3.9745 (3.5094) grad_norm 1.4234 (1.4817) [2022-01-22 06:51:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][240/1251] eta 0:38:29 lr 0.000525 time 1.5532 (2.2847) loss 3.7340 (3.5080) grad_norm 1.3880 (1.4795) [2022-01-22 06:52:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][250/1251] eta 0:37:56 lr 0.000525 time 1.8778 (2.2743) loss 3.4311 (3.5082) grad_norm 1.4778 (1.4770) [2022-01-22 06:52:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][260/1251] eta 0:37:31 lr 0.000525 time 2.2803 (2.2723) loss 2.7178 (3.5129) grad_norm 1.3040 (1.4787) [2022-01-22 06:52:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][270/1251] eta 0:37:01 lr 0.000525 time 2.7457 (2.2641) loss 3.3128 (3.5142) grad_norm 1.4020 (1.4822) [2022-01-22 06:53:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][280/1251] eta 0:36:32 lr 0.000525 time 1.9795 (2.2577) loss 2.8609 (3.5054) grad_norm 1.5182 (1.4832) [2022-01-22 06:53:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][290/1251] eta 0:36:09 lr 0.000525 time 2.2990 (2.2576) loss 3.8928 (3.5065) grad_norm 1.8767 (1.4875) [2022-01-22 06:53:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][300/1251] eta 0:35:47 lr 0.000524 time 2.7429 (2.2579) loss 3.0997 (3.5050) grad_norm 1.6622 (1.4865) [2022-01-22 06:54:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][310/1251] eta 0:35:32 lr 0.000524 time 3.1641 (2.2665) loss 3.8270 (3.5083) grad_norm 1.3894 (1.4859) [2022-01-22 06:54:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][320/1251] eta 0:35:09 lr 0.000524 time 2.1096 (2.2656) loss 4.1217 (3.5096) grad_norm 1.4179 (1.4860) [2022-01-22 06:55:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][330/1251] eta 0:34:41 lr 0.000524 time 1.6620 (2.2605) loss 3.6795 (3.5123) grad_norm 1.5073 (1.4857) [2022-01-22 06:55:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][340/1251] eta 0:34:16 lr 0.000524 time 1.8363 (2.2574) loss 2.3213 (3.5104) grad_norm 1.3180 (1.4828) [2022-01-22 06:55:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][350/1251] eta 0:33:49 lr 0.000524 time 2.2506 (2.2526) loss 2.6322 (3.5126) grad_norm 1.2709 (1.4798) [2022-01-22 06:56:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][360/1251] eta 0:33:20 lr 0.000524 time 1.6955 (2.2452) loss 3.3821 (3.5132) grad_norm 1.5248 (1.4785) [2022-01-22 06:56:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][370/1251] eta 0:32:55 lr 0.000524 time 2.2111 (2.2424) loss 3.8323 (3.5191) grad_norm 1.6105 (1.4802) [2022-01-22 06:56:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][380/1251] eta 0:32:30 lr 0.000524 time 1.5903 (2.2396) loss 3.7210 (3.5221) grad_norm 1.6873 (1.4836) [2022-01-22 06:57:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][390/1251] eta 0:32:06 lr 0.000524 time 2.4473 (2.2372) loss 3.6351 (3.5156) grad_norm 1.8879 (1.4888) [2022-01-22 06:57:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][400/1251] eta 0:31:43 lr 0.000524 time 2.2446 (2.2368) loss 4.2549 (3.5178) grad_norm 1.3733 (1.4896) [2022-01-22 06:58:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][410/1251] eta 0:31:24 lr 0.000524 time 2.5086 (2.2412) loss 2.7384 (3.5072) grad_norm 1.4512 (1.4888) [2022-01-22 06:58:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][420/1251] eta 0:31:03 lr 0.000524 time 2.1195 (2.2430) loss 4.0309 (3.5176) grad_norm 1.4189 (1.4870) [2022-01-22 06:58:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][430/1251] eta 0:30:41 lr 0.000524 time 2.5012 (2.2425) loss 3.2311 (3.5131) grad_norm 1.5058 (1.4860) [2022-01-22 06:59:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][440/1251] eta 0:30:16 lr 0.000524 time 1.9013 (2.2400) loss 3.9267 (3.5164) grad_norm 1.7528 (1.4868) [2022-01-22 06:59:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][450/1251] eta 0:29:53 lr 0.000524 time 2.5679 (2.2391) loss 3.2470 (3.5205) grad_norm 1.3470 (1.4863) [2022-01-22 06:59:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][460/1251] eta 0:29:28 lr 0.000524 time 1.9323 (2.2352) loss 3.8080 (3.5167) grad_norm 1.6052 (1.4854) [2022-01-22 07:00:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][470/1251] eta 0:29:02 lr 0.000524 time 2.0108 (2.2310) loss 3.7606 (3.5205) grad_norm 1.4771 (1.4854) [2022-01-22 07:00:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][480/1251] eta 0:28:39 lr 0.000524 time 1.9037 (2.2300) loss 3.3123 (3.5245) grad_norm 1.4568 (1.4862) [2022-01-22 07:00:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][490/1251] eta 0:28:16 lr 0.000524 time 2.7393 (2.2290) loss 3.4750 (3.5273) grad_norm 1.3545 (1.4842) [2022-01-22 07:01:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][500/1251] eta 0:27:54 lr 0.000524 time 2.3680 (2.2302) loss 2.4406 (3.5308) grad_norm 1.2761 (1.4826) [2022-01-22 07:01:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][510/1251] eta 0:27:32 lr 0.000524 time 2.1625 (2.2305) loss 2.5578 (3.5312) grad_norm 1.3395 (1.4829) [2022-01-22 07:02:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][520/1251] eta 0:27:09 lr 0.000524 time 1.9516 (2.2293) loss 3.3171 (3.5305) grad_norm 1.3114 (1.4812) [2022-01-22 07:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][530/1251] eta 0:26:44 lr 0.000524 time 1.6167 (2.2256) loss 4.1123 (3.5311) grad_norm 1.3894 (1.4799) [2022-01-22 07:02:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][540/1251] eta 0:26:23 lr 0.000523 time 2.4523 (2.2268) loss 3.5871 (3.5376) grad_norm 1.3490 (1.4800) [2022-01-22 07:03:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][550/1251] eta 0:25:58 lr 0.000523 time 1.8310 (2.2227) loss 3.2745 (3.5375) grad_norm 1.3903 (1.4778) [2022-01-22 07:03:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][560/1251] eta 0:25:33 lr 0.000523 time 2.3756 (2.2199) loss 3.4781 (3.5368) grad_norm 1.4075 (1.4775) [2022-01-22 07:03:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][570/1251] eta 0:25:09 lr 0.000523 time 2.0893 (2.2167) loss 3.6965 (3.5358) grad_norm 1.8630 (1.4793) [2022-01-22 07:04:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][580/1251] eta 0:24:47 lr 0.000523 time 2.2406 (2.2163) loss 2.4310 (3.5359) grad_norm 1.3175 (1.4791) [2022-01-22 07:04:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][590/1251] eta 0:24:25 lr 0.000523 time 2.4788 (2.2175) loss 2.8521 (3.5362) grad_norm 1.5863 (1.4792) [2022-01-22 07:04:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][600/1251] eta 0:24:03 lr 0.000523 time 2.0659 (2.2169) loss 4.1710 (3.5392) grad_norm 1.5576 (1.4809) [2022-01-22 07:05:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][610/1251] eta 0:23:40 lr 0.000523 time 2.4527 (2.2163) loss 3.7466 (3.5347) grad_norm 1.5014 (1.4803) [2022-01-22 07:05:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][620/1251] eta 0:23:20 lr 0.000523 time 2.7957 (2.2201) loss 3.5334 (3.5337) grad_norm 1.4664 (1.4796) [2022-01-22 07:06:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][630/1251] eta 0:23:01 lr 0.000523 time 3.6436 (2.2243) loss 4.3444 (3.5335) grad_norm 1.7279 (1.4783) [2022-01-22 07:06:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][640/1251] eta 0:22:39 lr 0.000523 time 2.2004 (2.2245) loss 3.6809 (3.5366) grad_norm 1.3811 (1.4774) [2022-01-22 07:06:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][650/1251] eta 0:22:16 lr 0.000523 time 2.2427 (2.2238) loss 3.8223 (3.5414) grad_norm 1.6788 (1.4786) [2022-01-22 07:07:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][660/1251] eta 0:21:52 lr 0.000523 time 1.8977 (2.2203) loss 3.6321 (3.5440) grad_norm 1.3310 (1.4791) [2022-01-22 07:07:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][670/1251] eta 0:21:29 lr 0.000523 time 1.8348 (2.2190) loss 2.8109 (3.5445) grad_norm 1.3733 (1.4794) [2022-01-22 07:07:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][680/1251] eta 0:21:06 lr 0.000523 time 1.9825 (2.2176) loss 3.8743 (3.5392) grad_norm 1.3057 (1.4792) [2022-01-22 07:08:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][690/1251] eta 0:20:43 lr 0.000523 time 1.9298 (2.2161) loss 2.9813 (3.5367) grad_norm 1.5505 (1.4793) [2022-01-22 07:08:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][700/1251] eta 0:20:21 lr 0.000523 time 2.3593 (2.2163) loss 2.4205 (3.5379) grad_norm 1.7715 (1.4793) [2022-01-22 07:08:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][710/1251] eta 0:20:01 lr 0.000523 time 2.6720 (2.2201) loss 3.9086 (3.5366) grad_norm 1.4485 (1.4798) [2022-01-22 07:09:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][720/1251] eta 0:19:39 lr 0.000523 time 1.9005 (2.2205) loss 3.4724 (3.5341) grad_norm 1.4711 (1.4797) [2022-01-22 07:09:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][730/1251] eta 0:19:16 lr 0.000523 time 1.9483 (2.2191) loss 3.7368 (3.5342) grad_norm 1.4432 (1.4803) [2022-01-22 07:10:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][740/1251] eta 0:18:53 lr 0.000523 time 2.1805 (2.2176) loss 4.3155 (3.5372) grad_norm 1.3723 (1.4807) [2022-01-22 07:10:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][750/1251] eta 0:18:31 lr 0.000523 time 3.1880 (2.2183) loss 3.7478 (3.5366) grad_norm 1.5096 (1.4812) [2022-01-22 07:10:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][760/1251] eta 0:18:07 lr 0.000523 time 1.7108 (2.2151) loss 3.8518 (3.5360) grad_norm 1.7423 (1.4828) [2022-01-22 07:11:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][770/1251] eta 0:17:45 lr 0.000523 time 2.1991 (2.2150) loss 2.5939 (3.5382) grad_norm 1.7165 (1.4824) [2022-01-22 07:11:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][780/1251] eta 0:17:23 lr 0.000522 time 2.3497 (2.2161) loss 3.2325 (3.5378) grad_norm 1.4768 (1.4819) [2022-01-22 07:11:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][790/1251] eta 0:17:01 lr 0.000522 time 2.1324 (2.2149) loss 3.6277 (3.5366) grad_norm 1.4242 (1.4816) [2022-01-22 07:12:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][800/1251] eta 0:16:38 lr 0.000522 time 1.6538 (2.2139) loss 3.5916 (3.5371) grad_norm 1.5177 (1.4817) [2022-01-22 07:12:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][810/1251] eta 0:16:15 lr 0.000522 time 1.6840 (2.2119) loss 3.9407 (3.5356) grad_norm 1.3691 (1.4837) [2022-01-22 07:12:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][820/1251] eta 0:15:52 lr 0.000522 time 1.9381 (2.2092) loss 3.4747 (3.5359) grad_norm 1.6254 (1.4834) [2022-01-22 07:13:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][830/1251] eta 0:15:29 lr 0.000522 time 1.7192 (2.2080) loss 3.2965 (3.5347) grad_norm 1.5148 (1.4834) [2022-01-22 07:13:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][840/1251] eta 0:15:07 lr 0.000522 time 2.0890 (2.2072) loss 3.6400 (3.5326) grad_norm 1.5563 (1.4828) [2022-01-22 07:13:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][850/1251] eta 0:14:45 lr 0.000522 time 2.5193 (2.2070) loss 3.2400 (3.5329) grad_norm 1.4853 (1.4822) [2022-01-22 07:14:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][860/1251] eta 0:14:23 lr 0.000522 time 1.7818 (2.2088) loss 4.0382 (3.5342) grad_norm 1.5604 (1.4821) [2022-01-22 07:14:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][870/1251] eta 0:14:02 lr 0.000522 time 2.2839 (2.2101) loss 3.9219 (3.5362) grad_norm 1.3423 (1.4819) [2022-01-22 07:15:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][880/1251] eta 0:13:41 lr 0.000522 time 3.4489 (2.2130) loss 3.5611 (3.5399) grad_norm 1.5222 (1.4821) [2022-01-22 07:15:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][890/1251] eta 0:13:18 lr 0.000522 time 2.0187 (2.2130) loss 3.6219 (3.5402) grad_norm 1.3576 (1.4815) [2022-01-22 07:15:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][900/1251] eta 0:12:56 lr 0.000522 time 2.2425 (2.2127) loss 3.0725 (3.5398) grad_norm 1.5760 (1.4821) [2022-01-22 07:16:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][910/1251] eta 0:12:33 lr 0.000522 time 2.7161 (2.2107) loss 2.7178 (3.5426) grad_norm 1.3886 (1.4821) [2022-01-22 07:16:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][920/1251] eta 0:12:11 lr 0.000522 time 2.3307 (2.2099) loss 3.8528 (3.5435) grad_norm 1.4436 (1.4819) [2022-01-22 07:16:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][930/1251] eta 0:11:49 lr 0.000522 time 1.8473 (2.2091) loss 3.6384 (3.5420) grad_norm 1.3546 (1.4814) [2022-01-22 07:17:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][940/1251] eta 0:11:27 lr 0.000522 time 2.1827 (2.2099) loss 3.3773 (3.5425) grad_norm 1.8613 (1.4816) [2022-01-22 07:17:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][950/1251] eta 0:11:05 lr 0.000522 time 3.2021 (2.2104) loss 2.4498 (3.5420) grad_norm 1.3067 (1.4809) [2022-01-22 07:18:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][960/1251] eta 0:10:43 lr 0.000522 time 2.4292 (2.2115) loss 2.7800 (3.5415) grad_norm 1.3917 (1.4815) [2022-01-22 07:18:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][970/1251] eta 0:10:21 lr 0.000522 time 1.8592 (2.2106) loss 3.0890 (3.5416) grad_norm 1.6114 (1.4814) [2022-01-22 07:18:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][980/1251] eta 0:09:58 lr 0.000522 time 1.9579 (2.2086) loss 4.2107 (3.5421) grad_norm 1.6531 (1.4821) [2022-01-22 07:19:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][990/1251] eta 0:09:35 lr 0.000522 time 2.1526 (2.2063) loss 2.5469 (3.5423) grad_norm 1.3800 (1.4817) [2022-01-22 07:19:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1000/1251] eta 0:09:13 lr 0.000522 time 2.5603 (2.2064) loss 3.9945 (3.5432) grad_norm 1.4483 (1.4816) [2022-01-22 07:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1010/1251] eta 0:08:51 lr 0.000522 time 2.5200 (2.2068) loss 2.5093 (3.5424) grad_norm 1.4855 (1.4815) [2022-01-22 07:20:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1020/1251] eta 0:08:30 lr 0.000522 time 2.2365 (2.2078) loss 4.1515 (3.5429) grad_norm 1.4223 (1.4818) [2022-01-22 07:20:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1030/1251] eta 0:08:08 lr 0.000521 time 1.8400 (2.2094) loss 3.9353 (3.5431) grad_norm 1.5389 (1.4812) [2022-01-22 07:20:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1040/1251] eta 0:07:45 lr 0.000521 time 2.1472 (2.2076) loss 2.6457 (3.5420) grad_norm 1.8524 (1.4818) [2022-01-22 07:21:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1050/1251] eta 0:07:23 lr 0.000521 time 1.8294 (2.2059) loss 4.0047 (3.5433) grad_norm 1.4888 (1.4818) [2022-01-22 07:21:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1060/1251] eta 0:07:01 lr 0.000521 time 1.9910 (2.2043) loss 4.0445 (3.5447) grad_norm 1.5964 (1.4815) [2022-01-22 07:22:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1070/1251] eta 0:06:39 lr 0.000521 time 2.4994 (2.2051) loss 3.2997 (3.5411) grad_norm 1.9248 (1.4815) [2022-01-22 07:22:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1080/1251] eta 0:06:17 lr 0.000521 time 2.5264 (2.2054) loss 4.0992 (3.5400) grad_norm 1.5776 (1.4818) [2022-01-22 07:22:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1090/1251] eta 0:05:54 lr 0.000521 time 2.5386 (2.2045) loss 4.1396 (3.5431) grad_norm 1.4270 (1.4816) [2022-01-22 07:23:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1100/1251] eta 0:05:32 lr 0.000521 time 2.2787 (2.2047) loss 2.3326 (3.5420) grad_norm 1.3138 (1.4815) [2022-01-22 07:23:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1110/1251] eta 0:05:11 lr 0.000521 time 3.0446 (2.2059) loss 3.7664 (3.5433) grad_norm 1.3212 (1.4805) [2022-01-22 07:23:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1120/1251] eta 0:04:48 lr 0.000521 time 1.8862 (2.2055) loss 2.8627 (3.5417) grad_norm 1.5583 (1.4806) [2022-01-22 07:24:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1130/1251] eta 0:04:26 lr 0.000521 time 2.5203 (2.2057) loss 3.4040 (3.5425) grad_norm 1.5344 (1.4803) [2022-01-22 07:24:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1140/1251] eta 0:04:04 lr 0.000521 time 1.7580 (2.2045) loss 4.2172 (3.5429) grad_norm 1.5229 (1.4805) [2022-01-22 07:24:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1150/1251] eta 0:03:42 lr 0.000521 time 1.9769 (2.2050) loss 3.5260 (3.5425) grad_norm 1.4145 (1.4799) [2022-01-22 07:25:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1160/1251] eta 0:03:20 lr 0.000521 time 1.8305 (2.2050) loss 3.2841 (3.5422) grad_norm 1.5776 (1.4796) [2022-01-22 07:25:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1170/1251] eta 0:02:58 lr 0.000521 time 1.8231 (2.2044) loss 4.2703 (3.5439) grad_norm 1.3997 (1.4794) [2022-01-22 07:26:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1180/1251] eta 0:02:36 lr 0.000521 time 1.7103 (2.2029) loss 3.6437 (3.5445) grad_norm 1.4241 (1.4798) [2022-01-22 07:26:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1190/1251] eta 0:02:14 lr 0.000521 time 1.8916 (2.2044) loss 3.7892 (3.5448) grad_norm 1.4413 (1.4810) [2022-01-22 07:26:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1200/1251] eta 0:01:52 lr 0.000521 time 1.7839 (2.2034) loss 4.3441 (3.5433) grad_norm 1.7553 (1.4814) [2022-01-22 07:27:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1210/1251] eta 0:01:30 lr 0.000521 time 1.9228 (2.2015) loss 3.6518 (3.5432) grad_norm 1.4746 (1.4819) [2022-01-22 07:27:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1220/1251] eta 0:01:08 lr 0.000521 time 2.1143 (2.2006) loss 4.1238 (3.5442) grad_norm 1.3935 (1.4824) [2022-01-22 07:27:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1230/1251] eta 0:00:46 lr 0.000521 time 2.3948 (2.1998) loss 4.1461 (3.5443) grad_norm 1.4017 (1.4823) [2022-01-22 07:28:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1240/1251] eta 0:00:24 lr 0.000521 time 1.2991 (2.1986) loss 2.5875 (3.5445) grad_norm 1.6919 (1.4829) [2022-01-22 07:28:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [146/300][1250/1251] eta 0:00:02 lr 0.000521 time 1.1934 (2.1930) loss 3.6267 (3.5437) grad_norm 1.3022 (1.4825) [2022-01-22 07:28:22 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 146 training takes 0:45:43 [2022-01-22 07:28:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.275 (18.275) Loss 1.0373 (1.0373) Acc@1 76.367 (76.367) Acc@5 93.555 (93.555) [2022-01-22 07:29:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.975 (3.543) Loss 0.9938 (1.0380) Acc@1 77.539 (76.021) Acc@5 94.141 (93.510) [2022-01-22 07:29:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.314 (2.673) Loss 1.0863 (1.0593) Acc@1 73.926 (75.521) Acc@5 93.262 (93.206) [2022-01-22 07:29:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.961 (2.273) Loss 1.1008 (1.0579) Acc@1 74.902 (75.621) Acc@5 93.164 (93.177) [2022-01-22 07:29:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.764 (2.127) Loss 1.0840 (1.0615) Acc@1 75.879 (75.603) Acc@5 92.090 (93.074) [2022-01-22 07:29:57 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.706 Acc@5 93.098 [2022-01-22 07:29:57 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.7% [2022-01-22 07:29:57 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 75.71% [2022-01-22 07:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][0/1251] eta 7:33:03 lr 0.000521 time 21.7298 (21.7298) loss 3.5780 (3.5780) grad_norm 1.4563 (1.4563) [2022-01-22 07:30:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][10/1251] eta 1:27:19 lr 0.000521 time 2.8702 (4.2223) loss 3.0455 (3.2807) grad_norm 1.3920 (1.4099) [2022-01-22 07:31:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][20/1251] eta 1:07:48 lr 0.000520 time 2.1782 (3.3047) loss 3.4598 (3.4096) grad_norm 1.4720 (1.4475) [2022-01-22 07:31:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][30/1251] eta 1:00:41 lr 0.000520 time 1.8350 (2.9821) loss 3.3781 (3.4690) grad_norm 1.3694 (1.4488) [2022-01-22 07:31:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][40/1251] eta 0:57:19 lr 0.000520 time 3.4979 (2.8405) loss 3.9468 (3.4448) grad_norm 1.3388 (1.4589) [2022-01-22 07:32:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][50/1251] eta 0:54:08 lr 0.000520 time 1.9109 (2.7047) loss 2.9368 (3.4407) grad_norm 1.6441 (1.4462) [2022-01-22 07:32:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][60/1251] eta 0:51:17 lr 0.000520 time 1.6390 (2.5842) loss 2.6941 (3.3957) grad_norm 1.3943 (1.4537) [2022-01-22 07:32:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][70/1251] eta 0:49:18 lr 0.000520 time 1.8666 (2.5054) loss 4.5168 (3.4644) grad_norm 1.4740 (1.4762) [2022-01-22 07:33:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][80/1251] eta 0:48:13 lr 0.000520 time 2.3237 (2.4710) loss 3.8192 (3.4856) grad_norm 1.5979 (1.4716) [2022-01-22 07:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][90/1251] eta 0:47:24 lr 0.000520 time 2.3469 (2.4499) loss 3.8239 (3.4774) grad_norm 1.3568 (1.4870) [2022-01-22 07:34:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][100/1251] eta 0:46:21 lr 0.000520 time 1.5151 (2.4168) loss 3.8489 (3.5089) grad_norm 1.6212 (1.4884) [2022-01-22 07:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][110/1251] eta 0:45:28 lr 0.000520 time 1.7296 (2.3909) loss 3.3235 (3.5021) grad_norm 1.2884 (1.4876) [2022-01-22 07:34:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][120/1251] eta 0:44:59 lr 0.000520 time 2.3053 (2.3870) loss 2.9310 (3.4850) grad_norm 1.4852 (1.4918) [2022-01-22 07:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][130/1251] eta 0:44:16 lr 0.000520 time 2.9414 (2.3702) loss 3.1527 (3.4950) grad_norm 1.3749 (1.4839) [2022-01-22 07:35:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][140/1251] eta 0:43:27 lr 0.000520 time 1.5660 (2.3474) loss 3.8516 (3.4840) grad_norm 1.4158 (1.4842) [2022-01-22 07:35:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][150/1251] eta 0:42:56 lr 0.000520 time 2.0066 (2.3398) loss 3.3722 (3.4749) grad_norm 1.3999 (1.4866) [2022-01-22 07:36:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][160/1251] eta 0:42:24 lr 0.000520 time 2.1841 (2.3327) loss 2.9482 (3.4889) grad_norm 1.4248 (1.4877) [2022-01-22 07:36:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][170/1251] eta 0:41:59 lr 0.000520 time 3.7656 (2.3306) loss 3.5292 (3.4920) grad_norm 1.4483 (1.4885) [2022-01-22 07:36:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][180/1251] eta 0:41:27 lr 0.000520 time 1.9587 (2.3222) loss 4.1349 (3.5035) grad_norm 1.3403 (1.4838) [2022-01-22 07:37:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][190/1251] eta 0:40:57 lr 0.000520 time 1.8648 (2.3163) loss 3.7060 (3.5001) grad_norm 1.4667 (1.4869) [2022-01-22 07:37:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][200/1251] eta 0:40:23 lr 0.000520 time 1.9619 (2.3057) loss 3.1272 (3.4938) grad_norm 1.6242 (1.4915) [2022-01-22 07:38:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][210/1251] eta 0:39:51 lr 0.000520 time 2.9265 (2.2969) loss 3.7806 (3.4909) grad_norm 1.5697 (1.4929) [2022-01-22 07:38:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][220/1251] eta 0:39:16 lr 0.000520 time 1.9451 (2.2860) loss 3.3759 (3.4808) grad_norm 1.5239 (1.4936) [2022-01-22 07:38:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][230/1251] eta 0:38:47 lr 0.000520 time 2.1258 (2.2795) loss 3.4272 (3.4895) grad_norm 1.4887 (1.4939) [2022-01-22 07:39:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][240/1251] eta 0:38:22 lr 0.000520 time 1.9524 (2.2773) loss 3.2515 (3.4925) grad_norm 1.5218 (1.4935) [2022-01-22 07:39:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][250/1251] eta 0:37:54 lr 0.000520 time 2.5072 (2.2724) loss 3.2358 (3.4946) grad_norm 1.5812 (1.4978) [2022-01-22 07:39:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][260/1251] eta 0:37:28 lr 0.000519 time 2.4683 (2.2689) loss 3.5724 (3.4863) grad_norm 1.5069 (1.5000) [2022-01-22 07:40:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][270/1251] eta 0:37:06 lr 0.000519 time 2.7713 (2.2698) loss 3.2940 (3.4893) grad_norm 1.3765 (1.5027) [2022-01-22 07:40:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][280/1251] eta 0:36:44 lr 0.000519 time 2.2449 (2.2704) loss 3.1232 (3.4757) grad_norm 1.6606 (1.5012) [2022-01-22 07:40:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][290/1251] eta 0:36:21 lr 0.000519 time 2.9066 (2.2703) loss 2.2715 (3.4772) grad_norm 1.3468 (1.5007) [2022-01-22 07:41:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][300/1251] eta 0:35:58 lr 0.000519 time 2.5942 (2.2695) loss 3.4199 (3.4863) grad_norm 1.3810 (1.4979) [2022-01-22 07:41:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][310/1251] eta 0:35:32 lr 0.000519 time 2.0487 (2.2658) loss 4.1420 (3.4923) grad_norm 1.4631 (1.4959) [2022-01-22 07:42:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][320/1251] eta 0:35:05 lr 0.000519 time 2.5195 (2.2618) loss 4.0676 (3.4880) grad_norm 1.4247 (1.4942) [2022-01-22 07:42:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][330/1251] eta 0:34:43 lr 0.000519 time 2.6088 (2.2625) loss 3.2901 (3.4833) grad_norm 1.3776 (1.4916) [2022-01-22 07:42:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][340/1251] eta 0:34:15 lr 0.000519 time 1.6126 (2.2559) loss 2.6721 (3.4840) grad_norm 1.4785 (1.4910) [2022-01-22 07:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][350/1251] eta 0:33:50 lr 0.000519 time 1.7814 (2.2532) loss 3.2190 (3.4837) grad_norm 1.3109 (1.4884) [2022-01-22 07:43:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][360/1251] eta 0:33:26 lr 0.000519 time 2.1692 (2.2521) loss 4.0373 (3.4876) grad_norm 1.4110 (1.4875) [2022-01-22 07:43:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][370/1251] eta 0:33:03 lr 0.000519 time 2.2822 (2.2509) loss 3.7855 (3.4804) grad_norm 1.4009 (1.4867) [2022-01-22 07:44:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][380/1251] eta 0:32:34 lr 0.000519 time 1.6250 (2.2445) loss 3.7411 (3.4843) grad_norm 1.4904 (1.4863) [2022-01-22 07:44:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][390/1251] eta 0:32:10 lr 0.000519 time 2.1714 (2.2417) loss 3.6966 (3.4797) grad_norm 1.4957 (1.4876) [2022-01-22 07:44:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][400/1251] eta 0:31:45 lr 0.000519 time 2.7751 (2.2387) loss 2.5933 (3.4804) grad_norm 1.7048 (1.4879) [2022-01-22 07:45:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][410/1251] eta 0:31:22 lr 0.000519 time 2.5840 (2.2383) loss 4.3918 (3.4859) grad_norm 1.8082 (1.4874) [2022-01-22 07:45:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][420/1251] eta 0:30:56 lr 0.000519 time 1.8932 (2.2341) loss 3.7668 (3.4932) grad_norm 1.4339 (1.4859) [2022-01-22 07:46:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][430/1251] eta 0:30:38 lr 0.000519 time 2.2981 (2.2394) loss 3.7630 (3.4996) grad_norm 1.5049 (1.4864) [2022-01-22 07:46:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][440/1251] eta 0:30:25 lr 0.000519 time 4.8734 (2.2505) loss 3.8622 (3.5056) grad_norm 1.5234 (1.4861) [2022-01-22 07:46:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][450/1251] eta 0:30:00 lr 0.000519 time 2.1881 (2.2476) loss 3.9980 (3.5067) grad_norm 1.6297 (1.4859) [2022-01-22 07:47:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][460/1251] eta 0:29:34 lr 0.000519 time 1.6366 (2.2433) loss 3.6928 (3.5093) grad_norm 1.5623 (1.4856) [2022-01-22 07:47:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][470/1251] eta 0:29:08 lr 0.000519 time 1.9030 (2.2392) loss 2.5781 (3.5054) grad_norm 1.5351 (1.4855) [2022-01-22 07:47:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][480/1251] eta 0:28:47 lr 0.000519 time 4.1354 (2.2406) loss 2.6143 (3.5011) grad_norm 1.5980 (1.4857) [2022-01-22 07:48:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][490/1251] eta 0:28:28 lr 0.000519 time 2.4585 (2.2445) loss 3.8111 (3.5022) grad_norm 1.3998 (1.4883) [2022-01-22 07:48:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][500/1251] eta 0:28:04 lr 0.000518 time 1.6255 (2.2436) loss 3.7210 (3.4994) grad_norm 1.3456 (1.4887) [2022-01-22 07:49:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][510/1251] eta 0:27:40 lr 0.000518 time 2.2745 (2.2402) loss 3.8644 (3.5037) grad_norm 1.6441 (1.4894) [2022-01-22 07:49:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][520/1251] eta 0:27:14 lr 0.000518 time 2.8389 (2.2357) loss 4.2239 (3.5032) grad_norm 1.4904 (1.4893) [2022-01-22 07:49:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][530/1251] eta 0:26:50 lr 0.000518 time 1.9612 (2.2332) loss 4.0479 (3.5021) grad_norm 2.0196 (1.4909) [2022-01-22 07:50:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][540/1251] eta 0:26:25 lr 0.000518 time 1.9303 (2.2293) loss 3.9156 (3.5033) grad_norm 1.2535 (1.4937) [2022-01-22 07:50:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][550/1251] eta 0:26:02 lr 0.000518 time 3.0812 (2.2285) loss 3.7977 (3.5072) grad_norm 1.5564 (1.4932) [2022-01-22 07:50:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][560/1251] eta 0:25:39 lr 0.000518 time 1.5488 (2.2284) loss 3.9709 (3.5118) grad_norm 1.3633 (1.4957) [2022-01-22 07:51:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][570/1251] eta 0:25:16 lr 0.000518 time 2.2891 (2.2274) loss 4.3641 (3.5130) grad_norm 1.4187 (1.4954) [2022-01-22 07:51:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][580/1251] eta 0:24:53 lr 0.000518 time 1.8776 (2.2260) loss 3.7496 (3.5106) grad_norm 1.3502 (1.4942) [2022-01-22 07:51:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][590/1251] eta 0:24:32 lr 0.000518 time 3.5510 (2.2275) loss 3.5999 (3.5129) grad_norm 1.4300 (1.4944) [2022-01-22 07:52:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][600/1251] eta 0:24:07 lr 0.000518 time 1.3467 (2.2239) loss 2.8239 (3.5126) grad_norm 1.2893 (1.4940) [2022-01-22 07:52:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][610/1251] eta 0:23:46 lr 0.000518 time 3.9776 (2.2260) loss 2.5758 (3.5143) grad_norm 1.4111 (1.4957) [2022-01-22 07:53:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][620/1251] eta 0:23:25 lr 0.000518 time 1.5562 (2.2267) loss 3.2725 (3.5187) grad_norm 1.5499 (1.4943) [2022-01-22 07:53:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][630/1251] eta 0:23:04 lr 0.000518 time 3.1805 (2.2289) loss 4.0437 (3.5211) grad_norm 1.4789 (1.4942) [2022-01-22 07:53:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][640/1251] eta 0:22:41 lr 0.000518 time 1.5135 (2.2281) loss 3.5467 (3.5225) grad_norm 1.3410 (1.4941) [2022-01-22 07:54:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][650/1251] eta 0:22:20 lr 0.000518 time 3.3529 (2.2311) loss 2.5265 (3.5181) grad_norm 1.4378 (1.4939) [2022-01-22 07:54:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][660/1251] eta 0:21:58 lr 0.000518 time 1.8573 (2.2309) loss 3.5020 (3.5186) grad_norm 1.5382 (1.4943) [2022-01-22 07:54:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][670/1251] eta 0:21:35 lr 0.000518 time 2.8566 (2.2303) loss 4.2115 (3.5198) grad_norm 1.5935 (1.4948) [2022-01-22 07:55:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][680/1251] eta 0:21:11 lr 0.000518 time 2.2771 (2.2262) loss 2.8750 (3.5165) grad_norm 1.9367 (1.4955) [2022-01-22 07:55:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][690/1251] eta 0:20:46 lr 0.000518 time 1.7778 (2.2215) loss 3.6843 (3.5126) grad_norm 1.6914 (1.4959) [2022-01-22 07:55:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][700/1251] eta 0:20:22 lr 0.000518 time 2.1697 (2.2184) loss 4.1011 (3.5116) grad_norm 1.7336 (1.4957) [2022-01-22 07:56:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][710/1251] eta 0:19:59 lr 0.000518 time 1.9785 (2.2172) loss 2.5212 (3.5140) grad_norm 1.7832 (1.4953) [2022-01-22 07:56:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][720/1251] eta 0:19:37 lr 0.000518 time 2.8666 (2.2170) loss 3.6666 (3.5179) grad_norm 1.6919 (1.4967) [2022-01-22 07:56:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][730/1251] eta 0:19:16 lr 0.000518 time 1.8988 (2.2193) loss 3.8102 (3.5194) grad_norm 1.3246 (1.4966) [2022-01-22 07:57:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][740/1251] eta 0:18:53 lr 0.000517 time 2.4883 (2.2183) loss 4.3264 (3.5229) grad_norm 1.6275 (1.4985) [2022-01-22 07:57:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][750/1251] eta 0:18:31 lr 0.000517 time 2.2042 (2.2192) loss 3.2867 (3.5282) grad_norm 1.4420 (1.5001) [2022-01-22 07:58:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][760/1251] eta 0:18:10 lr 0.000517 time 2.4938 (2.2219) loss 3.9357 (3.5313) grad_norm 1.4334 (1.5003) [2022-01-22 07:58:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][770/1251] eta 0:17:49 lr 0.000517 time 2.2241 (2.2228) loss 3.9373 (3.5317) grad_norm 1.8070 (1.5008) [2022-01-22 07:58:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][780/1251] eta 0:17:27 lr 0.000517 time 2.5250 (2.2235) loss 2.3921 (3.5312) grad_norm 1.3380 (1.5011) [2022-01-22 07:59:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][790/1251] eta 0:17:04 lr 0.000517 time 1.9142 (2.2218) loss 2.9142 (3.5322) grad_norm 1.3632 (1.5014) [2022-01-22 07:59:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][800/1251] eta 0:16:40 lr 0.000517 time 2.1912 (2.2187) loss 3.9258 (3.5318) grad_norm 1.4963 (1.5010) [2022-01-22 07:59:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][810/1251] eta 0:16:17 lr 0.000517 time 1.9151 (2.2155) loss 3.4182 (3.5330) grad_norm 1.4282 (1.5020) [2022-01-22 08:00:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][820/1251] eta 0:15:54 lr 0.000517 time 2.2725 (2.2145) loss 4.6176 (3.5384) grad_norm 1.7980 (1.5024) [2022-01-22 08:00:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][830/1251] eta 0:15:32 lr 0.000517 time 2.8476 (2.2159) loss 3.7794 (3.5360) grad_norm 1.5476 (1.5032) [2022-01-22 08:01:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][840/1251] eta 0:15:10 lr 0.000517 time 2.1668 (2.2152) loss 4.0406 (3.5367) grad_norm 1.4622 (1.5026) [2022-01-22 08:01:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][850/1251] eta 0:14:49 lr 0.000517 time 2.8002 (2.2183) loss 3.7503 (3.5376) grad_norm 1.4874 (1.5020) [2022-01-22 08:01:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][860/1251] eta 0:14:27 lr 0.000517 time 1.9787 (2.2194) loss 3.4194 (3.5382) grad_norm 1.3248 (1.5019) [2022-01-22 08:02:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][870/1251] eta 0:14:04 lr 0.000517 time 1.9030 (2.2177) loss 3.5628 (3.5378) grad_norm 1.3150 (1.5005) [2022-01-22 08:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][880/1251] eta 0:13:42 lr 0.000517 time 2.3159 (2.2156) loss 4.2530 (3.5405) grad_norm 1.5928 (1.5003) [2022-01-22 08:02:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][890/1251] eta 0:13:18 lr 0.000517 time 1.9380 (2.2133) loss 3.2471 (3.5404) grad_norm 1.4107 (1.4997) [2022-01-22 08:03:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][900/1251] eta 0:12:56 lr 0.000517 time 1.9525 (2.2131) loss 3.7958 (3.5399) grad_norm 1.6230 (1.4987) [2022-01-22 08:03:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][910/1251] eta 0:12:34 lr 0.000517 time 2.2327 (2.2128) loss 3.8450 (3.5396) grad_norm 1.3957 (1.4972) [2022-01-22 08:03:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][920/1251] eta 0:12:12 lr 0.000517 time 1.8582 (2.2140) loss 3.6352 (3.5387) grad_norm 1.3801 (1.4967) [2022-01-22 08:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][930/1251] eta 0:11:50 lr 0.000517 time 1.7126 (2.2127) loss 2.9930 (3.5373) grad_norm 1.6528 (1.4959) [2022-01-22 08:04:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][940/1251] eta 0:11:27 lr 0.000517 time 1.8180 (2.2122) loss 3.8313 (3.5368) grad_norm 1.5728 (1.4958) [2022-01-22 08:05:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][950/1251] eta 0:11:06 lr 0.000517 time 3.4932 (2.2127) loss 2.6650 (3.5361) grad_norm 1.5252 (1.4959) [2022-01-22 08:05:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][960/1251] eta 0:10:44 lr 0.000517 time 2.4663 (2.2135) loss 3.5610 (3.5308) grad_norm 1.4811 (1.4949) [2022-01-22 08:05:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][970/1251] eta 0:10:21 lr 0.000517 time 1.8237 (2.2133) loss 3.8057 (3.5300) grad_norm 1.4440 (1.4945) [2022-01-22 08:06:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][980/1251] eta 0:10:00 lr 0.000516 time 2.9555 (2.2146) loss 3.8418 (3.5297) grad_norm 1.5140 (1.4938) [2022-01-22 08:06:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][990/1251] eta 0:09:38 lr 0.000516 time 3.2016 (2.2152) loss 2.4634 (3.5316) grad_norm 1.6031 (1.4934) [2022-01-22 08:06:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1000/1251] eta 0:09:15 lr 0.000516 time 1.6707 (2.2139) loss 3.1460 (3.5316) grad_norm 1.3252 (1.4927) [2022-01-22 08:07:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1010/1251] eta 0:08:53 lr 0.000516 time 1.9250 (2.2128) loss 2.8459 (3.5330) grad_norm 1.4363 (1.4937) [2022-01-22 08:07:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1020/1251] eta 0:08:30 lr 0.000516 time 2.1009 (2.2111) loss 4.2943 (3.5335) grad_norm 1.4842 (1.4947) [2022-01-22 08:07:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1030/1251] eta 0:08:08 lr 0.000516 time 3.7202 (2.2117) loss 3.9774 (3.5334) grad_norm 1.5253 (1.4947) [2022-01-22 08:08:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1040/1251] eta 0:07:46 lr 0.000516 time 2.1852 (2.2126) loss 4.2974 (3.5328) grad_norm 1.3859 (1.4941) [2022-01-22 08:08:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1050/1251] eta 0:07:25 lr 0.000516 time 2.0616 (2.2141) loss 3.7841 (3.5339) grad_norm 1.7234 (1.4942) [2022-01-22 08:09:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1060/1251] eta 0:07:02 lr 0.000516 time 1.9955 (2.2126) loss 3.3303 (3.5313) grad_norm 1.6582 (1.4938) [2022-01-22 08:09:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1070/1251] eta 0:06:40 lr 0.000516 time 3.6538 (2.2131) loss 3.7710 (3.5303) grad_norm 1.2787 (1.4930) [2022-01-22 08:09:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1080/1251] eta 0:06:18 lr 0.000516 time 2.0134 (2.2121) loss 3.2359 (3.5305) grad_norm 1.3864 (1.4934) [2022-01-22 08:10:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1090/1251] eta 0:05:56 lr 0.000516 time 2.5333 (2.2149) loss 3.0159 (3.5307) grad_norm 1.4597 (1.4929) [2022-01-22 08:10:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1100/1251] eta 0:05:34 lr 0.000516 time 2.0012 (2.2146) loss 3.6753 (3.5311) grad_norm 1.2945 (1.4917) [2022-01-22 08:10:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1110/1251] eta 0:05:12 lr 0.000516 time 2.2016 (2.2140) loss 4.1464 (3.5312) grad_norm 1.3978 (1.4908) [2022-01-22 08:11:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1120/1251] eta 0:04:49 lr 0.000516 time 1.6383 (2.2126) loss 2.9060 (3.5303) grad_norm 1.4767 (1.4905) [2022-01-22 08:11:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1130/1251] eta 0:04:27 lr 0.000516 time 2.5030 (2.2124) loss 4.1459 (3.5298) grad_norm 1.2690 (1.4899) [2022-01-22 08:12:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1140/1251] eta 0:04:05 lr 0.000516 time 2.0995 (2.2110) loss 4.0957 (3.5295) grad_norm 1.4139 (1.4890) [2022-01-22 08:12:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1150/1251] eta 0:03:43 lr 0.000516 time 2.5783 (2.2106) loss 3.7413 (3.5309) grad_norm 1.4766 (1.4891) [2022-01-22 08:12:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1160/1251] eta 0:03:21 lr 0.000516 time 1.5644 (2.2088) loss 3.6083 (3.5321) grad_norm 1.5994 (1.4893) [2022-01-22 08:13:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1170/1251] eta 0:02:58 lr 0.000516 time 1.9630 (2.2091) loss 4.4895 (3.5343) grad_norm 1.5809 (1.4892) [2022-01-22 08:13:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1180/1251] eta 0:02:36 lr 0.000516 time 1.8872 (2.2099) loss 4.1096 (3.5351) grad_norm 1.4238 (1.4886) [2022-01-22 08:13:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1190/1251] eta 0:02:14 lr 0.000516 time 2.8398 (2.2101) loss 3.9588 (3.5345) grad_norm 1.4699 (1.4888) [2022-01-22 08:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1200/1251] eta 0:01:52 lr 0.000516 time 2.0520 (2.2089) loss 3.4141 (3.5340) grad_norm 1.5619 (1.4895) [2022-01-22 08:14:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1210/1251] eta 0:01:30 lr 0.000516 time 1.9021 (2.2094) loss 2.8216 (3.5344) grad_norm 1.4194 (1.4893) [2022-01-22 08:14:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1220/1251] eta 0:01:08 lr 0.000515 time 1.9020 (2.2086) loss 3.3482 (3.5352) grad_norm 1.5058 (1.4902) [2022-01-22 08:15:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1230/1251] eta 0:00:46 lr 0.000515 time 2.2382 (2.2081) loss 2.8273 (3.5334) grad_norm 1.3632 (1.4900) [2022-01-22 08:15:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1240/1251] eta 0:00:24 lr 0.000515 time 1.5406 (2.2071) loss 3.0760 (3.5337) grad_norm 1.4372 (1.4894) [2022-01-22 08:15:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [147/300][1250/1251] eta 0:00:02 lr 0.000515 time 1.1880 (2.2010) loss 3.9091 (3.5357) grad_norm 1.3738 (1.4888) [2022-01-22 08:15:51 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 147 training takes 0:45:53 [2022-01-22 08:16:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.885 (18.885) Loss 1.0742 (1.0742) Acc@1 75.195 (75.195) Acc@5 92.383 (92.383) [2022-01-22 08:16:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.976 (3.544) Loss 1.0150 (1.0199) Acc@1 76.758 (75.977) Acc@5 93.262 (93.510) [2022-01-22 08:16:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.244 (2.602) Loss 1.0977 (1.0269) Acc@1 73.828 (75.902) Acc@5 91.406 (93.308) [2022-01-22 08:17:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.203 (2.307) Loss 1.0628 (1.0347) Acc@1 77.246 (75.973) Acc@5 92.383 (93.126) [2022-01-22 08:17:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.341 (2.141) Loss 1.0619 (1.0387) Acc@1 75.195 (75.912) Acc@5 92.676 (93.107) [2022-01-22 08:17:26 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.996 Acc@5 93.148 [2022-01-22 08:17:26 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.0% [2022-01-22 08:17:26 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.00% [2022-01-22 08:17:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][0/1251] eta 7:42:27 lr 0.000515 time 22.1804 (22.1804) loss 3.3837 (3.3837) grad_norm 1.5331 (1.5331) [2022-01-22 08:18:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][10/1251] eta 1:24:41 lr 0.000515 time 1.8639 (4.0947) loss 3.6210 (3.6955) grad_norm 1.4473 (1.4957) [2022-01-22 08:18:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][20/1251] eta 1:03:28 lr 0.000515 time 1.2651 (3.0934) loss 4.1688 (3.7112) grad_norm 1.5645 (1.5185) [2022-01-22 08:18:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][30/1251] eta 0:57:07 lr 0.000515 time 1.8379 (2.8068) loss 3.9763 (3.6666) grad_norm 1.5796 (1.5134) [2022-01-22 08:19:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][40/1251] eta 0:53:38 lr 0.000515 time 3.1026 (2.6580) loss 4.0495 (3.6216) grad_norm 1.3830 (1.5307) [2022-01-22 08:19:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][50/1251] eta 0:51:59 lr 0.000515 time 1.9608 (2.5976) loss 3.5271 (3.6142) grad_norm 1.5323 (1.5079) [2022-01-22 08:19:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][60/1251] eta 0:49:40 lr 0.000515 time 1.1514 (2.5021) loss 3.0609 (3.5990) grad_norm 1.5044 (1.5019) [2022-01-22 08:20:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][70/1251] eta 0:48:44 lr 0.000515 time 1.8472 (2.4759) loss 4.0884 (3.5833) grad_norm 1.3950 (1.5079) [2022-01-22 08:20:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][80/1251] eta 0:47:58 lr 0.000515 time 3.4580 (2.4585) loss 4.0415 (3.5815) grad_norm 1.5095 (1.5195) [2022-01-22 08:21:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][90/1251] eta 0:47:15 lr 0.000515 time 2.1500 (2.4424) loss 3.8182 (3.5974) grad_norm 1.6614 (1.5144) [2022-01-22 08:21:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][100/1251] eta 0:46:15 lr 0.000515 time 1.5110 (2.4118) loss 2.4061 (3.5657) grad_norm 1.6355 (1.5122) [2022-01-22 08:21:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][110/1251] eta 0:45:33 lr 0.000515 time 1.8942 (2.3961) loss 3.9251 (3.5549) grad_norm 1.2940 (1.5055) [2022-01-22 08:22:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][120/1251] eta 0:44:47 lr 0.000515 time 2.4451 (2.3759) loss 3.4490 (3.5547) grad_norm 1.6482 (1.5015) [2022-01-22 08:22:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][130/1251] eta 0:44:21 lr 0.000515 time 3.0810 (2.3740) loss 3.5801 (3.5653) grad_norm 1.5919 (1.5013) [2022-01-22 08:22:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][140/1251] eta 0:43:34 lr 0.000515 time 1.7526 (2.3537) loss 3.3452 (3.5646) grad_norm 1.6130 (1.4946) [2022-01-22 08:23:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][150/1251] eta 0:42:56 lr 0.000515 time 1.8573 (2.3399) loss 2.3510 (3.5466) grad_norm 1.2340 (1.4878) [2022-01-22 08:23:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][160/1251] eta 0:42:15 lr 0.000515 time 2.0789 (2.3242) loss 2.7701 (3.5265) grad_norm 1.4163 (1.4846) [2022-01-22 08:24:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][170/1251] eta 0:41:39 lr 0.000515 time 2.1843 (2.3125) loss 3.1778 (3.5210) grad_norm 1.4382 (1.4840) [2022-01-22 08:24:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][180/1251] eta 0:41:04 lr 0.000515 time 1.8968 (2.3015) loss 3.7121 (3.5286) grad_norm 1.5262 (1.4801) [2022-01-22 08:24:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][190/1251] eta 0:40:34 lr 0.000515 time 2.3679 (2.2949) loss 3.6281 (3.5280) grad_norm 1.5188 (1.4806) [2022-01-22 08:25:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][200/1251] eta 0:40:02 lr 0.000515 time 2.4252 (2.2858) loss 3.9516 (3.5289) grad_norm 1.5696 (1.4794) [2022-01-22 08:25:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][210/1251] eta 0:39:40 lr 0.000514 time 1.6125 (2.2870) loss 3.9745 (3.5277) grad_norm 1.4504 (1.4821) [2022-01-22 08:25:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][220/1251] eta 0:39:20 lr 0.000514 time 2.2323 (2.2898) loss 3.7299 (3.5314) grad_norm 1.3212 (1.4812) [2022-01-22 08:26:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][230/1251] eta 0:38:56 lr 0.000514 time 1.5440 (2.2883) loss 3.6741 (3.5262) grad_norm 1.8405 (1.4845) [2022-01-22 08:26:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][240/1251] eta 0:38:25 lr 0.000514 time 1.6202 (2.2804) loss 3.6168 (3.5131) grad_norm 1.3997 (1.4841) [2022-01-22 08:26:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][250/1251] eta 0:37:54 lr 0.000514 time 1.9306 (2.2718) loss 4.1508 (3.5108) grad_norm 1.5494 (1.4860) [2022-01-22 08:27:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][260/1251] eta 0:37:22 lr 0.000514 time 1.9453 (2.2631) loss 3.9293 (3.5236) grad_norm 1.8053 (1.4899) [2022-01-22 08:27:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][270/1251] eta 0:36:54 lr 0.000514 time 1.9499 (2.2570) loss 3.9329 (3.5251) grad_norm 1.4471 (1.4912) [2022-01-22 08:27:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][280/1251] eta 0:36:23 lr 0.000514 time 1.9037 (2.2486) loss 3.8154 (3.5185) grad_norm 1.3835 (1.4919) [2022-01-22 08:28:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][290/1251] eta 0:35:58 lr 0.000514 time 2.4742 (2.2462) loss 3.8021 (3.5212) grad_norm 1.3116 (1.4907) [2022-01-22 08:28:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][300/1251] eta 0:35:34 lr 0.000514 time 2.3980 (2.2445) loss 4.0487 (3.5114) grad_norm 1.5254 (1.4905) [2022-01-22 08:29:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][310/1251] eta 0:35:12 lr 0.000514 time 1.6085 (2.2448) loss 3.9587 (3.5158) grad_norm 1.6046 (1.4917) [2022-01-22 08:29:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][320/1251] eta 0:34:49 lr 0.000514 time 2.3309 (2.2439) loss 4.1172 (3.5210) grad_norm 1.5672 (1.4916) [2022-01-22 08:29:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][330/1251] eta 0:34:27 lr 0.000514 time 2.4949 (2.2449) loss 4.2061 (3.5294) grad_norm 1.5554 (1.4929) [2022-01-22 08:30:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][340/1251] eta 0:34:06 lr 0.000514 time 1.6620 (2.2467) loss 3.8793 (3.5217) grad_norm 1.7174 (1.4925) [2022-01-22 08:30:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][350/1251] eta 0:33:45 lr 0.000514 time 2.2191 (2.2484) loss 3.9730 (3.5308) grad_norm 1.2954 (1.4900) [2022-01-22 08:30:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][360/1251] eta 0:33:21 lr 0.000514 time 2.5241 (2.2464) loss 4.0545 (3.5252) grad_norm 1.3008 (1.4883) [2022-01-22 08:31:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][370/1251] eta 0:32:58 lr 0.000514 time 3.3792 (2.2462) loss 3.8336 (3.5188) grad_norm 1.4609 (1.4901) [2022-01-22 08:31:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][380/1251] eta 0:32:33 lr 0.000514 time 1.8546 (2.2426) loss 3.5863 (3.5273) grad_norm 1.4358 (1.4895) [2022-01-22 08:32:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][390/1251] eta 0:32:09 lr 0.000514 time 2.0296 (2.2407) loss 4.1386 (3.5281) grad_norm 2.0329 (1.4907) [2022-01-22 08:32:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][400/1251] eta 0:31:44 lr 0.000514 time 1.9796 (2.2379) loss 4.3735 (3.5314) grad_norm 1.7206 (1.4909) [2022-01-22 08:32:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][410/1251] eta 0:31:20 lr 0.000514 time 2.1882 (2.2363) loss 3.9675 (3.5269) grad_norm 1.5757 (1.4910) [2022-01-22 08:33:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][420/1251] eta 0:31:00 lr 0.000514 time 2.3311 (2.2390) loss 2.8691 (3.5250) grad_norm 1.5148 (1.4897) [2022-01-22 08:33:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][430/1251] eta 0:30:36 lr 0.000514 time 1.4629 (2.2372) loss 3.0739 (3.5241) grad_norm 1.6958 (1.4897) [2022-01-22 08:33:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][440/1251] eta 0:30:12 lr 0.000514 time 1.5582 (2.2347) loss 3.4119 (3.5237) grad_norm 1.8780 (1.4924) [2022-01-22 08:34:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][450/1251] eta 0:29:45 lr 0.000514 time 2.0185 (2.2293) loss 4.1447 (3.5191) grad_norm 1.6762 (1.4954) [2022-01-22 08:34:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][460/1251] eta 0:29:20 lr 0.000513 time 1.6659 (2.2259) loss 4.0186 (3.5235) grad_norm 1.4103 (1.4968) [2022-01-22 08:34:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][470/1251] eta 0:28:57 lr 0.000513 time 1.6056 (2.2243) loss 4.1411 (3.5274) grad_norm 1.5638 (1.4956) [2022-01-22 08:35:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][480/1251] eta 0:28:37 lr 0.000513 time 2.3192 (2.2276) loss 3.9730 (3.5286) grad_norm 1.3795 (1.4972) [2022-01-22 08:35:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][490/1251] eta 0:28:13 lr 0.000513 time 2.1286 (2.2251) loss 3.3877 (3.5286) grad_norm 1.3462 (1.4946) [2022-01-22 08:36:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][500/1251] eta 0:27:51 lr 0.000513 time 2.2378 (2.2252) loss 3.5462 (3.5315) grad_norm 1.3310 (1.4937) [2022-01-22 08:36:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][510/1251] eta 0:27:26 lr 0.000513 time 1.5305 (2.2221) loss 3.8850 (3.5329) grad_norm 1.3184 (1.4940) [2022-01-22 08:36:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][520/1251] eta 0:27:01 lr 0.000513 time 1.8495 (2.2177) loss 4.2585 (3.5358) grad_norm 1.3806 (1.4945) [2022-01-22 08:37:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][530/1251] eta 0:26:38 lr 0.000513 time 2.1568 (2.2172) loss 4.0248 (3.5348) grad_norm 1.4433 (1.4926) [2022-01-22 08:37:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][540/1251] eta 0:26:17 lr 0.000513 time 2.5647 (2.2185) loss 3.5986 (3.5357) grad_norm 1.5241 (1.4949) [2022-01-22 08:37:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][550/1251] eta 0:25:54 lr 0.000513 time 2.0621 (2.2176) loss 2.5033 (3.5326) grad_norm 1.4426 (1.4943) [2022-01-22 08:38:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][560/1251] eta 0:25:31 lr 0.000513 time 1.9403 (2.2160) loss 3.5814 (3.5368) grad_norm 1.6245 (1.4945) [2022-01-22 08:38:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][570/1251] eta 0:25:08 lr 0.000513 time 1.9072 (2.2144) loss 4.1246 (3.5397) grad_norm 1.8157 (1.4960) [2022-01-22 08:38:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][580/1251] eta 0:24:45 lr 0.000513 time 2.5779 (2.2137) loss 3.3966 (3.5474) grad_norm 1.2502 (1.4950) [2022-01-22 08:39:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][590/1251] eta 0:24:23 lr 0.000513 time 1.6120 (2.2139) loss 3.9072 (3.5449) grad_norm 1.3862 (1.4947) [2022-01-22 08:39:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][600/1251] eta 0:24:01 lr 0.000513 time 1.8795 (2.2137) loss 4.3558 (3.5457) grad_norm 1.5301 (1.4946) [2022-01-22 08:39:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][610/1251] eta 0:23:39 lr 0.000513 time 2.4172 (2.2153) loss 4.1404 (3.5460) grad_norm 1.4324 (1.4944) [2022-01-22 08:40:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][620/1251] eta 0:23:19 lr 0.000513 time 3.1507 (2.2179) loss 3.8900 (3.5415) grad_norm 1.6941 (1.4949) [2022-01-22 08:40:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][630/1251] eta 0:22:56 lr 0.000513 time 2.2765 (2.2165) loss 3.8822 (3.5386) grad_norm 1.7524 (1.4956) [2022-01-22 08:41:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][640/1251] eta 0:22:33 lr 0.000513 time 1.6202 (2.2151) loss 3.9931 (3.5378) grad_norm 1.3696 (1.4977) [2022-01-22 08:41:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][650/1251] eta 0:22:10 lr 0.000513 time 1.9485 (2.2145) loss 4.3853 (3.5450) grad_norm 1.4255 (1.4983) [2022-01-22 08:41:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][660/1251] eta 0:21:48 lr 0.000513 time 2.4600 (2.2140) loss 2.9981 (3.5412) grad_norm 1.3972 (1.4993) [2022-01-22 08:42:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][670/1251] eta 0:21:26 lr 0.000513 time 2.1759 (2.2137) loss 3.8564 (3.5393) grad_norm 1.4239 (1.5003) [2022-01-22 08:42:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][680/1251] eta 0:21:04 lr 0.000513 time 1.6122 (2.2139) loss 3.1756 (3.5435) grad_norm 1.5132 (1.5016) [2022-01-22 08:42:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][690/1251] eta 0:20:40 lr 0.000513 time 2.2397 (2.2115) loss 3.6792 (3.5413) grad_norm 1.5269 (1.5016) [2022-01-22 08:43:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][700/1251] eta 0:20:18 lr 0.000512 time 2.2160 (2.2113) loss 3.6527 (3.5426) grad_norm 1.5875 (1.5012) [2022-01-22 08:43:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][710/1251] eta 0:19:56 lr 0.000512 time 2.4911 (2.2121) loss 3.6738 (3.5433) grad_norm 1.4351 (1.5014) [2022-01-22 08:44:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][720/1251] eta 0:19:34 lr 0.000512 time 1.8772 (2.2124) loss 3.8086 (3.5445) grad_norm 1.3883 (1.5011) [2022-01-22 08:44:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][730/1251] eta 0:19:11 lr 0.000512 time 2.8378 (2.2109) loss 3.1469 (3.5440) grad_norm 1.5143 (1.5011) [2022-01-22 08:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][740/1251] eta 0:18:48 lr 0.000512 time 1.9628 (2.2094) loss 2.6127 (3.5409) grad_norm 1.5403 (1.5006) [2022-01-22 08:45:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][750/1251] eta 0:18:25 lr 0.000512 time 1.8829 (2.2075) loss 3.8723 (3.5403) grad_norm 1.3726 (1.5002) [2022-01-22 08:45:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][760/1251] eta 0:18:03 lr 0.000512 time 2.1531 (2.2073) loss 4.1599 (3.5391) grad_norm 1.5227 (1.4996) [2022-01-22 08:45:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][770/1251] eta 0:17:40 lr 0.000512 time 1.5650 (2.2052) loss 4.4022 (3.5388) grad_norm 1.5262 (1.5002) [2022-01-22 08:46:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][780/1251] eta 0:17:19 lr 0.000512 time 2.4617 (2.2063) loss 3.3258 (3.5365) grad_norm 1.6592 (1.5016) [2022-01-22 08:46:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][790/1251] eta 0:16:56 lr 0.000512 time 1.4949 (2.2058) loss 3.5715 (3.5390) grad_norm 1.2745 (1.5009) [2022-01-22 08:46:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][800/1251] eta 0:16:35 lr 0.000512 time 2.4244 (2.2068) loss 3.9318 (3.5378) grad_norm 1.5133 (1.5018) [2022-01-22 08:47:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][810/1251] eta 0:16:12 lr 0.000512 time 1.9790 (2.2063) loss 4.2265 (3.5392) grad_norm 1.4681 (1.5017) [2022-01-22 08:47:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][820/1251] eta 0:15:51 lr 0.000512 time 2.5954 (2.2079) loss 4.0925 (3.5414) grad_norm 1.6113 (1.5027) [2022-01-22 08:47:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][830/1251] eta 0:15:28 lr 0.000512 time 1.5443 (2.2064) loss 3.6546 (3.5435) grad_norm 1.3770 (1.5017) [2022-01-22 08:48:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][840/1251] eta 0:15:06 lr 0.000512 time 2.2558 (2.2057) loss 3.0928 (3.5434) grad_norm 1.2762 (1.5007) [2022-01-22 08:48:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][850/1251] eta 0:14:43 lr 0.000512 time 2.1080 (2.2041) loss 3.8578 (3.5437) grad_norm 1.4176 (1.5001) [2022-01-22 08:49:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][860/1251] eta 0:14:21 lr 0.000512 time 2.4799 (2.2030) loss 3.0702 (3.5467) grad_norm 1.4015 (1.4994) [2022-01-22 08:49:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][870/1251] eta 0:13:59 lr 0.000512 time 1.6408 (2.2028) loss 3.6301 (3.5453) grad_norm 1.4277 (1.4991) [2022-01-22 08:49:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][880/1251] eta 0:13:37 lr 0.000512 time 2.5967 (2.2041) loss 3.5531 (3.5455) grad_norm 1.5573 (1.4988) [2022-01-22 08:50:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][890/1251] eta 0:13:15 lr 0.000512 time 1.9862 (2.2027) loss 3.9872 (3.5468) grad_norm 1.3578 (1.4987) [2022-01-22 08:50:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][900/1251] eta 0:12:52 lr 0.000512 time 2.6261 (2.2016) loss 3.1683 (3.5445) grad_norm 1.4162 (1.4991) [2022-01-22 08:50:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][910/1251] eta 0:12:30 lr 0.000512 time 1.8845 (2.2006) loss 4.1233 (3.5457) grad_norm 1.3244 (1.4990) [2022-01-22 08:51:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][920/1251] eta 0:12:08 lr 0.000512 time 1.7162 (2.2006) loss 3.7077 (3.5412) grad_norm 1.4440 (1.4985) [2022-01-22 08:51:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][930/1251] eta 0:11:46 lr 0.000512 time 2.5152 (2.2017) loss 3.2962 (3.5422) grad_norm 1.4868 (1.4987) [2022-01-22 08:51:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][940/1251] eta 0:11:24 lr 0.000511 time 1.9007 (2.2010) loss 4.2514 (3.5449) grad_norm 1.5929 (1.4984) [2022-01-22 08:52:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][950/1251] eta 0:11:03 lr 0.000511 time 2.4805 (2.2030) loss 3.8293 (3.5433) grad_norm 1.4102 (1.4977) [2022-01-22 08:52:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][960/1251] eta 0:10:41 lr 0.000511 time 2.5188 (2.2045) loss 4.1378 (3.5448) grad_norm 1.5608 (1.4979) [2022-01-22 08:53:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][970/1251] eta 0:10:19 lr 0.000511 time 1.8421 (2.2049) loss 3.4392 (3.5469) grad_norm 1.4312 (1.4979) [2022-01-22 08:53:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][980/1251] eta 0:09:57 lr 0.000511 time 2.1446 (2.2039) loss 3.9708 (3.5488) grad_norm 1.5858 (1.4971) [2022-01-22 08:53:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][990/1251] eta 0:09:35 lr 0.000511 time 2.2061 (2.2040) loss 3.0033 (3.5490) grad_norm 1.6703 (1.4973) [2022-01-22 08:54:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1000/1251] eta 0:09:12 lr 0.000511 time 2.1949 (2.2019) loss 2.1961 (3.5498) grad_norm 1.3464 (1.4966) [2022-01-22 08:54:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1010/1251] eta 0:08:50 lr 0.000511 time 2.2861 (2.2022) loss 3.8241 (3.5501) grad_norm 1.5572 (1.4967) [2022-01-22 08:54:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1020/1251] eta 0:08:28 lr 0.000511 time 1.5290 (2.2016) loss 3.7595 (3.5515) grad_norm 1.8559 (1.4969) [2022-01-22 08:55:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1030/1251] eta 0:08:07 lr 0.000511 time 2.4508 (2.2043) loss 4.3672 (3.5537) grad_norm 1.4779 (1.4964) [2022-01-22 08:55:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1040/1251] eta 0:07:44 lr 0.000511 time 2.6140 (2.2027) loss 3.0777 (3.5552) grad_norm 1.4669 (1.4962) [2022-01-22 08:55:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1050/1251] eta 0:07:22 lr 0.000511 time 2.2361 (2.2000) loss 3.6256 (3.5541) grad_norm 1.4955 (1.4965) [2022-01-22 08:56:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1060/1251] eta 0:06:59 lr 0.000511 time 1.8584 (2.1976) loss 3.1135 (3.5544) grad_norm 1.6867 (1.4975) [2022-01-22 08:56:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1070/1251] eta 0:06:37 lr 0.000511 time 1.5790 (2.1963) loss 2.4505 (3.5531) grad_norm 1.3477 (1.4965) [2022-01-22 08:56:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1080/1251] eta 0:06:15 lr 0.000511 time 1.8502 (2.1952) loss 4.2701 (3.5547) grad_norm 1.6436 (1.4964) [2022-01-22 08:57:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1090/1251] eta 0:05:53 lr 0.000511 time 2.7624 (2.1971) loss 3.9726 (3.5554) grad_norm 1.4179 (1.4957) [2022-01-22 08:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1100/1251] eta 0:05:31 lr 0.000511 time 2.2267 (2.1982) loss 3.8841 (3.5527) grad_norm 1.5671 (1.4960) [2022-01-22 08:58:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1110/1251] eta 0:05:10 lr 0.000511 time 1.8332 (2.2005) loss 4.0449 (3.5541) grad_norm 1.3709 (1.4965) [2022-01-22 08:58:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1120/1251] eta 0:04:48 lr 0.000511 time 1.8899 (2.2012) loss 3.5229 (3.5540) grad_norm 1.5452 (1.4966) [2022-01-22 08:58:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1130/1251] eta 0:04:26 lr 0.000511 time 2.1919 (2.2015) loss 3.3908 (3.5545) grad_norm 1.6421 (1.4962) [2022-01-22 08:59:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1140/1251] eta 0:04:04 lr 0.000511 time 2.2962 (2.1996) loss 3.4413 (3.5535) grad_norm 1.3997 (1.4963) [2022-01-22 08:59:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1150/1251] eta 0:03:42 lr 0.000511 time 1.4960 (2.1983) loss 3.3481 (3.5497) grad_norm 1.4026 (1.4961) [2022-01-22 08:59:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1160/1251] eta 0:03:19 lr 0.000511 time 1.9263 (2.1969) loss 3.8576 (3.5493) grad_norm 1.2399 (1.4954) [2022-01-22 09:00:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1170/1251] eta 0:02:57 lr 0.000511 time 1.8987 (2.1971) loss 3.7725 (3.5490) grad_norm 1.4112 (1.4952) [2022-01-22 09:00:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1180/1251] eta 0:02:36 lr 0.000510 time 2.6190 (2.1973) loss 3.2409 (3.5497) grad_norm 1.4121 (1.4956) [2022-01-22 09:01:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1190/1251] eta 0:02:13 lr 0.000510 time 1.5097 (2.1966) loss 3.9138 (3.5508) grad_norm 1.4425 (1.4951) [2022-01-22 09:01:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1200/1251] eta 0:01:52 lr 0.000510 time 2.2482 (2.1974) loss 3.3996 (3.5506) grad_norm 1.3358 (1.4946) [2022-01-22 09:01:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1210/1251] eta 0:01:30 lr 0.000510 time 2.2522 (2.1972) loss 4.0948 (3.5504) grad_norm 1.5684 (1.4944) [2022-01-22 09:02:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1220/1251] eta 0:01:08 lr 0.000510 time 2.4636 (2.1968) loss 3.5953 (3.5510) grad_norm 1.4646 (1.4938) [2022-01-22 09:02:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1230/1251] eta 0:00:46 lr 0.000510 time 2.4649 (2.1985) loss 4.0744 (3.5524) grad_norm 1.4450 (1.4945) [2022-01-22 09:02:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1240/1251] eta 0:00:24 lr 0.000510 time 1.1740 (2.1974) loss 3.2956 (3.5509) grad_norm 1.6108 (1.4955) [2022-01-22 09:03:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [148/300][1250/1251] eta 0:00:02 lr 0.000510 time 1.2005 (2.1925) loss 3.9429 (3.5515) grad_norm 2.0309 (1.4962) [2022-01-22 09:03:09 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 148 training takes 0:45:43 [2022-01-22 09:03:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.517 (18.517) Loss 1.0111 (1.0111) Acc@1 76.953 (76.953) Acc@5 93.262 (93.262) [2022-01-22 09:03:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.233 (3.546) Loss 1.0178 (1.0200) Acc@1 76.465 (76.349) Acc@5 94.043 (93.439) [2022-01-22 09:04:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.629 (2.504) Loss 1.0634 (1.0269) Acc@1 75.977 (76.121) Acc@5 93.164 (93.359) [2022-01-22 09:04:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.484 (2.259) Loss 1.0048 (1.0253) Acc@1 75.781 (76.112) Acc@5 93.457 (93.344) [2022-01-22 09:04:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.949 (2.178) Loss 1.0463 (1.0246) Acc@1 74.512 (76.093) Acc@5 92.188 (93.274) [2022-01-22 09:04:45 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.908 Acc@5 93.174 [2022-01-22 09:04:45 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.9% [2022-01-22 09:04:45 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.00% [2022-01-22 09:05:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][0/1251] eta 7:29:57 lr 0.000510 time 21.5810 (21.5810) loss 3.6845 (3.6845) grad_norm 1.6137 (1.6137) [2022-01-22 09:05:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][10/1251] eta 1:21:22 lr 0.000510 time 1.5552 (3.9341) loss 3.8808 (3.3887) grad_norm 1.3402 (1.5054) [2022-01-22 09:05:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][20/1251] eta 1:01:58 lr 0.000510 time 1.7959 (3.0209) loss 2.7246 (3.4461) grad_norm 1.5888 (1.4754) [2022-01-22 09:06:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][30/1251] eta 0:54:56 lr 0.000510 time 1.6264 (2.6997) loss 4.0956 (3.4774) grad_norm 1.3661 (1.4589) [2022-01-22 09:06:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][40/1251] eta 0:54:23 lr 0.000510 time 5.4630 (2.6949) loss 3.8493 (3.4894) grad_norm 1.4994 (1.4591) [2022-01-22 09:06:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][50/1251] eta 0:52:23 lr 0.000510 time 2.5297 (2.6176) loss 3.5906 (3.4842) grad_norm 1.4085 (1.4711) [2022-01-22 09:07:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][60/1251] eta 0:50:17 lr 0.000510 time 1.4739 (2.5340) loss 3.5086 (3.5003) grad_norm 1.4766 (1.4793) [2022-01-22 09:07:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][70/1251] eta 0:49:09 lr 0.000510 time 2.8138 (2.4975) loss 3.7419 (3.5089) grad_norm 1.2526 (1.4827) [2022-01-22 09:08:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][80/1251] eta 0:48:33 lr 0.000510 time 3.5453 (2.4880) loss 4.0991 (3.5168) grad_norm 1.5117 (1.4778) [2022-01-22 09:08:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][90/1251] eta 0:47:46 lr 0.000510 time 2.9106 (2.4692) loss 2.3970 (3.5253) grad_norm 1.4373 (1.4861) [2022-01-22 09:08:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][100/1251] eta 0:46:40 lr 0.000510 time 1.5686 (2.4335) loss 3.6458 (3.5382) grad_norm 1.6515 (1.4877) [2022-01-22 09:09:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][110/1251] eta 0:45:27 lr 0.000510 time 1.7167 (2.3907) loss 2.9478 (3.5425) grad_norm 1.4702 (1.4912) [2022-01-22 09:09:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][120/1251] eta 0:44:42 lr 0.000510 time 3.1890 (2.3720) loss 4.3239 (3.5567) grad_norm 1.5978 (1.4996) [2022-01-22 09:09:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][130/1251] eta 0:43:57 lr 0.000510 time 1.9240 (2.3531) loss 3.8644 (3.5289) grad_norm 1.7122 (1.4964) [2022-01-22 09:10:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][140/1251] eta 0:43:26 lr 0.000510 time 1.8716 (2.3459) loss 3.9006 (3.5434) grad_norm 1.3601 (1.4895) [2022-01-22 09:10:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][150/1251] eta 0:43:06 lr 0.000510 time 2.1147 (2.3492) loss 3.0760 (3.5297) grad_norm 1.6234 (1.4998) [2022-01-22 09:11:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][160/1251] eta 0:42:43 lr 0.000510 time 3.6717 (2.3493) loss 3.8276 (3.5442) grad_norm 1.5572 (1.5054) [2022-01-22 09:11:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][170/1251] eta 0:42:01 lr 0.000509 time 1.8551 (2.3323) loss 4.2051 (3.5494) grad_norm 1.7068 (1.5069) [2022-01-22 09:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][180/1251] eta 0:41:15 lr 0.000509 time 1.9025 (2.3112) loss 2.7311 (3.5366) grad_norm 1.5667 (1.5020) [2022-01-22 09:12:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][190/1251] eta 0:40:37 lr 0.000509 time 1.6492 (2.2976) loss 3.4119 (3.5442) grad_norm 1.4886 (1.5009) [2022-01-22 09:12:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][200/1251] eta 0:40:12 lr 0.000509 time 3.1303 (2.2956) loss 3.4886 (3.5417) grad_norm 1.4761 (1.5021) [2022-01-22 09:12:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][210/1251] eta 0:39:47 lr 0.000509 time 2.2502 (2.2933) loss 3.8220 (3.5359) grad_norm 1.5054 (1.5019) [2022-01-22 09:13:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][220/1251] eta 0:39:20 lr 0.000509 time 1.7081 (2.2897) loss 3.5784 (3.5207) grad_norm 1.7660 (1.5053) [2022-01-22 09:13:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][230/1251] eta 0:38:52 lr 0.000509 time 2.1863 (2.2844) loss 2.7742 (3.5208) grad_norm 1.7923 (1.5064) [2022-01-22 09:13:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][240/1251] eta 0:38:22 lr 0.000509 time 2.1983 (2.2776) loss 3.7895 (3.5412) grad_norm 1.6139 (1.5153) [2022-01-22 09:14:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][250/1251] eta 0:37:55 lr 0.000509 time 1.8387 (2.2735) loss 3.9459 (3.5555) grad_norm 1.4280 (1.5159) [2022-01-22 09:14:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][260/1251] eta 0:37:31 lr 0.000509 time 1.9597 (2.2717) loss 2.5980 (3.5532) grad_norm 1.6868 (1.5147) [2022-01-22 09:15:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][270/1251] eta 0:37:02 lr 0.000509 time 2.1661 (2.2660) loss 3.5436 (3.5594) grad_norm 1.4675 (1.5137) [2022-01-22 09:15:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][280/1251] eta 0:36:32 lr 0.000509 time 2.0042 (2.2585) loss 2.5405 (3.5499) grad_norm 1.4687 (1.5127) [2022-01-22 09:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][290/1251] eta 0:36:03 lr 0.000509 time 1.8505 (2.2514) loss 4.1061 (3.5523) grad_norm 1.5020 (1.5122) [2022-01-22 09:16:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][300/1251] eta 0:35:34 lr 0.000509 time 2.3527 (2.2440) loss 3.1585 (3.5553) grad_norm 1.2919 (1.5152) [2022-01-22 09:16:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][310/1251] eta 0:35:05 lr 0.000509 time 2.0929 (2.2376) loss 3.8166 (3.5511) grad_norm 1.5770 (1.5161) [2022-01-22 09:16:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][320/1251] eta 0:34:40 lr 0.000509 time 2.0588 (2.2344) loss 2.5831 (3.5444) grad_norm 1.3998 (1.5183) [2022-01-22 09:17:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][330/1251] eta 0:34:14 lr 0.000509 time 2.2070 (2.2311) loss 3.5418 (3.5458) grad_norm 1.5265 (1.5157) [2022-01-22 09:17:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][340/1251] eta 0:33:52 lr 0.000509 time 2.4120 (2.2311) loss 3.5100 (3.5455) grad_norm 1.3578 (1.5124) [2022-01-22 09:17:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][350/1251] eta 0:33:31 lr 0.000509 time 2.3080 (2.2329) loss 4.0344 (3.5471) grad_norm 1.4323 (1.5126) [2022-01-22 09:18:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][360/1251] eta 0:33:15 lr 0.000509 time 2.6712 (2.2396) loss 3.7670 (3.5471) grad_norm 1.5923 (1.5136) [2022-01-22 09:18:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][370/1251] eta 0:32:57 lr 0.000509 time 2.5761 (2.2444) loss 3.2903 (3.5536) grad_norm 2.6407 (1.5152) [2022-01-22 09:19:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][380/1251] eta 0:32:34 lr 0.000509 time 1.8821 (2.2435) loss 3.5931 (3.5553) grad_norm 1.6789 (1.5205) [2022-01-22 09:19:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][390/1251] eta 0:32:09 lr 0.000509 time 1.7581 (2.2413) loss 3.8149 (3.5548) grad_norm 1.7278 (1.5230) [2022-01-22 09:19:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][400/1251] eta 0:31:47 lr 0.000509 time 3.1694 (2.2421) loss 4.2492 (3.5580) grad_norm 1.6427 (1.5246) [2022-01-22 09:20:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][410/1251] eta 0:31:22 lr 0.000508 time 1.9605 (2.2380) loss 3.3426 (3.5600) grad_norm 1.6441 (1.5231) [2022-01-22 09:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][420/1251] eta 0:30:58 lr 0.000508 time 1.7171 (2.2362) loss 2.6554 (3.5546) grad_norm 1.2844 (1.5211) [2022-01-22 09:20:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][430/1251] eta 0:30:34 lr 0.000508 time 1.7011 (2.2349) loss 4.1110 (3.5599) grad_norm 1.5951 (1.5207) [2022-01-22 09:21:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][440/1251] eta 0:30:11 lr 0.000508 time 2.9348 (2.2342) loss 3.6108 (3.5624) grad_norm 1.4250 (1.5182) [2022-01-22 09:21:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][450/1251] eta 0:29:49 lr 0.000508 time 2.5749 (2.2336) loss 2.6314 (3.5586) grad_norm 1.4352 (1.5162) [2022-01-22 09:21:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][460/1251] eta 0:29:24 lr 0.000508 time 1.9735 (2.2313) loss 2.8951 (3.5576) grad_norm 1.3492 (1.5159) [2022-01-22 09:22:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][470/1251] eta 0:28:59 lr 0.000508 time 1.9072 (2.2272) loss 2.9002 (3.5577) grad_norm 1.3600 (1.5141) [2022-01-22 09:22:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][480/1251] eta 0:28:37 lr 0.000508 time 3.2167 (2.2273) loss 3.5860 (3.5592) grad_norm 1.5034 (1.5120) [2022-01-22 09:22:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][490/1251] eta 0:28:12 lr 0.000508 time 2.1733 (2.2234) loss 3.5721 (3.5556) grad_norm 1.3989 (1.5120) [2022-01-22 09:23:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][500/1251] eta 0:27:49 lr 0.000508 time 2.1743 (2.2233) loss 4.1240 (3.5582) grad_norm 1.3773 (1.5102) [2022-01-22 09:23:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][510/1251] eta 0:27:26 lr 0.000508 time 3.1294 (2.2227) loss 4.0997 (3.5565) grad_norm 1.3484 (1.5095) [2022-01-22 09:24:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][520/1251] eta 0:27:05 lr 0.000508 time 3.2093 (2.2235) loss 3.8477 (3.5541) grad_norm 1.6545 (1.5090) [2022-01-22 09:24:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][530/1251] eta 0:26:41 lr 0.000508 time 1.8948 (2.2214) loss 2.7330 (3.5498) grad_norm 1.4285 (1.5083) [2022-01-22 09:24:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][540/1251] eta 0:26:19 lr 0.000508 time 2.4687 (2.2218) loss 4.3293 (3.5532) grad_norm 1.4043 (1.5086) [2022-01-22 09:25:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][550/1251] eta 0:25:57 lr 0.000508 time 2.4358 (2.2224) loss 4.1614 (3.5537) grad_norm 1.3855 (1.5090) [2022-01-22 09:25:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][560/1251] eta 0:25:34 lr 0.000508 time 2.5837 (2.2211) loss 2.9246 (3.5563) grad_norm 1.4282 (1.5088) [2022-01-22 09:25:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][570/1251] eta 0:25:10 lr 0.000508 time 1.8711 (2.2188) loss 3.2746 (3.5550) grad_norm 1.6048 (1.5080) [2022-01-22 09:26:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][580/1251] eta 0:24:47 lr 0.000508 time 2.0301 (2.2175) loss 3.4692 (3.5533) grad_norm 1.4144 (1.5078) [2022-01-22 09:26:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][590/1251] eta 0:24:25 lr 0.000508 time 2.2575 (2.2178) loss 3.6706 (3.5535) grad_norm 1.5639 (1.5079) [2022-01-22 09:27:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][600/1251] eta 0:24:05 lr 0.000508 time 2.7694 (2.2202) loss 3.6105 (3.5537) grad_norm 1.4380 (1.5074) [2022-01-22 09:27:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][610/1251] eta 0:23:43 lr 0.000508 time 1.5845 (2.2200) loss 2.7835 (3.5506) grad_norm 1.2944 (1.5066) [2022-01-22 09:27:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][620/1251] eta 0:23:21 lr 0.000508 time 2.2234 (2.2212) loss 3.7681 (3.5524) grad_norm 1.9218 (1.5071) [2022-01-22 09:28:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][630/1251] eta 0:22:57 lr 0.000508 time 1.7989 (2.2187) loss 3.2048 (3.5534) grad_norm 1.6276 (1.5071) [2022-01-22 09:28:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][640/1251] eta 0:22:35 lr 0.000508 time 2.8779 (2.2186) loss 4.0565 (3.5522) grad_norm 1.3947 (1.5064) [2022-01-22 09:28:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][650/1251] eta 0:22:11 lr 0.000507 time 1.9441 (2.2158) loss 2.3680 (3.5492) grad_norm 1.4287 (1.5059) [2022-01-22 09:29:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][660/1251] eta 0:21:48 lr 0.000507 time 1.8802 (2.2135) loss 3.8922 (3.5470) grad_norm 1.5301 (1.5057) [2022-01-22 09:29:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][670/1251] eta 0:21:25 lr 0.000507 time 2.2371 (2.2121) loss 4.2292 (3.5485) grad_norm 1.4334 (1.5062) [2022-01-22 09:29:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][680/1251] eta 0:21:03 lr 0.000507 time 2.5222 (2.2136) loss 3.7006 (3.5486) grad_norm 1.6497 (1.5071) [2022-01-22 09:30:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][690/1251] eta 0:20:42 lr 0.000507 time 2.2078 (2.2148) loss 2.9800 (3.5466) grad_norm 1.5573 (1.5079) [2022-01-22 09:30:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][700/1251] eta 0:20:20 lr 0.000507 time 1.9416 (2.2152) loss 2.7066 (3.5440) grad_norm 1.5390 (1.5083) [2022-01-22 09:31:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][710/1251] eta 0:19:58 lr 0.000507 time 2.4671 (2.2155) loss 4.1885 (3.5414) grad_norm 1.4392 (1.5087) [2022-01-22 09:31:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][720/1251] eta 0:19:36 lr 0.000507 time 2.7412 (2.2152) loss 3.2927 (3.5416) grad_norm 1.8740 (1.5097) [2022-01-22 09:31:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][730/1251] eta 0:19:12 lr 0.000507 time 1.9036 (2.2128) loss 2.7324 (3.5427) grad_norm 1.4054 (1.5094) [2022-01-22 09:32:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][740/1251] eta 0:18:50 lr 0.000507 time 2.2059 (2.2128) loss 3.2276 (3.5432) grad_norm 1.3686 (1.5094) [2022-01-22 09:32:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][750/1251] eta 0:18:27 lr 0.000507 time 1.7201 (2.2113) loss 4.1548 (3.5410) grad_norm 2.0188 (1.5097) [2022-01-22 09:32:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][760/1251] eta 0:18:06 lr 0.000507 time 1.9261 (2.2125) loss 3.0705 (3.5402) grad_norm 1.5056 (1.5096) [2022-01-22 09:33:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][770/1251] eta 0:17:43 lr 0.000507 time 1.8793 (2.2116) loss 3.9283 (3.5375) grad_norm 1.4137 (1.5089) [2022-01-22 09:33:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][780/1251] eta 0:17:20 lr 0.000507 time 1.9571 (2.2100) loss 2.8650 (3.5364) grad_norm 1.5367 (1.5094) [2022-01-22 09:33:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][790/1251] eta 0:16:57 lr 0.000507 time 1.8833 (2.2077) loss 2.3636 (3.5357) grad_norm 1.4267 (1.5097) [2022-01-22 09:34:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][800/1251] eta 0:16:35 lr 0.000507 time 2.7537 (2.2081) loss 3.4148 (3.5329) grad_norm 1.4064 (1.5097) [2022-01-22 09:34:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][810/1251] eta 0:16:13 lr 0.000507 time 2.3473 (2.2079) loss 4.4395 (3.5324) grad_norm 1.7728 (1.5099) [2022-01-22 09:34:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][820/1251] eta 0:15:52 lr 0.000507 time 2.6512 (2.2094) loss 4.0564 (3.5321) grad_norm 1.6650 (1.5109) [2022-01-22 09:35:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][830/1251] eta 0:15:30 lr 0.000507 time 2.3500 (2.2102) loss 3.5539 (3.5317) grad_norm 1.3753 (1.5107) [2022-01-22 09:35:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][840/1251] eta 0:15:08 lr 0.000507 time 2.1461 (2.2097) loss 3.9486 (3.5318) grad_norm 1.5070 (1.5098) [2022-01-22 09:36:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][850/1251] eta 0:14:44 lr 0.000507 time 1.9202 (2.2061) loss 2.5270 (3.5307) grad_norm 1.5269 (1.5094) [2022-01-22 09:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][860/1251] eta 0:14:21 lr 0.000507 time 1.9255 (2.2040) loss 2.5067 (3.5255) grad_norm 1.3979 (1.5087) [2022-01-22 09:36:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][870/1251] eta 0:13:59 lr 0.000507 time 2.6840 (2.2027) loss 3.4933 (3.5235) grad_norm 1.2614 (1.5083) [2022-01-22 09:37:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][880/1251] eta 0:13:37 lr 0.000507 time 2.2782 (2.2029) loss 2.6120 (3.5218) grad_norm 1.3961 (1.5086) [2022-01-22 09:37:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][890/1251] eta 0:13:14 lr 0.000506 time 1.9997 (2.2022) loss 4.4976 (3.5253) grad_norm 1.4484 (1.5084) [2022-01-22 09:37:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][900/1251] eta 0:12:53 lr 0.000506 time 1.4848 (2.2028) loss 4.1450 (3.5283) grad_norm 1.6363 (1.5084) [2022-01-22 09:38:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][910/1251] eta 0:12:31 lr 0.000506 time 2.7079 (2.2047) loss 3.9774 (3.5318) grad_norm 1.4027 (1.5084) [2022-01-22 09:38:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][920/1251] eta 0:12:10 lr 0.000506 time 2.1408 (2.2055) loss 3.9534 (3.5309) grad_norm 1.5167 (1.5078) [2022-01-22 09:39:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][930/1251] eta 0:11:48 lr 0.000506 time 2.4556 (2.2063) loss 3.8512 (3.5289) grad_norm 1.4599 (1.5077) [2022-01-22 09:39:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][940/1251] eta 0:11:26 lr 0.000506 time 1.8189 (2.2064) loss 4.3701 (3.5301) grad_norm 1.5786 (1.5078) [2022-01-22 09:39:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][950/1251] eta 0:11:03 lr 0.000506 time 1.8067 (2.2043) loss 3.5683 (3.5319) grad_norm 1.5548 (1.5082) [2022-01-22 09:40:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][960/1251] eta 0:10:40 lr 0.000506 time 1.8349 (2.2021) loss 3.3567 (3.5328) grad_norm 1.3156 (1.5092) [2022-01-22 09:40:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][970/1251] eta 0:10:18 lr 0.000506 time 1.8541 (2.2009) loss 2.6239 (3.5320) grad_norm 1.5225 (1.5101) [2022-01-22 09:40:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][980/1251] eta 0:09:56 lr 0.000506 time 2.3633 (2.2002) loss 3.0682 (3.5334) grad_norm 1.3701 (1.5098) [2022-01-22 09:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][990/1251] eta 0:09:33 lr 0.000506 time 2.5281 (2.1984) loss 2.4417 (3.5339) grad_norm 1.5795 (1.5098) [2022-01-22 09:41:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1000/1251] eta 0:09:12 lr 0.000506 time 2.1191 (2.1992) loss 2.9440 (3.5331) grad_norm 1.3412 (1.5092) [2022-01-22 09:41:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1010/1251] eta 0:08:50 lr 0.000506 time 2.1178 (2.2002) loss 3.8217 (3.5324) grad_norm 1.5563 (1.5093) [2022-01-22 09:42:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1020/1251] eta 0:08:28 lr 0.000506 time 3.0444 (2.2028) loss 3.7660 (3.5349) grad_norm 1.5181 (1.5095) [2022-01-22 09:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1030/1251] eta 0:08:07 lr 0.000506 time 2.9713 (2.2041) loss 3.5979 (3.5365) grad_norm 1.7663 (1.5092) [2022-01-22 09:43:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1040/1251] eta 0:07:45 lr 0.000506 time 1.8999 (2.2049) loss 4.2740 (3.5367) grad_norm 1.4541 (1.5092) [2022-01-22 09:43:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1050/1251] eta 0:07:23 lr 0.000506 time 1.6442 (2.2040) loss 3.9675 (3.5340) grad_norm 1.2653 (1.5084) [2022-01-22 09:43:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1060/1251] eta 0:07:00 lr 0.000506 time 2.5324 (2.2035) loss 3.9260 (3.5342) grad_norm 1.5053 (1.5076) [2022-01-22 09:44:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1070/1251] eta 0:06:38 lr 0.000506 time 1.9135 (2.2006) loss 2.3167 (3.5326) grad_norm 1.3835 (1.5068) [2022-01-22 09:44:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1080/1251] eta 0:06:15 lr 0.000506 time 1.8447 (2.1982) loss 3.3209 (3.5311) grad_norm 1.7125 (1.5077) [2022-01-22 09:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1090/1251] eta 0:05:53 lr 0.000506 time 2.1398 (2.1972) loss 3.3543 (3.5325) grad_norm 1.3244 (1.5079) [2022-01-22 09:45:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1100/1251] eta 0:05:31 lr 0.000506 time 2.3081 (2.1976) loss 3.6381 (3.5343) grad_norm 1.5985 (1.5085) [2022-01-22 09:45:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1110/1251] eta 0:05:09 lr 0.000506 time 2.4710 (2.1978) loss 3.9298 (3.5331) grad_norm 1.5241 (1.5084) [2022-01-22 09:45:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1120/1251] eta 0:04:47 lr 0.000506 time 1.9379 (2.1971) loss 4.2147 (3.5319) grad_norm 1.6577 (1.5097) [2022-01-22 09:46:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1130/1251] eta 0:04:25 lr 0.000506 time 2.5954 (2.1977) loss 3.8180 (3.5326) grad_norm 1.4222 (1.5091) [2022-01-22 09:46:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1140/1251] eta 0:04:04 lr 0.000505 time 2.8139 (2.2003) loss 3.6171 (3.5328) grad_norm 1.5650 (1.5092) [2022-01-22 09:47:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1150/1251] eta 0:03:42 lr 0.000505 time 1.7525 (2.2032) loss 4.0573 (3.5349) grad_norm 1.6681 (1.5092) [2022-01-22 09:47:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1160/1251] eta 0:03:20 lr 0.000505 time 2.1789 (2.2042) loss 4.0809 (3.5364) grad_norm 1.4014 (1.5090) [2022-01-22 09:47:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1170/1251] eta 0:02:58 lr 0.000505 time 1.9128 (2.2032) loss 3.7373 (3.5370) grad_norm 1.5055 (1.5090) [2022-01-22 09:48:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1180/1251] eta 0:02:36 lr 0.000505 time 2.3011 (2.2005) loss 3.9907 (3.5332) grad_norm 1.5020 (1.5085) [2022-01-22 09:48:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1190/1251] eta 0:02:14 lr 0.000505 time 1.8714 (2.1983) loss 3.7938 (3.5330) grad_norm 1.3286 (1.5083) [2022-01-22 09:48:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1200/1251] eta 0:01:52 lr 0.000505 time 2.0999 (2.1965) loss 3.5645 (3.5324) grad_norm 1.5242 (1.5086) [2022-01-22 09:49:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1210/1251] eta 0:01:30 lr 0.000505 time 2.8621 (2.1972) loss 3.8790 (3.5325) grad_norm 1.7826 (1.5086) [2022-01-22 09:49:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1220/1251] eta 0:01:08 lr 0.000505 time 2.7669 (2.1970) loss 3.4665 (3.5339) grad_norm 1.4077 (1.5083) [2022-01-22 09:49:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1230/1251] eta 0:00:46 lr 0.000505 time 2.4092 (2.1983) loss 4.2997 (3.5340) grad_norm 1.5569 (1.5079) [2022-01-22 09:50:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1240/1251] eta 0:00:24 lr 0.000505 time 1.8907 (2.1968) loss 2.0916 (3.5313) grad_norm 1.4567 (1.5072) [2022-01-22 09:50:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [149/300][1250/1251] eta 0:00:02 lr 0.000505 time 1.1804 (2.1913) loss 3.6038 (3.5318) grad_norm 1.4343 (1.5069) [2022-01-22 09:50:27 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 149 training takes 0:45:41 [2022-01-22 09:50:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.353 (18.353) Loss 1.0754 (1.0754) Acc@1 75.977 (75.977) Acc@5 92.773 (92.773) [2022-01-22 09:51:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.565 (3.397) Loss 0.9786 (1.0183) Acc@1 76.758 (76.545) Acc@5 93.750 (93.031) [2022-01-22 09:51:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.809 (2.689) Loss 1.0336 (1.0336) Acc@1 75.195 (75.893) Acc@5 93.262 (92.866) [2022-01-22 09:51:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.261 (2.282) Loss 1.0051 (1.0211) Acc@1 76.074 (76.213) Acc@5 95.117 (93.101) [2022-01-22 09:51:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.396 (2.169) Loss 1.0025 (1.0172) Acc@1 76.953 (76.243) Acc@5 93.652 (93.200) [2022-01-22 09:52:04 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.300 Acc@5 93.212 [2022-01-22 09:52:04 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.3% [2022-01-22 09:52:04 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.30% [2022-01-22 09:52:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][0/1251] eta 7:29:31 lr 0.000505 time 21.5599 (21.5599) loss 4.0410 (4.0410) grad_norm 1.3609 (1.3609) [2022-01-22 09:52:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][10/1251] eta 1:25:51 lr 0.000505 time 1.7425 (4.1512) loss 2.5807 (3.5343) grad_norm 1.4347 (1.5160) [2022-01-22 09:53:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][20/1251] eta 1:06:13 lr 0.000505 time 2.5557 (3.2281) loss 3.9113 (3.5654) grad_norm 1.4713 (1.5289) [2022-01-22 09:53:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][30/1251] eta 0:58:45 lr 0.000505 time 1.5927 (2.8878) loss 2.3415 (3.5231) grad_norm 1.6767 (1.5002) [2022-01-22 09:53:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][40/1251] eta 0:55:55 lr 0.000505 time 3.8272 (2.7710) loss 4.1133 (3.5687) grad_norm 1.5600 (1.4974) [2022-01-22 09:54:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][50/1251] eta 0:53:21 lr 0.000505 time 2.3709 (2.6661) loss 4.1202 (3.6107) grad_norm 1.5455 (1.4947) [2022-01-22 09:54:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][60/1251] eta 0:51:13 lr 0.000505 time 2.2345 (2.5809) loss 4.2800 (3.6199) grad_norm 1.4536 (1.5053) [2022-01-22 09:55:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][70/1251] eta 0:49:25 lr 0.000505 time 1.6488 (2.5109) loss 3.0505 (3.6172) grad_norm 1.7754 (1.5140) [2022-01-22 09:55:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][80/1251] eta 0:48:23 lr 0.000505 time 3.4320 (2.4794) loss 3.8162 (3.5773) grad_norm 1.5592 (1.5125) [2022-01-22 09:55:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][90/1251] eta 0:47:17 lr 0.000505 time 2.5203 (2.4439) loss 3.0534 (3.5884) grad_norm 1.4120 (1.5105) [2022-01-22 09:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][100/1251] eta 0:46:04 lr 0.000505 time 2.1869 (2.4021) loss 2.8933 (3.5619) grad_norm 1.5623 (1.5125) [2022-01-22 09:56:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][110/1251] eta 0:45:12 lr 0.000505 time 1.8456 (2.3772) loss 3.2270 (3.5553) grad_norm 1.3566 (1.5054) [2022-01-22 09:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][120/1251] eta 0:44:29 lr 0.000505 time 2.2281 (2.3604) loss 3.2125 (3.5400) grad_norm 1.5770 (1.5042) [2022-01-22 09:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][130/1251] eta 0:44:22 lr 0.000504 time 3.2259 (2.3748) loss 2.7127 (3.5068) grad_norm 1.5725 (1.5093) [2022-01-22 09:57:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][140/1251] eta 0:43:51 lr 0.000504 time 1.9689 (2.3689) loss 4.2582 (3.5100) grad_norm 1.5763 (1.5075) [2022-01-22 09:58:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][150/1251] eta 0:43:13 lr 0.000504 time 2.2115 (2.3556) loss 3.5678 (3.5089) grad_norm 1.3338 (1.5096) [2022-01-22 09:58:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][160/1251] eta 0:42:29 lr 0.000504 time 1.8399 (2.3370) loss 2.3308 (3.5060) grad_norm 1.5271 (1.5074) [2022-01-22 09:58:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][170/1251] eta 0:42:03 lr 0.000504 time 3.1044 (2.3341) loss 4.0937 (3.5126) grad_norm 1.5264 (1.5063) [2022-01-22 09:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][180/1251] eta 0:41:26 lr 0.000504 time 1.9790 (2.3221) loss 2.8308 (3.5202) grad_norm 1.5796 (1.5074) [2022-01-22 09:59:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][190/1251] eta 0:40:45 lr 0.000504 time 1.6162 (2.3054) loss 2.4347 (3.5185) grad_norm 1.4322 (1.5091) [2022-01-22 09:59:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][200/1251] eta 0:40:05 lr 0.000504 time 1.8159 (2.2887) loss 3.3370 (3.5169) grad_norm 1.4722 (1.5139) [2022-01-22 10:00:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][210/1251] eta 0:39:36 lr 0.000504 time 1.8748 (2.2831) loss 3.3876 (3.5248) grad_norm 1.3564 (1.5143) [2022-01-22 10:00:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][220/1251] eta 0:39:07 lr 0.000504 time 2.6978 (2.2769) loss 3.6031 (3.5285) grad_norm 1.5132 (1.5152) [2022-01-22 10:00:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][230/1251] eta 0:38:46 lr 0.000504 time 2.8709 (2.2787) loss 4.1187 (3.5257) grad_norm 1.5060 (1.5181) [2022-01-22 10:01:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][240/1251] eta 0:38:22 lr 0.000504 time 2.0832 (2.2776) loss 3.1915 (3.5095) grad_norm 1.2602 (1.5166) [2022-01-22 10:01:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][250/1251] eta 0:38:00 lr 0.000504 time 2.7712 (2.2780) loss 3.0738 (3.5053) grad_norm 1.2276 (1.5135) [2022-01-22 10:01:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][260/1251] eta 0:37:34 lr 0.000504 time 2.8352 (2.2754) loss 4.1700 (3.5094) grad_norm 1.5626 (1.5212) [2022-01-22 10:02:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][270/1251] eta 0:37:05 lr 0.000504 time 2.1708 (2.2687) loss 2.5932 (3.5135) grad_norm 1.5192 (1.5215) [2022-01-22 10:02:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][280/1251] eta 0:36:34 lr 0.000504 time 1.9176 (2.2601) loss 3.9250 (3.5143) grad_norm 1.4966 (1.5246) [2022-01-22 10:03:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][290/1251] eta 0:36:06 lr 0.000504 time 2.2082 (2.2545) loss 3.0141 (3.5102) grad_norm 1.6690 (1.5253) [2022-01-22 10:03:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][300/1251] eta 0:35:38 lr 0.000504 time 2.3006 (2.2488) loss 3.9628 (3.5031) grad_norm 1.3531 (1.5214) [2022-01-22 10:03:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][310/1251] eta 0:35:15 lr 0.000504 time 2.9824 (2.2484) loss 4.0574 (3.5030) grad_norm 1.5825 (1.5211) [2022-01-22 10:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][320/1251] eta 0:34:52 lr 0.000504 time 1.6132 (2.2473) loss 2.4248 (3.5087) grad_norm 1.5528 (1.5205) [2022-01-22 10:04:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][330/1251] eta 0:34:25 lr 0.000504 time 2.0849 (2.2426) loss 3.8429 (3.5146) grad_norm 1.7210 (1.5216) [2022-01-22 10:04:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][340/1251] eta 0:34:03 lr 0.000504 time 2.1875 (2.2427) loss 4.2252 (3.5133) grad_norm 1.3301 (1.5196) [2022-01-22 10:05:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][350/1251] eta 0:33:41 lr 0.000504 time 2.6705 (2.2437) loss 3.8888 (3.5181) grad_norm 1.5348 (1.5199) [2022-01-22 10:05:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][360/1251] eta 0:33:16 lr 0.000504 time 1.6857 (2.2410) loss 2.3494 (3.5113) grad_norm 1.3420 (1.5201) [2022-01-22 10:05:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][370/1251] eta 0:32:53 lr 0.000503 time 1.6190 (2.2396) loss 3.4699 (3.5105) grad_norm 1.3471 (1.5217) [2022-01-22 10:06:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][380/1251] eta 0:32:28 lr 0.000503 time 1.7692 (2.2368) loss 3.4315 (3.4998) grad_norm 1.5080 (1.5221) [2022-01-22 10:06:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][390/1251] eta 0:32:01 lr 0.000503 time 2.4943 (2.2320) loss 3.5562 (3.5025) grad_norm 1.8437 (1.5233) [2022-01-22 10:06:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][400/1251] eta 0:31:35 lr 0.000503 time 1.7030 (2.2272) loss 2.7856 (3.5010) grad_norm 1.5137 (1.5230) [2022-01-22 10:07:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][410/1251] eta 0:31:11 lr 0.000503 time 2.2271 (2.2248) loss 3.6495 (3.5042) grad_norm 1.7192 (1.5244) [2022-01-22 10:07:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][420/1251] eta 0:30:46 lr 0.000503 time 1.9985 (2.2217) loss 2.8002 (3.4950) grad_norm 1.6645 (1.5256) [2022-01-22 10:08:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][430/1251] eta 0:30:23 lr 0.000503 time 2.6506 (2.2212) loss 3.6646 (3.4946) grad_norm 1.3973 (1.5264) [2022-01-22 10:08:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][440/1251] eta 0:29:59 lr 0.000503 time 2.1280 (2.2194) loss 2.7109 (3.4950) grad_norm 1.4594 (1.5256) [2022-01-22 10:08:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][450/1251] eta 0:29:38 lr 0.000503 time 3.1558 (2.2206) loss 3.4355 (3.4955) grad_norm 1.4666 (1.5265) [2022-01-22 10:09:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][460/1251] eta 0:29:18 lr 0.000503 time 1.7667 (2.2236) loss 2.3822 (3.4869) grad_norm 1.4448 (1.5256) [2022-01-22 10:09:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][470/1251] eta 0:28:58 lr 0.000503 time 2.3558 (2.2257) loss 3.3715 (3.4832) grad_norm 1.4546 (1.5261) [2022-01-22 10:09:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][480/1251] eta 0:28:36 lr 0.000503 time 1.9003 (2.2257) loss 3.8146 (3.4814) grad_norm 1.6267 (1.5272) [2022-01-22 10:10:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][490/1251] eta 0:28:13 lr 0.000503 time 2.4938 (2.2248) loss 3.5131 (3.4806) grad_norm 1.8259 (1.5283) [2022-01-22 10:10:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][500/1251] eta 0:27:47 lr 0.000503 time 1.6479 (2.2206) loss 4.2790 (3.4781) grad_norm 1.3897 (1.5276) [2022-01-22 10:10:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][510/1251] eta 0:27:23 lr 0.000503 time 2.1532 (2.2175) loss 2.9677 (3.4777) grad_norm 1.7346 (1.5276) [2022-01-22 10:11:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][520/1251] eta 0:27:00 lr 0.000503 time 2.8452 (2.2162) loss 3.7193 (3.4754) grad_norm 1.7538 (1.5273) [2022-01-22 10:11:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][530/1251] eta 0:26:41 lr 0.000503 time 2.1200 (2.2209) loss 4.3724 (3.4760) grad_norm 1.4302 (1.5263) [2022-01-22 10:12:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][540/1251] eta 0:26:21 lr 0.000503 time 1.8033 (2.2246) loss 4.0099 (3.4800) grad_norm 1.5188 (1.5255) [2022-01-22 10:12:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][550/1251] eta 0:25:59 lr 0.000503 time 2.2170 (2.2244) loss 3.5208 (3.4811) grad_norm 1.9328 (1.5248) [2022-01-22 10:12:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][560/1251] eta 0:25:36 lr 0.000503 time 3.6462 (2.2241) loss 3.8113 (3.4847) grad_norm 1.5217 (1.5259) [2022-01-22 10:13:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][570/1251] eta 0:25:11 lr 0.000503 time 1.8609 (2.2198) loss 4.3257 (3.4881) grad_norm 1.3362 (1.5251) [2022-01-22 10:13:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][580/1251] eta 0:24:48 lr 0.000503 time 2.1241 (2.2181) loss 3.3718 (3.4917) grad_norm 1.5070 (1.5245) [2022-01-22 10:13:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][590/1251] eta 0:24:25 lr 0.000503 time 2.2297 (2.2178) loss 3.1475 (3.4850) grad_norm 1.7883 (1.5238) [2022-01-22 10:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][600/1251] eta 0:24:05 lr 0.000503 time 3.7631 (2.2199) loss 2.8668 (3.4868) grad_norm 1.5219 (1.5254) [2022-01-22 10:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][610/1251] eta 0:23:42 lr 0.000502 time 2.0853 (2.2190) loss 3.7721 (3.4884) grad_norm 1.5223 (1.5258) [2022-01-22 10:15:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][620/1251] eta 0:23:19 lr 0.000502 time 2.5308 (2.2179) loss 3.4085 (3.4879) grad_norm 1.3796 (1.5251) [2022-01-22 10:15:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][630/1251] eta 0:22:56 lr 0.000502 time 1.8864 (2.2162) loss 2.9474 (3.4889) grad_norm 1.5699 (1.5243) [2022-01-22 10:15:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][640/1251] eta 0:22:33 lr 0.000502 time 2.6132 (2.2156) loss 3.5501 (3.4903) grad_norm 1.4469 (1.5232) [2022-01-22 10:16:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][650/1251] eta 0:22:10 lr 0.000502 time 1.8330 (2.2137) loss 4.0118 (3.4934) grad_norm 1.4582 (1.5213) [2022-01-22 10:16:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][660/1251] eta 0:21:46 lr 0.000502 time 2.4494 (2.2112) loss 3.9719 (3.4942) grad_norm 1.5876 (1.5211) [2022-01-22 10:16:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][670/1251] eta 0:21:24 lr 0.000502 time 2.0171 (2.2104) loss 2.9007 (3.4969) grad_norm 1.3953 (1.5202) [2022-01-22 10:17:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][680/1251] eta 0:21:01 lr 0.000502 time 1.9251 (2.2088) loss 3.2680 (3.5013) grad_norm 1.4491 (1.5198) [2022-01-22 10:17:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][690/1251] eta 0:20:38 lr 0.000502 time 1.5377 (2.2074) loss 4.3050 (3.4992) grad_norm 1.4591 (1.5187) [2022-01-22 10:17:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][700/1251] eta 0:20:16 lr 0.000502 time 2.8162 (2.2073) loss 2.8998 (3.4998) grad_norm 1.7649 (1.5183) [2022-01-22 10:18:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][710/1251] eta 0:19:53 lr 0.000502 time 2.2112 (2.2064) loss 3.4411 (3.5034) grad_norm 1.4384 (1.5188) [2022-01-22 10:18:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][720/1251] eta 0:19:31 lr 0.000502 time 2.5272 (2.2069) loss 3.8112 (3.5028) grad_norm 1.9402 (1.5178) [2022-01-22 10:18:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][730/1251] eta 0:19:09 lr 0.000502 time 1.9531 (2.2054) loss 3.7689 (3.5028) grad_norm 1.3512 (1.5169) [2022-01-22 10:19:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][740/1251] eta 0:18:47 lr 0.000502 time 2.4379 (2.2070) loss 3.7369 (3.5040) grad_norm 1.3227 (1.5154) [2022-01-22 10:19:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][750/1251] eta 0:18:26 lr 0.000502 time 2.4281 (2.2089) loss 3.5560 (3.5042) grad_norm 1.4076 (1.5155) [2022-01-22 10:20:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][760/1251] eta 0:18:04 lr 0.000502 time 2.1234 (2.2085) loss 2.7418 (3.5057) grad_norm 1.3549 (1.5151) [2022-01-22 10:20:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][770/1251] eta 0:17:41 lr 0.000502 time 1.5370 (2.2072) loss 2.6739 (3.5044) grad_norm 1.8169 (1.5142) [2022-01-22 10:20:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][780/1251] eta 0:17:19 lr 0.000502 time 2.5391 (2.2066) loss 4.2662 (3.5043) grad_norm 1.5101 (1.5144) [2022-01-22 10:21:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][790/1251] eta 0:16:55 lr 0.000502 time 1.7416 (2.2034) loss 4.0939 (3.5031) grad_norm 1.7014 (1.5154) [2022-01-22 10:21:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][800/1251] eta 0:16:33 lr 0.000502 time 2.2487 (2.2032) loss 3.2116 (3.5048) grad_norm 1.3827 (1.5165) [2022-01-22 10:21:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][810/1251] eta 0:16:11 lr 0.000502 time 2.2251 (2.2036) loss 4.0034 (3.5072) grad_norm 1.6832 (1.5168) [2022-01-22 10:22:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][820/1251] eta 0:15:50 lr 0.000502 time 2.5343 (2.2043) loss 3.9043 (3.5107) grad_norm 1.9582 (1.5172) [2022-01-22 10:22:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][830/1251] eta 0:15:28 lr 0.000502 time 2.1108 (2.2060) loss 3.7811 (3.5068) grad_norm 1.3628 (1.5161) [2022-01-22 10:23:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][840/1251] eta 0:15:07 lr 0.000502 time 2.5508 (2.2080) loss 3.6946 (3.5072) grad_norm 1.4045 (1.5158) [2022-01-22 10:23:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][850/1251] eta 0:14:45 lr 0.000501 time 2.1583 (2.2083) loss 3.1337 (3.5074) grad_norm 1.6302 (1.5157) [2022-01-22 10:23:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][860/1251] eta 0:14:22 lr 0.000501 time 2.1691 (2.2059) loss 4.1809 (3.5086) grad_norm 1.3237 (1.5159) [2022-01-22 10:24:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][870/1251] eta 0:13:59 lr 0.000501 time 1.8209 (2.2029) loss 3.9745 (3.5078) grad_norm 1.4088 (1.5150) [2022-01-22 10:24:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][880/1251] eta 0:13:37 lr 0.000501 time 2.2535 (2.2031) loss 3.9032 (3.5069) grad_norm 1.3750 (1.5149) [2022-01-22 10:24:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][890/1251] eta 0:13:15 lr 0.000501 time 2.3640 (2.2024) loss 3.8202 (3.5088) grad_norm 1.4445 (1.5152) [2022-01-22 10:25:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][900/1251] eta 0:12:53 lr 0.000501 time 2.1263 (2.2031) loss 3.5802 (3.5123) grad_norm 1.4562 (1.5154) [2022-01-22 10:25:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][910/1251] eta 0:12:31 lr 0.000501 time 2.1313 (2.2025) loss 2.5997 (3.5096) grad_norm 1.4682 (1.5141) [2022-01-22 10:25:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][920/1251] eta 0:12:08 lr 0.000501 time 1.7909 (2.2016) loss 3.0395 (3.5074) grad_norm 1.6702 (1.5152) [2022-01-22 10:26:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][930/1251] eta 0:11:46 lr 0.000501 time 2.3977 (2.2014) loss 3.9373 (3.5083) grad_norm 1.5658 (1.5149) [2022-01-22 10:26:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][940/1251] eta 0:11:24 lr 0.000501 time 1.7601 (2.2008) loss 4.0405 (3.5107) grad_norm 1.3449 (1.5155) [2022-01-22 10:26:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][950/1251] eta 0:11:02 lr 0.000501 time 2.4352 (2.2018) loss 2.9346 (3.5102) grad_norm 1.3843 (1.5150) [2022-01-22 10:27:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][960/1251] eta 0:10:40 lr 0.000501 time 2.2368 (2.2016) loss 3.7523 (3.5107) grad_norm 1.4902 (1.5142) [2022-01-22 10:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][970/1251] eta 0:10:18 lr 0.000501 time 2.1554 (2.2006) loss 4.0886 (3.5106) grad_norm 1.4463 (1.5151) [2022-01-22 10:28:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][980/1251] eta 0:09:55 lr 0.000501 time 1.9157 (2.1991) loss 4.2378 (3.5113) grad_norm 1.5581 (1.5155) [2022-01-22 10:28:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][990/1251] eta 0:09:33 lr 0.000501 time 2.6727 (2.1989) loss 3.5430 (3.5103) grad_norm 1.6229 (1.5161) [2022-01-22 10:28:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1000/1251] eta 0:09:11 lr 0.000501 time 1.8732 (2.1986) loss 3.8159 (3.5111) grad_norm 1.5227 (1.5165) [2022-01-22 10:29:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1010/1251] eta 0:08:50 lr 0.000501 time 2.8839 (2.2006) loss 3.8258 (3.5118) grad_norm 1.4788 (1.5165) [2022-01-22 10:29:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1020/1251] eta 0:08:28 lr 0.000501 time 1.5309 (2.1996) loss 3.1961 (3.5113) grad_norm 1.5304 (1.5164) [2022-01-22 10:29:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1030/1251] eta 0:08:06 lr 0.000501 time 2.7273 (2.2009) loss 3.5954 (3.5136) grad_norm 1.4044 (1.5162) [2022-01-22 10:30:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1040/1251] eta 0:07:44 lr 0.000501 time 2.2868 (2.2000) loss 3.2062 (3.5110) grad_norm 1.8144 (1.5158) [2022-01-22 10:30:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1050/1251] eta 0:07:22 lr 0.000501 time 2.1374 (2.1998) loss 2.6591 (3.5119) grad_norm 1.6631 (1.5155) [2022-01-22 10:30:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1060/1251] eta 0:06:59 lr 0.000501 time 1.9116 (2.1987) loss 3.8166 (3.5157) grad_norm 1.6809 (1.5157) [2022-01-22 10:31:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1070/1251] eta 0:06:38 lr 0.000501 time 2.2352 (2.1992) loss 2.7720 (3.5158) grad_norm 1.4264 (1.5161) [2022-01-22 10:31:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1080/1251] eta 0:06:16 lr 0.000501 time 2.1974 (2.1992) loss 2.7763 (3.5159) grad_norm 1.3686 (1.5160) [2022-01-22 10:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1090/1251] eta 0:05:53 lr 0.000500 time 1.9098 (2.1987) loss 3.7850 (3.5160) grad_norm 1.4934 (1.5156) [2022-01-22 10:32:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1100/1251] eta 0:05:31 lr 0.000500 time 1.8031 (2.1976) loss 3.7009 (3.5178) grad_norm 1.3988 (1.5153) [2022-01-22 10:32:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1110/1251] eta 0:05:10 lr 0.000500 time 1.8950 (2.1989) loss 2.6942 (3.5126) grad_norm 1.3373 (1.5151) [2022-01-22 10:33:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1120/1251] eta 0:04:48 lr 0.000500 time 2.1233 (2.1994) loss 2.7666 (3.5125) grad_norm 1.2840 (1.5145) [2022-01-22 10:33:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1130/1251] eta 0:04:26 lr 0.000500 time 1.9893 (2.1985) loss 3.4970 (3.5116) grad_norm 1.4763 (1.5144) [2022-01-22 10:33:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1140/1251] eta 0:04:03 lr 0.000500 time 1.8709 (2.1971) loss 3.5085 (3.5103) grad_norm 1.5592 (1.5134) [2022-01-22 10:34:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1150/1251] eta 0:03:41 lr 0.000500 time 1.9461 (2.1962) loss 3.1166 (3.5104) grad_norm 1.8768 (1.5135) [2022-01-22 10:34:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1160/1251] eta 0:03:19 lr 0.000500 time 2.4596 (2.1970) loss 3.8695 (3.5096) grad_norm 1.4931 (1.5128) [2022-01-22 10:34:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1170/1251] eta 0:02:57 lr 0.000500 time 1.8207 (2.1968) loss 3.6370 (3.5099) grad_norm 1.3321 (1.5123) [2022-01-22 10:35:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1180/1251] eta 0:02:35 lr 0.000500 time 1.5404 (2.1962) loss 2.5270 (3.5122) grad_norm 1.2681 (1.5118) [2022-01-22 10:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1190/1251] eta 0:02:13 lr 0.000500 time 2.1736 (2.1964) loss 3.2469 (3.5131) grad_norm 1.3080 (1.5111) [2022-01-22 10:36:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1200/1251] eta 0:01:52 lr 0.000500 time 2.0393 (2.1970) loss 3.7627 (3.5137) grad_norm 1.5891 (1.5113) [2022-01-22 10:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1210/1251] eta 0:01:30 lr 0.000500 time 1.6790 (2.1960) loss 3.2299 (3.5136) grad_norm 1.7789 (1.5116) [2022-01-22 10:36:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1220/1251] eta 0:01:08 lr 0.000500 time 1.6271 (2.1955) loss 3.3501 (3.5153) grad_norm 1.4165 (1.5118) [2022-01-22 10:37:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1230/1251] eta 0:00:46 lr 0.000500 time 1.8897 (2.1955) loss 3.5399 (3.5144) grad_norm 1.3475 (1.5110) [2022-01-22 10:37:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1240/1251] eta 0:00:24 lr 0.000500 time 2.2178 (2.1964) loss 4.1070 (3.5173) grad_norm 1.6090 (1.5100) [2022-01-22 10:37:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [150/300][1250/1251] eta 0:00:02 lr 0.000500 time 1.1670 (2.1902) loss 3.8450 (3.5143) grad_norm 1.4922 (1.5098) [2022-01-22 10:37:44 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 150 training takes 0:45:40 [2022-01-22 10:37:44 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_150 saving...... [2022-01-22 10:37:56 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_150 saved !!! [2022-01-22 10:38:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 15.683 (15.683) Loss 1.0901 (1.0901) Acc@1 74.902 (74.902) Acc@5 92.578 (92.578) [2022-01-22 10:38:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.966 (2.897) Loss 0.9911 (1.0238) Acc@1 75.293 (75.968) Acc@5 94.238 (93.279) [2022-01-22 10:38:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.326 (2.378) Loss 1.0493 (1.0254) Acc@1 76.660 (76.018) Acc@5 93.262 (93.257) [2022-01-22 10:38:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.919 (2.047) Loss 1.1309 (1.0258) Acc@1 73.926 (75.970) Acc@5 90.723 (93.214) [2022-01-22 10:39:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.728 (2.008) Loss 0.9769 (1.0275) Acc@1 78.223 (75.972) Acc@5 93.066 (93.216) [2022-01-22 10:39:26 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 75.890 Acc@5 93.238 [2022-01-22 10:39:26 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 75.9% [2022-01-22 10:39:26 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.30% [2022-01-22 10:39:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][0/1251] eta 7:33:21 lr 0.000500 time 21.7439 (21.7439) loss 3.8690 (3.8690) grad_norm 1.6999 (1.6999) [2022-01-22 10:40:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][10/1251] eta 1:25:12 lr 0.000500 time 2.3541 (4.1199) loss 3.6261 (3.6631) grad_norm 1.4919 (1.5682) [2022-01-22 10:40:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][20/1251] eta 1:06:12 lr 0.000500 time 1.3533 (3.2271) loss 3.7242 (3.5721) grad_norm 1.5494 (1.5615) [2022-01-22 10:40:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][30/1251] eta 0:58:02 lr 0.000500 time 1.4223 (2.8520) loss 4.0085 (3.5784) grad_norm 1.6250 (1.5727) [2022-01-22 10:41:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][40/1251] eta 0:54:48 lr 0.000500 time 3.8406 (2.7152) loss 3.7124 (3.5632) grad_norm 1.5365 (1.5587) [2022-01-22 10:41:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][50/1251] eta 0:52:31 lr 0.000500 time 2.7443 (2.6245) loss 3.9043 (3.5416) grad_norm 1.6147 (1.5269) [2022-01-22 10:42:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][60/1251] eta 0:51:08 lr 0.000500 time 1.8842 (2.5767) loss 4.2421 (3.5824) grad_norm 1.6007 (1.5078) [2022-01-22 10:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][70/1251] eta 0:49:25 lr 0.000500 time 1.8634 (2.5113) loss 3.7071 (3.5525) grad_norm 1.4085 (1.4975) [2022-01-22 10:42:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][80/1251] eta 0:48:38 lr 0.000499 time 3.4409 (2.4921) loss 3.8999 (3.5508) grad_norm 1.8032 (1.5089) [2022-01-22 10:43:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][90/1251] eta 0:47:33 lr 0.000499 time 2.5686 (2.4578) loss 3.0813 (3.5605) grad_norm 1.6417 (1.5023) [2022-01-22 10:43:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][100/1251] eta 0:46:29 lr 0.000499 time 1.7898 (2.4235) loss 3.0196 (3.5594) grad_norm 1.5842 (1.5040) [2022-01-22 10:43:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][110/1251] eta 0:45:42 lr 0.000499 time 1.8389 (2.4040) loss 3.8570 (3.5565) grad_norm 1.3346 (1.4986) [2022-01-22 10:44:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][120/1251] eta 0:45:01 lr 0.000499 time 2.8707 (2.3886) loss 3.8758 (3.5761) grad_norm 1.5446 (1.5011) [2022-01-22 10:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][130/1251] eta 0:44:19 lr 0.000499 time 2.5442 (2.3723) loss 2.5032 (3.5917) grad_norm 1.6093 (1.5006) [2022-01-22 10:44:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][140/1251] eta 0:43:39 lr 0.000499 time 1.9768 (2.3574) loss 3.8370 (3.6001) grad_norm 1.3998 (1.4973) [2022-01-22 10:45:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][150/1251] eta 0:43:24 lr 0.000499 time 3.1244 (2.3660) loss 3.6623 (3.5857) grad_norm 1.4511 (1.4979) [2022-01-22 10:45:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][160/1251] eta 0:43:02 lr 0.000499 time 2.4027 (2.3674) loss 3.7988 (3.5919) grad_norm 1.4861 (1.4967) [2022-01-22 10:46:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][170/1251] eta 0:42:27 lr 0.000499 time 1.9591 (2.3569) loss 2.7940 (3.5869) grad_norm 1.4431 (1.4944) [2022-01-22 10:46:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][180/1251] eta 0:41:55 lr 0.000499 time 1.9125 (2.3483) loss 3.8988 (3.5770) grad_norm 1.3509 (1.4944) [2022-01-22 10:46:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][190/1251] eta 0:41:11 lr 0.000499 time 2.1444 (2.3298) loss 3.8304 (3.5638) grad_norm 1.5851 (1.4921) [2022-01-22 10:47:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][200/1251] eta 0:40:30 lr 0.000499 time 1.6420 (2.3126) loss 3.1753 (3.5646) grad_norm 1.3294 (1.4908) [2022-01-22 10:47:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][210/1251] eta 0:39:58 lr 0.000499 time 2.4407 (2.3041) loss 3.5746 (3.5723) grad_norm 1.6738 (1.4911) [2022-01-22 10:47:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][220/1251] eta 0:39:27 lr 0.000499 time 2.4982 (2.2961) loss 3.9434 (3.5722) grad_norm 1.5698 (1.4903) [2022-01-22 10:48:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][230/1251] eta 0:39:04 lr 0.000499 time 2.7862 (2.2964) loss 4.4810 (3.5781) grad_norm 1.9170 (1.4942) [2022-01-22 10:48:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][240/1251] eta 0:38:42 lr 0.000499 time 2.4595 (2.2968) loss 3.7077 (3.5787) grad_norm 1.6249 (1.4986) [2022-01-22 10:49:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][250/1251] eta 0:38:18 lr 0.000499 time 2.4140 (2.2960) loss 4.0032 (3.5809) grad_norm 1.4939 (1.4990) [2022-01-22 10:49:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][260/1251] eta 0:37:52 lr 0.000499 time 1.9656 (2.2932) loss 3.3465 (3.5889) grad_norm 1.4872 (1.4988) [2022-01-22 10:49:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][270/1251] eta 0:37:28 lr 0.000499 time 2.8337 (2.2924) loss 2.7346 (3.5803) grad_norm 1.3281 (1.4973) [2022-01-22 10:50:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][280/1251] eta 0:36:59 lr 0.000499 time 2.2601 (2.2860) loss 3.0610 (3.5681) grad_norm 1.3737 (1.4973) [2022-01-22 10:50:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][290/1251] eta 0:36:32 lr 0.000499 time 1.9072 (2.2815) loss 2.8286 (3.5626) grad_norm 1.6558 (1.4968) [2022-01-22 10:50:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][300/1251] eta 0:36:06 lr 0.000499 time 2.0263 (2.2779) loss 3.8691 (3.5551) grad_norm 1.4352 (1.4967) [2022-01-22 10:51:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][310/1251] eta 0:35:39 lr 0.000499 time 2.5059 (2.2741) loss 2.3859 (3.5502) grad_norm 1.5724 (1.4964) [2022-01-22 10:51:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][320/1251] eta 0:35:13 lr 0.000498 time 1.9845 (2.2698) loss 3.3937 (3.5531) grad_norm 1.5352 (1.4963) [2022-01-22 10:51:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][330/1251] eta 0:34:51 lr 0.000498 time 2.1653 (2.2708) loss 3.0647 (3.5509) grad_norm 1.6706 (1.4987) [2022-01-22 10:52:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][340/1251] eta 0:34:27 lr 0.000498 time 1.8094 (2.2692) loss 3.6652 (3.5540) grad_norm 1.5353 (1.4993) [2022-01-22 10:52:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][350/1251] eta 0:34:07 lr 0.000498 time 3.2683 (2.2725) loss 3.7003 (3.5526) grad_norm 1.4753 (1.5007) [2022-01-22 10:53:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][360/1251] eta 0:33:38 lr 0.000498 time 1.7759 (2.2651) loss 3.8763 (3.5537) grad_norm 1.5448 (1.5022) [2022-01-22 10:53:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][370/1251] eta 0:33:10 lr 0.000498 time 2.6037 (2.2598) loss 3.6859 (3.5520) grad_norm 1.5730 (1.5049) [2022-01-22 10:53:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][380/1251] eta 0:32:39 lr 0.000498 time 1.5366 (2.2493) loss 4.1407 (3.5546) grad_norm 1.6618 (1.5056) [2022-01-22 10:54:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][390/1251] eta 0:32:13 lr 0.000498 time 1.9449 (2.2462) loss 3.6882 (3.5568) grad_norm 1.5661 (1.5053) [2022-01-22 10:54:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][400/1251] eta 0:31:52 lr 0.000498 time 2.8365 (2.2476) loss 2.4424 (3.5585) grad_norm 1.2377 (1.5028) [2022-01-22 10:54:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][410/1251] eta 0:31:32 lr 0.000498 time 1.6066 (2.2499) loss 3.5332 (3.5615) grad_norm 1.8173 (1.5037) [2022-01-22 10:55:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][420/1251] eta 0:31:10 lr 0.000498 time 1.9391 (2.2514) loss 3.8583 (3.5594) grad_norm 1.3757 (1.5051) [2022-01-22 10:55:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][430/1251] eta 0:30:49 lr 0.000498 time 1.9489 (2.2523) loss 3.1023 (3.5569) grad_norm 1.6994 (1.5068) [2022-01-22 10:55:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][440/1251] eta 0:30:25 lr 0.000498 time 2.0055 (2.2513) loss 3.5726 (3.5574) grad_norm 1.6307 (1.5096) [2022-01-22 10:56:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][450/1251] eta 0:30:01 lr 0.000498 time 1.7695 (2.2494) loss 3.5863 (3.5519) grad_norm 1.2758 (1.5090) [2022-01-22 10:56:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][460/1251] eta 0:29:38 lr 0.000498 time 1.9706 (2.2487) loss 3.2069 (3.5485) grad_norm 1.6958 (1.5094) [2022-01-22 10:57:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][470/1251] eta 0:29:13 lr 0.000498 time 1.9388 (2.2451) loss 3.3806 (3.5505) grad_norm 1.5058 (1.5081) [2022-01-22 10:57:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][480/1251] eta 0:28:50 lr 0.000498 time 2.0044 (2.2442) loss 3.8613 (3.5493) grad_norm 1.8746 (1.5086) [2022-01-22 10:57:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][490/1251] eta 0:28:27 lr 0.000498 time 1.9178 (2.2437) loss 3.9570 (3.5451) grad_norm 1.3069 (1.5068) [2022-01-22 10:58:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][500/1251] eta 0:28:03 lr 0.000498 time 1.8394 (2.2420) loss 3.9664 (3.5511) grad_norm 1.2207 (1.5065) [2022-01-22 10:58:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][510/1251] eta 0:27:39 lr 0.000498 time 1.7125 (2.2391) loss 2.9483 (3.5529) grad_norm 1.4664 (1.5075) [2022-01-22 10:58:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][520/1251] eta 0:27:16 lr 0.000498 time 2.2882 (2.2388) loss 3.9449 (3.5542) grad_norm 1.5176 (1.5091) [2022-01-22 10:59:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][530/1251] eta 0:26:53 lr 0.000498 time 1.6326 (2.2377) loss 3.8221 (3.5535) grad_norm 1.2822 (1.5095) [2022-01-22 10:59:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][540/1251] eta 0:26:30 lr 0.000498 time 1.7628 (2.2376) loss 4.2668 (3.5577) grad_norm 1.6881 (1.5109) [2022-01-22 10:59:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][550/1251] eta 0:26:07 lr 0.000498 time 1.8653 (2.2362) loss 4.4128 (3.5589) grad_norm 1.4787 (1.5103) [2022-01-22 11:00:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][560/1251] eta 0:25:44 lr 0.000497 time 2.2124 (2.2349) loss 3.4215 (3.5572) grad_norm 1.4791 (1.5101) [2022-01-22 11:00:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][570/1251] eta 0:25:20 lr 0.000497 time 1.7293 (2.2330) loss 2.9669 (3.5590) grad_norm 1.4409 (1.5104) [2022-01-22 11:01:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][580/1251] eta 0:24:56 lr 0.000497 time 1.7820 (2.2307) loss 4.0571 (3.5590) grad_norm 1.7799 (1.5109) [2022-01-22 11:01:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][590/1251] eta 0:24:32 lr 0.000497 time 2.1277 (2.2284) loss 3.7723 (3.5592) grad_norm 1.6555 (1.5117) [2022-01-22 11:01:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][600/1251] eta 0:24:09 lr 0.000497 time 2.7822 (2.2267) loss 3.0694 (3.5585) grad_norm 1.4372 (1.5116) [2022-01-22 11:02:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][610/1251] eta 0:23:48 lr 0.000497 time 1.7008 (2.2293) loss 3.5923 (3.5603) grad_norm 1.4501 (1.5133) [2022-01-22 11:02:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][620/1251] eta 0:23:27 lr 0.000497 time 2.8977 (2.2300) loss 3.4658 (3.5574) grad_norm 1.4801 (1.5128) [2022-01-22 11:02:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][630/1251] eta 0:23:04 lr 0.000497 time 2.1795 (2.2291) loss 3.8596 (3.5571) grad_norm 1.7749 (1.5136) [2022-01-22 11:03:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][640/1251] eta 0:22:40 lr 0.000497 time 1.8850 (2.2261) loss 3.7861 (3.5549) grad_norm 1.3228 (1.5137) [2022-01-22 11:03:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][650/1251] eta 0:22:17 lr 0.000497 time 2.2277 (2.2255) loss 4.0789 (3.5575) grad_norm 1.2744 (1.5119) [2022-01-22 11:03:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][660/1251] eta 0:21:54 lr 0.000497 time 2.5766 (2.2243) loss 3.9134 (3.5576) grad_norm 1.4002 (1.5116) [2022-01-22 11:04:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][670/1251] eta 0:21:31 lr 0.000497 time 2.2407 (2.2235) loss 3.8547 (3.5577) grad_norm 1.5601 (1.5113) [2022-01-22 11:04:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][680/1251] eta 0:21:09 lr 0.000497 time 2.7481 (2.2235) loss 4.2006 (3.5570) grad_norm 1.5895 (1.5108) [2022-01-22 11:05:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][690/1251] eta 0:20:48 lr 0.000497 time 2.3607 (2.2247) loss 2.6308 (3.5571) grad_norm 1.4456 (1.5104) [2022-01-22 11:05:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][700/1251] eta 0:20:25 lr 0.000497 time 2.1541 (2.2244) loss 2.9674 (3.5534) grad_norm 1.3845 (1.5093) [2022-01-22 11:05:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][710/1251] eta 0:20:02 lr 0.000497 time 1.9062 (2.2230) loss 2.3935 (3.5520) grad_norm 1.4712 (1.5078) [2022-01-22 11:06:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][720/1251] eta 0:19:39 lr 0.000497 time 1.9086 (2.2215) loss 4.0668 (3.5526) grad_norm 1.7634 (1.5084) [2022-01-22 11:06:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][730/1251] eta 0:19:17 lr 0.000497 time 1.7758 (2.2212) loss 3.5278 (3.5539) grad_norm 1.4115 (1.5078) [2022-01-22 11:06:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][740/1251] eta 0:18:55 lr 0.000497 time 2.5112 (2.2211) loss 3.0194 (3.5561) grad_norm 1.3779 (1.5084) [2022-01-22 11:07:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][750/1251] eta 0:18:33 lr 0.000497 time 2.0493 (2.2217) loss 4.0546 (3.5556) grad_norm 1.5739 (1.5089) [2022-01-22 11:07:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][760/1251] eta 0:18:11 lr 0.000497 time 2.5377 (2.2229) loss 3.8926 (3.5558) grad_norm 1.4441 (1.5089) [2022-01-22 11:07:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][770/1251] eta 0:17:48 lr 0.000497 time 1.9057 (2.2209) loss 3.5238 (3.5592) grad_norm 1.3919 (1.5078) [2022-01-22 11:08:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][780/1251] eta 0:17:24 lr 0.000497 time 2.2421 (2.2180) loss 3.6409 (3.5591) grad_norm 1.3816 (1.5076) [2022-01-22 11:08:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][790/1251] eta 0:17:01 lr 0.000497 time 1.6162 (2.2165) loss 2.6190 (3.5569) grad_norm 1.5047 (1.5081) [2022-01-22 11:09:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][800/1251] eta 0:16:39 lr 0.000497 time 2.1968 (2.2166) loss 3.8713 (3.5601) grad_norm 1.4441 (1.5076) [2022-01-22 11:09:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][810/1251] eta 0:16:17 lr 0.000496 time 2.5676 (2.2167) loss 4.0374 (3.5629) grad_norm 1.8042 (1.5082) [2022-01-22 11:09:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][820/1251] eta 0:15:56 lr 0.000496 time 3.2934 (2.2183) loss 3.7401 (3.5635) grad_norm 1.7536 (1.5099) [2022-01-22 11:10:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][830/1251] eta 0:15:34 lr 0.000496 time 2.1632 (2.2189) loss 3.8766 (3.5640) grad_norm 1.4167 (1.5102) [2022-01-22 11:10:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][840/1251] eta 0:15:12 lr 0.000496 time 2.7019 (2.2210) loss 2.8885 (3.5641) grad_norm 1.3396 (1.5100) [2022-01-22 11:10:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][850/1251] eta 0:14:50 lr 0.000496 time 2.4694 (2.2213) loss 3.5411 (3.5609) grad_norm 1.4915 (1.5094) [2022-01-22 11:11:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][860/1251] eta 0:14:27 lr 0.000496 time 1.9432 (2.2195) loss 3.4928 (3.5633) grad_norm 1.5811 (1.5104) [2022-01-22 11:11:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][870/1251] eta 0:14:04 lr 0.000496 time 1.8014 (2.2177) loss 2.9923 (3.5633) grad_norm 1.6964 (1.5105) [2022-01-22 11:12:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][880/1251] eta 0:13:44 lr 0.000496 time 3.5735 (2.2213) loss 2.8417 (3.5665) grad_norm 1.4954 (1.5112) [2022-01-22 11:12:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][890/1251] eta 0:13:22 lr 0.000496 time 2.2596 (2.2217) loss 3.2268 (3.5663) grad_norm 1.4625 (1.5112) [2022-01-22 11:12:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][900/1251] eta 0:12:59 lr 0.000496 time 1.9034 (2.2201) loss 4.1854 (3.5661) grad_norm 1.4482 (1.5112) [2022-01-22 11:13:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][910/1251] eta 0:12:36 lr 0.000496 time 2.0425 (2.2177) loss 3.5455 (3.5654) grad_norm 1.3892 (1.5108) [2022-01-22 11:13:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][920/1251] eta 0:12:14 lr 0.000496 time 3.4790 (2.2189) loss 3.9182 (3.5670) grad_norm 1.5082 (1.5110) [2022-01-22 11:13:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][930/1251] eta 0:11:51 lr 0.000496 time 2.3278 (2.2174) loss 2.5805 (3.5653) grad_norm 1.6436 (1.5113) [2022-01-22 11:14:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][940/1251] eta 0:11:29 lr 0.000496 time 2.1344 (2.2183) loss 3.5811 (3.5640) grad_norm 1.5914 (1.5130) [2022-01-22 11:14:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][950/1251] eta 0:11:07 lr 0.000496 time 1.7193 (2.2183) loss 3.1621 (3.5624) grad_norm 1.2429 (1.5129) [2022-01-22 11:14:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][960/1251] eta 0:10:45 lr 0.000496 time 3.8343 (2.2183) loss 3.2657 (3.5618) grad_norm 1.5868 (1.5129) [2022-01-22 11:15:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][970/1251] eta 0:10:23 lr 0.000496 time 1.8468 (2.2186) loss 2.9439 (3.5613) grad_norm 1.4129 (1.5119) [2022-01-22 11:15:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][980/1251] eta 0:10:01 lr 0.000496 time 2.0324 (2.2199) loss 2.5882 (3.5592) grad_norm 1.4111 (1.5115) [2022-01-22 11:16:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][990/1251] eta 0:09:38 lr 0.000496 time 1.6582 (2.2181) loss 3.6326 (3.5594) grad_norm 1.9435 (1.5117) [2022-01-22 11:16:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1000/1251] eta 0:09:16 lr 0.000496 time 3.8677 (2.2190) loss 3.5526 (3.5600) grad_norm 1.2289 (1.5115) [2022-01-22 11:16:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1010/1251] eta 0:08:54 lr 0.000496 time 2.2907 (2.2173) loss 3.9304 (3.5645) grad_norm 1.5741 (1.5107) [2022-01-22 11:17:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1020/1251] eta 0:08:31 lr 0.000496 time 2.2146 (2.2155) loss 3.8438 (3.5652) grad_norm 1.4644 (1.5097) [2022-01-22 11:17:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1030/1251] eta 0:08:09 lr 0.000496 time 2.2642 (2.2136) loss 3.4437 (3.5668) grad_norm 1.3164 (1.5103) [2022-01-22 11:17:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1040/1251] eta 0:07:46 lr 0.000496 time 1.6427 (2.2123) loss 3.6787 (3.5693) grad_norm 1.5227 (1.5103) [2022-01-22 11:18:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1050/1251] eta 0:07:24 lr 0.000495 time 2.9379 (2.2118) loss 3.4681 (3.5685) grad_norm 1.5072 (1.5098) [2022-01-22 11:18:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1060/1251] eta 0:07:02 lr 0.000495 time 2.7004 (2.2121) loss 2.6189 (3.5660) grad_norm 1.4148 (1.5096) [2022-01-22 11:18:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1070/1251] eta 0:06:40 lr 0.000495 time 3.3798 (2.2132) loss 3.6040 (3.5676) grad_norm 1.5055 (1.5087) [2022-01-22 11:19:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1080/1251] eta 0:06:18 lr 0.000495 time 1.9960 (2.2136) loss 2.6127 (3.5681) grad_norm 1.4360 (1.5079) [2022-01-22 11:19:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1090/1251] eta 0:05:56 lr 0.000495 time 2.5645 (2.2146) loss 3.5916 (3.5684) grad_norm 1.5498 (1.5069) [2022-01-22 11:20:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1100/1251] eta 0:05:34 lr 0.000495 time 2.9227 (2.2155) loss 3.4819 (3.5693) grad_norm 1.2355 (1.5062) [2022-01-22 11:20:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1110/1251] eta 0:05:12 lr 0.000495 time 2.5741 (2.2163) loss 3.9681 (3.5699) grad_norm 1.3653 (1.5056) [2022-01-22 11:20:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1120/1251] eta 0:04:50 lr 0.000495 time 1.7172 (2.2142) loss 3.9532 (3.5717) grad_norm 1.5589 (1.5052) [2022-01-22 11:21:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1130/1251] eta 0:04:27 lr 0.000495 time 1.9761 (2.2141) loss 3.6727 (3.5682) grad_norm 1.3523 (1.5049) [2022-01-22 11:21:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1140/1251] eta 0:04:05 lr 0.000495 time 2.1056 (2.2125) loss 3.1793 (3.5668) grad_norm 1.6620 (1.5053) [2022-01-22 11:21:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1150/1251] eta 0:03:43 lr 0.000495 time 2.2989 (2.2132) loss 3.4509 (3.5656) grad_norm 1.5048 (1.5046) [2022-01-22 11:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1160/1251] eta 0:03:21 lr 0.000495 time 1.4531 (2.2132) loss 3.7652 (3.5660) grad_norm 1.4956 (1.5037) [2022-01-22 11:22:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1170/1251] eta 0:02:59 lr 0.000495 time 2.8086 (2.2140) loss 3.0273 (3.5668) grad_norm 1.4198 (1.5042) [2022-01-22 11:23:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1180/1251] eta 0:02:37 lr 0.000495 time 1.6002 (2.2137) loss 3.2564 (3.5693) grad_norm 1.4382 (1.5042) [2022-01-22 11:23:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1190/1251] eta 0:02:15 lr 0.000495 time 2.3331 (2.2137) loss 3.6075 (3.5706) grad_norm 1.3761 (1.5049) [2022-01-22 11:23:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1200/1251] eta 0:01:52 lr 0.000495 time 1.5044 (2.2150) loss 2.4190 (3.5696) grad_norm 1.5229 (1.5053) [2022-01-22 11:24:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1210/1251] eta 0:01:30 lr 0.000495 time 1.9662 (2.2148) loss 3.0152 (3.5690) grad_norm 1.7768 (1.5062) [2022-01-22 11:24:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1220/1251] eta 0:01:08 lr 0.000495 time 1.7112 (2.2135) loss 2.9140 (3.5682) grad_norm 1.5503 (1.5065) [2022-01-22 11:24:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1230/1251] eta 0:00:46 lr 0.000495 time 2.7760 (2.2131) loss 3.6055 (3.5685) grad_norm 1.4280 (1.5071) [2022-01-22 11:25:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1240/1251] eta 0:00:24 lr 0.000495 time 1.1892 (2.2133) loss 4.2022 (3.5706) grad_norm 1.5378 (1.5072) [2022-01-22 11:25:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [151/300][1250/1251] eta 0:00:02 lr 0.000495 time 1.2155 (2.2072) loss 3.7926 (3.5721) grad_norm 1.6656 (1.5068) [2022-01-22 11:25:28 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 151 training takes 0:46:01 [2022-01-22 11:25:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.393 (18.393) Loss 1.0198 (1.0198) Acc@1 76.660 (76.660) Acc@5 95.020 (95.020) [2022-01-22 11:26:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.981 (3.090) Loss 1.0534 (1.0304) Acc@1 76.172 (75.994) Acc@5 93.457 (93.652) [2022-01-22 11:26:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 2.532 (2.506) Loss 1.0710 (1.0350) Acc@1 76.074 (75.972) Acc@5 92.383 (93.466) [2022-01-22 11:26:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.418 (2.212) Loss 0.9635 (1.0279) Acc@1 77.930 (76.049) Acc@5 93.555 (93.611) [2022-01-22 11:26:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.068 (2.141) Loss 0.9889 (1.0318) Acc@1 78.125 (76.053) Acc@5 93.262 (93.509) [2022-01-22 11:27:04 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.040 Acc@5 93.436 [2022-01-22 11:27:04 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.0% [2022-01-22 11:27:04 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.30% [2022-01-22 11:27:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][0/1251] eta 7:36:39 lr 0.000495 time 21.9023 (21.9023) loss 4.3440 (4.3440) grad_norm 1.6774 (1.6774) [2022-01-22 11:27:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][10/1251] eta 1:24:21 lr 0.000495 time 2.1797 (4.0785) loss 3.6726 (3.7272) grad_norm 1.8703 (1.5849) [2022-01-22 11:28:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][20/1251] eta 1:05:26 lr 0.000495 time 2.2594 (3.1894) loss 2.9167 (3.5663) grad_norm 1.4053 (1.5344) [2022-01-22 11:28:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][30/1251] eta 0:57:58 lr 0.000495 time 1.5621 (2.8491) loss 3.0309 (3.5263) grad_norm 1.4531 (1.5676) [2022-01-22 11:28:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][40/1251] eta 0:53:57 lr 0.000494 time 2.8432 (2.6731) loss 2.7909 (3.5237) grad_norm 1.2298 (1.5442) [2022-01-22 11:29:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][50/1251] eta 0:52:16 lr 0.000494 time 1.8666 (2.6113) loss 3.1652 (3.5549) grad_norm 1.5042 (1.5212) [2022-01-22 11:29:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][60/1251] eta 0:50:55 lr 0.000494 time 2.4917 (2.5654) loss 2.6745 (3.5472) grad_norm 1.5241 (1.5140) [2022-01-22 11:30:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][70/1251] eta 0:49:30 lr 0.000494 time 1.8983 (2.5149) loss 4.0100 (3.5610) grad_norm 1.5281 (1.5152) [2022-01-22 11:30:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][80/1251] eta 0:48:44 lr 0.000494 time 3.3360 (2.4973) loss 3.6741 (3.5561) grad_norm 1.3982 (1.5064) [2022-01-22 11:30:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][90/1251] eta 0:47:45 lr 0.000494 time 2.5404 (2.4680) loss 3.0276 (3.5382) grad_norm 2.0775 (1.5226) [2022-01-22 11:31:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][100/1251] eta 0:46:28 lr 0.000494 time 1.8407 (2.4230) loss 4.0135 (3.5464) grad_norm 1.4861 (1.5252) [2022-01-22 11:31:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][110/1251] eta 0:45:20 lr 0.000494 time 1.9072 (2.3841) loss 3.8611 (3.5366) grad_norm 1.7021 (1.5224) [2022-01-22 11:31:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][120/1251] eta 0:44:28 lr 0.000494 time 2.5634 (2.3593) loss 3.9489 (3.5340) grad_norm 2.2328 (1.5277) [2022-01-22 11:32:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][130/1251] eta 0:43:48 lr 0.000494 time 2.1308 (2.3446) loss 4.0674 (3.5488) grad_norm 1.3622 (1.5285) [2022-01-22 11:32:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][140/1251] eta 0:43:36 lr 0.000494 time 2.9956 (2.3550) loss 4.0482 (3.5613) grad_norm 1.6612 (1.5251) [2022-01-22 11:32:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][150/1251] eta 0:42:53 lr 0.000494 time 1.8838 (2.3373) loss 3.8031 (3.5603) grad_norm 1.5801 (1.5285) [2022-01-22 11:33:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][160/1251] eta 0:42:20 lr 0.000494 time 2.7387 (2.3290) loss 2.9839 (3.5537) grad_norm 1.4098 (1.5271) [2022-01-22 11:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][170/1251] eta 0:41:43 lr 0.000494 time 1.9480 (2.3163) loss 3.0035 (3.5517) grad_norm 1.6535 (1.5225) [2022-01-22 11:34:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][180/1251] eta 0:41:10 lr 0.000494 time 3.3197 (2.3069) loss 3.4045 (3.5550) grad_norm 1.4478 (1.5178) [2022-01-22 11:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][190/1251] eta 0:40:36 lr 0.000494 time 2.2099 (2.2965) loss 3.7644 (3.5695) grad_norm 1.3975 (1.5150) [2022-01-22 11:34:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][200/1251] eta 0:40:14 lr 0.000494 time 2.8521 (2.2970) loss 2.4181 (3.5613) grad_norm 1.6599 (1.5124) [2022-01-22 11:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][210/1251] eta 0:39:45 lr 0.000494 time 2.6679 (2.2919) loss 3.0331 (3.5550) grad_norm 1.4857 (1.5123) [2022-01-22 11:35:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][220/1251] eta 0:39:12 lr 0.000494 time 2.1698 (2.2821) loss 2.3732 (3.5522) grad_norm 1.7546 (1.5118) [2022-01-22 11:35:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][230/1251] eta 0:38:49 lr 0.000494 time 3.0427 (2.2816) loss 4.0962 (3.5451) grad_norm 1.4020 (1.5111) [2022-01-22 11:36:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][240/1251] eta 0:38:25 lr 0.000494 time 2.6981 (2.2804) loss 2.8228 (3.5548) grad_norm 1.6650 (1.5135) [2022-01-22 11:36:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][250/1251] eta 0:38:01 lr 0.000494 time 1.5313 (2.2789) loss 3.6376 (3.5511) grad_norm 1.5670 (1.5152) [2022-01-22 11:36:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][260/1251] eta 0:37:36 lr 0.000494 time 2.1713 (2.2773) loss 3.0818 (3.5478) grad_norm 1.4594 (1.5173) [2022-01-22 11:37:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][270/1251] eta 0:37:16 lr 0.000494 time 3.0854 (2.2803) loss 3.9536 (3.5537) grad_norm 1.3785 (1.5145) [2022-01-22 11:37:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][280/1251] eta 0:36:44 lr 0.000493 time 2.0468 (2.2707) loss 3.8644 (3.5596) grad_norm 1.5053 (1.5118) [2022-01-22 11:38:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][290/1251] eta 0:36:09 lr 0.000493 time 1.5215 (2.2580) loss 3.6583 (3.5614) grad_norm 1.4305 (1.5105) [2022-01-22 11:38:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][300/1251] eta 0:35:38 lr 0.000493 time 1.8836 (2.2491) loss 2.4578 (3.5624) grad_norm 1.5468 (1.5102) [2022-01-22 11:38:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][310/1251] eta 0:35:08 lr 0.000493 time 2.1616 (2.2409) loss 3.8751 (3.5646) grad_norm 1.6103 (1.5115) [2022-01-22 11:39:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][320/1251] eta 0:34:43 lr 0.000493 time 1.9499 (2.2378) loss 2.7688 (3.5643) grad_norm 1.5666 (1.5112) [2022-01-22 11:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][330/1251] eta 0:34:19 lr 0.000493 time 2.1361 (2.2360) loss 3.2545 (3.5603) grad_norm 1.4901 (1.5142) [2022-01-22 11:39:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][340/1251] eta 0:33:54 lr 0.000493 time 1.9413 (2.2330) loss 3.6522 (3.5576) grad_norm 1.5814 (1.5140) [2022-01-22 11:40:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][350/1251] eta 0:33:35 lr 0.000493 time 2.7983 (2.2365) loss 3.1376 (3.5589) grad_norm 1.4923 (1.5132) [2022-01-22 11:40:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][360/1251] eta 0:33:11 lr 0.000493 time 1.9549 (2.2354) loss 3.8126 (3.5522) grad_norm 1.5557 (1.5133) [2022-01-22 11:40:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][370/1251] eta 0:32:49 lr 0.000493 time 2.4294 (2.2355) loss 2.9554 (3.5500) grad_norm 1.4361 (1.5135) [2022-01-22 11:41:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][380/1251] eta 0:32:25 lr 0.000493 time 1.9474 (2.2337) loss 3.6427 (3.5437) grad_norm 1.6561 (1.5141) [2022-01-22 11:41:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][390/1251] eta 0:32:06 lr 0.000493 time 3.4657 (2.2377) loss 3.3448 (3.5413) grad_norm 1.6105 (1.5135) [2022-01-22 11:42:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][400/1251] eta 0:31:45 lr 0.000493 time 2.4853 (2.2397) loss 3.5768 (3.5439) grad_norm 1.6777 (1.5153) [2022-01-22 11:42:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][410/1251] eta 0:31:21 lr 0.000493 time 2.5326 (2.2377) loss 4.0630 (3.5445) grad_norm 2.7175 (1.5209) [2022-01-22 11:42:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][420/1251] eta 0:30:54 lr 0.000493 time 1.7772 (2.2317) loss 3.4140 (3.5425) grad_norm 1.6245 (1.5249) [2022-01-22 11:43:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][430/1251] eta 0:30:29 lr 0.000493 time 3.5942 (2.2287) loss 4.1426 (3.5383) grad_norm 1.3085 (1.5224) [2022-01-22 11:43:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][440/1251] eta 0:30:06 lr 0.000493 time 1.9710 (2.2274) loss 3.2564 (3.5307) grad_norm 2.1007 (1.5233) [2022-01-22 11:43:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][450/1251] eta 0:29:45 lr 0.000493 time 1.8436 (2.2296) loss 3.5859 (3.5288) grad_norm 1.5380 (1.5219) [2022-01-22 11:44:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][460/1251] eta 0:29:23 lr 0.000493 time 2.0045 (2.2288) loss 2.8924 (3.5291) grad_norm 1.7909 (1.5226) [2022-01-22 11:44:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][470/1251] eta 0:29:00 lr 0.000493 time 3.0744 (2.2284) loss 2.7717 (3.5310) grad_norm 1.5441 (1.5233) [2022-01-22 11:44:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][480/1251] eta 0:28:37 lr 0.000493 time 2.0425 (2.2271) loss 3.3701 (3.5319) grad_norm 1.3696 (1.5225) [2022-01-22 11:45:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][490/1251] eta 0:28:11 lr 0.000493 time 1.8748 (2.2230) loss 3.7787 (3.5339) grad_norm 1.4787 (1.5222) [2022-01-22 11:45:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][500/1251] eta 0:27:46 lr 0.000493 time 1.9939 (2.2196) loss 3.6564 (3.5371) grad_norm 1.3487 (1.5215) [2022-01-22 11:45:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][510/1251] eta 0:27:26 lr 0.000493 time 2.5967 (2.2223) loss 3.7587 (3.5347) grad_norm 1.2859 (1.5217) [2022-01-22 11:46:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][520/1251] eta 0:27:04 lr 0.000492 time 1.9908 (2.2228) loss 3.3909 (3.5293) grad_norm 1.6124 (1.5213) [2022-01-22 11:46:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][530/1251] eta 0:26:42 lr 0.000492 time 2.1902 (2.2222) loss 3.9025 (3.5308) grad_norm 1.4128 (1.5205) [2022-01-22 11:47:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][540/1251] eta 0:26:18 lr 0.000492 time 2.2751 (2.2205) loss 3.9459 (3.5356) grad_norm 1.4761 (1.5219) [2022-01-22 11:47:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][550/1251] eta 0:25:54 lr 0.000492 time 1.8950 (2.2169) loss 3.6928 (3.5349) grad_norm 1.7499 (1.5220) [2022-01-22 11:47:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][560/1251] eta 0:25:29 lr 0.000492 time 1.9875 (2.2134) loss 3.2971 (3.5353) grad_norm 1.4984 (1.5246) [2022-01-22 11:48:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][570/1251] eta 0:25:06 lr 0.000492 time 2.3716 (2.2126) loss 3.8485 (3.5330) grad_norm 1.8964 (1.5258) [2022-01-22 11:48:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][580/1251] eta 0:24:43 lr 0.000492 time 2.2364 (2.2109) loss 4.0304 (3.5285) grad_norm 1.7054 (1.5259) [2022-01-22 11:48:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][590/1251] eta 0:24:21 lr 0.000492 time 2.4905 (2.2105) loss 2.6431 (3.5229) grad_norm 1.5713 (1.5266) [2022-01-22 11:49:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][600/1251] eta 0:23:59 lr 0.000492 time 2.2185 (2.2107) loss 3.7693 (3.5193) grad_norm 1.5359 (1.5267) [2022-01-22 11:49:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][610/1251] eta 0:23:39 lr 0.000492 time 1.6623 (2.2142) loss 2.9022 (3.5204) grad_norm 1.4045 (1.5265) [2022-01-22 11:49:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][620/1251] eta 0:23:17 lr 0.000492 time 2.1074 (2.2148) loss 3.8477 (3.5231) grad_norm 1.5994 (1.5257) [2022-01-22 11:50:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][630/1251] eta 0:22:56 lr 0.000492 time 2.1454 (2.2161) loss 3.4526 (3.5213) grad_norm 1.2981 (1.5257) [2022-01-22 11:50:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][640/1251] eta 0:22:32 lr 0.000492 time 1.8593 (2.2133) loss 3.6757 (3.5246) grad_norm 1.5983 (1.5258) [2022-01-22 11:51:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][650/1251] eta 0:22:08 lr 0.000492 time 1.5949 (2.2110) loss 2.9816 (3.5239) grad_norm 1.3011 (1.5242) [2022-01-22 11:51:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][660/1251] eta 0:21:46 lr 0.000492 time 1.9227 (2.2102) loss 3.4295 (3.5221) grad_norm 1.3773 (1.5239) [2022-01-22 11:51:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][670/1251] eta 0:21:25 lr 0.000492 time 2.4363 (2.2124) loss 4.0509 (3.5240) grad_norm 1.3502 (1.5226) [2022-01-22 11:52:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][680/1251] eta 0:21:02 lr 0.000492 time 2.2769 (2.2109) loss 2.4472 (3.5272) grad_norm 1.7814 (1.5224) [2022-01-22 11:52:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][690/1251] eta 0:20:39 lr 0.000492 time 1.8818 (2.2094) loss 3.0260 (3.5265) grad_norm 1.4729 (1.5216) [2022-01-22 11:52:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][700/1251] eta 0:20:17 lr 0.000492 time 1.6468 (2.2092) loss 2.9933 (3.5281) grad_norm 1.5109 (1.5224) [2022-01-22 11:53:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][710/1251] eta 0:19:57 lr 0.000492 time 3.6177 (2.2132) loss 4.2490 (3.5281) grad_norm 1.3600 (1.5214) [2022-01-22 11:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][720/1251] eta 0:19:34 lr 0.000492 time 2.0297 (2.2115) loss 4.1412 (3.5274) grad_norm 1.8229 (1.5212) [2022-01-22 11:53:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][730/1251] eta 0:19:10 lr 0.000492 time 2.0725 (2.2087) loss 3.2849 (3.5279) grad_norm 1.7860 (1.5227) [2022-01-22 11:54:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][740/1251] eta 0:18:47 lr 0.000492 time 1.8310 (2.2061) loss 2.9604 (3.5270) grad_norm 1.8497 (1.5266) [2022-01-22 11:54:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][750/1251] eta 0:18:24 lr 0.000492 time 2.3408 (2.2049) loss 3.4368 (3.5243) grad_norm 1.4844 (1.5273) [2022-01-22 11:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][760/1251] eta 0:18:01 lr 0.000491 time 1.6109 (2.2029) loss 3.4641 (3.5252) grad_norm 1.3221 (1.5268) [2022-01-22 11:55:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][770/1251] eta 0:17:39 lr 0.000491 time 1.9510 (2.2024) loss 2.9669 (3.5267) grad_norm 1.3399 (1.5265) [2022-01-22 11:55:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][780/1251] eta 0:17:17 lr 0.000491 time 1.6822 (2.2032) loss 2.3855 (3.5290) grad_norm 1.5158 (1.5268) [2022-01-22 11:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][790/1251] eta 0:16:56 lr 0.000491 time 3.1390 (2.2041) loss 3.6870 (3.5247) grad_norm 1.3310 (1.5270) [2022-01-22 11:56:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][800/1251] eta 0:16:34 lr 0.000491 time 1.5590 (2.2047) loss 3.6109 (3.5224) grad_norm 1.5917 (1.5268) [2022-01-22 11:56:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][810/1251] eta 0:16:12 lr 0.000491 time 1.7045 (2.2062) loss 3.0348 (3.5234) grad_norm 1.4686 (1.5274) [2022-01-22 11:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][820/1251] eta 0:15:50 lr 0.000491 time 2.0859 (2.2062) loss 3.8020 (3.5261) grad_norm 1.4700 (1.5272) [2022-01-22 11:57:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][830/1251] eta 0:15:29 lr 0.000491 time 3.3448 (2.2073) loss 3.6695 (3.5252) grad_norm 1.5176 (1.5264) [2022-01-22 11:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][840/1251] eta 0:15:06 lr 0.000491 time 2.2421 (2.2060) loss 3.9478 (3.5298) grad_norm 1.3510 (1.5255) [2022-01-22 11:58:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][850/1251] eta 0:14:44 lr 0.000491 time 1.5615 (2.2053) loss 2.8928 (3.5299) grad_norm 1.4528 (1.5248) [2022-01-22 11:58:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][860/1251] eta 0:14:21 lr 0.000491 time 1.9259 (2.2043) loss 3.5887 (3.5331) grad_norm 1.5246 (1.5259) [2022-01-22 11:59:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][870/1251] eta 0:14:00 lr 0.000491 time 3.6347 (2.2066) loss 3.5640 (3.5336) grad_norm 1.4325 (1.5252) [2022-01-22 11:59:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][880/1251] eta 0:13:38 lr 0.000491 time 2.7451 (2.2069) loss 4.2609 (3.5366) grad_norm 1.6371 (1.5263) [2022-01-22 11:59:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][890/1251] eta 0:13:16 lr 0.000491 time 1.5527 (2.2063) loss 3.8293 (3.5389) grad_norm 1.6884 (1.5278) [2022-01-22 12:00:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][900/1251] eta 0:12:53 lr 0.000491 time 1.9032 (2.2049) loss 3.6443 (3.5389) grad_norm 1.2923 (1.5283) [2022-01-22 12:00:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][910/1251] eta 0:12:31 lr 0.000491 time 3.1086 (2.2047) loss 4.5296 (3.5378) grad_norm 1.4279 (1.5284) [2022-01-22 12:00:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][920/1251] eta 0:12:09 lr 0.000491 time 2.7517 (2.2045) loss 4.0905 (3.5394) grad_norm 1.7787 (1.5292) [2022-01-22 12:01:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][930/1251] eta 0:11:47 lr 0.000491 time 2.3267 (2.2030) loss 2.7158 (3.5414) grad_norm 1.5731 (1.5286) [2022-01-22 12:01:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][940/1251] eta 0:11:25 lr 0.000491 time 1.9469 (2.2028) loss 3.9188 (3.5391) grad_norm 1.6354 (1.5290) [2022-01-22 12:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][950/1251] eta 0:11:03 lr 0.000491 time 2.4717 (2.2037) loss 3.2479 (3.5397) grad_norm 1.5239 (1.5289) [2022-01-22 12:02:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][960/1251] eta 0:10:41 lr 0.000491 time 2.5135 (2.2052) loss 4.0951 (3.5392) grad_norm 1.3319 (1.5289) [2022-01-22 12:02:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][970/1251] eta 0:10:19 lr 0.000491 time 1.7680 (2.2033) loss 2.5296 (3.5386) grad_norm 1.5604 (1.5293) [2022-01-22 12:03:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][980/1251] eta 0:09:56 lr 0.000491 time 1.9674 (2.2015) loss 3.4326 (3.5397) grad_norm 1.3441 (1.5296) [2022-01-22 12:03:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][990/1251] eta 0:09:34 lr 0.000491 time 3.0835 (2.2006) loss 2.8813 (3.5398) grad_norm 1.4876 (1.5301) [2022-01-22 12:03:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1000/1251] eta 0:09:12 lr 0.000490 time 2.5294 (2.2013) loss 3.5899 (3.5389) grad_norm 1.4375 (1.5301) [2022-01-22 12:04:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1010/1251] eta 0:08:50 lr 0.000490 time 2.2798 (2.2010) loss 4.1969 (3.5384) grad_norm 1.4506 (1.5300) [2022-01-22 12:04:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1020/1251] eta 0:08:28 lr 0.000490 time 1.8314 (2.2015) loss 3.0893 (3.5347) grad_norm 1.5580 (1.5301) [2022-01-22 12:04:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1030/1251] eta 0:08:07 lr 0.000490 time 3.0483 (2.2041) loss 2.9734 (3.5329) grad_norm 1.5184 (1.5298) [2022-01-22 12:05:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1040/1251] eta 0:07:44 lr 0.000490 time 1.6899 (2.2018) loss 3.9297 (3.5328) grad_norm 1.8821 (1.5313) [2022-01-22 12:05:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1050/1251] eta 0:07:22 lr 0.000490 time 1.6071 (2.1999) loss 3.8627 (3.5306) grad_norm 1.5363 (1.5322) [2022-01-22 12:05:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1060/1251] eta 0:07:00 lr 0.000490 time 2.2233 (2.1993) loss 3.1324 (3.5313) grad_norm 1.6431 (1.5333) [2022-01-22 12:06:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1070/1251] eta 0:06:38 lr 0.000490 time 2.5682 (2.1999) loss 3.5184 (3.5326) grad_norm 1.4680 (1.5330) [2022-01-22 12:06:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1080/1251] eta 0:06:16 lr 0.000490 time 1.9453 (2.1989) loss 3.9710 (3.5357) grad_norm 1.5866 (1.5332) [2022-01-22 12:07:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1090/1251] eta 0:05:54 lr 0.000490 time 3.1432 (2.1988) loss 3.2556 (3.5351) grad_norm 1.6208 (1.5331) [2022-01-22 12:07:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1100/1251] eta 0:05:31 lr 0.000490 time 2.1369 (2.1981) loss 4.0022 (3.5351) grad_norm 1.3733 (1.5325) [2022-01-22 12:07:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1110/1251] eta 0:05:09 lr 0.000490 time 2.1219 (2.1966) loss 3.8486 (3.5367) grad_norm 1.6628 (1.5319) [2022-01-22 12:08:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1120/1251] eta 0:04:47 lr 0.000490 time 2.4644 (2.1971) loss 3.3440 (3.5380) grad_norm 1.6532 (1.5311) [2022-01-22 12:08:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1130/1251] eta 0:04:26 lr 0.000490 time 2.4691 (2.1988) loss 3.2091 (3.5389) grad_norm 1.3141 (1.5312) [2022-01-22 12:08:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1140/1251] eta 0:04:03 lr 0.000490 time 2.3980 (2.1981) loss 3.0947 (3.5364) grad_norm 1.4073 (1.5305) [2022-01-22 12:09:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1150/1251] eta 0:03:42 lr 0.000490 time 2.2935 (2.1987) loss 3.4673 (3.5367) grad_norm 1.5891 (1.5302) [2022-01-22 12:09:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1160/1251] eta 0:03:20 lr 0.000490 time 1.9360 (2.1987) loss 4.2799 (3.5381) grad_norm 2.1374 (1.5314) [2022-01-22 12:09:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1170/1251] eta 0:02:58 lr 0.000490 time 1.6920 (2.1981) loss 3.8121 (3.5399) grad_norm 1.3818 (1.5321) [2022-01-22 12:10:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1180/1251] eta 0:02:36 lr 0.000490 time 1.5755 (2.1974) loss 3.7656 (3.5410) grad_norm 1.4617 (1.5325) [2022-01-22 12:10:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1190/1251] eta 0:02:13 lr 0.000490 time 1.9604 (2.1967) loss 2.8568 (3.5398) grad_norm 1.4500 (1.5331) [2022-01-22 12:11:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1200/1251] eta 0:01:52 lr 0.000490 time 2.1243 (2.1976) loss 4.2155 (3.5410) grad_norm 1.6058 (1.5331) [2022-01-22 12:11:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1210/1251] eta 0:01:30 lr 0.000490 time 2.2915 (2.1986) loss 3.7112 (3.5420) grad_norm 1.4881 (1.5326) [2022-01-22 12:11:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1220/1251] eta 0:01:08 lr 0.000490 time 1.9756 (2.1992) loss 4.2091 (3.5425) grad_norm 1.4570 (1.5323) [2022-01-22 12:12:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1230/1251] eta 0:00:46 lr 0.000490 time 1.8625 (2.1989) loss 2.7227 (3.5419) grad_norm 1.3504 (1.5316) [2022-01-22 12:12:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1240/1251] eta 0:00:24 lr 0.000489 time 1.5708 (2.1966) loss 3.1255 (3.5427) grad_norm 1.5730 (1.5308) [2022-01-22 12:12:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [152/300][1250/1251] eta 0:00:02 lr 0.000489 time 1.2111 (2.1899) loss 3.1257 (3.5386) grad_norm 1.3709 (1.5301) [2022-01-22 12:12:43 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 152 training takes 0:45:39 [2022-01-22 12:13:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.382 (18.382) Loss 1.0430 (1.0430) Acc@1 74.219 (74.219) Acc@5 93.457 (93.457) [2022-01-22 12:13:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.107 (3.445) Loss 1.0103 (1.0235) Acc@1 76.855 (75.728) Acc@5 92.676 (93.093) [2022-01-22 12:13:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.569 (2.638) Loss 0.9303 (1.0217) Acc@1 77.344 (75.698) Acc@5 93.848 (93.197) [2022-01-22 12:13:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.603 (2.291) Loss 0.9691 (1.0162) Acc@1 77.051 (75.898) Acc@5 93.945 (93.252) [2022-01-22 12:14:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.261 (2.153) Loss 0.9611 (1.0111) Acc@1 79.199 (76.129) Acc@5 93.457 (93.317) [2022-01-22 12:14:19 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.178 Acc@5 93.272 [2022-01-22 12:14:19 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.2% [2022-01-22 12:14:19 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.30% [2022-01-22 12:14:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][0/1251] eta 7:41:05 lr 0.000489 time 22.1145 (22.1145) loss 3.3322 (3.3322) grad_norm 1.4981 (1.4981) [2022-01-22 12:15:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][10/1251] eta 1:26:30 lr 0.000489 time 2.0180 (4.1822) loss 3.4864 (3.0185) grad_norm 1.4389 (1.4853) [2022-01-22 12:15:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][20/1251] eta 1:04:07 lr 0.000489 time 1.2247 (3.1253) loss 3.1512 (3.1401) grad_norm 1.4107 (1.5368) [2022-01-22 12:15:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][30/1251] eta 0:56:57 lr 0.000489 time 1.9255 (2.7987) loss 2.4099 (3.2206) grad_norm 1.4210 (1.5446) [2022-01-22 12:16:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][40/1251] eta 0:54:40 lr 0.000489 time 4.9678 (2.7090) loss 2.7390 (3.3049) grad_norm 1.2506 (1.5241) [2022-01-22 12:16:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][50/1251] eta 0:52:57 lr 0.000489 time 2.6069 (2.6459) loss 3.9116 (3.3189) grad_norm 1.5276 (1.5244) [2022-01-22 12:16:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][60/1251] eta 0:51:24 lr 0.000489 time 1.3910 (2.5899) loss 3.7041 (3.3632) grad_norm 1.4342 (1.5112) [2022-01-22 12:17:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][70/1251] eta 0:50:23 lr 0.000489 time 3.0365 (2.5600) loss 3.4781 (3.3863) grad_norm 1.5324 (1.5233) [2022-01-22 12:17:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][80/1251] eta 0:49:34 lr 0.000489 time 3.6612 (2.5402) loss 3.3979 (3.4091) grad_norm 1.3756 (1.5257) [2022-01-22 12:18:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][90/1251] eta 0:48:19 lr 0.000489 time 1.9766 (2.4978) loss 3.6701 (3.4167) grad_norm 2.0622 (1.5412) [2022-01-22 12:18:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][100/1251] eta 0:47:04 lr 0.000489 time 1.8872 (2.4542) loss 3.1126 (3.4141) grad_norm 1.5785 (1.5425) [2022-01-22 12:18:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][110/1251] eta 0:46:13 lr 0.000489 time 3.3228 (2.4303) loss 3.5277 (3.4343) grad_norm 1.6426 (1.5498) [2022-01-22 12:19:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][120/1251] eta 0:45:16 lr 0.000489 time 2.6231 (2.4018) loss 3.9783 (3.4419) grad_norm 1.7224 (1.5533) [2022-01-22 12:19:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][130/1251] eta 0:44:17 lr 0.000489 time 1.8712 (2.3707) loss 3.4930 (3.4290) grad_norm 1.5336 (1.5459) [2022-01-22 12:19:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][140/1251] eta 0:43:36 lr 0.000489 time 1.7923 (2.3554) loss 3.7043 (3.4354) grad_norm 1.5000 (1.5429) [2022-01-22 12:20:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][150/1251] eta 0:43:09 lr 0.000489 time 3.3844 (2.3523) loss 3.7778 (3.4620) grad_norm 1.3914 (1.5391) [2022-01-22 12:20:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][160/1251] eta 0:42:45 lr 0.000489 time 2.6050 (2.3519) loss 4.1280 (3.4594) grad_norm 1.4945 (1.5386) [2022-01-22 12:20:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][170/1251] eta 0:42:07 lr 0.000489 time 1.7977 (2.3378) loss 3.0238 (3.4554) grad_norm 1.4832 (1.5346) [2022-01-22 12:21:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][180/1251] eta 0:41:30 lr 0.000489 time 2.2617 (2.3258) loss 4.1127 (3.4717) grad_norm 1.3879 (1.5309) [2022-01-22 12:21:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][190/1251] eta 0:40:58 lr 0.000489 time 2.7511 (2.3173) loss 2.8011 (3.4769) grad_norm 1.4598 (1.5263) [2022-01-22 12:22:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][200/1251] eta 0:40:27 lr 0.000489 time 2.5566 (2.3094) loss 3.1031 (3.4639) grad_norm 1.7863 (1.5248) [2022-01-22 12:22:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][210/1251] eta 0:39:59 lr 0.000489 time 2.1790 (2.3048) loss 3.7468 (3.4562) grad_norm 1.5192 (1.5289) [2022-01-22 12:22:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][220/1251] eta 0:39:28 lr 0.000489 time 2.0335 (2.2976) loss 3.8073 (3.4699) grad_norm 1.7670 (1.5305) [2022-01-22 12:23:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][230/1251] eta 0:39:00 lr 0.000488 time 2.5905 (2.2926) loss 2.6135 (3.4741) grad_norm 1.6122 (1.5271) [2022-01-22 12:23:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][240/1251] eta 0:38:25 lr 0.000488 time 1.9345 (2.2803) loss 3.0922 (3.4844) grad_norm 1.4990 (1.5264) [2022-01-22 12:23:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][250/1251] eta 0:37:50 lr 0.000488 time 1.4798 (2.2686) loss 3.4389 (3.4880) grad_norm 1.3326 (1.5268) [2022-01-22 12:24:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][260/1251] eta 0:37:26 lr 0.000488 time 2.2424 (2.2668) loss 3.6446 (3.4995) grad_norm 1.5407 (1.5276) [2022-01-22 12:24:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][270/1251] eta 0:37:05 lr 0.000488 time 2.7817 (2.2682) loss 4.0787 (3.5008) grad_norm 1.4038 (1.5274) [2022-01-22 12:24:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][280/1251] eta 0:36:43 lr 0.000488 time 1.9436 (2.2694) loss 3.2476 (3.5074) grad_norm 1.4936 (1.5275) [2022-01-22 12:25:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][290/1251] eta 0:36:18 lr 0.000488 time 2.1764 (2.2672) loss 2.3867 (3.4949) grad_norm 1.5304 (1.5286) [2022-01-22 12:25:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][300/1251] eta 0:35:51 lr 0.000488 time 1.6860 (2.2620) loss 3.7604 (3.5012) grad_norm 1.3999 (1.5278) [2022-01-22 12:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][310/1251] eta 0:35:29 lr 0.000488 time 4.0470 (2.2627) loss 2.7580 (3.4997) grad_norm 1.6697 (1.5268) [2022-01-22 12:26:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][320/1251] eta 0:35:03 lr 0.000488 time 2.0195 (2.2599) loss 3.9739 (3.4987) grad_norm 1.6071 (1.5272) [2022-01-22 12:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][330/1251] eta 0:34:40 lr 0.000488 time 2.1666 (2.2585) loss 3.7922 (3.5016) grad_norm 1.4860 (1.5293) [2022-01-22 12:27:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][340/1251] eta 0:34:15 lr 0.000488 time 1.9008 (2.2558) loss 4.6487 (3.5042) grad_norm 1.4023 (1.5308) [2022-01-22 12:27:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][350/1251] eta 0:33:52 lr 0.000488 time 3.8973 (2.2554) loss 3.9179 (3.5006) grad_norm 1.6131 (1.5312) [2022-01-22 12:27:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][360/1251] eta 0:33:26 lr 0.000488 time 2.3984 (2.2519) loss 3.6088 (3.4993) grad_norm 1.4791 (1.5319) [2022-01-22 12:28:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][370/1251] eta 0:33:02 lr 0.000488 time 2.1642 (2.2500) loss 3.8640 (3.5002) grad_norm 1.5760 (1.5320) [2022-01-22 12:28:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][380/1251] eta 0:32:37 lr 0.000488 time 2.2163 (2.2473) loss 3.4961 (3.4987) grad_norm 1.3716 (1.5289) [2022-01-22 12:28:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][390/1251] eta 0:32:15 lr 0.000488 time 2.7216 (2.2476) loss 4.3142 (3.5018) grad_norm 1.7118 (1.5305) [2022-01-22 12:29:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][400/1251] eta 0:31:54 lr 0.000488 time 3.0152 (2.2496) loss 3.4471 (3.5063) grad_norm 1.4130 (1.5313) [2022-01-22 12:29:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][410/1251] eta 0:31:28 lr 0.000488 time 1.6148 (2.2460) loss 2.8736 (3.5090) grad_norm 1.4907 (1.5295) [2022-01-22 12:30:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][420/1251] eta 0:31:02 lr 0.000488 time 1.9411 (2.2416) loss 4.0367 (3.5115) grad_norm 1.5746 (1.5285) [2022-01-22 12:30:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][430/1251] eta 0:30:35 lr 0.000488 time 1.8831 (2.2352) loss 4.2606 (3.5081) grad_norm 1.6223 (1.5304) [2022-01-22 12:30:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][440/1251] eta 0:30:09 lr 0.000488 time 2.5542 (2.2309) loss 3.2810 (3.5057) grad_norm 1.5269 (1.5336) [2022-01-22 12:31:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][450/1251] eta 0:29:43 lr 0.000488 time 1.9352 (2.2267) loss 4.0353 (3.5101) grad_norm 1.5306 (1.5336) [2022-01-22 12:31:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][460/1251] eta 0:29:20 lr 0.000488 time 2.6693 (2.2259) loss 3.5295 (3.5103) grad_norm 1.5398 (1.5343) [2022-01-22 12:31:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][470/1251] eta 0:28:58 lr 0.000488 time 2.4400 (2.2265) loss 3.2136 (3.5113) grad_norm 1.5440 (1.5326) [2022-01-22 12:32:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][480/1251] eta 0:28:38 lr 0.000487 time 2.1080 (2.2285) loss 3.9194 (3.5082) grad_norm 1.6954 (1.5339) [2022-01-22 12:32:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][490/1251] eta 0:28:15 lr 0.000487 time 2.4530 (2.2281) loss 3.6385 (3.5089) grad_norm 1.3117 (1.5332) [2022-01-22 12:32:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][500/1251] eta 0:27:52 lr 0.000487 time 2.2266 (2.2274) loss 3.5521 (3.5040) grad_norm 1.2706 (1.5324) [2022-01-22 12:33:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][510/1251] eta 0:27:29 lr 0.000487 time 1.7868 (2.2258) loss 3.9126 (3.5038) grad_norm 1.3641 (1.5318) [2022-01-22 12:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][520/1251] eta 0:27:08 lr 0.000487 time 2.4363 (2.2282) loss 3.9139 (3.5039) grad_norm 1.3817 (1.5303) [2022-01-22 12:34:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][530/1251] eta 0:26:46 lr 0.000487 time 2.7409 (2.2276) loss 2.2972 (3.5023) grad_norm 1.5717 (1.5307) [2022-01-22 12:34:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][540/1251] eta 0:26:23 lr 0.000487 time 2.5033 (2.2274) loss 2.6584 (3.5024) grad_norm 1.4227 (1.5356) [2022-01-22 12:34:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][550/1251] eta 0:26:02 lr 0.000487 time 1.6577 (2.2283) loss 3.8109 (3.5036) grad_norm 1.5020 (1.5356) [2022-01-22 12:35:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][560/1251] eta 0:25:39 lr 0.000487 time 2.2351 (2.2277) loss 4.3299 (3.5049) grad_norm 1.6053 (1.5379) [2022-01-22 12:35:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][570/1251] eta 0:25:15 lr 0.000487 time 2.2924 (2.2247) loss 4.0485 (3.5111) grad_norm 1.8392 (1.5396) [2022-01-22 12:35:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][580/1251] eta 0:24:52 lr 0.000487 time 1.9033 (2.2245) loss 2.1367 (3.5044) grad_norm 1.3946 (1.5398) [2022-01-22 12:36:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][590/1251] eta 0:24:28 lr 0.000487 time 1.7632 (2.2222) loss 2.8841 (3.5046) grad_norm 1.5428 (1.5384) [2022-01-22 12:36:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][600/1251] eta 0:24:04 lr 0.000487 time 2.3660 (2.2191) loss 3.0381 (3.5107) grad_norm 1.4225 (1.5386) [2022-01-22 12:36:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][610/1251] eta 0:23:40 lr 0.000487 time 2.2258 (2.2161) loss 2.5678 (3.5107) grad_norm 1.3811 (1.5370) [2022-01-22 12:37:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][620/1251] eta 0:23:18 lr 0.000487 time 2.9574 (2.2161) loss 3.5762 (3.5131) grad_norm 1.4905 (1.5362) [2022-01-22 12:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][630/1251] eta 0:22:54 lr 0.000487 time 2.1862 (2.2137) loss 4.0341 (3.5137) grad_norm 1.6174 (1.5340) [2022-01-22 12:38:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][640/1251] eta 0:22:33 lr 0.000487 time 2.3354 (2.2160) loss 2.9911 (3.5148) grad_norm 1.4962 (1.5336) [2022-01-22 12:38:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][650/1251] eta 0:22:12 lr 0.000487 time 2.6730 (2.2165) loss 3.5706 (3.5131) grad_norm 1.3158 (1.5312) [2022-01-22 12:38:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][660/1251] eta 0:21:49 lr 0.000487 time 2.1594 (2.2165) loss 3.1953 (3.5147) grad_norm 1.6202 (1.5320) [2022-01-22 12:39:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][670/1251] eta 0:21:28 lr 0.000487 time 2.8753 (2.2169) loss 3.1066 (3.5066) grad_norm 1.4786 (1.5317) [2022-01-22 12:39:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][680/1251] eta 0:21:04 lr 0.000487 time 2.4738 (2.2145) loss 3.5122 (3.5072) grad_norm 1.6802 (1.5326) [2022-01-22 12:39:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][690/1251] eta 0:20:41 lr 0.000487 time 2.2907 (2.2129) loss 2.6411 (3.5050) grad_norm 1.3931 (1.5325) [2022-01-22 12:40:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][700/1251] eta 0:20:18 lr 0.000487 time 1.5055 (2.2122) loss 3.6667 (3.5048) grad_norm 1.6032 (1.5319) [2022-01-22 12:40:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][710/1251] eta 0:19:56 lr 0.000487 time 2.3044 (2.2116) loss 3.1494 (3.5019) grad_norm 1.5144 (1.5337) [2022-01-22 12:40:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][720/1251] eta 0:19:33 lr 0.000486 time 2.5153 (2.2108) loss 3.2445 (3.5038) grad_norm 1.6617 (1.5340) [2022-01-22 12:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][730/1251] eta 0:19:12 lr 0.000486 time 2.7039 (2.2120) loss 3.3491 (3.5064) grad_norm 1.3638 (1.5338) [2022-01-22 12:41:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][740/1251] eta 0:18:50 lr 0.000486 time 2.2034 (2.2118) loss 3.5458 (3.5089) grad_norm 1.5052 (1.5331) [2022-01-22 12:41:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][750/1251] eta 0:18:27 lr 0.000486 time 1.7246 (2.2105) loss 3.6221 (3.5097) grad_norm 1.4036 (1.5332) [2022-01-22 12:42:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][760/1251] eta 0:18:05 lr 0.000486 time 2.2142 (2.2103) loss 3.9444 (3.5118) grad_norm 1.5155 (1.5321) [2022-01-22 12:42:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][770/1251] eta 0:17:43 lr 0.000486 time 2.1562 (2.2111) loss 3.5331 (3.5114) grad_norm 1.3197 (1.5317) [2022-01-22 12:43:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][780/1251] eta 0:17:20 lr 0.000486 time 1.8504 (2.2099) loss 3.7581 (3.5119) grad_norm 1.5737 (1.5323) [2022-01-22 12:43:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][790/1251] eta 0:16:57 lr 0.000486 time 1.7122 (2.2074) loss 3.6236 (3.5094) grad_norm 1.6076 (1.5329) [2022-01-22 12:43:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][800/1251] eta 0:16:34 lr 0.000486 time 1.7890 (2.2048) loss 3.7121 (3.5098) grad_norm 1.4284 (1.5318) [2022-01-22 12:44:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][810/1251] eta 0:16:12 lr 0.000486 time 1.8850 (2.2043) loss 3.6249 (3.5094) grad_norm 1.5143 (1.5329) [2022-01-22 12:44:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][820/1251] eta 0:15:50 lr 0.000486 time 2.1704 (2.2051) loss 3.6846 (3.5104) grad_norm 1.6791 (1.5326) [2022-01-22 12:44:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][830/1251] eta 0:15:28 lr 0.000486 time 1.8942 (2.2064) loss 3.0044 (3.5098) grad_norm 1.4242 (1.5318) [2022-01-22 12:45:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][840/1251] eta 0:15:08 lr 0.000486 time 1.9269 (2.2097) loss 3.5452 (3.5104) grad_norm 1.4824 (1.5321) [2022-01-22 12:45:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][850/1251] eta 0:14:46 lr 0.000486 time 2.7396 (2.2118) loss 3.5282 (3.5115) grad_norm 1.5158 (1.5323) [2022-01-22 12:46:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][860/1251] eta 0:14:25 lr 0.000486 time 2.1511 (2.2129) loss 3.7794 (3.5065) grad_norm 1.5112 (1.5315) [2022-01-22 12:46:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][870/1251] eta 0:14:02 lr 0.000486 time 1.9243 (2.2102) loss 3.8686 (3.5091) grad_norm 1.4986 (1.5310) [2022-01-22 12:46:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][880/1251] eta 0:13:38 lr 0.000486 time 1.8791 (2.2070) loss 2.9119 (3.5051) grad_norm 1.5447 (1.5317) [2022-01-22 12:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][890/1251] eta 0:13:15 lr 0.000486 time 1.9330 (2.2042) loss 2.6275 (3.5020) grad_norm 1.3221 (1.5314) [2022-01-22 12:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][900/1251] eta 0:12:52 lr 0.000486 time 2.0943 (2.2016) loss 3.7486 (3.5043) grad_norm 1.4460 (1.5323) [2022-01-22 12:47:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][910/1251] eta 0:12:30 lr 0.000486 time 1.8731 (2.2016) loss 3.6209 (3.5045) grad_norm 1.3196 (1.5323) [2022-01-22 12:48:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][920/1251] eta 0:12:08 lr 0.000486 time 2.1881 (2.2008) loss 4.2105 (3.5055) grad_norm 1.4995 (1.5334) [2022-01-22 12:48:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][930/1251] eta 0:11:46 lr 0.000486 time 2.5061 (2.2021) loss 3.7476 (3.5061) grad_norm 1.5351 (1.5340) [2022-01-22 12:48:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][940/1251] eta 0:11:25 lr 0.000486 time 2.9014 (2.2035) loss 3.5060 (3.5085) grad_norm 1.5680 (1.5344) [2022-01-22 12:49:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][950/1251] eta 0:11:03 lr 0.000486 time 2.1964 (2.2042) loss 3.9232 (3.5097) grad_norm 1.4699 (1.5348) [2022-01-22 12:49:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][960/1251] eta 0:10:41 lr 0.000485 time 2.0841 (2.2058) loss 3.1285 (3.5088) grad_norm 1.6204 (1.5353) [2022-01-22 12:50:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][970/1251] eta 0:10:19 lr 0.000485 time 2.2385 (2.2062) loss 2.4628 (3.5093) grad_norm 1.2980 (1.5362) [2022-01-22 12:50:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][980/1251] eta 0:09:57 lr 0.000485 time 2.8411 (2.2055) loss 3.3532 (3.5114) grad_norm 1.8607 (1.5363) [2022-01-22 12:50:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][990/1251] eta 0:09:35 lr 0.000485 time 1.8899 (2.2044) loss 3.1810 (3.5108) grad_norm 2.2251 (1.5377) [2022-01-22 12:51:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1000/1251] eta 0:09:12 lr 0.000485 time 1.8825 (2.2023) loss 3.4507 (3.5097) grad_norm 1.4992 (1.5384) [2022-01-22 12:51:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1010/1251] eta 0:08:50 lr 0.000485 time 1.7293 (2.2006) loss 2.9841 (3.5090) grad_norm 1.5931 (1.5384) [2022-01-22 12:51:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1020/1251] eta 0:08:27 lr 0.000485 time 1.6860 (2.1985) loss 3.2502 (3.5102) grad_norm 1.4880 (1.5378) [2022-01-22 12:52:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1030/1251] eta 0:08:06 lr 0.000485 time 2.2426 (2.1992) loss 4.2418 (3.5121) grad_norm 1.4302 (1.5366) [2022-01-22 12:52:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1040/1251] eta 0:07:43 lr 0.000485 time 1.6592 (2.1988) loss 2.9392 (3.5116) grad_norm 1.5403 (1.5357) [2022-01-22 12:52:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1050/1251] eta 0:07:22 lr 0.000485 time 1.8896 (2.1992) loss 4.1550 (3.5139) grad_norm 1.7170 (1.5355) [2022-01-22 12:53:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1060/1251] eta 0:07:00 lr 0.000485 time 1.8168 (2.2002) loss 3.5169 (3.5147) grad_norm 1.5434 (1.5352) [2022-01-22 12:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1070/1251] eta 0:06:38 lr 0.000485 time 2.1989 (2.2020) loss 3.8567 (3.5155) grad_norm 1.5352 (1.5346) [2022-01-22 12:54:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1080/1251] eta 0:06:16 lr 0.000485 time 1.6589 (2.2025) loss 2.5409 (3.5113) grad_norm 1.3733 (1.5348) [2022-01-22 12:54:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1090/1251] eta 0:05:54 lr 0.000485 time 2.4373 (2.2031) loss 3.7914 (3.5103) grad_norm 1.5713 (1.5346) [2022-01-22 12:54:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1100/1251] eta 0:05:32 lr 0.000485 time 1.6349 (2.2026) loss 3.8931 (3.5101) grad_norm 1.3520 (1.5342) [2022-01-22 12:55:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1110/1251] eta 0:05:10 lr 0.000485 time 1.9253 (2.2008) loss 3.4760 (3.5094) grad_norm 1.8064 (1.5338) [2022-01-22 12:55:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1120/1251] eta 0:04:48 lr 0.000485 time 1.8864 (2.1986) loss 2.5839 (3.5098) grad_norm 1.4853 (1.5333) [2022-01-22 12:55:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1130/1251] eta 0:04:25 lr 0.000485 time 2.3374 (2.1975) loss 3.0553 (3.5099) grad_norm 1.4872 (1.5338) [2022-01-22 12:56:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1140/1251] eta 0:04:03 lr 0.000485 time 2.3763 (2.1963) loss 3.4563 (3.5087) grad_norm 1.3927 (1.5329) [2022-01-22 12:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1150/1251] eta 0:03:41 lr 0.000485 time 2.3090 (2.1962) loss 4.0641 (3.5085) grad_norm 1.4760 (1.5324) [2022-01-22 12:56:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1160/1251] eta 0:03:19 lr 0.000485 time 2.8142 (2.1976) loss 3.3684 (3.5055) grad_norm 1.2926 (1.5313) [2022-01-22 12:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1170/1251] eta 0:02:58 lr 0.000485 time 2.7661 (2.1998) loss 2.3905 (3.5042) grad_norm 1.6150 (1.5314) [2022-01-22 12:57:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1180/1251] eta 0:02:36 lr 0.000485 time 2.5182 (2.2025) loss 3.2639 (3.5037) grad_norm 1.3402 (1.5314) [2022-01-22 12:58:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1190/1251] eta 0:02:14 lr 0.000485 time 2.0936 (2.2022) loss 3.8509 (3.5024) grad_norm 1.5384 (1.5314) [2022-01-22 12:58:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1200/1251] eta 0:01:52 lr 0.000484 time 1.9673 (2.2008) loss 3.1953 (3.5014) grad_norm 1.4262 (1.5309) [2022-01-22 12:58:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1210/1251] eta 0:01:30 lr 0.000484 time 1.6587 (2.1985) loss 3.5861 (3.5017) grad_norm 1.4602 (1.5308) [2022-01-22 12:59:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1220/1251] eta 0:01:08 lr 0.000484 time 1.9136 (2.1973) loss 3.4158 (3.5034) grad_norm 1.6348 (1.5313) [2022-01-22 12:59:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1230/1251] eta 0:00:46 lr 0.000484 time 2.6073 (2.1974) loss 4.1845 (3.5028) grad_norm 1.3972 (1.5313) [2022-01-22 12:59:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1240/1251] eta 0:00:24 lr 0.000484 time 2.2370 (2.1965) loss 4.1775 (3.5037) grad_norm 1.5244 (1.5307) [2022-01-22 13:00:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [153/300][1250/1251] eta 0:00:02 lr 0.000484 time 1.1334 (2.1914) loss 3.4485 (3.5040) grad_norm 1.4252 (1.5306) [2022-01-22 13:00:01 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 153 training takes 0:45:41 [2022-01-22 13:00:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.435 (18.435) Loss 1.0815 (1.0815) Acc@1 72.949 (72.949) Acc@5 92.285 (92.285) [2022-01-22 13:00:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.292 (3.313) Loss 0.9397 (1.0146) Acc@1 77.734 (75.648) Acc@5 93.262 (93.306) [2022-01-22 13:00:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.521 (2.537) Loss 0.9484 (1.0040) Acc@1 76.953 (76.190) Acc@5 93.848 (93.406) [2022-01-22 13:01:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.615 (2.257) Loss 1.0388 (0.9991) Acc@1 75.586 (76.307) Acc@5 92.480 (93.426) [2022-01-22 13:01:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 7.159 (2.149) Loss 0.9575 (1.0025) Acc@1 78.027 (76.279) Acc@5 93.262 (93.333) [2022-01-22 13:01:36 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.292 Acc@5 93.390 [2022-01-22 13:01:36 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.3% [2022-01-22 13:01:36 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.30% [2022-01-22 13:01:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][0/1251] eta 7:31:11 lr 0.000484 time 21.6398 (21.6398) loss 3.1654 (3.1654) grad_norm 1.6550 (1.6550) [2022-01-22 13:02:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][10/1251] eta 1:26:51 lr 0.000484 time 2.7735 (4.1996) loss 2.4444 (3.3810) grad_norm 1.4539 (1.5282) [2022-01-22 13:02:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][20/1251] eta 1:07:24 lr 0.000484 time 1.8311 (3.2857) loss 4.1783 (3.3958) grad_norm 1.4760 (1.5236) [2022-01-22 13:03:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][30/1251] eta 0:58:57 lr 0.000484 time 1.6314 (2.8974) loss 4.0698 (3.3453) grad_norm 1.8520 (1.5652) [2022-01-22 13:03:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][40/1251] eta 0:55:06 lr 0.000484 time 3.0336 (2.7300) loss 3.2832 (3.3490) grad_norm 1.2744 (1.5456) [2022-01-22 13:03:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][50/1251] eta 0:52:23 lr 0.000484 time 1.5021 (2.6176) loss 3.6680 (3.3553) grad_norm 1.7485 (1.5480) [2022-01-22 13:04:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][60/1251] eta 0:50:45 lr 0.000484 time 2.1759 (2.5568) loss 3.9662 (3.3978) grad_norm 1.3858 (1.5477) [2022-01-22 13:04:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][70/1251] eta 0:49:35 lr 0.000484 time 1.9511 (2.5197) loss 2.9336 (3.3990) grad_norm 1.6138 (1.5585) [2022-01-22 13:04:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][80/1251] eta 0:48:46 lr 0.000484 time 2.8433 (2.4991) loss 3.9663 (3.4203) grad_norm 1.8905 (1.5580) [2022-01-22 13:05:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][90/1251] eta 0:48:06 lr 0.000484 time 2.7599 (2.4862) loss 4.0035 (3.4585) grad_norm 1.3630 (1.5518) [2022-01-22 13:05:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][100/1251] eta 0:47:30 lr 0.000484 time 1.7614 (2.4767) loss 4.0162 (3.4517) grad_norm 1.6515 (1.5504) [2022-01-22 13:06:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][110/1251] eta 0:46:20 lr 0.000484 time 1.8428 (2.4370) loss 3.6278 (3.4510) grad_norm 1.6213 (1.5538) [2022-01-22 13:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][120/1251] eta 0:45:19 lr 0.000484 time 2.5419 (2.4049) loss 3.1478 (3.4797) grad_norm 1.7733 (1.5511) [2022-01-22 13:06:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][130/1251] eta 0:44:25 lr 0.000484 time 1.8030 (2.3777) loss 4.1240 (3.4816) grad_norm 1.4029 (1.5464) [2022-01-22 13:07:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][140/1251] eta 0:43:37 lr 0.000484 time 2.4817 (2.3558) loss 3.8916 (3.4928) grad_norm 1.6847 (1.5383) [2022-01-22 13:07:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][150/1251] eta 0:42:56 lr 0.000484 time 2.1899 (2.3400) loss 3.8684 (3.4967) grad_norm 1.6093 (1.5392) [2022-01-22 13:07:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][160/1251] eta 0:42:27 lr 0.000484 time 2.1820 (2.3354) loss 4.4343 (3.5068) grad_norm 2.0193 (1.5430) [2022-01-22 13:08:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][170/1251] eta 0:42:02 lr 0.000484 time 1.8128 (2.3338) loss 3.5502 (3.5120) grad_norm 1.5370 (1.5443) [2022-01-22 13:08:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][180/1251] eta 0:41:36 lr 0.000484 time 2.5425 (2.3312) loss 4.1346 (3.5051) grad_norm 1.4363 (1.5389) [2022-01-22 13:09:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][190/1251] eta 0:41:07 lr 0.000483 time 2.3606 (2.3261) loss 3.1426 (3.5049) grad_norm 1.5565 (1.5344) [2022-01-22 13:09:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][200/1251] eta 0:40:35 lr 0.000483 time 2.1777 (2.3173) loss 3.8226 (3.5186) grad_norm 1.4966 (1.5355) [2022-01-22 13:09:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][210/1251] eta 0:40:01 lr 0.000483 time 1.8062 (2.3071) loss 3.2337 (3.5144) grad_norm 1.8319 (1.5342) [2022-01-22 13:10:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][220/1251] eta 0:39:27 lr 0.000483 time 1.8792 (2.2962) loss 3.7907 (3.5203) grad_norm 1.4907 (1.5333) [2022-01-22 13:10:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][230/1251] eta 0:38:52 lr 0.000483 time 2.1310 (2.2846) loss 3.3120 (3.5159) grad_norm 1.8146 (1.5351) [2022-01-22 13:10:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][240/1251] eta 0:38:24 lr 0.000483 time 2.1541 (2.2796) loss 2.8682 (3.5132) grad_norm 1.6672 (1.5389) [2022-01-22 13:11:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][250/1251] eta 0:37:58 lr 0.000483 time 2.2263 (2.2763) loss 3.7975 (3.5140) grad_norm 1.3663 (1.5358) [2022-01-22 13:11:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][260/1251] eta 0:37:36 lr 0.000483 time 2.3790 (2.2769) loss 2.7177 (3.5091) grad_norm 1.9563 (1.5404) [2022-01-22 13:11:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][270/1251] eta 0:37:09 lr 0.000483 time 1.9145 (2.2722) loss 3.7398 (3.5107) grad_norm 1.2519 (1.5415) [2022-01-22 13:12:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][280/1251] eta 0:36:38 lr 0.000483 time 1.8315 (2.2638) loss 2.7395 (3.4984) grad_norm 1.7759 (1.5435) [2022-01-22 13:12:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][290/1251] eta 0:36:11 lr 0.000483 time 1.9347 (2.2599) loss 3.4161 (3.5006) grad_norm 1.6483 (1.5459) [2022-01-22 13:12:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][300/1251] eta 0:35:45 lr 0.000483 time 1.7654 (2.2561) loss 3.9389 (3.5032) grad_norm 1.5484 (1.5478) [2022-01-22 13:13:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][310/1251] eta 0:35:25 lr 0.000483 time 2.6199 (2.2585) loss 3.5973 (3.4968) grad_norm 1.6202 (1.5473) [2022-01-22 13:13:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][320/1251] eta 0:35:05 lr 0.000483 time 2.4360 (2.2616) loss 3.4393 (3.4878) grad_norm 1.6495 (1.5447) [2022-01-22 13:14:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][330/1251] eta 0:34:45 lr 0.000483 time 2.1333 (2.2641) loss 3.7187 (3.4903) grad_norm 1.2618 (1.5420) [2022-01-22 13:14:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][340/1251] eta 0:34:20 lr 0.000483 time 2.1476 (2.2621) loss 3.7640 (3.4943) grad_norm 1.3395 (1.5406) [2022-01-22 13:14:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][350/1251] eta 0:33:55 lr 0.000483 time 1.9764 (2.2594) loss 3.3959 (3.4888) grad_norm 1.4541 (1.5377) [2022-01-22 13:15:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][360/1251] eta 0:33:24 lr 0.000483 time 1.7919 (2.2497) loss 3.9708 (3.4852) grad_norm 1.3967 (1.5366) [2022-01-22 13:15:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][370/1251] eta 0:32:56 lr 0.000483 time 2.0079 (2.2438) loss 2.6529 (3.4864) grad_norm 1.4434 (1.5362) [2022-01-22 13:15:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][380/1251] eta 0:32:31 lr 0.000483 time 2.5180 (2.2402) loss 3.8507 (3.4896) grad_norm 1.5544 (1.5365) [2022-01-22 13:16:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][390/1251] eta 0:32:11 lr 0.000483 time 2.4387 (2.2436) loss 3.7186 (3.4947) grad_norm 1.7525 (1.5398) [2022-01-22 13:16:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][400/1251] eta 0:31:48 lr 0.000483 time 2.2008 (2.2428) loss 3.9183 (3.4955) grad_norm 1.7623 (1.5421) [2022-01-22 13:16:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][410/1251] eta 0:31:23 lr 0.000483 time 2.4005 (2.2401) loss 3.7271 (3.4939) grad_norm 2.3421 (1.5442) [2022-01-22 13:17:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][420/1251] eta 0:31:02 lr 0.000483 time 2.1497 (2.2412) loss 2.7662 (3.4934) grad_norm 1.3320 (1.5428) [2022-01-22 13:17:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][430/1251] eta 0:30:41 lr 0.000482 time 1.8865 (2.2425) loss 2.8130 (3.4974) grad_norm 1.6818 (1.5420) [2022-01-22 13:18:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][440/1251] eta 0:30:17 lr 0.000482 time 2.2131 (2.2410) loss 3.5797 (3.4973) grad_norm 1.7022 (1.5422) [2022-01-22 13:18:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][450/1251] eta 0:29:53 lr 0.000482 time 2.1229 (2.2391) loss 3.8097 (3.5024) grad_norm 1.6202 (1.5396) [2022-01-22 13:18:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][460/1251] eta 0:29:27 lr 0.000482 time 2.3019 (2.2345) loss 3.6089 (3.4951) grad_norm 1.4147 (1.5400) [2022-01-22 13:19:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][470/1251] eta 0:29:02 lr 0.000482 time 2.1261 (2.2310) loss 3.7298 (3.4992) grad_norm 1.4407 (1.5371) [2022-01-22 13:19:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][480/1251] eta 0:28:37 lr 0.000482 time 2.2937 (2.2271) loss 2.5038 (3.4922) grad_norm 1.5313 (1.5360) [2022-01-22 13:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][490/1251] eta 0:28:15 lr 0.000482 time 2.1886 (2.2276) loss 3.0020 (3.4916) grad_norm 1.5303 (1.5359) [2022-01-22 13:20:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][500/1251] eta 0:27:50 lr 0.000482 time 1.9049 (2.2240) loss 3.4741 (3.4891) grad_norm 1.3502 (1.5346) [2022-01-22 13:20:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][510/1251] eta 0:27:28 lr 0.000482 time 2.1235 (2.2243) loss 3.5301 (3.4877) grad_norm 1.6971 (1.5379) [2022-01-22 13:20:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][520/1251] eta 0:27:07 lr 0.000482 time 3.1773 (2.2260) loss 3.6943 (3.4920) grad_norm 1.4786 (1.5375) [2022-01-22 13:21:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][530/1251] eta 0:26:45 lr 0.000482 time 2.0326 (2.2262) loss 4.3353 (3.4942) grad_norm 1.2938 (1.5357) [2022-01-22 13:21:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][540/1251] eta 0:26:22 lr 0.000482 time 2.1717 (2.2256) loss 3.5942 (3.4927) grad_norm 1.5164 (1.5365) [2022-01-22 13:22:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][550/1251] eta 0:25:59 lr 0.000482 time 1.8690 (2.2249) loss 3.5366 (3.4943) grad_norm 1.6260 (1.5369) [2022-01-22 13:22:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][560/1251] eta 0:25:37 lr 0.000482 time 2.5014 (2.2254) loss 4.1449 (3.4973) grad_norm 1.5863 (1.5370) [2022-01-22 13:22:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][570/1251] eta 0:25:15 lr 0.000482 time 1.6390 (2.2252) loss 3.6195 (3.4917) grad_norm 1.8477 (1.5384) [2022-01-22 13:23:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][580/1251] eta 0:24:54 lr 0.000482 time 1.5260 (2.2268) loss 2.3435 (3.4949) grad_norm 1.4753 (1.5381) [2022-01-22 13:23:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][590/1251] eta 0:24:32 lr 0.000482 time 1.6645 (2.2280) loss 3.0133 (3.4975) grad_norm 1.3875 (1.5380) [2022-01-22 13:23:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][600/1251] eta 0:24:10 lr 0.000482 time 3.5672 (2.2287) loss 2.7675 (3.4943) grad_norm 1.3423 (1.5408) [2022-01-22 13:24:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][610/1251] eta 0:23:47 lr 0.000482 time 1.8301 (2.2268) loss 3.6445 (3.4914) grad_norm 1.4535 (1.5418) [2022-01-22 13:24:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][620/1251] eta 0:23:23 lr 0.000482 time 1.8421 (2.2239) loss 3.4572 (3.4961) grad_norm 1.8747 (1.5419) [2022-01-22 13:24:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][630/1251] eta 0:23:00 lr 0.000482 time 1.9852 (2.2223) loss 2.4110 (3.4976) grad_norm 1.6271 (1.5424) [2022-01-22 13:25:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][640/1251] eta 0:22:37 lr 0.000482 time 2.9032 (2.2220) loss 3.5817 (3.4995) grad_norm 1.5340 (1.5418) [2022-01-22 13:25:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][650/1251] eta 0:22:15 lr 0.000482 time 2.5122 (2.2218) loss 3.1908 (3.5031) grad_norm 1.3955 (1.5419) [2022-01-22 13:26:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][660/1251] eta 0:21:52 lr 0.000482 time 1.9415 (2.2204) loss 3.8743 (3.5001) grad_norm 1.6196 (1.5421) [2022-01-22 13:26:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][670/1251] eta 0:21:29 lr 0.000481 time 2.1595 (2.2203) loss 2.6590 (3.4968) grad_norm 1.3001 (1.5405) [2022-01-22 13:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][680/1251] eta 0:21:06 lr 0.000481 time 2.7822 (2.2182) loss 3.3604 (3.4982) grad_norm 1.5336 (1.5404) [2022-01-22 13:27:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][690/1251] eta 0:20:45 lr 0.000481 time 2.8176 (2.2199) loss 4.2709 (3.5047) grad_norm 1.9779 (1.5411) [2022-01-22 13:27:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][700/1251] eta 0:20:22 lr 0.000481 time 2.2071 (2.2194) loss 3.7568 (3.5047) grad_norm 1.7279 (1.5419) [2022-01-22 13:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][710/1251] eta 0:20:00 lr 0.000481 time 2.4039 (2.2198) loss 3.9608 (3.5082) grad_norm 1.5057 (1.5442) [2022-01-22 13:28:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][720/1251] eta 0:19:38 lr 0.000481 time 2.2271 (2.2188) loss 3.7631 (3.5094) grad_norm 1.4701 (1.5443) [2022-01-22 13:28:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][730/1251] eta 0:19:14 lr 0.000481 time 2.3350 (2.2160) loss 3.6319 (3.5097) grad_norm 1.5301 (1.5455) [2022-01-22 13:28:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][740/1251] eta 0:18:51 lr 0.000481 time 1.8365 (2.2145) loss 3.0104 (3.5049) grad_norm 1.3095 (1.5453) [2022-01-22 13:29:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][750/1251] eta 0:18:28 lr 0.000481 time 1.8898 (2.2134) loss 3.9218 (3.5059) grad_norm 1.4334 (1.5451) [2022-01-22 13:29:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][760/1251] eta 0:18:06 lr 0.000481 time 1.9144 (2.2118) loss 3.4788 (3.5022) grad_norm 1.6127 (1.5453) [2022-01-22 13:30:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][770/1251] eta 0:17:44 lr 0.000481 time 2.4100 (2.2134) loss 4.1361 (3.5037) grad_norm 1.5401 (1.5449) [2022-01-22 13:30:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][780/1251] eta 0:17:22 lr 0.000481 time 1.8524 (2.2137) loss 3.4893 (3.5029) grad_norm 1.5476 (1.5445) [2022-01-22 13:30:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][790/1251] eta 0:17:01 lr 0.000481 time 2.0582 (2.2149) loss 3.1483 (3.5010) grad_norm 1.6733 (1.5460) [2022-01-22 13:31:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][800/1251] eta 0:16:38 lr 0.000481 time 2.0763 (2.2149) loss 3.9595 (3.5039) grad_norm 1.4469 (1.5457) [2022-01-22 13:31:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][810/1251] eta 0:16:15 lr 0.000481 time 2.0946 (2.2121) loss 3.9034 (3.5041) grad_norm 1.7658 (1.5465) [2022-01-22 13:31:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][820/1251] eta 0:15:51 lr 0.000481 time 1.9709 (2.2087) loss 4.2647 (3.5026) grad_norm 1.4836 (1.5484) [2022-01-22 13:32:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][830/1251] eta 0:15:28 lr 0.000481 time 1.9095 (2.2060) loss 2.9114 (3.5020) grad_norm 1.5341 (1.5481) [2022-01-22 13:32:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][840/1251] eta 0:15:06 lr 0.000481 time 1.8779 (2.2048) loss 3.0254 (3.4977) grad_norm 1.3332 (1.5464) [2022-01-22 13:32:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][850/1251] eta 0:14:44 lr 0.000481 time 2.4807 (2.2053) loss 3.7834 (3.5001) grad_norm 1.6726 (1.5472) [2022-01-22 13:33:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][860/1251] eta 0:14:21 lr 0.000481 time 1.8418 (2.2046) loss 3.3591 (3.4964) grad_norm 1.4572 (1.5469) [2022-01-22 13:33:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][870/1251] eta 0:13:59 lr 0.000481 time 2.3359 (2.2044) loss 3.5663 (3.4947) grad_norm 1.5510 (1.5459) [2022-01-22 13:33:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][880/1251] eta 0:13:38 lr 0.000481 time 3.1034 (2.2055) loss 3.5739 (3.4969) grad_norm 1.4964 (1.5466) [2022-01-22 13:34:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][890/1251] eta 0:13:17 lr 0.000481 time 3.5158 (2.2102) loss 2.8890 (3.4963) grad_norm 1.4676 (1.5466) [2022-01-22 13:34:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][900/1251] eta 0:12:56 lr 0.000481 time 2.4982 (2.2111) loss 3.0637 (3.4935) grad_norm 1.3761 (1.5458) [2022-01-22 13:35:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][910/1251] eta 0:12:33 lr 0.000481 time 2.9343 (2.2107) loss 3.8119 (3.4908) grad_norm 1.3577 (1.5451) [2022-01-22 13:35:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][920/1251] eta 0:12:11 lr 0.000480 time 1.5606 (2.2103) loss 3.5022 (3.4912) grad_norm 1.5433 (1.5450) [2022-01-22 13:35:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][930/1251] eta 0:11:49 lr 0.000480 time 2.6180 (2.2094) loss 3.6322 (3.4897) grad_norm 1.4039 (1.5438) [2022-01-22 13:36:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][940/1251] eta 0:11:26 lr 0.000480 time 1.9925 (2.2068) loss 4.2779 (3.4936) grad_norm 1.4325 (1.5435) [2022-01-22 13:36:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][950/1251] eta 0:11:03 lr 0.000480 time 1.9028 (2.2045) loss 4.2340 (3.4918) grad_norm 1.6893 (1.5456) [2022-01-22 13:36:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][960/1251] eta 0:10:40 lr 0.000480 time 1.6015 (2.2027) loss 3.8493 (3.4920) grad_norm 1.3720 (1.5469) [2022-01-22 13:37:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][970/1251] eta 0:10:18 lr 0.000480 time 1.7573 (2.2023) loss 4.0033 (3.4906) grad_norm 1.6665 (1.5462) [2022-01-22 13:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][980/1251] eta 0:09:56 lr 0.000480 time 2.4089 (2.2023) loss 3.1322 (3.4915) grad_norm 1.5492 (1.5458) [2022-01-22 13:37:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][990/1251] eta 0:09:34 lr 0.000480 time 2.5492 (2.2027) loss 3.9165 (3.4882) grad_norm 1.4182 (1.5459) [2022-01-22 13:38:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1000/1251] eta 0:09:12 lr 0.000480 time 2.5082 (2.2024) loss 4.3400 (3.4888) grad_norm 1.6609 (1.5463) [2022-01-22 13:38:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1010/1251] eta 0:08:51 lr 0.000480 time 2.2720 (2.2042) loss 3.8551 (3.4879) grad_norm 1.4961 (1.5464) [2022-01-22 13:39:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1020/1251] eta 0:08:29 lr 0.000480 time 2.1011 (2.2063) loss 2.7130 (3.4901) grad_norm 1.6112 (1.5464) [2022-01-22 13:39:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1030/1251] eta 0:08:08 lr 0.000480 time 2.8103 (2.2083) loss 2.5663 (3.4906) grad_norm 1.8051 (1.5466) [2022-01-22 13:39:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1040/1251] eta 0:07:46 lr 0.000480 time 2.4767 (2.2085) loss 3.6072 (3.4926) grad_norm 1.5902 (1.5459) [2022-01-22 13:40:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1050/1251] eta 0:07:23 lr 0.000480 time 2.4415 (2.2084) loss 3.8596 (3.4928) grad_norm 1.3895 (1.5448) [2022-01-22 13:40:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1060/1251] eta 0:07:01 lr 0.000480 time 1.8261 (2.2058) loss 4.5446 (3.4952) grad_norm 1.4962 (1.5447) [2022-01-22 13:40:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1070/1251] eta 0:06:38 lr 0.000480 time 1.6171 (2.2039) loss 3.9708 (3.4959) grad_norm 1.8755 (1.5451) [2022-01-22 13:41:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1080/1251] eta 0:06:16 lr 0.000480 time 2.2262 (2.2043) loss 3.7102 (3.4948) grad_norm 1.4885 (1.5452) [2022-01-22 13:41:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1090/1251] eta 0:05:54 lr 0.000480 time 2.2864 (2.2048) loss 4.0536 (3.4966) grad_norm 1.8584 (1.5463) [2022-01-22 13:42:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1100/1251] eta 0:05:33 lr 0.000480 time 2.3879 (2.2064) loss 4.1218 (3.4962) grad_norm 1.6156 (1.5466) [2022-01-22 13:42:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1110/1251] eta 0:05:11 lr 0.000480 time 1.7937 (2.2063) loss 3.7419 (3.4966) grad_norm 1.4196 (1.5465) [2022-01-22 13:42:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1120/1251] eta 0:04:48 lr 0.000480 time 2.1738 (2.2061) loss 3.5266 (3.4963) grad_norm 1.3322 (1.5469) [2022-01-22 13:43:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1130/1251] eta 0:04:26 lr 0.000480 time 1.6427 (2.2049) loss 2.8746 (3.4960) grad_norm 1.5382 (1.5471) [2022-01-22 13:43:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1140/1251] eta 0:04:04 lr 0.000480 time 2.2036 (2.2056) loss 3.9981 (3.4980) grad_norm 1.4844 (1.5461) [2022-01-22 13:43:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1150/1251] eta 0:03:42 lr 0.000480 time 2.9452 (2.2063) loss 3.0408 (3.5001) grad_norm 1.6397 (1.5473) [2022-01-22 13:44:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1160/1251] eta 0:03:20 lr 0.000479 time 1.8484 (2.2050) loss 3.3143 (3.5005) grad_norm 1.5905 (1.5471) [2022-01-22 13:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1170/1251] eta 0:02:58 lr 0.000479 time 2.4663 (2.2038) loss 3.2119 (3.5028) grad_norm 1.4883 (1.5471) [2022-01-22 13:44:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1180/1251] eta 0:02:36 lr 0.000479 time 2.0889 (2.2023) loss 2.7343 (3.5031) grad_norm 1.7068 (1.5476) [2022-01-22 13:45:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1190/1251] eta 0:02:14 lr 0.000479 time 2.8700 (2.2021) loss 3.9806 (3.5030) grad_norm 1.3668 (1.5476) [2022-01-22 13:45:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1200/1251] eta 0:01:52 lr 0.000479 time 2.8522 (2.2033) loss 3.0453 (3.5027) grad_norm 1.4683 (1.5470) [2022-01-22 13:46:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1210/1251] eta 0:01:30 lr 0.000479 time 2.2524 (2.2046) loss 3.1503 (3.5008) grad_norm 1.6085 (1.5470) [2022-01-22 13:46:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1220/1251] eta 0:01:08 lr 0.000479 time 1.4649 (2.2051) loss 3.9122 (3.5001) grad_norm 1.5091 (1.5465) [2022-01-22 13:46:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1230/1251] eta 0:00:46 lr 0.000479 time 3.7614 (2.2059) loss 3.9961 (3.5017) grad_norm 1.6166 (1.5463) [2022-01-22 13:47:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1240/1251] eta 0:00:24 lr 0.000479 time 1.2124 (2.2033) loss 4.2808 (3.5034) grad_norm 1.5757 (1.5458) [2022-01-22 13:47:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [154/300][1250/1251] eta 0:00:02 lr 0.000479 time 1.2041 (2.1973) loss 2.6050 (3.5053) grad_norm 1.4222 (1.5454) [2022-01-22 13:47:25 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 154 training takes 0:45:49 [2022-01-22 13:47:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.140 (18.140) Loss 0.9670 (0.9670) Acc@1 78.418 (78.418) Acc@5 93.750 (93.750) [2022-01-22 13:48:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.262 (3.355) Loss 0.9696 (1.0348) Acc@1 79.785 (76.225) Acc@5 93.945 (93.208) [2022-01-22 13:48:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.891 (2.684) Loss 1.0392 (1.0392) Acc@1 75.488 (76.004) Acc@5 93.848 (93.192) [2022-01-22 13:48:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.593 (2.224) Loss 1.0680 (1.0325) Acc@1 75.098 (76.112) Acc@5 92.383 (93.230) [2022-01-22 13:48:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.134 (2.179) Loss 1.0032 (1.0294) Acc@1 77.441 (76.143) Acc@5 93.848 (93.250) [2022-01-22 13:49:02 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.156 Acc@5 93.294 [2022-01-22 13:49:02 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.2% [2022-01-22 13:49:02 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.30% [2022-01-22 13:49:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][0/1251] eta 7:07:12 lr 0.000479 time 20.4892 (20.4892) loss 3.4387 (3.4387) grad_norm 1.3399 (1.3399) [2022-01-22 13:49:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][10/1251] eta 1:21:08 lr 0.000479 time 2.9071 (3.9231) loss 3.9518 (3.4837) grad_norm 1.4195 (1.4638) [2022-01-22 13:50:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][20/1251] eta 1:03:06 lr 0.000479 time 1.5413 (3.0760) loss 3.7387 (3.5320) grad_norm 1.5865 (1.5059) [2022-01-22 13:50:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][30/1251] eta 0:57:34 lr 0.000479 time 1.8325 (2.8292) loss 3.8308 (3.5359) grad_norm 1.6060 (1.5346) [2022-01-22 13:50:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][40/1251] eta 0:55:18 lr 0.000479 time 3.8446 (2.7403) loss 3.3036 (3.5610) grad_norm 1.4033 (1.5229) [2022-01-22 13:51:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][50/1251] eta 0:53:10 lr 0.000479 time 2.6464 (2.6563) loss 3.9128 (3.5511) grad_norm 1.3659 (1.5140) [2022-01-22 13:51:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][60/1251] eta 0:51:10 lr 0.000479 time 1.7561 (2.5782) loss 4.2899 (3.5635) grad_norm 1.6937 (1.5172) [2022-01-22 13:52:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][70/1251] eta 0:49:39 lr 0.000479 time 1.9305 (2.5230) loss 3.1118 (3.5265) grad_norm 1.4224 (1.5291) [2022-01-22 13:52:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][80/1251] eta 0:48:46 lr 0.000479 time 4.2527 (2.4992) loss 3.0963 (3.4975) grad_norm 1.3844 (1.5275) [2022-01-22 13:52:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][90/1251] eta 0:47:35 lr 0.000479 time 2.0450 (2.4595) loss 3.6958 (3.4932) grad_norm 1.6565 (1.5240) [2022-01-22 13:53:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][100/1251] eta 0:46:34 lr 0.000479 time 1.7834 (2.4280) loss 3.7355 (3.4674) grad_norm 1.4451 (1.5170) [2022-01-22 13:53:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][110/1251] eta 0:45:44 lr 0.000479 time 2.2226 (2.4055) loss 3.4795 (3.4755) grad_norm 1.5143 (1.5188) [2022-01-22 13:53:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][120/1251] eta 0:45:00 lr 0.000479 time 2.8511 (2.3873) loss 3.5651 (3.4640) grad_norm 1.4233 (1.5194) [2022-01-22 13:54:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][130/1251] eta 0:44:10 lr 0.000479 time 1.9912 (2.3643) loss 2.5891 (3.4730) grad_norm 1.8572 (1.5207) [2022-01-22 13:54:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][140/1251] eta 0:43:28 lr 0.000479 time 2.0219 (2.3482) loss 2.9936 (3.4718) grad_norm 1.6286 (1.5203) [2022-01-22 13:54:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][150/1251] eta 0:43:01 lr 0.000478 time 2.3341 (2.3443) loss 3.6675 (3.4893) grad_norm 1.6279 (1.5212) [2022-01-22 13:55:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][160/1251] eta 0:42:34 lr 0.000478 time 3.4206 (2.3415) loss 4.0139 (3.5005) grad_norm 1.4506 (1.5227) [2022-01-22 13:55:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][170/1251] eta 0:42:02 lr 0.000478 time 1.6968 (2.3331) loss 3.5899 (3.5099) grad_norm 1.5660 (1.5248) [2022-01-22 13:56:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][180/1251] eta 0:41:18 lr 0.000478 time 1.8408 (2.3143) loss 2.6209 (3.4955) grad_norm 1.4152 (1.5250) [2022-01-22 13:56:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][190/1251] eta 0:40:39 lr 0.000478 time 2.2192 (2.2994) loss 3.7941 (3.5042) grad_norm 1.6050 (1.5219) [2022-01-22 13:56:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][200/1251] eta 0:39:59 lr 0.000478 time 2.2336 (2.2831) loss 3.3870 (3.4941) grad_norm 1.4471 (1.5213) [2022-01-22 13:57:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][210/1251] eta 0:39:29 lr 0.000478 time 1.9924 (2.2765) loss 2.4570 (3.4849) grad_norm 1.6342 (1.5189) [2022-01-22 13:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][220/1251] eta 0:39:08 lr 0.000478 time 2.5547 (2.2775) loss 3.7024 (3.4841) grad_norm 1.2868 (1.5175) [2022-01-22 13:57:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][230/1251] eta 0:38:46 lr 0.000478 time 2.1701 (2.2786) loss 3.4480 (3.4843) grad_norm 1.5012 (1.5174) [2022-01-22 13:58:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][240/1251] eta 0:38:25 lr 0.000478 time 2.7687 (2.2803) loss 3.2893 (3.4784) grad_norm 1.3126 (1.5170) [2022-01-22 13:58:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][250/1251] eta 0:38:02 lr 0.000478 time 1.9119 (2.2802) loss 3.2052 (3.4859) grad_norm 1.7184 (1.5185) [2022-01-22 13:58:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][260/1251] eta 0:37:34 lr 0.000478 time 2.2212 (2.2753) loss 3.9744 (3.4848) grad_norm 1.3227 (1.5225) [2022-01-22 13:59:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][270/1251] eta 0:37:07 lr 0.000478 time 2.1516 (2.2701) loss 3.1731 (3.4899) grad_norm 1.3467 (1.5213) [2022-01-22 13:59:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][280/1251] eta 0:36:41 lr 0.000478 time 2.2482 (2.2668) loss 3.6753 (3.4934) grad_norm 1.2269 (1.5192) [2022-01-22 14:00:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][290/1251] eta 0:36:15 lr 0.000478 time 2.2096 (2.2639) loss 3.4084 (3.4999) grad_norm 1.4633 (1.5184) [2022-01-22 14:00:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][300/1251] eta 0:35:48 lr 0.000478 time 2.4753 (2.2596) loss 3.3554 (3.5005) grad_norm 1.4001 (1.5182) [2022-01-22 14:00:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][310/1251] eta 0:35:25 lr 0.000478 time 3.4361 (2.2586) loss 4.0857 (3.5044) grad_norm 1.4169 (1.5166) [2022-01-22 14:01:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][320/1251] eta 0:35:04 lr 0.000478 time 1.8950 (2.2605) loss 3.8680 (3.5057) grad_norm 1.5297 (1.5162) [2022-01-22 14:01:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][330/1251] eta 0:34:40 lr 0.000478 time 1.5481 (2.2593) loss 3.9149 (3.5053) grad_norm 1.4283 (1.5151) [2022-01-22 14:01:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][340/1251] eta 0:34:15 lr 0.000478 time 1.5955 (2.2564) loss 4.1223 (3.5067) grad_norm 1.3894 (1.5138) [2022-01-22 14:02:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][350/1251] eta 0:33:50 lr 0.000478 time 2.5648 (2.2541) loss 3.5970 (3.5126) grad_norm 1.3986 (1.5136) [2022-01-22 14:02:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][360/1251] eta 0:33:22 lr 0.000478 time 2.1621 (2.2471) loss 2.6152 (3.5043) grad_norm 1.3889 (1.5158) [2022-01-22 14:02:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][370/1251] eta 0:32:53 lr 0.000478 time 1.9267 (2.2403) loss 3.6772 (3.5066) grad_norm 1.5695 (1.5160) [2022-01-22 14:03:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][380/1251] eta 0:32:32 lr 0.000478 time 1.5449 (2.2421) loss 3.7126 (3.5064) grad_norm 1.7882 (1.5194) [2022-01-22 14:03:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][390/1251] eta 0:32:07 lr 0.000477 time 1.9644 (2.2390) loss 3.9821 (3.5080) grad_norm 1.6578 (1.5223) [2022-01-22 14:04:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][400/1251] eta 0:31:46 lr 0.000477 time 2.3178 (2.2408) loss 3.7920 (3.5033) grad_norm 1.3226 (1.5230) [2022-01-22 14:04:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][410/1251] eta 0:31:26 lr 0.000477 time 2.5806 (2.2428) loss 3.3642 (3.5058) grad_norm 1.5795 (1.5257) [2022-01-22 14:04:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][420/1251] eta 0:31:01 lr 0.000477 time 1.6029 (2.2405) loss 3.9932 (3.5048) grad_norm 1.7793 (1.5306) [2022-01-22 14:05:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][430/1251] eta 0:30:43 lr 0.000477 time 3.1996 (2.2448) loss 3.1192 (3.5047) grad_norm 1.5585 (1.5298) [2022-01-22 14:05:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][440/1251] eta 0:30:21 lr 0.000477 time 2.0918 (2.2456) loss 4.3012 (3.5059) grad_norm 1.5713 (1.5300) [2022-01-22 14:05:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][450/1251] eta 0:29:54 lr 0.000477 time 1.8782 (2.2408) loss 4.1163 (3.5023) grad_norm 1.5249 (1.5307) [2022-01-22 14:06:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][460/1251] eta 0:29:26 lr 0.000477 time 1.8628 (2.2338) loss 3.5719 (3.5006) grad_norm 1.6074 (1.5326) [2022-01-22 14:06:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][470/1251] eta 0:28:59 lr 0.000477 time 2.2683 (2.2277) loss 3.3949 (3.5053) grad_norm 1.4410 (1.5309) [2022-01-22 14:06:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][480/1251] eta 0:28:37 lr 0.000477 time 2.7306 (2.2270) loss 3.5983 (3.4995) grad_norm 1.4657 (1.5301) [2022-01-22 14:07:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][490/1251] eta 0:28:16 lr 0.000477 time 2.5533 (2.2296) loss 3.6270 (3.5063) grad_norm 1.5625 (1.5299) [2022-01-22 14:07:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][500/1251] eta 0:27:55 lr 0.000477 time 1.2154 (2.2307) loss 2.7936 (3.5068) grad_norm 1.3124 (1.5296) [2022-01-22 14:08:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][510/1251] eta 0:27:31 lr 0.000477 time 1.8512 (2.2288) loss 3.3328 (3.5059) grad_norm 1.6011 (1.5308) [2022-01-22 14:08:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][520/1251] eta 0:27:11 lr 0.000477 time 3.0841 (2.2315) loss 3.8324 (3.5035) grad_norm 1.6505 (1.5330) [2022-01-22 14:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][530/1251] eta 0:26:49 lr 0.000477 time 2.2729 (2.2325) loss 3.0618 (3.4972) grad_norm 1.4796 (1.5319) [2022-01-22 14:09:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][540/1251] eta 0:26:26 lr 0.000477 time 1.5819 (2.2314) loss 4.1683 (3.5021) grad_norm 1.5457 (1.5306) [2022-01-22 14:09:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][550/1251] eta 0:26:02 lr 0.000477 time 2.2717 (2.2295) loss 3.1876 (3.4991) grad_norm 1.4162 (1.5299) [2022-01-22 14:09:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][560/1251] eta 0:25:38 lr 0.000477 time 1.8649 (2.2267) loss 2.8889 (3.4988) grad_norm 1.8836 (1.5310) [2022-01-22 14:10:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][570/1251] eta 0:25:15 lr 0.000477 time 2.0724 (2.2251) loss 2.6460 (3.4951) grad_norm 1.6796 (1.5305) [2022-01-22 14:10:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][580/1251] eta 0:24:51 lr 0.000477 time 2.1034 (2.2235) loss 3.9476 (3.4974) grad_norm 1.3837 (1.5314) [2022-01-22 14:10:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][590/1251] eta 0:24:28 lr 0.000477 time 2.1484 (2.2223) loss 2.6363 (3.5006) grad_norm 1.5824 (1.5321) [2022-01-22 14:11:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][600/1251] eta 0:24:06 lr 0.000477 time 2.3784 (2.2227) loss 3.9826 (3.5045) grad_norm 1.6286 (1.5325) [2022-01-22 14:11:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][610/1251] eta 0:23:43 lr 0.000477 time 1.9210 (2.2207) loss 3.4584 (3.5072) grad_norm 1.5307 (1.5312) [2022-01-22 14:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][620/1251] eta 0:23:19 lr 0.000477 time 2.0586 (2.2174) loss 4.2105 (3.5054) grad_norm 1.5315 (1.5309) [2022-01-22 14:12:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][630/1251] eta 0:22:56 lr 0.000476 time 2.2000 (2.2162) loss 2.3887 (3.5029) grad_norm 1.6447 (1.5334) [2022-01-22 14:12:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][640/1251] eta 0:22:33 lr 0.000476 time 1.5907 (2.2147) loss 3.9021 (3.5047) grad_norm 1.3965 (1.5331) [2022-01-22 14:13:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][650/1251] eta 0:22:10 lr 0.000476 time 1.5434 (2.2140) loss 2.9356 (3.5024) grad_norm 1.5002 (1.5336) [2022-01-22 14:13:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][660/1251] eta 0:21:48 lr 0.000476 time 1.9695 (2.2139) loss 3.5563 (3.4983) grad_norm 1.4205 (1.5351) [2022-01-22 14:13:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][670/1251] eta 0:21:25 lr 0.000476 time 2.2841 (2.2132) loss 2.7388 (3.5004) grad_norm 1.3806 (1.5348) [2022-01-22 14:14:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][680/1251] eta 0:21:05 lr 0.000476 time 2.9667 (2.2167) loss 3.8533 (3.5030) grad_norm 1.6182 (1.5355) [2022-01-22 14:14:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][690/1251] eta 0:20:45 lr 0.000476 time 1.8572 (2.2203) loss 4.2779 (3.5035) grad_norm 1.5346 (1.5355) [2022-01-22 14:14:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][700/1251] eta 0:20:23 lr 0.000476 time 1.9599 (2.2211) loss 2.8826 (3.5054) grad_norm 1.6069 (1.5348) [2022-01-22 14:15:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][710/1251] eta 0:20:01 lr 0.000476 time 2.5259 (2.2204) loss 2.9995 (3.5067) grad_norm 1.6428 (1.5354) [2022-01-22 14:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][720/1251] eta 0:19:37 lr 0.000476 time 2.1942 (2.2179) loss 3.8306 (3.5101) grad_norm 1.7042 (1.5346) [2022-01-22 14:16:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][730/1251] eta 0:19:13 lr 0.000476 time 1.6162 (2.2132) loss 3.9355 (3.5122) grad_norm 1.3603 (1.5335) [2022-01-22 14:16:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][740/1251] eta 0:18:49 lr 0.000476 time 2.2575 (2.2099) loss 4.1477 (3.5130) grad_norm 1.4013 (1.5331) [2022-01-22 14:16:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][750/1251] eta 0:18:27 lr 0.000476 time 1.9451 (2.2102) loss 2.3053 (3.5111) grad_norm 1.2913 (1.5326) [2022-01-22 14:17:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][760/1251] eta 0:18:05 lr 0.000476 time 2.2408 (2.2111) loss 3.8909 (3.5109) grad_norm 1.6111 (1.5330) [2022-01-22 14:17:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][770/1251] eta 0:17:44 lr 0.000476 time 2.4407 (2.2128) loss 4.1775 (3.5080) grad_norm 1.5788 (1.5328) [2022-01-22 14:17:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][780/1251] eta 0:17:22 lr 0.000476 time 2.5220 (2.2131) loss 3.9276 (3.5068) grad_norm 1.2677 (1.5332) [2022-01-22 14:18:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][790/1251] eta 0:17:00 lr 0.000476 time 2.0978 (2.2128) loss 3.8308 (3.5072) grad_norm 1.5872 (1.5333) [2022-01-22 14:18:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][800/1251] eta 0:16:37 lr 0.000476 time 2.5116 (2.2121) loss 3.2328 (3.5103) grad_norm 1.4020 (1.5339) [2022-01-22 14:18:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][810/1251] eta 0:16:15 lr 0.000476 time 2.1879 (2.2118) loss 3.5229 (3.5095) grad_norm 1.4309 (1.5339) [2022-01-22 14:19:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][820/1251] eta 0:15:53 lr 0.000476 time 2.0964 (2.2111) loss 3.5857 (3.5100) grad_norm 1.4469 (1.5355) [2022-01-22 14:19:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][830/1251] eta 0:15:30 lr 0.000476 time 1.8465 (2.2098) loss 3.3759 (3.5134) grad_norm 1.5202 (1.5346) [2022-01-22 14:20:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][840/1251] eta 0:15:08 lr 0.000476 time 1.9633 (2.2093) loss 3.5472 (3.5137) grad_norm 1.2519 (1.5347) [2022-01-22 14:20:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][850/1251] eta 0:14:45 lr 0.000476 time 1.9461 (2.2094) loss 3.3364 (3.5144) grad_norm 1.7953 (1.5350) [2022-01-22 14:20:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][860/1251] eta 0:14:24 lr 0.000476 time 3.4780 (2.2100) loss 3.7891 (3.5122) grad_norm 1.4706 (1.5358) [2022-01-22 14:21:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][870/1251] eta 0:14:02 lr 0.000475 time 2.3568 (2.2109) loss 4.1964 (3.5129) grad_norm 1.7764 (1.5362) [2022-01-22 14:21:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][880/1251] eta 0:13:40 lr 0.000475 time 1.7850 (2.2109) loss 4.0101 (3.5146) grad_norm 1.6019 (1.5363) [2022-01-22 14:21:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][890/1251] eta 0:13:18 lr 0.000475 time 1.9186 (2.2121) loss 4.0576 (3.5179) grad_norm 1.5790 (1.5373) [2022-01-22 14:22:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][900/1251] eta 0:12:57 lr 0.000475 time 3.6043 (2.2138) loss 2.5123 (3.5185) grad_norm 1.3955 (1.5378) [2022-01-22 14:22:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][910/1251] eta 0:12:34 lr 0.000475 time 2.1871 (2.2124) loss 2.8066 (3.5165) grad_norm 1.6268 (1.5381) [2022-01-22 14:22:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][920/1251] eta 0:12:11 lr 0.000475 time 1.8489 (2.2102) loss 3.6142 (3.5148) grad_norm 1.4124 (1.5383) [2022-01-22 14:23:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][930/1251] eta 0:11:49 lr 0.000475 time 1.8927 (2.2090) loss 4.0462 (3.5157) grad_norm 1.5074 (1.5384) [2022-01-22 14:23:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][940/1251] eta 0:11:27 lr 0.000475 time 3.8815 (2.2107) loss 3.6278 (3.5171) grad_norm 1.5099 (1.5389) [2022-01-22 14:24:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][950/1251] eta 0:11:05 lr 0.000475 time 2.1172 (2.2112) loss 3.6861 (3.5161) grad_norm 1.3880 (1.5387) [2022-01-22 14:24:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][960/1251] eta 0:10:43 lr 0.000475 time 1.6987 (2.2103) loss 4.0817 (3.5174) grad_norm 1.5669 (1.5385) [2022-01-22 14:24:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][970/1251] eta 0:10:20 lr 0.000475 time 1.6606 (2.2091) loss 4.0843 (3.5149) grad_norm 1.5952 (1.5395) [2022-01-22 14:25:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][980/1251] eta 0:09:58 lr 0.000475 time 3.7460 (2.2094) loss 3.2330 (3.5163) grad_norm 1.5946 (1.5402) [2022-01-22 14:25:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][990/1251] eta 0:09:36 lr 0.000475 time 1.9287 (2.2088) loss 4.1133 (3.5163) grad_norm 1.5781 (1.5399) [2022-01-22 14:25:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1000/1251] eta 0:09:14 lr 0.000475 time 2.1538 (2.2101) loss 3.7963 (3.5187) grad_norm 1.4378 (1.5394) [2022-01-22 14:26:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1010/1251] eta 0:08:52 lr 0.000475 time 2.1121 (2.2087) loss 2.5592 (3.5171) grad_norm 1.4626 (1.5392) [2022-01-22 14:26:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1020/1251] eta 0:08:30 lr 0.000475 time 3.7197 (2.2104) loss 2.9659 (3.5179) grad_norm 1.7033 (1.5404) [2022-01-22 14:27:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1030/1251] eta 0:08:08 lr 0.000475 time 1.7076 (2.2091) loss 4.0461 (3.5207) grad_norm 2.0635 (1.5410) [2022-01-22 14:27:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1040/1251] eta 0:07:45 lr 0.000475 time 1.9927 (2.2081) loss 3.7605 (3.5224) grad_norm 1.9307 (1.5418) [2022-01-22 14:27:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1050/1251] eta 0:07:23 lr 0.000475 time 1.6979 (2.2071) loss 4.4402 (3.5233) grad_norm 1.8662 (1.5415) [2022-01-22 14:28:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1060/1251] eta 0:07:01 lr 0.000475 time 3.0542 (2.2069) loss 3.8761 (3.5223) grad_norm 1.4507 (1.5415) [2022-01-22 14:28:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1070/1251] eta 0:06:39 lr 0.000475 time 1.9014 (2.2054) loss 3.3434 (3.5225) grad_norm 1.5447 (1.5424) [2022-01-22 14:28:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1080/1251] eta 0:06:17 lr 0.000475 time 2.2027 (2.2053) loss 2.7237 (3.5208) grad_norm 1.5041 (1.5426) [2022-01-22 14:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1090/1251] eta 0:05:55 lr 0.000475 time 2.2425 (2.2057) loss 2.5165 (3.5208) grad_norm 1.4631 (1.5427) [2022-01-22 14:29:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1100/1251] eta 0:05:32 lr 0.000475 time 2.0297 (2.2046) loss 3.0418 (3.5197) grad_norm 1.5049 (1.5434) [2022-01-22 14:29:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1110/1251] eta 0:05:10 lr 0.000475 time 1.9906 (2.2035) loss 3.0062 (3.5201) grad_norm 1.5821 (1.5440) [2022-01-22 14:30:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1120/1251] eta 0:04:48 lr 0.000474 time 2.1795 (2.2028) loss 3.8218 (3.5190) grad_norm 1.3543 (1.5431) [2022-01-22 14:30:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1130/1251] eta 0:04:26 lr 0.000474 time 1.6170 (2.2023) loss 3.2970 (3.5186) grad_norm 1.6173 (1.5426) [2022-01-22 14:30:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1140/1251] eta 0:04:04 lr 0.000474 time 2.8366 (2.2031) loss 3.5763 (3.5198) grad_norm 1.4249 (1.5426) [2022-01-22 14:31:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1150/1251] eta 0:03:42 lr 0.000474 time 2.5454 (2.2034) loss 3.9286 (3.5191) grad_norm 1.6508 (1.5431) [2022-01-22 14:31:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1160/1251] eta 0:03:20 lr 0.000474 time 1.8219 (2.2043) loss 3.5302 (3.5196) grad_norm 1.4171 (1.5423) [2022-01-22 14:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1170/1251] eta 0:02:58 lr 0.000474 time 1.8481 (2.2037) loss 3.7262 (3.5190) grad_norm 1.3530 (1.5416) [2022-01-22 14:32:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1180/1251] eta 0:02:36 lr 0.000474 time 2.1682 (2.2024) loss 2.8944 (3.5190) grad_norm 1.4402 (1.5414) [2022-01-22 14:32:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1190/1251] eta 0:02:14 lr 0.000474 time 1.9848 (2.2013) loss 3.8709 (3.5211) grad_norm 1.6610 (1.5417) [2022-01-22 14:33:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1200/1251] eta 0:01:52 lr 0.000474 time 2.6563 (2.2010) loss 3.0984 (3.5197) grad_norm 1.8480 (1.5425) [2022-01-22 14:33:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1210/1251] eta 0:01:30 lr 0.000474 time 2.2404 (2.2009) loss 3.7092 (3.5178) grad_norm 1.6183 (1.5427) [2022-01-22 14:33:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1220/1251] eta 0:01:08 lr 0.000474 time 2.0572 (2.2002) loss 2.7963 (3.5154) grad_norm 1.3258 (1.5423) [2022-01-22 14:34:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1230/1251] eta 0:00:46 lr 0.000474 time 2.2743 (2.2008) loss 3.4278 (3.5143) grad_norm 1.3439 (1.5420) [2022-01-22 14:34:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1240/1251] eta 0:00:24 lr 0.000474 time 2.2662 (2.1996) loss 3.8482 (3.5141) grad_norm 1.4011 (1.5419) [2022-01-22 14:34:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [155/300][1250/1251] eta 0:00:02 lr 0.000474 time 1.1730 (2.1942) loss 4.0729 (3.5148) grad_norm 1.4609 (1.5421) [2022-01-22 14:34:47 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 155 training takes 0:45:45 [2022-01-22 14:35:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 20.712 (20.712) Loss 0.9550 (0.9550) Acc@1 77.441 (77.441) Acc@5 93.457 (93.457) [2022-01-22 14:35:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.637 (3.428) Loss 1.0063 (0.9823) Acc@1 77.344 (76.403) Acc@5 93.066 (93.679) [2022-01-22 14:35:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.608 (2.443) Loss 0.9776 (1.0000) Acc@1 77.441 (76.204) Acc@5 93.555 (93.494) [2022-01-22 14:35:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.976 (2.228) Loss 0.9286 (0.9991) Acc@1 77.051 (76.162) Acc@5 94.336 (93.514) [2022-01-22 14:36:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.647 (2.177) Loss 1.0202 (0.9957) Acc@1 76.465 (76.348) Acc@5 92.285 (93.498) [2022-01-22 14:36:23 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.278 Acc@5 93.430 [2022-01-22 14:36:23 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.3% [2022-01-22 14:36:23 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.30% [2022-01-22 14:36:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][0/1251] eta 7:24:35 lr 0.000474 time 21.3236 (21.3236) loss 4.0382 (4.0382) grad_norm 1.6244 (1.6244) [2022-01-22 14:37:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][10/1251] eta 1:24:10 lr 0.000474 time 2.1647 (4.0700) loss 3.1890 (3.6292) grad_norm 1.3700 (1.5452) [2022-01-22 14:37:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][20/1251] eta 1:05:32 lr 0.000474 time 1.8439 (3.1944) loss 3.4251 (3.6155) grad_norm 1.5774 (1.5496) [2022-01-22 14:37:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][30/1251] eta 0:58:52 lr 0.000474 time 1.4658 (2.8931) loss 2.6423 (3.6301) grad_norm 1.5167 (1.5528) [2022-01-22 14:38:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][40/1251] eta 0:55:50 lr 0.000474 time 3.8447 (2.7669) loss 3.6806 (3.6150) grad_norm 1.4918 (1.5572) [2022-01-22 14:38:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][50/1251] eta 0:53:41 lr 0.000474 time 1.4702 (2.6823) loss 3.7945 (3.6011) grad_norm 1.3918 (1.5559) [2022-01-22 14:39:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][60/1251] eta 0:51:46 lr 0.000474 time 1.5442 (2.6085) loss 3.8929 (3.5524) grad_norm 1.5415 (1.5535) [2022-01-22 14:39:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][70/1251] eta 0:50:31 lr 0.000474 time 1.7675 (2.5668) loss 3.2182 (3.5512) grad_norm 1.3840 (1.5529) [2022-01-22 14:39:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][80/1251] eta 0:48:58 lr 0.000474 time 2.0277 (2.5097) loss 2.4362 (3.5065) grad_norm 1.3676 (1.5432) [2022-01-22 14:40:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][90/1251] eta 0:47:32 lr 0.000474 time 1.6402 (2.4566) loss 3.9837 (3.5372) grad_norm 1.4326 (1.5397) [2022-01-22 14:40:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][100/1251] eta 0:46:35 lr 0.000474 time 1.7466 (2.4287) loss 4.1407 (3.5558) grad_norm 2.1871 (1.5495) [2022-01-22 14:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][110/1251] eta 0:45:41 lr 0.000473 time 2.0244 (2.4031) loss 2.6920 (3.5772) grad_norm 1.3986 (1.5411) [2022-01-22 14:41:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][120/1251] eta 0:45:00 lr 0.000473 time 2.6458 (2.3876) loss 3.9087 (3.5574) grad_norm 1.6735 (1.5363) [2022-01-22 14:41:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][130/1251] eta 0:44:18 lr 0.000473 time 1.9994 (2.3718) loss 3.5773 (3.5440) grad_norm 1.6801 (1.5361) [2022-01-22 14:41:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][140/1251] eta 0:43:43 lr 0.000473 time 1.9005 (2.3610) loss 2.6282 (3.5399) grad_norm 1.4994 (1.5323) [2022-01-22 14:42:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][150/1251] eta 0:42:56 lr 0.000473 time 2.3241 (2.3404) loss 3.2674 (3.5411) grad_norm 1.4041 (1.5334) [2022-01-22 14:42:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][160/1251] eta 0:42:14 lr 0.000473 time 2.2195 (2.3227) loss 3.0808 (3.5394) grad_norm 1.6195 (1.5424) [2022-01-22 14:42:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][170/1251] eta 0:41:36 lr 0.000473 time 2.0217 (2.3096) loss 4.2209 (3.5297) grad_norm 1.6470 (1.5483) [2022-01-22 14:43:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][180/1251] eta 0:41:17 lr 0.000473 time 2.8208 (2.3135) loss 3.9136 (3.5389) grad_norm 1.6651 (1.5495) [2022-01-22 14:43:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][190/1251] eta 0:40:54 lr 0.000473 time 2.2466 (2.3132) loss 3.2883 (3.5311) grad_norm 1.3713 (1.5520) [2022-01-22 14:44:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][200/1251] eta 0:40:23 lr 0.000473 time 1.2365 (2.3057) loss 3.7567 (3.5275) grad_norm 1.5305 (1.5526) [2022-01-22 14:44:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][210/1251] eta 0:39:56 lr 0.000473 time 1.5665 (2.3025) loss 3.8246 (3.5258) grad_norm 1.6547 (1.5528) [2022-01-22 14:44:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][220/1251] eta 0:39:33 lr 0.000473 time 2.7543 (2.3024) loss 3.6022 (3.5261) grad_norm 1.4643 (1.5545) [2022-01-22 14:45:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][230/1251] eta 0:38:58 lr 0.000473 time 2.2880 (2.2902) loss 2.4180 (3.5119) grad_norm 1.5932 (1.5542) [2022-01-22 14:45:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][240/1251] eta 0:38:18 lr 0.000473 time 1.8350 (2.2737) loss 3.6444 (3.5122) grad_norm 1.3668 (1.5501) [2022-01-22 14:45:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][250/1251] eta 0:37:46 lr 0.000473 time 2.1995 (2.2646) loss 3.5427 (3.5047) grad_norm 1.4954 (1.5453) [2022-01-22 14:46:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][260/1251] eta 0:37:21 lr 0.000473 time 2.3005 (2.2621) loss 3.8356 (3.4992) grad_norm 1.3158 (1.5452) [2022-01-22 14:46:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][270/1251] eta 0:36:54 lr 0.000473 time 1.6207 (2.2578) loss 2.8919 (3.5042) grad_norm 1.4329 (1.5456) [2022-01-22 14:46:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][280/1251] eta 0:36:33 lr 0.000473 time 2.4661 (2.2591) loss 4.3452 (3.5010) grad_norm 1.4694 (1.5486) [2022-01-22 14:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][290/1251] eta 0:36:16 lr 0.000473 time 2.4001 (2.2650) loss 3.5168 (3.4945) grad_norm 1.5561 (1.5481) [2022-01-22 14:47:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][300/1251] eta 0:35:52 lr 0.000473 time 1.8726 (2.2638) loss 3.7498 (3.5028) grad_norm 1.3276 (1.5452) [2022-01-22 14:48:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][310/1251] eta 0:35:28 lr 0.000473 time 1.5972 (2.2618) loss 3.9922 (3.5018) grad_norm 1.6522 (1.5463) [2022-01-22 14:48:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][320/1251] eta 0:35:02 lr 0.000473 time 1.9971 (2.2585) loss 3.6813 (3.5074) grad_norm 1.3853 (1.5468) [2022-01-22 14:48:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][330/1251] eta 0:34:32 lr 0.000473 time 2.3135 (2.2503) loss 4.1018 (3.5085) grad_norm 1.5633 (1.5463) [2022-01-22 14:49:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][340/1251] eta 0:34:07 lr 0.000473 time 1.9227 (2.2471) loss 3.8682 (3.5068) grad_norm 1.4704 (1.5438) [2022-01-22 14:49:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][350/1251] eta 0:33:42 lr 0.000472 time 2.2119 (2.2447) loss 3.4704 (3.4933) grad_norm 1.5033 (1.5441) [2022-01-22 14:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][360/1251] eta 0:33:20 lr 0.000472 time 1.5963 (2.2450) loss 3.5525 (3.4911) grad_norm 1.4652 (1.5462) [2022-01-22 14:50:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][370/1251] eta 0:32:56 lr 0.000472 time 2.2031 (2.2434) loss 3.8574 (3.4920) grad_norm 1.4248 (1.5500) [2022-01-22 14:50:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][380/1251] eta 0:32:33 lr 0.000472 time 2.2457 (2.2424) loss 3.2353 (3.4921) grad_norm 1.5087 (1.5491) [2022-01-22 14:51:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][390/1251] eta 0:32:11 lr 0.000472 time 2.1272 (2.2439) loss 2.6888 (3.4856) grad_norm 1.7333 (1.5492) [2022-01-22 14:51:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][400/1251] eta 0:31:47 lr 0.000472 time 1.5530 (2.2417) loss 3.9048 (3.4879) grad_norm 1.6612 (1.5493) [2022-01-22 14:51:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][410/1251] eta 0:31:23 lr 0.000472 time 1.9074 (2.2394) loss 4.2042 (3.4959) grad_norm 1.5648 (1.5515) [2022-01-22 14:52:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][420/1251] eta 0:30:58 lr 0.000472 time 1.6634 (2.2362) loss 2.9925 (3.4996) grad_norm 1.2986 (1.5522) [2022-01-22 14:52:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][430/1251] eta 0:30:38 lr 0.000472 time 2.1284 (2.2397) loss 3.5537 (3.5051) grad_norm 1.5493 (1.5522) [2022-01-22 14:52:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][440/1251] eta 0:30:14 lr 0.000472 time 1.8785 (2.2369) loss 3.8435 (3.5114) grad_norm 1.5793 (1.5515) [2022-01-22 14:53:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][450/1251] eta 0:29:50 lr 0.000472 time 1.5937 (2.2358) loss 3.8713 (3.5185) grad_norm 1.4722 (1.5541) [2022-01-22 14:53:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][460/1251] eta 0:29:29 lr 0.000472 time 1.7924 (2.2365) loss 2.4063 (3.5190) grad_norm 1.7779 (1.5544) [2022-01-22 14:53:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][470/1251] eta 0:29:04 lr 0.000472 time 1.6480 (2.2339) loss 3.8352 (3.5169) grad_norm 1.6145 (1.5555) [2022-01-22 14:54:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][480/1251] eta 0:28:39 lr 0.000472 time 2.0339 (2.2302) loss 4.3024 (3.5139) grad_norm 1.5214 (1.5569) [2022-01-22 14:54:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][490/1251] eta 0:28:17 lr 0.000472 time 2.5899 (2.2300) loss 2.4441 (3.5113) grad_norm 1.5921 (1.5569) [2022-01-22 14:55:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][500/1251] eta 0:27:55 lr 0.000472 time 2.2575 (2.2313) loss 3.5851 (3.5075) grad_norm 1.3155 (1.5584) [2022-01-22 14:55:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][510/1251] eta 0:27:36 lr 0.000472 time 1.8329 (2.2349) loss 3.5467 (3.5044) grad_norm 1.3643 (1.5580) [2022-01-22 14:55:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][520/1251] eta 0:27:12 lr 0.000472 time 1.8696 (2.2331) loss 3.5665 (3.5063) grad_norm 1.6034 (1.5580) [2022-01-22 14:56:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][530/1251] eta 0:26:47 lr 0.000472 time 1.6029 (2.2300) loss 3.8327 (3.5051) grad_norm 1.7848 (1.5582) [2022-01-22 14:56:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][540/1251] eta 0:26:23 lr 0.000472 time 2.3144 (2.2265) loss 3.4983 (3.5032) grad_norm 1.5712 (1.5584) [2022-01-22 14:56:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][550/1251] eta 0:25:58 lr 0.000472 time 1.8130 (2.2232) loss 4.2244 (3.5074) grad_norm 1.4293 (1.5577) [2022-01-22 14:57:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][560/1251] eta 0:25:36 lr 0.000472 time 2.9497 (2.2230) loss 2.3685 (3.5052) grad_norm 1.6067 (1.5577) [2022-01-22 14:57:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][570/1251] eta 0:25:13 lr 0.000472 time 1.9927 (2.2232) loss 3.8313 (3.5060) grad_norm 1.9266 (1.5579) [2022-01-22 14:57:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][580/1251] eta 0:24:51 lr 0.000472 time 2.9466 (2.2228) loss 4.2537 (3.5093) grad_norm 1.6905 (1.5574) [2022-01-22 14:58:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][590/1251] eta 0:24:30 lr 0.000471 time 2.1718 (2.2247) loss 2.7582 (3.5104) grad_norm 1.5436 (1.5582) [2022-01-22 14:58:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][600/1251] eta 0:24:06 lr 0.000471 time 1.7826 (2.2227) loss 3.8049 (3.5117) grad_norm 1.4548 (1.5585) [2022-01-22 14:59:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][610/1251] eta 0:23:43 lr 0.000471 time 2.3090 (2.2202) loss 3.8375 (3.5110) grad_norm 1.5312 (1.5582) [2022-01-22 14:59:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][620/1251] eta 0:23:19 lr 0.000471 time 1.9594 (2.2178) loss 3.3575 (3.5114) grad_norm 1.6088 (1.5575) [2022-01-22 14:59:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][630/1251] eta 0:22:56 lr 0.000471 time 2.1137 (2.2160) loss 3.9267 (3.5120) grad_norm 1.7670 (1.5574) [2022-01-22 15:00:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][640/1251] eta 0:22:33 lr 0.000471 time 2.3514 (2.2154) loss 3.7682 (3.5055) grad_norm 1.4673 (1.5579) [2022-01-22 15:00:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][650/1251] eta 0:22:12 lr 0.000471 time 2.6840 (2.2166) loss 4.3977 (3.5096) grad_norm 1.6171 (1.5575) [2022-01-22 15:00:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][660/1251] eta 0:21:49 lr 0.000471 time 1.9880 (2.2149) loss 3.8284 (3.5092) grad_norm 1.3959 (1.5569) [2022-01-22 15:01:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][670/1251] eta 0:21:27 lr 0.000471 time 2.3205 (2.2154) loss 3.8093 (3.5106) grad_norm 1.3333 (1.5574) [2022-01-22 15:01:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][680/1251] eta 0:21:05 lr 0.000471 time 2.4331 (2.2157) loss 2.7211 (3.5110) grad_norm 1.5023 (1.5566) [2022-01-22 15:01:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][690/1251] eta 0:20:43 lr 0.000471 time 2.4105 (2.2159) loss 4.1080 (3.5164) grad_norm 1.4134 (1.5555) [2022-01-22 15:02:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][700/1251] eta 0:20:21 lr 0.000471 time 1.9299 (2.2161) loss 3.9292 (3.5169) grad_norm 1.5094 (1.5548) [2022-01-22 15:02:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][710/1251] eta 0:19:59 lr 0.000471 time 2.7380 (2.2164) loss 3.7136 (3.5176) grad_norm 1.6102 (1.5546) [2022-01-22 15:03:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][720/1251] eta 0:19:35 lr 0.000471 time 1.6281 (2.2143) loss 3.4178 (3.5172) grad_norm 1.6706 (1.5552) [2022-01-22 15:03:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][730/1251] eta 0:19:12 lr 0.000471 time 1.8380 (2.2119) loss 3.9175 (3.5170) grad_norm 1.7251 (1.5563) [2022-01-22 15:03:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][740/1251] eta 0:18:48 lr 0.000471 time 1.8589 (2.2089) loss 3.8905 (3.5182) grad_norm 1.4742 (1.5544) [2022-01-22 15:04:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][750/1251] eta 0:18:26 lr 0.000471 time 2.7313 (2.2088) loss 3.0639 (3.5200) grad_norm 1.4008 (1.5539) [2022-01-22 15:04:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][760/1251] eta 0:18:04 lr 0.000471 time 1.7388 (2.2090) loss 3.4516 (3.5208) grad_norm 1.3295 (1.5532) [2022-01-22 15:04:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][770/1251] eta 0:17:42 lr 0.000471 time 1.8454 (2.2093) loss 2.8538 (3.5153) grad_norm 1.4530 (1.5533) [2022-01-22 15:05:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][780/1251] eta 0:17:20 lr 0.000471 time 1.5504 (2.2082) loss 4.1469 (3.5150) grad_norm 1.6065 (1.5540) [2022-01-22 15:05:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][790/1251] eta 0:16:59 lr 0.000471 time 2.6301 (2.2113) loss 3.9869 (3.5162) grad_norm 1.5764 (1.5548) [2022-01-22 15:05:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][800/1251] eta 0:16:38 lr 0.000471 time 1.9461 (2.2135) loss 3.7706 (3.5167) grad_norm 1.3603 (1.5546) [2022-01-22 15:06:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][810/1251] eta 0:16:16 lr 0.000471 time 1.8893 (2.2139) loss 3.0954 (3.5166) grad_norm 1.4851 (1.5538) [2022-01-22 15:06:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][820/1251] eta 0:15:53 lr 0.000471 time 2.1833 (2.2123) loss 4.1610 (3.5159) grad_norm 1.7297 (1.5546) [2022-01-22 15:06:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][830/1251] eta 0:15:30 lr 0.000470 time 1.6143 (2.2092) loss 3.3548 (3.5179) grad_norm 1.5536 (1.5573) [2022-01-22 15:07:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][840/1251] eta 0:15:07 lr 0.000470 time 2.1647 (2.2076) loss 3.4445 (3.5207) grad_norm 1.4261 (1.5576) [2022-01-22 15:07:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][850/1251] eta 0:14:45 lr 0.000470 time 1.9610 (2.2081) loss 4.3730 (3.5233) grad_norm 1.4836 (1.5571) [2022-01-22 15:08:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][860/1251] eta 0:14:23 lr 0.000470 time 2.2713 (2.2083) loss 2.7908 (3.5244) grad_norm 1.5472 (1.5566) [2022-01-22 15:08:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][870/1251] eta 0:14:01 lr 0.000470 time 2.1798 (2.2084) loss 2.8829 (3.5248) grad_norm 1.5656 (1.5565) [2022-01-22 15:08:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][880/1251] eta 0:13:39 lr 0.000470 time 2.4925 (2.2080) loss 4.1064 (3.5253) grad_norm 1.4807 (1.5567) [2022-01-22 15:09:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][890/1251] eta 0:13:17 lr 0.000470 time 1.8739 (2.2080) loss 3.7315 (3.5262) grad_norm 1.8157 (1.5567) [2022-01-22 15:09:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][900/1251] eta 0:12:54 lr 0.000470 time 1.9154 (2.2080) loss 3.8890 (3.5271) grad_norm 1.3473 (1.5558) [2022-01-22 15:09:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][910/1251] eta 0:12:32 lr 0.000470 time 1.8212 (2.2069) loss 3.9441 (3.5287) grad_norm 1.5235 (1.5554) [2022-01-22 15:10:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][920/1251] eta 0:12:09 lr 0.000470 time 2.5569 (2.2053) loss 3.3460 (3.5311) grad_norm 1.5263 (1.5548) [2022-01-22 15:10:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][930/1251] eta 0:11:47 lr 0.000470 time 1.9050 (2.2047) loss 3.9008 (3.5336) grad_norm 1.5333 (1.5548) [2022-01-22 15:10:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][940/1251] eta 0:11:25 lr 0.000470 time 2.0600 (2.2036) loss 2.3541 (3.5325) grad_norm 1.5783 (1.5553) [2022-01-22 15:11:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][950/1251] eta 0:11:03 lr 0.000470 time 1.9682 (2.2047) loss 3.1599 (3.5313) grad_norm 1.5265 (1.5551) [2022-01-22 15:11:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][960/1251] eta 0:10:41 lr 0.000470 time 2.5772 (2.2056) loss 2.7037 (3.5298) grad_norm 1.3925 (1.5546) [2022-01-22 15:12:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][970/1251] eta 0:10:19 lr 0.000470 time 2.3028 (2.2053) loss 3.8947 (3.5286) grad_norm 1.6796 (1.5548) [2022-01-22 15:12:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][980/1251] eta 0:09:57 lr 0.000470 time 1.9646 (2.2057) loss 3.8383 (3.5261) grad_norm 1.4337 (1.5546) [2022-01-22 15:12:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][990/1251] eta 0:09:35 lr 0.000470 time 1.9795 (2.2040) loss 3.8696 (3.5273) grad_norm 1.5391 (1.5550) [2022-01-22 15:13:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1000/1251] eta 0:09:12 lr 0.000470 time 2.2287 (2.2029) loss 4.1495 (3.5273) grad_norm 1.6358 (1.5545) [2022-01-22 15:13:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1010/1251] eta 0:08:50 lr 0.000470 time 2.7010 (2.2018) loss 2.7430 (3.5275) grad_norm 1.3536 (1.5549) [2022-01-22 15:13:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1020/1251] eta 0:08:28 lr 0.000470 time 2.8091 (2.2014) loss 3.2452 (3.5275) grad_norm 1.6029 (1.5549) [2022-01-22 15:14:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1030/1251] eta 0:08:06 lr 0.000470 time 1.8970 (2.2010) loss 3.9546 (3.5275) grad_norm 1.6571 (1.5544) [2022-01-22 15:14:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1040/1251] eta 0:07:44 lr 0.000470 time 1.5538 (2.2015) loss 3.6685 (3.5296) grad_norm 1.4801 (1.5547) [2022-01-22 15:14:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1050/1251] eta 0:07:22 lr 0.000470 time 2.8293 (2.2028) loss 4.0712 (3.5305) grad_norm 1.6020 (1.5547) [2022-01-22 15:15:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1060/1251] eta 0:07:00 lr 0.000470 time 1.8312 (2.2020) loss 2.7009 (3.5306) grad_norm 1.5469 (1.5540) [2022-01-22 15:15:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1070/1251] eta 0:06:38 lr 0.000469 time 2.0740 (2.2005) loss 4.0451 (3.5315) grad_norm 1.5811 (1.5536) [2022-01-22 15:16:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1080/1251] eta 0:06:16 lr 0.000469 time 2.5308 (2.2013) loss 4.1795 (3.5292) grad_norm 1.7287 (1.5528) [2022-01-22 15:16:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1090/1251] eta 0:05:54 lr 0.000469 time 2.6259 (2.2004) loss 2.7513 (3.5278) grad_norm 1.5186 (1.5521) [2022-01-22 15:16:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1100/1251] eta 0:05:32 lr 0.000469 time 1.8679 (2.2001) loss 3.1583 (3.5292) grad_norm 1.5175 (1.5528) [2022-01-22 15:17:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1110/1251] eta 0:05:10 lr 0.000469 time 2.4816 (2.1999) loss 4.1323 (3.5276) grad_norm 1.5884 (1.5535) [2022-01-22 15:17:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1120/1251] eta 0:04:48 lr 0.000469 time 1.7956 (2.2004) loss 3.0034 (3.5228) grad_norm 1.8954 (1.5533) [2022-01-22 15:17:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1130/1251] eta 0:04:26 lr 0.000469 time 2.8731 (2.2009) loss 4.4714 (3.5225) grad_norm 1.6100 (1.5540) [2022-01-22 15:18:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1140/1251] eta 0:04:04 lr 0.000469 time 2.3841 (2.2020) loss 3.7565 (3.5229) grad_norm 1.5853 (1.5539) [2022-01-22 15:18:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1150/1251] eta 0:03:42 lr 0.000469 time 1.7723 (2.2018) loss 2.5800 (3.5240) grad_norm 1.5418 (1.5539) [2022-01-22 15:18:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1160/1251] eta 0:03:20 lr 0.000469 time 1.8743 (2.2011) loss 4.3565 (3.5247) grad_norm 1.5513 (1.5540) [2022-01-22 15:19:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1170/1251] eta 0:02:58 lr 0.000469 time 2.4230 (2.1998) loss 3.5090 (3.5254) grad_norm 1.5030 (1.5536) [2022-01-22 15:19:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1180/1251] eta 0:02:36 lr 0.000469 time 1.7917 (2.1988) loss 3.3345 (3.5254) grad_norm 1.4450 (1.5535) [2022-01-22 15:20:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1190/1251] eta 0:02:14 lr 0.000469 time 2.1718 (2.1982) loss 3.8005 (3.5259) grad_norm 1.5231 (1.5542) [2022-01-22 15:20:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1200/1251] eta 0:01:52 lr 0.000469 time 2.2518 (2.1972) loss 2.5867 (3.5259) grad_norm 1.4789 (1.5536) [2022-01-22 15:20:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1210/1251] eta 0:01:30 lr 0.000469 time 2.2968 (2.1959) loss 3.8352 (3.5257) grad_norm 1.5168 (1.5541) [2022-01-22 15:21:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1220/1251] eta 0:01:08 lr 0.000469 time 1.8336 (2.1964) loss 3.9806 (3.5278) grad_norm 1.6220 (1.5548) [2022-01-22 15:21:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1230/1251] eta 0:00:46 lr 0.000469 time 1.8093 (2.1974) loss 4.0522 (3.5298) grad_norm 1.5283 (1.5545) [2022-01-22 15:21:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1240/1251] eta 0:00:24 lr 0.000469 time 1.4278 (2.1971) loss 3.3619 (3.5301) grad_norm 1.4179 (1.5542) [2022-01-22 15:22:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [156/300][1250/1251] eta 0:00:02 lr 0.000469 time 1.3892 (2.1926) loss 3.1820 (3.5299) grad_norm 1.5039 (1.5539) [2022-01-22 15:22:07 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 156 training takes 0:45:43 [2022-01-22 15:22:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.727 (18.727) Loss 0.9997 (0.9997) Acc@1 76.367 (76.367) Acc@5 94.336 (94.336) [2022-01-22 15:22:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.577 (3.505) Loss 1.0016 (0.9867) Acc@1 75.977 (76.465) Acc@5 93.359 (93.688) [2022-01-22 15:23:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.273 (2.594) Loss 0.9397 (0.9910) Acc@1 77.637 (76.609) Acc@5 93.750 (93.578) [2022-01-22 15:23:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.630 (2.365) Loss 0.9961 (0.9865) Acc@1 77.344 (76.742) Acc@5 93.750 (93.621) [2022-01-22 15:23:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.778 (2.175) Loss 0.9070 (0.9860) Acc@1 78.125 (76.648) Acc@5 94.434 (93.571) [2022-01-22 15:23:44 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.516 Acc@5 93.514 [2022-01-22 15:23:44 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.5% [2022-01-22 15:23:44 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.52% [2022-01-22 15:24:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][0/1251] eta 8:07:47 lr 0.000469 time 23.3954 (23.3954) loss 2.3488 (2.3488) grad_norm 1.5571 (1.5571) [2022-01-22 15:24:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][10/1251] eta 1:25:04 lr 0.000469 time 1.3494 (4.1131) loss 3.8283 (3.4575) grad_norm 1.3287 (1.6275) [2022-01-22 15:24:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][20/1251] eta 1:05:10 lr 0.000469 time 2.3598 (3.1770) loss 3.9159 (3.5585) grad_norm 1.5965 (1.5916) [2022-01-22 15:25:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][30/1251] eta 0:58:35 lr 0.000469 time 1.5129 (2.8792) loss 2.6446 (3.5457) grad_norm 1.5213 (1.5881) [2022-01-22 15:25:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][40/1251] eta 0:55:29 lr 0.000469 time 3.5271 (2.7497) loss 3.3343 (3.5595) grad_norm 1.6181 (1.5829) [2022-01-22 15:26:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][50/1251] eta 0:53:16 lr 0.000469 time 2.8748 (2.6618) loss 3.6859 (3.5607) grad_norm 1.4792 (1.5659) [2022-01-22 15:26:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][60/1251] eta 0:51:19 lr 0.000468 time 2.7617 (2.5860) loss 3.6630 (3.5668) grad_norm 1.5553 (1.5452) [2022-01-22 15:26:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][70/1251] eta 0:49:42 lr 0.000468 time 1.6311 (2.5251) loss 2.2930 (3.5478) grad_norm 1.5598 (1.5556) [2022-01-22 15:27:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][80/1251] eta 0:48:15 lr 0.000468 time 2.2533 (2.4728) loss 3.8063 (3.5468) grad_norm 1.6022 (1.5628) [2022-01-22 15:27:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][90/1251] eta 0:46:58 lr 0.000468 time 2.2425 (2.4274) loss 3.6952 (3.5391) grad_norm 1.6922 (1.5643) [2022-01-22 15:27:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][100/1251] eta 0:46:04 lr 0.000468 time 2.2830 (2.4016) loss 3.5568 (3.5018) grad_norm 1.5786 (1.5739) [2022-01-22 15:28:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][110/1251] eta 0:45:06 lr 0.000468 time 1.9028 (2.3719) loss 2.3825 (3.4899) grad_norm 1.4868 (1.5740) [2022-01-22 15:28:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][120/1251] eta 0:44:30 lr 0.000468 time 2.8759 (2.3613) loss 2.4463 (3.4843) grad_norm 2.1178 (1.5786) [2022-01-22 15:28:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][130/1251] eta 0:43:57 lr 0.000468 time 2.9494 (2.3526) loss 4.0067 (3.4861) grad_norm 1.4924 (1.5787) [2022-01-22 15:29:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][140/1251] eta 0:43:22 lr 0.000468 time 2.2937 (2.3423) loss 3.3282 (3.4770) grad_norm 1.7077 (1.5702) [2022-01-22 15:29:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][150/1251] eta 0:42:50 lr 0.000468 time 1.9208 (2.3344) loss 2.9999 (3.4806) grad_norm 1.3859 (1.5692) [2022-01-22 15:29:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][160/1251] eta 0:42:09 lr 0.000468 time 1.8305 (2.3188) loss 2.4747 (3.4793) grad_norm 1.5043 (1.5723) [2022-01-22 15:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][170/1251] eta 0:41:29 lr 0.000468 time 1.8440 (2.3033) loss 3.5927 (3.4825) grad_norm 1.5645 (1.5673) [2022-01-22 15:30:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][180/1251] eta 0:41:03 lr 0.000468 time 2.5515 (2.2999) loss 3.6369 (3.4797) grad_norm 1.6924 (1.5616) [2022-01-22 15:31:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][190/1251] eta 0:40:29 lr 0.000468 time 1.8742 (2.2902) loss 3.2471 (3.4758) grad_norm 1.4796 (1.5586) [2022-01-22 15:31:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][200/1251] eta 0:40:07 lr 0.000468 time 1.8577 (2.2906) loss 4.0475 (3.4913) grad_norm 1.5123 (1.5604) [2022-01-22 15:31:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][210/1251] eta 0:39:45 lr 0.000468 time 2.0499 (2.2917) loss 3.7496 (3.4974) grad_norm 1.6861 (1.5618) [2022-01-22 15:32:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][220/1251] eta 0:39:37 lr 0.000468 time 2.0996 (2.3065) loss 3.6596 (3.5143) grad_norm 1.4305 (1.5597) [2022-01-22 15:32:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][230/1251] eta 0:39:06 lr 0.000468 time 2.2129 (2.2985) loss 4.0891 (3.5090) grad_norm 1.5216 (1.5612) [2022-01-22 15:32:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][240/1251] eta 0:38:30 lr 0.000468 time 1.7431 (2.2858) loss 4.1985 (3.5167) grad_norm 1.5208 (1.5621) [2022-01-22 15:33:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][250/1251] eta 0:37:54 lr 0.000468 time 1.6650 (2.2722) loss 4.1713 (3.5230) grad_norm 1.4196 (1.5601) [2022-01-22 15:33:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][260/1251] eta 0:37:34 lr 0.000468 time 1.9477 (2.2746) loss 4.2060 (3.5191) grad_norm 1.5200 (1.5602) [2022-01-22 15:34:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][270/1251] eta 0:37:11 lr 0.000468 time 1.8969 (2.2748) loss 2.7928 (3.5166) grad_norm 1.4863 (1.5642) [2022-01-22 15:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][280/1251] eta 0:36:46 lr 0.000468 time 1.9244 (2.2723) loss 3.5766 (3.5185) grad_norm 1.8069 (1.5680) [2022-01-22 15:34:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][290/1251] eta 0:36:16 lr 0.000468 time 1.8181 (2.2652) loss 4.0290 (3.5193) grad_norm 1.6021 (1.5677) [2022-01-22 15:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][300/1251] eta 0:35:59 lr 0.000468 time 1.9082 (2.2709) loss 2.3099 (3.5091) grad_norm 1.6620 (1.5685) [2022-01-22 15:35:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][310/1251] eta 0:35:28 lr 0.000467 time 1.9511 (2.2620) loss 3.2395 (3.5114) grad_norm 1.6385 (1.5680) [2022-01-22 15:35:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][320/1251] eta 0:35:00 lr 0.000467 time 2.1403 (2.2561) loss 3.7130 (3.5089) grad_norm 1.4589 (1.5679) [2022-01-22 15:36:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][330/1251] eta 0:34:35 lr 0.000467 time 2.4673 (2.2536) loss 3.7243 (3.5079) grad_norm 1.6393 (1.5684) [2022-01-22 15:36:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][340/1251] eta 0:34:09 lr 0.000467 time 2.1150 (2.2495) loss 2.4759 (3.5067) grad_norm 1.6223 (1.5672) [2022-01-22 15:36:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][350/1251] eta 0:33:44 lr 0.000467 time 1.8489 (2.2468) loss 3.9541 (3.5104) grad_norm 1.5143 (1.5657) [2022-01-22 15:37:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][360/1251] eta 0:33:24 lr 0.000467 time 2.3006 (2.2502) loss 2.9613 (3.5123) grad_norm 1.4146 (1.5654) [2022-01-22 15:37:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][370/1251] eta 0:32:59 lr 0.000467 time 2.3217 (2.2472) loss 2.9518 (3.5045) grad_norm 1.7493 (1.5670) [2022-01-22 15:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][380/1251] eta 0:32:33 lr 0.000467 time 2.0328 (2.2429) loss 3.5956 (3.5059) grad_norm 1.4196 (1.5648) [2022-01-22 15:38:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][390/1251] eta 0:32:11 lr 0.000467 time 1.9804 (2.2430) loss 3.0347 (3.5114) grad_norm 1.5079 (1.5627) [2022-01-22 15:38:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][400/1251] eta 0:31:47 lr 0.000467 time 1.9145 (2.2414) loss 3.6350 (3.5139) grad_norm 1.7784 (1.5631) [2022-01-22 15:39:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][410/1251] eta 0:31:23 lr 0.000467 time 2.3554 (2.2399) loss 3.6098 (3.5088) grad_norm 1.5416 (1.5599) [2022-01-22 15:39:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][420/1251] eta 0:30:58 lr 0.000467 time 2.1959 (2.2361) loss 3.7429 (3.5155) grad_norm 1.4180 (1.5595) [2022-01-22 15:39:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][430/1251] eta 0:30:33 lr 0.000467 time 2.2639 (2.2326) loss 3.6787 (3.5131) grad_norm 1.6182 (1.5594) [2022-01-22 15:40:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][440/1251] eta 0:30:08 lr 0.000467 time 1.8142 (2.2294) loss 2.3151 (3.5053) grad_norm 1.4850 (1.5580) [2022-01-22 15:40:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][450/1251] eta 0:29:45 lr 0.000467 time 2.4886 (2.2296) loss 2.8302 (3.4985) grad_norm 1.4132 (1.5566) [2022-01-22 15:40:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][460/1251] eta 0:29:22 lr 0.000467 time 1.8140 (2.2288) loss 3.6660 (3.5041) grad_norm 1.7632 (1.5572) [2022-01-22 15:41:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][470/1251] eta 0:29:01 lr 0.000467 time 2.3485 (2.2298) loss 3.6340 (3.5061) grad_norm 1.4091 (1.5568) [2022-01-22 15:41:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][480/1251] eta 0:28:39 lr 0.000467 time 1.9268 (2.2298) loss 3.9448 (3.5051) grad_norm 1.4629 (1.5556) [2022-01-22 15:41:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][490/1251] eta 0:28:17 lr 0.000467 time 2.8883 (2.2309) loss 4.2670 (3.5063) grad_norm 1.5482 (1.5537) [2022-01-22 15:42:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][500/1251] eta 0:27:54 lr 0.000467 time 1.6534 (2.2294) loss 3.7209 (3.5062) grad_norm 1.4904 (1.5538) [2022-01-22 15:42:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][510/1251] eta 0:27:33 lr 0.000467 time 3.6540 (2.2314) loss 3.7601 (3.5031) grad_norm 1.6332 (1.5537) [2022-01-22 15:43:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][520/1251] eta 0:27:07 lr 0.000467 time 1.8153 (2.2263) loss 3.5722 (3.5040) grad_norm 1.5517 (1.5533) [2022-01-22 15:43:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][530/1251] eta 0:26:44 lr 0.000467 time 2.5563 (2.2257) loss 3.8004 (3.5002) grad_norm 1.4615 (1.5532) [2022-01-22 15:43:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][540/1251] eta 0:26:20 lr 0.000467 time 2.2136 (2.2227) loss 3.2780 (3.4984) grad_norm 1.4541 (1.5545) [2022-01-22 15:44:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][550/1251] eta 0:25:57 lr 0.000466 time 2.4500 (2.2223) loss 3.5725 (3.5008) grad_norm 1.4204 (1.5548) [2022-01-22 15:44:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][560/1251] eta 0:25:35 lr 0.000466 time 2.1911 (2.2215) loss 4.5096 (3.5024) grad_norm 1.4701 (1.5561) [2022-01-22 15:44:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][570/1251] eta 0:25:13 lr 0.000466 time 2.9050 (2.2226) loss 3.8544 (3.5040) grad_norm 1.5490 (1.5561) [2022-01-22 15:45:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][580/1251] eta 0:24:49 lr 0.000466 time 2.4689 (2.2202) loss 3.2647 (3.5024) grad_norm 1.6910 (1.5566) [2022-01-22 15:45:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][590/1251] eta 0:24:29 lr 0.000466 time 2.5054 (2.2225) loss 3.4538 (3.5025) grad_norm 1.4933 (1.5580) [2022-01-22 15:46:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][600/1251] eta 0:24:07 lr 0.000466 time 2.1845 (2.2240) loss 3.7925 (3.5037) grad_norm 1.3761 (1.5581) [2022-01-22 15:46:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][610/1251] eta 0:23:47 lr 0.000466 time 2.3475 (2.2276) loss 4.1846 (3.5049) grad_norm 1.4644 (1.5576) [2022-01-22 15:46:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][620/1251] eta 0:23:25 lr 0.000466 time 2.9489 (2.2278) loss 4.1499 (3.5043) grad_norm 1.5010 (1.5577) [2022-01-22 15:47:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][630/1251] eta 0:23:01 lr 0.000466 time 1.8565 (2.2240) loss 3.2447 (3.5086) grad_norm 2.0593 (1.5574) [2022-01-22 15:47:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][640/1251] eta 0:22:36 lr 0.000466 time 2.0310 (2.2196) loss 3.6815 (3.5061) grad_norm 1.3629 (1.5573) [2022-01-22 15:47:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][650/1251] eta 0:22:12 lr 0.000466 time 1.9131 (2.2172) loss 4.0650 (3.5052) grad_norm 1.8100 (1.5574) [2022-01-22 15:48:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][660/1251] eta 0:21:47 lr 0.000466 time 1.6010 (2.2131) loss 3.3966 (3.5074) grad_norm 1.4311 (1.5587) [2022-01-22 15:48:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][670/1251] eta 0:21:25 lr 0.000466 time 2.4307 (2.2130) loss 2.7935 (3.5071) grad_norm 1.5388 (1.5587) [2022-01-22 15:48:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][680/1251] eta 0:21:05 lr 0.000466 time 2.2466 (2.2162) loss 3.5796 (3.5079) grad_norm 1.4082 (1.5609) [2022-01-22 15:49:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][690/1251] eta 0:20:44 lr 0.000466 time 2.4772 (2.2183) loss 4.3077 (3.5125) grad_norm 1.5830 (1.5604) [2022-01-22 15:49:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][700/1251] eta 0:20:21 lr 0.000466 time 1.8983 (2.2160) loss 3.5918 (3.5120) grad_norm 1.5064 (1.5593) [2022-01-22 15:49:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][710/1251] eta 0:19:58 lr 0.000466 time 1.9124 (2.2156) loss 4.3912 (3.5077) grad_norm 1.5095 (1.5587) [2022-01-22 15:50:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][720/1251] eta 0:19:36 lr 0.000466 time 2.4075 (2.2151) loss 4.0512 (3.5109) grad_norm 1.4880 (1.5578) [2022-01-22 15:50:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][730/1251] eta 0:19:13 lr 0.000466 time 1.8986 (2.2149) loss 2.7413 (3.5095) grad_norm 1.5177 (1.5589) [2022-01-22 15:51:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][740/1251] eta 0:18:51 lr 0.000466 time 2.0690 (2.2147) loss 3.4078 (3.5098) grad_norm 1.5237 (1.5586) [2022-01-22 15:51:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][750/1251] eta 0:18:29 lr 0.000466 time 1.8528 (2.2144) loss 3.8974 (3.5102) grad_norm 1.5900 (1.5595) [2022-01-22 15:51:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][760/1251] eta 0:18:07 lr 0.000466 time 2.2222 (2.2146) loss 2.8132 (3.5105) grad_norm 1.7577 (1.5602) [2022-01-22 15:52:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][770/1251] eta 0:17:45 lr 0.000466 time 2.1913 (2.2155) loss 4.1221 (3.5126) grad_norm 1.6310 (1.5611) [2022-01-22 15:52:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][780/1251] eta 0:17:22 lr 0.000466 time 1.6095 (2.2134) loss 3.1760 (3.5140) grad_norm 1.4116 (1.5622) [2022-01-22 15:52:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][790/1251] eta 0:17:00 lr 0.000465 time 2.0829 (2.2142) loss 2.3707 (3.5067) grad_norm 1.7096 (1.5630) [2022-01-22 15:53:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][800/1251] eta 0:16:38 lr 0.000465 time 2.1832 (2.2129) loss 2.4851 (3.5044) grad_norm 1.3273 (1.5630) [2022-01-22 15:53:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][810/1251] eta 0:16:15 lr 0.000465 time 2.1129 (2.2111) loss 3.7956 (3.5052) grad_norm 1.6139 (1.5636) [2022-01-22 15:53:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][820/1251] eta 0:15:51 lr 0.000465 time 1.6831 (2.2085) loss 3.9945 (3.5080) grad_norm 1.7483 (1.5640) [2022-01-22 15:54:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][830/1251] eta 0:15:29 lr 0.000465 time 2.5405 (2.2074) loss 3.3651 (3.5071) grad_norm 1.5641 (1.5639) [2022-01-22 15:54:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][840/1251] eta 0:15:06 lr 0.000465 time 2.2206 (2.2064) loss 2.6710 (3.5060) grad_norm 1.5381 (1.5647) [2022-01-22 15:55:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][850/1251] eta 0:14:44 lr 0.000465 time 2.4966 (2.2064) loss 3.8643 (3.5095) grad_norm 1.3896 (1.5654) [2022-01-22 15:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][860/1251] eta 0:14:22 lr 0.000465 time 1.9390 (2.2056) loss 3.1513 (3.5099) grad_norm 1.5631 (1.5652) [2022-01-22 15:55:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][870/1251] eta 0:14:00 lr 0.000465 time 2.3458 (2.2048) loss 3.7270 (3.5099) grad_norm 1.3941 (1.5649) [2022-01-22 15:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][880/1251] eta 0:13:38 lr 0.000465 time 2.3821 (2.2052) loss 3.6430 (3.5124) grad_norm 1.4219 (1.5648) [2022-01-22 15:56:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][890/1251] eta 0:13:16 lr 0.000465 time 2.2183 (2.2057) loss 4.1806 (3.5122) grad_norm 1.5264 (1.5646) [2022-01-22 15:56:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][900/1251] eta 0:12:55 lr 0.000465 time 1.5398 (2.2093) loss 3.8185 (3.5144) grad_norm 1.4018 (1.5651) [2022-01-22 15:57:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][910/1251] eta 0:12:34 lr 0.000465 time 3.6572 (2.2120) loss 3.2171 (3.5151) grad_norm 1.4531 (1.5636) [2022-01-22 15:57:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][920/1251] eta 0:12:12 lr 0.000465 time 1.8880 (2.2122) loss 3.1562 (3.5150) grad_norm 1.2337 (1.5632) [2022-01-22 15:58:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][930/1251] eta 0:11:49 lr 0.000465 time 1.8662 (2.2105) loss 2.6309 (3.5119) grad_norm 1.3833 (1.5623) [2022-01-22 15:58:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][940/1251] eta 0:11:26 lr 0.000465 time 1.8281 (2.2087) loss 3.7361 (3.5116) grad_norm 1.6701 (1.5620) [2022-01-22 15:58:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][950/1251] eta 0:11:04 lr 0.000465 time 3.1305 (2.2079) loss 3.7190 (3.5113) grad_norm 1.4958 (1.5623) [2022-01-22 15:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][960/1251] eta 0:10:41 lr 0.000465 time 1.6559 (2.2060) loss 2.9128 (3.5113) grad_norm 1.3503 (1.5621) [2022-01-22 15:59:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][970/1251] eta 0:10:19 lr 0.000465 time 2.0200 (2.2061) loss 2.7127 (3.5076) grad_norm 1.3345 (1.5613) [2022-01-22 15:59:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][980/1251] eta 0:09:57 lr 0.000465 time 2.0271 (2.2062) loss 4.2909 (3.5076) grad_norm 1.3704 (1.5607) [2022-01-22 16:00:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][990/1251] eta 0:09:35 lr 0.000465 time 2.2172 (2.2056) loss 3.2807 (3.5055) grad_norm 1.4949 (1.5609) [2022-01-22 16:00:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1000/1251] eta 0:09:13 lr 0.000465 time 2.2080 (2.2042) loss 3.9091 (3.5076) grad_norm 1.3958 (1.5607) [2022-01-22 16:00:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1010/1251] eta 0:08:51 lr 0.000465 time 2.1186 (2.2038) loss 3.7057 (3.5070) grad_norm 1.4097 (1.5609) [2022-01-22 16:01:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1020/1251] eta 0:08:28 lr 0.000465 time 1.8576 (2.2020) loss 4.4485 (3.5082) grad_norm 1.4296 (1.5603) [2022-01-22 16:01:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1030/1251] eta 0:08:06 lr 0.000464 time 2.2112 (2.2015) loss 3.7617 (3.5059) grad_norm 1.6516 (1.5605) [2022-01-22 16:01:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1040/1251] eta 0:07:44 lr 0.000464 time 3.2464 (2.2028) loss 4.1823 (3.5057) grad_norm 1.5136 (1.5602) [2022-01-22 16:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1050/1251] eta 0:07:22 lr 0.000464 time 2.3323 (2.2035) loss 3.5881 (3.5068) grad_norm 1.6845 (1.5601) [2022-01-22 16:02:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1060/1251] eta 0:07:01 lr 0.000464 time 2.0106 (2.2042) loss 3.7987 (3.5057) grad_norm 1.5070 (1.5597) [2022-01-22 16:03:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1070/1251] eta 0:06:39 lr 0.000464 time 2.5377 (2.2046) loss 3.9872 (3.5058) grad_norm 1.4744 (1.5596) [2022-01-22 16:03:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1080/1251] eta 0:06:16 lr 0.000464 time 2.5999 (2.2046) loss 3.2747 (3.5038) grad_norm 1.4842 (1.5594) [2022-01-22 16:03:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1090/1251] eta 0:05:54 lr 0.000464 time 1.8680 (2.2040) loss 3.7317 (3.5031) grad_norm 1.6299 (1.5596) [2022-01-22 16:04:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1100/1251] eta 0:05:32 lr 0.000464 time 1.7710 (2.2027) loss 4.1049 (3.5031) grad_norm 1.6938 (1.5599) [2022-01-22 16:04:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1110/1251] eta 0:05:10 lr 0.000464 time 2.2969 (2.2020) loss 3.9456 (3.5003) grad_norm 1.6029 (1.5595) [2022-01-22 16:04:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1120/1251] eta 0:04:48 lr 0.000464 time 3.1931 (2.2020) loss 3.5111 (3.5000) grad_norm 1.4363 (1.5592) [2022-01-22 16:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1130/1251] eta 0:04:26 lr 0.000464 time 1.9006 (2.2024) loss 3.9579 (3.5013) grad_norm 1.4481 (1.5586) [2022-01-22 16:05:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1140/1251] eta 0:04:04 lr 0.000464 time 1.9231 (2.2033) loss 3.3086 (3.5039) grad_norm 1.3765 (1.5588) [2022-01-22 16:05:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1150/1251] eta 0:03:42 lr 0.000464 time 1.7659 (2.2026) loss 2.8956 (3.5040) grad_norm 1.3605 (1.5587) [2022-01-22 16:06:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1160/1251] eta 0:03:20 lr 0.000464 time 2.8450 (2.2012) loss 3.8895 (3.5036) grad_norm 1.5734 (1.5587) [2022-01-22 16:06:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1170/1251] eta 0:02:58 lr 0.000464 time 2.1554 (2.1992) loss 3.7133 (3.5029) grad_norm 1.3844 (1.5585) [2022-01-22 16:06:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1180/1251] eta 0:02:36 lr 0.000464 time 2.3939 (2.1976) loss 3.2271 (3.5033) grad_norm 1.4471 (1.5580) [2022-01-22 16:07:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1190/1251] eta 0:02:14 lr 0.000464 time 2.0965 (2.1972) loss 3.6895 (3.5023) grad_norm 1.4726 (1.5577) [2022-01-22 16:07:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1200/1251] eta 0:01:52 lr 0.000464 time 2.7581 (2.1971) loss 3.4323 (3.5017) grad_norm 1.3885 (1.5574) [2022-01-22 16:08:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1210/1251] eta 0:01:30 lr 0.000464 time 1.8854 (2.1978) loss 4.1112 (3.5033) grad_norm 1.6561 (1.5567) [2022-01-22 16:08:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1220/1251] eta 0:01:08 lr 0.000464 time 2.2145 (2.1983) loss 3.9045 (3.5043) grad_norm 1.5097 (1.5560) [2022-01-22 16:08:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1230/1251] eta 0:00:46 lr 0.000464 time 2.3771 (2.1979) loss 3.6732 (3.5029) grad_norm 1.6064 (1.5557) [2022-01-22 16:09:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1240/1251] eta 0:00:24 lr 0.000464 time 1.4784 (2.1968) loss 2.9311 (3.5016) grad_norm 1.5669 (1.5556) [2022-01-22 16:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [157/300][1250/1251] eta 0:00:02 lr 0.000464 time 1.1798 (2.1918) loss 3.1203 (3.5009) grad_norm 1.4110 (1.5554) [2022-01-22 16:09:26 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 157 training takes 0:45:42 [2022-01-22 16:09:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.579 (18.579) Loss 1.0036 (1.0036) Acc@1 75.293 (75.293) Acc@5 93.457 (93.457) [2022-01-22 16:10:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.642 (3.487) Loss 0.9187 (0.9846) Acc@1 78.711 (76.820) Acc@5 94.141 (93.652) [2022-01-22 16:10:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.622 (2.591) Loss 0.9403 (0.9916) Acc@1 77.832 (76.823) Acc@5 93.652 (93.411) [2022-01-22 16:10:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.602 (2.285) Loss 0.9345 (0.9994) Acc@1 76.758 (76.459) Acc@5 94.434 (93.400) [2022-01-22 16:10:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.096 (2.162) Loss 1.0205 (0.9964) Acc@1 75.586 (76.348) Acc@5 93.555 (93.481) [2022-01-22 16:11:02 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.392 Acc@5 93.510 [2022-01-22 16:11:02 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.4% [2022-01-22 16:11:02 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.52% [2022-01-22 16:11:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][0/1251] eta 7:20:07 lr 0.000464 time 21.1091 (21.1091) loss 2.6215 (2.6215) grad_norm 1.4429 (1.4429) [2022-01-22 16:11:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][10/1251] eta 1:23:12 lr 0.000464 time 1.8262 (4.0226) loss 3.6053 (3.4581) grad_norm 1.3819 (1.5263) [2022-01-22 16:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][20/1251] eta 1:04:56 lr 0.000463 time 1.4635 (3.1655) loss 2.6477 (3.3797) grad_norm 1.4445 (1.5696) [2022-01-22 16:12:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][30/1251] eta 0:58:11 lr 0.000463 time 1.4733 (2.8596) loss 3.4976 (3.4533) grad_norm 1.5420 (1.5684) [2022-01-22 16:12:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][40/1251] eta 0:55:44 lr 0.000463 time 4.7301 (2.7614) loss 3.7480 (3.4783) grad_norm 1.5699 (1.5751) [2022-01-22 16:13:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][50/1251] eta 0:52:56 lr 0.000463 time 1.5924 (2.6451) loss 3.7330 (3.4753) grad_norm 1.6257 (1.5871) [2022-01-22 16:13:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][60/1251] eta 0:51:03 lr 0.000463 time 1.5792 (2.5718) loss 3.6339 (3.5027) grad_norm 1.5550 (1.5799) [2022-01-22 16:14:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][70/1251] eta 0:49:24 lr 0.000463 time 1.8945 (2.5098) loss 3.7116 (3.5228) grad_norm 1.7071 (1.5772) [2022-01-22 16:14:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][80/1251] eta 0:48:33 lr 0.000463 time 3.5605 (2.4876) loss 4.2316 (3.5391) grad_norm 1.7049 (1.5755) [2022-01-22 16:14:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][90/1251] eta 0:47:23 lr 0.000463 time 1.7295 (2.4490) loss 3.0003 (3.5328) grad_norm 1.5817 (1.5831) [2022-01-22 16:15:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][100/1251] eta 0:46:10 lr 0.000463 time 1.9280 (2.4072) loss 4.0760 (3.5576) grad_norm 1.5647 (1.5814) [2022-01-22 16:15:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][110/1251] eta 0:45:34 lr 0.000463 time 2.6498 (2.3964) loss 3.8031 (3.5565) grad_norm 1.5255 (1.5709) [2022-01-22 16:15:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][120/1251] eta 0:45:18 lr 0.000463 time 3.4186 (2.4035) loss 2.7319 (3.5405) grad_norm 1.8851 (1.5712) [2022-01-22 16:16:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][130/1251] eta 0:44:34 lr 0.000463 time 1.9116 (2.3858) loss 2.6715 (3.5322) grad_norm 1.6541 (1.5709) [2022-01-22 16:16:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][140/1251] eta 0:43:51 lr 0.000463 time 1.8318 (2.3689) loss 3.4345 (3.5334) grad_norm 1.7312 (1.5693) [2022-01-22 16:16:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][150/1251] eta 0:42:59 lr 0.000463 time 1.7807 (2.3431) loss 4.1632 (3.5412) grad_norm 1.4525 (1.5629) [2022-01-22 16:17:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][160/1251] eta 0:42:12 lr 0.000463 time 2.2368 (2.3214) loss 3.8620 (3.5518) grad_norm 1.4606 (1.5651) [2022-01-22 16:17:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][170/1251] eta 0:41:37 lr 0.000463 time 2.0644 (2.3106) loss 4.1370 (3.5497) grad_norm 1.7863 (1.5658) [2022-01-22 16:18:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][180/1251] eta 0:41:11 lr 0.000463 time 2.2541 (2.3079) loss 3.8454 (3.5323) grad_norm 1.4327 (1.5650) [2022-01-22 16:18:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][190/1251] eta 0:40:45 lr 0.000463 time 2.4201 (2.3047) loss 3.9891 (3.5448) grad_norm 1.6140 (1.5690) [2022-01-22 16:18:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][200/1251] eta 0:40:21 lr 0.000463 time 2.7680 (2.3042) loss 3.3856 (3.5331) grad_norm 1.5703 (1.5651) [2022-01-22 16:19:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][210/1251] eta 0:39:50 lr 0.000463 time 1.6797 (2.2965) loss 4.1743 (3.5444) grad_norm 1.4959 (1.5670) [2022-01-22 16:19:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][220/1251] eta 0:39:25 lr 0.000463 time 2.1835 (2.2944) loss 2.7617 (3.5337) grad_norm 1.6149 (1.5694) [2022-01-22 16:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][230/1251] eta 0:38:54 lr 0.000463 time 1.6367 (2.2861) loss 4.2494 (3.5249) grad_norm 1.7607 (1.5742) [2022-01-22 16:20:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][240/1251] eta 0:38:27 lr 0.000463 time 2.4566 (2.2825) loss 4.1750 (3.5179) grad_norm 1.7877 (1.5737) [2022-01-22 16:20:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][250/1251] eta 0:37:57 lr 0.000463 time 1.5662 (2.2756) loss 3.1334 (3.5197) grad_norm 1.3293 (1.5702) [2022-01-22 16:20:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][260/1251] eta 0:37:30 lr 0.000463 time 1.7306 (2.2710) loss 3.4439 (3.5177) grad_norm 1.5775 (1.5744) [2022-01-22 16:21:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][270/1251] eta 0:37:03 lr 0.000462 time 2.1037 (2.2663) loss 3.8684 (3.5140) grad_norm 1.4687 (1.5731) [2022-01-22 16:21:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][280/1251] eta 0:36:39 lr 0.000462 time 2.4737 (2.2653) loss 2.9085 (3.5079) grad_norm 1.4356 (1.5733) [2022-01-22 16:22:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][290/1251] eta 0:36:18 lr 0.000462 time 2.4294 (2.2666) loss 3.7106 (3.5024) grad_norm 1.6908 (1.5751) [2022-01-22 16:22:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][300/1251] eta 0:35:51 lr 0.000462 time 1.6736 (2.2626) loss 3.9603 (3.4931) grad_norm 1.3147 (1.5743) [2022-01-22 16:22:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][310/1251] eta 0:35:27 lr 0.000462 time 1.7603 (2.2604) loss 3.1728 (3.4937) grad_norm 1.3369 (1.5710) [2022-01-22 16:23:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][320/1251] eta 0:35:04 lr 0.000462 time 2.7272 (2.2604) loss 3.4067 (3.4883) grad_norm 1.4265 (1.5708) [2022-01-22 16:23:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][330/1251] eta 0:34:42 lr 0.000462 time 2.5287 (2.2606) loss 4.0862 (3.4928) grad_norm 1.5732 (1.5686) [2022-01-22 16:23:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][340/1251] eta 0:34:16 lr 0.000462 time 2.4987 (2.2572) loss 3.2780 (3.4962) grad_norm 1.4316 (1.5669) [2022-01-22 16:24:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][350/1251] eta 0:33:48 lr 0.000462 time 1.6770 (2.2517) loss 4.1433 (3.5017) grad_norm 1.5704 (1.5665) [2022-01-22 16:24:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][360/1251] eta 0:33:21 lr 0.000462 time 1.8587 (2.2468) loss 3.2201 (3.5031) grad_norm 1.3964 (1.5658) [2022-01-22 16:24:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][370/1251] eta 0:32:57 lr 0.000462 time 1.9404 (2.2446) loss 3.9226 (3.5009) grad_norm 1.6400 (1.5674) [2022-01-22 16:25:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][380/1251] eta 0:32:32 lr 0.000462 time 2.1936 (2.2420) loss 2.6109 (3.4909) grad_norm 1.3387 (1.5669) [2022-01-22 16:25:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][390/1251] eta 0:32:08 lr 0.000462 time 2.1925 (2.2399) loss 3.1936 (3.4960) grad_norm 1.4498 (1.5682) [2022-01-22 16:26:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][400/1251] eta 0:31:46 lr 0.000462 time 2.2596 (2.2409) loss 3.2609 (3.4978) grad_norm 1.6294 (1.5672) [2022-01-22 16:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][410/1251] eta 0:31:24 lr 0.000462 time 2.2611 (2.2403) loss 3.7730 (3.4942) grad_norm 1.5987 (1.5655) [2022-01-22 16:26:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][420/1251] eta 0:30:59 lr 0.000462 time 2.3200 (2.2381) loss 2.9668 (3.4899) grad_norm 1.2771 (1.5627) [2022-01-22 16:27:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][430/1251] eta 0:30:32 lr 0.000462 time 1.7622 (2.2322) loss 3.5802 (3.4883) grad_norm 1.5131 (1.5614) [2022-01-22 16:27:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][440/1251] eta 0:30:08 lr 0.000462 time 2.3644 (2.2298) loss 2.9792 (3.4880) grad_norm 1.9055 (1.5625) [2022-01-22 16:27:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][450/1251] eta 0:29:44 lr 0.000462 time 2.2496 (2.2274) loss 4.1404 (3.4847) grad_norm 1.3895 (1.5616) [2022-01-22 16:28:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][460/1251] eta 0:29:19 lr 0.000462 time 1.9274 (2.2243) loss 3.6570 (3.4885) grad_norm 1.6416 (1.5615) [2022-01-22 16:28:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][470/1251] eta 0:28:58 lr 0.000462 time 2.4488 (2.2263) loss 3.8894 (3.4894) grad_norm 1.6811 (1.5625) [2022-01-22 16:28:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][480/1251] eta 0:28:38 lr 0.000462 time 2.8327 (2.2287) loss 3.5423 (3.4964) grad_norm 1.4722 (1.5605) [2022-01-22 16:29:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][490/1251] eta 0:28:16 lr 0.000462 time 2.3589 (2.2290) loss 3.6124 (3.4987) grad_norm 1.5176 (1.5591) [2022-01-22 16:29:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][500/1251] eta 0:27:54 lr 0.000462 time 2.3315 (2.2296) loss 3.4718 (3.5005) grad_norm 1.3444 (1.5589) [2022-01-22 16:30:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][510/1251] eta 0:27:31 lr 0.000461 time 2.2126 (2.2292) loss 3.3070 (3.4999) grad_norm 1.5400 (1.5588) [2022-01-22 16:30:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][520/1251] eta 0:27:08 lr 0.000461 time 2.2393 (2.2278) loss 2.4303 (3.4977) grad_norm 1.4172 (1.5585) [2022-01-22 16:30:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][530/1251] eta 0:26:46 lr 0.000461 time 2.1296 (2.2275) loss 3.3463 (3.5000) grad_norm 1.3716 (1.5591) [2022-01-22 16:31:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][540/1251] eta 0:26:23 lr 0.000461 time 1.6177 (2.2265) loss 3.0652 (3.4994) grad_norm 1.3917 (1.5592) [2022-01-22 16:31:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][550/1251] eta 0:25:58 lr 0.000461 time 1.8881 (2.2232) loss 3.8433 (3.5041) grad_norm 1.4386 (1.5572) [2022-01-22 16:31:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][560/1251] eta 0:25:34 lr 0.000461 time 1.9259 (2.2210) loss 2.6929 (3.5033) grad_norm 1.5991 (1.5567) [2022-01-22 16:32:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][570/1251] eta 0:25:11 lr 0.000461 time 1.8614 (2.2200) loss 3.3441 (3.5044) grad_norm 1.4712 (1.5564) [2022-01-22 16:32:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][580/1251] eta 0:24:51 lr 0.000461 time 2.1828 (2.2226) loss 3.1799 (3.5073) grad_norm 1.4206 (1.5568) [2022-01-22 16:32:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][590/1251] eta 0:24:29 lr 0.000461 time 2.7844 (2.2234) loss 3.8518 (3.5114) grad_norm 1.5137 (1.5568) [2022-01-22 16:33:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][600/1251] eta 0:24:07 lr 0.000461 time 2.3930 (2.2233) loss 3.9825 (3.5154) grad_norm 1.3567 (1.5577) [2022-01-22 16:33:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][610/1251] eta 0:23:43 lr 0.000461 time 1.6094 (2.2200) loss 3.2406 (3.5175) grad_norm 1.4719 (1.5588) [2022-01-22 16:33:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][620/1251] eta 0:23:19 lr 0.000461 time 1.8952 (2.2172) loss 3.3230 (3.5181) grad_norm 1.6069 (1.5591) [2022-01-22 16:34:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][630/1251] eta 0:22:54 lr 0.000461 time 1.9788 (2.2139) loss 3.5240 (3.5195) grad_norm 1.6712 (1.5589) [2022-01-22 16:34:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][640/1251] eta 0:22:31 lr 0.000461 time 2.1399 (2.2126) loss 3.4227 (3.5232) grad_norm 1.4036 (1.5578) [2022-01-22 16:35:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][650/1251] eta 0:22:09 lr 0.000461 time 1.8781 (2.2118) loss 4.0195 (3.5277) grad_norm 1.4368 (1.5570) [2022-01-22 16:35:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][660/1251] eta 0:21:47 lr 0.000461 time 2.8481 (2.2122) loss 2.4992 (3.5277) grad_norm 1.5431 (1.5570) [2022-01-22 16:35:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][670/1251] eta 0:21:26 lr 0.000461 time 2.9241 (2.2136) loss 4.0358 (3.5277) grad_norm 1.4191 (1.5574) [2022-01-22 16:36:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][680/1251] eta 0:21:04 lr 0.000461 time 2.7552 (2.2149) loss 3.0613 (3.5252) grad_norm 1.5517 (1.5573) [2022-01-22 16:36:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][690/1251] eta 0:20:41 lr 0.000461 time 2.1883 (2.2126) loss 3.3604 (3.5265) grad_norm 1.5968 (1.5573) [2022-01-22 16:36:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][700/1251] eta 0:20:19 lr 0.000461 time 2.2727 (2.2138) loss 2.9951 (3.5261) grad_norm 1.6394 (1.5572) [2022-01-22 16:37:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][710/1251] eta 0:19:59 lr 0.000461 time 2.0295 (2.2171) loss 2.7797 (3.5245) grad_norm 1.6284 (1.5570) [2022-01-22 16:37:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][720/1251] eta 0:19:37 lr 0.000461 time 1.8724 (2.2169) loss 3.6621 (3.5227) grad_norm 2.2757 (1.5590) [2022-01-22 16:38:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][730/1251] eta 0:19:14 lr 0.000461 time 2.5790 (2.2158) loss 2.9938 (3.5222) grad_norm 1.7040 (1.5597) [2022-01-22 16:38:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][740/1251] eta 0:18:50 lr 0.000461 time 1.8651 (2.2126) loss 2.9643 (3.5208) grad_norm 1.3687 (1.5591) [2022-01-22 16:38:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][750/1251] eta 0:18:27 lr 0.000460 time 2.5409 (2.2116) loss 3.0198 (3.5167) grad_norm 1.4846 (1.5599) [2022-01-22 16:39:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][760/1251] eta 0:18:04 lr 0.000460 time 2.1757 (2.2091) loss 3.2375 (3.5173) grad_norm 1.5688 (1.5602) [2022-01-22 16:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][770/1251] eta 0:17:41 lr 0.000460 time 2.0857 (2.2072) loss 3.4357 (3.5174) grad_norm 1.4577 (1.5614) [2022-01-22 16:39:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][780/1251] eta 0:17:19 lr 0.000460 time 2.1692 (2.2071) loss 4.2435 (3.5203) grad_norm 1.6878 (1.5629) [2022-01-22 16:40:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][790/1251] eta 0:16:56 lr 0.000460 time 2.2061 (2.2060) loss 3.4944 (3.5218) grad_norm 1.3867 (1.5637) [2022-01-22 16:40:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][800/1251] eta 0:16:35 lr 0.000460 time 2.5158 (2.2075) loss 2.3218 (3.5207) grad_norm 1.5102 (1.5644) [2022-01-22 16:40:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][810/1251] eta 0:16:13 lr 0.000460 time 2.5019 (2.2081) loss 3.1726 (3.5165) grad_norm 1.5400 (1.5631) [2022-01-22 16:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][820/1251] eta 0:15:52 lr 0.000460 time 2.5317 (2.2094) loss 4.2631 (3.5202) grad_norm 1.5795 (1.5640) [2022-01-22 16:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][830/1251] eta 0:15:29 lr 0.000460 time 1.8848 (2.2083) loss 3.7990 (3.5215) grad_norm 1.2943 (1.5638) [2022-01-22 16:41:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][840/1251] eta 0:15:06 lr 0.000460 time 1.8856 (2.2066) loss 3.5864 (3.5232) grad_norm 1.3802 (1.5636) [2022-01-22 16:42:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][850/1251] eta 0:14:43 lr 0.000460 time 1.8645 (2.2040) loss 4.2228 (3.5226) grad_norm 1.4661 (1.5637) [2022-01-22 16:42:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][860/1251] eta 0:14:21 lr 0.000460 time 1.8222 (2.2041) loss 3.6103 (3.5218) grad_norm 1.4296 (1.5639) [2022-01-22 16:43:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][870/1251] eta 0:13:59 lr 0.000460 time 1.7268 (2.2044) loss 3.9670 (3.5241) grad_norm 1.5238 (1.5626) [2022-01-22 16:43:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][880/1251] eta 0:13:38 lr 0.000460 time 2.1481 (2.2063) loss 3.7695 (3.5236) grad_norm 1.3442 (1.5624) [2022-01-22 16:43:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][890/1251] eta 0:13:16 lr 0.000460 time 2.4678 (2.2066) loss 2.5608 (3.5231) grad_norm 1.5937 (1.5634) [2022-01-22 16:44:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][900/1251] eta 0:12:54 lr 0.000460 time 2.1489 (2.2060) loss 4.2043 (3.5255) grad_norm 1.5616 (1.5634) [2022-01-22 16:44:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][910/1251] eta 0:12:32 lr 0.000460 time 2.2202 (2.2054) loss 3.2515 (3.5234) grad_norm 1.5418 (1.5640) [2022-01-22 16:44:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][920/1251] eta 0:12:09 lr 0.000460 time 1.8580 (2.2050) loss 2.3856 (3.5223) grad_norm 1.5368 (1.5640) [2022-01-22 16:45:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][930/1251] eta 0:11:47 lr 0.000460 time 2.4723 (2.2051) loss 3.7986 (3.5219) grad_norm 1.4227 (1.5641) [2022-01-22 16:45:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][940/1251] eta 0:11:25 lr 0.000460 time 1.6452 (2.2026) loss 3.6589 (3.5234) grad_norm 1.6110 (1.5649) [2022-01-22 16:45:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][950/1251] eta 0:11:02 lr 0.000460 time 1.7915 (2.2020) loss 3.6978 (3.5228) grad_norm 1.6669 (1.5645) [2022-01-22 16:46:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][960/1251] eta 0:10:40 lr 0.000460 time 2.4115 (2.2016) loss 3.5439 (3.5239) grad_norm 1.4987 (1.5644) [2022-01-22 16:46:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][970/1251] eta 0:10:19 lr 0.000460 time 2.2311 (2.2034) loss 2.5178 (3.5235) grad_norm 1.6465 (1.5644) [2022-01-22 16:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][980/1251] eta 0:09:56 lr 0.000460 time 1.8221 (2.2028) loss 4.1857 (3.5227) grad_norm 1.6703 (1.5645) [2022-01-22 16:47:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][990/1251] eta 0:09:34 lr 0.000459 time 1.9676 (2.2024) loss 3.3232 (3.5249) grad_norm 1.5100 (1.5642) [2022-01-22 16:47:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1000/1251] eta 0:09:12 lr 0.000459 time 1.8346 (2.2020) loss 3.7506 (3.5283) grad_norm 1.5141 (1.5637) [2022-01-22 16:48:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1010/1251] eta 0:08:50 lr 0.000459 time 2.5749 (2.2022) loss 3.7390 (3.5299) grad_norm 1.4088 (1.5635) [2022-01-22 16:48:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1020/1251] eta 0:08:28 lr 0.000459 time 1.7092 (2.2015) loss 2.9369 (3.5278) grad_norm 1.4129 (1.5627) [2022-01-22 16:48:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1030/1251] eta 0:08:06 lr 0.000459 time 2.3199 (2.2036) loss 4.1692 (3.5303) grad_norm 1.8897 (1.5628) [2022-01-22 16:49:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1040/1251] eta 0:07:45 lr 0.000459 time 2.6056 (2.2048) loss 2.7703 (3.5308) grad_norm 1.4654 (1.5634) [2022-01-22 16:49:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1050/1251] eta 0:07:23 lr 0.000459 time 3.2647 (2.2057) loss 2.8142 (3.5278) grad_norm 1.4585 (1.5621) [2022-01-22 16:50:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1060/1251] eta 0:07:01 lr 0.000459 time 1.8203 (2.2054) loss 3.2272 (3.5289) grad_norm 1.5850 (1.5615) [2022-01-22 16:50:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1070/1251] eta 0:06:39 lr 0.000459 time 1.6362 (2.2049) loss 2.6233 (3.5292) grad_norm 1.4380 (1.5607) [2022-01-22 16:50:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1080/1251] eta 0:06:16 lr 0.000459 time 2.3135 (2.2026) loss 2.9065 (3.5293) grad_norm 1.5162 (1.5605) [2022-01-22 16:51:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1090/1251] eta 0:05:54 lr 0.000459 time 2.8155 (2.2024) loss 3.0581 (3.5298) grad_norm 1.6992 (1.5605) [2022-01-22 16:51:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1100/1251] eta 0:05:32 lr 0.000459 time 2.1587 (2.2018) loss 4.2430 (3.5322) grad_norm 1.6758 (1.5600) [2022-01-22 16:51:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1110/1251] eta 0:05:10 lr 0.000459 time 2.1633 (2.2016) loss 4.2789 (3.5342) grad_norm 1.5157 (1.5600) [2022-01-22 16:52:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1120/1251] eta 0:04:48 lr 0.000459 time 3.1821 (2.2013) loss 3.6829 (3.5339) grad_norm 1.4753 (1.5597) [2022-01-22 16:52:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1130/1251] eta 0:04:26 lr 0.000459 time 1.8142 (2.2003) loss 3.7766 (3.5338) grad_norm 1.5879 (1.5597) [2022-01-22 16:52:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1140/1251] eta 0:04:04 lr 0.000459 time 2.1922 (2.2000) loss 3.1994 (3.5297) grad_norm 1.7014 (1.5598) [2022-01-22 16:53:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1150/1251] eta 0:03:42 lr 0.000459 time 2.4417 (2.2004) loss 2.7966 (3.5300) grad_norm 1.5780 (1.5589) [2022-01-22 16:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1160/1251] eta 0:03:20 lr 0.000459 time 2.8804 (2.2013) loss 4.1626 (3.5306) grad_norm 1.5916 (1.5591) [2022-01-22 16:53:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1170/1251] eta 0:02:58 lr 0.000459 time 1.5432 (2.2009) loss 3.1499 (3.5306) grad_norm 1.5589 (1.5588) [2022-01-22 16:54:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1180/1251] eta 0:02:36 lr 0.000459 time 2.5395 (2.2004) loss 3.4161 (3.5308) grad_norm 1.3256 (1.5587) [2022-01-22 16:54:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1190/1251] eta 0:02:14 lr 0.000459 time 2.1120 (2.1997) loss 3.8980 (3.5296) grad_norm 1.4771 (1.5593) [2022-01-22 16:55:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1200/1251] eta 0:01:52 lr 0.000459 time 2.2428 (2.2006) loss 3.9742 (3.5310) grad_norm 1.3450 (1.5586) [2022-01-22 16:55:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1210/1251] eta 0:01:30 lr 0.000459 time 2.1170 (2.2005) loss 3.7010 (3.5300) grad_norm 1.6110 (1.5579) [2022-01-22 16:55:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1220/1251] eta 0:01:08 lr 0.000459 time 2.2004 (2.1994) loss 3.5322 (3.5285) grad_norm 1.5044 (1.5573) [2022-01-22 16:56:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1230/1251] eta 0:00:46 lr 0.000459 time 3.5853 (2.2001) loss 3.7171 (3.5273) grad_norm 1.3122 (1.5568) [2022-01-22 16:56:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1240/1251] eta 0:00:24 lr 0.000458 time 1.6602 (2.1989) loss 3.5230 (3.5260) grad_norm 1.5385 (1.5571) [2022-01-22 16:56:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [158/300][1250/1251] eta 0:00:02 lr 0.000458 time 1.1322 (2.1937) loss 3.7553 (3.5264) grad_norm 1.5456 (1.5571) [2022-01-22 16:56:47 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 158 training takes 0:45:44 [2022-01-22 16:57:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.243 (18.243) Loss 1.0705 (1.0705) Acc@1 74.316 (74.316) Acc@5 92.383 (92.383) [2022-01-22 16:57:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.857 (3.528) Loss 0.9509 (1.0188) Acc@1 77.734 (75.914) Acc@5 93.066 (93.288) [2022-01-22 16:57:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.893 (2.659) Loss 1.0667 (1.0194) Acc@1 74.902 (75.967) Acc@5 92.773 (93.252) [2022-01-22 16:57:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.603 (2.332) Loss 0.9839 (1.0122) Acc@1 77.441 (76.247) Acc@5 93.750 (93.366) [2022-01-22 16:58:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.807 (2.199) Loss 1.0072 (1.0101) Acc@1 78.027 (76.331) Acc@5 93.652 (93.431) [2022-01-22 16:58:25 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.444 Acc@5 93.474 [2022-01-22 16:58:25 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.4% [2022-01-22 16:58:25 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.52% [2022-01-22 16:58:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][0/1251] eta 7:24:07 lr 0.000458 time 21.3009 (21.3009) loss 3.7544 (3.7544) grad_norm 1.5750 (1.5750) [2022-01-22 16:59:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][10/1251] eta 1:21:00 lr 0.000458 time 2.4466 (3.9169) loss 3.5647 (3.5384) grad_norm 1.3565 (1.5221) [2022-01-22 16:59:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][20/1251] eta 1:04:28 lr 0.000458 time 1.9926 (3.1426) loss 3.4535 (3.6154) grad_norm 1.5274 (1.5624) [2022-01-22 16:59:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][30/1251] eta 0:57:06 lr 0.000458 time 1.2554 (2.8064) loss 2.3206 (3.4999) grad_norm 1.7363 (1.5727) [2022-01-22 17:00:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][40/1251] eta 0:53:45 lr 0.000458 time 3.2566 (2.6631) loss 3.8171 (3.4932) grad_norm 1.6985 (1.5801) [2022-01-22 17:00:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][50/1251] eta 0:51:16 lr 0.000458 time 2.5769 (2.5613) loss 3.4501 (3.5372) grad_norm 1.4003 (1.5600) [2022-01-22 17:00:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][60/1251] eta 0:49:33 lr 0.000458 time 2.2421 (2.4970) loss 3.5881 (3.5345) grad_norm 1.6608 (1.5543) [2022-01-22 17:01:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][70/1251] eta 0:48:30 lr 0.000458 time 1.8300 (2.4648) loss 3.7960 (3.5198) grad_norm 1.5493 (1.5515) [2022-01-22 17:01:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][80/1251] eta 0:47:57 lr 0.000458 time 3.3017 (2.4570) loss 2.7022 (3.4998) grad_norm 1.4190 (1.5453) [2022-01-22 17:02:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][90/1251] eta 0:46:59 lr 0.000458 time 1.8177 (2.4282) loss 4.2385 (3.5028) grad_norm 1.6072 (1.5572) [2022-01-22 17:02:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][100/1251] eta 0:46:18 lr 0.000458 time 3.2230 (2.4139) loss 4.2720 (3.5021) grad_norm 1.4121 (1.5603) [2022-01-22 17:02:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][110/1251] eta 0:45:19 lr 0.000458 time 1.5966 (2.3833) loss 3.8138 (3.5047) grad_norm 2.6092 (1.5698) [2022-01-22 17:03:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][120/1251] eta 0:44:57 lr 0.000458 time 3.4498 (2.3853) loss 3.5821 (3.4906) grad_norm 1.5178 (1.5703) [2022-01-22 17:03:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][130/1251] eta 0:44:15 lr 0.000458 time 1.7323 (2.3692) loss 4.0937 (3.4816) grad_norm 1.8673 (1.5748) [2022-01-22 17:03:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][140/1251] eta 0:43:32 lr 0.000458 time 2.2850 (2.3518) loss 2.8392 (3.4787) grad_norm 1.5244 (1.5758) [2022-01-22 17:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][150/1251] eta 0:42:50 lr 0.000458 time 1.6336 (2.3347) loss 3.6310 (3.4740) grad_norm 1.6630 (1.5721) [2022-01-22 17:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][160/1251] eta 0:42:07 lr 0.000458 time 2.1599 (2.3164) loss 4.2234 (3.4907) grad_norm 1.4456 (1.5751) [2022-01-22 17:05:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][170/1251] eta 0:41:37 lr 0.000458 time 1.8922 (2.3106) loss 3.6016 (3.4810) grad_norm 1.5298 (1.5722) [2022-01-22 17:05:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][180/1251] eta 0:41:18 lr 0.000458 time 1.8597 (2.3142) loss 3.4258 (3.4726) grad_norm 1.9941 (1.5673) [2022-01-22 17:05:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][190/1251] eta 0:40:56 lr 0.000458 time 1.9882 (2.3155) loss 4.0071 (3.4783) grad_norm 1.4475 (1.5658) [2022-01-22 17:06:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][200/1251] eta 0:40:23 lr 0.000458 time 1.9445 (2.3057) loss 3.5065 (3.4854) grad_norm 1.6437 (1.5656) [2022-01-22 17:06:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][210/1251] eta 0:39:45 lr 0.000458 time 1.7236 (2.2915) loss 2.3709 (3.4753) grad_norm 1.5348 (1.5657) [2022-01-22 17:06:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][220/1251] eta 0:39:05 lr 0.000458 time 2.1172 (2.2753) loss 3.9047 (3.4842) grad_norm 1.4240 (1.5649) [2022-01-22 17:07:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][230/1251] eta 0:38:38 lr 0.000457 time 1.8414 (2.2707) loss 3.8981 (3.4941) grad_norm 1.9700 (1.5656) [2022-01-22 17:07:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][240/1251] eta 0:38:11 lr 0.000457 time 2.4245 (2.2664) loss 2.8013 (3.4977) grad_norm 1.4754 (1.5627) [2022-01-22 17:07:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][250/1251] eta 0:37:47 lr 0.000457 time 2.3175 (2.2656) loss 3.3576 (3.4946) grad_norm 1.6653 (1.5613) [2022-01-22 17:08:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][260/1251] eta 0:37:25 lr 0.000457 time 2.6259 (2.2663) loss 3.8615 (3.4787) grad_norm 1.2601 (1.5591) [2022-01-22 17:08:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][270/1251] eta 0:37:02 lr 0.000457 time 1.5257 (2.2657) loss 3.8899 (3.4844) grad_norm 1.3595 (1.5579) [2022-01-22 17:09:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][280/1251] eta 0:36:40 lr 0.000457 time 1.5536 (2.2658) loss 3.5211 (3.4860) grad_norm 1.4048 (1.5587) [2022-01-22 17:09:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][290/1251] eta 0:36:16 lr 0.000457 time 2.0576 (2.2645) loss 3.7130 (3.4970) grad_norm 1.6440 (1.5582) [2022-01-22 17:09:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][300/1251] eta 0:35:49 lr 0.000457 time 2.7325 (2.2606) loss 4.0441 (3.5079) grad_norm 1.5551 (1.5577) [2022-01-22 17:10:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][310/1251] eta 0:35:17 lr 0.000457 time 1.8388 (2.2508) loss 3.7363 (3.5100) grad_norm 1.4764 (1.5578) [2022-01-22 17:10:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][320/1251] eta 0:34:53 lr 0.000457 time 1.8899 (2.2483) loss 3.4640 (3.5130) grad_norm 1.3947 (1.5568) [2022-01-22 17:10:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][330/1251] eta 0:34:29 lr 0.000457 time 2.1931 (2.2469) loss 4.0684 (3.5204) grad_norm 1.3228 (1.5555) [2022-01-22 17:11:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][340/1251] eta 0:34:12 lr 0.000457 time 3.1047 (2.2526) loss 3.6584 (3.5129) grad_norm 1.4294 (1.5566) [2022-01-22 17:11:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][350/1251] eta 0:33:52 lr 0.000457 time 1.9989 (2.2563) loss 3.9990 (3.5147) grad_norm 1.5061 (1.5551) [2022-01-22 17:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][360/1251] eta 0:33:29 lr 0.000457 time 1.8332 (2.2557) loss 2.2291 (3.5103) grad_norm 1.3596 (1.5585) [2022-01-22 17:12:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][370/1251] eta 0:32:58 lr 0.000457 time 1.6144 (2.2462) loss 3.6103 (3.4971) grad_norm 1.8632 (1.5621) [2022-01-22 17:12:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][380/1251] eta 0:32:29 lr 0.000457 time 2.2054 (2.2385) loss 2.7790 (3.4949) grad_norm 1.5351 (1.5632) [2022-01-22 17:12:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][390/1251] eta 0:32:02 lr 0.000457 time 1.8621 (2.2325) loss 4.1397 (3.4972) grad_norm 1.7128 (1.5633) [2022-01-22 17:13:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][400/1251] eta 0:31:36 lr 0.000457 time 1.8495 (2.2288) loss 2.9626 (3.4987) grad_norm 1.4882 (1.5646) [2022-01-22 17:13:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][410/1251] eta 0:31:10 lr 0.000457 time 2.5900 (2.2241) loss 3.0571 (3.4963) grad_norm 1.5550 (1.5640) [2022-01-22 17:14:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][420/1251] eta 0:30:47 lr 0.000457 time 2.4456 (2.2235) loss 3.6851 (3.5005) grad_norm 1.3795 (1.5621) [2022-01-22 17:14:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][430/1251] eta 0:30:27 lr 0.000457 time 2.4965 (2.2265) loss 3.1839 (3.4934) grad_norm 1.5691 (1.5617) [2022-01-22 17:14:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][440/1251] eta 0:30:06 lr 0.000457 time 2.5905 (2.2279) loss 2.2242 (3.4912) grad_norm 1.4955 (1.5626) [2022-01-22 17:15:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][450/1251] eta 0:29:45 lr 0.000457 time 2.7391 (2.2289) loss 3.5904 (3.4925) grad_norm 1.6918 (1.5625) [2022-01-22 17:15:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][460/1251] eta 0:29:23 lr 0.000457 time 2.2630 (2.2293) loss 3.2039 (3.4943) grad_norm 1.7401 (1.5656) [2022-01-22 17:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][470/1251] eta 0:29:00 lr 0.000456 time 1.7948 (2.2282) loss 3.8779 (3.4923) grad_norm 1.4277 (1.5648) [2022-01-22 17:16:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][480/1251] eta 0:28:37 lr 0.000456 time 2.3187 (2.2280) loss 4.3703 (3.4890) grad_norm 1.6986 (1.5634) [2022-01-22 17:16:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][490/1251] eta 0:28:14 lr 0.000456 time 1.8774 (2.2269) loss 3.4669 (3.4908) grad_norm 1.3838 (1.5650) [2022-01-22 17:16:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][500/1251] eta 0:27:49 lr 0.000456 time 1.8403 (2.2237) loss 2.5265 (3.4832) grad_norm 1.4473 (1.5641) [2022-01-22 17:17:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][510/1251] eta 0:27:25 lr 0.000456 time 1.9266 (2.2205) loss 2.9918 (3.4785) grad_norm 1.4287 (1.5655) [2022-01-22 17:17:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][520/1251] eta 0:27:02 lr 0.000456 time 2.1300 (2.2192) loss 3.1214 (3.4722) grad_norm 1.5343 (1.5669) [2022-01-22 17:18:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][530/1251] eta 0:26:39 lr 0.000456 time 1.8915 (2.2185) loss 3.8174 (3.4753) grad_norm 1.3838 (1.5677) [2022-01-22 17:18:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][540/1251] eta 0:26:17 lr 0.000456 time 2.1087 (2.2184) loss 3.7409 (3.4798) grad_norm 1.3908 (1.5677) [2022-01-22 17:18:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][550/1251] eta 0:25:55 lr 0.000456 time 2.1914 (2.2195) loss 3.2567 (3.4803) grad_norm 1.2454 (1.5657) [2022-01-22 17:19:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][560/1251] eta 0:25:35 lr 0.000456 time 3.4502 (2.2225) loss 3.2325 (3.4762) grad_norm 1.4101 (1.5658) [2022-01-22 17:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][570/1251] eta 0:25:13 lr 0.000456 time 1.8322 (2.2219) loss 2.5909 (3.4773) grad_norm 1.7388 (1.5675) [2022-01-22 17:19:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][580/1251] eta 0:24:48 lr 0.000456 time 1.8653 (2.2186) loss 3.7177 (3.4759) grad_norm 1.3831 (1.5669) [2022-01-22 17:20:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][590/1251] eta 0:24:24 lr 0.000456 time 1.7195 (2.2156) loss 3.6859 (3.4776) grad_norm 1.7023 (1.5676) [2022-01-22 17:20:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][600/1251] eta 0:24:02 lr 0.000456 time 3.0539 (2.2153) loss 3.0993 (3.4774) grad_norm 1.7714 (1.5693) [2022-01-22 17:20:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][610/1251] eta 0:23:39 lr 0.000456 time 2.4083 (2.2151) loss 3.1823 (3.4762) grad_norm 1.4941 (1.5693) [2022-01-22 17:21:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][620/1251] eta 0:23:18 lr 0.000456 time 1.8678 (2.2169) loss 3.0193 (3.4736) grad_norm 1.6111 (1.5674) [2022-01-22 17:21:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][630/1251] eta 0:22:57 lr 0.000456 time 1.6781 (2.2182) loss 3.5870 (3.4732) grad_norm 1.4253 (1.5673) [2022-01-22 17:22:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][640/1251] eta 0:22:34 lr 0.000456 time 2.1569 (2.2177) loss 3.9381 (3.4752) grad_norm 1.5940 (1.5663) [2022-01-22 17:22:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][650/1251] eta 0:22:11 lr 0.000456 time 2.7645 (2.2157) loss 3.2228 (3.4792) grad_norm 1.3749 (1.5646) [2022-01-22 17:22:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][660/1251] eta 0:21:47 lr 0.000456 time 2.2405 (2.2131) loss 3.3101 (3.4739) grad_norm 1.6297 (1.5647) [2022-01-22 17:23:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][670/1251] eta 0:21:24 lr 0.000456 time 1.8678 (2.2109) loss 4.0170 (3.4779) grad_norm 1.4689 (1.5641) [2022-01-22 17:23:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][680/1251] eta 0:21:02 lr 0.000456 time 1.5986 (2.2116) loss 4.0702 (3.4798) grad_norm 1.4561 (1.5640) [2022-01-22 17:23:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][690/1251] eta 0:20:40 lr 0.000456 time 2.3822 (2.2118) loss 3.0603 (3.4798) grad_norm 1.6082 (1.5645) [2022-01-22 17:24:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][700/1251] eta 0:20:17 lr 0.000456 time 2.1525 (2.2101) loss 3.3674 (3.4833) grad_norm 1.5146 (1.5635) [2022-01-22 17:24:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][710/1251] eta 0:19:56 lr 0.000455 time 2.7687 (2.2121) loss 4.4174 (3.4863) grad_norm 1.4011 (1.5630) [2022-01-22 17:24:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][720/1251] eta 0:19:34 lr 0.000455 time 1.8926 (2.2120) loss 3.9250 (3.4838) grad_norm 1.4102 (1.5615) [2022-01-22 17:25:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][730/1251] eta 0:19:13 lr 0.000455 time 2.6650 (2.2131) loss 3.4700 (3.4836) grad_norm 1.7851 (1.5617) [2022-01-22 17:25:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][740/1251] eta 0:18:49 lr 0.000455 time 1.9064 (2.2105) loss 4.1031 (3.4867) grad_norm 1.6131 (1.5631) [2022-01-22 17:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][750/1251] eta 0:18:26 lr 0.000455 time 1.9633 (2.2084) loss 3.6819 (3.4898) grad_norm 1.4499 (1.5633) [2022-01-22 17:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][760/1251] eta 0:18:03 lr 0.000455 time 1.6391 (2.2060) loss 4.0607 (3.4902) grad_norm 1.5283 (1.5631) [2022-01-22 17:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][770/1251] eta 0:17:41 lr 0.000455 time 2.8124 (2.2075) loss 2.2754 (3.4877) grad_norm 1.6057 (1.5632) [2022-01-22 17:27:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][780/1251] eta 0:17:19 lr 0.000455 time 1.8286 (2.2077) loss 3.5368 (3.4907) grad_norm 1.7842 (1.5639) [2022-01-22 17:27:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][790/1251] eta 0:16:58 lr 0.000455 time 2.1383 (2.2095) loss 3.8597 (3.4897) grad_norm 1.7645 (1.5651) [2022-01-22 17:27:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][800/1251] eta 0:16:37 lr 0.000455 time 2.3768 (2.2109) loss 3.6649 (3.4906) grad_norm 1.4691 (1.5654) [2022-01-22 17:28:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][810/1251] eta 0:16:15 lr 0.000455 time 2.1979 (2.2121) loss 3.2370 (3.4906) grad_norm 1.4159 (1.5648) [2022-01-22 17:28:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][820/1251] eta 0:15:53 lr 0.000455 time 2.1886 (2.2113) loss 3.4183 (3.4877) grad_norm 1.4058 (1.5642) [2022-01-22 17:29:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][830/1251] eta 0:15:30 lr 0.000455 time 1.7896 (2.2107) loss 3.9413 (3.4879) grad_norm 1.6232 (1.5640) [2022-01-22 17:29:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][840/1251] eta 0:15:07 lr 0.000455 time 2.1590 (2.2088) loss 3.5402 (3.4895) grad_norm 1.4014 (1.5631) [2022-01-22 17:29:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][850/1251] eta 0:14:45 lr 0.000455 time 1.8611 (2.2075) loss 3.2850 (3.4871) grad_norm 1.4202 (1.5626) [2022-01-22 17:30:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][860/1251] eta 0:14:23 lr 0.000455 time 2.7565 (2.2073) loss 3.4493 (3.4881) grad_norm 1.4998 (1.5627) [2022-01-22 17:30:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][870/1251] eta 0:14:00 lr 0.000455 time 1.7708 (2.2068) loss 3.5911 (3.4879) grad_norm 1.3171 (1.5612) [2022-01-22 17:30:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][880/1251] eta 0:13:38 lr 0.000455 time 1.7978 (2.2059) loss 2.3781 (3.4878) grad_norm 1.4175 (1.5617) [2022-01-22 17:31:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][890/1251] eta 0:13:16 lr 0.000455 time 1.8928 (2.2056) loss 3.5142 (3.4915) grad_norm 1.7633 (1.5624) [2022-01-22 17:31:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][900/1251] eta 0:12:54 lr 0.000455 time 2.2265 (2.2053) loss 4.1732 (3.4936) grad_norm 1.4486 (1.5625) [2022-01-22 17:31:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][910/1251] eta 0:12:32 lr 0.000455 time 2.8053 (2.2063) loss 3.4865 (3.4936) grad_norm 1.5237 (1.5619) [2022-01-22 17:32:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][920/1251] eta 0:12:10 lr 0.000455 time 1.9306 (2.2058) loss 3.8407 (3.4921) grad_norm 1.5943 (1.5609) [2022-01-22 17:32:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][930/1251] eta 0:11:47 lr 0.000455 time 1.5674 (2.2039) loss 3.8195 (3.4934) grad_norm 1.6479 (1.5607) [2022-01-22 17:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][940/1251] eta 0:11:26 lr 0.000455 time 3.4570 (2.2063) loss 3.5863 (3.4921) grad_norm 1.6108 (1.5609) [2022-01-22 17:33:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][950/1251] eta 0:11:04 lr 0.000454 time 2.8582 (2.2085) loss 4.0650 (3.4897) grad_norm 1.6723 (1.5610) [2022-01-22 17:33:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][960/1251] eta 0:10:42 lr 0.000454 time 2.2994 (2.2082) loss 3.9918 (3.4904) grad_norm 1.4326 (1.5609) [2022-01-22 17:34:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][970/1251] eta 0:10:19 lr 0.000454 time 1.7578 (2.2051) loss 3.2409 (3.4914) grad_norm 1.3398 (1.5613) [2022-01-22 17:34:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][980/1251] eta 0:09:57 lr 0.000454 time 1.9613 (2.2034) loss 4.1995 (3.4925) grad_norm 1.6807 (1.5622) [2022-01-22 17:34:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][990/1251] eta 0:09:35 lr 0.000454 time 2.2050 (2.2042) loss 3.5398 (3.4930) grad_norm 1.4335 (1.5619) [2022-01-22 17:35:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1000/1251] eta 0:09:13 lr 0.000454 time 3.1099 (2.2057) loss 4.1334 (3.4941) grad_norm 1.7365 (1.5618) [2022-01-22 17:35:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1010/1251] eta 0:08:51 lr 0.000454 time 1.8328 (2.2042) loss 3.6006 (3.4927) grad_norm 1.4536 (1.5620) [2022-01-22 17:35:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1020/1251] eta 0:08:29 lr 0.000454 time 1.5929 (2.2038) loss 3.4275 (3.4922) grad_norm 1.6119 (1.5629) [2022-01-22 17:36:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1030/1251] eta 0:08:07 lr 0.000454 time 1.8748 (2.2038) loss 3.5400 (3.4921) grad_norm 1.7041 (1.5638) [2022-01-22 17:36:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1040/1251] eta 0:07:45 lr 0.000454 time 2.7184 (2.2050) loss 3.8262 (3.4922) grad_norm 1.7140 (1.5637) [2022-01-22 17:37:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1050/1251] eta 0:07:22 lr 0.000454 time 1.7869 (2.2033) loss 2.6293 (3.4906) grad_norm 1.9628 (1.5632) [2022-01-22 17:37:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1060/1251] eta 0:07:00 lr 0.000454 time 2.2374 (2.2021) loss 3.0887 (3.4909) grad_norm 1.5101 (1.5629) [2022-01-22 17:37:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1070/1251] eta 0:06:38 lr 0.000454 time 1.8780 (2.2011) loss 3.8348 (3.4921) grad_norm 1.5487 (1.5618) [2022-01-22 17:38:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1080/1251] eta 0:06:16 lr 0.000454 time 2.5641 (2.2007) loss 4.2216 (3.4949) grad_norm 1.4516 (1.5614) [2022-01-22 17:38:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1090/1251] eta 0:05:54 lr 0.000454 time 2.2503 (2.1996) loss 4.2164 (3.4965) grad_norm 1.6607 (1.5609) [2022-01-22 17:38:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1100/1251] eta 0:05:31 lr 0.000454 time 3.0231 (2.1985) loss 3.8228 (3.4965) grad_norm 1.5478 (1.5607) [2022-01-22 17:39:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1110/1251] eta 0:05:09 lr 0.000454 time 1.6089 (2.1974) loss 3.8888 (3.4957) grad_norm 1.4971 (1.5608) [2022-01-22 17:39:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1120/1251] eta 0:04:47 lr 0.000454 time 1.8842 (2.1962) loss 3.3757 (3.4955) grad_norm 1.5812 (1.5609) [2022-01-22 17:39:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1130/1251] eta 0:04:25 lr 0.000454 time 2.1066 (2.1968) loss 4.4838 (3.4976) grad_norm 1.6337 (1.5617) [2022-01-22 17:40:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1140/1251] eta 0:04:03 lr 0.000454 time 2.5494 (2.1979) loss 3.3744 (3.4972) grad_norm 1.4427 (1.5614) [2022-01-22 17:40:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1150/1251] eta 0:03:41 lr 0.000454 time 2.2354 (2.1971) loss 2.3348 (3.4966) grad_norm 1.4612 (1.5612) [2022-01-22 17:40:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1160/1251] eta 0:03:19 lr 0.000454 time 2.1205 (2.1973) loss 2.5777 (3.4971) grad_norm 1.6342 (1.5614) [2022-01-22 17:41:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1170/1251] eta 0:02:57 lr 0.000454 time 1.7306 (2.1973) loss 3.7261 (3.4964) grad_norm 1.4869 (1.5626) [2022-01-22 17:41:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1180/1251] eta 0:02:36 lr 0.000454 time 2.8064 (2.1988) loss 2.2628 (3.4964) grad_norm 1.4520 (1.5622) [2022-01-22 17:42:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1190/1251] eta 0:02:14 lr 0.000454 time 2.3394 (2.2006) loss 2.5934 (3.4959) grad_norm 1.5337 (1.5627) [2022-01-22 17:42:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1200/1251] eta 0:01:52 lr 0.000453 time 1.8646 (2.1998) loss 3.0355 (3.4948) grad_norm 1.4676 (1.5622) [2022-01-22 17:42:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1210/1251] eta 0:01:30 lr 0.000453 time 2.1525 (2.1993) loss 2.7842 (3.4936) grad_norm 1.7390 (1.5624) [2022-01-22 17:43:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1220/1251] eta 0:01:08 lr 0.000453 time 1.6430 (2.1984) loss 2.4894 (3.4950) grad_norm 1.4956 (1.5624) [2022-01-22 17:43:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1230/1251] eta 0:00:46 lr 0.000453 time 1.8047 (2.1987) loss 2.5427 (3.4945) grad_norm 2.3922 (1.5638) [2022-01-22 17:43:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1240/1251] eta 0:00:24 lr 0.000453 time 1.4024 (2.1973) loss 3.7817 (3.4945) grad_norm 1.3251 (1.5637) [2022-01-22 17:44:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [159/300][1250/1251] eta 0:00:02 lr 0.000453 time 1.1923 (2.1919) loss 2.6675 (3.4936) grad_norm 1.4074 (1.5634) [2022-01-22 17:44:07 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 159 training takes 0:45:42 [2022-01-22 17:44:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.473 (18.473) Loss 0.9845 (0.9845) Acc@1 78.027 (78.027) Acc@5 93.262 (93.262) [2022-01-22 17:44:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.575 (3.457) Loss 1.0708 (1.0087) Acc@1 75.586 (76.527) Acc@5 92.969 (93.457) [2022-01-22 17:45:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.576 (2.559) Loss 1.0186 (0.9949) Acc@1 75.879 (76.814) Acc@5 93.262 (93.676) [2022-01-22 17:45:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.963 (2.360) Loss 0.8966 (0.9994) Acc@1 78.906 (76.663) Acc@5 94.922 (93.693) [2022-01-22 17:45:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.748 (2.237) Loss 0.9770 (1.0016) Acc@1 77.930 (76.651) Acc@5 94.043 (93.676) [2022-01-22 17:45:46 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.550 Acc@5 93.530 [2022-01-22 17:45:46 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.5% [2022-01-22 17:45:46 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.55% [2022-01-22 17:46:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][0/1251] eta 8:29:52 lr 0.000453 time 24.4544 (24.4544) loss 4.2291 (4.2291) grad_norm 1.8724 (1.8724) [2022-01-22 17:46:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][10/1251] eta 1:37:14 lr 0.000453 time 2.8426 (4.7011) loss 3.8267 (3.3976) grad_norm 1.4842 (1.5757) [2022-01-22 17:47:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][20/1251] eta 1:12:51 lr 0.000453 time 1.4159 (3.5512) loss 4.0757 (3.6373) grad_norm 1.4305 (1.5816) [2022-01-22 17:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][30/1251] eta 1:03:21 lr 0.000453 time 2.0385 (3.1131) loss 3.2739 (3.6212) grad_norm 1.6829 (1.5712) [2022-01-22 17:47:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][40/1251] eta 0:58:36 lr 0.000453 time 3.8085 (2.9041) loss 4.1835 (3.6249) grad_norm 1.4326 (1.5597) [2022-01-22 17:48:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][50/1251] eta 0:55:13 lr 0.000453 time 1.5535 (2.7586) loss 3.2712 (3.5890) grad_norm 1.5288 (1.5577) [2022-01-22 17:48:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][60/1251] eta 0:52:39 lr 0.000453 time 2.0289 (2.6527) loss 4.1920 (3.6281) grad_norm 1.5323 (1.5503) [2022-01-22 17:48:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][70/1251] eta 0:50:45 lr 0.000453 time 1.9403 (2.5790) loss 2.3824 (3.5583) grad_norm 1.4373 (1.5480) [2022-01-22 17:49:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][80/1251] eta 0:49:45 lr 0.000453 time 3.6096 (2.5494) loss 3.6679 (3.5296) grad_norm 1.5928 (1.5413) [2022-01-22 17:49:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][90/1251] eta 0:48:13 lr 0.000453 time 1.9093 (2.4920) loss 2.7579 (3.5031) grad_norm 1.5675 (1.5477) [2022-01-22 17:49:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][100/1251] eta 0:46:52 lr 0.000453 time 1.6763 (2.4435) loss 3.9803 (3.5206) grad_norm 1.6110 (1.5506) [2022-01-22 17:50:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][110/1251] eta 0:45:50 lr 0.000453 time 2.2363 (2.4107) loss 3.7022 (3.5090) grad_norm 1.4634 (1.5553) [2022-01-22 17:50:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][120/1251] eta 0:45:03 lr 0.000453 time 2.5400 (2.3906) loss 3.5313 (3.4936) grad_norm 2.2937 (1.5648) [2022-01-22 17:50:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][130/1251] eta 0:44:28 lr 0.000453 time 2.3469 (2.3801) loss 3.6164 (3.4911) grad_norm 1.7546 (1.5631) [2022-01-22 17:51:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][140/1251] eta 0:43:51 lr 0.000453 time 1.8913 (2.3685) loss 2.9522 (3.4762) grad_norm 1.8677 (1.5717) [2022-01-22 17:51:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][150/1251] eta 0:43:20 lr 0.000453 time 2.3968 (2.3622) loss 3.9748 (3.4862) grad_norm 1.8053 (1.5744) [2022-01-22 17:52:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][160/1251] eta 0:42:58 lr 0.000453 time 2.1401 (2.3634) loss 2.9391 (3.4835) grad_norm 1.6451 (1.5767) [2022-01-22 17:52:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][170/1251] eta 0:42:26 lr 0.000453 time 1.8151 (2.3561) loss 4.2070 (3.5093) grad_norm 1.5308 (1.5816) [2022-01-22 17:52:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][180/1251] eta 0:41:49 lr 0.000453 time 1.9082 (2.3427) loss 3.8283 (3.5100) grad_norm 1.9380 (1.5784) [2022-01-22 17:53:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][190/1251] eta 0:41:01 lr 0.000452 time 1.6552 (2.3201) loss 4.0845 (3.5230) grad_norm 1.6741 (1.5750) [2022-01-22 17:53:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][200/1251] eta 0:40:20 lr 0.000452 time 1.8866 (2.3028) loss 3.9174 (3.5128) grad_norm 1.5289 (1.5780) [2022-01-22 17:53:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][210/1251] eta 0:39:50 lr 0.000452 time 2.0090 (2.2967) loss 4.0792 (3.5243) grad_norm 1.8316 (1.5768) [2022-01-22 17:54:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][220/1251] eta 0:39:20 lr 0.000452 time 2.7239 (2.2900) loss 3.9260 (3.5344) grad_norm 1.4822 (1.5783) [2022-01-22 17:54:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][230/1251] eta 0:38:55 lr 0.000452 time 2.6465 (2.2879) loss 4.0161 (3.5369) grad_norm 1.5668 (1.5817) [2022-01-22 17:54:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][240/1251] eta 0:38:25 lr 0.000452 time 1.6085 (2.2807) loss 2.4703 (3.5332) grad_norm 1.3523 (1.5830) [2022-01-22 17:55:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][250/1251] eta 0:37:58 lr 0.000452 time 1.6390 (2.2763) loss 3.0167 (3.5249) grad_norm 1.5706 (1.5814) [2022-01-22 17:55:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][260/1251] eta 0:37:39 lr 0.000452 time 2.6108 (2.2802) loss 2.2346 (3.5125) grad_norm 1.4613 (1.5782) [2022-01-22 17:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][270/1251] eta 0:37:13 lr 0.000452 time 2.3519 (2.2764) loss 3.3624 (3.4975) grad_norm 1.3302 (1.5770) [2022-01-22 17:56:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][280/1251] eta 0:36:47 lr 0.000452 time 1.8419 (2.2731) loss 3.8381 (3.5066) grad_norm 1.5963 (1.5762) [2022-01-22 17:56:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][290/1251] eta 0:36:24 lr 0.000452 time 1.8406 (2.2730) loss 3.8308 (3.5068) grad_norm 1.7745 (1.5803) [2022-01-22 17:57:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][300/1251] eta 0:36:04 lr 0.000452 time 2.8857 (2.2760) loss 2.5897 (3.5025) grad_norm 1.4008 (1.5798) [2022-01-22 17:57:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][310/1251] eta 0:35:33 lr 0.000452 time 1.5967 (2.2668) loss 2.4694 (3.4988) grad_norm 1.5571 (1.5779) [2022-01-22 17:57:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][320/1251] eta 0:35:05 lr 0.000452 time 1.8485 (2.2620) loss 3.8144 (3.5081) grad_norm 1.4920 (1.5751) [2022-01-22 17:58:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][330/1251] eta 0:34:38 lr 0.000452 time 1.8275 (2.2565) loss 3.7482 (3.5068) grad_norm 1.5117 (1.5764) [2022-01-22 17:58:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][340/1251] eta 0:34:14 lr 0.000452 time 2.7908 (2.2549) loss 2.8243 (3.5020) grad_norm 1.4659 (1.5764) [2022-01-22 17:58:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][350/1251] eta 0:33:52 lr 0.000452 time 2.2588 (2.2553) loss 3.3848 (3.5018) grad_norm 1.4905 (1.5751) [2022-01-22 17:59:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][360/1251] eta 0:33:25 lr 0.000452 time 1.5881 (2.2510) loss 3.9416 (3.5010) grad_norm 1.4375 (1.5734) [2022-01-22 17:59:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][370/1251] eta 0:33:03 lr 0.000452 time 2.5431 (2.2515) loss 3.2109 (3.4990) grad_norm 1.4711 (1.5742) [2022-01-22 18:00:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][380/1251] eta 0:32:41 lr 0.000452 time 2.8114 (2.2519) loss 2.4557 (3.4980) grad_norm 1.2827 (1.5734) [2022-01-22 18:00:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][390/1251] eta 0:32:20 lr 0.000452 time 2.4809 (2.2537) loss 2.4861 (3.4998) grad_norm 1.6852 (1.5735) [2022-01-22 18:00:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][400/1251] eta 0:31:57 lr 0.000452 time 2.1795 (2.2536) loss 4.1215 (3.4994) grad_norm 1.5959 (1.5734) [2022-01-22 18:01:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][410/1251] eta 0:31:31 lr 0.000452 time 1.6079 (2.2492) loss 3.1850 (3.5018) grad_norm 1.6315 (1.5725) [2022-01-22 18:01:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][420/1251] eta 0:31:07 lr 0.000452 time 2.1656 (2.2471) loss 3.7867 (3.5027) grad_norm 1.8198 (1.5752) [2022-01-22 18:01:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][430/1251] eta 0:30:49 lr 0.000451 time 2.5851 (2.2525) loss 4.0689 (3.5069) grad_norm 1.4823 (1.5752) [2022-01-22 18:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][440/1251] eta 0:30:27 lr 0.000451 time 2.2049 (2.2535) loss 3.3536 (3.5078) grad_norm 1.6326 (1.5764) [2022-01-22 18:02:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][450/1251] eta 0:29:59 lr 0.000451 time 1.6787 (2.2461) loss 2.7151 (3.4981) grad_norm 1.5446 (1.5764) [2022-01-22 18:03:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][460/1251] eta 0:29:35 lr 0.000451 time 2.0075 (2.2449) loss 3.3235 (3.4939) grad_norm 1.6310 (1.5773) [2022-01-22 18:03:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][470/1251] eta 0:29:16 lr 0.000451 time 1.6007 (2.2487) loss 3.7862 (3.4903) grad_norm 1.6515 (1.5783) [2022-01-22 18:03:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][480/1251] eta 0:28:50 lr 0.000451 time 1.8741 (2.2439) loss 2.3796 (3.4892) grad_norm 2.3139 (1.5789) [2022-01-22 18:04:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][490/1251] eta 0:28:24 lr 0.000451 time 1.9088 (2.2404) loss 4.1014 (3.4872) grad_norm 2.0259 (1.5805) [2022-01-22 18:04:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][500/1251] eta 0:28:01 lr 0.000451 time 1.9036 (2.2388) loss 2.8070 (3.4829) grad_norm 1.4316 (1.5795) [2022-01-22 18:04:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][510/1251] eta 0:27:42 lr 0.000451 time 1.8040 (2.2430) loss 2.8820 (3.4841) grad_norm 1.4845 (1.5781) [2022-01-22 18:05:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][520/1251] eta 0:27:21 lr 0.000451 time 2.9376 (2.2452) loss 3.8284 (3.4823) grad_norm 1.4852 (1.5784) [2022-01-22 18:05:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][530/1251] eta 0:26:57 lr 0.000451 time 1.8636 (2.2438) loss 3.3676 (3.4822) grad_norm 1.9408 (1.5796) [2022-01-22 18:05:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][540/1251] eta 0:26:32 lr 0.000451 time 1.7653 (2.2398) loss 2.4219 (3.4812) grad_norm 1.3905 (1.5810) [2022-01-22 18:06:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][550/1251] eta 0:26:09 lr 0.000451 time 2.0444 (2.2395) loss 3.2667 (3.4758) grad_norm 1.6586 (1.5806) [2022-01-22 18:06:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][560/1251] eta 0:25:45 lr 0.000451 time 1.9454 (2.2371) loss 4.1288 (3.4775) grad_norm 1.5887 (1.5810) [2022-01-22 18:07:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][570/1251] eta 0:25:22 lr 0.000451 time 1.9668 (2.2350) loss 2.4549 (3.4704) grad_norm 1.3162 (1.5799) [2022-01-22 18:07:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][580/1251] eta 0:24:58 lr 0.000451 time 2.1352 (2.2337) loss 3.5781 (3.4738) grad_norm 1.4827 (1.5786) [2022-01-22 18:07:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][590/1251] eta 0:24:35 lr 0.000451 time 1.8774 (2.2324) loss 2.9436 (3.4749) grad_norm 2.0841 (1.5793) [2022-01-22 18:08:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][600/1251] eta 0:24:12 lr 0.000451 time 2.2355 (2.2314) loss 4.2289 (3.4778) grad_norm 1.7560 (1.5803) [2022-01-22 18:08:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][610/1251] eta 0:23:49 lr 0.000451 time 1.8682 (2.2303) loss 3.9714 (3.4770) grad_norm 1.5520 (1.5808) [2022-01-22 18:08:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][620/1251] eta 0:23:28 lr 0.000451 time 2.8346 (2.2319) loss 3.3391 (3.4782) grad_norm 1.6428 (1.5800) [2022-01-22 18:09:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][630/1251] eta 0:23:04 lr 0.000451 time 1.8568 (2.2302) loss 4.0715 (3.4747) grad_norm 1.6993 (1.5795) [2022-01-22 18:09:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][640/1251] eta 0:22:41 lr 0.000451 time 2.4433 (2.2279) loss 4.1089 (3.4749) grad_norm 1.7176 (1.5804) [2022-01-22 18:09:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][650/1251] eta 0:22:17 lr 0.000451 time 1.7851 (2.2263) loss 2.7085 (3.4774) grad_norm 1.4646 (1.5805) [2022-01-22 18:10:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][660/1251] eta 0:21:56 lr 0.000451 time 3.0733 (2.2280) loss 3.7697 (3.4716) grad_norm 1.6813 (1.5824) [2022-01-22 18:10:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][670/1251] eta 0:21:34 lr 0.000450 time 2.0710 (2.2276) loss 3.6458 (3.4754) grad_norm 1.4077 (1.5829) [2022-01-22 18:11:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][680/1251] eta 0:21:11 lr 0.000450 time 2.9099 (2.2269) loss 3.3285 (3.4771) grad_norm 1.5014 (1.5827) [2022-01-22 18:11:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][690/1251] eta 0:20:49 lr 0.000450 time 3.2740 (2.2280) loss 3.1020 (3.4734) grad_norm 1.4354 (1.5814) [2022-01-22 18:11:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][700/1251] eta 0:20:26 lr 0.000450 time 2.2935 (2.2265) loss 2.3403 (3.4744) grad_norm 1.6524 (1.5819) [2022-01-22 18:12:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][710/1251] eta 0:20:04 lr 0.000450 time 1.7985 (2.2255) loss 2.8241 (3.4708) grad_norm 1.3587 (1.5829) [2022-01-22 18:12:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][720/1251] eta 0:19:42 lr 0.000450 time 3.0956 (2.2261) loss 2.4549 (3.4728) grad_norm 1.5590 (1.5830) [2022-01-22 18:12:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][730/1251] eta 0:19:20 lr 0.000450 time 2.2541 (2.2266) loss 3.6132 (3.4712) grad_norm 1.3373 (1.5833) [2022-01-22 18:13:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][740/1251] eta 0:18:57 lr 0.000450 time 1.8340 (2.2254) loss 4.1381 (3.4700) grad_norm 1.6498 (1.5838) [2022-01-22 18:13:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][750/1251] eta 0:18:34 lr 0.000450 time 2.1561 (2.2238) loss 3.3033 (3.4681) grad_norm 1.6175 (1.5851) [2022-01-22 18:13:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][760/1251] eta 0:18:10 lr 0.000450 time 2.7163 (2.2219) loss 3.9186 (3.4710) grad_norm 1.5432 (1.5853) [2022-01-22 18:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][770/1251] eta 0:17:48 lr 0.000450 time 1.8948 (2.2204) loss 3.6751 (3.4726) grad_norm 1.6654 (1.5851) [2022-01-22 18:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][780/1251] eta 0:17:25 lr 0.000450 time 1.9056 (2.2195) loss 3.1749 (3.4726) grad_norm 1.4874 (1.5852) [2022-01-22 18:15:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][790/1251] eta 0:17:03 lr 0.000450 time 1.7302 (2.2203) loss 2.9451 (3.4692) grad_norm 1.4306 (1.5858) [2022-01-22 18:15:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][800/1251] eta 0:16:42 lr 0.000450 time 3.1097 (2.2224) loss 3.5294 (3.4680) grad_norm 1.5034 (1.5858) [2022-01-22 18:15:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][810/1251] eta 0:16:19 lr 0.000450 time 1.5528 (2.2215) loss 3.0434 (3.4649) grad_norm 1.3356 (1.5846) [2022-01-22 18:16:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][820/1251] eta 0:15:56 lr 0.000450 time 1.9138 (2.2197) loss 3.9274 (3.4685) grad_norm 1.3953 (1.5841) [2022-01-22 18:16:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][830/1251] eta 0:15:34 lr 0.000450 time 1.9323 (2.2188) loss 2.9663 (3.4650) grad_norm 1.4037 (1.5835) [2022-01-22 18:16:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][840/1251] eta 0:15:10 lr 0.000450 time 1.6367 (2.2164) loss 4.1284 (3.4662) grad_norm 1.6971 (1.5843) [2022-01-22 18:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][850/1251] eta 0:14:48 lr 0.000450 time 1.8399 (2.2160) loss 3.4938 (3.4628) grad_norm 1.4930 (1.5836) [2022-01-22 18:17:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][860/1251] eta 0:14:26 lr 0.000450 time 1.9030 (2.2164) loss 2.8292 (3.4651) grad_norm 1.6242 (1.5836) [2022-01-22 18:17:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][870/1251] eta 0:14:04 lr 0.000450 time 2.7511 (2.2172) loss 3.9786 (3.4650) grad_norm 1.6116 (1.5829) [2022-01-22 18:18:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][880/1251] eta 0:13:42 lr 0.000450 time 1.9040 (2.2175) loss 2.4025 (3.4653) grad_norm 1.5358 (1.5818) [2022-01-22 18:18:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][890/1251] eta 0:13:20 lr 0.000450 time 1.7920 (2.2168) loss 3.1913 (3.4667) grad_norm 1.5224 (1.5818) [2022-01-22 18:19:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][900/1251] eta 0:12:57 lr 0.000450 time 2.2098 (2.2157) loss 3.9136 (3.4685) grad_norm 1.4285 (1.5816) [2022-01-22 18:19:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][910/1251] eta 0:12:35 lr 0.000450 time 2.4778 (2.2158) loss 2.5645 (3.4682) grad_norm 1.4135 (1.5806) [2022-01-22 18:19:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][920/1251] eta 0:12:13 lr 0.000449 time 1.7731 (2.2149) loss 2.8739 (3.4660) grad_norm 1.3728 (1.5797) [2022-01-22 18:20:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][930/1251] eta 0:11:50 lr 0.000449 time 1.8643 (2.2128) loss 4.0749 (3.4638) grad_norm 2.1221 (1.5803) [2022-01-22 18:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][940/1251] eta 0:11:27 lr 0.000449 time 1.9320 (2.2114) loss 4.2402 (3.4661) grad_norm 1.8404 (1.5809) [2022-01-22 18:20:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][950/1251] eta 0:11:05 lr 0.000449 time 2.7264 (2.2110) loss 4.1941 (3.4665) grad_norm 1.5118 (1.5815) [2022-01-22 18:21:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][960/1251] eta 0:10:43 lr 0.000449 time 2.4029 (2.2129) loss 3.9695 (3.4651) grad_norm 1.5996 (1.5807) [2022-01-22 18:21:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][970/1251] eta 0:10:21 lr 0.000449 time 1.5781 (2.2129) loss 3.7726 (3.4671) grad_norm 1.8241 (1.5798) [2022-01-22 18:21:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][980/1251] eta 0:09:59 lr 0.000449 time 2.5277 (2.2124) loss 3.8751 (3.4683) grad_norm 1.4002 (1.5795) [2022-01-22 18:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][990/1251] eta 0:09:36 lr 0.000449 time 1.7296 (2.2100) loss 3.0237 (3.4680) grad_norm 1.5864 (1.5796) [2022-01-22 18:22:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1000/1251] eta 0:09:15 lr 0.000449 time 2.4360 (2.2112) loss 4.1729 (3.4700) grad_norm 1.6484 (1.5796) [2022-01-22 18:23:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1010/1251] eta 0:08:52 lr 0.000449 time 1.5726 (2.2097) loss 2.8965 (3.4690) grad_norm 1.4707 (1.5795) [2022-01-22 18:23:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1020/1251] eta 0:08:30 lr 0.000449 time 2.3625 (2.2091) loss 2.4357 (3.4668) grad_norm 1.3270 (1.5810) [2022-01-22 18:23:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1030/1251] eta 0:08:07 lr 0.000449 time 1.9212 (2.2075) loss 3.5765 (3.4646) grad_norm 1.4006 (1.5802) [2022-01-22 18:24:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1040/1251] eta 0:07:45 lr 0.000449 time 2.9070 (2.2071) loss 3.5502 (3.4643) grad_norm 1.7314 (1.5804) [2022-01-22 18:24:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1050/1251] eta 0:07:23 lr 0.000449 time 2.1850 (2.2060) loss 3.7726 (3.4640) grad_norm 1.5314 (1.5801) [2022-01-22 18:24:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1060/1251] eta 0:07:01 lr 0.000449 time 1.5534 (2.2060) loss 4.1217 (3.4626) grad_norm 1.6714 (1.5806) [2022-01-22 18:25:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1070/1251] eta 0:06:39 lr 0.000449 time 2.2908 (2.2068) loss 2.4398 (3.4609) grad_norm 1.4937 (1.5799) [2022-01-22 18:25:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1080/1251] eta 0:06:17 lr 0.000449 time 2.1974 (2.2065) loss 4.0087 (3.4624) grad_norm 1.7640 (1.5800) [2022-01-22 18:25:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1090/1251] eta 0:05:55 lr 0.000449 time 2.1939 (2.2064) loss 3.9555 (3.4615) grad_norm 1.9006 (1.5812) [2022-01-22 18:26:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1100/1251] eta 0:05:33 lr 0.000449 time 2.1991 (2.2065) loss 4.1356 (3.4620) grad_norm 1.5660 (1.5810) [2022-01-22 18:26:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1110/1251] eta 0:05:11 lr 0.000449 time 3.1833 (2.2073) loss 3.5279 (3.4625) grad_norm 1.5168 (1.5803) [2022-01-22 18:27:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1120/1251] eta 0:04:49 lr 0.000449 time 1.6166 (2.2074) loss 3.5992 (3.4632) grad_norm 1.6886 (1.5800) [2022-01-22 18:27:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1130/1251] eta 0:04:27 lr 0.000449 time 2.2563 (2.2066) loss 2.8329 (3.4654) grad_norm 1.5754 (1.5798) [2022-01-22 18:27:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1140/1251] eta 0:04:04 lr 0.000449 time 1.8987 (2.2064) loss 2.3670 (3.4667) grad_norm 1.4427 (1.5795) [2022-01-22 18:28:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1150/1251] eta 0:03:42 lr 0.000449 time 1.9698 (2.2050) loss 2.6590 (3.4658) grad_norm 2.5198 (1.5807) [2022-01-22 18:28:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1160/1251] eta 0:03:20 lr 0.000448 time 2.1164 (2.2039) loss 3.2388 (3.4635) grad_norm 1.4126 (1.5810) [2022-01-22 18:28:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1170/1251] eta 0:02:58 lr 0.000448 time 1.4547 (2.2038) loss 4.3560 (3.4643) grad_norm 1.4797 (1.5810) [2022-01-22 18:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1180/1251] eta 0:02:36 lr 0.000448 time 2.1869 (2.2035) loss 2.9133 (3.4646) grad_norm 1.5197 (1.5812) [2022-01-22 18:29:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1190/1251] eta 0:02:14 lr 0.000448 time 2.2015 (2.2036) loss 3.9177 (3.4667) grad_norm 1.3869 (1.5814) [2022-01-22 18:29:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1200/1251] eta 0:01:52 lr 0.000448 time 1.6663 (2.2039) loss 4.0037 (3.4683) grad_norm 1.6338 (1.5813) [2022-01-22 18:30:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1210/1251] eta 0:01:30 lr 0.000448 time 2.0531 (2.2019) loss 4.1141 (3.4676) grad_norm 1.6679 (1.5812) [2022-01-22 18:30:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1220/1251] eta 0:01:08 lr 0.000448 time 1.9738 (2.2015) loss 4.4528 (3.4675) grad_norm 1.4062 (1.5810) [2022-01-22 18:30:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1230/1251] eta 0:00:46 lr 0.000448 time 1.8868 (2.2016) loss 2.7351 (3.4651) grad_norm 1.5950 (1.5809) [2022-01-22 18:31:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1240/1251] eta 0:00:24 lr 0.000448 time 2.0128 (2.2003) loss 2.9778 (3.4614) grad_norm 1.4141 (1.5801) [2022-01-22 18:31:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [160/300][1250/1251] eta 0:00:02 lr 0.000448 time 1.1850 (2.1950) loss 3.5581 (3.4595) grad_norm 1.6583 (1.5798) [2022-01-22 18:31:33 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 160 training takes 0:45:46 [2022-01-22 18:31:33 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_160 saving...... [2022-01-22 18:31:44 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_160 saved !!! [2022-01-22 18:32:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 15.884 (15.884) Loss 1.0632 (1.0632) Acc@1 74.902 (74.902) Acc@5 93.945 (93.945) [2022-01-22 18:32:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.988 (2.670) Loss 0.9686 (1.0154) Acc@1 77.539 (76.323) Acc@5 93.945 (93.422) [2022-01-22 18:32:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.893 (2.122) Loss 0.9508 (1.0102) Acc@1 77.246 (76.414) Acc@5 93.652 (93.541) [2022-01-22 18:32:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.243 (2.072) Loss 1.0584 (1.0156) Acc@1 75.684 (76.421) Acc@5 92.871 (93.381) [2022-01-22 18:33:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.305 (2.016) Loss 0.9237 (1.0171) Acc@1 79.102 (76.420) Acc@5 94.043 (93.400) [2022-01-22 18:33:14 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.556 Acc@5 93.406 [2022-01-22 18:33:14 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.6% [2022-01-22 18:33:14 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.56% [2022-01-22 18:33:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][0/1251] eta 7:24:36 lr 0.000448 time 21.3242 (21.3242) loss 3.6099 (3.6099) grad_norm 1.4698 (1.4698) [2022-01-22 18:33:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][10/1251] eta 1:24:26 lr 0.000448 time 2.2603 (4.0827) loss 3.7870 (3.2597) grad_norm 1.4812 (1.5005) [2022-01-22 18:34:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][20/1251] eta 1:04:51 lr 0.000448 time 2.4572 (3.1612) loss 3.7224 (3.4603) grad_norm 1.7454 (1.5250) [2022-01-22 18:34:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][30/1251] eta 0:57:26 lr 0.000448 time 1.5759 (2.8229) loss 2.2965 (3.4341) grad_norm 1.4931 (1.5214) [2022-01-22 18:35:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][40/1251] eta 0:54:42 lr 0.000448 time 3.5224 (2.7102) loss 3.7961 (3.4158) grad_norm 1.4267 (1.5372) [2022-01-22 18:35:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][50/1251] eta 0:53:26 lr 0.000448 time 3.0876 (2.6700) loss 3.7207 (3.4122) grad_norm 1.8267 (1.5450) [2022-01-22 18:35:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][60/1251] eta 0:52:03 lr 0.000448 time 2.5350 (2.6224) loss 3.9790 (3.4169) grad_norm 1.4392 (1.5538) [2022-01-22 18:36:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][70/1251] eta 0:50:21 lr 0.000448 time 1.5898 (2.5588) loss 3.8551 (3.4217) grad_norm 2.4692 (1.5704) [2022-01-22 18:36:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][80/1251] eta 0:48:54 lr 0.000448 time 2.8346 (2.5057) loss 3.2393 (3.4037) grad_norm 1.5948 (1.5602) [2022-01-22 18:36:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][90/1251] eta 0:47:22 lr 0.000448 time 2.2927 (2.4483) loss 4.0809 (3.3935) grad_norm 1.4415 (1.5621) [2022-01-22 18:37:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][100/1251] eta 0:46:14 lr 0.000448 time 1.9245 (2.4107) loss 3.2202 (3.3917) grad_norm 1.3980 (1.5589) [2022-01-22 18:37:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][110/1251] eta 0:45:14 lr 0.000448 time 1.9373 (2.3794) loss 3.5252 (3.4047) grad_norm 1.3508 (1.5616) [2022-01-22 18:38:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][120/1251] eta 0:44:47 lr 0.000448 time 3.2988 (2.3765) loss 3.3153 (3.3920) grad_norm 1.5557 (1.5636) [2022-01-22 18:38:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][130/1251] eta 0:44:20 lr 0.000448 time 1.6460 (2.3735) loss 3.4859 (3.3953) grad_norm 1.9943 (1.5669) [2022-01-22 18:38:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][140/1251] eta 0:43:49 lr 0.000448 time 1.3829 (2.3668) loss 3.3380 (3.3781) grad_norm 1.5441 (1.5682) [2022-01-22 18:39:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][150/1251] eta 0:43:17 lr 0.000447 time 1.8556 (2.3596) loss 3.5003 (3.3994) grad_norm 1.4126 (1.5646) [2022-01-22 18:39:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][160/1251] eta 0:42:45 lr 0.000447 time 2.5947 (2.3514) loss 3.0184 (3.4057) grad_norm 1.5356 (1.5603) [2022-01-22 18:39:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][170/1251] eta 0:42:04 lr 0.000447 time 2.6652 (2.3354) loss 3.6048 (3.4160) grad_norm 1.7471 (1.5569) [2022-01-22 18:40:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][180/1251] eta 0:41:26 lr 0.000447 time 1.8530 (2.3217) loss 2.6770 (3.4181) grad_norm 1.5237 (1.5581) [2022-01-22 18:40:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][190/1251] eta 0:40:52 lr 0.000447 time 1.8967 (2.3113) loss 2.6500 (3.4083) grad_norm 1.6746 (1.5578) [2022-01-22 18:40:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][200/1251] eta 0:40:23 lr 0.000447 time 1.8950 (2.3064) loss 3.5957 (3.4137) grad_norm 1.5071 (1.5592) [2022-01-22 18:41:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][210/1251] eta 0:39:53 lr 0.000447 time 2.1782 (2.2994) loss 3.9711 (3.4210) grad_norm 1.5294 (1.5553) [2022-01-22 18:41:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][220/1251] eta 0:39:28 lr 0.000447 time 2.7910 (2.2976) loss 3.0145 (3.4267) grad_norm 1.5354 (1.5565) [2022-01-22 18:42:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][230/1251] eta 0:38:55 lr 0.000447 time 1.9507 (2.2875) loss 3.0823 (3.4302) grad_norm 1.8711 (1.5622) [2022-01-22 18:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][240/1251] eta 0:38:27 lr 0.000447 time 1.9595 (2.2821) loss 3.4462 (3.4397) grad_norm 1.3431 (1.5603) [2022-01-22 18:42:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][250/1251] eta 0:38:00 lr 0.000447 time 2.8266 (2.2784) loss 2.9809 (3.4418) grad_norm 1.6091 (1.5642) [2022-01-22 18:43:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][260/1251] eta 0:37:30 lr 0.000447 time 1.6461 (2.2706) loss 2.9588 (3.4475) grad_norm 1.8122 (1.5684) [2022-01-22 18:43:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][270/1251] eta 0:37:01 lr 0.000447 time 1.9279 (2.2650) loss 3.6681 (3.4559) grad_norm 1.3302 (1.5684) [2022-01-22 18:43:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][280/1251] eta 0:36:38 lr 0.000447 time 2.2305 (2.2638) loss 3.2883 (3.4492) grad_norm 1.3647 (1.5680) [2022-01-22 18:44:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][290/1251] eta 0:36:16 lr 0.000447 time 2.9385 (2.2648) loss 3.4975 (3.4431) grad_norm 2.5863 (1.5733) [2022-01-22 18:44:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][300/1251] eta 0:35:48 lr 0.000447 time 1.8611 (2.2592) loss 4.0026 (3.4419) grad_norm 1.4104 (1.5723) [2022-01-22 18:44:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][310/1251] eta 0:35:22 lr 0.000447 time 2.5390 (2.2557) loss 4.0219 (3.4484) grad_norm 1.8112 (1.5734) [2022-01-22 18:45:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][320/1251] eta 0:34:58 lr 0.000447 time 1.5767 (2.2540) loss 3.8359 (3.4437) grad_norm 1.4385 (1.5721) [2022-01-22 18:45:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][330/1251] eta 0:34:34 lr 0.000447 time 2.5932 (2.2528) loss 3.9161 (3.4465) grad_norm 1.5666 (1.5712) [2022-01-22 18:46:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][340/1251] eta 0:34:08 lr 0.000447 time 1.9111 (2.2488) loss 3.5669 (3.4480) grad_norm 1.3674 (1.5704) [2022-01-22 18:46:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][350/1251] eta 0:33:47 lr 0.000447 time 2.2604 (2.2507) loss 3.8910 (3.4499) grad_norm 1.6016 (1.5689) [2022-01-22 18:46:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][360/1251] eta 0:33:24 lr 0.000447 time 1.9821 (2.2498) loss 2.8143 (3.4481) grad_norm 1.3541 (1.5734) [2022-01-22 18:47:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][370/1251] eta 0:33:04 lr 0.000447 time 2.5265 (2.2531) loss 3.0701 (3.4509) grad_norm 1.4465 (1.5769) [2022-01-22 18:47:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][380/1251] eta 0:32:40 lr 0.000447 time 1.8785 (2.2505) loss 3.0078 (3.4528) grad_norm 1.5290 (1.5748) [2022-01-22 18:47:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][390/1251] eta 0:32:12 lr 0.000447 time 2.6072 (2.2445) loss 3.9857 (3.4577) grad_norm 1.4602 (1.5746) [2022-01-22 18:48:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][400/1251] eta 0:31:46 lr 0.000446 time 1.9132 (2.2406) loss 4.3555 (3.4661) grad_norm 1.3784 (1.5748) [2022-01-22 18:48:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][410/1251] eta 0:31:19 lr 0.000446 time 1.6205 (2.2343) loss 4.3890 (3.4734) grad_norm 1.6030 (1.5730) [2022-01-22 18:48:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][420/1251] eta 0:30:52 lr 0.000446 time 2.1715 (2.2294) loss 2.7250 (3.4756) grad_norm 1.4111 (1.5719) [2022-01-22 18:49:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][430/1251] eta 0:30:28 lr 0.000446 time 2.5206 (2.2276) loss 3.5922 (3.4705) grad_norm 2.0201 (1.5733) [2022-01-22 18:49:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][440/1251] eta 0:30:07 lr 0.000446 time 2.1844 (2.2292) loss 2.8159 (3.4706) grad_norm 1.6446 (1.5748) [2022-01-22 18:49:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][450/1251] eta 0:29:43 lr 0.000446 time 1.6513 (2.2264) loss 3.9177 (3.4779) grad_norm 1.5231 (1.5741) [2022-01-22 18:50:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][460/1251] eta 0:29:21 lr 0.000446 time 2.2975 (2.2272) loss 3.5653 (3.4751) grad_norm 1.5843 (1.5733) [2022-01-22 18:50:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][470/1251] eta 0:28:58 lr 0.000446 time 2.2600 (2.2265) loss 3.1599 (3.4782) grad_norm 1.5896 (1.5723) [2022-01-22 18:51:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][480/1251] eta 0:28:39 lr 0.000446 time 2.4327 (2.2297) loss 2.4699 (3.4715) grad_norm 1.5770 (1.5753) [2022-01-22 18:51:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][490/1251] eta 0:28:17 lr 0.000446 time 1.8549 (2.2307) loss 3.4290 (3.4728) grad_norm 1.8477 (1.5789) [2022-01-22 18:51:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][500/1251] eta 0:27:55 lr 0.000446 time 2.6610 (2.2312) loss 3.1114 (3.4730) grad_norm 1.4396 (1.5826) [2022-01-22 18:52:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][510/1251] eta 0:27:29 lr 0.000446 time 1.6463 (2.2265) loss 2.6155 (3.4750) grad_norm 2.0999 (1.5841) [2022-01-22 18:52:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][520/1251] eta 0:27:08 lr 0.000446 time 2.8378 (2.2279) loss 3.4218 (3.4718) grad_norm 1.4984 (1.5835) [2022-01-22 18:52:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][530/1251] eta 0:26:44 lr 0.000446 time 1.9491 (2.2254) loss 4.3351 (3.4741) grad_norm 1.4041 (1.5814) [2022-01-22 18:53:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][540/1251] eta 0:26:22 lr 0.000446 time 2.8038 (2.2258) loss 4.0104 (3.4741) grad_norm 1.4296 (1.5800) [2022-01-22 18:53:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][550/1251] eta 0:25:59 lr 0.000446 time 1.6175 (2.2247) loss 3.3852 (3.4760) grad_norm 1.5225 (1.5803) [2022-01-22 18:54:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][560/1251] eta 0:25:37 lr 0.000446 time 2.3645 (2.2253) loss 2.6489 (3.4761) grad_norm 1.6575 (1.5805) [2022-01-22 18:54:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][570/1251] eta 0:25:14 lr 0.000446 time 1.9076 (2.2234) loss 2.6349 (3.4742) grad_norm 1.7185 (1.5806) [2022-01-22 18:54:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][580/1251] eta 0:24:50 lr 0.000446 time 2.6037 (2.2210) loss 3.6624 (3.4739) grad_norm 1.4999 (1.5807) [2022-01-22 18:55:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][590/1251] eta 0:24:27 lr 0.000446 time 2.0352 (2.2198) loss 3.0898 (3.4793) grad_norm 1.4790 (1.5803) [2022-01-22 18:55:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][600/1251] eta 0:24:03 lr 0.000446 time 2.2057 (2.2178) loss 4.1844 (3.4757) grad_norm 1.5721 (1.5808) [2022-01-22 18:55:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][610/1251] eta 0:23:40 lr 0.000446 time 2.2125 (2.2155) loss 2.8449 (3.4745) grad_norm 1.4951 (1.5795) [2022-01-22 18:56:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][620/1251] eta 0:23:17 lr 0.000446 time 2.6030 (2.2154) loss 4.4531 (3.4803) grad_norm 1.6027 (1.5788) [2022-01-22 18:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][630/1251] eta 0:22:55 lr 0.000446 time 2.2474 (2.2154) loss 3.7610 (3.4821) grad_norm 1.6746 (1.5780) [2022-01-22 18:56:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][640/1251] eta 0:22:34 lr 0.000445 time 2.1626 (2.2174) loss 2.7068 (3.4796) grad_norm 1.6039 (1.5773) [2022-01-22 18:57:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][650/1251] eta 0:22:12 lr 0.000445 time 1.8288 (2.2175) loss 3.8875 (3.4834) grad_norm 1.4426 (1.5783) [2022-01-22 18:57:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][660/1251] eta 0:21:52 lr 0.000445 time 3.3940 (2.2208) loss 3.4762 (3.4823) grad_norm 1.3541 (1.5772) [2022-01-22 18:58:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][670/1251] eta 0:21:30 lr 0.000445 time 1.6461 (2.2204) loss 3.6855 (3.4823) grad_norm 1.3491 (1.5765) [2022-01-22 18:58:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][680/1251] eta 0:21:07 lr 0.000445 time 1.5028 (2.2193) loss 4.3018 (3.4797) grad_norm 2.2222 (1.5791) [2022-01-22 18:58:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][690/1251] eta 0:20:43 lr 0.000445 time 2.6069 (2.2174) loss 4.3612 (3.4801) grad_norm 1.5501 (1.5810) [2022-01-22 18:59:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][700/1251] eta 0:20:20 lr 0.000445 time 2.1343 (2.2157) loss 4.3526 (3.4808) grad_norm 1.7657 (1.5806) [2022-01-22 18:59:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][710/1251] eta 0:19:57 lr 0.000445 time 1.7797 (2.2140) loss 3.1792 (3.4808) grad_norm 1.6296 (1.5817) [2022-01-22 18:59:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][720/1251] eta 0:19:36 lr 0.000445 time 2.4605 (2.2148) loss 3.8259 (3.4830) grad_norm 1.6137 (1.5812) [2022-01-22 19:00:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][730/1251] eta 0:19:14 lr 0.000445 time 2.7381 (2.2153) loss 3.5784 (3.4841) grad_norm 1.4779 (1.5813) [2022-01-22 19:00:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][740/1251] eta 0:18:51 lr 0.000445 time 2.3181 (2.2140) loss 3.5468 (3.4858) grad_norm 1.5971 (1.5817) [2022-01-22 19:00:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][750/1251] eta 0:18:29 lr 0.000445 time 2.0420 (2.2137) loss 3.0799 (3.4862) grad_norm 1.5017 (1.5829) [2022-01-22 19:01:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][760/1251] eta 0:18:07 lr 0.000445 time 2.2020 (2.2142) loss 3.9584 (3.4865) grad_norm 1.6816 (1.5832) [2022-01-22 19:01:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][770/1251] eta 0:17:45 lr 0.000445 time 2.7880 (2.2143) loss 4.2829 (3.4863) grad_norm 1.4374 (1.5839) [2022-01-22 19:02:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][780/1251] eta 0:17:21 lr 0.000445 time 2.1006 (2.2118) loss 3.8614 (3.4884) grad_norm 1.7009 (1.5839) [2022-01-22 19:02:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][790/1251] eta 0:16:58 lr 0.000445 time 1.8234 (2.2095) loss 2.5877 (3.4906) grad_norm 1.6120 (1.5835) [2022-01-22 19:02:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][800/1251] eta 0:16:35 lr 0.000445 time 2.2660 (2.2078) loss 3.8214 (3.4900) grad_norm 1.8522 (1.5841) [2022-01-22 19:03:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][810/1251] eta 0:16:13 lr 0.000445 time 1.6450 (2.2080) loss 3.7910 (3.4903) grad_norm 1.3993 (1.5855) [2022-01-22 19:03:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][820/1251] eta 0:15:51 lr 0.000445 time 1.7898 (2.2076) loss 3.4567 (3.4902) grad_norm 1.6504 (1.5873) [2022-01-22 19:03:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][830/1251] eta 0:15:29 lr 0.000445 time 2.1950 (2.2089) loss 2.6794 (3.4873) grad_norm 1.9866 (1.5882) [2022-01-22 19:04:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][840/1251] eta 0:15:08 lr 0.000445 time 2.7750 (2.2111) loss 3.1414 (3.4850) grad_norm 1.7550 (1.5886) [2022-01-22 19:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][850/1251] eta 0:14:47 lr 0.000445 time 2.2050 (2.2131) loss 3.9562 (3.4899) grad_norm 1.6379 (1.5884) [2022-01-22 19:04:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][860/1251] eta 0:14:24 lr 0.000445 time 1.6534 (2.2119) loss 2.1876 (3.4926) grad_norm 1.6794 (1.5886) [2022-01-22 19:05:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][870/1251] eta 0:14:02 lr 0.000445 time 1.8119 (2.2107) loss 2.5068 (3.4948) grad_norm 1.7443 (1.5883) [2022-01-22 19:05:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][880/1251] eta 0:13:39 lr 0.000444 time 2.4735 (2.2091) loss 3.9541 (3.4907) grad_norm 1.5279 (1.5892) [2022-01-22 19:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][890/1251] eta 0:13:16 lr 0.000444 time 1.8241 (2.2072) loss 4.2125 (3.4916) grad_norm 1.5980 (1.5895) [2022-01-22 19:06:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][900/1251] eta 0:12:54 lr 0.000444 time 2.4214 (2.2070) loss 3.7735 (3.4946) grad_norm 1.5363 (1.5893) [2022-01-22 19:06:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][910/1251] eta 0:12:32 lr 0.000444 time 2.1184 (2.2066) loss 4.3881 (3.4955) grad_norm 1.4906 (1.5892) [2022-01-22 19:07:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][920/1251] eta 0:12:10 lr 0.000444 time 2.2536 (2.2064) loss 4.1375 (3.4957) grad_norm 1.3476 (1.5886) [2022-01-22 19:07:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][930/1251] eta 0:11:48 lr 0.000444 time 2.2562 (2.2071) loss 3.0583 (3.4940) grad_norm 1.4791 (1.5872) [2022-01-22 19:07:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][940/1251] eta 0:11:26 lr 0.000444 time 2.8454 (2.2076) loss 3.2261 (3.4926) grad_norm 1.7062 (1.5881) [2022-01-22 19:08:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][950/1251] eta 0:11:05 lr 0.000444 time 2.7281 (2.2095) loss 3.3627 (3.4932) grad_norm 1.5415 (1.5889) [2022-01-22 19:08:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][960/1251] eta 0:10:42 lr 0.000444 time 2.2039 (2.2087) loss 3.4428 (3.4931) grad_norm 1.5684 (1.5887) [2022-01-22 19:08:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][970/1251] eta 0:10:20 lr 0.000444 time 2.2404 (2.2089) loss 3.6758 (3.4960) grad_norm 1.5723 (1.5887) [2022-01-22 19:09:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][980/1251] eta 0:09:58 lr 0.000444 time 1.8606 (2.2078) loss 3.5844 (3.4980) grad_norm 1.3152 (1.5879) [2022-01-22 19:09:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][990/1251] eta 0:09:35 lr 0.000444 time 1.6997 (2.2063) loss 3.7542 (3.4971) grad_norm 1.4986 (1.5870) [2022-01-22 19:10:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1000/1251] eta 0:09:13 lr 0.000444 time 1.8980 (2.2048) loss 3.7998 (3.4976) grad_norm 1.3839 (1.5869) [2022-01-22 19:10:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1010/1251] eta 0:08:51 lr 0.000444 time 2.1981 (2.2041) loss 4.1699 (3.4980) grad_norm 1.5120 (1.5864) [2022-01-22 19:10:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1020/1251] eta 0:08:29 lr 0.000444 time 2.1534 (2.2048) loss 2.7189 (3.4976) grad_norm 1.5409 (1.5862) [2022-01-22 19:11:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1030/1251] eta 0:08:07 lr 0.000444 time 1.9303 (2.2040) loss 3.8902 (3.4989) grad_norm 1.4871 (1.5853) [2022-01-22 19:11:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1040/1251] eta 0:07:45 lr 0.000444 time 2.1562 (2.2046) loss 2.6377 (3.4986) grad_norm 1.7921 (1.5851) [2022-01-22 19:11:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1050/1251] eta 0:07:23 lr 0.000444 time 2.1543 (2.2049) loss 3.9183 (3.4988) grad_norm 1.9502 (1.5848) [2022-01-22 19:12:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1060/1251] eta 0:07:01 lr 0.000444 time 1.6078 (2.2050) loss 3.5047 (3.4957) grad_norm 1.4986 (1.5845) [2022-01-22 19:12:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1070/1251] eta 0:06:39 lr 0.000444 time 1.8482 (2.2049) loss 2.5436 (3.4932) grad_norm 1.6692 (1.5850) [2022-01-22 19:12:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1080/1251] eta 0:06:16 lr 0.000444 time 1.6037 (2.2042) loss 3.4720 (3.4930) grad_norm 1.6349 (1.5844) [2022-01-22 19:13:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1090/1251] eta 0:05:54 lr 0.000444 time 2.1369 (2.2045) loss 3.6840 (3.4948) grad_norm 1.5475 (1.5838) [2022-01-22 19:13:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1100/1251] eta 0:05:32 lr 0.000444 time 1.8288 (2.2029) loss 4.0140 (3.4963) grad_norm 1.5535 (1.5829) [2022-01-22 19:14:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1110/1251] eta 0:05:10 lr 0.000444 time 2.0219 (2.2026) loss 2.6293 (3.4944) grad_norm 1.4827 (1.5818) [2022-01-22 19:14:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1120/1251] eta 0:04:48 lr 0.000443 time 2.1298 (2.2023) loss 3.9329 (3.4968) grad_norm 1.6219 (1.5808) [2022-01-22 19:14:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1130/1251] eta 0:04:26 lr 0.000443 time 1.9067 (2.2020) loss 3.4729 (3.4971) grad_norm 1.6550 (1.5812) [2022-01-22 19:15:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1140/1251] eta 0:04:04 lr 0.000443 time 1.8575 (2.2011) loss 3.9331 (3.4986) grad_norm 1.6693 (1.5813) [2022-01-22 19:15:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1150/1251] eta 0:03:42 lr 0.000443 time 2.2249 (2.2009) loss 3.8411 (3.4991) grad_norm 1.4785 (1.5811) [2022-01-22 19:15:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1160/1251] eta 0:03:20 lr 0.000443 time 2.4640 (2.2010) loss 2.2207 (3.4979) grad_norm 1.5047 (1.5812) [2022-01-22 19:16:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1170/1251] eta 0:02:58 lr 0.000443 time 2.8372 (2.2022) loss 3.0825 (3.4958) grad_norm 1.4938 (1.5805) [2022-01-22 19:16:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1180/1251] eta 0:02:36 lr 0.000443 time 2.2125 (2.2011) loss 2.9490 (3.4948) grad_norm 1.4634 (1.5801) [2022-01-22 19:16:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1190/1251] eta 0:02:14 lr 0.000443 time 2.2388 (2.2010) loss 2.6333 (3.4950) grad_norm 1.4863 (1.5801) [2022-01-22 19:17:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1200/1251] eta 0:01:52 lr 0.000443 time 2.9964 (2.2013) loss 3.5917 (3.4966) grad_norm 1.5438 (1.5795) [2022-01-22 19:17:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1210/1251] eta 0:01:30 lr 0.000443 time 2.5205 (2.2023) loss 3.6424 (3.4949) grad_norm 1.4242 (1.5786) [2022-01-22 19:18:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1220/1251] eta 0:01:08 lr 0.000443 time 2.2386 (2.2022) loss 3.6395 (3.4935) grad_norm 1.4559 (1.5782) [2022-01-22 19:18:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1230/1251] eta 0:00:46 lr 0.000443 time 2.2417 (2.2019) loss 4.0373 (3.4952) grad_norm 1.3145 (1.5786) [2022-01-22 19:18:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1240/1251] eta 0:00:24 lr 0.000443 time 1.6570 (2.2009) loss 4.2297 (3.4979) grad_norm 1.7784 (1.5786) [2022-01-22 19:19:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [161/300][1250/1251] eta 0:00:02 lr 0.000443 time 1.1695 (2.1957) loss 3.3583 (3.4983) grad_norm 1.5259 (1.5781) [2022-01-22 19:19:01 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 161 training takes 0:45:47 [2022-01-22 19:19:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.916 (18.916) Loss 0.9207 (0.9207) Acc@1 78.223 (78.223) Acc@5 94.043 (94.043) [2022-01-22 19:19:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.944 (3.501) Loss 0.9997 (0.9780) Acc@1 75.098 (76.926) Acc@5 93.359 (93.581) [2022-01-22 19:19:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.351 (2.612) Loss 1.0061 (0.9860) Acc@1 74.902 (76.767) Acc@5 93.359 (93.508) [2022-01-22 19:20:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.251 (2.369) Loss 0.8647 (0.9840) Acc@1 78.516 (76.714) Acc@5 95.508 (93.586) [2022-01-22 19:20:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.713 (2.201) Loss 1.0588 (0.9857) Acc@1 74.805 (76.655) Acc@5 92.578 (93.526) [2022-01-22 19:20:39 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.650 Acc@5 93.540 [2022-01-22 19:20:39 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.7% [2022-01-22 19:20:39 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.65% [2022-01-22 19:21:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][0/1251] eta 7:38:35 lr 0.000443 time 21.9950 (21.9950) loss 2.8408 (2.8408) grad_norm 1.7174 (1.7174) [2022-01-22 19:21:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][10/1251] eta 1:25:56 lr 0.000443 time 2.1645 (4.1555) loss 3.8664 (3.4385) grad_norm 1.5473 (1.5272) [2022-01-22 19:21:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][20/1251] eta 1:05:35 lr 0.000443 time 1.4371 (3.1967) loss 3.4504 (3.5149) grad_norm 1.5643 (1.5599) [2022-01-22 19:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][30/1251] eta 0:58:53 lr 0.000443 time 1.8897 (2.8940) loss 3.8089 (3.4523) grad_norm 1.6051 (1.6000) [2022-01-22 19:22:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][40/1251] eta 0:56:11 lr 0.000443 time 3.9151 (2.7842) loss 4.0170 (3.4603) grad_norm 1.6236 (1.6156) [2022-01-22 19:22:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][50/1251] eta 0:54:07 lr 0.000443 time 2.9681 (2.7040) loss 3.0663 (3.4574) grad_norm 2.0544 (1.6158) [2022-01-22 19:23:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][60/1251] eta 0:51:25 lr 0.000443 time 1.5540 (2.5910) loss 3.3223 (3.4332) grad_norm 1.4941 (1.6066) [2022-01-22 19:23:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][70/1251] eta 0:49:28 lr 0.000443 time 1.6815 (2.5138) loss 2.3527 (3.4149) grad_norm 1.6722 (1.6196) [2022-01-22 19:24:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][80/1251] eta 0:48:29 lr 0.000443 time 4.1120 (2.4845) loss 3.7623 (3.3766) grad_norm 1.4948 (1.6049) [2022-01-22 19:24:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][90/1251] eta 0:47:34 lr 0.000443 time 1.7510 (2.4585) loss 4.1147 (3.3999) grad_norm 1.7758 (1.6009) [2022-01-22 19:24:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][100/1251] eta 0:46:39 lr 0.000443 time 1.9328 (2.4325) loss 3.4292 (3.4158) grad_norm 1.5156 (1.5936) [2022-01-22 19:25:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][110/1251] eta 0:45:37 lr 0.000443 time 1.8920 (2.3996) loss 2.5217 (3.4290) grad_norm 1.8987 (1.5942) [2022-01-22 19:25:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][120/1251] eta 0:44:46 lr 0.000442 time 2.0744 (2.3749) loss 2.8141 (3.4162) grad_norm 1.4969 (1.5920) [2022-01-22 19:25:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][130/1251] eta 0:43:51 lr 0.000442 time 2.1233 (2.3473) loss 3.7883 (3.4336) grad_norm 1.9348 (1.5923) [2022-01-22 19:26:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][140/1251] eta 0:43:22 lr 0.000442 time 2.0374 (2.3427) loss 4.0297 (3.4418) grad_norm 1.5813 (1.5936) [2022-01-22 19:26:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][150/1251] eta 0:42:47 lr 0.000442 time 2.7296 (2.3323) loss 3.0996 (3.4336) grad_norm 1.4621 (1.5954) [2022-01-22 19:26:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][160/1251] eta 0:42:17 lr 0.000442 time 2.3624 (2.3255) loss 3.7421 (3.4298) grad_norm 1.4519 (1.5959) [2022-01-22 19:27:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][170/1251] eta 0:41:45 lr 0.000442 time 2.5699 (2.3175) loss 3.3523 (3.4307) grad_norm 1.6572 (1.5928) [2022-01-22 19:27:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][180/1251] eta 0:41:12 lr 0.000442 time 2.3911 (2.3082) loss 2.9938 (3.4276) grad_norm 1.7932 (1.5943) [2022-01-22 19:27:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][190/1251] eta 0:40:43 lr 0.000442 time 1.9357 (2.3033) loss 2.9086 (3.4307) grad_norm 1.7134 (1.5947) [2022-01-22 19:28:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][200/1251] eta 0:40:09 lr 0.000442 time 1.5773 (2.2929) loss 3.5541 (3.4310) grad_norm 1.4514 (1.5903) [2022-01-22 19:28:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][210/1251] eta 0:39:38 lr 0.000442 time 2.2054 (2.2848) loss 3.9102 (3.4204) grad_norm 1.4388 (1.5862) [2022-01-22 19:29:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][220/1251] eta 0:39:06 lr 0.000442 time 1.9677 (2.2755) loss 3.3295 (3.4189) grad_norm 1.6689 (1.5819) [2022-01-22 19:29:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][230/1251] eta 0:38:32 lr 0.000442 time 1.8732 (2.2650) loss 2.9945 (3.4237) grad_norm 1.4948 (1.5801) [2022-01-22 19:29:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][240/1251] eta 0:38:05 lr 0.000442 time 1.5874 (2.2604) loss 3.9828 (3.4205) grad_norm 1.4799 (1.5811) [2022-01-22 19:30:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][250/1251] eta 0:37:43 lr 0.000442 time 3.1495 (2.2611) loss 3.2633 (3.4235) grad_norm 1.5707 (1.5804) [2022-01-22 19:30:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][260/1251] eta 0:37:23 lr 0.000442 time 2.0704 (2.2640) loss 2.8026 (3.4145) grad_norm 1.6617 (1.5805) [2022-01-22 19:30:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][270/1251] eta 0:37:01 lr 0.000442 time 1.9556 (2.2643) loss 2.8347 (3.4140) grad_norm 1.4567 (1.5801) [2022-01-22 19:31:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][280/1251] eta 0:36:41 lr 0.000442 time 1.8522 (2.2676) loss 3.6392 (3.4132) grad_norm 1.5028 (1.5775) [2022-01-22 19:31:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][290/1251] eta 0:36:22 lr 0.000442 time 3.0467 (2.2709) loss 3.8616 (3.4194) grad_norm 1.5020 (1.5786) [2022-01-22 19:32:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][300/1251] eta 0:35:49 lr 0.000442 time 1.8340 (2.2604) loss 3.7436 (3.4292) grad_norm 1.3916 (1.5785) [2022-01-22 19:32:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][310/1251] eta 0:35:15 lr 0.000442 time 1.9613 (2.2485) loss 3.5763 (3.4371) grad_norm 1.5453 (1.5796) [2022-01-22 19:32:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][320/1251] eta 0:34:45 lr 0.000442 time 1.9070 (2.2401) loss 2.9048 (3.4354) grad_norm 1.4263 (1.5813) [2022-01-22 19:32:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][330/1251] eta 0:34:18 lr 0.000442 time 2.2928 (2.2348) loss 4.0840 (3.4378) grad_norm 1.6331 (1.5805) [2022-01-22 19:33:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][340/1251] eta 0:33:55 lr 0.000442 time 2.7937 (2.2339) loss 4.0292 (3.4337) grad_norm 1.5136 (1.5790) [2022-01-22 19:33:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][350/1251] eta 0:33:27 lr 0.000442 time 1.8487 (2.2284) loss 3.3812 (3.4374) grad_norm 1.9367 (1.5811) [2022-01-22 19:34:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][360/1251] eta 0:33:04 lr 0.000441 time 2.1558 (2.2270) loss 2.8532 (3.4367) grad_norm 1.6043 (1.5854) [2022-01-22 19:34:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][370/1251] eta 0:32:41 lr 0.000441 time 2.6133 (2.2262) loss 3.5821 (3.4404) grad_norm 1.5664 (1.5892) [2022-01-22 19:34:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][380/1251] eta 0:32:17 lr 0.000441 time 1.6867 (2.2242) loss 3.5826 (3.4482) grad_norm 1.4479 (1.5884) [2022-01-22 19:35:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][390/1251] eta 0:31:57 lr 0.000441 time 2.1966 (2.2265) loss 3.5648 (3.4445) grad_norm 1.6857 (1.5912) [2022-01-22 19:35:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][400/1251] eta 0:31:35 lr 0.000441 time 2.7185 (2.2273) loss 3.3794 (3.4480) grad_norm 1.4512 (1.5890) [2022-01-22 19:35:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][410/1251] eta 0:31:11 lr 0.000441 time 2.0957 (2.2251) loss 4.1136 (3.4440) grad_norm 1.4641 (1.5883) [2022-01-22 19:36:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][420/1251] eta 0:30:56 lr 0.000441 time 3.0082 (2.2345) loss 3.2260 (3.4462) grad_norm 1.3831 (1.5867) [2022-01-22 19:36:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][430/1251] eta 0:30:40 lr 0.000441 time 2.0975 (2.2418) loss 2.9202 (3.4484) grad_norm 1.5908 (1.5840) [2022-01-22 19:37:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][440/1251] eta 0:30:19 lr 0.000441 time 2.4160 (2.2439) loss 3.5879 (3.4507) grad_norm 1.5608 (1.5845) [2022-01-22 19:37:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][450/1251] eta 0:29:54 lr 0.000441 time 1.8562 (2.2399) loss 3.9521 (3.4532) grad_norm 1.5969 (1.5858) [2022-01-22 19:37:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][460/1251] eta 0:29:25 lr 0.000441 time 1.8006 (2.2324) loss 4.0065 (3.4480) grad_norm 1.9104 (1.5878) [2022-01-22 19:38:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][470/1251] eta 0:28:58 lr 0.000441 time 2.1874 (2.2265) loss 3.5236 (3.4514) grad_norm 1.5958 (1.5879) [2022-01-22 19:38:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][480/1251] eta 0:28:33 lr 0.000441 time 1.9082 (2.2227) loss 4.0523 (3.4511) grad_norm 2.0683 (1.5912) [2022-01-22 19:38:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][490/1251] eta 0:28:09 lr 0.000441 time 2.2061 (2.2206) loss 2.5903 (3.4540) grad_norm 1.4230 (1.5903) [2022-01-22 19:39:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][500/1251] eta 0:27:44 lr 0.000441 time 1.5020 (2.2165) loss 3.6451 (3.4573) grad_norm 1.4995 (1.5913) [2022-01-22 19:39:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][510/1251] eta 0:27:24 lr 0.000441 time 3.2012 (2.2193) loss 3.5176 (3.4612) grad_norm 1.8684 (1.5913) [2022-01-22 19:39:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][520/1251] eta 0:27:05 lr 0.000441 time 2.6184 (2.2232) loss 3.7414 (3.4684) grad_norm 1.3921 (1.5918) [2022-01-22 19:40:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][530/1251] eta 0:26:44 lr 0.000441 time 2.2706 (2.2260) loss 3.5488 (3.4720) grad_norm 1.3919 (1.5899) [2022-01-22 19:40:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][540/1251] eta 0:26:22 lr 0.000441 time 2.2475 (2.2260) loss 3.9519 (3.4763) grad_norm 1.4350 (1.5909) [2022-01-22 19:41:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][550/1251] eta 0:26:01 lr 0.000441 time 2.9382 (2.2274) loss 4.2354 (3.4766) grad_norm 1.4719 (1.5899) [2022-01-22 19:41:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][560/1251] eta 0:25:37 lr 0.000441 time 1.9252 (2.2252) loss 4.0488 (3.4727) grad_norm 1.7409 (1.5905) [2022-01-22 19:41:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][570/1251] eta 0:25:13 lr 0.000441 time 1.8165 (2.2222) loss 2.7199 (3.4727) grad_norm 1.7984 (1.5915) [2022-01-22 19:42:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][580/1251] eta 0:24:49 lr 0.000441 time 2.1931 (2.2204) loss 3.6388 (3.4711) grad_norm 1.4679 (1.5906) [2022-01-22 19:42:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][590/1251] eta 0:24:28 lr 0.000441 time 3.4798 (2.2214) loss 4.0070 (3.4735) grad_norm 1.5180 (1.5898) [2022-01-22 19:42:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][600/1251] eta 0:24:05 lr 0.000440 time 1.8192 (2.2203) loss 3.0213 (3.4716) grad_norm 1.5628 (1.5905) [2022-01-22 19:43:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][610/1251] eta 0:23:42 lr 0.000440 time 2.2407 (2.2196) loss 2.5695 (3.4730) grad_norm 1.6589 (1.5916) [2022-01-22 19:43:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][620/1251] eta 0:23:22 lr 0.000440 time 2.5095 (2.2222) loss 3.8738 (3.4747) grad_norm 1.4762 (1.5910) [2022-01-22 19:44:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][630/1251] eta 0:22:59 lr 0.000440 time 2.5579 (2.2221) loss 3.2942 (3.4726) grad_norm 1.6321 (1.5915) [2022-01-22 19:44:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][640/1251] eta 0:22:37 lr 0.000440 time 1.5496 (2.2213) loss 3.7627 (3.4738) grad_norm 1.3306 (1.5909) [2022-01-22 19:44:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][650/1251] eta 0:22:14 lr 0.000440 time 1.8871 (2.2201) loss 4.1908 (3.4744) grad_norm 1.5807 (1.5912) [2022-01-22 19:45:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][660/1251] eta 0:21:50 lr 0.000440 time 1.8473 (2.2167) loss 4.1897 (3.4735) grad_norm 1.7504 (1.5907) [2022-01-22 19:45:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][670/1251] eta 0:21:27 lr 0.000440 time 2.5213 (2.2160) loss 4.5442 (3.4765) grad_norm 1.5276 (1.5923) [2022-01-22 19:45:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][680/1251] eta 0:21:04 lr 0.000440 time 2.1492 (2.2144) loss 3.9341 (3.4766) grad_norm 1.6929 (1.5936) [2022-01-22 19:46:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][690/1251] eta 0:20:41 lr 0.000440 time 2.5312 (2.2137) loss 3.1302 (3.4784) grad_norm 1.3993 (1.5929) [2022-01-22 19:46:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][700/1251] eta 0:20:19 lr 0.000440 time 2.3615 (2.2127) loss 3.6905 (3.4811) grad_norm 1.7442 (1.5929) [2022-01-22 19:46:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][710/1251] eta 0:19:58 lr 0.000440 time 3.1827 (2.2159) loss 3.6859 (3.4830) grad_norm 1.8584 (1.5940) [2022-01-22 19:47:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][720/1251] eta 0:19:35 lr 0.000440 time 1.8655 (2.2147) loss 3.6429 (3.4871) grad_norm 2.1128 (1.5945) [2022-01-22 19:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][730/1251] eta 0:19:14 lr 0.000440 time 2.5310 (2.2154) loss 3.2452 (3.4836) grad_norm 1.6062 (1.5930) [2022-01-22 19:47:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][740/1251] eta 0:18:50 lr 0.000440 time 1.9197 (2.2123) loss 3.0460 (3.4844) grad_norm 1.4219 (1.5918) [2022-01-22 19:48:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][750/1251] eta 0:18:28 lr 0.000440 time 1.8047 (2.2120) loss 3.1255 (3.4817) grad_norm 1.5368 (1.5916) [2022-01-22 19:48:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][760/1251] eta 0:18:05 lr 0.000440 time 1.5802 (2.2117) loss 4.1891 (3.4854) grad_norm 1.5589 (1.5910) [2022-01-22 19:49:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][770/1251] eta 0:17:44 lr 0.000440 time 2.7630 (2.2124) loss 4.1789 (3.4874) grad_norm 1.6987 (1.5894) [2022-01-22 19:49:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][780/1251] eta 0:17:21 lr 0.000440 time 1.7698 (2.2108) loss 3.3895 (3.4884) grad_norm 1.4220 (1.5896) [2022-01-22 19:49:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][790/1251] eta 0:16:59 lr 0.000440 time 2.1674 (2.2119) loss 3.8952 (3.4874) grad_norm 1.3924 (1.5891) [2022-01-22 19:50:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][800/1251] eta 0:16:37 lr 0.000440 time 1.8681 (2.2112) loss 3.6020 (3.4869) grad_norm 1.4449 (1.5877) [2022-01-22 19:50:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][810/1251] eta 0:16:15 lr 0.000440 time 2.5808 (2.2115) loss 3.8359 (3.4903) grad_norm 1.4872 (1.5870) [2022-01-22 19:50:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][820/1251] eta 0:15:52 lr 0.000440 time 1.9445 (2.2097) loss 3.5060 (3.4916) grad_norm 1.7001 (1.5875) [2022-01-22 19:51:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][830/1251] eta 0:15:29 lr 0.000440 time 2.6219 (2.2080) loss 3.7164 (3.4895) grad_norm 1.4643 (1.5868) [2022-01-22 19:51:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][840/1251] eta 0:15:06 lr 0.000440 time 1.9004 (2.2062) loss 3.9986 (3.4883) grad_norm 1.4423 (1.5860) [2022-01-22 19:51:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][850/1251] eta 0:14:44 lr 0.000439 time 2.0827 (2.2069) loss 4.1112 (3.4870) grad_norm 1.3584 (1.5867) [2022-01-22 19:52:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][860/1251] eta 0:14:22 lr 0.000439 time 2.5466 (2.2066) loss 3.1753 (3.4903) grad_norm 1.6510 (1.5870) [2022-01-22 19:52:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][870/1251] eta 0:13:59 lr 0.000439 time 2.4563 (2.2046) loss 4.2182 (3.4915) grad_norm 1.4939 (1.5860) [2022-01-22 19:53:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][880/1251] eta 0:13:37 lr 0.000439 time 1.9145 (2.2035) loss 3.0245 (3.4904) grad_norm 1.5141 (1.5859) [2022-01-22 19:53:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][890/1251] eta 0:13:15 lr 0.000439 time 2.1537 (2.2028) loss 3.1776 (3.4905) grad_norm 1.5658 (1.5852) [2022-01-22 19:53:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][900/1251] eta 0:12:52 lr 0.000439 time 1.5865 (2.2020) loss 3.4804 (3.4892) grad_norm 1.3670 (1.5856) [2022-01-22 19:54:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][910/1251] eta 0:12:31 lr 0.000439 time 3.3192 (2.2041) loss 3.8028 (3.4906) grad_norm 1.6037 (1.5854) [2022-01-22 19:54:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][920/1251] eta 0:12:10 lr 0.000439 time 1.4586 (2.2055) loss 3.8846 (3.4898) grad_norm 1.8615 (1.5846) [2022-01-22 19:54:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][930/1251] eta 0:11:47 lr 0.000439 time 2.3932 (2.2054) loss 3.8544 (3.4920) grad_norm 1.4747 (1.5837) [2022-01-22 19:55:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][940/1251] eta 0:11:25 lr 0.000439 time 1.7571 (2.2048) loss 3.1210 (3.4932) grad_norm 1.7913 (1.5839) [2022-01-22 19:55:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][950/1251] eta 0:11:03 lr 0.000439 time 1.9243 (2.2033) loss 3.2013 (3.4917) grad_norm 1.4718 (1.5840) [2022-01-22 19:55:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][960/1251] eta 0:10:40 lr 0.000439 time 1.9072 (2.2020) loss 3.7768 (3.4908) grad_norm 1.3049 (1.5838) [2022-01-22 19:56:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][970/1251] eta 0:10:18 lr 0.000439 time 1.6129 (2.2010) loss 3.1619 (3.4912) grad_norm 1.5587 (1.5835) [2022-01-22 19:56:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][980/1251] eta 0:09:56 lr 0.000439 time 2.2199 (2.2010) loss 4.1236 (3.4872) grad_norm 2.0756 (1.5834) [2022-01-22 19:56:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][990/1251] eta 0:09:34 lr 0.000439 time 2.1649 (2.1996) loss 2.6573 (3.4875) grad_norm 1.5195 (1.5831) [2022-01-22 19:57:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1000/1251] eta 0:09:12 lr 0.000439 time 2.6931 (2.1996) loss 2.3684 (3.4864) grad_norm 1.6477 (1.5832) [2022-01-22 19:57:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1010/1251] eta 0:08:50 lr 0.000439 time 1.8844 (2.2004) loss 3.8031 (3.4887) grad_norm 1.4794 (1.5826) [2022-01-22 19:58:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1020/1251] eta 0:08:28 lr 0.000439 time 2.1293 (2.2003) loss 3.7994 (3.4907) grad_norm 1.2989 (1.5828) [2022-01-22 19:58:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1030/1251] eta 0:08:06 lr 0.000439 time 2.2622 (2.2005) loss 3.7371 (3.4916) grad_norm 1.5611 (1.5826) [2022-01-22 19:58:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1040/1251] eta 0:07:44 lr 0.000439 time 2.7836 (2.2010) loss 3.7392 (3.4924) grad_norm 1.4634 (1.5826) [2022-01-22 19:59:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1050/1251] eta 0:07:22 lr 0.000439 time 1.5836 (2.2012) loss 2.6520 (3.4916) grad_norm 1.6988 (1.5823) [2022-01-22 19:59:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1060/1251] eta 0:07:00 lr 0.000439 time 3.0201 (2.2024) loss 3.9064 (3.4913) grad_norm 1.4792 (1.5819) [2022-01-22 19:59:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1070/1251] eta 0:06:38 lr 0.000439 time 2.0774 (2.2018) loss 3.4729 (3.4922) grad_norm 1.7126 (1.5818) [2022-01-22 20:00:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1080/1251] eta 0:06:16 lr 0.000439 time 2.3820 (2.2014) loss 3.2751 (3.4921) grad_norm 1.7839 (1.5814) [2022-01-22 20:00:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1090/1251] eta 0:05:54 lr 0.000438 time 1.9255 (2.1995) loss 2.5441 (3.4914) grad_norm 1.6004 (1.5811) [2022-01-22 20:01:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1100/1251] eta 0:05:32 lr 0.000438 time 2.2526 (2.1991) loss 4.1741 (3.4908) grad_norm 1.5076 (1.5810) [2022-01-22 20:01:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1110/1251] eta 0:05:09 lr 0.000438 time 2.6055 (2.1983) loss 3.2903 (3.4874) grad_norm 1.8340 (1.5818) [2022-01-22 20:01:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1120/1251] eta 0:04:47 lr 0.000438 time 2.5190 (2.1982) loss 3.8498 (3.4889) grad_norm 1.5174 (1.5819) [2022-01-22 20:02:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1130/1251] eta 0:04:25 lr 0.000438 time 1.6884 (2.1975) loss 3.0751 (3.4878) grad_norm 1.4295 (1.5816) [2022-01-22 20:02:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1140/1251] eta 0:04:03 lr 0.000438 time 2.5102 (2.1976) loss 4.1632 (3.4906) grad_norm 1.4149 (1.5814) [2022-01-22 20:02:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1150/1251] eta 0:03:42 lr 0.000438 time 2.8475 (2.1986) loss 3.9571 (3.4917) grad_norm 1.4434 (1.5810) [2022-01-22 20:03:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1160/1251] eta 0:03:20 lr 0.000438 time 2.7284 (2.2003) loss 3.7907 (3.4939) grad_norm 1.4874 (1.5805) [2022-01-22 20:03:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1170/1251] eta 0:02:58 lr 0.000438 time 1.6868 (2.1999) loss 3.9017 (3.4965) grad_norm 1.3895 (1.5810) [2022-01-22 20:03:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1180/1251] eta 0:02:36 lr 0.000438 time 2.0007 (2.1987) loss 3.4836 (3.4979) grad_norm 1.5710 (1.5808) [2022-01-22 20:04:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1190/1251] eta 0:02:14 lr 0.000438 time 1.9038 (2.1971) loss 3.1028 (3.4999) grad_norm 1.4056 (1.5802) [2022-01-22 20:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1200/1251] eta 0:01:51 lr 0.000438 time 2.2408 (2.1957) loss 4.2082 (3.5007) grad_norm 1.5013 (1.5801) [2022-01-22 20:04:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1210/1251] eta 0:01:29 lr 0.000438 time 2.3002 (2.1948) loss 3.1629 (3.4995) grad_norm 1.4865 (1.5810) [2022-01-22 20:05:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1220/1251] eta 0:01:08 lr 0.000438 time 2.1492 (2.1948) loss 3.8234 (3.4972) grad_norm 1.6903 (1.5829) [2022-01-22 20:05:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1230/1251] eta 0:00:46 lr 0.000438 time 2.4277 (2.1956) loss 3.5003 (3.4957) grad_norm 1.7018 (1.5832) [2022-01-22 20:06:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1240/1251] eta 0:00:24 lr 0.000438 time 1.7128 (2.1956) loss 3.8124 (3.4960) grad_norm 1.4609 (1.5829) [2022-01-22 20:06:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [162/300][1250/1251] eta 0:00:02 lr 0.000438 time 1.1733 (2.1913) loss 3.3769 (3.4991) grad_norm 1.3210 (1.5826) [2022-01-22 20:06:21 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 162 training takes 0:45:41 [2022-01-22 20:06:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 16.031 (16.031) Loss 0.9319 (0.9319) Acc@1 78.027 (78.027) Acc@5 94.922 (94.922) [2022-01-22 20:06:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 2.278 (3.296) Loss 1.0291 (0.9643) Acc@1 75.098 (76.634) Acc@5 92.480 (93.910) [2022-01-22 20:07:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.632 (2.567) Loss 1.0100 (0.9864) Acc@1 78.320 (76.469) Acc@5 92.773 (93.708) [2022-01-22 20:07:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.326 (2.234) Loss 0.9690 (0.9897) Acc@1 77.051 (76.418) Acc@5 94.141 (93.709) [2022-01-22 20:07:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.442 (2.140) Loss 0.9764 (0.9930) Acc@1 76.660 (76.398) Acc@5 94.238 (93.707) [2022-01-22 20:07:57 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.468 Acc@5 93.676 [2022-01-22 20:07:57 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.5% [2022-01-22 20:07:57 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.65% [2022-01-22 20:08:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][0/1251] eta 7:30:20 lr 0.000438 time 21.5995 (21.5995) loss 3.8873 (3.8873) grad_norm 1.3607 (1.3607) [2022-01-22 20:08:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][10/1251] eta 1:26:30 lr 0.000438 time 1.5704 (4.1825) loss 3.7828 (3.4536) grad_norm 1.6039 (1.5654) [2022-01-22 20:09:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][20/1251] eta 1:07:38 lr 0.000438 time 1.8309 (3.2970) loss 3.8017 (3.5242) grad_norm 1.6543 (1.5616) [2022-01-22 20:09:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][30/1251] eta 0:59:04 lr 0.000438 time 1.5463 (2.9027) loss 4.2577 (3.5110) grad_norm 1.5866 (1.5796) [2022-01-22 20:09:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][40/1251] eta 0:56:25 lr 0.000438 time 3.8762 (2.7953) loss 3.4650 (3.5113) grad_norm 1.4792 (1.5809) [2022-01-22 20:10:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][50/1251] eta 0:53:41 lr 0.000438 time 1.4944 (2.6821) loss 4.2141 (3.5392) grad_norm 1.6358 (1.5848) [2022-01-22 20:10:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][60/1251] eta 0:52:04 lr 0.000438 time 1.7520 (2.6237) loss 3.3839 (3.5255) grad_norm 1.5008 (1.5768) [2022-01-22 20:10:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][70/1251] eta 0:50:00 lr 0.000438 time 1.6736 (2.5408) loss 3.3871 (3.5406) grad_norm 1.4245 (1.5821) [2022-01-22 20:11:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][80/1251] eta 0:49:00 lr 0.000437 time 3.7353 (2.5108) loss 4.1145 (3.5160) grad_norm 1.5014 (1.5842) [2022-01-22 20:11:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][90/1251] eta 0:47:49 lr 0.000437 time 1.7687 (2.4715) loss 3.5036 (3.5013) grad_norm 1.4489 (1.5775) [2022-01-22 20:12:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][100/1251] eta 0:46:59 lr 0.000437 time 2.0301 (2.4498) loss 2.3134 (3.4943) grad_norm 1.9179 (1.5780) [2022-01-22 20:12:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][110/1251] eta 0:46:11 lr 0.000437 time 2.8783 (2.4291) loss 3.8468 (3.4990) grad_norm 1.6251 (1.5760) [2022-01-22 20:12:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][120/1251] eta 0:45:27 lr 0.000437 time 3.3752 (2.4116) loss 3.7813 (3.4930) grad_norm 2.0014 (1.5809) [2022-01-22 20:13:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][130/1251] eta 0:44:39 lr 0.000437 time 1.6004 (2.3905) loss 3.7545 (3.4987) grad_norm 1.5379 (1.5805) [2022-01-22 20:13:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][140/1251] eta 0:44:22 lr 0.000437 time 2.4430 (2.3966) loss 3.6973 (3.4981) grad_norm 1.6245 (1.5811) [2022-01-22 20:13:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][150/1251] eta 0:43:37 lr 0.000437 time 1.6536 (2.3776) loss 3.7326 (3.4988) grad_norm 1.5614 (1.5843) [2022-01-22 20:14:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][160/1251] eta 0:42:57 lr 0.000437 time 1.7964 (2.3628) loss 2.8604 (3.4826) grad_norm 1.3878 (1.5772) [2022-01-22 20:14:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][170/1251] eta 0:42:10 lr 0.000437 time 1.8557 (2.3407) loss 3.0173 (3.4679) grad_norm 1.6303 (1.5716) [2022-01-22 20:14:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][180/1251] eta 0:41:39 lr 0.000437 time 2.2771 (2.3336) loss 3.5699 (3.4709) grad_norm 1.6344 (1.5707) [2022-01-22 20:15:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][190/1251] eta 0:41:03 lr 0.000437 time 1.5883 (2.3218) loss 4.1961 (3.4651) grad_norm 1.4539 (1.5694) [2022-01-22 20:15:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][200/1251] eta 0:40:36 lr 0.000437 time 2.5772 (2.3184) loss 3.9479 (3.4816) grad_norm 1.4839 (1.5684) [2022-01-22 20:16:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][210/1251] eta 0:39:58 lr 0.000437 time 1.9079 (2.3045) loss 3.2708 (3.4875) grad_norm 1.3692 (1.5714) [2022-01-22 20:16:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][220/1251] eta 0:39:24 lr 0.000437 time 1.9228 (2.2935) loss 4.2497 (3.4856) grad_norm 1.5420 (1.5690) [2022-01-22 20:16:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][230/1251] eta 0:38:56 lr 0.000437 time 1.5538 (2.2886) loss 3.3368 (3.4836) grad_norm 1.6987 (1.5691) [2022-01-22 20:17:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][240/1251] eta 0:38:30 lr 0.000437 time 2.3728 (2.2856) loss 3.1099 (3.4888) grad_norm 1.5454 (1.5717) [2022-01-22 20:17:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][250/1251] eta 0:38:11 lr 0.000437 time 2.3237 (2.2893) loss 3.8681 (3.4960) grad_norm 1.7669 (1.5718) [2022-01-22 20:17:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][260/1251] eta 0:37:49 lr 0.000437 time 1.6630 (2.2903) loss 3.3544 (3.4907) grad_norm 1.7607 (1.5718) [2022-01-22 20:18:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][270/1251] eta 0:37:27 lr 0.000437 time 1.7079 (2.2907) loss 3.2395 (3.4915) grad_norm 1.4513 (1.5754) [2022-01-22 20:18:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][280/1251] eta 0:36:56 lr 0.000437 time 2.4288 (2.2826) loss 3.5281 (3.4876) grad_norm 1.4782 (1.5770) [2022-01-22 20:18:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][290/1251] eta 0:36:23 lr 0.000437 time 1.8885 (2.2721) loss 3.6356 (3.4726) grad_norm 1.8966 (1.5766) [2022-01-22 20:19:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][300/1251] eta 0:35:52 lr 0.000437 time 1.9373 (2.2639) loss 2.2581 (3.4665) grad_norm 1.4800 (1.5777) [2022-01-22 20:19:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][310/1251] eta 0:35:24 lr 0.000437 time 1.9063 (2.2580) loss 4.0654 (3.4762) grad_norm 1.6006 (1.5779) [2022-01-22 20:20:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][320/1251] eta 0:35:00 lr 0.000437 time 2.6922 (2.2559) loss 3.8389 (3.4861) grad_norm 1.9465 (1.5796) [2022-01-22 20:20:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][330/1251] eta 0:34:39 lr 0.000436 time 2.3634 (2.2575) loss 3.7245 (3.4905) grad_norm 1.5592 (1.5814) [2022-01-22 20:20:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][340/1251] eta 0:34:16 lr 0.000436 time 1.8111 (2.2574) loss 4.0361 (3.4845) grad_norm 1.5078 (1.5800) [2022-01-22 20:21:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][350/1251] eta 0:33:53 lr 0.000436 time 1.6507 (2.2567) loss 3.8493 (3.4816) grad_norm 1.4983 (1.5789) [2022-01-22 20:21:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][360/1251] eta 0:33:33 lr 0.000436 time 3.4095 (2.2599) loss 3.0872 (3.4850) grad_norm 1.6248 (1.5832) [2022-01-22 20:21:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][370/1251] eta 0:33:14 lr 0.000436 time 1.6235 (2.2642) loss 3.1586 (3.4871) grad_norm 1.7424 (1.5899) [2022-01-22 20:22:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][380/1251] eta 0:32:46 lr 0.000436 time 1.5682 (2.2581) loss 2.6631 (3.4889) grad_norm 1.6407 (1.5900) [2022-01-22 20:22:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][390/1251] eta 0:32:19 lr 0.000436 time 1.6919 (2.2529) loss 2.6023 (3.4860) grad_norm 1.3390 (1.5931) [2022-01-22 20:23:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][400/1251] eta 0:31:57 lr 0.000436 time 3.7068 (2.2533) loss 3.4689 (3.4890) grad_norm 1.4810 (1.5948) [2022-01-22 20:23:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][410/1251] eta 0:31:33 lr 0.000436 time 2.1825 (2.2517) loss 2.9098 (3.4853) grad_norm 1.8198 (1.5964) [2022-01-22 20:23:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][420/1251] eta 0:31:09 lr 0.000436 time 1.8933 (2.2502) loss 3.8911 (3.4837) grad_norm 1.4854 (1.5966) [2022-01-22 20:24:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][430/1251] eta 0:30:48 lr 0.000436 time 2.1717 (2.2510) loss 4.2501 (3.4828) grad_norm 1.5632 (1.5959) [2022-01-22 20:24:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][440/1251] eta 0:30:25 lr 0.000436 time 3.1565 (2.2512) loss 3.8589 (3.4789) grad_norm 1.9356 (1.5969) [2022-01-22 20:24:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][450/1251] eta 0:30:00 lr 0.000436 time 2.7576 (2.2482) loss 3.1021 (3.4825) grad_norm 1.3822 (1.5960) [2022-01-22 20:25:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][460/1251] eta 0:29:37 lr 0.000436 time 1.5277 (2.2475) loss 3.7615 (3.4869) grad_norm 1.7693 (1.5990) [2022-01-22 20:25:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][470/1251] eta 0:29:13 lr 0.000436 time 1.9172 (2.2457) loss 2.6395 (3.4769) grad_norm 1.6207 (1.5984) [2022-01-22 20:25:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][480/1251] eta 0:28:47 lr 0.000436 time 2.0976 (2.2405) loss 3.4719 (3.4787) grad_norm 1.4542 (1.5968) [2022-01-22 20:26:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][490/1251] eta 0:28:22 lr 0.000436 time 3.2242 (2.2374) loss 3.8307 (3.4767) grad_norm 1.4095 (1.5946) [2022-01-22 20:26:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][500/1251] eta 0:27:57 lr 0.000436 time 2.0002 (2.2343) loss 2.1364 (3.4756) grad_norm 1.3998 (1.5950) [2022-01-22 20:26:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][510/1251] eta 0:27:34 lr 0.000436 time 2.5970 (2.2328) loss 3.5052 (3.4767) grad_norm 1.9636 (1.5960) [2022-01-22 20:27:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][520/1251] eta 0:27:11 lr 0.000436 time 2.1977 (2.2315) loss 3.1028 (3.4742) grad_norm 1.6612 (1.5947) [2022-01-22 20:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][530/1251] eta 0:26:48 lr 0.000436 time 2.0138 (2.2308) loss 3.7828 (3.4808) grad_norm 1.4032 (1.5936) [2022-01-22 20:28:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][540/1251] eta 0:26:27 lr 0.000436 time 2.9707 (2.2330) loss 3.9477 (3.4840) grad_norm 1.3873 (1.5933) [2022-01-22 20:28:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][550/1251] eta 0:26:04 lr 0.000436 time 2.4848 (2.2315) loss 3.4295 (3.4857) grad_norm 1.4129 (1.5920) [2022-01-22 20:28:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][560/1251] eta 0:25:41 lr 0.000436 time 2.2876 (2.2312) loss 3.4620 (3.4858) grad_norm 1.4384 (1.5909) [2022-01-22 20:29:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][570/1251] eta 0:25:21 lr 0.000435 time 2.2605 (2.2344) loss 2.4709 (3.4859) grad_norm 1.7520 (1.5953) [2022-01-22 20:29:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][580/1251] eta 0:24:59 lr 0.000435 time 2.2531 (2.2345) loss 2.8469 (3.4847) grad_norm 1.4720 (1.5954) [2022-01-22 20:29:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][590/1251] eta 0:24:35 lr 0.000435 time 2.1849 (2.2328) loss 2.9548 (3.4821) grad_norm 1.4206 (1.5945) [2022-01-22 20:30:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][600/1251] eta 0:24:10 lr 0.000435 time 1.9089 (2.2282) loss 3.3475 (3.4851) grad_norm 1.3665 (1.5937) [2022-01-22 20:30:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][610/1251] eta 0:23:47 lr 0.000435 time 2.8648 (2.2271) loss 3.9806 (3.4864) grad_norm 1.4809 (1.5938) [2022-01-22 20:31:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][620/1251] eta 0:23:26 lr 0.000435 time 2.7938 (2.2283) loss 3.3881 (3.4817) grad_norm 1.7027 (1.5935) [2022-01-22 20:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][630/1251] eta 0:23:02 lr 0.000435 time 1.8326 (2.2264) loss 4.2210 (3.4833) grad_norm 1.8467 (1.5930) [2022-01-22 20:31:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][640/1251] eta 0:22:39 lr 0.000435 time 2.7447 (2.2254) loss 3.3232 (3.4871) grad_norm 1.4759 (1.5916) [2022-01-22 20:32:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][650/1251] eta 0:22:15 lr 0.000435 time 2.2954 (2.2224) loss 3.2288 (3.4865) grad_norm 1.4752 (1.5910) [2022-01-22 20:32:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][660/1251] eta 0:21:53 lr 0.000435 time 2.2831 (2.2218) loss 3.9348 (3.4874) grad_norm 1.3994 (1.5919) [2022-01-22 20:32:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][670/1251] eta 0:21:30 lr 0.000435 time 1.9106 (2.2205) loss 2.5401 (3.4875) grad_norm 1.4797 (1.5917) [2022-01-22 20:33:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][680/1251] eta 0:21:08 lr 0.000435 time 3.2083 (2.2216) loss 3.4843 (3.4883) grad_norm 1.6157 (1.5914) [2022-01-22 20:33:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][690/1251] eta 0:20:46 lr 0.000435 time 2.5589 (2.2211) loss 3.2747 (3.4853) grad_norm 1.7908 (1.5913) [2022-01-22 20:33:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][700/1251] eta 0:20:22 lr 0.000435 time 2.6295 (2.2191) loss 4.1479 (3.4859) grad_norm 1.7234 (1.5923) [2022-01-22 20:34:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][710/1251] eta 0:20:00 lr 0.000435 time 2.8486 (2.2188) loss 3.5039 (3.4805) grad_norm 1.5718 (1.5914) [2022-01-22 20:34:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][720/1251] eta 0:19:38 lr 0.000435 time 2.5442 (2.2186) loss 3.2338 (3.4816) grad_norm 1.6199 (1.5918) [2022-01-22 20:34:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][730/1251] eta 0:19:16 lr 0.000435 time 2.3575 (2.2193) loss 4.2423 (3.4827) grad_norm 1.6414 (1.5912) [2022-01-22 20:35:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][740/1251] eta 0:18:53 lr 0.000435 time 2.2643 (2.2189) loss 3.1520 (3.4819) grad_norm 1.5926 (1.5911) [2022-01-22 20:35:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][750/1251] eta 0:18:30 lr 0.000435 time 1.9941 (2.2168) loss 3.4835 (3.4831) grad_norm 1.7126 (1.5902) [2022-01-22 20:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][760/1251] eta 0:18:08 lr 0.000435 time 2.6760 (2.2167) loss 3.8907 (3.4858) grad_norm 1.6504 (1.5903) [2022-01-22 20:36:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][770/1251] eta 0:17:45 lr 0.000435 time 2.1940 (2.2154) loss 3.8579 (3.4857) grad_norm 1.6047 (1.5906) [2022-01-22 20:36:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][780/1251] eta 0:17:23 lr 0.000435 time 2.2482 (2.2148) loss 3.6197 (3.4861) grad_norm 1.6408 (1.5914) [2022-01-22 20:37:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][790/1251] eta 0:17:00 lr 0.000435 time 2.2427 (2.2136) loss 3.4733 (3.4828) grad_norm 1.6616 (1.5912) [2022-01-22 20:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][800/1251] eta 0:16:41 lr 0.000435 time 4.0181 (2.2207) loss 3.6023 (3.4809) grad_norm 1.5195 (1.5914) [2022-01-22 20:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][810/1251] eta 0:16:19 lr 0.000434 time 2.2497 (2.2204) loss 3.6935 (3.4807) grad_norm 1.4878 (1.5911) [2022-01-22 20:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][820/1251] eta 0:15:56 lr 0.000434 time 1.9947 (2.2185) loss 3.5551 (3.4832) grad_norm 1.5448 (1.5921) [2022-01-22 20:38:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][830/1251] eta 0:15:32 lr 0.000434 time 1.6106 (2.2149) loss 4.2666 (3.4858) grad_norm 1.5128 (1.5919) [2022-01-22 20:38:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][840/1251] eta 0:15:10 lr 0.000434 time 2.8206 (2.2147) loss 4.2251 (3.4865) grad_norm 1.4148 (1.5924) [2022-01-22 20:39:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][850/1251] eta 0:14:47 lr 0.000434 time 2.3131 (2.2135) loss 3.7376 (3.4890) grad_norm 2.1505 (1.5929) [2022-01-22 20:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][860/1251] eta 0:14:24 lr 0.000434 time 1.8439 (2.2118) loss 3.7707 (3.4925) grad_norm 1.4698 (1.5934) [2022-01-22 20:40:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][870/1251] eta 0:14:03 lr 0.000434 time 1.9415 (2.2137) loss 4.0768 (3.4941) grad_norm 1.5214 (1.5931) [2022-01-22 20:40:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][880/1251] eta 0:13:41 lr 0.000434 time 2.1572 (2.2156) loss 2.5992 (3.4924) grad_norm 1.8530 (1.5948) [2022-01-22 20:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][890/1251] eta 0:13:19 lr 0.000434 time 2.4597 (2.2147) loss 3.9060 (3.4917) grad_norm 1.3252 (1.5958) [2022-01-22 20:41:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][900/1251] eta 0:12:57 lr 0.000434 time 2.0153 (2.2137) loss 4.1176 (3.4893) grad_norm 1.4586 (1.5952) [2022-01-22 20:41:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][910/1251] eta 0:12:34 lr 0.000434 time 2.2596 (2.2136) loss 2.8503 (3.4888) grad_norm 1.5540 (1.5939) [2022-01-22 20:41:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][920/1251] eta 0:12:12 lr 0.000434 time 2.4888 (2.2136) loss 4.0952 (3.4903) grad_norm 1.5846 (1.5940) [2022-01-22 20:42:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][930/1251] eta 0:11:50 lr 0.000434 time 2.5711 (2.2129) loss 3.9662 (3.4898) grad_norm 1.9215 (1.5938) [2022-01-22 20:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][940/1251] eta 0:11:27 lr 0.000434 time 1.6286 (2.2111) loss 3.7885 (3.4893) grad_norm 1.3630 (1.5943) [2022-01-22 20:42:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][950/1251] eta 0:11:05 lr 0.000434 time 2.1177 (2.2104) loss 3.8517 (3.4894) grad_norm 1.4869 (1.5928) [2022-01-22 20:43:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][960/1251] eta 0:10:42 lr 0.000434 time 1.9054 (2.2096) loss 3.5979 (3.4894) grad_norm 1.4998 (1.5935) [2022-01-22 20:43:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][970/1251] eta 0:10:20 lr 0.000434 time 2.4543 (2.2097) loss 3.5729 (3.4904) grad_norm 1.4134 (1.5937) [2022-01-22 20:44:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][980/1251] eta 0:09:58 lr 0.000434 time 3.0606 (2.2085) loss 3.6549 (3.4928) grad_norm 1.5662 (1.5939) [2022-01-22 20:44:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][990/1251] eta 0:09:36 lr 0.000434 time 2.0110 (2.2069) loss 3.1495 (3.4925) grad_norm 2.0899 (1.5946) [2022-01-22 20:44:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1000/1251] eta 0:09:13 lr 0.000434 time 2.1377 (2.2057) loss 2.3236 (3.4903) grad_norm 1.9648 (1.5960) [2022-01-22 20:45:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1010/1251] eta 0:08:51 lr 0.000434 time 2.2325 (2.2061) loss 3.8891 (3.4876) grad_norm 1.4133 (1.5956) [2022-01-22 20:45:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1020/1251] eta 0:08:30 lr 0.000434 time 2.4763 (2.2085) loss 3.5050 (3.4866) grad_norm 1.4247 (1.5952) [2022-01-22 20:45:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1030/1251] eta 0:08:08 lr 0.000434 time 1.9787 (2.2097) loss 3.5136 (3.4873) grad_norm 1.8962 (1.5954) [2022-01-22 20:46:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1040/1251] eta 0:07:46 lr 0.000434 time 1.5804 (2.2097) loss 4.3791 (3.4887) grad_norm 1.5821 (1.5952) [2022-01-22 20:46:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1050/1251] eta 0:07:24 lr 0.000434 time 3.0304 (2.2096) loss 3.8448 (3.4885) grad_norm 1.5106 (1.5942) [2022-01-22 20:47:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1060/1251] eta 0:07:01 lr 0.000433 time 2.1548 (2.2090) loss 3.4284 (3.4893) grad_norm 1.4351 (1.5937) [2022-01-22 20:47:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1070/1251] eta 0:06:39 lr 0.000433 time 1.6362 (2.2076) loss 3.8504 (3.4877) grad_norm 1.3790 (1.5930) [2022-01-22 20:47:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1080/1251] eta 0:06:17 lr 0.000433 time 2.2310 (2.2071) loss 3.5247 (3.4861) grad_norm 1.5891 (1.5926) [2022-01-22 20:48:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1090/1251] eta 0:05:55 lr 0.000433 time 3.0326 (2.2066) loss 3.1774 (3.4843) grad_norm 1.5652 (1.5916) [2022-01-22 20:48:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1100/1251] eta 0:05:33 lr 0.000433 time 1.9454 (2.2057) loss 4.2022 (3.4880) grad_norm 1.5336 (1.5909) [2022-01-22 20:48:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1110/1251] eta 0:05:11 lr 0.000433 time 2.8066 (2.2058) loss 3.4729 (3.4902) grad_norm 1.4532 (1.5905) [2022-01-22 20:49:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1120/1251] eta 0:04:48 lr 0.000433 time 2.0816 (2.2052) loss 3.2453 (3.4895) grad_norm 1.7769 (1.5902) [2022-01-22 20:49:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1130/1251] eta 0:04:26 lr 0.000433 time 1.9213 (2.2039) loss 3.3775 (3.4899) grad_norm 1.8204 (1.5900) [2022-01-22 20:49:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1140/1251] eta 0:04:04 lr 0.000433 time 2.0058 (2.2043) loss 2.7087 (3.4888) grad_norm 1.7436 (1.5900) [2022-01-22 20:50:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1150/1251] eta 0:03:42 lr 0.000433 time 2.4746 (2.2058) loss 3.9533 (3.4884) grad_norm 1.6617 (1.5913) [2022-01-22 20:50:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1160/1251] eta 0:03:20 lr 0.000433 time 1.8456 (2.2061) loss 3.2012 (3.4868) grad_norm 1.4782 (1.5916) [2022-01-22 20:51:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1170/1251] eta 0:02:58 lr 0.000433 time 2.6484 (2.2075) loss 3.9782 (3.4885) grad_norm 1.7319 (1.5917) [2022-01-22 20:51:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1180/1251] eta 0:02:36 lr 0.000433 time 2.2581 (2.2075) loss 3.4897 (3.4869) grad_norm 1.6160 (1.5920) [2022-01-22 20:51:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1190/1251] eta 0:02:14 lr 0.000433 time 1.5892 (2.2060) loss 3.4313 (3.4863) grad_norm 1.6631 (1.5919) [2022-01-22 20:52:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1200/1251] eta 0:01:52 lr 0.000433 time 1.8530 (2.2043) loss 3.5540 (3.4866) grad_norm 1.5144 (1.5914) [2022-01-22 20:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1210/1251] eta 0:01:30 lr 0.000433 time 2.0017 (2.2034) loss 4.1640 (3.4872) grad_norm 1.4358 (1.5912) [2022-01-22 20:52:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1220/1251] eta 0:01:08 lr 0.000433 time 1.9792 (2.2030) loss 3.7976 (3.4880) grad_norm 1.6956 (1.5921) [2022-01-22 20:53:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1230/1251] eta 0:00:46 lr 0.000433 time 2.2461 (2.2043) loss 3.9100 (3.4868) grad_norm 1.7799 (1.5926) [2022-01-22 20:53:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1240/1251] eta 0:00:24 lr 0.000433 time 1.1981 (2.2026) loss 3.2138 (3.4855) grad_norm 1.7019 (1.5927) [2022-01-22 20:53:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [163/300][1250/1251] eta 0:00:02 lr 0.000433 time 1.1813 (2.1975) loss 2.8267 (3.4863) grad_norm 1.3745 (1.5922) [2022-01-22 20:53:46 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 163 training takes 0:45:49 [2022-01-22 20:54:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.257 (18.257) Loss 0.9939 (0.9939) Acc@1 76.074 (76.074) Acc@5 93.848 (93.848) [2022-01-22 20:54:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.246 (3.197) Loss 0.9357 (0.9752) Acc@1 76.465 (76.740) Acc@5 94.824 (94.087) [2022-01-22 20:54:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.284 (2.486) Loss 1.0160 (0.9874) Acc@1 74.805 (76.483) Acc@5 93.652 (93.829) [2022-01-22 20:54:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.608 (2.268) Loss 1.0332 (0.9900) Acc@1 76.465 (76.537) Acc@5 93.359 (93.637) [2022-01-22 20:55:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.775 (2.163) Loss 1.0039 (0.9852) Acc@1 77.051 (76.686) Acc@5 93.457 (93.633) [2022-01-22 20:55:22 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.650 Acc@5 93.552 [2022-01-22 20:55:22 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.6% [2022-01-22 20:55:22 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.65% [2022-01-22 20:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][0/1251] eta 6:47:56 lr 0.000433 time 19.5656 (19.5656) loss 3.9414 (3.9414) grad_norm 1.4998 (1.4998) [2022-01-22 20:56:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][10/1251] eta 1:22:07 lr 0.000433 time 1.8413 (3.9707) loss 2.9159 (3.6456) grad_norm 1.4646 (1.5723) [2022-01-22 20:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][20/1251] eta 1:03:11 lr 0.000433 time 1.5649 (3.0800) loss 3.3758 (3.7073) grad_norm 1.4088 (1.6169) [2022-01-22 20:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][30/1251] eta 0:57:37 lr 0.000433 time 1.5617 (2.8315) loss 3.7629 (3.6615) grad_norm 1.7265 (1.6019) [2022-01-22 20:57:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][40/1251] eta 0:54:46 lr 0.000433 time 3.7700 (2.7140) loss 2.8103 (3.5743) grad_norm 1.4532 (1.6050) [2022-01-22 20:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][50/1251] eta 0:52:39 lr 0.000432 time 1.7975 (2.6307) loss 3.9251 (3.5810) grad_norm 1.5410 (1.5932) [2022-01-22 20:57:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][60/1251] eta 0:50:50 lr 0.000432 time 1.5802 (2.5616) loss 3.6413 (3.5445) grad_norm 1.4332 (1.5867) [2022-01-22 20:58:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][70/1251] eta 0:49:23 lr 0.000432 time 1.3055 (2.5096) loss 4.0971 (3.5621) grad_norm 1.6811 (1.5848) [2022-01-22 20:58:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][80/1251] eta 0:48:35 lr 0.000432 time 3.2171 (2.4897) loss 2.3709 (3.5638) grad_norm 1.5197 (1.5748) [2022-01-22 20:59:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][90/1251] eta 0:47:57 lr 0.000432 time 1.6850 (2.4781) loss 3.2640 (3.5707) grad_norm 1.3686 (1.5609) [2022-01-22 20:59:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][100/1251] eta 0:46:50 lr 0.000432 time 1.9466 (2.4418) loss 2.7477 (3.5735) grad_norm 1.4111 (1.5601) [2022-01-22 20:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][110/1251] eta 0:45:50 lr 0.000432 time 1.5963 (2.4107) loss 3.7912 (3.5644) grad_norm 1.8671 (1.5573) [2022-01-22 21:00:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][120/1251] eta 0:44:53 lr 0.000432 time 2.5790 (2.3813) loss 2.4926 (3.5534) grad_norm 1.6780 (1.5593) [2022-01-22 21:00:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][130/1251] eta 0:44:17 lr 0.000432 time 1.6454 (2.3711) loss 2.8210 (3.5450) grad_norm 1.5335 (1.5588) [2022-01-22 21:00:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][140/1251] eta 0:43:47 lr 0.000432 time 2.3027 (2.3649) loss 3.1645 (3.5483) grad_norm 1.6646 (1.5587) [2022-01-22 21:01:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][150/1251] eta 0:43:03 lr 0.000432 time 1.8940 (2.3467) loss 4.3106 (3.5535) grad_norm 1.4082 (1.5605) [2022-01-22 21:01:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][160/1251] eta 0:42:24 lr 0.000432 time 2.2498 (2.3321) loss 2.6827 (3.5609) grad_norm 1.4583 (1.5602) [2022-01-22 21:02:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][170/1251] eta 0:41:58 lr 0.000432 time 2.1556 (2.3301) loss 3.5214 (3.5561) grad_norm 1.7045 (1.5644) [2022-01-22 21:02:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][180/1251] eta 0:41:28 lr 0.000432 time 1.5197 (2.3234) loss 4.0456 (3.5576) grad_norm 1.9132 (1.5701) [2022-01-22 21:02:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][190/1251] eta 0:40:59 lr 0.000432 time 2.2794 (2.3177) loss 3.8620 (3.5565) grad_norm 1.4932 (1.5707) [2022-01-22 21:03:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][200/1251] eta 0:40:23 lr 0.000432 time 2.1242 (2.3062) loss 3.5377 (3.5528) grad_norm 1.4940 (1.5755) [2022-01-22 21:03:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][210/1251] eta 0:39:49 lr 0.000432 time 2.5843 (2.2954) loss 4.0383 (3.5650) grad_norm 1.5822 (1.5792) [2022-01-22 21:03:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][220/1251] eta 0:39:21 lr 0.000432 time 1.9424 (2.2909) loss 2.7315 (3.5600) grad_norm 1.5601 (1.5811) [2022-01-22 21:04:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][230/1251] eta 0:38:56 lr 0.000432 time 2.0411 (2.2888) loss 3.9234 (3.5597) grad_norm 1.4944 (1.5789) [2022-01-22 21:04:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][240/1251] eta 0:38:30 lr 0.000432 time 2.8760 (2.2853) loss 3.9394 (3.5507) grad_norm 1.6260 (1.5807) [2022-01-22 21:04:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][250/1251] eta 0:38:02 lr 0.000432 time 2.7850 (2.2807) loss 3.8875 (3.5472) grad_norm 1.5814 (1.5825) [2022-01-22 21:05:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][260/1251] eta 0:37:33 lr 0.000432 time 1.5745 (2.2739) loss 3.4495 (3.5416) grad_norm 1.8855 (1.5847) [2022-01-22 21:05:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][270/1251] eta 0:37:11 lr 0.000432 time 2.2831 (2.2751) loss 2.8822 (3.5291) grad_norm 1.4455 (1.5821) [2022-01-22 21:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][280/1251] eta 0:36:47 lr 0.000432 time 1.9347 (2.2733) loss 2.7465 (3.5280) grad_norm 1.5601 (1.5812) [2022-01-22 21:06:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][290/1251] eta 0:36:22 lr 0.000432 time 3.0871 (2.2714) loss 3.3794 (3.5209) grad_norm 1.7394 (1.5838) [2022-01-22 21:06:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][300/1251] eta 0:35:51 lr 0.000431 time 1.7747 (2.2624) loss 2.9708 (3.5182) grad_norm 1.6247 (1.5868) [2022-01-22 21:07:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][310/1251] eta 0:35:29 lr 0.000431 time 1.6718 (2.2634) loss 3.3978 (3.5158) grad_norm 1.5302 (1.5857) [2022-01-22 21:07:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][320/1251] eta 0:35:07 lr 0.000431 time 1.7885 (2.2636) loss 3.5104 (3.5082) grad_norm 1.6180 (1.5857) [2022-01-22 21:07:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][330/1251] eta 0:34:43 lr 0.000431 time 2.8335 (2.2618) loss 3.7672 (3.5149) grad_norm 1.7861 (1.5882) [2022-01-22 21:08:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][340/1251] eta 0:34:14 lr 0.000431 time 1.8614 (2.2547) loss 2.9058 (3.5055) grad_norm 1.4530 (1.5860) [2022-01-22 21:08:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][350/1251] eta 0:33:47 lr 0.000431 time 1.9025 (2.2506) loss 4.0296 (3.5129) grad_norm 1.3988 (1.5856) [2022-01-22 21:08:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][360/1251] eta 0:33:26 lr 0.000431 time 2.1206 (2.2521) loss 3.0386 (3.5043) grad_norm 1.4313 (1.5865) [2022-01-22 21:09:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][370/1251] eta 0:33:00 lr 0.000431 time 1.9791 (2.2477) loss 4.1560 (3.5037) grad_norm 1.5253 (1.5858) [2022-01-22 21:09:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][380/1251] eta 0:32:30 lr 0.000431 time 1.6237 (2.2393) loss 2.4973 (3.5039) grad_norm 1.6603 (1.5861) [2022-01-22 21:09:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][390/1251] eta 0:32:05 lr 0.000431 time 1.7821 (2.2360) loss 2.4888 (3.5016) grad_norm 1.8372 (1.5900) [2022-01-22 21:10:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][400/1251] eta 0:31:40 lr 0.000431 time 1.5663 (2.2335) loss 3.8879 (3.5076) grad_norm 1.6883 (1.5901) [2022-01-22 21:10:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][410/1251] eta 0:31:18 lr 0.000431 time 2.2343 (2.2333) loss 3.7873 (3.5087) grad_norm 1.5813 (1.5907) [2022-01-22 21:11:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][420/1251] eta 0:30:54 lr 0.000431 time 2.2520 (2.2319) loss 2.7413 (3.5111) grad_norm 1.5507 (1.5921) [2022-01-22 21:11:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][430/1251] eta 0:30:35 lr 0.000431 time 3.1521 (2.2359) loss 4.0285 (3.5111) grad_norm 1.3141 (1.5899) [2022-01-22 21:11:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][440/1251] eta 0:30:14 lr 0.000431 time 2.2322 (2.2374) loss 3.9433 (3.5120) grad_norm 1.6160 (1.5909) [2022-01-22 21:12:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][450/1251] eta 0:29:51 lr 0.000431 time 2.3930 (2.2368) loss 2.6294 (3.5131) grad_norm 1.5336 (1.5886) [2022-01-22 21:12:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][460/1251] eta 0:29:26 lr 0.000431 time 2.2624 (2.2336) loss 3.4416 (3.5193) grad_norm 1.5464 (1.5874) [2022-01-22 21:12:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][470/1251] eta 0:29:02 lr 0.000431 time 2.6240 (2.2312) loss 2.3064 (3.5161) grad_norm 1.5639 (1.5865) [2022-01-22 21:13:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][480/1251] eta 0:28:38 lr 0.000431 time 2.4747 (2.2285) loss 3.6327 (3.5133) grad_norm 1.5263 (1.5850) [2022-01-22 21:13:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][490/1251] eta 0:28:15 lr 0.000431 time 2.1187 (2.2276) loss 2.3081 (3.5080) grad_norm 1.4369 (1.5839) [2022-01-22 21:13:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][500/1251] eta 0:27:50 lr 0.000431 time 2.3471 (2.2245) loss 3.0609 (3.5075) grad_norm 1.4332 (1.5829) [2022-01-22 21:14:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][510/1251] eta 0:27:29 lr 0.000431 time 2.3695 (2.2259) loss 3.7501 (3.5032) grad_norm 1.4708 (1.5846) [2022-01-22 21:14:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][520/1251] eta 0:27:07 lr 0.000431 time 2.3732 (2.2265) loss 3.7308 (3.5024) grad_norm 1.5596 (1.5857) [2022-01-22 21:15:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][530/1251] eta 0:26:44 lr 0.000431 time 1.8642 (2.2261) loss 2.6389 (3.4994) grad_norm 1.5419 (1.5849) [2022-01-22 21:15:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][540/1251] eta 0:26:20 lr 0.000430 time 2.1316 (2.2235) loss 3.3406 (3.5000) grad_norm 1.6163 (1.5854) [2022-01-22 21:15:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][550/1251] eta 0:25:57 lr 0.000430 time 1.9161 (2.2220) loss 4.1218 (3.5037) grad_norm 1.5457 (1.5860) [2022-01-22 21:16:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][560/1251] eta 0:25:36 lr 0.000430 time 2.5187 (2.2242) loss 3.3412 (3.5033) grad_norm 1.5735 (1.5845) [2022-01-22 21:16:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][570/1251] eta 0:25:14 lr 0.000430 time 2.0039 (2.2235) loss 2.7120 (3.5003) grad_norm 1.8724 (1.5849) [2022-01-22 21:16:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][580/1251] eta 0:24:49 lr 0.000430 time 2.0984 (2.2194) loss 4.1057 (3.5013) grad_norm 1.5286 (1.5849) [2022-01-22 21:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][590/1251] eta 0:24:25 lr 0.000430 time 1.9957 (2.2169) loss 3.5485 (3.5011) grad_norm 1.5931 (1.5853) [2022-01-22 21:17:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][600/1251] eta 0:24:01 lr 0.000430 time 1.5867 (2.2136) loss 4.3631 (3.5040) grad_norm 1.6634 (1.5846) [2022-01-22 21:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][610/1251] eta 0:23:38 lr 0.000430 time 2.5059 (2.2126) loss 2.8340 (3.5002) grad_norm 1.6847 (1.5845) [2022-01-22 21:18:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][620/1251] eta 0:23:16 lr 0.000430 time 1.8768 (2.2124) loss 3.7054 (3.5015) grad_norm 1.5126 (1.5837) [2022-01-22 21:18:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][630/1251] eta 0:22:55 lr 0.000430 time 2.4717 (2.2154) loss 3.0203 (3.5002) grad_norm 1.7442 (1.5846) [2022-01-22 21:19:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][640/1251] eta 0:22:35 lr 0.000430 time 2.3794 (2.2180) loss 4.0245 (3.5026) grad_norm 1.9149 (1.5845) [2022-01-22 21:19:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][650/1251] eta 0:22:15 lr 0.000430 time 2.8117 (2.2214) loss 3.6852 (3.5058) grad_norm 1.4865 (1.5846) [2022-01-22 21:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][660/1251] eta 0:21:52 lr 0.000430 time 1.8212 (2.2212) loss 4.0507 (3.5031) grad_norm 1.4785 (1.5851) [2022-01-22 21:20:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][670/1251] eta 0:21:30 lr 0.000430 time 2.1701 (2.2207) loss 3.7353 (3.5056) grad_norm 1.7144 (1.5860) [2022-01-22 21:20:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][680/1251] eta 0:21:05 lr 0.000430 time 1.9517 (2.2161) loss 3.1863 (3.5013) grad_norm 1.4501 (1.5857) [2022-01-22 21:20:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][690/1251] eta 0:20:42 lr 0.000430 time 2.2073 (2.2140) loss 2.2481 (3.5023) grad_norm 1.5006 (1.5849) [2022-01-22 21:21:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][700/1251] eta 0:20:20 lr 0.000430 time 1.9398 (2.2145) loss 3.8444 (3.5020) grad_norm 1.5463 (1.5846) [2022-01-22 21:21:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][710/1251] eta 0:19:59 lr 0.000430 time 2.5089 (2.2177) loss 3.3853 (3.5036) grad_norm 1.7610 (1.5844) [2022-01-22 21:22:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][720/1251] eta 0:19:37 lr 0.000430 time 1.8974 (2.2178) loss 3.7054 (3.5029) grad_norm 1.4509 (1.5843) [2022-01-22 21:22:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][730/1251] eta 0:19:14 lr 0.000430 time 2.2307 (2.2163) loss 4.3474 (3.5068) grad_norm 1.5000 (1.5837) [2022-01-22 21:22:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][740/1251] eta 0:18:50 lr 0.000430 time 2.0081 (2.2125) loss 3.6869 (3.5080) grad_norm 1.7097 (1.5840) [2022-01-22 21:23:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][750/1251] eta 0:18:28 lr 0.000430 time 2.2226 (2.2126) loss 3.3761 (3.5124) grad_norm 1.4980 (1.5849) [2022-01-22 21:23:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][760/1251] eta 0:18:07 lr 0.000430 time 2.4825 (2.2139) loss 4.0072 (3.5107) grad_norm 1.7610 (1.5854) [2022-01-22 21:23:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][770/1251] eta 0:17:44 lr 0.000430 time 1.9033 (2.2141) loss 3.2192 (3.5105) grad_norm 1.6071 (1.5849) [2022-01-22 21:24:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][780/1251] eta 0:17:22 lr 0.000429 time 1.6363 (2.2123) loss 2.1704 (3.5097) grad_norm 1.4653 (1.5847) [2022-01-22 21:24:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][790/1251] eta 0:16:59 lr 0.000429 time 3.0498 (2.2124) loss 3.7830 (3.5077) grad_norm 1.6182 (1.5846) [2022-01-22 21:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][800/1251] eta 0:16:36 lr 0.000429 time 1.9161 (2.2104) loss 3.1746 (3.5083) grad_norm 1.3801 (1.5854) [2022-01-22 21:25:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][810/1251] eta 0:16:14 lr 0.000429 time 2.1483 (2.2101) loss 3.1649 (3.5076) grad_norm 1.4056 (1.5859) [2022-01-22 21:25:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][820/1251] eta 0:15:53 lr 0.000429 time 1.7456 (2.2114) loss 3.9692 (3.5035) grad_norm 1.6151 (1.5864) [2022-01-22 21:26:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][830/1251] eta 0:15:31 lr 0.000429 time 3.1854 (2.2131) loss 2.3467 (3.5033) grad_norm 1.6867 (1.5870) [2022-01-22 21:26:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][840/1251] eta 0:15:09 lr 0.000429 time 2.3662 (2.2137) loss 3.8864 (3.5050) grad_norm 1.3762 (1.5881) [2022-01-22 21:26:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][850/1251] eta 0:14:47 lr 0.000429 time 1.9004 (2.2129) loss 3.5774 (3.5046) grad_norm 1.4901 (1.5879) [2022-01-22 21:27:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][860/1251] eta 0:14:24 lr 0.000429 time 2.1797 (2.2116) loss 3.0377 (3.5051) grad_norm 1.4052 (1.5876) [2022-01-22 21:27:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][870/1251] eta 0:14:01 lr 0.000429 time 1.9747 (2.2083) loss 3.8195 (3.5054) grad_norm 1.4688 (1.5872) [2022-01-22 21:27:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][880/1251] eta 0:13:38 lr 0.000429 time 2.3037 (2.2066) loss 3.4839 (3.5070) grad_norm 1.8332 (1.5877) [2022-01-22 21:28:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][890/1251] eta 0:13:15 lr 0.000429 time 1.8995 (2.2046) loss 3.0809 (3.5088) grad_norm 1.6817 (1.5876) [2022-01-22 21:28:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][900/1251] eta 0:12:53 lr 0.000429 time 2.2384 (2.2030) loss 3.9638 (3.5117) grad_norm 1.3184 (1.5870) [2022-01-22 21:28:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][910/1251] eta 0:12:31 lr 0.000429 time 2.3122 (2.2030) loss 3.4419 (3.5094) grad_norm 1.5945 (1.5865) [2022-01-22 21:29:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][920/1251] eta 0:12:09 lr 0.000429 time 2.5779 (2.2038) loss 3.9458 (3.5097) grad_norm 1.7062 (1.5864) [2022-01-22 21:29:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][930/1251] eta 0:11:47 lr 0.000429 time 2.1910 (2.2048) loss 2.9180 (3.5105) grad_norm 1.5866 (1.5873) [2022-01-22 21:29:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][940/1251] eta 0:11:26 lr 0.000429 time 2.4335 (2.2069) loss 2.5954 (3.5122) grad_norm 1.7992 (1.5878) [2022-01-22 21:30:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][950/1251] eta 0:11:04 lr 0.000429 time 2.2991 (2.2074) loss 2.3807 (3.5113) grad_norm 1.5082 (1.5884) [2022-01-22 21:30:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][960/1251] eta 0:10:42 lr 0.000429 time 2.5523 (2.2088) loss 3.9215 (3.5089) grad_norm 1.6020 (1.5879) [2022-01-22 21:31:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][970/1251] eta 0:10:20 lr 0.000429 time 2.1157 (2.2082) loss 3.5570 (3.5108) grad_norm 1.8392 (1.5886) [2022-01-22 21:31:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][980/1251] eta 0:09:58 lr 0.000429 time 1.6235 (2.2082) loss 3.4003 (3.5086) grad_norm 1.8102 (1.5900) [2022-01-22 21:31:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][990/1251] eta 0:09:36 lr 0.000429 time 1.8218 (2.2076) loss 3.1765 (3.5073) grad_norm 1.7666 (1.5900) [2022-01-22 21:32:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1000/1251] eta 0:09:14 lr 0.000429 time 1.9581 (2.2079) loss 2.3508 (3.5034) grad_norm 1.7062 (1.5909) [2022-01-22 21:32:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1010/1251] eta 0:08:51 lr 0.000429 time 1.8922 (2.2057) loss 2.2489 (3.5006) grad_norm 1.4176 (1.5916) [2022-01-22 21:32:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1020/1251] eta 0:08:29 lr 0.000429 time 1.9753 (2.2051) loss 3.6637 (3.5034) grad_norm 1.4710 (1.5915) [2022-01-22 21:33:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1030/1251] eta 0:08:07 lr 0.000428 time 1.5622 (2.2037) loss 3.7854 (3.5029) grad_norm 1.6380 (1.5914) [2022-01-22 21:33:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1040/1251] eta 0:07:45 lr 0.000428 time 2.1565 (2.2045) loss 4.0242 (3.5032) grad_norm 1.8558 (1.5913) [2022-01-22 21:34:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1050/1251] eta 0:07:23 lr 0.000428 time 2.0135 (2.2053) loss 2.5437 (3.5028) grad_norm 1.4834 (1.5911) [2022-01-22 21:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1060/1251] eta 0:07:01 lr 0.000428 time 2.4371 (2.2053) loss 3.7667 (3.5042) grad_norm 1.7456 (1.5913) [2022-01-22 21:34:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1070/1251] eta 0:06:38 lr 0.000428 time 1.6009 (2.2039) loss 3.0455 (3.5043) grad_norm 1.6403 (1.5908) [2022-01-22 21:35:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1080/1251] eta 0:06:17 lr 0.000428 time 1.9708 (2.2049) loss 3.8740 (3.5023) grad_norm 1.4449 (1.5906) [2022-01-22 21:35:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1090/1251] eta 0:05:54 lr 0.000428 time 2.2033 (2.2040) loss 2.4400 (3.4997) grad_norm 1.6055 (1.5906) [2022-01-22 21:35:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1100/1251] eta 0:05:32 lr 0.000428 time 2.2826 (2.2044) loss 4.0490 (3.4994) grad_norm 1.5854 (1.5908) [2022-01-22 21:36:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1110/1251] eta 0:05:10 lr 0.000428 time 1.8977 (2.2034) loss 3.5371 (3.5005) grad_norm 1.6399 (1.5904) [2022-01-22 21:36:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1120/1251] eta 0:04:48 lr 0.000428 time 1.9915 (2.2033) loss 3.6141 (3.5020) grad_norm 1.5846 (1.5905) [2022-01-22 21:36:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1130/1251] eta 0:04:26 lr 0.000428 time 2.3189 (2.2020) loss 3.7417 (3.5047) grad_norm 1.8417 (1.5915) [2022-01-22 21:37:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1140/1251] eta 0:04:04 lr 0.000428 time 1.9482 (2.2015) loss 2.9441 (3.5033) grad_norm 1.7040 (1.5915) [2022-01-22 21:37:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1150/1251] eta 0:03:42 lr 0.000428 time 1.7995 (2.2009) loss 3.0471 (3.5051) grad_norm 1.5972 (1.5913) [2022-01-22 21:37:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1160/1251] eta 0:03:20 lr 0.000428 time 2.2856 (2.2003) loss 3.0416 (3.5035) grad_norm 1.6989 (1.5919) [2022-01-22 21:38:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1170/1251] eta 0:02:58 lr 0.000428 time 1.9016 (2.2005) loss 2.8764 (3.5028) grad_norm 1.6369 (1.5919) [2022-01-22 21:38:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1180/1251] eta 0:02:36 lr 0.000428 time 2.0136 (2.2005) loss 3.6396 (3.5033) grad_norm 1.6875 (1.5919) [2022-01-22 21:39:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1190/1251] eta 0:02:14 lr 0.000428 time 1.9097 (2.2016) loss 3.5123 (3.5021) grad_norm 1.3918 (1.5914) [2022-01-22 21:39:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1200/1251] eta 0:01:52 lr 0.000428 time 1.9910 (2.2020) loss 3.2605 (3.5013) grad_norm 1.5757 (1.5908) [2022-01-22 21:39:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1210/1251] eta 0:01:30 lr 0.000428 time 2.1931 (2.2017) loss 3.5906 (3.5035) grad_norm 1.4494 (1.5915) [2022-01-22 21:40:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1220/1251] eta 0:01:08 lr 0.000428 time 1.8764 (2.2006) loss 3.2474 (3.5038) grad_norm 1.4866 (1.5922) [2022-01-22 21:40:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1230/1251] eta 0:00:46 lr 0.000428 time 1.9376 (2.1992) loss 4.1510 (3.5073) grad_norm 1.5397 (1.5926) [2022-01-22 21:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1240/1251] eta 0:00:24 lr 0.000428 time 1.7984 (2.1985) loss 4.0654 (3.5077) grad_norm 1.5442 (1.5926) [2022-01-22 21:41:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [164/300][1250/1251] eta 0:00:02 lr 0.000428 time 1.2945 (2.1931) loss 2.6523 (3.5075) grad_norm 1.4129 (1.5925) [2022-01-22 21:41:06 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 164 training takes 0:45:44 [2022-01-22 21:41:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.099 (18.099) Loss 0.9847 (0.9847) Acc@1 76.367 (76.367) Acc@5 93.262 (93.262) [2022-01-22 21:41:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.018 (3.178) Loss 1.0550 (0.9989) Acc@1 76.465 (76.420) Acc@5 91.797 (93.164) [2022-01-22 21:42:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.594 (2.641) Loss 1.0388 (0.9913) Acc@1 75.488 (76.567) Acc@5 92.676 (93.355) [2022-01-22 21:42:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.305 (2.363) Loss 0.9714 (0.9934) Acc@1 76.465 (76.654) Acc@5 94.043 (93.466) [2022-01-22 21:42:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.756 (2.185) Loss 0.9198 (0.9927) Acc@1 77.930 (76.620) Acc@5 94.238 (93.526) [2022-01-22 21:42:43 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.644 Acc@5 93.596 [2022-01-22 21:42:43 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.6% [2022-01-22 21:42:43 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.65% [2022-01-22 21:43:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][0/1251] eta 7:37:55 lr 0.000428 time 21.9625 (21.9625) loss 3.2232 (3.2232) grad_norm 1.6375 (1.6375) [2022-01-22 21:43:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][10/1251] eta 1:22:53 lr 0.000428 time 2.8980 (4.0075) loss 3.9464 (3.4119) grad_norm 1.4874 (1.5617) [2022-01-22 21:43:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][20/1251] eta 1:03:35 lr 0.000427 time 2.1838 (3.0997) loss 2.9105 (3.3778) grad_norm 1.5503 (1.5340) [2022-01-22 21:44:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][30/1251] eta 0:56:24 lr 0.000427 time 1.6565 (2.7722) loss 3.7441 (3.4682) grad_norm 1.7703 (1.5407) [2022-01-22 21:44:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][40/1251] eta 0:56:35 lr 0.000427 time 7.1801 (2.8036) loss 3.6211 (3.4480) grad_norm 1.7532 (1.5884) [2022-01-22 21:45:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][50/1251] eta 0:54:16 lr 0.000427 time 1.9961 (2.7112) loss 3.0820 (3.4542) grad_norm 1.8952 (1.6335) [2022-01-22 21:45:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][60/1251] eta 0:52:16 lr 0.000427 time 2.1747 (2.6338) loss 3.1007 (3.4737) grad_norm 1.5862 (1.6310) [2022-01-22 21:45:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][70/1251] eta 0:50:04 lr 0.000427 time 1.8060 (2.5442) loss 3.2743 (3.4950) grad_norm 1.5903 (1.6350) [2022-01-22 21:46:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][80/1251] eta 0:48:13 lr 0.000427 time 2.2339 (2.4707) loss 3.1681 (3.5138) grad_norm 1.7607 (1.6314) [2022-01-22 21:46:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][90/1251] eta 0:47:07 lr 0.000427 time 1.6170 (2.4350) loss 3.5925 (3.5092) grad_norm 1.5669 (1.6299) [2022-01-22 21:46:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][100/1251] eta 0:46:22 lr 0.000427 time 1.6135 (2.4171) loss 3.4751 (3.5225) grad_norm 1.8708 (1.6346) [2022-01-22 21:47:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][110/1251] eta 0:45:28 lr 0.000427 time 2.2025 (2.3914) loss 4.0200 (3.4973) grad_norm 1.5494 (1.6288) [2022-01-22 21:47:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][120/1251] eta 0:44:47 lr 0.000427 time 1.9725 (2.3759) loss 3.4093 (3.4826) grad_norm 1.4407 (1.6227) [2022-01-22 21:47:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][130/1251] eta 0:43:50 lr 0.000427 time 1.9042 (2.3468) loss 3.7442 (3.4704) grad_norm 1.8369 (1.6275) [2022-01-22 21:48:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][140/1251] eta 0:43:24 lr 0.000427 time 3.0258 (2.3442) loss 3.7559 (3.4637) grad_norm 1.9075 (1.6242) [2022-01-22 21:48:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][150/1251] eta 0:42:56 lr 0.000427 time 2.1484 (2.3400) loss 3.0708 (3.4620) grad_norm 1.4554 (1.6190) [2022-01-22 21:48:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][160/1251] eta 0:42:24 lr 0.000427 time 1.5268 (2.3322) loss 4.0883 (3.4691) grad_norm 1.5048 (1.6183) [2022-01-22 21:49:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][170/1251] eta 0:41:56 lr 0.000427 time 2.1559 (2.3280) loss 3.3585 (3.4804) grad_norm 1.4228 (1.6161) [2022-01-22 21:49:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][180/1251] eta 0:41:30 lr 0.000427 time 2.9927 (2.3258) loss 3.5608 (3.4813) grad_norm 2.9488 (1.6186) [2022-01-22 21:50:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][190/1251] eta 0:40:54 lr 0.000427 time 1.8807 (2.3136) loss 4.0357 (3.4808) grad_norm 1.5718 (1.6195) [2022-01-22 21:50:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][200/1251] eta 0:40:21 lr 0.000427 time 1.8479 (2.3042) loss 3.3133 (3.4910) grad_norm 1.7646 (1.6197) [2022-01-22 21:50:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][210/1251] eta 0:39:45 lr 0.000427 time 1.8413 (2.2913) loss 3.5317 (3.4985) grad_norm 1.8349 (1.6215) [2022-01-22 21:51:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][220/1251] eta 0:39:25 lr 0.000427 time 4.2432 (2.2941) loss 3.5558 (3.4902) grad_norm 1.5217 (1.6174) [2022-01-22 21:51:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][230/1251] eta 0:39:05 lr 0.000427 time 2.8938 (2.2975) loss 3.6322 (3.5013) grad_norm 1.8192 (1.6167) [2022-01-22 21:51:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][240/1251] eta 0:38:39 lr 0.000427 time 2.1080 (2.2944) loss 3.8055 (3.4935) grad_norm 1.3870 (1.6154) [2022-01-22 21:52:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][250/1251] eta 0:38:09 lr 0.000427 time 1.9730 (2.2872) loss 3.7426 (3.4963) grad_norm 2.0224 (1.6131) [2022-01-22 21:52:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][260/1251] eta 0:37:46 lr 0.000427 time 3.1630 (2.2874) loss 3.1649 (3.5067) grad_norm 1.4953 (1.6106) [2022-01-22 21:53:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][270/1251] eta 0:37:12 lr 0.000426 time 1.7752 (2.2758) loss 2.1497 (3.5022) grad_norm 1.3861 (1.6124) [2022-01-22 21:53:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][280/1251] eta 0:36:43 lr 0.000426 time 2.1140 (2.2689) loss 3.0335 (3.4996) grad_norm 1.4101 (1.6092) [2022-01-22 21:53:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][290/1251] eta 0:36:14 lr 0.000426 time 1.8523 (2.2629) loss 3.7608 (3.5020) grad_norm 1.6468 (1.6080) [2022-01-22 21:54:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][300/1251] eta 0:35:51 lr 0.000426 time 2.5201 (2.2623) loss 2.5891 (3.4937) grad_norm 1.4692 (1.6062) [2022-01-22 21:54:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][310/1251] eta 0:35:24 lr 0.000426 time 2.0906 (2.2579) loss 3.9004 (3.4979) grad_norm 1.8997 (1.6061) [2022-01-22 21:54:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][320/1251] eta 0:35:01 lr 0.000426 time 2.7155 (2.2571) loss 3.8925 (3.4990) grad_norm 1.5367 (1.6084) [2022-01-22 21:55:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][330/1251] eta 0:34:36 lr 0.000426 time 2.5402 (2.2549) loss 3.8428 (3.4930) grad_norm 1.8179 (1.6121) [2022-01-22 21:55:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][340/1251] eta 0:34:12 lr 0.000426 time 2.1572 (2.2526) loss 3.4681 (3.4977) grad_norm 1.5929 (1.6114) [2022-01-22 21:55:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][350/1251] eta 0:33:44 lr 0.000426 time 1.4968 (2.2472) loss 3.6074 (3.4946) grad_norm 1.7420 (1.6117) [2022-01-22 21:56:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][360/1251] eta 0:33:20 lr 0.000426 time 2.2965 (2.2452) loss 3.8568 (3.4921) grad_norm 1.5317 (1.6128) [2022-01-22 21:56:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][370/1251] eta 0:32:55 lr 0.000426 time 2.2497 (2.2419) loss 2.5457 (3.4841) grad_norm 1.9673 (1.6147) [2022-01-22 21:56:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][380/1251] eta 0:32:31 lr 0.000426 time 2.5265 (2.2408) loss 3.6118 (3.4851) grad_norm 1.4736 (1.6136) [2022-01-22 21:57:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][390/1251] eta 0:32:08 lr 0.000426 time 1.5778 (2.2397) loss 4.2114 (3.4964) grad_norm 1.9355 (1.6140) [2022-01-22 21:57:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][400/1251] eta 0:31:48 lr 0.000426 time 2.8704 (2.2425) loss 2.8277 (3.4924) grad_norm 1.6189 (1.6145) [2022-01-22 21:58:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][410/1251] eta 0:31:26 lr 0.000426 time 3.6968 (2.2437) loss 3.7009 (3.4938) grad_norm 1.9770 (1.6163) [2022-01-22 21:58:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][420/1251] eta 0:31:04 lr 0.000426 time 2.5753 (2.2436) loss 3.8100 (3.4960) grad_norm 1.7554 (1.6166) [2022-01-22 21:58:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][430/1251] eta 0:30:39 lr 0.000426 time 1.5666 (2.2406) loss 4.2635 (3.4985) grad_norm 1.5581 (1.6141) [2022-01-22 21:59:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][440/1251] eta 0:30:13 lr 0.000426 time 1.6845 (2.2361) loss 3.4159 (3.4928) grad_norm 1.5057 (1.6132) [2022-01-22 21:59:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][450/1251] eta 0:29:47 lr 0.000426 time 2.8526 (2.2321) loss 2.6322 (3.4958) grad_norm 1.5005 (1.6130) [2022-01-22 21:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][460/1251] eta 0:29:21 lr 0.000426 time 2.0076 (2.2265) loss 3.4965 (3.4971) grad_norm 1.5309 (1.6136) [2022-01-22 22:00:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][470/1251] eta 0:29:00 lr 0.000426 time 2.4996 (2.2280) loss 2.9937 (3.4979) grad_norm 1.6511 (1.6132) [2022-01-22 22:00:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][480/1251] eta 0:28:39 lr 0.000426 time 1.9146 (2.2296) loss 2.7958 (3.4967) grad_norm 1.9221 (1.6135) [2022-01-22 22:01:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][490/1251] eta 0:28:19 lr 0.000426 time 2.6095 (2.2330) loss 3.2093 (3.5011) grad_norm 1.5613 (1.6122) [2022-01-22 22:01:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][500/1251] eta 0:28:00 lr 0.000426 time 2.2153 (2.2383) loss 3.9931 (3.5008) grad_norm 1.5910 (1.6117) [2022-01-22 22:01:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][510/1251] eta 0:27:38 lr 0.000425 time 2.0117 (2.2382) loss 3.5667 (3.4981) grad_norm 1.3869 (1.6105) [2022-01-22 22:02:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][520/1251] eta 0:27:12 lr 0.000425 time 1.7217 (2.2331) loss 3.9614 (3.5015) grad_norm 1.5483 (1.6086) [2022-01-22 22:02:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][530/1251] eta 0:26:45 lr 0.000425 time 1.7783 (2.2270) loss 2.3391 (3.4975) grad_norm 1.5827 (1.6079) [2022-01-22 22:02:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][540/1251] eta 0:26:20 lr 0.000425 time 2.0521 (2.2231) loss 3.5851 (3.4982) grad_norm 1.4718 (1.6075) [2022-01-22 22:03:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][550/1251] eta 0:25:55 lr 0.000425 time 1.7774 (2.2191) loss 4.0501 (3.4968) grad_norm 1.6389 (1.6065) [2022-01-22 22:03:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][560/1251] eta 0:25:33 lr 0.000425 time 2.4741 (2.2190) loss 4.1970 (3.4944) grad_norm 1.7943 (1.6063) [2022-01-22 22:03:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][570/1251] eta 0:25:10 lr 0.000425 time 2.0661 (2.2179) loss 3.5057 (3.4982) grad_norm 1.8219 (1.6076) [2022-01-22 22:04:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][580/1251] eta 0:24:49 lr 0.000425 time 2.1689 (2.2199) loss 2.6657 (3.4953) grad_norm 1.5150 (1.6064) [2022-01-22 22:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][590/1251] eta 0:24:29 lr 0.000425 time 3.1660 (2.2228) loss 3.7069 (3.4971) grad_norm 1.6779 (1.6075) [2022-01-22 22:05:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][600/1251] eta 0:24:08 lr 0.000425 time 3.0025 (2.2251) loss 3.6879 (3.4988) grad_norm 1.3811 (1.6065) [2022-01-22 22:05:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][610/1251] eta 0:23:47 lr 0.000425 time 2.5806 (2.2273) loss 3.1557 (3.4980) grad_norm 1.7183 (1.6064) [2022-01-22 22:05:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][620/1251] eta 0:23:23 lr 0.000425 time 1.6462 (2.2242) loss 2.3194 (3.4943) grad_norm 1.8358 (1.6070) [2022-01-22 22:06:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][630/1251] eta 0:23:00 lr 0.000425 time 1.8200 (2.2227) loss 4.2771 (3.4961) grad_norm 1.7079 (1.6074) [2022-01-22 22:06:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][640/1251] eta 0:22:36 lr 0.000425 time 1.6378 (2.2195) loss 4.0647 (3.4964) grad_norm 1.5575 (1.6076) [2022-01-22 22:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][650/1251] eta 0:22:15 lr 0.000425 time 3.3090 (2.2217) loss 2.5133 (3.4972) grad_norm 1.4421 (1.6064) [2022-01-22 22:07:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][660/1251] eta 0:21:53 lr 0.000425 time 1.8461 (2.2223) loss 3.4364 (3.4998) grad_norm 1.4588 (1.6060) [2022-01-22 22:07:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][670/1251] eta 0:21:31 lr 0.000425 time 2.3437 (2.2223) loss 3.9474 (3.5034) grad_norm 1.7567 (1.6073) [2022-01-22 22:07:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][680/1251] eta 0:21:07 lr 0.000425 time 2.2143 (2.2199) loss 3.1113 (3.5052) grad_norm 1.7078 (1.6081) [2022-01-22 22:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][690/1251] eta 0:20:45 lr 0.000425 time 2.8615 (2.2200) loss 4.0089 (3.5037) grad_norm 1.5179 (1.6079) [2022-01-22 22:08:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][700/1251] eta 0:20:23 lr 0.000425 time 1.9341 (2.2199) loss 3.8565 (3.5065) grad_norm 1.6061 (1.6083) [2022-01-22 22:09:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][710/1251] eta 0:20:00 lr 0.000425 time 2.1764 (2.2193) loss 4.0472 (3.5078) grad_norm 1.6073 (1.6099) [2022-01-22 22:09:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][720/1251] eta 0:19:38 lr 0.000425 time 2.1297 (2.2196) loss 3.9370 (3.5057) grad_norm 1.5723 (1.6096) [2022-01-22 22:09:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][730/1251] eta 0:19:17 lr 0.000425 time 1.8543 (2.2213) loss 2.9825 (3.5056) grad_norm 1.5957 (1.6098) [2022-01-22 22:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][740/1251] eta 0:18:53 lr 0.000425 time 1.6197 (2.2178) loss 3.7267 (3.5047) grad_norm 1.5527 (1.6093) [2022-01-22 22:10:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][750/1251] eta 0:18:29 lr 0.000424 time 1.7946 (2.2150) loss 3.2821 (3.5058) grad_norm 1.5212 (1.6105) [2022-01-22 22:10:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][760/1251] eta 0:18:07 lr 0.000424 time 2.1382 (2.2144) loss 3.5097 (3.5045) grad_norm 1.4558 (1.6104) [2022-01-22 22:11:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][770/1251] eta 0:17:46 lr 0.000424 time 2.6273 (2.2166) loss 3.7739 (3.5048) grad_norm 1.5559 (1.6107) [2022-01-22 22:11:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][780/1251] eta 0:17:23 lr 0.000424 time 2.1862 (2.2148) loss 4.4693 (3.5054) grad_norm 1.7976 (1.6116) [2022-01-22 22:11:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][790/1251] eta 0:17:00 lr 0.000424 time 1.8652 (2.2139) loss 3.5090 (3.5065) grad_norm 1.8980 (1.6129) [2022-01-22 22:12:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][800/1251] eta 0:16:38 lr 0.000424 time 1.6670 (2.2130) loss 3.2771 (3.5061) grad_norm 1.5838 (1.6128) [2022-01-22 22:12:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][810/1251] eta 0:16:16 lr 0.000424 time 2.7760 (2.2139) loss 3.5509 (3.5063) grad_norm 1.5308 (1.6138) [2022-01-22 22:12:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][820/1251] eta 0:15:53 lr 0.000424 time 1.6300 (2.2117) loss 2.6795 (3.5069) grad_norm 1.6386 (1.6139) [2022-01-22 22:13:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][830/1251] eta 0:15:31 lr 0.000424 time 1.9845 (2.2117) loss 3.4117 (3.5061) grad_norm 1.4655 (1.6130) [2022-01-22 22:13:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][840/1251] eta 0:15:09 lr 0.000424 time 1.5524 (2.2122) loss 2.6677 (3.5029) grad_norm 1.4864 (1.6120) [2022-01-22 22:14:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][850/1251] eta 0:14:47 lr 0.000424 time 2.8340 (2.2137) loss 3.3745 (3.5027) grad_norm 1.7956 (1.6119) [2022-01-22 22:14:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][860/1251] eta 0:14:25 lr 0.000424 time 2.2480 (2.2145) loss 2.3178 (3.5014) grad_norm 1.4122 (1.6123) [2022-01-22 22:14:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][870/1251] eta 0:14:02 lr 0.000424 time 1.8873 (2.2123) loss 4.4696 (3.5007) grad_norm 1.7948 (1.6122) [2022-01-22 22:15:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][880/1251] eta 0:13:40 lr 0.000424 time 1.6884 (2.2110) loss 4.0634 (3.5026) grad_norm 1.6374 (1.6125) [2022-01-22 22:15:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][890/1251] eta 0:13:17 lr 0.000424 time 1.5880 (2.2100) loss 2.4407 (3.4989) grad_norm 1.4010 (1.6130) [2022-01-22 22:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][900/1251] eta 0:12:55 lr 0.000424 time 2.8467 (2.2093) loss 3.1547 (3.4986) grad_norm 1.5502 (1.6137) [2022-01-22 22:16:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][910/1251] eta 0:12:33 lr 0.000424 time 2.1790 (2.2104) loss 2.6545 (3.4959) grad_norm 1.4672 (1.6127) [2022-01-22 22:16:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][920/1251] eta 0:12:13 lr 0.000424 time 1.8633 (2.2148) loss 3.6854 (3.4964) grad_norm 1.5749 (1.6117) [2022-01-22 22:17:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][930/1251] eta 0:11:51 lr 0.000424 time 3.1723 (2.2173) loss 3.6537 (3.4963) grad_norm 1.5539 (1.6116) [2022-01-22 22:17:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][940/1251] eta 0:11:29 lr 0.000424 time 2.2292 (2.2175) loss 3.3869 (3.4923) grad_norm 1.7015 (1.6122) [2022-01-22 22:17:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][950/1251] eta 0:11:06 lr 0.000424 time 1.9103 (2.2156) loss 3.5954 (3.4925) grad_norm 1.5448 (1.6119) [2022-01-22 22:18:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][960/1251] eta 0:10:44 lr 0.000424 time 1.9171 (2.2136) loss 3.3083 (3.4924) grad_norm 1.4937 (1.6117) [2022-01-22 22:18:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][970/1251] eta 0:10:21 lr 0.000424 time 1.9672 (2.2107) loss 3.8165 (3.4917) grad_norm 1.4746 (1.6119) [2022-01-22 22:18:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][980/1251] eta 0:09:58 lr 0.000424 time 1.8281 (2.2096) loss 3.7825 (3.4884) grad_norm 1.4954 (1.6115) [2022-01-22 22:19:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][990/1251] eta 0:09:37 lr 0.000424 time 2.4636 (2.2108) loss 3.6547 (3.4887) grad_norm 1.5155 (1.6111) [2022-01-22 22:19:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1000/1251] eta 0:09:15 lr 0.000423 time 2.5714 (2.2115) loss 3.7403 (3.4889) grad_norm 1.4910 (1.6105) [2022-01-22 22:20:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1010/1251] eta 0:08:53 lr 0.000423 time 2.1808 (2.2122) loss 3.7376 (3.4907) grad_norm 1.4595 (1.6103) [2022-01-22 22:20:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1020/1251] eta 0:08:31 lr 0.000423 time 1.9987 (2.2124) loss 3.4386 (3.4932) grad_norm 1.5381 (1.6097) [2022-01-22 22:20:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1030/1251] eta 0:08:09 lr 0.000423 time 1.6930 (2.2128) loss 3.4540 (3.4924) grad_norm 1.5343 (1.6093) [2022-01-22 22:21:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1040/1251] eta 0:07:47 lr 0.000423 time 2.2206 (2.2147) loss 2.8580 (3.4896) grad_norm 1.6608 (1.6098) [2022-01-22 22:21:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1050/1251] eta 0:07:25 lr 0.000423 time 2.2378 (2.2144) loss 3.5718 (3.4907) grad_norm 1.6742 (1.6095) [2022-01-22 22:21:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1060/1251] eta 0:07:02 lr 0.000423 time 2.0121 (2.2127) loss 3.9144 (3.4869) grad_norm 1.7706 (1.6094) [2022-01-22 22:22:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1070/1251] eta 0:06:39 lr 0.000423 time 1.7434 (2.2097) loss 2.1374 (3.4854) grad_norm 1.7355 (1.6101) [2022-01-22 22:22:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1080/1251] eta 0:06:17 lr 0.000423 time 2.2303 (2.2093) loss 3.8059 (3.4836) grad_norm 1.9325 (1.6108) [2022-01-22 22:22:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1090/1251] eta 0:05:55 lr 0.000423 time 1.8750 (2.2083) loss 3.6875 (3.4824) grad_norm 1.6903 (1.6115) [2022-01-22 22:23:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1100/1251] eta 0:05:33 lr 0.000423 time 2.3019 (2.2082) loss 3.6761 (3.4832) grad_norm 1.4193 (1.6111) [2022-01-22 22:23:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1110/1251] eta 0:05:11 lr 0.000423 time 2.4934 (2.2084) loss 2.6713 (3.4836) grad_norm 1.6250 (1.6116) [2022-01-22 22:24:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1120/1251] eta 0:04:49 lr 0.000423 time 2.1494 (2.2110) loss 3.9496 (3.4857) grad_norm 1.7401 (1.6122) [2022-01-22 22:24:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1130/1251] eta 0:04:27 lr 0.000423 time 1.7786 (2.2110) loss 3.7691 (3.4844) grad_norm 1.9443 (1.6126) [2022-01-22 22:24:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1140/1251] eta 0:04:05 lr 0.000423 time 2.1180 (2.2107) loss 3.1367 (3.4852) grad_norm 1.6503 (1.6129) [2022-01-22 22:25:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1150/1251] eta 0:03:43 lr 0.000423 time 1.8725 (2.2102) loss 3.8198 (3.4855) grad_norm 1.5396 (1.6130) [2022-01-22 22:25:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1160/1251] eta 0:03:20 lr 0.000423 time 1.9569 (2.2082) loss 3.5381 (3.4847) grad_norm 1.3368 (1.6127) [2022-01-22 22:25:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1170/1251] eta 0:02:58 lr 0.000423 time 1.7209 (2.2081) loss 2.8839 (3.4842) grad_norm 1.4116 (1.6129) [2022-01-22 22:26:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1180/1251] eta 0:02:36 lr 0.000423 time 2.0856 (2.2081) loss 3.6455 (3.4853) grad_norm 1.4740 (1.6122) [2022-01-22 22:26:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1190/1251] eta 0:02:14 lr 0.000423 time 2.3502 (2.2085) loss 4.1416 (3.4874) grad_norm 1.7985 (1.6118) [2022-01-22 22:26:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1200/1251] eta 0:01:52 lr 0.000423 time 2.9784 (2.2077) loss 3.9733 (3.4904) grad_norm 1.5460 (1.6123) [2022-01-22 22:27:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1210/1251] eta 0:01:30 lr 0.000423 time 1.8199 (2.2081) loss 3.9831 (3.4904) grad_norm 1.5214 (1.6119) [2022-01-22 22:27:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1220/1251] eta 0:01:08 lr 0.000423 time 1.9445 (2.2069) loss 3.3203 (3.4892) grad_norm 1.5310 (1.6118) [2022-01-22 22:27:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1230/1251] eta 0:00:46 lr 0.000423 time 2.5440 (2.2064) loss 3.8161 (3.4863) grad_norm 1.6040 (1.6117) [2022-01-22 22:28:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1240/1251] eta 0:00:24 lr 0.000422 time 2.0451 (2.2051) loss 3.5194 (3.4823) grad_norm 1.4677 (1.6119) [2022-01-22 22:28:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [165/300][1250/1251] eta 0:00:02 lr 0.000422 time 1.1663 (2.1996) loss 4.0310 (3.4826) grad_norm 1.4571 (1.6122) [2022-01-22 22:28:35 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 165 training takes 0:45:52 [2022-01-22 22:28:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.150 (18.150) Loss 1.0073 (1.0073) Acc@1 75.293 (75.293) Acc@5 94.141 (94.141) [2022-01-22 22:29:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.691 (3.374) Loss 1.0108 (1.0001) Acc@1 75.098 (76.474) Acc@5 93.555 (93.466) [2022-01-22 22:29:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.229 (2.660) Loss 0.9316 (0.9911) Acc@1 77.832 (76.758) Acc@5 94.434 (93.690) [2022-01-22 22:29:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.270 (2.355) Loss 0.8833 (0.9788) Acc@1 79.688 (77.007) Acc@5 95.312 (93.782) [2022-01-22 22:30:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.206 (2.211) Loss 1.0590 (0.9870) Acc@1 75.098 (76.829) Acc@5 92.773 (93.581) [2022-01-22 22:30:14 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.816 Acc@5 93.570 [2022-01-22 22:30:14 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.8% [2022-01-22 22:30:14 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.82% [2022-01-22 22:30:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][0/1251] eta 7:37:39 lr 0.000422 time 21.9504 (21.9504) loss 3.9292 (3.9292) grad_norm 1.9007 (1.9007) [2022-01-22 22:30:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][10/1251] eta 1:24:16 lr 0.000422 time 1.9639 (4.0747) loss 3.0035 (3.3708) grad_norm 1.5633 (1.6475) [2022-01-22 22:31:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][20/1251] eta 1:05:34 lr 0.000422 time 1.5062 (3.1959) loss 3.3466 (3.3749) grad_norm 1.7248 (1.6225) [2022-01-22 22:31:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][30/1251] eta 0:58:44 lr 0.000422 time 2.2093 (2.8866) loss 3.3506 (3.3214) grad_norm 1.8187 (1.6103) [2022-01-22 22:32:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][40/1251] eta 0:55:57 lr 0.000422 time 3.8851 (2.7728) loss 3.0263 (3.3909) grad_norm 1.6390 (1.6139) [2022-01-22 22:32:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][50/1251] eta 0:53:43 lr 0.000422 time 3.5146 (2.6841) loss 3.6734 (3.3945) grad_norm 1.4708 (1.5983) [2022-01-22 22:32:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][60/1251] eta 0:51:18 lr 0.000422 time 2.0269 (2.5846) loss 2.7253 (3.4122) grad_norm 1.3873 (1.5786) [2022-01-22 22:33:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][70/1251] eta 0:49:35 lr 0.000422 time 1.9872 (2.5192) loss 3.4548 (3.4445) grad_norm 1.9123 (1.5894) [2022-01-22 22:33:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][80/1251] eta 0:48:38 lr 0.000422 time 3.1099 (2.4921) loss 3.6035 (3.4797) grad_norm 1.8264 (1.5930) [2022-01-22 22:33:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][90/1251] eta 0:47:35 lr 0.000422 time 3.0156 (2.4595) loss 3.8177 (3.4955) grad_norm 1.8195 (1.6002) [2022-01-22 22:34:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][100/1251] eta 0:46:23 lr 0.000422 time 2.0526 (2.4184) loss 3.9651 (3.4942) grad_norm 1.4497 (1.6067) [2022-01-22 22:34:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][110/1251] eta 0:45:15 lr 0.000422 time 1.5831 (2.3799) loss 3.5941 (3.4719) grad_norm 1.7772 (1.6164) [2022-01-22 22:34:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][120/1251] eta 0:44:22 lr 0.000422 time 2.3061 (2.3538) loss 4.1571 (3.4761) grad_norm 2.0347 (1.6176) [2022-01-22 22:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][130/1251] eta 0:43:40 lr 0.000422 time 2.6081 (2.3378) loss 3.2777 (3.4536) grad_norm 1.9746 (1.6175) [2022-01-22 22:35:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][140/1251] eta 0:43:04 lr 0.000422 time 1.8401 (2.3262) loss 3.8726 (3.4671) grad_norm 1.6844 (1.6141) [2022-01-22 22:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][150/1251] eta 0:42:32 lr 0.000422 time 2.1984 (2.3185) loss 2.8953 (3.4607) grad_norm 1.5166 (1.6143) [2022-01-22 22:36:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][160/1251] eta 0:41:56 lr 0.000422 time 2.2006 (2.3069) loss 3.8382 (3.4483) grad_norm 1.5984 (1.6208) [2022-01-22 22:36:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][170/1251] eta 0:41:39 lr 0.000422 time 3.4342 (2.3119) loss 2.4014 (3.4546) grad_norm 1.5954 (1.6154) [2022-01-22 22:37:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][180/1251] eta 0:41:16 lr 0.000422 time 2.4079 (2.3122) loss 3.4535 (3.4312) grad_norm 1.5451 (1.6115) [2022-01-22 22:37:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][190/1251] eta 0:40:43 lr 0.000422 time 1.7564 (2.3028) loss 3.7920 (3.4299) grad_norm 1.6025 (1.6096) [2022-01-22 22:37:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][200/1251] eta 0:40:16 lr 0.000422 time 1.9863 (2.2995) loss 3.7322 (3.4421) grad_norm 1.4664 (1.6082) [2022-01-22 22:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][210/1251] eta 0:39:49 lr 0.000422 time 2.4922 (2.2957) loss 3.5182 (3.4311) grad_norm 1.5888 (1.6079) [2022-01-22 22:38:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][220/1251] eta 0:39:18 lr 0.000422 time 2.2741 (2.2881) loss 3.2131 (3.4234) grad_norm 1.5397 (1.6116) [2022-01-22 22:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][230/1251] eta 0:38:49 lr 0.000422 time 1.8805 (2.2816) loss 3.7909 (3.4225) grad_norm 1.7846 (1.6154) [2022-01-22 22:39:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][240/1251] eta 0:38:20 lr 0.000421 time 2.1934 (2.2758) loss 3.9626 (3.4229) grad_norm 1.4715 (1.6190) [2022-01-22 22:39:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][250/1251] eta 0:37:58 lr 0.000421 time 3.1317 (2.2762) loss 4.1395 (3.4198) grad_norm 1.8441 (1.6192) [2022-01-22 22:40:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][260/1251] eta 0:37:25 lr 0.000421 time 2.0235 (2.2660) loss 3.9309 (3.4305) grad_norm 1.5274 (1.6187) [2022-01-22 22:40:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][270/1251] eta 0:36:55 lr 0.000421 time 1.9398 (2.2588) loss 2.5294 (3.4348) grad_norm 1.3312 (1.6157) [2022-01-22 22:40:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][280/1251] eta 0:36:31 lr 0.000421 time 2.2060 (2.2566) loss 3.8375 (3.4412) grad_norm 1.7570 (1.6152) [2022-01-22 22:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][290/1251] eta 0:36:02 lr 0.000421 time 1.6297 (2.2503) loss 3.3442 (3.4409) grad_norm 1.9816 (1.6150) [2022-01-22 22:41:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][300/1251] eta 0:35:43 lr 0.000421 time 1.7770 (2.2535) loss 3.8823 (3.4471) grad_norm 1.4091 (1.6114) [2022-01-22 22:41:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][310/1251] eta 0:35:21 lr 0.000421 time 2.4055 (2.2540) loss 3.9818 (3.4537) grad_norm 1.3764 (1.6103) [2022-01-22 22:42:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][320/1251] eta 0:34:59 lr 0.000421 time 2.1822 (2.2546) loss 2.5115 (3.4459) grad_norm 1.6709 (1.6090) [2022-01-22 22:42:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][330/1251] eta 0:34:38 lr 0.000421 time 2.1993 (2.2570) loss 3.5797 (3.4548) grad_norm 1.5155 (1.6134) [2022-01-22 22:43:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][340/1251] eta 0:34:16 lr 0.000421 time 1.9192 (2.2570) loss 3.9949 (3.4555) grad_norm 1.6609 (1.6140) [2022-01-22 22:43:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][350/1251] eta 0:33:53 lr 0.000421 time 2.8605 (2.2571) loss 3.8436 (3.4604) grad_norm 1.4143 (1.6129) [2022-01-22 22:43:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][360/1251] eta 0:33:24 lr 0.000421 time 1.6948 (2.2502) loss 3.8490 (3.4566) grad_norm 1.6942 (1.6136) [2022-01-22 22:44:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][370/1251] eta 0:32:57 lr 0.000421 time 1.9203 (2.2444) loss 3.7117 (3.4622) grad_norm 1.6844 (1.6192) [2022-01-22 22:44:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][380/1251] eta 0:32:35 lr 0.000421 time 2.3662 (2.2450) loss 3.5413 (3.4603) grad_norm 1.6174 (1.6219) [2022-01-22 22:44:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][390/1251] eta 0:32:12 lr 0.000421 time 2.5374 (2.2446) loss 4.1771 (3.4613) grad_norm 1.7907 (1.6238) [2022-01-22 22:45:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][400/1251] eta 0:31:50 lr 0.000421 time 2.1754 (2.2448) loss 3.1796 (3.4578) grad_norm 1.6510 (1.6237) [2022-01-22 22:45:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][410/1251] eta 0:31:25 lr 0.000421 time 1.8669 (2.2423) loss 3.0863 (3.4542) grad_norm 1.3804 (1.6232) [2022-01-22 22:45:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][420/1251] eta 0:31:03 lr 0.000421 time 1.8751 (2.2427) loss 2.7883 (3.4571) grad_norm 1.5179 (1.6227) [2022-01-22 22:46:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][430/1251] eta 0:30:39 lr 0.000421 time 1.8956 (2.2400) loss 3.7304 (3.4571) grad_norm 1.6755 (1.6233) [2022-01-22 22:46:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][440/1251] eta 0:30:12 lr 0.000421 time 2.0510 (2.2343) loss 4.1243 (3.4570) grad_norm 1.5690 (1.6233) [2022-01-22 22:47:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][450/1251] eta 0:29:47 lr 0.000421 time 2.1243 (2.2314) loss 4.0075 (3.4522) grad_norm 1.6455 (1.6226) [2022-01-22 22:47:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][460/1251] eta 0:29:22 lr 0.000421 time 2.1930 (2.2285) loss 3.9399 (3.4563) grad_norm 1.5283 (1.6225) [2022-01-22 22:47:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][470/1251] eta 0:28:58 lr 0.000421 time 2.1799 (2.2264) loss 2.9986 (3.4575) grad_norm 1.5563 (1.6232) [2022-01-22 22:48:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][480/1251] eta 0:28:32 lr 0.000420 time 1.9448 (2.2217) loss 3.7774 (3.4554) grad_norm 1.5834 (1.6225) [2022-01-22 22:48:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][490/1251] eta 0:28:06 lr 0.000420 time 2.2896 (2.2166) loss 3.6716 (3.4586) grad_norm 1.3678 (1.6222) [2022-01-22 22:48:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][500/1251] eta 0:27:44 lr 0.000420 time 2.2559 (2.2163) loss 4.0634 (3.4575) grad_norm 1.5492 (1.6233) [2022-01-22 22:49:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][510/1251] eta 0:27:21 lr 0.000420 time 1.8571 (2.2154) loss 2.3937 (3.4602) grad_norm 1.5831 (1.6229) [2022-01-22 22:49:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][520/1251] eta 0:27:00 lr 0.000420 time 2.1229 (2.2165) loss 3.9368 (3.4579) grad_norm 1.7346 (1.6218) [2022-01-22 22:49:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][530/1251] eta 0:26:38 lr 0.000420 time 2.4961 (2.2177) loss 3.7208 (3.4614) grad_norm 1.7512 (1.6214) [2022-01-22 22:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][540/1251] eta 0:26:17 lr 0.000420 time 2.4319 (2.2193) loss 3.2882 (3.4580) grad_norm 1.6294 (1.6210) [2022-01-22 22:50:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][550/1251] eta 0:25:56 lr 0.000420 time 1.9124 (2.2201) loss 2.7249 (3.4543) grad_norm 1.5476 (1.6215) [2022-01-22 22:51:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][560/1251] eta 0:25:36 lr 0.000420 time 3.0536 (2.2230) loss 3.7637 (3.4573) grad_norm 1.8963 (1.6202) [2022-01-22 22:51:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][570/1251] eta 0:25:13 lr 0.000420 time 2.4895 (2.2229) loss 2.9538 (3.4586) grad_norm 1.7276 (1.6219) [2022-01-22 22:51:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][580/1251] eta 0:24:52 lr 0.000420 time 2.2883 (2.2241) loss 3.4172 (3.4581) grad_norm 1.7265 (1.6215) [2022-01-22 22:52:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][590/1251] eta 0:24:27 lr 0.000420 time 1.6255 (2.2198) loss 3.6763 (3.4591) grad_norm 1.5578 (1.6206) [2022-01-22 22:52:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][600/1251] eta 0:24:05 lr 0.000420 time 2.2492 (2.2197) loss 3.6558 (3.4604) grad_norm 1.4299 (1.6203) [2022-01-22 22:52:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][610/1251] eta 0:23:42 lr 0.000420 time 2.5945 (2.2188) loss 3.5174 (3.4602) grad_norm 1.6068 (1.6206) [2022-01-22 22:53:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][620/1251] eta 0:23:18 lr 0.000420 time 2.1599 (2.2168) loss 3.1985 (3.4626) grad_norm 1.7117 (1.6216) [2022-01-22 22:53:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][630/1251] eta 0:22:56 lr 0.000420 time 2.4659 (2.2166) loss 3.7669 (3.4632) grad_norm 1.6171 (1.6225) [2022-01-22 22:53:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][640/1251] eta 0:22:34 lr 0.000420 time 2.0910 (2.2175) loss 3.4708 (3.4637) grad_norm 1.7314 (1.6243) [2022-01-22 22:54:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][650/1251] eta 0:22:13 lr 0.000420 time 3.0137 (2.2193) loss 4.3939 (3.4642) grad_norm 1.4349 (1.6222) [2022-01-22 22:54:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][660/1251] eta 0:21:49 lr 0.000420 time 1.6612 (2.2163) loss 4.0402 (3.4631) grad_norm 1.7026 (1.6239) [2022-01-22 22:54:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][670/1251] eta 0:21:26 lr 0.000420 time 1.8507 (2.2135) loss 3.5652 (3.4643) grad_norm 1.3378 (1.6229) [2022-01-22 22:55:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][680/1251] eta 0:21:02 lr 0.000420 time 2.2829 (2.2111) loss 3.7044 (3.4633) grad_norm 1.8531 (1.6241) [2022-01-22 22:55:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][690/1251] eta 0:20:38 lr 0.000420 time 1.9281 (2.2077) loss 3.8222 (3.4656) grad_norm 1.4532 (1.6229) [2022-01-22 22:56:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][700/1251] eta 0:20:16 lr 0.000420 time 1.9412 (2.2071) loss 3.8323 (3.4679) grad_norm 1.7797 (1.6235) [2022-01-22 22:56:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][710/1251] eta 0:19:54 lr 0.000420 time 2.2077 (2.2081) loss 4.0179 (3.4642) grad_norm 1.7404 (1.6246) [2022-01-22 22:56:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][720/1251] eta 0:19:33 lr 0.000420 time 2.5375 (2.2096) loss 3.7787 (3.4679) grad_norm 1.5385 (1.6258) [2022-01-22 22:57:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][730/1251] eta 0:19:12 lr 0.000419 time 2.6400 (2.2113) loss 2.6358 (3.4674) grad_norm 1.7061 (1.6263) [2022-01-22 22:57:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][740/1251] eta 0:18:51 lr 0.000419 time 2.3973 (2.2134) loss 3.0684 (3.4690) grad_norm 1.4516 (1.6267) [2022-01-22 22:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][750/1251] eta 0:18:30 lr 0.000419 time 2.6680 (2.2169) loss 3.7698 (3.4684) grad_norm 1.5216 (1.6265) [2022-01-22 22:58:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][760/1251] eta 0:18:07 lr 0.000419 time 1.9440 (2.2152) loss 3.7719 (3.4700) grad_norm 1.4776 (1.6257) [2022-01-22 22:58:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][770/1251] eta 0:17:44 lr 0.000419 time 1.9129 (2.2121) loss 3.4914 (3.4664) grad_norm 1.7067 (1.6248) [2022-01-22 22:58:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][780/1251] eta 0:17:20 lr 0.000419 time 1.8084 (2.2089) loss 3.7338 (3.4669) grad_norm 1.9064 (1.6255) [2022-01-22 22:59:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][790/1251] eta 0:16:58 lr 0.000419 time 2.3889 (2.2092) loss 3.2743 (3.4648) grad_norm 1.5880 (1.6257) [2022-01-22 22:59:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][800/1251] eta 0:16:36 lr 0.000419 time 2.5159 (2.2086) loss 2.5348 (3.4635) grad_norm 1.4999 (1.6255) [2022-01-22 23:00:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][810/1251] eta 0:16:13 lr 0.000419 time 2.0598 (2.2064) loss 3.6435 (3.4609) grad_norm 1.3445 (1.6246) [2022-01-22 23:00:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][820/1251] eta 0:15:50 lr 0.000419 time 1.6728 (2.2053) loss 2.8063 (3.4579) grad_norm 1.7829 (1.6249) [2022-01-22 23:00:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][830/1251] eta 0:15:27 lr 0.000419 time 2.1702 (2.2035) loss 3.8574 (3.4608) grad_norm 1.7651 (1.6245) [2022-01-22 23:01:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][840/1251] eta 0:15:05 lr 0.000419 time 1.8355 (2.2033) loss 3.3940 (3.4608) grad_norm 1.4589 (1.6238) [2022-01-22 23:01:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][850/1251] eta 0:14:43 lr 0.000419 time 2.8669 (2.2039) loss 4.2561 (3.4608) grad_norm 1.5908 (1.6237) [2022-01-22 23:01:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][860/1251] eta 0:14:22 lr 0.000419 time 2.4625 (2.2056) loss 3.7811 (3.4628) grad_norm 1.5419 (1.6242) [2022-01-22 23:02:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][870/1251] eta 0:14:00 lr 0.000419 time 1.9222 (2.2059) loss 3.6787 (3.4624) grad_norm 1.5275 (1.6232) [2022-01-22 23:02:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][880/1251] eta 0:13:38 lr 0.000419 time 1.7971 (2.2058) loss 3.7743 (3.4624) grad_norm 1.6750 (1.6243) [2022-01-22 23:02:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][890/1251] eta 0:13:16 lr 0.000419 time 2.2449 (2.2052) loss 3.1517 (3.4612) grad_norm 1.4797 (1.6231) [2022-01-22 23:03:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][900/1251] eta 0:12:53 lr 0.000419 time 1.5895 (2.2043) loss 3.7246 (3.4607) grad_norm 1.6921 (1.6216) [2022-01-22 23:03:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][910/1251] eta 0:12:31 lr 0.000419 time 1.8507 (2.2038) loss 3.6242 (3.4617) grad_norm 1.7351 (1.6220) [2022-01-22 23:04:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][920/1251] eta 0:12:09 lr 0.000419 time 2.7893 (2.2047) loss 2.4859 (3.4600) grad_norm 1.4317 (1.6229) [2022-01-22 23:04:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][930/1251] eta 0:11:47 lr 0.000419 time 2.1943 (2.2056) loss 3.6890 (3.4628) grad_norm 1.6525 (1.6222) [2022-01-22 23:04:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][940/1251] eta 0:11:26 lr 0.000419 time 2.2142 (2.2071) loss 3.8295 (3.4637) grad_norm 1.6906 (1.6225) [2022-01-22 23:05:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][950/1251] eta 0:11:03 lr 0.000419 time 1.9415 (2.2059) loss 3.8658 (3.4671) grad_norm 1.7657 (1.6231) [2022-01-22 23:05:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][960/1251] eta 0:10:41 lr 0.000419 time 3.1740 (2.2052) loss 3.1583 (3.4672) grad_norm 1.4138 (1.6234) [2022-01-22 23:05:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][970/1251] eta 0:10:19 lr 0.000418 time 1.9708 (2.2039) loss 3.9521 (3.4665) grad_norm 1.5642 (1.6228) [2022-01-22 23:06:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][980/1251] eta 0:09:56 lr 0.000418 time 1.9327 (2.2029) loss 2.2745 (3.4655) grad_norm 1.7998 (1.6227) [2022-01-22 23:06:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][990/1251] eta 0:09:34 lr 0.000418 time 1.5636 (2.2021) loss 2.9267 (3.4639) grad_norm 1.6605 (1.6228) [2022-01-22 23:06:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1000/1251] eta 0:09:12 lr 0.000418 time 3.4913 (2.2032) loss 3.8463 (3.4675) grad_norm 1.6017 (1.6221) [2022-01-22 23:07:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1010/1251] eta 0:08:51 lr 0.000418 time 1.2835 (2.2048) loss 3.1150 (3.4664) grad_norm 1.5484 (1.6214) [2022-01-22 23:07:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1020/1251] eta 0:08:29 lr 0.000418 time 1.5901 (2.2052) loss 3.5167 (3.4642) grad_norm 1.4868 (1.6222) [2022-01-22 23:08:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1030/1251] eta 0:08:07 lr 0.000418 time 1.7144 (2.2052) loss 3.5508 (3.4646) grad_norm 1.5739 (1.6220) [2022-01-22 23:08:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1040/1251] eta 0:07:45 lr 0.000418 time 2.5000 (2.2056) loss 3.1363 (3.4641) grad_norm 1.7101 (1.6215) [2022-01-22 23:08:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1050/1251] eta 0:07:23 lr 0.000418 time 1.6134 (2.2055) loss 3.7266 (3.4622) grad_norm 1.5809 (1.6216) [2022-01-22 23:09:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1060/1251] eta 0:07:00 lr 0.000418 time 1.5830 (2.2029) loss 3.3450 (3.4649) grad_norm 2.3049 (1.6225) [2022-01-22 23:09:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1070/1251] eta 0:06:38 lr 0.000418 time 2.5255 (2.2024) loss 4.2808 (3.4661) grad_norm 1.6710 (1.6224) [2022-01-22 23:09:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1080/1251] eta 0:06:16 lr 0.000418 time 1.7814 (2.2019) loss 3.8703 (3.4668) grad_norm 1.6645 (1.6231) [2022-01-22 23:10:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1090/1251] eta 0:05:54 lr 0.000418 time 2.1841 (2.2029) loss 4.0245 (3.4657) grad_norm 1.4830 (1.6241) [2022-01-22 23:10:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1100/1251] eta 0:05:32 lr 0.000418 time 1.5617 (2.2015) loss 3.7612 (3.4666) grad_norm 1.4605 (1.6235) [2022-01-22 23:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1110/1251] eta 0:05:10 lr 0.000418 time 2.5604 (2.2023) loss 2.9799 (3.4677) grad_norm 1.4280 (1.6236) [2022-01-22 23:11:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1120/1251] eta 0:04:48 lr 0.000418 time 2.1851 (2.2007) loss 3.4643 (3.4668) grad_norm 1.5136 (1.6245) [2022-01-22 23:11:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1130/1251] eta 0:04:26 lr 0.000418 time 1.8802 (2.2011) loss 3.6917 (3.4678) grad_norm 1.4104 (1.6254) [2022-01-22 23:12:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1140/1251] eta 0:04:04 lr 0.000418 time 1.7260 (2.2023) loss 2.7446 (3.4688) grad_norm 1.7074 (1.6257) [2022-01-22 23:12:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1150/1251] eta 0:03:42 lr 0.000418 time 3.0975 (2.2036) loss 3.3269 (3.4685) grad_norm 1.8318 (1.6259) [2022-01-22 23:12:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1160/1251] eta 0:03:20 lr 0.000418 time 1.5726 (2.2035) loss 3.6919 (3.4692) grad_norm 1.4106 (1.6262) [2022-01-22 23:13:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1170/1251] eta 0:02:58 lr 0.000418 time 1.5483 (2.2029) loss 3.7305 (3.4708) grad_norm 1.8007 (1.6260) [2022-01-22 23:13:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1180/1251] eta 0:02:36 lr 0.000418 time 1.7061 (2.2021) loss 4.2882 (3.4738) grad_norm 1.6293 (1.6261) [2022-01-22 23:13:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1190/1251] eta 0:02:14 lr 0.000418 time 2.8655 (2.2017) loss 4.1602 (3.4740) grad_norm 1.5241 (1.6263) [2022-01-22 23:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1200/1251] eta 0:01:52 lr 0.000418 time 1.8155 (2.2011) loss 3.7275 (3.4746) grad_norm 1.5489 (1.6263) [2022-01-22 23:14:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1210/1251] eta 0:01:30 lr 0.000418 time 1.5096 (2.2007) loss 3.9117 (3.4754) grad_norm 1.8008 (1.6267) [2022-01-22 23:15:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1220/1251] eta 0:01:08 lr 0.000417 time 2.9717 (2.2026) loss 2.7125 (3.4756) grad_norm 1.3508 (1.6266) [2022-01-22 23:15:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1230/1251] eta 0:00:46 lr 0.000417 time 2.4849 (2.2033) loss 3.6705 (3.4789) grad_norm 1.9338 (1.6274) [2022-01-22 23:15:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1240/1251] eta 0:00:24 lr 0.000417 time 1.4075 (2.2018) loss 3.2955 (3.4787) grad_norm 1.4604 (1.6275) [2022-01-22 23:16:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [166/300][1250/1251] eta 0:00:02 lr 0.000417 time 1.1895 (2.1957) loss 3.9323 (3.4770) grad_norm 1.4766 (1.6276) [2022-01-22 23:16:01 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 166 training takes 0:45:47 [2022-01-22 23:16:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.968 (17.968) Loss 0.9161 (0.9161) Acc@1 78.516 (78.516) Acc@5 93.555 (93.555) [2022-01-22 23:16:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.566 (3.462) Loss 0.9657 (0.9866) Acc@1 77.148 (76.607) Acc@5 95.117 (93.697) [2022-01-22 23:16:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.888 (2.643) Loss 0.9313 (0.9748) Acc@1 77.539 (77.018) Acc@5 94.238 (93.806) [2022-01-22 23:17:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.648 (2.367) Loss 0.9851 (0.9819) Acc@1 76.172 (76.868) Acc@5 94.434 (93.655) [2022-01-22 23:17:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.844 (2.196) Loss 0.9148 (0.9793) Acc@1 79.199 (76.839) Acc@5 93.750 (93.698) [2022-01-22 23:17:39 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.722 Acc@5 93.706 [2022-01-22 23:17:39 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.7% [2022-01-22 23:17:39 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.82% [2022-01-22 23:18:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][0/1251] eta 7:38:15 lr 0.000417 time 21.9787 (21.9787) loss 3.7360 (3.7360) grad_norm 2.1806 (2.1806) [2022-01-22 23:18:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][10/1251] eta 1:22:02 lr 0.000417 time 2.0352 (3.9667) loss 2.8183 (3.4397) grad_norm 1.4875 (1.6955) [2022-01-22 23:18:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][20/1251] eta 1:03:35 lr 0.000417 time 1.8927 (3.0997) loss 3.5132 (3.3759) grad_norm 1.3387 (1.6372) [2022-01-22 23:19:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][30/1251] eta 0:57:16 lr 0.000417 time 1.8037 (2.8144) loss 2.7864 (3.3815) grad_norm 1.8678 (1.6451) [2022-01-22 23:19:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][40/1251] eta 0:54:20 lr 0.000417 time 3.6187 (2.6921) loss 3.7936 (3.4191) grad_norm 1.6263 (1.6581) [2022-01-22 23:19:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][50/1251] eta 0:52:36 lr 0.000417 time 1.7630 (2.6282) loss 3.7363 (3.4719) grad_norm 1.6398 (1.6522) [2022-01-22 23:20:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][60/1251] eta 0:51:18 lr 0.000417 time 2.4023 (2.5852) loss 3.3322 (3.4816) grad_norm 1.8692 (1.6582) [2022-01-22 23:20:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][70/1251] eta 0:49:20 lr 0.000417 time 1.8738 (2.5067) loss 3.9245 (3.4688) grad_norm 1.3630 (1.6654) [2022-01-22 23:20:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][80/1251] eta 0:47:55 lr 0.000417 time 2.6923 (2.4555) loss 3.0676 (3.4589) grad_norm 1.5162 (1.6595) [2022-01-22 23:21:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][90/1251] eta 0:46:42 lr 0.000417 time 1.8131 (2.4141) loss 3.6730 (3.4631) grad_norm 1.5586 (1.6487) [2022-01-22 23:21:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][100/1251] eta 0:45:54 lr 0.000417 time 2.1870 (2.3935) loss 3.7332 (3.4842) grad_norm 1.5711 (1.6451) [2022-01-22 23:22:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][110/1251] eta 0:45:19 lr 0.000417 time 1.5908 (2.3830) loss 3.8927 (3.4758) grad_norm 1.6530 (1.6385) [2022-01-22 23:22:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][120/1251] eta 0:44:47 lr 0.000417 time 2.8533 (2.3761) loss 3.1527 (3.4935) grad_norm 1.5628 (1.6369) [2022-01-22 23:22:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][130/1251] eta 0:44:02 lr 0.000417 time 2.1447 (2.3572) loss 3.7504 (3.4915) grad_norm 1.6032 (1.6289) [2022-01-22 23:23:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][140/1251] eta 0:43:21 lr 0.000417 time 2.0761 (2.3420) loss 3.2108 (3.4976) grad_norm 1.4222 (1.6241) [2022-01-22 23:23:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][150/1251] eta 0:42:39 lr 0.000417 time 2.2625 (2.3251) loss 2.4734 (3.4965) grad_norm 1.5422 (1.6232) [2022-01-22 23:23:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][160/1251] eta 0:41:58 lr 0.000417 time 1.8913 (2.3089) loss 3.4098 (3.5093) grad_norm 1.4834 (1.6172) [2022-01-22 23:24:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][170/1251] eta 0:41:26 lr 0.000417 time 2.0346 (2.3003) loss 3.8453 (3.5219) grad_norm 1.7497 (1.6210) [2022-01-22 23:24:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][180/1251] eta 0:41:07 lr 0.000417 time 2.5527 (2.3038) loss 3.0843 (3.5180) grad_norm 1.6832 (1.6191) [2022-01-22 23:24:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][190/1251] eta 0:40:29 lr 0.000417 time 1.8266 (2.2896) loss 2.4330 (3.5284) grad_norm 1.4751 (1.6203) [2022-01-22 23:25:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][200/1251] eta 0:40:02 lr 0.000417 time 2.8582 (2.2856) loss 3.9131 (3.5344) grad_norm 1.5139 (1.6188) [2022-01-22 23:25:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][210/1251] eta 0:39:38 lr 0.000416 time 1.8802 (2.2850) loss 2.7861 (3.5361) grad_norm 1.6564 (1.6216) [2022-01-22 23:26:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][220/1251] eta 0:39:20 lr 0.000416 time 2.5951 (2.2900) loss 3.7229 (3.5408) grad_norm 1.8762 (1.6269) [2022-01-22 23:26:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][230/1251] eta 0:38:53 lr 0.000416 time 1.6658 (2.2852) loss 3.8517 (3.5456) grad_norm 1.6937 (1.6285) [2022-01-22 23:26:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][240/1251] eta 0:38:30 lr 0.000416 time 2.7883 (2.2850) loss 3.8017 (3.5491) grad_norm 1.5222 (1.6278) [2022-01-22 23:27:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][250/1251] eta 0:37:51 lr 0.000416 time 1.6345 (2.2696) loss 2.7193 (3.5355) grad_norm 1.6681 (1.6259) [2022-01-22 23:27:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][260/1251] eta 0:37:25 lr 0.000416 time 1.8922 (2.2655) loss 3.7876 (3.5340) grad_norm 1.3809 (1.6249) [2022-01-22 23:27:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][270/1251] eta 0:36:59 lr 0.000416 time 2.0328 (2.2626) loss 2.6793 (3.5419) grad_norm 1.4763 (1.6243) [2022-01-22 23:28:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][280/1251] eta 0:36:36 lr 0.000416 time 1.8429 (2.2618) loss 3.6733 (3.5404) grad_norm 1.6019 (1.6260) [2022-01-22 23:28:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][290/1251] eta 0:36:10 lr 0.000416 time 1.9228 (2.2581) loss 4.1444 (3.5378) grad_norm 1.6619 (1.6266) [2022-01-22 23:28:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][300/1251] eta 0:35:45 lr 0.000416 time 1.9322 (2.2556) loss 4.0444 (3.5360) grad_norm 1.7531 (1.6282) [2022-01-22 23:29:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][310/1251] eta 0:35:20 lr 0.000416 time 1.8519 (2.2535) loss 2.3782 (3.5251) grad_norm 1.6687 (1.6285) [2022-01-22 23:29:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][320/1251] eta 0:34:55 lr 0.000416 time 1.9313 (2.2505) loss 3.2835 (3.5101) grad_norm 1.6725 (1.6286) [2022-01-22 23:30:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][330/1251] eta 0:34:32 lr 0.000416 time 1.9197 (2.2508) loss 3.6152 (3.5061) grad_norm 1.5927 (1.6289) [2022-01-22 23:30:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][340/1251] eta 0:34:09 lr 0.000416 time 2.1160 (2.2493) loss 3.3060 (3.5075) grad_norm 1.4213 (1.6295) [2022-01-22 23:30:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][350/1251] eta 0:33:43 lr 0.000416 time 2.4182 (2.2464) loss 3.3361 (3.5098) grad_norm 1.4311 (1.6272) [2022-01-22 23:31:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][360/1251] eta 0:33:18 lr 0.000416 time 2.3128 (2.2433) loss 2.7453 (3.5111) grad_norm 1.6649 (1.6279) [2022-01-22 23:31:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][370/1251] eta 0:32:54 lr 0.000416 time 2.1454 (2.2408) loss 2.4692 (3.5070) grad_norm 1.5899 (1.6282) [2022-01-22 23:31:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][380/1251] eta 0:32:29 lr 0.000416 time 2.2497 (2.2380) loss 3.4296 (3.5052) grad_norm 1.6071 (1.6285) [2022-01-22 23:32:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][390/1251] eta 0:32:10 lr 0.000416 time 2.2160 (2.2425) loss 2.6326 (3.4999) grad_norm 1.6930 (1.6301) [2022-01-22 23:32:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][400/1251] eta 0:31:46 lr 0.000416 time 1.9130 (2.2404) loss 4.0097 (3.5006) grad_norm 1.9705 (1.6321) [2022-01-22 23:32:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][410/1251] eta 0:31:21 lr 0.000416 time 1.9366 (2.2375) loss 3.7857 (3.5025) grad_norm 1.7755 (1.6330) [2022-01-22 23:33:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][420/1251] eta 0:30:55 lr 0.000416 time 2.0492 (2.2327) loss 3.5137 (3.4999) grad_norm 1.5645 (1.6336) [2022-01-22 23:33:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][430/1251] eta 0:30:27 lr 0.000416 time 2.1566 (2.2265) loss 2.4572 (3.4907) grad_norm 1.4303 (1.6327) [2022-01-22 23:34:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][440/1251] eta 0:30:02 lr 0.000416 time 2.0328 (2.2230) loss 3.5494 (3.4934) grad_norm 1.4705 (1.6328) [2022-01-22 23:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][450/1251] eta 0:29:40 lr 0.000416 time 2.1684 (2.2230) loss 3.6816 (3.4988) grad_norm 1.6291 (1.6346) [2022-01-22 23:34:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][460/1251] eta 0:29:15 lr 0.000415 time 2.1656 (2.2191) loss 3.4543 (3.5018) grad_norm 1.7288 (1.6360) [2022-01-22 23:35:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][470/1251] eta 0:28:55 lr 0.000415 time 2.8083 (2.2226) loss 3.7195 (3.4970) grad_norm 1.5181 (1.6360) [2022-01-22 23:35:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][480/1251] eta 0:28:38 lr 0.000415 time 2.9193 (2.2289) loss 3.5824 (3.5008) grad_norm 1.4866 (1.6354) [2022-01-22 23:35:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][490/1251] eta 0:28:22 lr 0.000415 time 2.5321 (2.2378) loss 3.9001 (3.5001) grad_norm 1.7741 (1.6346) [2022-01-22 23:36:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][500/1251] eta 0:28:00 lr 0.000415 time 1.8149 (2.2374) loss 4.1651 (3.4998) grad_norm 1.5766 (1.6341) [2022-01-22 23:36:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][510/1251] eta 0:27:36 lr 0.000415 time 2.1866 (2.2354) loss 3.0172 (3.5007) grad_norm 1.4590 (1.6331) [2022-01-22 23:37:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][520/1251] eta 0:27:10 lr 0.000415 time 1.9717 (2.2303) loss 3.5940 (3.5045) grad_norm 1.6916 (1.6346) [2022-01-22 23:37:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][530/1251] eta 0:26:42 lr 0.000415 time 1.7560 (2.2230) loss 3.1995 (3.5020) grad_norm 1.9404 (1.6355) [2022-01-22 23:37:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][540/1251] eta 0:26:16 lr 0.000415 time 2.2015 (2.2172) loss 3.7279 (3.4998) grad_norm 1.4562 (1.6359) [2022-01-22 23:38:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][550/1251] eta 0:25:53 lr 0.000415 time 2.4197 (2.2159) loss 3.9334 (3.4996) grad_norm 1.7980 (1.6370) [2022-01-22 23:38:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][560/1251] eta 0:25:29 lr 0.000415 time 1.9409 (2.2140) loss 3.0635 (3.4988) grad_norm 1.5824 (1.6378) [2022-01-22 23:38:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][570/1251] eta 0:25:11 lr 0.000415 time 2.8643 (2.2190) loss 3.7072 (3.4964) grad_norm 1.7795 (1.6386) [2022-01-22 23:39:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][580/1251] eta 0:24:55 lr 0.000415 time 2.8032 (2.2283) loss 2.4893 (3.4984) grad_norm 1.5858 (1.6371) [2022-01-22 23:39:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][590/1251] eta 0:24:34 lr 0.000415 time 2.1561 (2.2314) loss 4.2270 (3.4983) grad_norm 1.8264 (1.6377) [2022-01-22 23:39:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][600/1251] eta 0:24:11 lr 0.000415 time 1.5683 (2.2297) loss 4.0526 (3.4976) grad_norm 1.5258 (1.6389) [2022-01-22 23:40:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][610/1251] eta 0:23:46 lr 0.000415 time 2.1842 (2.2262) loss 3.9978 (3.5001) grad_norm 1.5923 (1.6404) [2022-01-22 23:40:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][620/1251] eta 0:23:21 lr 0.000415 time 2.2843 (2.2210) loss 3.6299 (3.4991) grad_norm 1.6645 (1.6392) [2022-01-22 23:40:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][630/1251] eta 0:22:57 lr 0.000415 time 2.5656 (2.2177) loss 3.3931 (3.5007) grad_norm 2.0169 (1.6388) [2022-01-22 23:41:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][640/1251] eta 0:22:34 lr 0.000415 time 1.8831 (2.2171) loss 4.0098 (3.4995) grad_norm 1.5871 (1.6388) [2022-01-22 23:41:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][650/1251] eta 0:22:12 lr 0.000415 time 2.9785 (2.2168) loss 3.8016 (3.5029) grad_norm 1.5726 (1.6385) [2022-01-22 23:42:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][660/1251] eta 0:21:50 lr 0.000415 time 2.4722 (2.2178) loss 3.5947 (3.4998) grad_norm 1.6843 (1.6373) [2022-01-22 23:42:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][670/1251] eta 0:21:28 lr 0.000415 time 2.7797 (2.2183) loss 2.9569 (3.5012) grad_norm 1.6111 (1.6368) [2022-01-22 23:42:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][680/1251] eta 0:21:08 lr 0.000415 time 1.9916 (2.2217) loss 3.0522 (3.5009) grad_norm 1.4551 (1.6361) [2022-01-22 23:43:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][690/1251] eta 0:20:49 lr 0.000415 time 3.4815 (2.2270) loss 3.7871 (3.5051) grad_norm 1.5518 (1.6349) [2022-01-22 23:43:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][700/1251] eta 0:20:26 lr 0.000414 time 1.8869 (2.2251) loss 4.5414 (3.5041) grad_norm 1.7653 (1.6354) [2022-01-22 23:43:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][710/1251] eta 0:20:01 lr 0.000414 time 1.6333 (2.2203) loss 3.6974 (3.4983) grad_norm 1.6396 (1.6349) [2022-01-22 23:44:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][720/1251] eta 0:19:37 lr 0.000414 time 2.2621 (2.2176) loss 4.0349 (3.4974) grad_norm 2.1412 (1.6350) [2022-01-22 23:44:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][730/1251] eta 0:19:14 lr 0.000414 time 2.2997 (2.2156) loss 3.4776 (3.4937) grad_norm 1.8898 (1.6347) [2022-01-22 23:45:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][740/1251] eta 0:18:51 lr 0.000414 time 2.0457 (2.2146) loss 3.7589 (3.4922) grad_norm 1.6254 (1.6341) [2022-01-22 23:45:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][750/1251] eta 0:18:29 lr 0.000414 time 2.0967 (2.2155) loss 3.3611 (3.4937) grad_norm 1.6103 (1.6333) [2022-01-22 23:45:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][760/1251] eta 0:18:07 lr 0.000414 time 2.2179 (2.2150) loss 3.4231 (3.4952) grad_norm 1.6008 (1.6335) [2022-01-22 23:46:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][770/1251] eta 0:17:45 lr 0.000414 time 1.8325 (2.2146) loss 3.9027 (3.4982) grad_norm 1.6918 (1.6340) [2022-01-22 23:46:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][780/1251] eta 0:17:23 lr 0.000414 time 2.4355 (2.2148) loss 4.1516 (3.5000) grad_norm 1.7539 (1.6350) [2022-01-22 23:46:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][790/1251] eta 0:17:01 lr 0.000414 time 1.6667 (2.2148) loss 2.9682 (3.4996) grad_norm 1.4784 (1.6355) [2022-01-22 23:47:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][800/1251] eta 0:16:40 lr 0.000414 time 2.8585 (2.2176) loss 3.3491 (3.5009) grad_norm 1.4708 (1.6352) [2022-01-22 23:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][810/1251] eta 0:16:18 lr 0.000414 time 2.1673 (2.2190) loss 3.2747 (3.5004) grad_norm 1.7297 (1.6344) [2022-01-22 23:48:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][820/1251] eta 0:15:56 lr 0.000414 time 2.5171 (2.2184) loss 3.4173 (3.5015) grad_norm 1.5453 (1.6355) [2022-01-22 23:48:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][830/1251] eta 0:15:32 lr 0.000414 time 2.2393 (2.2149) loss 3.3179 (3.4981) grad_norm 1.7362 (1.6354) [2022-01-22 23:48:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][840/1251] eta 0:15:09 lr 0.000414 time 2.5157 (2.2123) loss 3.5327 (3.4983) grad_norm 1.3289 (1.6346) [2022-01-22 23:49:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][850/1251] eta 0:14:46 lr 0.000414 time 1.9804 (2.2103) loss 3.4911 (3.4993) grad_norm 1.4768 (1.6341) [2022-01-22 23:49:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][860/1251] eta 0:14:23 lr 0.000414 time 1.8631 (2.2096) loss 3.5145 (3.4998) grad_norm 1.6664 (1.6345) [2022-01-22 23:49:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][870/1251] eta 0:14:02 lr 0.000414 time 1.8729 (2.2100) loss 4.1421 (3.5000) grad_norm 1.6025 (1.6358) [2022-01-22 23:50:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][880/1251] eta 0:13:40 lr 0.000414 time 2.5964 (2.2112) loss 3.5108 (3.4990) grad_norm 1.8474 (1.6374) [2022-01-22 23:50:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][890/1251] eta 0:13:19 lr 0.000414 time 2.9878 (2.2144) loss 2.3672 (3.4966) grad_norm 1.7382 (1.6379) [2022-01-22 23:50:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][900/1251] eta 0:12:57 lr 0.000414 time 2.1611 (2.2164) loss 2.4716 (3.4946) grad_norm 1.5749 (1.6385) [2022-01-22 23:51:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][910/1251] eta 0:12:35 lr 0.000414 time 1.8923 (2.2154) loss 2.2618 (3.4971) grad_norm 1.5639 (1.6400) [2022-01-22 23:51:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][920/1251] eta 0:12:12 lr 0.000414 time 1.9814 (2.2127) loss 3.8354 (3.4971) grad_norm 1.4811 (1.6398) [2022-01-22 23:51:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][930/1251] eta 0:11:49 lr 0.000414 time 1.8200 (2.2102) loss 3.4984 (3.4953) grad_norm 1.3975 (1.6401) [2022-01-22 23:52:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][940/1251] eta 0:11:27 lr 0.000414 time 2.4110 (2.2093) loss 3.3163 (3.4951) grad_norm 1.5933 (1.6399) [2022-01-22 23:52:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][950/1251] eta 0:11:04 lr 0.000413 time 2.1991 (2.2083) loss 2.3560 (3.4929) grad_norm 1.3862 (1.6405) [2022-01-22 23:53:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][960/1251] eta 0:10:43 lr 0.000413 time 3.3427 (2.2103) loss 3.8691 (3.4919) grad_norm 1.8226 (1.6405) [2022-01-22 23:53:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][970/1251] eta 0:10:21 lr 0.000413 time 1.6225 (2.2109) loss 4.0391 (3.4946) grad_norm 1.8933 (1.6408) [2022-01-22 23:53:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][980/1251] eta 0:09:59 lr 0.000413 time 2.8637 (2.2134) loss 2.6060 (3.4935) grad_norm 1.6340 (1.6402) [2022-01-22 23:54:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][990/1251] eta 0:09:38 lr 0.000413 time 1.7499 (2.2150) loss 3.6168 (3.4937) grad_norm 1.5938 (1.6403) [2022-01-22 23:54:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1000/1251] eta 0:09:15 lr 0.000413 time 2.4767 (2.2145) loss 2.2528 (3.4921) grad_norm 1.4580 (1.6395) [2022-01-22 23:54:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1010/1251] eta 0:08:53 lr 0.000413 time 1.6327 (2.2123) loss 3.3215 (3.4914) grad_norm 1.3819 (1.6391) [2022-01-22 23:55:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1020/1251] eta 0:08:30 lr 0.000413 time 2.0076 (2.2095) loss 3.3942 (3.4887) grad_norm 1.7394 (1.6388) [2022-01-22 23:55:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1030/1251] eta 0:08:07 lr 0.000413 time 2.2230 (2.2078) loss 2.4754 (3.4885) grad_norm 1.7876 (1.6383) [2022-01-22 23:55:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1040/1251] eta 0:07:45 lr 0.000413 time 2.8709 (2.2070) loss 4.1436 (3.4897) grad_norm 1.7310 (1.6380) [2022-01-22 23:56:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1050/1251] eta 0:07:23 lr 0.000413 time 1.9007 (2.2066) loss 3.0706 (3.4892) grad_norm 1.8163 (1.6386) [2022-01-22 23:56:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1060/1251] eta 0:07:01 lr 0.000413 time 2.1906 (2.2072) loss 2.5253 (3.4876) grad_norm 2.0112 (1.6391) [2022-01-22 23:57:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1070/1251] eta 0:06:39 lr 0.000413 time 2.4746 (2.2083) loss 2.5344 (3.4882) grad_norm 1.6219 (1.6387) [2022-01-22 23:57:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1080/1251] eta 0:06:17 lr 0.000413 time 2.5100 (2.2084) loss 4.0787 (3.4901) grad_norm 2.0043 (1.6392) [2022-01-22 23:57:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1090/1251] eta 0:05:55 lr 0.000413 time 1.7866 (2.2080) loss 3.4720 (3.4885) grad_norm 1.6467 (1.6387) [2022-01-22 23:58:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1100/1251] eta 0:05:33 lr 0.000413 time 1.9778 (2.2076) loss 3.5874 (3.4891) grad_norm 1.3802 (1.6382) [2022-01-22 23:58:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1110/1251] eta 0:05:11 lr 0.000413 time 1.9199 (2.2067) loss 2.7258 (3.4892) grad_norm 1.5770 (1.6377) [2022-01-22 23:58:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1120/1251] eta 0:04:49 lr 0.000413 time 3.1338 (2.2082) loss 3.9518 (3.4884) grad_norm 1.4515 (1.6370) [2022-01-22 23:59:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1130/1251] eta 0:04:27 lr 0.000413 time 2.1300 (2.2089) loss 4.2423 (3.4883) grad_norm 1.5878 (1.6368) [2022-01-22 23:59:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1140/1251] eta 0:04:05 lr 0.000413 time 1.6262 (2.2090) loss 3.7280 (3.4908) grad_norm 1.5149 (1.6369) [2022-01-23 00:00:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1150/1251] eta 0:03:43 lr 0.000413 time 1.6988 (2.2080) loss 2.7252 (3.4900) grad_norm 1.6332 (1.6372) [2022-01-23 00:00:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1160/1251] eta 0:03:20 lr 0.000413 time 2.2595 (2.2068) loss 2.3146 (3.4886) grad_norm 1.5270 (1.6368) [2022-01-23 00:00:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1170/1251] eta 0:02:58 lr 0.000413 time 1.8218 (2.2053) loss 3.3412 (3.4865) grad_norm 1.6083 (1.6367) [2022-01-23 00:01:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1180/1251] eta 0:02:36 lr 0.000413 time 1.8395 (2.2049) loss 3.8541 (3.4869) grad_norm 1.4668 (1.6362) [2022-01-23 00:01:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1190/1251] eta 0:02:14 lr 0.000412 time 3.1797 (2.2055) loss 4.0389 (3.4851) grad_norm 1.7569 (1.6362) [2022-01-23 00:01:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1200/1251] eta 0:01:52 lr 0.000412 time 1.4921 (2.2062) loss 4.3045 (3.4856) grad_norm 1.8135 (1.6367) [2022-01-23 00:02:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1210/1251] eta 0:01:30 lr 0.000412 time 1.5526 (2.2064) loss 2.9501 (3.4826) grad_norm 1.5640 (1.6369) [2022-01-23 00:02:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1220/1251] eta 0:01:08 lr 0.000412 time 2.2057 (2.2072) loss 3.6388 (3.4829) grad_norm 1.5417 (1.6361) [2022-01-23 00:02:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1230/1251] eta 0:00:46 lr 0.000412 time 3.1180 (2.2076) loss 4.3766 (3.4830) grad_norm 1.8909 (1.6360) [2022-01-23 00:03:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1240/1251] eta 0:00:24 lr 0.000412 time 1.5080 (2.2047) loss 3.8379 (3.4842) grad_norm 1.6302 (1.6364) [2022-01-23 00:03:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [167/300][1250/1251] eta 0:00:02 lr 0.000412 time 1.1916 (2.1988) loss 2.9706 (3.4811) grad_norm 1.7987 (1.6372) [2022-01-23 00:03:31 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 167 training takes 0:45:51 [2022-01-23 00:03:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 21.030 (21.030) Loss 1.1271 (1.1271) Acc@1 75.000 (75.000) Acc@5 91.211 (91.211) [2022-01-23 00:04:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.264 (3.395) Loss 0.9979 (1.0282) Acc@1 75.391 (76.385) Acc@5 94.434 (93.333) [2022-01-23 00:04:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.608 (2.745) Loss 1.0318 (1.0046) Acc@1 76.172 (76.618) Acc@5 93.555 (93.620) [2022-01-23 00:04:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.294 (2.300) Loss 0.9273 (0.9915) Acc@1 78.418 (76.830) Acc@5 94.434 (93.775) [2022-01-23 00:05:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.786 (2.250) Loss 0.9440 (0.9928) Acc@1 76.855 (76.836) Acc@5 94.629 (93.724) [2022-01-23 00:05:11 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.770 Acc@5 93.672 [2022-01-23 00:05:11 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.8% [2022-01-23 00:05:11 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.82% [2022-01-23 00:05:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][0/1251] eta 7:26:40 lr 0.000412 time 21.4237 (21.4237) loss 3.8565 (3.8565) grad_norm 1.7680 (1.7680) [2022-01-23 00:05:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][10/1251] eta 1:22:12 lr 0.000412 time 2.2137 (3.9745) loss 3.7844 (3.4879) grad_norm 1.8817 (1.6538) [2022-01-23 00:06:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][20/1251] eta 1:04:56 lr 0.000412 time 2.1761 (3.1653) loss 3.5050 (3.4315) grad_norm 1.7188 (1.6597) [2022-01-23 00:06:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][30/1251] eta 0:58:07 lr 0.000412 time 2.1698 (2.8564) loss 3.7308 (3.4620) grad_norm 1.6619 (1.6233) [2022-01-23 00:07:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][40/1251] eta 0:54:33 lr 0.000412 time 2.7864 (2.7034) loss 3.5755 (3.4822) grad_norm 1.4763 (1.6381) [2022-01-23 00:07:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][50/1251] eta 0:53:08 lr 0.000412 time 3.3146 (2.6547) loss 3.9015 (3.5015) grad_norm 1.7223 (1.6434) [2022-01-23 00:07:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][60/1251] eta 0:51:30 lr 0.000412 time 1.8175 (2.5951) loss 3.8317 (3.4879) grad_norm 1.5712 (1.6245) [2022-01-23 00:08:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][70/1251] eta 0:50:20 lr 0.000412 time 2.3315 (2.5572) loss 4.0382 (3.5025) grad_norm 1.9739 (1.6448) [2022-01-23 00:08:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][80/1251] eta 0:49:07 lr 0.000412 time 2.4686 (2.5172) loss 3.3984 (3.4923) grad_norm 1.4959 (1.6341) [2022-01-23 00:08:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][90/1251] eta 0:47:30 lr 0.000412 time 1.8203 (2.4549) loss 3.9972 (3.4942) grad_norm 1.6059 (1.6273) [2022-01-23 00:09:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][100/1251] eta 0:45:59 lr 0.000412 time 1.9159 (2.3974) loss 3.8907 (3.5195) grad_norm 1.5274 (1.6298) [2022-01-23 00:09:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][110/1251] eta 0:44:55 lr 0.000412 time 2.5940 (2.3625) loss 3.5740 (3.5019) grad_norm 1.7222 (1.6251) [2022-01-23 00:09:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][120/1251] eta 0:44:16 lr 0.000412 time 2.1357 (2.3486) loss 2.7983 (3.4836) grad_norm 2.2917 (1.6339) [2022-01-23 00:10:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][130/1251] eta 0:43:40 lr 0.000412 time 1.6699 (2.3377) loss 3.8115 (3.4802) grad_norm 1.7887 (1.6317) [2022-01-23 00:10:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][140/1251] eta 0:43:13 lr 0.000412 time 1.9707 (2.3340) loss 2.9562 (3.4872) grad_norm 1.4762 (1.6276) [2022-01-23 00:11:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][150/1251] eta 0:42:56 lr 0.000412 time 3.2604 (2.3401) loss 3.4064 (3.4668) grad_norm 1.5541 (1.6291) [2022-01-23 00:11:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][160/1251] eta 0:42:25 lr 0.000412 time 1.9632 (2.3335) loss 2.8209 (3.4606) grad_norm 1.7293 (1.6290) [2022-01-23 00:11:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][170/1251] eta 0:41:59 lr 0.000412 time 1.6339 (2.3303) loss 3.2646 (3.4640) grad_norm 1.7028 (1.6343) [2022-01-23 00:12:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][180/1251] eta 0:41:14 lr 0.000412 time 1.6333 (2.3107) loss 3.6234 (3.4759) grad_norm 1.7927 (1.6359) [2022-01-23 00:12:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][190/1251] eta 0:40:39 lr 0.000411 time 1.9625 (2.2997) loss 3.2349 (3.4451) grad_norm 1.5904 (1.6342) [2022-01-23 00:12:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][200/1251] eta 0:40:07 lr 0.000411 time 1.7920 (2.2903) loss 3.1190 (3.4483) grad_norm 1.5746 (1.6327) [2022-01-23 00:13:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][210/1251] eta 0:39:46 lr 0.000411 time 2.3142 (2.2921) loss 3.3653 (3.4431) grad_norm 1.8911 (1.6326) [2022-01-23 00:13:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][220/1251] eta 0:39:23 lr 0.000411 time 1.5514 (2.2922) loss 2.7594 (3.4411) grad_norm 1.5834 (1.6331) [2022-01-23 00:13:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][230/1251] eta 0:38:55 lr 0.000411 time 1.7885 (2.2876) loss 3.4174 (3.4416) grad_norm 1.9162 (1.6322) [2022-01-23 00:14:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][240/1251] eta 0:38:23 lr 0.000411 time 1.9103 (2.2788) loss 2.7272 (3.4513) grad_norm 1.4240 (1.6344) [2022-01-23 00:14:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][250/1251] eta 0:37:54 lr 0.000411 time 1.9311 (2.2724) loss 3.4447 (3.4491) grad_norm 1.7113 (1.6349) [2022-01-23 00:15:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][260/1251] eta 0:37:22 lr 0.000411 time 2.1673 (2.2632) loss 2.5807 (3.4497) grad_norm 1.5945 (1.6350) [2022-01-23 00:15:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][270/1251] eta 0:36:58 lr 0.000411 time 2.4844 (2.2611) loss 3.8486 (3.4410) grad_norm 1.7861 (1.6338) [2022-01-23 00:15:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][280/1251] eta 0:36:42 lr 0.000411 time 1.6020 (2.2682) loss 3.7426 (3.4398) grad_norm 1.7967 (1.6343) [2022-01-23 00:16:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][290/1251] eta 0:36:19 lr 0.000411 time 1.6917 (2.2683) loss 3.8245 (3.4428) grad_norm 1.9276 (1.6377) [2022-01-23 00:16:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][300/1251] eta 0:35:57 lr 0.000411 time 2.0226 (2.2689) loss 3.1180 (3.4349) grad_norm 1.3751 (1.6383) [2022-01-23 00:16:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][310/1251] eta 0:35:35 lr 0.000411 time 2.2120 (2.2698) loss 3.5845 (3.4461) grad_norm 1.4408 (1.6399) [2022-01-23 00:17:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][320/1251] eta 0:35:10 lr 0.000411 time 1.9776 (2.2665) loss 3.8637 (3.4426) grad_norm 2.0068 (1.6386) [2022-01-23 00:17:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][330/1251] eta 0:34:38 lr 0.000411 time 1.8682 (2.2568) loss 4.0951 (3.4487) grad_norm 1.6778 (1.6377) [2022-01-23 00:17:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][340/1251] eta 0:34:11 lr 0.000411 time 1.9100 (2.2523) loss 3.8149 (3.4508) grad_norm 1.4855 (1.6343) [2022-01-23 00:18:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][350/1251] eta 0:33:46 lr 0.000411 time 1.9334 (2.2494) loss 3.1129 (3.4527) grad_norm 1.5164 (1.6336) [2022-01-23 00:18:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][360/1251] eta 0:33:20 lr 0.000411 time 2.0957 (2.2457) loss 3.1204 (3.4443) grad_norm 1.5358 (1.6313) [2022-01-23 00:19:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][370/1251] eta 0:32:57 lr 0.000411 time 1.9324 (2.2451) loss 3.6041 (3.4443) grad_norm 1.7176 (1.6305) [2022-01-23 00:19:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][380/1251] eta 0:32:36 lr 0.000411 time 2.5451 (2.2466) loss 3.9798 (3.4465) grad_norm 1.6455 (1.6308) [2022-01-23 00:19:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][390/1251] eta 0:32:19 lr 0.000411 time 1.8531 (2.2521) loss 3.3590 (3.4495) grad_norm 1.7402 (1.6330) [2022-01-23 00:20:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][400/1251] eta 0:31:54 lr 0.000411 time 2.3405 (2.2502) loss 3.5102 (3.4535) grad_norm 1.5534 (1.6357) [2022-01-23 00:20:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][410/1251] eta 0:31:29 lr 0.000411 time 1.6095 (2.2472) loss 4.1181 (3.4594) grad_norm 1.5569 (1.6369) [2022-01-23 00:20:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][420/1251] eta 0:31:03 lr 0.000411 time 1.6076 (2.2424) loss 3.3748 (3.4548) grad_norm 1.5271 (1.6360) [2022-01-23 00:21:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][430/1251] eta 0:30:36 lr 0.000410 time 1.9823 (2.2370) loss 2.5325 (3.4495) grad_norm 1.5070 (1.6357) [2022-01-23 00:21:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][440/1251] eta 0:30:10 lr 0.000410 time 2.2073 (2.2328) loss 2.4497 (3.4455) grad_norm 1.9794 (1.6374) [2022-01-23 00:21:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][450/1251] eta 0:29:46 lr 0.000410 time 1.8330 (2.2304) loss 3.8044 (3.4469) grad_norm 1.8536 (1.6393) [2022-01-23 00:22:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][460/1251] eta 0:29:24 lr 0.000410 time 2.5983 (2.2312) loss 3.9622 (3.4492) grad_norm 1.6624 (1.6422) [2022-01-23 00:22:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][470/1251] eta 0:29:03 lr 0.000410 time 1.9512 (2.2329) loss 3.6572 (3.4521) grad_norm 1.4728 (1.6414) [2022-01-23 00:23:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][480/1251] eta 0:28:43 lr 0.000410 time 2.8178 (2.2354) loss 3.0474 (3.4495) grad_norm 1.5834 (1.6420) [2022-01-23 00:23:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][490/1251] eta 0:28:21 lr 0.000410 time 1.6397 (2.2355) loss 2.6839 (3.4478) grad_norm 1.5346 (1.6405) [2022-01-23 00:23:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][500/1251] eta 0:27:59 lr 0.000410 time 1.8192 (2.2365) loss 3.1906 (3.4458) grad_norm 1.6148 (1.6404) [2022-01-23 00:24:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][510/1251] eta 0:27:38 lr 0.000410 time 1.9175 (2.2383) loss 3.9834 (3.4465) grad_norm 1.7464 (1.6411) [2022-01-23 00:24:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][520/1251] eta 0:27:12 lr 0.000410 time 2.2725 (2.2328) loss 3.6888 (3.4379) grad_norm 1.9217 (1.6447) [2022-01-23 00:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][530/1251] eta 0:26:46 lr 0.000410 time 2.0092 (2.2276) loss 4.2276 (3.4349) grad_norm 1.6345 (1.6440) [2022-01-23 00:25:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][540/1251] eta 0:26:21 lr 0.000410 time 2.1254 (2.2238) loss 3.4097 (3.4287) grad_norm 1.3402 (1.6447) [2022-01-23 00:25:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][550/1251] eta 0:25:57 lr 0.000410 time 1.8373 (2.2215) loss 4.3600 (3.4326) grad_norm 1.5813 (1.6437) [2022-01-23 00:25:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][560/1251] eta 0:25:36 lr 0.000410 time 2.6176 (2.2229) loss 3.3198 (3.4373) grad_norm 1.7091 (1.6441) [2022-01-23 00:26:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][570/1251] eta 0:25:13 lr 0.000410 time 2.1645 (2.2230) loss 3.6298 (3.4366) grad_norm 2.0023 (1.6465) [2022-01-23 00:26:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][580/1251] eta 0:24:52 lr 0.000410 time 2.8076 (2.2244) loss 3.7253 (3.4403) grad_norm 1.5436 (1.6451) [2022-01-23 00:27:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][590/1251] eta 0:24:29 lr 0.000410 time 1.6802 (2.2227) loss 3.5355 (3.4408) grad_norm 1.6335 (1.6459) [2022-01-23 00:27:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][600/1251] eta 0:24:07 lr 0.000410 time 2.0625 (2.2237) loss 3.6791 (3.4412) grad_norm 1.6455 (1.6455) [2022-01-23 00:27:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][610/1251] eta 0:23:45 lr 0.000410 time 2.2674 (2.2239) loss 3.6354 (3.4407) grad_norm 1.5650 (1.6460) [2022-01-23 00:28:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][620/1251] eta 0:23:23 lr 0.000410 time 2.7235 (2.2242) loss 3.9571 (3.4406) grad_norm 1.4785 (1.6460) [2022-01-23 00:28:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][630/1251] eta 0:23:01 lr 0.000410 time 1.5464 (2.2239) loss 4.1336 (3.4390) grad_norm 1.5633 (1.6462) [2022-01-23 00:28:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][640/1251] eta 0:22:39 lr 0.000410 time 2.1663 (2.2245) loss 2.8119 (3.4391) grad_norm 1.4668 (1.6444) [2022-01-23 00:29:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][650/1251] eta 0:22:16 lr 0.000410 time 1.8915 (2.2242) loss 3.9345 (3.4377) grad_norm 1.7755 (1.6520) [2022-01-23 00:29:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][660/1251] eta 0:21:55 lr 0.000410 time 1.8857 (2.2257) loss 3.7713 (3.4412) grad_norm 1.4773 (1.6544) [2022-01-23 00:30:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][670/1251] eta 0:21:31 lr 0.000410 time 1.9346 (2.2236) loss 3.5452 (3.4405) grad_norm 1.6555 (1.6556) [2022-01-23 00:30:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][680/1251] eta 0:21:08 lr 0.000409 time 1.9528 (2.2216) loss 3.7265 (3.4410) grad_norm 1.8640 (1.6552) [2022-01-23 00:30:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][690/1251] eta 0:20:45 lr 0.000409 time 1.8578 (2.2199) loss 4.1191 (3.4410) grad_norm 1.7060 (1.6550) [2022-01-23 00:31:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][700/1251] eta 0:20:23 lr 0.000409 time 2.4314 (2.2214) loss 2.8927 (3.4370) grad_norm 1.6833 (1.6558) [2022-01-23 00:31:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][710/1251] eta 0:20:01 lr 0.000409 time 2.2684 (2.2209) loss 3.8297 (3.4375) grad_norm 1.5622 (1.6540) [2022-01-23 00:31:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][720/1251] eta 0:19:38 lr 0.000409 time 1.5789 (2.2202) loss 3.4122 (3.4403) grad_norm 1.8216 (1.6541) [2022-01-23 00:32:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][730/1251] eta 0:19:16 lr 0.000409 time 1.7200 (2.2189) loss 3.9870 (3.4439) grad_norm 1.4691 (1.6545) [2022-01-23 00:32:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][740/1251] eta 0:18:54 lr 0.000409 time 1.6636 (2.2193) loss 3.3693 (3.4446) grad_norm 1.7097 (1.6544) [2022-01-23 00:32:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][750/1251] eta 0:18:30 lr 0.000409 time 1.7494 (2.2167) loss 3.0402 (3.4425) grad_norm 1.6359 (1.6557) [2022-01-23 00:33:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][760/1251] eta 0:18:06 lr 0.000409 time 1.8399 (2.2138) loss 3.1534 (3.4389) grad_norm 1.5498 (1.6559) [2022-01-23 00:33:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][770/1251] eta 0:17:44 lr 0.000409 time 2.1836 (2.2126) loss 2.7239 (3.4410) grad_norm 1.6945 (1.6561) [2022-01-23 00:33:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][780/1251] eta 0:17:22 lr 0.000409 time 2.3233 (2.2130) loss 4.0564 (3.4432) grad_norm 1.4646 (1.6570) [2022-01-23 00:34:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][790/1251] eta 0:16:59 lr 0.000409 time 1.7896 (2.2116) loss 4.0058 (3.4432) grad_norm 1.5655 (1.6583) [2022-01-23 00:34:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][800/1251] eta 0:16:37 lr 0.000409 time 2.5574 (2.2118) loss 2.4530 (3.4442) grad_norm 1.2767 (1.6566) [2022-01-23 00:35:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][810/1251] eta 0:16:15 lr 0.000409 time 1.7801 (2.2111) loss 3.3982 (3.4480) grad_norm 1.4744 (1.6562) [2022-01-23 00:35:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][820/1251] eta 0:15:53 lr 0.000409 time 1.9303 (2.2121) loss 3.1691 (3.4510) grad_norm 1.5072 (1.6559) [2022-01-23 00:35:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][830/1251] eta 0:15:31 lr 0.000409 time 2.2464 (2.2116) loss 3.7598 (3.4553) grad_norm 1.7783 (1.6555) [2022-01-23 00:36:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][840/1251] eta 0:15:09 lr 0.000409 time 1.9495 (2.2117) loss 3.0976 (3.4576) grad_norm 1.4955 (1.6554) [2022-01-23 00:36:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][850/1251] eta 0:14:47 lr 0.000409 time 2.0287 (2.2123) loss 3.6929 (3.4574) grad_norm 1.5255 (1.6544) [2022-01-23 00:36:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][860/1251] eta 0:14:25 lr 0.000409 time 2.5258 (2.2138) loss 3.7125 (3.4586) grad_norm 1.7004 (1.6532) [2022-01-23 00:37:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][870/1251] eta 0:14:02 lr 0.000409 time 1.9839 (2.2123) loss 3.6464 (3.4588) grad_norm 1.6242 (1.6516) [2022-01-23 00:37:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][880/1251] eta 0:13:39 lr 0.000409 time 1.8731 (2.2099) loss 3.1637 (3.4608) grad_norm 1.6259 (1.6515) [2022-01-23 00:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][890/1251] eta 0:13:17 lr 0.000409 time 1.8873 (2.2084) loss 3.4853 (3.4612) grad_norm 1.5631 (1.6523) [2022-01-23 00:38:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][900/1251] eta 0:12:56 lr 0.000409 time 3.4413 (2.2120) loss 3.6727 (3.4615) grad_norm 1.5623 (1.6522) [2022-01-23 00:38:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][910/1251] eta 0:12:35 lr 0.000409 time 2.1647 (2.2141) loss 3.4903 (3.4626) grad_norm 1.5960 (1.6520) [2022-01-23 00:39:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][920/1251] eta 0:12:12 lr 0.000409 time 2.2593 (2.2125) loss 4.0131 (3.4625) grad_norm 1.7349 (1.6519) [2022-01-23 00:39:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][930/1251] eta 0:11:49 lr 0.000408 time 1.9691 (2.2096) loss 3.7002 (3.4604) grad_norm 1.5330 (1.6508) [2022-01-23 00:39:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][940/1251] eta 0:11:26 lr 0.000408 time 1.9287 (2.2065) loss 3.7240 (3.4572) grad_norm 1.6195 (1.6500) [2022-01-23 00:40:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][950/1251] eta 0:11:03 lr 0.000408 time 2.4222 (2.2054) loss 3.7505 (3.4580) grad_norm 1.4273 (1.6493) [2022-01-23 00:40:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][960/1251] eta 0:10:41 lr 0.000408 time 1.9523 (2.2045) loss 3.9953 (3.4603) grad_norm 1.5110 (1.6488) [2022-01-23 00:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][970/1251] eta 0:10:19 lr 0.000408 time 1.8205 (2.2035) loss 3.2534 (3.4595) grad_norm 1.5778 (1.6505) [2022-01-23 00:41:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][980/1251] eta 0:09:57 lr 0.000408 time 2.0364 (2.2035) loss 4.0242 (3.4624) grad_norm 1.7664 (1.6500) [2022-01-23 00:41:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][990/1251] eta 0:09:35 lr 0.000408 time 2.4784 (2.2037) loss 3.4085 (3.4624) grad_norm 1.7032 (1.6504) [2022-01-23 00:41:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1000/1251] eta 0:09:13 lr 0.000408 time 2.4799 (2.2051) loss 2.5123 (3.4620) grad_norm 1.6995 (1.6507) [2022-01-23 00:42:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1010/1251] eta 0:08:51 lr 0.000408 time 2.4319 (2.2074) loss 2.5134 (3.4592) grad_norm 1.5071 (1.6505) [2022-01-23 00:42:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1020/1251] eta 0:08:30 lr 0.000408 time 2.6933 (2.2095) loss 2.9198 (3.4605) grad_norm 1.7341 (1.6512) [2022-01-23 00:43:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1030/1251] eta 0:08:07 lr 0.000408 time 1.8969 (2.2081) loss 4.0415 (3.4604) grad_norm 1.5638 (1.6510) [2022-01-23 00:43:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1040/1251] eta 0:07:45 lr 0.000408 time 1.7251 (2.2083) loss 3.7158 (3.4610) grad_norm 1.6372 (1.6510) [2022-01-23 00:43:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1050/1251] eta 0:07:23 lr 0.000408 time 1.8512 (2.2055) loss 3.4347 (3.4647) grad_norm 1.4954 (1.6504) [2022-01-23 00:44:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1060/1251] eta 0:07:00 lr 0.000408 time 1.9061 (2.2028) loss 4.0108 (3.4683) grad_norm 1.8173 (1.6509) [2022-01-23 00:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1070/1251] eta 0:06:38 lr 0.000408 time 2.1899 (2.2012) loss 3.9201 (3.4688) grad_norm 1.6089 (1.6502) [2022-01-23 00:44:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1080/1251] eta 0:06:16 lr 0.000408 time 2.2059 (2.2011) loss 2.5840 (3.4702) grad_norm 2.0184 (1.6500) [2022-01-23 00:45:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1090/1251] eta 0:05:54 lr 0.000408 time 1.8852 (2.2009) loss 3.2110 (3.4690) grad_norm 1.7221 (1.6500) [2022-01-23 00:45:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1100/1251] eta 0:05:32 lr 0.000408 time 2.5291 (2.2024) loss 3.3222 (3.4635) grad_norm 1.5705 (1.6502) [2022-01-23 00:45:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1110/1251] eta 0:05:10 lr 0.000408 time 2.2585 (2.2039) loss 2.6824 (3.4587) grad_norm 1.6959 (1.6504) [2022-01-23 00:46:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1120/1251] eta 0:04:48 lr 0.000408 time 2.1683 (2.2049) loss 3.5392 (3.4584) grad_norm 1.6616 (1.6499) [2022-01-23 00:46:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1130/1251] eta 0:04:26 lr 0.000408 time 2.8224 (2.2048) loss 3.7529 (3.4589) grad_norm 1.6472 (1.6496) [2022-01-23 00:47:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1140/1251] eta 0:04:04 lr 0.000408 time 2.2147 (2.2031) loss 3.3830 (3.4580) grad_norm 1.6865 (1.6491) [2022-01-23 00:47:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1150/1251] eta 0:03:42 lr 0.000408 time 2.1710 (2.2016) loss 3.1285 (3.4569) grad_norm 1.2937 (1.6486) [2022-01-23 00:47:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1160/1251] eta 0:03:20 lr 0.000408 time 2.8394 (2.2013) loss 3.5822 (3.4541) grad_norm 1.3791 (1.6480) [2022-01-23 00:48:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1170/1251] eta 0:02:58 lr 0.000407 time 1.9053 (2.2003) loss 2.4113 (3.4529) grad_norm 1.6087 (1.6475) [2022-01-23 00:48:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1180/1251] eta 0:02:36 lr 0.000407 time 2.3658 (2.1992) loss 2.7774 (3.4509) grad_norm 1.5951 (1.6480) [2022-01-23 00:48:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1190/1251] eta 0:02:14 lr 0.000407 time 2.2776 (2.1992) loss 3.7517 (3.4508) grad_norm 1.5280 (1.6489) [2022-01-23 00:49:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1200/1251] eta 0:01:52 lr 0.000407 time 2.9311 (2.2016) loss 3.3355 (3.4503) grad_norm 1.4692 (1.6485) [2022-01-23 00:49:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1210/1251] eta 0:01:30 lr 0.000407 time 2.1817 (2.2018) loss 3.5613 (3.4493) grad_norm 1.5741 (1.6484) [2022-01-23 00:49:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1220/1251] eta 0:01:08 lr 0.000407 time 2.5586 (2.2018) loss 3.6651 (3.4503) grad_norm 1.5263 (1.6480) [2022-01-23 00:50:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1230/1251] eta 0:00:46 lr 0.000407 time 1.7238 (2.2010) loss 3.7236 (3.4523) grad_norm 1.5916 (1.6478) [2022-01-23 00:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1240/1251] eta 0:00:24 lr 0.000407 time 1.2877 (2.1992) loss 3.2931 (3.4535) grad_norm 1.7453 (1.6478) [2022-01-23 00:50:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [168/300][1250/1251] eta 0:00:02 lr 0.000407 time 1.3184 (2.1934) loss 4.0180 (3.4531) grad_norm 2.0515 (1.6476) [2022-01-23 00:50:55 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 168 training takes 0:45:44 [2022-01-23 00:51:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.954 (18.954) Loss 0.9862 (0.9862) Acc@1 77.832 (77.832) Acc@5 93.652 (93.652) [2022-01-23 00:51:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.939 (3.422) Loss 0.8987 (0.9942) Acc@1 79.492 (77.033) Acc@5 94.531 (93.377) [2022-01-23 00:51:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.945 (2.608) Loss 1.0069 (0.9898) Acc@1 75.488 (76.930) Acc@5 93.555 (93.555) [2022-01-23 00:52:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.211 (2.279) Loss 0.9938 (0.9947) Acc@1 76.367 (76.764) Acc@5 93.652 (93.536) [2022-01-23 00:52:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.171 (2.200) Loss 0.9371 (0.9883) Acc@1 76.270 (76.798) Acc@5 94.727 (93.707) [2022-01-23 00:52:32 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.770 Acc@5 93.684 [2022-01-23 00:52:32 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.8% [2022-01-23 00:52:32 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.82% [2022-01-23 00:52:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][0/1251] eta 7:18:39 lr 0.000407 time 21.0390 (21.0390) loss 3.8790 (3.8790) grad_norm 1.6041 (1.6041) [2022-01-23 00:53:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][10/1251] eta 1:25:14 lr 0.000407 time 2.7289 (4.1209) loss 3.5494 (3.1078) grad_norm 1.5822 (1.5884) [2022-01-23 00:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][20/1251] eta 1:04:21 lr 0.000407 time 1.5447 (3.1369) loss 3.3560 (3.2049) grad_norm 1.5670 (1.5979) [2022-01-23 00:54:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][30/1251] eta 0:57:59 lr 0.000407 time 1.8435 (2.8500) loss 2.2442 (3.2467) grad_norm 1.6579 (1.5823) [2022-01-23 00:54:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][40/1251] eta 0:54:49 lr 0.000407 time 3.9959 (2.7159) loss 3.0375 (3.2195) grad_norm 1.4389 (1.6080) [2022-01-23 00:54:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][50/1251] eta 0:52:36 lr 0.000407 time 1.5441 (2.6279) loss 3.5522 (3.2939) grad_norm 1.5670 (1.5993) [2022-01-23 00:55:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][60/1251] eta 0:50:40 lr 0.000407 time 2.1861 (2.5530) loss 3.8511 (3.2798) grad_norm 1.5877 (1.6132) [2022-01-23 00:55:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][70/1251] eta 0:49:22 lr 0.000407 time 1.9124 (2.5087) loss 4.1700 (3.3198) grad_norm 1.6757 (1.6264) [2022-01-23 00:55:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][80/1251] eta 0:48:14 lr 0.000407 time 2.8561 (2.4721) loss 3.4924 (3.3281) grad_norm 1.7073 (1.6245) [2022-01-23 00:56:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][90/1251] eta 0:47:31 lr 0.000407 time 1.9244 (2.4558) loss 4.2369 (3.3662) grad_norm 1.5567 (1.6151) [2022-01-23 00:56:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][100/1251] eta 0:46:27 lr 0.000407 time 2.4489 (2.4214) loss 3.8139 (3.3890) grad_norm 1.8898 (1.6201) [2022-01-23 00:56:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][110/1251] eta 0:45:30 lr 0.000407 time 1.9178 (2.3935) loss 3.8900 (3.3949) grad_norm 1.6009 (1.6299) [2022-01-23 00:57:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][120/1251] eta 0:44:47 lr 0.000407 time 2.8708 (2.3764) loss 3.4555 (3.3955) grad_norm 1.9751 (1.6358) [2022-01-23 00:57:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][130/1251] eta 0:44:14 lr 0.000407 time 2.6457 (2.3679) loss 3.3607 (3.3947) grad_norm 1.7174 (1.6349) [2022-01-23 00:58:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][140/1251] eta 0:43:45 lr 0.000407 time 2.7832 (2.3633) loss 4.2286 (3.4080) grad_norm 1.8186 (1.6300) [2022-01-23 00:58:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][150/1251] eta 0:43:10 lr 0.000407 time 1.8933 (2.3532) loss 3.0441 (3.4056) grad_norm 1.4226 (1.6299) [2022-01-23 00:58:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][160/1251] eta 0:42:38 lr 0.000407 time 2.5711 (2.3452) loss 3.0191 (3.4040) grad_norm 1.5170 (1.6336) [2022-01-23 00:59:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][170/1251] eta 0:41:53 lr 0.000406 time 1.7850 (2.3254) loss 3.6356 (3.4000) grad_norm 1.7601 (1.6368) [2022-01-23 00:59:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][180/1251] eta 0:41:10 lr 0.000406 time 2.1676 (2.3063) loss 3.5178 (3.4002) grad_norm 1.7678 (1.6375) [2022-01-23 00:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][190/1251] eta 0:40:30 lr 0.000406 time 2.0645 (2.2912) loss 2.8653 (3.4002) grad_norm 1.4202 (1.6316) [2022-01-23 01:00:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][200/1251] eta 0:40:00 lr 0.000406 time 2.6033 (2.2839) loss 3.9458 (3.3920) grad_norm 1.5966 (1.6325) [2022-01-23 01:00:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][210/1251] eta 0:39:32 lr 0.000406 time 2.1966 (2.2795) loss 3.4932 (3.3996) grad_norm 1.4153 (1.6283) [2022-01-23 01:00:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][220/1251] eta 0:39:09 lr 0.000406 time 2.5751 (2.2784) loss 3.6283 (3.4149) grad_norm 2.0687 (1.6347) [2022-01-23 01:01:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][230/1251] eta 0:38:39 lr 0.000406 time 1.4256 (2.2721) loss 3.2310 (3.4247) grad_norm 1.7215 (1.6328) [2022-01-23 01:01:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][240/1251] eta 0:38:23 lr 0.000406 time 2.6017 (2.2783) loss 3.0130 (3.4218) grad_norm 1.4109 (1.6294) [2022-01-23 01:02:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][250/1251] eta 0:38:01 lr 0.000406 time 2.4346 (2.2792) loss 3.3092 (3.4250) grad_norm 1.6463 (1.6267) [2022-01-23 01:02:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][260/1251] eta 0:37:33 lr 0.000406 time 2.0073 (2.2741) loss 3.6045 (3.4141) grad_norm 1.5053 (1.6259) [2022-01-23 01:02:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][270/1251] eta 0:37:06 lr 0.000406 time 1.5959 (2.2693) loss 3.0639 (3.4144) grad_norm 1.5074 (1.6259) [2022-01-23 01:03:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][280/1251] eta 0:36:37 lr 0.000406 time 1.8818 (2.2636) loss 3.4627 (3.4170) grad_norm 1.4523 (1.6234) [2022-01-23 01:03:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][290/1251] eta 0:36:16 lr 0.000406 time 2.5962 (2.2644) loss 3.8141 (3.4187) grad_norm 1.7704 (1.6234) [2022-01-23 01:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][300/1251] eta 0:35:49 lr 0.000406 time 2.2335 (2.2601) loss 3.5445 (3.4272) grad_norm 1.4387 (1.6244) [2022-01-23 01:04:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][310/1251] eta 0:35:25 lr 0.000406 time 1.9473 (2.2592) loss 3.3108 (3.4243) grad_norm 1.6800 (1.6218) [2022-01-23 01:04:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][320/1251] eta 0:35:00 lr 0.000406 time 2.2148 (2.2557) loss 3.3311 (3.4261) grad_norm 1.5959 (1.6208) [2022-01-23 01:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][330/1251] eta 0:34:37 lr 0.000406 time 2.2491 (2.2557) loss 4.1058 (3.4328) grad_norm 1.6109 (1.6194) [2022-01-23 01:05:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][340/1251] eta 0:34:08 lr 0.000406 time 2.1729 (2.2489) loss 3.7639 (3.4332) grad_norm 1.5410 (1.6238) [2022-01-23 01:05:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][350/1251] eta 0:33:39 lr 0.000406 time 1.5698 (2.2416) loss 3.1329 (3.4285) grad_norm 1.7868 (1.6283) [2022-01-23 01:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][360/1251] eta 0:33:15 lr 0.000406 time 2.2092 (2.2391) loss 3.8780 (3.4346) grad_norm 1.5552 (1.6286) [2022-01-23 01:06:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][370/1251] eta 0:32:54 lr 0.000406 time 2.4535 (2.2415) loss 3.3250 (3.4408) grad_norm 2.5197 (1.6311) [2022-01-23 01:06:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][380/1251] eta 0:32:29 lr 0.000406 time 2.1135 (2.2377) loss 3.5056 (3.4418) grad_norm 1.7916 (1.6320) [2022-01-23 01:07:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][390/1251] eta 0:32:08 lr 0.000406 time 2.4357 (2.2393) loss 4.0341 (3.4479) grad_norm 1.7032 (1.6361) [2022-01-23 01:07:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][400/1251] eta 0:31:47 lr 0.000406 time 2.0019 (2.2411) loss 3.4838 (3.4340) grad_norm 1.7682 (1.6395) [2022-01-23 01:07:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][410/1251] eta 0:31:21 lr 0.000405 time 1.9358 (2.2371) loss 3.6176 (3.4382) grad_norm 1.8197 (1.6400) [2022-01-23 01:08:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][420/1251] eta 0:30:54 lr 0.000405 time 2.2295 (2.2315) loss 3.6394 (3.4360) grad_norm 1.5275 (1.6393) [2022-01-23 01:08:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][430/1251] eta 0:30:26 lr 0.000405 time 1.7086 (2.2253) loss 3.7447 (3.4401) grad_norm 1.7290 (1.6401) [2022-01-23 01:08:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][440/1251] eta 0:30:01 lr 0.000405 time 2.4531 (2.2217) loss 4.0642 (3.4476) grad_norm 1.6932 (1.6408) [2022-01-23 01:09:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][450/1251] eta 0:29:34 lr 0.000405 time 1.7891 (2.2155) loss 2.6633 (3.4470) grad_norm 1.5942 (1.6398) [2022-01-23 01:09:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][460/1251] eta 0:29:10 lr 0.000405 time 2.4584 (2.2131) loss 3.6660 (3.4427) grad_norm 1.8296 (1.6400) [2022-01-23 01:09:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][470/1251] eta 0:28:48 lr 0.000405 time 2.0462 (2.2132) loss 3.8634 (3.4482) grad_norm 1.5312 (1.6394) [2022-01-23 01:10:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][480/1251] eta 0:28:27 lr 0.000405 time 1.8992 (2.2150) loss 3.9848 (3.4516) grad_norm 1.6931 (1.6394) [2022-01-23 01:10:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][490/1251] eta 0:28:07 lr 0.000405 time 2.2320 (2.2178) loss 3.5082 (3.4480) grad_norm 1.6735 (1.6383) [2022-01-23 01:11:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][500/1251] eta 0:27:46 lr 0.000405 time 1.9350 (2.2186) loss 3.6554 (3.4504) grad_norm 1.4086 (1.6375) [2022-01-23 01:11:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][510/1251] eta 0:27:25 lr 0.000405 time 2.4631 (2.2205) loss 2.8785 (3.4516) grad_norm 1.7554 (1.6368) [2022-01-23 01:11:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][520/1251] eta 0:27:04 lr 0.000405 time 1.7212 (2.2223) loss 3.7991 (3.4519) grad_norm 1.7954 (1.6371) [2022-01-23 01:12:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][530/1251] eta 0:26:42 lr 0.000405 time 1.9040 (2.2228) loss 3.7344 (3.4528) grad_norm 1.7847 (1.6412) [2022-01-23 01:12:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][540/1251] eta 0:26:20 lr 0.000405 time 1.8265 (2.2231) loss 4.0050 (3.4576) grad_norm 1.6809 (1.6437) [2022-01-23 01:12:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][550/1251] eta 0:25:58 lr 0.000405 time 2.1978 (2.2229) loss 3.5640 (3.4574) grad_norm 1.5360 (1.6433) [2022-01-23 01:13:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][560/1251] eta 0:25:34 lr 0.000405 time 1.9172 (2.2205) loss 3.9209 (3.4541) grad_norm 1.5104 (1.6465) [2022-01-23 01:13:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][570/1251] eta 0:25:11 lr 0.000405 time 3.4310 (2.2193) loss 3.8256 (3.4625) grad_norm 1.5187 (1.6466) [2022-01-23 01:14:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][580/1251] eta 0:24:48 lr 0.000405 time 2.0039 (2.2182) loss 3.8347 (3.4656) grad_norm 1.7689 (1.6471) [2022-01-23 01:14:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][590/1251] eta 0:24:25 lr 0.000405 time 2.8268 (2.2169) loss 2.5785 (3.4672) grad_norm 1.7725 (1.6485) [2022-01-23 01:14:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][600/1251] eta 0:24:02 lr 0.000405 time 2.0881 (2.2154) loss 2.8344 (3.4676) grad_norm 1.4696 (1.6479) [2022-01-23 01:15:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][610/1251] eta 0:23:38 lr 0.000405 time 1.8918 (2.2137) loss 3.4799 (3.4703) grad_norm 1.8266 (1.6496) [2022-01-23 01:15:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][620/1251] eta 0:23:14 lr 0.000405 time 1.9468 (2.2101) loss 2.5070 (3.4684) grad_norm 1.5165 (1.6473) [2022-01-23 01:15:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][630/1251] eta 0:22:52 lr 0.000405 time 2.2091 (2.2098) loss 3.2818 (3.4632) grad_norm 1.5223 (1.6457) [2022-01-23 01:16:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][640/1251] eta 0:22:30 lr 0.000405 time 2.5155 (2.2098) loss 3.8415 (3.4642) grad_norm 1.7812 (1.6445) [2022-01-23 01:16:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][650/1251] eta 0:22:08 lr 0.000405 time 1.5647 (2.2105) loss 3.8687 (3.4661) grad_norm 1.4194 (1.6438) [2022-01-23 01:16:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][660/1251] eta 0:21:47 lr 0.000404 time 2.3897 (2.2121) loss 3.6785 (3.4627) grad_norm 1.5197 (1.6418) [2022-01-23 01:17:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][670/1251] eta 0:21:24 lr 0.000404 time 1.9706 (2.2116) loss 3.8637 (3.4635) grad_norm 1.4907 (1.6416) [2022-01-23 01:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][680/1251] eta 0:21:03 lr 0.000404 time 2.8539 (2.2129) loss 3.0913 (3.4639) grad_norm 1.4714 (1.6407) [2022-01-23 01:18:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][690/1251] eta 0:20:41 lr 0.000404 time 1.6338 (2.2125) loss 2.8507 (3.4631) grad_norm 2.0027 (1.6402) [2022-01-23 01:18:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][700/1251] eta 0:20:18 lr 0.000404 time 1.9306 (2.2113) loss 4.0166 (3.4655) grad_norm 1.5381 (1.6397) [2022-01-23 01:18:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][710/1251] eta 0:19:55 lr 0.000404 time 1.9411 (2.2096) loss 3.7194 (3.4665) grad_norm 1.7163 (1.6398) [2022-01-23 01:19:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][720/1251] eta 0:19:34 lr 0.000404 time 3.1006 (2.2112) loss 2.9273 (3.4659) grad_norm 1.5845 (1.6394) [2022-01-23 01:19:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][730/1251] eta 0:19:10 lr 0.000404 time 1.6122 (2.2075) loss 3.7110 (3.4670) grad_norm 1.7131 (1.6392) [2022-01-23 01:19:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][740/1251] eta 0:18:47 lr 0.000404 time 2.3559 (2.2056) loss 4.0287 (3.4684) grad_norm 1.8399 (1.6403) [2022-01-23 01:20:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][750/1251] eta 0:18:23 lr 0.000404 time 1.8491 (2.2033) loss 4.2418 (3.4687) grad_norm 1.5542 (1.6402) [2022-01-23 01:20:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][760/1251] eta 0:18:02 lr 0.000404 time 2.1075 (2.2042) loss 3.7850 (3.4690) grad_norm 1.7529 (1.6408) [2022-01-23 01:20:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][770/1251] eta 0:17:39 lr 0.000404 time 1.5638 (2.2025) loss 4.0211 (3.4696) grad_norm 1.5452 (1.6401) [2022-01-23 01:21:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][780/1251] eta 0:17:17 lr 0.000404 time 3.3479 (2.2038) loss 2.5266 (3.4694) grad_norm 1.6155 (1.6411) [2022-01-23 01:21:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][790/1251] eta 0:16:56 lr 0.000404 time 2.2210 (2.2040) loss 3.7841 (3.4674) grad_norm 1.6336 (1.6415) [2022-01-23 01:21:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][800/1251] eta 0:16:34 lr 0.000404 time 2.2039 (2.2045) loss 4.2256 (3.4686) grad_norm 1.7113 (1.6418) [2022-01-23 01:22:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][810/1251] eta 0:16:12 lr 0.000404 time 1.8632 (2.2045) loss 3.9422 (3.4664) grad_norm 1.5612 (1.6421) [2022-01-23 01:22:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][820/1251] eta 0:15:51 lr 0.000404 time 2.6733 (2.2068) loss 3.7401 (3.4660) grad_norm 1.7919 (1.6442) [2022-01-23 01:23:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][830/1251] eta 0:15:28 lr 0.000404 time 1.7936 (2.2051) loss 3.7007 (3.4644) grad_norm 1.5697 (1.6429) [2022-01-23 01:23:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][840/1251] eta 0:15:05 lr 0.000404 time 1.6523 (2.2040) loss 3.5760 (3.4644) grad_norm 1.7669 (1.6445) [2022-01-23 01:23:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][850/1251] eta 0:14:42 lr 0.000404 time 1.9209 (2.2016) loss 3.5064 (3.4644) grad_norm 1.7295 (1.6450) [2022-01-23 01:24:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][860/1251] eta 0:14:21 lr 0.000404 time 3.3761 (2.2037) loss 3.6499 (3.4652) grad_norm 1.5797 (1.6454) [2022-01-23 01:24:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][870/1251] eta 0:14:00 lr 0.000404 time 2.3843 (2.2049) loss 3.1223 (3.4639) grad_norm 1.5342 (1.6442) [2022-01-23 01:24:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][880/1251] eta 0:13:38 lr 0.000404 time 1.6293 (2.2049) loss 2.5687 (3.4618) grad_norm 1.8602 (1.6462) [2022-01-23 01:25:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][890/1251] eta 0:13:16 lr 0.000404 time 2.1706 (2.2054) loss 2.7369 (3.4622) grad_norm 1.5953 (1.6468) [2022-01-23 01:25:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][900/1251] eta 0:12:53 lr 0.000404 time 2.1904 (2.2038) loss 4.0542 (3.4619) grad_norm 1.6107 (1.6469) [2022-01-23 01:25:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][910/1251] eta 0:12:31 lr 0.000403 time 2.6685 (2.2025) loss 2.4476 (3.4596) grad_norm 1.5849 (1.6470) [2022-01-23 01:26:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][920/1251] eta 0:12:08 lr 0.000403 time 2.4868 (2.2020) loss 3.7667 (3.4599) grad_norm 1.7987 (1.6485) [2022-01-23 01:26:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][930/1251] eta 0:11:46 lr 0.000403 time 1.8917 (2.2000) loss 3.4522 (3.4609) grad_norm 1.8884 (1.6492) [2022-01-23 01:27:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][940/1251] eta 0:11:24 lr 0.000403 time 2.6766 (2.1998) loss 4.2313 (3.4595) grad_norm 1.6552 (1.6492) [2022-01-23 01:27:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][950/1251] eta 0:11:01 lr 0.000403 time 2.2672 (2.1993) loss 3.6519 (3.4593) grad_norm 1.5375 (1.6484) [2022-01-23 01:27:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][960/1251] eta 0:10:40 lr 0.000403 time 2.3957 (2.2000) loss 3.6306 (3.4602) grad_norm 1.4629 (1.6471) [2022-01-23 01:28:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][970/1251] eta 0:10:18 lr 0.000403 time 1.9540 (2.2005) loss 3.8318 (3.4609) grad_norm 1.6819 (1.6472) [2022-01-23 01:28:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][980/1251] eta 0:09:56 lr 0.000403 time 2.6120 (2.2007) loss 2.8748 (3.4628) grad_norm 1.7659 (1.6475) [2022-01-23 01:28:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][990/1251] eta 0:09:34 lr 0.000403 time 1.8469 (2.2014) loss 4.0128 (3.4654) grad_norm 1.8227 (1.6478) [2022-01-23 01:29:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1000/1251] eta 0:09:12 lr 0.000403 time 2.8346 (2.2027) loss 4.0619 (3.4670) grad_norm 1.5720 (1.6468) [2022-01-23 01:29:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1010/1251] eta 0:08:50 lr 0.000403 time 1.7611 (2.2029) loss 3.0635 (3.4661) grad_norm 1.6868 (1.6477) [2022-01-23 01:30:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1020/1251] eta 0:08:28 lr 0.000403 time 1.9763 (2.2015) loss 4.1828 (3.4693) grad_norm 1.7179 (1.6480) [2022-01-23 01:30:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1030/1251] eta 0:08:06 lr 0.000403 time 1.9446 (2.2002) loss 2.9015 (3.4686) grad_norm 1.6934 (1.6474) [2022-01-23 01:30:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1040/1251] eta 0:07:44 lr 0.000403 time 2.5841 (2.2002) loss 3.5512 (3.4690) grad_norm 1.7708 (1.6476) [2022-01-23 01:31:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1050/1251] eta 0:07:22 lr 0.000403 time 1.5295 (2.2004) loss 3.4077 (3.4704) grad_norm 1.4149 (1.6461) [2022-01-23 01:31:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1060/1251] eta 0:07:00 lr 0.000403 time 2.0663 (2.1997) loss 2.5224 (3.4678) grad_norm 1.6942 (1.6455) [2022-01-23 01:31:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1070/1251] eta 0:06:38 lr 0.000403 time 1.6095 (2.1992) loss 3.6606 (3.4693) grad_norm 1.5809 (1.6446) [2022-01-23 01:32:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1080/1251] eta 0:06:16 lr 0.000403 time 2.7790 (2.1996) loss 4.0209 (3.4703) grad_norm 2.0531 (1.6447) [2022-01-23 01:32:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1090/1251] eta 0:05:54 lr 0.000403 time 1.5070 (2.2008) loss 2.9405 (3.4695) grad_norm 1.6236 (1.6436) [2022-01-23 01:32:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1100/1251] eta 0:05:32 lr 0.000403 time 1.5969 (2.2006) loss 3.6002 (3.4676) grad_norm 1.7408 (1.6440) [2022-01-23 01:33:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1110/1251] eta 0:05:10 lr 0.000403 time 1.8881 (2.2005) loss 3.4197 (3.4668) grad_norm 1.6936 (1.6446) [2022-01-23 01:33:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1120/1251] eta 0:04:48 lr 0.000403 time 2.2451 (2.1997) loss 4.2278 (3.4669) grad_norm 1.7597 (1.6447) [2022-01-23 01:34:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1130/1251] eta 0:04:26 lr 0.000403 time 1.5493 (2.2000) loss 3.0402 (3.4679) grad_norm 1.4898 (1.6447) [2022-01-23 01:34:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1140/1251] eta 0:04:04 lr 0.000403 time 1.7680 (2.1984) loss 3.6074 (3.4703) grad_norm 1.6802 (1.6440) [2022-01-23 01:34:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1150/1251] eta 0:03:42 lr 0.000402 time 2.1137 (2.1981) loss 4.1251 (3.4710) grad_norm 1.4832 (1.6434) [2022-01-23 01:35:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1160/1251] eta 0:03:20 lr 0.000402 time 3.1060 (2.1984) loss 3.7150 (3.4720) grad_norm 1.5302 (1.6424) [2022-01-23 01:35:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1170/1251] eta 0:02:58 lr 0.000402 time 3.1399 (2.1992) loss 2.3976 (3.4703) grad_norm 1.5360 (1.6422) [2022-01-23 01:35:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1180/1251] eta 0:02:36 lr 0.000402 time 2.1242 (2.1994) loss 2.6632 (3.4706) grad_norm 1.9157 (1.6420) [2022-01-23 01:36:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1190/1251] eta 0:02:14 lr 0.000402 time 2.2010 (2.1998) loss 3.3214 (3.4696) grad_norm 1.4941 (1.6418) [2022-01-23 01:36:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1200/1251] eta 0:01:52 lr 0.000402 time 2.5340 (2.1999) loss 2.2893 (3.4673) grad_norm 1.5100 (1.6417) [2022-01-23 01:36:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1210/1251] eta 0:01:30 lr 0.000402 time 1.9530 (2.1993) loss 2.4239 (3.4637) grad_norm 1.5058 (1.6414) [2022-01-23 01:37:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1220/1251] eta 0:01:08 lr 0.000402 time 2.0614 (2.1986) loss 4.3608 (3.4648) grad_norm 1.5091 (1.6409) [2022-01-23 01:37:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1230/1251] eta 0:00:46 lr 0.000402 time 1.9352 (2.1985) loss 3.9314 (3.4666) grad_norm 1.4785 (1.6400) [2022-01-23 01:38:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1240/1251] eta 0:00:24 lr 0.000402 time 1.4478 (2.1979) loss 3.1306 (3.4654) grad_norm 1.6982 (1.6397) [2022-01-23 01:38:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [169/300][1250/1251] eta 0:00:02 lr 0.000402 time 1.1680 (2.1932) loss 3.0005 (3.4666) grad_norm 1.4773 (1.6390) [2022-01-23 01:38:16 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 169 training takes 0:45:44 [2022-01-23 01:38:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.769 (17.769) Loss 1.0747 (1.0747) Acc@1 74.805 (74.805) Acc@5 91.699 (91.699) [2022-01-23 01:38:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.241 (3.364) Loss 1.0556 (1.0091) Acc@1 74.023 (76.465) Acc@5 92.383 (93.457) [2022-01-23 01:39:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.313 (2.601) Loss 0.9111 (1.0006) Acc@1 78.223 (76.660) Acc@5 94.531 (93.652) [2022-01-23 01:39:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.926 (2.367) Loss 1.0549 (1.0034) Acc@1 76.465 (76.588) Acc@5 93.457 (93.655) [2022-01-23 01:39:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.142 (2.221) Loss 1.0334 (1.0004) Acc@1 75.781 (76.705) Acc@5 93.457 (93.681) [2022-01-23 01:39:55 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.788 Acc@5 93.638 [2022-01-23 01:39:55 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.8% [2022-01-23 01:39:55 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.82% [2022-01-23 01:40:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][0/1251] eta 7:11:06 lr 0.000402 time 20.6766 (20.6766) loss 2.6924 (2.6924) grad_norm 1.9016 (1.9016) [2022-01-23 01:40:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][10/1251] eta 1:26:21 lr 0.000402 time 2.8956 (4.1754) loss 3.4584 (3.0901) grad_norm 1.4476 (1.6243) [2022-01-23 01:41:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][20/1251] eta 1:06:26 lr 0.000402 time 2.2670 (3.2384) loss 2.9825 (3.2157) grad_norm 1.4809 (1.6206) [2022-01-23 01:41:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][30/1251] eta 0:58:07 lr 0.000402 time 1.5956 (2.8564) loss 3.5302 (3.2948) grad_norm 2.0070 (1.6144) [2022-01-23 01:41:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][40/1251] eta 0:54:58 lr 0.000402 time 3.6213 (2.7235) loss 2.7223 (3.4001) grad_norm 1.6639 (1.6076) [2022-01-23 01:42:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][50/1251] eta 0:52:29 lr 0.000402 time 1.9398 (2.6221) loss 3.7080 (3.4013) grad_norm 1.7187 (1.6023) [2022-01-23 01:42:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][60/1251] eta 0:51:06 lr 0.000402 time 1.8607 (2.5743) loss 3.8372 (3.3956) grad_norm 1.4205 (1.5915) [2022-01-23 01:42:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][70/1251] eta 0:49:36 lr 0.000402 time 1.6518 (2.5204) loss 3.8551 (3.4310) grad_norm 1.8437 (1.6038) [2022-01-23 01:43:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][80/1251] eta 0:48:52 lr 0.000402 time 3.6249 (2.5044) loss 3.6316 (3.4432) grad_norm 1.6099 (1.6120) [2022-01-23 01:43:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][90/1251] eta 0:48:00 lr 0.000402 time 1.8869 (2.4808) loss 2.3879 (3.4413) grad_norm 1.7487 (1.6243) [2022-01-23 01:44:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][100/1251] eta 0:46:40 lr 0.000402 time 1.7240 (2.4327) loss 3.8378 (3.4493) grad_norm 1.6525 (1.6256) [2022-01-23 01:44:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][110/1251] eta 0:45:45 lr 0.000402 time 2.2712 (2.4060) loss 3.5641 (3.4708) grad_norm 1.7587 (1.6257) [2022-01-23 01:44:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][120/1251] eta 0:45:13 lr 0.000402 time 2.4967 (2.3994) loss 3.7948 (3.4766) grad_norm 1.8708 (1.6229) [2022-01-23 01:45:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][130/1251] eta 0:44:30 lr 0.000402 time 2.3899 (2.3822) loss 2.9890 (3.4678) grad_norm 1.7418 (1.6250) [2022-01-23 01:45:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][140/1251] eta 0:43:43 lr 0.000402 time 1.5876 (2.3612) loss 3.5165 (3.4617) grad_norm 1.6428 (1.6249) [2022-01-23 01:45:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][150/1251] eta 0:42:58 lr 0.000401 time 2.1948 (2.3415) loss 3.5488 (3.4838) grad_norm 1.7148 (1.6344) [2022-01-23 01:46:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][160/1251] eta 0:42:21 lr 0.000401 time 2.2017 (2.3293) loss 2.8890 (3.4749) grad_norm 1.9036 (1.6549) [2022-01-23 01:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][170/1251] eta 0:41:57 lr 0.000401 time 3.0142 (2.3287) loss 3.3632 (3.4685) grad_norm 1.5061 (1.6560) [2022-01-23 01:46:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][180/1251] eta 0:41:19 lr 0.000401 time 2.2521 (2.3147) loss 4.2011 (3.4799) grad_norm 1.8361 (1.6532) [2022-01-23 01:47:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][190/1251] eta 0:40:46 lr 0.000401 time 2.1986 (2.3055) loss 3.4235 (3.4814) grad_norm 1.3447 (1.6503) [2022-01-23 01:47:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][200/1251] eta 0:40:13 lr 0.000401 time 2.0450 (2.2968) loss 2.8119 (3.4671) grad_norm 1.6722 (1.6483) [2022-01-23 01:47:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][210/1251] eta 0:39:44 lr 0.000401 time 2.1780 (2.2903) loss 2.9711 (3.4626) grad_norm 1.8040 (1.6505) [2022-01-23 01:48:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][220/1251] eta 0:39:10 lr 0.000401 time 2.5888 (2.2798) loss 4.1344 (3.4565) grad_norm 1.5243 (1.6514) [2022-01-23 01:48:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][230/1251] eta 0:38:46 lr 0.000401 time 2.6165 (2.2785) loss 3.5526 (3.4622) grad_norm 1.5327 (1.6493) [2022-01-23 01:49:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][240/1251] eta 0:38:19 lr 0.000401 time 2.2310 (2.2744) loss 3.5585 (3.4505) grad_norm 1.5920 (1.6516) [2022-01-23 01:49:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][250/1251] eta 0:37:57 lr 0.000401 time 2.2930 (2.2752) loss 4.1246 (3.4480) grad_norm 1.5571 (1.6480) [2022-01-23 01:49:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][260/1251] eta 0:37:25 lr 0.000401 time 1.7745 (2.2664) loss 2.6454 (3.4512) grad_norm 1.4935 (1.6483) [2022-01-23 01:50:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][270/1251] eta 0:36:57 lr 0.000401 time 1.8877 (2.2608) loss 2.5356 (3.4424) grad_norm 1.5459 (1.6462) [2022-01-23 01:50:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][280/1251] eta 0:36:29 lr 0.000401 time 2.2445 (2.2547) loss 3.8858 (3.4524) grad_norm 1.3507 (1.6433) [2022-01-23 01:50:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][290/1251] eta 0:36:02 lr 0.000401 time 2.1137 (2.2502) loss 3.2891 (3.4518) grad_norm 1.5232 (1.6442) [2022-01-23 01:51:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][300/1251] eta 0:35:35 lr 0.000401 time 1.8913 (2.2454) loss 3.2857 (3.4574) grad_norm 1.7912 (1.6445) [2022-01-23 01:51:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][310/1251] eta 0:35:10 lr 0.000401 time 1.8668 (2.2424) loss 3.2596 (3.4545) grad_norm 1.6567 (1.6431) [2022-01-23 01:51:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][320/1251] eta 0:34:48 lr 0.000401 time 2.7473 (2.2430) loss 3.9167 (3.4570) grad_norm 1.6160 (1.6412) [2022-01-23 01:52:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][330/1251] eta 0:34:28 lr 0.000401 time 2.3198 (2.2462) loss 2.5571 (3.4552) grad_norm 1.5833 (1.6397) [2022-01-23 01:52:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][340/1251] eta 0:34:07 lr 0.000401 time 2.5767 (2.2473) loss 3.7884 (3.4548) grad_norm 1.6686 (1.6387) [2022-01-23 01:53:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][350/1251] eta 0:33:42 lr 0.000401 time 1.9681 (2.2445) loss 3.2327 (3.4592) grad_norm 1.7173 (1.6385) [2022-01-23 01:53:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][360/1251] eta 0:33:18 lr 0.000401 time 1.8343 (2.2427) loss 3.9705 (3.4581) grad_norm 1.8093 (1.6426) [2022-01-23 01:53:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][370/1251] eta 0:32:56 lr 0.000401 time 2.6827 (2.2440) loss 3.4791 (3.4566) grad_norm 1.8631 (1.6430) [2022-01-23 01:54:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][380/1251] eta 0:32:33 lr 0.000401 time 2.3506 (2.2434) loss 4.0871 (3.4559) grad_norm 1.6522 (1.6440) [2022-01-23 01:54:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][390/1251] eta 0:32:08 lr 0.000401 time 1.9700 (2.2403) loss 2.9564 (3.4560) grad_norm 1.8436 (1.6451) [2022-01-23 01:54:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][400/1251] eta 0:31:41 lr 0.000400 time 1.7268 (2.2348) loss 4.1816 (3.4562) grad_norm 1.8086 (1.6456) [2022-01-23 01:55:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][410/1251] eta 0:31:20 lr 0.000400 time 2.4364 (2.2356) loss 3.7650 (3.4620) grad_norm 1.7664 (1.6464) [2022-01-23 01:55:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][420/1251] eta 0:30:54 lr 0.000400 time 1.6949 (2.2311) loss 2.9711 (3.4639) grad_norm 1.7708 (1.6453) [2022-01-23 01:55:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][430/1251] eta 0:30:28 lr 0.000400 time 1.9635 (2.2276) loss 2.9752 (3.4628) grad_norm 1.7961 (1.6466) [2022-01-23 01:56:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][440/1251] eta 0:30:06 lr 0.000400 time 1.6022 (2.2278) loss 3.5970 (3.4590) grad_norm 1.7225 (1.6452) [2022-01-23 01:56:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][450/1251] eta 0:29:46 lr 0.000400 time 3.3720 (2.2306) loss 2.8742 (3.4498) grad_norm 1.5373 (1.6444) [2022-01-23 01:57:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][460/1251] eta 0:29:24 lr 0.000400 time 1.8089 (2.2311) loss 3.7459 (3.4505) grad_norm 1.6829 (1.6443) [2022-01-23 01:57:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][470/1251] eta 0:29:02 lr 0.000400 time 2.0730 (2.2308) loss 3.8854 (3.4517) grad_norm 1.6695 (1.6444) [2022-01-23 01:57:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][480/1251] eta 0:28:38 lr 0.000400 time 2.0093 (2.2289) loss 3.1554 (3.4524) grad_norm 1.4467 (1.6438) [2022-01-23 01:58:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][490/1251] eta 0:28:15 lr 0.000400 time 2.7265 (2.2275) loss 3.0980 (3.4486) grad_norm 1.8587 (1.6447) [2022-01-23 01:58:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][500/1251] eta 0:27:49 lr 0.000400 time 2.2424 (2.2234) loss 3.6957 (3.4553) grad_norm 1.4215 (1.6471) [2022-01-23 01:58:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][510/1251] eta 0:27:23 lr 0.000400 time 1.5634 (2.2176) loss 3.1740 (3.4512) grad_norm 2.0473 (1.6476) [2022-01-23 01:59:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][520/1251] eta 0:26:59 lr 0.000400 time 1.7856 (2.2148) loss 3.5517 (3.4519) grad_norm 1.7546 (1.6481) [2022-01-23 01:59:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][530/1251] eta 0:26:37 lr 0.000400 time 2.7539 (2.2159) loss 2.8560 (3.4540) grad_norm 1.6085 (1.6491) [2022-01-23 01:59:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][540/1251] eta 0:26:15 lr 0.000400 time 1.8047 (2.2158) loss 3.1199 (3.4518) grad_norm 1.6642 (1.6523) [2022-01-23 02:00:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][550/1251] eta 0:25:53 lr 0.000400 time 2.2059 (2.2160) loss 3.0974 (3.4565) grad_norm 1.4701 (1.6535) [2022-01-23 02:00:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][560/1251] eta 0:25:32 lr 0.000400 time 2.6303 (2.2177) loss 3.3432 (3.4564) grad_norm 1.7472 (1.6543) [2022-01-23 02:01:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][570/1251] eta 0:25:11 lr 0.000400 time 1.4792 (2.2201) loss 4.5789 (3.4561) grad_norm 2.0840 (1.6542) [2022-01-23 02:01:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][580/1251] eta 0:24:48 lr 0.000400 time 1.8931 (2.2186) loss 3.0758 (3.4603) grad_norm 1.6772 (1.6536) [2022-01-23 02:01:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][590/1251] eta 0:24:25 lr 0.000400 time 1.7582 (2.2173) loss 3.5692 (3.4581) grad_norm 1.9158 (1.6538) [2022-01-23 02:02:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][600/1251] eta 0:24:03 lr 0.000400 time 2.2013 (2.2179) loss 3.6324 (3.4569) grad_norm 1.6798 (1.6547) [2022-01-23 02:02:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][610/1251] eta 0:23:40 lr 0.000400 time 1.8588 (2.2164) loss 3.1490 (3.4551) grad_norm 1.6022 (1.6555) [2022-01-23 02:02:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][620/1251] eta 0:23:17 lr 0.000400 time 2.1633 (2.2142) loss 4.2093 (3.4551) grad_norm 1.7801 (1.6550) [2022-01-23 02:03:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][630/1251] eta 0:22:54 lr 0.000400 time 1.9361 (2.2140) loss 3.4980 (3.4524) grad_norm 1.6793 (1.6559) [2022-01-23 02:03:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][640/1251] eta 0:22:32 lr 0.000399 time 1.8996 (2.2133) loss 3.3986 (3.4536) grad_norm 1.8139 (1.6570) [2022-01-23 02:03:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][650/1251] eta 0:22:08 lr 0.000399 time 1.9728 (2.2112) loss 4.0367 (3.4538) grad_norm 1.5282 (1.6556) [2022-01-23 02:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][660/1251] eta 0:21:46 lr 0.000399 time 1.8441 (2.2113) loss 3.3338 (3.4535) grad_norm 1.6824 (1.6550) [2022-01-23 02:04:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][670/1251] eta 0:21:26 lr 0.000399 time 3.0985 (2.2147) loss 3.2336 (3.4520) grad_norm 1.7188 (1.6555) [2022-01-23 02:05:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][680/1251] eta 0:21:06 lr 0.000399 time 1.9192 (2.2172) loss 3.5008 (3.4552) grad_norm 2.0147 (1.6536) [2022-01-23 02:05:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][690/1251] eta 0:20:44 lr 0.000399 time 2.7944 (2.2187) loss 3.5130 (3.4571) grad_norm 1.6805 (1.6528) [2022-01-23 02:05:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][700/1251] eta 0:20:21 lr 0.000399 time 1.8053 (2.2167) loss 3.8734 (3.4566) grad_norm 2.0249 (1.6542) [2022-01-23 02:06:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][710/1251] eta 0:19:56 lr 0.000399 time 1.7008 (2.2124) loss 3.5493 (3.4557) grad_norm 1.4888 (1.6562) [2022-01-23 02:06:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][720/1251] eta 0:19:32 lr 0.000399 time 1.6898 (2.2089) loss 3.4217 (3.4535) grad_norm 1.7447 (1.6551) [2022-01-23 02:06:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][730/1251] eta 0:19:09 lr 0.000399 time 2.5060 (2.2060) loss 2.5314 (3.4552) grad_norm 1.6089 (1.6557) [2022-01-23 02:07:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][740/1251] eta 0:18:47 lr 0.000399 time 2.5440 (2.2061) loss 4.0238 (3.4586) grad_norm 1.4679 (1.6540) [2022-01-23 02:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][750/1251] eta 0:18:25 lr 0.000399 time 2.7754 (2.2065) loss 2.1903 (3.4607) grad_norm 1.6522 (1.6559) [2022-01-23 02:07:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][760/1251] eta 0:18:03 lr 0.000399 time 1.9469 (2.2069) loss 4.3682 (3.4643) grad_norm 1.6931 (1.6560) [2022-01-23 02:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][770/1251] eta 0:17:41 lr 0.000399 time 2.2942 (2.2070) loss 2.5605 (3.4617) grad_norm 1.7157 (1.6557) [2022-01-23 02:08:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][780/1251] eta 0:17:20 lr 0.000399 time 1.9542 (2.2097) loss 3.1513 (3.4613) grad_norm 1.4153 (1.6559) [2022-01-23 02:09:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][790/1251] eta 0:16:59 lr 0.000399 time 2.2237 (2.2115) loss 4.1208 (3.4582) grad_norm 1.6021 (1.6552) [2022-01-23 02:09:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][800/1251] eta 0:16:38 lr 0.000399 time 1.8411 (2.2141) loss 3.7136 (3.4609) grad_norm 1.6400 (1.6556) [2022-01-23 02:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][810/1251] eta 0:16:15 lr 0.000399 time 1.6894 (2.2121) loss 4.1986 (3.4645) grad_norm 1.7293 (1.6566) [2022-01-23 02:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][820/1251] eta 0:15:53 lr 0.000399 time 1.9981 (2.2119) loss 4.2899 (3.4687) grad_norm 1.9920 (1.6588) [2022-01-23 02:10:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][830/1251] eta 0:15:30 lr 0.000399 time 1.8713 (2.2094) loss 3.5516 (3.4700) grad_norm 1.6005 (1.6587) [2022-01-23 02:10:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][840/1251] eta 0:15:07 lr 0.000399 time 2.1817 (2.2084) loss 3.6297 (3.4719) grad_norm 1.5495 (1.6576) [2022-01-23 02:11:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][850/1251] eta 0:14:45 lr 0.000399 time 1.9873 (2.2077) loss 3.5542 (3.4724) grad_norm 1.6308 (1.6574) [2022-01-23 02:11:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][860/1251] eta 0:14:22 lr 0.000399 time 2.2277 (2.2061) loss 3.3767 (3.4726) grad_norm 1.4816 (1.6575) [2022-01-23 02:11:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][870/1251] eta 0:13:59 lr 0.000399 time 1.9427 (2.2040) loss 3.9370 (3.4742) grad_norm 1.4919 (1.6566) [2022-01-23 02:12:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][880/1251] eta 0:13:37 lr 0.000399 time 1.9601 (2.2029) loss 4.0865 (3.4714) grad_norm 1.5556 (1.6566) [2022-01-23 02:12:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][890/1251] eta 0:13:15 lr 0.000398 time 2.1870 (2.2037) loss 4.2937 (3.4710) grad_norm 1.7553 (1.6561) [2022-01-23 02:13:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][900/1251] eta 0:12:53 lr 0.000398 time 2.4467 (2.2045) loss 2.7704 (3.4723) grad_norm 1.7972 (1.6564) [2022-01-23 02:13:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][910/1251] eta 0:12:32 lr 0.000398 time 2.4022 (2.2069) loss 2.9655 (3.4733) grad_norm 1.6507 (1.6564) [2022-01-23 02:13:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][920/1251] eta 0:12:10 lr 0.000398 time 1.8870 (2.2067) loss 4.0846 (3.4732) grad_norm 1.4440 (1.6561) [2022-01-23 02:14:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][930/1251] eta 0:11:48 lr 0.000398 time 2.3975 (2.2076) loss 3.4479 (3.4721) grad_norm 1.6044 (1.6555) [2022-01-23 02:14:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][940/1251] eta 0:11:26 lr 0.000398 time 1.7904 (2.2066) loss 4.0102 (3.4723) grad_norm 1.5786 (1.6554) [2022-01-23 02:14:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][950/1251] eta 0:11:03 lr 0.000398 time 2.2141 (2.2050) loss 3.8488 (3.4728) grad_norm 1.5524 (1.6556) [2022-01-23 02:15:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][960/1251] eta 0:10:41 lr 0.000398 time 2.1112 (2.2038) loss 3.9889 (3.4733) grad_norm 1.6516 (1.6566) [2022-01-23 02:15:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][970/1251] eta 0:10:19 lr 0.000398 time 1.9816 (2.2030) loss 2.4425 (3.4732) grad_norm 1.6718 (1.6567) [2022-01-23 02:15:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][980/1251] eta 0:09:56 lr 0.000398 time 1.9823 (2.2017) loss 3.5031 (3.4743) grad_norm 1.7528 (1.6566) [2022-01-23 02:16:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][990/1251] eta 0:09:34 lr 0.000398 time 2.5667 (2.2015) loss 2.7256 (3.4709) grad_norm 1.9588 (1.6573) [2022-01-23 02:16:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1000/1251] eta 0:09:12 lr 0.000398 time 2.2808 (2.2018) loss 3.8164 (3.4694) grad_norm 1.5332 (1.6572) [2022-01-23 02:17:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1010/1251] eta 0:08:51 lr 0.000398 time 2.8327 (2.2035) loss 3.5079 (3.4696) grad_norm 1.3717 (1.6562) [2022-01-23 02:17:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1020/1251] eta 0:08:28 lr 0.000398 time 1.9084 (2.2034) loss 3.7192 (3.4702) grad_norm 1.6563 (1.6563) [2022-01-23 02:17:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1030/1251] eta 0:08:06 lr 0.000398 time 1.9726 (2.2035) loss 3.6018 (3.4681) grad_norm 1.5736 (1.6569) [2022-01-23 02:18:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1040/1251] eta 0:07:44 lr 0.000398 time 1.9439 (2.2035) loss 4.1014 (3.4683) grad_norm 1.7864 (1.6572) [2022-01-23 02:18:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1050/1251] eta 0:07:22 lr 0.000398 time 2.3755 (2.2036) loss 2.4357 (3.4662) grad_norm 1.5783 (1.6568) [2022-01-23 02:18:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1060/1251] eta 0:07:00 lr 0.000398 time 2.2108 (2.2032) loss 3.7950 (3.4647) grad_norm 1.6088 (1.6569) [2022-01-23 02:19:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1070/1251] eta 0:06:38 lr 0.000398 time 1.6639 (2.2030) loss 2.5471 (3.4627) grad_norm 1.5884 (1.6575) [2022-01-23 02:19:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1080/1251] eta 0:06:16 lr 0.000398 time 2.5330 (2.2045) loss 2.6838 (3.4613) grad_norm 1.5079 (1.6576) [2022-01-23 02:20:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1090/1251] eta 0:05:54 lr 0.000398 time 1.7967 (2.2042) loss 3.9848 (3.4605) grad_norm 1.6332 (1.6579) [2022-01-23 02:20:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1100/1251] eta 0:05:32 lr 0.000398 time 1.7300 (2.2019) loss 3.9160 (3.4605) grad_norm 1.6313 (1.6579) [2022-01-23 02:20:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1110/1251] eta 0:05:10 lr 0.000398 time 1.7044 (2.1998) loss 3.1228 (3.4605) grad_norm 1.7869 (1.6576) [2022-01-23 02:21:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1120/1251] eta 0:04:48 lr 0.000398 time 2.2206 (2.1999) loss 3.3561 (3.4615) grad_norm 1.6456 (1.6572) [2022-01-23 02:21:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1130/1251] eta 0:04:26 lr 0.000398 time 2.7464 (2.2000) loss 3.6420 (3.4615) grad_norm 1.5552 (1.6567) [2022-01-23 02:21:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1140/1251] eta 0:04:04 lr 0.000397 time 1.8628 (2.2001) loss 3.5184 (3.4599) grad_norm 1.5372 (1.6565) [2022-01-23 02:22:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1150/1251] eta 0:03:42 lr 0.000397 time 1.8986 (2.2006) loss 3.8624 (3.4595) grad_norm 1.6640 (1.6564) [2022-01-23 02:22:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1160/1251] eta 0:03:20 lr 0.000397 time 2.4813 (2.2021) loss 3.4310 (3.4606) grad_norm 1.6223 (1.6562) [2022-01-23 02:22:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1170/1251] eta 0:02:58 lr 0.000397 time 2.1958 (2.2032) loss 3.7316 (3.4613) grad_norm 1.6791 (1.6569) [2022-01-23 02:23:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1180/1251] eta 0:02:36 lr 0.000397 time 2.3924 (2.2039) loss 3.2980 (3.4618) grad_norm 1.4484 (1.6564) [2022-01-23 02:23:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1190/1251] eta 0:02:14 lr 0.000397 time 1.6194 (2.2035) loss 3.8921 (3.4620) grad_norm 1.6484 (1.6564) [2022-01-23 02:23:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1200/1251] eta 0:01:52 lr 0.000397 time 2.2130 (2.2013) loss 3.8672 (3.4634) grad_norm 1.5919 (1.6561) [2022-01-23 02:24:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1210/1251] eta 0:01:30 lr 0.000397 time 1.6402 (2.1993) loss 2.9517 (3.4625) grad_norm 1.6070 (1.6561) [2022-01-23 02:24:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1220/1251] eta 0:01:08 lr 0.000397 time 2.4353 (2.1989) loss 3.4249 (3.4616) grad_norm 1.7355 (1.6555) [2022-01-23 02:25:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1230/1251] eta 0:00:46 lr 0.000397 time 2.3366 (2.1992) loss 2.5489 (3.4608) grad_norm 1.5942 (1.6552) [2022-01-23 02:25:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1240/1251] eta 0:00:24 lr 0.000397 time 2.3877 (2.1990) loss 3.9452 (3.4614) grad_norm 1.5677 (1.6549) [2022-01-23 02:25:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [170/300][1250/1251] eta 0:00:02 lr 0.000397 time 1.1535 (2.1938) loss 3.3544 (3.4613) grad_norm 1.5343 (1.6545) [2022-01-23 02:25:40 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 170 training takes 0:45:44 [2022-01-23 02:25:40 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_170 saving...... [2022-01-23 02:25:51 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_170 saved !!! [2022-01-23 02:26:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 13.797 (13.797) Loss 1.0238 (1.0238) Acc@1 76.660 (76.660) Acc@5 92.285 (92.285) [2022-01-23 02:26:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.702 (2.693) Loss 0.9360 (0.9709) Acc@1 77.734 (76.971) Acc@5 93.262 (93.288) [2022-01-23 02:26:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.017 (2.178) Loss 1.0112 (0.9623) Acc@1 76.465 (77.214) Acc@5 93.848 (93.569) [2022-01-23 02:26:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.327 (2.133) Loss 0.9291 (0.9597) Acc@1 76.758 (77.205) Acc@5 94.336 (93.621) [2022-01-23 02:27:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 0.809 (1.981) Loss 0.9691 (0.9669) Acc@1 75.781 (76.925) Acc@5 93.262 (93.593) [2022-01-23 02:27:21 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.842 Acc@5 93.642 [2022-01-23 02:27:21 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.8% [2022-01-23 02:27:21 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.84% [2022-01-23 02:27:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][0/1251] eta 7:38:10 lr 0.000397 time 21.9750 (21.9750) loss 4.4041 (4.4041) grad_norm 2.0500 (2.0500) [2022-01-23 02:28:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][10/1251] eta 1:24:20 lr 0.000397 time 2.1080 (4.0774) loss 2.2593 (3.2920) grad_norm 1.5752 (1.6375) [2022-01-23 02:28:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][20/1251] eta 1:03:10 lr 0.000397 time 2.1989 (3.0795) loss 3.7557 (3.3668) grad_norm 1.6852 (1.6219) [2022-01-23 02:28:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][30/1251] eta 0:56:28 lr 0.000397 time 1.5101 (2.7753) loss 3.5393 (3.3819) grad_norm 1.7638 (1.6074) [2022-01-23 02:29:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][40/1251] eta 0:53:24 lr 0.000397 time 3.2581 (2.6466) loss 3.2047 (3.3073) grad_norm 1.3662 (1.6420) [2022-01-23 02:29:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][50/1251] eta 0:52:22 lr 0.000397 time 2.8570 (2.6164) loss 2.5644 (3.3587) grad_norm 1.7695 (1.6586) [2022-01-23 02:29:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][60/1251] eta 0:50:46 lr 0.000397 time 2.4108 (2.5579) loss 3.1533 (3.3418) grad_norm 1.5959 (1.6463) [2022-01-23 02:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][70/1251] eta 0:49:07 lr 0.000397 time 1.7754 (2.4962) loss 3.0414 (3.3344) grad_norm 1.6654 (1.6543) [2022-01-23 02:30:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][80/1251] eta 0:47:37 lr 0.000397 time 2.7811 (2.4403) loss 3.2972 (3.3394) grad_norm 1.8627 (1.6505) [2022-01-23 02:30:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][90/1251] eta 0:46:27 lr 0.000397 time 2.3602 (2.4011) loss 3.8553 (3.3565) grad_norm 1.8440 (1.6495) [2022-01-23 02:31:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][100/1251] eta 0:45:34 lr 0.000397 time 2.2527 (2.3754) loss 2.5557 (3.3782) grad_norm 1.8934 (1.6519) [2022-01-23 02:31:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][110/1251] eta 0:44:45 lr 0.000397 time 1.6728 (2.3539) loss 4.3756 (3.3840) grad_norm 1.7906 (1.6540) [2022-01-23 02:32:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][120/1251] eta 0:44:27 lr 0.000397 time 2.5974 (2.3585) loss 3.7932 (3.4008) grad_norm 1.8346 (1.6597) [2022-01-23 02:32:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][130/1251] eta 0:44:10 lr 0.000396 time 2.7222 (2.3647) loss 3.8340 (3.3987) grad_norm 1.7012 (1.6587) [2022-01-23 02:32:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][140/1251] eta 0:43:44 lr 0.000396 time 2.8851 (2.3623) loss 3.5355 (3.4192) grad_norm 1.6517 (1.6582) [2022-01-23 02:33:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][150/1251] eta 0:42:59 lr 0.000396 time 1.6404 (2.3432) loss 3.4931 (3.4262) grad_norm 1.7803 (1.6603) [2022-01-23 02:33:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][160/1251] eta 0:42:05 lr 0.000396 time 1.8278 (2.3151) loss 2.2494 (3.4240) grad_norm 1.3693 (1.6570) [2022-01-23 02:33:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][170/1251] eta 0:41:29 lr 0.000396 time 2.1293 (2.3030) loss 3.7580 (3.4115) grad_norm 1.5120 (1.6514) [2022-01-23 02:34:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][180/1251] eta 0:40:52 lr 0.000396 time 2.4638 (2.2899) loss 2.9029 (3.4070) grad_norm 1.6895 (1.6556) [2022-01-23 02:34:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][190/1251] eta 0:40:33 lr 0.000396 time 2.8444 (2.2931) loss 3.5945 (3.4162) grad_norm 1.5710 (1.6505) [2022-01-23 02:35:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][200/1251] eta 0:40:05 lr 0.000396 time 2.1869 (2.2885) loss 3.2359 (3.4166) grad_norm 1.5149 (1.6478) [2022-01-23 02:35:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][210/1251] eta 0:39:43 lr 0.000396 time 2.2314 (2.2896) loss 3.7842 (3.4105) grad_norm 2.0229 (1.6554) [2022-01-23 02:35:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][220/1251] eta 0:39:26 lr 0.000396 time 2.9116 (2.2958) loss 3.3275 (3.4231) grad_norm 1.5330 (1.6568) [2022-01-23 02:36:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][230/1251] eta 0:38:57 lr 0.000396 time 2.1754 (2.2896) loss 3.3955 (3.4180) grad_norm 1.8356 (1.6544) [2022-01-23 02:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][240/1251] eta 0:38:22 lr 0.000396 time 1.8814 (2.2777) loss 3.9978 (3.4219) grad_norm 1.7212 (1.6548) [2022-01-23 02:36:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][250/1251] eta 0:37:50 lr 0.000396 time 2.2614 (2.2678) loss 3.9692 (3.4250) grad_norm 1.4458 (1.6545) [2022-01-23 02:37:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][260/1251] eta 0:37:19 lr 0.000396 time 2.4801 (2.2598) loss 3.6692 (3.4268) grad_norm 1.5853 (1.6540) [2022-01-23 02:37:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][270/1251] eta 0:36:50 lr 0.000396 time 1.9028 (2.2533) loss 3.7397 (3.4290) grad_norm 1.6759 (1.6567) [2022-01-23 02:37:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][280/1251] eta 0:36:31 lr 0.000396 time 2.5360 (2.2571) loss 3.4815 (3.4211) grad_norm 1.4071 (1.6555) [2022-01-23 02:38:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][290/1251] eta 0:36:06 lr 0.000396 time 2.1760 (2.2546) loss 3.6295 (3.4140) grad_norm 1.8214 (1.6605) [2022-01-23 02:38:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][300/1251] eta 0:35:43 lr 0.000396 time 2.7414 (2.2540) loss 3.7523 (3.4055) grad_norm 1.4891 (1.6577) [2022-01-23 02:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][310/1251] eta 0:35:19 lr 0.000396 time 2.2148 (2.2525) loss 3.8007 (3.4083) grad_norm 1.6059 (1.6538) [2022-01-23 02:39:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][320/1251] eta 0:34:52 lr 0.000396 time 2.4995 (2.2481) loss 3.6102 (3.4050) grad_norm 1.6494 (1.6541) [2022-01-23 02:39:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][330/1251] eta 0:34:28 lr 0.000396 time 3.0064 (2.2458) loss 4.0583 (3.4160) grad_norm 1.5938 (1.6547) [2022-01-23 02:40:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][340/1251] eta 0:34:00 lr 0.000396 time 1.7792 (2.2394) loss 4.0186 (3.4121) grad_norm 1.6369 (1.6527) [2022-01-23 02:40:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][350/1251] eta 0:33:36 lr 0.000396 time 2.3685 (2.2376) loss 3.5423 (3.4147) grad_norm 1.5636 (1.6502) [2022-01-23 02:40:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][360/1251] eta 0:33:10 lr 0.000396 time 2.2726 (2.2338) loss 3.8857 (3.4148) grad_norm 1.5515 (1.6491) [2022-01-23 02:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][370/1251] eta 0:32:47 lr 0.000396 time 2.5154 (2.2336) loss 3.8058 (3.4152) grad_norm 1.7951 (1.6514) [2022-01-23 02:41:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][380/1251] eta 0:32:25 lr 0.000395 time 2.4369 (2.2339) loss 4.0297 (3.4215) grad_norm 1.6308 (1.6536) [2022-01-23 02:41:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][390/1251] eta 0:32:02 lr 0.000395 time 1.8615 (2.2329) loss 3.2494 (3.4250) grad_norm 1.7891 (1.6548) [2022-01-23 02:42:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][400/1251] eta 0:31:38 lr 0.000395 time 2.7159 (2.2314) loss 3.8403 (3.4283) grad_norm 1.6829 (1.6605) [2022-01-23 02:42:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][410/1251] eta 0:31:20 lr 0.000395 time 3.7334 (2.2358) loss 3.9936 (3.4263) grad_norm 1.9468 (1.6603) [2022-01-23 02:43:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][420/1251] eta 0:31:01 lr 0.000395 time 2.4979 (2.2396) loss 3.2560 (3.4283) grad_norm 1.4592 (1.6583) [2022-01-23 02:43:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][430/1251] eta 0:30:37 lr 0.000395 time 1.8181 (2.2381) loss 3.5599 (3.4232) grad_norm 1.4659 (1.6568) [2022-01-23 02:43:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][440/1251] eta 0:30:11 lr 0.000395 time 1.7987 (2.2331) loss 3.7936 (3.4179) grad_norm 2.0110 (1.6557) [2022-01-23 02:44:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][450/1251] eta 0:29:45 lr 0.000395 time 2.9634 (2.2291) loss 3.6070 (3.4207) grad_norm 1.6627 (1.6575) [2022-01-23 02:44:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][460/1251] eta 0:29:19 lr 0.000395 time 1.9012 (2.2250) loss 4.1035 (3.4219) grad_norm 1.7820 (1.6582) [2022-01-23 02:44:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][470/1251] eta 0:28:55 lr 0.000395 time 1.5400 (2.2216) loss 3.0647 (3.4198) grad_norm 1.6559 (1.6593) [2022-01-23 02:45:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][480/1251] eta 0:28:33 lr 0.000395 time 2.2739 (2.2221) loss 3.7586 (3.4256) grad_norm 1.7859 (1.6589) [2022-01-23 02:45:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][490/1251] eta 0:28:11 lr 0.000395 time 2.7813 (2.2224) loss 3.5513 (3.4314) grad_norm 1.5391 (1.6589) [2022-01-23 02:45:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][500/1251] eta 0:27:48 lr 0.000395 time 2.5854 (2.2222) loss 3.6865 (3.4269) grad_norm 1.8827 (1.6596) [2022-01-23 02:46:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][510/1251] eta 0:27:27 lr 0.000395 time 2.4809 (2.2235) loss 3.4782 (3.4272) grad_norm 1.6872 (1.6586) [2022-01-23 02:46:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][520/1251] eta 0:27:05 lr 0.000395 time 2.6837 (2.2241) loss 3.4806 (3.4328) grad_norm 1.4850 (1.6565) [2022-01-23 02:47:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][530/1251] eta 0:26:42 lr 0.000395 time 2.4858 (2.2223) loss 3.0609 (3.4393) grad_norm 1.8250 (1.6569) [2022-01-23 02:47:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][540/1251] eta 0:26:18 lr 0.000395 time 1.9134 (2.2204) loss 2.7646 (3.4352) grad_norm 1.7690 (1.6602) [2022-01-23 02:47:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][550/1251] eta 0:25:55 lr 0.000395 time 2.0909 (2.2184) loss 3.1956 (3.4385) grad_norm 1.7744 (1.6603) [2022-01-23 02:48:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][560/1251] eta 0:25:31 lr 0.000395 time 2.5679 (2.2158) loss 3.7522 (3.4387) grad_norm 1.5771 (1.6603) [2022-01-23 02:48:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][570/1251] eta 0:25:08 lr 0.000395 time 3.3897 (2.2153) loss 3.9768 (3.4380) grad_norm 2.1279 (1.6606) [2022-01-23 02:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][580/1251] eta 0:24:46 lr 0.000395 time 2.5618 (2.2146) loss 2.5409 (3.4384) grad_norm 1.5016 (1.6587) [2022-01-23 02:49:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][590/1251] eta 0:24:22 lr 0.000395 time 2.2559 (2.2124) loss 3.5002 (3.4434) grad_norm 1.7342 (1.6582) [2022-01-23 02:49:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][600/1251] eta 0:23:59 lr 0.000395 time 2.4423 (2.2109) loss 2.2989 (3.4428) grad_norm 1.5307 (1.6586) [2022-01-23 02:49:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][610/1251] eta 0:23:38 lr 0.000395 time 3.1801 (2.2125) loss 3.6492 (3.4405) grad_norm 1.6541 (1.6598) [2022-01-23 02:50:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][620/1251] eta 0:23:19 lr 0.000395 time 2.9491 (2.2176) loss 3.4750 (3.4403) grad_norm 1.7951 (1.6606) [2022-01-23 02:50:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][630/1251] eta 0:22:58 lr 0.000394 time 2.4853 (2.2197) loss 3.6319 (3.4421) grad_norm 1.5620 (1.6600) [2022-01-23 02:51:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][640/1251] eta 0:22:36 lr 0.000394 time 2.3163 (2.2202) loss 3.2051 (3.4392) grad_norm 1.6875 (1.6584) [2022-01-23 02:51:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][650/1251] eta 0:22:12 lr 0.000394 time 1.9043 (2.2176) loss 3.8930 (3.4417) grad_norm 1.4742 (1.6567) [2022-01-23 02:51:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][660/1251] eta 0:21:48 lr 0.000394 time 1.7823 (2.2140) loss 3.6008 (3.4440) grad_norm 1.8136 (1.6579) [2022-01-23 02:52:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][670/1251] eta 0:21:24 lr 0.000394 time 1.9777 (2.2101) loss 3.9631 (3.4474) grad_norm 1.7629 (1.6598) [2022-01-23 02:52:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][680/1251] eta 0:21:00 lr 0.000394 time 1.9001 (2.2076) loss 3.7038 (3.4520) grad_norm 1.6556 (1.6609) [2022-01-23 02:52:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][690/1251] eta 0:20:38 lr 0.000394 time 2.6523 (2.2076) loss 3.7125 (3.4533) grad_norm 1.7572 (1.6598) [2022-01-23 02:53:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][700/1251] eta 0:20:16 lr 0.000394 time 2.3898 (2.2084) loss 3.7770 (3.4530) grad_norm 1.7272 (1.6597) [2022-01-23 02:53:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][710/1251] eta 0:19:56 lr 0.000394 time 2.1136 (2.2124) loss 3.0500 (3.4527) grad_norm 1.7330 (1.6600) [2022-01-23 02:53:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][720/1251] eta 0:19:34 lr 0.000394 time 1.9130 (2.2121) loss 4.3181 (3.4543) grad_norm 1.7892 (1.6601) [2022-01-23 02:54:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][730/1251] eta 0:19:11 lr 0.000394 time 2.0855 (2.2105) loss 3.0972 (3.4520) grad_norm 1.8061 (1.6597) [2022-01-23 02:54:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][740/1251] eta 0:18:50 lr 0.000394 time 1.8870 (2.2129) loss 3.5766 (3.4530) grad_norm 1.5840 (1.6618) [2022-01-23 02:55:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][750/1251] eta 0:18:29 lr 0.000394 time 1.8242 (2.2149) loss 3.1381 (3.4545) grad_norm 1.7850 (1.6622) [2022-01-23 02:55:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][760/1251] eta 0:18:08 lr 0.000394 time 2.1859 (2.2167) loss 3.1231 (3.4530) grad_norm 2.0432 (1.6623) [2022-01-23 02:55:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][770/1251] eta 0:17:45 lr 0.000394 time 2.2652 (2.2161) loss 3.9444 (3.4541) grad_norm 1.4544 (1.6621) [2022-01-23 02:56:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][780/1251] eta 0:17:22 lr 0.000394 time 1.8859 (2.2137) loss 3.3309 (3.4527) grad_norm 1.4413 (1.6645) [2022-01-23 02:56:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][790/1251] eta 0:16:59 lr 0.000394 time 1.9997 (2.2113) loss 4.2523 (3.4523) grad_norm 1.5860 (1.6643) [2022-01-23 02:56:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][800/1251] eta 0:16:36 lr 0.000394 time 2.6369 (2.2102) loss 3.8378 (3.4538) grad_norm 1.9107 (1.6638) [2022-01-23 02:57:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][810/1251] eta 0:16:14 lr 0.000394 time 1.9629 (2.2098) loss 2.5253 (3.4513) grad_norm 1.4322 (1.6626) [2022-01-23 02:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][820/1251] eta 0:15:53 lr 0.000394 time 2.4653 (2.2113) loss 3.5653 (3.4532) grad_norm 1.5210 (1.6626) [2022-01-23 02:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][830/1251] eta 0:15:31 lr 0.000394 time 2.4569 (2.2123) loss 3.6632 (3.4546) grad_norm 1.6307 (1.6623) [2022-01-23 02:58:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][840/1251] eta 0:15:09 lr 0.000394 time 2.5699 (2.2129) loss 3.6376 (3.4553) grad_norm 1.4020 (1.6634) [2022-01-23 02:58:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][850/1251] eta 0:14:46 lr 0.000394 time 1.5567 (2.2106) loss 2.7477 (3.4571) grad_norm 1.8905 (1.6638) [2022-01-23 02:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][860/1251] eta 0:14:24 lr 0.000394 time 2.8023 (2.2106) loss 2.3927 (3.4546) grad_norm 1.6182 (1.6641) [2022-01-23 02:59:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][870/1251] eta 0:14:01 lr 0.000394 time 2.2126 (2.2096) loss 3.5337 (3.4510) grad_norm 1.7088 (1.6643) [2022-01-23 02:59:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][880/1251] eta 0:13:39 lr 0.000393 time 1.6828 (2.2085) loss 2.9948 (3.4516) grad_norm 1.6382 (1.6640) [2022-01-23 03:00:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][890/1251] eta 0:13:17 lr 0.000393 time 1.9781 (2.2090) loss 3.4358 (3.4537) grad_norm 1.5803 (1.6639) [2022-01-23 03:00:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][900/1251] eta 0:12:55 lr 0.000393 time 3.4241 (2.2095) loss 3.8563 (3.4559) grad_norm 1.4709 (1.6636) [2022-01-23 03:00:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][910/1251] eta 0:12:33 lr 0.000393 time 2.4267 (2.2083) loss 2.5228 (3.4522) grad_norm 1.8008 (1.6636) [2022-01-23 03:01:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][920/1251] eta 0:12:10 lr 0.000393 time 1.5636 (2.2070) loss 3.4839 (3.4528) grad_norm 1.5350 (1.6624) [2022-01-23 03:01:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][930/1251] eta 0:11:47 lr 0.000393 time 1.8989 (2.2056) loss 2.7982 (3.4489) grad_norm 1.7143 (1.6630) [2022-01-23 03:01:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][940/1251] eta 0:11:26 lr 0.000393 time 3.7354 (2.2068) loss 3.9708 (3.4524) grad_norm 2.0604 (1.6645) [2022-01-23 03:02:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][950/1251] eta 0:11:03 lr 0.000393 time 1.5645 (2.2049) loss 3.9349 (3.4546) grad_norm 1.7091 (1.6654) [2022-01-23 03:02:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][960/1251] eta 0:10:41 lr 0.000393 time 1.8651 (2.2043) loss 3.8724 (3.4539) grad_norm 1.7469 (1.6661) [2022-01-23 03:03:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][970/1251] eta 0:10:20 lr 0.000393 time 1.9721 (2.2075) loss 3.5138 (3.4526) grad_norm 1.8436 (1.6654) [2022-01-23 03:03:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][980/1251] eta 0:09:58 lr 0.000393 time 3.4160 (2.2092) loss 3.3644 (3.4519) grad_norm 1.8133 (1.6655) [2022-01-23 03:03:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][990/1251] eta 0:09:36 lr 0.000393 time 2.8006 (2.2097) loss 3.2947 (3.4527) grad_norm 1.8155 (1.6653) [2022-01-23 03:04:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1000/1251] eta 0:09:14 lr 0.000393 time 1.8725 (2.2097) loss 3.1200 (3.4496) grad_norm 1.6506 (1.6649) [2022-01-23 03:04:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1010/1251] eta 0:08:52 lr 0.000393 time 1.9636 (2.2087) loss 3.5090 (3.4506) grad_norm 1.5641 (1.6651) [2022-01-23 03:04:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1020/1251] eta 0:08:29 lr 0.000393 time 1.9814 (2.2062) loss 4.1351 (3.4522) grad_norm 1.9395 (1.6652) [2022-01-23 03:05:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1030/1251] eta 0:08:07 lr 0.000393 time 2.5775 (2.2049) loss 2.1473 (3.4485) grad_norm 1.6132 (1.6641) [2022-01-23 03:05:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1040/1251] eta 0:07:45 lr 0.000393 time 2.2832 (2.2039) loss 4.0471 (3.4506) grad_norm 1.7361 (1.6639) [2022-01-23 03:05:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1050/1251] eta 0:07:22 lr 0.000393 time 1.9360 (2.2027) loss 3.5871 (3.4505) grad_norm 1.8211 (1.6641) [2022-01-23 03:06:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1060/1251] eta 0:07:01 lr 0.000393 time 2.4175 (2.2042) loss 3.8303 (3.4518) grad_norm 1.4424 (1.6638) [2022-01-23 03:06:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1070/1251] eta 0:06:38 lr 0.000393 time 2.0290 (2.2042) loss 2.3731 (3.4486) grad_norm 1.5395 (1.6642) [2022-01-23 03:07:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1080/1251] eta 0:06:17 lr 0.000393 time 2.5074 (2.2059) loss 3.7454 (3.4482) grad_norm 1.5421 (1.6637) [2022-01-23 03:07:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1090/1251] eta 0:05:55 lr 0.000393 time 1.9414 (2.2055) loss 3.5246 (3.4507) grad_norm 1.5022 (1.6629) [2022-01-23 03:07:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1100/1251] eta 0:05:32 lr 0.000393 time 1.9089 (2.2051) loss 2.9483 (3.4495) grad_norm 1.7061 (1.6629) [2022-01-23 03:08:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1110/1251] eta 0:05:10 lr 0.000393 time 1.8919 (2.2034) loss 4.3796 (3.4509) grad_norm 1.7300 (1.6624) [2022-01-23 03:08:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1120/1251] eta 0:04:48 lr 0.000392 time 2.7458 (2.2026) loss 3.6142 (3.4513) grad_norm 1.4992 (1.6629) [2022-01-23 03:08:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1130/1251] eta 0:04:26 lr 0.000392 time 2.1426 (2.2018) loss 2.9805 (3.4504) grad_norm 1.9142 (1.6639) [2022-01-23 03:09:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1140/1251] eta 0:04:04 lr 0.000392 time 2.2108 (2.2018) loss 3.9645 (3.4516) grad_norm 1.5368 (1.6646) [2022-01-23 03:09:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1150/1251] eta 0:03:42 lr 0.000392 time 2.5308 (2.2015) loss 4.0837 (3.4552) grad_norm 1.6111 (1.6644) [2022-01-23 03:09:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1160/1251] eta 0:03:20 lr 0.000392 time 2.2291 (2.2022) loss 3.8389 (3.4570) grad_norm 1.5601 (1.6642) [2022-01-23 03:10:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1170/1251] eta 0:02:58 lr 0.000392 time 1.7105 (2.2025) loss 3.7889 (3.4554) grad_norm 1.8409 (1.6641) [2022-01-23 03:10:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1180/1251] eta 0:02:36 lr 0.000392 time 1.9636 (2.2033) loss 2.4824 (3.4535) grad_norm 1.9411 (1.6657) [2022-01-23 03:11:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1190/1251] eta 0:02:14 lr 0.000392 time 3.1704 (2.2038) loss 3.9380 (3.4536) grad_norm 1.7988 (1.6659) [2022-01-23 03:11:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1200/1251] eta 0:01:52 lr 0.000392 time 1.9933 (2.2026) loss 4.3369 (3.4540) grad_norm 1.5136 (1.6662) [2022-01-23 03:11:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1210/1251] eta 0:01:30 lr 0.000392 time 2.2104 (2.2016) loss 3.1333 (3.4539) grad_norm 1.5587 (1.6659) [2022-01-23 03:12:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1220/1251] eta 0:01:08 lr 0.000392 time 2.2383 (2.2005) loss 2.2388 (3.4548) grad_norm 1.4851 (1.6655) [2022-01-23 03:12:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1230/1251] eta 0:00:46 lr 0.000392 time 1.6086 (2.1996) loss 2.8972 (3.4555) grad_norm 1.4422 (1.6661) [2022-01-23 03:12:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1240/1251] eta 0:00:24 lr 0.000392 time 1.6510 (2.1981) loss 3.9535 (3.4559) grad_norm 1.9332 (1.6661) [2022-01-23 03:13:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [171/300][1250/1251] eta 0:00:02 lr 0.000392 time 1.1834 (2.1934) loss 3.6602 (3.4554) grad_norm 1.7493 (1.6666) [2022-01-23 03:13:05 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 171 training takes 0:45:44 [2022-01-23 03:13:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.524 (18.524) Loss 0.9536 (0.9536) Acc@1 76.465 (76.465) Acc@5 94.434 (94.434) [2022-01-23 03:13:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.589 (3.323) Loss 0.9519 (0.9634) Acc@1 76.465 (76.456) Acc@5 94.043 (93.928) [2022-01-23 03:13:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.614 (2.592) Loss 1.0109 (0.9631) Acc@1 76.562 (76.683) Acc@5 93.066 (93.880) [2022-01-23 03:14:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.306 (2.308) Loss 0.9541 (0.9627) Acc@1 77.051 (76.896) Acc@5 94.336 (93.832) [2022-01-23 03:14:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.882 (2.142) Loss 0.9674 (0.9666) Acc@1 76.953 (76.832) Acc@5 93.848 (93.748) [2022-01-23 03:14:40 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.876 Acc@5 93.720 [2022-01-23 03:14:40 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.9% [2022-01-23 03:14:40 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 76.88% [2022-01-23 03:15:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][0/1251] eta 7:34:59 lr 0.000392 time 21.8221 (21.8221) loss 2.5257 (2.5257) grad_norm 1.7780 (1.7780) [2022-01-23 03:15:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][10/1251] eta 1:25:11 lr 0.000392 time 2.9521 (4.1189) loss 3.5892 (3.6593) grad_norm 1.4641 (1.6731) [2022-01-23 03:15:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][20/1251] eta 1:04:50 lr 0.000392 time 2.1469 (3.1600) loss 3.8870 (3.5605) grad_norm 1.8740 (1.6852) [2022-01-23 03:16:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][30/1251] eta 0:58:14 lr 0.000392 time 1.8668 (2.8624) loss 2.2058 (3.4238) grad_norm 1.5121 (1.6810) [2022-01-23 03:16:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][40/1251] eta 0:54:53 lr 0.000392 time 2.8752 (2.7198) loss 3.2859 (3.3967) grad_norm 1.5770 (1.6854) [2022-01-23 03:16:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][50/1251] eta 0:52:39 lr 0.000392 time 2.2032 (2.6310) loss 2.2829 (3.3977) grad_norm 1.6636 (1.6766) [2022-01-23 03:17:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][60/1251] eta 0:50:31 lr 0.000392 time 2.2421 (2.5450) loss 4.1084 (3.3873) grad_norm 1.7508 (1.6642) [2022-01-23 03:17:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][70/1251] eta 0:48:50 lr 0.000392 time 1.8421 (2.4810) loss 3.8655 (3.4178) grad_norm 1.6598 (1.6649) [2022-01-23 03:17:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][80/1251] eta 0:47:57 lr 0.000392 time 3.3955 (2.4572) loss 3.4797 (3.4060) grad_norm 1.8779 (1.6703) [2022-01-23 03:18:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][90/1251] eta 0:47:14 lr 0.000392 time 3.2078 (2.4416) loss 2.9334 (3.3968) grad_norm 2.0162 (1.6889) [2022-01-23 03:18:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][100/1251] eta 0:46:31 lr 0.000392 time 2.9173 (2.4256) loss 3.5273 (3.3773) grad_norm 1.5014 (1.7037) [2022-01-23 03:19:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][110/1251] eta 0:45:37 lr 0.000392 time 1.9254 (2.3992) loss 2.5213 (3.3752) grad_norm 1.6754 (1.7068) [2022-01-23 03:19:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][120/1251] eta 0:44:47 lr 0.000391 time 2.7339 (2.3763) loss 3.7741 (3.3898) grad_norm 1.8195 (1.7030) [2022-01-23 03:19:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][130/1251] eta 0:44:05 lr 0.000391 time 1.9603 (2.3598) loss 3.0128 (3.3985) grad_norm 2.4247 (1.7037) [2022-01-23 03:20:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][140/1251] eta 0:43:42 lr 0.000391 time 3.7149 (2.3601) loss 4.2267 (3.4006) grad_norm 1.7221 (1.6989) [2022-01-23 03:20:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][150/1251] eta 0:43:06 lr 0.000391 time 2.0065 (2.3495) loss 3.2855 (3.3843) grad_norm 1.4657 (1.6943) [2022-01-23 03:20:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][160/1251] eta 0:42:27 lr 0.000391 time 1.8722 (2.3349) loss 2.8224 (3.3875) grad_norm 1.5729 (1.6909) [2022-01-23 03:21:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][170/1251] eta 0:41:53 lr 0.000391 time 2.1205 (2.3249) loss 3.9631 (3.3919) grad_norm 1.5553 (1.6875) [2022-01-23 03:21:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][180/1251] eta 0:41:18 lr 0.000391 time 2.8421 (2.3144) loss 2.9454 (3.3804) grad_norm 1.6636 (1.6814) [2022-01-23 03:21:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][190/1251] eta 0:40:39 lr 0.000391 time 1.5851 (2.2994) loss 3.0637 (3.3771) grad_norm 1.6276 (1.6783) [2022-01-23 03:22:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][200/1251] eta 0:40:07 lr 0.000391 time 1.8567 (2.2907) loss 3.1040 (3.3716) grad_norm 1.7513 (1.6741) [2022-01-23 03:22:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][210/1251] eta 0:39:35 lr 0.000391 time 2.1039 (2.2817) loss 2.5760 (3.3761) grad_norm 1.6935 (1.6864) [2022-01-23 03:23:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][220/1251] eta 0:39:08 lr 0.000391 time 2.7733 (2.2777) loss 3.6081 (3.3710) grad_norm 2.1345 (1.6900) [2022-01-23 03:23:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][230/1251] eta 0:38:40 lr 0.000391 time 1.8238 (2.2726) loss 2.6028 (3.3751) grad_norm 1.5498 (1.6882) [2022-01-23 03:23:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][240/1251] eta 0:38:19 lr 0.000391 time 2.3039 (2.2741) loss 3.4756 (3.3807) grad_norm 1.6326 (1.6901) [2022-01-23 03:24:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][250/1251] eta 0:37:52 lr 0.000391 time 1.8688 (2.2703) loss 2.3457 (3.3732) grad_norm 1.6349 (1.6900) [2022-01-23 03:24:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][260/1251] eta 0:37:26 lr 0.000391 time 2.5789 (2.2672) loss 2.8781 (3.3792) grad_norm 1.4968 (1.6877) [2022-01-23 03:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][270/1251] eta 0:36:59 lr 0.000391 time 2.7862 (2.2620) loss 3.6293 (3.3844) grad_norm 1.6795 (1.6859) [2022-01-23 03:25:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][280/1251] eta 0:36:33 lr 0.000391 time 2.1193 (2.2590) loss 3.0086 (3.3745) grad_norm 1.6588 (1.6838) [2022-01-23 03:25:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][290/1251] eta 0:36:05 lr 0.000391 time 1.8628 (2.2529) loss 3.8029 (3.3790) grad_norm 1.6690 (1.6828) [2022-01-23 03:25:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][300/1251] eta 0:35:36 lr 0.000391 time 1.8076 (2.2463) loss 4.0501 (3.3853) grad_norm 1.5483 (1.6805) [2022-01-23 03:26:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][310/1251] eta 0:35:12 lr 0.000391 time 2.2212 (2.2452) loss 3.8811 (3.3931) grad_norm 1.5236 (1.6788) [2022-01-23 03:26:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][320/1251] eta 0:34:46 lr 0.000391 time 1.8575 (2.2412) loss 3.8134 (3.3966) grad_norm 1.5952 (1.6779) [2022-01-23 03:27:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][330/1251] eta 0:34:20 lr 0.000391 time 1.9049 (2.2376) loss 3.7124 (3.3961) grad_norm 2.1611 (1.6757) [2022-01-23 03:27:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][340/1251] eta 0:34:03 lr 0.000391 time 3.2108 (2.2431) loss 3.6335 (3.3927) grad_norm 1.8069 (1.6751) [2022-01-23 03:27:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][350/1251] eta 0:33:40 lr 0.000391 time 2.2855 (2.2430) loss 3.8572 (3.3887) grad_norm 1.7982 (1.6769) [2022-01-23 03:28:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][360/1251] eta 0:33:14 lr 0.000391 time 1.7540 (2.2390) loss 3.8195 (3.3915) grad_norm 1.6879 (1.6800) [2022-01-23 03:28:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][370/1251] eta 0:32:50 lr 0.000390 time 1.9436 (2.2366) loss 3.9001 (3.3906) grad_norm 1.8507 (1.6806) [2022-01-23 03:28:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][380/1251] eta 0:32:27 lr 0.000390 time 3.3627 (2.2362) loss 2.8159 (3.3822) grad_norm 1.5949 (1.6817) [2022-01-23 03:29:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][390/1251] eta 0:32:03 lr 0.000390 time 1.6737 (2.2344) loss 2.7833 (3.3807) grad_norm 2.2265 (1.6843) [2022-01-23 03:29:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][400/1251] eta 0:31:37 lr 0.000390 time 2.1130 (2.2300) loss 3.8270 (3.3851) grad_norm 1.4411 (1.6839) [2022-01-23 03:29:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][410/1251] eta 0:31:12 lr 0.000390 time 1.7974 (2.2269) loss 3.7105 (3.3929) grad_norm 1.7828 (1.6844) [2022-01-23 03:30:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][420/1251] eta 0:30:49 lr 0.000390 time 3.0743 (2.2262) loss 3.3608 (3.3971) grad_norm 1.6071 (1.6841) [2022-01-23 03:30:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][430/1251] eta 0:30:25 lr 0.000390 time 1.5806 (2.2239) loss 3.0112 (3.3954) grad_norm 1.7402 (1.6833) [2022-01-23 03:31:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][440/1251] eta 0:30:02 lr 0.000390 time 1.9002 (2.2226) loss 4.1017 (3.3936) grad_norm 1.5078 (1.6834) [2022-01-23 03:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][450/1251] eta 0:29:40 lr 0.000390 time 1.8846 (2.2227) loss 2.4395 (3.3951) grad_norm 1.4409 (1.6847) [2022-01-23 03:31:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][460/1251] eta 0:29:20 lr 0.000390 time 2.2477 (2.2262) loss 2.8591 (3.3900) grad_norm 1.8597 (1.6852) [2022-01-23 03:32:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][470/1251] eta 0:28:58 lr 0.000390 time 1.7544 (2.2257) loss 3.7026 (3.3882) grad_norm 1.7746 (1.6851) [2022-01-23 03:32:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][480/1251] eta 0:28:35 lr 0.000390 time 1.9892 (2.2244) loss 3.7305 (3.3922) grad_norm 1.7901 (1.6859) [2022-01-23 03:32:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][490/1251] eta 0:28:12 lr 0.000390 time 1.9756 (2.2237) loss 2.4373 (3.3945) grad_norm 1.5384 (1.6868) [2022-01-23 03:33:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][500/1251] eta 0:27:47 lr 0.000390 time 1.9543 (2.2202) loss 3.4690 (3.3984) grad_norm 1.4221 (1.6853) [2022-01-23 03:33:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][510/1251] eta 0:27:23 lr 0.000390 time 2.0362 (2.2184) loss 2.9360 (3.4025) grad_norm 1.7959 (1.6837) [2022-01-23 03:33:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][520/1251] eta 0:27:01 lr 0.000390 time 1.9718 (2.2176) loss 3.7124 (3.3996) grad_norm 1.7443 (1.6837) [2022-01-23 03:34:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][530/1251] eta 0:26:40 lr 0.000390 time 1.9017 (2.2193) loss 3.6504 (3.3988) grad_norm 1.5659 (1.6826) [2022-01-23 03:34:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][540/1251] eta 0:26:16 lr 0.000390 time 2.1543 (2.2178) loss 2.5090 (3.4000) grad_norm 1.5127 (1.6834) [2022-01-23 03:35:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][550/1251] eta 0:25:53 lr 0.000390 time 1.5632 (2.2164) loss 4.1505 (3.4042) grad_norm 1.4275 (1.6813) [2022-01-23 03:35:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][560/1251] eta 0:25:29 lr 0.000390 time 1.5217 (2.2140) loss 3.9619 (3.4016) grad_norm 1.7115 (1.6813) [2022-01-23 03:35:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][570/1251] eta 0:25:10 lr 0.000390 time 2.2925 (2.2180) loss 4.1491 (3.4059) grad_norm 1.9649 (1.6821) [2022-01-23 03:36:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][580/1251] eta 0:24:48 lr 0.000390 time 1.8466 (2.2180) loss 4.1842 (3.4099) grad_norm 1.7746 (1.6815) [2022-01-23 03:36:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][590/1251] eta 0:24:27 lr 0.000390 time 2.2804 (2.2199) loss 3.1860 (3.4102) grad_norm 1.7033 (1.6833) [2022-01-23 03:36:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][600/1251] eta 0:24:02 lr 0.000390 time 1.5434 (2.2165) loss 3.5984 (3.4125) grad_norm 1.7992 (1.6847) [2022-01-23 03:37:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][610/1251] eta 0:23:39 lr 0.000390 time 2.7133 (2.2149) loss 3.8765 (3.4128) grad_norm 1.5668 (1.6842) [2022-01-23 03:37:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][620/1251] eta 0:23:16 lr 0.000389 time 2.2675 (2.2128) loss 3.1809 (3.4131) grad_norm 1.5407 (1.6829) [2022-01-23 03:37:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][630/1251] eta 0:22:53 lr 0.000389 time 1.7614 (2.2123) loss 4.4659 (3.4158) grad_norm 1.5461 (1.6817) [2022-01-23 03:38:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][640/1251] eta 0:22:30 lr 0.000389 time 2.1769 (2.2098) loss 2.5707 (3.4152) grad_norm 1.4924 (1.6800) [2022-01-23 03:38:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][650/1251] eta 0:22:07 lr 0.000389 time 2.6051 (2.2086) loss 3.4166 (3.4185) grad_norm 1.4013 (1.6783) [2022-01-23 03:39:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][660/1251] eta 0:21:45 lr 0.000389 time 1.4962 (2.2088) loss 2.3717 (3.4164) grad_norm 1.6377 (1.6773) [2022-01-23 03:39:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][670/1251] eta 0:21:23 lr 0.000389 time 2.2979 (2.2096) loss 3.8127 (3.4186) grad_norm 1.5924 (1.6761) [2022-01-23 03:39:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][680/1251] eta 0:21:00 lr 0.000389 time 1.6073 (2.2083) loss 4.0421 (3.4151) grad_norm 1.9752 (1.6761) [2022-01-23 03:40:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][690/1251] eta 0:20:39 lr 0.000389 time 2.1639 (2.2088) loss 2.9123 (3.4160) grad_norm 1.4715 (1.6751) [2022-01-23 03:40:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][700/1251] eta 0:20:16 lr 0.000389 time 1.8390 (2.2084) loss 4.1066 (3.4160) grad_norm 1.8294 (1.6749) [2022-01-23 03:40:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][710/1251] eta 0:19:55 lr 0.000389 time 1.9266 (2.2100) loss 2.4219 (3.4140) grad_norm 1.6182 (1.6746) [2022-01-23 03:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][720/1251] eta 0:19:35 lr 0.000389 time 1.8641 (2.2130) loss 4.1352 (3.4167) grad_norm 1.6283 (1.6738) [2022-01-23 03:41:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][730/1251] eta 0:19:14 lr 0.000389 time 2.1435 (2.2153) loss 3.5764 (3.4183) grad_norm 1.6323 (1.6733) [2022-01-23 03:42:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][740/1251] eta 0:18:51 lr 0.000389 time 1.9324 (2.2137) loss 3.1762 (3.4176) grad_norm 1.6585 (1.6737) [2022-01-23 03:42:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][750/1251] eta 0:18:27 lr 0.000389 time 1.8293 (2.2101) loss 3.8720 (3.4221) grad_norm 1.6767 (1.6733) [2022-01-23 03:42:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][760/1251] eta 0:18:04 lr 0.000389 time 1.8719 (2.2094) loss 4.1487 (3.4208) grad_norm 1.7470 (1.6731) [2022-01-23 03:43:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][770/1251] eta 0:17:41 lr 0.000389 time 1.8475 (2.2075) loss 3.7725 (3.4251) grad_norm 1.5326 (1.6737) [2022-01-23 03:43:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][780/1251] eta 0:17:20 lr 0.000389 time 2.0006 (2.2088) loss 4.1701 (3.4270) grad_norm 1.8095 (1.6747) [2022-01-23 03:43:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][790/1251] eta 0:16:58 lr 0.000389 time 1.9420 (2.2097) loss 3.8120 (3.4234) grad_norm 1.5760 (1.6742) [2022-01-23 03:44:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][800/1251] eta 0:16:36 lr 0.000389 time 2.5919 (2.2105) loss 3.8817 (3.4227) grad_norm 1.4880 (1.6743) [2022-01-23 03:44:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][810/1251] eta 0:16:14 lr 0.000389 time 2.1048 (2.2088) loss 3.7691 (3.4224) grad_norm 1.6332 (1.6731) [2022-01-23 03:44:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][820/1251] eta 0:15:51 lr 0.000389 time 1.9921 (2.2082) loss 3.0630 (3.4218) grad_norm 1.6849 (1.6748) [2022-01-23 03:45:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][830/1251] eta 0:15:28 lr 0.000389 time 1.9106 (2.2057) loss 3.0445 (3.4213) grad_norm 1.6045 (1.6759) [2022-01-23 03:45:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][840/1251] eta 0:15:05 lr 0.000389 time 2.1758 (2.2044) loss 3.3902 (3.4194) grad_norm 1.5316 (1.6746) [2022-01-23 03:45:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][850/1251] eta 0:14:44 lr 0.000389 time 2.8607 (2.2055) loss 4.1464 (3.4197) grad_norm 1.8406 (1.6734) [2022-01-23 03:46:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][860/1251] eta 0:14:22 lr 0.000388 time 3.0876 (2.2071) loss 3.6762 (3.4209) grad_norm 1.4683 (1.6744) [2022-01-23 03:46:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][870/1251] eta 0:14:00 lr 0.000388 time 1.5673 (2.2062) loss 3.4172 (3.4221) grad_norm 1.5085 (1.6747) [2022-01-23 03:47:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][880/1251] eta 0:13:39 lr 0.000388 time 2.3346 (2.2076) loss 3.8595 (3.4254) grad_norm 1.8833 (1.6747) [2022-01-23 03:47:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][890/1251] eta 0:13:16 lr 0.000388 time 1.8849 (2.2064) loss 3.4284 (3.4224) grad_norm 1.6230 (1.6751) [2022-01-23 03:47:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][900/1251] eta 0:12:54 lr 0.000388 time 3.5854 (2.2074) loss 3.7973 (3.4244) grad_norm 1.7225 (1.6756) [2022-01-23 03:48:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][910/1251] eta 0:12:32 lr 0.000388 time 2.2379 (2.2062) loss 3.6693 (3.4253) grad_norm 1.5217 (1.6742) [2022-01-23 03:48:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][920/1251] eta 0:12:09 lr 0.000388 time 1.9995 (2.2045) loss 2.1174 (3.4217) grad_norm 1.6167 (1.6743) [2022-01-23 03:48:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][930/1251] eta 0:11:47 lr 0.000388 time 2.2322 (2.2036) loss 3.3583 (3.4231) grad_norm 1.8307 (1.6747) [2022-01-23 03:49:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][940/1251] eta 0:11:25 lr 0.000388 time 3.1157 (2.2055) loss 3.4933 (3.4233) grad_norm 1.7013 (1.6745) [2022-01-23 03:49:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][950/1251] eta 0:11:03 lr 0.000388 time 1.8360 (2.2040) loss 2.6389 (3.4230) grad_norm 1.5028 (1.6751) [2022-01-23 03:49:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][960/1251] eta 0:10:40 lr 0.000388 time 1.9446 (2.2027) loss 3.5780 (3.4227) grad_norm 1.5979 (1.6746) [2022-01-23 03:50:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][970/1251] eta 0:10:18 lr 0.000388 time 1.8666 (2.2023) loss 4.2497 (3.4251) grad_norm 1.6124 (1.6750) [2022-01-23 03:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][980/1251] eta 0:09:56 lr 0.000388 time 2.7704 (2.2020) loss 3.4118 (3.4272) grad_norm 1.6430 (1.6743) [2022-01-23 03:51:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][990/1251] eta 0:09:34 lr 0.000388 time 2.2952 (2.2009) loss 3.8659 (3.4291) grad_norm 1.6354 (1.6748) [2022-01-23 03:51:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1000/1251] eta 0:09:12 lr 0.000388 time 2.8813 (2.2005) loss 3.6416 (3.4285) grad_norm 1.7565 (1.6743) [2022-01-23 03:51:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1010/1251] eta 0:08:50 lr 0.000388 time 2.5106 (2.2003) loss 4.1495 (3.4305) grad_norm 1.6325 (1.6746) [2022-01-23 03:52:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1020/1251] eta 0:08:28 lr 0.000388 time 3.3900 (2.2013) loss 3.0476 (3.4285) grad_norm 1.8295 (1.6756) [2022-01-23 03:52:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1030/1251] eta 0:08:06 lr 0.000388 time 2.5210 (2.2009) loss 3.6246 (3.4299) grad_norm 1.6528 (1.6761) [2022-01-23 03:52:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1040/1251] eta 0:07:44 lr 0.000388 time 2.0943 (2.2012) loss 3.6857 (3.4296) grad_norm 1.6512 (1.6764) [2022-01-23 03:53:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1050/1251] eta 0:07:22 lr 0.000388 time 2.1883 (2.2022) loss 3.7677 (3.4312) grad_norm 1.8154 (1.6759) [2022-01-23 03:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1060/1251] eta 0:07:00 lr 0.000388 time 3.3772 (2.2038) loss 3.2230 (3.4306) grad_norm 1.6265 (1.6760) [2022-01-23 03:54:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1070/1251] eta 0:06:38 lr 0.000388 time 1.9446 (2.2041) loss 3.2280 (3.4301) grad_norm 1.4491 (1.6753) [2022-01-23 03:54:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1080/1251] eta 0:06:16 lr 0.000388 time 1.8325 (2.2025) loss 3.5985 (3.4302) grad_norm 1.8003 (1.6753) [2022-01-23 03:54:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1090/1251] eta 0:05:54 lr 0.000388 time 2.0060 (2.2006) loss 3.4268 (3.4301) grad_norm 1.7115 (1.6749) [2022-01-23 03:55:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1100/1251] eta 0:05:32 lr 0.000388 time 1.8287 (2.2000) loss 3.8059 (3.4311) grad_norm 1.6340 (1.6743) [2022-01-23 03:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1110/1251] eta 0:05:10 lr 0.000387 time 2.6657 (2.1989) loss 3.9962 (3.4305) grad_norm 1.9502 (1.6738) [2022-01-23 03:55:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1120/1251] eta 0:04:48 lr 0.000387 time 2.7898 (2.1988) loss 3.9295 (3.4290) grad_norm 1.8262 (1.6761) [2022-01-23 03:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1130/1251] eta 0:04:26 lr 0.000387 time 1.9043 (2.1987) loss 4.2150 (3.4281) grad_norm 1.7289 (1.6761) [2022-01-23 03:56:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1140/1251] eta 0:04:04 lr 0.000387 time 1.9534 (2.1984) loss 3.7659 (3.4284) grad_norm 1.4727 (1.6758) [2022-01-23 03:56:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1150/1251] eta 0:03:42 lr 0.000387 time 1.8729 (2.1987) loss 2.9576 (3.4275) grad_norm 1.6413 (1.6762) [2022-01-23 03:57:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1160/1251] eta 0:03:20 lr 0.000387 time 2.5553 (2.1993) loss 3.7292 (3.4295) grad_norm 1.5773 (1.6758) [2022-01-23 03:57:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1170/1251] eta 0:02:58 lr 0.000387 time 1.9690 (2.1985) loss 2.5133 (3.4296) grad_norm 1.5876 (1.6754) [2022-01-23 03:57:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1180/1251] eta 0:02:35 lr 0.000387 time 2.1795 (2.1970) loss 3.3674 (3.4274) grad_norm 1.4799 (1.6758) [2022-01-23 03:58:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1190/1251] eta 0:02:14 lr 0.000387 time 2.4528 (2.1971) loss 2.9418 (3.4278) grad_norm 1.6177 (1.6752) [2022-01-23 03:58:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1200/1251] eta 0:01:52 lr 0.000387 time 2.0203 (2.1977) loss 4.1836 (3.4286) grad_norm 1.6045 (1.6758) [2022-01-23 03:59:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1210/1251] eta 0:01:30 lr 0.000387 time 2.5124 (2.1979) loss 2.7821 (3.4274) grad_norm 1.9333 (1.6753) [2022-01-23 03:59:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1220/1251] eta 0:01:08 lr 0.000387 time 1.9839 (2.1975) loss 3.5983 (3.4285) grad_norm 1.6507 (1.6750) [2022-01-23 03:59:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1230/1251] eta 0:00:46 lr 0.000387 time 3.2223 (2.1980) loss 3.9504 (3.4285) grad_norm 1.6825 (1.6745) [2022-01-23 04:00:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1240/1251] eta 0:00:24 lr 0.000387 time 2.0971 (2.1986) loss 3.2784 (3.4272) grad_norm 1.6040 (1.6742) [2022-01-23 04:00:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [172/300][1250/1251] eta 0:00:02 lr 0.000387 time 1.1953 (2.1932) loss 3.1964 (3.4253) grad_norm 1.6749 (1.6745) [2022-01-23 04:00:24 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 172 training takes 0:45:44 [2022-01-23 04:00:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 12.223 (12.223) Loss 0.9118 (0.9118) Acc@1 77.344 (77.344) Acc@5 94.629 (94.629) [2022-01-23 04:01:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.854 (3.327) Loss 0.9055 (0.9476) Acc@1 78.418 (77.069) Acc@5 93.555 (93.857) [2022-01-23 04:01:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.323 (2.460) Loss 0.9795 (0.9495) Acc@1 76.660 (77.083) Acc@5 93.359 (93.866) [2022-01-23 04:01:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.315 (2.189) Loss 0.9349 (0.9499) Acc@1 78.223 (77.152) Acc@5 94.434 (93.813) [2022-01-23 04:01:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.840 (2.112) Loss 0.9675 (0.9534) Acc@1 75.977 (77.065) Acc@5 94.434 (93.855) [2022-01-23 04:01:59 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 77.014 Acc@5 93.790 [2022-01-23 04:01:59 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 77.0% [2022-01-23 04:01:59 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 77.01% [2022-01-23 04:02:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][0/1251] eta 7:36:05 lr 0.000387 time 21.8748 (21.8748) loss 3.8356 (3.8356) grad_norm 1.8094 (1.8094) [2022-01-23 04:02:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][10/1251] eta 1:22:18 lr 0.000387 time 1.3460 (3.9797) loss 3.1840 (3.4761) grad_norm 1.5807 (1.5886) [2022-01-23 04:03:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][20/1251] eta 1:04:55 lr 0.000387 time 2.1521 (3.1647) loss 2.9472 (3.4111) grad_norm 1.7679 (1.6014) [2022-01-23 04:03:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][30/1251] eta 0:57:42 lr 0.000387 time 1.8571 (2.8356) loss 4.1239 (3.4634) grad_norm 1.7852 (1.6216) [2022-01-23 04:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][40/1251] eta 0:56:08 lr 0.000387 time 3.8943 (2.7818) loss 3.5768 (3.4252) grad_norm 1.5720 (1.6539) [2022-01-23 04:04:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][50/1251] eta 0:53:42 lr 0.000387 time 1.6607 (2.6828) loss 2.8432 (3.4082) grad_norm 1.6447 (1.6666) [2022-01-23 04:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][60/1251] eta 0:51:30 lr 0.000387 time 1.8170 (2.5949) loss 3.6716 (3.3962) grad_norm 2.0164 (1.7033) [2022-01-23 04:04:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][70/1251] eta 0:49:27 lr 0.000387 time 1.8856 (2.5124) loss 3.7007 (3.4301) grad_norm 1.8426 (1.7190) [2022-01-23 04:05:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][80/1251] eta 0:47:51 lr 0.000387 time 2.5354 (2.4519) loss 3.6755 (3.4458) grad_norm 1.6495 (1.7082) [2022-01-23 04:05:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][90/1251] eta 0:46:41 lr 0.000387 time 1.8565 (2.4127) loss 3.8195 (3.4625) grad_norm 1.9361 (1.7048) [2022-01-23 04:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][100/1251] eta 0:45:58 lr 0.000387 time 2.2331 (2.3962) loss 3.1457 (3.4143) grad_norm 1.7104 (1.7035) [2022-01-23 04:06:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][110/1251] eta 0:45:20 lr 0.000386 time 2.4598 (2.3844) loss 4.0621 (3.4274) grad_norm 1.6053 (1.6903) [2022-01-23 04:06:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][120/1251] eta 0:44:46 lr 0.000386 time 2.8803 (2.3756) loss 3.8973 (3.4163) grad_norm 1.7218 (1.6895) [2022-01-23 04:07:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][130/1251] eta 0:44:04 lr 0.000386 time 1.7955 (2.3594) loss 3.3578 (3.4345) grad_norm 1.7187 (1.6826) [2022-01-23 04:07:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][140/1251] eta 0:43:26 lr 0.000386 time 1.6879 (2.3460) loss 3.8915 (3.4423) grad_norm 1.5284 (1.6731) [2022-01-23 04:07:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][150/1251] eta 0:42:56 lr 0.000386 time 1.9359 (2.3399) loss 3.8412 (3.4383) grad_norm 1.7448 (1.6745) [2022-01-23 04:08:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][160/1251] eta 0:42:29 lr 0.000386 time 2.8194 (2.3368) loss 2.5854 (3.4373) grad_norm 1.6800 (1.6737) [2022-01-23 04:08:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][170/1251] eta 0:42:01 lr 0.000386 time 1.7589 (2.3323) loss 3.9577 (3.4354) grad_norm 1.7591 (1.6723) [2022-01-23 04:08:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][180/1251] eta 0:41:25 lr 0.000386 time 1.5415 (2.3205) loss 3.5496 (3.4456) grad_norm 1.7985 (1.6717) [2022-01-23 04:09:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][190/1251] eta 0:40:54 lr 0.000386 time 1.8012 (2.3135) loss 3.5774 (3.4504) grad_norm 1.7070 (1.6697) [2022-01-23 04:09:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][200/1251] eta 0:40:27 lr 0.000386 time 2.1672 (2.3097) loss 2.8799 (3.4565) grad_norm 1.6244 (1.6683) [2022-01-23 04:10:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][210/1251] eta 0:39:52 lr 0.000386 time 1.7028 (2.2986) loss 3.6926 (3.4713) grad_norm 1.6816 (1.6678) [2022-01-23 04:10:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][220/1251] eta 0:39:21 lr 0.000386 time 1.9297 (2.2906) loss 3.1614 (3.4657) grad_norm 1.7392 (1.6707) [2022-01-23 04:10:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][230/1251] eta 0:38:52 lr 0.000386 time 1.7099 (2.2844) loss 3.8604 (3.4728) grad_norm 1.5723 (1.6707) [2022-01-23 04:11:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][240/1251] eta 0:38:26 lr 0.000386 time 2.4271 (2.2813) loss 3.7796 (3.4658) grad_norm 1.9795 (1.6697) [2022-01-23 04:11:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][250/1251] eta 0:38:00 lr 0.000386 time 2.0030 (2.2784) loss 3.7533 (3.4785) grad_norm 1.8122 (1.6708) [2022-01-23 04:11:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][260/1251] eta 0:37:38 lr 0.000386 time 2.5787 (2.2785) loss 2.4920 (3.4754) grad_norm 1.3977 (1.6698) [2022-01-23 04:12:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][270/1251] eta 0:37:11 lr 0.000386 time 2.1628 (2.2750) loss 3.7859 (3.4788) grad_norm 1.5086 (1.6701) [2022-01-23 04:12:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][280/1251] eta 0:36:42 lr 0.000386 time 1.8432 (2.2688) loss 3.5453 (3.4716) grad_norm 1.6142 (1.6705) [2022-01-23 04:12:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][290/1251] eta 0:36:15 lr 0.000386 time 2.0002 (2.2634) loss 2.6191 (3.4591) grad_norm 1.7321 (1.6695) [2022-01-23 04:13:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][300/1251] eta 0:35:50 lr 0.000386 time 2.2336 (2.2615) loss 3.4682 (3.4630) grad_norm 1.5084 (1.6681) [2022-01-23 04:13:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][310/1251] eta 0:35:24 lr 0.000386 time 2.2447 (2.2576) loss 2.7914 (3.4649) grad_norm 1.7532 (1.6665) [2022-01-23 04:14:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][320/1251] eta 0:34:57 lr 0.000386 time 2.3547 (2.2531) loss 2.6999 (3.4696) grad_norm 1.6757 (1.6669) [2022-01-23 04:14:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][330/1251] eta 0:34:30 lr 0.000386 time 1.9325 (2.2480) loss 2.7979 (3.4678) grad_norm 1.9564 (1.6674) [2022-01-23 04:14:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][340/1251] eta 0:34:07 lr 0.000386 time 2.5149 (2.2476) loss 4.4452 (3.4683) grad_norm 1.7450 (1.6681) [2022-01-23 04:15:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][350/1251] eta 0:33:48 lr 0.000386 time 2.2450 (2.2515) loss 3.1618 (3.4666) grad_norm 1.5704 (1.6665) [2022-01-23 04:15:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][360/1251] eta 0:33:26 lr 0.000385 time 2.7013 (2.2516) loss 2.8858 (3.4549) grad_norm 1.4923 (1.6644) [2022-01-23 04:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][370/1251] eta 0:33:02 lr 0.000385 time 2.2012 (2.2506) loss 3.6658 (3.4532) grad_norm 2.0312 (1.6639) [2022-01-23 04:16:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][380/1251] eta 0:32:40 lr 0.000385 time 1.8385 (2.2504) loss 3.9412 (3.4454) grad_norm 1.6872 (1.6653) [2022-01-23 04:16:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][390/1251] eta 0:32:15 lr 0.000385 time 2.5371 (2.2485) loss 3.0898 (3.4458) grad_norm 1.8252 (1.6701) [2022-01-23 04:16:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][400/1251] eta 0:31:49 lr 0.000385 time 1.8566 (2.2437) loss 3.7541 (3.4468) grad_norm 1.7793 (1.6744) [2022-01-23 04:17:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][410/1251] eta 0:31:24 lr 0.000385 time 1.9434 (2.2409) loss 2.8709 (3.4501) grad_norm 2.2465 (1.6771) [2022-01-23 04:17:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][420/1251] eta 0:30:59 lr 0.000385 time 2.0338 (2.2378) loss 2.5843 (3.4519) grad_norm 1.5633 (1.6770) [2022-01-23 04:18:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][430/1251] eta 0:30:35 lr 0.000385 time 2.5709 (2.2352) loss 3.0408 (3.4516) grad_norm 1.6431 (1.6770) [2022-01-23 04:18:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][440/1251] eta 0:30:11 lr 0.000385 time 1.8873 (2.2336) loss 3.9568 (3.4498) grad_norm 1.5625 (1.6787) [2022-01-23 04:18:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][450/1251] eta 0:29:49 lr 0.000385 time 2.9272 (2.2341) loss 3.2008 (3.4513) grad_norm 1.6038 (1.6791) [2022-01-23 04:19:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][460/1251] eta 0:29:25 lr 0.000385 time 1.8330 (2.2321) loss 3.0103 (3.4474) grad_norm 1.5011 (1.6808) [2022-01-23 04:19:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][470/1251] eta 0:29:01 lr 0.000385 time 2.7986 (2.2304) loss 3.7578 (3.4463) grad_norm 1.5333 (1.6814) [2022-01-23 04:19:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][480/1251] eta 0:28:42 lr 0.000385 time 2.3850 (2.2343) loss 3.4952 (3.4421) grad_norm 1.5677 (1.6803) [2022-01-23 04:20:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][490/1251] eta 0:28:21 lr 0.000385 time 2.5682 (2.2362) loss 3.2679 (3.4432) grad_norm 1.6770 (1.6799) [2022-01-23 04:20:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][500/1251] eta 0:27:59 lr 0.000385 time 1.8982 (2.2362) loss 3.7601 (3.4469) grad_norm 1.5085 (1.6807) [2022-01-23 04:20:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][510/1251] eta 0:27:33 lr 0.000385 time 1.8666 (2.2320) loss 4.2995 (3.4483) grad_norm 1.7669 (1.6814) [2022-01-23 04:21:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][520/1251] eta 0:27:08 lr 0.000385 time 1.9207 (2.2274) loss 2.6988 (3.4483) grad_norm 1.6059 (1.6816) [2022-01-23 04:21:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][530/1251] eta 0:26:43 lr 0.000385 time 1.6045 (2.2247) loss 3.9079 (3.4467) grad_norm 1.4039 (1.6816) [2022-01-23 04:22:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][540/1251] eta 0:26:20 lr 0.000385 time 1.8050 (2.2231) loss 3.5828 (3.4449) grad_norm 1.5547 (1.6813) [2022-01-23 04:22:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][550/1251] eta 0:25:59 lr 0.000385 time 2.0082 (2.2247) loss 3.5709 (3.4476) grad_norm 1.5037 (1.6798) [2022-01-23 04:22:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][560/1251] eta 0:25:35 lr 0.000385 time 1.6811 (2.2227) loss 3.2628 (3.4477) grad_norm 1.7374 (1.6820) [2022-01-23 04:23:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][570/1251] eta 0:25:13 lr 0.000385 time 3.0177 (2.2231) loss 3.9044 (3.4500) grad_norm 1.5746 (1.6846) [2022-01-23 04:23:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][580/1251] eta 0:24:53 lr 0.000385 time 2.1208 (2.2255) loss 4.0641 (3.4543) grad_norm 1.7179 (1.6851) [2022-01-23 04:23:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][590/1251] eta 0:24:31 lr 0.000385 time 2.5056 (2.2262) loss 3.7166 (3.4562) grad_norm 1.8383 (1.6858) [2022-01-23 04:24:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][600/1251] eta 0:24:08 lr 0.000385 time 2.1403 (2.2253) loss 3.3340 (3.4541) grad_norm 1.4625 (1.6846) [2022-01-23 04:24:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][610/1251] eta 0:23:47 lr 0.000384 time 3.1361 (2.2264) loss 3.6883 (3.4561) grad_norm 1.5198 (1.6840) [2022-01-23 04:25:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][620/1251] eta 0:23:24 lr 0.000384 time 1.8804 (2.2264) loss 3.7270 (3.4585) grad_norm 1.5458 (1.6843) [2022-01-23 04:25:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][630/1251] eta 0:22:59 lr 0.000384 time 1.8077 (2.2222) loss 2.4767 (3.4603) grad_norm 1.6951 (1.6848) [2022-01-23 04:25:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][640/1251] eta 0:22:38 lr 0.000384 time 1.9253 (2.2229) loss 3.7439 (3.4622) grad_norm 1.5705 (1.6841) [2022-01-23 04:26:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][650/1251] eta 0:22:15 lr 0.000384 time 2.2088 (2.2221) loss 3.6005 (3.4623) grad_norm 1.7016 (1.6835) [2022-01-23 04:26:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][660/1251] eta 0:21:52 lr 0.000384 time 1.8182 (2.2212) loss 3.6414 (3.4607) grad_norm 1.9230 (1.6845) [2022-01-23 04:26:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][670/1251] eta 0:21:29 lr 0.000384 time 1.9618 (2.2190) loss 4.1914 (3.4639) grad_norm 1.6844 (1.6836) [2022-01-23 04:27:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][680/1251] eta 0:21:05 lr 0.000384 time 1.8924 (2.2162) loss 3.7517 (3.4634) grad_norm 1.4144 (1.6830) [2022-01-23 04:27:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][690/1251] eta 0:20:42 lr 0.000384 time 2.5468 (2.2150) loss 3.1693 (3.4642) grad_norm 1.4626 (1.6817) [2022-01-23 04:27:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][700/1251] eta 0:20:20 lr 0.000384 time 2.4061 (2.2154) loss 2.7523 (3.4657) grad_norm 1.6906 (1.6816) [2022-01-23 04:28:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][710/1251] eta 0:20:00 lr 0.000384 time 1.7566 (2.2192) loss 2.7219 (3.4626) grad_norm 1.4500 (1.6807) [2022-01-23 04:28:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][720/1251] eta 0:19:38 lr 0.000384 time 2.9153 (2.2199) loss 3.4122 (3.4630) grad_norm 1.5142 (1.6811) [2022-01-23 04:29:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][730/1251] eta 0:19:16 lr 0.000384 time 1.8896 (2.2195) loss 3.0670 (3.4581) grad_norm 1.4469 (1.6810) [2022-01-23 04:29:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][740/1251] eta 0:18:52 lr 0.000384 time 2.2603 (2.2160) loss 3.5497 (3.4583) grad_norm 1.5632 (1.6808) [2022-01-23 04:29:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][750/1251] eta 0:18:30 lr 0.000384 time 2.2448 (2.2168) loss 2.7764 (3.4592) grad_norm 1.5642 (1.6806) [2022-01-23 04:30:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][760/1251] eta 0:18:08 lr 0.000384 time 2.3659 (2.2159) loss 3.4142 (3.4560) grad_norm 1.7683 (1.6810) [2022-01-23 04:30:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][770/1251] eta 0:17:45 lr 0.000384 time 2.2340 (2.2147) loss 2.4687 (3.4569) grad_norm 1.5221 (1.6797) [2022-01-23 04:30:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][780/1251] eta 0:17:22 lr 0.000384 time 1.9277 (2.2129) loss 4.1666 (3.4597) grad_norm 2.3002 (1.6815) [2022-01-23 04:31:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][790/1251] eta 0:16:59 lr 0.000384 time 2.0851 (2.2112) loss 3.8067 (3.4570) grad_norm 1.5061 (1.6828) [2022-01-23 04:31:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][800/1251] eta 0:16:37 lr 0.000384 time 1.8461 (2.2120) loss 3.7660 (3.4578) grad_norm 1.3777 (1.6824) [2022-01-23 04:31:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][810/1251] eta 0:16:16 lr 0.000384 time 1.9136 (2.2145) loss 3.6818 (3.4566) grad_norm 1.7495 (1.6823) [2022-01-23 04:32:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][820/1251] eta 0:15:54 lr 0.000384 time 2.8259 (2.2151) loss 3.5728 (3.4587) grad_norm 1.7862 (1.6824) [2022-01-23 04:32:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][830/1251] eta 0:15:31 lr 0.000384 time 1.5905 (2.2129) loss 3.6411 (3.4610) grad_norm 1.9005 (1.6822) [2022-01-23 04:32:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][840/1251] eta 0:15:08 lr 0.000384 time 1.9311 (2.2105) loss 3.8488 (3.4629) grad_norm 1.6336 (1.6818) [2022-01-23 04:33:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][850/1251] eta 0:14:46 lr 0.000384 time 2.4967 (2.2100) loss 4.3020 (3.4659) grad_norm 1.6378 (1.6814) [2022-01-23 04:33:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][860/1251] eta 0:14:24 lr 0.000383 time 1.8761 (2.2103) loss 3.9309 (3.4638) grad_norm 1.4743 (1.6818) [2022-01-23 04:34:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][870/1251] eta 0:14:02 lr 0.000383 time 2.4392 (2.2111) loss 3.4674 (3.4625) grad_norm 1.7979 (1.6811) [2022-01-23 04:34:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][880/1251] eta 0:13:40 lr 0.000383 time 2.2771 (2.2128) loss 3.3084 (3.4613) grad_norm 1.7258 (1.6814) [2022-01-23 04:34:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][890/1251] eta 0:13:18 lr 0.000383 time 2.2251 (2.2111) loss 3.4618 (3.4625) grad_norm 1.5189 (1.6811) [2022-01-23 04:35:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][900/1251] eta 0:12:55 lr 0.000383 time 1.7973 (2.2096) loss 2.4311 (3.4633) grad_norm 1.8178 (1.6807) [2022-01-23 04:35:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][910/1251] eta 0:12:32 lr 0.000383 time 2.4805 (2.2081) loss 3.9249 (3.4633) grad_norm 1.5383 (1.6807) [2022-01-23 04:35:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][920/1251] eta 0:12:10 lr 0.000383 time 2.2471 (2.2070) loss 3.3838 (3.4625) grad_norm 1.6236 (1.6797) [2022-01-23 04:36:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][930/1251] eta 0:11:48 lr 0.000383 time 2.1517 (2.2069) loss 3.9180 (3.4641) grad_norm 1.7318 (1.6796) [2022-01-23 04:36:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][940/1251] eta 0:11:26 lr 0.000383 time 1.5273 (2.2065) loss 3.5789 (3.4679) grad_norm 1.9219 (1.6796) [2022-01-23 04:36:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][950/1251] eta 0:11:04 lr 0.000383 time 2.4916 (2.2068) loss 3.7213 (3.4693) grad_norm 1.7295 (1.6809) [2022-01-23 04:37:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][960/1251] eta 0:10:42 lr 0.000383 time 2.5126 (2.2066) loss 4.0397 (3.4680) grad_norm 1.7220 (1.6832) [2022-01-23 04:37:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][970/1251] eta 0:10:19 lr 0.000383 time 2.2314 (2.2059) loss 3.9879 (3.4678) grad_norm 2.1046 (1.6839) [2022-01-23 04:38:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][980/1251] eta 0:09:57 lr 0.000383 time 1.9438 (2.2059) loss 3.9437 (3.4673) grad_norm 1.5547 (1.6832) [2022-01-23 04:38:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][990/1251] eta 0:09:35 lr 0.000383 time 2.2927 (2.2057) loss 3.2680 (3.4672) grad_norm 1.9649 (1.6831) [2022-01-23 04:38:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1000/1251] eta 0:09:13 lr 0.000383 time 2.5622 (2.2062) loss 3.9676 (3.4700) grad_norm 1.4422 (1.6825) [2022-01-23 04:39:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1010/1251] eta 0:08:52 lr 0.000383 time 2.2386 (2.2079) loss 2.5067 (3.4701) grad_norm 1.5795 (1.6822) [2022-01-23 04:39:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1020/1251] eta 0:08:30 lr 0.000383 time 2.5633 (2.2085) loss 2.4900 (3.4716) grad_norm 1.6022 (1.6818) [2022-01-23 04:39:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1030/1251] eta 0:08:07 lr 0.000383 time 2.0413 (2.2074) loss 3.3321 (3.4713) grad_norm 1.6772 (1.6810) [2022-01-23 04:40:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1040/1251] eta 0:07:45 lr 0.000383 time 2.0144 (2.2046) loss 3.1932 (3.4712) grad_norm 2.4131 (1.6826) [2022-01-23 04:40:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1050/1251] eta 0:07:22 lr 0.000383 time 2.2818 (2.2032) loss 3.9025 (3.4716) grad_norm 1.5028 (1.6825) [2022-01-23 04:40:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1060/1251] eta 0:07:00 lr 0.000383 time 1.8787 (2.2022) loss 2.5433 (3.4704) grad_norm 1.8211 (1.6831) [2022-01-23 04:41:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1070/1251] eta 0:06:38 lr 0.000383 time 2.8468 (2.2018) loss 2.8736 (3.4706) grad_norm 1.7573 (1.6831) [2022-01-23 04:41:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1080/1251] eta 0:06:16 lr 0.000383 time 1.9180 (2.2019) loss 3.8219 (3.4701) grad_norm 1.5896 (1.6835) [2022-01-23 04:42:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1090/1251] eta 0:05:54 lr 0.000383 time 1.9151 (2.2025) loss 4.0343 (3.4706) grad_norm 1.9534 (1.6834) [2022-01-23 04:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1100/1251] eta 0:05:32 lr 0.000383 time 2.2928 (2.2027) loss 2.8538 (3.4685) grad_norm 1.7355 (1.6841) [2022-01-23 04:42:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1110/1251] eta 0:05:10 lr 0.000382 time 2.5812 (2.2023) loss 3.6745 (3.4677) grad_norm 1.6530 (1.6838) [2022-01-23 04:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1120/1251] eta 0:04:48 lr 0.000382 time 2.2241 (2.2026) loss 3.1967 (3.4649) grad_norm 1.6827 (1.6833) [2022-01-23 04:43:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1130/1251] eta 0:04:26 lr 0.000382 time 1.5755 (2.2021) loss 3.6709 (3.4630) grad_norm 1.6323 (1.6835) [2022-01-23 04:43:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1140/1251] eta 0:04:04 lr 0.000382 time 1.8953 (2.2012) loss 3.2066 (3.4637) grad_norm 1.7218 (1.6840) [2022-01-23 04:44:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1150/1251] eta 0:03:42 lr 0.000382 time 2.1611 (2.2024) loss 3.7486 (3.4646) grad_norm 1.5425 (1.6831) [2022-01-23 04:44:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1160/1251] eta 0:03:20 lr 0.000382 time 1.5940 (2.2025) loss 3.2447 (3.4621) grad_norm 1.6139 (1.6839) [2022-01-23 04:44:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1170/1251] eta 0:02:58 lr 0.000382 time 2.2781 (2.2019) loss 3.7862 (3.4626) grad_norm 1.5697 (1.6838) [2022-01-23 04:45:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1180/1251] eta 0:02:36 lr 0.000382 time 2.2173 (2.2015) loss 3.5311 (3.4615) grad_norm 1.4899 (1.6840) [2022-01-23 04:45:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1190/1251] eta 0:02:14 lr 0.000382 time 2.2470 (2.2024) loss 3.9797 (3.4650) grad_norm 1.7765 (1.6844) [2022-01-23 04:46:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1200/1251] eta 0:01:52 lr 0.000382 time 1.6389 (2.2020) loss 3.5636 (3.4652) grad_norm 1.5216 (1.6842) [2022-01-23 04:46:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1210/1251] eta 0:01:30 lr 0.000382 time 1.8875 (2.2008) loss 2.7534 (3.4643) grad_norm 1.4864 (1.6838) [2022-01-23 04:46:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1220/1251] eta 0:01:08 lr 0.000382 time 3.1590 (2.2008) loss 2.6312 (3.4652) grad_norm 1.6885 (1.6837) [2022-01-23 04:47:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1230/1251] eta 0:00:46 lr 0.000382 time 2.1232 (2.2012) loss 3.4551 (3.4663) grad_norm 1.6070 (1.6835) [2022-01-23 04:47:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1240/1251] eta 0:00:24 lr 0.000382 time 1.2233 (2.1995) loss 3.8302 (3.4670) grad_norm 1.4020 (1.6838) [2022-01-23 04:47:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [173/300][1250/1251] eta 0:00:02 lr 0.000382 time 1.1893 (2.1941) loss 3.8783 (3.4682) grad_norm 1.5371 (1.6833) [2022-01-23 04:47:44 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 173 training takes 0:45:45 [2022-01-23 04:48:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.728 (19.728) Loss 0.9694 (0.9694) Acc@1 78.516 (78.516) Acc@5 94.434 (94.434) [2022-01-23 04:48:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 2.738 (3.429) Loss 0.9110 (0.9613) Acc@1 79.492 (77.282) Acc@5 94.141 (94.087) [2022-01-23 04:48:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.364 (2.605) Loss 1.0542 (0.9731) Acc@1 74.609 (77.065) Acc@5 92.773 (93.955) [2022-01-23 04:48:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.206 (2.286) Loss 0.9575 (0.9862) Acc@1 78.320 (76.972) Acc@5 94.434 (93.725) [2022-01-23 04:49:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.890 (2.170) Loss 1.0123 (0.9858) Acc@1 76.660 (76.944) Acc@5 93.848 (93.736) [2022-01-23 04:49:20 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 76.910 Acc@5 93.786 [2022-01-23 04:49:20 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 76.9% [2022-01-23 04:49:20 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 77.01% [2022-01-23 04:49:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][0/1251] eta 7:28:55 lr 0.000382 time 21.5309 (21.5309) loss 2.9704 (2.9704) grad_norm 1.5682 (1.5682) [2022-01-23 04:50:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][10/1251] eta 1:20:46 lr 0.000382 time 1.5624 (3.9056) loss 2.9898 (3.4771) grad_norm 1.5506 (1.6656) [2022-01-23 04:50:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][20/1251] eta 1:02:31 lr 0.000382 time 1.5512 (3.0473) loss 2.5172 (3.3075) grad_norm 1.4729 (1.6478) [2022-01-23 04:50:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][30/1251] eta 0:55:52 lr 0.000382 time 1.4013 (2.7461) loss 3.1554 (3.3695) grad_norm 1.7703 (1.6524) [2022-01-23 04:51:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][40/1251] eta 0:54:17 lr 0.000382 time 3.8125 (2.6898) loss 3.5889 (3.4361) grad_norm 1.4887 (1.6506) [2022-01-23 04:51:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][50/1251] eta 0:53:23 lr 0.000382 time 2.6125 (2.6671) loss 3.1503 (3.4637) grad_norm 1.8061 (1.6476) [2022-01-23 04:51:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][60/1251] eta 0:51:15 lr 0.000382 time 1.5519 (2.5823) loss 2.9755 (3.3943) grad_norm 1.5071 (1.6411) [2022-01-23 04:52:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][70/1251] eta 0:49:42 lr 0.000382 time 1.5609 (2.5252) loss 3.4142 (3.3686) grad_norm 1.6423 (1.6341) [2022-01-23 04:52:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][80/1251] eta 0:48:39 lr 0.000382 time 3.2433 (2.4932) loss 3.7466 (3.3923) grad_norm 1.5643 (1.6426) [2022-01-23 04:53:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][90/1251] eta 0:47:57 lr 0.000382 time 3.0884 (2.4783) loss 3.6353 (3.4098) grad_norm 1.4104 (1.6263) [2022-01-23 04:53:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][100/1251] eta 0:46:30 lr 0.000381 time 1.6732 (2.4244) loss 3.1061 (3.4250) grad_norm 1.6404 (1.6257) [2022-01-23 04:53:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][110/1251] eta 0:45:30 lr 0.000381 time 2.2583 (2.3927) loss 3.7782 (3.4286) grad_norm 1.5806 (1.6303) [2022-01-23 04:54:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][120/1251] eta 0:44:40 lr 0.000381 time 2.2799 (2.3700) loss 2.1670 (3.4192) grad_norm 1.6715 (1.6306) [2022-01-23 04:54:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][130/1251] eta 0:43:50 lr 0.000381 time 2.4489 (2.3463) loss 3.3559 (3.4177) grad_norm 1.7555 (1.6289) [2022-01-23 04:54:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][140/1251] eta 0:43:02 lr 0.000381 time 2.2985 (2.3248) loss 3.3840 (3.4200) grad_norm 1.8606 (1.6308) [2022-01-23 04:55:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][150/1251] eta 0:42:28 lr 0.000381 time 2.0559 (2.3145) loss 4.1225 (3.4219) grad_norm 1.6509 (1.6356) [2022-01-23 04:55:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][160/1251] eta 0:41:59 lr 0.000381 time 1.8720 (2.3090) loss 3.4206 (3.4405) grad_norm 2.0039 (1.6404) [2022-01-23 04:55:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][170/1251] eta 0:41:25 lr 0.000381 time 1.5458 (2.2990) loss 3.5826 (3.4538) grad_norm 1.9164 (1.6527) [2022-01-23 04:56:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][180/1251] eta 0:41:09 lr 0.000381 time 3.1310 (2.3057) loss 3.4777 (3.4508) grad_norm 3.2199 (1.7307) [2022-01-23 04:56:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][190/1251] eta 0:40:53 lr 0.000381 time 2.6255 (2.3128) loss 2.9175 (3.4433) grad_norm 1.7423 (1.7296) [2022-01-23 04:57:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][200/1251] eta 0:40:33 lr 0.000381 time 2.1745 (2.3158) loss 4.1459 (3.4521) grad_norm 1.6472 (1.7264) [2022-01-23 04:57:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][210/1251] eta 0:40:09 lr 0.000381 time 1.8630 (2.3150) loss 3.3582 (3.4649) grad_norm 1.8248 (1.7249) [2022-01-23 04:57:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][220/1251] eta 0:39:41 lr 0.000381 time 2.4000 (2.3103) loss 3.8075 (3.4658) grad_norm 1.9128 (1.7223) [2022-01-23 04:58:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][230/1251] eta 0:39:06 lr 0.000381 time 1.8681 (2.2985) loss 2.1892 (3.4444) grad_norm 1.8006 (1.7184) [2022-01-23 04:58:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][240/1251] eta 0:38:25 lr 0.000381 time 1.8864 (2.2801) loss 3.8179 (3.4510) grad_norm 1.3964 (1.7127) [2022-01-23 04:58:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][250/1251] eta 0:37:51 lr 0.000381 time 1.9474 (2.2694) loss 3.6068 (3.4550) grad_norm 1.8754 (1.7088) [2022-01-23 04:59:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][260/1251] eta 0:37:21 lr 0.000381 time 2.3643 (2.2622) loss 3.6781 (3.4503) grad_norm 1.6931 (1.7059) [2022-01-23 04:59:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][270/1251] eta 0:37:01 lr 0.000381 time 2.5246 (2.2645) loss 3.7990 (3.4481) grad_norm 1.6162 (1.7040) [2022-01-23 04:59:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][280/1251] eta 0:36:40 lr 0.000381 time 2.2349 (2.2667) loss 3.8670 (3.4469) grad_norm 1.7224 (1.7063) [2022-01-23 05:00:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][290/1251] eta 0:36:17 lr 0.000381 time 2.7867 (2.2658) loss 2.9358 (3.4440) grad_norm 1.6798 (1.7063) [2022-01-23 05:00:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][300/1251] eta 0:35:56 lr 0.000381 time 1.9149 (2.2672) loss 2.5734 (3.4405) grad_norm 1.4707 (1.7060) [2022-01-23 05:01:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][310/1251] eta 0:35:32 lr 0.000381 time 2.1389 (2.2659) loss 2.9028 (3.4406) grad_norm 1.8276 (1.7087) [2022-01-23 05:01:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][320/1251] eta 0:35:08 lr 0.000381 time 2.1763 (2.2647) loss 3.5930 (3.4375) grad_norm 2.0408 (1.7097) [2022-01-23 05:01:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][330/1251] eta 0:34:44 lr 0.000381 time 1.8327 (2.2628) loss 3.8552 (3.4449) grad_norm 1.8187 (1.7094) [2022-01-23 05:02:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][340/1251] eta 0:34:19 lr 0.000381 time 2.1761 (2.2610) loss 3.7614 (3.4439) grad_norm 1.4847 (1.7087) [2022-01-23 05:02:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][350/1251] eta 0:33:52 lr 0.000380 time 1.8060 (2.2556) loss 3.5578 (3.4423) grad_norm 1.6737 (1.7066) [2022-01-23 05:02:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][360/1251] eta 0:33:24 lr 0.000380 time 1.8327 (2.2498) loss 3.9686 (3.4464) grad_norm 2.0657 (1.7063) [2022-01-23 05:03:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][370/1251] eta 0:32:59 lr 0.000380 time 1.7772 (2.2466) loss 4.1975 (3.4465) grad_norm 1.6905 (1.7077) [2022-01-23 05:03:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][380/1251] eta 0:32:32 lr 0.000380 time 1.8235 (2.2416) loss 3.5628 (3.4474) grad_norm 1.7961 (1.7083) [2022-01-23 05:03:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][390/1251] eta 0:32:06 lr 0.000380 time 2.2630 (2.2374) loss 4.2495 (3.4518) grad_norm 1.6113 (1.7056) [2022-01-23 05:04:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][400/1251] eta 0:31:41 lr 0.000380 time 2.1706 (2.2341) loss 3.8314 (3.4550) grad_norm 1.5141 (1.7027) [2022-01-23 05:04:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][410/1251] eta 0:31:17 lr 0.000380 time 1.8808 (2.2322) loss 2.1646 (3.4566) grad_norm 1.8402 (1.7021) [2022-01-23 05:05:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][420/1251] eta 0:30:57 lr 0.000380 time 2.6242 (2.2352) loss 3.2452 (3.4526) grad_norm 1.6775 (1.7013) [2022-01-23 05:05:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][430/1251] eta 0:30:32 lr 0.000380 time 1.8500 (2.2321) loss 3.9744 (3.4568) grad_norm 1.7421 (1.7000) [2022-01-23 05:05:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][440/1251] eta 0:30:06 lr 0.000380 time 1.7447 (2.2281) loss 2.6190 (3.4585) grad_norm 1.7851 (1.7011) [2022-01-23 05:06:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][450/1251] eta 0:29:43 lr 0.000380 time 1.9995 (2.2271) loss 3.8012 (3.4602) grad_norm 1.6036 (1.7007) [2022-01-23 05:06:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][460/1251] eta 0:29:20 lr 0.000380 time 1.6362 (2.2252) loss 3.6453 (3.4658) grad_norm 1.5261 (1.7017) [2022-01-23 05:06:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][470/1251] eta 0:28:57 lr 0.000380 time 1.8767 (2.2250) loss 3.7433 (3.4675) grad_norm 1.6025 (1.7010) [2022-01-23 05:07:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][480/1251] eta 0:28:35 lr 0.000380 time 1.4198 (2.2248) loss 3.3038 (3.4704) grad_norm 1.8325 (1.7006) [2022-01-23 05:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][490/1251] eta 0:28:12 lr 0.000380 time 1.8852 (2.2236) loss 3.1070 (3.4688) grad_norm 1.6662 (1.6993) [2022-01-23 05:07:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][500/1251] eta 0:27:49 lr 0.000380 time 2.2124 (2.2231) loss 3.4236 (3.4663) grad_norm 1.5467 (1.6985) [2022-01-23 05:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][510/1251] eta 0:27:28 lr 0.000380 time 2.2454 (2.2242) loss 2.5898 (3.4642) grad_norm 1.7056 (1.6956) [2022-01-23 05:08:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][520/1251] eta 0:27:04 lr 0.000380 time 1.6693 (2.2219) loss 2.6374 (3.4621) grad_norm 1.7514 (1.6946) [2022-01-23 05:09:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][530/1251] eta 0:26:44 lr 0.000380 time 1.9329 (2.2249) loss 2.4368 (3.4574) grad_norm 1.9478 (1.6941) [2022-01-23 05:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][540/1251] eta 0:26:24 lr 0.000380 time 2.4149 (2.2288) loss 3.7801 (3.4599) grad_norm 1.6396 (1.6961) [2022-01-23 05:09:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][550/1251] eta 0:26:01 lr 0.000380 time 1.8266 (2.2271) loss 3.1945 (3.4579) grad_norm 1.4913 (1.6963) [2022-01-23 05:10:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][560/1251] eta 0:25:36 lr 0.000380 time 1.8678 (2.2241) loss 2.6228 (3.4580) grad_norm 1.7768 (1.6984) [2022-01-23 05:10:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][570/1251] eta 0:25:13 lr 0.000380 time 1.9588 (2.2222) loss 3.9511 (3.4619) grad_norm 1.9742 (1.6972) [2022-01-23 05:10:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][580/1251] eta 0:24:49 lr 0.000380 time 1.8980 (2.2194) loss 2.5360 (3.4604) grad_norm 1.5214 (1.6965) [2022-01-23 05:11:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][590/1251] eta 0:24:25 lr 0.000380 time 1.9337 (2.2172) loss 3.8531 (3.4615) grad_norm 1.7338 (1.6970) [2022-01-23 05:11:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][600/1251] eta 0:24:03 lr 0.000379 time 2.7748 (2.2173) loss 3.9517 (3.4619) grad_norm 1.5553 (1.6968) [2022-01-23 05:11:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][610/1251] eta 0:23:41 lr 0.000379 time 1.6614 (2.2181) loss 4.0706 (3.4557) grad_norm 1.8008 (1.6965) [2022-01-23 05:12:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][620/1251] eta 0:23:19 lr 0.000379 time 1.9124 (2.2180) loss 3.2079 (3.4563) grad_norm 1.6271 (1.6949) [2022-01-23 05:12:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][630/1251] eta 0:22:58 lr 0.000379 time 2.0498 (2.2206) loss 2.7984 (3.4567) grad_norm 1.6322 (1.6956) [2022-01-23 05:13:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][640/1251] eta 0:22:37 lr 0.000379 time 2.8685 (2.2215) loss 3.3327 (3.4532) grad_norm 1.5317 (1.6945) [2022-01-23 05:13:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][650/1251] eta 0:22:13 lr 0.000379 time 1.8654 (2.2181) loss 3.4346 (3.4504) grad_norm 1.5931 (1.6944) [2022-01-23 05:13:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][660/1251] eta 0:21:48 lr 0.000379 time 1.9503 (2.2143) loss 3.9137 (3.4492) grad_norm 1.8102 (1.6920) [2022-01-23 05:14:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][670/1251] eta 0:21:25 lr 0.000379 time 2.2570 (2.2121) loss 3.9668 (3.4515) grad_norm 1.7394 (1.6910) [2022-01-23 05:14:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][680/1251] eta 0:21:03 lr 0.000379 time 2.2611 (2.2126) loss 3.1971 (3.4517) grad_norm 1.8090 (1.6918) [2022-01-23 05:14:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][690/1251] eta 0:20:40 lr 0.000379 time 1.5637 (2.2114) loss 3.6740 (3.4513) grad_norm 1.4802 (1.6912) [2022-01-23 05:15:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][700/1251] eta 0:20:19 lr 0.000379 time 2.0350 (2.2125) loss 2.4531 (3.4473) grad_norm 1.7280 (1.6911) [2022-01-23 05:15:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][710/1251] eta 0:19:56 lr 0.000379 time 1.8801 (2.2123) loss 3.5146 (3.4465) grad_norm 1.7272 (1.6898) [2022-01-23 05:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][720/1251] eta 0:19:33 lr 0.000379 time 1.3258 (2.2106) loss 3.9341 (3.4431) grad_norm 1.5550 (1.6891) [2022-01-23 05:16:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][730/1251] eta 0:19:11 lr 0.000379 time 1.7899 (2.2107) loss 4.1045 (3.4427) grad_norm 1.7562 (1.6891) [2022-01-23 05:16:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][740/1251] eta 0:18:49 lr 0.000379 time 2.0782 (2.2106) loss 3.4087 (3.4420) grad_norm 1.5238 (1.6899) [2022-01-23 05:17:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][750/1251] eta 0:18:27 lr 0.000379 time 1.8000 (2.2109) loss 4.1519 (3.4450) grad_norm 1.5779 (1.6907) [2022-01-23 05:17:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][760/1251] eta 0:18:06 lr 0.000379 time 1.9161 (2.2132) loss 3.6490 (3.4448) grad_norm 1.8548 (1.6910) [2022-01-23 05:17:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][770/1251] eta 0:17:44 lr 0.000379 time 1.5360 (2.2129) loss 2.1722 (3.4462) grad_norm 1.5086 (1.6898) [2022-01-23 05:18:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][780/1251] eta 0:17:23 lr 0.000379 time 3.0248 (2.2150) loss 3.9907 (3.4450) grad_norm 1.4293 (1.6905) [2022-01-23 05:18:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][790/1251] eta 0:17:00 lr 0.000379 time 1.8485 (2.2145) loss 3.9561 (3.4472) grad_norm 1.4294 (1.6900) [2022-01-23 05:18:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][800/1251] eta 0:16:38 lr 0.000379 time 1.8382 (2.2133) loss 2.8114 (3.4469) grad_norm 1.3679 (1.6915) [2022-01-23 05:19:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][810/1251] eta 0:16:14 lr 0.000379 time 1.9488 (2.2101) loss 3.5492 (3.4482) grad_norm 1.6567 (1.6933) [2022-01-23 05:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][820/1251] eta 0:15:51 lr 0.000379 time 2.5121 (2.2087) loss 3.7401 (3.4493) grad_norm 1.6435 (1.6936) [2022-01-23 05:19:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][830/1251] eta 0:15:29 lr 0.000379 time 1.8746 (2.2067) loss 3.9237 (3.4502) grad_norm 1.5255 (1.6946) [2022-01-23 05:20:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][840/1251] eta 0:15:06 lr 0.000379 time 1.8819 (2.2063) loss 2.8900 (3.4502) grad_norm 1.8026 (1.6938) [2022-01-23 05:20:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][850/1251] eta 0:14:44 lr 0.000378 time 2.1413 (2.2066) loss 2.2469 (3.4481) grad_norm 1.7263 (1.6937) [2022-01-23 05:21:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][860/1251] eta 0:14:22 lr 0.000378 time 2.3294 (2.2064) loss 3.7148 (3.4480) grad_norm 1.5587 (1.6935) [2022-01-23 05:21:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][870/1251] eta 0:14:00 lr 0.000378 time 2.1029 (2.2069) loss 2.8924 (3.4448) grad_norm 1.6278 (1.6932) [2022-01-23 05:21:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][880/1251] eta 0:13:39 lr 0.000378 time 1.8976 (2.2086) loss 3.9461 (3.4476) grad_norm 1.9697 (1.6926) [2022-01-23 05:22:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][890/1251] eta 0:13:18 lr 0.000378 time 2.1898 (2.2109) loss 3.6947 (3.4502) grad_norm 1.5935 (1.6922) [2022-01-23 05:22:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][900/1251] eta 0:12:55 lr 0.000378 time 2.0135 (2.2097) loss 3.8879 (3.4502) grad_norm 1.4831 (1.6913) [2022-01-23 05:22:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][910/1251] eta 0:12:32 lr 0.000378 time 1.7964 (2.2071) loss 3.7655 (3.4498) grad_norm 1.5359 (1.6903) [2022-01-23 05:23:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][920/1251] eta 0:12:10 lr 0.000378 time 1.9910 (2.2056) loss 3.9422 (3.4524) grad_norm 1.5879 (1.6900) [2022-01-23 05:23:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][930/1251] eta 0:11:48 lr 0.000378 time 1.9103 (2.2065) loss 3.7590 (3.4505) grad_norm 1.4757 (1.6885) [2022-01-23 05:23:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][940/1251] eta 0:11:26 lr 0.000378 time 2.5644 (2.2064) loss 2.9812 (3.4501) grad_norm 1.5475 (1.6889) [2022-01-23 05:24:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][950/1251] eta 0:11:04 lr 0.000378 time 1.4542 (2.2064) loss 3.6814 (3.4513) grad_norm 1.3513 (1.6892) [2022-01-23 05:24:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][960/1251] eta 0:10:41 lr 0.000378 time 1.6956 (2.2045) loss 3.6650 (3.4540) grad_norm 1.5665 (1.6890) [2022-01-23 05:24:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][970/1251] eta 0:10:19 lr 0.000378 time 1.8532 (2.2029) loss 3.2933 (3.4530) grad_norm 1.5979 (1.6893) [2022-01-23 05:25:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][980/1251] eta 0:09:57 lr 0.000378 time 1.8916 (2.2030) loss 3.6329 (3.4518) grad_norm 1.7193 (1.6889) [2022-01-23 05:25:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][990/1251] eta 0:09:34 lr 0.000378 time 1.7721 (2.2030) loss 4.1696 (3.4524) grad_norm 1.6540 (1.6887) [2022-01-23 05:26:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1000/1251] eta 0:09:13 lr 0.000378 time 1.6212 (2.2032) loss 3.2957 (3.4472) grad_norm 1.9031 (1.6889) [2022-01-23 05:26:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1010/1251] eta 0:08:51 lr 0.000378 time 1.6684 (2.2036) loss 4.0365 (3.4491) grad_norm 1.6240 (1.6899) [2022-01-23 05:26:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1020/1251] eta 0:08:29 lr 0.000378 time 1.6799 (2.2041) loss 3.4039 (3.4471) grad_norm 2.0764 (1.6903) [2022-01-23 05:27:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1030/1251] eta 0:08:06 lr 0.000378 time 1.8533 (2.2033) loss 3.7922 (3.4485) grad_norm 1.5461 (1.6901) [2022-01-23 05:27:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1040/1251] eta 0:07:45 lr 0.000378 time 2.4317 (2.2041) loss 3.7145 (3.4481) grad_norm 2.0442 (1.6902) [2022-01-23 05:27:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1050/1251] eta 0:07:23 lr 0.000378 time 1.9528 (2.2042) loss 3.0840 (3.4480) grad_norm 1.4800 (1.6909) [2022-01-23 05:28:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1060/1251] eta 0:07:01 lr 0.000378 time 1.9065 (2.2053) loss 3.5967 (3.4497) grad_norm 2.1472 (1.6917) [2022-01-23 05:28:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1070/1251] eta 0:06:38 lr 0.000378 time 1.8898 (2.2043) loss 4.0426 (3.4491) grad_norm 1.6436 (1.6917) [2022-01-23 05:29:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1080/1251] eta 0:06:16 lr 0.000378 time 1.9915 (2.2028) loss 2.5132 (3.4495) grad_norm 1.8605 (1.6915) [2022-01-23 05:29:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1090/1251] eta 0:05:54 lr 0.000378 time 2.0612 (2.2023) loss 3.3326 (3.4503) grad_norm 1.6924 (1.6914) [2022-01-23 05:29:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1100/1251] eta 0:05:33 lr 0.000377 time 2.5164 (2.2061) loss 2.7218 (3.4491) grad_norm 1.6451 (1.6916) [2022-01-23 05:30:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1110/1251] eta 0:05:10 lr 0.000377 time 1.9442 (2.2053) loss 3.4186 (3.4492) grad_norm 1.8273 (1.6912) [2022-01-23 05:30:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1120/1251] eta 0:04:48 lr 0.000377 time 1.7150 (2.2032) loss 4.3813 (3.4517) grad_norm 1.5684 (1.6905) [2022-01-23 05:30:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1130/1251] eta 0:04:26 lr 0.000377 time 2.1475 (2.2023) loss 3.3726 (3.4527) grad_norm 1.6719 (1.6899) [2022-01-23 05:31:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1140/1251] eta 0:04:04 lr 0.000377 time 2.2208 (2.2026) loss 3.6445 (3.4525) grad_norm 1.5316 (1.6906) [2022-01-23 05:31:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1150/1251] eta 0:03:42 lr 0.000377 time 1.9620 (2.2016) loss 3.3434 (3.4529) grad_norm 1.4934 (1.6906) [2022-01-23 05:31:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1160/1251] eta 0:03:20 lr 0.000377 time 2.3389 (2.2025) loss 3.6593 (3.4547) grad_norm 1.8797 (1.6910) [2022-01-23 05:32:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1170/1251] eta 0:02:58 lr 0.000377 time 1.4864 (2.2019) loss 3.9643 (3.4539) grad_norm 1.6850 (1.6909) [2022-01-23 05:32:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1180/1251] eta 0:02:36 lr 0.000377 time 1.7106 (2.2035) loss 2.1500 (3.4520) grad_norm 1.7588 (1.6905) [2022-01-23 05:33:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1190/1251] eta 0:02:14 lr 0.000377 time 2.3928 (2.2037) loss 2.7226 (3.4510) grad_norm 1.5559 (1.6903) [2022-01-23 05:33:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1200/1251] eta 0:01:52 lr 0.000377 time 2.4627 (2.2044) loss 3.2501 (3.4506) grad_norm 1.7139 (1.6894) [2022-01-23 05:33:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1210/1251] eta 0:01:30 lr 0.000377 time 1.7080 (2.2044) loss 1.9800 (3.4511) grad_norm 1.5134 (1.6891) [2022-01-23 05:34:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1220/1251] eta 0:01:08 lr 0.000377 time 1.7813 (2.2049) loss 3.8494 (3.4513) grad_norm 1.6211 (1.6887) [2022-01-23 05:34:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1230/1251] eta 0:00:46 lr 0.000377 time 3.3662 (2.2051) loss 3.6295 (3.4521) grad_norm 1.6202 (1.6885) [2022-01-23 05:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1240/1251] eta 0:00:24 lr 0.000377 time 1.4845 (2.2032) loss 4.0699 (3.4504) grad_norm 1.5975 (1.6879) [2022-01-23 05:35:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [174/300][1250/1251] eta 0:00:02 lr 0.000377 time 1.1852 (2.1972) loss 3.3342 (3.4513) grad_norm 1.5302 (1.6882) [2022-01-23 05:35:09 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 174 training takes 0:45:49 [2022-01-23 05:35:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.889 (19.889) Loss 0.9903 (0.9903) Acc@1 77.344 (77.344) Acc@5 92.383 (92.383) [2022-01-23 05:35:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.985 (3.689) Loss 0.9520 (0.9462) Acc@1 77.930 (77.628) Acc@5 94.141 (94.070) [2022-01-23 05:36:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.949 (2.704) Loss 0.9065 (0.9527) Acc@1 78.809 (77.344) Acc@5 94.434 (94.043) [2022-01-23 05:36:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.957 (2.285) Loss 0.9695 (0.9661) Acc@1 77.930 (77.278) Acc@5 92.871 (93.775) [2022-01-23 05:36:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.594 (2.184) Loss 0.9792 (0.9700) Acc@1 76.855 (77.153) Acc@5 93.945 (93.738) [2022-01-23 05:36:46 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 77.154 Acc@5 93.768 [2022-01-23 05:36:46 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 77.2% [2022-01-23 05:36:46 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 77.15% [2022-01-23 05:37:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][0/1251] eta 7:13:51 lr 0.000377 time 20.8086 (20.8086) loss 4.0396 (4.0396) grad_norm 1.4355 (1.4355) [2022-01-23 05:37:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][10/1251] eta 1:23:50 lr 0.000377 time 2.5642 (4.0537) loss 3.5395 (3.5558) grad_norm 1.5863 (1.5502) [2022-01-23 05:37:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][20/1251] eta 1:05:01 lr 0.000377 time 2.1646 (3.1693) loss 3.2992 (3.4505) grad_norm 1.5323 (1.6056) [2022-01-23 05:38:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][30/1251] eta 0:57:17 lr 0.000377 time 1.7425 (2.8156) loss 2.5764 (3.4662) grad_norm 1.7542 (1.6475) [2022-01-23 05:38:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][40/1251] eta 0:55:00 lr 0.000377 time 5.3205 (2.7258) loss 2.6060 (3.4667) grad_norm 1.5299 (1.6428) [2022-01-23 05:39:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][50/1251] eta 0:52:31 lr 0.000377 time 1.5929 (2.6240) loss 3.4946 (3.4686) grad_norm 2.0163 (1.6500) [2022-01-23 05:39:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][60/1251] eta 0:51:01 lr 0.000377 time 1.7495 (2.5708) loss 3.0064 (3.4859) grad_norm 1.7417 (1.6497) [2022-01-23 05:39:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][70/1251] eta 0:49:41 lr 0.000377 time 1.7785 (2.5244) loss 4.1694 (3.4969) grad_norm 1.7752 (1.6602) [2022-01-23 05:40:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][80/1251] eta 0:48:52 lr 0.000377 time 4.6190 (2.5039) loss 3.5541 (3.4500) grad_norm 2.0146 (1.6713) [2022-01-23 05:40:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][90/1251] eta 0:48:06 lr 0.000377 time 1.5823 (2.4864) loss 3.7361 (3.4887) grad_norm 1.5480 (1.6761) [2022-01-23 05:40:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][100/1251] eta 0:47:12 lr 0.000376 time 1.8244 (2.4611) loss 3.8073 (3.4932) grad_norm 1.9155 (1.6892) [2022-01-23 05:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][110/1251] eta 0:46:12 lr 0.000376 time 1.8546 (2.4301) loss 4.1612 (3.4796) grad_norm 1.5650 (1.6934) [2022-01-23 05:41:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][120/1251] eta 0:45:28 lr 0.000376 time 2.9594 (2.4128) loss 3.6922 (3.4736) grad_norm 2.1339 (1.6991) [2022-01-23 05:41:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][130/1251] eta 0:44:24 lr 0.000376 time 2.2602 (2.3770) loss 2.8422 (3.4677) grad_norm 1.7112 (1.7039) [2022-01-23 05:42:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][140/1251] eta 0:43:33 lr 0.000376 time 2.0128 (2.3525) loss 2.9737 (3.4499) grad_norm 1.6853 (1.6978) [2022-01-23 05:42:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][150/1251] eta 0:42:52 lr 0.000376 time 2.2558 (2.3363) loss 3.4397 (3.4541) grad_norm 1.4827 (1.6910) [2022-01-23 05:43:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][160/1251] eta 0:42:24 lr 0.000376 time 3.3689 (2.3325) loss 3.8155 (3.4634) grad_norm 1.6663 (1.6930) [2022-01-23 05:43:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][170/1251] eta 0:42:05 lr 0.000376 time 2.2436 (2.3364) loss 3.5998 (3.4566) grad_norm 1.9962 (1.6968) [2022-01-23 05:43:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][180/1251] eta 0:41:34 lr 0.000376 time 2.0664 (2.3296) loss 3.8201 (3.4629) grad_norm 1.9141 (1.7002) [2022-01-23 05:44:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][190/1251] eta 0:41:07 lr 0.000376 time 2.2590 (2.3259) loss 2.8755 (3.4441) grad_norm 1.8667 (1.7011) [2022-01-23 05:44:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][200/1251] eta 0:40:42 lr 0.000376 time 2.6210 (2.3237) loss 3.9510 (3.4494) grad_norm 1.5978 (1.7004) [2022-01-23 05:44:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][210/1251] eta 0:40:03 lr 0.000376 time 1.6260 (2.3093) loss 3.6914 (3.4524) grad_norm 2.1788 (1.7037) [2022-01-23 05:45:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][220/1251] eta 0:39:30 lr 0.000376 time 2.0392 (2.2991) loss 3.5746 (3.4438) grad_norm 1.5391 (1.6996) [2022-01-23 05:45:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][230/1251] eta 0:39:03 lr 0.000376 time 2.3175 (2.2957) loss 3.1858 (3.4336) grad_norm 2.0038 (1.6991) [2022-01-23 05:46:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][240/1251] eta 0:38:47 lr 0.000376 time 3.0029 (2.3026) loss 3.4066 (3.4307) grad_norm 1.9525 (1.7016) [2022-01-23 05:46:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][250/1251] eta 0:38:22 lr 0.000376 time 2.4836 (2.3004) loss 2.6594 (3.4293) grad_norm 1.7536 (1.6998) [2022-01-23 05:46:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][260/1251] eta 0:37:49 lr 0.000376 time 1.8371 (2.2900) loss 4.0082 (3.4241) grad_norm 1.6502 (1.7011) [2022-01-23 05:47:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][270/1251] eta 0:37:17 lr 0.000376 time 1.9372 (2.2809) loss 3.5907 (3.4281) grad_norm 1.5940 (1.6998) [2022-01-23 05:47:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][280/1251] eta 0:36:46 lr 0.000376 time 1.9669 (2.2726) loss 3.7052 (3.4390) grad_norm 1.8006 (1.6966) [2022-01-23 05:47:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][290/1251] eta 0:36:20 lr 0.000376 time 2.1608 (2.2689) loss 2.9661 (3.4284) grad_norm 1.6906 (1.6987) [2022-01-23 05:48:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][300/1251] eta 0:35:51 lr 0.000376 time 1.8672 (2.2628) loss 3.3985 (3.4356) grad_norm 1.6633 (1.6966) [2022-01-23 05:48:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][310/1251] eta 0:35:26 lr 0.000376 time 1.8886 (2.2603) loss 3.9193 (3.4393) grad_norm 1.6924 (1.6946) [2022-01-23 05:48:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][320/1251] eta 0:34:57 lr 0.000376 time 1.9844 (2.2525) loss 3.3037 (3.4406) grad_norm 1.4553 (1.6932) [2022-01-23 05:49:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][330/1251] eta 0:34:32 lr 0.000376 time 2.4218 (2.2508) loss 2.4821 (3.4412) grad_norm 1.5474 (1.6937) [2022-01-23 05:49:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][340/1251] eta 0:34:09 lr 0.000376 time 2.7536 (2.2498) loss 4.0356 (3.4379) grad_norm 1.4458 (1.6936) [2022-01-23 05:49:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][350/1251] eta 0:33:48 lr 0.000375 time 2.5907 (2.2512) loss 3.8014 (3.4409) grad_norm 1.5749 (1.6916) [2022-01-23 05:50:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][360/1251] eta 0:33:25 lr 0.000375 time 2.1255 (2.2504) loss 2.7397 (3.4334) grad_norm 1.5967 (1.6918) [2022-01-23 05:50:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][370/1251] eta 0:33:03 lr 0.000375 time 2.5397 (2.2518) loss 3.2616 (3.4309) grad_norm 1.7142 (1.6913) [2022-01-23 05:51:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][380/1251] eta 0:32:41 lr 0.000375 time 2.1958 (2.2519) loss 2.7283 (3.4252) grad_norm 1.4581 (1.6887) [2022-01-23 05:51:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][390/1251] eta 0:32:19 lr 0.000375 time 2.5566 (2.2527) loss 3.0566 (3.4304) grad_norm 1.6753 (1.6894) [2022-01-23 05:51:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][400/1251] eta 0:31:53 lr 0.000375 time 1.8956 (2.2487) loss 2.5783 (3.4290) grad_norm 1.8316 (1.6917) [2022-01-23 05:52:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][410/1251] eta 0:31:33 lr 0.000375 time 2.1603 (2.2518) loss 4.0234 (3.4263) grad_norm 2.1218 (1.6945) [2022-01-23 05:52:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][420/1251] eta 0:31:07 lr 0.000375 time 1.6307 (2.2474) loss 4.0647 (3.4329) grad_norm 1.7000 (1.6976) [2022-01-23 05:52:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][430/1251] eta 0:30:47 lr 0.000375 time 2.3324 (2.2497) loss 4.0758 (3.4352) grad_norm 1.7913 (1.6972) [2022-01-23 05:53:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][440/1251] eta 0:30:20 lr 0.000375 time 1.5431 (2.2448) loss 3.0754 (3.4292) grad_norm 2.6964 (1.7000) [2022-01-23 05:53:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][450/1251] eta 0:29:57 lr 0.000375 time 2.2440 (2.2445) loss 2.8341 (3.4320) grad_norm 1.6961 (1.6998) [2022-01-23 05:54:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][460/1251] eta 0:29:34 lr 0.000375 time 1.5124 (2.2430) loss 3.5997 (3.4367) grad_norm 1.6149 (1.7030) [2022-01-23 05:54:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][470/1251] eta 0:29:12 lr 0.000375 time 2.2679 (2.2444) loss 4.0366 (3.4405) grad_norm 1.4993 (1.7021) [2022-01-23 05:54:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][480/1251] eta 0:28:46 lr 0.000375 time 2.1839 (2.2394) loss 2.8508 (3.4449) grad_norm 1.6506 (1.7017) [2022-01-23 05:55:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][490/1251] eta 0:28:20 lr 0.000375 time 1.7034 (2.2350) loss 3.7943 (3.4332) grad_norm 1.6392 (1.7008) [2022-01-23 05:55:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][500/1251] eta 0:27:58 lr 0.000375 time 2.2521 (2.2350) loss 3.6298 (3.4381) grad_norm 1.6221 (1.7011) [2022-01-23 05:55:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][510/1251] eta 0:27:37 lr 0.000375 time 2.7360 (2.2371) loss 2.4098 (3.4314) grad_norm 1.7005 (1.6991) [2022-01-23 05:56:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][520/1251] eta 0:27:16 lr 0.000375 time 2.7454 (2.2388) loss 2.4846 (3.4317) grad_norm 1.6754 (1.6984) [2022-01-23 05:56:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][530/1251] eta 0:26:53 lr 0.000375 time 1.5445 (2.2380) loss 3.1608 (3.4348) grad_norm 1.7717 (1.6977) [2022-01-23 05:56:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][540/1251] eta 0:26:28 lr 0.000375 time 1.5927 (2.2345) loss 2.9365 (3.4394) grad_norm 1.4699 (1.6984) [2022-01-23 05:57:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][550/1251] eta 0:26:05 lr 0.000375 time 1.6633 (2.2333) loss 3.6570 (3.4433) grad_norm 1.5042 (1.6966) [2022-01-23 05:57:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][560/1251] eta 0:25:40 lr 0.000375 time 2.2655 (2.2290) loss 4.2439 (3.4417) grad_norm 1.6480 (1.6959) [2022-01-23 05:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][570/1251] eta 0:25:15 lr 0.000375 time 1.8543 (2.2261) loss 2.5505 (3.4422) grad_norm 1.5427 (1.6950) [2022-01-23 05:58:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][580/1251] eta 0:24:53 lr 0.000375 time 2.1356 (2.2257) loss 3.7049 (3.4458) grad_norm 1.5314 (1.6951) [2022-01-23 05:58:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][590/1251] eta 0:24:33 lr 0.000375 time 2.3540 (2.2292) loss 3.5922 (3.4516) grad_norm 1.5018 (1.6944) [2022-01-23 05:59:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][600/1251] eta 0:24:11 lr 0.000374 time 1.9898 (2.2299) loss 3.0941 (3.4509) grad_norm 1.4473 (1.6935) [2022-01-23 05:59:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][610/1251] eta 0:23:50 lr 0.000374 time 2.1282 (2.2310) loss 3.3083 (3.4502) grad_norm 1.7901 (1.6935) [2022-01-23 05:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][620/1251] eta 0:23:25 lr 0.000374 time 1.7205 (2.2279) loss 3.4472 (3.4525) grad_norm 1.6601 (1.6933) [2022-01-23 06:00:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][630/1251] eta 0:23:04 lr 0.000374 time 2.2578 (2.2297) loss 2.4213 (3.4454) grad_norm 1.5299 (1.6924) [2022-01-23 06:00:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][640/1251] eta 0:22:40 lr 0.000374 time 1.5167 (2.2271) loss 3.6347 (3.4455) grad_norm 1.5329 (1.6919) [2022-01-23 06:00:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][650/1251] eta 0:22:18 lr 0.000374 time 1.8727 (2.2264) loss 3.6789 (3.4432) grad_norm 1.4334 (1.6912) [2022-01-23 06:01:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][660/1251] eta 0:21:55 lr 0.000374 time 2.0652 (2.2261) loss 3.3460 (3.4425) grad_norm 1.6981 (1.6903) [2022-01-23 06:01:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][670/1251] eta 0:21:33 lr 0.000374 time 2.7653 (2.2259) loss 3.6622 (3.4407) grad_norm 1.6009 (1.6909) [2022-01-23 06:02:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][680/1251] eta 0:21:10 lr 0.000374 time 1.9934 (2.2243) loss 3.6943 (3.4418) grad_norm 1.8572 (1.6900) [2022-01-23 06:02:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][690/1251] eta 0:20:47 lr 0.000374 time 2.0653 (2.2236) loss 3.3628 (3.4420) grad_norm 2.0545 (1.6901) [2022-01-23 06:02:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][700/1251] eta 0:20:24 lr 0.000374 time 1.9561 (2.2227) loss 3.5265 (3.4400) grad_norm 2.0177 (1.6906) [2022-01-23 06:03:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][710/1251] eta 0:20:02 lr 0.000374 time 2.2501 (2.2228) loss 3.4670 (3.4360) grad_norm 1.5481 (1.6893) [2022-01-23 06:03:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][720/1251] eta 0:19:39 lr 0.000374 time 2.4741 (2.2208) loss 2.4951 (3.4360) grad_norm 1.6066 (1.6890) [2022-01-23 06:03:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][730/1251] eta 0:19:16 lr 0.000374 time 2.1992 (2.2199) loss 2.4917 (3.4340) grad_norm 1.5403 (1.6894) [2022-01-23 06:04:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][740/1251] eta 0:18:53 lr 0.000374 time 2.0333 (2.2186) loss 2.9997 (3.4342) grad_norm 1.7593 (1.6901) [2022-01-23 06:04:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][750/1251] eta 0:18:31 lr 0.000374 time 2.4672 (2.2195) loss 3.6446 (3.4361) grad_norm 1.6772 (1.6902) [2022-01-23 06:04:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][760/1251] eta 0:18:09 lr 0.000374 time 1.4958 (2.2192) loss 3.3176 (3.4361) grad_norm 2.0769 (1.6918) [2022-01-23 06:05:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][770/1251] eta 0:17:47 lr 0.000374 time 1.8434 (2.2194) loss 3.2614 (3.4328) grad_norm 1.5246 (1.6928) [2022-01-23 06:05:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][780/1251] eta 0:17:25 lr 0.000374 time 2.2888 (2.2190) loss 3.5172 (3.4332) grad_norm 1.7300 (1.6944) [2022-01-23 06:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][790/1251] eta 0:17:02 lr 0.000374 time 2.2196 (2.2181) loss 3.3148 (3.4327) grad_norm 1.6172 (1.6959) [2022-01-23 06:06:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][800/1251] eta 0:16:41 lr 0.000374 time 3.0505 (2.2196) loss 3.7604 (3.4338) grad_norm 1.6177 (1.6948) [2022-01-23 06:06:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][810/1251] eta 0:16:17 lr 0.000374 time 1.4921 (2.2173) loss 3.7899 (3.4318) grad_norm 1.6707 (1.6946) [2022-01-23 06:07:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][820/1251] eta 0:15:56 lr 0.000374 time 2.4653 (2.2182) loss 3.6590 (3.4329) grad_norm 1.4150 (1.6955) [2022-01-23 06:07:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][830/1251] eta 0:15:33 lr 0.000374 time 1.8376 (2.2172) loss 4.2358 (3.4358) grad_norm 2.1359 (1.6954) [2022-01-23 06:07:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][840/1251] eta 0:15:11 lr 0.000374 time 2.7585 (2.2180) loss 2.5869 (3.4310) grad_norm 1.8489 (1.6958) [2022-01-23 06:08:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][850/1251] eta 0:14:48 lr 0.000373 time 1.8672 (2.2161) loss 3.7973 (3.4312) grad_norm 1.5442 (1.6953) [2022-01-23 06:08:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][860/1251] eta 0:14:26 lr 0.000373 time 2.7872 (2.2163) loss 4.2242 (3.4351) grad_norm 1.8639 (1.6961) [2022-01-23 06:08:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][870/1251] eta 0:14:03 lr 0.000373 time 1.6158 (2.2147) loss 3.9445 (3.4352) grad_norm 1.8542 (1.6962) [2022-01-23 06:09:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][880/1251] eta 0:13:41 lr 0.000373 time 1.8308 (2.2143) loss 3.8800 (3.4364) grad_norm 1.9913 (1.6971) [2022-01-23 06:09:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][890/1251] eta 0:13:19 lr 0.000373 time 1.7134 (2.2136) loss 3.4550 (3.4351) grad_norm 1.7316 (1.6971) [2022-01-23 06:10:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][900/1251] eta 0:12:57 lr 0.000373 time 2.6447 (2.2144) loss 3.9594 (3.4339) grad_norm 1.6373 (1.6970) [2022-01-23 06:10:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][910/1251] eta 0:12:35 lr 0.000373 time 2.6616 (2.2141) loss 3.3159 (3.4328) grad_norm 1.4995 (1.6969) [2022-01-23 06:10:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][920/1251] eta 0:12:13 lr 0.000373 time 1.8587 (2.2146) loss 3.4678 (3.4329) grad_norm 1.6763 (1.6968) [2022-01-23 06:11:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][930/1251] eta 0:11:50 lr 0.000373 time 1.9003 (2.2136) loss 4.0912 (3.4333) grad_norm 1.6766 (1.6971) [2022-01-23 06:11:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][940/1251] eta 0:11:28 lr 0.000373 time 1.7362 (2.2131) loss 3.3457 (3.4328) grad_norm 1.8496 (1.6986) [2022-01-23 06:11:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][950/1251] eta 0:11:05 lr 0.000373 time 1.8052 (2.2120) loss 3.5540 (3.4312) grad_norm 1.4613 (1.6990) [2022-01-23 06:12:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][960/1251] eta 0:10:43 lr 0.000373 time 1.8579 (2.2105) loss 3.0253 (3.4291) grad_norm 1.4932 (1.6992) [2022-01-23 06:12:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][970/1251] eta 0:10:20 lr 0.000373 time 2.1125 (2.2086) loss 2.9259 (3.4290) grad_norm 1.6741 (1.6996) [2022-01-23 06:12:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][980/1251] eta 0:09:58 lr 0.000373 time 1.5941 (2.2094) loss 3.1454 (3.4292) grad_norm 1.5978 (1.6994) [2022-01-23 06:13:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][990/1251] eta 0:09:36 lr 0.000373 time 1.5800 (2.2079) loss 2.4734 (3.4275) grad_norm 2.2299 (1.6999) [2022-01-23 06:13:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1000/1251] eta 0:09:14 lr 0.000373 time 2.5020 (2.2095) loss 3.7648 (3.4266) grad_norm 1.5615 (1.6999) [2022-01-23 06:14:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1010/1251] eta 0:08:52 lr 0.000373 time 2.4617 (2.2105) loss 4.1517 (3.4275) grad_norm 1.4086 (1.6994) [2022-01-23 06:14:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1020/1251] eta 0:08:30 lr 0.000373 time 1.8036 (2.2102) loss 4.0904 (3.4307) grad_norm 1.7249 (1.6990) [2022-01-23 06:14:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1030/1251] eta 0:08:08 lr 0.000373 time 2.2297 (2.2102) loss 2.8263 (3.4306) grad_norm 1.5163 (1.6983) [2022-01-23 06:15:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1040/1251] eta 0:07:46 lr 0.000373 time 2.3845 (2.2099) loss 2.5336 (3.4285) grad_norm 1.6241 (1.6981) [2022-01-23 06:15:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1050/1251] eta 0:07:23 lr 0.000373 time 2.2398 (2.2085) loss 2.3099 (3.4278) grad_norm 1.8465 (1.6979) [2022-01-23 06:15:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1060/1251] eta 0:07:01 lr 0.000373 time 1.8525 (2.2084) loss 4.1088 (3.4277) grad_norm 1.7264 (1.6977) [2022-01-23 06:16:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1070/1251] eta 0:06:40 lr 0.000373 time 3.3267 (2.2103) loss 3.8399 (3.4282) grad_norm 1.4474 (1.6967) [2022-01-23 06:16:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1080/1251] eta 0:06:17 lr 0.000373 time 1.8931 (2.2093) loss 3.0178 (3.4279) grad_norm 1.6313 (1.6957) [2022-01-23 06:16:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1090/1251] eta 0:05:55 lr 0.000373 time 2.0002 (2.2078) loss 3.6515 (3.4298) grad_norm 1.7630 (1.6966) [2022-01-23 06:17:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1100/1251] eta 0:05:33 lr 0.000372 time 1.7706 (2.2070) loss 3.7274 (3.4311) grad_norm 1.4942 (1.6966) [2022-01-23 06:17:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1110/1251] eta 0:05:11 lr 0.000372 time 3.0313 (2.2090) loss 3.9198 (3.4318) grad_norm 1.6045 (1.6960) [2022-01-23 06:18:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1120/1251] eta 0:04:49 lr 0.000372 time 2.0885 (2.2093) loss 3.7353 (3.4318) grad_norm 1.7192 (1.6960) [2022-01-23 06:18:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1130/1251] eta 0:04:27 lr 0.000372 time 1.7631 (2.2087) loss 3.9363 (3.4336) grad_norm 1.7074 (1.6952) [2022-01-23 06:18:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1140/1251] eta 0:04:05 lr 0.000372 time 1.8465 (2.2082) loss 3.9483 (3.4328) grad_norm 1.7622 (1.6946) [2022-01-23 06:19:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1150/1251] eta 0:03:43 lr 0.000372 time 3.5190 (2.2094) loss 4.1001 (3.4323) grad_norm 1.8032 (1.6938) [2022-01-23 06:19:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1160/1251] eta 0:03:21 lr 0.000372 time 1.8280 (2.2091) loss 2.8687 (3.4312) grad_norm 1.5572 (1.6935) [2022-01-23 06:19:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1170/1251] eta 0:02:58 lr 0.000372 time 1.8582 (2.2082) loss 3.9217 (3.4297) grad_norm 1.4643 (1.6926) [2022-01-23 06:20:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1180/1251] eta 0:02:36 lr 0.000372 time 1.5485 (2.2076) loss 3.9976 (3.4295) grad_norm 1.6795 (1.6918) [2022-01-23 06:20:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1190/1251] eta 0:02:14 lr 0.000372 time 3.2279 (2.2084) loss 2.8508 (3.4281) grad_norm 1.5560 (1.6921) [2022-01-23 06:20:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1200/1251] eta 0:01:52 lr 0.000372 time 1.9290 (2.2070) loss 3.7462 (3.4282) grad_norm 1.8142 (1.6921) [2022-01-23 06:21:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1210/1251] eta 0:01:30 lr 0.000372 time 2.4712 (2.2073) loss 3.6579 (3.4256) grad_norm 1.7305 (1.6922) [2022-01-23 06:21:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1220/1251] eta 0:01:08 lr 0.000372 time 1.6426 (2.2068) loss 2.3069 (3.4254) grad_norm 1.5709 (1.6919) [2022-01-23 06:22:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1230/1251] eta 0:00:46 lr 0.000372 time 2.8152 (2.2074) loss 3.9739 (3.4267) grad_norm 1.7021 (1.6918) [2022-01-23 06:22:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1240/1251] eta 0:00:24 lr 0.000372 time 1.7406 (2.2055) loss 3.8946 (3.4258) grad_norm 1.5015 (1.6914) [2022-01-23 06:22:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [175/300][1250/1251] eta 0:00:02 lr 0.000372 time 1.2525 (2.1998) loss 2.9574 (3.4269) grad_norm 2.0807 (1.6923) [2022-01-23 06:22:39 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 175 training takes 0:45:52 [2022-01-23 06:22:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.991 (18.991) Loss 0.9883 (0.9883) Acc@1 78.320 (78.320) Acc@5 92.969 (92.969) [2022-01-23 06:23:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.922 (3.278) Loss 0.9238 (0.9842) Acc@1 77.539 (77.317) Acc@5 93.848 (93.333) [2022-01-23 06:23:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.570 (2.460) Loss 1.0506 (0.9894) Acc@1 74.902 (76.939) Acc@5 92.676 (93.592) [2022-01-23 06:23:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.477 (2.221) Loss 1.0203 (0.9756) Acc@1 75.098 (77.142) Acc@5 94.531 (93.800) [2022-01-23 06:24:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.052 (2.138) Loss 0.9514 (0.9812) Acc@1 76.953 (76.967) Acc@5 93.848 (93.721) [2022-01-23 06:24:14 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 77.028 Acc@5 93.800 [2022-01-23 06:24:14 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 77.0% [2022-01-23 06:24:14 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 77.15% [2022-01-23 06:24:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][0/1251] eta 7:31:24 lr 0.000372 time 21.6501 (21.6501) loss 3.4045 (3.4045) grad_norm 1.6495 (1.6495) [2022-01-23 06:24:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][10/1251] eta 1:22:33 lr 0.000372 time 1.5781 (3.9918) loss 2.8869 (3.3661) grad_norm 1.5310 (1.6524) [2022-01-23 06:25:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][20/1251] eta 1:04:43 lr 0.000372 time 1.3120 (3.1549) loss 2.9993 (3.4521) grad_norm 1.6903 (1.6548) [2022-01-23 06:25:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][30/1251] eta 0:57:26 lr 0.000372 time 1.4730 (2.8227) loss 3.5737 (3.3949) grad_norm 1.5353 (1.6540) [2022-01-23 06:26:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][40/1251] eta 0:53:56 lr 0.000372 time 3.5683 (2.6729) loss 3.6430 (3.4073) grad_norm 1.7453 (1.6661) [2022-01-23 06:26:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][50/1251] eta 0:52:22 lr 0.000372 time 1.6005 (2.6164) loss 3.3074 (3.4141) grad_norm 1.5081 (1.6637) [2022-01-23 06:26:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][60/1251] eta 0:50:44 lr 0.000372 time 1.5714 (2.5560) loss 3.3382 (3.4218) grad_norm 1.5186 (1.6582) [2022-01-23 06:27:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][70/1251] eta 0:49:08 lr 0.000372 time 1.8119 (2.4970) loss 3.2072 (3.4394) grad_norm 1.5374 (1.6538) [2022-01-23 06:27:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][80/1251] eta 0:48:01 lr 0.000372 time 2.8391 (2.4609) loss 3.2898 (3.4402) grad_norm 1.9096 (1.6671) [2022-01-23 06:27:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][90/1251] eta 0:47:32 lr 0.000372 time 2.3920 (2.4572) loss 3.3807 (3.4368) grad_norm 1.7321 (1.6733) [2022-01-23 06:28:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][100/1251] eta 0:46:35 lr 0.000371 time 1.6461 (2.4288) loss 3.8518 (3.4576) grad_norm 1.7414 (1.6829) [2022-01-23 06:28:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][110/1251] eta 0:45:43 lr 0.000371 time 1.5119 (2.4046) loss 2.4845 (3.4425) grad_norm 1.8577 (1.6935) [2022-01-23 06:29:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][120/1251] eta 0:44:52 lr 0.000371 time 2.2712 (2.3807) loss 2.9629 (3.4303) grad_norm 1.7233 (1.6947) [2022-01-23 06:29:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][130/1251] eta 0:44:18 lr 0.000371 time 2.1923 (2.3717) loss 4.1748 (3.4410) grad_norm 1.8866 (1.6891) [2022-01-23 06:29:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][140/1251] eta 0:43:33 lr 0.000371 time 2.1957 (2.3528) loss 3.8763 (3.4432) grad_norm 1.9526 (1.6917) [2022-01-23 06:30:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][150/1251] eta 0:42:55 lr 0.000371 time 1.8992 (2.3389) loss 3.2979 (3.4291) grad_norm 1.9933 (1.7028) [2022-01-23 06:30:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][160/1251] eta 0:42:22 lr 0.000371 time 1.7669 (2.3304) loss 4.0353 (3.4228) grad_norm 1.3726 (1.7057) [2022-01-23 06:30:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][170/1251] eta 0:42:00 lr 0.000371 time 2.5416 (2.3319) loss 3.9635 (3.4204) grad_norm 1.6621 (1.7040) [2022-01-23 06:31:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][180/1251] eta 0:41:39 lr 0.000371 time 2.9855 (2.3335) loss 3.6492 (3.4277) grad_norm 1.4479 (1.6962) [2022-01-23 06:31:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][190/1251] eta 0:40:58 lr 0.000371 time 1.6136 (2.3172) loss 3.1910 (3.4134) grad_norm 1.6164 (1.6974) [2022-01-23 06:31:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][200/1251] eta 0:40:23 lr 0.000371 time 1.8984 (2.3064) loss 3.8297 (3.4098) grad_norm 1.5801 (1.6989) [2022-01-23 06:32:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][210/1251] eta 0:39:58 lr 0.000371 time 1.8330 (2.3036) loss 3.9521 (3.4097) grad_norm 1.7375 (1.7003) [2022-01-23 06:32:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][220/1251] eta 0:39:25 lr 0.000371 time 1.9854 (2.2940) loss 2.9558 (3.4047) grad_norm 1.7538 (1.7065) [2022-01-23 06:33:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][230/1251] eta 0:38:54 lr 0.000371 time 2.3000 (2.2867) loss 2.8995 (3.4144) grad_norm 1.5622 (1.7074) [2022-01-23 06:33:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][240/1251] eta 0:38:20 lr 0.000371 time 1.9856 (2.2757) loss 3.2999 (3.4075) grad_norm 1.4698 (1.7116) [2022-01-23 06:33:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][250/1251] eta 0:37:56 lr 0.000371 time 1.9510 (2.2738) loss 3.2692 (3.4045) grad_norm 2.8173 (1.7123) [2022-01-23 06:34:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][260/1251] eta 0:37:32 lr 0.000371 time 2.6558 (2.2731) loss 4.1274 (3.4165) grad_norm 1.6397 (1.7167) [2022-01-23 06:34:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][270/1251] eta 0:37:01 lr 0.000371 time 2.0753 (2.2649) loss 4.0466 (3.4286) grad_norm 1.6871 (1.7214) [2022-01-23 06:34:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][280/1251] eta 0:36:32 lr 0.000371 time 2.2992 (2.2582) loss 3.7181 (3.4312) grad_norm 1.6611 (1.7180) [2022-01-23 06:35:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][290/1251] eta 0:36:03 lr 0.000371 time 1.9265 (2.2508) loss 3.5228 (3.4290) grad_norm 2.3922 (1.7201) [2022-01-23 06:35:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][300/1251] eta 0:35:39 lr 0.000371 time 2.8718 (2.2492) loss 3.5015 (3.4199) grad_norm 1.5466 (1.7207) [2022-01-23 06:35:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][310/1251] eta 0:35:14 lr 0.000371 time 1.6020 (2.2467) loss 3.4066 (3.4195) grad_norm 1.5242 (1.7205) [2022-01-23 06:36:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][320/1251] eta 0:34:55 lr 0.000371 time 2.4824 (2.2505) loss 3.9904 (3.4303) grad_norm 1.6940 (1.7197) [2022-01-23 06:36:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][330/1251] eta 0:34:31 lr 0.000371 time 1.8589 (2.2491) loss 4.0657 (3.4313) grad_norm 1.6574 (1.7217) [2022-01-23 06:37:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][340/1251] eta 0:34:05 lr 0.000371 time 2.7224 (2.2458) loss 4.2698 (3.4414) grad_norm 1.5769 (1.7226) [2022-01-23 06:37:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][350/1251] eta 0:33:40 lr 0.000370 time 1.6142 (2.2429) loss 3.7886 (3.4452) grad_norm 1.5916 (1.7236) [2022-01-23 06:37:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][360/1251] eta 0:33:19 lr 0.000370 time 2.4179 (2.2446) loss 3.6875 (3.4481) grad_norm 1.6397 (1.7225) [2022-01-23 06:38:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][370/1251] eta 0:32:55 lr 0.000370 time 1.5856 (2.2423) loss 3.9558 (3.4534) grad_norm 1.9353 (1.7239) [2022-01-23 06:38:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][380/1251] eta 0:32:32 lr 0.000370 time 2.9830 (2.2421) loss 3.8102 (3.4491) grad_norm 1.8534 (1.7237) [2022-01-23 06:38:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][390/1251] eta 0:32:07 lr 0.000370 time 2.0230 (2.2392) loss 2.7224 (3.4531) grad_norm 1.9190 (1.7252) [2022-01-23 06:39:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][400/1251] eta 0:31:45 lr 0.000370 time 2.7320 (2.2394) loss 2.9651 (3.4523) grad_norm 1.7071 (1.7239) [2022-01-23 06:39:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][410/1251] eta 0:31:20 lr 0.000370 time 1.8425 (2.2362) loss 2.9554 (3.4519) grad_norm 1.6481 (1.7215) [2022-01-23 06:39:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][420/1251] eta 0:30:54 lr 0.000370 time 2.3942 (2.2319) loss 3.7189 (3.4539) grad_norm 1.7276 (1.7224) [2022-01-23 06:40:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][430/1251] eta 0:30:28 lr 0.000370 time 2.3686 (2.2267) loss 2.8344 (3.4532) grad_norm 1.8810 (1.7228) [2022-01-23 06:40:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][440/1251] eta 0:30:05 lr 0.000370 time 2.5121 (2.2265) loss 3.8032 (3.4556) grad_norm 1.8109 (1.7219) [2022-01-23 06:40:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][450/1251] eta 0:29:44 lr 0.000370 time 2.5037 (2.2277) loss 3.8086 (3.4627) grad_norm 1.5191 (1.7220) [2022-01-23 06:41:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][460/1251] eta 0:29:20 lr 0.000370 time 1.8206 (2.2261) loss 3.1284 (3.4601) grad_norm 1.7728 (1.7212) [2022-01-23 06:41:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][470/1251] eta 0:29:00 lr 0.000370 time 2.8704 (2.2286) loss 3.8056 (3.4637) grad_norm 1.8095 (1.7192) [2022-01-23 06:42:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][480/1251] eta 0:28:38 lr 0.000370 time 2.4549 (2.2286) loss 2.8589 (3.4623) grad_norm 1.7426 (1.7179) [2022-01-23 06:42:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][490/1251] eta 0:28:12 lr 0.000370 time 1.9053 (2.2245) loss 2.5664 (3.4668) grad_norm 1.7554 (1.7185) [2022-01-23 06:42:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][500/1251] eta 0:27:47 lr 0.000370 time 2.2197 (2.2209) loss 3.1185 (3.4676) grad_norm 1.7217 (1.7176) [2022-01-23 06:43:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][510/1251] eta 0:27:22 lr 0.000370 time 2.2770 (2.2173) loss 2.5128 (3.4579) grad_norm 1.4561 (1.7165) [2022-01-23 06:43:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][520/1251] eta 0:27:02 lr 0.000370 time 2.1858 (2.2193) loss 2.2411 (3.4539) grad_norm 1.6651 (1.7184) [2022-01-23 06:43:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][530/1251] eta 0:26:41 lr 0.000370 time 2.4644 (2.2210) loss 3.6045 (3.4551) grad_norm 1.7997 (1.7203) [2022-01-23 06:44:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][540/1251] eta 0:26:18 lr 0.000370 time 2.4290 (2.2204) loss 3.7994 (3.4620) grad_norm 1.5460 (1.7202) [2022-01-23 06:44:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][550/1251] eta 0:25:55 lr 0.000370 time 1.5837 (2.2185) loss 3.9488 (3.4652) grad_norm 1.6426 (1.7197) [2022-01-23 06:44:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][560/1251] eta 0:25:31 lr 0.000370 time 2.3115 (2.2165) loss 2.1424 (3.4591) grad_norm 1.7932 (1.7176) [2022-01-23 06:45:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][570/1251] eta 0:25:10 lr 0.000370 time 1.9409 (2.2174) loss 3.6477 (3.4594) grad_norm 2.0167 (1.7174) [2022-01-23 06:45:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][580/1251] eta 0:24:47 lr 0.000370 time 2.1092 (2.2165) loss 2.6906 (3.4580) grad_norm 1.7140 (1.7158) [2022-01-23 06:46:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][590/1251] eta 0:24:23 lr 0.000370 time 1.5780 (2.2145) loss 2.9708 (3.4582) grad_norm 1.8964 (1.7164) [2022-01-23 06:46:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][600/1251] eta 0:24:00 lr 0.000369 time 2.1140 (2.2129) loss 3.2069 (3.4591) grad_norm 1.7304 (1.7148) [2022-01-23 06:46:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][610/1251] eta 0:23:37 lr 0.000369 time 2.4580 (2.2119) loss 3.7325 (3.4557) grad_norm 1.8308 (1.7135) [2022-01-23 06:47:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][620/1251] eta 0:23:15 lr 0.000369 time 2.2893 (2.2112) loss 3.4704 (3.4523) grad_norm 1.7937 (1.7121) [2022-01-23 06:47:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][630/1251] eta 0:22:52 lr 0.000369 time 2.1941 (2.2104) loss 3.3939 (3.4506) grad_norm 1.5655 (1.7112) [2022-01-23 06:47:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][640/1251] eta 0:22:28 lr 0.000369 time 2.1003 (2.2077) loss 2.1879 (3.4480) grad_norm 1.8823 (1.7099) [2022-01-23 06:48:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][650/1251] eta 0:22:07 lr 0.000369 time 1.8407 (2.2094) loss 2.7706 (3.4522) grad_norm 1.5360 (1.7083) [2022-01-23 06:48:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][660/1251] eta 0:21:46 lr 0.000369 time 2.4959 (2.2099) loss 3.9790 (3.4527) grad_norm 1.8953 (1.7096) [2022-01-23 06:48:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][670/1251] eta 0:21:24 lr 0.000369 time 2.0917 (2.2103) loss 2.3947 (3.4513) grad_norm 1.6066 (1.7094) [2022-01-23 06:49:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][680/1251] eta 0:21:03 lr 0.000369 time 2.7258 (2.2119) loss 2.8663 (3.4505) grad_norm 1.7184 (1.7106) [2022-01-23 06:49:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][690/1251] eta 0:20:40 lr 0.000369 time 1.5214 (2.2106) loss 4.2640 (3.4523) grad_norm 1.5803 (1.7116) [2022-01-23 06:50:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][700/1251] eta 0:20:17 lr 0.000369 time 2.2327 (2.2089) loss 2.4843 (3.4496) grad_norm 1.7560 (1.7116) [2022-01-23 06:50:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][710/1251] eta 0:19:54 lr 0.000369 time 2.2642 (2.2081) loss 2.4225 (3.4490) grad_norm 1.6303 (1.7112) [2022-01-23 06:50:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][720/1251] eta 0:19:33 lr 0.000369 time 2.8369 (2.2092) loss 3.9321 (3.4502) grad_norm 2.1965 (1.7119) [2022-01-23 06:51:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][730/1251] eta 0:19:11 lr 0.000369 time 2.1274 (2.2104) loss 2.7633 (3.4489) grad_norm 1.7117 (1.7117) [2022-01-23 06:51:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][740/1251] eta 0:18:49 lr 0.000369 time 2.8733 (2.2099) loss 3.9215 (3.4483) grad_norm 1.5660 (1.7108) [2022-01-23 06:51:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][750/1251] eta 0:18:25 lr 0.000369 time 1.8887 (2.2070) loss 3.6929 (3.4482) grad_norm 1.6667 (1.7103) [2022-01-23 06:52:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][760/1251] eta 0:18:02 lr 0.000369 time 2.2181 (2.2056) loss 3.5220 (3.4485) grad_norm 1.9087 (1.7096) [2022-01-23 06:52:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][770/1251] eta 0:17:40 lr 0.000369 time 1.7743 (2.2053) loss 2.9695 (3.4460) grad_norm 1.5604 (1.7081) [2022-01-23 06:52:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][780/1251] eta 0:17:19 lr 0.000369 time 2.7246 (2.2063) loss 3.3106 (3.4449) grad_norm 1.6441 (1.7086) [2022-01-23 06:53:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][790/1251] eta 0:16:57 lr 0.000369 time 1.9156 (2.2064) loss 2.1886 (3.4425) grad_norm 1.8228 (1.7086) [2022-01-23 06:53:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][800/1251] eta 0:16:35 lr 0.000369 time 1.5220 (2.2064) loss 2.3501 (3.4452) grad_norm 1.7397 (1.7093) [2022-01-23 06:54:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][810/1251] eta 0:16:14 lr 0.000369 time 1.9492 (2.2090) loss 3.3843 (3.4418) grad_norm 1.6440 (1.7083) [2022-01-23 06:54:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][820/1251] eta 0:15:52 lr 0.000369 time 2.1505 (2.2091) loss 2.5683 (3.4420) grad_norm 1.9285 (1.7080) [2022-01-23 06:54:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][830/1251] eta 0:15:29 lr 0.000369 time 1.9373 (2.2090) loss 3.3461 (3.4439) grad_norm 1.6889 (1.7084) [2022-01-23 06:55:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][840/1251] eta 0:15:06 lr 0.000369 time 1.7157 (2.2064) loss 2.3651 (3.4406) grad_norm 1.8393 (1.7084) [2022-01-23 06:55:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][850/1251] eta 0:14:44 lr 0.000368 time 2.0525 (2.2059) loss 3.7886 (3.4362) grad_norm 1.7212 (1.7070) [2022-01-23 06:55:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][860/1251] eta 0:14:22 lr 0.000368 time 2.3033 (2.2047) loss 3.6973 (3.4396) grad_norm 1.8832 (1.7085) [2022-01-23 06:56:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][870/1251] eta 0:13:59 lr 0.000368 time 1.8635 (2.2040) loss 3.4918 (3.4377) grad_norm 1.5754 (1.7077) [2022-01-23 06:56:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][880/1251] eta 0:13:37 lr 0.000368 time 2.0889 (2.2035) loss 2.8499 (3.4388) grad_norm 2.0130 (1.7083) [2022-01-23 06:56:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][890/1251] eta 0:13:15 lr 0.000368 time 2.1464 (2.2043) loss 2.5454 (3.4388) grad_norm 2.5137 (1.7116) [2022-01-23 06:57:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][900/1251] eta 0:12:53 lr 0.000368 time 2.2766 (2.2035) loss 3.2038 (3.4358) grad_norm 1.5740 (1.7113) [2022-01-23 06:57:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][910/1251] eta 0:12:31 lr 0.000368 time 2.1734 (2.2035) loss 2.2682 (3.4344) grad_norm 1.5984 (1.7099) [2022-01-23 06:58:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][920/1251] eta 0:12:08 lr 0.000368 time 1.8540 (2.2023) loss 3.5029 (3.4369) grad_norm 1.4817 (1.7085) [2022-01-23 06:58:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][930/1251] eta 0:11:46 lr 0.000368 time 1.7188 (2.2015) loss 3.5960 (3.4360) grad_norm 1.7812 (1.7091) [2022-01-23 06:58:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][940/1251] eta 0:11:25 lr 0.000368 time 2.5614 (2.2027) loss 3.8873 (3.4380) grad_norm 1.7955 (1.7096) [2022-01-23 06:59:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][950/1251] eta 0:11:02 lr 0.000368 time 1.6966 (2.2017) loss 3.3945 (3.4388) grad_norm 1.8035 (1.7106) [2022-01-23 06:59:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][960/1251] eta 0:10:40 lr 0.000368 time 2.4912 (2.2016) loss 3.6999 (3.4377) grad_norm 1.5672 (1.7121) [2022-01-23 06:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][970/1251] eta 0:10:18 lr 0.000368 time 1.7416 (2.2002) loss 3.8199 (3.4390) grad_norm 1.9689 (1.7124) [2022-01-23 07:00:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][980/1251] eta 0:09:56 lr 0.000368 time 2.2350 (2.1994) loss 2.5256 (3.4375) grad_norm 1.5893 (1.7121) [2022-01-23 07:00:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][990/1251] eta 0:09:33 lr 0.000368 time 1.9021 (2.1972) loss 2.5954 (3.4375) grad_norm 1.6863 (1.7125) [2022-01-23 07:00:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1000/1251] eta 0:09:11 lr 0.000368 time 2.1982 (2.1970) loss 3.7899 (3.4378) grad_norm 1.8721 (1.7123) [2022-01-23 07:01:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1010/1251] eta 0:08:49 lr 0.000368 time 1.8693 (2.1967) loss 3.8948 (3.4384) grad_norm 1.6587 (1.7120) [2022-01-23 07:01:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1020/1251] eta 0:08:27 lr 0.000368 time 1.6999 (2.1970) loss 3.0707 (3.4381) grad_norm 1.7157 (1.7121) [2022-01-23 07:02:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1030/1251] eta 0:08:05 lr 0.000368 time 2.0926 (2.1979) loss 3.8218 (3.4391) grad_norm 1.9733 (1.7119) [2022-01-23 07:02:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1040/1251] eta 0:07:44 lr 0.000368 time 2.7334 (2.1997) loss 3.8167 (3.4384) grad_norm 2.0807 (1.7125) [2022-01-23 07:02:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1050/1251] eta 0:07:22 lr 0.000368 time 1.5996 (2.2003) loss 4.0993 (3.4392) grad_norm 1.7870 (1.7131) [2022-01-23 07:03:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1060/1251] eta 0:07:00 lr 0.000368 time 1.5597 (2.2014) loss 3.7649 (3.4402) grad_norm 1.4855 (1.7134) [2022-01-23 07:03:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1070/1251] eta 0:06:38 lr 0.000368 time 1.9283 (2.2007) loss 3.5156 (3.4391) grad_norm 1.9106 (1.7132) [2022-01-23 07:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1080/1251] eta 0:06:16 lr 0.000368 time 2.7988 (2.2005) loss 3.7346 (3.4399) grad_norm 1.7873 (1.7131) [2022-01-23 07:04:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1090/1251] eta 0:05:54 lr 0.000368 time 1.8506 (2.1998) loss 3.7383 (3.4397) grad_norm 1.5436 (1.7128) [2022-01-23 07:04:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1100/1251] eta 0:05:32 lr 0.000368 time 2.2196 (2.2001) loss 3.2164 (3.4379) grad_norm 1.7830 (1.7129) [2022-01-23 07:04:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1110/1251] eta 0:05:10 lr 0.000367 time 1.8099 (2.2000) loss 3.9619 (3.4388) grad_norm 1.6708 (1.7130) [2022-01-23 07:05:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1120/1251] eta 0:04:48 lr 0.000367 time 2.1814 (2.1999) loss 4.2662 (3.4370) grad_norm 1.9479 (1.7129) [2022-01-23 07:05:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1130/1251] eta 0:04:26 lr 0.000367 time 2.2195 (2.1983) loss 3.7033 (3.4386) grad_norm 1.7159 (1.7129) [2022-01-23 07:06:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1140/1251] eta 0:04:04 lr 0.000367 time 1.8888 (2.1983) loss 3.6600 (3.4397) grad_norm 2.1659 (1.7130) [2022-01-23 07:06:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1150/1251] eta 0:03:42 lr 0.000367 time 2.0301 (2.1980) loss 3.0496 (3.4381) grad_norm 1.6961 (1.7123) [2022-01-23 07:06:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1160/1251] eta 0:03:20 lr 0.000367 time 2.1403 (2.1984) loss 2.6243 (3.4361) grad_norm 1.6349 (1.7121) [2022-01-23 07:07:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1170/1251] eta 0:02:58 lr 0.000367 time 2.0769 (2.1994) loss 3.5127 (3.4380) grad_norm 1.8873 (1.7121) [2022-01-23 07:07:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1180/1251] eta 0:02:36 lr 0.000367 time 1.9053 (2.2008) loss 3.1817 (3.4378) grad_norm 1.5956 (1.7127) [2022-01-23 07:07:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1190/1251] eta 0:02:14 lr 0.000367 time 1.9648 (2.1992) loss 3.6113 (3.4351) grad_norm 1.9026 (1.7133) [2022-01-23 07:08:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1200/1251] eta 0:01:52 lr 0.000367 time 2.4946 (2.1987) loss 2.5985 (3.4343) grad_norm 1.5497 (1.7128) [2022-01-23 07:08:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1210/1251] eta 0:01:30 lr 0.000367 time 2.2432 (2.1977) loss 2.9561 (3.4334) grad_norm 1.4768 (1.7121) [2022-01-23 07:08:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1220/1251] eta 0:01:08 lr 0.000367 time 2.0809 (2.1979) loss 3.9374 (3.4339) grad_norm 1.6123 (1.7117) [2022-01-23 07:09:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1230/1251] eta 0:00:46 lr 0.000367 time 1.6916 (2.1985) loss 2.7253 (3.4348) grad_norm 1.4801 (1.7118) [2022-01-23 07:09:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1240/1251] eta 0:00:24 lr 0.000367 time 1.4595 (2.1968) loss 3.7357 (3.4337) grad_norm 1.8397 (1.7123) [2022-01-23 07:09:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [176/300][1250/1251] eta 0:00:02 lr 0.000367 time 1.3171 (2.1911) loss 2.5178 (3.4337) grad_norm 1.5529 (1.7133) [2022-01-23 07:09:55 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 176 training takes 0:45:41 [2022-01-23 07:10:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.401 (18.401) Loss 0.9359 (0.9359) Acc@1 77.539 (77.539) Acc@5 93.945 (93.945) [2022-01-23 07:10:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.217 (3.372) Loss 0.9308 (0.9478) Acc@1 77.930 (77.583) Acc@5 93.945 (94.114) [2022-01-23 07:10:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.959 (2.625) Loss 0.9905 (0.9575) Acc@1 76.270 (77.534) Acc@5 92.285 (93.936) [2022-01-23 07:11:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.629 (2.236) Loss 0.9919 (0.9696) Acc@1 76.172 (77.293) Acc@5 94.336 (93.791) [2022-01-23 07:11:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.760 (2.198) Loss 0.9321 (0.9703) Acc@1 78.320 (77.272) Acc@5 94.531 (93.848) [2022-01-23 07:11:33 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 77.234 Acc@5 93.844 [2022-01-23 07:11:33 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 77.2% [2022-01-23 07:11:33 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 77.23% [2022-01-23 07:11:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][0/1251] eta 8:29:47 lr 0.000367 time 24.4504 (24.4504) loss 4.1515 (4.1515) grad_norm 2.0056 (2.0056) [2022-01-23 07:12:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][10/1251] eta 1:28:17 lr 0.000367 time 2.1887 (4.2688) loss 2.9427 (3.4564) grad_norm 1.4442 (1.7108) [2022-01-23 07:12:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][20/1251] eta 1:08:25 lr 0.000367 time 1.8854 (3.3349) loss 3.8548 (3.4855) grad_norm 1.6971 (1.6813) [2022-01-23 07:13:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][30/1251] eta 0:59:48 lr 0.000367 time 1.5845 (2.9392) loss 2.4987 (3.3978) grad_norm 1.7357 (1.6769) [2022-01-23 07:13:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][40/1251] eta 0:56:26 lr 0.000367 time 3.8707 (2.7967) loss 3.1863 (3.4092) grad_norm 1.8119 (1.7290) [2022-01-23 07:13:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][50/1251] eta 0:53:45 lr 0.000367 time 2.1889 (2.6854) loss 2.2134 (3.3752) grad_norm 1.9295 (1.7142) [2022-01-23 07:14:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][60/1251] eta 0:51:32 lr 0.000367 time 2.5416 (2.5962) loss 2.6185 (3.4135) grad_norm 1.5763 (1.7070) [2022-01-23 07:14:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][70/1251] eta 0:49:45 lr 0.000367 time 1.8973 (2.5282) loss 3.7171 (3.4121) grad_norm 1.9330 (1.7216) [2022-01-23 07:14:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][80/1251] eta 0:48:17 lr 0.000367 time 1.8221 (2.4742) loss 3.0137 (3.4310) grad_norm 1.6628 (1.7148) [2022-01-23 07:15:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][90/1251] eta 0:47:30 lr 0.000367 time 2.2021 (2.4550) loss 4.0711 (3.4510) grad_norm 1.6920 (1.7144) [2022-01-23 07:15:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][100/1251] eta 0:46:39 lr 0.000367 time 2.4826 (2.4321) loss 2.9462 (3.4087) grad_norm 1.8777 (1.7149) [2022-01-23 07:16:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][110/1251] eta 0:45:50 lr 0.000366 time 1.9056 (2.4107) loss 4.1492 (3.4062) grad_norm 1.6320 (1.7079) [2022-01-23 07:16:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][120/1251] eta 0:45:02 lr 0.000366 time 1.9000 (2.3890) loss 2.2806 (3.4037) grad_norm 1.6999 (1.7052) [2022-01-23 07:16:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][130/1251] eta 0:44:26 lr 0.000366 time 2.0721 (2.3787) loss 3.3408 (3.4100) grad_norm 1.9084 (1.7090) [2022-01-23 07:17:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][140/1251] eta 0:43:53 lr 0.000366 time 3.0407 (2.3699) loss 4.2182 (3.4011) grad_norm 1.7846 (1.7166) [2022-01-23 07:17:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][150/1251] eta 0:43:20 lr 0.000366 time 1.9340 (2.3623) loss 4.0825 (3.4034) grad_norm 1.8065 (1.7165) [2022-01-23 07:17:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][160/1251] eta 0:42:39 lr 0.000366 time 1.5547 (2.3463) loss 2.6195 (3.4041) grad_norm 1.7496 (1.7266) [2022-01-23 07:18:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][170/1251] eta 0:41:58 lr 0.000366 time 1.8077 (2.3302) loss 3.9927 (3.4127) grad_norm 1.5473 (1.7236) [2022-01-23 07:18:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][180/1251] eta 0:41:23 lr 0.000366 time 2.7183 (2.3193) loss 3.8364 (3.4237) grad_norm 2.0024 (1.7249) [2022-01-23 07:18:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][190/1251] eta 0:40:57 lr 0.000366 time 1.9152 (2.3167) loss 3.1856 (3.4241) grad_norm 1.5124 (1.7228) [2022-01-23 07:19:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][200/1251] eta 0:40:24 lr 0.000366 time 1.8835 (2.3069) loss 3.5774 (3.4261) grad_norm 1.6273 (1.7226) [2022-01-23 07:19:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][210/1251] eta 0:39:55 lr 0.000366 time 2.2729 (2.3014) loss 3.3074 (3.4366) grad_norm 1.5400 (1.7216) [2022-01-23 07:19:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][220/1251] eta 0:39:22 lr 0.000366 time 2.3153 (2.2918) loss 4.4247 (3.4515) grad_norm 1.8654 (1.7184) [2022-01-23 07:20:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][230/1251] eta 0:38:51 lr 0.000366 time 2.0601 (2.2839) loss 3.9355 (3.4571) grad_norm 1.7519 (1.7174) [2022-01-23 07:20:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][240/1251] eta 0:38:21 lr 0.000366 time 1.5078 (2.2769) loss 3.5130 (3.4625) grad_norm 1.7453 (1.7194) [2022-01-23 07:21:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][250/1251] eta 0:37:53 lr 0.000366 time 1.9226 (2.2709) loss 2.6381 (3.4526) grad_norm 1.6958 (1.7213) [2022-01-23 07:21:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][260/1251] eta 0:37:24 lr 0.000366 time 2.1311 (2.2647) loss 3.9428 (3.4555) grad_norm 1.7800 (1.7198) [2022-01-23 07:21:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][270/1251] eta 0:36:58 lr 0.000366 time 2.2463 (2.2618) loss 3.5148 (3.4606) grad_norm 1.5362 (1.7184) [2022-01-23 07:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][280/1251] eta 0:36:38 lr 0.000366 time 1.8598 (2.2644) loss 3.5739 (3.4508) grad_norm 1.5224 (1.7153) [2022-01-23 07:22:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][290/1251] eta 0:36:14 lr 0.000366 time 1.8432 (2.2629) loss 3.7779 (3.4521) grad_norm 1.6534 (1.7116) [2022-01-23 07:22:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][300/1251] eta 0:35:50 lr 0.000366 time 1.5578 (2.2614) loss 3.9307 (3.4557) grad_norm 1.4593 (1.7096) [2022-01-23 07:23:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][310/1251] eta 0:35:28 lr 0.000366 time 2.6040 (2.2617) loss 2.5430 (3.4375) grad_norm 1.7031 (1.7059) [2022-01-23 07:23:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][320/1251] eta 0:35:08 lr 0.000366 time 1.8315 (2.2652) loss 2.8362 (3.4342) grad_norm 1.6472 (1.7031) [2022-01-23 07:24:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][330/1251] eta 0:34:45 lr 0.000366 time 2.3653 (2.2641) loss 3.8703 (3.4254) grad_norm 1.6356 (1.7033) [2022-01-23 07:24:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][340/1251] eta 0:34:18 lr 0.000366 time 1.6610 (2.2596) loss 2.7996 (3.4244) grad_norm 1.5527 (1.7034) [2022-01-23 07:24:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][350/1251] eta 0:33:49 lr 0.000366 time 1.9278 (2.2527) loss 3.9208 (3.4157) grad_norm 1.8000 (1.7011) [2022-01-23 07:25:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][360/1251] eta 0:33:27 lr 0.000365 time 2.1604 (2.2533) loss 3.5321 (3.4114) grad_norm 1.7248 (1.7046) [2022-01-23 07:25:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][370/1251] eta 0:33:04 lr 0.000365 time 2.3677 (2.2520) loss 4.0150 (3.4114) grad_norm 2.0426 (1.7073) [2022-01-23 07:25:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][380/1251] eta 0:32:42 lr 0.000365 time 2.1818 (2.2532) loss 3.6254 (3.4076) grad_norm 1.8041 (1.7051) [2022-01-23 07:26:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][390/1251] eta 0:32:18 lr 0.000365 time 2.2071 (2.2512) loss 3.6088 (3.4117) grad_norm 1.6521 (1.7044) [2022-01-23 07:26:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][400/1251] eta 0:31:53 lr 0.000365 time 2.6862 (2.2482) loss 3.7737 (3.4091) grad_norm 2.2353 (1.7082) [2022-01-23 07:26:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][410/1251] eta 0:31:24 lr 0.000365 time 1.7398 (2.2413) loss 3.8536 (3.4101) grad_norm 1.8839 (1.7100) [2022-01-23 07:27:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][420/1251] eta 0:31:02 lr 0.000365 time 2.5604 (2.2414) loss 2.9995 (3.4106) grad_norm 1.7127 (1.7126) [2022-01-23 07:27:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][430/1251] eta 0:30:36 lr 0.000365 time 2.2759 (2.2371) loss 3.6390 (3.4139) grad_norm 1.5630 (1.7121) [2022-01-23 07:27:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][440/1251] eta 0:30:10 lr 0.000365 time 2.2253 (2.2323) loss 3.4872 (3.4161) grad_norm 1.6783 (1.7100) [2022-01-23 07:28:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][450/1251] eta 0:29:46 lr 0.000365 time 2.5774 (2.2306) loss 3.4195 (3.4166) grad_norm 1.7287 (1.7114) [2022-01-23 07:28:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][460/1251] eta 0:29:24 lr 0.000365 time 2.6253 (2.2307) loss 3.6812 (3.4112) grad_norm 1.6587 (1.7096) [2022-01-23 07:29:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][470/1251] eta 0:29:03 lr 0.000365 time 2.5487 (2.2330) loss 3.1272 (3.4097) grad_norm 1.8492 (1.7081) [2022-01-23 07:29:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][480/1251] eta 0:28:40 lr 0.000365 time 2.1160 (2.2322) loss 2.2419 (3.4078) grad_norm 1.7435 (1.7079) [2022-01-23 07:29:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][490/1251] eta 0:28:19 lr 0.000365 time 2.1159 (2.2326) loss 3.1667 (3.4059) grad_norm 1.7807 (1.7088) [2022-01-23 07:30:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][500/1251] eta 0:27:56 lr 0.000365 time 2.5240 (2.2319) loss 3.7274 (3.4071) grad_norm 1.4898 (1.7086) [2022-01-23 07:30:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][510/1251] eta 0:27:32 lr 0.000365 time 2.3412 (2.2305) loss 3.7942 (3.4099) grad_norm 1.7329 (1.7085) [2022-01-23 07:30:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][520/1251] eta 0:27:09 lr 0.000365 time 2.0665 (2.2287) loss 4.0356 (3.4166) grad_norm 1.9756 (1.7116) [2022-01-23 07:31:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][530/1251] eta 0:26:44 lr 0.000365 time 2.6304 (2.2250) loss 3.5082 (3.4163) grad_norm 1.6998 (1.7120) [2022-01-23 07:31:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][540/1251] eta 0:26:20 lr 0.000365 time 2.9570 (2.2228) loss 2.7828 (3.4172) grad_norm 1.5380 (1.7123) [2022-01-23 07:31:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][550/1251] eta 0:25:56 lr 0.000365 time 2.7099 (2.2210) loss 2.7382 (3.4217) grad_norm 1.8767 (1.7127) [2022-01-23 07:32:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][560/1251] eta 0:25:34 lr 0.000365 time 1.5484 (2.2202) loss 3.3894 (3.4248) grad_norm 1.8881 (1.7119) [2022-01-23 07:32:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][570/1251] eta 0:25:13 lr 0.000365 time 2.4795 (2.2230) loss 3.9188 (3.4276) grad_norm 1.7389 (1.7114) [2022-01-23 07:33:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][580/1251] eta 0:24:53 lr 0.000365 time 2.8161 (2.2251) loss 3.4836 (3.4329) grad_norm 1.6891 (1.7106) [2022-01-23 07:33:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][590/1251] eta 0:24:30 lr 0.000365 time 2.5832 (2.2241) loss 3.3273 (3.4330) grad_norm 1.7122 (1.7106) [2022-01-23 07:33:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][600/1251] eta 0:24:07 lr 0.000365 time 2.1646 (2.2231) loss 4.0226 (3.4343) grad_norm 1.4393 (1.7102) [2022-01-23 07:34:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][610/1251] eta 0:23:42 lr 0.000364 time 1.9616 (2.2188) loss 3.4119 (3.4303) grad_norm 1.7938 (1.7115) [2022-01-23 07:34:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][620/1251] eta 0:23:18 lr 0.000364 time 2.7586 (2.2161) loss 3.7910 (3.4328) grad_norm 1.7940 (1.7126) [2022-01-23 07:34:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][630/1251] eta 0:22:55 lr 0.000364 time 2.4073 (2.2153) loss 4.1325 (3.4347) grad_norm 1.7021 (1.7142) [2022-01-23 07:35:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][640/1251] eta 0:22:33 lr 0.000364 time 2.5853 (2.2159) loss 3.7300 (3.4404) grad_norm 1.5281 (1.7136) [2022-01-23 07:35:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][650/1251] eta 0:22:12 lr 0.000364 time 2.4784 (2.2172) loss 4.0695 (3.4432) grad_norm 1.5731 (1.7148) [2022-01-23 07:35:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][660/1251] eta 0:21:49 lr 0.000364 time 2.2738 (2.2163) loss 3.1850 (3.4413) grad_norm 1.6461 (1.7146) [2022-01-23 07:36:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][670/1251] eta 0:21:26 lr 0.000364 time 1.8374 (2.2149) loss 3.5413 (3.4414) grad_norm 1.5738 (1.7137) [2022-01-23 07:36:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][680/1251] eta 0:21:05 lr 0.000364 time 2.7962 (2.2171) loss 2.7181 (3.4393) grad_norm 2.0763 (1.7176) [2022-01-23 07:37:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][690/1251] eta 0:20:44 lr 0.000364 time 1.8750 (2.2186) loss 3.3633 (3.4389) grad_norm 1.8398 (1.7172) [2022-01-23 07:37:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][700/1251] eta 0:20:22 lr 0.000364 time 1.9937 (2.2188) loss 2.3749 (3.4391) grad_norm 1.6588 (1.7183) [2022-01-23 07:37:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][710/1251] eta 0:19:59 lr 0.000364 time 1.7857 (2.2164) loss 3.8165 (3.4425) grad_norm 1.7605 (1.7170) [2022-01-23 07:38:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][720/1251] eta 0:19:35 lr 0.000364 time 1.9092 (2.2135) loss 3.6570 (3.4434) grad_norm 1.6808 (1.7170) [2022-01-23 07:38:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][730/1251] eta 0:19:12 lr 0.000364 time 2.1546 (2.2120) loss 2.6233 (3.4408) grad_norm 1.7443 (1.7176) [2022-01-23 07:38:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][740/1251] eta 0:18:50 lr 0.000364 time 1.5999 (2.2117) loss 3.5983 (3.4418) grad_norm 1.3793 (1.7164) [2022-01-23 07:39:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][750/1251] eta 0:18:29 lr 0.000364 time 1.5629 (2.2143) loss 3.7706 (3.4395) grad_norm 1.6283 (1.7160) [2022-01-23 07:39:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][760/1251] eta 0:18:07 lr 0.000364 time 2.2472 (2.2153) loss 3.7856 (3.4379) grad_norm 1.8220 (1.7164) [2022-01-23 07:40:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][770/1251] eta 0:17:45 lr 0.000364 time 2.7813 (2.2159) loss 3.4322 (3.4375) grad_norm 1.6850 (1.7160) [2022-01-23 07:40:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][780/1251] eta 0:17:23 lr 0.000364 time 1.7872 (2.2158) loss 2.8584 (3.4355) grad_norm 1.7857 (1.7169) [2022-01-23 07:40:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][790/1251] eta 0:17:00 lr 0.000364 time 1.5982 (2.2138) loss 2.6266 (3.4375) grad_norm 1.5669 (1.7171) [2022-01-23 07:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][800/1251] eta 0:16:37 lr 0.000364 time 1.8964 (2.2121) loss 3.6584 (3.4388) grad_norm 1.5971 (1.7163) [2022-01-23 07:41:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][810/1251] eta 0:16:14 lr 0.000364 time 2.4763 (2.2103) loss 2.3630 (3.4398) grad_norm 1.5230 (1.7171) [2022-01-23 07:41:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][820/1251] eta 0:15:51 lr 0.000364 time 1.5785 (2.2083) loss 2.7823 (3.4388) grad_norm 1.7941 (1.7164) [2022-01-23 07:42:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][830/1251] eta 0:15:29 lr 0.000364 time 2.1866 (2.2078) loss 3.5045 (3.4373) grad_norm 1.5516 (1.7159) [2022-01-23 07:42:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][840/1251] eta 0:15:07 lr 0.000364 time 1.9071 (2.2077) loss 3.8971 (3.4339) grad_norm 1.6148 (1.7147) [2022-01-23 07:42:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][850/1251] eta 0:14:45 lr 0.000364 time 2.4702 (2.2073) loss 3.3401 (3.4337) grad_norm 1.7623 (1.7142) [2022-01-23 07:43:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][860/1251] eta 0:14:22 lr 0.000363 time 2.1503 (2.2071) loss 2.8919 (3.4320) grad_norm 1.9428 (1.7139) [2022-01-23 07:43:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][870/1251] eta 0:14:00 lr 0.000363 time 2.1772 (2.2073) loss 3.9356 (3.4323) grad_norm 1.8418 (1.7136) [2022-01-23 07:43:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][880/1251] eta 0:13:38 lr 0.000363 time 2.5526 (2.2067) loss 4.1956 (3.4342) grad_norm 1.6142 (1.7146) [2022-01-23 07:44:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][890/1251] eta 0:13:16 lr 0.000363 time 2.4893 (2.2073) loss 3.7721 (3.4355) grad_norm 1.6877 (1.7145) [2022-01-23 07:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][900/1251] eta 0:12:55 lr 0.000363 time 2.1110 (2.2088) loss 3.8653 (3.4317) grad_norm 1.6947 (1.7154) [2022-01-23 07:45:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][910/1251] eta 0:12:32 lr 0.000363 time 2.0006 (2.2079) loss 4.0134 (3.4352) grad_norm 1.6403 (1.7150) [2022-01-23 07:45:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][920/1251] eta 0:12:11 lr 0.000363 time 2.8185 (2.2087) loss 3.6250 (3.4351) grad_norm 1.4373 (1.7142) [2022-01-23 07:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][930/1251] eta 0:11:49 lr 0.000363 time 2.5896 (2.2106) loss 3.6847 (3.4368) grad_norm 1.6014 (1.7135) [2022-01-23 07:46:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][940/1251] eta 0:11:27 lr 0.000363 time 1.5733 (2.2095) loss 2.7321 (3.4372) grad_norm 1.7695 (1.7142) [2022-01-23 07:46:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][950/1251] eta 0:11:04 lr 0.000363 time 1.7807 (2.2085) loss 2.8044 (3.4347) grad_norm 1.6635 (1.7139) [2022-01-23 07:46:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][960/1251] eta 0:10:42 lr 0.000363 time 3.6649 (2.2090) loss 2.7083 (3.4340) grad_norm 1.5279 (1.7139) [2022-01-23 07:47:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][970/1251] eta 0:10:20 lr 0.000363 time 1.9552 (2.2086) loss 3.3876 (3.4339) grad_norm 1.7203 (1.7136) [2022-01-23 07:47:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][980/1251] eta 0:09:58 lr 0.000363 time 1.7049 (2.2068) loss 3.0917 (3.4337) grad_norm 1.6960 (1.7135) [2022-01-23 07:47:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][990/1251] eta 0:09:35 lr 0.000363 time 2.1645 (2.2059) loss 3.9072 (3.4350) grad_norm 1.7081 (1.7139) [2022-01-23 07:48:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1000/1251] eta 0:09:13 lr 0.000363 time 3.3968 (2.2069) loss 4.0984 (3.4335) grad_norm 1.6283 (1.7132) [2022-01-23 07:48:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1010/1251] eta 0:08:51 lr 0.000363 time 1.6637 (2.2059) loss 3.9565 (3.4330) grad_norm 1.6262 (1.7135) [2022-01-23 07:49:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1020/1251] eta 0:08:29 lr 0.000363 time 2.0819 (2.2051) loss 2.5187 (3.4316) grad_norm 1.6573 (1.7134) [2022-01-23 07:49:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1030/1251] eta 0:08:07 lr 0.000363 time 1.8714 (2.2041) loss 4.0809 (3.4307) grad_norm 1.9605 (1.7143) [2022-01-23 07:49:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1040/1251] eta 0:07:45 lr 0.000363 time 2.3756 (2.2045) loss 3.7038 (3.4316) grad_norm 2.0366 (1.7144) [2022-01-23 07:50:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1050/1251] eta 0:07:22 lr 0.000363 time 1.6994 (2.2037) loss 3.1657 (3.4327) grad_norm 1.7923 (1.7146) [2022-01-23 07:50:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1060/1251] eta 0:07:00 lr 0.000363 time 1.8276 (2.2026) loss 3.7923 (3.4344) grad_norm 1.9935 (1.7155) [2022-01-23 07:50:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1070/1251] eta 0:06:38 lr 0.000363 time 1.8530 (2.2012) loss 3.5795 (3.4356) grad_norm 1.9802 (1.7162) [2022-01-23 07:51:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1080/1251] eta 0:06:16 lr 0.000363 time 2.2051 (2.2011) loss 3.6280 (3.4346) grad_norm 1.7736 (1.7166) [2022-01-23 07:51:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1090/1251] eta 0:05:54 lr 0.000363 time 3.3831 (2.2020) loss 3.6458 (3.4345) grad_norm 1.6513 (1.7170) [2022-01-23 07:52:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1100/1251] eta 0:05:33 lr 0.000363 time 1.9426 (2.2059) loss 2.6271 (3.4338) grad_norm 1.8409 (1.7174) [2022-01-23 07:52:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1110/1251] eta 0:05:10 lr 0.000362 time 2.0949 (2.2052) loss 3.7715 (3.4350) grad_norm 1.4511 (1.7177) [2022-01-23 07:52:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1120/1251] eta 0:04:48 lr 0.000362 time 1.8624 (2.2029) loss 2.2724 (3.4341) grad_norm 1.5924 (1.7172) [2022-01-23 07:53:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1130/1251] eta 0:04:26 lr 0.000362 time 1.9845 (2.2029) loss 4.2880 (3.4335) grad_norm 2.0698 (1.7180) [2022-01-23 07:53:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1140/1251] eta 0:04:04 lr 0.000362 time 1.8099 (2.2027) loss 3.6601 (3.4336) grad_norm 1.7164 (1.7168) [2022-01-23 07:53:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1150/1251] eta 0:03:42 lr 0.000362 time 2.1713 (2.2030) loss 3.4172 (3.4322) grad_norm 1.9697 (1.7168) [2022-01-23 07:54:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1160/1251] eta 0:03:20 lr 0.000362 time 1.9069 (2.2035) loss 3.9904 (3.4321) grad_norm 1.8076 (1.7165) [2022-01-23 07:54:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1170/1251] eta 0:02:58 lr 0.000362 time 4.3218 (2.2063) loss 2.8216 (3.4312) grad_norm 1.6057 (1.7159) [2022-01-23 07:54:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1180/1251] eta 0:02:36 lr 0.000362 time 2.1123 (2.2053) loss 3.5179 (3.4310) grad_norm 1.6249 (1.7149) [2022-01-23 07:55:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1190/1251] eta 0:02:14 lr 0.000362 time 1.7840 (2.2045) loss 3.2009 (3.4302) grad_norm 1.8997 (1.7150) [2022-01-23 07:55:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1200/1251] eta 0:01:52 lr 0.000362 time 1.9032 (2.2035) loss 3.3235 (3.4297) grad_norm 1.5536 (1.7145) [2022-01-23 07:56:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1210/1251] eta 0:01:30 lr 0.000362 time 2.3720 (2.2029) loss 3.2329 (3.4274) grad_norm 1.5031 (1.7141) [2022-01-23 07:56:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1220/1251] eta 0:01:08 lr 0.000362 time 1.9427 (2.2009) loss 3.2522 (3.4295) grad_norm 1.6003 (1.7136) [2022-01-23 07:56:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1230/1251] eta 0:00:46 lr 0.000362 time 2.2404 (2.2006) loss 3.6603 (3.4287) grad_norm 1.9884 (1.7137) [2022-01-23 07:57:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1240/1251] eta 0:00:24 lr 0.000362 time 2.2279 (2.2000) loss 3.2180 (3.4288) grad_norm 1.8892 (1.7150) [2022-01-23 07:57:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [177/300][1250/1251] eta 0:00:02 lr 0.000362 time 1.1923 (2.1949) loss 2.9246 (3.4280) grad_norm 1.9256 (1.7150) [2022-01-23 07:57:19 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 177 training takes 0:45:46 [2022-01-23 07:57:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.676 (18.676) Loss 0.9426 (0.9426) Acc@1 79.199 (79.199) Acc@5 93.945 (93.945) [2022-01-23 07:57:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 2.274 (3.292) Loss 1.0278 (0.9627) Acc@1 75.195 (77.228) Acc@5 93.262 (94.034) [2022-01-23 07:58:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.191 (2.493) Loss 0.9216 (0.9516) Acc@1 77.344 (77.562) Acc@5 94.629 (93.964) [2022-01-23 07:58:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.586 (2.308) Loss 0.9952 (0.9586) Acc@1 76.660 (77.341) Acc@5 93.945 (93.936) [2022-01-23 07:58:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.881 (2.112) Loss 0.9367 (0.9578) Acc@1 78.711 (77.458) Acc@5 94.727 (93.914) [2022-01-23 07:58:53 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 77.444 Acc@5 93.966 [2022-01-23 07:58:53 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 77.4% [2022-01-23 07:58:53 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 77.44% [2022-01-23 07:59:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][0/1251] eta 7:32:02 lr 0.000362 time 21.6807 (21.6807) loss 2.4149 (2.4149) grad_norm 1.8394 (1.8394) [2022-01-23 07:59:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][10/1251] eta 1:22:47 lr 0.000362 time 1.8708 (4.0025) loss 3.6357 (3.3666) grad_norm 1.5689 (1.7511) [2022-01-23 07:59:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][20/1251] eta 1:04:31 lr 0.000362 time 1.3492 (3.1448) loss 3.6555 (3.4421) grad_norm 1.6739 (1.7254) [2022-01-23 08:00:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][30/1251] eta 0:55:11 lr 0.000362 time 1.6957 (2.7124) loss 3.4785 (3.4235) grad_norm 1.6811 (1.7264) [2022-01-23 08:00:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][40/1251] eta 0:55:05 lr 0.000362 time 4.4494 (2.7297) loss 3.5156 (3.4036) grad_norm 1.7025 (1.7655) [2022-01-23 08:01:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][50/1251] eta 0:52:43 lr 0.000362 time 2.5527 (2.6340) loss 3.7844 (3.4213) grad_norm 1.6558 (1.7861) [2022-01-23 08:01:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][60/1251] eta 0:50:01 lr 0.000362 time 1.5690 (2.5200) loss 3.9229 (3.4107) grad_norm 1.7842 (1.7644) [2022-01-23 08:01:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][70/1251] eta 0:48:51 lr 0.000362 time 1.9573 (2.4820) loss 4.1149 (3.4439) grad_norm 1.6290 (1.7629) [2022-01-23 08:02:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][80/1251] eta 0:48:54 lr 0.000362 time 6.5294 (2.5060) loss 3.3831 (3.4391) grad_norm 1.6812 (1.7493) [2022-01-23 08:02:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][90/1251] eta 0:47:21 lr 0.000362 time 1.9060 (2.4472) loss 3.3015 (3.4445) grad_norm 1.9337 (1.7469) [2022-01-23 08:02:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][100/1251] eta 0:46:06 lr 0.000362 time 1.9040 (2.4036) loss 4.3323 (3.4399) grad_norm 1.7842 (1.7437) [2022-01-23 08:03:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][110/1251] eta 0:45:06 lr 0.000361 time 1.6278 (2.3720) loss 2.9165 (3.4275) grad_norm 1.7315 (1.7362) [2022-01-23 08:03:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][120/1251] eta 0:44:33 lr 0.000361 time 3.6411 (2.3639) loss 3.8705 (3.4394) grad_norm 1.8455 (1.7306) [2022-01-23 08:04:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][130/1251] eta 0:43:49 lr 0.000361 time 2.4004 (2.3457) loss 3.9300 (3.4553) grad_norm 1.9664 (1.7374) [2022-01-23 08:04:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][140/1251] eta 0:43:23 lr 0.000361 time 2.2278 (2.3437) loss 4.4587 (3.4581) grad_norm 1.7658 (1.7305) [2022-01-23 08:04:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][150/1251] eta 0:42:45 lr 0.000361 time 1.5888 (2.3305) loss 3.6814 (3.4591) grad_norm 1.9004 (1.7326) [2022-01-23 08:05:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][160/1251] eta 0:42:14 lr 0.000361 time 2.9341 (2.3231) loss 3.0676 (3.4564) grad_norm 1.5196 (1.7311) [2022-01-23 08:05:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][170/1251] eta 0:41:40 lr 0.000361 time 2.4658 (2.3129) loss 2.5218 (3.4573) grad_norm 1.9591 (1.7281) [2022-01-23 08:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][180/1251] eta 0:41:12 lr 0.000361 time 1.8496 (2.3085) loss 4.1755 (3.4669) grad_norm 2.0025 (1.7251) [2022-01-23 08:06:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][190/1251] eta 0:40:44 lr 0.000361 time 2.2428 (2.3043) loss 3.9827 (3.4601) grad_norm 1.7005 (1.7252) [2022-01-23 08:06:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][200/1251] eta 0:40:16 lr 0.000361 time 2.7437 (2.2988) loss 3.8530 (3.4641) grad_norm 1.6665 (1.7261) [2022-01-23 08:06:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][210/1251] eta 0:39:44 lr 0.000361 time 2.1753 (2.2908) loss 3.5517 (3.4595) grad_norm 1.6595 (1.7250) [2022-01-23 08:07:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][220/1251] eta 0:39:18 lr 0.000361 time 1.9654 (2.2875) loss 3.8109 (3.4482) grad_norm 1.9763 (1.7251) [2022-01-23 08:07:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][230/1251] eta 0:38:48 lr 0.000361 time 1.9586 (2.2801) loss 3.9644 (3.4402) grad_norm 1.7226 (1.7236) [2022-01-23 08:08:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][240/1251] eta 0:38:23 lr 0.000361 time 2.8081 (2.2782) loss 3.7015 (3.4330) grad_norm 1.7477 (1.7219) [2022-01-23 08:08:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][250/1251] eta 0:37:55 lr 0.000361 time 1.7489 (2.2736) loss 2.3942 (3.4228) grad_norm 1.6081 (1.7200) [2022-01-23 08:08:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][260/1251] eta 0:37:30 lr 0.000361 time 1.9589 (2.2708) loss 3.7639 (3.4257) grad_norm 1.7600 (1.7206) [2022-01-23 08:09:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][270/1251] eta 0:36:58 lr 0.000361 time 1.8694 (2.2616) loss 3.7648 (3.4338) grad_norm 1.5104 (1.7210) [2022-01-23 08:09:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][280/1251] eta 0:36:38 lr 0.000361 time 3.7037 (2.2637) loss 2.3291 (3.4311) grad_norm 1.7895 (1.7181) [2022-01-23 08:09:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][290/1251] eta 0:36:08 lr 0.000361 time 2.0618 (2.2566) loss 3.8699 (3.4319) grad_norm 1.9622 (1.7192) [2022-01-23 08:10:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][300/1251] eta 0:35:45 lr 0.000361 time 2.7181 (2.2557) loss 3.7566 (3.4362) grad_norm 1.4580 (1.7177) [2022-01-23 08:10:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][310/1251] eta 0:35:26 lr 0.000361 time 2.8734 (2.2600) loss 3.9389 (3.4398) grad_norm 1.7151 (1.7170) [2022-01-23 08:10:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][320/1251] eta 0:35:00 lr 0.000361 time 1.6038 (2.2566) loss 4.1375 (3.4438) grad_norm 1.6854 (1.7158) [2022-01-23 08:11:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][330/1251] eta 0:34:33 lr 0.000361 time 1.8889 (2.2509) loss 4.0795 (3.4491) grad_norm 1.5096 (1.7149) [2022-01-23 08:11:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][340/1251] eta 0:34:08 lr 0.000361 time 2.0938 (2.2483) loss 3.9427 (3.4535) grad_norm 1.6865 (1.7156) [2022-01-23 08:12:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][350/1251] eta 0:33:39 lr 0.000361 time 1.5872 (2.2416) loss 3.5274 (3.4481) grad_norm 1.8029 (1.7164) [2022-01-23 08:12:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][360/1251] eta 0:33:13 lr 0.000361 time 2.3102 (2.2379) loss 3.0580 (3.4473) grad_norm 1.7241 (1.7170) [2022-01-23 08:12:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][370/1251] eta 0:32:54 lr 0.000360 time 2.4668 (2.2414) loss 3.6992 (3.4399) grad_norm 2.0927 (1.7184) [2022-01-23 08:13:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][380/1251] eta 0:32:32 lr 0.000360 time 2.3091 (2.2411) loss 4.4418 (3.4419) grad_norm 1.5317 (1.7174) [2022-01-23 08:13:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][390/1251] eta 0:32:06 lr 0.000360 time 2.1572 (2.2377) loss 2.2936 (3.4318) grad_norm 2.3135 (1.7206) [2022-01-23 08:13:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][400/1251] eta 0:31:43 lr 0.000360 time 2.6314 (2.2367) loss 3.2594 (3.4299) grad_norm 1.5457 (1.7194) [2022-01-23 08:14:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][410/1251] eta 0:31:21 lr 0.000360 time 2.6964 (2.2372) loss 3.4481 (3.4373) grad_norm 1.7855 (1.7189) [2022-01-23 08:14:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][420/1251] eta 0:30:57 lr 0.000360 time 2.5094 (2.2350) loss 3.7516 (3.4411) grad_norm 1.7283 (1.7189) [2022-01-23 08:14:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][430/1251] eta 0:30:38 lr 0.000360 time 2.1262 (2.2395) loss 2.2481 (3.4326) grad_norm 2.1852 (1.7218) [2022-01-23 08:15:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][440/1251] eta 0:30:13 lr 0.000360 time 1.6319 (2.2357) loss 2.6783 (3.4317) grad_norm 1.7499 (1.7266) [2022-01-23 08:15:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][450/1251] eta 0:29:51 lr 0.000360 time 1.8121 (2.2364) loss 3.2800 (3.4344) grad_norm 1.6364 (1.7274) [2022-01-23 08:16:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][460/1251] eta 0:29:25 lr 0.000360 time 1.7384 (2.2321) loss 3.5568 (3.4399) grad_norm 1.9377 (1.7272) [2022-01-23 08:16:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][470/1251] eta 0:29:03 lr 0.000360 time 1.9347 (2.2322) loss 3.4703 (3.4388) grad_norm 1.6646 (1.7263) [2022-01-23 08:16:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][480/1251] eta 0:28:38 lr 0.000360 time 2.2010 (2.2288) loss 2.4220 (3.4330) grad_norm 2.1228 (1.7268) [2022-01-23 08:17:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][490/1251] eta 0:28:14 lr 0.000360 time 2.0026 (2.2270) loss 2.4492 (3.4275) grad_norm 2.0289 (1.7268) [2022-01-23 08:17:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][500/1251] eta 0:27:51 lr 0.000360 time 2.6805 (2.2258) loss 2.4869 (3.4286) grad_norm 1.4373 (1.7281) [2022-01-23 08:17:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][510/1251] eta 0:27:28 lr 0.000360 time 2.6693 (2.2241) loss 3.5837 (3.4287) grad_norm 1.7710 (1.7271) [2022-01-23 08:18:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][520/1251] eta 0:27:03 lr 0.000360 time 1.8504 (2.2203) loss 3.8699 (3.4256) grad_norm 1.7866 (1.7281) [2022-01-23 08:18:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][530/1251] eta 0:26:40 lr 0.000360 time 2.1447 (2.2197) loss 3.7256 (3.4276) grad_norm 1.6852 (1.7284) [2022-01-23 08:18:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][540/1251] eta 0:26:19 lr 0.000360 time 2.4238 (2.2215) loss 3.1991 (3.4262) grad_norm 1.4108 (1.7314) [2022-01-23 08:19:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][550/1251] eta 0:25:55 lr 0.000360 time 1.9539 (2.2190) loss 2.4892 (3.4263) grad_norm 1.7008 (1.7322) [2022-01-23 08:19:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][560/1251] eta 0:25:32 lr 0.000360 time 1.8562 (2.2177) loss 4.1300 (3.4239) grad_norm 1.5434 (1.7309) [2022-01-23 08:20:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][570/1251] eta 0:25:10 lr 0.000360 time 1.8116 (2.2177) loss 3.6134 (3.4229) grad_norm 1.7094 (1.7321) [2022-01-23 08:20:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][580/1251] eta 0:24:47 lr 0.000360 time 2.8603 (2.2165) loss 2.9313 (3.4231) grad_norm 1.6596 (1.7305) [2022-01-23 08:20:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][590/1251] eta 0:24:23 lr 0.000360 time 1.9711 (2.2136) loss 2.4223 (3.4245) grad_norm 1.8902 (1.7307) [2022-01-23 08:21:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][600/1251] eta 0:24:00 lr 0.000360 time 1.7985 (2.2120) loss 3.7744 (3.4227) grad_norm 1.5725 (1.7312) [2022-01-23 08:21:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][610/1251] eta 0:23:38 lr 0.000360 time 2.0489 (2.2136) loss 3.3788 (3.4261) grad_norm 1.6352 (1.7318) [2022-01-23 08:21:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][620/1251] eta 0:23:16 lr 0.000359 time 2.1496 (2.2137) loss 3.4939 (3.4253) grad_norm 1.7317 (1.7298) [2022-01-23 08:22:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][630/1251] eta 0:22:54 lr 0.000359 time 1.9684 (2.2127) loss 2.9958 (3.4254) grad_norm 1.6341 (1.7282) [2022-01-23 08:22:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][640/1251] eta 0:22:31 lr 0.000359 time 1.5780 (2.2115) loss 3.2337 (3.4268) grad_norm 1.4085 (1.7284) [2022-01-23 08:22:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][650/1251] eta 0:22:08 lr 0.000359 time 2.4263 (2.2103) loss 3.5688 (3.4268) grad_norm 1.6463 (1.7279) [2022-01-23 08:23:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][660/1251] eta 0:21:45 lr 0.000359 time 2.6094 (2.2095) loss 3.2335 (3.4248) grad_norm 1.7584 (1.7269) [2022-01-23 08:23:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][670/1251] eta 0:21:24 lr 0.000359 time 3.0434 (2.2113) loss 3.7370 (3.4269) grad_norm 1.7181 (1.7290) [2022-01-23 08:24:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][680/1251] eta 0:21:02 lr 0.000359 time 2.3022 (2.2119) loss 3.8589 (3.4295) grad_norm 2.1106 (1.7303) [2022-01-23 08:24:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][690/1251] eta 0:20:40 lr 0.000359 time 1.9587 (2.2113) loss 3.1988 (3.4285) grad_norm 1.7680 (1.7298) [2022-01-23 08:24:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][700/1251] eta 0:20:18 lr 0.000359 time 1.9125 (2.2109) loss 3.0798 (3.4286) grad_norm 2.1181 (1.7302) [2022-01-23 08:25:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][710/1251] eta 0:19:56 lr 0.000359 time 2.6238 (2.2117) loss 3.7418 (3.4315) grad_norm 1.8018 (1.7301) [2022-01-23 08:25:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][720/1251] eta 0:19:34 lr 0.000359 time 2.7970 (2.2123) loss 3.1217 (3.4325) grad_norm 1.7378 (1.7302) [2022-01-23 08:25:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][730/1251] eta 0:19:12 lr 0.000359 time 1.9319 (2.2120) loss 3.3392 (3.4328) grad_norm 1.6865 (1.7293) [2022-01-23 08:26:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][740/1251] eta 0:18:49 lr 0.000359 time 2.1804 (2.2102) loss 3.6657 (3.4334) grad_norm 1.5759 (1.7284) [2022-01-23 08:26:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][750/1251] eta 0:18:27 lr 0.000359 time 2.5107 (2.2099) loss 3.9657 (3.4332) grad_norm 1.6188 (1.7284) [2022-01-23 08:26:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][760/1251] eta 0:18:04 lr 0.000359 time 2.8723 (2.2086) loss 3.6661 (3.4341) grad_norm 1.6209 (1.7304) [2022-01-23 08:27:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][770/1251] eta 0:17:41 lr 0.000359 time 2.0695 (2.2076) loss 4.0791 (3.4354) grad_norm 1.5689 (1.7296) [2022-01-23 08:27:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][780/1251] eta 0:17:19 lr 0.000359 time 2.5920 (2.2069) loss 3.1104 (3.4370) grad_norm 1.6483 (1.7303) [2022-01-23 08:27:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][790/1251] eta 0:16:56 lr 0.000359 time 2.2401 (2.2058) loss 4.2748 (3.4376) grad_norm 1.7310 (1.7322) [2022-01-23 08:28:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][800/1251] eta 0:16:34 lr 0.000359 time 1.8055 (2.2057) loss 3.1201 (3.4404) grad_norm 1.7149 (1.7320) [2022-01-23 08:28:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][810/1251] eta 0:16:12 lr 0.000359 time 1.6892 (2.2042) loss 2.5567 (3.4366) grad_norm 1.8325 (1.7326) [2022-01-23 08:29:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][820/1251] eta 0:15:49 lr 0.000359 time 1.5791 (2.2020) loss 4.1242 (3.4378) grad_norm 1.6689 (1.7321) [2022-01-23 08:29:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][830/1251] eta 0:15:27 lr 0.000359 time 1.8061 (2.2023) loss 3.3733 (3.4397) grad_norm 1.7806 (1.7322) [2022-01-23 08:29:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][840/1251] eta 0:15:05 lr 0.000359 time 2.4338 (2.2030) loss 3.8574 (3.4412) grad_norm 1.5746 (1.7316) [2022-01-23 08:30:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][850/1251] eta 0:14:43 lr 0.000359 time 1.6442 (2.2021) loss 3.7182 (3.4413) grad_norm 1.6616 (1.7323) [2022-01-23 08:30:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][860/1251] eta 0:14:20 lr 0.000359 time 1.9423 (2.2007) loss 4.1078 (3.4420) grad_norm 1.8777 (1.7324) [2022-01-23 08:30:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][870/1251] eta 0:13:58 lr 0.000358 time 2.0851 (2.2001) loss 2.9569 (3.4408) grad_norm 1.9899 (1.7331) [2022-01-23 08:31:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][880/1251] eta 0:13:36 lr 0.000358 time 2.2112 (2.2007) loss 3.5231 (3.4445) grad_norm 1.8436 (1.7353) [2022-01-23 08:31:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][890/1251] eta 0:13:14 lr 0.000358 time 1.5892 (2.2018) loss 3.7833 (3.4451) grad_norm 1.5652 (1.7345) [2022-01-23 08:31:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][900/1251] eta 0:12:52 lr 0.000358 time 1.8325 (2.2021) loss 3.4598 (3.4454) grad_norm 1.7400 (1.7336) [2022-01-23 08:32:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][910/1251] eta 0:12:31 lr 0.000358 time 2.4629 (2.2029) loss 3.2625 (3.4426) grad_norm 1.6743 (1.7327) [2022-01-23 08:32:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][920/1251] eta 0:12:09 lr 0.000358 time 2.8613 (2.2029) loss 3.0257 (3.4388) grad_norm 1.9177 (1.7332) [2022-01-23 08:33:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][930/1251] eta 0:11:47 lr 0.000358 time 1.8242 (2.2035) loss 2.6664 (3.4354) grad_norm 1.7099 (1.7337) [2022-01-23 08:33:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][940/1251] eta 0:11:25 lr 0.000358 time 1.5360 (2.2028) loss 3.2586 (3.4370) grad_norm 1.7793 (1.7353) [2022-01-23 08:33:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][950/1251] eta 0:11:02 lr 0.000358 time 2.2060 (2.2018) loss 3.7297 (3.4390) grad_norm 1.8087 (1.7358) [2022-01-23 08:34:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][960/1251] eta 0:10:40 lr 0.000358 time 2.5808 (2.2017) loss 3.7560 (3.4412) grad_norm 1.7252 (1.7357) [2022-01-23 08:34:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][970/1251] eta 0:10:18 lr 0.000358 time 2.7491 (2.2010) loss 4.1906 (3.4392) grad_norm 1.8591 (1.7357) [2022-01-23 08:34:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][980/1251] eta 0:09:56 lr 0.000358 time 2.0348 (2.2018) loss 2.3399 (3.4368) grad_norm 1.6474 (1.7373) [2022-01-23 08:35:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][990/1251] eta 0:09:34 lr 0.000358 time 2.2121 (2.2017) loss 3.7120 (3.4344) grad_norm 1.8940 (1.7372) [2022-01-23 08:35:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1000/1251] eta 0:09:12 lr 0.000358 time 1.6203 (2.2002) loss 2.8559 (3.4352) grad_norm 2.1829 (1.7375) [2022-01-23 08:35:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1010/1251] eta 0:08:49 lr 0.000358 time 2.1035 (2.1987) loss 3.4800 (3.4356) grad_norm 1.5059 (1.7380) [2022-01-23 08:36:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1020/1251] eta 0:08:27 lr 0.000358 time 2.3069 (2.1987) loss 3.0074 (3.4339) grad_norm 1.6917 (1.7378) [2022-01-23 08:36:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1030/1251] eta 0:08:06 lr 0.000358 time 2.0997 (2.2000) loss 3.6275 (3.4353) grad_norm 1.7024 (1.7368) [2022-01-23 08:37:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1040/1251] eta 0:07:44 lr 0.000358 time 1.8249 (2.2007) loss 3.9947 (3.4374) grad_norm 2.0272 (1.7369) [2022-01-23 08:37:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1050/1251] eta 0:07:22 lr 0.000358 time 1.8855 (2.1995) loss 3.6558 (3.4388) grad_norm 1.6871 (1.7361) [2022-01-23 08:37:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1060/1251] eta 0:06:59 lr 0.000358 time 1.7176 (2.1975) loss 3.5701 (3.4388) grad_norm 1.6600 (1.7355) [2022-01-23 08:38:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1070/1251] eta 0:06:37 lr 0.000358 time 2.6969 (2.1972) loss 3.4822 (3.4376) grad_norm 1.7285 (1.7350) [2022-01-23 08:38:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1080/1251] eta 0:06:15 lr 0.000358 time 2.4386 (2.1976) loss 3.5997 (3.4392) grad_norm 1.6512 (1.7352) [2022-01-23 08:38:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1090/1251] eta 0:05:53 lr 0.000358 time 2.3745 (2.1975) loss 3.7659 (3.4389) grad_norm 1.7013 (1.7343) [2022-01-23 08:39:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1100/1251] eta 0:05:31 lr 0.000358 time 1.9051 (2.1986) loss 3.7877 (3.4407) grad_norm 1.6685 (1.7341) [2022-01-23 08:39:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1110/1251] eta 0:05:10 lr 0.000358 time 3.0793 (2.2000) loss 3.4867 (3.4401) grad_norm 1.5635 (1.7333) [2022-01-23 08:40:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1120/1251] eta 0:04:48 lr 0.000357 time 2.7551 (2.2000) loss 3.3919 (3.4375) grad_norm 1.7393 (1.7333) [2022-01-23 08:40:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1130/1251] eta 0:04:25 lr 0.000357 time 1.7850 (2.1981) loss 2.7818 (3.4378) grad_norm 1.8775 (1.7343) [2022-01-23 08:40:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1140/1251] eta 0:04:03 lr 0.000357 time 1.9238 (2.1965) loss 2.7380 (3.4386) grad_norm 1.5698 (1.7348) [2022-01-23 08:40:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1150/1251] eta 0:03:41 lr 0.000357 time 1.9127 (2.1945) loss 3.6324 (3.4394) grad_norm 1.5871 (1.7340) [2022-01-23 08:41:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1160/1251] eta 0:03:19 lr 0.000357 time 2.0428 (2.1938) loss 2.3147 (3.4398) grad_norm 1.7856 (1.7337) [2022-01-23 08:41:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1170/1251] eta 0:02:58 lr 0.000357 time 2.3425 (2.1987) loss 2.1006 (3.4380) grad_norm 1.6486 (1.7332) [2022-01-23 08:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1180/1251] eta 0:02:36 lr 0.000357 time 2.1308 (2.2014) loss 3.6979 (3.4382) grad_norm 1.6552 (1.7337) [2022-01-23 08:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1190/1251] eta 0:02:14 lr 0.000357 time 1.9109 (2.2012) loss 4.0287 (3.4388) grad_norm 1.7143 (1.7342) [2022-01-23 08:42:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1200/1251] eta 0:01:52 lr 0.000357 time 1.9485 (2.1993) loss 3.9768 (3.4393) grad_norm 1.7318 (1.7346) [2022-01-23 08:43:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1210/1251] eta 0:01:30 lr 0.000357 time 1.9591 (2.1984) loss 2.2237 (3.4379) grad_norm 1.5548 (1.7340) [2022-01-23 08:43:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1220/1251] eta 0:01:08 lr 0.000357 time 1.5877 (2.1973) loss 3.2846 (3.4391) grad_norm 1.5198 (1.7333) [2022-01-23 08:43:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1230/1251] eta 0:00:46 lr 0.000357 time 1.8318 (2.1969) loss 2.2966 (3.4376) grad_norm 1.5065 (1.7334) [2022-01-23 08:44:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1240/1251] eta 0:00:24 lr 0.000357 time 1.8082 (2.1959) loss 3.7710 (3.4383) grad_norm 1.7131 (1.7326) [2022-01-23 08:44:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [178/300][1250/1251] eta 0:00:02 lr 0.000357 time 1.1811 (2.1912) loss 3.1934 (3.4373) grad_norm 1.6923 (1.7327) [2022-01-23 08:44:35 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 178 training takes 0:45:41 [2022-01-23 08:44:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.151 (18.151) Loss 0.9777 (0.9777) Acc@1 77.637 (77.637) Acc@5 93.164 (93.164) [2022-01-23 08:45:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.671 (3.316) Loss 0.9855 (0.9598) Acc@1 76.465 (77.495) Acc@5 93.750 (93.768) [2022-01-23 08:45:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.230 (2.579) Loss 0.9861 (0.9591) Acc@1 77.051 (77.483) Acc@5 93.652 (93.810) [2022-01-23 08:45:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.616 (2.221) Loss 0.9710 (0.9546) Acc@1 75.879 (77.526) Acc@5 94.043 (93.889) [2022-01-23 08:46:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.933 (2.162) Loss 0.9199 (0.9606) Acc@1 77.930 (77.289) Acc@5 94.043 (93.838) [2022-01-23 08:46:11 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 77.236 Acc@5 93.854 [2022-01-23 08:46:11 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 77.2% [2022-01-23 08:46:11 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 77.44% [2022-01-23 08:46:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][0/1251] eta 7:31:44 lr 0.000357 time 21.6665 (21.6665) loss 3.7030 (3.7030) grad_norm 1.9334 (1.9334) [2022-01-23 08:46:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][10/1251] eta 1:23:23 lr 0.000357 time 1.3738 (4.0315) loss 3.4684 (3.6100) grad_norm 1.5123 (1.7532) [2022-01-23 08:47:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][20/1251] eta 1:04:22 lr 0.000357 time 1.4260 (3.1379) loss 3.4677 (3.5034) grad_norm 1.7990 (1.7552) [2022-01-23 08:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][30/1251] eta 0:57:50 lr 0.000357 time 1.4743 (2.8427) loss 3.5871 (3.5441) grad_norm 1.7278 (1.7438) [2022-01-23 08:48:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][40/1251] eta 0:54:09 lr 0.000357 time 3.9441 (2.6834) loss 4.2377 (3.5630) grad_norm 1.7190 (1.7421) [2022-01-23 08:48:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][50/1251] eta 0:52:30 lr 0.000357 time 1.6873 (2.6235) loss 2.4058 (3.4993) grad_norm 1.8477 (1.7617) [2022-01-23 08:48:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][60/1251] eta 0:50:25 lr 0.000357 time 1.4152 (2.5405) loss 3.6380 (3.4836) grad_norm 1.9122 (1.7591) [2022-01-23 08:49:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][70/1251] eta 0:48:49 lr 0.000357 time 1.7003 (2.4802) loss 4.0943 (3.4941) grad_norm 1.7565 (1.7903) [2022-01-23 08:49:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][80/1251] eta 0:48:12 lr 0.000357 time 3.7234 (2.4701) loss 2.6078 (3.4697) grad_norm 1.7610 (1.7819) [2022-01-23 08:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][90/1251] eta 0:47:22 lr 0.000357 time 1.5792 (2.4480) loss 3.7103 (3.4620) grad_norm 1.6804 (1.7883) [2022-01-23 08:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][100/1251] eta 0:46:26 lr 0.000357 time 1.6842 (2.4209) loss 2.8578 (3.4243) grad_norm 1.5073 (1.7823) [2022-01-23 08:50:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][110/1251] eta 0:45:23 lr 0.000357 time 1.6443 (2.3866) loss 3.8361 (3.4276) grad_norm 1.7535 (1.7718) [2022-01-23 08:50:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][120/1251] eta 0:44:52 lr 0.000357 time 3.5698 (2.3810) loss 3.8133 (3.4265) grad_norm 1.8144 (1.7693) [2022-01-23 08:51:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][130/1251] eta 0:44:11 lr 0.000356 time 1.5496 (2.3656) loss 3.7373 (3.4181) grad_norm 1.5520 (1.7571) [2022-01-23 08:51:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][140/1251] eta 0:43:29 lr 0.000356 time 1.7361 (2.3484) loss 3.0073 (3.4298) grad_norm 2.0268 (1.7511) [2022-01-23 08:52:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][150/1251] eta 0:42:41 lr 0.000356 time 1.9034 (2.3267) loss 3.7617 (3.4452) grad_norm 1.5798 (1.7490) [2022-01-23 08:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][160/1251] eta 0:42:15 lr 0.000356 time 3.6691 (2.3244) loss 3.8212 (3.4545) grad_norm 1.5606 (1.7497) [2022-01-23 08:52:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][170/1251] eta 0:41:40 lr 0.000356 time 1.7462 (2.3135) loss 3.4625 (3.4525) grad_norm 1.6787 (1.7467) [2022-01-23 08:53:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][180/1251] eta 0:41:11 lr 0.000356 time 2.1390 (2.3078) loss 3.2539 (3.4477) grad_norm 1.9078 (1.7368) [2022-01-23 08:53:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][190/1251] eta 0:40:45 lr 0.000356 time 1.9647 (2.3052) loss 3.7685 (3.4309) grad_norm 1.6905 (1.7340) [2022-01-23 08:53:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][200/1251] eta 0:40:28 lr 0.000356 time 3.3768 (2.3104) loss 2.7268 (3.4348) grad_norm 1.9598 (1.7353) [2022-01-23 08:54:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][210/1251] eta 0:39:56 lr 0.000356 time 1.5320 (2.3022) loss 3.6846 (3.4375) grad_norm 1.7174 (1.7401) [2022-01-23 08:54:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][220/1251] eta 0:39:22 lr 0.000356 time 1.6371 (2.2912) loss 3.1536 (3.4465) grad_norm 1.9863 (1.7416) [2022-01-23 08:54:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][230/1251] eta 0:38:47 lr 0.000356 time 1.9302 (2.2799) loss 3.5190 (3.4466) grad_norm 2.1578 (1.7411) [2022-01-23 08:55:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][240/1251] eta 0:38:18 lr 0.000356 time 2.1905 (2.2735) loss 2.1717 (3.4367) grad_norm 1.8663 (1.7408) [2022-01-23 08:55:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][250/1251] eta 0:37:49 lr 0.000356 time 1.8389 (2.2669) loss 3.6441 (3.4362) grad_norm 1.7949 (1.7401) [2022-01-23 08:56:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][260/1251] eta 0:37:26 lr 0.000356 time 2.8773 (2.2667) loss 3.3372 (3.4364) grad_norm 2.0723 (1.7484) [2022-01-23 08:56:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][270/1251] eta 0:37:01 lr 0.000356 time 2.1092 (2.2645) loss 3.2014 (3.4355) grad_norm 1.7078 (1.7478) [2022-01-23 08:56:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][280/1251] eta 0:36:41 lr 0.000356 time 1.5550 (2.2674) loss 4.0865 (3.4230) grad_norm 1.5850 (1.7474) [2022-01-23 08:57:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][290/1251] eta 0:36:18 lr 0.000356 time 2.2451 (2.2673) loss 3.8278 (3.4317) grad_norm 1.8537 (1.7524) [2022-01-23 08:57:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][300/1251] eta 0:35:50 lr 0.000356 time 2.2705 (2.2616) loss 2.7521 (3.4250) grad_norm 1.5700 (1.7519) [2022-01-23 08:57:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][310/1251] eta 0:35:22 lr 0.000356 time 1.6869 (2.2554) loss 2.8472 (3.4225) grad_norm 1.6705 (1.7519) [2022-01-23 08:58:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][320/1251] eta 0:34:55 lr 0.000356 time 1.8899 (2.2508) loss 3.6379 (3.4238) grad_norm 1.8843 (1.7506) [2022-01-23 08:58:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][330/1251] eta 0:34:32 lr 0.000356 time 2.5475 (2.2501) loss 3.6653 (3.4237) grad_norm 1.7204 (1.7525) [2022-01-23 08:58:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][340/1251] eta 0:34:09 lr 0.000356 time 2.4714 (2.2495) loss 4.1304 (3.4281) grad_norm 1.5937 (1.7509) [2022-01-23 08:59:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][350/1251] eta 0:33:45 lr 0.000356 time 2.2797 (2.2479) loss 3.8789 (3.4232) grad_norm 1.9256 (1.7506) [2022-01-23 08:59:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][360/1251] eta 0:33:26 lr 0.000356 time 2.2655 (2.2515) loss 2.5700 (3.4256) grad_norm 1.5654 (1.7480) [2022-01-23 09:00:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][370/1251] eta 0:33:08 lr 0.000356 time 3.0174 (2.2567) loss 2.7881 (3.4241) grad_norm 2.1294 (1.7549) [2022-01-23 09:00:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][380/1251] eta 0:32:43 lr 0.000355 time 2.9284 (2.2549) loss 3.1182 (3.4167) grad_norm 1.6380 (1.7536) [2022-01-23 09:00:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][390/1251] eta 0:32:14 lr 0.000355 time 1.7441 (2.2473) loss 3.3202 (3.4192) grad_norm 2.1048 (1.7543) [2022-01-23 09:01:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][400/1251] eta 0:31:46 lr 0.000355 time 1.8648 (2.2400) loss 2.8439 (3.4151) grad_norm 1.7633 (1.7546) [2022-01-23 09:01:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][410/1251] eta 0:31:21 lr 0.000355 time 1.9051 (2.2378) loss 2.9020 (3.4147) grad_norm 1.5413 (1.7549) [2022-01-23 09:01:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][420/1251] eta 0:31:05 lr 0.000355 time 3.3361 (2.2444) loss 3.7778 (3.4137) grad_norm 1.7944 (1.7545) [2022-01-23 09:02:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][430/1251] eta 0:30:43 lr 0.000355 time 2.4890 (2.2460) loss 3.5267 (3.4124) grad_norm 1.7828 (1.7545) [2022-01-23 09:02:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][440/1251] eta 0:30:21 lr 0.000355 time 1.9073 (2.2459) loss 3.6023 (3.4118) grad_norm 1.5837 (1.7526) [2022-01-23 09:03:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][450/1251] eta 0:29:53 lr 0.000355 time 1.8665 (2.2388) loss 3.6868 (3.4111) grad_norm 1.7657 (1.7513) [2022-01-23 09:03:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][460/1251] eta 0:29:26 lr 0.000355 time 2.3028 (2.2331) loss 3.8500 (3.4122) grad_norm 2.0429 (1.7526) [2022-01-23 09:03:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][470/1251] eta 0:29:01 lr 0.000355 time 1.4893 (2.2300) loss 3.9079 (3.4113) grad_norm 1.7299 (1.7529) [2022-01-23 09:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][480/1251] eta 0:28:41 lr 0.000355 time 1.5463 (2.2324) loss 2.3311 (3.4084) grad_norm 1.8432 (1.7517) [2022-01-23 09:04:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][490/1251] eta 0:28:19 lr 0.000355 time 2.4391 (2.2337) loss 3.7225 (3.4087) grad_norm 1.8917 (1.7528) [2022-01-23 09:04:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][500/1251] eta 0:27:55 lr 0.000355 time 2.5752 (2.2311) loss 3.6098 (3.4087) grad_norm 1.5623 (1.7515) [2022-01-23 09:05:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][510/1251] eta 0:27:31 lr 0.000355 time 2.0044 (2.2285) loss 3.8345 (3.4108) grad_norm 1.8574 (1.7497) [2022-01-23 09:05:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][520/1251] eta 0:27:06 lr 0.000355 time 1.5479 (2.2251) loss 3.7336 (3.4120) grad_norm 1.9808 (1.7488) [2022-01-23 09:05:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][530/1251] eta 0:26:43 lr 0.000355 time 1.2578 (2.2237) loss 2.2941 (3.4098) grad_norm 1.6212 (1.7484) [2022-01-23 09:06:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][540/1251] eta 0:26:19 lr 0.000355 time 1.8215 (2.2211) loss 4.0770 (3.4112) grad_norm 1.9949 (1.7504) [2022-01-23 09:06:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][550/1251] eta 0:25:57 lr 0.000355 time 1.9167 (2.2221) loss 3.3282 (3.4162) grad_norm 1.6949 (1.7508) [2022-01-23 09:06:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][560/1251] eta 0:25:37 lr 0.000355 time 2.5070 (2.2250) loss 3.7449 (3.4156) grad_norm 1.7568 (1.7528) [2022-01-23 09:07:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][570/1251] eta 0:25:15 lr 0.000355 time 1.8939 (2.2253) loss 3.7256 (3.4169) grad_norm 2.1844 (1.7542) [2022-01-23 09:07:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][580/1251] eta 0:24:54 lr 0.000355 time 2.2374 (2.2279) loss 3.6441 (3.4184) grad_norm 1.6232 (1.7529) [2022-01-23 09:08:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][590/1251] eta 0:24:31 lr 0.000355 time 1.8586 (2.2259) loss 2.9646 (3.4181) grad_norm 1.6964 (1.7519) [2022-01-23 09:08:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][600/1251] eta 0:24:07 lr 0.000355 time 1.8781 (2.2240) loss 3.5892 (3.4174) grad_norm 1.5060 (1.7516) [2022-01-23 09:08:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][610/1251] eta 0:23:44 lr 0.000355 time 1.9964 (2.2229) loss 3.5297 (3.4218) grad_norm 1.9549 (1.7511) [2022-01-23 09:09:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][620/1251] eta 0:23:20 lr 0.000355 time 2.2777 (2.2199) loss 3.0023 (3.4204) grad_norm 2.0071 (1.7516) [2022-01-23 09:09:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][630/1251] eta 0:22:58 lr 0.000354 time 1.9889 (2.2192) loss 3.3078 (3.4199) grad_norm 2.2578 (1.7540) [2022-01-23 09:09:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][640/1251] eta 0:22:35 lr 0.000354 time 2.3371 (2.2185) loss 2.8910 (3.4184) grad_norm 1.9302 (1.7536) [2022-01-23 09:10:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][650/1251] eta 0:22:13 lr 0.000354 time 1.5910 (2.2181) loss 4.2649 (3.4210) grad_norm 1.5961 (1.7527) [2022-01-23 09:10:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][660/1251] eta 0:21:49 lr 0.000354 time 1.6685 (2.2161) loss 2.1750 (3.4194) grad_norm 2.0242 (1.7527) [2022-01-23 09:10:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][670/1251] eta 0:21:27 lr 0.000354 time 1.9791 (2.2167) loss 4.1744 (3.4207) grad_norm 1.7117 (1.7525) [2022-01-23 09:11:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][680/1251] eta 0:21:05 lr 0.000354 time 2.3378 (2.2160) loss 3.0389 (3.4218) grad_norm 1.8409 (1.7522) [2022-01-23 09:11:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][690/1251] eta 0:20:44 lr 0.000354 time 2.3569 (2.2181) loss 2.7637 (3.4207) grad_norm 1.5802 (1.7517) [2022-01-23 09:12:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][700/1251] eta 0:20:22 lr 0.000354 time 1.8968 (2.2183) loss 3.3152 (3.4197) grad_norm 1.7459 (1.7504) [2022-01-23 09:12:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][710/1251] eta 0:20:00 lr 0.000354 time 2.1979 (2.2187) loss 2.4880 (3.4172) grad_norm 1.6526 (1.7489) [2022-01-23 09:12:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][720/1251] eta 0:19:37 lr 0.000354 time 1.9721 (2.2170) loss 3.8710 (3.4161) grad_norm 1.6306 (1.7466) [2022-01-23 09:13:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][730/1251] eta 0:19:14 lr 0.000354 time 1.8467 (2.2156) loss 3.6719 (3.4184) grad_norm 1.6353 (1.7446) [2022-01-23 09:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][740/1251] eta 0:18:51 lr 0.000354 time 2.4691 (2.2149) loss 2.8582 (3.4167) grad_norm 1.4932 (1.7436) [2022-01-23 09:13:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][750/1251] eta 0:18:30 lr 0.000354 time 2.2667 (2.2160) loss 4.0880 (3.4181) grad_norm 1.8400 (1.7438) [2022-01-23 09:14:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][760/1251] eta 0:18:06 lr 0.000354 time 1.9488 (2.2131) loss 3.2531 (3.4201) grad_norm 1.8212 (1.7448) [2022-01-23 09:14:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][770/1251] eta 0:17:42 lr 0.000354 time 2.1772 (2.2096) loss 2.4146 (3.4183) grad_norm 1.5655 (1.7436) [2022-01-23 09:14:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][780/1251] eta 0:17:20 lr 0.000354 time 2.1105 (2.2086) loss 2.3812 (3.4194) grad_norm 1.5898 (1.7446) [2022-01-23 09:15:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][790/1251] eta 0:16:58 lr 0.000354 time 2.4351 (2.2095) loss 2.4014 (3.4137) grad_norm 1.6182 (1.7456) [2022-01-23 09:15:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][800/1251] eta 0:16:37 lr 0.000354 time 1.9769 (2.2111) loss 3.7889 (3.4156) grad_norm 1.5618 (1.7455) [2022-01-23 09:16:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][810/1251] eta 0:16:15 lr 0.000354 time 2.2973 (2.2110) loss 3.3328 (3.4136) grad_norm 2.0483 (1.7447) [2022-01-23 09:16:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][820/1251] eta 0:15:53 lr 0.000354 time 2.5343 (2.2115) loss 3.5660 (3.4132) grad_norm 1.6513 (1.7441) [2022-01-23 09:16:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][830/1251] eta 0:15:31 lr 0.000354 time 2.2320 (2.2133) loss 3.0706 (3.4149) grad_norm 1.7428 (1.7433) [2022-01-23 09:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][840/1251] eta 0:15:09 lr 0.000354 time 1.7315 (2.2126) loss 3.5881 (3.4144) grad_norm 1.5680 (1.7441) [2022-01-23 09:17:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][850/1251] eta 0:14:46 lr 0.000354 time 1.5396 (2.2114) loss 2.9408 (3.4119) grad_norm 1.6491 (1.7438) [2022-01-23 09:17:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][860/1251] eta 0:14:23 lr 0.000354 time 1.7420 (2.2084) loss 2.9191 (3.4110) grad_norm 1.8118 (1.7428) [2022-01-23 09:18:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][870/1251] eta 0:14:01 lr 0.000354 time 1.9441 (2.2079) loss 2.3469 (3.4076) grad_norm 1.6729 (1.7420) [2022-01-23 09:18:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][880/1251] eta 0:13:39 lr 0.000353 time 1.9249 (2.2077) loss 4.2842 (3.4091) grad_norm 1.8284 (1.7425) [2022-01-23 09:18:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][890/1251] eta 0:13:17 lr 0.000353 time 1.9736 (2.2079) loss 3.0266 (3.4095) grad_norm 1.8415 (1.7421) [2022-01-23 09:19:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][900/1251] eta 0:12:55 lr 0.000353 time 1.9970 (2.2095) loss 3.7584 (3.4072) grad_norm 1.7169 (1.7419) [2022-01-23 09:19:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][910/1251] eta 0:12:33 lr 0.000353 time 2.1888 (2.2097) loss 2.7593 (3.4068) grad_norm 1.6860 (1.7410) [2022-01-23 09:20:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][920/1251] eta 0:12:10 lr 0.000353 time 2.2034 (2.2083) loss 2.4896 (3.4049) grad_norm 1.8490 (1.7402) [2022-01-23 09:20:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][930/1251] eta 0:11:48 lr 0.000353 time 1.8714 (2.2068) loss 3.6596 (3.4046) grad_norm 1.5890 (1.7399) [2022-01-23 09:20:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][940/1251] eta 0:11:25 lr 0.000353 time 1.9140 (2.2057) loss 3.3244 (3.4020) grad_norm 1.9858 (1.7403) [2022-01-23 09:21:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][950/1251] eta 0:11:03 lr 0.000353 time 2.1308 (2.2046) loss 3.2841 (3.4022) grad_norm 1.8265 (1.7411) [2022-01-23 09:21:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][960/1251] eta 0:10:42 lr 0.000353 time 2.9505 (2.2068) loss 2.7422 (3.4028) grad_norm 1.5109 (1.7406) [2022-01-23 09:21:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][970/1251] eta 0:10:19 lr 0.000353 time 1.9251 (2.2063) loss 3.0319 (3.4032) grad_norm 1.7475 (1.7406) [2022-01-23 09:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][980/1251] eta 0:09:58 lr 0.000353 time 2.7504 (2.2076) loss 3.7074 (3.4068) grad_norm 1.6789 (1.7399) [2022-01-23 09:22:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][990/1251] eta 0:09:36 lr 0.000353 time 2.0267 (2.2076) loss 3.6557 (3.4096) grad_norm 1.6925 (1.7388) [2022-01-23 09:23:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1000/1251] eta 0:09:13 lr 0.000353 time 2.1856 (2.2069) loss 3.6596 (3.4085) grad_norm 1.5517 (1.7383) [2022-01-23 09:23:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1010/1251] eta 0:08:51 lr 0.000353 time 1.8310 (2.2042) loss 2.6220 (3.4052) grad_norm 1.5385 (1.7374) [2022-01-23 09:23:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1020/1251] eta 0:08:28 lr 0.000353 time 1.9363 (2.2023) loss 1.9905 (3.4041) grad_norm 1.7006 (1.7379) [2022-01-23 09:23:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1030/1251] eta 0:08:06 lr 0.000353 time 1.9314 (2.2004) loss 2.4591 (3.4051) grad_norm 2.1751 (1.7381) [2022-01-23 09:24:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1040/1251] eta 0:07:44 lr 0.000353 time 2.8392 (2.2009) loss 2.8635 (3.4008) grad_norm 1.7144 (1.7383) [2022-01-23 09:24:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1050/1251] eta 0:07:22 lr 0.000353 time 1.5663 (2.2015) loss 3.9373 (3.4026) grad_norm 1.5648 (1.7376) [2022-01-23 09:25:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1060/1251] eta 0:07:00 lr 0.000353 time 2.3788 (2.2026) loss 3.7144 (3.4033) grad_norm 1.5836 (1.7372) [2022-01-23 09:25:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1070/1251] eta 0:06:38 lr 0.000353 time 1.9211 (2.2042) loss 3.4642 (3.4039) grad_norm 1.5165 (1.7369) [2022-01-23 09:25:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1080/1251] eta 0:06:17 lr 0.000353 time 2.6090 (2.2075) loss 2.6389 (3.4040) grad_norm 1.6226 (1.7361) [2022-01-23 09:26:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1090/1251] eta 0:05:55 lr 0.000353 time 2.0956 (2.2074) loss 3.3080 (3.4041) grad_norm 1.8432 (1.7370) [2022-01-23 09:26:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1100/1251] eta 0:05:33 lr 0.000353 time 1.8736 (2.2059) loss 4.1272 (3.4056) grad_norm 1.7385 (1.7367) [2022-01-23 09:26:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1110/1251] eta 0:05:10 lr 0.000353 time 1.9330 (2.2037) loss 4.0480 (3.4072) grad_norm 1.8250 (1.7372) [2022-01-23 09:27:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1120/1251] eta 0:04:48 lr 0.000353 time 1.9136 (2.2021) loss 4.1420 (3.4068) grad_norm 1.8222 (1.7372) [2022-01-23 09:27:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1130/1251] eta 0:04:26 lr 0.000353 time 1.8342 (2.2005) loss 3.6375 (3.4056) grad_norm 1.7422 (1.7365) [2022-01-23 09:28:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1140/1251] eta 0:04:04 lr 0.000352 time 2.2753 (2.1992) loss 3.8807 (3.4057) grad_norm 1.5881 (1.7358) [2022-01-23 09:28:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1150/1251] eta 0:03:42 lr 0.000352 time 2.1466 (2.1986) loss 3.7086 (3.4066) grad_norm 1.4858 (1.7357) [2022-01-23 09:28:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1160/1251] eta 0:03:20 lr 0.000352 time 5.2908 (2.2028) loss 3.3762 (3.4058) grad_norm 1.6397 (1.7352) [2022-01-23 09:29:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1170/1251] eta 0:02:58 lr 0.000352 time 4.1055 (2.2074) loss 3.8337 (3.4042) grad_norm 1.9408 (1.7352) [2022-01-23 09:29:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1180/1251] eta 0:02:36 lr 0.000352 time 1.9569 (2.2067) loss 2.9872 (3.4042) grad_norm 1.5363 (1.7350) [2022-01-23 09:29:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1190/1251] eta 0:02:14 lr 0.000352 time 1.8437 (2.2057) loss 3.4741 (3.4039) grad_norm 1.5235 (1.7348) [2022-01-23 09:30:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1200/1251] eta 0:01:52 lr 0.000352 time 2.1869 (2.2048) loss 3.5000 (3.4053) grad_norm 1.4603 (1.7342) [2022-01-23 09:30:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1210/1251] eta 0:01:30 lr 0.000352 time 2.8307 (2.2032) loss 2.8239 (3.4039) grad_norm 1.5221 (1.7337) [2022-01-23 09:31:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1220/1251] eta 0:01:08 lr 0.000352 time 1.8809 (2.2027) loss 3.8029 (3.4060) grad_norm 1.5932 (1.7334) [2022-01-23 09:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1230/1251] eta 0:00:46 lr 0.000352 time 2.2347 (2.2027) loss 2.5902 (3.4032) grad_norm 1.5426 (1.7329) [2022-01-23 09:31:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1240/1251] eta 0:00:24 lr 0.000352 time 1.6770 (2.2032) loss 2.3716 (3.4031) grad_norm 1.6804 (1.7330) [2022-01-23 09:32:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [179/300][1250/1251] eta 0:00:02 lr 0.000352 time 1.1565 (2.1979) loss 3.4294 (3.4040) grad_norm 1.7610 (1.7337) [2022-01-23 09:32:01 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 179 training takes 0:45:50 [2022-01-23 09:32:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.510 (18.510) Loss 1.0461 (1.0461) Acc@1 75.391 (75.391) Acc@5 92.773 (92.773) [2022-01-23 09:32:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 2.041 (3.596) Loss 0.9965 (0.9772) Acc@1 76.660 (77.069) Acc@5 92.969 (93.608) [2022-01-23 09:32:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.593 (2.565) Loss 0.9610 (0.9725) Acc@1 76.465 (77.162) Acc@5 94.434 (93.871) [2022-01-23 09:33:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.918 (2.303) Loss 0.9397 (0.9602) Acc@1 78.320 (77.482) Acc@5 94.531 (93.989) [2022-01-23 09:33:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.321 (2.200) Loss 1.0851 (0.9678) Acc@1 73.340 (77.203) Acc@5 92.383 (93.936) [2022-01-23 09:33:38 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 77.260 Acc@5 93.968 [2022-01-23 09:33:38 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 77.3% [2022-01-23 09:33:38 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 77.44% [2022-01-23 09:34:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][0/1251] eta 7:29:42 lr 0.000352 time 21.5687 (21.5687) loss 3.4673 (3.4673) grad_norm 1.7734 (1.7734) [2022-01-23 09:34:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][10/1251] eta 1:27:19 lr 0.000352 time 2.8226 (4.2219) loss 3.7473 (3.3331) grad_norm 1.7377 (1.8143) [2022-01-23 09:34:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][20/1251] eta 1:05:47 lr 0.000352 time 1.3064 (3.2071) loss 2.4458 (3.3474) grad_norm 1.9886 (1.8293) [2022-01-23 09:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][30/1251] eta 0:58:36 lr 0.000352 time 1.5082 (2.8801) loss 3.3241 (3.4317) grad_norm 1.9506 (1.7971) [2022-01-23 09:35:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][40/1251] eta 0:55:11 lr 0.000352 time 3.7520 (2.7347) loss 3.5394 (3.4629) grad_norm 1.4454 (1.7773) [2022-01-23 09:35:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][50/1251] eta 0:52:31 lr 0.000352 time 1.5260 (2.6242) loss 2.7637 (3.4166) grad_norm 1.6016 (1.7656) [2022-01-23 09:36:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][60/1251] eta 0:51:04 lr 0.000352 time 1.8587 (2.5732) loss 4.1014 (3.3868) grad_norm 1.5369 (1.7498) [2022-01-23 09:36:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][70/1251] eta 0:49:54 lr 0.000352 time 1.5206 (2.5359) loss 3.6688 (3.4126) grad_norm 2.0350 (1.7489) [2022-01-23 09:37:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][80/1251] eta 0:49:00 lr 0.000352 time 3.1018 (2.5108) loss 3.8034 (3.4131) grad_norm 1.8195 (1.7399) [2022-01-23 09:37:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][90/1251] eta 0:47:45 lr 0.000352 time 1.8701 (2.4680) loss 3.1762 (3.3925) grad_norm 2.0285 (1.7378) [2022-01-23 09:37:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][100/1251] eta 0:46:38 lr 0.000352 time 1.9223 (2.4314) loss 2.8741 (3.4004) grad_norm 1.5466 (1.7368) [2022-01-23 09:38:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][110/1251] eta 0:45:39 lr 0.000352 time 1.9612 (2.4007) loss 3.9843 (3.3952) grad_norm 1.8982 (1.7413) [2022-01-23 09:38:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][120/1251] eta 0:44:47 lr 0.000352 time 2.7939 (2.3760) loss 2.9960 (3.3745) grad_norm 1.9316 (1.7434) [2022-01-23 09:38:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][130/1251] eta 0:43:51 lr 0.000352 time 1.8879 (2.3477) loss 2.9981 (3.3689) grad_norm 1.9462 (1.7455) [2022-01-23 09:39:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][140/1251] eta 0:43:30 lr 0.000351 time 3.0646 (2.3498) loss 4.1375 (3.3692) grad_norm 1.4700 (1.7403) [2022-01-23 09:39:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][150/1251] eta 0:42:52 lr 0.000351 time 2.2541 (2.3365) loss 2.7197 (3.3707) grad_norm 1.5907 (1.7399) [2022-01-23 09:39:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][160/1251] eta 0:42:27 lr 0.000351 time 1.5447 (2.3354) loss 3.5158 (3.3632) grad_norm 1.8553 (1.7386) [2022-01-23 09:40:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][170/1251] eta 0:41:54 lr 0.000351 time 1.5842 (2.3258) loss 3.6517 (3.3634) grad_norm 1.6597 (1.7397) [2022-01-23 09:40:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][180/1251] eta 0:41:26 lr 0.000351 time 2.4888 (2.3218) loss 3.8451 (3.3716) grad_norm 2.2677 (1.7431) [2022-01-23 09:41:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][190/1251] eta 0:40:53 lr 0.000351 time 2.2206 (2.3128) loss 3.5014 (3.3732) grad_norm 1.9933 (1.7495) [2022-01-23 09:41:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][200/1251] eta 0:40:25 lr 0.000351 time 1.5144 (2.3080) loss 3.4083 (3.3897) grad_norm 2.0461 (1.7513) [2022-01-23 09:41:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][210/1251] eta 0:39:58 lr 0.000351 time 1.8149 (2.3039) loss 3.7317 (3.3896) grad_norm 1.5143 (1.7516) [2022-01-23 09:42:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][220/1251] eta 0:39:35 lr 0.000351 time 2.9899 (2.3038) loss 4.0556 (3.3972) grad_norm 1.9039 (1.7511) [2022-01-23 09:42:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][230/1251] eta 0:38:58 lr 0.000351 time 1.9374 (2.2901) loss 3.7182 (3.3933) grad_norm 2.0998 (1.7501) [2022-01-23 09:42:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][240/1251] eta 0:38:19 lr 0.000351 time 1.9457 (2.2749) loss 2.3734 (3.3786) grad_norm 1.6034 (1.7489) [2022-01-23 09:43:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][250/1251] eta 0:37:47 lr 0.000351 time 2.3240 (2.2655) loss 4.2074 (3.3832) grad_norm 1.6191 (1.7472) [2022-01-23 09:43:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][260/1251] eta 0:37:19 lr 0.000351 time 2.5217 (2.2599) loss 2.9013 (3.3818) grad_norm 1.5764 (1.7521) [2022-01-23 09:43:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][270/1251] eta 0:36:50 lr 0.000351 time 2.1467 (2.2534) loss 3.6546 (3.3805) grad_norm 1.8102 (1.7519) [2022-01-23 09:44:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][280/1251] eta 0:36:31 lr 0.000351 time 2.3348 (2.2564) loss 4.0740 (3.3816) grad_norm 1.6164 (1.7523) [2022-01-23 09:44:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][290/1251] eta 0:36:08 lr 0.000351 time 1.9858 (2.2560) loss 4.1143 (3.3749) grad_norm 1.9213 (1.7567) [2022-01-23 09:44:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][300/1251] eta 0:35:50 lr 0.000351 time 3.4188 (2.2617) loss 3.5858 (3.3796) grad_norm 1.5060 (1.7594) [2022-01-23 09:45:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][310/1251] eta 0:35:28 lr 0.000351 time 2.2226 (2.2616) loss 3.8372 (3.3766) grad_norm 1.7911 (1.7600) [2022-01-23 09:45:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][320/1251] eta 0:35:00 lr 0.000351 time 1.8743 (2.2562) loss 2.4790 (3.3642) grad_norm 1.8440 (1.7596) [2022-01-23 09:46:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][330/1251] eta 0:34:34 lr 0.000351 time 2.2981 (2.2527) loss 3.0048 (3.3774) grad_norm 1.5439 (1.7604) [2022-01-23 09:46:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][340/1251] eta 0:34:09 lr 0.000351 time 2.9288 (2.2493) loss 3.7591 (3.3784) grad_norm 1.7559 (1.7592) [2022-01-23 09:46:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][350/1251] eta 0:33:45 lr 0.000351 time 2.2479 (2.2486) loss 3.0985 (3.3828) grad_norm 1.8219 (1.7601) [2022-01-23 09:47:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][360/1251] eta 0:33:24 lr 0.000351 time 2.4321 (2.2499) loss 3.4256 (3.3910) grad_norm 1.6767 (1.7587) [2022-01-23 09:47:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][370/1251] eta 0:33:00 lr 0.000351 time 2.5312 (2.2485) loss 3.6960 (3.3913) grad_norm 1.7621 (1.7596) [2022-01-23 09:47:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][380/1251] eta 0:32:36 lr 0.000351 time 3.4055 (2.2463) loss 3.5995 (3.3893) grad_norm 1.7189 (1.7645) [2022-01-23 09:48:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][390/1251] eta 0:32:10 lr 0.000351 time 2.4636 (2.2420) loss 3.9660 (3.3941) grad_norm 1.9208 (1.7661) [2022-01-23 09:48:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][400/1251] eta 0:31:45 lr 0.000350 time 1.6954 (2.2397) loss 3.6168 (3.3977) grad_norm 1.5786 (1.7662) [2022-01-23 09:48:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][410/1251] eta 0:31:21 lr 0.000350 time 1.5800 (2.2369) loss 3.5545 (3.3954) grad_norm 1.8255 (1.7639) [2022-01-23 09:49:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][420/1251] eta 0:31:03 lr 0.000350 time 4.9781 (2.2426) loss 3.9392 (3.3978) grad_norm 1.5171 (1.7623) [2022-01-23 09:49:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][430/1251] eta 0:30:42 lr 0.000350 time 2.6903 (2.2446) loss 3.9369 (3.4029) grad_norm 2.1497 (1.7630) [2022-01-23 09:50:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][440/1251] eta 0:30:16 lr 0.000350 time 1.6156 (2.2394) loss 3.8097 (3.4003) grad_norm 1.9431 (1.7626) [2022-01-23 09:50:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][450/1251] eta 0:29:51 lr 0.000350 time 1.8393 (2.2362) loss 2.9949 (3.4052) grad_norm 1.7430 (1.7612) [2022-01-23 09:50:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][460/1251] eta 0:29:27 lr 0.000350 time 2.6576 (2.2340) loss 3.8797 (3.4015) grad_norm 1.7572 (1.7609) [2022-01-23 09:51:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][470/1251] eta 0:29:01 lr 0.000350 time 2.2627 (2.2295) loss 3.6413 (3.3973) grad_norm 1.6784 (1.7604) [2022-01-23 09:51:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][480/1251] eta 0:28:39 lr 0.000350 time 2.1890 (2.2302) loss 2.4593 (3.3996) grad_norm 1.6864 (1.7604) [2022-01-23 09:51:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][490/1251] eta 0:28:16 lr 0.000350 time 1.8520 (2.2292) loss 2.7970 (3.3989) grad_norm 1.5890 (1.7598) [2022-01-23 09:52:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][500/1251] eta 0:27:54 lr 0.000350 time 2.4492 (2.2294) loss 2.1975 (3.3960) grad_norm 1.6192 (1.7604) [2022-01-23 09:52:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][510/1251] eta 0:27:30 lr 0.000350 time 2.2039 (2.2278) loss 3.5544 (3.3984) grad_norm 1.9667 (1.7619) [2022-01-23 09:52:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][520/1251] eta 0:27:09 lr 0.000350 time 2.7837 (2.2285) loss 2.7923 (3.3976) grad_norm 1.6255 (1.7623) [2022-01-23 09:53:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][530/1251] eta 0:26:47 lr 0.000350 time 2.5126 (2.2300) loss 3.4108 (3.3980) grad_norm 1.5443 (1.7593) [2022-01-23 09:53:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][540/1251] eta 0:26:24 lr 0.000350 time 2.7949 (2.2285) loss 3.5769 (3.3962) grad_norm 1.7958 (1.7601) [2022-01-23 09:54:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][550/1251] eta 0:26:01 lr 0.000350 time 2.1528 (2.2275) loss 3.5777 (3.4004) grad_norm 1.5222 (1.7576) [2022-01-23 09:54:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][560/1251] eta 0:25:37 lr 0.000350 time 1.9666 (2.2258) loss 3.6607 (3.3992) grad_norm 1.8856 (1.7588) [2022-01-23 09:54:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][570/1251] eta 0:25:16 lr 0.000350 time 2.8631 (2.2276) loss 3.5375 (3.4014) grad_norm 1.7907 (1.7589) [2022-01-23 09:55:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][580/1251] eta 0:24:54 lr 0.000350 time 2.2144 (2.2269) loss 2.7036 (3.4027) grad_norm 1.7773 (1.7604) [2022-01-23 09:55:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][590/1251] eta 0:24:29 lr 0.000350 time 1.9736 (2.2227) loss 2.6730 (3.4036) grad_norm 1.9728 (1.7630) [2022-01-23 09:55:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][600/1251] eta 0:24:04 lr 0.000350 time 2.1972 (2.2194) loss 3.1601 (3.4055) grad_norm 1.6697 (1.7646) [2022-01-23 09:56:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][610/1251] eta 0:23:40 lr 0.000350 time 1.6185 (2.2157) loss 4.2405 (3.4051) grad_norm 1.8332 (1.7672) [2022-01-23 09:56:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][620/1251] eta 0:23:18 lr 0.000350 time 2.7585 (2.2155) loss 2.8320 (3.4031) grad_norm 1.6028 (1.7663) [2022-01-23 09:56:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][630/1251] eta 0:22:57 lr 0.000350 time 1.6125 (2.2185) loss 2.2704 (3.4028) grad_norm 1.6576 (1.7663) [2022-01-23 09:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][640/1251] eta 0:22:39 lr 0.000350 time 2.4136 (2.2253) loss 2.5906 (3.4014) grad_norm 1.5734 (1.7649) [2022-01-23 09:57:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][650/1251] eta 0:22:17 lr 0.000349 time 1.8869 (2.2260) loss 3.2511 (3.4043) grad_norm 1.4865 (1.7643) [2022-01-23 09:58:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][660/1251] eta 0:21:56 lr 0.000349 time 3.6477 (2.2273) loss 2.8443 (3.4045) grad_norm 1.5205 (1.7630) [2022-01-23 09:58:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][670/1251] eta 0:21:31 lr 0.000349 time 1.8878 (2.2236) loss 3.2828 (3.4052) grad_norm 1.5331 (1.7611) [2022-01-23 09:58:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][680/1251] eta 0:21:07 lr 0.000349 time 2.0423 (2.2190) loss 3.6125 (3.4075) grad_norm 1.8255 (1.7611) [2022-01-23 09:59:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][690/1251] eta 0:20:43 lr 0.000349 time 1.4595 (2.2159) loss 3.2571 (3.4095) grad_norm 1.6401 (1.7616) [2022-01-23 09:59:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][700/1251] eta 0:20:20 lr 0.000349 time 2.5968 (2.2154) loss 3.9006 (3.4108) grad_norm 1.7651 (1.7607) [2022-01-23 09:59:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][710/1251] eta 0:19:57 lr 0.000349 time 1.5715 (2.2143) loss 3.8936 (3.4101) grad_norm 1.6806 (1.7605) [2022-01-23 10:00:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][720/1251] eta 0:19:35 lr 0.000349 time 2.8437 (2.2145) loss 4.3448 (3.4137) grad_norm 1.8371 (1.7595) [2022-01-23 10:00:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][730/1251] eta 0:19:15 lr 0.000349 time 1.6228 (2.2173) loss 3.8933 (3.4130) grad_norm 1.5676 (1.7597) [2022-01-23 10:01:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][740/1251] eta 0:18:53 lr 0.000349 time 2.7929 (2.2178) loss 4.1004 (3.4134) grad_norm 1.5912 (1.7600) [2022-01-23 10:01:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][750/1251] eta 0:18:30 lr 0.000349 time 1.8587 (2.2164) loss 3.2847 (3.4135) grad_norm 1.6124 (1.7601) [2022-01-23 10:01:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][760/1251] eta 0:18:08 lr 0.000349 time 1.6887 (2.2171) loss 3.8239 (3.4124) grad_norm 1.6909 (1.7600) [2022-01-23 10:02:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][770/1251] eta 0:17:46 lr 0.000349 time 1.5324 (2.2162) loss 4.0624 (3.4178) grad_norm 1.7022 (1.7590) [2022-01-23 10:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][780/1251] eta 0:17:23 lr 0.000349 time 2.1068 (2.2163) loss 3.9685 (3.4204) grad_norm 2.0209 (1.7591) [2022-01-23 10:02:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][790/1251] eta 0:17:01 lr 0.000349 time 2.6177 (2.2161) loss 3.0482 (3.4194) grad_norm 1.6420 (1.7599) [2022-01-23 10:03:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][800/1251] eta 0:16:38 lr 0.000349 time 2.2904 (2.2140) loss 3.2022 (3.4168) grad_norm 1.6127 (1.7597) [2022-01-23 10:03:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][810/1251] eta 0:16:15 lr 0.000349 time 1.8868 (2.2122) loss 3.8341 (3.4170) grad_norm 1.7755 (1.7596) [2022-01-23 10:03:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][820/1251] eta 0:15:53 lr 0.000349 time 2.3604 (2.2128) loss 3.5694 (3.4181) grad_norm 1.9024 (1.7589) [2022-01-23 10:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][830/1251] eta 0:15:31 lr 0.000349 time 2.4611 (2.2125) loss 2.6660 (3.4184) grad_norm 1.8396 (1.7575) [2022-01-23 10:04:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][840/1251] eta 0:15:08 lr 0.000349 time 1.8526 (2.2113) loss 3.4987 (3.4159) grad_norm 1.7366 (1.7570) [2022-01-23 10:05:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][850/1251] eta 0:14:47 lr 0.000349 time 2.2942 (2.2125) loss 3.1530 (3.4187) grad_norm 1.9262 (1.7574) [2022-01-23 10:05:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][860/1251] eta 0:14:25 lr 0.000349 time 3.1370 (2.2135) loss 2.7718 (3.4142) grad_norm 1.5679 (1.7573) [2022-01-23 10:05:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][870/1251] eta 0:14:03 lr 0.000349 time 2.4575 (2.2131) loss 3.5739 (3.4137) grad_norm 1.7720 (1.7573) [2022-01-23 10:06:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][880/1251] eta 0:13:40 lr 0.000349 time 1.6327 (2.2117) loss 3.7280 (3.4180) grad_norm 2.4147 (1.7581) [2022-01-23 10:06:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][890/1251] eta 0:13:18 lr 0.000349 time 1.9581 (2.2110) loss 2.6599 (3.4194) grad_norm 1.5890 (1.7575) [2022-01-23 10:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][900/1251] eta 0:12:56 lr 0.000348 time 2.8203 (2.2118) loss 4.0868 (3.4202) grad_norm 1.7642 (1.7576) [2022-01-23 10:07:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][910/1251] eta 0:12:34 lr 0.000348 time 2.2783 (2.2122) loss 3.5060 (3.4210) grad_norm 1.7556 (1.7592) [2022-01-23 10:07:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][920/1251] eta 0:12:11 lr 0.000348 time 2.0575 (2.2099) loss 3.9970 (3.4211) grad_norm 1.6034 (1.7589) [2022-01-23 10:07:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][930/1251] eta 0:11:48 lr 0.000348 time 1.9575 (2.2078) loss 2.3684 (3.4206) grad_norm 1.5780 (1.7582) [2022-01-23 10:08:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][940/1251] eta 0:11:26 lr 0.000348 time 2.2071 (2.2064) loss 2.9381 (3.4213) grad_norm 1.7016 (1.7590) [2022-01-23 10:08:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][950/1251] eta 0:11:04 lr 0.000348 time 2.1994 (2.2066) loss 3.2616 (3.4159) grad_norm 1.4908 (1.7593) [2022-01-23 10:08:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][960/1251] eta 0:10:42 lr 0.000348 time 2.1433 (2.2063) loss 2.5333 (3.4169) grad_norm 1.6573 (1.7600) [2022-01-23 10:09:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][970/1251] eta 0:10:20 lr 0.000348 time 2.8032 (2.2068) loss 2.4865 (3.4149) grad_norm 1.8126 (1.7612) [2022-01-23 10:09:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][980/1251] eta 0:09:58 lr 0.000348 time 2.7820 (2.2072) loss 3.3941 (3.4172) grad_norm 1.6838 (1.7607) [2022-01-23 10:10:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][990/1251] eta 0:09:35 lr 0.000348 time 2.5979 (2.2068) loss 3.7358 (3.4170) grad_norm 1.8371 (1.7607) [2022-01-23 10:10:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1000/1251] eta 0:09:14 lr 0.000348 time 1.6878 (2.2081) loss 3.4910 (3.4146) grad_norm 1.6058 (1.7604) [2022-01-23 10:10:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1010/1251] eta 0:08:52 lr 0.000348 time 2.3565 (2.2086) loss 3.2337 (3.4160) grad_norm 1.6581 (1.7617) [2022-01-23 10:11:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1020/1251] eta 0:08:30 lr 0.000348 time 3.7184 (2.2094) loss 3.2428 (3.4154) grad_norm 1.8261 (1.7622) [2022-01-23 10:11:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1030/1251] eta 0:08:08 lr 0.000348 time 2.1980 (2.2088) loss 3.4394 (3.4148) grad_norm 1.8434 (1.7622) [2022-01-23 10:11:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1040/1251] eta 0:07:45 lr 0.000348 time 2.0332 (2.2070) loss 3.7085 (3.4147) grad_norm 1.9234 (1.7618) [2022-01-23 10:12:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1050/1251] eta 0:07:23 lr 0.000348 time 2.2059 (2.2051) loss 4.1460 (3.4151) grad_norm 1.6544 (1.7618) [2022-01-23 10:12:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1060/1251] eta 0:07:01 lr 0.000348 time 2.7340 (2.2050) loss 3.1672 (3.4159) grad_norm 1.6135 (1.7610) [2022-01-23 10:13:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1070/1251] eta 0:06:39 lr 0.000348 time 2.0971 (2.2050) loss 3.4807 (3.4159) grad_norm 1.9755 (1.7614) [2022-01-23 10:13:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1080/1251] eta 0:06:16 lr 0.000348 time 1.6613 (2.2037) loss 2.8107 (3.4157) grad_norm 1.7002 (1.7613) [2022-01-23 10:13:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1090/1251] eta 0:05:54 lr 0.000348 time 1.5312 (2.2040) loss 3.6695 (3.4167) grad_norm 1.8349 (1.7608) [2022-01-23 10:14:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1100/1251] eta 0:05:32 lr 0.000348 time 2.3817 (2.2041) loss 3.8845 (3.4177) grad_norm 1.6724 (1.7607) [2022-01-23 10:14:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1110/1251] eta 0:05:10 lr 0.000348 time 2.4822 (2.2050) loss 3.5481 (3.4184) grad_norm 1.7518 (1.7600) [2022-01-23 10:14:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1120/1251] eta 0:04:48 lr 0.000348 time 1.8025 (2.2054) loss 2.6900 (3.4171) grad_norm 1.8001 (1.7605) [2022-01-23 10:15:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1130/1251] eta 0:04:26 lr 0.000348 time 1.7409 (2.2046) loss 3.7839 (3.4190) grad_norm 1.4429 (1.7601) [2022-01-23 10:15:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1140/1251] eta 0:04:04 lr 0.000348 time 2.1509 (2.2054) loss 2.8478 (3.4195) grad_norm 1.9298 (1.7598) [2022-01-23 10:15:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1150/1251] eta 0:03:42 lr 0.000348 time 2.0940 (2.2074) loss 3.5417 (3.4190) grad_norm 1.8435 (1.7603) [2022-01-23 10:16:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1160/1251] eta 0:03:20 lr 0.000347 time 2.1659 (2.2065) loss 3.2932 (3.4178) grad_norm 1.7923 (1.7596) [2022-01-23 10:16:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1170/1251] eta 0:02:58 lr 0.000347 time 1.9012 (2.2050) loss 3.5404 (3.4179) grad_norm 1.7089 (1.7594) [2022-01-23 10:17:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1180/1251] eta 0:02:36 lr 0.000347 time 1.9015 (2.2042) loss 3.8465 (3.4185) grad_norm 1.6721 (1.7594) [2022-01-23 10:17:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1190/1251] eta 0:02:14 lr 0.000347 time 1.9831 (2.2032) loss 3.7754 (3.4176) grad_norm 1.8464 (1.7605) [2022-01-23 10:17:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1200/1251] eta 0:01:52 lr 0.000347 time 2.2185 (2.2014) loss 4.1573 (3.4191) grad_norm 1.8855 (1.7604) [2022-01-23 10:18:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1210/1251] eta 0:01:30 lr 0.000347 time 1.8511 (2.2001) loss 4.0422 (3.4192) grad_norm 1.5182 (1.7600) [2022-01-23 10:18:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1220/1251] eta 0:01:08 lr 0.000347 time 2.4923 (2.2004) loss 3.9322 (3.4216) grad_norm 1.6229 (1.7593) [2022-01-23 10:18:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1230/1251] eta 0:00:46 lr 0.000347 time 1.9214 (2.2002) loss 4.0858 (3.4232) grad_norm 1.6944 (1.7590) [2022-01-23 10:19:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1240/1251] eta 0:00:24 lr 0.000347 time 2.0296 (2.2001) loss 3.7419 (3.4242) grad_norm 1.8062 (1.7591) [2022-01-23 10:19:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [180/300][1250/1251] eta 0:00:02 lr 0.000347 time 1.1361 (2.1948) loss 3.4826 (3.4257) grad_norm 1.5118 (1.7588) [2022-01-23 10:19:24 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 180 training takes 0:45:46 [2022-01-23 10:19:24 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_180 saving...... [2022-01-23 10:19:36 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_180 saved !!! [2022-01-23 10:19:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 11.712 (11.712) Loss 0.9382 (0.9382) Acc@1 78.809 (78.809) Acc@5 94.336 (94.336) [2022-01-23 10:20:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.311 (2.888) Loss 0.9914 (0.9723) Acc@1 76.270 (77.486) Acc@5 93.457 (93.830) [2022-01-23 10:20:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.294 (2.144) Loss 1.0055 (0.9740) Acc@1 77.148 (77.125) Acc@5 93.555 (93.750) [2022-01-23 10:20:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 5.273 (2.037) Loss 0.9319 (0.9675) Acc@1 77.832 (77.315) Acc@5 95.117 (93.841) [2022-01-23 10:20:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.192 (1.908) Loss 0.9668 (0.9621) Acc@1 78.320 (77.525) Acc@5 93.945 (93.891) [2022-01-23 10:21:02 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 77.510 Acc@5 93.960 [2022-01-23 10:21:02 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 77.5% [2022-01-23 10:21:02 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 77.51% [2022-01-23 10:21:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][0/1251] eta 7:31:40 lr 0.000347 time 21.6634 (21.6634) loss 3.7377 (3.7377) grad_norm 1.8894 (1.8894) [2022-01-23 10:21:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][10/1251] eta 1:23:13 lr 0.000347 time 1.7662 (4.0237) loss 3.4957 (3.5600) grad_norm 1.5371 (1.7804) [2022-01-23 10:22:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][20/1251] eta 1:02:36 lr 0.000347 time 1.5367 (3.0519) loss 3.7261 (3.6071) grad_norm 1.9850 (1.8702) [2022-01-23 10:22:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][30/1251] eta 0:55:19 lr 0.000347 time 1.7839 (2.7189) loss 3.9238 (3.6155) grad_norm 1.7460 (1.8838) [2022-01-23 10:22:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][40/1251] eta 0:54:39 lr 0.000347 time 6.8835 (2.7082) loss 2.0187 (3.5521) grad_norm 1.5398 (1.8918) [2022-01-23 10:23:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][50/1251] eta 0:52:12 lr 0.000347 time 1.8670 (2.6078) loss 3.6244 (3.5747) grad_norm 1.5871 (1.8565) [2022-01-23 10:23:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][60/1251] eta 0:50:09 lr 0.000347 time 1.6251 (2.5271) loss 4.1327 (3.5672) grad_norm 1.7109 (1.8276) [2022-01-23 10:23:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][70/1251] eta 0:48:46 lr 0.000347 time 1.5544 (2.4780) loss 2.8920 (3.5677) grad_norm 1.6606 (1.8381) [2022-01-23 10:24:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][80/1251] eta 0:48:06 lr 0.000347 time 3.5596 (2.4653) loss 4.4239 (3.5173) grad_norm 1.5961 (1.8285) [2022-01-23 10:24:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][90/1251] eta 0:47:22 lr 0.000347 time 2.0736 (2.4486) loss 3.7337 (3.5271) grad_norm 1.5812 (1.8083) [2022-01-23 10:25:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][100/1251] eta 0:46:17 lr 0.000347 time 1.9056 (2.4130) loss 2.5317 (3.5213) grad_norm 1.7483 (1.8018) [2022-01-23 10:25:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][110/1251] eta 0:45:18 lr 0.000347 time 1.5231 (2.3822) loss 3.8525 (3.5016) grad_norm 1.5609 (1.7911) [2022-01-23 10:25:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][120/1251] eta 0:44:41 lr 0.000347 time 2.8179 (2.3706) loss 3.5106 (3.4971) grad_norm 1.6119 (1.7813) [2022-01-23 10:26:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][130/1251] eta 0:44:01 lr 0.000347 time 2.0404 (2.3563) loss 3.8676 (3.4792) grad_norm 2.0205 (1.7797) [2022-01-23 10:26:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][140/1251] eta 0:43:16 lr 0.000347 time 2.1225 (2.3374) loss 3.7990 (3.4816) grad_norm 1.7944 (1.7842) [2022-01-23 10:26:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][150/1251] eta 0:42:54 lr 0.000347 time 1.8582 (2.3383) loss 2.9779 (3.4843) grad_norm 1.6638 (1.7805) [2022-01-23 10:27:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][160/1251] eta 0:42:29 lr 0.000346 time 3.0694 (2.3365) loss 3.7911 (3.4912) grad_norm 1.4617 (1.7724) [2022-01-23 10:27:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][170/1251] eta 0:41:53 lr 0.000346 time 1.8856 (2.3251) loss 3.4604 (3.4923) grad_norm 1.6158 (1.7705) [2022-01-23 10:28:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][180/1251] eta 0:41:15 lr 0.000346 time 1.8606 (2.3110) loss 3.4092 (3.4942) grad_norm 1.8796 (1.7722) [2022-01-23 10:28:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][190/1251] eta 0:40:39 lr 0.000346 time 2.2617 (2.2992) loss 3.6752 (3.4827) grad_norm 1.7319 (1.7717) [2022-01-23 10:28:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][200/1251] eta 0:40:05 lr 0.000346 time 1.9769 (2.2884) loss 4.0838 (3.4913) grad_norm 1.8751 (1.7750) [2022-01-23 10:29:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][210/1251] eta 0:39:29 lr 0.000346 time 1.8129 (2.2757) loss 3.0691 (3.4936) grad_norm 1.4443 (1.7704) [2022-01-23 10:29:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][220/1251] eta 0:39:02 lr 0.000346 time 2.1344 (2.2719) loss 2.3650 (3.4950) grad_norm 1.8854 (1.7691) [2022-01-23 10:29:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][230/1251] eta 0:38:37 lr 0.000346 time 2.1930 (2.2696) loss 4.0855 (3.4753) grad_norm 1.9612 (1.7679) [2022-01-23 10:30:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][240/1251] eta 0:38:15 lr 0.000346 time 2.3717 (2.2709) loss 2.5162 (3.4770) grad_norm 1.7611 (1.7672) [2022-01-23 10:30:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][250/1251] eta 0:37:53 lr 0.000346 time 2.1314 (2.2709) loss 2.2578 (3.4754) grad_norm 1.8024 (1.7638) [2022-01-23 10:30:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][260/1251] eta 0:37:30 lr 0.000346 time 2.0101 (2.2712) loss 3.7503 (3.4655) grad_norm 1.5792 (1.7695) [2022-01-23 10:31:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][270/1251] eta 0:37:02 lr 0.000346 time 1.9350 (2.2657) loss 3.3601 (3.4601) grad_norm 1.5270 (1.7717) [2022-01-23 10:31:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][280/1251] eta 0:36:37 lr 0.000346 time 2.3196 (2.2634) loss 3.7587 (3.4587) grad_norm 1.8944 (1.7692) [2022-01-23 10:31:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][290/1251] eta 0:36:09 lr 0.000346 time 1.5299 (2.2571) loss 3.7296 (3.4632) grad_norm 1.9420 (1.7718) [2022-01-23 10:32:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][300/1251] eta 0:35:42 lr 0.000346 time 1.9378 (2.2532) loss 2.4105 (3.4589) grad_norm 1.7923 (1.7756) [2022-01-23 10:32:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][310/1251] eta 0:35:20 lr 0.000346 time 1.9195 (2.2532) loss 3.6463 (3.4546) grad_norm 1.7246 (1.7746) [2022-01-23 10:33:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][320/1251] eta 0:34:58 lr 0.000346 time 1.7619 (2.2535) loss 3.7119 (3.4500) grad_norm 1.7165 (1.7720) [2022-01-23 10:33:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][330/1251] eta 0:34:33 lr 0.000346 time 1.5772 (2.2514) loss 3.5816 (3.4476) grad_norm 1.8481 (1.7699) [2022-01-23 10:33:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][340/1251] eta 0:34:11 lr 0.000346 time 2.4274 (2.2517) loss 3.2706 (3.4422) grad_norm 1.5129 (1.7664) [2022-01-23 10:34:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][350/1251] eta 0:33:43 lr 0.000346 time 1.9197 (2.2461) loss 2.7162 (3.4404) grad_norm 1.7991 (1.7657) [2022-01-23 10:34:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][360/1251] eta 0:33:16 lr 0.000346 time 1.5959 (2.2411) loss 3.7690 (3.4320) grad_norm 1.6502 (1.7662) [2022-01-23 10:34:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][370/1251] eta 0:32:51 lr 0.000346 time 1.7377 (2.2379) loss 3.9911 (3.4333) grad_norm 1.6517 (1.7673) [2022-01-23 10:35:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][380/1251] eta 0:32:30 lr 0.000346 time 2.5827 (2.2390) loss 2.9454 (3.4323) grad_norm 1.7835 (1.7669) [2022-01-23 10:35:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][390/1251] eta 0:32:04 lr 0.000346 time 1.6009 (2.2351) loss 3.7236 (3.4296) grad_norm 2.6411 (1.7715) [2022-01-23 10:35:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][400/1251] eta 0:31:38 lr 0.000346 time 1.8697 (2.2311) loss 3.1175 (3.4276) grad_norm 1.7256 (1.7731) [2022-01-23 10:36:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][410/1251] eta 0:31:18 lr 0.000346 time 1.8894 (2.2337) loss 2.6352 (3.4318) grad_norm 1.9821 (1.7726) [2022-01-23 10:36:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][420/1251] eta 0:30:57 lr 0.000345 time 2.1421 (2.2357) loss 3.8937 (3.4309) grad_norm 1.7834 (1.7701) [2022-01-23 10:37:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][430/1251] eta 0:30:34 lr 0.000345 time 1.8947 (2.2340) loss 3.4872 (3.4281) grad_norm 1.6724 (1.7672) [2022-01-23 10:37:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][440/1251] eta 0:30:10 lr 0.000345 time 1.9026 (2.2322) loss 3.7372 (3.4314) grad_norm 1.7231 (1.7641) [2022-01-23 10:37:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][450/1251] eta 0:29:46 lr 0.000345 time 2.0224 (2.2302) loss 2.7826 (3.4333) grad_norm 1.6310 (1.7628) [2022-01-23 10:38:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][460/1251] eta 0:29:19 lr 0.000345 time 1.7115 (2.2246) loss 4.2214 (3.4354) grad_norm 1.9245 (1.7627) [2022-01-23 10:38:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][470/1251] eta 0:28:55 lr 0.000345 time 1.9929 (2.2223) loss 3.6478 (3.4378) grad_norm 1.7679 (1.7627) [2022-01-23 10:38:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][480/1251] eta 0:28:34 lr 0.000345 time 2.6516 (2.2234) loss 4.3277 (3.4410) grad_norm 1.6128 (1.7626) [2022-01-23 10:39:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][490/1251] eta 0:28:11 lr 0.000345 time 1.7920 (2.2231) loss 3.9224 (3.4449) grad_norm 1.7987 (1.7623) [2022-01-23 10:39:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][500/1251] eta 0:27:47 lr 0.000345 time 1.7357 (2.2202) loss 3.6054 (3.4466) grad_norm 1.6077 (1.7630) [2022-01-23 10:39:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][510/1251] eta 0:27:23 lr 0.000345 time 2.5542 (2.2183) loss 3.7993 (3.4450) grad_norm 1.7576 (1.7634) [2022-01-23 10:40:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][520/1251] eta 0:27:00 lr 0.000345 time 1.8561 (2.2168) loss 3.9246 (3.4440) grad_norm 2.0307 (1.7656) [2022-01-23 10:40:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][530/1251] eta 0:26:38 lr 0.000345 time 1.6184 (2.2165) loss 3.9834 (3.4496) grad_norm 1.5892 (1.7654) [2022-01-23 10:41:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][540/1251] eta 0:26:16 lr 0.000345 time 2.5248 (2.2166) loss 3.6431 (3.4535) grad_norm 1.5592 (1.7680) [2022-01-23 10:41:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][550/1251] eta 0:25:54 lr 0.000345 time 1.8017 (2.2178) loss 2.6392 (3.4496) grad_norm 1.6253 (1.7661) [2022-01-23 10:41:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][560/1251] eta 0:25:35 lr 0.000345 time 2.7140 (2.2215) loss 3.5406 (3.4481) grad_norm 1.8812 (1.7672) [2022-01-23 10:42:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][570/1251] eta 0:25:10 lr 0.000345 time 1.8407 (2.2183) loss 2.5645 (3.4478) grad_norm 1.6110 (1.7709) [2022-01-23 10:42:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][580/1251] eta 0:24:47 lr 0.000345 time 1.5915 (2.2164) loss 2.6357 (3.4446) grad_norm 1.5772 (1.7710) [2022-01-23 10:42:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][590/1251] eta 0:24:22 lr 0.000345 time 1.9206 (2.2133) loss 3.1982 (3.4438) grad_norm 1.7189 (1.7718) [2022-01-23 10:43:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][600/1251] eta 0:24:01 lr 0.000345 time 2.9961 (2.2147) loss 3.8875 (3.4454) grad_norm 1.6378 (1.7740) [2022-01-23 10:43:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][610/1251] eta 0:23:38 lr 0.000345 time 1.5005 (2.2136) loss 2.8358 (3.4450) grad_norm 1.9389 (1.7738) [2022-01-23 10:43:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][620/1251] eta 0:23:16 lr 0.000345 time 2.5042 (2.2129) loss 3.4350 (3.4465) grad_norm 1.9327 (1.7733) [2022-01-23 10:44:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][630/1251] eta 0:22:53 lr 0.000345 time 2.2337 (2.2113) loss 3.7440 (3.4478) grad_norm 2.3119 (1.7737) [2022-01-23 10:44:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][640/1251] eta 0:22:30 lr 0.000345 time 1.9802 (2.2101) loss 3.3913 (3.4444) grad_norm 1.7207 (1.7723) [2022-01-23 10:44:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][650/1251] eta 0:22:06 lr 0.000345 time 1.6036 (2.2077) loss 3.5353 (3.4427) grad_norm 1.6600 (1.7721) [2022-01-23 10:45:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][660/1251] eta 0:21:43 lr 0.000345 time 2.2186 (2.2062) loss 2.9680 (3.4446) grad_norm 1.8773 (1.7734) [2022-01-23 10:45:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][670/1251] eta 0:21:21 lr 0.000344 time 2.3056 (2.2055) loss 3.1404 (3.4462) grad_norm 1.7725 (1.7740) [2022-01-23 10:46:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][680/1251] eta 0:20:58 lr 0.000344 time 2.1873 (2.2049) loss 2.8522 (3.4491) grad_norm 1.8873 (1.7747) [2022-01-23 10:46:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][690/1251] eta 0:20:37 lr 0.000344 time 1.8954 (2.2064) loss 3.5572 (3.4467) grad_norm 1.6634 (1.7731) [2022-01-23 10:46:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][700/1251] eta 0:20:15 lr 0.000344 time 2.4161 (2.2066) loss 3.7696 (3.4447) grad_norm 2.0116 (1.7716) [2022-01-23 10:47:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][710/1251] eta 0:19:53 lr 0.000344 time 3.3112 (2.2068) loss 4.1201 (3.4453) grad_norm 1.6307 (1.7709) [2022-01-23 10:47:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][720/1251] eta 0:19:31 lr 0.000344 time 2.1509 (2.2063) loss 3.1922 (3.4410) grad_norm 1.6913 (1.7710) [2022-01-23 10:47:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][730/1251] eta 0:19:09 lr 0.000344 time 1.9763 (2.2070) loss 2.6127 (3.4402) grad_norm 1.4733 (1.7715) [2022-01-23 10:48:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][740/1251] eta 0:18:49 lr 0.000344 time 2.5140 (2.2100) loss 3.6026 (3.4384) grad_norm 1.7417 (1.7717) [2022-01-23 10:48:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][750/1251] eta 0:18:28 lr 0.000344 time 2.8691 (2.2122) loss 3.7005 (3.4366) grad_norm 1.8195 (1.7727) [2022-01-23 10:49:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][760/1251] eta 0:18:05 lr 0.000344 time 1.6314 (2.2116) loss 2.4331 (3.4339) grad_norm 1.8327 (1.7742) [2022-01-23 10:49:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][770/1251] eta 0:17:42 lr 0.000344 time 1.8069 (2.2095) loss 4.0379 (3.4351) grad_norm 1.6618 (1.7749) [2022-01-23 10:49:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][780/1251] eta 0:17:19 lr 0.000344 time 1.7016 (2.2070) loss 3.8394 (3.4352) grad_norm 1.7714 (1.7756) [2022-01-23 10:50:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][790/1251] eta 0:16:57 lr 0.000344 time 3.1731 (2.2066) loss 3.7942 (3.4353) grad_norm 1.4297 (1.7770) [2022-01-23 10:50:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][800/1251] eta 0:16:34 lr 0.000344 time 1.6729 (2.2042) loss 3.6364 (3.4352) grad_norm 1.7267 (1.7764) [2022-01-23 10:50:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][810/1251] eta 0:16:12 lr 0.000344 time 1.9046 (2.2044) loss 3.5442 (3.4344) grad_norm 1.5622 (1.7769) [2022-01-23 10:51:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][820/1251] eta 0:15:50 lr 0.000344 time 2.6182 (2.2042) loss 3.8171 (3.4339) grad_norm 1.6586 (1.7769) [2022-01-23 10:51:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][830/1251] eta 0:15:28 lr 0.000344 time 2.5587 (2.2046) loss 3.4360 (3.4322) grad_norm 1.6829 (1.7754) [2022-01-23 10:51:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][840/1251] eta 0:15:06 lr 0.000344 time 1.8686 (2.2054) loss 3.6648 (3.4328) grad_norm 1.6235 (1.7743) [2022-01-23 10:52:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][850/1251] eta 0:14:44 lr 0.000344 time 2.8480 (2.2063) loss 4.0681 (3.4329) grad_norm 1.5817 (1.7731) [2022-01-23 10:52:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][860/1251] eta 0:14:22 lr 0.000344 time 2.8129 (2.2069) loss 3.2886 (3.4319) grad_norm 1.7674 (1.7732) [2022-01-23 10:53:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][870/1251] eta 0:14:00 lr 0.000344 time 1.8622 (2.2058) loss 3.8142 (3.4339) grad_norm 1.7657 (1.7725) [2022-01-23 10:53:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][880/1251] eta 0:13:37 lr 0.000344 time 1.8608 (2.2041) loss 3.4152 (3.4337) grad_norm 1.8582 (1.7726) [2022-01-23 10:53:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][890/1251] eta 0:13:15 lr 0.000344 time 3.0689 (2.2039) loss 3.1194 (3.4320) grad_norm 1.8375 (1.7716) [2022-01-23 10:54:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][900/1251] eta 0:12:53 lr 0.000344 time 3.0081 (2.2031) loss 3.7938 (3.4289) grad_norm 1.7503 (1.7711) [2022-01-23 10:54:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][910/1251] eta 0:12:31 lr 0.000344 time 1.9170 (2.2028) loss 3.2615 (3.4267) grad_norm 1.5284 (1.7693) [2022-01-23 10:54:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][920/1251] eta 0:12:08 lr 0.000344 time 1.7077 (2.2013) loss 3.5303 (3.4276) grad_norm 1.4117 (1.7682) [2022-01-23 10:55:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][930/1251] eta 0:11:46 lr 0.000343 time 3.3167 (2.2022) loss 2.9007 (3.4247) grad_norm 1.8626 (1.7682) [2022-01-23 10:55:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][940/1251] eta 0:11:25 lr 0.000343 time 2.6265 (2.2027) loss 3.6262 (3.4243) grad_norm 1.7677 (1.7684) [2022-01-23 10:55:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][950/1251] eta 0:11:03 lr 0.000343 time 1.7814 (2.2031) loss 3.8644 (3.4266) grad_norm 1.8516 (1.7679) [2022-01-23 10:56:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][960/1251] eta 0:10:41 lr 0.000343 time 1.7575 (2.2032) loss 3.6172 (3.4272) grad_norm 1.6499 (1.7679) [2022-01-23 10:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][970/1251] eta 0:10:18 lr 0.000343 time 3.3291 (2.2028) loss 3.3074 (3.4283) grad_norm 1.6857 (1.7666) [2022-01-23 10:57:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][980/1251] eta 0:09:56 lr 0.000343 time 2.3353 (2.2018) loss 3.6060 (3.4267) grad_norm 1.8675 (1.7666) [2022-01-23 10:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][990/1251] eta 0:09:35 lr 0.000343 time 2.2702 (2.2032) loss 3.7674 (3.4264) grad_norm 1.7084 (1.7665) [2022-01-23 10:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1000/1251] eta 0:09:12 lr 0.000343 time 1.9997 (2.2025) loss 2.9561 (3.4241) grad_norm 2.0377 (1.7662) [2022-01-23 10:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1010/1251] eta 0:08:50 lr 0.000343 time 2.7622 (2.2016) loss 2.7534 (3.4219) grad_norm 1.5919 (1.7648) [2022-01-23 10:58:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1020/1251] eta 0:08:28 lr 0.000343 time 1.8060 (2.2021) loss 3.3930 (3.4224) grad_norm 1.9289 (1.7641) [2022-01-23 10:58:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1030/1251] eta 0:08:06 lr 0.000343 time 1.9088 (2.2007) loss 3.3302 (3.4219) grad_norm 1.8337 (1.7631) [2022-01-23 10:59:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1040/1251] eta 0:07:43 lr 0.000343 time 2.0122 (2.1983) loss 3.9495 (3.4234) grad_norm 1.8689 (1.7626) [2022-01-23 10:59:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1050/1251] eta 0:07:21 lr 0.000343 time 2.2124 (2.1986) loss 3.9659 (3.4252) grad_norm 1.6635 (1.7632) [2022-01-23 10:59:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1060/1251] eta 0:07:00 lr 0.000343 time 2.1568 (2.1990) loss 3.7224 (3.4271) grad_norm 1.7198 (1.7637) [2022-01-23 11:00:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1070/1251] eta 0:06:38 lr 0.000343 time 2.4505 (2.1991) loss 3.7118 (3.4260) grad_norm 1.5155 (1.7624) [2022-01-23 11:00:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1080/1251] eta 0:06:16 lr 0.000343 time 2.3265 (2.1992) loss 2.3531 (3.4256) grad_norm 1.8845 (1.7616) [2022-01-23 11:01:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1090/1251] eta 0:05:54 lr 0.000343 time 2.2217 (2.1988) loss 3.8401 (3.4255) grad_norm 1.5002 (1.7614) [2022-01-23 11:01:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1100/1251] eta 0:05:32 lr 0.000343 time 1.5778 (2.1989) loss 2.8386 (3.4271) grad_norm 1.6048 (1.7609) [2022-01-23 11:01:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1110/1251] eta 0:05:10 lr 0.000343 time 2.3627 (2.1989) loss 4.0689 (3.4261) grad_norm 1.8620 (1.7605) [2022-01-23 11:02:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1120/1251] eta 0:04:47 lr 0.000343 time 1.9823 (2.1975) loss 3.1926 (3.4280) grad_norm 1.5652 (1.7604) [2022-01-23 11:02:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1130/1251] eta 0:04:25 lr 0.000343 time 1.6869 (2.1967) loss 2.7755 (3.4259) grad_norm 1.7096 (1.7596) [2022-01-23 11:02:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1140/1251] eta 0:04:03 lr 0.000343 time 2.1420 (2.1976) loss 3.8762 (3.4264) grad_norm 1.6533 (1.7587) [2022-01-23 11:03:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1150/1251] eta 0:03:42 lr 0.000343 time 1.9363 (2.1982) loss 3.5663 (3.4256) grad_norm 1.7477 (1.7585) [2022-01-23 11:03:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1160/1251] eta 0:03:20 lr 0.000343 time 2.1381 (2.1991) loss 3.4793 (3.4256) grad_norm 1.6224 (1.7582) [2022-01-23 11:03:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1170/1251] eta 0:02:58 lr 0.000343 time 1.5824 (2.1987) loss 2.6566 (3.4267) grad_norm 1.6878 (1.7576) [2022-01-23 11:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1180/1251] eta 0:02:36 lr 0.000342 time 1.7151 (2.1973) loss 3.5469 (3.4270) grad_norm 1.7751 (1.7591) [2022-01-23 11:04:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1190/1251] eta 0:02:13 lr 0.000342 time 2.2453 (2.1966) loss 3.5887 (3.4264) grad_norm 1.8003 (1.7604) [2022-01-23 11:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1200/1251] eta 0:01:52 lr 0.000342 time 2.2236 (2.1962) loss 2.6467 (3.4259) grad_norm 2.0327 (1.7612) [2022-01-23 11:05:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1210/1251] eta 0:01:30 lr 0.000342 time 1.6294 (2.1955) loss 2.3796 (3.4229) grad_norm 1.7981 (1.7616) [2022-01-23 11:05:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1220/1251] eta 0:01:08 lr 0.000342 time 1.9626 (2.1957) loss 2.6269 (3.4239) grad_norm 1.7779 (1.7612) [2022-01-23 11:06:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1230/1251] eta 0:00:46 lr 0.000342 time 1.7415 (2.1958) loss 3.4445 (3.4236) grad_norm 2.6147 (1.7614) [2022-01-23 11:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1240/1251] eta 0:00:24 lr 0.000342 time 1.8583 (2.1962) loss 4.0528 (3.4243) grad_norm 1.7865 (1.7623) [2022-01-23 11:06:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [181/300][1250/1251] eta 0:00:02 lr 0.000342 time 1.3580 (2.1913) loss 3.4475 (3.4242) grad_norm 2.1420 (1.7623) [2022-01-23 11:06:43 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 181 training takes 0:45:41 [2022-01-23 11:07:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.310 (19.310) Loss 0.9862 (0.9862) Acc@1 76.270 (76.270) Acc@5 93.164 (93.164) [2022-01-23 11:07:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.246 (3.534) Loss 0.9373 (0.9518) Acc@1 78.516 (77.646) Acc@5 94.434 (93.857) [2022-01-23 11:07:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.900 (2.767) Loss 0.9177 (0.9621) Acc@1 77.637 (77.251) Acc@5 95.703 (93.983) [2022-01-23 11:07:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.276 (2.262) Loss 1.0735 (0.9688) Acc@1 76.074 (77.268) Acc@5 92.578 (93.898) [2022-01-23 11:08:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.480 (2.197) Loss 0.9384 (0.9682) Acc@1 77.539 (77.303) Acc@5 94.238 (93.862) [2022-01-23 11:08:20 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 77.326 Acc@5 93.876 [2022-01-23 11:08:20 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 77.3% [2022-01-23 11:08:20 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 77.51% [2022-01-23 11:08:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][0/1251] eta 7:25:11 lr 0.000342 time 21.3519 (21.3519) loss 2.3975 (2.3975) grad_norm 1.8036 (1.8036) [2022-01-23 11:09:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][10/1251] eta 1:25:50 lr 0.000342 time 2.7696 (4.1500) loss 2.4092 (2.9888) grad_norm 1.6279 (1.7147) [2022-01-23 11:09:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][20/1251] eta 1:04:34 lr 0.000342 time 1.2047 (3.1475) loss 3.6178 (3.3087) grad_norm 1.5906 (1.7055) [2022-01-23 11:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][30/1251] eta 0:58:06 lr 0.000342 time 1.8537 (2.8551) loss 3.6531 (3.2930) grad_norm 1.6950 (1.7262) [2022-01-23 11:10:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][40/1251] eta 0:54:59 lr 0.000342 time 3.7400 (2.7246) loss 3.3237 (3.3049) grad_norm 1.6035 (1.7387) [2022-01-23 11:10:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][50/1251] eta 0:53:28 lr 0.000342 time 2.7479 (2.6712) loss 3.7628 (3.3408) grad_norm 1.5720 (1.7490) [2022-01-23 11:11:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][60/1251] eta 0:52:00 lr 0.000342 time 2.3507 (2.6201) loss 3.4402 (3.3413) grad_norm 1.7433 (1.7376) [2022-01-23 11:11:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][70/1251] eta 0:50:13 lr 0.000342 time 1.5766 (2.5513) loss 3.6270 (3.3317) grad_norm 1.6061 (1.7553) [2022-01-23 11:11:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][80/1251] eta 0:48:54 lr 0.000342 time 2.5827 (2.5057) loss 3.7029 (3.3520) grad_norm 1.8632 (1.7612) [2022-01-23 11:12:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][90/1251] eta 0:47:42 lr 0.000342 time 1.8727 (2.4659) loss 3.8704 (3.3488) grad_norm 1.9099 (1.7623) [2022-01-23 11:12:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][100/1251] eta 0:46:38 lr 0.000342 time 3.0914 (2.4312) loss 3.6791 (3.3702) grad_norm 1.8027 (1.7859) [2022-01-23 11:12:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][110/1251] eta 0:45:31 lr 0.000342 time 1.6382 (2.3937) loss 3.8878 (3.3741) grad_norm 1.5100 (1.7851) [2022-01-23 11:13:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][120/1251] eta 0:44:33 lr 0.000342 time 1.9855 (2.3642) loss 2.2872 (3.3754) grad_norm 2.1627 (1.7822) [2022-01-23 11:13:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][130/1251] eta 0:43:41 lr 0.000342 time 1.9117 (2.3388) loss 4.1176 (3.3742) grad_norm 2.6539 (1.7954) [2022-01-23 11:13:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][140/1251] eta 0:43:10 lr 0.000342 time 2.3251 (2.3313) loss 3.6922 (3.3772) grad_norm 1.9176 (1.7941) [2022-01-23 11:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][150/1251] eta 0:42:29 lr 0.000342 time 2.2113 (2.3153) loss 2.7019 (3.3937) grad_norm 1.6268 (1.7945) [2022-01-23 11:14:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][160/1251] eta 0:42:01 lr 0.000342 time 2.1005 (2.3109) loss 3.5060 (3.3803) grad_norm 1.5442 (1.7896) [2022-01-23 11:14:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][170/1251] eta 0:41:35 lr 0.000342 time 2.2143 (2.3086) loss 3.8982 (3.3776) grad_norm 1.7314 (1.7893) [2022-01-23 11:15:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][180/1251] eta 0:41:07 lr 0.000342 time 2.1803 (2.3036) loss 2.9959 (3.3725) grad_norm 1.6393 (1.7855) [2022-01-23 11:15:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][190/1251] eta 0:40:42 lr 0.000341 time 2.4553 (2.3023) loss 4.0271 (3.3905) grad_norm 1.9400 (1.7817) [2022-01-23 11:16:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][200/1251] eta 0:40:19 lr 0.000341 time 1.8296 (2.3021) loss 3.6093 (3.3829) grad_norm 1.5517 (1.7765) [2022-01-23 11:16:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][210/1251] eta 0:39:52 lr 0.000341 time 1.8788 (2.2982) loss 2.5276 (3.3791) grad_norm 1.5963 (1.7756) [2022-01-23 11:16:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][220/1251] eta 0:39:19 lr 0.000341 time 1.8522 (2.2882) loss 3.5781 (3.3934) grad_norm 1.5081 (1.7748) [2022-01-23 11:17:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][230/1251] eta 0:38:52 lr 0.000341 time 2.6109 (2.2848) loss 3.6366 (3.3990) grad_norm 1.8746 (1.7786) [2022-01-23 11:17:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][240/1251] eta 0:38:22 lr 0.000341 time 2.4816 (2.2773) loss 3.8949 (3.3895) grad_norm 1.9061 (1.7856) [2022-01-23 11:17:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][250/1251] eta 0:37:50 lr 0.000341 time 2.4452 (2.2678) loss 3.6269 (3.3992) grad_norm 1.6870 (1.7876) [2022-01-23 11:18:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][260/1251] eta 0:37:19 lr 0.000341 time 2.1116 (2.2600) loss 3.3101 (3.4073) grad_norm 1.8589 (1.7888) [2022-01-23 11:18:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][270/1251] eta 0:37:01 lr 0.000341 time 3.0966 (2.2643) loss 2.9177 (3.4002) grad_norm 1.6237 (1.7880) [2022-01-23 11:18:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][280/1251] eta 0:36:35 lr 0.000341 time 2.5302 (2.2606) loss 3.3661 (3.3895) grad_norm 1.6988 (1.7903) [2022-01-23 11:19:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][290/1251] eta 0:36:09 lr 0.000341 time 1.8007 (2.2576) loss 2.9856 (3.3968) grad_norm 1.7738 (1.7913) [2022-01-23 11:19:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][300/1251] eta 0:35:48 lr 0.000341 time 2.6516 (2.2591) loss 4.1107 (3.4016) grad_norm 1.7474 (1.7941) [2022-01-23 11:20:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][310/1251] eta 0:35:25 lr 0.000341 time 2.4936 (2.2589) loss 2.9549 (3.4084) grad_norm 1.9186 (1.7920) [2022-01-23 11:20:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][320/1251] eta 0:35:02 lr 0.000341 time 2.3132 (2.2586) loss 3.7164 (3.4119) grad_norm 1.5392 (1.7902) [2022-01-23 11:20:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][330/1251] eta 0:34:37 lr 0.000341 time 1.8809 (2.2555) loss 3.5487 (3.4200) grad_norm 1.7804 (1.7873) [2022-01-23 11:21:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][340/1251] eta 0:34:11 lr 0.000341 time 2.5035 (2.2520) loss 3.8934 (3.4264) grad_norm 2.1207 (1.7899) [2022-01-23 11:21:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][350/1251] eta 0:33:43 lr 0.000341 time 1.6176 (2.2460) loss 4.2934 (3.4368) grad_norm 2.0996 (1.7905) [2022-01-23 11:21:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][360/1251] eta 0:33:23 lr 0.000341 time 2.5204 (2.2483) loss 3.3108 (3.4393) grad_norm 1.9020 (1.7898) [2022-01-23 11:22:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][370/1251] eta 0:32:56 lr 0.000341 time 2.0907 (2.2430) loss 3.3684 (3.4402) grad_norm 1.5318 (1.7878) [2022-01-23 11:22:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][380/1251] eta 0:32:31 lr 0.000341 time 1.8550 (2.2407) loss 3.8846 (3.4445) grad_norm 1.8887 (1.7886) [2022-01-23 11:22:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][390/1251] eta 0:32:07 lr 0.000341 time 1.8926 (2.2390) loss 3.5934 (3.4448) grad_norm 1.7768 (1.7875) [2022-01-23 11:23:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][400/1251] eta 0:31:41 lr 0.000341 time 1.9163 (2.2348) loss 3.9276 (3.4427) grad_norm 1.8749 (1.7893) [2022-01-23 11:23:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][410/1251] eta 0:31:20 lr 0.000341 time 2.7499 (2.2360) loss 3.8852 (3.4448) grad_norm 1.8507 (1.7872) [2022-01-23 11:24:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][420/1251] eta 0:30:59 lr 0.000341 time 2.7577 (2.2375) loss 3.4561 (3.4495) grad_norm 1.6330 (1.7848) [2022-01-23 11:24:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][430/1251] eta 0:30:35 lr 0.000341 time 1.5217 (2.2353) loss 2.6517 (3.4352) grad_norm 1.7387 (1.7831) [2022-01-23 11:24:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][440/1251] eta 0:30:13 lr 0.000340 time 1.9203 (2.2356) loss 3.4149 (3.4359) grad_norm 1.8551 (1.7822) [2022-01-23 11:25:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][450/1251] eta 0:29:50 lr 0.000340 time 2.4930 (2.2349) loss 3.9406 (3.4337) grad_norm 1.9648 (1.7828) [2022-01-23 11:25:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][460/1251] eta 0:29:27 lr 0.000340 time 2.3217 (2.2348) loss 3.5223 (3.4305) grad_norm 1.9075 (1.7805) [2022-01-23 11:25:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][470/1251] eta 0:29:00 lr 0.000340 time 1.6617 (2.2290) loss 3.5183 (3.4353) grad_norm 1.7814 (1.7810) [2022-01-23 11:26:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][480/1251] eta 0:28:35 lr 0.000340 time 1.9158 (2.2253) loss 2.5642 (3.4315) grad_norm 1.8071 (1.7797) [2022-01-23 11:26:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][490/1251] eta 0:28:13 lr 0.000340 time 2.1449 (2.2257) loss 3.7103 (3.4282) grad_norm 1.7781 (1.7785) [2022-01-23 11:26:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][500/1251] eta 0:27:51 lr 0.000340 time 2.2100 (2.2252) loss 3.6140 (3.4328) grad_norm 2.1854 (1.7825) [2022-01-23 11:27:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][510/1251] eta 0:27:29 lr 0.000340 time 2.1947 (2.2264) loss 4.0371 (3.4360) grad_norm 2.0314 (1.7834) [2022-01-23 11:27:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][520/1251] eta 0:27:12 lr 0.000340 time 2.3943 (2.2333) loss 2.5408 (3.4343) grad_norm 1.6440 (1.7830) [2022-01-23 11:28:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][530/1251] eta 0:26:47 lr 0.000340 time 1.9261 (2.2299) loss 3.7111 (3.4382) grad_norm 1.7362 (1.7806) [2022-01-23 11:28:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][540/1251] eta 0:26:23 lr 0.000340 time 1.9141 (2.2268) loss 2.8128 (3.4362) grad_norm 1.6095 (1.7803) [2022-01-23 11:28:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][550/1251] eta 0:25:56 lr 0.000340 time 1.7777 (2.2204) loss 3.1505 (3.4366) grad_norm 1.6630 (1.7783) [2022-01-23 11:29:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][560/1251] eta 0:25:30 lr 0.000340 time 1.5898 (2.2152) loss 4.0489 (3.4393) grad_norm 1.5864 (1.7778) [2022-01-23 11:29:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][570/1251] eta 0:25:06 lr 0.000340 time 1.8133 (2.2123) loss 4.0655 (3.4412) grad_norm 1.9350 (1.7787) [2022-01-23 11:29:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][580/1251] eta 0:24:45 lr 0.000340 time 1.9490 (2.2135) loss 2.7193 (3.4417) grad_norm 1.4242 (1.7768) [2022-01-23 11:30:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][590/1251] eta 0:24:25 lr 0.000340 time 2.1639 (2.2164) loss 3.5026 (3.4386) grad_norm 2.0662 (1.7760) [2022-01-23 11:30:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][600/1251] eta 0:24:03 lr 0.000340 time 2.0170 (2.2174) loss 3.8062 (3.4360) grad_norm 1.8060 (1.7767) [2022-01-23 11:30:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][610/1251] eta 0:23:40 lr 0.000340 time 1.9893 (2.2159) loss 3.7014 (3.4388) grad_norm 1.7062 (1.7764) [2022-01-23 11:31:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][620/1251] eta 0:23:18 lr 0.000340 time 2.1752 (2.2156) loss 2.7121 (3.4361) grad_norm 2.9273 (1.7791) [2022-01-23 11:31:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][630/1251] eta 0:22:57 lr 0.000340 time 2.4499 (2.2174) loss 2.4465 (3.4342) grad_norm 1.9467 (1.7801) [2022-01-23 11:32:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][640/1251] eta 0:22:33 lr 0.000340 time 1.5568 (2.2159) loss 3.6832 (3.4346) grad_norm 1.5673 (1.7800) [2022-01-23 11:32:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][650/1251] eta 0:22:10 lr 0.000340 time 1.9152 (2.2140) loss 3.2073 (3.4342) grad_norm 1.5727 (1.7792) [2022-01-23 11:32:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][660/1251] eta 0:21:48 lr 0.000340 time 1.8516 (2.2142) loss 3.7604 (3.4356) grad_norm 1.8550 (1.7788) [2022-01-23 11:33:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][670/1251] eta 0:21:26 lr 0.000340 time 2.5724 (2.2148) loss 3.9215 (3.4384) grad_norm 1.4653 (1.7785) [2022-01-23 11:33:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][680/1251] eta 0:21:05 lr 0.000340 time 2.4771 (2.2171) loss 3.1360 (3.4365) grad_norm 1.7225 (1.7770) [2022-01-23 11:33:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][690/1251] eta 0:20:44 lr 0.000340 time 1.8769 (2.2178) loss 3.4235 (3.4326) grad_norm 1.9500 (1.7780) [2022-01-23 11:34:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][700/1251] eta 0:20:20 lr 0.000339 time 1.7723 (2.2151) loss 3.5797 (3.4337) grad_norm 1.9146 (1.7795) [2022-01-23 11:34:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][710/1251] eta 0:19:57 lr 0.000339 time 1.9643 (2.2127) loss 2.7382 (3.4321) grad_norm 1.5606 (1.7783) [2022-01-23 11:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][720/1251] eta 0:19:33 lr 0.000339 time 2.3725 (2.2101) loss 4.1716 (3.4319) grad_norm 1.9464 (1.7780) [2022-01-23 11:35:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][730/1251] eta 0:19:11 lr 0.000339 time 2.1176 (2.2108) loss 2.4067 (3.4297) grad_norm 1.8464 (1.7789) [2022-01-23 11:35:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][740/1251] eta 0:18:51 lr 0.000339 time 2.4746 (2.2136) loss 4.0755 (3.4277) grad_norm 1.6781 (1.7794) [2022-01-23 11:36:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][750/1251] eta 0:18:29 lr 0.000339 time 1.9287 (2.2141) loss 2.6367 (3.4278) grad_norm 2.1203 (1.7799) [2022-01-23 11:36:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][760/1251] eta 0:18:07 lr 0.000339 time 2.6003 (2.2158) loss 3.0390 (3.4257) grad_norm 1.6806 (1.7780) [2022-01-23 11:36:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][770/1251] eta 0:17:44 lr 0.000339 time 1.6889 (2.2128) loss 2.5644 (3.4224) grad_norm 1.6086 (1.7772) [2022-01-23 11:37:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][780/1251] eta 0:17:20 lr 0.000339 time 1.6944 (2.2100) loss 3.8444 (3.4213) grad_norm 1.8070 (1.7775) [2022-01-23 11:37:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][790/1251] eta 0:16:57 lr 0.000339 time 2.2506 (2.2071) loss 3.7214 (3.4175) grad_norm 1.6494 (1.7766) [2022-01-23 11:37:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][800/1251] eta 0:16:35 lr 0.000339 time 2.0374 (2.2081) loss 2.4304 (3.4163) grad_norm 1.6301 (1.7764) [2022-01-23 11:38:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][810/1251] eta 0:16:14 lr 0.000339 time 1.9291 (2.2089) loss 3.4344 (3.4160) grad_norm 1.8267 (1.7771) [2022-01-23 11:38:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][820/1251] eta 0:15:53 lr 0.000339 time 1.9882 (2.2124) loss 3.8136 (3.4163) grad_norm 1.7602 (1.7776) [2022-01-23 11:38:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][830/1251] eta 0:15:31 lr 0.000339 time 1.5896 (2.2121) loss 3.3683 (3.4162) grad_norm 1.8540 (1.7783) [2022-01-23 11:39:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][840/1251] eta 0:15:09 lr 0.000339 time 1.9363 (2.2134) loss 4.0792 (3.4196) grad_norm 1.7830 (1.7780) [2022-01-23 11:39:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][850/1251] eta 0:14:46 lr 0.000339 time 1.8968 (2.2111) loss 3.8830 (3.4154) grad_norm 2.0508 (1.7775) [2022-01-23 11:40:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][860/1251] eta 0:14:24 lr 0.000339 time 2.2249 (2.2101) loss 2.7735 (3.4154) grad_norm 1.8867 (1.7788) [2022-01-23 11:40:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][870/1251] eta 0:14:01 lr 0.000339 time 1.9480 (2.2094) loss 3.7740 (3.4156) grad_norm 1.7617 (1.7783) [2022-01-23 11:40:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][880/1251] eta 0:13:40 lr 0.000339 time 2.2772 (2.2123) loss 3.4732 (3.4148) grad_norm 2.0650 (1.7797) [2022-01-23 11:41:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][890/1251] eta 0:13:18 lr 0.000339 time 1.5503 (2.2113) loss 4.0818 (3.4167) grad_norm 1.6058 (1.7795) [2022-01-23 11:41:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][900/1251] eta 0:12:55 lr 0.000339 time 2.1762 (2.2098) loss 3.3423 (3.4151) grad_norm 1.7497 (1.7792) [2022-01-23 11:41:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][910/1251] eta 0:12:32 lr 0.000339 time 1.9443 (2.2077) loss 3.9868 (3.4183) grad_norm 1.6031 (1.7788) [2022-01-23 11:42:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][920/1251] eta 0:12:10 lr 0.000339 time 1.9171 (2.2082) loss 3.3309 (3.4162) grad_norm 1.9045 (1.7786) [2022-01-23 11:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][930/1251] eta 0:11:48 lr 0.000339 time 2.1176 (2.2070) loss 3.0501 (3.4188) grad_norm 1.9419 (1.7783) [2022-01-23 11:42:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][940/1251] eta 0:11:27 lr 0.000339 time 1.8314 (2.2092) loss 3.5687 (3.4206) grad_norm 1.9052 (1.7779) [2022-01-23 11:43:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][950/1251] eta 0:11:04 lr 0.000338 time 2.5229 (2.2086) loss 2.7889 (3.4184) grad_norm 1.4634 (1.7769) [2022-01-23 11:43:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][960/1251] eta 0:10:42 lr 0.000338 time 1.6437 (2.2087) loss 3.4068 (3.4176) grad_norm 1.6854 (1.7765) [2022-01-23 11:44:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][970/1251] eta 0:10:20 lr 0.000338 time 2.5292 (2.2084) loss 3.5817 (3.4186) grad_norm 1.8569 (1.7776) [2022-01-23 11:44:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][980/1251] eta 0:09:58 lr 0.000338 time 1.7124 (2.2073) loss 3.5110 (3.4191) grad_norm 1.5843 (1.7767) [2022-01-23 11:44:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][990/1251] eta 0:09:35 lr 0.000338 time 1.9236 (2.2059) loss 3.9700 (3.4184) grad_norm 1.6976 (1.7767) [2022-01-23 11:45:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1000/1251] eta 0:09:13 lr 0.000338 time 1.9345 (2.2058) loss 3.5646 (3.4191) grad_norm 1.4579 (1.7760) [2022-01-23 11:45:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1010/1251] eta 0:08:51 lr 0.000338 time 2.1803 (2.2070) loss 3.7883 (3.4200) grad_norm 1.8298 (1.7758) [2022-01-23 11:45:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1020/1251] eta 0:08:29 lr 0.000338 time 1.5627 (2.2072) loss 3.0520 (3.4181) grad_norm 2.0393 (1.7752) [2022-01-23 11:46:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1030/1251] eta 0:08:07 lr 0.000338 time 1.8338 (2.2069) loss 4.1673 (3.4179) grad_norm 2.2514 (1.7761) [2022-01-23 11:46:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1040/1251] eta 0:07:45 lr 0.000338 time 2.2478 (2.2072) loss 3.8332 (3.4176) grad_norm 1.9296 (1.7761) [2022-01-23 11:46:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1050/1251] eta 0:07:23 lr 0.000338 time 1.9189 (2.2052) loss 3.8749 (3.4195) grad_norm 1.8245 (1.7764) [2022-01-23 11:47:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1060/1251] eta 0:07:00 lr 0.000338 time 1.9407 (2.2037) loss 3.0720 (3.4207) grad_norm 1.8251 (1.7772) [2022-01-23 11:47:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1070/1251] eta 0:06:38 lr 0.000338 time 1.7782 (2.2038) loss 4.0251 (3.4227) grad_norm 1.7378 (1.7769) [2022-01-23 11:48:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1080/1251] eta 0:06:17 lr 0.000338 time 2.2369 (2.2053) loss 3.4722 (3.4238) grad_norm 1.8473 (1.7766) [2022-01-23 11:48:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1090/1251] eta 0:05:55 lr 0.000338 time 2.0654 (2.2057) loss 3.7910 (3.4236) grad_norm 1.9708 (1.7776) [2022-01-23 11:48:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1100/1251] eta 0:05:33 lr 0.000338 time 2.0535 (2.2072) loss 3.7023 (3.4254) grad_norm 1.7970 (1.7772) [2022-01-23 11:49:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1110/1251] eta 0:05:11 lr 0.000338 time 1.6904 (2.2060) loss 2.7054 (3.4233) grad_norm 1.8025 (1.7771) [2022-01-23 11:49:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1120/1251] eta 0:04:49 lr 0.000338 time 1.7886 (2.2064) loss 3.0900 (3.4245) grad_norm 1.7125 (1.7771) [2022-01-23 11:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1130/1251] eta 0:04:26 lr 0.000338 time 1.8735 (2.2050) loss 3.2000 (3.4235) grad_norm 1.6514 (1.7773) [2022-01-23 11:50:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1140/1251] eta 0:04:04 lr 0.000338 time 1.9632 (2.2051) loss 2.8176 (3.4227) grad_norm 1.7648 (1.7784) [2022-01-23 11:50:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1150/1251] eta 0:03:42 lr 0.000338 time 1.8505 (2.2046) loss 3.9290 (3.4237) grad_norm 1.7783 (1.7775) [2022-01-23 11:51:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1160/1251] eta 0:03:20 lr 0.000338 time 1.9365 (2.2058) loss 2.6735 (3.4226) grad_norm 1.6756 (1.7763) [2022-01-23 11:51:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1170/1251] eta 0:02:58 lr 0.000338 time 2.1346 (2.2046) loss 3.5847 (3.4240) grad_norm 1.5653 (1.7747) [2022-01-23 11:51:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1180/1251] eta 0:02:36 lr 0.000338 time 2.2096 (2.2037) loss 2.7863 (3.4242) grad_norm 1.7992 (1.7749) [2022-01-23 11:52:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1190/1251] eta 0:02:14 lr 0.000338 time 2.5215 (2.2043) loss 3.7699 (3.4233) grad_norm 1.8188 (1.7753) [2022-01-23 11:52:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1200/1251] eta 0:01:52 lr 0.000338 time 1.8971 (2.2045) loss 3.5444 (3.4212) grad_norm 1.9080 (1.7752) [2022-01-23 11:52:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1210/1251] eta 0:01:30 lr 0.000337 time 1.7062 (2.2033) loss 3.6043 (3.4191) grad_norm 1.7600 (1.7747) [2022-01-23 11:53:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1220/1251] eta 0:01:08 lr 0.000337 time 2.0242 (2.2030) loss 2.8053 (3.4192) grad_norm 1.7204 (1.7742) [2022-01-23 11:53:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1230/1251] eta 0:00:46 lr 0.000337 time 2.0487 (2.2034) loss 4.0198 (3.4201) grad_norm 1.7734 (1.7736) [2022-01-23 11:53:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1240/1251] eta 0:00:24 lr 0.000337 time 1.4422 (2.2038) loss 3.0422 (3.4200) grad_norm 1.7023 (1.7737) [2022-01-23 11:54:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [182/300][1250/1251] eta 0:00:02 lr 0.000337 time 1.1953 (2.1989) loss 2.3009 (3.4183) grad_norm 1.9093 (1.7735) [2022-01-23 11:54:12 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 182 training takes 0:45:51 [2022-01-23 11:54:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.455 (18.455) Loss 0.9626 (0.9626) Acc@1 78.320 (78.320) Acc@5 93.262 (93.262) [2022-01-23 11:54:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.229 (3.504) Loss 0.9355 (0.9712) Acc@1 78.125 (77.512) Acc@5 95.508 (93.590) [2022-01-23 11:55:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.915 (2.726) Loss 0.8868 (0.9655) Acc@1 78.906 (77.307) Acc@5 94.727 (93.862) [2022-01-23 11:55:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.591 (2.339) Loss 1.0185 (0.9601) Acc@1 76.562 (77.416) Acc@5 92.773 (93.819) [2022-01-23 11:55:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.885 (2.200) Loss 0.9607 (0.9615) Acc@1 78.711 (77.375) Acc@5 93.359 (93.814) [2022-01-23 11:55:49 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 77.428 Acc@5 93.872 [2022-01-23 11:55:49 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 77.4% [2022-01-23 11:55:49 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 77.51% [2022-01-23 11:56:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][0/1251] eta 7:40:20 lr 0.000337 time 22.0791 (22.0791) loss 3.0205 (3.0205) grad_norm 1.7365 (1.7365) [2022-01-23 11:56:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][10/1251] eta 1:25:04 lr 0.000337 time 1.8730 (4.1135) loss 3.5030 (3.1462) grad_norm 1.7571 (1.7439) [2022-01-23 11:56:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][20/1251] eta 1:07:21 lr 0.000337 time 1.4990 (3.2832) loss 4.0316 (3.3016) grad_norm 1.6978 (1.7727) [2022-01-23 11:57:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][30/1251] eta 0:59:31 lr 0.000337 time 1.5025 (2.9250) loss 3.6228 (3.3320) grad_norm 1.7599 (1.7514) [2022-01-23 11:57:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][40/1251] eta 0:56:09 lr 0.000337 time 3.2265 (2.7825) loss 3.8084 (3.3544) grad_norm 1.5553 (1.7360) [2022-01-23 11:58:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][50/1251] eta 0:53:22 lr 0.000337 time 1.9178 (2.6662) loss 3.4885 (3.3554) grad_norm 1.7324 (1.7153) [2022-01-23 11:58:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][60/1251] eta 0:52:00 lr 0.000337 time 2.1816 (2.6204) loss 3.6768 (3.3448) grad_norm 1.7331 (1.7196) [2022-01-23 11:58:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][70/1251] eta 0:50:11 lr 0.000337 time 1.7365 (2.5498) loss 3.5355 (3.3482) grad_norm 1.7059 (1.7331) [2022-01-23 11:59:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][80/1251] eta 0:48:49 lr 0.000337 time 2.1732 (2.5020) loss 2.3966 (3.3343) grad_norm 1.8336 (1.7484) [2022-01-23 11:59:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][90/1251] eta 0:47:26 lr 0.000337 time 1.8975 (2.4521) loss 3.2054 (3.3417) grad_norm 1.7498 (1.7528) [2022-01-23 11:59:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][100/1251] eta 0:46:22 lr 0.000337 time 1.8780 (2.4172) loss 2.9813 (3.3418) grad_norm 1.9415 (1.7553) [2022-01-23 12:00:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][110/1251] eta 0:45:21 lr 0.000337 time 1.8848 (2.3855) loss 3.5128 (3.3558) grad_norm 1.6243 (1.7586) [2022-01-23 12:00:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][120/1251] eta 0:44:28 lr 0.000337 time 2.5937 (2.3591) loss 3.7254 (3.3859) grad_norm 1.5612 (1.7600) [2022-01-23 12:00:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][130/1251] eta 0:43:43 lr 0.000337 time 2.1045 (2.3401) loss 3.2080 (3.3923) grad_norm 1.7378 (1.7541) [2022-01-23 12:01:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][140/1251] eta 0:43:05 lr 0.000337 time 1.7706 (2.3269) loss 3.6934 (3.4095) grad_norm 1.7053 (1.7505) [2022-01-23 12:01:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][150/1251] eta 0:42:34 lr 0.000337 time 2.0635 (2.3204) loss 3.7648 (3.4303) grad_norm 1.6349 (1.7496) [2022-01-23 12:02:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][160/1251] eta 0:42:11 lr 0.000337 time 3.2066 (2.3201) loss 3.8474 (3.4393) grad_norm 1.6421 (1.7506) [2022-01-23 12:02:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][170/1251] eta 0:41:47 lr 0.000337 time 2.1309 (2.3200) loss 3.7088 (3.4443) grad_norm 1.9712 (1.7512) [2022-01-23 12:02:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][180/1251] eta 0:41:22 lr 0.000337 time 1.9090 (2.3181) loss 3.5161 (3.4357) grad_norm 1.8518 (1.7483) [2022-01-23 12:03:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][190/1251] eta 0:40:55 lr 0.000337 time 1.7893 (2.3146) loss 3.8803 (3.4427) grad_norm 1.9103 (1.7527) [2022-01-23 12:03:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][200/1251] eta 0:40:20 lr 0.000337 time 2.0869 (2.3033) loss 3.7260 (3.4433) grad_norm 1.7633 (1.7535) [2022-01-23 12:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][210/1251] eta 0:39:48 lr 0.000337 time 1.9120 (2.2942) loss 3.0970 (3.4394) grad_norm 1.5513 (1.7532) [2022-01-23 12:04:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][220/1251] eta 0:39:14 lr 0.000336 time 2.3863 (2.2838) loss 3.1981 (3.4365) grad_norm 1.8812 (1.7578) [2022-01-23 12:04:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][230/1251] eta 0:38:39 lr 0.000336 time 1.6478 (2.2720) loss 3.6545 (3.4485) grad_norm 1.8663 (1.7575) [2022-01-23 12:04:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][240/1251] eta 0:38:09 lr 0.000336 time 2.5298 (2.2647) loss 3.4442 (3.4543) grad_norm 1.6146 (1.7553) [2022-01-23 12:05:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][250/1251] eta 0:37:33 lr 0.000336 time 1.9307 (2.2516) loss 3.9419 (3.4547) grad_norm 1.5667 (1.7553) [2022-01-23 12:05:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][260/1251] eta 0:37:17 lr 0.000336 time 5.3299 (2.2581) loss 3.4755 (3.4533) grad_norm 2.2293 (1.7592) [2022-01-23 12:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][270/1251] eta 0:36:55 lr 0.000336 time 2.2898 (2.2579) loss 3.6521 (3.4522) grad_norm 1.7433 (1.7636) [2022-01-23 12:06:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][280/1251] eta 0:36:32 lr 0.000336 time 1.6046 (2.2575) loss 2.8289 (3.4467) grad_norm 1.7054 (1.7612) [2022-01-23 12:06:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][290/1251] eta 0:36:11 lr 0.000336 time 2.4068 (2.2599) loss 2.7808 (3.4430) grad_norm 1.8248 (1.7610) [2022-01-23 12:07:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][300/1251] eta 0:35:56 lr 0.000336 time 3.9449 (2.2675) loss 3.2007 (3.4412) grad_norm 1.4837 (1.7596) [2022-01-23 12:07:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][310/1251] eta 0:35:28 lr 0.000336 time 1.8351 (2.2617) loss 3.4057 (3.4387) grad_norm 1.5724 (1.7575) [2022-01-23 12:07:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][320/1251] eta 0:35:00 lr 0.000336 time 1.6675 (2.2567) loss 2.5520 (3.4367) grad_norm 1.6155 (1.7554) [2022-01-23 12:08:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][330/1251] eta 0:34:37 lr 0.000336 time 2.2842 (2.2554) loss 3.4218 (3.4361) grad_norm 1.6662 (1.7560) [2022-01-23 12:08:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][340/1251] eta 0:34:15 lr 0.000336 time 4.1728 (2.2563) loss 3.8793 (3.4360) grad_norm 1.6993 (1.7564) [2022-01-23 12:09:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][350/1251] eta 0:33:51 lr 0.000336 time 2.1567 (2.2545) loss 2.6854 (3.4268) grad_norm 1.6938 (1.7585) [2022-01-23 12:09:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][360/1251] eta 0:33:22 lr 0.000336 time 1.5638 (2.2478) loss 3.5501 (3.4306) grad_norm 1.8727 (1.7581) [2022-01-23 12:09:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][370/1251] eta 0:33:00 lr 0.000336 time 1.9148 (2.2482) loss 3.3775 (3.4289) grad_norm 1.7125 (1.7593) [2022-01-23 12:10:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][380/1251] eta 0:32:39 lr 0.000336 time 3.2099 (2.2495) loss 3.1224 (3.4355) grad_norm 1.6316 (1.7594) [2022-01-23 12:10:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][390/1251] eta 0:32:14 lr 0.000336 time 1.9934 (2.2465) loss 4.3222 (3.4308) grad_norm 2.0026 (1.7596) [2022-01-23 12:10:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][400/1251] eta 0:31:48 lr 0.000336 time 1.9437 (2.2431) loss 3.9177 (3.4284) grad_norm 1.9837 (1.7617) [2022-01-23 12:11:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][410/1251] eta 0:31:27 lr 0.000336 time 1.8444 (2.2443) loss 3.8191 (3.4248) grad_norm 1.6281 (1.7624) [2022-01-23 12:11:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][420/1251] eta 0:31:04 lr 0.000336 time 2.4839 (2.2442) loss 3.5699 (3.4217) grad_norm 1.4987 (1.7610) [2022-01-23 12:11:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][430/1251] eta 0:30:41 lr 0.000336 time 2.9507 (2.2429) loss 2.9338 (3.4157) grad_norm 1.6978 (1.7609) [2022-01-23 12:12:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][440/1251] eta 0:30:15 lr 0.000336 time 2.0100 (2.2387) loss 3.8490 (3.4113) grad_norm 2.4774 (1.7650) [2022-01-23 12:12:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][450/1251] eta 0:29:52 lr 0.000336 time 2.6449 (2.2376) loss 2.9007 (3.4093) grad_norm 1.7391 (1.7653) [2022-01-23 12:12:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][460/1251] eta 0:29:26 lr 0.000336 time 2.0284 (2.2335) loss 3.7450 (3.4069) grad_norm 1.8458 (1.7659) [2022-01-23 12:13:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][470/1251] eta 0:29:06 lr 0.000335 time 4.0312 (2.2358) loss 3.6272 (3.4091) grad_norm 1.5875 (1.7665) [2022-01-23 12:13:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][480/1251] eta 0:28:43 lr 0.000335 time 2.1704 (2.2351) loss 2.4974 (3.4057) grad_norm 1.7859 (1.7666) [2022-01-23 12:14:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][490/1251] eta 0:28:21 lr 0.000335 time 2.5530 (2.2365) loss 3.1190 (3.4069) grad_norm 1.9944 (1.7695) [2022-01-23 12:14:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][500/1251] eta 0:27:57 lr 0.000335 time 1.8536 (2.2341) loss 3.8864 (3.4100) grad_norm 1.6149 (1.7715) [2022-01-23 12:14:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][510/1251] eta 0:27:35 lr 0.000335 time 3.6671 (2.2344) loss 4.0657 (3.4142) grad_norm 1.6803 (1.7723) [2022-01-23 12:15:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][520/1251] eta 0:27:12 lr 0.000335 time 1.8837 (2.2337) loss 3.7617 (3.4162) grad_norm 2.2073 (1.7748) [2022-01-23 12:15:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][530/1251] eta 0:26:49 lr 0.000335 time 2.0657 (2.2316) loss 2.7498 (3.4109) grad_norm 1.7616 (1.7764) [2022-01-23 12:15:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][540/1251] eta 0:26:24 lr 0.000335 time 1.8478 (2.2279) loss 3.5771 (3.4131) grad_norm 1.8486 (1.7781) [2022-01-23 12:16:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][550/1251] eta 0:26:01 lr 0.000335 time 3.1724 (2.2280) loss 3.2868 (3.4167) grad_norm 2.1411 (1.7784) [2022-01-23 12:16:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][560/1251] eta 0:25:38 lr 0.000335 time 2.2011 (2.2262) loss 3.8527 (3.4137) grad_norm 1.9485 (1.7802) [2022-01-23 12:16:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][570/1251] eta 0:25:14 lr 0.000335 time 1.6020 (2.2243) loss 3.9027 (3.4187) grad_norm 1.7989 (1.7797) [2022-01-23 12:17:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][580/1251] eta 0:24:53 lr 0.000335 time 2.4697 (2.2262) loss 3.3743 (3.4176) grad_norm 1.7991 (1.7794) [2022-01-23 12:17:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][590/1251] eta 0:24:35 lr 0.000335 time 3.3742 (2.2318) loss 2.3628 (3.4175) grad_norm 2.0544 (1.7798) [2022-01-23 12:18:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][600/1251] eta 0:24:12 lr 0.000335 time 1.9597 (2.2315) loss 3.7418 (3.4151) grad_norm 1.6942 (1.7822) [2022-01-23 12:18:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][610/1251] eta 0:23:47 lr 0.000335 time 1.9620 (2.2274) loss 3.6496 (3.4171) grad_norm 2.0487 (1.7845) [2022-01-23 12:18:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][620/1251] eta 0:23:21 lr 0.000335 time 1.7542 (2.2212) loss 2.3965 (3.4158) grad_norm 1.8857 (1.7835) [2022-01-23 12:19:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][630/1251] eta 0:22:57 lr 0.000335 time 1.9767 (2.2188) loss 4.1401 (3.4173) grad_norm 2.0028 (1.7830) [2022-01-23 12:19:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][640/1251] eta 0:22:34 lr 0.000335 time 2.2446 (2.2167) loss 3.1350 (3.4148) grad_norm 1.6686 (1.7822) [2022-01-23 12:19:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][650/1251] eta 0:22:12 lr 0.000335 time 2.2061 (2.2175) loss 3.0324 (3.4131) grad_norm 1.6671 (1.7819) [2022-01-23 12:20:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][660/1251] eta 0:21:49 lr 0.000335 time 1.9791 (2.2153) loss 3.5548 (3.4159) grad_norm 1.6840 (1.7812) [2022-01-23 12:20:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][670/1251] eta 0:21:27 lr 0.000335 time 3.1967 (2.2158) loss 4.0213 (3.4171) grad_norm 1.6560 (1.7804) [2022-01-23 12:20:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][680/1251] eta 0:21:06 lr 0.000335 time 2.8069 (2.2175) loss 3.4313 (3.4188) grad_norm 1.6455 (1.7809) [2022-01-23 12:21:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][690/1251] eta 0:20:45 lr 0.000335 time 2.7786 (2.2194) loss 4.1707 (3.4197) grad_norm 1.6900 (1.7799) [2022-01-23 12:21:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][700/1251] eta 0:20:21 lr 0.000335 time 1.8108 (2.2169) loss 3.7225 (3.4228) grad_norm 1.8032 (1.7789) [2022-01-23 12:22:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][710/1251] eta 0:19:58 lr 0.000335 time 2.2329 (2.2156) loss 3.6336 (3.4217) grad_norm 1.8693 (1.7788) [2022-01-23 12:22:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][720/1251] eta 0:19:36 lr 0.000335 time 2.1234 (2.2160) loss 2.4459 (3.4206) grad_norm 1.8015 (1.7799) [2022-01-23 12:22:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][730/1251] eta 0:19:15 lr 0.000334 time 2.4429 (2.2183) loss 3.7758 (3.4199) grad_norm 1.7485 (1.7789) [2022-01-23 12:23:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][740/1251] eta 0:18:52 lr 0.000334 time 1.6389 (2.2168) loss 3.5535 (3.4254) grad_norm 1.6397 (1.7787) [2022-01-23 12:23:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][750/1251] eta 0:18:29 lr 0.000334 time 1.9046 (2.2151) loss 3.6418 (3.4232) grad_norm 1.7757 (1.7777) [2022-01-23 12:23:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][760/1251] eta 0:18:06 lr 0.000334 time 2.6422 (2.2137) loss 3.6405 (3.4227) grad_norm 1.8383 (1.7774) [2022-01-23 12:24:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][770/1251] eta 0:17:45 lr 0.000334 time 2.4837 (2.2147) loss 4.0453 (3.4205) grad_norm 1.7206 (1.7773) [2022-01-23 12:24:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][780/1251] eta 0:17:23 lr 0.000334 time 2.2080 (2.2151) loss 3.4203 (3.4168) grad_norm 1.6865 (1.7774) [2022-01-23 12:25:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][790/1251] eta 0:17:00 lr 0.000334 time 1.9253 (2.2147) loss 2.7635 (3.4191) grad_norm 1.7838 (1.7784) [2022-01-23 12:25:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][800/1251] eta 0:16:38 lr 0.000334 time 2.6282 (2.2130) loss 3.8454 (3.4180) grad_norm 1.7521 (1.7789) [2022-01-23 12:25:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][810/1251] eta 0:16:16 lr 0.000334 time 1.9074 (2.2140) loss 3.0714 (3.4203) grad_norm 1.8630 (1.7799) [2022-01-23 12:26:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][820/1251] eta 0:15:53 lr 0.000334 time 1.9138 (2.2121) loss 4.1105 (3.4202) grad_norm 1.8704 (1.7791) [2022-01-23 12:26:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][830/1251] eta 0:15:30 lr 0.000334 time 2.1511 (2.2102) loss 3.6652 (3.4219) grad_norm 1.8037 (1.7792) [2022-01-23 12:26:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][840/1251] eta 0:15:08 lr 0.000334 time 1.9268 (2.2105) loss 3.9983 (3.4221) grad_norm 1.9280 (1.7795) [2022-01-23 12:27:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][850/1251] eta 0:14:46 lr 0.000334 time 2.2865 (2.2098) loss 3.0048 (3.4213) grad_norm 2.5195 (1.7809) [2022-01-23 12:27:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][860/1251] eta 0:14:23 lr 0.000334 time 2.2140 (2.2073) loss 3.4390 (3.4186) grad_norm 1.3967 (1.7817) [2022-01-23 12:27:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][870/1251] eta 0:14:01 lr 0.000334 time 2.4829 (2.2075) loss 3.8361 (3.4164) grad_norm 1.6255 (1.7805) [2022-01-23 12:28:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][880/1251] eta 0:13:38 lr 0.000334 time 1.9890 (2.2062) loss 3.6715 (3.4153) grad_norm 1.6922 (1.7807) [2022-01-23 12:28:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][890/1251] eta 0:13:16 lr 0.000334 time 2.3677 (2.2051) loss 3.9820 (3.4143) grad_norm 1.6902 (1.7800) [2022-01-23 12:28:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][900/1251] eta 0:12:53 lr 0.000334 time 1.8969 (2.2039) loss 3.1711 (3.4151) grad_norm 1.5585 (1.7793) [2022-01-23 12:29:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][910/1251] eta 0:12:32 lr 0.000334 time 2.5401 (2.2056) loss 3.4196 (3.4120) grad_norm 1.7989 (1.7779) [2022-01-23 12:29:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][920/1251] eta 0:12:10 lr 0.000334 time 1.8918 (2.2060) loss 3.6173 (3.4128) grad_norm 2.1957 (1.7774) [2022-01-23 12:30:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][930/1251] eta 0:11:48 lr 0.000334 time 1.8794 (2.2069) loss 3.4021 (3.4136) grad_norm 1.7169 (1.7772) [2022-01-23 12:30:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][940/1251] eta 0:11:26 lr 0.000334 time 2.1646 (2.2074) loss 3.4786 (3.4136) grad_norm 1.8717 (1.7776) [2022-01-23 12:30:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][950/1251] eta 0:11:04 lr 0.000334 time 2.6156 (2.2092) loss 4.0253 (3.4128) grad_norm 1.6265 (1.7796) [2022-01-23 12:31:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][960/1251] eta 0:10:43 lr 0.000334 time 3.1854 (2.2101) loss 3.9544 (3.4135) grad_norm 1.8572 (1.7808) [2022-01-23 12:31:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][970/1251] eta 0:10:20 lr 0.000334 time 2.2053 (2.2087) loss 3.5818 (3.4149) grad_norm 1.9181 (1.7815) [2022-01-23 12:31:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][980/1251] eta 0:09:58 lr 0.000334 time 1.6212 (2.2072) loss 3.7891 (3.4155) grad_norm 1.5636 (1.7807) [2022-01-23 12:32:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][990/1251] eta 0:09:35 lr 0.000333 time 1.6053 (2.2054) loss 3.6559 (3.4180) grad_norm 1.6875 (1.7806) [2022-01-23 12:32:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1000/1251] eta 0:09:13 lr 0.000333 time 2.7042 (2.2046) loss 2.4372 (3.4168) grad_norm 1.9980 (1.7815) [2022-01-23 12:32:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1010/1251] eta 0:08:51 lr 0.000333 time 2.4879 (2.2037) loss 2.6332 (3.4167) grad_norm 1.8134 (1.7811) [2022-01-23 12:33:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1020/1251] eta 0:08:29 lr 0.000333 time 2.5336 (2.2046) loss 3.6233 (3.4155) grad_norm 1.8229 (1.7811) [2022-01-23 12:33:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1030/1251] eta 0:08:07 lr 0.000333 time 2.1366 (2.2056) loss 3.8038 (3.4151) grad_norm 1.7946 (1.7803) [2022-01-23 12:34:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1040/1251] eta 0:07:45 lr 0.000333 time 2.5606 (2.2059) loss 3.6603 (3.4160) grad_norm 2.5022 (1.7812) [2022-01-23 12:34:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1050/1251] eta 0:07:23 lr 0.000333 time 1.7808 (2.2040) loss 3.7419 (3.4184) grad_norm 1.7888 (1.7808) [2022-01-23 12:34:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1060/1251] eta 0:07:00 lr 0.000333 time 2.1793 (2.2022) loss 3.5806 (3.4173) grad_norm 1.9526 (1.7814) [2022-01-23 12:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1070/1251] eta 0:06:38 lr 0.000333 time 2.2556 (2.2012) loss 3.0354 (3.4157) grad_norm 2.0407 (1.7809) [2022-01-23 12:35:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1080/1251] eta 0:06:16 lr 0.000333 time 2.8494 (2.2004) loss 3.5811 (3.4168) grad_norm 2.0318 (1.7812) [2022-01-23 12:35:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1090/1251] eta 0:05:54 lr 0.000333 time 2.1757 (2.2011) loss 3.5324 (3.4163) grad_norm 1.5649 (1.7810) [2022-01-23 12:36:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1100/1251] eta 0:05:32 lr 0.000333 time 2.2610 (2.2023) loss 3.3261 (3.4166) grad_norm 2.0219 (1.7813) [2022-01-23 12:36:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1110/1251] eta 0:05:10 lr 0.000333 time 1.8344 (2.2015) loss 3.1508 (3.4137) grad_norm 1.7721 (1.7806) [2022-01-23 12:36:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1120/1251] eta 0:04:48 lr 0.000333 time 2.4206 (2.2019) loss 3.5401 (3.4120) grad_norm 1.5113 (1.7802) [2022-01-23 12:37:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1130/1251] eta 0:04:26 lr 0.000333 time 2.1640 (2.2020) loss 3.7630 (3.4122) grad_norm 1.4461 (1.7801) [2022-01-23 12:37:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1140/1251] eta 0:04:04 lr 0.000333 time 1.7794 (2.2022) loss 2.5370 (3.4106) grad_norm 1.7544 (1.7796) [2022-01-23 12:38:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1150/1251] eta 0:03:42 lr 0.000333 time 1.9131 (2.2012) loss 3.6581 (3.4071) grad_norm 1.5580 (1.7791) [2022-01-23 12:38:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1160/1251] eta 0:03:20 lr 0.000333 time 1.8645 (2.2018) loss 3.1512 (3.4064) grad_norm 1.6927 (1.7785) [2022-01-23 12:38:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1170/1251] eta 0:02:58 lr 0.000333 time 1.9030 (2.2007) loss 3.6156 (3.4070) grad_norm 1.6480 (1.7786) [2022-01-23 12:39:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1180/1251] eta 0:02:36 lr 0.000333 time 1.5711 (2.2014) loss 3.5899 (3.4057) grad_norm 1.8615 (1.7781) [2022-01-23 12:39:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1190/1251] eta 0:02:14 lr 0.000333 time 1.9017 (2.2014) loss 3.8898 (3.4069) grad_norm 1.7024 (1.7776) [2022-01-23 12:39:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1200/1251] eta 0:01:52 lr 0.000333 time 2.3700 (2.2022) loss 3.5251 (3.4050) grad_norm 2.2108 (1.7776) [2022-01-23 12:40:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1210/1251] eta 0:01:30 lr 0.000333 time 1.9087 (2.2018) loss 3.1580 (3.4037) grad_norm 1.6408 (1.7777) [2022-01-23 12:40:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1220/1251] eta 0:01:08 lr 0.000333 time 1.9416 (2.2018) loss 3.7739 (3.4033) grad_norm 1.4871 (1.7764) [2022-01-23 12:40:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1230/1251] eta 0:00:46 lr 0.000333 time 1.7064 (2.2010) loss 2.8960 (3.4054) grad_norm 1.5796 (1.7758) [2022-01-23 12:41:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1240/1251] eta 0:00:24 lr 0.000332 time 1.3126 (2.2001) loss 3.1149 (3.4059) grad_norm 1.7732 (1.7755) [2022-01-23 12:41:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [183/300][1250/1251] eta 0:00:02 lr 0.000332 time 1.1783 (2.1949) loss 2.2680 (3.4060) grad_norm 1.7777 (1.7754) [2022-01-23 12:41:35 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 183 training takes 0:45:46 [2022-01-23 12:41:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.759 (18.759) Loss 1.0226 (1.0226) Acc@1 74.805 (74.805) Acc@5 93.555 (93.555) [2022-01-23 12:42:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.594 (3.271) Loss 0.9339 (0.9570) Acc@1 78.516 (77.264) Acc@5 93.164 (93.839) [2022-01-23 12:42:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.452 (2.640) Loss 0.9759 (0.9558) Acc@1 78.223 (77.144) Acc@5 93.457 (93.871) [2022-01-23 12:42:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.951 (2.241) Loss 0.9505 (0.9616) Acc@1 76.855 (77.010) Acc@5 94.336 (93.750) [2022-01-23 12:43:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.870 (2.159) Loss 0.9103 (0.9553) Acc@1 78.516 (77.244) Acc@5 94.238 (93.874) [2022-01-23 12:43:11 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 77.402 Acc@5 93.976 [2022-01-23 12:43:11 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 77.4% [2022-01-23 12:43:11 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 77.51% [2022-01-23 12:43:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][0/1251] eta 7:37:44 lr 0.000332 time 21.9539 (21.9539) loss 2.2831 (2.2831) grad_norm 1.6299 (1.6299) [2022-01-23 12:43:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][10/1251] eta 1:26:58 lr 0.000332 time 2.4164 (4.2047) loss 3.5328 (3.5076) grad_norm 1.6182 (1.7067) [2022-01-23 12:44:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][20/1251] eta 1:06:10 lr 0.000332 time 1.5829 (3.2258) loss 3.7156 (3.4034) grad_norm 1.7416 (1.7464) [2022-01-23 12:44:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][30/1251] eta 0:59:12 lr 0.000332 time 2.2213 (2.9092) loss 3.5036 (3.3707) grad_norm 2.1230 (1.8124) [2022-01-23 12:45:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][40/1251] eta 0:55:06 lr 0.000332 time 2.9269 (2.7303) loss 3.6260 (3.3531) grad_norm 1.7629 (1.8248) [2022-01-23 12:45:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][50/1251] eta 0:52:58 lr 0.000332 time 1.8931 (2.6463) loss 2.9928 (3.3567) grad_norm 1.9706 (1.8085) [2022-01-23 12:45:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][60/1251] eta 0:50:51 lr 0.000332 time 1.6123 (2.5621) loss 4.1412 (3.3828) grad_norm 1.6201 (1.7963) [2022-01-23 12:46:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][70/1251] eta 0:49:22 lr 0.000332 time 1.6068 (2.5088) loss 3.3342 (3.3544) grad_norm 1.6964 (1.7970) [2022-01-23 12:46:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][80/1251] eta 0:48:16 lr 0.000332 time 2.9343 (2.4739) loss 3.9935 (3.3726) grad_norm 1.9512 (1.7919) [2022-01-23 12:46:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][90/1251] eta 0:47:42 lr 0.000332 time 2.2284 (2.4659) loss 3.8984 (3.3700) grad_norm 1.8423 (1.7885) [2022-01-23 12:47:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][100/1251] eta 0:46:47 lr 0.000332 time 1.5299 (2.4393) loss 2.1773 (3.3566) grad_norm 2.0585 (1.7873) [2022-01-23 12:47:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][110/1251] eta 0:45:44 lr 0.000332 time 1.7829 (2.4051) loss 3.7697 (3.3571) grad_norm 1.8136 (1.7790) [2022-01-23 12:47:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][120/1251] eta 0:44:55 lr 0.000332 time 2.9155 (2.3837) loss 3.4790 (3.3605) grad_norm 1.9981 (1.7787) [2022-01-23 12:48:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][130/1251] eta 0:44:18 lr 0.000332 time 2.1894 (2.3714) loss 3.0508 (3.3562) grad_norm 1.7404 (1.7778) [2022-01-23 12:48:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][140/1251] eta 0:43:38 lr 0.000332 time 2.1335 (2.3565) loss 3.8153 (3.3496) grad_norm 1.7252 (1.7715) [2022-01-23 12:49:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][150/1251] eta 0:43:03 lr 0.000332 time 2.6064 (2.3465) loss 3.3454 (3.3529) grad_norm 1.8037 (1.7745) [2022-01-23 12:49:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][160/1251] eta 0:42:35 lr 0.000332 time 2.2271 (2.3425) loss 2.7263 (3.3525) grad_norm 1.6696 (1.7759) [2022-01-23 12:49:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][170/1251] eta 0:42:07 lr 0.000332 time 1.8892 (2.3378) loss 3.3432 (3.3571) grad_norm 1.6113 (1.7740) [2022-01-23 12:50:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][180/1251] eta 0:41:34 lr 0.000332 time 1.7766 (2.3294) loss 3.5546 (3.3688) grad_norm 1.6896 (1.7716) [2022-01-23 12:50:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][190/1251] eta 0:40:53 lr 0.000332 time 1.7837 (2.3122) loss 3.4984 (3.3649) grad_norm 1.7671 (1.7720) [2022-01-23 12:50:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][200/1251] eta 0:40:16 lr 0.000332 time 2.1819 (2.2996) loss 3.1669 (3.3626) grad_norm 1.7925 (1.7679) [2022-01-23 12:51:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][210/1251] eta 0:39:47 lr 0.000332 time 2.7920 (2.2932) loss 3.3184 (3.3810) grad_norm 1.7621 (1.7664) [2022-01-23 12:51:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][220/1251] eta 0:39:17 lr 0.000332 time 1.9018 (2.2865) loss 3.6290 (3.3943) grad_norm 1.8252 (1.7739) [2022-01-23 12:52:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][230/1251] eta 0:38:57 lr 0.000332 time 2.5182 (2.2893) loss 2.8165 (3.3970) grad_norm 1.9331 (1.7750) [2022-01-23 12:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][240/1251] eta 0:38:42 lr 0.000332 time 2.4594 (2.2968) loss 2.8341 (3.4021) grad_norm 1.6565 (1.7776) [2022-01-23 12:52:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][250/1251] eta 0:38:12 lr 0.000331 time 2.1796 (2.2900) loss 4.1330 (3.4121) grad_norm 1.8065 (1.7816) [2022-01-23 12:53:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][260/1251] eta 0:37:40 lr 0.000331 time 1.8905 (2.2807) loss 3.7675 (3.4170) grad_norm 2.0971 (1.7839) [2022-01-23 12:53:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][270/1251] eta 0:37:08 lr 0.000331 time 1.7363 (2.2715) loss 3.7037 (3.4195) grad_norm 1.6810 (1.7823) [2022-01-23 12:53:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][280/1251] eta 0:36:37 lr 0.000331 time 1.8809 (2.2636) loss 3.9030 (3.4125) grad_norm 1.8373 (1.7802) [2022-01-23 12:54:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][290/1251] eta 0:36:08 lr 0.000331 time 1.8978 (2.2568) loss 3.3520 (3.4107) grad_norm 1.8525 (1.7853) [2022-01-23 12:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][300/1251] eta 0:35:42 lr 0.000331 time 1.8112 (2.2530) loss 3.2166 (3.4117) grad_norm 1.6335 (1.7862) [2022-01-23 12:54:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][310/1251] eta 0:35:17 lr 0.000331 time 1.5105 (2.2501) loss 3.9510 (3.4077) grad_norm 1.7449 (1.7845) [2022-01-23 12:55:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][320/1251] eta 0:34:55 lr 0.000331 time 1.8928 (2.2507) loss 3.7449 (3.4045) grad_norm 1.7178 (1.7836) [2022-01-23 12:55:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][330/1251] eta 0:34:33 lr 0.000331 time 2.8626 (2.2516) loss 3.2364 (3.4056) grad_norm 1.4515 (1.7822) [2022-01-23 12:55:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][340/1251] eta 0:34:09 lr 0.000331 time 2.8483 (2.2499) loss 3.7958 (3.4071) grad_norm 1.8207 (1.7794) [2022-01-23 12:56:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][350/1251] eta 0:33:52 lr 0.000331 time 2.5185 (2.2554) loss 3.2707 (3.4054) grad_norm 1.8378 (1.7777) [2022-01-23 12:56:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][360/1251] eta 0:33:34 lr 0.000331 time 1.8961 (2.2608) loss 3.8012 (3.4049) grad_norm 1.5655 (1.7754) [2022-01-23 12:57:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][370/1251] eta 0:33:09 lr 0.000331 time 1.7321 (2.2581) loss 3.2712 (3.4080) grad_norm 1.9392 (1.7752) [2022-01-23 12:57:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][380/1251] eta 0:32:38 lr 0.000331 time 1.6840 (2.2490) loss 3.3031 (3.4124) grad_norm 1.9275 (1.7779) [2022-01-23 12:57:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][390/1251] eta 0:32:10 lr 0.000331 time 2.1141 (2.2421) loss 3.9642 (3.4181) grad_norm 1.9923 (1.7878) [2022-01-23 12:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][400/1251] eta 0:31:41 lr 0.000331 time 1.8402 (2.2348) loss 3.6251 (3.4204) grad_norm 1.9539 (1.7912) [2022-01-23 12:58:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][410/1251] eta 0:31:18 lr 0.000331 time 2.1605 (2.2336) loss 3.9286 (3.4221) grad_norm 1.8467 (1.7931) [2022-01-23 12:58:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][420/1251] eta 0:30:54 lr 0.000331 time 1.8124 (2.2320) loss 4.1568 (3.4214) grad_norm 1.8586 (1.7928) [2022-01-23 12:59:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][430/1251] eta 0:30:32 lr 0.000331 time 2.6970 (2.2322) loss 2.9389 (3.4154) grad_norm 1.8851 (1.7927) [2022-01-23 12:59:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][440/1251] eta 0:30:10 lr 0.000331 time 2.3645 (2.2325) loss 4.0495 (3.4184) grad_norm 2.0237 (1.7918) [2022-01-23 12:59:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][450/1251] eta 0:29:50 lr 0.000331 time 1.6547 (2.2350) loss 3.8247 (3.4205) grad_norm 1.6826 (1.7923) [2022-01-23 13:00:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][460/1251] eta 0:29:28 lr 0.000331 time 1.8714 (2.2352) loss 3.0997 (3.4204) grad_norm 1.9140 (1.7918) [2022-01-23 13:00:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][470/1251] eta 0:29:08 lr 0.000331 time 2.3315 (2.2391) loss 2.4790 (3.4192) grad_norm 1.8832 (1.7920) [2022-01-23 13:01:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][480/1251] eta 0:28:54 lr 0.000331 time 3.0773 (2.2496) loss 3.8108 (3.4247) grad_norm 1.6500 (1.7920) [2022-01-23 13:01:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][490/1251] eta 0:28:33 lr 0.000331 time 2.6210 (2.2514) loss 3.4122 (3.4243) grad_norm 1.6511 (1.7920) [2022-01-23 13:01:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][500/1251] eta 0:28:08 lr 0.000331 time 1.8574 (2.2490) loss 3.2644 (3.4249) grad_norm 1.5822 (1.7903) [2022-01-23 13:02:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][510/1251] eta 0:27:41 lr 0.000330 time 1.8630 (2.2422) loss 2.6675 (3.4299) grad_norm 2.0860 (1.7895) [2022-01-23 13:02:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][520/1251] eta 0:27:13 lr 0.000330 time 1.9074 (2.2341) loss 3.9078 (3.4299) grad_norm 1.8141 (1.7900) [2022-01-23 13:02:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][530/1251] eta 0:26:49 lr 0.000330 time 1.8654 (2.2318) loss 2.9584 (3.4301) grad_norm 1.5923 (1.7887) [2022-01-23 13:03:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][540/1251] eta 0:26:25 lr 0.000330 time 2.1776 (2.2298) loss 3.0665 (3.4321) grad_norm 1.4496 (1.7875) [2022-01-23 13:03:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][550/1251] eta 0:26:02 lr 0.000330 time 1.9879 (2.2291) loss 3.0102 (3.4338) grad_norm 1.9241 (1.7880) [2022-01-23 13:04:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][560/1251] eta 0:25:43 lr 0.000330 time 2.8592 (2.2341) loss 3.6207 (3.4342) grad_norm 1.7970 (1.7893) [2022-01-23 13:04:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][570/1251] eta 0:25:21 lr 0.000330 time 2.4878 (2.2349) loss 3.9351 (3.4361) grad_norm 1.7471 (1.7877) [2022-01-23 13:04:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][580/1251] eta 0:25:01 lr 0.000330 time 2.5876 (2.2371) loss 3.9982 (3.4346) grad_norm 1.7885 (1.7870) [2022-01-23 13:05:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][590/1251] eta 0:24:39 lr 0.000330 time 2.2893 (2.2376) loss 3.8057 (3.4364) grad_norm 1.8116 (1.7863) [2022-01-23 13:05:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][600/1251] eta 0:24:15 lr 0.000330 time 1.8978 (2.2360) loss 3.5602 (3.4383) grad_norm 1.4258 (1.7855) [2022-01-23 13:05:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][610/1251] eta 0:23:52 lr 0.000330 time 2.6929 (2.2343) loss 4.0912 (3.4391) grad_norm 1.7340 (1.7855) [2022-01-23 13:06:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][620/1251] eta 0:23:28 lr 0.000330 time 2.3261 (2.2324) loss 2.5839 (3.4350) grad_norm 1.8319 (1.7853) [2022-01-23 13:06:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][630/1251] eta 0:23:05 lr 0.000330 time 2.0227 (2.2308) loss 3.7261 (3.4378) grad_norm 1.7562 (1.7876) [2022-01-23 13:07:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][640/1251] eta 0:22:43 lr 0.000330 time 2.1855 (2.2312) loss 3.9325 (3.4376) grad_norm 1.7621 (1.7873) [2022-01-23 13:07:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][650/1251] eta 0:22:21 lr 0.000330 time 1.9703 (2.2323) loss 3.7070 (3.4337) grad_norm 1.6868 (1.7855) [2022-01-23 13:07:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][660/1251] eta 0:21:58 lr 0.000330 time 2.2646 (2.2317) loss 3.5860 (3.4312) grad_norm 2.5219 (1.7860) [2022-01-23 13:08:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][670/1251] eta 0:21:35 lr 0.000330 time 1.9205 (2.2295) loss 4.0747 (3.4335) grad_norm 1.6443 (1.7851) [2022-01-23 13:08:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][680/1251] eta 0:21:12 lr 0.000330 time 3.2037 (2.2288) loss 4.0339 (3.4323) grad_norm 1.7620 (1.7859) [2022-01-23 13:08:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][690/1251] eta 0:20:48 lr 0.000330 time 2.2039 (2.2264) loss 2.9115 (3.4353) grad_norm 1.6912 (1.7856) [2022-01-23 13:09:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][700/1251] eta 0:20:25 lr 0.000330 time 1.8816 (2.2240) loss 4.0566 (3.4361) grad_norm 1.6513 (1.7858) [2022-01-23 13:09:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][710/1251] eta 0:20:02 lr 0.000330 time 2.5837 (2.2228) loss 3.7852 (3.4344) grad_norm 1.8445 (1.7852) [2022-01-23 13:09:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][720/1251] eta 0:19:40 lr 0.000330 time 1.9125 (2.2232) loss 2.6371 (3.4296) grad_norm 1.6017 (1.7836) [2022-01-23 13:10:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][730/1251] eta 0:19:17 lr 0.000330 time 2.6025 (2.2221) loss 2.8523 (3.4306) grad_norm 1.8679 (1.7839) [2022-01-23 13:10:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][740/1251] eta 0:18:54 lr 0.000330 time 2.1780 (2.2209) loss 4.1632 (3.4322) grad_norm 1.7469 (1.7854) [2022-01-23 13:10:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][750/1251] eta 0:18:32 lr 0.000330 time 2.4891 (2.2203) loss 4.0551 (3.4321) grad_norm 1.5313 (1.7858) [2022-01-23 13:11:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][760/1251] eta 0:18:10 lr 0.000330 time 1.9167 (2.2210) loss 3.9924 (3.4345) grad_norm 1.9371 (1.7856) [2022-01-23 13:11:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][770/1251] eta 0:17:48 lr 0.000329 time 1.8151 (2.2205) loss 3.3075 (3.4334) grad_norm 1.7424 (1.7852) [2022-01-23 13:12:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][780/1251] eta 0:17:26 lr 0.000329 time 2.6381 (2.2229) loss 3.3599 (3.4320) grad_norm 1.8759 (1.7860) [2022-01-23 13:12:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][790/1251] eta 0:17:04 lr 0.000329 time 2.7972 (2.2227) loss 3.5229 (3.4303) grad_norm 1.7639 (1.7850) [2022-01-23 13:12:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][800/1251] eta 0:16:41 lr 0.000329 time 1.6468 (2.2200) loss 3.7169 (3.4335) grad_norm 1.4803 (1.7847) [2022-01-23 13:13:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][810/1251] eta 0:16:18 lr 0.000329 time 2.1524 (2.2179) loss 4.1001 (3.4314) grad_norm 1.6609 (1.7847) [2022-01-23 13:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][820/1251] eta 0:15:55 lr 0.000329 time 1.9141 (2.2175) loss 2.4988 (3.4266) grad_norm 1.6894 (1.7846) [2022-01-23 13:13:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][830/1251] eta 0:15:32 lr 0.000329 time 2.3507 (2.2161) loss 3.5862 (3.4268) grad_norm 1.8959 (1.7840) [2022-01-23 13:14:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][840/1251] eta 0:15:10 lr 0.000329 time 2.1543 (2.2157) loss 3.7495 (3.4274) grad_norm 1.6112 (1.7832) [2022-01-23 13:14:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][850/1251] eta 0:14:48 lr 0.000329 time 2.5654 (2.2154) loss 2.8896 (3.4277) grad_norm 1.4434 (1.7829) [2022-01-23 13:14:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][860/1251] eta 0:14:25 lr 0.000329 time 1.9016 (2.2140) loss 4.1747 (3.4254) grad_norm 1.5753 (1.7847) [2022-01-23 13:15:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][870/1251] eta 0:14:03 lr 0.000329 time 2.8503 (2.2132) loss 3.3947 (3.4255) grad_norm 1.8492 (1.7863) [2022-01-23 13:15:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][880/1251] eta 0:13:41 lr 0.000329 time 3.0448 (2.2141) loss 3.7786 (3.4276) grad_norm 1.9460 (1.7857) [2022-01-23 13:16:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][890/1251] eta 0:13:19 lr 0.000329 time 2.1960 (2.2148) loss 3.8344 (3.4309) grad_norm 1.9709 (1.7890) [2022-01-23 13:16:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][900/1251] eta 0:12:58 lr 0.000329 time 2.7510 (2.2169) loss 3.6707 (3.4310) grad_norm 1.4659 (1.7894) [2022-01-23 13:16:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][910/1251] eta 0:12:36 lr 0.000329 time 1.9202 (2.2177) loss 3.2939 (3.4314) grad_norm 1.7007 (1.7892) [2022-01-23 13:17:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][920/1251] eta 0:12:14 lr 0.000329 time 2.5087 (2.2186) loss 3.0466 (3.4274) grad_norm 1.6502 (1.7887) [2022-01-23 13:17:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][930/1251] eta 0:11:51 lr 0.000329 time 1.9735 (2.2156) loss 2.1229 (3.4250) grad_norm 1.7620 (1.7884) [2022-01-23 13:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][940/1251] eta 0:11:28 lr 0.000329 time 2.9280 (2.2136) loss 3.5417 (3.4245) grad_norm 2.2127 (1.7889) [2022-01-23 13:18:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][950/1251] eta 0:11:05 lr 0.000329 time 2.1845 (2.2116) loss 3.1935 (3.4247) grad_norm 1.8125 (1.7894) [2022-01-23 13:18:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][960/1251] eta 0:10:43 lr 0.000329 time 2.4542 (2.2111) loss 3.8269 (3.4224) grad_norm 1.5998 (1.7900) [2022-01-23 13:18:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][970/1251] eta 0:10:20 lr 0.000329 time 2.2493 (2.2098) loss 3.6733 (3.4219) grad_norm 1.7730 (1.7903) [2022-01-23 13:19:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][980/1251] eta 0:09:58 lr 0.000329 time 2.0075 (2.2099) loss 3.2297 (3.4206) grad_norm 1.5264 (1.7907) [2022-01-23 13:19:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][990/1251] eta 0:09:36 lr 0.000329 time 2.1505 (2.2104) loss 3.5274 (3.4222) grad_norm 1.5754 (1.7908) [2022-01-23 13:20:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1000/1251] eta 0:09:15 lr 0.000329 time 1.8381 (2.2119) loss 2.4893 (3.4198) grad_norm 1.6637 (1.7899) [2022-01-23 13:20:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1010/1251] eta 0:08:53 lr 0.000329 time 2.1636 (2.2134) loss 3.4178 (3.4185) grad_norm 1.5341 (1.7890) [2022-01-23 13:20:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1020/1251] eta 0:08:31 lr 0.000329 time 2.0908 (2.2123) loss 3.2494 (3.4171) grad_norm 1.7093 (1.7886) [2022-01-23 13:21:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1030/1251] eta 0:08:08 lr 0.000328 time 2.2555 (2.2109) loss 3.2682 (3.4177) grad_norm 1.6632 (1.7885) [2022-01-23 13:21:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1040/1251] eta 0:07:46 lr 0.000328 time 1.9375 (2.2091) loss 2.6853 (3.4178) grad_norm 1.8825 (1.7879) [2022-01-23 13:21:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1050/1251] eta 0:07:23 lr 0.000328 time 2.2202 (2.2087) loss 3.4776 (3.4176) grad_norm 1.9514 (1.7878) [2022-01-23 13:22:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1060/1251] eta 0:07:01 lr 0.000328 time 1.8521 (2.2081) loss 3.5184 (3.4197) grad_norm 1.9230 (1.7879) [2022-01-23 13:22:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1070/1251] eta 0:06:39 lr 0.000328 time 2.3274 (2.2084) loss 3.6767 (3.4182) grad_norm 1.7669 (1.7877) [2022-01-23 13:23:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1080/1251] eta 0:06:17 lr 0.000328 time 2.5970 (2.2099) loss 3.4971 (3.4191) grad_norm 1.7313 (1.7874) [2022-01-23 13:23:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1090/1251] eta 0:05:55 lr 0.000328 time 2.1522 (2.2107) loss 2.4603 (3.4192) grad_norm 1.7766 (1.7869) [2022-01-23 13:23:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1100/1251] eta 0:05:33 lr 0.000328 time 1.7235 (2.2093) loss 3.1532 (3.4186) grad_norm 1.6134 (1.7865) [2022-01-23 13:24:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1110/1251] eta 0:05:11 lr 0.000328 time 1.9460 (2.2082) loss 2.9959 (3.4168) grad_norm 1.6373 (1.7874) [2022-01-23 13:24:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1120/1251] eta 0:04:49 lr 0.000328 time 2.5199 (2.2070) loss 3.6978 (3.4167) grad_norm 1.6269 (1.7872) [2022-01-23 13:24:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1130/1251] eta 0:04:27 lr 0.000328 time 2.5627 (2.2068) loss 3.8245 (3.4167) grad_norm 1.5804 (1.7874) [2022-01-23 13:25:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1140/1251] eta 0:04:04 lr 0.000328 time 2.2275 (2.2063) loss 3.5815 (3.4180) grad_norm 1.5543 (1.7876) [2022-01-23 13:25:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1150/1251] eta 0:03:42 lr 0.000328 time 1.8541 (2.2064) loss 4.1230 (3.4172) grad_norm 2.0003 (1.7870) [2022-01-23 13:25:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1160/1251] eta 0:03:20 lr 0.000328 time 2.0205 (2.2068) loss 3.7707 (3.4189) grad_norm 1.6257 (1.7869) [2022-01-23 13:26:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1170/1251] eta 0:02:58 lr 0.000328 time 2.1975 (2.2062) loss 3.9702 (3.4195) grad_norm 1.8694 (1.7867) [2022-01-23 13:26:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1180/1251] eta 0:02:36 lr 0.000328 time 3.0414 (2.2072) loss 2.6126 (3.4214) grad_norm 1.6907 (1.7858) [2022-01-23 13:27:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1190/1251] eta 0:02:14 lr 0.000328 time 2.4005 (2.2078) loss 3.2675 (3.4227) grad_norm 1.6266 (1.7853) [2022-01-23 13:27:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1200/1251] eta 0:01:52 lr 0.000328 time 2.5148 (2.2089) loss 3.3400 (3.4244) grad_norm 1.6801 (1.7844) [2022-01-23 13:27:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1210/1251] eta 0:01:30 lr 0.000328 time 2.2155 (2.2085) loss 3.7317 (3.4249) grad_norm 1.6535 (1.7838) [2022-01-23 13:28:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1220/1251] eta 0:01:08 lr 0.000328 time 3.3731 (2.2089) loss 3.9513 (3.4253) grad_norm 1.5744 (1.7836) [2022-01-23 13:28:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1230/1251] eta 0:00:46 lr 0.000328 time 1.6214 (2.2072) loss 2.6348 (3.4240) grad_norm 1.6003 (1.7829) [2022-01-23 13:28:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1240/1251] eta 0:00:24 lr 0.000328 time 1.2519 (2.2064) loss 2.7469 (3.4235) grad_norm 1.6768 (1.7832) [2022-01-23 13:29:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [184/300][1250/1251] eta 0:00:02 lr 0.000328 time 1.1781 (2.2015) loss 3.6657 (3.4255) grad_norm 1.6277 (1.7829) [2022-01-23 13:29:06 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 184 training takes 0:45:54 [2022-01-23 13:29:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.887 (18.887) Loss 0.9228 (0.9228) Acc@1 79.395 (79.395) Acc@5 94.434 (94.434) [2022-01-23 13:29:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.893 (3.331) Loss 0.9974 (0.9507) Acc@1 76.660 (77.894) Acc@5 94.141 (94.025) [2022-01-23 13:29:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.532 (2.481) Loss 1.0183 (0.9555) Acc@1 76.074 (77.832) Acc@5 94.336 (93.996) [2022-01-23 13:30:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.642 (2.259) Loss 0.9438 (0.9602) Acc@1 78.711 (77.646) Acc@5 94.043 (93.986) [2022-01-23 13:30:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.535 (2.156) Loss 0.9110 (0.9573) Acc@1 76.953 (77.620) Acc@5 95.508 (94.014) [2022-01-23 13:30:41 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 77.562 Acc@5 93.998 [2022-01-23 13:30:41 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 77.6% [2022-01-23 13:30:41 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 77.56% [2022-01-23 13:31:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][0/1251] eta 7:17:49 lr 0.000328 time 20.9988 (20.9988) loss 3.3484 (3.3484) grad_norm 1.6913 (1.6913) [2022-01-23 13:31:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][10/1251] eta 1:25:08 lr 0.000328 time 2.2601 (4.1168) loss 3.1224 (3.3656) grad_norm 1.6351 (1.7762) [2022-01-23 13:31:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][20/1251] eta 1:07:08 lr 0.000328 time 1.9309 (3.2726) loss 4.2264 (3.3070) grad_norm 1.7625 (1.7560) [2022-01-23 13:32:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][30/1251] eta 0:59:30 lr 0.000327 time 1.6791 (2.9239) loss 3.8545 (3.3324) grad_norm 1.7595 (1.7638) [2022-01-23 13:32:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][40/1251] eta 0:55:32 lr 0.000327 time 2.8381 (2.7523) loss 2.6540 (3.3438) grad_norm 1.6332 (1.7943) [2022-01-23 13:32:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][50/1251] eta 0:52:48 lr 0.000327 time 1.5884 (2.6380) loss 2.7342 (3.3060) grad_norm 1.9312 (1.8035) [2022-01-23 13:33:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][60/1251] eta 0:50:52 lr 0.000327 time 1.8978 (2.5629) loss 3.8436 (3.3040) grad_norm 1.6942 (1.7956) [2022-01-23 13:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][70/1251] eta 0:49:32 lr 0.000327 time 1.8473 (2.5168) loss 2.6960 (3.3273) grad_norm 1.7166 (1.7977) [2022-01-23 13:34:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][80/1251] eta 0:48:35 lr 0.000327 time 2.6842 (2.4894) loss 3.5279 (3.3536) grad_norm 1.8056 (1.7933) [2022-01-23 13:34:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][90/1251] eta 0:47:22 lr 0.000327 time 1.9129 (2.4484) loss 4.0900 (3.3419) grad_norm 1.5924 (1.7868) [2022-01-23 13:34:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][100/1251] eta 0:46:22 lr 0.000327 time 1.8989 (2.4178) loss 2.3900 (3.3542) grad_norm 1.7857 (1.7821) [2022-01-23 13:35:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][110/1251] eta 0:45:19 lr 0.000327 time 1.9890 (2.3835) loss 3.9746 (3.3692) grad_norm 1.7675 (1.7825) [2022-01-23 13:35:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][120/1251] eta 0:44:39 lr 0.000327 time 1.8668 (2.3689) loss 2.9633 (3.3733) grad_norm 1.9797 (1.7934) [2022-01-23 13:35:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][130/1251] eta 0:43:50 lr 0.000327 time 1.7744 (2.3466) loss 3.2324 (3.3880) grad_norm 1.8589 (1.7947) [2022-01-23 13:36:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][140/1251] eta 0:43:22 lr 0.000327 time 2.4388 (2.3420) loss 2.5450 (3.3723) grad_norm 1.7451 (1.7890) [2022-01-23 13:36:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][150/1251] eta 0:42:44 lr 0.000327 time 1.8505 (2.3296) loss 3.6369 (3.3875) grad_norm 1.6875 (1.7929) [2022-01-23 13:36:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][160/1251] eta 0:42:10 lr 0.000327 time 1.8273 (2.3199) loss 3.6777 (3.3747) grad_norm 1.5218 (1.7944) [2022-01-23 13:37:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][170/1251] eta 0:41:50 lr 0.000327 time 2.4465 (2.3223) loss 3.7759 (3.3835) grad_norm 1.6380 (1.7977) [2022-01-23 13:37:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][180/1251] eta 0:41:22 lr 0.000327 time 1.9749 (2.3182) loss 2.4635 (3.3904) grad_norm 2.0516 (1.7931) [2022-01-23 13:38:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][190/1251] eta 0:40:42 lr 0.000327 time 1.6653 (2.3023) loss 3.0093 (3.3816) grad_norm 1.6213 (1.7911) [2022-01-23 13:38:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][200/1251] eta 0:40:10 lr 0.000327 time 2.1673 (2.2932) loss 3.3200 (3.3616) grad_norm 1.8019 (1.7892) [2022-01-23 13:38:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][210/1251] eta 0:39:40 lr 0.000327 time 2.0560 (2.2871) loss 3.1663 (3.3713) grad_norm 1.6057 (1.7845) [2022-01-23 13:39:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][220/1251] eta 0:39:11 lr 0.000327 time 1.5852 (2.2806) loss 2.5845 (3.3676) grad_norm 1.8836 (1.7831) [2022-01-23 13:39:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][230/1251] eta 0:38:40 lr 0.000327 time 1.8506 (2.2731) loss 3.5205 (3.3744) grad_norm 2.3845 (1.7838) [2022-01-23 13:39:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][240/1251] eta 0:38:15 lr 0.000327 time 1.9761 (2.2703) loss 2.8441 (3.3772) grad_norm 2.1447 (1.7917) [2022-01-23 13:40:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][250/1251] eta 0:37:51 lr 0.000327 time 3.0730 (2.2694) loss 4.0665 (3.3895) grad_norm 1.7437 (1.7873) [2022-01-23 13:40:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][260/1251] eta 0:37:24 lr 0.000327 time 1.5972 (2.2653) loss 2.5155 (3.3872) grad_norm 1.5425 (1.7893) [2022-01-23 13:40:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][270/1251] eta 0:37:04 lr 0.000327 time 1.4918 (2.2676) loss 3.4260 (3.3844) grad_norm 1.7693 (1.7905) [2022-01-23 13:41:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][280/1251] eta 0:36:45 lr 0.000327 time 1.8823 (2.2718) loss 3.3391 (3.3787) grad_norm 1.7504 (1.7900) [2022-01-23 13:41:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][290/1251] eta 0:36:26 lr 0.000326 time 3.0963 (2.2757) loss 3.4362 (3.3817) grad_norm 1.6588 (1.7886) [2022-01-23 13:42:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][300/1251] eta 0:35:55 lr 0.000326 time 1.6179 (2.2661) loss 3.2535 (3.3805) grad_norm 1.8431 (1.7886) [2022-01-23 13:42:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][310/1251] eta 0:35:24 lr 0.000326 time 1.9958 (2.2581) loss 3.0568 (3.3793) grad_norm 1.9973 (1.7912) [2022-01-23 13:42:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][320/1251] eta 0:34:53 lr 0.000326 time 1.8869 (2.2490) loss 4.1616 (3.3785) grad_norm 1.8088 (1.7915) [2022-01-23 13:43:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][330/1251] eta 0:34:29 lr 0.000326 time 2.5766 (2.2471) loss 4.0611 (3.3789) grad_norm 1.9914 (1.7929) [2022-01-23 13:43:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][340/1251] eta 0:34:05 lr 0.000326 time 1.8565 (2.2450) loss 3.3134 (3.3821) grad_norm 1.8105 (1.7894) [2022-01-23 13:43:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][350/1251] eta 0:33:43 lr 0.000326 time 2.5457 (2.2463) loss 3.6201 (3.3770) grad_norm 1.8743 (1.7883) [2022-01-23 13:44:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][360/1251] eta 0:33:25 lr 0.000326 time 1.7783 (2.2506) loss 3.3773 (3.3745) grad_norm 1.8605 (1.7889) [2022-01-23 13:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][370/1251] eta 0:33:04 lr 0.000326 time 1.9085 (2.2520) loss 3.1181 (3.3773) grad_norm 1.9410 (1.7896) [2022-01-23 13:44:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][380/1251] eta 0:32:37 lr 0.000326 time 1.8561 (2.2475) loss 4.0611 (3.3825) grad_norm 1.6528 (1.7882) [2022-01-23 13:45:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][390/1251] eta 0:32:09 lr 0.000326 time 1.8338 (2.2415) loss 3.5520 (3.3801) grad_norm 1.6645 (1.7878) [2022-01-23 13:45:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][400/1251] eta 0:31:43 lr 0.000326 time 2.3039 (2.2369) loss 3.6143 (3.3856) grad_norm 1.9499 (1.7899) [2022-01-23 13:45:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][410/1251] eta 0:31:15 lr 0.000326 time 1.6400 (2.2296) loss 2.6479 (3.3843) grad_norm 1.9513 (1.7883) [2022-01-23 13:46:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][420/1251] eta 0:30:51 lr 0.000326 time 1.9355 (2.2275) loss 2.6873 (3.3795) grad_norm 1.8927 (1.7867) [2022-01-23 13:46:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][430/1251] eta 0:30:27 lr 0.000326 time 1.8875 (2.2265) loss 3.9039 (3.3751) grad_norm 1.9238 (1.7883) [2022-01-23 13:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][440/1251] eta 0:30:06 lr 0.000326 time 2.7450 (2.2274) loss 3.7412 (3.3763) grad_norm 1.7728 (1.7904) [2022-01-23 13:47:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][450/1251] eta 0:29:45 lr 0.000326 time 3.1064 (2.2287) loss 3.2824 (3.3810) grad_norm 1.7128 (1.7882) [2022-01-23 13:47:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][460/1251] eta 0:29:20 lr 0.000326 time 1.4629 (2.2253) loss 3.9763 (3.3799) grad_norm 1.6620 (1.7866) [2022-01-23 13:48:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][470/1251] eta 0:28:58 lr 0.000326 time 2.1775 (2.2254) loss 3.5929 (3.3847) grad_norm 2.1134 (1.7848) [2022-01-23 13:48:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][480/1251] eta 0:28:34 lr 0.000326 time 2.8499 (2.2233) loss 3.9606 (3.3847) grad_norm 1.6865 (1.7845) [2022-01-23 13:48:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][490/1251] eta 0:28:13 lr 0.000326 time 2.1424 (2.2255) loss 3.6676 (3.3833) grad_norm 1.8393 (1.7826) [2022-01-23 13:49:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][500/1251] eta 0:27:51 lr 0.000326 time 1.8958 (2.2259) loss 3.3374 (3.3832) grad_norm 1.4998 (1.7817) [2022-01-23 13:49:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][510/1251] eta 0:27:30 lr 0.000326 time 1.7874 (2.2271) loss 4.0985 (3.3833) grad_norm 1.7644 (1.7808) [2022-01-23 13:50:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][520/1251] eta 0:27:07 lr 0.000326 time 2.8118 (2.2259) loss 3.6243 (3.3790) grad_norm 1.7448 (1.7813) [2022-01-23 13:50:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][530/1251] eta 0:26:44 lr 0.000326 time 1.5702 (2.2252) loss 4.1722 (3.3811) grad_norm 1.7655 (1.7809) [2022-01-23 13:50:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][540/1251] eta 0:26:21 lr 0.000326 time 1.7044 (2.2236) loss 3.7138 (3.3829) grad_norm 1.5246 (1.7801) [2022-01-23 13:51:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][550/1251] eta 0:25:57 lr 0.000325 time 1.9603 (2.2221) loss 2.8745 (3.3838) grad_norm 1.7362 (1.7802) [2022-01-23 13:51:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][560/1251] eta 0:25:35 lr 0.000325 time 2.9677 (2.2227) loss 3.5592 (3.3847) grad_norm 1.5531 (1.7823) [2022-01-23 13:51:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][570/1251] eta 0:25:13 lr 0.000325 time 1.8942 (2.2232) loss 3.0678 (3.3856) grad_norm 1.8500 (1.7824) [2022-01-23 13:52:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][580/1251] eta 0:24:50 lr 0.000325 time 1.8842 (2.2210) loss 3.3053 (3.3852) grad_norm 1.7889 (1.7827) [2022-01-23 13:52:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][590/1251] eta 0:24:27 lr 0.000325 time 1.8137 (2.2205) loss 2.3455 (3.3867) grad_norm 2.2330 (1.7839) [2022-01-23 13:52:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][600/1251] eta 0:24:05 lr 0.000325 time 2.4225 (2.2197) loss 3.5784 (3.3919) grad_norm 2.1442 (1.7867) [2022-01-23 13:53:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][610/1251] eta 0:23:41 lr 0.000325 time 1.9143 (2.2179) loss 3.6328 (3.3908) grad_norm 1.6407 (1.7891) [2022-01-23 13:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][620/1251] eta 0:23:18 lr 0.000325 time 2.2100 (2.2166) loss 3.1238 (3.3886) grad_norm 1.8820 (1.7894) [2022-01-23 13:53:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][630/1251] eta 0:22:55 lr 0.000325 time 1.9466 (2.2151) loss 3.2157 (3.3875) grad_norm 1.8953 (1.7892) [2022-01-23 13:54:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][640/1251] eta 0:22:33 lr 0.000325 time 2.6063 (2.2155) loss 3.6360 (3.3892) grad_norm 1.6671 (1.7903) [2022-01-23 13:54:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][650/1251] eta 0:22:10 lr 0.000325 time 1.8438 (2.2139) loss 3.6370 (3.3920) grad_norm 1.6371 (1.7895) [2022-01-23 13:55:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][660/1251] eta 0:21:48 lr 0.000325 time 2.2687 (2.2137) loss 3.6324 (3.3931) grad_norm 1.6887 (1.7905) [2022-01-23 13:55:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][670/1251] eta 0:21:25 lr 0.000325 time 1.8438 (2.2123) loss 3.7605 (3.3951) grad_norm 1.5802 (1.7916) [2022-01-23 13:55:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][680/1251] eta 0:21:03 lr 0.000325 time 1.7951 (2.2128) loss 3.7401 (3.3968) grad_norm 1.5283 (1.7917) [2022-01-23 13:56:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][690/1251] eta 0:20:41 lr 0.000325 time 1.8493 (2.2123) loss 2.7873 (3.3993) grad_norm 1.7719 (1.7907) [2022-01-23 13:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][700/1251] eta 0:20:19 lr 0.000325 time 2.2527 (2.2124) loss 3.5082 (3.4027) grad_norm 1.5822 (1.7898) [2022-01-23 13:56:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][710/1251] eta 0:19:56 lr 0.000325 time 2.2718 (2.2116) loss 4.2545 (3.4050) grad_norm 1.6560 (1.7897) [2022-01-23 13:57:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][720/1251] eta 0:19:33 lr 0.000325 time 2.0704 (2.2099) loss 2.6782 (3.4028) grad_norm 1.7653 (1.7897) [2022-01-23 13:57:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][730/1251] eta 0:19:12 lr 0.000325 time 2.2037 (2.2114) loss 3.7287 (3.4030) grad_norm 1.6316 (1.7874) [2022-01-23 13:58:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][740/1251] eta 0:18:50 lr 0.000325 time 2.1741 (2.2120) loss 3.3217 (3.4026) grad_norm 1.6887 (1.7888) [2022-01-23 13:58:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][750/1251] eta 0:18:28 lr 0.000325 time 2.4640 (2.2118) loss 4.2489 (3.4014) grad_norm 2.0542 (1.7911) [2022-01-23 13:58:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][760/1251] eta 0:18:06 lr 0.000325 time 2.1541 (2.2122) loss 3.3328 (3.4024) grad_norm 2.0852 (1.7921) [2022-01-23 13:59:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][770/1251] eta 0:17:44 lr 0.000325 time 1.7189 (2.2126) loss 3.6176 (3.4042) grad_norm 1.7392 (1.7926) [2022-01-23 13:59:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][780/1251] eta 0:17:21 lr 0.000325 time 1.9970 (2.2112) loss 2.8616 (3.4052) grad_norm 1.7627 (1.7924) [2022-01-23 13:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][790/1251] eta 0:16:59 lr 0.000325 time 2.3690 (2.2110) loss 3.7733 (3.4062) grad_norm 1.8978 (1.7933) [2022-01-23 14:00:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][800/1251] eta 0:16:36 lr 0.000325 time 2.5830 (2.2106) loss 3.4443 (3.4080) grad_norm 1.5448 (1.7930) [2022-01-23 14:00:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][810/1251] eta 0:16:14 lr 0.000324 time 1.6062 (2.2088) loss 3.6607 (3.4055) grad_norm 1.8554 (1.7936) [2022-01-23 14:00:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][820/1251] eta 0:15:50 lr 0.000324 time 2.1789 (2.2064) loss 3.9805 (3.4081) grad_norm 1.9664 (1.7941) [2022-01-23 14:01:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][830/1251] eta 0:15:28 lr 0.000324 time 2.0644 (2.2049) loss 2.6788 (3.4076) grad_norm 1.8872 (1.7944) [2022-01-23 14:01:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][840/1251] eta 0:15:06 lr 0.000324 time 2.1004 (2.2046) loss 3.9009 (3.4092) grad_norm 1.6874 (1.7941) [2022-01-23 14:01:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][850/1251] eta 0:14:43 lr 0.000324 time 2.2342 (2.2041) loss 3.5608 (3.4106) grad_norm 1.5490 (1.7929) [2022-01-23 14:02:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][860/1251] eta 0:14:21 lr 0.000324 time 3.3297 (2.2031) loss 3.3132 (3.4095) grad_norm 1.7787 (1.7928) [2022-01-23 14:02:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][870/1251] eta 0:13:58 lr 0.000324 time 1.4251 (2.2016) loss 3.0793 (3.4103) grad_norm 1.5428 (1.7926) [2022-01-23 14:03:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][880/1251] eta 0:13:36 lr 0.000324 time 1.8916 (2.2015) loss 3.7430 (3.4070) grad_norm 1.7801 (1.7924) [2022-01-23 14:03:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][890/1251] eta 0:13:14 lr 0.000324 time 1.6281 (2.2009) loss 4.1882 (3.4059) grad_norm 1.5712 (1.7914) [2022-01-23 14:03:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][900/1251] eta 0:12:53 lr 0.000324 time 2.5994 (2.2031) loss 3.1697 (3.4053) grad_norm 1.6814 (1.7909) [2022-01-23 14:04:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][910/1251] eta 0:12:31 lr 0.000324 time 1.6484 (2.2027) loss 3.7575 (3.4065) grad_norm 1.7737 (1.7897) [2022-01-23 14:04:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][920/1251] eta 0:12:08 lr 0.000324 time 1.8333 (2.2023) loss 2.6389 (3.4031) grad_norm 1.6160 (1.7891) [2022-01-23 14:04:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][930/1251] eta 0:11:46 lr 0.000324 time 1.8484 (2.2016) loss 3.7802 (3.4061) grad_norm 1.7670 (1.7881) [2022-01-23 14:05:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][940/1251] eta 0:11:25 lr 0.000324 time 2.8793 (2.2028) loss 3.4261 (3.4054) grad_norm 2.1517 (1.7897) [2022-01-23 14:05:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][950/1251] eta 0:11:02 lr 0.000324 time 2.1869 (2.2026) loss 3.2748 (3.4046) grad_norm 1.8037 (1.7898) [2022-01-23 14:05:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][960/1251] eta 0:10:40 lr 0.000324 time 1.6747 (2.2011) loss 3.9063 (3.4050) grad_norm 1.7224 (1.7896) [2022-01-23 14:06:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][970/1251] eta 0:10:18 lr 0.000324 time 1.8769 (2.2005) loss 3.9463 (3.4041) grad_norm 1.7150 (1.7890) [2022-01-23 14:06:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][980/1251] eta 0:09:56 lr 0.000324 time 2.7978 (2.2014) loss 3.9510 (3.4063) grad_norm 1.5908 (1.7893) [2022-01-23 14:07:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][990/1251] eta 0:09:35 lr 0.000324 time 2.1943 (2.2035) loss 3.7162 (3.4094) grad_norm 1.7190 (1.7894) [2022-01-23 14:07:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1000/1251] eta 0:09:13 lr 0.000324 time 2.3557 (2.2048) loss 3.4163 (3.4087) grad_norm 1.7569 (1.7897) [2022-01-23 14:07:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1010/1251] eta 0:08:51 lr 0.000324 time 2.1211 (2.2050) loss 4.0436 (3.4104) grad_norm 1.7486 (1.7896) [2022-01-23 14:08:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1020/1251] eta 0:08:28 lr 0.000324 time 2.2962 (2.2032) loss 3.2704 (3.4091) grad_norm 1.8213 (1.7888) [2022-01-23 14:08:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1030/1251] eta 0:08:06 lr 0.000324 time 1.6380 (2.1998) loss 3.7732 (3.4054) grad_norm 2.1268 (1.7883) [2022-01-23 14:08:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1040/1251] eta 0:07:43 lr 0.000324 time 2.3048 (2.1982) loss 3.0589 (3.4067) grad_norm 2.1489 (1.7887) [2022-01-23 14:09:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1050/1251] eta 0:07:21 lr 0.000324 time 1.8341 (2.1977) loss 3.6988 (3.4093) grad_norm 2.1786 (1.7891) [2022-01-23 14:09:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1060/1251] eta 0:06:59 lr 0.000324 time 2.3439 (2.1990) loss 4.0762 (3.4104) grad_norm 1.7109 (1.7884) [2022-01-23 14:09:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1070/1251] eta 0:06:38 lr 0.000323 time 1.8990 (2.1997) loss 3.6128 (3.4095) grad_norm 1.9125 (1.7880) [2022-01-23 14:10:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1080/1251] eta 0:06:16 lr 0.000323 time 2.5424 (2.2013) loss 2.8845 (3.4103) grad_norm 2.1912 (1.7883) [2022-01-23 14:10:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1090/1251] eta 0:05:54 lr 0.000323 time 1.9866 (2.2014) loss 4.1019 (3.4090) grad_norm 1.6973 (1.7878) [2022-01-23 14:11:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1100/1251] eta 0:05:32 lr 0.000323 time 1.9024 (2.2018) loss 2.3036 (3.4078) grad_norm 1.6144 (1.7878) [2022-01-23 14:11:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1110/1251] eta 0:05:10 lr 0.000323 time 3.1823 (2.2022) loss 4.2723 (3.4070) grad_norm 2.1778 (1.7884) [2022-01-23 14:11:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1120/1251] eta 0:04:48 lr 0.000323 time 1.8629 (2.2015) loss 3.8387 (3.4094) grad_norm 1.7165 (1.7884) [2022-01-23 14:12:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1130/1251] eta 0:04:26 lr 0.000323 time 1.8722 (2.1997) loss 3.9838 (3.4099) grad_norm 1.7103 (1.7886) [2022-01-23 14:12:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1140/1251] eta 0:04:04 lr 0.000323 time 2.2390 (2.1987) loss 3.8121 (3.4087) grad_norm 1.9679 (1.7885) [2022-01-23 14:12:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1150/1251] eta 0:03:41 lr 0.000323 time 2.1868 (2.1978) loss 2.8096 (3.4062) grad_norm 1.6494 (1.7878) [2022-01-23 14:13:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1160/1251] eta 0:03:19 lr 0.000323 time 2.5358 (2.1967) loss 4.0140 (3.4074) grad_norm 1.6850 (1.7875) [2022-01-23 14:13:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1170/1251] eta 0:02:57 lr 0.000323 time 2.6285 (2.1974) loss 3.6772 (3.4094) grad_norm 1.7197 (1.7875) [2022-01-23 14:13:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1180/1251] eta 0:02:35 lr 0.000323 time 1.8927 (2.1971) loss 3.3961 (3.4104) grad_norm 1.5995 (1.7870) [2022-01-23 14:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1190/1251] eta 0:02:14 lr 0.000323 time 2.2501 (2.1973) loss 2.8611 (3.4094) grad_norm 1.6184 (1.7868) [2022-01-23 14:14:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1200/1251] eta 0:01:52 lr 0.000323 time 2.7665 (2.1980) loss 3.7019 (3.4090) grad_norm 1.5110 (1.7858) [2022-01-23 14:15:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1210/1251] eta 0:01:30 lr 0.000323 time 2.6454 (2.1997) loss 3.8509 (3.4121) grad_norm 1.7958 (1.7850) [2022-01-23 14:15:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1220/1251] eta 0:01:08 lr 0.000323 time 2.7986 (2.2001) loss 2.9823 (3.4131) grad_norm 1.6784 (1.7849) [2022-01-23 14:15:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1230/1251] eta 0:00:46 lr 0.000323 time 2.3277 (2.2001) loss 3.5078 (3.4138) grad_norm 1.5254 (1.7853) [2022-01-23 14:16:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1240/1251] eta 0:00:24 lr 0.000323 time 1.2430 (2.1976) loss 3.2180 (3.4127) grad_norm 1.6532 (1.7854) [2022-01-23 14:16:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [185/300][1250/1251] eta 0:00:02 lr 0.000323 time 1.2336 (2.1913) loss 3.4395 (3.4115) grad_norm 1.7094 (1.7852) [2022-01-23 14:16:23 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 185 training takes 0:45:41 [2022-01-23 14:16:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.027 (19.027) Loss 0.9558 (0.9558) Acc@1 76.465 (76.465) Acc@5 93.555 (93.555) [2022-01-23 14:17:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.884 (3.566) Loss 0.9275 (0.9609) Acc@1 78.125 (77.024) Acc@5 94.336 (93.661) [2022-01-23 14:17:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.951 (2.717) Loss 0.9120 (0.9465) Acc@1 78.711 (77.348) Acc@5 94.727 (93.848) [2022-01-23 14:17:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.645 (2.340) Loss 0.8797 (0.9354) Acc@1 80.176 (77.520) Acc@5 95.020 (93.993) [2022-01-23 14:17:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.943 (2.214) Loss 0.9533 (0.9392) Acc@1 76.270 (77.496) Acc@5 94.727 (93.995) [2022-01-23 14:18:01 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 77.504 Acc@5 94.028 [2022-01-23 14:18:01 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 77.5% [2022-01-23 14:18:01 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 77.56% [2022-01-23 14:18:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][0/1251] eta 7:21:23 lr 0.000323 time 21.1701 (21.1701) loss 2.5238 (2.5238) grad_norm 1.5406 (1.5406) [2022-01-23 14:18:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][10/1251] eta 1:21:54 lr 0.000323 time 1.8690 (3.9600) loss 4.0860 (3.4880) grad_norm 1.5569 (1.8077) [2022-01-23 14:19:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][20/1251] eta 1:04:42 lr 0.000323 time 2.1515 (3.1535) loss 3.9596 (3.4621) grad_norm 1.8184 (1.7715) [2022-01-23 14:19:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][30/1251] eta 0:57:11 lr 0.000323 time 1.8425 (2.8101) loss 3.6670 (3.3609) grad_norm 2.0103 (1.7729) [2022-01-23 14:19:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][40/1251] eta 0:54:14 lr 0.000323 time 3.6398 (2.6873) loss 2.8658 (3.3385) grad_norm 1.6090 (1.7675) [2022-01-23 14:20:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][50/1251] eta 0:52:50 lr 0.000323 time 3.0173 (2.6402) loss 2.4219 (3.3103) grad_norm 1.4853 (1.7621) [2022-01-23 14:20:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][60/1251] eta 0:51:12 lr 0.000323 time 2.1785 (2.5796) loss 3.5420 (3.3520) grad_norm 2.1175 (1.7570) [2022-01-23 14:21:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][70/1251] eta 0:49:47 lr 0.000323 time 1.8109 (2.5299) loss 3.9510 (3.4022) grad_norm 1.8129 (1.7891) [2022-01-23 14:21:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][80/1251] eta 0:48:45 lr 0.000322 time 2.3419 (2.4987) loss 2.4266 (3.3876) grad_norm 1.7720 (1.7973) [2022-01-23 14:21:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][90/1251] eta 0:47:39 lr 0.000322 time 1.8834 (2.4629) loss 4.0756 (3.3958) grad_norm 1.9165 (1.7935) [2022-01-23 14:22:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][100/1251] eta 0:46:21 lr 0.000322 time 1.8717 (2.4163) loss 2.3371 (3.4008) grad_norm 1.7049 (1.7895) [2022-01-23 14:22:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][110/1251] eta 0:45:23 lr 0.000322 time 2.0405 (2.3873) loss 2.7816 (3.4129) grad_norm 1.7146 (1.8021) [2022-01-23 14:22:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][120/1251] eta 0:44:59 lr 0.000322 time 2.5200 (2.3865) loss 4.1608 (3.4189) grad_norm 2.7496 (1.8144) [2022-01-23 14:23:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][130/1251] eta 0:44:25 lr 0.000322 time 1.5542 (2.3780) loss 3.3584 (3.4193) grad_norm 1.6251 (1.8122) [2022-01-23 14:23:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][140/1251] eta 0:43:46 lr 0.000322 time 1.8685 (2.3641) loss 2.5464 (3.4075) grad_norm 1.8375 (1.8159) [2022-01-23 14:23:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][150/1251] eta 0:43:07 lr 0.000322 time 2.1364 (2.3503) loss 4.0018 (3.4148) grad_norm 1.8776 (1.8221) [2022-01-23 14:24:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][160/1251] eta 0:42:24 lr 0.000322 time 1.7406 (2.3327) loss 3.4156 (3.4136) grad_norm 1.8449 (1.8230) [2022-01-23 14:24:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][170/1251] eta 0:41:50 lr 0.000322 time 1.7323 (2.3228) loss 3.7230 (3.4031) grad_norm 1.6408 (1.8186) [2022-01-23 14:25:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][180/1251] eta 0:41:19 lr 0.000322 time 1.9684 (2.3154) loss 3.0416 (3.4084) grad_norm 1.7415 (1.8125) [2022-01-23 14:25:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][190/1251] eta 0:40:50 lr 0.000322 time 1.6111 (2.3098) loss 3.9639 (3.3988) grad_norm 1.7656 (1.8141) [2022-01-23 14:25:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][200/1251] eta 0:40:17 lr 0.000322 time 1.7458 (2.3006) loss 3.1135 (3.3873) grad_norm 1.6723 (1.8143) [2022-01-23 14:26:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][210/1251] eta 0:39:42 lr 0.000322 time 1.5653 (2.2890) loss 3.5146 (3.3856) grad_norm 1.7205 (1.8174) [2022-01-23 14:26:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][220/1251] eta 0:39:16 lr 0.000322 time 2.5182 (2.2860) loss 2.6397 (3.3806) grad_norm 1.9900 (1.8195) [2022-01-23 14:26:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][230/1251] eta 0:38:59 lr 0.000322 time 2.5590 (2.2910) loss 3.3223 (3.3870) grad_norm 1.8508 (1.8210) [2022-01-23 14:27:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][240/1251] eta 0:38:34 lr 0.000322 time 2.2067 (2.2896) loss 3.3992 (3.3788) grad_norm 1.5266 (1.8219) [2022-01-23 14:27:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][250/1251] eta 0:37:59 lr 0.000322 time 1.5918 (2.2777) loss 3.8521 (3.3925) grad_norm 1.9682 (1.8231) [2022-01-23 14:27:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][260/1251] eta 0:37:27 lr 0.000322 time 1.7669 (2.2675) loss 3.6259 (3.3979) grad_norm 1.7715 (1.8215) [2022-01-23 14:28:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][270/1251] eta 0:36:54 lr 0.000322 time 1.9397 (2.2576) loss 3.0650 (3.3844) grad_norm 1.6833 (1.8191) [2022-01-23 14:28:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][280/1251] eta 0:36:29 lr 0.000322 time 2.6449 (2.2551) loss 3.9610 (3.3875) grad_norm 1.6613 (1.8157) [2022-01-23 14:28:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][290/1251] eta 0:36:11 lr 0.000322 time 2.6868 (2.2593) loss 3.5969 (3.3861) grad_norm 1.9880 (1.8164) [2022-01-23 14:29:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][300/1251] eta 0:35:43 lr 0.000322 time 1.9745 (2.2539) loss 3.6549 (3.3854) grad_norm 1.5374 (1.8163) [2022-01-23 14:29:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][310/1251] eta 0:35:18 lr 0.000322 time 2.3696 (2.2513) loss 3.7216 (3.3901) grad_norm 1.7909 (1.8147) [2022-01-23 14:30:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][320/1251] eta 0:34:58 lr 0.000322 time 2.9260 (2.2543) loss 3.6344 (3.3977) grad_norm 1.6097 (1.8114) [2022-01-23 14:30:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][330/1251] eta 0:34:37 lr 0.000322 time 2.4424 (2.2559) loss 4.0438 (3.4018) grad_norm 1.6374 (1.8097) [2022-01-23 14:30:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][340/1251] eta 0:34:13 lr 0.000321 time 2.3084 (2.2540) loss 3.4997 (3.3977) grad_norm 1.8066 (1.8090) [2022-01-23 14:31:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][350/1251] eta 0:33:48 lr 0.000321 time 2.2034 (2.2511) loss 3.4167 (3.4005) grad_norm 1.6757 (1.8077) [2022-01-23 14:31:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][360/1251] eta 0:33:17 lr 0.000321 time 1.9371 (2.2421) loss 3.2069 (3.4020) grad_norm 1.7202 (1.8038) [2022-01-23 14:31:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][370/1251] eta 0:32:52 lr 0.000321 time 2.5489 (2.2392) loss 3.4520 (3.4024) grad_norm 1.9174 (1.8041) [2022-01-23 14:32:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][380/1251] eta 0:32:27 lr 0.000321 time 1.9850 (2.2360) loss 3.4470 (3.4073) grad_norm 1.6539 (1.8049) [2022-01-23 14:32:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][390/1251] eta 0:32:06 lr 0.000321 time 2.1301 (2.2371) loss 3.5498 (3.4069) grad_norm 3.6472 (1.8180) [2022-01-23 14:32:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][400/1251] eta 0:31:43 lr 0.000321 time 2.4405 (2.2363) loss 3.6896 (3.4002) grad_norm 1.8699 (1.8226) [2022-01-23 14:33:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][410/1251] eta 0:31:18 lr 0.000321 time 1.9892 (2.2337) loss 3.3029 (3.3967) grad_norm 1.8325 (1.8234) [2022-01-23 14:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][420/1251] eta 0:30:54 lr 0.000321 time 1.9182 (2.2311) loss 3.6635 (3.3969) grad_norm 1.8297 (1.8231) [2022-01-23 14:34:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][430/1251] eta 0:30:27 lr 0.000321 time 1.6635 (2.2254) loss 4.0492 (3.3961) grad_norm 2.0402 (1.8221) [2022-01-23 14:34:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][440/1251] eta 0:30:02 lr 0.000321 time 1.6297 (2.2222) loss 3.0189 (3.3967) grad_norm 1.7912 (1.8219) [2022-01-23 14:34:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][450/1251] eta 0:29:41 lr 0.000321 time 2.6134 (2.2244) loss 3.8758 (3.3993) grad_norm 1.9420 (1.8192) [2022-01-23 14:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][460/1251] eta 0:29:19 lr 0.000321 time 2.0231 (2.2249) loss 3.5337 (3.3971) grad_norm 1.8206 (1.8186) [2022-01-23 14:35:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][470/1251] eta 0:28:56 lr 0.000321 time 2.2933 (2.2240) loss 3.4830 (3.3933) grad_norm 1.6713 (1.8171) [2022-01-23 14:35:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][480/1251] eta 0:28:35 lr 0.000321 time 1.5762 (2.2255) loss 3.8822 (3.3935) grad_norm 1.6709 (1.8153) [2022-01-23 14:36:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][490/1251] eta 0:28:14 lr 0.000321 time 2.9041 (2.2272) loss 3.5866 (3.3897) grad_norm 1.7636 (1.8167) [2022-01-23 14:36:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][500/1251] eta 0:27:50 lr 0.000321 time 1.8561 (2.2240) loss 2.8375 (3.3860) grad_norm 1.5768 (1.8149) [2022-01-23 14:36:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][510/1251] eta 0:27:26 lr 0.000321 time 2.6873 (2.2222) loss 3.2772 (3.3889) grad_norm 1.9691 (1.8127) [2022-01-23 14:37:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][520/1251] eta 0:27:03 lr 0.000321 time 1.8501 (2.2214) loss 2.8148 (3.3821) grad_norm 1.6915 (1.8115) [2022-01-23 14:37:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][530/1251] eta 0:26:40 lr 0.000321 time 1.9099 (2.2192) loss 3.8283 (3.3881) grad_norm 1.8875 (1.8144) [2022-01-23 14:38:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][540/1251] eta 0:26:15 lr 0.000321 time 2.4792 (2.2165) loss 2.8280 (3.3861) grad_norm 1.5403 (1.8140) [2022-01-23 14:38:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][550/1251] eta 0:25:53 lr 0.000321 time 1.9255 (2.2157) loss 2.8270 (3.3881) grad_norm 1.6466 (1.8121) [2022-01-23 14:38:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][560/1251] eta 0:25:34 lr 0.000321 time 2.3166 (2.2211) loss 3.6643 (3.3801) grad_norm 1.7955 (1.8115) [2022-01-23 14:39:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][570/1251] eta 0:25:12 lr 0.000321 time 2.1911 (2.2216) loss 3.8492 (3.3820) grad_norm 2.0166 (1.8132) [2022-01-23 14:39:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][580/1251] eta 0:24:51 lr 0.000321 time 2.6890 (2.2225) loss 3.5901 (3.3844) grad_norm 1.9789 (1.8136) [2022-01-23 14:39:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][590/1251] eta 0:24:28 lr 0.000321 time 2.2046 (2.2216) loss 3.7544 (3.3822) grad_norm 1.8263 (1.8138) [2022-01-23 14:40:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][600/1251] eta 0:24:05 lr 0.000320 time 2.2043 (2.2203) loss 3.7622 (3.3855) grad_norm 1.4357 (1.8118) [2022-01-23 14:40:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][610/1251] eta 0:23:41 lr 0.000320 time 1.6818 (2.2174) loss 3.7280 (3.3804) grad_norm 1.9298 (1.8110) [2022-01-23 14:40:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][620/1251] eta 0:23:18 lr 0.000320 time 2.3577 (2.2165) loss 3.8526 (3.3790) grad_norm 1.7533 (1.8108) [2022-01-23 14:41:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][630/1251] eta 0:22:55 lr 0.000320 time 1.8904 (2.2143) loss 3.5206 (3.3777) grad_norm 1.6894 (1.8109) [2022-01-23 14:41:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][640/1251] eta 0:22:33 lr 0.000320 time 2.8038 (2.2159) loss 3.8882 (3.3767) grad_norm 1.7405 (1.8099) [2022-01-23 14:42:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][650/1251] eta 0:22:10 lr 0.000320 time 1.7620 (2.2135) loss 2.3236 (3.3740) grad_norm 1.7625 (1.8092) [2022-01-23 14:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][660/1251] eta 0:21:47 lr 0.000320 time 2.2089 (2.2130) loss 3.5687 (3.3743) grad_norm 1.6896 (1.8085) [2022-01-23 14:42:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][670/1251] eta 0:21:26 lr 0.000320 time 2.4031 (2.2137) loss 3.7611 (3.3762) grad_norm 1.8329 (1.8071) [2022-01-23 14:43:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][680/1251] eta 0:21:04 lr 0.000320 time 3.0462 (2.2152) loss 3.9917 (3.3794) grad_norm 1.7001 (1.8062) [2022-01-23 14:43:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][690/1251] eta 0:20:42 lr 0.000320 time 1.4936 (2.2148) loss 3.3735 (3.3829) grad_norm 1.9760 (1.8057) [2022-01-23 14:43:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][700/1251] eta 0:20:18 lr 0.000320 time 1.8869 (2.2115) loss 3.8443 (3.3844) grad_norm 2.1784 (1.8075) [2022-01-23 14:44:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][710/1251] eta 0:19:57 lr 0.000320 time 2.4806 (2.2136) loss 3.8402 (3.3862) grad_norm 2.0177 (1.8090) [2022-01-23 14:44:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][720/1251] eta 0:19:35 lr 0.000320 time 2.1832 (2.2129) loss 3.4267 (3.3853) grad_norm 1.7949 (1.8080) [2022-01-23 14:44:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][730/1251] eta 0:19:12 lr 0.000320 time 1.8614 (2.2114) loss 3.8296 (3.3887) grad_norm 1.8055 (1.8071) [2022-01-23 14:45:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][740/1251] eta 0:18:49 lr 0.000320 time 2.5650 (2.2106) loss 4.0545 (3.3901) grad_norm 1.6463 (1.8063) [2022-01-23 14:45:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][750/1251] eta 0:18:27 lr 0.000320 time 2.1662 (2.2096) loss 3.6135 (3.3889) grad_norm 1.6157 (1.8061) [2022-01-23 14:46:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][760/1251] eta 0:18:05 lr 0.000320 time 3.0905 (2.2112) loss 3.2681 (3.3906) grad_norm 1.8941 (1.8051) [2022-01-23 14:46:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][770/1251] eta 0:17:43 lr 0.000320 time 1.5924 (2.2100) loss 2.6846 (3.3895) grad_norm 1.8453 (1.8046) [2022-01-23 14:46:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][780/1251] eta 0:17:20 lr 0.000320 time 2.9340 (2.2098) loss 2.7750 (3.3891) grad_norm 1.6748 (1.8049) [2022-01-23 14:47:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][790/1251] eta 0:16:58 lr 0.000320 time 2.4929 (2.2089) loss 3.9700 (3.3903) grad_norm 1.6644 (1.8049) [2022-01-23 14:47:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][800/1251] eta 0:16:36 lr 0.000320 time 2.2060 (2.2088) loss 3.9899 (3.3912) grad_norm 1.6064 (1.8045) [2022-01-23 14:47:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][810/1251] eta 0:16:13 lr 0.000320 time 2.7476 (2.2081) loss 2.6105 (3.3889) grad_norm 2.0190 (1.8054) [2022-01-23 14:48:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][820/1251] eta 0:15:51 lr 0.000320 time 2.1777 (2.2074) loss 3.9301 (3.3912) grad_norm 1.7018 (1.8056) [2022-01-23 14:48:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][830/1251] eta 0:15:29 lr 0.000320 time 1.8610 (2.2073) loss 3.7717 (3.3918) grad_norm 1.8088 (1.8046) [2022-01-23 14:48:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][840/1251] eta 0:15:07 lr 0.000320 time 1.8542 (2.2071) loss 4.0500 (3.3915) grad_norm 1.9392 (1.8039) [2022-01-23 14:49:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][850/1251] eta 0:14:44 lr 0.000320 time 2.8017 (2.2055) loss 2.7831 (3.3911) grad_norm 1.9747 (1.8041) [2022-01-23 14:49:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][860/1251] eta 0:14:21 lr 0.000319 time 2.9665 (2.2045) loss 2.9408 (3.3894) grad_norm 1.6050 (1.8044) [2022-01-23 14:49:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][870/1251] eta 0:13:59 lr 0.000319 time 1.9083 (2.2025) loss 2.7318 (3.3879) grad_norm 1.8337 (1.8040) [2022-01-23 14:50:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][880/1251] eta 0:13:37 lr 0.000319 time 2.3256 (2.2026) loss 3.7982 (3.3888) grad_norm 1.5991 (1.8046) [2022-01-23 14:50:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][890/1251] eta 0:13:15 lr 0.000319 time 2.1313 (2.2026) loss 2.5491 (3.3920) grad_norm 1.9725 (1.8044) [2022-01-23 14:51:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][900/1251] eta 0:12:53 lr 0.000319 time 2.5886 (2.2043) loss 3.5227 (3.3923) grad_norm 2.0882 (1.8049) [2022-01-23 14:51:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][910/1251] eta 0:12:32 lr 0.000319 time 2.4008 (2.2053) loss 3.1321 (3.3893) grad_norm 1.6401 (1.8047) [2022-01-23 14:51:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][920/1251] eta 0:12:09 lr 0.000319 time 2.1759 (2.2050) loss 3.5626 (3.3880) grad_norm 1.7548 (1.8040) [2022-01-23 14:52:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][930/1251] eta 0:11:48 lr 0.000319 time 2.4919 (2.2058) loss 2.4300 (3.3862) grad_norm 1.7373 (1.8039) [2022-01-23 14:52:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][940/1251] eta 0:11:25 lr 0.000319 time 2.4087 (2.2037) loss 3.2874 (3.3869) grad_norm 2.0877 (1.8036) [2022-01-23 14:52:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][950/1251] eta 0:11:02 lr 0.000319 time 1.9817 (2.2015) loss 2.6157 (3.3845) grad_norm 1.6403 (1.8034) [2022-01-23 14:53:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][960/1251] eta 0:10:40 lr 0.000319 time 2.8228 (2.2024) loss 2.9562 (3.3822) grad_norm 1.8689 (1.8030) [2022-01-23 14:53:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][970/1251] eta 0:10:19 lr 0.000319 time 2.4753 (2.2033) loss 3.2299 (3.3804) grad_norm 1.8577 (1.8036) [2022-01-23 14:54:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][980/1251] eta 0:09:56 lr 0.000319 time 1.8860 (2.2026) loss 4.0517 (3.3834) grad_norm 1.9286 (1.8043) [2022-01-23 14:54:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][990/1251] eta 0:09:34 lr 0.000319 time 2.2751 (2.2020) loss 4.2976 (3.3859) grad_norm 1.7011 (1.8045) [2022-01-23 14:54:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1000/1251] eta 0:09:12 lr 0.000319 time 2.4868 (2.2017) loss 2.4760 (3.3847) grad_norm 2.0891 (1.8037) [2022-01-23 14:55:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1010/1251] eta 0:08:50 lr 0.000319 time 2.6779 (2.2013) loss 2.9474 (3.3831) grad_norm 1.4866 (1.8032) [2022-01-23 14:55:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1020/1251] eta 0:08:28 lr 0.000319 time 2.1737 (2.2004) loss 3.3563 (3.3817) grad_norm 1.8752 (1.8045) [2022-01-23 14:55:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1030/1251] eta 0:08:06 lr 0.000319 time 1.8276 (2.2016) loss 3.0884 (3.3810) grad_norm 2.3620 (1.8046) [2022-01-23 14:56:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1040/1251] eta 0:07:44 lr 0.000319 time 1.9758 (2.2025) loss 3.8506 (3.3806) grad_norm 1.6436 (1.8031) [2022-01-23 14:56:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1050/1251] eta 0:07:22 lr 0.000319 time 1.9583 (2.2030) loss 3.5510 (3.3829) grad_norm 1.7632 (1.8021) [2022-01-23 14:56:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1060/1251] eta 0:07:00 lr 0.000319 time 2.4032 (2.2015) loss 2.6103 (3.3809) grad_norm 2.3362 (1.8031) [2022-01-23 14:57:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1070/1251] eta 0:06:38 lr 0.000319 time 1.9071 (2.1992) loss 3.2419 (3.3811) grad_norm 1.8968 (1.8020) [2022-01-23 14:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1080/1251] eta 0:06:15 lr 0.000319 time 1.5860 (2.1975) loss 2.5474 (3.3781) grad_norm 1.8786 (1.8015) [2022-01-23 14:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1090/1251] eta 0:05:53 lr 0.000319 time 1.9368 (2.1965) loss 3.4388 (3.3773) grad_norm 1.6781 (1.8003) [2022-01-23 14:58:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1100/1251] eta 0:05:31 lr 0.000319 time 2.2045 (2.1966) loss 3.8396 (3.3785) grad_norm 1.5465 (1.7991) [2022-01-23 14:58:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1110/1251] eta 0:05:09 lr 0.000319 time 2.5026 (2.1959) loss 3.2479 (3.3756) grad_norm 1.7680 (1.7982) [2022-01-23 14:59:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1120/1251] eta 0:04:47 lr 0.000318 time 2.1979 (2.1965) loss 3.0693 (3.3750) grad_norm 2.0005 (1.7977) [2022-01-23 14:59:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1130/1251] eta 0:04:25 lr 0.000318 time 2.5400 (2.1975) loss 2.9948 (3.3749) grad_norm 1.8039 (1.7973) [2022-01-23 14:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1140/1251] eta 0:04:04 lr 0.000318 time 2.2637 (2.1988) loss 3.9432 (3.3742) grad_norm 2.0365 (1.7979) [2022-01-23 15:00:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1150/1251] eta 0:03:42 lr 0.000318 time 2.0690 (2.2011) loss 3.4593 (3.3755) grad_norm 1.9998 (1.7986) [2022-01-23 15:00:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1160/1251] eta 0:03:20 lr 0.000318 time 2.7657 (2.2028) loss 4.1654 (3.3747) grad_norm 1.7833 (1.8002) [2022-01-23 15:01:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1170/1251] eta 0:02:58 lr 0.000318 time 1.6084 (2.2026) loss 3.6632 (3.3749) grad_norm 1.9298 (1.8016) [2022-01-23 15:01:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1180/1251] eta 0:02:36 lr 0.000318 time 1.8985 (2.2003) loss 3.5788 (3.3767) grad_norm 2.0182 (1.8029) [2022-01-23 15:01:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1190/1251] eta 0:02:14 lr 0.000318 time 1.8315 (2.1994) loss 2.6045 (3.3740) grad_norm 1.7366 (1.8034) [2022-01-23 15:02:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1200/1251] eta 0:01:52 lr 0.000318 time 2.3851 (2.1987) loss 3.7057 (3.3744) grad_norm 1.8193 (1.8037) [2022-01-23 15:02:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1210/1251] eta 0:01:30 lr 0.000318 time 2.0093 (2.1977) loss 2.9835 (3.3755) grad_norm 1.8234 (1.8042) [2022-01-23 15:02:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1220/1251] eta 0:01:08 lr 0.000318 time 2.0901 (2.1975) loss 2.8752 (3.3756) grad_norm 1.7394 (1.8035) [2022-01-23 15:03:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1230/1251] eta 0:00:46 lr 0.000318 time 1.9856 (2.1966) loss 3.8059 (3.3766) grad_norm 1.6703 (1.8030) [2022-01-23 15:03:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1240/1251] eta 0:00:24 lr 0.000318 time 2.1658 (2.1961) loss 3.5533 (3.3783) grad_norm 1.6632 (1.8026) [2022-01-23 15:03:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [186/300][1250/1251] eta 0:00:02 lr 0.000318 time 1.1083 (2.1915) loss 3.2485 (3.3786) grad_norm 1.7595 (1.8025) [2022-01-23 15:03:43 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 186 training takes 0:45:42 [2022-01-23 15:04:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.090 (18.090) Loss 0.9099 (0.9099) Acc@1 78.906 (78.906) Acc@5 93.262 (93.262) [2022-01-23 15:04:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.600 (3.289) Loss 0.9554 (0.9420) Acc@1 77.930 (78.072) Acc@5 94.824 (94.283) [2022-01-23 15:04:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.010 (2.497) Loss 0.9787 (0.9554) Acc@1 76.855 (77.758) Acc@5 93.945 (94.113) [2022-01-23 15:04:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.577 (2.243) Loss 0.9929 (0.9656) Acc@1 75.781 (77.589) Acc@5 93.750 (93.980) [2022-01-23 15:05:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.280 (2.178) Loss 0.9487 (0.9607) Acc@1 78.320 (77.653) Acc@5 93.652 (94.012) [2022-01-23 15:05:19 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 77.760 Acc@5 94.052 [2022-01-23 15:05:19 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 77.8% [2022-01-23 15:05:19 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 77.76% [2022-01-23 15:05:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][0/1251] eta 7:04:36 lr 0.000318 time 20.3646 (20.3646) loss 3.2221 (3.2221) grad_norm 2.0229 (2.0229) [2022-01-23 15:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][10/1251] eta 1:19:36 lr 0.000318 time 1.6178 (3.8488) loss 2.3597 (3.2434) grad_norm 1.8219 (1.8304) [2022-01-23 15:06:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][20/1251] eta 1:02:09 lr 0.000318 time 1.4879 (3.0298) loss 3.3838 (3.3864) grad_norm 1.6653 (1.7939) [2022-01-23 15:06:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][30/1251] eta 0:56:40 lr 0.000318 time 1.7724 (2.7854) loss 3.6104 (3.4055) grad_norm 1.8211 (1.7746) [2022-01-23 15:07:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][40/1251] eta 0:53:45 lr 0.000318 time 3.8919 (2.6638) loss 3.8047 (3.4774) grad_norm 1.6272 (1.7708) [2022-01-23 15:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][50/1251] eta 0:52:30 lr 0.000318 time 1.5047 (2.6229) loss 4.0336 (3.4839) grad_norm 1.7535 (1.7696) [2022-01-23 15:07:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][60/1251] eta 0:51:12 lr 0.000318 time 1.5739 (2.5798) loss 3.3324 (3.4547) grad_norm 1.9610 (1.7842) [2022-01-23 15:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][70/1251] eta 0:49:33 lr 0.000318 time 1.5031 (2.5175) loss 3.7993 (3.4605) grad_norm 2.0415 (1.7903) [2022-01-23 15:08:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][80/1251] eta 0:48:24 lr 0.000318 time 2.5238 (2.4806) loss 3.6122 (3.4294) grad_norm 3.4070 (1.8145) [2022-01-23 15:09:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][90/1251] eta 0:47:35 lr 0.000318 time 2.2038 (2.4598) loss 2.3514 (3.4280) grad_norm 1.9870 (1.8267) [2022-01-23 15:09:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][100/1251] eta 0:46:38 lr 0.000318 time 1.7342 (2.4316) loss 3.8258 (3.4110) grad_norm 1.9722 (1.8302) [2022-01-23 15:09:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][110/1251] eta 0:45:33 lr 0.000318 time 1.8069 (2.3955) loss 2.4974 (3.3921) grad_norm 1.8054 (1.8235) [2022-01-23 15:10:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][120/1251] eta 0:44:44 lr 0.000318 time 2.1219 (2.3738) loss 2.5030 (3.3907) grad_norm 2.0039 (1.8274) [2022-01-23 15:10:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][130/1251] eta 0:44:12 lr 0.000317 time 2.4033 (2.3666) loss 2.8168 (3.4003) grad_norm 2.0389 (1.8332) [2022-01-23 15:10:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][140/1251] eta 0:43:36 lr 0.000317 time 1.8535 (2.3553) loss 3.3898 (3.4028) grad_norm 1.6965 (1.8291) [2022-01-23 15:11:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][150/1251] eta 0:42:57 lr 0.000317 time 1.8963 (2.3414) loss 3.6929 (3.3997) grad_norm 1.6888 (1.8231) [2022-01-23 15:11:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][160/1251] eta 0:42:22 lr 0.000317 time 2.4878 (2.3308) loss 3.8358 (3.4062) grad_norm 1.6553 (1.8188) [2022-01-23 15:11:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][170/1251] eta 0:41:47 lr 0.000317 time 2.5126 (2.3197) loss 3.9717 (3.4207) grad_norm 1.7556 (1.8145) [2022-01-23 15:12:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][180/1251] eta 0:41:17 lr 0.000317 time 2.1774 (2.3131) loss 3.2355 (3.4164) grad_norm 1.6406 (1.8076) [2022-01-23 15:12:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][190/1251] eta 0:41:06 lr 0.000317 time 3.5728 (2.3251) loss 3.3497 (3.3950) grad_norm 1.5668 (1.8044) [2022-01-23 15:13:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][200/1251] eta 0:40:43 lr 0.000317 time 1.9092 (2.3245) loss 4.0045 (3.4058) grad_norm 1.7943 (1.8033) [2022-01-23 15:13:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][210/1251] eta 0:40:12 lr 0.000317 time 1.8924 (2.3170) loss 3.1068 (3.4065) grad_norm 1.7791 (1.8000) [2022-01-23 15:13:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][220/1251] eta 0:39:31 lr 0.000317 time 1.7092 (2.3000) loss 3.4083 (3.4066) grad_norm 1.8351 (1.7997) [2022-01-23 15:14:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][230/1251] eta 0:38:57 lr 0.000317 time 2.4725 (2.2891) loss 3.9540 (3.4119) grad_norm 1.9236 (1.7991) [2022-01-23 15:14:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][240/1251] eta 0:38:30 lr 0.000317 time 1.8264 (2.2855) loss 2.7794 (3.4065) grad_norm 1.8807 (1.8003) [2022-01-23 15:14:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][250/1251] eta 0:38:04 lr 0.000317 time 2.3194 (2.2821) loss 3.5790 (3.4195) grad_norm 1.8237 (1.8041) [2022-01-23 15:15:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][260/1251] eta 0:37:38 lr 0.000317 time 2.2134 (2.2791) loss 3.5119 (3.4180) grad_norm 1.7873 (1.8056) [2022-01-23 15:15:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][270/1251] eta 0:37:13 lr 0.000317 time 2.6961 (2.2764) loss 2.7982 (3.4177) grad_norm 1.9677 (1.8105) [2022-01-23 15:15:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][280/1251] eta 0:36:47 lr 0.000317 time 1.8671 (2.2732) loss 3.7574 (3.4106) grad_norm 1.4671 (1.8092) [2022-01-23 15:16:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][290/1251] eta 0:36:20 lr 0.000317 time 2.3535 (2.2690) loss 3.7052 (3.4139) grad_norm 1.6489 (1.8068) [2022-01-23 15:16:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][300/1251] eta 0:35:55 lr 0.000317 time 2.2880 (2.2671) loss 3.7704 (3.4126) grad_norm 1.8002 (1.8070) [2022-01-23 15:17:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][310/1251] eta 0:35:33 lr 0.000317 time 3.6286 (2.2668) loss 3.9521 (3.4147) grad_norm 1.8729 (1.8075) [2022-01-23 15:17:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][320/1251] eta 0:35:05 lr 0.000317 time 1.8902 (2.2613) loss 2.7383 (3.4086) grad_norm 1.6574 (1.8051) [2022-01-23 15:17:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][330/1251] eta 0:34:39 lr 0.000317 time 2.0104 (2.2584) loss 3.7757 (3.4127) grad_norm 1.8998 (1.8040) [2022-01-23 15:18:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][340/1251] eta 0:34:18 lr 0.000317 time 2.0703 (2.2594) loss 2.7688 (3.4038) grad_norm 1.8585 (1.8054) [2022-01-23 15:18:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][350/1251] eta 0:33:55 lr 0.000317 time 3.0372 (2.2593) loss 2.2782 (3.3943) grad_norm 1.9493 (1.8064) [2022-01-23 15:18:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][360/1251] eta 0:33:26 lr 0.000317 time 1.6308 (2.2516) loss 3.4741 (3.3919) grad_norm 2.0789 (1.8092) [2022-01-23 15:19:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][370/1251] eta 0:32:56 lr 0.000317 time 1.7101 (2.2439) loss 3.5539 (3.3838) grad_norm 2.0151 (1.8103) [2022-01-23 15:19:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][380/1251] eta 0:32:28 lr 0.000317 time 2.0468 (2.2374) loss 3.9952 (3.3842) grad_norm 1.8926 (1.8114) [2022-01-23 15:19:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][390/1251] eta 0:32:04 lr 0.000316 time 2.1043 (2.2349) loss 4.0519 (3.3867) grad_norm 1.9210 (1.8097) [2022-01-23 15:20:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][400/1251] eta 0:31:42 lr 0.000316 time 2.1545 (2.2351) loss 4.2984 (3.3903) grad_norm 1.8060 (1.8124) [2022-01-23 15:20:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][410/1251] eta 0:31:21 lr 0.000316 time 1.5766 (2.2375) loss 3.5990 (3.3854) grad_norm 1.5133 (1.8126) [2022-01-23 15:21:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][420/1251] eta 0:31:01 lr 0.000316 time 1.7824 (2.2396) loss 3.5505 (3.3902) grad_norm 1.6613 (1.8119) [2022-01-23 15:21:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][430/1251] eta 0:30:45 lr 0.000316 time 3.3326 (2.2474) loss 4.0241 (3.3923) grad_norm 2.0688 (1.8115) [2022-01-23 15:21:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][440/1251] eta 0:30:25 lr 0.000316 time 1.8781 (2.2506) loss 2.7870 (3.4006) grad_norm 1.5191 (1.8110) [2022-01-23 15:22:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][450/1251] eta 0:29:58 lr 0.000316 time 1.5854 (2.2454) loss 4.1311 (3.4080) grad_norm 1.7686 (1.8108) [2022-01-23 15:22:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][460/1251] eta 0:29:29 lr 0.000316 time 1.5510 (2.2377) loss 3.6990 (3.4053) grad_norm 1.9236 (1.8110) [2022-01-23 15:22:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][470/1251] eta 0:29:05 lr 0.000316 time 2.7030 (2.2353) loss 3.0317 (3.4084) grad_norm 1.5143 (1.8110) [2022-01-23 15:23:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][480/1251] eta 0:28:40 lr 0.000316 time 1.9880 (2.2319) loss 3.6438 (3.4088) grad_norm 1.8739 (1.8096) [2022-01-23 15:23:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][490/1251] eta 0:28:16 lr 0.000316 time 2.2992 (2.2288) loss 2.7669 (3.3958) grad_norm 1.7893 (1.8091) [2022-01-23 15:23:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][500/1251] eta 0:27:53 lr 0.000316 time 2.3645 (2.2277) loss 2.5117 (3.3923) grad_norm 1.5466 (1.8099) [2022-01-23 15:24:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][510/1251] eta 0:27:32 lr 0.000316 time 2.6950 (2.2298) loss 3.0415 (3.3915) grad_norm 1.7249 (1.8101) [2022-01-23 15:24:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][520/1251] eta 0:27:15 lr 0.000316 time 4.9626 (2.2380) loss 3.6155 (3.3906) grad_norm 1.6830 (1.8087) [2022-01-23 15:25:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][530/1251] eta 0:26:54 lr 0.000316 time 2.3932 (2.2389) loss 3.8374 (3.3951) grad_norm 1.8458 (1.8084) [2022-01-23 15:25:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][540/1251] eta 0:26:32 lr 0.000316 time 2.7137 (2.2398) loss 3.7241 (3.3953) grad_norm 1.7115 (1.8097) [2022-01-23 15:25:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][550/1251] eta 0:26:08 lr 0.000316 time 1.9514 (2.2379) loss 2.3490 (3.3983) grad_norm 1.6147 (1.8094) [2022-01-23 15:26:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][560/1251] eta 0:25:43 lr 0.000316 time 2.2463 (2.2344) loss 3.8778 (3.4037) grad_norm 1.9148 (1.8098) [2022-01-23 15:26:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][570/1251] eta 0:25:18 lr 0.000316 time 2.1727 (2.2303) loss 3.9269 (3.4054) grad_norm 2.0817 (1.8111) [2022-01-23 15:26:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][580/1251] eta 0:24:53 lr 0.000316 time 2.2976 (2.2261) loss 3.8112 (3.4055) grad_norm 1.9472 (1.8122) [2022-01-23 15:27:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][590/1251] eta 0:24:29 lr 0.000316 time 2.6158 (2.2238) loss 3.8356 (3.4064) grad_norm 1.9061 (1.8123) [2022-01-23 15:27:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][600/1251] eta 0:24:07 lr 0.000316 time 2.2017 (2.2231) loss 4.1482 (3.4105) grad_norm 1.9172 (1.8135) [2022-01-23 15:27:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][610/1251] eta 0:23:44 lr 0.000316 time 2.2792 (2.2221) loss 3.7906 (3.4108) grad_norm 1.7942 (1.8147) [2022-01-23 15:28:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][620/1251] eta 0:23:22 lr 0.000316 time 1.8262 (2.2231) loss 3.5025 (3.4065) grad_norm 2.0129 (1.8148) [2022-01-23 15:28:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][630/1251] eta 0:23:02 lr 0.000316 time 3.2977 (2.2269) loss 2.8927 (3.3995) grad_norm 2.3338 (1.8158) [2022-01-23 15:29:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][640/1251] eta 0:22:41 lr 0.000316 time 1.6545 (2.2277) loss 2.9177 (3.3972) grad_norm 1.6369 (1.8154) [2022-01-23 15:29:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][650/1251] eta 0:22:18 lr 0.000315 time 2.0826 (2.2273) loss 3.9686 (3.3981) grad_norm 1.5550 (1.8148) [2022-01-23 15:29:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][660/1251] eta 0:21:55 lr 0.000315 time 2.0331 (2.2256) loss 2.6072 (3.3996) grad_norm 1.9664 (1.8148) [2022-01-23 15:30:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][670/1251] eta 0:21:32 lr 0.000315 time 2.8658 (2.2250) loss 2.5463 (3.3977) grad_norm 1.6914 (1.8133) [2022-01-23 15:30:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][680/1251] eta 0:21:10 lr 0.000315 time 2.4988 (2.2246) loss 3.4558 (3.3967) grad_norm 1.6102 (1.8136) [2022-01-23 15:30:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][690/1251] eta 0:20:47 lr 0.000315 time 2.1294 (2.2237) loss 3.7504 (3.3965) grad_norm 1.7725 (1.8131) [2022-01-23 15:31:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][700/1251] eta 0:20:24 lr 0.000315 time 2.1684 (2.2223) loss 2.9426 (3.3934) grad_norm 1.8371 (1.8125) [2022-01-23 15:31:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][710/1251] eta 0:20:02 lr 0.000315 time 2.6353 (2.2226) loss 3.7202 (3.3908) grad_norm 1.7557 (1.8124) [2022-01-23 15:31:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][720/1251] eta 0:19:38 lr 0.000315 time 1.8474 (2.2190) loss 3.1332 (3.3901) grad_norm 1.7341 (1.8113) [2022-01-23 15:32:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][730/1251] eta 0:19:14 lr 0.000315 time 2.2653 (2.2166) loss 3.5686 (3.3895) grad_norm 1.6016 (1.8109) [2022-01-23 15:32:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][740/1251] eta 0:18:51 lr 0.000315 time 1.9581 (2.2149) loss 3.6480 (3.3914) grad_norm 1.7377 (1.8124) [2022-01-23 15:33:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][750/1251] eta 0:18:30 lr 0.000315 time 2.5090 (2.2175) loss 3.5659 (3.3914) grad_norm 1.5723 (1.8131) [2022-01-23 15:33:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][760/1251] eta 0:18:09 lr 0.000315 time 2.1867 (2.2189) loss 3.5741 (3.3891) grad_norm 1.4427 (1.8131) [2022-01-23 15:33:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][770/1251] eta 0:17:46 lr 0.000315 time 2.1460 (2.2169) loss 2.1212 (3.3896) grad_norm 1.6257 (1.8116) [2022-01-23 15:34:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][780/1251] eta 0:17:23 lr 0.000315 time 2.1365 (2.2157) loss 3.6708 (3.3881) grad_norm 1.5984 (1.8095) [2022-01-23 15:34:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][790/1251] eta 0:17:01 lr 0.000315 time 3.1702 (2.2169) loss 2.5810 (3.3853) grad_norm 1.8013 (1.8089) [2022-01-23 15:34:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][800/1251] eta 0:16:40 lr 0.000315 time 2.3957 (2.2173) loss 2.6311 (3.3850) grad_norm 1.5809 (1.8078) [2022-01-23 15:35:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][810/1251] eta 0:16:17 lr 0.000315 time 1.5154 (2.2167) loss 2.8461 (3.3822) grad_norm 1.8896 (1.8078) [2022-01-23 15:35:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][820/1251] eta 0:15:55 lr 0.000315 time 2.3505 (2.2166) loss 3.6214 (3.3820) grad_norm 1.6070 (1.8094) [2022-01-23 15:35:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][830/1251] eta 0:15:31 lr 0.000315 time 1.6360 (2.2138) loss 3.0523 (3.3824) grad_norm 1.8654 (1.8090) [2022-01-23 15:36:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][840/1251] eta 0:15:09 lr 0.000315 time 2.2528 (2.2135) loss 2.7745 (3.3824) grad_norm 1.7444 (1.8086) [2022-01-23 15:36:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][850/1251] eta 0:14:48 lr 0.000315 time 1.5390 (2.2152) loss 2.7744 (3.3814) grad_norm 1.6582 (1.8086) [2022-01-23 15:37:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][860/1251] eta 0:14:27 lr 0.000315 time 3.0024 (2.2181) loss 3.7798 (3.3818) grad_norm 1.6269 (1.8086) [2022-01-23 15:37:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][870/1251] eta 0:14:04 lr 0.000315 time 1.9344 (2.2178) loss 3.5030 (3.3782) grad_norm 1.6215 (1.8073) [2022-01-23 15:37:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][880/1251] eta 0:13:41 lr 0.000315 time 1.6226 (2.2138) loss 3.1031 (3.3791) grad_norm 1.6988 (1.8073) [2022-01-23 15:38:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][890/1251] eta 0:13:18 lr 0.000315 time 2.3625 (2.2113) loss 2.4204 (3.3746) grad_norm 1.7629 (1.8084) [2022-01-23 15:38:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][900/1251] eta 0:12:56 lr 0.000315 time 1.8838 (2.2126) loss 2.3255 (3.3718) grad_norm 1.9796 (1.8086) [2022-01-23 15:38:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][910/1251] eta 0:12:34 lr 0.000314 time 2.1642 (2.2119) loss 4.2121 (3.3708) grad_norm 1.6934 (1.8087) [2022-01-23 15:39:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][920/1251] eta 0:12:12 lr 0.000314 time 2.4486 (2.2117) loss 3.0296 (3.3719) grad_norm 1.7784 (1.8080) [2022-01-23 15:39:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][930/1251] eta 0:11:49 lr 0.000314 time 1.9132 (2.2109) loss 3.6793 (3.3743) grad_norm 2.0214 (1.8080) [2022-01-23 15:40:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][940/1251] eta 0:11:27 lr 0.000314 time 1.9008 (2.2116) loss 2.5861 (3.3736) grad_norm 1.7378 (1.8090) [2022-01-23 15:40:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][950/1251] eta 0:11:05 lr 0.000314 time 2.5724 (2.2122) loss 3.5138 (3.3714) grad_norm 1.7425 (1.8105) [2022-01-23 15:40:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][960/1251] eta 0:10:43 lr 0.000314 time 2.2098 (2.2115) loss 4.2729 (3.3742) grad_norm 1.7001 (1.8112) [2022-01-23 15:41:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][970/1251] eta 0:10:21 lr 0.000314 time 2.2645 (2.2114) loss 2.2880 (3.3751) grad_norm 1.7396 (1.8119) [2022-01-23 15:41:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][980/1251] eta 0:09:59 lr 0.000314 time 2.2139 (2.2112) loss 3.6689 (3.3770) grad_norm 1.7697 (1.8124) [2022-01-23 15:41:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][990/1251] eta 0:09:37 lr 0.000314 time 2.7016 (2.2117) loss 3.0812 (3.3786) grad_norm 1.9066 (1.8119) [2022-01-23 15:42:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1000/1251] eta 0:09:14 lr 0.000314 time 1.9147 (2.2096) loss 3.5345 (3.3812) grad_norm 1.7543 (1.8114) [2022-01-23 15:42:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1010/1251] eta 0:08:52 lr 0.000314 time 1.5750 (2.2093) loss 3.2774 (3.3808) grad_norm 1.7185 (1.8109) [2022-01-23 15:42:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1020/1251] eta 0:08:30 lr 0.000314 time 2.0614 (2.2087) loss 3.6370 (3.3815) grad_norm 1.5792 (1.8114) [2022-01-23 15:43:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1030/1251] eta 0:08:08 lr 0.000314 time 1.8781 (2.2088) loss 3.2228 (3.3830) grad_norm 2.6412 (1.8111) [2022-01-23 15:43:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1040/1251] eta 0:07:45 lr 0.000314 time 1.8670 (2.2085) loss 2.1697 (3.3811) grad_norm 1.9972 (1.8109) [2022-01-23 15:43:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1050/1251] eta 0:07:23 lr 0.000314 time 2.2715 (2.2078) loss 3.8862 (3.3813) grad_norm 1.7994 (1.8099) [2022-01-23 15:44:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1060/1251] eta 0:07:01 lr 0.000314 time 1.9114 (2.2081) loss 3.0265 (3.3820) grad_norm 1.6616 (1.8100) [2022-01-23 15:44:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1070/1251] eta 0:06:39 lr 0.000314 time 1.7771 (2.2070) loss 3.4190 (3.3805) grad_norm 1.7314 (1.8091) [2022-01-23 15:45:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1080/1251] eta 0:06:17 lr 0.000314 time 2.2172 (2.2083) loss 3.4852 (3.3814) grad_norm 1.7664 (1.8090) [2022-01-23 15:45:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1090/1251] eta 0:05:55 lr 0.000314 time 2.3814 (2.2082) loss 2.3916 (3.3821) grad_norm 1.8914 (1.8095) [2022-01-23 15:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1100/1251] eta 0:05:33 lr 0.000314 time 2.2417 (2.2090) loss 3.6823 (3.3819) grad_norm 1.8087 (1.8107) [2022-01-23 15:46:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1110/1251] eta 0:05:11 lr 0.000314 time 1.8560 (2.2083) loss 3.6204 (3.3819) grad_norm 2.1357 (1.8115) [2022-01-23 15:46:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1120/1251] eta 0:04:49 lr 0.000314 time 1.6760 (2.2070) loss 2.8791 (3.3827) grad_norm 1.8474 (1.8119) [2022-01-23 15:46:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1130/1251] eta 0:04:26 lr 0.000314 time 3.1825 (2.2061) loss 2.9362 (3.3828) grad_norm 1.8047 (1.8120) [2022-01-23 15:47:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1140/1251] eta 0:04:04 lr 0.000314 time 1.8503 (2.2047) loss 3.4064 (3.3833) grad_norm 1.7283 (1.8111) [2022-01-23 15:47:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1150/1251] eta 0:03:42 lr 0.000314 time 2.3694 (2.2045) loss 2.9943 (3.3816) grad_norm 1.7277 (1.8109) [2022-01-23 15:47:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1160/1251] eta 0:03:20 lr 0.000314 time 1.9865 (2.2044) loss 3.8383 (3.3829) grad_norm 1.7573 (1.8100) [2022-01-23 15:48:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1170/1251] eta 0:02:58 lr 0.000313 time 2.1729 (2.2044) loss 3.4351 (3.3825) grad_norm 1.7189 (1.8108) [2022-01-23 15:48:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1180/1251] eta 0:02:36 lr 0.000313 time 1.8821 (2.2044) loss 3.7112 (3.3828) grad_norm 1.8168 (1.8113) [2022-01-23 15:49:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1190/1251] eta 0:02:14 lr 0.000313 time 2.1766 (2.2041) loss 3.6802 (3.3834) grad_norm 1.7713 (1.8130) [2022-01-23 15:49:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1200/1251] eta 0:01:52 lr 0.000313 time 1.8061 (2.2052) loss 3.7567 (3.3864) grad_norm 1.6831 (1.8130) [2022-01-23 15:49:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1210/1251] eta 0:01:30 lr 0.000313 time 1.8820 (2.2046) loss 3.7229 (3.3881) grad_norm 1.7004 (1.8133) [2022-01-23 15:50:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1220/1251] eta 0:01:08 lr 0.000313 time 2.1975 (2.2055) loss 2.3321 (3.3878) grad_norm 1.4319 (1.8125) [2022-01-23 15:50:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1230/1251] eta 0:00:46 lr 0.000313 time 1.6701 (2.2043) loss 3.4291 (3.3863) grad_norm 1.6750 (1.8123) [2022-01-23 15:50:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1240/1251] eta 0:00:24 lr 0.000313 time 1.6895 (2.2034) loss 3.2920 (3.3834) grad_norm 1.7605 (1.8125) [2022-01-23 15:51:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [187/300][1250/1251] eta 0:00:02 lr 0.000313 time 1.2109 (2.1975) loss 3.6921 (3.3826) grad_norm 1.7577 (1.8121) [2022-01-23 15:51:08 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 187 training takes 0:45:49 [2022-01-23 15:51:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.717 (18.717) Loss 0.9238 (0.9238) Acc@1 78.027 (78.027) Acc@5 93.848 (93.848) [2022-01-23 15:51:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.953 (3.313) Loss 1.0136 (0.9654) Acc@1 77.051 (77.051) Acc@5 92.480 (93.626) [2022-01-23 15:52:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.933 (2.599) Loss 0.9572 (0.9432) Acc@1 75.488 (77.469) Acc@5 94.531 (93.964) [2022-01-23 15:52:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.560 (2.261) Loss 0.9340 (0.9403) Acc@1 78.223 (77.778) Acc@5 95.508 (94.040) [2022-01-23 15:52:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.030 (2.209) Loss 0.8376 (0.9344) Acc@1 80.273 (77.742) Acc@5 95.020 (94.086) [2022-01-23 15:52:46 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 77.870 Acc@5 94.134 [2022-01-23 15:52:46 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 77.9% [2022-01-23 15:52:46 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 77.87% [2022-01-23 15:53:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][0/1251] eta 7:33:14 lr 0.000313 time 21.7378 (21.7378) loss 2.7614 (2.7614) grad_norm 1.6973 (1.6973) [2022-01-23 15:53:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][10/1251] eta 1:26:51 lr 0.000313 time 2.1049 (4.1995) loss 3.5284 (3.5170) grad_norm 1.6867 (1.7752) [2022-01-23 15:53:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][20/1251] eta 1:05:36 lr 0.000313 time 1.2505 (3.1978) loss 3.4051 (3.3890) grad_norm 1.6404 (1.7482) [2022-01-23 15:54:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][30/1251] eta 0:57:38 lr 0.000313 time 1.5269 (2.8329) loss 3.0468 (3.3934) grad_norm 1.7476 (1.7524) [2022-01-23 15:54:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][40/1251] eta 0:54:06 lr 0.000313 time 3.3035 (2.6807) loss 2.0542 (3.3751) grad_norm 1.7741 (1.7623) [2022-01-23 15:54:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][50/1251] eta 0:51:40 lr 0.000313 time 1.7611 (2.5818) loss 3.9238 (3.3896) grad_norm 1.8727 (1.7865) [2022-01-23 15:55:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][60/1251] eta 0:49:47 lr 0.000313 time 1.3374 (2.5086) loss 2.6723 (3.3907) grad_norm 2.0885 (1.7951) [2022-01-23 15:55:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][70/1251] eta 0:48:19 lr 0.000313 time 2.0721 (2.4555) loss 3.2275 (3.3835) grad_norm 1.8789 (1.8068) [2022-01-23 15:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][80/1251] eta 0:47:30 lr 0.000313 time 3.2968 (2.4346) loss 4.0683 (3.3639) grad_norm 1.8913 (1.8232) [2022-01-23 15:56:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][90/1251] eta 0:46:32 lr 0.000313 time 1.3485 (2.4051) loss 2.9099 (3.3535) grad_norm 1.9078 (1.8259) [2022-01-23 15:56:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][100/1251] eta 0:45:36 lr 0.000313 time 1.5162 (2.3774) loss 3.4832 (3.3639) grad_norm 1.7670 (1.8231) [2022-01-23 15:57:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][110/1251] eta 0:44:51 lr 0.000313 time 1.7328 (2.3590) loss 3.8827 (3.3839) grad_norm 1.6541 (1.8282) [2022-01-23 15:57:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][120/1251] eta 0:44:20 lr 0.000313 time 3.2151 (2.3520) loss 2.6678 (3.3729) grad_norm 2.0288 (1.8330) [2022-01-23 15:57:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][130/1251] eta 0:43:46 lr 0.000313 time 2.1610 (2.3427) loss 3.8081 (3.3812) grad_norm 1.9848 (1.8340) [2022-01-23 15:58:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][140/1251] eta 0:43:15 lr 0.000313 time 1.5154 (2.3365) loss 3.3761 (3.3858) grad_norm 1.7939 (1.8397) [2022-01-23 15:58:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][150/1251] eta 0:42:44 lr 0.000313 time 1.8687 (2.3293) loss 3.5209 (3.4058) grad_norm 1.6983 (1.8416) [2022-01-23 15:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][160/1251] eta 0:42:22 lr 0.000313 time 3.5539 (2.3302) loss 2.7517 (3.3956) grad_norm 1.7013 (1.8344) [2022-01-23 15:59:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][170/1251] eta 0:41:46 lr 0.000313 time 1.8304 (2.3182) loss 4.1679 (3.4206) grad_norm 1.8062 (1.8332) [2022-01-23 15:59:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][180/1251] eta 0:41:18 lr 0.000312 time 1.6437 (2.3141) loss 4.0599 (3.4124) grad_norm 2.5519 (1.8356) [2022-01-23 16:00:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][190/1251] eta 0:40:45 lr 0.000312 time 2.0336 (2.3047) loss 3.0224 (3.4081) grad_norm 1.8298 (1.8415) [2022-01-23 16:00:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][200/1251] eta 0:40:22 lr 0.000312 time 2.7056 (2.3047) loss 3.3824 (3.4063) grad_norm 1.8798 (1.8435) [2022-01-23 16:00:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][210/1251] eta 0:39:53 lr 0.000312 time 1.5754 (2.2997) loss 3.6964 (3.4064) grad_norm 1.7708 (1.8424) [2022-01-23 16:01:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][220/1251] eta 0:39:25 lr 0.000312 time 1.5959 (2.2945) loss 2.6179 (3.4091) grad_norm 1.6909 (1.8394) [2022-01-23 16:01:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][230/1251] eta 0:38:50 lr 0.000312 time 2.0283 (2.2822) loss 3.8489 (3.4030) grad_norm 2.2052 (1.8383) [2022-01-23 16:01:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][240/1251] eta 0:38:24 lr 0.000312 time 2.1318 (2.2796) loss 3.9280 (3.4097) grad_norm 1.7885 (1.8409) [2022-01-23 16:02:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][250/1251] eta 0:38:03 lr 0.000312 time 1.9197 (2.2817) loss 3.5369 (3.4107) grad_norm 1.7666 (1.8419) [2022-01-23 16:02:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][260/1251] eta 0:37:39 lr 0.000312 time 1.6909 (2.2801) loss 2.5778 (3.4062) grad_norm 2.0532 (1.8491) [2022-01-23 16:03:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][270/1251] eta 0:37:14 lr 0.000312 time 1.8751 (2.2775) loss 3.1275 (3.3952) grad_norm 1.8646 (1.8506) [2022-01-23 16:03:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][280/1251] eta 0:36:47 lr 0.000312 time 1.9116 (2.2736) loss 3.8993 (3.3923) grad_norm 1.8777 (1.8534) [2022-01-23 16:03:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][290/1251] eta 0:36:22 lr 0.000312 time 1.8672 (2.2707) loss 2.7133 (3.3925) grad_norm 1.7611 (1.8554) [2022-01-23 16:04:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][300/1251] eta 0:35:53 lr 0.000312 time 2.2209 (2.2647) loss 3.3030 (3.4023) grad_norm 1.9945 (1.8565) [2022-01-23 16:04:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][310/1251] eta 0:35:27 lr 0.000312 time 2.2795 (2.2609) loss 3.6558 (3.3983) grad_norm 2.0742 (1.8564) [2022-01-23 16:04:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][320/1251] eta 0:35:00 lr 0.000312 time 2.2121 (2.2565) loss 3.3701 (3.3916) grad_norm 1.7311 (1.8559) [2022-01-23 16:05:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][330/1251] eta 0:34:37 lr 0.000312 time 1.8571 (2.2562) loss 2.7879 (3.3958) grad_norm 1.9232 (1.8552) [2022-01-23 16:05:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][340/1251] eta 0:34:13 lr 0.000312 time 1.8926 (2.2537) loss 3.2364 (3.3940) grad_norm 1.7601 (1.8528) [2022-01-23 16:05:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][350/1251] eta 0:33:48 lr 0.000312 time 1.5171 (2.2511) loss 2.5061 (3.3933) grad_norm 1.6514 (1.8512) [2022-01-23 16:06:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][360/1251] eta 0:33:22 lr 0.000312 time 2.6469 (2.2480) loss 3.2513 (3.3954) grad_norm 1.8449 (1.8498) [2022-01-23 16:06:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][370/1251] eta 0:32:57 lr 0.000312 time 2.1621 (2.2442) loss 3.7201 (3.4015) grad_norm 2.4941 (1.8511) [2022-01-23 16:06:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][380/1251] eta 0:32:29 lr 0.000312 time 1.5872 (2.2387) loss 3.6939 (3.4065) grad_norm 1.6105 (1.8521) [2022-01-23 16:07:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][390/1251] eta 0:32:08 lr 0.000312 time 1.9132 (2.2396) loss 3.7287 (3.4074) grad_norm 1.6649 (1.8547) [2022-01-23 16:07:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][400/1251] eta 0:31:46 lr 0.000312 time 2.5649 (2.2404) loss 4.1538 (3.4163) grad_norm 2.0156 (1.8543) [2022-01-23 16:08:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][410/1251] eta 0:31:22 lr 0.000312 time 1.9819 (2.2389) loss 4.1593 (3.4197) grad_norm 1.9549 (1.8561) [2022-01-23 16:08:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][420/1251] eta 0:30:59 lr 0.000312 time 1.5337 (2.2377) loss 3.2791 (3.4180) grad_norm 1.6065 (1.8533) [2022-01-23 16:08:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][430/1251] eta 0:30:36 lr 0.000312 time 2.5979 (2.2370) loss 3.6791 (3.4173) grad_norm 2.1717 (1.8568) [2022-01-23 16:09:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][440/1251] eta 0:30:14 lr 0.000312 time 3.8637 (2.2373) loss 2.4611 (3.4188) grad_norm 1.7911 (1.8580) [2022-01-23 16:09:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][450/1251] eta 0:29:50 lr 0.000311 time 1.5817 (2.2358) loss 3.9926 (3.4205) grad_norm 1.6711 (1.8553) [2022-01-23 16:09:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][460/1251] eta 0:29:25 lr 0.000311 time 1.6371 (2.2321) loss 2.5525 (3.4199) grad_norm 2.1957 (1.8559) [2022-01-23 16:10:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][470/1251] eta 0:29:02 lr 0.000311 time 2.4970 (2.2311) loss 2.5572 (3.4180) grad_norm 2.1088 (1.8572) [2022-01-23 16:10:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][480/1251] eta 0:28:42 lr 0.000311 time 3.6471 (2.2347) loss 2.8134 (3.4128) grad_norm 1.6922 (1.8554) [2022-01-23 16:11:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][490/1251] eta 0:28:18 lr 0.000311 time 1.9573 (2.2324) loss 3.5445 (3.4171) grad_norm 1.7816 (1.8542) [2022-01-23 16:11:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][500/1251] eta 0:27:54 lr 0.000311 time 1.9964 (2.2293) loss 2.9467 (3.4149) grad_norm 1.6500 (1.8532) [2022-01-23 16:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][510/1251] eta 0:27:30 lr 0.000311 time 2.1954 (2.2267) loss 2.5076 (3.4094) grad_norm 1.7159 (1.8509) [2022-01-23 16:12:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][520/1251] eta 0:27:08 lr 0.000311 time 3.7191 (2.2284) loss 3.4598 (3.4081) grad_norm 1.8338 (1.8532) [2022-01-23 16:12:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][530/1251] eta 0:26:45 lr 0.000311 time 1.6464 (2.2272) loss 2.5372 (3.4078) grad_norm 1.7680 (1.8523) [2022-01-23 16:12:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][540/1251] eta 0:26:22 lr 0.000311 time 1.6477 (2.2256) loss 3.3304 (3.4051) grad_norm 1.8520 (1.8518) [2022-01-23 16:13:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][550/1251] eta 0:26:00 lr 0.000311 time 1.8414 (2.2257) loss 3.4087 (3.4013) grad_norm 1.9646 (1.8532) [2022-01-23 16:13:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][560/1251] eta 0:25:39 lr 0.000311 time 3.6569 (2.2280) loss 3.4746 (3.4019) grad_norm 1.6818 (1.8514) [2022-01-23 16:13:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][570/1251] eta 0:25:16 lr 0.000311 time 2.1976 (2.2273) loss 2.4543 (3.3966) grad_norm 1.8229 (1.8492) [2022-01-23 16:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][580/1251] eta 0:24:52 lr 0.000311 time 1.6352 (2.2247) loss 3.3984 (3.3981) grad_norm 1.5754 (1.8494) [2022-01-23 16:14:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][590/1251] eta 0:24:27 lr 0.000311 time 1.5930 (2.2205) loss 3.2909 (3.3987) grad_norm 2.2834 (1.8504) [2022-01-23 16:14:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][600/1251] eta 0:24:03 lr 0.000311 time 2.8798 (2.2180) loss 3.4608 (3.3951) grad_norm 1.6122 (1.8497) [2022-01-23 16:15:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][610/1251] eta 0:23:40 lr 0.000311 time 1.9874 (2.2158) loss 4.0767 (3.3919) grad_norm 1.9664 (1.8490) [2022-01-23 16:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][620/1251] eta 0:23:17 lr 0.000311 time 2.2357 (2.2152) loss 3.8868 (3.3916) grad_norm 2.1507 (1.8488) [2022-01-23 16:16:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][630/1251] eta 0:22:55 lr 0.000311 time 1.9634 (2.2142) loss 2.2996 (3.3898) grad_norm 2.0579 (1.8478) [2022-01-23 16:16:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][640/1251] eta 0:22:34 lr 0.000311 time 3.4637 (2.2168) loss 3.6628 (3.3923) grad_norm 1.7322 (1.8475) [2022-01-23 16:16:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][650/1251] eta 0:22:12 lr 0.000311 time 1.8916 (2.2164) loss 3.5150 (3.3945) grad_norm 1.6136 (1.8475) [2022-01-23 16:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][660/1251] eta 0:21:50 lr 0.000311 time 1.7406 (2.2181) loss 3.2835 (3.3975) grad_norm 1.8271 (1.8475) [2022-01-23 16:17:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][670/1251] eta 0:21:29 lr 0.000311 time 2.2360 (2.2188) loss 3.0916 (3.3960) grad_norm 2.0107 (1.8483) [2022-01-23 16:17:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][680/1251] eta 0:21:08 lr 0.000311 time 3.1703 (2.2207) loss 2.4980 (3.3947) grad_norm 1.7571 (1.8471) [2022-01-23 16:18:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][690/1251] eta 0:20:45 lr 0.000311 time 1.6988 (2.2195) loss 3.4879 (3.3992) grad_norm 1.7466 (1.8455) [2022-01-23 16:18:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][700/1251] eta 0:20:21 lr 0.000311 time 1.5868 (2.2162) loss 2.7675 (3.3991) grad_norm 1.7550 (1.8439) [2022-01-23 16:19:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][710/1251] eta 0:19:57 lr 0.000310 time 1.8225 (2.2138) loss 3.4003 (3.4013) grad_norm 2.0259 (1.8434) [2022-01-23 16:19:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][720/1251] eta 0:19:34 lr 0.000310 time 1.9287 (2.2118) loss 3.4353 (3.3999) grad_norm 1.7102 (1.8433) [2022-01-23 16:19:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][730/1251] eta 0:19:11 lr 0.000310 time 1.8464 (2.2105) loss 4.0735 (3.3988) grad_norm 1.9232 (1.8438) [2022-01-23 16:20:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][740/1251] eta 0:18:49 lr 0.000310 time 2.4176 (2.2098) loss 3.4904 (3.3986) grad_norm 1.5685 (1.8451) [2022-01-23 16:20:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][750/1251] eta 0:18:26 lr 0.000310 time 2.2153 (2.2085) loss 2.9386 (3.4006) grad_norm 1.8087 (1.8458) [2022-01-23 16:20:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][760/1251] eta 0:18:04 lr 0.000310 time 2.3427 (2.2085) loss 4.0724 (3.3997) grad_norm 1.6573 (1.8471) [2022-01-23 16:21:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][770/1251] eta 0:17:42 lr 0.000310 time 2.2445 (2.2093) loss 3.5519 (3.4009) grad_norm 1.8442 (1.8470) [2022-01-23 16:21:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][780/1251] eta 0:17:20 lr 0.000310 time 2.2846 (2.2086) loss 3.3274 (3.4020) grad_norm 1.5051 (1.8460) [2022-01-23 16:21:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][790/1251] eta 0:16:58 lr 0.000310 time 1.7802 (2.2099) loss 2.9658 (3.4037) grad_norm 1.6973 (1.8461) [2022-01-23 16:22:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][800/1251] eta 0:16:37 lr 0.000310 time 2.1753 (2.2119) loss 3.9014 (3.4037) grad_norm 1.6654 (1.8464) [2022-01-23 16:22:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][810/1251] eta 0:16:15 lr 0.000310 time 2.2362 (2.2128) loss 3.6199 (3.4046) grad_norm 1.9253 (1.8456) [2022-01-23 16:23:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][820/1251] eta 0:15:52 lr 0.000310 time 1.7033 (2.2099) loss 2.4436 (3.4034) grad_norm 1.8427 (1.8456) [2022-01-23 16:23:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][830/1251] eta 0:15:30 lr 0.000310 time 2.5546 (2.2109) loss 3.6450 (3.4011) grad_norm 1.7259 (1.8456) [2022-01-23 16:23:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][840/1251] eta 0:15:08 lr 0.000310 time 2.2594 (2.2100) loss 3.1623 (3.4034) grad_norm 1.6752 (1.8449) [2022-01-23 16:24:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][850/1251] eta 0:14:45 lr 0.000310 time 1.8978 (2.2070) loss 3.9976 (3.4021) grad_norm 1.9592 (1.8446) [2022-01-23 16:24:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][860/1251] eta 0:14:21 lr 0.000310 time 1.9458 (2.2040) loss 3.9849 (3.4019) grad_norm 1.8687 (1.8451) [2022-01-23 16:24:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][870/1251] eta 0:13:59 lr 0.000310 time 2.4020 (2.2039) loss 3.2096 (3.4036) grad_norm 1.9269 (1.8448) [2022-01-23 16:25:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][880/1251] eta 0:13:38 lr 0.000310 time 2.5869 (2.2053) loss 3.6446 (3.4028) grad_norm 1.6490 (1.8459) [2022-01-23 16:25:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][890/1251] eta 0:13:16 lr 0.000310 time 2.0506 (2.2056) loss 2.9048 (3.4015) grad_norm 2.0065 (1.8464) [2022-01-23 16:25:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][900/1251] eta 0:12:54 lr 0.000310 time 1.9522 (2.2054) loss 3.9605 (3.4033) grad_norm 1.8674 (1.8456) [2022-01-23 16:26:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][910/1251] eta 0:12:32 lr 0.000310 time 2.0152 (2.2076) loss 3.7308 (3.4044) grad_norm 2.0719 (1.8454) [2022-01-23 16:26:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][920/1251] eta 0:12:11 lr 0.000310 time 2.5461 (2.2088) loss 3.9384 (3.4026) grad_norm 1.7178 (1.8444) [2022-01-23 16:27:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][930/1251] eta 0:11:49 lr 0.000310 time 2.1557 (2.2099) loss 3.9565 (3.4012) grad_norm 1.7455 (1.8430) [2022-01-23 16:27:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][940/1251] eta 0:11:26 lr 0.000310 time 2.2161 (2.2089) loss 3.7552 (3.4023) grad_norm 1.8327 (1.8425) [2022-01-23 16:27:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][950/1251] eta 0:11:04 lr 0.000310 time 1.8147 (2.2082) loss 3.5507 (3.3994) grad_norm 1.5754 (1.8416) [2022-01-23 16:28:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][960/1251] eta 0:10:42 lr 0.000310 time 1.5940 (2.2071) loss 3.6999 (3.3987) grad_norm 1.7305 (1.8416) [2022-01-23 16:28:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][970/1251] eta 0:10:20 lr 0.000309 time 1.5420 (2.2074) loss 2.8782 (3.3966) grad_norm 1.7809 (1.8399) [2022-01-23 16:28:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][980/1251] eta 0:09:57 lr 0.000309 time 2.1490 (2.2062) loss 3.9002 (3.3935) grad_norm 2.3426 (1.8405) [2022-01-23 16:29:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][990/1251] eta 0:09:36 lr 0.000309 time 2.2321 (2.2070) loss 2.9488 (3.3924) grad_norm 1.7798 (1.8402) [2022-01-23 16:29:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1000/1251] eta 0:09:13 lr 0.000309 time 2.1831 (2.2065) loss 2.6993 (3.3937) grad_norm 1.5538 (1.8403) [2022-01-23 16:29:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1010/1251] eta 0:08:52 lr 0.000309 time 2.1101 (2.2079) loss 3.5726 (3.3931) grad_norm 1.4450 (1.8394) [2022-01-23 16:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1020/1251] eta 0:08:29 lr 0.000309 time 1.8210 (2.2057) loss 3.4988 (3.3905) grad_norm 1.7618 (1.8387) [2022-01-23 16:30:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1030/1251] eta 0:08:07 lr 0.000309 time 1.6259 (2.2046) loss 2.4089 (3.3899) grad_norm 2.1039 (1.8381) [2022-01-23 16:30:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1040/1251] eta 0:07:44 lr 0.000309 time 1.6348 (2.2028) loss 3.5257 (3.3883) grad_norm 1.9014 (1.8379) [2022-01-23 16:31:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1050/1251] eta 0:07:22 lr 0.000309 time 1.9598 (2.2027) loss 3.4843 (3.3876) grad_norm 1.8853 (1.8367) [2022-01-23 16:31:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1060/1251] eta 0:07:00 lr 0.000309 time 2.1239 (2.2021) loss 3.2686 (3.3862) grad_norm 1.8268 (1.8365) [2022-01-23 16:32:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1070/1251] eta 0:06:39 lr 0.000309 time 2.8575 (2.2044) loss 3.7620 (3.3869) grad_norm 1.7060 (1.8362) [2022-01-23 16:32:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1080/1251] eta 0:06:17 lr 0.000309 time 1.9112 (2.2061) loss 3.7324 (3.3876) grad_norm 2.3503 (1.8388) [2022-01-23 16:32:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1090/1251] eta 0:05:55 lr 0.000309 time 1.5615 (2.2066) loss 3.0114 (3.3872) grad_norm 2.0965 (1.8398) [2022-01-23 16:33:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1100/1251] eta 0:05:33 lr 0.000309 time 1.8651 (2.2054) loss 2.6317 (3.3876) grad_norm 1.8531 (1.8398) [2022-01-23 16:33:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1110/1251] eta 0:05:10 lr 0.000309 time 1.9596 (2.2031) loss 2.6013 (3.3855) grad_norm 1.5652 (1.8396) [2022-01-23 16:33:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1120/1251] eta 0:04:48 lr 0.000309 time 1.8882 (2.2009) loss 3.9978 (3.3883) grad_norm 1.7698 (1.8384) [2022-01-23 16:34:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1130/1251] eta 0:04:26 lr 0.000309 time 1.9233 (2.1997) loss 3.6715 (3.3880) grad_norm 1.7782 (1.8380) [2022-01-23 16:34:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1140/1251] eta 0:04:04 lr 0.000309 time 2.1855 (2.1999) loss 2.9631 (3.3871) grad_norm 1.7708 (1.8372) [2022-01-23 16:34:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1150/1251] eta 0:03:42 lr 0.000309 time 2.4906 (2.2005) loss 3.6141 (3.3871) grad_norm 1.9779 (1.8367) [2022-01-23 16:35:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1160/1251] eta 0:03:20 lr 0.000309 time 2.8373 (2.2022) loss 3.9254 (3.3879) grad_norm 1.6449 (1.8361) [2022-01-23 16:35:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1170/1251] eta 0:02:58 lr 0.000309 time 2.8566 (2.2048) loss 2.3029 (3.3873) grad_norm 1.6927 (1.8358) [2022-01-23 16:36:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1180/1251] eta 0:02:36 lr 0.000309 time 3.1045 (2.2061) loss 2.8465 (3.3886) grad_norm 1.7960 (1.8355) [2022-01-23 16:36:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1190/1251] eta 0:02:14 lr 0.000309 time 2.0164 (2.2039) loss 2.5293 (3.3878) grad_norm 1.5718 (1.8351) [2022-01-23 16:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1200/1251] eta 0:01:52 lr 0.000309 time 2.5232 (2.2023) loss 3.7193 (3.3887) grad_norm 1.7291 (1.8343) [2022-01-23 16:37:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1210/1251] eta 0:01:30 lr 0.000309 time 3.0559 (2.2032) loss 3.9670 (3.3899) grad_norm 2.2194 (1.8346) [2022-01-23 16:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1220/1251] eta 0:01:08 lr 0.000309 time 2.2691 (2.2030) loss 4.0753 (3.3918) grad_norm 1.9065 (1.8345) [2022-01-23 16:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1230/1251] eta 0:00:46 lr 0.000308 time 2.2257 (2.2033) loss 2.7717 (3.3903) grad_norm 2.0049 (1.8339) [2022-01-23 16:38:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1240/1251] eta 0:00:24 lr 0.000308 time 1.6631 (2.2023) loss 3.4646 (3.3905) grad_norm 1.8457 (1.8345) [2022-01-23 16:38:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [188/300][1250/1251] eta 0:00:02 lr 0.000308 time 1.1533 (2.1967) loss 3.3982 (3.3912) grad_norm 2.2809 (1.8342) [2022-01-23 16:38:34 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 188 training takes 0:45:48 [2022-01-23 16:38:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.532 (18.532) Loss 0.9680 (0.9680) Acc@1 77.246 (77.246) Acc@5 94.336 (94.336) [2022-01-23 16:39:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.259 (3.231) Loss 0.9871 (0.9580) Acc@1 77.344 (77.539) Acc@5 94.727 (94.070) [2022-01-23 16:39:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.610 (2.577) Loss 0.9246 (0.9524) Acc@1 76.270 (77.441) Acc@5 94.922 (94.043) [2022-01-23 16:39:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.942 (2.363) Loss 1.0375 (0.9473) Acc@1 75.000 (77.589) Acc@5 93.164 (94.040) [2022-01-23 16:40:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.289 (2.189) Loss 0.8460 (0.9391) Acc@1 80.176 (77.811) Acc@5 95.605 (94.153) [2022-01-23 16:40:11 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 77.818 Acc@5 94.102 [2022-01-23 16:40:11 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 77.8% [2022-01-23 16:40:11 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 77.87% [2022-01-23 16:40:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][0/1251] eta 6:44:45 lr 0.000308 time 19.4133 (19.4133) loss 3.2774 (3.2774) grad_norm 2.1148 (2.1148) [2022-01-23 16:40:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][10/1251] eta 1:23:01 lr 0.000308 time 2.6533 (4.0141) loss 3.8573 (3.4814) grad_norm 1.6988 (1.7564) [2022-01-23 16:41:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][20/1251] eta 1:05:25 lr 0.000308 time 2.1926 (3.1892) loss 2.3956 (3.4954) grad_norm 1.8835 (1.7790) [2022-01-23 16:41:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][30/1251] eta 0:57:55 lr 0.000308 time 1.6089 (2.8462) loss 4.0056 (3.3479) grad_norm 1.9661 (1.8253) [2022-01-23 16:42:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][40/1251] eta 0:55:13 lr 0.000308 time 3.6509 (2.7366) loss 3.7303 (3.2938) grad_norm 1.8895 (1.8270) [2022-01-23 16:42:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][50/1251] eta 0:52:47 lr 0.000308 time 2.3827 (2.6373) loss 3.4123 (3.2837) grad_norm 1.6917 (1.8154) [2022-01-23 16:42:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][60/1251] eta 0:51:11 lr 0.000308 time 2.5557 (2.5789) loss 2.2912 (3.3070) grad_norm 1.8259 (1.7983) [2022-01-23 16:43:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][70/1251] eta 0:49:21 lr 0.000308 time 2.1851 (2.5076) loss 3.0232 (3.3064) grad_norm 1.7716 (1.8083) [2022-01-23 16:43:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][80/1251] eta 0:48:11 lr 0.000308 time 3.3070 (2.4694) loss 3.3455 (3.3123) grad_norm 1.6183 (1.8187) [2022-01-23 16:43:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][90/1251] eta 0:46:55 lr 0.000308 time 1.6126 (2.4251) loss 2.7393 (3.3360) grad_norm 1.8398 (1.8228) [2022-01-23 16:44:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][100/1251] eta 0:47:12 lr 0.000308 time 1.8233 (2.4610) loss 3.5132 (3.3572) grad_norm 1.9532 (1.8247) [2022-01-23 16:44:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][110/1251] eta 0:46:06 lr 0.000308 time 2.2394 (2.4244) loss 2.3539 (3.3420) grad_norm 1.7208 (1.8287) [2022-01-23 16:44:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][120/1251] eta 0:44:47 lr 0.000308 time 2.2837 (2.3761) loss 3.4434 (3.3457) grad_norm 1.8272 (1.8350) [2022-01-23 16:45:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][130/1251] eta 0:43:49 lr 0.000308 time 1.9427 (2.3455) loss 3.6366 (3.3452) grad_norm 1.8076 (1.8338) [2022-01-23 16:45:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][140/1251] eta 0:43:06 lr 0.000308 time 1.8653 (2.3279) loss 3.7661 (3.3660) grad_norm 1.9597 (1.8333) [2022-01-23 16:46:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][150/1251] eta 0:42:27 lr 0.000308 time 2.2233 (2.3135) loss 3.5285 (3.3888) grad_norm 1.8635 (1.8462) [2022-01-23 16:46:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][160/1251] eta 0:42:00 lr 0.000308 time 2.4555 (2.3102) loss 3.2565 (3.3927) grad_norm 1.5546 (1.8512) [2022-01-23 16:46:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][170/1251] eta 0:41:31 lr 0.000308 time 1.7418 (2.3047) loss 2.1448 (3.3917) grad_norm 1.9261 (1.8500) [2022-01-23 16:47:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][180/1251] eta 0:41:06 lr 0.000308 time 2.7409 (2.3026) loss 3.0551 (3.3838) grad_norm 2.0078 (1.8466) [2022-01-23 16:47:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][190/1251] eta 0:40:38 lr 0.000308 time 2.6070 (2.2981) loss 3.6925 (3.3914) grad_norm 1.6255 (1.8460) [2022-01-23 16:47:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][200/1251] eta 0:40:31 lr 0.000308 time 3.2077 (2.3137) loss 2.8530 (3.4007) grad_norm 1.9792 (1.8479) [2022-01-23 16:48:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][210/1251] eta 0:40:07 lr 0.000308 time 2.0910 (2.3123) loss 3.5688 (3.4027) grad_norm 1.6752 (1.8473) [2022-01-23 16:48:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][220/1251] eta 0:39:33 lr 0.000308 time 2.2162 (2.3022) loss 3.6433 (3.4085) grad_norm 2.1727 (1.8470) [2022-01-23 16:48:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][230/1251] eta 0:38:52 lr 0.000308 time 1.6453 (2.2845) loss 3.2200 (3.4087) grad_norm 2.4303 (1.8485) [2022-01-23 16:49:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][240/1251] eta 0:38:18 lr 0.000307 time 1.6240 (2.2734) loss 3.6617 (3.4081) grad_norm 1.7618 (1.8506) [2022-01-23 16:49:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][250/1251] eta 0:37:50 lr 0.000307 time 1.8737 (2.2681) loss 4.0083 (3.4112) grad_norm 2.0947 (1.8509) [2022-01-23 16:50:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][260/1251] eta 0:37:31 lr 0.000307 time 2.2425 (2.2723) loss 3.7139 (3.4194) grad_norm 1.7386 (1.8495) [2022-01-23 16:50:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][270/1251] eta 0:37:05 lr 0.000307 time 2.4909 (2.2691) loss 2.6785 (3.4132) grad_norm 1.6276 (1.8472) [2022-01-23 16:50:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][280/1251] eta 0:36:40 lr 0.000307 time 1.6174 (2.2658) loss 3.4007 (3.4040) grad_norm 2.0033 (1.8491) [2022-01-23 16:51:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][290/1251] eta 0:36:17 lr 0.000307 time 2.2841 (2.2661) loss 2.3973 (3.3987) grad_norm 1.9556 (1.8480) [2022-01-23 16:51:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][300/1251] eta 0:35:54 lr 0.000307 time 2.5518 (2.2653) loss 3.8055 (3.3919) grad_norm 1.8638 (1.8458) [2022-01-23 16:51:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][310/1251] eta 0:35:32 lr 0.000307 time 1.8289 (2.2657) loss 3.9521 (3.3862) grad_norm 1.6861 (1.8434) [2022-01-23 16:52:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][320/1251] eta 0:35:09 lr 0.000307 time 1.8539 (2.2657) loss 3.3708 (3.3828) grad_norm 1.6707 (1.8425) [2022-01-23 16:52:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][330/1251] eta 0:35:29 lr 0.000307 time 2.1507 (2.3122) loss 3.7205 (3.3854) grad_norm 1.7344 (1.8426) [2022-01-23 16:53:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][340/1251] eta 0:34:57 lr 0.000307 time 1.7627 (2.3023) loss 3.9105 (3.3827) grad_norm 1.5863 (1.8405) [2022-01-23 16:54:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][350/1251] eta 0:35:34 lr 0.000307 time 1.8090 (2.3686) loss 3.0536 (3.3804) grad_norm 1.8334 (1.8394) [2022-01-23 16:54:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][360/1251] eta 0:34:55 lr 0.000307 time 1.8876 (2.3516) loss 3.6400 (3.3823) grad_norm 1.6115 (1.8389) [2022-01-23 16:54:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][370/1251] eta 0:34:21 lr 0.000307 time 1.8663 (2.3401) loss 3.6708 (3.3820) grad_norm 2.0158 (1.8416) [2022-01-23 16:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][380/1251] eta 0:33:52 lr 0.000307 time 2.2406 (2.3337) loss 3.8519 (3.3750) grad_norm 1.9640 (1.8379) [2022-01-23 16:55:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][390/1251] eta 0:33:26 lr 0.000307 time 2.3171 (2.3301) loss 3.5450 (3.3823) grad_norm 1.9563 (1.8408) [2022-01-23 16:55:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][400/1251] eta 0:32:58 lr 0.000307 time 2.5317 (2.3248) loss 3.9985 (3.3787) grad_norm 1.9455 (1.8382) [2022-01-23 16:56:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][410/1251] eta 0:32:32 lr 0.000307 time 1.8557 (2.3217) loss 3.5203 (3.3737) grad_norm 1.9338 (1.8391) [2022-01-23 16:56:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][420/1251] eta 0:32:11 lr 0.000307 time 2.9464 (2.3248) loss 3.0613 (3.3782) grad_norm 2.1159 (1.8374) [2022-01-23 16:57:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][430/1251] eta 0:32:02 lr 0.000307 time 2.5690 (2.3413) loss 3.1470 (3.3731) grad_norm 1.7513 (1.8361) [2022-01-23 16:57:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][440/1251] eta 0:31:33 lr 0.000307 time 2.5312 (2.3343) loss 3.1036 (3.3698) grad_norm 2.0127 (1.8348) [2022-01-23 16:57:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][450/1251] eta 0:31:03 lr 0.000307 time 1.9394 (2.3260) loss 2.5294 (3.3657) grad_norm 1.7846 (1.8331) [2022-01-23 16:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][460/1251] eta 0:30:32 lr 0.000307 time 1.8504 (2.3163) loss 4.2096 (3.3641) grad_norm 1.9959 (1.8330) [2022-01-23 16:58:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][470/1251] eta 0:30:04 lr 0.000307 time 1.8630 (2.3109) loss 4.0414 (3.3658) grad_norm 1.8908 (1.8340) [2022-01-23 16:58:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][480/1251] eta 0:29:37 lr 0.000307 time 2.0288 (2.3058) loss 2.6787 (3.3647) grad_norm 1.9322 (1.8338) [2022-01-23 16:59:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][490/1251] eta 0:29:12 lr 0.000307 time 2.1777 (2.3025) loss 3.5339 (3.3666) grad_norm 1.7210 (1.8338) [2022-01-23 16:59:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][500/1251] eta 0:28:47 lr 0.000307 time 1.5425 (2.3009) loss 3.5180 (3.3623) grad_norm 1.5949 (1.8329) [2022-01-23 16:59:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][510/1251] eta 0:28:26 lr 0.000306 time 2.8195 (2.3030) loss 3.0069 (3.3635) grad_norm 1.5879 (1.8306) [2022-01-23 17:00:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][520/1251] eta 0:28:01 lr 0.000306 time 1.9685 (2.2998) loss 2.4013 (3.3624) grad_norm 1.6783 (1.8297) [2022-01-23 17:00:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][530/1251] eta 0:27:34 lr 0.000306 time 2.1866 (2.2943) loss 3.3966 (3.3619) grad_norm 1.8375 (1.8280) [2022-01-23 17:00:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][540/1251] eta 0:27:09 lr 0.000306 time 1.5069 (2.2917) loss 3.0307 (3.3635) grad_norm 1.7168 (1.8272) [2022-01-23 17:01:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][550/1251] eta 0:26:45 lr 0.000306 time 2.6027 (2.2900) loss 3.5823 (3.3619) grad_norm 1.6909 (1.8261) [2022-01-23 17:01:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][560/1251] eta 0:26:21 lr 0.000306 time 2.0148 (2.2884) loss 2.8554 (3.3576) grad_norm 2.0390 (1.8261) [2022-01-23 17:01:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][570/1251] eta 0:25:57 lr 0.000306 time 2.3068 (2.2873) loss 2.8932 (3.3515) grad_norm 1.9872 (1.8253) [2022-01-23 17:02:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][580/1251] eta 0:25:32 lr 0.000306 time 2.1863 (2.2843) loss 3.1723 (3.3481) grad_norm 1.7773 (1.8256) [2022-01-23 17:02:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][590/1251] eta 0:25:06 lr 0.000306 time 1.8870 (2.2794) loss 3.3151 (3.3467) grad_norm 1.8605 (1.8254) [2022-01-23 17:03:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][600/1251] eta 0:24:42 lr 0.000306 time 2.1798 (2.2772) loss 3.9295 (3.3521) grad_norm 1.7515 (1.8252) [2022-01-23 17:03:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][610/1251] eta 0:24:17 lr 0.000306 time 2.0579 (2.2732) loss 3.6794 (3.3485) grad_norm 1.6929 (1.8256) [2022-01-23 17:03:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][620/1251] eta 0:23:53 lr 0.000306 time 2.2049 (2.2719) loss 2.6780 (3.3491) grad_norm 1.7949 (1.8267) [2022-01-23 17:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][630/1251] eta 0:23:30 lr 0.000306 time 2.1824 (2.2719) loss 3.6554 (3.3467) grad_norm 2.1880 (1.8268) [2022-01-23 17:04:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][640/1251] eta 0:23:08 lr 0.000306 time 1.8800 (2.2724) loss 3.6878 (3.3435) grad_norm 1.6949 (1.8259) [2022-01-23 17:04:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][650/1251] eta 0:22:46 lr 0.000306 time 2.3575 (2.2730) loss 3.0988 (3.3467) grad_norm 1.9426 (1.8262) [2022-01-23 17:05:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][660/1251] eta 0:22:23 lr 0.000306 time 1.8414 (2.2727) loss 2.8389 (3.3444) grad_norm 1.9096 (1.8265) [2022-01-23 17:05:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][670/1251] eta 0:22:01 lr 0.000306 time 1.8649 (2.2751) loss 2.6705 (3.3424) grad_norm 1.7891 (1.8280) [2022-01-23 17:05:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][680/1251] eta 0:21:37 lr 0.000306 time 1.6216 (2.2728) loss 4.0584 (3.3413) grad_norm 1.9025 (1.8295) [2022-01-23 17:06:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][690/1251] eta 0:21:13 lr 0.000306 time 1.9361 (2.2693) loss 4.2683 (3.3424) grad_norm 2.1057 (1.8321) [2022-01-23 17:06:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][700/1251] eta 0:20:51 lr 0.000306 time 1.9197 (2.2717) loss 3.7226 (3.3465) grad_norm 1.7543 (1.8323) [2022-01-23 17:07:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][710/1251] eta 0:20:28 lr 0.000306 time 1.8500 (2.2701) loss 2.5194 (3.3469) grad_norm 1.7292 (1.8317) [2022-01-23 17:07:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][720/1251] eta 0:20:03 lr 0.000306 time 1.8658 (2.2668) loss 3.1651 (3.3489) grad_norm 1.8188 (1.8314) [2022-01-23 17:07:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][730/1251] eta 0:19:40 lr 0.000306 time 2.6309 (2.2654) loss 3.5027 (3.3470) grad_norm 1.8205 (1.8317) [2022-01-23 17:08:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][740/1251] eta 0:19:17 lr 0.000306 time 2.1899 (2.2659) loss 2.4118 (3.3444) grad_norm 1.7063 (1.8313) [2022-01-23 17:08:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][750/1251] eta 0:18:55 lr 0.000306 time 2.3033 (2.2663) loss 3.5562 (3.3448) grad_norm 1.7399 (1.8304) [2022-01-23 17:08:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][760/1251] eta 0:18:31 lr 0.000306 time 2.2756 (2.2643) loss 3.2866 (3.3466) grad_norm 1.6936 (1.8313) [2022-01-23 17:09:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][770/1251] eta 0:18:07 lr 0.000305 time 2.3004 (2.2619) loss 3.4831 (3.3480) grad_norm 1.6577 (1.8313) [2022-01-23 17:09:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][780/1251] eta 0:17:44 lr 0.000305 time 1.8457 (2.2593) loss 3.3071 (3.3507) grad_norm 1.5493 (1.8308) [2022-01-23 17:09:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][790/1251] eta 0:17:21 lr 0.000305 time 2.0381 (2.2600) loss 3.6253 (3.3544) grad_norm 1.6082 (1.8305) [2022-01-23 17:10:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][800/1251] eta 0:16:59 lr 0.000305 time 1.9512 (2.2603) loss 3.7390 (3.3577) grad_norm 1.6317 (1.8299) [2022-01-23 17:10:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][810/1251] eta 0:16:36 lr 0.000305 time 1.8713 (2.2593) loss 3.8328 (3.3544) grad_norm 1.7057 (1.8299) [2022-01-23 17:11:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][820/1251] eta 0:16:12 lr 0.000305 time 1.9457 (2.2573) loss 2.7313 (3.3545) grad_norm 1.8439 (1.8302) [2022-01-23 17:11:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][830/1251] eta 0:15:51 lr 0.000305 time 1.6783 (2.2590) loss 3.6973 (3.3557) grad_norm 1.6739 (1.8295) [2022-01-23 17:11:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][840/1251] eta 0:15:27 lr 0.000305 time 1.9123 (2.2562) loss 3.7375 (3.3560) grad_norm 1.5187 (1.8290) [2022-01-23 17:12:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][850/1251] eta 0:15:07 lr 0.000305 time 1.6644 (2.2622) loss 2.5591 (3.3554) grad_norm 1.9845 (1.8299) [2022-01-23 17:12:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][860/1251] eta 0:14:43 lr 0.000305 time 1.8556 (2.2583) loss 3.9170 (3.3550) grad_norm 1.5829 (1.8308) [2022-01-23 17:13:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][870/1251] eta 0:14:23 lr 0.000305 time 2.2169 (2.2651) loss 3.2341 (3.3562) grad_norm 1.9586 (1.8311) [2022-01-23 17:13:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][880/1251] eta 0:13:58 lr 0.000305 time 1.5819 (2.2601) loss 3.1085 (3.3566) grad_norm 1.8970 (1.8312) [2022-01-23 17:13:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][890/1251] eta 0:13:34 lr 0.000305 time 2.2324 (2.2566) loss 3.6518 (3.3564) grad_norm 1.7032 (1.8321) [2022-01-23 17:14:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][900/1251] eta 0:13:11 lr 0.000305 time 1.8858 (2.2549) loss 3.5632 (3.3564) grad_norm 1.7168 (1.8312) [2022-01-23 17:14:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][910/1251] eta 0:12:48 lr 0.000305 time 2.5157 (2.2542) loss 2.1413 (3.3564) grad_norm 1.8500 (1.8303) [2022-01-23 17:14:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][920/1251] eta 0:12:26 lr 0.000305 time 1.8596 (2.2549) loss 3.7645 (3.3581) grad_norm 1.9369 (1.8304) [2022-01-23 17:15:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][930/1251] eta 0:12:04 lr 0.000305 time 2.5930 (2.2564) loss 3.5856 (3.3612) grad_norm 1.6090 (1.8297) [2022-01-23 17:15:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][940/1251] eta 0:11:41 lr 0.000305 time 2.1865 (2.2559) loss 3.7050 (3.3645) grad_norm 1.7008 (1.8297) [2022-01-23 17:15:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][950/1251] eta 0:11:18 lr 0.000305 time 2.5122 (2.2550) loss 2.4148 (3.3621) grad_norm 1.6629 (1.8299) [2022-01-23 17:16:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][960/1251] eta 0:10:59 lr 0.000305 time 2.4359 (2.2652) loss 3.5454 (3.3610) grad_norm 1.6882 (1.8286) [2022-01-23 17:16:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][970/1251] eta 0:10:36 lr 0.000305 time 2.1807 (2.2634) loss 3.5167 (3.3607) grad_norm 1.8064 (1.8280) [2022-01-23 17:17:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][980/1251] eta 0:10:12 lr 0.000305 time 1.8363 (2.2599) loss 3.9939 (3.3604) grad_norm 2.5635 (1.8284) [2022-01-23 17:17:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][990/1251] eta 0:09:49 lr 0.000305 time 2.5725 (2.2578) loss 3.8189 (3.3620) grad_norm 2.2090 (1.8284) [2022-01-23 17:17:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1000/1251] eta 0:09:26 lr 0.000305 time 2.2452 (2.2559) loss 3.1702 (3.3601) grad_norm 1.6577 (1.8272) [2022-01-23 17:18:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1010/1251] eta 0:09:03 lr 0.000305 time 1.9569 (2.2544) loss 2.6518 (3.3601) grad_norm 1.5773 (1.8270) [2022-01-23 17:18:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1020/1251] eta 0:08:40 lr 0.000305 time 2.2416 (2.2532) loss 3.3935 (3.3597) grad_norm 1.7924 (1.8260) [2022-01-23 17:18:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1030/1251] eta 0:08:18 lr 0.000305 time 2.4635 (2.2543) loss 3.7518 (3.3600) grad_norm 1.5847 (1.8258) [2022-01-23 17:20:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1040/1251] eta 0:08:08 lr 0.000304 time 2.4936 (2.3136) loss 4.2148 (3.3566) grad_norm 1.8802 (1.8250) [2022-01-23 17:20:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1050/1251] eta 0:07:44 lr 0.000304 time 1.8386 (2.3098) loss 3.0073 (3.3573) grad_norm 2.0950 (1.8255) [2022-01-23 17:20:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1060/1251] eta 0:07:20 lr 0.000304 time 1.9277 (2.3061) loss 3.3888 (3.3558) grad_norm 1.8299 (1.8263) [2022-01-23 17:21:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1070/1251] eta 0:06:56 lr 0.000304 time 1.8575 (2.3024) loss 2.5680 (3.3570) grad_norm 1.8555 (1.8265) [2022-01-23 17:21:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1080/1251] eta 0:06:33 lr 0.000304 time 1.9362 (2.2996) loss 3.4123 (3.3566) grad_norm 1.9246 (1.8268) [2022-01-23 17:21:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1090/1251] eta 0:06:10 lr 0.000304 time 2.1753 (2.2987) loss 2.3814 (3.3545) grad_norm 1.9788 (1.8260) [2022-01-23 17:22:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1100/1251] eta 0:05:46 lr 0.000304 time 2.3522 (2.2976) loss 3.5415 (3.3563) grad_norm 1.7020 (1.8252) [2022-01-23 17:22:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1110/1251] eta 0:05:24 lr 0.000304 time 2.5103 (2.2980) loss 3.8892 (3.3553) grad_norm 1.7255 (1.8248) [2022-01-23 17:23:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1120/1251] eta 0:05:02 lr 0.000304 time 1.7654 (2.3083) loss 2.1162 (3.3556) grad_norm 2.0250 (1.8250) [2022-01-23 17:23:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1130/1251] eta 0:04:39 lr 0.000304 time 2.4059 (2.3072) loss 2.8696 (3.3568) grad_norm 1.6844 (1.8250) [2022-01-23 17:24:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1140/1251] eta 0:04:15 lr 0.000304 time 1.9405 (2.3039) loss 2.4369 (3.3588) grad_norm 1.6167 (1.8255) [2022-01-23 17:24:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1150/1251] eta 0:03:52 lr 0.000304 time 1.8558 (2.3005) loss 2.6102 (3.3596) grad_norm 1.9357 (1.8257) [2022-01-23 17:24:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1160/1251] eta 0:03:29 lr 0.000304 time 1.8923 (2.2975) loss 3.6004 (3.3606) grad_norm 1.7101 (1.8256) [2022-01-23 17:25:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1170/1251] eta 0:03:06 lr 0.000304 time 2.2563 (2.2969) loss 3.0201 (3.3599) grad_norm 1.5833 (1.8257) [2022-01-23 17:25:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1180/1251] eta 0:02:42 lr 0.000304 time 2.3476 (2.2950) loss 3.0917 (3.3622) grad_norm 1.7322 (1.8254) [2022-01-23 17:25:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1190/1251] eta 0:02:19 lr 0.000304 time 1.8686 (2.2941) loss 3.8783 (3.3634) grad_norm 1.8919 (1.8255) [2022-01-23 17:26:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1200/1251] eta 0:01:57 lr 0.000304 time 1.5195 (2.2941) loss 3.5716 (3.3631) grad_norm 2.4288 (1.8255) [2022-01-23 17:26:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1210/1251] eta 0:01:34 lr 0.000304 time 2.7552 (2.2950) loss 3.6000 (3.3634) grad_norm 1.6985 (1.8251) [2022-01-23 17:26:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1220/1251] eta 0:01:11 lr 0.000304 time 2.4256 (2.2951) loss 3.9923 (3.3662) grad_norm 1.7035 (1.8244) [2022-01-23 17:27:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1230/1251] eta 0:00:48 lr 0.000304 time 1.9963 (2.2946) loss 3.1824 (3.3633) grad_norm 1.5773 (1.8235) [2022-01-23 17:27:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1240/1251] eta 0:00:25 lr 0.000304 time 1.4220 (2.2926) loss 3.5095 (3.3655) grad_norm 1.7957 (1.8232) [2022-01-23 17:27:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [189/300][1250/1251] eta 0:00:02 lr 0.000304 time 1.2084 (2.2864) loss 2.6782 (3.3668) grad_norm 2.0107 (1.8226) [2022-01-23 17:27:52 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 189 training takes 0:47:40 [2022-01-23 17:28:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.826 (18.826) Loss 0.9549 (0.9549) Acc@1 78.516 (78.516) Acc@5 93.164 (93.164) [2022-01-23 17:28:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.824 (3.566) Loss 0.9837 (0.9644) Acc@1 79.199 (77.601) Acc@5 93.066 (93.865) [2022-01-23 17:28:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.960 (2.473) Loss 0.9903 (0.9653) Acc@1 75.879 (77.479) Acc@5 94.336 (94.001) [2022-01-23 17:29:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.304 (2.259) Loss 0.9809 (0.9653) Acc@1 76.855 (77.441) Acc@5 94.336 (94.081) [2022-01-23 17:29:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.049 (2.183) Loss 1.0495 (0.9609) Acc@1 76.270 (77.596) Acc@5 92.480 (94.069) [2022-01-23 17:29:28 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 77.680 Acc@5 94.194 [2022-01-23 17:29:28 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 77.7% [2022-01-23 17:29:28 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 77.87% [2022-01-23 17:29:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][0/1251] eta 7:37:33 lr 0.000304 time 21.9454 (21.9454) loss 3.1383 (3.1383) grad_norm 2.1254 (2.1254) [2022-01-23 17:30:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][10/1251] eta 1:25:18 lr 0.000304 time 1.2990 (4.1248) loss 2.9926 (3.3624) grad_norm 1.5118 (1.8465) [2022-01-23 17:30:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][20/1251] eta 1:06:33 lr 0.000304 time 1.5150 (3.2444) loss 3.5171 (3.3407) grad_norm 1.7333 (1.8081) [2022-01-23 17:30:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][30/1251] eta 0:58:27 lr 0.000304 time 1.5988 (2.8723) loss 2.2127 (3.3251) grad_norm 2.1721 (1.8201) [2022-01-23 17:31:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][40/1251] eta 0:54:43 lr 0.000304 time 3.2918 (2.7111) loss 3.0940 (3.3131) grad_norm 1.6118 (1.7933) [2022-01-23 17:31:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][50/1251] eta 0:52:54 lr 0.000303 time 1.5079 (2.6432) loss 3.3769 (3.3525) grad_norm 1.9347 (1.7962) [2022-01-23 17:32:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][60/1251] eta 0:51:20 lr 0.000303 time 2.3555 (2.5864) loss 3.1927 (3.3427) grad_norm 1.6062 (1.8108) [2022-01-23 17:32:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][70/1251] eta 0:49:54 lr 0.000303 time 1.5925 (2.5354) loss 2.4436 (3.3187) grad_norm 1.8199 (1.8320) [2022-01-23 17:32:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][80/1251] eta 0:48:59 lr 0.000303 time 2.8655 (2.5105) loss 4.0031 (3.3260) grad_norm 2.0446 (1.8287) [2022-01-23 17:33:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][90/1251] eta 0:47:47 lr 0.000303 time 1.9398 (2.4701) loss 3.8857 (3.3449) grad_norm 2.2071 (1.8460) [2022-01-23 17:33:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][100/1251] eta 0:46:22 lr 0.000303 time 1.9422 (2.4173) loss 3.9257 (3.3468) grad_norm 2.0096 (1.8622) [2022-01-23 17:33:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][110/1251] eta 0:45:30 lr 0.000303 time 2.2722 (2.3931) loss 3.5387 (3.3582) grad_norm 1.7470 (1.8577) [2022-01-23 17:34:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][120/1251] eta 0:44:47 lr 0.000303 time 3.1501 (2.3760) loss 2.2341 (3.3570) grad_norm 2.2139 (1.8647) [2022-01-23 17:34:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][130/1251] eta 0:44:04 lr 0.000303 time 1.9655 (2.3589) loss 3.7387 (3.3753) grad_norm 1.7805 (1.8571) [2022-01-23 17:35:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][140/1251] eta 0:43:32 lr 0.000303 time 1.4339 (2.3514) loss 3.8897 (3.3842) grad_norm 2.1940 (1.8562) [2022-01-23 17:35:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][150/1251] eta 0:43:06 lr 0.000303 time 2.1292 (2.3494) loss 2.3453 (3.3795) grad_norm 1.6125 (1.8617) [2022-01-23 17:35:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][160/1251] eta 0:42:38 lr 0.000303 time 2.3848 (2.3450) loss 3.2749 (3.3737) grad_norm 1.7966 (1.8597) [2022-01-23 17:36:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][170/1251] eta 0:41:59 lr 0.000303 time 1.5867 (2.3305) loss 3.8674 (3.3802) grad_norm 1.9578 (1.8595) [2022-01-23 17:36:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][180/1251] eta 0:41:21 lr 0.000303 time 1.6198 (2.3174) loss 3.9506 (3.3749) grad_norm 2.0370 (1.8562) [2022-01-23 17:36:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][190/1251] eta 0:40:56 lr 0.000303 time 1.9094 (2.3152) loss 3.4522 (3.3854) grad_norm 1.7473 (1.8531) [2022-01-23 17:37:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][200/1251] eta 0:40:38 lr 0.000303 time 2.2442 (2.3202) loss 2.6623 (3.3957) grad_norm 2.0676 (1.8561) [2022-01-23 17:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][210/1251] eta 0:40:05 lr 0.000303 time 1.8407 (2.3110) loss 3.4303 (3.3939) grad_norm 1.6236 (1.8673) [2022-01-23 17:37:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][220/1251] eta 0:39:28 lr 0.000303 time 1.6698 (2.2977) loss 3.6858 (3.3916) grad_norm 1.7901 (1.8715) [2022-01-23 17:38:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][230/1251] eta 0:39:05 lr 0.000303 time 1.6879 (2.2972) loss 3.8169 (3.3854) grad_norm 1.9974 (1.8715) [2022-01-23 17:38:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][240/1251] eta 0:38:36 lr 0.000303 time 2.0474 (2.2917) loss 3.6349 (3.3780) grad_norm 1.8946 (1.8676) [2022-01-23 17:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][250/1251] eta 0:38:04 lr 0.000303 time 1.9128 (2.2819) loss 3.4306 (3.3773) grad_norm 1.7931 (1.8653) [2022-01-23 17:39:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][260/1251] eta 0:37:29 lr 0.000303 time 2.2378 (2.2702) loss 3.3213 (3.3703) grad_norm 1.8720 (1.8641) [2022-01-23 17:39:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][270/1251] eta 0:37:04 lr 0.000303 time 1.6641 (2.2676) loss 3.3552 (3.3686) grad_norm 1.5816 (1.8611) [2022-01-23 17:40:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][280/1251] eta 0:36:47 lr 0.000303 time 1.5694 (2.2731) loss 3.7996 (3.3687) grad_norm 1.5909 (1.8566) [2022-01-23 17:40:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][290/1251] eta 0:36:24 lr 0.000303 time 1.9020 (2.2727) loss 4.0575 (3.3620) grad_norm 1.9056 (1.8582) [2022-01-23 17:40:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][300/1251] eta 0:35:56 lr 0.000303 time 1.7918 (2.2679) loss 3.2885 (3.3669) grad_norm 1.6091 (1.8525) [2022-01-23 17:41:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][310/1251] eta 0:35:30 lr 0.000302 time 1.8051 (2.2642) loss 3.9237 (3.3656) grad_norm 1.7845 (1.8483) [2022-01-23 17:41:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][320/1251] eta 0:35:00 lr 0.000302 time 1.8548 (2.2566) loss 3.3167 (3.3657) grad_norm 1.5825 (1.8467) [2022-01-23 17:41:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][330/1251] eta 0:34:35 lr 0.000302 time 1.9313 (2.2531) loss 3.6655 (3.3647) grad_norm 1.6366 (1.8434) [2022-01-23 17:42:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][340/1251] eta 0:34:16 lr 0.000302 time 2.6590 (2.2572) loss 3.8517 (3.3685) grad_norm 1.6933 (1.8430) [2022-01-23 17:42:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][350/1251] eta 0:33:53 lr 0.000302 time 2.2435 (2.2567) loss 2.7124 (3.3641) grad_norm 1.7732 (1.8413) [2022-01-23 17:43:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][360/1251] eta 0:33:27 lr 0.000302 time 2.4731 (2.2528) loss 4.0159 (3.3662) grad_norm 2.0721 (1.8438) [2022-01-23 17:43:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][370/1251] eta 0:33:01 lr 0.000302 time 1.9267 (2.2496) loss 3.5418 (3.3570) grad_norm 1.8131 (1.8492) [2022-01-23 17:43:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][380/1251] eta 0:32:36 lr 0.000302 time 2.2210 (2.2468) loss 3.3758 (3.3633) grad_norm 1.8752 (1.8514) [2022-01-23 17:44:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][390/1251] eta 0:32:12 lr 0.000302 time 2.3039 (2.2443) loss 3.4862 (3.3682) grad_norm 1.7510 (1.8505) [2022-01-23 17:44:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][400/1251] eta 0:31:44 lr 0.000302 time 1.8582 (2.2382) loss 3.2023 (3.3658) grad_norm 1.6934 (1.8480) [2022-01-23 17:44:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][410/1251] eta 0:31:17 lr 0.000302 time 1.9745 (2.2327) loss 3.0918 (3.3625) grad_norm 1.7392 (1.8482) [2022-01-23 17:45:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][420/1251] eta 0:30:55 lr 0.000302 time 2.9573 (2.2330) loss 2.7438 (3.3604) grad_norm 1.6526 (1.8475) [2022-01-23 17:45:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][430/1251] eta 0:30:33 lr 0.000302 time 2.4348 (2.2327) loss 3.4672 (3.3598) grad_norm 1.6609 (1.8476) [2022-01-23 17:45:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][440/1251] eta 0:30:14 lr 0.000302 time 1.9801 (2.2371) loss 2.7560 (3.3653) grad_norm 1.7985 (1.8484) [2022-01-23 17:46:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][450/1251] eta 0:29:54 lr 0.000302 time 2.2645 (2.2398) loss 3.0090 (3.3677) grad_norm 1.7815 (1.8472) [2022-01-23 17:46:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][460/1251] eta 0:29:31 lr 0.000302 time 2.5600 (2.2394) loss 3.2041 (3.3631) grad_norm 1.8937 (1.8459) [2022-01-23 17:47:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][470/1251] eta 0:29:05 lr 0.000302 time 1.5265 (2.2346) loss 3.8446 (3.3650) grad_norm 1.9919 (1.8460) [2022-01-23 17:47:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][480/1251] eta 0:28:38 lr 0.000302 time 1.9882 (2.2290) loss 3.0058 (3.3666) grad_norm 1.7793 (1.8446) [2022-01-23 17:47:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][490/1251] eta 0:28:13 lr 0.000302 time 1.6419 (2.2259) loss 4.1284 (3.3673) grad_norm 1.9559 (1.8436) [2022-01-23 17:48:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][500/1251] eta 0:27:49 lr 0.000302 time 2.5496 (2.2227) loss 3.9966 (3.3747) grad_norm 2.0490 (1.8444) [2022-01-23 17:48:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][510/1251] eta 0:27:25 lr 0.000302 time 1.8827 (2.2212) loss 2.3455 (3.3703) grad_norm 1.8410 (1.8436) [2022-01-23 17:48:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][520/1251] eta 0:27:04 lr 0.000302 time 2.5789 (2.2221) loss 3.5403 (3.3741) grad_norm 1.8931 (1.8439) [2022-01-23 17:49:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][530/1251] eta 0:26:42 lr 0.000302 time 2.2264 (2.2224) loss 2.8000 (3.3748) grad_norm 1.8366 (1.8437) [2022-01-23 17:49:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][540/1251] eta 0:26:21 lr 0.000302 time 2.5824 (2.2241) loss 3.2079 (3.3719) grad_norm 2.0696 (1.8466) [2022-01-23 17:49:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][550/1251] eta 0:25:58 lr 0.000302 time 2.3316 (2.2230) loss 3.0421 (3.3719) grad_norm 1.9079 (1.8452) [2022-01-23 17:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][560/1251] eta 0:25:35 lr 0.000302 time 1.8156 (2.2226) loss 3.1374 (3.3682) grad_norm 1.8182 (1.8448) [2022-01-23 17:50:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][570/1251] eta 0:25:13 lr 0.000302 time 2.3043 (2.2224) loss 3.7433 (3.3726) grad_norm 2.4512 (1.8479) [2022-01-23 17:50:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][580/1251] eta 0:24:51 lr 0.000301 time 2.6999 (2.2222) loss 2.4279 (3.3698) grad_norm 1.7691 (1.8478) [2022-01-23 17:51:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][590/1251] eta 0:24:30 lr 0.000301 time 2.2687 (2.2241) loss 2.9258 (3.3678) grad_norm 2.3056 (1.8475) [2022-01-23 17:51:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][600/1251] eta 0:24:04 lr 0.000301 time 1.5629 (2.2194) loss 3.3635 (3.3679) grad_norm 1.6731 (1.8510) [2022-01-23 17:52:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][610/1251] eta 0:23:41 lr 0.000301 time 2.4614 (2.2175) loss 3.7981 (3.3707) grad_norm 1.7874 (1.8505) [2022-01-23 17:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][620/1251] eta 0:23:18 lr 0.000301 time 1.9628 (2.2169) loss 2.6597 (3.3681) grad_norm 1.8757 (1.8506) [2022-01-23 17:52:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][630/1251] eta 0:22:57 lr 0.000301 time 1.7185 (2.2175) loss 3.6714 (3.3725) grad_norm 2.3903 (1.8517) [2022-01-23 17:53:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][640/1251] eta 0:22:34 lr 0.000301 time 2.1693 (2.2170) loss 3.0582 (3.3722) grad_norm 1.8538 (1.8512) [2022-01-23 17:53:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][650/1251] eta 0:22:12 lr 0.000301 time 2.1581 (2.2178) loss 4.0153 (3.3745) grad_norm 2.0230 (1.8506) [2022-01-23 17:53:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][660/1251] eta 0:21:49 lr 0.000301 time 1.9218 (2.2164) loss 3.4913 (3.3747) grad_norm 1.8249 (1.8508) [2022-01-23 17:54:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][670/1251] eta 0:21:27 lr 0.000301 time 1.9348 (2.2163) loss 2.9740 (3.3738) grad_norm 1.6718 (1.8522) [2022-01-23 17:54:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][680/1251] eta 0:21:05 lr 0.000301 time 1.9832 (2.2162) loss 3.9669 (3.3766) grad_norm 1.6790 (1.8516) [2022-01-23 17:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][690/1251] eta 0:20:43 lr 0.000301 time 2.3866 (2.2160) loss 3.6188 (3.3759) grad_norm 1.7517 (1.8506) [2022-01-23 17:55:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][700/1251] eta 0:20:20 lr 0.000301 time 2.1828 (2.2143) loss 3.9405 (3.3786) grad_norm 1.7015 (1.8492) [2022-01-23 17:55:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][710/1251] eta 0:19:56 lr 0.000301 time 1.5621 (2.2117) loss 2.9795 (3.3794) grad_norm 1.7846 (1.8476) [2022-01-23 17:56:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][720/1251] eta 0:19:33 lr 0.000301 time 1.9173 (2.2101) loss 3.3351 (3.3749) grad_norm 1.6720 (1.8471) [2022-01-23 17:56:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][730/1251] eta 0:19:11 lr 0.000301 time 2.5898 (2.2111) loss 3.5495 (3.3733) grad_norm 1.6110 (1.8452) [2022-01-23 17:56:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][740/1251] eta 0:18:49 lr 0.000301 time 1.8695 (2.2099) loss 2.8434 (3.3720) grad_norm 2.0961 (1.8443) [2022-01-23 17:57:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][750/1251] eta 0:18:27 lr 0.000301 time 1.7021 (2.2110) loss 3.7478 (3.3725) grad_norm 1.7857 (1.8435) [2022-01-23 17:57:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][760/1251] eta 0:18:07 lr 0.000301 time 1.9445 (2.2139) loss 2.6289 (3.3730) grad_norm 1.6775 (1.8438) [2022-01-23 17:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][770/1251] eta 0:17:45 lr 0.000301 time 2.5421 (2.2162) loss 4.0977 (3.3739) grad_norm 1.6671 (1.8439) [2022-01-23 17:58:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][780/1251] eta 0:17:23 lr 0.000301 time 1.8521 (2.2149) loss 3.5333 (3.3738) grad_norm 1.7587 (1.8436) [2022-01-23 17:58:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][790/1251] eta 0:16:59 lr 0.000301 time 1.9140 (2.2113) loss 3.7099 (3.3740) grad_norm 1.6076 (1.8424) [2022-01-23 17:58:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][800/1251] eta 0:16:35 lr 0.000301 time 2.2206 (2.2081) loss 3.2652 (3.3747) grad_norm 1.8383 (1.8425) [2022-01-23 17:59:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][810/1251] eta 0:16:12 lr 0.000301 time 2.3940 (2.2060) loss 4.2467 (3.3738) grad_norm 1.7042 (1.8412) [2022-01-23 17:59:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][820/1251] eta 0:15:50 lr 0.000301 time 2.7413 (2.2059) loss 3.7448 (3.3761) grad_norm 1.8182 (1.8411) [2022-01-23 18:00:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][830/1251] eta 0:15:28 lr 0.000301 time 1.6183 (2.2051) loss 3.9117 (3.3768) grad_norm 1.8460 (1.8402) [2022-01-23 18:00:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][840/1251] eta 0:15:06 lr 0.000300 time 1.6186 (2.2046) loss 3.5544 (3.3757) grad_norm 1.8914 (1.8397) [2022-01-23 18:00:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][850/1251] eta 0:14:44 lr 0.000300 time 2.7869 (2.2062) loss 3.5020 (3.3746) grad_norm 1.6702 (1.8387) [2022-01-23 18:01:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][860/1251] eta 0:14:23 lr 0.000300 time 2.8704 (2.2089) loss 3.7380 (3.3735) grad_norm 1.6063 (1.8369) [2022-01-23 18:01:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][870/1251] eta 0:14:01 lr 0.000300 time 1.8905 (2.2087) loss 3.4155 (3.3753) grad_norm 2.0178 (1.8358) [2022-01-23 18:01:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][880/1251] eta 0:13:39 lr 0.000300 time 2.2075 (2.2085) loss 4.0730 (3.3735) grad_norm 1.8866 (1.8349) [2022-01-23 18:02:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][890/1251] eta 0:13:17 lr 0.000300 time 3.2807 (2.2089) loss 3.4151 (3.3729) grad_norm 2.3383 (1.8354) [2022-01-23 18:02:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][900/1251] eta 0:12:54 lr 0.000300 time 1.8186 (2.2069) loss 2.3468 (3.3744) grad_norm 1.7401 (1.8347) [2022-01-23 18:02:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][910/1251] eta 0:12:32 lr 0.000300 time 1.6610 (2.2064) loss 3.2489 (3.3743) grad_norm 1.9308 (1.8345) [2022-01-23 18:03:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][920/1251] eta 0:12:10 lr 0.000300 time 2.4554 (2.2062) loss 3.2366 (3.3733) grad_norm 1.7891 (1.8349) [2022-01-23 18:03:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][930/1251] eta 0:11:48 lr 0.000300 time 2.1501 (2.2058) loss 3.4858 (3.3731) grad_norm 1.7034 (1.8354) [2022-01-23 18:04:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][940/1251] eta 0:11:25 lr 0.000300 time 2.3414 (2.2052) loss 3.5399 (3.3734) grad_norm 1.6646 (1.8349) [2022-01-23 18:04:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][950/1251] eta 0:11:04 lr 0.000300 time 3.1741 (2.2066) loss 2.7668 (3.3710) grad_norm 1.8461 (1.8356) [2022-01-23 18:04:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][960/1251] eta 0:10:42 lr 0.000300 time 2.7235 (2.2075) loss 3.6432 (3.3718) grad_norm 2.0325 (1.8360) [2022-01-23 18:05:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][970/1251] eta 0:10:20 lr 0.000300 time 2.4361 (2.2070) loss 2.4492 (3.3725) grad_norm 1.7171 (1.8365) [2022-01-23 18:05:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][980/1251] eta 0:09:58 lr 0.000300 time 2.2183 (2.2069) loss 3.1374 (3.3727) grad_norm 1.6288 (1.8363) [2022-01-23 18:05:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][990/1251] eta 0:09:36 lr 0.000300 time 3.0229 (2.2076) loss 3.7245 (3.3754) grad_norm 2.0784 (1.8373) [2022-01-23 18:06:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1000/1251] eta 0:09:14 lr 0.000300 time 2.0329 (2.2072) loss 3.7223 (3.3736) grad_norm 1.9223 (1.8378) [2022-01-23 18:06:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1010/1251] eta 0:08:51 lr 0.000300 time 2.1553 (2.2059) loss 2.3889 (3.3723) grad_norm 1.8186 (1.8378) [2022-01-23 18:07:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1020/1251] eta 0:08:29 lr 0.000300 time 1.9880 (2.2058) loss 2.2412 (3.3705) grad_norm 1.5972 (1.8373) [2022-01-23 18:07:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1030/1251] eta 0:08:07 lr 0.000300 time 2.6196 (2.2062) loss 3.7927 (3.3704) grad_norm 2.0953 (1.8366) [2022-01-23 18:07:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1040/1251] eta 0:07:45 lr 0.000300 time 2.2662 (2.2068) loss 2.8182 (3.3711) grad_norm 2.0310 (1.8371) [2022-01-23 18:08:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1050/1251] eta 0:07:23 lr 0.000300 time 1.8532 (2.2068) loss 4.2689 (3.3735) grad_norm 1.8010 (1.8368) [2022-01-23 18:08:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1060/1251] eta 0:07:01 lr 0.000300 time 1.8318 (2.2069) loss 3.2977 (3.3758) grad_norm 1.8473 (1.8364) [2022-01-23 18:08:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1070/1251] eta 0:06:39 lr 0.000300 time 1.6804 (2.2059) loss 3.3132 (3.3759) grad_norm 1.8014 (1.8365) [2022-01-23 18:09:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1080/1251] eta 0:06:16 lr 0.000300 time 1.6195 (2.2045) loss 3.7189 (3.3757) grad_norm 1.7033 (1.8362) [2022-01-23 18:09:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1090/1251] eta 0:05:54 lr 0.000300 time 1.8554 (2.2046) loss 2.4892 (3.3770) grad_norm 1.6664 (1.8350) [2022-01-23 18:09:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1100/1251] eta 0:05:32 lr 0.000300 time 1.9580 (2.2044) loss 3.9329 (3.3786) grad_norm 1.8703 (1.8346) [2022-01-23 18:10:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1110/1251] eta 0:05:10 lr 0.000299 time 2.2397 (2.2029) loss 3.5164 (3.3801) grad_norm 1.7731 (1.8343) [2022-01-23 18:10:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1120/1251] eta 0:04:48 lr 0.000299 time 2.4807 (2.2024) loss 3.8269 (3.3792) grad_norm 2.1421 (1.8341) [2022-01-23 18:10:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1130/1251] eta 0:04:26 lr 0.000299 time 2.4756 (2.2026) loss 4.1917 (3.3765) grad_norm 1.5199 (1.8339) [2022-01-23 18:11:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1140/1251] eta 0:04:04 lr 0.000299 time 3.8403 (2.2064) loss 3.6590 (3.3754) grad_norm 1.7023 (1.8332) [2022-01-23 18:11:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1150/1251] eta 0:03:43 lr 0.000299 time 2.2061 (2.2080) loss 3.6325 (3.3746) grad_norm 1.6112 (1.8330) [2022-01-23 18:12:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1160/1251] eta 0:03:20 lr 0.000299 time 2.8492 (2.2076) loss 2.3595 (3.3761) grad_norm 1.6636 (1.8321) [2022-01-23 18:12:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1170/1251] eta 0:02:58 lr 0.000299 time 1.9263 (2.2069) loss 3.8280 (3.3763) grad_norm 1.6980 (1.8317) [2022-01-23 18:12:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1180/1251] eta 0:02:36 lr 0.000299 time 2.1983 (2.2056) loss 2.4491 (3.3764) grad_norm 1.8279 (1.8308) [2022-01-23 18:13:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1190/1251] eta 0:02:14 lr 0.000299 time 2.2181 (2.2039) loss 3.2428 (3.3772) grad_norm 1.5479 (1.8313) [2022-01-23 18:13:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1200/1251] eta 0:01:52 lr 0.000299 time 2.2213 (2.2031) loss 2.2845 (3.3739) grad_norm 1.7633 (1.8317) [2022-01-23 18:13:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1210/1251] eta 0:01:30 lr 0.000299 time 2.2768 (2.2037) loss 3.3964 (3.3705) grad_norm 1.7321 (1.8326) [2022-01-23 18:14:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1220/1251] eta 0:01:08 lr 0.000299 time 1.8830 (2.2036) loss 3.8331 (3.3727) grad_norm 1.9688 (1.8321) [2022-01-23 18:14:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1230/1251] eta 0:00:46 lr 0.000299 time 2.5299 (2.2043) loss 2.2160 (3.3724) grad_norm 1.7140 (1.8310) [2022-01-23 18:15:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1240/1251] eta 0:00:24 lr 0.000299 time 2.2699 (2.2053) loss 2.8925 (3.3738) grad_norm 1.7368 (1.8306) [2022-01-23 18:15:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [190/300][1250/1251] eta 0:00:02 lr 0.000299 time 1.2057 (2.1997) loss 2.1907 (3.3722) grad_norm 1.9260 (1.8304) [2022-01-23 18:15:21 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 190 training takes 0:45:52 [2022-01-23 18:15:21 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_190 saving...... [2022-01-23 18:15:33 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_190 saved !!! [2022-01-23 18:15:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 16.534 (16.534) Loss 0.9682 (0.9682) Acc@1 78.516 (78.516) Acc@5 93.652 (93.652) [2022-01-23 18:16:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.313 (2.799) Loss 0.9405 (0.9425) Acc@1 78.027 (78.205) Acc@5 94.434 (94.363) [2022-01-23 18:16:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.030 (2.255) Loss 0.9343 (0.9412) Acc@1 76.270 (78.125) Acc@5 93.945 (94.262) [2022-01-23 18:16:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.412 (2.086) Loss 0.9553 (0.9470) Acc@1 79.395 (78.046) Acc@5 94.141 (94.150) [2022-01-23 18:16:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.086 (2.054) Loss 0.9349 (0.9450) Acc@1 77.734 (78.011) Acc@5 94.629 (94.210) [2022-01-23 18:17:04 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.020 Acc@5 94.248 [2022-01-23 18:17:04 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.0% [2022-01-23 18:17:04 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.02% [2022-01-23 18:17:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][0/1251] eta 7:32:12 lr 0.000299 time 21.6889 (21.6889) loss 3.6249 (3.6249) grad_norm 1.8185 (1.8185) [2022-01-23 18:17:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][10/1251] eta 1:22:19 lr 0.000299 time 1.7117 (3.9805) loss 3.4358 (3.2552) grad_norm 1.8209 (1.8331) [2022-01-23 18:18:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][20/1251] eta 1:03:12 lr 0.000299 time 1.4741 (3.0808) loss 2.4294 (3.1984) grad_norm 1.5543 (1.7824) [2022-01-23 18:18:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][30/1251] eta 0:58:16 lr 0.000299 time 1.9582 (2.8638) loss 3.0219 (3.1885) grad_norm 1.8393 (1.7599) [2022-01-23 18:18:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][40/1251] eta 0:54:37 lr 0.000299 time 3.6029 (2.7065) loss 3.0138 (3.2045) grad_norm 1.7280 (1.7629) [2022-01-23 18:19:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][50/1251] eta 0:52:10 lr 0.000299 time 1.8355 (2.6068) loss 3.4303 (3.2880) grad_norm 1.4803 (1.7587) [2022-01-23 18:19:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][60/1251] eta 0:49:47 lr 0.000299 time 1.5893 (2.5083) loss 3.5533 (3.3340) grad_norm 1.8147 (1.7886) [2022-01-23 18:20:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][70/1251] eta 0:48:50 lr 0.000299 time 2.2943 (2.4812) loss 3.3774 (3.3437) grad_norm 1.6929 (1.7844) [2022-01-23 18:20:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][80/1251] eta 0:48:11 lr 0.000299 time 3.5694 (2.4692) loss 3.5880 (3.3373) grad_norm 2.1785 (1.7997) [2022-01-23 18:20:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][90/1251] eta 0:47:06 lr 0.000299 time 1.5557 (2.4347) loss 3.2518 (3.3218) grad_norm 1.8852 (1.8048) [2022-01-23 18:21:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][100/1251] eta 0:46:09 lr 0.000299 time 2.2861 (2.4066) loss 3.6018 (3.3381) grad_norm 1.7558 (1.8069) [2022-01-23 18:21:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][110/1251] eta 0:45:23 lr 0.000299 time 1.9733 (2.3872) loss 3.9036 (3.3506) grad_norm 1.7253 (1.7997) [2022-01-23 18:21:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][120/1251] eta 0:44:51 lr 0.000298 time 4.0806 (2.3800) loss 3.7137 (3.3439) grad_norm 1.7518 (1.7928) [2022-01-23 18:22:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][130/1251] eta 0:44:10 lr 0.000298 time 1.5605 (2.3642) loss 3.4294 (3.3437) grad_norm 1.7979 (1.7933) [2022-01-23 18:22:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][140/1251] eta 0:43:24 lr 0.000298 time 1.8703 (2.3440) loss 3.8730 (3.3435) grad_norm 1.7596 (1.7948) [2022-01-23 18:22:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][150/1251] eta 0:42:45 lr 0.000298 time 1.8690 (2.3298) loss 3.9265 (3.3568) grad_norm 1.8100 (1.7988) [2022-01-23 18:23:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][160/1251] eta 0:42:17 lr 0.000298 time 3.0882 (2.3258) loss 4.0176 (3.3719) grad_norm 1.6393 (1.8006) [2022-01-23 18:23:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][170/1251] eta 0:41:48 lr 0.000298 time 1.7862 (2.3207) loss 3.8527 (3.3749) grad_norm 2.0639 (1.8059) [2022-01-23 18:24:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][180/1251] eta 0:41:15 lr 0.000298 time 1.5485 (2.3113) loss 3.0341 (3.3702) grad_norm 1.5785 (1.8038) [2022-01-23 18:24:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][190/1251] eta 0:40:47 lr 0.000298 time 1.8840 (2.3067) loss 3.7983 (3.3648) grad_norm 1.9311 (1.8048) [2022-01-23 18:24:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][200/1251] eta 0:40:13 lr 0.000298 time 2.7882 (2.2963) loss 3.5414 (3.3691) grad_norm 1.7089 (1.8128) [2022-01-23 18:25:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][210/1251] eta 0:39:46 lr 0.000298 time 1.8976 (2.2924) loss 3.1502 (3.3780) grad_norm 1.8895 (1.8146) [2022-01-23 18:25:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][220/1251] eta 0:39:23 lr 0.000298 time 1.8345 (2.2921) loss 3.4158 (3.3733) grad_norm 2.0892 (1.8158) [2022-01-23 18:25:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][230/1251] eta 0:38:57 lr 0.000298 time 1.9095 (2.2895) loss 3.0577 (3.3770) grad_norm 1.8273 (1.8115) [2022-01-23 18:26:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][240/1251] eta 0:38:25 lr 0.000298 time 1.9177 (2.2801) loss 2.9435 (3.3693) grad_norm 2.0087 (1.8136) [2022-01-23 18:26:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][250/1251] eta 0:37:58 lr 0.000298 time 3.2461 (2.2761) loss 3.9794 (3.3624) grad_norm 1.9182 (1.8254) [2022-01-23 18:26:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][260/1251] eta 0:37:22 lr 0.000298 time 1.9700 (2.2628) loss 3.3030 (3.3649) grad_norm 1.7463 (1.8244) [2022-01-23 18:27:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][270/1251] eta 0:36:55 lr 0.000298 time 2.1281 (2.2582) loss 3.0191 (3.3633) grad_norm 1.7014 (1.8264) [2022-01-23 18:27:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][280/1251] eta 0:36:31 lr 0.000298 time 1.9168 (2.2570) loss 3.9318 (3.3725) grad_norm 1.9593 (1.8321) [2022-01-23 18:28:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][290/1251] eta 0:36:18 lr 0.000298 time 3.6370 (2.2670) loss 3.7658 (3.3801) grad_norm 2.0059 (1.8302) [2022-01-23 18:28:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][300/1251] eta 0:35:55 lr 0.000298 time 1.8238 (2.2668) loss 3.1738 (3.3784) grad_norm 1.5817 (1.8285) [2022-01-23 18:28:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][310/1251] eta 0:35:26 lr 0.000298 time 1.7997 (2.2601) loss 3.1470 (3.3818) grad_norm 1.9456 (1.8308) [2022-01-23 18:29:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][320/1251] eta 0:34:54 lr 0.000298 time 1.9124 (2.2497) loss 4.0368 (3.3907) grad_norm 1.7303 (1.8315) [2022-01-23 18:29:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][330/1251] eta 0:34:29 lr 0.000298 time 2.5355 (2.2465) loss 4.0358 (3.3965) grad_norm 1.6508 (1.8318) [2022-01-23 18:29:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][340/1251] eta 0:34:02 lr 0.000298 time 1.6779 (2.2417) loss 4.1486 (3.3969) grad_norm 1.7781 (1.8324) [2022-01-23 18:30:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][350/1251] eta 0:33:37 lr 0.000298 time 1.6142 (2.2395) loss 4.0676 (3.3974) grad_norm 2.1885 (1.8321) [2022-01-23 18:30:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][360/1251] eta 0:33:14 lr 0.000298 time 2.1773 (2.2389) loss 2.8157 (3.4022) grad_norm 1.7615 (1.8357) [2022-01-23 18:30:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][370/1251] eta 0:32:55 lr 0.000298 time 3.2261 (2.2423) loss 2.8142 (3.4023) grad_norm 1.8095 (1.8400) [2022-01-23 18:31:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][380/1251] eta 0:32:34 lr 0.000298 time 2.7060 (2.2440) loss 3.8335 (3.4116) grad_norm 1.8750 (1.8426) [2022-01-23 18:31:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][390/1251] eta 0:32:09 lr 0.000297 time 1.8572 (2.2410) loss 3.8366 (3.4036) grad_norm 1.7365 (1.8423) [2022-01-23 18:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][400/1251] eta 0:31:46 lr 0.000297 time 2.2211 (2.2403) loss 3.9953 (3.4024) grad_norm 1.7883 (1.8433) [2022-01-23 18:32:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][410/1251] eta 0:31:22 lr 0.000297 time 2.4786 (2.2384) loss 3.1940 (3.4039) grad_norm 1.7929 (1.8425) [2022-01-23 18:32:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][420/1251] eta 0:30:58 lr 0.000297 time 3.0235 (2.2362) loss 2.7151 (3.4018) grad_norm 1.7383 (1.8425) [2022-01-23 18:33:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][430/1251] eta 0:30:31 lr 0.000297 time 1.9220 (2.2311) loss 3.6993 (3.4037) grad_norm 1.7149 (1.8421) [2022-01-23 18:33:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][440/1251] eta 0:30:08 lr 0.000297 time 1.9446 (2.2297) loss 3.3442 (3.3992) grad_norm 1.6925 (1.8382) [2022-01-23 18:33:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][450/1251] eta 0:29:43 lr 0.000297 time 2.4858 (2.2272) loss 3.7359 (3.3927) grad_norm 1.7120 (1.8365) [2022-01-23 18:34:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][460/1251] eta 0:29:19 lr 0.000297 time 1.9934 (2.2250) loss 3.4824 (3.3899) grad_norm 1.8453 (1.8373) [2022-01-23 18:34:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][470/1251] eta 0:28:56 lr 0.000297 time 1.9066 (2.2240) loss 3.0104 (3.3934) grad_norm 1.7574 (1.8357) [2022-01-23 18:34:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][480/1251] eta 0:28:36 lr 0.000297 time 2.8029 (2.2267) loss 3.5068 (3.3888) grad_norm 1.6455 (1.8338) [2022-01-23 18:35:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][490/1251] eta 0:28:14 lr 0.000297 time 2.7406 (2.2268) loss 3.6135 (3.3907) grad_norm 1.8712 (1.8355) [2022-01-23 18:35:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][500/1251] eta 0:27:51 lr 0.000297 time 1.9059 (2.2257) loss 3.8278 (3.3925) grad_norm 1.8431 (1.8362) [2022-01-23 18:36:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][510/1251] eta 0:27:27 lr 0.000297 time 1.8019 (2.2237) loss 3.9739 (3.3953) grad_norm 1.5828 (1.8357) [2022-01-23 18:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][520/1251] eta 0:27:05 lr 0.000297 time 2.6510 (2.2241) loss 4.2123 (3.3957) grad_norm 1.8174 (1.8354) [2022-01-23 18:36:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][530/1251] eta 0:26:44 lr 0.000297 time 2.2246 (2.2252) loss 3.9560 (3.3962) grad_norm 1.5293 (1.8353) [2022-01-23 18:37:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][540/1251] eta 0:26:23 lr 0.000297 time 1.8442 (2.2275) loss 3.3012 (3.3977) grad_norm 1.6825 (1.8356) [2022-01-23 18:37:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][550/1251] eta 0:26:00 lr 0.000297 time 1.7714 (2.2261) loss 3.7772 (3.4017) grad_norm 1.8608 (1.8361) [2022-01-23 18:37:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][560/1251] eta 0:25:37 lr 0.000297 time 1.8466 (2.2254) loss 2.2796 (3.3982) grad_norm 1.8727 (1.8358) [2022-01-23 18:38:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][570/1251] eta 0:25:14 lr 0.000297 time 2.3885 (2.2237) loss 4.1794 (3.3990) grad_norm 2.1894 (1.8348) [2022-01-23 18:38:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][580/1251] eta 0:24:50 lr 0.000297 time 1.6227 (2.2218) loss 3.6550 (3.3981) grad_norm 1.8694 (1.8340) [2022-01-23 18:38:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][590/1251] eta 0:24:27 lr 0.000297 time 2.2399 (2.2199) loss 4.0065 (3.4006) grad_norm 1.8744 (1.8341) [2022-01-23 18:39:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][600/1251] eta 0:24:04 lr 0.000297 time 2.2440 (2.2195) loss 3.7626 (3.4039) grad_norm 2.0778 (1.8370) [2022-01-23 18:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][610/1251] eta 0:23:43 lr 0.000297 time 2.6713 (2.2207) loss 3.6821 (3.4022) grad_norm 2.0981 (1.8369) [2022-01-23 18:40:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][620/1251] eta 0:23:20 lr 0.000297 time 2.1658 (2.2190) loss 3.7443 (3.3995) grad_norm 1.7395 (1.8381) [2022-01-23 18:40:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][630/1251] eta 0:22:56 lr 0.000297 time 1.6421 (2.2164) loss 2.7701 (3.4035) grad_norm 1.6794 (1.8372) [2022-01-23 18:40:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][640/1251] eta 0:22:36 lr 0.000297 time 3.2662 (2.2204) loss 3.5545 (3.3993) grad_norm 1.8178 (1.8370) [2022-01-23 18:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][650/1251] eta 0:22:14 lr 0.000296 time 1.4604 (2.2199) loss 3.7429 (3.3971) grad_norm 1.6363 (1.8360) [2022-01-23 18:41:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][660/1251] eta 0:21:53 lr 0.000296 time 1.6345 (2.2224) loss 2.5662 (3.3964) grad_norm 2.0497 (1.8351) [2022-01-23 18:41:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][670/1251] eta 0:21:30 lr 0.000296 time 1.6541 (2.2206) loss 3.5257 (3.3955) grad_norm 1.9555 (1.8348) [2022-01-23 18:42:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][680/1251] eta 0:21:08 lr 0.000296 time 2.9916 (2.2208) loss 3.2892 (3.3953) grad_norm 1.8217 (1.8349) [2022-01-23 18:42:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][690/1251] eta 0:20:44 lr 0.000296 time 1.9349 (2.2179) loss 3.6803 (3.3945) grad_norm 1.7357 (1.8334) [2022-01-23 18:42:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][700/1251] eta 0:20:20 lr 0.000296 time 1.7247 (2.2154) loss 2.5223 (3.3955) grad_norm 1.8829 (1.8326) [2022-01-23 18:43:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][710/1251] eta 0:19:58 lr 0.000296 time 1.8234 (2.2151) loss 2.3900 (3.3938) grad_norm 1.7697 (1.8336) [2022-01-23 18:43:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][720/1251] eta 0:19:37 lr 0.000296 time 2.7427 (2.2173) loss 2.9697 (3.3931) grad_norm 1.6298 (1.8325) [2022-01-23 18:44:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][730/1251] eta 0:19:14 lr 0.000296 time 1.8141 (2.2159) loss 3.2450 (3.3941) grad_norm 1.9571 (1.8318) [2022-01-23 18:44:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][740/1251] eta 0:18:52 lr 0.000296 time 2.6550 (2.2170) loss 3.4016 (3.3933) grad_norm 1.8479 (1.8324) [2022-01-23 18:44:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][750/1251] eta 0:18:30 lr 0.000296 time 1.5067 (2.2161) loss 3.5890 (3.3964) grad_norm 1.7029 (1.8328) [2022-01-23 18:45:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][760/1251] eta 0:18:09 lr 0.000296 time 2.9997 (2.2180) loss 3.4256 (3.3961) grad_norm 1.7634 (1.8324) [2022-01-23 18:45:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][770/1251] eta 0:17:45 lr 0.000296 time 1.8129 (2.2153) loss 3.6645 (3.3982) grad_norm 1.6305 (1.8325) [2022-01-23 18:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][780/1251] eta 0:17:21 lr 0.000296 time 1.8190 (2.2115) loss 3.6932 (3.3978) grad_norm 1.8236 (1.8325) [2022-01-23 18:46:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][790/1251] eta 0:16:59 lr 0.000296 time 1.8360 (2.2112) loss 3.3848 (3.3976) grad_norm 1.7497 (1.8319) [2022-01-23 18:46:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][800/1251] eta 0:16:36 lr 0.000296 time 2.3411 (2.2104) loss 3.3797 (3.3983) grad_norm 1.5286 (1.8321) [2022-01-23 18:46:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][810/1251] eta 0:16:15 lr 0.000296 time 2.0842 (2.2114) loss 3.5967 (3.3957) grad_norm 1.6717 (1.8323) [2022-01-23 18:47:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][820/1251] eta 0:15:53 lr 0.000296 time 2.5305 (2.2115) loss 3.8116 (3.3947) grad_norm 1.7501 (1.8332) [2022-01-23 18:47:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][830/1251] eta 0:15:30 lr 0.000296 time 2.2218 (2.2102) loss 3.8657 (3.3941) grad_norm 1.8075 (1.8327) [2022-01-23 18:48:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][840/1251] eta 0:15:08 lr 0.000296 time 2.4930 (2.2109) loss 3.6744 (3.3944) grad_norm 1.7249 (1.8331) [2022-01-23 18:48:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][850/1251] eta 0:14:46 lr 0.000296 time 1.9247 (2.2107) loss 3.5240 (3.3970) grad_norm 1.5987 (1.8330) [2022-01-23 18:48:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][860/1251] eta 0:14:24 lr 0.000296 time 1.9310 (2.2109) loss 3.0729 (3.3958) grad_norm 1.5878 (1.8327) [2022-01-23 18:49:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][870/1251] eta 0:14:02 lr 0.000296 time 2.1823 (2.2109) loss 3.6570 (3.3934) grad_norm 1.8647 (1.8339) [2022-01-23 18:49:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][880/1251] eta 0:13:39 lr 0.000296 time 2.7583 (2.2101) loss 3.6422 (3.3953) grad_norm 1.7323 (1.8346) [2022-01-23 18:49:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][890/1251] eta 0:13:16 lr 0.000296 time 1.9124 (2.2071) loss 2.4235 (3.3947) grad_norm 1.8898 (1.8342) [2022-01-23 18:50:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][900/1251] eta 0:12:53 lr 0.000296 time 1.9633 (2.2049) loss 3.4646 (3.3921) grad_norm 1.6277 (1.8339) [2022-01-23 18:50:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][910/1251] eta 0:12:31 lr 0.000296 time 1.9217 (2.2040) loss 2.8046 (3.3872) grad_norm 1.8425 (1.8336) [2022-01-23 18:50:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][920/1251] eta 0:12:09 lr 0.000295 time 2.8114 (2.2042) loss 3.3867 (3.3828) grad_norm 2.0207 (1.8334) [2022-01-23 18:51:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][930/1251] eta 0:11:48 lr 0.000295 time 2.0966 (2.2059) loss 3.0783 (3.3830) grad_norm 1.7454 (1.8325) [2022-01-23 18:51:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][940/1251] eta 0:11:26 lr 0.000295 time 1.5588 (2.2061) loss 3.4201 (3.3821) grad_norm 1.8795 (1.8330) [2022-01-23 18:52:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][950/1251] eta 0:11:04 lr 0.000295 time 2.2065 (2.2079) loss 3.1351 (3.3817) grad_norm 1.7287 (1.8344) [2022-01-23 18:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][960/1251] eta 0:10:42 lr 0.000295 time 2.0169 (2.2070) loss 3.1357 (3.3808) grad_norm 1.7420 (1.8347) [2022-01-23 18:52:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][970/1251] eta 0:10:19 lr 0.000295 time 2.2614 (2.2060) loss 3.7594 (3.3800) grad_norm 1.9581 (1.8342) [2022-01-23 18:53:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][980/1251] eta 0:09:57 lr 0.000295 time 2.0295 (2.2060) loss 3.5655 (3.3805) grad_norm 1.6728 (1.8344) [2022-01-23 18:53:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][990/1251] eta 0:09:35 lr 0.000295 time 1.6066 (2.2043) loss 3.4323 (3.3786) grad_norm 1.8785 (1.8338) [2022-01-23 18:53:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1000/1251] eta 0:09:13 lr 0.000295 time 1.7616 (2.2035) loss 3.6649 (3.3755) grad_norm 2.1458 (1.8329) [2022-01-23 18:54:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1010/1251] eta 0:08:51 lr 0.000295 time 1.8739 (2.2048) loss 4.2212 (3.3772) grad_norm 1.8824 (1.8339) [2022-01-23 18:54:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1020/1251] eta 0:08:29 lr 0.000295 time 1.6723 (2.2051) loss 1.8628 (3.3761) grad_norm 1.5812 (1.8338) [2022-01-23 18:54:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1030/1251] eta 0:08:07 lr 0.000295 time 1.4817 (2.2046) loss 3.0688 (3.3744) grad_norm 2.0948 (1.8342) [2022-01-23 18:55:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1040/1251] eta 0:07:45 lr 0.000295 time 1.5655 (2.2048) loss 2.7463 (3.3753) grad_norm 2.1567 (1.8351) [2022-01-23 18:55:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1050/1251] eta 0:07:23 lr 0.000295 time 1.6685 (2.2044) loss 3.7452 (3.3746) grad_norm 1.8836 (1.8360) [2022-01-23 18:56:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1060/1251] eta 0:07:00 lr 0.000295 time 1.9257 (2.2032) loss 3.0628 (3.3748) grad_norm 1.8119 (1.8352) [2022-01-23 18:56:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1070/1251] eta 0:06:38 lr 0.000295 time 1.7159 (2.2023) loss 2.8482 (3.3748) grad_norm 1.6560 (1.8345) [2022-01-23 18:56:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1080/1251] eta 0:06:16 lr 0.000295 time 2.2028 (2.2022) loss 4.1331 (3.3769) grad_norm 2.0105 (1.8348) [2022-01-23 18:57:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1090/1251] eta 0:05:54 lr 0.000295 time 2.7872 (2.2028) loss 3.4832 (3.3763) grad_norm 2.0901 (1.8342) [2022-01-23 18:57:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1100/1251] eta 0:05:32 lr 0.000295 time 1.6346 (2.2024) loss 2.8945 (3.3779) grad_norm 1.6518 (1.8337) [2022-01-23 18:57:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1110/1251] eta 0:05:10 lr 0.000295 time 2.1835 (2.2025) loss 3.5659 (3.3753) grad_norm 2.0034 (1.8335) [2022-01-23 18:58:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1120/1251] eta 0:04:48 lr 0.000295 time 1.9320 (2.2031) loss 3.8095 (3.3776) grad_norm 1.9635 (1.8337) [2022-01-23 18:58:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1130/1251] eta 0:04:26 lr 0.000295 time 2.2421 (2.2037) loss 3.4142 (3.3787) grad_norm 1.9918 (1.8347) [2022-01-23 18:58:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1140/1251] eta 0:04:04 lr 0.000295 time 1.8276 (2.2032) loss 3.6847 (3.3800) grad_norm 2.1217 (1.8347) [2022-01-23 18:59:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1150/1251] eta 0:03:42 lr 0.000295 time 2.2689 (2.2021) loss 3.3204 (3.3791) grad_norm 1.8846 (1.8353) [2022-01-23 18:59:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1160/1251] eta 0:03:20 lr 0.000295 time 2.5280 (2.2008) loss 2.8887 (3.3793) grad_norm 1.7237 (1.8345) [2022-01-23 19:00:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1170/1251] eta 0:02:58 lr 0.000295 time 2.0958 (2.2000) loss 3.3215 (3.3787) grad_norm 2.0159 (1.8354) [2022-01-23 19:00:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1180/1251] eta 0:02:36 lr 0.000295 time 2.4488 (2.2002) loss 3.6126 (3.3799) grad_norm 1.8685 (1.8362) [2022-01-23 19:00:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1190/1251] eta 0:02:14 lr 0.000294 time 2.4120 (2.2010) loss 2.1349 (3.3796) grad_norm 1.8441 (1.8360) [2022-01-23 19:01:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1200/1251] eta 0:01:52 lr 0.000294 time 2.2653 (2.2012) loss 1.9975 (3.3785) grad_norm 1.5530 (1.8352) [2022-01-23 19:01:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1210/1251] eta 0:01:30 lr 0.000294 time 3.3623 (2.2032) loss 2.7064 (3.3773) grad_norm 1.8936 (1.8357) [2022-01-23 19:01:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1220/1251] eta 0:01:08 lr 0.000294 time 2.8908 (2.2037) loss 2.9271 (3.3753) grad_norm 1.8086 (1.8356) [2022-01-23 19:02:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1230/1251] eta 0:00:46 lr 0.000294 time 1.6481 (2.2029) loss 3.2060 (3.3756) grad_norm 1.6637 (1.8365) [2022-01-23 19:02:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1240/1251] eta 0:00:24 lr 0.000294 time 1.4797 (2.2007) loss 3.4943 (3.3749) grad_norm 2.0888 (1.8368) [2022-01-23 19:02:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [191/300][1250/1251] eta 0:00:02 lr 0.000294 time 1.2345 (2.1941) loss 3.7519 (3.3763) grad_norm 1.7069 (1.8364) [2022-01-23 19:02:49 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 191 training takes 0:45:45 [2022-01-23 19:03:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.744 (18.744) Loss 0.9147 (0.9147) Acc@1 78.027 (78.027) Acc@5 94.141 (94.141) [2022-01-23 19:03:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.866 (3.290) Loss 0.9481 (0.9343) Acc@1 77.441 (77.868) Acc@5 94.922 (94.371) [2022-01-23 19:03:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.506 (2.579) Loss 0.9388 (0.9366) Acc@1 77.637 (77.860) Acc@5 93.750 (94.266) [2022-01-23 19:04:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.277 (2.276) Loss 0.9425 (0.9394) Acc@1 76.855 (77.816) Acc@5 94.043 (94.172) [2022-01-23 19:04:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.195 (2.181) Loss 0.9793 (0.9380) Acc@1 76.953 (77.911) Acc@5 93.848 (94.167) [2022-01-23 19:04:25 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 77.904 Acc@5 94.210 [2022-01-23 19:04:25 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 77.9% [2022-01-23 19:04:25 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.02% [2022-01-23 19:04:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][0/1251] eta 7:32:34 lr 0.000294 time 21.7066 (21.7066) loss 2.1244 (2.1244) grad_norm 2.2493 (2.2493) [2022-01-23 19:05:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][10/1251] eta 1:22:59 lr 0.000294 time 1.9739 (4.0123) loss 3.2323 (3.3059) grad_norm 1.9857 (1.8966) [2022-01-23 19:05:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][20/1251] eta 1:06:37 lr 0.000294 time 2.5196 (3.2470) loss 3.4869 (3.4845) grad_norm 1.5572 (1.8727) [2022-01-23 19:05:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][30/1251] eta 0:59:39 lr 0.000294 time 1.7798 (2.9319) loss 3.6903 (3.4102) grad_norm 1.7833 (1.8847) [2022-01-23 19:06:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][40/1251] eta 0:56:06 lr 0.000294 time 3.8113 (2.7797) loss 2.7183 (3.3611) grad_norm 1.6797 (1.8769) [2022-01-23 19:06:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][50/1251] eta 0:53:48 lr 0.000294 time 2.3124 (2.6884) loss 3.2996 (3.3151) grad_norm 2.0039 (1.8915) [2022-01-23 19:07:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][60/1251] eta 0:51:29 lr 0.000294 time 2.4923 (2.5943) loss 3.6149 (3.3628) grad_norm 1.7155 (1.8818) [2022-01-23 19:07:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][70/1251] eta 0:48:56 lr 0.000294 time 1.9261 (2.4864) loss 3.4950 (3.3639) grad_norm 1.7129 (1.8757) [2022-01-23 19:07:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][80/1251] eta 0:47:33 lr 0.000294 time 2.6251 (2.4364) loss 3.7967 (3.3729) grad_norm 1.7667 (1.8748) [2022-01-23 19:08:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][90/1251] eta 0:46:40 lr 0.000294 time 1.8413 (2.4124) loss 3.7013 (3.3757) grad_norm 2.1551 (1.8710) [2022-01-23 19:08:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][100/1251] eta 0:45:47 lr 0.000294 time 1.9239 (2.3871) loss 3.9462 (3.3921) grad_norm 1.9749 (1.8775) [2022-01-23 19:08:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][110/1251] eta 0:45:09 lr 0.000294 time 2.9468 (2.3745) loss 2.5795 (3.3838) grad_norm 1.9776 (1.8896) [2022-01-23 19:09:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][120/1251] eta 0:44:32 lr 0.000294 time 2.1766 (2.3629) loss 2.6883 (3.3572) grad_norm 2.0904 (1.8845) [2022-01-23 19:09:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][130/1251] eta 0:44:08 lr 0.000294 time 3.1261 (2.3626) loss 3.4427 (3.3681) grad_norm 2.0230 (1.9010) [2022-01-23 19:09:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][140/1251] eta 0:43:37 lr 0.000294 time 2.1331 (2.3563) loss 2.7257 (3.3633) grad_norm 2.1850 (1.8975) [2022-01-23 19:10:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][150/1251] eta 0:43:15 lr 0.000294 time 2.8283 (2.3578) loss 2.2102 (3.3485) grad_norm 1.5468 (1.8954) [2022-01-23 19:10:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][160/1251] eta 0:42:32 lr 0.000294 time 1.6330 (2.3398) loss 3.7145 (3.3542) grad_norm 1.6584 (1.8926) [2022-01-23 19:11:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][170/1251] eta 0:41:47 lr 0.000294 time 1.7066 (2.3196) loss 3.6954 (3.3581) grad_norm 1.7443 (1.8993) [2022-01-23 19:11:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][180/1251] eta 0:41:08 lr 0.000294 time 2.0880 (2.3048) loss 3.1249 (3.3584) grad_norm 1.7507 (1.8907) [2022-01-23 19:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][190/1251] eta 0:40:35 lr 0.000294 time 2.2465 (2.2958) loss 2.5435 (3.3533) grad_norm 1.7248 (1.8873) [2022-01-23 19:12:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][200/1251] eta 0:40:03 lr 0.000293 time 1.5536 (2.2873) loss 3.7467 (3.3538) grad_norm 1.6117 (1.8809) [2022-01-23 19:12:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][210/1251] eta 0:39:44 lr 0.000293 time 2.4260 (2.2907) loss 3.8735 (3.3491) grad_norm 1.7203 (1.8784) [2022-01-23 19:12:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][220/1251] eta 0:39:24 lr 0.000293 time 2.2879 (2.2935) loss 3.1629 (3.3504) grad_norm 1.8086 (1.8758) [2022-01-23 19:13:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][230/1251] eta 0:39:00 lr 0.000293 time 2.2584 (2.2927) loss 4.0144 (3.3601) grad_norm 1.8486 (1.8745) [2022-01-23 19:13:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][240/1251] eta 0:38:35 lr 0.000293 time 1.9224 (2.2907) loss 3.3852 (3.3627) grad_norm 2.0180 (1.8762) [2022-01-23 19:13:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][250/1251] eta 0:38:04 lr 0.000293 time 1.6786 (2.2822) loss 3.6071 (3.3686) grad_norm 1.8446 (1.8733) [2022-01-23 19:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][260/1251] eta 0:37:29 lr 0.000293 time 1.9522 (2.2696) loss 4.1580 (3.3633) grad_norm 1.7637 (1.8753) [2022-01-23 19:14:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][270/1251] eta 0:36:58 lr 0.000293 time 2.1878 (2.2610) loss 2.3943 (3.3592) grad_norm 1.6049 (1.8759) [2022-01-23 19:15:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][280/1251] eta 0:36:33 lr 0.000293 time 2.3279 (2.2589) loss 2.7100 (3.3476) grad_norm 1.6333 (1.8736) [2022-01-23 19:15:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][290/1251] eta 0:36:07 lr 0.000293 time 1.8312 (2.2556) loss 3.1988 (3.3498) grad_norm 1.9073 (1.8746) [2022-01-23 19:15:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][300/1251] eta 0:35:44 lr 0.000293 time 2.1170 (2.2553) loss 2.6596 (3.3513) grad_norm 1.7479 (1.8756) [2022-01-23 19:16:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][310/1251] eta 0:35:24 lr 0.000293 time 1.8590 (2.2581) loss 3.2820 (3.3471) grad_norm 1.7619 (1.8742) [2022-01-23 19:16:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][320/1251] eta 0:35:00 lr 0.000293 time 2.5142 (2.2567) loss 4.1016 (3.3474) grad_norm 1.9785 (1.8740) [2022-01-23 19:16:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][330/1251] eta 0:34:35 lr 0.000293 time 1.8358 (2.2537) loss 3.6523 (3.3516) grad_norm 2.2640 (1.8757) [2022-01-23 19:17:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][340/1251] eta 0:34:04 lr 0.000293 time 1.6558 (2.2445) loss 3.6164 (3.3537) grad_norm 1.8390 (1.8758) [2022-01-23 19:17:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][350/1251] eta 0:33:40 lr 0.000293 time 2.4513 (2.2422) loss 3.4995 (3.3466) grad_norm 1.6989 (1.8721) [2022-01-23 19:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][360/1251] eta 0:33:16 lr 0.000293 time 2.7287 (2.2413) loss 2.0974 (3.3469) grad_norm 1.7908 (1.8709) [2022-01-23 19:18:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][370/1251] eta 0:32:53 lr 0.000293 time 2.2093 (2.2397) loss 4.2504 (3.3474) grad_norm 1.9503 (1.8699) [2022-01-23 19:18:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][380/1251] eta 0:32:31 lr 0.000293 time 1.8826 (2.2406) loss 3.2945 (3.3515) grad_norm 2.1772 (1.8751) [2022-01-23 19:19:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][390/1251] eta 0:32:08 lr 0.000293 time 1.9404 (2.2394) loss 3.7116 (3.3516) grad_norm 2.1973 (1.8769) [2022-01-23 19:19:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][400/1251] eta 0:31:47 lr 0.000293 time 2.6303 (2.2413) loss 3.8373 (3.3558) grad_norm 1.6559 (1.8772) [2022-01-23 19:19:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][410/1251] eta 0:31:22 lr 0.000293 time 1.7444 (2.2390) loss 3.6742 (3.3522) grad_norm 1.6942 (1.8779) [2022-01-23 19:20:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][420/1251] eta 0:30:58 lr 0.000293 time 2.1633 (2.2361) loss 2.5106 (3.3557) grad_norm 1.6415 (1.8787) [2022-01-23 19:20:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][430/1251] eta 0:30:33 lr 0.000293 time 2.0351 (2.2333) loss 2.8656 (3.3552) grad_norm 1.7564 (1.8781) [2022-01-23 19:20:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][440/1251] eta 0:30:06 lr 0.000293 time 1.9012 (2.2274) loss 3.3838 (3.3616) grad_norm 1.9715 (1.8788) [2022-01-23 19:21:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][450/1251] eta 0:29:44 lr 0.000293 time 2.2857 (2.2276) loss 4.0317 (3.3550) grad_norm 1.8298 (1.8800) [2022-01-23 19:21:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][460/1251] eta 0:29:22 lr 0.000293 time 1.8645 (2.2276) loss 3.7135 (3.3553) grad_norm 2.3238 (1.8808) [2022-01-23 19:21:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][470/1251] eta 0:29:00 lr 0.000292 time 2.4787 (2.2281) loss 2.3918 (3.3568) grad_norm 2.0823 (1.8810) [2022-01-23 19:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][480/1251] eta 0:28:36 lr 0.000292 time 1.9818 (2.2261) loss 3.5618 (3.3595) grad_norm 1.7879 (1.8824) [2022-01-23 19:22:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][490/1251] eta 0:28:14 lr 0.000292 time 2.2067 (2.2265) loss 3.9298 (3.3560) grad_norm 2.0049 (1.8832) [2022-01-23 19:23:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][500/1251] eta 0:27:52 lr 0.000292 time 2.5369 (2.2270) loss 3.7621 (3.3568) grad_norm 1.9409 (1.8839) [2022-01-23 19:23:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][510/1251] eta 0:27:30 lr 0.000292 time 2.9016 (2.2277) loss 3.4890 (3.3551) grad_norm 1.7821 (1.8850) [2022-01-23 19:23:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][520/1251] eta 0:27:07 lr 0.000292 time 2.0620 (2.2258) loss 3.6595 (3.3594) grad_norm 1.7639 (1.8838) [2022-01-23 19:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][530/1251] eta 0:26:42 lr 0.000292 time 1.6320 (2.2231) loss 3.0063 (3.3582) grad_norm 1.8375 (1.8848) [2022-01-23 19:24:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][540/1251] eta 0:26:18 lr 0.000292 time 2.1221 (2.2204) loss 4.0252 (3.3630) grad_norm 1.6722 (1.8842) [2022-01-23 19:24:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][550/1251] eta 0:25:55 lr 0.000292 time 1.9555 (2.2185) loss 3.7144 (3.3606) grad_norm 1.7908 (1.8849) [2022-01-23 19:25:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][560/1251] eta 0:25:32 lr 0.000292 time 2.2385 (2.2180) loss 2.3165 (3.3598) grad_norm 1.9159 (1.8844) [2022-01-23 19:25:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][570/1251] eta 0:25:11 lr 0.000292 time 1.8732 (2.2192) loss 2.5902 (3.3566) grad_norm 2.3087 (1.8840) [2022-01-23 19:25:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][580/1251] eta 0:24:49 lr 0.000292 time 2.5278 (2.2198) loss 2.8764 (3.3562) grad_norm 1.6635 (1.8873) [2022-01-23 19:26:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][590/1251] eta 0:24:27 lr 0.000292 time 2.1567 (2.2196) loss 3.4247 (3.3560) grad_norm 1.8078 (1.8904) [2022-01-23 19:26:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][600/1251] eta 0:24:04 lr 0.000292 time 2.2757 (2.2186) loss 4.0333 (3.3595) grad_norm 1.7154 (1.8941) [2022-01-23 19:27:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][610/1251] eta 0:23:43 lr 0.000292 time 1.8799 (2.2205) loss 3.7321 (3.3601) grad_norm 2.0051 (1.8937) [2022-01-23 19:27:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][620/1251] eta 0:23:20 lr 0.000292 time 2.2919 (2.2190) loss 3.2662 (3.3570) grad_norm 1.9418 (1.8934) [2022-01-23 19:27:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][630/1251] eta 0:22:55 lr 0.000292 time 1.7852 (2.2155) loss 3.3759 (3.3568) grad_norm 1.9708 (1.8917) [2022-01-23 19:28:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][640/1251] eta 0:22:33 lr 0.000292 time 2.1686 (2.2150) loss 3.8824 (3.3587) grad_norm 1.7319 (1.8896) [2022-01-23 19:28:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][650/1251] eta 0:22:12 lr 0.000292 time 1.9850 (2.2164) loss 2.3524 (3.3563) grad_norm 1.8132 (1.8879) [2022-01-23 19:28:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][660/1251] eta 0:21:49 lr 0.000292 time 2.2495 (2.2154) loss 3.2249 (3.3575) grad_norm 1.7203 (1.8854) [2022-01-23 19:29:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][670/1251] eta 0:21:27 lr 0.000292 time 1.6070 (2.2153) loss 3.1845 (3.3596) grad_norm 1.7268 (1.8847) [2022-01-23 19:29:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][680/1251] eta 0:21:04 lr 0.000292 time 1.8790 (2.2148) loss 2.7361 (3.3603) grad_norm 1.8782 (1.8841) [2022-01-23 19:29:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][690/1251] eta 0:20:43 lr 0.000292 time 2.4683 (2.2174) loss 3.3554 (3.3587) grad_norm 1.7060 (1.8820) [2022-01-23 19:30:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][700/1251] eta 0:20:21 lr 0.000292 time 1.9283 (2.2165) loss 2.8035 (3.3583) grad_norm 2.0417 (1.8815) [2022-01-23 19:30:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][710/1251] eta 0:19:58 lr 0.000292 time 1.6579 (2.2146) loss 3.2976 (3.3609) grad_norm 1.6826 (1.8828) [2022-01-23 19:31:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][720/1251] eta 0:19:35 lr 0.000292 time 2.1671 (2.2137) loss 2.7922 (3.3580) grad_norm 1.8352 (1.8822) [2022-01-23 19:31:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][730/1251] eta 0:19:13 lr 0.000292 time 1.9264 (2.2143) loss 2.9778 (3.3586) grad_norm 1.9125 (1.8831) [2022-01-23 19:31:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][740/1251] eta 0:18:51 lr 0.000291 time 2.2873 (2.2139) loss 4.0021 (3.3579) grad_norm 1.7602 (1.8838) [2022-01-23 19:32:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][750/1251] eta 0:18:28 lr 0.000291 time 1.8134 (2.2124) loss 3.3680 (3.3593) grad_norm 1.5891 (1.8829) [2022-01-23 19:32:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][760/1251] eta 0:18:05 lr 0.000291 time 2.0938 (2.2110) loss 3.6464 (3.3598) grad_norm 2.0363 (1.8811) [2022-01-23 19:32:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][770/1251] eta 0:17:43 lr 0.000291 time 2.2347 (2.2100) loss 3.7747 (3.3599) grad_norm 1.6813 (1.8799) [2022-01-23 19:33:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][780/1251] eta 0:17:19 lr 0.000291 time 2.0760 (2.2073) loss 3.9686 (3.3611) grad_norm 1.7268 (1.8796) [2022-01-23 19:33:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][790/1251] eta 0:16:56 lr 0.000291 time 2.0081 (2.2055) loss 3.6625 (3.3636) grad_norm 1.5583 (1.8788) [2022-01-23 19:33:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][800/1251] eta 0:16:34 lr 0.000291 time 2.5596 (2.2052) loss 2.9714 (3.3642) grad_norm 1.8211 (1.8778) [2022-01-23 19:34:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][810/1251] eta 0:16:12 lr 0.000291 time 2.4941 (2.2059) loss 3.6934 (3.3669) grad_norm 1.8720 (1.8766) [2022-01-23 19:34:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][820/1251] eta 0:15:50 lr 0.000291 time 2.1675 (2.2061) loss 3.7860 (3.3676) grad_norm 2.3988 (1.8774) [2022-01-23 19:34:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][830/1251] eta 0:15:28 lr 0.000291 time 2.1965 (2.2061) loss 4.0155 (3.3731) grad_norm 1.9468 (1.8758) [2022-01-23 19:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][840/1251] eta 0:15:06 lr 0.000291 time 2.2502 (2.2053) loss 3.4203 (3.3757) grad_norm 1.7049 (1.8740) [2022-01-23 19:35:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][850/1251] eta 0:14:44 lr 0.000291 time 2.7515 (2.2058) loss 3.6723 (3.3768) grad_norm 1.6985 (1.8723) [2022-01-23 19:36:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][860/1251] eta 0:14:22 lr 0.000291 time 2.0777 (2.2059) loss 3.0221 (3.3765) grad_norm 1.9278 (1.8715) [2022-01-23 19:36:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][870/1251] eta 0:14:00 lr 0.000291 time 1.9262 (2.2062) loss 3.0568 (3.3777) grad_norm 1.7029 (1.8728) [2022-01-23 19:36:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][880/1251] eta 0:13:38 lr 0.000291 time 2.1234 (2.2072) loss 4.0166 (3.3767) grad_norm 1.9167 (1.8748) [2022-01-23 19:37:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][890/1251] eta 0:13:17 lr 0.000291 time 2.8485 (2.2087) loss 2.1810 (3.3775) grad_norm 1.7196 (1.8748) [2022-01-23 19:37:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][900/1251] eta 0:12:54 lr 0.000291 time 1.6714 (2.2074) loss 3.8410 (3.3796) grad_norm 1.6947 (1.8763) [2022-01-23 19:37:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][910/1251] eta 0:12:32 lr 0.000291 time 1.9205 (2.2063) loss 2.7692 (3.3815) grad_norm 1.7510 (1.8756) [2022-01-23 19:38:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][920/1251] eta 0:12:10 lr 0.000291 time 1.6491 (2.2062) loss 4.1021 (3.3827) grad_norm 1.6300 (1.8750) [2022-01-23 19:38:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][930/1251] eta 0:11:48 lr 0.000291 time 3.3544 (2.2076) loss 3.5513 (3.3840) grad_norm 1.6232 (1.8738) [2022-01-23 19:39:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][940/1251] eta 0:11:26 lr 0.000291 time 1.7007 (2.2070) loss 3.3770 (3.3825) grad_norm 2.0517 (1.8736) [2022-01-23 19:39:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][950/1251] eta 0:11:03 lr 0.000291 time 1.8121 (2.2054) loss 3.1174 (3.3816) grad_norm 1.8917 (1.8745) [2022-01-23 19:39:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][960/1251] eta 0:10:40 lr 0.000291 time 2.1998 (2.2023) loss 3.1096 (3.3804) grad_norm 1.9036 (1.8754) [2022-01-23 19:40:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][970/1251] eta 0:10:18 lr 0.000291 time 3.1092 (2.2019) loss 3.8210 (3.3787) grad_norm 1.7241 (1.8750) [2022-01-23 19:40:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][980/1251] eta 0:09:56 lr 0.000291 time 2.1029 (2.2015) loss 3.7812 (3.3797) grad_norm 1.7694 (1.8740) [2022-01-23 19:40:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][990/1251] eta 0:09:34 lr 0.000291 time 2.6219 (2.2025) loss 3.5691 (3.3784) grad_norm 1.9731 (1.8748) [2022-01-23 19:41:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1000/1251] eta 0:09:13 lr 0.000290 time 2.0983 (2.2037) loss 3.3879 (3.3772) grad_norm 1.5346 (1.8743) [2022-01-23 19:41:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1010/1251] eta 0:08:51 lr 0.000290 time 3.4147 (2.2057) loss 3.9638 (3.3780) grad_norm 1.9768 (1.8738) [2022-01-23 19:41:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1020/1251] eta 0:08:29 lr 0.000290 time 2.1035 (2.2053) loss 2.3421 (3.3785) grad_norm 1.7478 (1.8738) [2022-01-23 19:42:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1030/1251] eta 0:08:07 lr 0.000290 time 1.7531 (2.2038) loss 2.4414 (3.3787) grad_norm 1.9567 (1.8734) [2022-01-23 19:42:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1040/1251] eta 0:07:44 lr 0.000290 time 1.5303 (2.2016) loss 3.7985 (3.3803) grad_norm 2.3757 (1.8737) [2022-01-23 19:43:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1050/1251] eta 0:07:22 lr 0.000290 time 3.1612 (2.2021) loss 3.5555 (3.3780) grad_norm 1.6507 (1.8730) [2022-01-23 19:43:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1060/1251] eta 0:07:00 lr 0.000290 time 1.9530 (2.2018) loss 3.1624 (3.3735) grad_norm 2.1848 (1.8737) [2022-01-23 19:43:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1070/1251] eta 0:06:38 lr 0.000290 time 2.5760 (2.2025) loss 3.2523 (3.3756) grad_norm 2.0318 (1.8738) [2022-01-23 19:44:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1080/1251] eta 0:06:16 lr 0.000290 time 2.0795 (2.2014) loss 3.4864 (3.3755) grad_norm 2.1400 (1.8735) [2022-01-23 19:44:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1090/1251] eta 0:05:54 lr 0.000290 time 2.3186 (2.2009) loss 3.8093 (3.3757) grad_norm 1.8952 (1.8730) [2022-01-23 19:44:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1100/1251] eta 0:05:32 lr 0.000290 time 2.4664 (2.2005) loss 3.5715 (3.3757) grad_norm 1.7336 (1.8724) [2022-01-23 19:45:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1110/1251] eta 0:05:10 lr 0.000290 time 2.2080 (2.2004) loss 3.5410 (3.3769) grad_norm 1.8387 (1.8706) [2022-01-23 19:45:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1120/1251] eta 0:04:48 lr 0.000290 time 1.5274 (2.2001) loss 2.9346 (3.3776) grad_norm 1.8624 (1.8697) [2022-01-23 19:45:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1130/1251] eta 0:04:26 lr 0.000290 time 2.2462 (2.2002) loss 3.7413 (3.3758) grad_norm 1.8013 (1.8699) [2022-01-23 19:46:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1140/1251] eta 0:04:04 lr 0.000290 time 2.7918 (2.2011) loss 3.4204 (3.3781) grad_norm 1.6432 (1.8694) [2022-01-23 19:46:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1150/1251] eta 0:03:42 lr 0.000290 time 2.6976 (2.2022) loss 4.2174 (3.3773) grad_norm 1.9478 (1.8701) [2022-01-23 19:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1160/1251] eta 0:03:20 lr 0.000290 time 1.8210 (2.2031) loss 2.7917 (3.3777) grad_norm 2.0153 (1.8700) [2022-01-23 19:47:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1170/1251] eta 0:02:58 lr 0.000290 time 2.1666 (2.2018) loss 2.9557 (3.3784) grad_norm 1.7722 (1.8707) [2022-01-23 19:47:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1180/1251] eta 0:02:36 lr 0.000290 time 1.9645 (2.1993) loss 3.6132 (3.3798) grad_norm 1.5924 (1.8703) [2022-01-23 19:48:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1190/1251] eta 0:02:14 lr 0.000290 time 1.9402 (2.1972) loss 3.2959 (3.3788) grad_norm 1.8404 (1.8697) [2022-01-23 19:48:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1200/1251] eta 0:01:51 lr 0.000290 time 2.2139 (2.1960) loss 2.3750 (3.3771) grad_norm 1.7937 (1.8697) [2022-01-23 19:48:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1210/1251] eta 0:01:30 lr 0.000290 time 2.6194 (2.1964) loss 3.7694 (3.3776) grad_norm 2.2341 (1.8700) [2022-01-23 19:49:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1220/1251] eta 0:01:08 lr 0.000290 time 2.4267 (2.1972) loss 3.8258 (3.3784) grad_norm 2.1820 (1.8703) [2022-01-23 19:49:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1230/1251] eta 0:00:46 lr 0.000290 time 1.8620 (2.1978) loss 3.2658 (3.3780) grad_norm 1.8123 (1.8704) [2022-01-23 19:49:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1240/1251] eta 0:00:24 lr 0.000290 time 1.7500 (2.1974) loss 3.3584 (3.3786) grad_norm 2.0705 (1.8703) [2022-01-23 19:50:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [192/300][1250/1251] eta 0:00:02 lr 0.000290 time 1.1953 (2.1928) loss 3.2575 (3.3778) grad_norm 1.9375 (1.8700) [2022-01-23 19:50:09 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 192 training takes 0:45:43 [2022-01-23 19:50:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.236 (18.236) Loss 0.9165 (0.9165) Acc@1 80.273 (80.273) Acc@5 93.848 (93.848) [2022-01-23 19:50:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.223 (3.441) Loss 0.8985 (0.9165) Acc@1 78.906 (78.587) Acc@5 94.727 (94.478) [2022-01-23 19:51:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.975 (2.503) Loss 0.9402 (0.9258) Acc@1 77.246 (78.199) Acc@5 93.457 (94.303) [2022-01-23 19:51:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.312 (2.227) Loss 0.9180 (0.9218) Acc@1 77.637 (78.317) Acc@5 94.727 (94.330) [2022-01-23 19:51:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.221 (2.163) Loss 0.9344 (0.9231) Acc@1 78.711 (78.332) Acc@5 93.457 (94.281) [2022-01-23 19:51:45 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.214 Acc@5 94.300 [2022-01-23 19:51:45 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.2% [2022-01-23 19:51:45 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.21% [2022-01-23 19:52:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][0/1251] eta 6:58:03 lr 0.000290 time 20.0511 (20.0511) loss 3.8971 (3.8971) grad_norm 2.0842 (2.0842) [2022-01-23 19:52:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][10/1251] eta 1:22:02 lr 0.000290 time 2.3017 (3.9669) loss 3.6525 (3.3419) grad_norm 1.5482 (1.9188) [2022-01-23 19:52:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][20/1251] eta 1:04:19 lr 0.000289 time 1.5114 (3.1355) loss 3.3322 (3.3222) grad_norm 1.9105 (1.8818) [2022-01-23 19:53:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][30/1251] eta 0:59:14 lr 0.000289 time 1.5847 (2.9108) loss 4.1578 (3.3426) grad_norm 2.1350 (1.8779) [2022-01-23 19:53:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][40/1251] eta 0:54:25 lr 0.000289 time 2.6475 (2.6963) loss 4.0792 (3.4078) grad_norm 1.8222 (1.8634) [2022-01-23 19:54:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][50/1251] eta 0:53:15 lr 0.000289 time 3.4587 (2.6607) loss 3.7039 (3.4238) grad_norm 1.8194 (1.8611) [2022-01-23 19:54:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][60/1251] eta 0:51:27 lr 0.000289 time 1.3425 (2.5924) loss 3.2181 (3.4184) grad_norm 1.7235 (1.8463) [2022-01-23 19:54:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][70/1251] eta 0:49:52 lr 0.000289 time 1.6928 (2.5341) loss 3.2714 (3.3662) grad_norm 1.7589 (1.8556) [2022-01-23 19:55:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][80/1251] eta 0:48:38 lr 0.000289 time 2.4816 (2.4925) loss 4.1967 (3.3870) grad_norm 1.9461 (1.8585) [2022-01-23 19:55:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][90/1251] eta 0:47:36 lr 0.000289 time 2.8772 (2.4600) loss 3.2036 (3.3861) grad_norm 1.7693 (1.8633) [2022-01-23 19:55:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][100/1251] eta 0:46:31 lr 0.000289 time 1.5915 (2.4257) loss 3.1096 (3.3835) grad_norm 1.9559 (1.8578) [2022-01-23 19:56:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][110/1251] eta 0:45:29 lr 0.000289 time 1.8702 (2.3918) loss 3.2860 (3.3915) grad_norm 1.6905 (1.8574) [2022-01-23 19:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][120/1251] eta 0:44:44 lr 0.000289 time 1.8282 (2.3736) loss 4.1626 (3.4000) grad_norm 1.8611 (1.8607) [2022-01-23 19:56:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][130/1251] eta 0:44:10 lr 0.000289 time 2.3946 (2.3643) loss 3.4098 (3.3624) grad_norm 1.8622 (1.8588) [2022-01-23 19:57:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][140/1251] eta 0:43:24 lr 0.000289 time 2.2898 (2.3445) loss 3.5274 (3.3653) grad_norm 1.8058 (1.8638) [2022-01-23 19:57:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][150/1251] eta 0:42:32 lr 0.000289 time 2.2441 (2.3180) loss 3.7959 (3.3875) grad_norm 1.7223 (1.8652) [2022-01-23 19:57:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][160/1251] eta 0:41:51 lr 0.000289 time 2.3500 (2.3017) loss 3.4977 (3.3836) grad_norm 1.6973 (1.8620) [2022-01-23 19:58:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][170/1251] eta 0:41:19 lr 0.000289 time 2.9228 (2.2938) loss 3.7146 (3.3834) grad_norm 2.3453 (1.8614) [2022-01-23 19:58:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][180/1251] eta 0:40:44 lr 0.000289 time 1.8680 (2.2822) loss 3.4349 (3.3968) grad_norm 1.6834 (1.8569) [2022-01-23 19:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][190/1251] eta 0:40:21 lr 0.000289 time 2.1318 (2.2821) loss 3.7847 (3.3910) grad_norm 1.8663 (1.8618) [2022-01-23 19:59:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][200/1251] eta 0:40:00 lr 0.000289 time 2.3561 (2.2841) loss 2.6144 (3.3759) grad_norm 1.8120 (1.8599) [2022-01-23 19:59:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][210/1251] eta 0:40:03 lr 0.000289 time 3.3408 (2.3085) loss 2.5052 (3.3733) grad_norm 1.6875 (1.8586) [2022-01-23 20:00:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][220/1251] eta 0:39:37 lr 0.000289 time 1.8943 (2.3056) loss 3.3889 (3.3729) grad_norm 1.8882 (1.8644) [2022-01-23 20:00:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][230/1251] eta 0:39:02 lr 0.000289 time 1.9139 (2.2947) loss 3.7098 (3.3716) grad_norm 1.6824 (1.8651) [2022-01-23 20:00:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][240/1251] eta 0:38:20 lr 0.000289 time 1.5646 (2.2760) loss 2.4493 (3.3735) grad_norm 1.5182 (1.8642) [2022-01-23 20:01:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][250/1251] eta 0:37:50 lr 0.000289 time 2.4328 (2.2677) loss 2.7587 (3.3653) grad_norm 2.1884 (1.8647) [2022-01-23 20:01:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][260/1251] eta 0:37:17 lr 0.000289 time 1.9072 (2.2580) loss 2.8250 (3.3577) grad_norm 2.0305 (1.8660) [2022-01-23 20:01:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][270/1251] eta 0:36:53 lr 0.000289 time 3.1781 (2.2563) loss 2.3840 (3.3572) grad_norm 1.7989 (1.8636) [2022-01-23 20:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][280/1251] eta 0:36:33 lr 0.000289 time 2.4167 (2.2586) loss 3.6110 (3.3632) grad_norm 2.1674 (1.8635) [2022-01-23 20:02:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][290/1251] eta 0:36:11 lr 0.000288 time 1.9441 (2.2593) loss 3.7888 (3.3600) grad_norm 1.7364 (1.8626) [2022-01-23 20:03:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][300/1251] eta 0:35:47 lr 0.000288 time 1.6227 (2.2581) loss 2.2724 (3.3581) grad_norm 1.5500 (1.8655) [2022-01-23 20:03:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][310/1251] eta 0:35:30 lr 0.000288 time 3.2434 (2.2636) loss 3.5511 (3.3622) grad_norm 1.7765 (1.8640) [2022-01-23 20:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][320/1251] eta 0:35:10 lr 0.000288 time 2.9241 (2.2673) loss 2.8808 (3.3485) grad_norm 1.8472 (1.8639) [2022-01-23 20:04:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][330/1251] eta 0:34:44 lr 0.000288 time 1.6007 (2.2636) loss 4.2145 (3.3516) grad_norm 2.0375 (1.8648) [2022-01-23 20:04:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][340/1251] eta 0:34:16 lr 0.000288 time 1.9858 (2.2574) loss 2.6474 (3.3470) grad_norm 2.0744 (1.8670) [2022-01-23 20:04:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][350/1251] eta 0:33:49 lr 0.000288 time 2.9033 (2.2522) loss 3.4791 (3.3390) grad_norm 1.9187 (1.8653) [2022-01-23 20:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][360/1251] eta 0:33:20 lr 0.000288 time 2.1654 (2.2448) loss 2.5573 (3.3419) grad_norm 1.9099 (1.8660) [2022-01-23 20:05:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][370/1251] eta 0:32:56 lr 0.000288 time 2.5528 (2.2437) loss 3.4139 (3.3373) grad_norm 2.1723 (1.8672) [2022-01-23 20:05:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][380/1251] eta 0:32:30 lr 0.000288 time 1.8613 (2.2393) loss 3.3594 (3.3373) grad_norm 1.9804 (1.8692) [2022-01-23 20:06:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][390/1251] eta 0:32:09 lr 0.000288 time 3.1409 (2.2415) loss 2.1727 (3.3323) grad_norm 1.7827 (1.8702) [2022-01-23 20:06:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][400/1251] eta 0:31:48 lr 0.000288 time 2.8134 (2.2425) loss 3.6247 (3.3334) grad_norm 1.8391 (1.8725) [2022-01-23 20:07:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][410/1251] eta 0:31:25 lr 0.000288 time 2.1301 (2.2422) loss 2.9616 (3.3345) grad_norm 2.1276 (1.8709) [2022-01-23 20:07:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][420/1251] eta 0:31:01 lr 0.000288 time 2.1460 (2.2402) loss 2.9207 (3.3339) grad_norm 1.7376 (1.8704) [2022-01-23 20:07:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][430/1251] eta 0:30:37 lr 0.000288 time 2.6205 (2.2382) loss 4.1363 (3.3304) grad_norm 1.9085 (1.8693) [2022-01-23 20:08:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][440/1251] eta 0:30:13 lr 0.000288 time 2.2744 (2.2355) loss 2.1322 (3.3327) grad_norm 1.8836 (1.8706) [2022-01-23 20:08:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][450/1251] eta 0:29:49 lr 0.000288 time 2.8617 (2.2338) loss 3.3009 (3.3299) grad_norm 1.6593 (1.8698) [2022-01-23 20:08:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][460/1251] eta 0:29:26 lr 0.000288 time 2.8787 (2.2336) loss 3.2473 (3.3322) grad_norm 1.7513 (1.8681) [2022-01-23 20:09:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][470/1251] eta 0:29:03 lr 0.000288 time 2.2868 (2.2321) loss 3.6917 (3.3318) grad_norm 1.9449 (1.8670) [2022-01-23 20:09:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][480/1251] eta 0:28:37 lr 0.000288 time 1.7469 (2.2272) loss 2.6551 (3.3246) grad_norm 1.7752 (1.8655) [2022-01-23 20:09:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][490/1251] eta 0:28:12 lr 0.000288 time 1.9557 (2.2238) loss 2.9192 (3.3236) grad_norm 2.0838 (1.8650) [2022-01-23 20:10:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][500/1251] eta 0:27:47 lr 0.000288 time 2.1017 (2.2198) loss 3.1226 (3.3291) grad_norm 1.6768 (1.8635) [2022-01-23 20:10:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][510/1251] eta 0:27:24 lr 0.000288 time 1.8836 (2.2199) loss 3.6936 (3.3289) grad_norm 1.7070 (1.8613) [2022-01-23 20:11:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][520/1251] eta 0:27:03 lr 0.000288 time 2.3069 (2.2209) loss 4.2416 (3.3279) grad_norm 1.7951 (1.8607) [2022-01-23 20:11:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][530/1251] eta 0:26:44 lr 0.000288 time 2.7295 (2.2251) loss 3.7731 (3.3344) grad_norm 1.8454 (1.8599) [2022-01-23 20:11:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][540/1251] eta 0:26:21 lr 0.000288 time 1.5152 (2.2239) loss 3.7797 (3.3371) grad_norm 1.6329 (1.8605) [2022-01-23 20:12:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][550/1251] eta 0:25:57 lr 0.000288 time 1.5388 (2.2224) loss 3.4149 (3.3362) grad_norm 1.7748 (1.8595) [2022-01-23 20:12:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][560/1251] eta 0:25:36 lr 0.000287 time 2.5278 (2.2235) loss 2.7325 (3.3370) grad_norm 1.5912 (1.8594) [2022-01-23 20:12:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][570/1251] eta 0:25:16 lr 0.000287 time 2.2932 (2.2261) loss 3.6940 (3.3379) grad_norm 1.8720 (1.8599) [2022-01-23 20:13:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][580/1251] eta 0:24:52 lr 0.000287 time 1.5320 (2.2242) loss 3.9140 (3.3378) grad_norm 1.7681 (1.8594) [2022-01-23 20:13:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][590/1251] eta 0:24:29 lr 0.000287 time 1.6813 (2.2226) loss 2.9110 (3.3363) grad_norm 2.1596 (1.8583) [2022-01-23 20:14:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][600/1251] eta 0:24:05 lr 0.000287 time 1.6621 (2.2205) loss 4.0259 (3.3384) grad_norm 1.6957 (1.8595) [2022-01-23 20:14:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][610/1251] eta 0:23:41 lr 0.000287 time 2.2331 (2.2181) loss 2.1359 (3.3408) grad_norm 1.8270 (1.8618) [2022-01-23 20:14:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][620/1251] eta 0:23:18 lr 0.000287 time 1.9675 (2.2164) loss 3.3053 (3.3385) grad_norm 1.8883 (1.8624) [2022-01-23 20:15:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][630/1251] eta 0:22:55 lr 0.000287 time 1.8631 (2.2150) loss 3.7678 (3.3356) grad_norm 2.1551 (1.8624) [2022-01-23 20:15:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][640/1251] eta 0:22:33 lr 0.000287 time 2.2456 (2.2154) loss 3.5474 (3.3334) grad_norm 1.8228 (1.8610) [2022-01-23 20:15:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][650/1251] eta 0:22:12 lr 0.000287 time 2.9664 (2.2167) loss 2.4342 (3.3336) grad_norm 1.6375 (1.8600) [2022-01-23 20:16:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][660/1251] eta 0:21:51 lr 0.000287 time 1.8765 (2.2190) loss 3.3411 (3.3355) grad_norm 1.7971 (1.8577) [2022-01-23 20:16:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][670/1251] eta 0:21:29 lr 0.000287 time 2.5839 (2.2202) loss 3.4812 (3.3391) grad_norm 1.8999 (1.8571) [2022-01-23 20:16:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][680/1251] eta 0:21:06 lr 0.000287 time 1.6426 (2.2179) loss 3.6405 (3.3397) grad_norm 1.5773 (1.8558) [2022-01-23 20:17:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][690/1251] eta 0:20:42 lr 0.000287 time 2.0996 (2.2150) loss 3.7747 (3.3379) grad_norm 1.9166 (1.8552) [2022-01-23 20:17:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][700/1251] eta 0:20:18 lr 0.000287 time 2.6305 (2.2119) loss 3.9277 (3.3424) grad_norm 1.7336 (1.8544) [2022-01-23 20:17:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][710/1251] eta 0:19:55 lr 0.000287 time 2.5257 (2.2104) loss 2.5354 (3.3426) grad_norm 1.8902 (1.8588) [2022-01-23 20:18:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][720/1251] eta 0:19:32 lr 0.000287 time 1.9439 (2.2090) loss 2.8014 (3.3390) grad_norm 1.7417 (1.8577) [2022-01-23 20:18:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][730/1251] eta 0:19:11 lr 0.000287 time 2.5438 (2.2093) loss 2.3780 (3.3389) grad_norm 1.9072 (1.8581) [2022-01-23 20:19:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][740/1251] eta 0:18:49 lr 0.000287 time 2.0987 (2.2098) loss 3.6292 (3.3429) grad_norm 1.9139 (1.8599) [2022-01-23 20:19:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][750/1251] eta 0:18:27 lr 0.000287 time 1.8388 (2.2098) loss 3.9151 (3.3448) grad_norm 2.0739 (1.8603) [2022-01-23 20:19:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][760/1251] eta 0:18:05 lr 0.000287 time 2.4824 (2.2110) loss 3.6553 (3.3463) grad_norm 1.6975 (1.8608) [2022-01-23 20:20:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][770/1251] eta 0:17:44 lr 0.000287 time 2.5601 (2.2124) loss 2.5716 (3.3446) grad_norm 1.8868 (1.8598) [2022-01-23 20:20:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][780/1251] eta 0:17:22 lr 0.000287 time 2.1318 (2.2140) loss 3.1041 (3.3452) grad_norm 1.8925 (1.8593) [2022-01-23 20:20:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][790/1251] eta 0:17:00 lr 0.000287 time 1.9700 (2.2138) loss 3.3567 (3.3462) grad_norm 1.9980 (1.8591) [2022-01-23 20:21:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][800/1251] eta 0:16:37 lr 0.000287 time 2.0037 (2.2115) loss 3.5544 (3.3453) grad_norm 1.6164 (1.8590) [2022-01-23 20:21:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][810/1251] eta 0:16:13 lr 0.000287 time 1.9409 (2.2074) loss 3.7787 (3.3474) grad_norm 1.8332 (1.8598) [2022-01-23 20:21:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][820/1251] eta 0:15:50 lr 0.000287 time 1.8666 (2.2057) loss 3.9314 (3.3442) grad_norm 1.9287 (1.8605) [2022-01-23 20:22:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][830/1251] eta 0:15:28 lr 0.000286 time 1.8984 (2.2054) loss 2.0826 (3.3423) grad_norm 1.8227 (1.8607) [2022-01-23 20:22:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][840/1251] eta 0:15:08 lr 0.000286 time 5.6446 (2.2114) loss 2.2967 (3.3429) grad_norm 1.9039 (1.8608) [2022-01-23 20:23:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][850/1251] eta 0:14:47 lr 0.000286 time 2.2243 (2.2144) loss 3.4181 (3.3429) grad_norm 1.9602 (1.8616) [2022-01-23 20:23:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][860/1251] eta 0:14:25 lr 0.000286 time 1.9344 (2.2144) loss 3.7685 (3.3441) grad_norm 1.8080 (1.8629) [2022-01-23 20:23:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][870/1251] eta 0:14:02 lr 0.000286 time 1.5878 (2.2122) loss 2.9246 (3.3411) grad_norm 1.9025 (1.8631) [2022-01-23 20:24:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][880/1251] eta 0:13:39 lr 0.000286 time 2.0640 (2.2091) loss 3.4942 (3.3422) grad_norm 1.9391 (1.8647) [2022-01-23 20:24:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][890/1251] eta 0:13:16 lr 0.000286 time 2.2545 (2.2073) loss 4.1909 (3.3414) grad_norm 2.0508 (1.8642) [2022-01-23 20:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][900/1251] eta 0:12:54 lr 0.000286 time 2.5849 (2.2061) loss 3.7877 (3.3426) grad_norm 1.9927 (1.8641) [2022-01-23 20:25:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][910/1251] eta 0:12:31 lr 0.000286 time 2.2683 (2.2052) loss 2.4283 (3.3429) grad_norm 1.7126 (1.8630) [2022-01-23 20:25:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][920/1251] eta 0:12:09 lr 0.000286 time 1.8685 (2.2052) loss 2.6798 (3.3415) grad_norm 1.4941 (1.8620) [2022-01-23 20:25:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][930/1251] eta 0:11:47 lr 0.000286 time 2.6626 (2.2055) loss 3.8417 (3.3438) grad_norm 1.9560 (1.8619) [2022-01-23 20:26:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][940/1251] eta 0:11:26 lr 0.000286 time 1.6676 (2.2060) loss 3.6343 (3.3448) grad_norm 2.2059 (1.8613) [2022-01-23 20:26:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][950/1251] eta 0:11:04 lr 0.000286 time 1.9701 (2.2064) loss 3.6541 (3.3458) grad_norm 1.8710 (1.8617) [2022-01-23 20:27:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][960/1251] eta 0:10:42 lr 0.000286 time 2.4041 (2.2083) loss 3.4353 (3.3470) grad_norm 1.8980 (1.8614) [2022-01-23 20:27:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][970/1251] eta 0:10:20 lr 0.000286 time 2.8573 (2.2086) loss 3.9745 (3.3494) grad_norm 1.9609 (1.8623) [2022-01-23 20:27:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][980/1251] eta 0:09:58 lr 0.000286 time 2.4329 (2.2082) loss 2.9938 (3.3463) grad_norm 1.4382 (1.8612) [2022-01-23 20:28:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][990/1251] eta 0:09:35 lr 0.000286 time 2.2451 (2.2067) loss 2.9368 (3.3450) grad_norm 1.9057 (1.8607) [2022-01-23 20:28:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1000/1251] eta 0:09:13 lr 0.000286 time 2.3876 (2.2052) loss 3.5411 (3.3449) grad_norm 1.5432 (1.8589) [2022-01-23 20:28:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1010/1251] eta 0:08:51 lr 0.000286 time 2.6139 (2.2047) loss 2.3613 (3.3452) grad_norm 1.7594 (1.8592) [2022-01-23 20:29:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1020/1251] eta 0:08:29 lr 0.000286 time 2.1422 (2.2041) loss 3.8285 (3.3470) grad_norm 1.9900 (1.8592) [2022-01-23 20:29:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1030/1251] eta 0:08:07 lr 0.000286 time 2.1702 (2.2041) loss 3.6478 (3.3485) grad_norm 2.0979 (1.8596) [2022-01-23 20:29:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1040/1251] eta 0:07:44 lr 0.000286 time 2.2862 (2.2036) loss 3.6537 (3.3478) grad_norm 2.0399 (1.8594) [2022-01-23 20:30:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1050/1251] eta 0:07:22 lr 0.000286 time 1.5022 (2.2035) loss 2.9928 (3.3489) grad_norm 2.1028 (1.8601) [2022-01-23 20:30:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1060/1251] eta 0:07:00 lr 0.000286 time 2.4655 (2.2033) loss 4.0241 (3.3491) grad_norm 2.4260 (1.8604) [2022-01-23 20:31:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1070/1251] eta 0:06:38 lr 0.000286 time 1.9320 (2.2023) loss 2.4963 (3.3495) grad_norm 1.8992 (1.8606) [2022-01-23 20:31:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1080/1251] eta 0:06:16 lr 0.000286 time 2.5377 (2.2012) loss 3.3827 (3.3462) grad_norm 2.2917 (1.8618) [2022-01-23 20:31:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1090/1251] eta 0:05:54 lr 0.000286 time 1.8059 (2.2002) loss 3.9402 (3.3461) grad_norm 2.3323 (1.8631) [2022-01-23 20:32:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1100/1251] eta 0:05:32 lr 0.000285 time 2.9828 (2.2015) loss 2.9071 (3.3425) grad_norm 1.8405 (1.8645) [2022-01-23 20:32:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1110/1251] eta 0:05:10 lr 0.000285 time 2.4994 (2.2024) loss 3.1927 (3.3439) grad_norm 1.8133 (1.8644) [2022-01-23 20:32:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1120/1251] eta 0:04:48 lr 0.000285 time 2.1829 (2.2028) loss 3.4577 (3.3428) grad_norm 2.0452 (1.8650) [2022-01-23 20:33:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1130/1251] eta 0:04:26 lr 0.000285 time 1.6076 (2.2031) loss 3.8101 (3.3414) grad_norm 1.8240 (1.8648) [2022-01-23 20:33:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1140/1251] eta 0:04:04 lr 0.000285 time 2.8275 (2.2026) loss 2.9784 (3.3433) grad_norm 1.9548 (1.8640) [2022-01-23 20:33:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1150/1251] eta 0:03:42 lr 0.000285 time 1.7561 (2.1999) loss 3.3129 (3.3409) grad_norm 2.1195 (1.8638) [2022-01-23 20:34:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1160/1251] eta 0:03:20 lr 0.000285 time 2.4323 (2.1980) loss 3.1600 (3.3413) grad_norm 1.8244 (1.8640) [2022-01-23 20:34:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1170/1251] eta 0:02:57 lr 0.000285 time 2.4248 (2.1972) loss 2.8891 (3.3402) grad_norm 1.7005 (1.8629) [2022-01-23 20:34:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1180/1251] eta 0:02:35 lr 0.000285 time 1.5120 (2.1962) loss 2.5101 (3.3398) grad_norm 1.6860 (1.8632) [2022-01-23 20:35:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1190/1251] eta 0:02:13 lr 0.000285 time 2.3875 (2.1966) loss 4.0494 (3.3395) grad_norm 1.7655 (1.8637) [2022-01-23 20:35:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1200/1251] eta 0:01:52 lr 0.000285 time 3.1649 (2.1970) loss 3.1150 (3.3362) grad_norm 1.7791 (1.8637) [2022-01-23 20:36:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1210/1251] eta 0:01:30 lr 0.000285 time 2.8964 (2.1976) loss 3.8679 (3.3366) grad_norm 1.8738 (1.8639) [2022-01-23 20:36:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1220/1251] eta 0:01:08 lr 0.000285 time 2.4902 (2.1973) loss 3.5057 (3.3384) grad_norm 1.7210 (1.8635) [2022-01-23 20:36:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1230/1251] eta 0:00:46 lr 0.000285 time 3.2063 (2.2002) loss 3.5826 (3.3395) grad_norm 1.7988 (1.8635) [2022-01-23 20:37:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1240/1251] eta 0:00:24 lr 0.000285 time 2.0936 (2.1992) loss 3.6886 (3.3427) grad_norm 1.7037 (1.8642) [2022-01-23 20:37:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [193/300][1250/1251] eta 0:00:02 lr 0.000285 time 1.2077 (2.1932) loss 3.3216 (3.3448) grad_norm 1.7649 (1.8641) [2022-01-23 20:37:29 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 193 training takes 0:45:44 [2022-01-23 20:37:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.912 (17.912) Loss 1.0044 (1.0044) Acc@1 75.293 (75.293) Acc@5 93.359 (93.359) [2022-01-23 20:38:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.288 (3.217) Loss 1.0179 (0.9429) Acc@1 75.977 (78.125) Acc@5 93.262 (94.292) [2022-01-23 20:38:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.941 (2.458) Loss 0.9090 (0.9340) Acc@1 79.004 (78.246) Acc@5 94.336 (94.434) [2022-01-23 20:38:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.616 (2.305) Loss 0.9306 (0.9456) Acc@1 78.027 (77.949) Acc@5 94.238 (94.311) [2022-01-23 20:38:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.853 (2.176) Loss 0.9891 (0.9435) Acc@1 78.223 (78.092) Acc@5 93.164 (94.331) [2022-01-23 20:39:06 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.138 Acc@5 94.310 [2022-01-23 20:39:06 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.1% [2022-01-23 20:39:06 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.21% [2022-01-23 20:39:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][0/1251] eta 7:38:56 lr 0.000285 time 22.0118 (22.0118) loss 3.0033 (3.0033) grad_norm 2.1691 (2.1691) [2022-01-23 20:39:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][10/1251] eta 1:23:29 lr 0.000285 time 1.9770 (4.0367) loss 3.6431 (3.3169) grad_norm 1.9433 (1.9166) [2022-01-23 20:40:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][20/1251] eta 1:03:53 lr 0.000285 time 1.6710 (3.1141) loss 3.9088 (3.3581) grad_norm 1.7333 (1.9362) [2022-01-23 20:40:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][30/1251] eta 0:57:17 lr 0.000285 time 1.6286 (2.8156) loss 3.3963 (3.3674) grad_norm 2.3122 (2.0088) [2022-01-23 20:40:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][40/1251] eta 0:55:32 lr 0.000285 time 6.6662 (2.7520) loss 3.6885 (3.4073) grad_norm 1.5770 (1.9818) [2022-01-23 20:41:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][50/1251] eta 0:52:25 lr 0.000285 time 1.3106 (2.6188) loss 3.3324 (3.4135) grad_norm 1.6616 (1.9493) [2022-01-23 20:41:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][60/1251] eta 0:50:37 lr 0.000285 time 2.2602 (2.5505) loss 3.1545 (3.3838) grad_norm 1.8134 (1.9124) [2022-01-23 20:42:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][70/1251] eta 0:49:02 lr 0.000285 time 1.7094 (2.4916) loss 4.1572 (3.4139) grad_norm 1.8449 (1.9096) [2022-01-23 20:42:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][80/1251] eta 0:48:22 lr 0.000285 time 3.1677 (2.4788) loss 3.7248 (3.4295) grad_norm 1.9201 (1.9013) [2022-01-23 20:42:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][90/1251] eta 0:47:27 lr 0.000285 time 1.6425 (2.4525) loss 3.5129 (3.4221) grad_norm 1.7609 (1.8999) [2022-01-23 20:43:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][100/1251] eta 0:46:38 lr 0.000285 time 2.2243 (2.4318) loss 3.7688 (3.4448) grad_norm 1.9469 (1.9025) [2022-01-23 20:43:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][110/1251] eta 0:45:34 lr 0.000284 time 1.7223 (2.3969) loss 2.6508 (3.4469) grad_norm 2.0277 (1.9055) [2022-01-23 20:43:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][120/1251] eta 0:44:44 lr 0.000284 time 2.3744 (2.3733) loss 2.9165 (3.4508) grad_norm 2.0751 (1.9088) [2022-01-23 20:44:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][130/1251] eta 0:43:46 lr 0.000284 time 1.5546 (2.3429) loss 2.4970 (3.4271) grad_norm 2.2797 (1.9092) [2022-01-23 20:44:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][140/1251] eta 0:43:18 lr 0.000284 time 2.3442 (2.3385) loss 2.8010 (3.4043) grad_norm 2.2288 (1.9121) [2022-01-23 20:44:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][150/1251] eta 0:42:40 lr 0.000284 time 1.8715 (2.3258) loss 3.8230 (3.4077) grad_norm 2.2181 (1.9189) [2022-01-23 20:45:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][160/1251] eta 0:42:20 lr 0.000284 time 3.0792 (2.3290) loss 3.2784 (3.4014) grad_norm 1.5670 (1.9166) [2022-01-23 20:45:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][170/1251] eta 0:41:52 lr 0.000284 time 1.8039 (2.3244) loss 3.4955 (3.4028) grad_norm 1.9930 (1.9139) [2022-01-23 20:46:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][180/1251] eta 0:41:21 lr 0.000284 time 2.4919 (2.3171) loss 3.4128 (3.4051) grad_norm 1.7303 (1.9135) [2022-01-23 20:46:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][190/1251] eta 0:40:49 lr 0.000284 time 1.5955 (2.3090) loss 3.3562 (3.3954) grad_norm 1.8743 (1.9047) [2022-01-23 20:46:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][200/1251] eta 0:40:16 lr 0.000284 time 2.8179 (2.2990) loss 3.5852 (3.3838) grad_norm 2.0807 (1.9115) [2022-01-23 20:47:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][210/1251] eta 0:39:43 lr 0.000284 time 1.9079 (2.2899) loss 4.0581 (3.3767) grad_norm 1.7752 (1.9092) [2022-01-23 20:47:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][220/1251] eta 0:39:14 lr 0.000284 time 3.3189 (2.2841) loss 3.7109 (3.3789) grad_norm 1.7734 (1.9064) [2022-01-23 20:47:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][230/1251] eta 0:38:45 lr 0.000284 time 1.9357 (2.2781) loss 3.0829 (3.3778) grad_norm 1.9452 (1.9058) [2022-01-23 20:48:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][240/1251] eta 0:38:24 lr 0.000284 time 3.1680 (2.2795) loss 2.3622 (3.3671) grad_norm 1.7647 (1.9031) [2022-01-23 20:48:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][250/1251] eta 0:37:54 lr 0.000284 time 2.2344 (2.2719) loss 4.0946 (3.3615) grad_norm 1.8107 (1.9026) [2022-01-23 20:48:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][260/1251] eta 0:37:27 lr 0.000284 time 2.8487 (2.2680) loss 3.6657 (3.3520) grad_norm 1.9367 (1.9032) [2022-01-23 20:49:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][270/1251] eta 0:37:03 lr 0.000284 time 1.9097 (2.2669) loss 3.5691 (3.3497) grad_norm 1.9540 (1.9024) [2022-01-23 20:49:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][280/1251] eta 0:36:38 lr 0.000284 time 2.3883 (2.2640) loss 3.7401 (3.3548) grad_norm 1.5333 (1.8987) [2022-01-23 20:50:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][290/1251] eta 0:36:10 lr 0.000284 time 1.8435 (2.2583) loss 3.5491 (3.3558) grad_norm 1.7942 (1.8961) [2022-01-23 20:50:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][300/1251] eta 0:35:43 lr 0.000284 time 2.4345 (2.2536) loss 3.3432 (3.3612) grad_norm 1.8417 (1.8932) [2022-01-23 20:50:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][310/1251] eta 0:35:14 lr 0.000284 time 2.2001 (2.2473) loss 3.7617 (3.3595) grad_norm 1.9123 (1.8897) [2022-01-23 20:51:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][320/1251] eta 0:34:50 lr 0.000284 time 2.3032 (2.2452) loss 4.1497 (3.3609) grad_norm 1.6246 (1.8845) [2022-01-23 20:51:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][330/1251] eta 0:34:24 lr 0.000284 time 1.8122 (2.2413) loss 3.2528 (3.3648) grad_norm 1.8261 (1.8815) [2022-01-23 20:51:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][340/1251] eta 0:34:03 lr 0.000284 time 2.3927 (2.2428) loss 3.4715 (3.3643) grad_norm 1.6524 (1.8779) [2022-01-23 20:52:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][350/1251] eta 0:33:37 lr 0.000284 time 2.2584 (2.2388) loss 3.3147 (3.3642) grad_norm 1.7694 (1.8750) [2022-01-23 20:52:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][360/1251] eta 0:33:14 lr 0.000284 time 2.3674 (2.2385) loss 3.1344 (3.3575) grad_norm 1.5248 (1.8716) [2022-01-23 20:52:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][370/1251] eta 0:32:50 lr 0.000284 time 1.8812 (2.2368) loss 4.0315 (3.3651) grad_norm 1.9691 (1.8723) [2022-01-23 20:53:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][380/1251] eta 0:32:29 lr 0.000283 time 2.1799 (2.2380) loss 3.0872 (3.3625) grad_norm 1.7091 (1.8730) [2022-01-23 20:53:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][390/1251] eta 0:32:06 lr 0.000283 time 1.8987 (2.2376) loss 3.4957 (3.3587) grad_norm 1.6760 (1.8743) [2022-01-23 20:54:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][400/1251] eta 0:31:46 lr 0.000283 time 1.8724 (2.2406) loss 3.5989 (3.3580) grad_norm 1.7468 (1.8739) [2022-01-23 20:54:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][410/1251] eta 0:31:24 lr 0.000283 time 1.8648 (2.2410) loss 2.3708 (3.3540) grad_norm 2.0092 (1.8741) [2022-01-23 20:54:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][420/1251] eta 0:31:01 lr 0.000283 time 2.4375 (2.2403) loss 2.7572 (3.3537) grad_norm 1.7746 (1.8727) [2022-01-23 20:55:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][430/1251] eta 0:30:32 lr 0.000283 time 1.8646 (2.2325) loss 3.6557 (3.3520) grad_norm 1.9099 (1.8737) [2022-01-23 20:55:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][440/1251] eta 0:30:07 lr 0.000283 time 1.6038 (2.2290) loss 3.5906 (3.3552) grad_norm 2.7943 (1.8772) [2022-01-23 20:55:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][450/1251] eta 0:29:42 lr 0.000283 time 1.8823 (2.2252) loss 3.0517 (3.3576) grad_norm 1.9546 (1.8806) [2022-01-23 20:56:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][460/1251] eta 0:29:20 lr 0.000283 time 1.8358 (2.2256) loss 4.0147 (3.3555) grad_norm 1.9088 (1.8810) [2022-01-23 20:56:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][470/1251] eta 0:28:57 lr 0.000283 time 1.8488 (2.2252) loss 3.7663 (3.3514) grad_norm 1.7456 (1.8809) [2022-01-23 20:56:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][480/1251] eta 0:28:35 lr 0.000283 time 2.2531 (2.2245) loss 3.3262 (3.3539) grad_norm 1.9894 (1.8797) [2022-01-23 20:57:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][490/1251] eta 0:28:10 lr 0.000283 time 1.5815 (2.2221) loss 3.9119 (3.3615) grad_norm 1.6034 (1.8789) [2022-01-23 20:57:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][500/1251] eta 0:27:49 lr 0.000283 time 2.7610 (2.2226) loss 3.1615 (3.3549) grad_norm 1.5720 (1.8793) [2022-01-23 20:58:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][510/1251] eta 0:27:27 lr 0.000283 time 2.1692 (2.2233) loss 2.1188 (3.3549) grad_norm 1.9149 (1.8789) [2022-01-23 20:58:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][520/1251] eta 0:27:06 lr 0.000283 time 2.1895 (2.2245) loss 2.4134 (3.3511) grad_norm 1.8348 (1.8774) [2022-01-23 20:58:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][530/1251] eta 0:26:41 lr 0.000283 time 1.7443 (2.2209) loss 3.1634 (3.3517) grad_norm 1.6203 (1.8748) [2022-01-23 20:59:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][540/1251] eta 0:26:19 lr 0.000283 time 2.9948 (2.2210) loss 4.0860 (3.3555) grad_norm 1.7515 (1.8758) [2022-01-23 20:59:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][550/1251] eta 0:25:56 lr 0.000283 time 1.8804 (2.2203) loss 3.5820 (3.3566) grad_norm 1.7073 (1.8746) [2022-01-23 20:59:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][560/1251] eta 0:25:34 lr 0.000283 time 1.8261 (2.2204) loss 3.4566 (3.3640) grad_norm 1.8560 (1.8748) [2022-01-23 21:00:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][570/1251] eta 0:25:10 lr 0.000283 time 2.0014 (2.2174) loss 3.3880 (3.3647) grad_norm 1.8089 (1.8734) [2022-01-23 21:00:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][580/1251] eta 0:24:47 lr 0.000283 time 2.4675 (2.2164) loss 3.3551 (3.3654) grad_norm 1.7974 (1.8731) [2022-01-23 21:00:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][590/1251] eta 0:24:24 lr 0.000283 time 1.6096 (2.2150) loss 3.7243 (3.3690) grad_norm 1.6829 (1.8724) [2022-01-23 21:01:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][600/1251] eta 0:24:00 lr 0.000283 time 2.1857 (2.2130) loss 3.6393 (3.3669) grad_norm 1.7364 (1.8734) [2022-01-23 21:01:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][610/1251] eta 0:23:37 lr 0.000283 time 2.1444 (2.2118) loss 2.2735 (3.3604) grad_norm 1.9634 (1.8725) [2022-01-23 21:02:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][620/1251] eta 0:23:15 lr 0.000283 time 2.1714 (2.2113) loss 3.7021 (3.3571) grad_norm 2.4990 (1.8730) [2022-01-23 21:02:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][630/1251] eta 0:22:52 lr 0.000283 time 1.8603 (2.2100) loss 3.4085 (3.3557) grad_norm 2.0552 (1.8734) [2022-01-23 21:02:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][640/1251] eta 0:22:31 lr 0.000283 time 3.0187 (2.2121) loss 2.0661 (3.3503) grad_norm 2.1638 (1.8735) [2022-01-23 21:03:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][650/1251] eta 0:22:10 lr 0.000282 time 2.2085 (2.2139) loss 3.8582 (3.3557) grad_norm 1.5709 (1.8716) [2022-01-23 21:03:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][660/1251] eta 0:21:47 lr 0.000282 time 2.2855 (2.2128) loss 3.8586 (3.3542) grad_norm 2.1916 (1.8708) [2022-01-23 21:03:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][670/1251] eta 0:21:24 lr 0.000282 time 1.6839 (2.2110) loss 2.5364 (3.3514) grad_norm 1.6682 (1.8698) [2022-01-23 21:04:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][680/1251] eta 0:21:02 lr 0.000282 time 3.0576 (2.2115) loss 2.4901 (3.3478) grad_norm 1.9536 (1.8724) [2022-01-23 21:04:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][690/1251] eta 0:20:40 lr 0.000282 time 2.1999 (2.2118) loss 3.4151 (3.3480) grad_norm 1.8060 (1.8710) [2022-01-23 21:04:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][700/1251] eta 0:20:18 lr 0.000282 time 2.4866 (2.2118) loss 3.4560 (3.3475) grad_norm 1.9896 (1.8702) [2022-01-23 21:05:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][710/1251] eta 0:19:56 lr 0.000282 time 2.0777 (2.2124) loss 3.3843 (3.3493) grad_norm 1.7288 (1.8703) [2022-01-23 21:05:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][720/1251] eta 0:19:36 lr 0.000282 time 3.2729 (2.2150) loss 3.4822 (3.3489) grad_norm 1.8022 (1.8695) [2022-01-23 21:06:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][730/1251] eta 0:19:14 lr 0.000282 time 1.8485 (2.2164) loss 3.4466 (3.3496) grad_norm 1.9111 (1.8705) [2022-01-23 21:06:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][740/1251] eta 0:18:52 lr 0.000282 time 1.7340 (2.2153) loss 3.7984 (3.3521) grad_norm 1.9505 (1.8713) [2022-01-23 21:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][750/1251] eta 0:18:29 lr 0.000282 time 2.9205 (2.2145) loss 3.9079 (3.3542) grad_norm 2.1114 (1.8720) [2022-01-23 21:07:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][760/1251] eta 0:18:06 lr 0.000282 time 3.0811 (2.2129) loss 4.0214 (3.3541) grad_norm 1.9336 (1.8726) [2022-01-23 21:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][770/1251] eta 0:17:44 lr 0.000282 time 1.7922 (2.2122) loss 3.4459 (3.3510) grad_norm 1.9706 (1.8742) [2022-01-23 21:07:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][780/1251] eta 0:17:21 lr 0.000282 time 1.6110 (2.2119) loss 3.8233 (3.3526) grad_norm 1.9976 (1.8738) [2022-01-23 21:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][790/1251] eta 0:17:00 lr 0.000282 time 2.9241 (2.2126) loss 3.5726 (3.3551) grad_norm 1.7599 (1.8743) [2022-01-23 21:08:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][800/1251] eta 0:16:37 lr 0.000282 time 3.1839 (2.2115) loss 3.6550 (3.3551) grad_norm 1.9441 (1.8743) [2022-01-23 21:08:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][810/1251] eta 0:16:14 lr 0.000282 time 2.0242 (2.2087) loss 4.3443 (3.3551) grad_norm 1.9002 (1.8757) [2022-01-23 21:09:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][820/1251] eta 0:15:52 lr 0.000282 time 1.9711 (2.2090) loss 3.9741 (3.3551) grad_norm 1.9261 (1.8757) [2022-01-23 21:09:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][830/1251] eta 0:15:30 lr 0.000282 time 3.0503 (2.2107) loss 3.8027 (3.3527) grad_norm 1.7249 (1.8751) [2022-01-23 21:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][840/1251] eta 0:15:09 lr 0.000282 time 2.9944 (2.2118) loss 4.0845 (3.3545) grad_norm 1.6663 (1.8751) [2022-01-23 21:10:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][850/1251] eta 0:14:46 lr 0.000282 time 1.5259 (2.2102) loss 2.8163 (3.3525) grad_norm 2.0955 (1.8752) [2022-01-23 21:10:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][860/1251] eta 0:14:23 lr 0.000282 time 1.9331 (2.2091) loss 2.1369 (3.3503) grad_norm 2.2230 (1.8769) [2022-01-23 21:11:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][870/1251] eta 0:14:01 lr 0.000282 time 2.2864 (2.2078) loss 2.8943 (3.3432) grad_norm 2.3142 (1.8768) [2022-01-23 21:11:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][880/1251] eta 0:13:39 lr 0.000282 time 3.4482 (2.2086) loss 3.7277 (3.3447) grad_norm 1.9428 (1.8784) [2022-01-23 21:11:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][890/1251] eta 0:13:16 lr 0.000282 time 2.2163 (2.2073) loss 2.4069 (3.3401) grad_norm 1.6911 (1.8789) [2022-01-23 21:12:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][900/1251] eta 0:12:54 lr 0.000282 time 1.7068 (2.2071) loss 3.6887 (3.3430) grad_norm 1.9147 (1.8791) [2022-01-23 21:12:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][910/1251] eta 0:12:32 lr 0.000282 time 1.9105 (2.2068) loss 3.5568 (3.3406) grad_norm 1.8100 (1.8793) [2022-01-23 21:12:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][920/1251] eta 0:12:10 lr 0.000281 time 2.1523 (2.2065) loss 3.4725 (3.3415) grad_norm 1.5926 (1.8783) [2022-01-23 21:13:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][930/1251] eta 0:11:48 lr 0.000281 time 2.2160 (2.2076) loss 4.1113 (3.3463) grad_norm 1.7765 (1.8782) [2022-01-23 21:13:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][940/1251] eta 0:11:25 lr 0.000281 time 1.6858 (2.2057) loss 3.9513 (3.3471) grad_norm 1.6048 (1.8778) [2022-01-23 21:14:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][950/1251] eta 0:11:03 lr 0.000281 time 1.6090 (2.2052) loss 2.3284 (3.3466) grad_norm 1.7960 (1.8781) [2022-01-23 21:14:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][960/1251] eta 0:10:41 lr 0.000281 time 2.8163 (2.2054) loss 3.1676 (3.3460) grad_norm 1.7719 (1.8776) [2022-01-23 21:14:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][970/1251] eta 0:10:19 lr 0.000281 time 2.4802 (2.2061) loss 3.6222 (3.3499) grad_norm 1.9799 (1.8780) [2022-01-23 21:15:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][980/1251] eta 0:09:57 lr 0.000281 time 1.8377 (2.2060) loss 3.0725 (3.3508) grad_norm 2.0602 (1.8782) [2022-01-23 21:15:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][990/1251] eta 0:09:35 lr 0.000281 time 2.6420 (2.2055) loss 3.7898 (3.3500) grad_norm 1.8690 (1.8787) [2022-01-23 21:15:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1000/1251] eta 0:09:13 lr 0.000281 time 2.2855 (2.2041) loss 2.1395 (3.3490) grad_norm 1.9525 (1.8783) [2022-01-23 21:16:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1010/1251] eta 0:08:50 lr 0.000281 time 1.4918 (2.2026) loss 3.1258 (3.3489) grad_norm 1.4681 (1.8774) [2022-01-23 21:16:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1020/1251] eta 0:08:28 lr 0.000281 time 2.1306 (2.2016) loss 3.9202 (3.3479) grad_norm 2.1062 (1.8781) [2022-01-23 21:16:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1030/1251] eta 0:08:06 lr 0.000281 time 2.5830 (2.2016) loss 4.0468 (3.3492) grad_norm 1.9421 (1.8775) [2022-01-23 21:17:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1040/1251] eta 0:07:44 lr 0.000281 time 2.5310 (2.2008) loss 3.5462 (3.3494) grad_norm 2.1631 (1.8779) [2022-01-23 21:17:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1050/1251] eta 0:07:22 lr 0.000281 time 2.4299 (2.2019) loss 3.4951 (3.3510) grad_norm 1.9219 (1.8778) [2022-01-23 21:18:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1060/1251] eta 0:07:00 lr 0.000281 time 1.8981 (2.2013) loss 2.4734 (3.3513) grad_norm 1.8587 (1.8775) [2022-01-23 21:18:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1070/1251] eta 0:06:38 lr 0.000281 time 2.8898 (2.2021) loss 2.7065 (3.3524) grad_norm 2.1224 (1.8773) [2022-01-23 21:18:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1080/1251] eta 0:06:16 lr 0.000281 time 2.6150 (2.2027) loss 2.2783 (3.3498) grad_norm 2.0682 (1.8779) [2022-01-23 21:19:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1090/1251] eta 0:05:54 lr 0.000281 time 2.0855 (2.2030) loss 2.5292 (3.3466) grad_norm 1.7785 (1.8775) [2022-01-23 21:19:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1100/1251] eta 0:05:32 lr 0.000281 time 1.5936 (2.2022) loss 3.7272 (3.3450) grad_norm 2.0074 (1.8785) [2022-01-23 21:19:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1110/1251] eta 0:05:10 lr 0.000281 time 2.4128 (2.2011) loss 3.4207 (3.3438) grad_norm 1.7704 (1.8781) [2022-01-23 21:20:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1120/1251] eta 0:04:48 lr 0.000281 time 2.4898 (2.2002) loss 3.6142 (3.3460) grad_norm 2.0192 (1.8775) [2022-01-23 21:20:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1130/1251] eta 0:04:26 lr 0.000281 time 2.2125 (2.2007) loss 2.7383 (3.3468) grad_norm 2.0881 (1.8774) [2022-01-23 21:20:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1140/1251] eta 0:04:04 lr 0.000281 time 1.8262 (2.2013) loss 3.9822 (3.3465) grad_norm 1.6839 (1.8769) [2022-01-23 21:21:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1150/1251] eta 0:03:42 lr 0.000281 time 1.6780 (2.2022) loss 2.3736 (3.3443) grad_norm 1.6387 (1.8763) [2022-01-23 21:21:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1160/1251] eta 0:03:20 lr 0.000281 time 2.0540 (2.2014) loss 3.2480 (3.3459) grad_norm 1.6825 (1.8751) [2022-01-23 21:22:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1170/1251] eta 0:02:58 lr 0.000281 time 2.5251 (2.2011) loss 3.4409 (3.3450) grad_norm 1.9381 (1.8759) [2022-01-23 21:22:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1180/1251] eta 0:02:36 lr 0.000281 time 1.6284 (2.2002) loss 2.9445 (3.3461) grad_norm 1.8747 (1.8759) [2022-01-23 21:22:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1190/1251] eta 0:02:14 lr 0.000280 time 2.2786 (2.1994) loss 2.6471 (3.3454) grad_norm 1.9010 (1.8755) [2022-01-23 21:23:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1200/1251] eta 0:01:52 lr 0.000280 time 2.6012 (2.1989) loss 2.1093 (3.3452) grad_norm 1.8164 (1.8760) [2022-01-23 21:23:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1210/1251] eta 0:01:30 lr 0.000280 time 2.2544 (2.1984) loss 2.7937 (3.3408) grad_norm 1.7145 (1.8754) [2022-01-23 21:23:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1220/1251] eta 0:01:08 lr 0.000280 time 1.8057 (2.1974) loss 3.7310 (3.3405) grad_norm 1.8164 (1.8760) [2022-01-23 21:24:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1230/1251] eta 0:00:46 lr 0.000280 time 2.4656 (2.1976) loss 2.6558 (3.3384) grad_norm 1.8058 (1.8760) [2022-01-23 21:24:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1240/1251] eta 0:00:24 lr 0.000280 time 2.6986 (2.1983) loss 3.3839 (3.3383) grad_norm 2.3339 (1.8778) [2022-01-23 21:24:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [194/300][1250/1251] eta 0:00:02 lr 0.000280 time 1.1750 (2.1933) loss 3.8494 (3.3405) grad_norm 1.8315 (1.8775) [2022-01-23 21:24:51 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 194 training takes 0:45:44 [2022-01-23 21:25:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.523 (18.523) Loss 0.7964 (0.7964) Acc@1 81.055 (81.055) Acc@5 95.410 (95.410) [2022-01-23 21:25:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.572 (3.186) Loss 0.8543 (0.9043) Acc@1 79.199 (78.232) Acc@5 95.508 (94.744) [2022-01-23 21:25:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.600 (2.484) Loss 0.9161 (0.9080) Acc@1 76.855 (78.227) Acc@5 95.312 (94.643) [2022-01-23 21:26:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.615 (2.241) Loss 0.9392 (0.9152) Acc@1 77.637 (78.197) Acc@5 94.141 (94.446) [2022-01-23 21:26:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.410 (2.140) Loss 0.9670 (0.9171) Acc@1 77.539 (78.218) Acc@5 93.750 (94.403) [2022-01-23 21:26:26 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.232 Acc@5 94.338 [2022-01-23 21:26:26 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.2% [2022-01-23 21:26:26 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.23% [2022-01-23 21:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][0/1251] eta 7:29:06 lr 0.000280 time 21.5402 (21.5402) loss 3.6333 (3.6333) grad_norm 2.0853 (2.0853) [2022-01-23 21:27:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][10/1251] eta 1:21:14 lr 0.000280 time 2.1698 (3.9281) loss 3.3996 (3.3210) grad_norm 1.6571 (1.8316) [2022-01-23 21:27:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][20/1251] eta 1:03:40 lr 0.000280 time 1.8196 (3.1033) loss 2.8085 (3.2344) grad_norm 1.7951 (1.7992) [2022-01-23 21:27:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][30/1251] eta 0:57:08 lr 0.000280 time 1.5358 (2.8080) loss 3.0199 (3.2256) grad_norm 2.0590 (1.9016) [2022-01-23 21:28:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][40/1251] eta 0:53:57 lr 0.000280 time 3.5419 (2.6732) loss 4.0692 (3.3233) grad_norm 1.9209 (1.9190) [2022-01-23 21:28:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][50/1251] eta 0:52:18 lr 0.000280 time 2.4303 (2.6136) loss 2.4592 (3.3174) grad_norm 2.0241 (1.9067) [2022-01-23 21:29:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][60/1251] eta 0:50:24 lr 0.000280 time 1.5553 (2.5396) loss 3.8047 (3.3666) grad_norm 1.8075 (1.8783) [2022-01-23 21:29:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][70/1251] eta 0:49:23 lr 0.000280 time 1.8665 (2.5092) loss 2.6537 (3.3675) grad_norm 1.7944 (1.8842) [2022-01-23 21:29:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][80/1251] eta 0:48:20 lr 0.000280 time 3.1666 (2.4773) loss 3.4482 (3.3095) grad_norm 1.5626 (1.8719) [2022-01-23 21:30:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][90/1251] eta 0:47:14 lr 0.000280 time 2.1044 (2.4415) loss 3.1971 (3.2991) grad_norm 1.8981 (1.8676) [2022-01-23 21:30:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][100/1251] eta 0:46:29 lr 0.000280 time 2.1291 (2.4239) loss 4.0637 (3.3059) grad_norm 1.9414 (1.8682) [2022-01-23 21:30:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][110/1251] eta 0:45:27 lr 0.000280 time 1.9130 (2.3901) loss 3.4272 (3.3216) grad_norm 1.7989 (1.8691) [2022-01-23 21:31:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][120/1251] eta 0:44:48 lr 0.000280 time 3.4601 (2.3771) loss 3.8308 (3.3275) grad_norm 2.1795 (1.8745) [2022-01-23 21:31:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][130/1251] eta 0:44:07 lr 0.000280 time 1.6264 (2.3617) loss 3.5390 (3.3350) grad_norm 2.1729 (1.8773) [2022-01-23 21:31:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][140/1251] eta 0:43:37 lr 0.000280 time 1.9549 (2.3562) loss 3.2150 (3.3414) grad_norm 1.8033 (1.8839) [2022-01-23 21:32:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][150/1251] eta 0:43:02 lr 0.000280 time 1.7948 (2.3453) loss 3.7762 (3.3511) grad_norm 2.0636 (1.8803) [2022-01-23 21:32:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][160/1251] eta 0:42:34 lr 0.000280 time 3.1167 (2.3412) loss 3.4177 (3.3413) grad_norm 1.9050 (1.8747) [2022-01-23 21:33:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][170/1251] eta 0:41:58 lr 0.000280 time 1.5715 (2.3300) loss 3.2626 (3.3265) grad_norm 1.8758 (1.8701) [2022-01-23 21:33:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][180/1251] eta 0:41:22 lr 0.000280 time 2.1471 (2.3182) loss 3.7679 (3.3375) grad_norm 2.3284 (1.8721) [2022-01-23 21:33:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][190/1251] eta 0:40:48 lr 0.000280 time 1.8972 (2.3076) loss 3.6117 (3.3337) grad_norm 1.7377 (1.8723) [2022-01-23 21:34:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][200/1251] eta 0:40:22 lr 0.000280 time 3.6365 (2.3052) loss 3.5897 (3.3415) grad_norm 1.9403 (1.8743) [2022-01-23 21:34:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][210/1251] eta 0:39:51 lr 0.000279 time 2.1692 (2.2969) loss 2.4832 (3.3487) grad_norm 2.5541 (1.8799) [2022-01-23 21:34:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][220/1251] eta 0:39:26 lr 0.000279 time 1.7620 (2.2952) loss 3.2116 (3.3547) grad_norm 1.7083 (1.8766) [2022-01-23 21:35:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][230/1251] eta 0:39:02 lr 0.000279 time 1.7501 (2.2942) loss 3.4163 (3.3556) grad_norm 2.5882 (1.8783) [2022-01-23 21:35:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][240/1251] eta 0:38:39 lr 0.000279 time 4.6597 (2.2945) loss 3.0874 (3.3524) grad_norm 1.7567 (1.8776) [2022-01-23 21:35:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][250/1251] eta 0:38:07 lr 0.000279 time 1.8465 (2.2850) loss 4.0974 (3.3538) grad_norm 1.9594 (1.8736) [2022-01-23 21:36:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][260/1251] eta 0:37:31 lr 0.000279 time 1.8672 (2.2720) loss 3.3916 (3.3518) grad_norm 2.1876 (1.8704) [2022-01-23 21:36:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][270/1251] eta 0:37:03 lr 0.000279 time 1.9006 (2.2667) loss 3.2456 (3.3556) grad_norm 1.7951 (1.8705) [2022-01-23 21:37:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][280/1251] eta 0:36:48 lr 0.000279 time 3.5851 (2.2747) loss 4.0013 (3.3637) grad_norm 1.9111 (1.8688) [2022-01-23 21:37:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][290/1251] eta 0:36:25 lr 0.000279 time 2.0927 (2.2740) loss 3.6490 (3.3615) grad_norm 2.0070 (1.8672) [2022-01-23 21:37:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][300/1251] eta 0:35:55 lr 0.000279 time 1.8856 (2.2665) loss 2.4208 (3.3576) grad_norm 1.6989 (1.8643) [2022-01-23 21:38:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][310/1251] eta 0:35:31 lr 0.000279 time 1.8296 (2.2648) loss 3.6581 (3.3585) grad_norm 1.7466 (1.8621) [2022-01-23 21:38:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][320/1251] eta 0:35:04 lr 0.000279 time 2.5872 (2.2607) loss 3.7961 (3.3603) grad_norm 1.7935 (1.8625) [2022-01-23 21:38:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][330/1251] eta 0:34:38 lr 0.000279 time 1.9725 (2.2572) loss 3.7715 (3.3579) grad_norm 1.6566 (1.8598) [2022-01-23 21:39:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][340/1251] eta 0:34:17 lr 0.000279 time 1.9680 (2.2581) loss 3.6196 (3.3587) grad_norm 1.9200 (1.8590) [2022-01-23 21:39:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][350/1251] eta 0:33:48 lr 0.000279 time 1.6695 (2.2515) loss 2.2337 (3.3571) grad_norm 1.8924 (1.8574) [2022-01-23 21:39:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][360/1251] eta 0:33:24 lr 0.000279 time 2.7426 (2.2496) loss 2.7175 (3.3478) grad_norm 2.0612 (1.8614) [2022-01-23 21:40:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][370/1251] eta 0:32:59 lr 0.000279 time 2.3035 (2.2469) loss 3.9580 (3.3491) grad_norm 1.9518 (1.8627) [2022-01-23 21:40:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][380/1251] eta 0:32:38 lr 0.000279 time 1.9441 (2.2482) loss 3.0162 (3.3602) grad_norm 1.7758 (1.8648) [2022-01-23 21:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][390/1251] eta 0:32:14 lr 0.000279 time 2.6053 (2.2470) loss 2.4233 (3.3584) grad_norm 2.0212 (1.8667) [2022-01-23 21:41:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][400/1251] eta 0:31:49 lr 0.000279 time 1.6822 (2.2444) loss 3.3463 (3.3597) grad_norm 1.8130 (1.8662) [2022-01-23 21:41:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][410/1251] eta 0:31:25 lr 0.000279 time 2.4441 (2.2418) loss 3.5690 (3.3618) grad_norm 1.9295 (1.8659) [2022-01-23 21:42:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][420/1251] eta 0:30:58 lr 0.000279 time 1.8075 (2.2368) loss 3.1594 (3.3589) grad_norm 1.6455 (1.8668) [2022-01-23 21:42:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][430/1251] eta 0:30:39 lr 0.000279 time 3.6193 (2.2409) loss 3.2689 (3.3615) grad_norm 2.3032 (1.8668) [2022-01-23 21:42:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][440/1251] eta 0:30:17 lr 0.000279 time 1.5658 (2.2405) loss 3.7475 (3.3558) grad_norm 1.7284 (1.8655) [2022-01-23 21:43:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][450/1251] eta 0:29:54 lr 0.000279 time 3.0132 (2.2405) loss 3.5505 (3.3481) grad_norm 1.7069 (1.8644) [2022-01-23 21:43:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][460/1251] eta 0:29:27 lr 0.000279 time 2.1964 (2.2351) loss 3.0142 (3.3482) grad_norm 1.8489 (1.8648) [2022-01-23 21:43:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][470/1251] eta 0:29:02 lr 0.000279 time 1.5107 (2.2312) loss 3.7321 (3.3481) grad_norm 1.7222 (1.8636) [2022-01-23 21:44:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][480/1251] eta 0:28:39 lr 0.000279 time 1.8800 (2.2303) loss 2.4895 (3.3530) grad_norm 2.5087 (1.8655) [2022-01-23 21:44:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][490/1251] eta 0:28:16 lr 0.000278 time 3.1015 (2.2297) loss 2.5644 (3.3509) grad_norm 2.0158 (1.8666) [2022-01-23 21:45:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][500/1251] eta 0:27:52 lr 0.000278 time 2.1808 (2.2268) loss 3.5001 (3.3536) grad_norm 1.4597 (1.8693) [2022-01-23 21:45:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][510/1251] eta 0:27:31 lr 0.000278 time 2.1732 (2.2288) loss 3.8791 (3.3553) grad_norm 1.9134 (1.8700) [2022-01-23 21:45:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][520/1251] eta 0:27:07 lr 0.000278 time 2.3020 (2.2267) loss 3.3268 (3.3592) grad_norm 2.0294 (1.8701) [2022-01-23 21:46:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][530/1251] eta 0:26:46 lr 0.000278 time 3.6843 (2.2276) loss 3.6363 (3.3595) grad_norm 1.9108 (1.8720) [2022-01-23 21:46:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][540/1251] eta 0:26:22 lr 0.000278 time 2.5377 (2.2261) loss 3.5901 (3.3566) grad_norm 2.2019 (1.8741) [2022-01-23 21:46:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][550/1251] eta 0:26:00 lr 0.000278 time 1.9228 (2.2255) loss 2.5950 (3.3560) grad_norm 1.8155 (1.8750) [2022-01-23 21:47:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][560/1251] eta 0:25:35 lr 0.000278 time 1.9684 (2.2215) loss 3.8312 (3.3568) grad_norm 2.0077 (1.8773) [2022-01-23 21:47:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][570/1251] eta 0:25:13 lr 0.000278 time 3.4212 (2.2225) loss 3.3627 (3.3542) grad_norm 2.1846 (1.8767) [2022-01-23 21:47:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][580/1251] eta 0:24:51 lr 0.000278 time 1.8881 (2.2232) loss 3.3343 (3.3498) grad_norm 1.7612 (1.8762) [2022-01-23 21:48:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][590/1251] eta 0:24:28 lr 0.000278 time 1.9264 (2.2214) loss 3.8913 (3.3563) grad_norm 2.2383 (1.8784) [2022-01-23 21:48:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][600/1251] eta 0:24:03 lr 0.000278 time 1.8203 (2.2174) loss 3.8392 (3.3525) grad_norm 1.5625 (1.8811) [2022-01-23 21:48:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][610/1251] eta 0:23:39 lr 0.000278 time 2.8461 (2.2148) loss 3.7001 (3.3501) grad_norm 1.8730 (1.8819) [2022-01-23 21:49:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][620/1251] eta 0:23:15 lr 0.000278 time 2.2246 (2.2121) loss 3.9735 (3.3523) grad_norm 1.7223 (1.8813) [2022-01-23 21:49:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][630/1251] eta 0:22:54 lr 0.000278 time 1.9453 (2.2135) loss 3.1032 (3.3532) grad_norm 2.1622 (1.8818) [2022-01-23 21:50:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][640/1251] eta 0:22:32 lr 0.000278 time 1.9197 (2.2137) loss 4.0233 (3.3557) grad_norm 2.1764 (1.8808) [2022-01-23 21:50:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][650/1251] eta 0:22:10 lr 0.000278 time 3.2371 (2.2134) loss 3.7034 (3.3564) grad_norm 1.6994 (1.8797) [2022-01-23 21:50:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][660/1251] eta 0:21:48 lr 0.000278 time 2.3855 (2.2145) loss 3.4188 (3.3547) grad_norm 1.9393 (1.8811) [2022-01-23 21:51:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][670/1251] eta 0:21:27 lr 0.000278 time 2.1279 (2.2158) loss 3.8006 (3.3572) grad_norm 1.9523 (1.8815) [2022-01-23 21:51:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][680/1251] eta 0:21:05 lr 0.000278 time 2.3508 (2.2168) loss 3.1076 (3.3586) grad_norm 1.7670 (1.8828) [2022-01-23 21:51:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][690/1251] eta 0:20:44 lr 0.000278 time 3.4647 (2.2187) loss 3.2230 (3.3605) grad_norm 1.9064 (1.8837) [2022-01-23 21:52:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][700/1251] eta 0:20:23 lr 0.000278 time 2.9116 (2.2196) loss 2.5913 (3.3566) grad_norm 1.9025 (1.8846) [2022-01-23 21:52:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][710/1251] eta 0:19:59 lr 0.000278 time 1.9845 (2.2165) loss 3.6763 (3.3562) grad_norm 1.6302 (1.8847) [2022-01-23 21:53:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][720/1251] eta 0:19:35 lr 0.000278 time 1.7297 (2.2132) loss 3.6427 (3.3547) grad_norm 1.9894 (1.8845) [2022-01-23 21:53:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][730/1251] eta 0:19:11 lr 0.000278 time 1.8899 (2.2100) loss 3.7574 (3.3570) grad_norm 1.5576 (1.8855) [2022-01-23 21:53:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][740/1251] eta 0:18:48 lr 0.000278 time 1.9033 (2.2077) loss 3.7095 (3.3578) grad_norm 1.7644 (1.8860) [2022-01-23 21:54:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][750/1251] eta 0:18:25 lr 0.000278 time 2.3179 (2.2069) loss 3.7803 (3.3619) grad_norm 1.7012 (1.8866) [2022-01-23 21:54:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][760/1251] eta 0:18:03 lr 0.000277 time 2.1631 (2.2066) loss 3.4527 (3.3619) grad_norm 1.6259 (1.8864) [2022-01-23 21:54:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][770/1251] eta 0:17:41 lr 0.000277 time 2.1388 (2.2065) loss 3.6565 (3.3604) grad_norm 1.6523 (1.8854) [2022-01-23 21:55:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][780/1251] eta 0:17:19 lr 0.000277 time 2.7864 (2.2078) loss 3.7465 (3.3610) grad_norm 1.6545 (1.8858) [2022-01-23 21:55:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][790/1251] eta 0:16:58 lr 0.000277 time 2.8519 (2.2089) loss 3.2269 (3.3621) grad_norm 2.0531 (1.8856) [2022-01-23 21:55:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][800/1251] eta 0:16:36 lr 0.000277 time 2.2038 (2.2087) loss 3.0255 (3.3590) grad_norm 2.1720 (1.8850) [2022-01-23 21:56:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][810/1251] eta 0:16:13 lr 0.000277 time 1.9110 (2.2071) loss 3.1289 (3.3579) grad_norm 1.8520 (1.8851) [2022-01-23 21:56:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][820/1251] eta 0:15:50 lr 0.000277 time 2.2060 (2.2062) loss 3.1951 (3.3594) grad_norm 1.8823 (1.8849) [2022-01-23 21:57:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][830/1251] eta 0:15:29 lr 0.000277 time 2.8412 (2.2078) loss 3.0620 (3.3577) grad_norm 1.7029 (1.8844) [2022-01-23 21:57:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][840/1251] eta 0:15:07 lr 0.000277 time 2.4677 (2.2081) loss 3.5008 (3.3583) grad_norm 1.6899 (1.8830) [2022-01-23 21:57:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][850/1251] eta 0:14:44 lr 0.000277 time 1.9242 (2.2064) loss 2.3660 (3.3550) grad_norm 2.1163 (1.8849) [2022-01-23 21:58:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][860/1251] eta 0:14:22 lr 0.000277 time 2.1851 (2.2047) loss 3.2710 (3.3527) grad_norm 1.9205 (1.8857) [2022-01-23 21:58:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][870/1251] eta 0:13:59 lr 0.000277 time 2.0937 (2.2038) loss 3.7522 (3.3514) grad_norm 1.7969 (1.8853) [2022-01-23 21:58:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][880/1251] eta 0:13:37 lr 0.000277 time 2.5902 (2.2027) loss 3.7532 (3.3510) grad_norm 1.6918 (1.8871) [2022-01-23 21:59:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][890/1251] eta 0:13:15 lr 0.000277 time 1.8278 (2.2025) loss 3.1661 (3.3503) grad_norm 1.6531 (1.8869) [2022-01-23 21:59:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][900/1251] eta 0:12:52 lr 0.000277 time 2.1971 (2.2021) loss 2.9107 (3.3497) grad_norm 1.7282 (1.8861) [2022-01-23 21:59:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][910/1251] eta 0:12:30 lr 0.000277 time 2.2871 (2.2009) loss 2.9215 (3.3501) grad_norm 1.7163 (1.8843) [2022-01-23 22:00:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][920/1251] eta 0:12:08 lr 0.000277 time 3.0346 (2.2017) loss 3.5570 (3.3491) grad_norm 1.9372 (1.8833) [2022-01-23 22:00:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][930/1251] eta 0:11:46 lr 0.000277 time 2.3658 (2.2013) loss 3.4500 (3.3527) grad_norm 2.0645 (1.8840) [2022-01-23 22:00:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][940/1251] eta 0:11:25 lr 0.000277 time 2.3912 (2.2037) loss 3.4685 (3.3539) grad_norm 2.1950 (1.8856) [2022-01-23 22:01:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][950/1251] eta 0:11:03 lr 0.000277 time 2.1298 (2.2040) loss 3.5141 (3.3556) grad_norm 1.6802 (1.8856) [2022-01-23 22:01:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][960/1251] eta 0:10:41 lr 0.000277 time 3.2105 (2.2054) loss 3.3845 (3.3546) grad_norm 1.7504 (1.8848) [2022-01-23 22:02:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][970/1251] eta 0:10:19 lr 0.000277 time 3.1071 (2.2061) loss 3.7646 (3.3572) grad_norm 1.9769 (1.8841) [2022-01-23 22:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][980/1251] eta 0:09:57 lr 0.000277 time 2.1647 (2.2056) loss 3.6198 (3.3594) grad_norm 1.7717 (1.8845) [2022-01-23 22:02:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][990/1251] eta 0:09:35 lr 0.000277 time 1.9328 (2.2032) loss 3.5306 (3.3603) grad_norm 2.3558 (1.8860) [2022-01-23 22:03:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1000/1251] eta 0:09:12 lr 0.000277 time 1.8737 (2.2012) loss 3.8364 (3.3605) grad_norm 2.1658 (1.8870) [2022-01-23 22:03:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1010/1251] eta 0:08:50 lr 0.000277 time 2.4451 (2.1996) loss 3.6375 (3.3624) grad_norm 1.9030 (1.8875) [2022-01-23 22:03:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1020/1251] eta 0:08:28 lr 0.000277 time 1.8366 (2.1994) loss 3.6621 (3.3638) grad_norm 1.5906 (1.8881) [2022-01-23 22:04:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1030/1251] eta 0:08:06 lr 0.000276 time 1.8437 (2.2003) loss 3.7556 (3.3640) grad_norm 1.5671 (1.8871) [2022-01-23 22:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1040/1251] eta 0:07:44 lr 0.000276 time 2.5729 (2.2015) loss 3.0907 (3.3644) grad_norm 1.8666 (1.8863) [2022-01-23 22:05:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1050/1251] eta 0:07:22 lr 0.000276 time 2.5580 (2.2016) loss 2.9071 (3.3645) grad_norm 2.3201 (1.8866) [2022-01-23 22:05:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1060/1251] eta 0:07:00 lr 0.000276 time 1.9523 (2.2018) loss 3.6700 (3.3620) grad_norm 1.7363 (1.8868) [2022-01-23 22:05:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1070/1251] eta 0:06:38 lr 0.000276 time 1.6119 (2.2012) loss 3.9889 (3.3618) grad_norm 2.0020 (1.8867) [2022-01-23 22:06:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1080/1251] eta 0:06:16 lr 0.000276 time 2.0831 (2.2004) loss 2.8879 (3.3600) grad_norm 1.8712 (1.8864) [2022-01-23 22:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1090/1251] eta 0:05:54 lr 0.000276 time 2.5046 (2.2007) loss 3.7654 (3.3606) grad_norm 1.7228 (1.8856) [2022-01-23 22:06:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1100/1251] eta 0:05:32 lr 0.000276 time 2.2890 (2.2006) loss 2.5231 (3.3612) grad_norm 2.1748 (1.8857) [2022-01-23 22:07:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1110/1251] eta 0:05:10 lr 0.000276 time 1.8196 (2.1996) loss 3.2970 (3.3607) grad_norm 1.8657 (1.8853) [2022-01-23 22:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1120/1251] eta 0:04:48 lr 0.000276 time 2.2661 (2.2003) loss 2.2979 (3.3582) grad_norm 1.7658 (1.8858) [2022-01-23 22:07:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1130/1251] eta 0:04:26 lr 0.000276 time 3.1351 (2.2009) loss 2.4568 (3.3553) grad_norm 1.9516 (1.8869) [2022-01-23 22:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1140/1251] eta 0:04:04 lr 0.000276 time 2.0128 (2.2010) loss 2.9712 (3.3543) grad_norm 1.5548 (1.8865) [2022-01-23 22:08:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1150/1251] eta 0:03:42 lr 0.000276 time 1.7524 (2.1992) loss 3.8232 (3.3562) grad_norm 1.8326 (1.8863) [2022-01-23 22:08:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1160/1251] eta 0:03:20 lr 0.000276 time 1.9346 (2.1989) loss 2.9875 (3.3552) grad_norm 1.7923 (1.8855) [2022-01-23 22:09:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1170/1251] eta 0:02:58 lr 0.000276 time 2.4236 (2.1985) loss 3.1899 (3.3541) grad_norm 1.8944 (1.8844) [2022-01-23 22:09:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1180/1251] eta 0:02:36 lr 0.000276 time 2.2473 (2.1976) loss 3.1663 (3.3534) grad_norm 1.7133 (1.8841) [2022-01-23 22:10:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1190/1251] eta 0:02:13 lr 0.000276 time 2.2718 (2.1967) loss 3.4449 (3.3549) grad_norm 1.6039 (1.8840) [2022-01-23 22:10:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1200/1251] eta 0:01:51 lr 0.000276 time 1.8791 (2.1955) loss 3.0744 (3.3546) grad_norm 1.7699 (1.8839) [2022-01-23 22:10:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1210/1251] eta 0:01:30 lr 0.000276 time 2.6942 (2.1958) loss 3.9070 (3.3558) grad_norm 1.6819 (1.8832) [2022-01-23 22:11:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1220/1251] eta 0:01:08 lr 0.000276 time 2.4417 (2.1960) loss 3.4176 (3.3562) grad_norm 2.1807 (1.8836) [2022-01-23 22:11:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1230/1251] eta 0:00:46 lr 0.000276 time 1.9108 (2.1964) loss 2.7932 (3.3559) grad_norm 2.3148 (1.8844) [2022-01-23 22:11:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1240/1251] eta 0:00:24 lr 0.000276 time 1.4863 (2.1976) loss 2.6457 (3.3557) grad_norm 2.1136 (1.8838) [2022-01-23 22:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [195/300][1250/1251] eta 0:00:02 lr 0.000276 time 1.1838 (2.1923) loss 3.8539 (3.3564) grad_norm 2.1407 (1.8841) [2022-01-23 22:12:09 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 195 training takes 0:45:43 [2022-01-23 22:12:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.296 (18.296) Loss 0.9208 (0.9208) Acc@1 78.320 (78.320) Acc@5 94.043 (94.043) [2022-01-23 22:12:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.238 (3.243) Loss 0.7929 (0.9205) Acc@1 80.762 (78.125) Acc@5 96.094 (94.309) [2022-01-23 22:13:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.994 (2.444) Loss 0.9028 (0.9165) Acc@1 78.027 (78.288) Acc@5 94.238 (94.387) [2022-01-23 22:13:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.571 (2.281) Loss 0.8768 (0.9179) Acc@1 79.395 (78.270) Acc@5 94.434 (94.371) [2022-01-23 22:13:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.827 (2.130) Loss 0.9007 (0.9192) Acc@1 77.246 (78.161) Acc@5 94.922 (94.329) [2022-01-23 22:13:43 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.174 Acc@5 94.302 [2022-01-23 22:13:43 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.2% [2022-01-23 22:13:43 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.23% [2022-01-23 22:14:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][0/1251] eta 7:29:54 lr 0.000276 time 21.5784 (21.5784) loss 3.5666 (3.5666) grad_norm 2.3138 (2.3138) [2022-01-23 22:14:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][10/1251] eta 1:24:37 lr 0.000276 time 1.7049 (4.0914) loss 2.6127 (3.3904) grad_norm 1.5256 (1.8483) [2022-01-23 22:14:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][20/1251] eta 1:03:36 lr 0.000276 time 1.9523 (3.1001) loss 3.2638 (3.1598) grad_norm 1.9856 (1.8600) [2022-01-23 22:15:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][30/1251] eta 0:56:09 lr 0.000276 time 1.3559 (2.7598) loss 3.3199 (3.3285) grad_norm 2.0414 (1.8364) [2022-01-23 22:15:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][40/1251] eta 0:54:04 lr 0.000276 time 5.6742 (2.6796) loss 3.3009 (3.3333) grad_norm 1.7897 (1.8498) [2022-01-23 22:15:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][50/1251] eta 0:51:50 lr 0.000275 time 2.0396 (2.5901) loss 3.0441 (3.2919) grad_norm 1.7939 (1.8733) [2022-01-23 22:16:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][60/1251] eta 0:49:51 lr 0.000275 time 1.9623 (2.5113) loss 2.7935 (3.2995) grad_norm 2.2084 (1.8781) [2022-01-23 22:16:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][70/1251] eta 0:48:11 lr 0.000275 time 1.4922 (2.4484) loss 3.7819 (3.3077) grad_norm 1.9670 (1.8958) [2022-01-23 22:17:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][80/1251] eta 0:47:36 lr 0.000275 time 3.3655 (2.4391) loss 3.3761 (3.3304) grad_norm 1.9248 (1.9032) [2022-01-23 22:17:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][90/1251] eta 0:46:54 lr 0.000275 time 2.3573 (2.4240) loss 2.9924 (3.3229) grad_norm 1.9062 (1.9119) [2022-01-23 22:17:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][100/1251] eta 0:46:13 lr 0.000275 time 2.5336 (2.4100) loss 2.7448 (3.3195) grad_norm 2.2567 (1.9167) [2022-01-23 22:18:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][110/1251] eta 0:45:34 lr 0.000275 time 1.9589 (2.3969) loss 2.6079 (3.3124) grad_norm 1.8393 (1.9136) [2022-01-23 22:18:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][120/1251] eta 0:44:55 lr 0.000275 time 2.8699 (2.3830) loss 3.5398 (3.2941) grad_norm 1.8385 (1.9191) [2022-01-23 22:18:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][130/1251] eta 0:44:22 lr 0.000275 time 2.4426 (2.3754) loss 3.6666 (3.2982) grad_norm 2.0239 (1.9237) [2022-01-23 22:19:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][140/1251] eta 0:43:50 lr 0.000275 time 1.7935 (2.3674) loss 3.3476 (3.3024) grad_norm 2.2813 (1.9256) [2022-01-23 22:19:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][150/1251] eta 0:43:01 lr 0.000275 time 1.9698 (2.3450) loss 3.6353 (3.3150) grad_norm 1.6646 (1.9237) [2022-01-23 22:19:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][160/1251] eta 0:42:12 lr 0.000275 time 2.2635 (2.3210) loss 2.8703 (3.2883) grad_norm 1.8407 (1.9182) [2022-01-23 22:20:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][170/1251] eta 0:41:41 lr 0.000275 time 1.9786 (2.3143) loss 3.1118 (3.2788) grad_norm 1.7519 (1.9155) [2022-01-23 22:20:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][180/1251] eta 0:41:11 lr 0.000275 time 2.2567 (2.3072) loss 4.1433 (3.2985) grad_norm 1.9577 (1.9164) [2022-01-23 22:21:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][190/1251] eta 0:40:44 lr 0.000275 time 2.2800 (2.3042) loss 3.3960 (3.3049) grad_norm 1.9881 (1.9162) [2022-01-23 22:21:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][200/1251] eta 0:40:21 lr 0.000275 time 2.7748 (2.3037) loss 2.3198 (3.2973) grad_norm 1.7712 (1.9127) [2022-01-23 22:21:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][210/1251] eta 0:39:48 lr 0.000275 time 2.2606 (2.2947) loss 2.9051 (3.2864) grad_norm 1.7712 (1.9130) [2022-01-23 22:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][220/1251] eta 0:39:19 lr 0.000275 time 1.7650 (2.2888) loss 2.4637 (3.2907) grad_norm 1.7001 (1.9069) [2022-01-23 22:22:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][230/1251] eta 0:38:49 lr 0.000275 time 1.8904 (2.2817) loss 2.8745 (3.2873) grad_norm 2.1918 (1.9059) [2022-01-23 22:22:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][240/1251] eta 0:38:26 lr 0.000275 time 3.1728 (2.2819) loss 3.2116 (3.2886) grad_norm 1.6846 (1.9047) [2022-01-23 22:23:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][250/1251] eta 0:38:05 lr 0.000275 time 2.5745 (2.2828) loss 3.6054 (3.2786) grad_norm 1.7903 (1.9051) [2022-01-23 22:23:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][260/1251] eta 0:37:30 lr 0.000275 time 1.6654 (2.2712) loss 2.3013 (3.2699) grad_norm 1.8892 (1.9029) [2022-01-23 22:23:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][270/1251] eta 0:37:00 lr 0.000275 time 2.2373 (2.2634) loss 3.1092 (3.2686) grad_norm 1.7948 (1.9004) [2022-01-23 22:24:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][280/1251] eta 0:36:32 lr 0.000275 time 2.1847 (2.2577) loss 3.4597 (3.2779) grad_norm 2.3069 (1.9005) [2022-01-23 22:24:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][290/1251] eta 0:36:09 lr 0.000275 time 2.7406 (2.2579) loss 2.1651 (3.2633) grad_norm 1.8378 (1.8988) [2022-01-23 22:25:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][300/1251] eta 0:35:44 lr 0.000275 time 2.6073 (2.2553) loss 3.7450 (3.2637) grad_norm 1.9187 (1.8983) [2022-01-23 22:25:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][310/1251] eta 0:35:24 lr 0.000275 time 2.1398 (2.2576) loss 3.5670 (3.2741) grad_norm 2.0314 (1.8999) [2022-01-23 22:25:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][320/1251] eta 0:34:57 lr 0.000274 time 2.5244 (2.2528) loss 3.4117 (3.2684) grad_norm 2.2125 (1.8991) [2022-01-23 22:26:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][330/1251] eta 0:34:31 lr 0.000274 time 2.0860 (2.2490) loss 3.8411 (3.2729) grad_norm 2.0787 (1.8999) [2022-01-23 22:26:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][340/1251] eta 0:34:07 lr 0.000274 time 2.4979 (2.2472) loss 3.1936 (3.2668) grad_norm 1.8792 (1.9000) [2022-01-23 22:26:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][350/1251] eta 0:33:43 lr 0.000274 time 1.8619 (2.2463) loss 3.3947 (3.2702) grad_norm 1.8370 (1.8998) [2022-01-23 22:27:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][360/1251] eta 0:33:18 lr 0.000274 time 1.9233 (2.2429) loss 2.9003 (3.2718) grad_norm 1.7335 (1.8994) [2022-01-23 22:27:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][370/1251] eta 0:32:51 lr 0.000274 time 2.3931 (2.2372) loss 3.2219 (3.2699) grad_norm 1.9849 (1.8978) [2022-01-23 22:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][380/1251] eta 0:32:25 lr 0.000274 time 2.8701 (2.2333) loss 4.0939 (3.2770) grad_norm 1.7902 (1.8989) [2022-01-23 22:28:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][390/1251] eta 0:31:59 lr 0.000274 time 2.5690 (2.2293) loss 3.4863 (3.2748) grad_norm 1.7132 (1.8995) [2022-01-23 22:28:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][400/1251] eta 0:31:34 lr 0.000274 time 1.7567 (2.2265) loss 3.6550 (3.2795) grad_norm 1.8950 (1.8991) [2022-01-23 22:28:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][410/1251] eta 0:31:12 lr 0.000274 time 1.8484 (2.2261) loss 3.5335 (3.2779) grad_norm 1.9538 (1.8968) [2022-01-23 22:29:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][420/1251] eta 0:30:54 lr 0.000274 time 3.5069 (2.2313) loss 4.2758 (3.2801) grad_norm 2.4368 (1.8997) [2022-01-23 22:29:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][430/1251] eta 0:30:36 lr 0.000274 time 3.2070 (2.2364) loss 3.4663 (3.2857) grad_norm 1.8054 (1.9015) [2022-01-23 22:30:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][440/1251] eta 0:30:09 lr 0.000274 time 1.5500 (2.2310) loss 3.4940 (3.2874) grad_norm 2.8795 (1.9031) [2022-01-23 22:30:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][450/1251] eta 0:29:43 lr 0.000274 time 2.2195 (2.2271) loss 3.8213 (3.2889) grad_norm 1.8647 (1.9041) [2022-01-23 22:30:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][460/1251] eta 0:29:21 lr 0.000274 time 3.3881 (2.2267) loss 3.5635 (3.2875) grad_norm 2.2471 (1.9061) [2022-01-23 22:31:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][470/1251] eta 0:28:59 lr 0.000274 time 2.8329 (2.2272) loss 2.6772 (3.2936) grad_norm 1.8075 (1.9046) [2022-01-23 22:31:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][480/1251] eta 0:28:37 lr 0.000274 time 2.2629 (2.2281) loss 2.7584 (3.2908) grad_norm 1.8619 (1.9034) [2022-01-23 22:31:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][490/1251] eta 0:28:16 lr 0.000274 time 2.0265 (2.2298) loss 3.4447 (3.2968) grad_norm 1.9516 (1.9028) [2022-01-23 22:32:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][500/1251] eta 0:27:55 lr 0.000274 time 3.2454 (2.2307) loss 3.7794 (3.3022) grad_norm 2.0297 (1.9044) [2022-01-23 22:32:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][510/1251] eta 0:27:31 lr 0.000274 time 2.1796 (2.2294) loss 2.6796 (3.3021) grad_norm 2.0482 (1.9058) [2022-01-23 22:33:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][520/1251] eta 0:27:06 lr 0.000274 time 1.6494 (2.2244) loss 3.5950 (3.3057) grad_norm 1.9341 (1.9057) [2022-01-23 22:33:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][530/1251] eta 0:26:41 lr 0.000274 time 2.1273 (2.2218) loss 3.8196 (3.3102) grad_norm 1.8141 (1.9040) [2022-01-23 22:33:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][540/1251] eta 0:26:20 lr 0.000274 time 3.4539 (2.2232) loss 2.6818 (3.3100) grad_norm 1.8032 (1.9032) [2022-01-23 22:34:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][550/1251] eta 0:25:55 lr 0.000274 time 1.5871 (2.2195) loss 2.4068 (3.3105) grad_norm 1.7151 (1.9011) [2022-01-23 22:34:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][560/1251] eta 0:25:32 lr 0.000274 time 1.5427 (2.2175) loss 3.0561 (3.3090) grad_norm 1.9018 (1.9013) [2022-01-23 22:34:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][570/1251] eta 0:25:07 lr 0.000274 time 1.5956 (2.2144) loss 3.7968 (3.3136) grad_norm 2.1523 (1.9000) [2022-01-23 22:35:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][580/1251] eta 0:24:44 lr 0.000274 time 2.2113 (2.2130) loss 3.2520 (3.3121) grad_norm 1.8263 (1.8997) [2022-01-23 22:35:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][590/1251] eta 0:24:22 lr 0.000274 time 2.1765 (2.2123) loss 3.5473 (3.3130) grad_norm 1.6505 (1.8986) [2022-01-23 22:35:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][600/1251] eta 0:24:01 lr 0.000273 time 2.0348 (2.2146) loss 3.7710 (3.3155) grad_norm 1.8556 (1.9002) [2022-01-23 22:36:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][610/1251] eta 0:23:39 lr 0.000273 time 2.1914 (2.2147) loss 3.6388 (3.3183) grad_norm 2.0376 (1.9021) [2022-01-23 22:36:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][620/1251] eta 0:23:16 lr 0.000273 time 2.2783 (2.2127) loss 2.5389 (3.3203) grad_norm 2.0379 (1.9012) [2022-01-23 22:36:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][630/1251] eta 0:22:53 lr 0.000273 time 1.7287 (2.2118) loss 3.7722 (3.3222) grad_norm 2.4624 (1.9037) [2022-01-23 22:37:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][640/1251] eta 0:22:33 lr 0.000273 time 2.2983 (2.2147) loss 3.4478 (3.3266) grad_norm 1.7235 (1.9046) [2022-01-23 22:37:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][650/1251] eta 0:22:12 lr 0.000273 time 2.6401 (2.2164) loss 2.4480 (3.3278) grad_norm 2.0463 (1.9051) [2022-01-23 22:38:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][660/1251] eta 0:21:49 lr 0.000273 time 1.7392 (2.2157) loss 3.6528 (3.3324) grad_norm 1.8321 (1.9052) [2022-01-23 22:38:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][670/1251] eta 0:21:26 lr 0.000273 time 2.2340 (2.2144) loss 3.8966 (3.3360) grad_norm 1.7679 (1.9038) [2022-01-23 22:38:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][680/1251] eta 0:21:04 lr 0.000273 time 3.1048 (2.2138) loss 3.5787 (3.3369) grad_norm 1.9592 (1.9049) [2022-01-23 22:39:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][690/1251] eta 0:20:41 lr 0.000273 time 2.0714 (2.2130) loss 2.5499 (3.3357) grad_norm 2.0065 (1.9045) [2022-01-23 22:39:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][700/1251] eta 0:20:19 lr 0.000273 time 2.2362 (2.2124) loss 3.7781 (3.3376) grad_norm 1.9140 (1.9036) [2022-01-23 22:39:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][710/1251] eta 0:19:57 lr 0.000273 time 1.8142 (2.2129) loss 2.4639 (3.3323) grad_norm 1.7199 (1.9038) [2022-01-23 22:40:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][720/1251] eta 0:19:35 lr 0.000273 time 2.7958 (2.2145) loss 3.3146 (3.3331) grad_norm 1.7079 (1.9020) [2022-01-23 22:40:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][730/1251] eta 0:19:13 lr 0.000273 time 2.2424 (2.2143) loss 3.1728 (3.3309) grad_norm 1.5862 (1.9012) [2022-01-23 22:41:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][740/1251] eta 0:18:52 lr 0.000273 time 2.0537 (2.2156) loss 3.3746 (3.3340) grad_norm 1.9073 (1.9027) [2022-01-23 22:41:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][750/1251] eta 0:18:28 lr 0.000273 time 1.6169 (2.2128) loss 3.6239 (3.3338) grad_norm 1.6264 (1.9033) [2022-01-23 22:41:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][760/1251] eta 0:18:05 lr 0.000273 time 2.7505 (2.2118) loss 3.7496 (3.3334) grad_norm 1.9730 (1.9027) [2022-01-23 22:42:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][770/1251] eta 0:17:43 lr 0.000273 time 1.8960 (2.2101) loss 4.1944 (3.3370) grad_norm 2.2252 (1.9029) [2022-01-23 22:42:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][780/1251] eta 0:17:20 lr 0.000273 time 1.8700 (2.2087) loss 2.9558 (3.3355) grad_norm 1.8532 (1.9022) [2022-01-23 22:42:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][790/1251] eta 0:16:58 lr 0.000273 time 2.0505 (2.2100) loss 3.2423 (3.3370) grad_norm 2.2673 (1.9021) [2022-01-23 22:43:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][800/1251] eta 0:16:36 lr 0.000273 time 2.5261 (2.2098) loss 3.8513 (3.3376) grad_norm 1.7699 (1.9058) [2022-01-23 22:43:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][810/1251] eta 0:16:13 lr 0.000273 time 2.0720 (2.2081) loss 3.5780 (3.3384) grad_norm 1.7701 (1.9060) [2022-01-23 22:43:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][820/1251] eta 0:15:52 lr 0.000273 time 1.8668 (2.2097) loss 2.7854 (3.3363) grad_norm 1.6937 (1.9072) [2022-01-23 22:44:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][830/1251] eta 0:15:30 lr 0.000273 time 2.1385 (2.2107) loss 2.6618 (3.3341) grad_norm 2.0155 (1.9070) [2022-01-23 22:44:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][840/1251] eta 0:15:08 lr 0.000273 time 1.8089 (2.2102) loss 3.3636 (3.3335) grad_norm 1.8185 (1.9063) [2022-01-23 22:45:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][850/1251] eta 0:14:46 lr 0.000273 time 2.8762 (2.2100) loss 3.4217 (3.3316) grad_norm 1.6878 (1.9063) [2022-01-23 22:45:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][860/1251] eta 0:14:23 lr 0.000273 time 2.1528 (2.2096) loss 3.0321 (3.3317) grad_norm 1.7461 (1.9061) [2022-01-23 22:45:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][870/1251] eta 0:14:01 lr 0.000272 time 2.2631 (2.2079) loss 3.0126 (3.3294) grad_norm 2.2735 (1.9063) [2022-01-23 22:46:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][880/1251] eta 0:13:39 lr 0.000272 time 1.9822 (2.2078) loss 3.1826 (3.3293) grad_norm 1.7184 (1.9077) [2022-01-23 22:46:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][890/1251] eta 0:13:17 lr 0.000272 time 3.4734 (2.2097) loss 2.5704 (3.3285) grad_norm 1.6016 (1.9074) [2022-01-23 22:46:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][900/1251] eta 0:12:55 lr 0.000272 time 2.4874 (2.2094) loss 2.7334 (3.3281) grad_norm 1.6406 (1.9063) [2022-01-23 22:47:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][910/1251] eta 0:12:33 lr 0.000272 time 2.4764 (2.2087) loss 3.2990 (3.3287) grad_norm 1.8753 (1.9053) [2022-01-23 22:47:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][920/1251] eta 0:12:10 lr 0.000272 time 2.8136 (2.2074) loss 3.6183 (3.3318) grad_norm 1.9169 (1.9053) [2022-01-23 22:47:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][930/1251] eta 0:11:47 lr 0.000272 time 1.6518 (2.2050) loss 2.7331 (3.3298) grad_norm 1.8893 (1.9060) [2022-01-23 22:48:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][940/1251] eta 0:11:25 lr 0.000272 time 2.0758 (2.2035) loss 3.1991 (3.3293) grad_norm 1.9267 (1.9060) [2022-01-23 22:48:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][950/1251] eta 0:11:03 lr 0.000272 time 2.4394 (2.2033) loss 4.0966 (3.3293) grad_norm 1.7322 (1.9056) [2022-01-23 22:49:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][960/1251] eta 0:10:41 lr 0.000272 time 3.0433 (2.2031) loss 3.5849 (3.3312) grad_norm 1.7521 (1.9046) [2022-01-23 22:49:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][970/1251] eta 0:10:19 lr 0.000272 time 2.1643 (2.2048) loss 2.7933 (3.3311) grad_norm 2.0581 (1.9059) [2022-01-23 22:49:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][980/1251] eta 0:09:57 lr 0.000272 time 1.8758 (2.2050) loss 3.9538 (3.3310) grad_norm 1.7182 (1.9056) [2022-01-23 22:50:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][990/1251] eta 0:09:35 lr 0.000272 time 2.2333 (2.2058) loss 3.7996 (3.3329) grad_norm 2.3577 (1.9059) [2022-01-23 22:50:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1000/1251] eta 0:09:13 lr 0.000272 time 2.7415 (2.2049) loss 2.9411 (3.3351) grad_norm 1.8124 (1.9072) [2022-01-23 22:50:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1010/1251] eta 0:08:51 lr 0.000272 time 2.9057 (2.2052) loss 3.4414 (3.3335) grad_norm 1.7174 (1.9075) [2022-01-23 22:51:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1020/1251] eta 0:08:29 lr 0.000272 time 1.9479 (2.2043) loss 3.9115 (3.3359) grad_norm 1.8606 (1.9078) [2022-01-23 22:51:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1030/1251] eta 0:08:06 lr 0.000272 time 1.6607 (2.2024) loss 4.0379 (3.3368) grad_norm 1.8681 (1.9067) [2022-01-23 22:51:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1040/1251] eta 0:07:44 lr 0.000272 time 2.6965 (2.2029) loss 3.6609 (3.3391) grad_norm 2.2932 (1.9074) [2022-01-23 22:52:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1050/1251] eta 0:07:22 lr 0.000272 time 2.7817 (2.2039) loss 3.7390 (3.3405) grad_norm 1.8363 (1.9081) [2022-01-23 22:52:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1060/1251] eta 0:07:01 lr 0.000272 time 2.0813 (2.2043) loss 3.9726 (3.3412) grad_norm 2.3285 (1.9085) [2022-01-23 22:53:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1070/1251] eta 0:06:38 lr 0.000272 time 1.6917 (2.2032) loss 2.6299 (3.3398) grad_norm 1.8928 (1.9078) [2022-01-23 22:53:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1080/1251] eta 0:06:16 lr 0.000272 time 1.6615 (2.2035) loss 2.4916 (3.3401) grad_norm 1.7712 (1.9071) [2022-01-23 22:53:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1090/1251] eta 0:05:54 lr 0.000272 time 2.4240 (2.2028) loss 3.4076 (3.3409) grad_norm 1.7417 (1.9067) [2022-01-23 22:54:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1100/1251] eta 0:05:32 lr 0.000272 time 1.9442 (2.2014) loss 2.6218 (3.3411) grad_norm 1.5633 (1.9056) [2022-01-23 22:54:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1110/1251] eta 0:05:10 lr 0.000272 time 1.6002 (2.2008) loss 2.2061 (3.3387) grad_norm 1.9597 (1.9056) [2022-01-23 22:54:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1120/1251] eta 0:04:48 lr 0.000272 time 1.6925 (2.1999) loss 3.1491 (3.3406) grad_norm 1.8302 (1.9053) [2022-01-23 22:55:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1130/1251] eta 0:04:26 lr 0.000272 time 2.4684 (2.2003) loss 3.8631 (3.3422) grad_norm 2.0240 (1.9066) [2022-01-23 22:55:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1140/1251] eta 0:04:04 lr 0.000271 time 1.8041 (2.1994) loss 3.3532 (3.3442) grad_norm 1.6417 (1.9065) [2022-01-23 22:55:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1150/1251] eta 0:03:42 lr 0.000271 time 2.5135 (2.1992) loss 2.5701 (3.3439) grad_norm 1.8597 (1.9069) [2022-01-23 22:56:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1160/1251] eta 0:03:20 lr 0.000271 time 1.9653 (2.2015) loss 3.3464 (3.3449) grad_norm 1.7171 (1.9067) [2022-01-23 22:56:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1170/1251] eta 0:02:58 lr 0.000271 time 2.5213 (2.2026) loss 2.5312 (3.3435) grad_norm 1.8718 (1.9066) [2022-01-23 22:57:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1180/1251] eta 0:02:36 lr 0.000271 time 1.7391 (2.2025) loss 3.3573 (3.3432) grad_norm 1.6749 (1.9060) [2022-01-23 22:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1190/1251] eta 0:02:14 lr 0.000271 time 1.9888 (2.2013) loss 3.6955 (3.3451) grad_norm 1.8321 (1.9053) [2022-01-23 22:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1200/1251] eta 0:01:52 lr 0.000271 time 1.8427 (2.2003) loss 3.5445 (3.3472) grad_norm 1.8934 (1.9047) [2022-01-23 22:58:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1210/1251] eta 0:01:30 lr 0.000271 time 1.7267 (2.1983) loss 3.3259 (3.3472) grad_norm 2.0254 (1.9051) [2022-01-23 22:58:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1220/1251] eta 0:01:08 lr 0.000271 time 2.0180 (2.1974) loss 3.9536 (3.3455) grad_norm 2.3121 (1.9053) [2022-01-23 22:58:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1230/1251] eta 0:00:46 lr 0.000271 time 2.2304 (2.1987) loss 3.8564 (3.3455) grad_norm 1.9180 (1.9058) [2022-01-23 22:59:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1240/1251] eta 0:00:24 lr 0.000271 time 1.7055 (2.1995) loss 2.9227 (3.3455) grad_norm 1.8473 (1.9055) [2022-01-23 22:59:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [196/300][1250/1251] eta 0:00:02 lr 0.000271 time 1.1883 (2.1942) loss 2.4368 (3.3448) grad_norm 2.7521 (1.9053) [2022-01-23 22:59:28 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 196 training takes 0:45:45 [2022-01-23 22:59:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.615 (18.615) Loss 0.8878 (0.8878) Acc@1 79.688 (79.688) Acc@5 94.043 (94.043) [2022-01-23 23:00:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.592 (3.187) Loss 0.8784 (0.9241) Acc@1 79.395 (78.489) Acc@5 94.238 (94.300) [2022-01-23 23:00:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.941 (2.445) Loss 0.9541 (0.9284) Acc@1 77.930 (78.344) Acc@5 94.336 (94.303) [2022-01-23 23:00:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.277 (2.290) Loss 0.8920 (0.9303) Acc@1 79.395 (78.188) Acc@5 94.824 (94.386) [2022-01-23 23:00:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.563 (2.170) Loss 0.9602 (0.9336) Acc@1 79.004 (78.142) Acc@5 93.848 (94.334) [2022-01-23 23:01:04 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.090 Acc@5 94.274 [2022-01-23 23:01:04 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.1% [2022-01-23 23:01:04 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.23% [2022-01-23 23:01:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][0/1251] eta 7:39:19 lr 0.000271 time 22.0298 (22.0298) loss 3.6029 (3.6029) grad_norm 2.0025 (2.0025) [2022-01-23 23:01:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][10/1251] eta 1:24:58 lr 0.000271 time 2.4635 (4.1083) loss 2.5746 (3.3504) grad_norm 1.9031 (2.0820) [2022-01-23 23:02:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][20/1251] eta 1:05:42 lr 0.000271 time 1.5417 (3.2026) loss 3.7556 (3.2677) grad_norm 1.8287 (2.0088) [2022-01-23 23:02:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][30/1251] eta 0:59:43 lr 0.000271 time 1.8093 (2.9348) loss 4.2138 (3.3456) grad_norm 2.2054 (1.9490) [2022-01-23 23:02:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][40/1251] eta 0:55:27 lr 0.000271 time 2.9533 (2.7477) loss 3.8393 (3.3076) grad_norm 1.8892 (1.9473) [2022-01-23 23:03:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][50/1251] eta 0:53:18 lr 0.000271 time 3.2229 (2.6635) loss 3.5836 (3.2644) grad_norm 1.8041 (1.9389) [2022-01-23 23:03:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][60/1251] eta 0:51:17 lr 0.000271 time 2.4247 (2.5841) loss 2.5702 (3.2984) grad_norm 1.7811 (1.9269) [2022-01-23 23:04:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][70/1251] eta 0:49:41 lr 0.000271 time 1.8797 (2.5249) loss 3.7208 (3.3312) grad_norm 1.9662 (1.9135) [2022-01-23 23:04:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][80/1251] eta 0:48:27 lr 0.000271 time 2.5609 (2.4832) loss 3.3218 (3.3278) grad_norm 2.1360 (1.9160) [2022-01-23 23:04:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][90/1251] eta 0:47:42 lr 0.000271 time 2.8718 (2.4655) loss 4.2756 (3.3148) grad_norm 1.7664 (1.9205) [2022-01-23 23:05:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][100/1251] eta 0:46:53 lr 0.000271 time 3.0683 (2.4442) loss 3.5124 (3.3347) grad_norm 2.0051 (1.9234) [2022-01-23 23:05:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][110/1251] eta 0:45:55 lr 0.000271 time 1.7723 (2.4150) loss 3.2585 (3.3200) grad_norm 1.8126 (1.9252) [2022-01-23 23:05:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][120/1251] eta 0:45:13 lr 0.000271 time 2.3272 (2.3989) loss 4.0493 (3.3248) grad_norm 2.7813 (1.9330) [2022-01-23 23:06:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][130/1251] eta 0:44:20 lr 0.000271 time 1.6421 (2.3738) loss 3.5900 (3.3142) grad_norm 1.9263 (1.9357) [2022-01-23 23:06:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][140/1251] eta 0:43:41 lr 0.000271 time 2.6266 (2.3593) loss 3.1327 (3.3187) grad_norm 2.1234 (1.9378) [2022-01-23 23:06:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][150/1251] eta 0:42:55 lr 0.000271 time 2.1969 (2.3393) loss 3.2932 (3.3320) grad_norm 1.7509 (1.9414) [2022-01-23 23:07:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][160/1251] eta 0:42:11 lr 0.000271 time 1.8759 (2.3202) loss 3.3592 (3.3373) grad_norm 2.0054 (1.9370) [2022-01-23 23:07:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][170/1251] eta 0:41:28 lr 0.000270 time 1.6331 (2.3024) loss 3.4586 (3.3380) grad_norm 2.2700 (1.9376) [2022-01-23 23:08:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][180/1251] eta 0:40:59 lr 0.000270 time 1.6577 (2.2962) loss 2.1461 (3.3350) grad_norm 1.7833 (1.9329) [2022-01-23 23:08:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][190/1251] eta 0:40:33 lr 0.000270 time 2.1299 (2.2938) loss 3.2740 (3.3423) grad_norm 1.7053 (1.9284) [2022-01-23 23:08:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][200/1251] eta 0:40:10 lr 0.000270 time 1.7967 (2.2934) loss 3.4446 (3.3472) grad_norm 1.9586 (1.9241) [2022-01-23 23:09:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][210/1251] eta 0:39:41 lr 0.000270 time 1.8850 (2.2882) loss 3.5873 (3.3556) grad_norm 1.9048 (1.9207) [2022-01-23 23:09:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][220/1251] eta 0:39:17 lr 0.000270 time 2.2072 (2.2864) loss 3.5466 (3.3588) grad_norm 1.8498 (1.9197) [2022-01-23 23:09:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][230/1251] eta 0:38:50 lr 0.000270 time 2.4187 (2.2828) loss 2.7947 (3.3524) grad_norm 1.9481 (1.9168) [2022-01-23 23:10:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][240/1251] eta 0:38:29 lr 0.000270 time 2.2312 (2.2843) loss 3.4834 (3.3605) grad_norm 1.8124 (1.9128) [2022-01-23 23:10:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][250/1251] eta 0:38:01 lr 0.000270 time 1.8826 (2.2794) loss 4.1802 (3.3499) grad_norm 1.6164 (1.9118) [2022-01-23 23:10:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][260/1251] eta 0:37:30 lr 0.000270 time 2.0044 (2.2707) loss 2.4868 (3.3459) grad_norm 1.7186 (1.9156) [2022-01-23 23:11:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][270/1251] eta 0:37:06 lr 0.000270 time 2.2044 (2.2699) loss 3.6124 (3.3495) grad_norm 1.9041 (1.9125) [2022-01-23 23:11:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][280/1251] eta 0:36:39 lr 0.000270 time 2.1627 (2.2649) loss 3.2431 (3.3434) grad_norm 1.6489 (1.9076) [2022-01-23 23:12:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][290/1251] eta 0:36:08 lr 0.000270 time 1.8751 (2.2565) loss 3.9788 (3.3499) grad_norm 1.8716 (1.9081) [2022-01-23 23:12:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][300/1251] eta 0:35:41 lr 0.000270 time 2.2523 (2.2522) loss 3.5827 (3.3561) grad_norm 1.9956 (1.9085) [2022-01-23 23:12:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][310/1251] eta 0:35:15 lr 0.000270 time 1.8713 (2.2479) loss 3.6345 (3.3562) grad_norm 2.2509 (1.9102) [2022-01-23 23:13:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][320/1251] eta 0:34:50 lr 0.000270 time 2.0976 (2.2455) loss 2.2738 (3.3504) grad_norm 1.6759 (1.9069) [2022-01-23 23:13:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][330/1251] eta 0:34:24 lr 0.000270 time 1.7930 (2.2413) loss 3.1801 (3.3485) grad_norm 1.8424 (1.9061) [2022-01-23 23:13:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][340/1251] eta 0:33:58 lr 0.000270 time 2.6148 (2.2375) loss 3.5075 (3.3488) grad_norm 1.6688 (1.9050) [2022-01-23 23:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][350/1251] eta 0:33:37 lr 0.000270 time 2.5619 (2.2394) loss 3.0519 (3.3489) grad_norm 1.6431 (1.9041) [2022-01-23 23:14:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][360/1251] eta 0:33:12 lr 0.000270 time 1.8118 (2.2367) loss 3.7850 (3.3512) grad_norm 1.8995 (1.9025) [2022-01-23 23:14:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][370/1251] eta 0:32:54 lr 0.000270 time 2.3622 (2.2410) loss 3.3006 (3.3429) grad_norm 1.7518 (1.9054) [2022-01-23 23:15:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][380/1251] eta 0:32:33 lr 0.000270 time 1.5758 (2.2425) loss 3.6016 (3.3382) grad_norm 1.7978 (1.9067) [2022-01-23 23:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][390/1251] eta 0:32:09 lr 0.000270 time 2.2869 (2.2412) loss 3.1948 (3.3351) grad_norm 2.2663 (1.9096) [2022-01-23 23:16:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][400/1251] eta 0:31:43 lr 0.000270 time 2.3163 (2.2373) loss 2.8304 (3.3379) grad_norm 1.7995 (1.9091) [2022-01-23 23:16:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][410/1251] eta 0:31:18 lr 0.000270 time 1.7192 (2.2332) loss 3.4759 (3.3384) grad_norm 1.7864 (1.9115) [2022-01-23 23:16:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][420/1251] eta 0:30:51 lr 0.000270 time 1.7947 (2.2284) loss 3.8926 (3.3397) grad_norm 1.7544 (1.9114) [2022-01-23 23:17:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][430/1251] eta 0:30:28 lr 0.000270 time 2.1090 (2.2269) loss 3.0248 (3.3371) grad_norm 2.0770 (1.9117) [2022-01-23 23:17:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][440/1251] eta 0:30:05 lr 0.000269 time 1.8349 (2.2260) loss 3.0838 (3.3363) grad_norm 2.1363 (1.9122) [2022-01-23 23:17:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][450/1251] eta 0:29:43 lr 0.000269 time 1.8529 (2.2260) loss 2.4837 (3.3300) grad_norm 1.9881 (1.9130) [2022-01-23 23:18:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][460/1251] eta 0:29:22 lr 0.000269 time 2.4528 (2.2282) loss 3.2053 (3.3298) grad_norm 1.8121 (1.9121) [2022-01-23 23:18:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][470/1251] eta 0:28:59 lr 0.000269 time 2.4517 (2.2277) loss 3.1418 (3.3291) grad_norm 1.6234 (1.9139) [2022-01-23 23:18:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][480/1251] eta 0:28:37 lr 0.000269 time 1.8022 (2.2275) loss 3.7881 (3.3337) grad_norm 2.1228 (1.9160) [2022-01-23 23:19:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][490/1251] eta 0:28:12 lr 0.000269 time 2.0275 (2.2245) loss 3.7434 (3.3325) grad_norm 1.8874 (1.9150) [2022-01-23 23:19:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][500/1251] eta 0:27:47 lr 0.000269 time 2.2324 (2.2209) loss 3.3985 (3.3330) grad_norm 1.7009 (1.9148) [2022-01-23 23:19:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][510/1251] eta 0:27:23 lr 0.000269 time 1.6865 (2.2183) loss 3.5680 (3.3370) grad_norm 2.1066 (1.9160) [2022-01-23 23:20:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][520/1251] eta 0:27:02 lr 0.000269 time 1.8270 (2.2192) loss 3.2759 (3.3332) grad_norm 2.1576 (1.9153) [2022-01-23 23:20:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][530/1251] eta 0:26:41 lr 0.000269 time 1.9163 (2.2217) loss 3.7183 (3.3355) grad_norm 1.8391 (1.9142) [2022-01-23 23:21:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][540/1251] eta 0:26:20 lr 0.000269 time 2.5876 (2.2227) loss 2.9549 (3.3328) grad_norm 1.9913 (1.9163) [2022-01-23 23:21:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][550/1251] eta 0:25:57 lr 0.000269 time 2.7134 (2.2221) loss 2.7132 (3.3294) grad_norm 2.1611 (1.9152) [2022-01-23 23:21:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][560/1251] eta 0:25:34 lr 0.000269 time 1.7866 (2.2210) loss 2.2912 (3.3302) grad_norm 1.9220 (1.9158) [2022-01-23 23:22:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][570/1251] eta 0:25:12 lr 0.000269 time 1.8293 (2.2212) loss 3.6396 (3.3295) grad_norm 2.2775 (1.9168) [2022-01-23 23:22:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][580/1251] eta 0:24:50 lr 0.000269 time 1.8011 (2.2207) loss 2.3510 (3.3290) grad_norm 1.7604 (1.9168) [2022-01-23 23:22:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][590/1251] eta 0:24:27 lr 0.000269 time 2.4327 (2.2200) loss 3.7799 (3.3294) grad_norm 1.7682 (1.9154) [2022-01-23 23:23:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][600/1251] eta 0:24:03 lr 0.000269 time 1.9084 (2.2167) loss 3.8000 (3.3301) grad_norm 1.7289 (1.9185) [2022-01-23 23:23:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][610/1251] eta 0:23:39 lr 0.000269 time 2.1816 (2.2152) loss 2.5457 (3.3301) grad_norm 1.8296 (1.9190) [2022-01-23 23:23:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][620/1251] eta 0:23:17 lr 0.000269 time 1.9042 (2.2140) loss 2.6097 (3.3284) grad_norm 1.7798 (1.9195) [2022-01-23 23:24:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][630/1251] eta 0:22:55 lr 0.000269 time 1.7495 (2.2144) loss 3.5694 (3.3321) grad_norm 1.8203 (1.9189) [2022-01-23 23:24:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][640/1251] eta 0:22:33 lr 0.000269 time 1.7321 (2.2144) loss 3.8573 (3.3357) grad_norm 1.6689 (1.9165) [2022-01-23 23:25:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][650/1251] eta 0:22:11 lr 0.000269 time 2.5038 (2.2148) loss 3.4387 (3.3354) grad_norm 1.7672 (1.9149) [2022-01-23 23:25:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][660/1251] eta 0:21:49 lr 0.000269 time 2.2052 (2.2157) loss 3.1862 (3.3356) grad_norm 1.7077 (1.9152) [2022-01-23 23:25:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][670/1251] eta 0:21:26 lr 0.000269 time 2.2317 (2.2149) loss 3.7780 (3.3346) grad_norm 1.9480 (1.9149) [2022-01-23 23:26:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][680/1251] eta 0:21:04 lr 0.000269 time 1.9941 (2.2140) loss 3.0784 (3.3369) grad_norm 2.0848 (1.9155) [2022-01-23 23:26:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][690/1251] eta 0:20:42 lr 0.000269 time 3.1360 (2.2148) loss 3.4592 (3.3337) grad_norm 1.8587 (1.9141) [2022-01-23 23:26:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][700/1251] eta 0:20:18 lr 0.000269 time 1.8459 (2.2123) loss 3.8074 (3.3375) grad_norm 1.9224 (1.9131) [2022-01-23 23:27:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][710/1251] eta 0:19:55 lr 0.000268 time 2.5230 (2.2105) loss 3.0768 (3.3395) grad_norm 1.8493 (1.9142) [2022-01-23 23:27:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][720/1251] eta 0:19:32 lr 0.000268 time 1.9137 (2.2074) loss 4.1083 (3.3406) grad_norm 1.8178 (1.9139) [2022-01-23 23:27:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][730/1251] eta 0:19:09 lr 0.000268 time 2.2193 (2.2062) loss 3.7499 (3.3441) grad_norm 1.8517 (1.9139) [2022-01-23 23:28:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][740/1251] eta 0:18:46 lr 0.000268 time 1.7760 (2.2037) loss 3.0729 (3.3449) grad_norm 2.0475 (1.9161) [2022-01-23 23:28:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][750/1251] eta 0:18:24 lr 0.000268 time 2.9896 (2.2052) loss 3.6198 (3.3444) grad_norm 1.8080 (1.9183) [2022-01-23 23:29:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][760/1251] eta 0:18:03 lr 0.000268 time 2.4905 (2.2074) loss 4.0190 (3.3456) grad_norm 2.1057 (1.9198) [2022-01-23 23:29:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][770/1251] eta 0:17:42 lr 0.000268 time 2.3959 (2.2098) loss 3.5434 (3.3444) grad_norm 1.8098 (1.9197) [2022-01-23 23:29:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][780/1251] eta 0:17:20 lr 0.000268 time 2.2683 (2.2099) loss 3.8388 (3.3450) grad_norm 1.6265 (1.9203) [2022-01-23 23:30:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][790/1251] eta 0:16:59 lr 0.000268 time 3.3519 (2.2107) loss 3.3690 (3.3453) grad_norm 2.0753 (1.9210) [2022-01-23 23:30:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][800/1251] eta 0:16:35 lr 0.000268 time 1.9853 (2.2070) loss 3.2417 (3.3449) grad_norm 1.7772 (1.9197) [2022-01-23 23:30:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][810/1251] eta 0:16:12 lr 0.000268 time 1.6526 (2.2048) loss 3.5209 (3.3493) grad_norm 1.8504 (1.9203) [2022-01-23 23:31:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][820/1251] eta 0:15:50 lr 0.000268 time 2.3316 (2.2045) loss 3.9775 (3.3475) grad_norm 2.1700 (1.9203) [2022-01-23 23:31:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][830/1251] eta 0:15:29 lr 0.000268 time 3.5723 (2.2078) loss 3.6467 (3.3493) grad_norm 2.1773 (1.9207) [2022-01-23 23:32:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][840/1251] eta 0:15:07 lr 0.000268 time 1.9258 (2.2076) loss 3.9800 (3.3485) grad_norm 1.7764 (1.9200) [2022-01-23 23:32:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][850/1251] eta 0:14:45 lr 0.000268 time 2.1732 (2.2073) loss 2.2819 (3.3413) grad_norm 1.7691 (1.9195) [2022-01-23 23:32:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][860/1251] eta 0:14:23 lr 0.000268 time 2.1890 (2.2078) loss 3.9013 (3.3441) grad_norm 1.7541 (1.9191) [2022-01-23 23:33:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][870/1251] eta 0:14:01 lr 0.000268 time 3.3445 (2.2091) loss 2.3198 (3.3390) grad_norm 1.8610 (1.9184) [2022-01-23 23:33:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][880/1251] eta 0:13:38 lr 0.000268 time 1.6086 (2.2068) loss 3.7296 (3.3376) grad_norm 1.7166 (1.9185) [2022-01-23 23:33:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][890/1251] eta 0:13:16 lr 0.000268 time 1.9116 (2.2062) loss 3.7342 (3.3343) grad_norm 1.6538 (1.9183) [2022-01-23 23:34:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][900/1251] eta 0:12:54 lr 0.000268 time 2.2989 (2.2053) loss 3.4030 (3.3365) grad_norm 1.8612 (1.9180) [2022-01-23 23:34:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][910/1251] eta 0:12:32 lr 0.000268 time 3.6738 (2.2066) loss 3.3198 (3.3365) grad_norm 2.0502 (1.9179) [2022-01-23 23:34:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][920/1251] eta 0:12:10 lr 0.000268 time 2.2220 (2.2066) loss 3.4462 (3.3370) grad_norm 2.2747 (1.9189) [2022-01-23 23:35:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][930/1251] eta 0:11:48 lr 0.000268 time 1.7197 (2.2061) loss 3.6477 (3.3348) grad_norm 2.1334 (1.9192) [2022-01-23 23:35:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][940/1251] eta 0:11:25 lr 0.000268 time 1.8439 (2.2044) loss 3.7554 (3.3337) grad_norm 1.8675 (1.9205) [2022-01-23 23:36:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][950/1251] eta 0:11:03 lr 0.000268 time 3.0073 (2.2052) loss 2.3675 (3.3323) grad_norm 1.9705 (1.9218) [2022-01-23 23:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][960/1251] eta 0:10:41 lr 0.000268 time 2.1585 (2.2042) loss 2.7388 (3.3304) grad_norm 1.9081 (1.9238) [2022-01-23 23:36:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][970/1251] eta 0:10:19 lr 0.000268 time 1.7291 (2.2030) loss 2.7419 (3.3321) grad_norm 1.9434 (1.9234) [2022-01-23 23:37:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][980/1251] eta 0:09:56 lr 0.000268 time 2.2408 (2.2022) loss 3.5825 (3.3341) grad_norm 1.7284 (1.9236) [2022-01-23 23:37:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][990/1251] eta 0:09:34 lr 0.000267 time 3.4078 (2.2025) loss 2.8194 (3.3372) grad_norm 1.8380 (1.9236) [2022-01-23 23:37:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1000/1251] eta 0:09:12 lr 0.000267 time 1.6947 (2.2017) loss 3.5215 (3.3407) grad_norm 1.7692 (1.9230) [2022-01-23 23:38:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1010/1251] eta 0:08:50 lr 0.000267 time 1.9054 (2.2033) loss 3.3276 (3.3395) grad_norm 1.8811 (1.9233) [2022-01-23 23:38:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1020/1251] eta 0:08:28 lr 0.000267 time 2.8108 (2.2029) loss 2.5158 (3.3403) grad_norm 1.7932 (1.9236) [2022-01-23 23:38:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1030/1251] eta 0:08:06 lr 0.000267 time 2.9805 (2.2029) loss 2.9822 (3.3399) grad_norm 2.4230 (1.9247) [2022-01-23 23:39:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1040/1251] eta 0:07:44 lr 0.000267 time 1.9388 (2.2022) loss 3.6573 (3.3388) grad_norm 1.8925 (1.9246) [2022-01-23 23:39:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1050/1251] eta 0:07:22 lr 0.000267 time 2.6699 (2.2023) loss 3.7670 (3.3417) grad_norm 2.0690 (1.9241) [2022-01-23 23:40:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1060/1251] eta 0:07:00 lr 0.000267 time 2.0669 (2.2023) loss 3.9893 (3.3415) grad_norm 2.1004 (1.9241) [2022-01-23 23:40:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1070/1251] eta 0:06:38 lr 0.000267 time 2.0183 (2.2012) loss 3.7936 (3.3431) grad_norm 2.0896 (1.9239) [2022-01-23 23:40:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1080/1251] eta 0:06:16 lr 0.000267 time 1.8744 (2.1999) loss 2.3178 (3.3427) grad_norm 1.9675 (1.9230) [2022-01-23 23:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1090/1251] eta 0:05:54 lr 0.000267 time 2.6561 (2.1997) loss 2.9887 (3.3437) grad_norm 2.1897 (1.9231) [2022-01-23 23:41:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1100/1251] eta 0:05:32 lr 0.000267 time 3.3194 (2.2009) loss 3.4851 (3.3436) grad_norm 1.9725 (1.9230) [2022-01-23 23:41:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1110/1251] eta 0:05:10 lr 0.000267 time 1.5497 (2.2008) loss 3.4026 (3.3431) grad_norm 1.9108 (1.9223) [2022-01-23 23:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1120/1251] eta 0:04:48 lr 0.000267 time 2.1190 (2.2022) loss 3.6706 (3.3454) grad_norm 2.3267 (1.9221) [2022-01-23 23:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1130/1251] eta 0:04:26 lr 0.000267 time 2.2788 (2.2023) loss 3.6279 (3.3452) grad_norm 2.1085 (1.9225) [2022-01-23 23:42:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1140/1251] eta 0:04:04 lr 0.000267 time 3.4402 (2.2023) loss 2.6850 (3.3440) grad_norm 1.7432 (1.9219) [2022-01-23 23:43:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1150/1251] eta 0:03:42 lr 0.000267 time 1.9801 (2.2010) loss 3.6429 (3.3460) grad_norm 2.5246 (1.9215) [2022-01-23 23:43:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1160/1251] eta 0:03:20 lr 0.000267 time 2.2422 (2.2006) loss 2.1552 (3.3463) grad_norm 1.6995 (1.9210) [2022-01-23 23:44:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1170/1251] eta 0:02:58 lr 0.000267 time 1.8140 (2.1992) loss 3.6284 (3.3464) grad_norm 1.8786 (1.9203) [2022-01-23 23:44:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1180/1251] eta 0:02:36 lr 0.000267 time 1.9001 (2.1987) loss 3.5163 (3.3469) grad_norm 1.8503 (1.9204) [2022-01-23 23:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1190/1251] eta 0:02:14 lr 0.000267 time 2.3024 (2.1983) loss 3.6145 (3.3489) grad_norm 1.7189 (1.9206) [2022-01-23 23:45:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1200/1251] eta 0:01:52 lr 0.000267 time 2.1595 (2.1984) loss 3.5058 (3.3487) grad_norm 1.9444 (1.9203) [2022-01-23 23:45:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1210/1251] eta 0:01:30 lr 0.000267 time 2.3560 (2.1983) loss 4.0377 (3.3477) grad_norm 1.6849 (1.9200) [2022-01-23 23:45:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1220/1251] eta 0:01:08 lr 0.000267 time 2.0347 (2.1983) loss 3.9274 (3.3477) grad_norm 1.7801 (1.9194) [2022-01-23 23:46:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1230/1251] eta 0:00:46 lr 0.000267 time 1.4982 (2.1988) loss 3.6456 (3.3479) grad_norm 2.1494 (1.9191) [2022-01-23 23:46:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1240/1251] eta 0:00:24 lr 0.000267 time 1.4326 (2.1990) loss 2.5157 (3.3465) grad_norm 1.8991 (1.9188) [2022-01-23 23:46:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [197/300][1250/1251] eta 0:00:02 lr 0.000267 time 1.1977 (2.1930) loss 3.2563 (3.3468) grad_norm 2.1397 (1.9186) [2022-01-23 23:46:48 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 197 training takes 0:45:43 [2022-01-23 23:47:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.473 (18.473) Loss 0.9335 (0.9335) Acc@1 76.758 (76.758) Acc@5 94.531 (94.531) [2022-01-23 23:47:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.833 (3.377) Loss 0.8902 (0.9397) Acc@1 78.320 (77.468) Acc@5 94.531 (94.300) [2022-01-23 23:47:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.288 (2.536) Loss 0.9198 (0.9325) Acc@1 78.320 (77.911) Acc@5 95.410 (94.443) [2022-01-23 23:47:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.911 (2.195) Loss 0.9967 (0.9305) Acc@1 76.562 (77.990) Acc@5 93.652 (94.377) [2022-01-23 23:48:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.810 (2.179) Loss 0.9037 (0.9321) Acc@1 78.906 (77.968) Acc@5 94.531 (94.355) [2022-01-23 23:48:25 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.094 Acc@5 94.394 [2022-01-23 23:48:25 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.1% [2022-01-23 23:48:25 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.23% [2022-01-23 23:48:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][0/1251] eta 7:30:40 lr 0.000267 time 21.6154 (21.6154) loss 3.3291 (3.3291) grad_norm 1.7549 (1.7549) [2022-01-23 23:49:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][10/1251] eta 1:23:53 lr 0.000266 time 1.9949 (4.0561) loss 3.5467 (3.4670) grad_norm 1.7567 (1.7674) [2022-01-23 23:49:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][20/1251] eta 1:03:21 lr 0.000266 time 1.9579 (3.0882) loss 3.6122 (3.5586) grad_norm 1.8907 (1.8338) [2022-01-23 23:49:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][30/1251] eta 0:56:46 lr 0.000266 time 1.4750 (2.7901) loss 4.3628 (3.5901) grad_norm 1.9389 (1.8297) [2022-01-23 23:50:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][40/1251] eta 0:55:14 lr 0.000266 time 3.9478 (2.7366) loss 3.2011 (3.5486) grad_norm 1.7569 (1.8936) [2022-01-23 23:50:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][50/1251] eta 0:52:36 lr 0.000266 time 1.4320 (2.6278) loss 3.4911 (3.4785) grad_norm 1.9916 (1.9001) [2022-01-23 23:51:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][60/1251] eta 0:51:16 lr 0.000266 time 2.8701 (2.5835) loss 2.4870 (3.4603) grad_norm 1.6869 (1.8772) [2022-01-23 23:51:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][70/1251] eta 0:49:29 lr 0.000266 time 1.6844 (2.5140) loss 2.6986 (3.4315) grad_norm 2.1443 (1.8782) [2022-01-23 23:51:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][80/1251] eta 0:48:24 lr 0.000266 time 3.4139 (2.4806) loss 3.5637 (3.4141) grad_norm 1.9676 (1.8653) [2022-01-23 23:52:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][90/1251] eta 0:47:15 lr 0.000266 time 1.8317 (2.4419) loss 2.1867 (3.4062) grad_norm 1.7691 (1.8669) [2022-01-23 23:52:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][100/1251] eta 0:46:24 lr 0.000266 time 2.4870 (2.4196) loss 3.4153 (3.3868) grad_norm 1.7228 (1.8699) [2022-01-23 23:52:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][110/1251] eta 0:45:33 lr 0.000266 time 1.9691 (2.3955) loss 3.1604 (3.4009) grad_norm 1.7753 (1.8706) [2022-01-23 23:53:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][120/1251] eta 0:44:44 lr 0.000266 time 2.8919 (2.3738) loss 3.1689 (3.3779) grad_norm 1.6450 (1.8728) [2022-01-23 23:53:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][130/1251] eta 0:44:03 lr 0.000266 time 2.1449 (2.3583) loss 3.6778 (3.3706) grad_norm 2.3209 (1.8695) [2022-01-23 23:53:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][140/1251] eta 0:43:29 lr 0.000266 time 1.9331 (2.3487) loss 3.6322 (3.3612) grad_norm 1.9685 (1.8694) [2022-01-23 23:54:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][150/1251] eta 0:42:44 lr 0.000266 time 1.8877 (2.3291) loss 3.5004 (3.3656) grad_norm 1.9757 (1.8867) [2022-01-23 23:54:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][160/1251] eta 0:42:06 lr 0.000266 time 2.8947 (2.3158) loss 2.4378 (3.3546) grad_norm 1.7660 (1.8867) [2022-01-23 23:54:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][170/1251] eta 0:41:31 lr 0.000266 time 2.0016 (2.3048) loss 3.3770 (3.3590) grad_norm 2.1593 (1.8939) [2022-01-23 23:55:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][180/1251] eta 0:40:53 lr 0.000266 time 1.7412 (2.2905) loss 3.4177 (3.3470) grad_norm 1.8721 (1.8977) [2022-01-23 23:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][190/1251] eta 0:40:26 lr 0.000266 time 2.1536 (2.2871) loss 2.5441 (3.3387) grad_norm 2.1301 (1.8981) [2022-01-23 23:56:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][200/1251] eta 0:40:04 lr 0.000266 time 3.3579 (2.2878) loss 4.0357 (3.3457) grad_norm 1.9063 (1.8994) [2022-01-23 23:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][210/1251] eta 0:39:36 lr 0.000266 time 1.9210 (2.2832) loss 3.6362 (3.3477) grad_norm 1.8307 (1.8986) [2022-01-23 23:56:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][220/1251] eta 0:39:07 lr 0.000266 time 2.1335 (2.2768) loss 2.4435 (3.3462) grad_norm 1.7765 (1.8975) [2022-01-23 23:57:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][230/1251] eta 0:38:43 lr 0.000266 time 1.7645 (2.2761) loss 3.5283 (3.3481) grad_norm 1.9037 (1.8956) [2022-01-23 23:57:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][240/1251] eta 0:38:22 lr 0.000266 time 3.4159 (2.2771) loss 3.4726 (3.3445) grad_norm 1.9926 (1.9003) [2022-01-23 23:57:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][250/1251] eta 0:37:52 lr 0.000266 time 1.6299 (2.2706) loss 3.7842 (3.3445) grad_norm 1.9099 (1.8975) [2022-01-23 23:58:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][260/1251] eta 0:37:23 lr 0.000266 time 2.1663 (2.2636) loss 3.9776 (3.3554) grad_norm 1.7142 (1.8980) [2022-01-23 23:58:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][270/1251] eta 0:36:56 lr 0.000266 time 2.5114 (2.2593) loss 3.6406 (3.3526) grad_norm 1.8188 (1.8968) [2022-01-23 23:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][280/1251] eta 0:36:36 lr 0.000266 time 2.4093 (2.2626) loss 3.6486 (3.3443) grad_norm 1.9291 (1.8960) [2022-01-23 23:59:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][290/1251] eta 0:36:13 lr 0.000265 time 2.1875 (2.2620) loss 3.8438 (3.3424) grad_norm 1.7452 (1.8951) [2022-01-23 23:59:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][300/1251] eta 0:35:50 lr 0.000265 time 2.4563 (2.2608) loss 3.0275 (3.3301) grad_norm 2.1177 (1.8942) [2022-01-24 00:00:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][310/1251] eta 0:35:21 lr 0.000265 time 1.9593 (2.2550) loss 3.6362 (3.3242) grad_norm 2.1080 (1.8943) [2022-01-24 00:00:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][320/1251] eta 0:34:50 lr 0.000265 time 2.5306 (2.2454) loss 3.6502 (3.3207) grad_norm 2.1871 (1.8986) [2022-01-24 00:00:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][330/1251] eta 0:34:20 lr 0.000265 time 1.9871 (2.2376) loss 4.0744 (3.3186) grad_norm 2.0031 (1.8984) [2022-01-24 00:01:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][340/1251] eta 0:33:55 lr 0.000265 time 2.2298 (2.2338) loss 3.1873 (3.3204) grad_norm 1.5919 (1.8964) [2022-01-24 00:01:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][350/1251] eta 0:33:34 lr 0.000265 time 2.6527 (2.2358) loss 3.8036 (3.3237) grad_norm 1.7206 (1.8952) [2022-01-24 00:01:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][360/1251] eta 0:33:12 lr 0.000265 time 2.5014 (2.2360) loss 2.6290 (3.3155) grad_norm 1.6078 (1.8933) [2022-01-24 00:02:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][370/1251] eta 0:32:52 lr 0.000265 time 2.1566 (2.2387) loss 3.5842 (3.3209) grad_norm 2.6613 (1.8966) [2022-01-24 00:02:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][380/1251] eta 0:32:32 lr 0.000265 time 2.0646 (2.2411) loss 3.0836 (3.3257) grad_norm 2.0799 (1.8986) [2022-01-24 00:03:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][390/1251] eta 0:32:12 lr 0.000265 time 2.8570 (2.2445) loss 3.8028 (3.3330) grad_norm 2.2673 (1.9008) [2022-01-24 00:03:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][400/1251] eta 0:31:45 lr 0.000265 time 1.9180 (2.2386) loss 3.2926 (3.3396) grad_norm 1.8620 (1.9016) [2022-01-24 00:03:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][410/1251] eta 0:31:18 lr 0.000265 time 1.8494 (2.2333) loss 3.4509 (3.3454) grad_norm 2.4883 (1.9022) [2022-01-24 00:04:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][420/1251] eta 0:30:49 lr 0.000265 time 1.5755 (2.2262) loss 4.1128 (3.3466) grad_norm 1.7268 (1.9043) [2022-01-24 00:04:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][430/1251] eta 0:30:24 lr 0.000265 time 2.2504 (2.2220) loss 2.9397 (3.3429) grad_norm 2.2501 (1.9046) [2022-01-24 00:04:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][440/1251] eta 0:29:59 lr 0.000265 time 1.7728 (2.2192) loss 3.6600 (3.3369) grad_norm 1.8055 (1.9060) [2022-01-24 00:05:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][450/1251] eta 0:29:36 lr 0.000265 time 2.0572 (2.2183) loss 3.6223 (3.3342) grad_norm 1.8460 (1.9067) [2022-01-24 00:05:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][460/1251] eta 0:29:17 lr 0.000265 time 2.2492 (2.2224) loss 3.7957 (3.3345) grad_norm 1.9133 (1.9060) [2022-01-24 00:05:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][470/1251] eta 0:28:57 lr 0.000265 time 3.4756 (2.2253) loss 3.6099 (3.3385) grad_norm 1.9384 (1.9045) [2022-01-24 00:06:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][480/1251] eta 0:28:38 lr 0.000265 time 1.7832 (2.2288) loss 4.0064 (3.3361) grad_norm 1.8928 (1.9044) [2022-01-24 00:06:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][490/1251] eta 0:28:18 lr 0.000265 time 2.5699 (2.2315) loss 3.0291 (3.3308) grad_norm 1.6443 (1.9021) [2022-01-24 00:07:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][500/1251] eta 0:27:55 lr 0.000265 time 1.6408 (2.2308) loss 3.0994 (3.3312) grad_norm 1.6889 (1.9037) [2022-01-24 00:07:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][510/1251] eta 0:27:31 lr 0.000265 time 1.9275 (2.2284) loss 3.8100 (3.3374) grad_norm 1.9173 (1.9036) [2022-01-24 00:07:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][520/1251] eta 0:27:05 lr 0.000265 time 1.8047 (2.2242) loss 3.4184 (3.3399) grad_norm 1.8264 (1.9019) [2022-01-24 00:08:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][530/1251] eta 0:26:41 lr 0.000265 time 1.5879 (2.2207) loss 3.6909 (3.3452) grad_norm 2.1365 (1.9030) [2022-01-24 00:08:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][540/1251] eta 0:26:16 lr 0.000265 time 2.2007 (2.2172) loss 3.4825 (3.3470) grad_norm 2.0858 (1.9052) [2022-01-24 00:08:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][550/1251] eta 0:25:52 lr 0.000265 time 2.1671 (2.2144) loss 2.4186 (3.3449) grad_norm 2.1157 (1.9047) [2022-01-24 00:09:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][560/1251] eta 0:25:28 lr 0.000265 time 1.9191 (2.2115) loss 4.1294 (3.3472) grad_norm 2.1042 (1.9065) [2022-01-24 00:09:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][570/1251] eta 0:25:07 lr 0.000264 time 2.8133 (2.2129) loss 3.8887 (3.3465) grad_norm 2.4682 (1.9101) [2022-01-24 00:09:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][580/1251] eta 0:24:44 lr 0.000264 time 1.8244 (2.2129) loss 3.9329 (3.3484) grad_norm 1.9324 (1.9104) [2022-01-24 00:10:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][590/1251] eta 0:24:22 lr 0.000264 time 2.2259 (2.2127) loss 3.3283 (3.3501) grad_norm 2.3421 (1.9125) [2022-01-24 00:10:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][600/1251] eta 0:24:02 lr 0.000264 time 2.4500 (2.2154) loss 3.0879 (3.3480) grad_norm 1.6327 (1.9129) [2022-01-24 00:10:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][610/1251] eta 0:23:41 lr 0.000264 time 2.8840 (2.2169) loss 3.4257 (3.3449) grad_norm 2.0805 (1.9138) [2022-01-24 00:11:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][620/1251] eta 0:23:20 lr 0.000264 time 1.7219 (2.2187) loss 2.3044 (3.3447) grad_norm 1.6473 (1.9126) [2022-01-24 00:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][630/1251] eta 0:22:57 lr 0.000264 time 2.0964 (2.2176) loss 3.9229 (3.3435) grad_norm 1.6204 (1.9109) [2022-01-24 00:12:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][640/1251] eta 0:22:34 lr 0.000264 time 2.3187 (2.2166) loss 3.6922 (3.3429) grad_norm 1.7917 (1.9093) [2022-01-24 00:12:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][650/1251] eta 0:22:10 lr 0.000264 time 2.5613 (2.2143) loss 3.5064 (3.3422) grad_norm 1.6425 (1.9075) [2022-01-24 00:12:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][660/1251] eta 0:21:47 lr 0.000264 time 1.6175 (2.2127) loss 3.3009 (3.3411) grad_norm 1.7612 (1.9062) [2022-01-24 00:13:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][670/1251] eta 0:21:25 lr 0.000264 time 2.1931 (2.2130) loss 3.7186 (3.3421) grad_norm 1.8463 (1.9051) [2022-01-24 00:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][680/1251] eta 0:21:03 lr 0.000264 time 1.9526 (2.2126) loss 3.5451 (3.3415) grad_norm 1.7745 (1.9032) [2022-01-24 00:13:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][690/1251] eta 0:20:41 lr 0.000264 time 2.2633 (2.2131) loss 3.2039 (3.3397) grad_norm 1.8160 (1.9021) [2022-01-24 00:14:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][700/1251] eta 0:20:19 lr 0.000264 time 1.8848 (2.2130) loss 3.0484 (3.3398) grad_norm 1.9937 (1.9022) [2022-01-24 00:14:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][710/1251] eta 0:19:56 lr 0.000264 time 1.9658 (2.2108) loss 3.9349 (3.3435) grad_norm 1.5831 (1.9014) [2022-01-24 00:14:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][720/1251] eta 0:19:33 lr 0.000264 time 2.5936 (2.2100) loss 3.9383 (3.3437) grad_norm 1.7496 (1.9001) [2022-01-24 00:15:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][730/1251] eta 0:19:11 lr 0.000264 time 1.5846 (2.2093) loss 2.6870 (3.3461) grad_norm 1.8833 (1.9002) [2022-01-24 00:15:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][740/1251] eta 0:18:49 lr 0.000264 time 2.8235 (2.2107) loss 3.0428 (3.3483) grad_norm 2.3054 (1.9015) [2022-01-24 00:16:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][750/1251] eta 0:18:26 lr 0.000264 time 2.2084 (2.2096) loss 3.0391 (3.3437) grad_norm 1.9367 (1.9007) [2022-01-24 00:16:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][760/1251] eta 0:18:04 lr 0.000264 time 1.8944 (2.2079) loss 3.2729 (3.3415) grad_norm 1.8780 (1.8997) [2022-01-24 00:16:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][770/1251] eta 0:17:41 lr 0.000264 time 1.9010 (2.2073) loss 3.7358 (3.3421) grad_norm 1.9954 (1.9001) [2022-01-24 00:17:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][780/1251] eta 0:17:19 lr 0.000264 time 1.8174 (2.2069) loss 3.6942 (3.3422) grad_norm 2.1536 (1.9021) [2022-01-24 00:17:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][790/1251] eta 0:16:57 lr 0.000264 time 3.3814 (2.2062) loss 3.4780 (3.3415) grad_norm 1.6765 (1.9026) [2022-01-24 00:17:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][800/1251] eta 0:16:34 lr 0.000264 time 2.4895 (2.2060) loss 2.5146 (3.3398) grad_norm 1.8982 (1.9020) [2022-01-24 00:18:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][810/1251] eta 0:16:12 lr 0.000264 time 2.0262 (2.2043) loss 3.1871 (3.3423) grad_norm 2.0630 (1.9017) [2022-01-24 00:18:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][820/1251] eta 0:15:50 lr 0.000264 time 2.7152 (2.2055) loss 2.9317 (3.3406) grad_norm 1.8744 (1.9017) [2022-01-24 00:18:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][830/1251] eta 0:15:28 lr 0.000264 time 1.5213 (2.2049) loss 3.7911 (3.3413) grad_norm 1.8411 (1.9021) [2022-01-24 00:19:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][840/1251] eta 0:15:05 lr 0.000263 time 2.9002 (2.2039) loss 3.3680 (3.3380) grad_norm 1.9285 (1.9038) [2022-01-24 00:19:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][850/1251] eta 0:14:43 lr 0.000263 time 1.9026 (2.2038) loss 2.9985 (3.3387) grad_norm 2.0587 (1.9032) [2022-01-24 00:20:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][860/1251] eta 0:14:22 lr 0.000263 time 2.6365 (2.2056) loss 3.0387 (3.3398) grad_norm 1.8336 (1.9032) [2022-01-24 00:20:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][870/1251] eta 0:13:59 lr 0.000263 time 1.5853 (2.2045) loss 2.3417 (3.3385) grad_norm 1.8277 (1.9036) [2022-01-24 00:20:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][880/1251] eta 0:13:37 lr 0.000263 time 2.2977 (2.2033) loss 2.7560 (3.3364) grad_norm 2.0273 (1.9050) [2022-01-24 00:21:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][890/1251] eta 0:13:14 lr 0.000263 time 1.6963 (2.2005) loss 2.7798 (3.3369) grad_norm 1.7490 (1.9046) [2022-01-24 00:21:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][900/1251] eta 0:12:52 lr 0.000263 time 3.0929 (2.2001) loss 2.8678 (3.3333) grad_norm 1.7640 (1.9034) [2022-01-24 00:21:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][910/1251] eta 0:12:30 lr 0.000263 time 1.9565 (2.2008) loss 2.5724 (3.3329) grad_norm 1.7192 (1.9037) [2022-01-24 00:22:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][920/1251] eta 0:12:09 lr 0.000263 time 2.8254 (2.2032) loss 3.4763 (3.3331) grad_norm 1.7995 (1.9030) [2022-01-24 00:22:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][930/1251] eta 0:11:47 lr 0.000263 time 2.1067 (2.2034) loss 3.7254 (3.3332) grad_norm 1.8433 (1.9018) [2022-01-24 00:23:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][940/1251] eta 0:11:25 lr 0.000263 time 3.3159 (2.2050) loss 3.3122 (3.3320) grad_norm 1.9210 (1.9046) [2022-01-24 00:23:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][950/1251] eta 0:11:03 lr 0.000263 time 2.4415 (2.2048) loss 2.6281 (3.3309) grad_norm 1.7270 (1.9060) [2022-01-24 00:23:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][960/1251] eta 0:10:41 lr 0.000263 time 1.6609 (2.2044) loss 3.3817 (3.3313) grad_norm 1.9309 (1.9057) [2022-01-24 00:24:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][970/1251] eta 0:10:19 lr 0.000263 time 2.2648 (2.2035) loss 3.6901 (3.3321) grad_norm 1.8520 (1.9060) [2022-01-24 00:24:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][980/1251] eta 0:09:56 lr 0.000263 time 1.9735 (2.2018) loss 3.1929 (3.3301) grad_norm 2.5268 (1.9062) [2022-01-24 00:24:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][990/1251] eta 0:09:34 lr 0.000263 time 2.6047 (2.2009) loss 3.4486 (3.3298) grad_norm 2.0713 (1.9064) [2022-01-24 00:25:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1000/1251] eta 0:09:12 lr 0.000263 time 2.4429 (2.1996) loss 3.1324 (3.3293) grad_norm 2.0023 (1.9077) [2022-01-24 00:25:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1010/1251] eta 0:08:49 lr 0.000263 time 2.0161 (2.1988) loss 3.6116 (3.3286) grad_norm 1.6450 (1.9071) [2022-01-24 00:25:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1020/1251] eta 0:08:27 lr 0.000263 time 1.8864 (2.1984) loss 2.2748 (3.3262) grad_norm 1.8504 (1.9077) [2022-01-24 00:26:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1030/1251] eta 0:08:05 lr 0.000263 time 2.8928 (2.1983) loss 3.5593 (3.3277) grad_norm 2.0071 (1.9083) [2022-01-24 00:26:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1040/1251] eta 0:07:44 lr 0.000263 time 2.5213 (2.1996) loss 3.7104 (3.3275) grad_norm 1.6623 (1.9090) [2022-01-24 00:26:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1050/1251] eta 0:07:22 lr 0.000263 time 2.3538 (2.2018) loss 3.6003 (3.3268) grad_norm 2.0081 (1.9092) [2022-01-24 00:27:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1060/1251] eta 0:07:00 lr 0.000263 time 2.2070 (2.2013) loss 3.0101 (3.3260) grad_norm 1.9445 (1.9087) [2022-01-24 00:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1070/1251] eta 0:06:38 lr 0.000263 time 1.8721 (2.1997) loss 3.7348 (3.3269) grad_norm 1.9098 (1.9082) [2022-01-24 00:28:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1080/1251] eta 0:06:15 lr 0.000263 time 1.8256 (2.1977) loss 3.8984 (3.3279) grad_norm 2.1061 (1.9082) [2022-01-24 00:28:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1090/1251] eta 0:05:53 lr 0.000263 time 2.2872 (2.1966) loss 4.0531 (3.3285) grad_norm 2.1607 (1.9078) [2022-01-24 00:28:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1100/1251] eta 0:05:31 lr 0.000263 time 1.8437 (2.1951) loss 3.5704 (3.3285) grad_norm 1.7572 (1.9074) [2022-01-24 00:29:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1110/1251] eta 0:05:09 lr 0.000263 time 1.8994 (2.1946) loss 3.5107 (3.3305) grad_norm 1.9419 (1.9064) [2022-01-24 00:29:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1120/1251] eta 0:04:47 lr 0.000262 time 1.9746 (2.1946) loss 3.0754 (3.3321) grad_norm 1.9877 (1.9066) [2022-01-24 00:29:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1130/1251] eta 0:04:25 lr 0.000262 time 2.2596 (2.1938) loss 3.8106 (3.3317) grad_norm 1.7269 (1.9073) [2022-01-24 00:30:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1140/1251] eta 0:04:03 lr 0.000262 time 2.1399 (2.1934) loss 3.1116 (3.3325) grad_norm 1.9267 (1.9078) [2022-01-24 00:30:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1150/1251] eta 0:03:41 lr 0.000262 time 2.5972 (2.1935) loss 2.9701 (3.3318) grad_norm 1.9305 (1.9090) [2022-01-24 00:30:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1160/1251] eta 0:03:19 lr 0.000262 time 2.2194 (2.1942) loss 2.5001 (3.3292) grad_norm 1.6551 (1.9089) [2022-01-24 00:31:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1170/1251] eta 0:02:57 lr 0.000262 time 2.7450 (2.1952) loss 3.5316 (3.3304) grad_norm 1.6642 (1.9087) [2022-01-24 00:31:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1180/1251] eta 0:02:36 lr 0.000262 time 2.7656 (2.1978) loss 2.8706 (3.3307) grad_norm 2.2749 (1.9095) [2022-01-24 00:32:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1190/1251] eta 0:02:14 lr 0.000262 time 1.8953 (2.1988) loss 3.4238 (3.3314) grad_norm 1.9890 (1.9093) [2022-01-24 00:32:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1200/1251] eta 0:01:52 lr 0.000262 time 2.4541 (2.2003) loss 3.3209 (3.3329) grad_norm 1.6685 (1.9088) [2022-01-24 00:32:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1210/1251] eta 0:01:30 lr 0.000262 time 1.9289 (2.2005) loss 3.7655 (3.3326) grad_norm 1.7102 (1.9080) [2022-01-24 00:33:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1220/1251] eta 0:01:08 lr 0.000262 time 1.9171 (2.1989) loss 3.6631 (3.3325) grad_norm 1.8390 (1.9076) [2022-01-24 00:33:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1230/1251] eta 0:00:46 lr 0.000262 time 1.6353 (2.1972) loss 3.5247 (3.3332) grad_norm 1.7245 (1.9087) [2022-01-24 00:33:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1240/1251] eta 0:00:24 lr 0.000262 time 1.6465 (2.1952) loss 3.7908 (3.3338) grad_norm 1.9287 (1.9083) [2022-01-24 00:34:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [198/300][1250/1251] eta 0:00:02 lr 0.000262 time 1.1914 (2.1897) loss 3.1692 (3.3336) grad_norm 1.7074 (1.9075) [2022-01-24 00:34:05 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 198 training takes 0:45:39 [2022-01-24 00:34:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.435 (18.435) Loss 0.8896 (0.8896) Acc@1 78.320 (78.320) Acc@5 95.312 (95.312) [2022-01-24 00:34:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.830 (3.234) Loss 0.9571 (0.9157) Acc@1 76.855 (78.622) Acc@5 94.629 (94.735) [2022-01-24 00:34:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.258 (2.469) Loss 0.9842 (0.9305) Acc@1 77.051 (78.306) Acc@5 93.457 (94.517) [2022-01-24 00:35:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.942 (2.212) Loss 0.9727 (0.9392) Acc@1 78.320 (78.254) Acc@5 93.359 (94.349) [2022-01-24 00:35:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.589 (2.178) Loss 0.9697 (0.9343) Acc@1 77.734 (78.320) Acc@5 93.750 (94.424) [2022-01-24 00:35:41 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.336 Acc@5 94.420 [2022-01-24 00:35:41 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.3% [2022-01-24 00:35:41 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.34% [2022-01-24 00:36:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][0/1251] eta 7:33:21 lr 0.000262 time 21.7438 (21.7438) loss 2.6113 (2.6113) grad_norm 1.8464 (1.8464) [2022-01-24 00:36:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][10/1251] eta 1:22:23 lr 0.000262 time 1.8083 (3.9834) loss 2.0909 (3.2045) grad_norm 1.6548 (1.7322) [2022-01-24 00:36:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][20/1251] eta 1:03:23 lr 0.000262 time 1.4200 (3.0896) loss 3.3233 (3.3254) grad_norm 1.6805 (1.7626) [2022-01-24 00:37:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][30/1251] eta 0:56:08 lr 0.000262 time 1.4782 (2.7591) loss 2.8923 (3.3179) grad_norm 1.9336 (1.7994) [2022-01-24 00:37:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][40/1251] eta 0:53:27 lr 0.000262 time 3.3032 (2.6486) loss 3.8371 (3.3258) grad_norm 2.0906 (1.8157) [2022-01-24 00:37:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][50/1251] eta 0:52:26 lr 0.000262 time 2.7963 (2.6198) loss 2.6758 (3.3339) grad_norm 1.8580 (1.8195) [2022-01-24 00:38:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][60/1251] eta 0:50:11 lr 0.000262 time 1.3874 (2.5282) loss 3.9422 (3.3104) grad_norm 1.6985 (1.8130) [2022-01-24 00:38:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][70/1251] eta 0:48:57 lr 0.000262 time 1.5564 (2.4871) loss 2.8393 (3.2879) grad_norm 2.0028 (1.8327) [2022-01-24 00:39:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][80/1251] eta 0:48:03 lr 0.000262 time 2.7074 (2.4628) loss 3.6840 (3.3044) grad_norm 1.8448 (1.8359) [2022-01-24 00:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][90/1251] eta 0:47:31 lr 0.000262 time 2.5890 (2.4562) loss 3.7790 (3.2991) grad_norm 1.8354 (1.8373) [2022-01-24 00:39:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][100/1251] eta 0:46:54 lr 0.000262 time 1.5155 (2.4455) loss 3.8845 (3.2949) grad_norm 1.8299 (1.8514) [2022-01-24 00:40:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][110/1251] eta 0:46:01 lr 0.000262 time 2.2134 (2.4206) loss 3.6085 (3.3117) grad_norm 2.0058 (1.8558) [2022-01-24 00:40:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][120/1251] eta 0:45:03 lr 0.000262 time 2.6229 (2.3901) loss 4.3176 (3.3297) grad_norm 1.9869 (1.8558) [2022-01-24 00:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][130/1251] eta 0:44:02 lr 0.000262 time 1.9784 (2.3576) loss 2.9100 (3.3343) grad_norm 1.8593 (1.8545) [2022-01-24 00:41:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][140/1251] eta 0:43:15 lr 0.000261 time 2.4744 (2.3366) loss 3.6951 (3.3466) grad_norm 1.8715 (1.8543) [2022-01-24 00:41:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][150/1251] eta 0:42:35 lr 0.000261 time 2.3031 (2.3209) loss 3.2505 (3.3356) grad_norm 1.5983 (1.8570) [2022-01-24 00:41:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][160/1251] eta 0:42:00 lr 0.000261 time 2.6953 (2.3099) loss 3.6753 (3.3306) grad_norm 1.8264 (1.8597) [2022-01-24 00:42:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][170/1251] eta 0:41:28 lr 0.000261 time 1.8590 (2.3018) loss 2.7698 (3.3296) grad_norm 1.9042 (1.8668) [2022-01-24 00:42:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][180/1251] eta 0:41:02 lr 0.000261 time 2.5322 (2.2990) loss 3.1277 (3.3330) grad_norm 1.8576 (1.8691) [2022-01-24 00:43:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][190/1251] eta 0:40:45 lr 0.000261 time 2.4931 (2.3052) loss 3.3778 (3.3302) grad_norm 2.2143 (1.8731) [2022-01-24 00:43:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][200/1251] eta 0:40:35 lr 0.000261 time 3.0577 (2.3172) loss 2.3198 (3.3208) grad_norm 1.9093 (1.8750) [2022-01-24 00:43:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][210/1251] eta 0:40:06 lr 0.000261 time 2.4873 (2.3115) loss 2.6971 (3.3250) grad_norm 1.6316 (1.8753) [2022-01-24 00:44:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][220/1251] eta 0:39:28 lr 0.000261 time 2.0405 (2.2977) loss 3.7811 (3.3277) grad_norm 1.8402 (1.8818) [2022-01-24 00:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][230/1251] eta 0:38:50 lr 0.000261 time 1.8598 (2.2830) loss 3.6254 (3.3274) grad_norm 2.1862 (1.8816) [2022-01-24 00:44:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][240/1251] eta 0:38:16 lr 0.000261 time 2.4991 (2.2711) loss 2.5688 (3.3355) grad_norm 1.7668 (1.8835) [2022-01-24 00:45:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][250/1251] eta 0:37:49 lr 0.000261 time 2.1189 (2.2668) loss 3.5812 (3.3336) grad_norm 1.8756 (1.8836) [2022-01-24 00:45:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][260/1251] eta 0:37:22 lr 0.000261 time 2.2167 (2.2627) loss 3.4390 (3.3324) grad_norm 1.9575 (1.8790) [2022-01-24 00:45:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][270/1251] eta 0:36:59 lr 0.000261 time 2.3023 (2.2621) loss 3.7725 (3.3362) grad_norm 1.8333 (1.8800) [2022-01-24 00:46:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][280/1251] eta 0:36:39 lr 0.000261 time 2.8700 (2.2652) loss 2.4520 (3.3289) grad_norm 1.8254 (1.8781) [2022-01-24 00:46:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][290/1251] eta 0:36:17 lr 0.000261 time 2.2865 (2.2661) loss 2.7810 (3.3342) grad_norm 1.5899 (1.8775) [2022-01-24 00:47:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][300/1251] eta 0:35:49 lr 0.000261 time 1.9258 (2.2603) loss 2.8425 (3.3257) grad_norm 1.6585 (1.8816) [2022-01-24 00:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][310/1251] eta 0:35:23 lr 0.000261 time 2.2380 (2.2568) loss 3.8137 (3.3148) grad_norm 1.8082 (1.8811) [2022-01-24 00:47:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][320/1251] eta 0:34:57 lr 0.000261 time 2.5377 (2.2527) loss 3.8715 (3.3185) grad_norm 2.0545 (1.8823) [2022-01-24 00:48:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][330/1251] eta 0:34:33 lr 0.000261 time 2.2046 (2.2514) loss 3.1673 (3.3197) grad_norm 1.7332 (1.8803) [2022-01-24 00:48:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][340/1251] eta 0:34:08 lr 0.000261 time 1.8378 (2.2489) loss 2.6054 (3.3182) grad_norm 1.8038 (1.8809) [2022-01-24 00:48:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][350/1251] eta 0:33:43 lr 0.000261 time 1.8880 (2.2459) loss 2.6191 (3.3165) grad_norm 1.9647 (1.8839) [2022-01-24 00:49:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][360/1251] eta 0:33:18 lr 0.000261 time 1.6327 (2.2427) loss 3.3675 (3.3198) grad_norm 1.8897 (1.8837) [2022-01-24 00:49:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][370/1251] eta 0:32:52 lr 0.000261 time 1.9118 (2.2387) loss 3.5533 (3.3240) grad_norm 1.9838 (1.8833) [2022-01-24 00:49:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][380/1251] eta 0:32:27 lr 0.000261 time 1.9835 (2.2364) loss 2.7384 (3.3221) grad_norm 1.7619 (1.8865) [2022-01-24 00:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][390/1251] eta 0:32:05 lr 0.000261 time 2.5454 (2.2358) loss 3.3773 (3.3181) grad_norm 2.3264 (1.8892) [2022-01-24 00:50:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][400/1251] eta 0:31:45 lr 0.000261 time 2.2963 (2.2387) loss 3.1705 (3.3196) grad_norm 1.6521 (1.8908) [2022-01-24 00:50:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][410/1251] eta 0:31:19 lr 0.000261 time 1.8431 (2.2352) loss 3.3737 (3.3170) grad_norm 1.8496 (1.8928) [2022-01-24 00:51:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][420/1251] eta 0:30:58 lr 0.000260 time 2.3189 (2.2359) loss 2.6826 (3.3206) grad_norm 1.6787 (1.8933) [2022-01-24 00:51:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][430/1251] eta 0:30:35 lr 0.000260 time 2.3838 (2.2359) loss 3.3190 (3.3205) grad_norm 2.0568 (1.8940) [2022-01-24 00:52:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][440/1251] eta 0:30:12 lr 0.000260 time 1.8356 (2.2349) loss 3.3344 (3.3267) grad_norm 1.9790 (1.8972) [2022-01-24 00:52:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][450/1251] eta 0:29:50 lr 0.000260 time 2.1461 (2.2358) loss 3.5082 (3.3276) grad_norm 1.6515 (1.8976) [2022-01-24 00:52:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][460/1251] eta 0:29:25 lr 0.000260 time 1.7975 (2.2316) loss 3.5237 (3.3307) grad_norm 2.3805 (1.8975) [2022-01-24 00:53:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][470/1251] eta 0:29:00 lr 0.000260 time 2.0464 (2.2280) loss 2.5453 (3.3234) grad_norm 1.7367 (1.8984) [2022-01-24 00:53:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][480/1251] eta 0:28:34 lr 0.000260 time 1.9338 (2.2237) loss 2.2815 (3.3169) grad_norm 1.9394 (1.8986) [2022-01-24 00:53:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][490/1251] eta 0:28:10 lr 0.000260 time 2.0615 (2.2219) loss 2.3601 (3.3171) grad_norm 1.6795 (1.9011) [2022-01-24 00:54:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][500/1251] eta 0:27:48 lr 0.000260 time 2.1668 (2.2219) loss 2.5902 (3.3099) grad_norm 1.9372 (1.9027) [2022-01-24 00:54:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][510/1251] eta 0:27:26 lr 0.000260 time 2.4699 (2.2221) loss 2.6950 (3.3110) grad_norm 2.1166 (1.9021) [2022-01-24 00:54:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][520/1251] eta 0:27:03 lr 0.000260 time 1.5882 (2.2213) loss 2.9904 (3.3117) grad_norm 2.0204 (1.9034) [2022-01-24 00:55:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][530/1251] eta 0:26:43 lr 0.000260 time 2.5767 (2.2237) loss 3.1199 (3.3051) grad_norm 1.9319 (1.9041) [2022-01-24 00:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][540/1251] eta 0:26:19 lr 0.000260 time 2.0038 (2.2213) loss 2.8782 (3.3086) grad_norm 1.8836 (1.9041) [2022-01-24 00:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][550/1251] eta 0:25:55 lr 0.000260 time 1.8915 (2.2190) loss 2.3613 (3.3081) grad_norm 1.8655 (1.9049) [2022-01-24 00:56:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][560/1251] eta 0:25:32 lr 0.000260 time 2.0011 (2.2180) loss 3.4541 (3.3091) grad_norm 1.9174 (1.9041) [2022-01-24 00:56:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][570/1251] eta 0:25:09 lr 0.000260 time 2.5509 (2.2173) loss 2.5479 (3.3071) grad_norm 1.8541 (1.9028) [2022-01-24 00:57:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][580/1251] eta 0:24:47 lr 0.000260 time 2.4533 (2.2169) loss 3.8310 (3.3095) grad_norm 1.7392 (1.9024) [2022-01-24 00:57:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][590/1251] eta 0:24:26 lr 0.000260 time 1.8569 (2.2182) loss 2.2587 (3.3076) grad_norm 2.1304 (1.9021) [2022-01-24 00:57:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][600/1251] eta 0:24:05 lr 0.000260 time 2.2856 (2.2197) loss 3.6698 (3.3076) grad_norm 1.9188 (1.9026) [2022-01-24 00:58:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][610/1251] eta 0:23:42 lr 0.000260 time 2.5181 (2.2198) loss 3.5408 (3.3047) grad_norm 2.0929 (1.9038) [2022-01-24 00:58:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][620/1251] eta 0:23:19 lr 0.000260 time 2.2322 (2.2177) loss 4.0230 (3.3105) grad_norm 1.8814 (1.9040) [2022-01-24 00:58:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][630/1251] eta 0:22:55 lr 0.000260 time 1.6600 (2.2147) loss 3.0114 (3.3057) grad_norm 1.8798 (1.9046) [2022-01-24 00:59:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][640/1251] eta 0:22:31 lr 0.000260 time 1.9448 (2.2113) loss 3.8232 (3.3080) grad_norm 1.9165 (1.9042) [2022-01-24 00:59:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][650/1251] eta 0:22:08 lr 0.000260 time 2.0518 (2.2107) loss 3.5485 (3.3114) grad_norm 1.8974 (1.9049) [2022-01-24 01:00:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][660/1251] eta 0:21:45 lr 0.000260 time 2.2692 (2.2092) loss 3.7449 (3.3144) grad_norm 1.8135 (1.9050) [2022-01-24 01:00:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][670/1251] eta 0:21:23 lr 0.000260 time 1.7246 (2.2085) loss 2.3614 (3.3109) grad_norm 1.7996 (1.9033) [2022-01-24 01:00:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][680/1251] eta 0:21:01 lr 0.000260 time 1.7982 (2.2097) loss 3.4371 (3.3123) grad_norm 1.9199 (1.9045) [2022-01-24 01:01:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][690/1251] eta 0:20:41 lr 0.000260 time 2.7648 (2.2125) loss 3.3127 (3.3131) grad_norm 1.9930 (1.9044) [2022-01-24 01:01:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][700/1251] eta 0:20:18 lr 0.000259 time 1.8554 (2.2119) loss 3.1268 (3.3114) grad_norm 2.7793 (1.9058) [2022-01-24 01:01:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][710/1251] eta 0:19:55 lr 0.000259 time 1.5648 (2.2090) loss 2.4347 (3.3144) grad_norm 1.6603 (1.9059) [2022-01-24 01:02:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][720/1251] eta 0:19:32 lr 0.000259 time 2.4257 (2.2088) loss 2.4258 (3.3164) grad_norm 1.9274 (1.9055) [2022-01-24 01:02:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][730/1251] eta 0:19:09 lr 0.000259 time 1.9800 (2.2065) loss 3.2127 (3.3166) grad_norm 1.6900 (1.9053) [2022-01-24 01:02:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][740/1251] eta 0:18:47 lr 0.000259 time 2.3041 (2.2060) loss 3.6577 (3.3179) grad_norm 1.9150 (1.9058) [2022-01-24 01:03:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][750/1251] eta 0:18:25 lr 0.000259 time 1.8306 (2.2062) loss 3.5034 (3.3186) grad_norm 1.5728 (1.9065) [2022-01-24 01:03:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][760/1251] eta 0:18:04 lr 0.000259 time 1.8498 (2.2084) loss 3.1429 (3.3222) grad_norm 1.9130 (1.9055) [2022-01-24 01:04:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][770/1251] eta 0:17:41 lr 0.000259 time 1.5898 (2.2074) loss 3.4977 (3.3204) grad_norm 2.1622 (1.9070) [2022-01-24 01:04:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][780/1251] eta 0:17:19 lr 0.000259 time 2.5719 (2.2073) loss 3.5413 (3.3194) grad_norm 1.6900 (1.9085) [2022-01-24 01:04:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][790/1251] eta 0:16:57 lr 0.000259 time 2.0833 (2.2070) loss 3.4886 (3.3197) grad_norm 1.8941 (1.9089) [2022-01-24 01:05:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][800/1251] eta 0:16:35 lr 0.000259 time 2.4922 (2.2068) loss 4.1419 (3.3185) grad_norm 1.8892 (1.9086) [2022-01-24 01:05:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][810/1251] eta 0:16:13 lr 0.000259 time 1.6351 (2.2066) loss 2.3937 (3.3219) grad_norm 1.7728 (1.9102) [2022-01-24 01:05:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][820/1251] eta 0:15:51 lr 0.000259 time 2.4976 (2.2079) loss 3.7753 (3.3229) grad_norm 2.0730 (1.9102) [2022-01-24 01:06:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][830/1251] eta 0:15:29 lr 0.000259 time 1.9245 (2.2069) loss 2.3076 (3.3231) grad_norm 1.9841 (1.9113) [2022-01-24 01:06:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][840/1251] eta 0:15:06 lr 0.000259 time 2.9881 (2.2060) loss 3.0248 (3.3237) grad_norm 1.8974 (1.9115) [2022-01-24 01:06:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][850/1251] eta 0:14:43 lr 0.000259 time 1.6658 (2.2035) loss 3.5429 (3.3250) grad_norm 1.7090 (1.9120) [2022-01-24 01:07:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][860/1251] eta 0:14:21 lr 0.000259 time 2.4491 (2.2030) loss 3.0801 (3.3262) grad_norm 1.6420 (1.9124) [2022-01-24 01:07:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][870/1251] eta 0:13:59 lr 0.000259 time 2.1307 (2.2036) loss 3.6523 (3.3282) grad_norm 1.6731 (1.9121) [2022-01-24 01:08:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][880/1251] eta 0:13:38 lr 0.000259 time 3.5652 (2.2062) loss 3.7948 (3.3289) grad_norm 1.7988 (1.9117) [2022-01-24 01:08:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][890/1251] eta 0:13:16 lr 0.000259 time 1.7237 (2.2052) loss 2.3047 (3.3282) grad_norm 1.5603 (1.9103) [2022-01-24 01:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][900/1251] eta 0:12:53 lr 0.000259 time 2.1783 (2.2049) loss 2.6289 (3.3285) grad_norm 1.6564 (1.9092) [2022-01-24 01:09:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][910/1251] eta 0:12:31 lr 0.000259 time 2.2164 (2.2048) loss 3.5575 (3.3269) grad_norm 1.6621 (1.9085) [2022-01-24 01:09:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][920/1251] eta 0:12:09 lr 0.000259 time 2.2403 (2.2035) loss 3.3356 (3.3259) grad_norm 1.7660 (1.9082) [2022-01-24 01:09:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][930/1251] eta 0:11:46 lr 0.000259 time 2.2713 (2.2018) loss 2.9601 (3.3248) grad_norm 1.8027 (1.9091) [2022-01-24 01:10:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][940/1251] eta 0:11:25 lr 0.000259 time 2.7710 (2.2027) loss 3.6951 (3.3254) grad_norm 2.0805 (1.9110) [2022-01-24 01:10:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][950/1251] eta 0:11:03 lr 0.000259 time 2.5531 (2.2037) loss 3.1306 (3.3281) grad_norm 2.3960 (1.9127) [2022-01-24 01:10:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][960/1251] eta 0:10:41 lr 0.000259 time 2.0951 (2.2042) loss 3.6823 (3.3285) grad_norm 1.9060 (1.9124) [2022-01-24 01:11:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][970/1251] eta 0:10:19 lr 0.000259 time 1.8857 (2.2050) loss 3.5145 (3.3303) grad_norm 1.9289 (1.9122) [2022-01-24 01:11:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][980/1251] eta 0:09:57 lr 0.000258 time 1.9312 (2.2040) loss 3.4044 (3.3279) grad_norm 1.9298 (1.9123) [2022-01-24 01:12:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][990/1251] eta 0:09:35 lr 0.000258 time 2.5573 (2.2035) loss 2.1924 (3.3262) grad_norm 2.1610 (1.9123) [2022-01-24 01:12:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1000/1251] eta 0:09:12 lr 0.000258 time 1.9126 (2.2011) loss 3.7984 (3.3256) grad_norm 1.8129 (1.9111) [2022-01-24 01:12:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1010/1251] eta 0:08:49 lr 0.000258 time 1.8219 (2.1990) loss 2.6581 (3.3230) grad_norm 1.7740 (1.9119) [2022-01-24 01:13:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1020/1251] eta 0:08:28 lr 0.000258 time 2.2707 (2.1995) loss 4.0520 (3.3222) grad_norm 1.8903 (1.9137) [2022-01-24 01:13:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1030/1251] eta 0:08:06 lr 0.000258 time 2.0332 (2.2000) loss 2.6687 (3.3202) grad_norm 2.3510 (1.9137) [2022-01-24 01:13:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1040/1251] eta 0:07:44 lr 0.000258 time 2.0625 (2.2018) loss 3.3328 (3.3180) grad_norm 2.3715 (1.9138) [2022-01-24 01:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1050/1251] eta 0:07:23 lr 0.000258 time 3.0771 (2.2045) loss 3.6990 (3.3183) grad_norm 1.9976 (1.9138) [2022-01-24 01:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1060/1251] eta 0:07:01 lr 0.000258 time 1.9018 (2.2044) loss 3.7772 (3.3192) grad_norm 2.1992 (1.9144) [2022-01-24 01:15:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1070/1251] eta 0:06:39 lr 0.000258 time 1.8909 (2.2050) loss 3.7763 (3.3182) grad_norm 1.9030 (1.9150) [2022-01-24 01:15:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1080/1251] eta 0:06:16 lr 0.000258 time 1.9056 (2.2034) loss 3.9188 (3.3191) grad_norm 2.1320 (1.9148) [2022-01-24 01:15:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1090/1251] eta 0:05:54 lr 0.000258 time 1.8079 (2.2020) loss 3.4894 (3.3198) grad_norm 1.9126 (1.9146) [2022-01-24 01:16:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1100/1251] eta 0:05:32 lr 0.000258 time 1.8084 (2.2008) loss 3.4712 (3.3174) grad_norm 1.8252 (1.9140) [2022-01-24 01:16:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1110/1251] eta 0:05:10 lr 0.000258 time 2.1783 (2.2011) loss 2.3753 (3.3156) grad_norm 1.7150 (1.9135) [2022-01-24 01:16:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1120/1251] eta 0:04:48 lr 0.000258 time 2.1923 (2.2011) loss 3.2028 (3.3139) grad_norm 2.1607 (1.9140) [2022-01-24 01:17:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1130/1251] eta 0:04:26 lr 0.000258 time 2.4185 (2.2020) loss 3.4157 (3.3143) grad_norm 1.7903 (1.9143) [2022-01-24 01:17:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1140/1251] eta 0:04:04 lr 0.000258 time 1.8730 (2.2016) loss 2.6942 (3.3128) grad_norm 1.7423 (1.9144) [2022-01-24 01:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1150/1251] eta 0:03:42 lr 0.000258 time 1.5716 (2.2012) loss 2.4549 (3.3098) grad_norm 1.9701 (1.9145) [2022-01-24 01:18:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1160/1251] eta 0:03:20 lr 0.000258 time 2.1371 (2.2009) loss 3.1966 (3.3079) grad_norm 2.0792 (1.9139) [2022-01-24 01:18:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1170/1251] eta 0:02:58 lr 0.000258 time 1.9174 (2.2000) loss 2.8064 (3.3080) grad_norm 1.8289 (1.9140) [2022-01-24 01:18:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1180/1251] eta 0:02:36 lr 0.000258 time 1.4137 (2.1989) loss 2.3981 (3.3082) grad_norm 2.0268 (1.9138) [2022-01-24 01:19:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1190/1251] eta 0:02:14 lr 0.000258 time 2.2168 (2.2000) loss 2.3776 (3.3060) grad_norm 1.8022 (1.9144) [2022-01-24 01:19:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1200/1251] eta 0:01:52 lr 0.000258 time 1.6607 (2.2005) loss 3.4951 (3.3076) grad_norm 2.1488 (1.9154) [2022-01-24 01:20:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1210/1251] eta 0:01:30 lr 0.000258 time 2.5434 (2.2017) loss 3.7509 (3.3100) grad_norm 1.8127 (1.9152) [2022-01-24 01:20:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1220/1251] eta 0:01:08 lr 0.000258 time 1.4816 (2.2022) loss 3.6619 (3.3098) grad_norm 1.8115 (1.9150) [2022-01-24 01:20:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1230/1251] eta 0:00:46 lr 0.000258 time 2.4127 (2.2022) loss 3.8552 (3.3093) grad_norm 2.1138 (1.9163) [2022-01-24 01:21:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1240/1251] eta 0:00:24 lr 0.000258 time 1.2828 (2.2000) loss 3.8419 (3.3094) grad_norm 2.0031 (1.9175) [2022-01-24 01:21:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [199/300][1250/1251] eta 0:00:02 lr 0.000258 time 1.2044 (2.1936) loss 3.4539 (3.3086) grad_norm 1.6506 (1.9177) [2022-01-24 01:21:25 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 199 training takes 0:45:44 [2022-01-24 01:21:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.497 (18.497) Loss 0.8443 (0.8443) Acc@1 79.785 (79.785) Acc@5 94.531 (94.531) [2022-01-24 01:22:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.940 (3.385) Loss 0.8718 (0.9088) Acc@1 79.199 (77.805) Acc@5 94.141 (94.229) [2022-01-24 01:22:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.219 (2.554) Loss 0.9518 (0.9088) Acc@1 77.441 (78.097) Acc@5 93.848 (94.308) [2022-01-24 01:22:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.596 (2.215) Loss 0.8961 (0.9012) Acc@1 79.199 (78.462) Acc@5 94.824 (94.421) [2022-01-24 01:22:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.870 (2.132) Loss 0.8119 (0.9015) Acc@1 79.297 (78.408) Acc@5 95.703 (94.431) [2022-01-24 01:23:00 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.396 Acc@5 94.418 [2022-01-24 01:23:00 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.4% [2022-01-24 01:23:00 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.40% [2022-01-24 01:23:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][0/1251] eta 6:50:39 lr 0.000258 time 19.6962 (19.6962) loss 3.2415 (3.2415) grad_norm 1.9278 (1.9278) [2022-01-24 01:23:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][10/1251] eta 1:24:00 lr 0.000257 time 3.0438 (4.0618) loss 2.4177 (3.2070) grad_norm 1.9742 (1.9851) [2022-01-24 01:24:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][20/1251] eta 1:05:52 lr 0.000257 time 1.9570 (3.2107) loss 2.5977 (3.2205) grad_norm 1.8678 (1.9907) [2022-01-24 01:24:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][30/1251] eta 0:59:39 lr 0.000257 time 1.5372 (2.9314) loss 3.9714 (3.2794) grad_norm 1.8811 (1.9549) [2022-01-24 01:24:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][40/1251] eta 0:57:39 lr 0.000257 time 3.5491 (2.8566) loss 3.4939 (3.3233) grad_norm 1.9435 (1.9776) [2022-01-24 01:25:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][50/1251] eta 0:54:52 lr 0.000257 time 2.1153 (2.7415) loss 2.3271 (3.2905) grad_norm 2.1875 (1.9747) [2022-01-24 01:25:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][60/1251] eta 0:51:57 lr 0.000257 time 1.9317 (2.6177) loss 3.4374 (3.3235) grad_norm 1.9121 (1.9681) [2022-01-24 01:26:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][70/1251] eta 0:49:59 lr 0.000257 time 1.6294 (2.5397) loss 3.4605 (3.3212) grad_norm 2.0182 (1.9688) [2022-01-24 01:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][80/1251] eta 0:48:55 lr 0.000257 time 3.0834 (2.5064) loss 3.7362 (3.3203) grad_norm 2.4474 (1.9663) [2022-01-24 01:26:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][90/1251] eta 0:47:38 lr 0.000257 time 1.8889 (2.4620) loss 3.4824 (3.3005) grad_norm 1.7288 (1.9671) [2022-01-24 01:27:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][100/1251] eta 0:46:20 lr 0.000257 time 1.8866 (2.4154) loss 3.5642 (3.3034) grad_norm 1.8144 (1.9617) [2022-01-24 01:27:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][110/1251] eta 0:45:26 lr 0.000257 time 1.9274 (2.3892) loss 3.1634 (3.2861) grad_norm 1.9452 (1.9557) [2022-01-24 01:27:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][120/1251] eta 0:44:42 lr 0.000257 time 2.8940 (2.3717) loss 2.3237 (3.2599) grad_norm 2.0888 (1.9595) [2022-01-24 01:28:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][130/1251] eta 0:43:57 lr 0.000257 time 1.8732 (2.3527) loss 3.6785 (3.2728) grad_norm 2.8188 (1.9610) [2022-01-24 01:28:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][140/1251] eta 0:43:16 lr 0.000257 time 1.9422 (2.3371) loss 2.9348 (3.2886) grad_norm 2.1074 (1.9581) [2022-01-24 01:28:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][150/1251] eta 0:42:37 lr 0.000257 time 2.1816 (2.3226) loss 2.5476 (3.2877) grad_norm 1.5681 (1.9582) [2022-01-24 01:29:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][160/1251] eta 0:42:08 lr 0.000257 time 2.9731 (2.3176) loss 4.0353 (3.2844) grad_norm 1.9644 (1.9581) [2022-01-24 01:29:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][170/1251] eta 0:42:03 lr 0.000257 time 1.5852 (2.3345) loss 2.8716 (3.2933) grad_norm 1.8720 (1.9557) [2022-01-24 01:30:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][180/1251] eta 0:41:34 lr 0.000257 time 2.4828 (2.3287) loss 3.6175 (3.2980) grad_norm 1.9259 (1.9481) [2022-01-24 01:30:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][190/1251] eta 0:40:51 lr 0.000257 time 2.0101 (2.3106) loss 3.6845 (3.3067) grad_norm 1.9414 (1.9458) [2022-01-24 01:30:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][200/1251] eta 0:40:13 lr 0.000257 time 1.9146 (2.2968) loss 3.2081 (3.2953) grad_norm 2.2919 (1.9481) [2022-01-24 01:31:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][210/1251] eta 0:39:39 lr 0.000257 time 2.1529 (2.2860) loss 2.5666 (3.2939) grad_norm 2.0947 (1.9448) [2022-01-24 01:31:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][220/1251] eta 0:39:08 lr 0.000257 time 1.9729 (2.2783) loss 3.2945 (3.3003) grad_norm 2.1804 (1.9436) [2022-01-24 01:31:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][230/1251] eta 0:38:41 lr 0.000257 time 2.2232 (2.2735) loss 2.6352 (3.2967) grad_norm 2.1228 (1.9414) [2022-01-24 01:32:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][240/1251] eta 0:38:18 lr 0.000257 time 1.5855 (2.2738) loss 2.3409 (3.2800) grad_norm 1.8096 (1.9405) [2022-01-24 01:32:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][250/1251] eta 0:37:59 lr 0.000257 time 1.5719 (2.2775) loss 3.3807 (3.2758) grad_norm 1.8663 (1.9403) [2022-01-24 01:32:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][260/1251] eta 0:37:29 lr 0.000257 time 1.6572 (2.2701) loss 3.4585 (3.2795) grad_norm 1.8789 (1.9410) [2022-01-24 01:33:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][270/1251] eta 0:37:04 lr 0.000257 time 2.1471 (2.2680) loss 3.6144 (3.2848) grad_norm 2.0696 (1.9400) [2022-01-24 01:33:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][280/1251] eta 0:36:34 lr 0.000256 time 1.5376 (2.2599) loss 2.9234 (3.2834) grad_norm 1.9068 (1.9403) [2022-01-24 01:33:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][290/1251] eta 0:36:08 lr 0.000256 time 2.2005 (2.2568) loss 2.5358 (3.2915) grad_norm 1.9035 (1.9428) [2022-01-24 01:34:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][300/1251] eta 0:35:47 lr 0.000256 time 2.6887 (2.2583) loss 3.2387 (3.2974) grad_norm 1.8579 (1.9502) [2022-01-24 01:34:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][310/1251] eta 0:35:21 lr 0.000256 time 1.6040 (2.2547) loss 3.3357 (3.2993) grad_norm 1.7594 (1.9497) [2022-01-24 01:35:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][320/1251] eta 0:34:53 lr 0.000256 time 1.8303 (2.2486) loss 3.5927 (3.2874) grad_norm 1.8155 (1.9464) [2022-01-24 01:35:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][330/1251] eta 0:34:24 lr 0.000256 time 1.9123 (2.2416) loss 2.5863 (3.2807) grad_norm 1.8470 (1.9447) [2022-01-24 01:35:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][340/1251] eta 0:34:00 lr 0.000256 time 2.2005 (2.2402) loss 4.0081 (3.2815) grad_norm 2.1046 (1.9453) [2022-01-24 01:36:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][350/1251] eta 0:33:35 lr 0.000256 time 2.4396 (2.2374) loss 3.2607 (3.2798) grad_norm 1.7589 (1.9448) [2022-01-24 01:36:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][360/1251] eta 0:33:15 lr 0.000256 time 1.5826 (2.2391) loss 4.1319 (3.2812) grad_norm 2.2863 (1.9461) [2022-01-24 01:36:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][370/1251] eta 0:32:52 lr 0.000256 time 1.6812 (2.2388) loss 3.2041 (3.2870) grad_norm 1.9827 (1.9456) [2022-01-24 01:37:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][380/1251] eta 0:32:28 lr 0.000256 time 1.8831 (2.2375) loss 3.8360 (3.2930) grad_norm 1.7488 (1.9449) [2022-01-24 01:37:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][390/1251] eta 0:32:05 lr 0.000256 time 2.0255 (2.2363) loss 3.4199 (3.2912) grad_norm 2.3049 (1.9457) [2022-01-24 01:37:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][400/1251] eta 0:31:40 lr 0.000256 time 1.7369 (2.2337) loss 3.2392 (3.2962) grad_norm 1.8148 (1.9436) [2022-01-24 01:38:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][410/1251] eta 0:31:17 lr 0.000256 time 1.9556 (2.2328) loss 3.3736 (3.3004) grad_norm 1.8456 (1.9411) [2022-01-24 01:38:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][420/1251] eta 0:30:50 lr 0.000256 time 1.8787 (2.2273) loss 2.2346 (3.2970) grad_norm 1.8215 (1.9394) [2022-01-24 01:38:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][430/1251] eta 0:30:27 lr 0.000256 time 2.2174 (2.2265) loss 3.5316 (3.2983) grad_norm 1.9033 (1.9369) [2022-01-24 01:39:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][440/1251] eta 0:30:03 lr 0.000256 time 1.5225 (2.2238) loss 4.0743 (3.2942) grad_norm 2.2133 (1.9376) [2022-01-24 01:39:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][450/1251] eta 0:29:41 lr 0.000256 time 1.7913 (2.2237) loss 3.6798 (3.2965) grad_norm 2.1443 (1.9407) [2022-01-24 01:40:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][460/1251] eta 0:29:22 lr 0.000256 time 2.1989 (2.2280) loss 3.5651 (3.2983) grad_norm 2.2909 (1.9443) [2022-01-24 01:40:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][470/1251] eta 0:29:01 lr 0.000256 time 1.6639 (2.2297) loss 3.5096 (3.2952) grad_norm 2.1325 (1.9449) [2022-01-24 01:40:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][480/1251] eta 0:28:38 lr 0.000256 time 1.9024 (2.2291) loss 2.8947 (3.2981) grad_norm 2.0436 (1.9436) [2022-01-24 01:41:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][490/1251] eta 0:28:13 lr 0.000256 time 1.5340 (2.2249) loss 2.3630 (3.2956) grad_norm 1.8774 (1.9434) [2022-01-24 01:41:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][500/1251] eta 0:27:49 lr 0.000256 time 1.9573 (2.2236) loss 3.5684 (3.2926) grad_norm 1.7889 (1.9441) [2022-01-24 01:41:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][510/1251] eta 0:27:25 lr 0.000256 time 2.3529 (2.2209) loss 3.2734 (3.2896) grad_norm 1.8363 (1.9427) [2022-01-24 01:42:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][520/1251] eta 0:27:03 lr 0.000256 time 1.9606 (2.2215) loss 2.2949 (3.2861) grad_norm 1.8720 (1.9443) [2022-01-24 01:42:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][530/1251] eta 0:26:42 lr 0.000256 time 1.8886 (2.2222) loss 3.9251 (3.2922) grad_norm 1.9810 (1.9445) [2022-01-24 01:43:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][540/1251] eta 0:26:19 lr 0.000256 time 2.2499 (2.2214) loss 3.5917 (3.2942) grad_norm 1.9967 (1.9444) [2022-01-24 01:43:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][550/1251] eta 0:25:56 lr 0.000256 time 2.1975 (2.2199) loss 2.6264 (3.2933) grad_norm 1.7362 (1.9451) [2022-01-24 01:43:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][560/1251] eta 0:25:33 lr 0.000255 time 2.1783 (2.2190) loss 3.5753 (3.2955) grad_norm 2.1447 (1.9462) [2022-01-24 01:44:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][570/1251] eta 0:25:09 lr 0.000255 time 2.2220 (2.2168) loss 3.0368 (3.2949) grad_norm 1.9295 (1.9462) [2022-01-24 01:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][580/1251] eta 0:24:48 lr 0.000255 time 2.6995 (2.2180) loss 3.6694 (3.2974) grad_norm 1.8403 (1.9449) [2022-01-24 01:44:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][590/1251] eta 0:24:27 lr 0.000255 time 1.4878 (2.2197) loss 2.5912 (3.2956) grad_norm 2.2615 (1.9449) [2022-01-24 01:45:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][600/1251] eta 0:24:04 lr 0.000255 time 2.3074 (2.2189) loss 3.4038 (3.2997) grad_norm 1.7953 (1.9464) [2022-01-24 01:45:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][610/1251] eta 0:23:41 lr 0.000255 time 2.5685 (2.2173) loss 3.8020 (3.3006) grad_norm 1.7307 (1.9458) [2022-01-24 01:45:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][620/1251] eta 0:23:18 lr 0.000255 time 2.2967 (2.2171) loss 3.5180 (3.3052) grad_norm 1.8391 (1.9440) [2022-01-24 01:46:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][630/1251] eta 0:22:56 lr 0.000255 time 1.8225 (2.2171) loss 2.4035 (3.3045) grad_norm 1.7682 (1.9431) [2022-01-24 01:46:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][640/1251] eta 0:22:34 lr 0.000255 time 1.8863 (2.2174) loss 3.3928 (3.3060) grad_norm 1.7218 (1.9420) [2022-01-24 01:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][650/1251] eta 0:22:12 lr 0.000255 time 1.8751 (2.2168) loss 2.5810 (3.3042) grad_norm 1.9656 (1.9408) [2022-01-24 01:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][660/1251] eta 0:21:48 lr 0.000255 time 2.4717 (2.2141) loss 3.5040 (3.3026) grad_norm 2.2520 (1.9401) [2022-01-24 01:47:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][670/1251] eta 0:21:24 lr 0.000255 time 1.9918 (2.2102) loss 3.5110 (3.3042) grad_norm 1.6954 (1.9380) [2022-01-24 01:48:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][680/1251] eta 0:21:01 lr 0.000255 time 2.8693 (2.2098) loss 2.9694 (3.3057) grad_norm 1.9420 (1.9367) [2022-01-24 01:48:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][690/1251] eta 0:20:39 lr 0.000255 time 2.1644 (2.2087) loss 3.2775 (3.3048) grad_norm 1.9254 (1.9366) [2022-01-24 01:48:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][700/1251] eta 0:20:16 lr 0.000255 time 1.9701 (2.2082) loss 3.1299 (3.3054) grad_norm 1.8317 (1.9377) [2022-01-24 01:49:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][710/1251] eta 0:19:55 lr 0.000255 time 2.2985 (2.2096) loss 3.1851 (3.3031) grad_norm 1.6867 (1.9371) [2022-01-24 01:49:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][720/1251] eta 0:19:33 lr 0.000255 time 2.7944 (2.2108) loss 3.5575 (3.3018) grad_norm 1.8598 (1.9378) [2022-01-24 01:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][730/1251] eta 0:19:11 lr 0.000255 time 1.9952 (2.2101) loss 2.7654 (3.3052) grad_norm 1.8569 (1.9377) [2022-01-24 01:50:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][740/1251] eta 0:18:49 lr 0.000255 time 2.5592 (2.2109) loss 4.0689 (3.3089) grad_norm 1.8896 (1.9388) [2022-01-24 01:50:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][750/1251] eta 0:18:28 lr 0.000255 time 2.4454 (2.2123) loss 3.8013 (3.3114) grad_norm 1.9509 (1.9385) [2022-01-24 01:51:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][760/1251] eta 0:18:04 lr 0.000255 time 1.8838 (2.2096) loss 3.6113 (3.3141) grad_norm 1.7159 (1.9382) [2022-01-24 01:51:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][770/1251] eta 0:17:41 lr 0.000255 time 1.7394 (2.2073) loss 3.3258 (3.3114) grad_norm 1.7330 (1.9375) [2022-01-24 01:51:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][780/1251] eta 0:17:19 lr 0.000255 time 1.9907 (2.2062) loss 2.5965 (3.3101) grad_norm 1.8910 (1.9375) [2022-01-24 01:52:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][790/1251] eta 0:16:57 lr 0.000255 time 2.5635 (2.2070) loss 3.2836 (3.3134) grad_norm 1.9813 (1.9376) [2022-01-24 01:52:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][800/1251] eta 0:16:35 lr 0.000255 time 2.1006 (2.2066) loss 3.5037 (3.3151) grad_norm 1.8567 (1.9362) [2022-01-24 01:52:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][810/1251] eta 0:16:13 lr 0.000255 time 1.9502 (2.2064) loss 3.7567 (3.3184) grad_norm 1.9707 (1.9372) [2022-01-24 01:53:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][820/1251] eta 0:15:51 lr 0.000255 time 2.1330 (2.2067) loss 2.5584 (3.3164) grad_norm 1.9748 (1.9375) [2022-01-24 01:53:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][830/1251] eta 0:15:28 lr 0.000255 time 2.3798 (2.2066) loss 4.1848 (3.3201) grad_norm 2.0306 (1.9365) [2022-01-24 01:53:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][840/1251] eta 0:15:07 lr 0.000254 time 2.4692 (2.2089) loss 3.7020 (3.3217) grad_norm 1.9634 (1.9363) [2022-01-24 01:54:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][850/1251] eta 0:14:45 lr 0.000254 time 1.9610 (2.2093) loss 2.6987 (3.3184) grad_norm 2.1714 (1.9357) [2022-01-24 01:54:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][860/1251] eta 0:14:23 lr 0.000254 time 3.3631 (2.2088) loss 3.4144 (3.3163) grad_norm 1.6814 (1.9351) [2022-01-24 01:55:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][870/1251] eta 0:14:00 lr 0.000254 time 1.9475 (2.2062) loss 2.9714 (3.3163) grad_norm 2.0933 (1.9362) [2022-01-24 01:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][880/1251] eta 0:13:38 lr 0.000254 time 2.2399 (2.2057) loss 3.6208 (3.3153) grad_norm 1.8027 (1.9376) [2022-01-24 01:55:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][890/1251] eta 0:13:15 lr 0.000254 time 1.9506 (2.2046) loss 3.0267 (3.3159) grad_norm 1.9760 (1.9378) [2022-01-24 01:56:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][900/1251] eta 0:12:53 lr 0.000254 time 2.4087 (2.2049) loss 3.4270 (3.3168) grad_norm 1.7959 (1.9395) [2022-01-24 01:56:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][910/1251] eta 0:12:31 lr 0.000254 time 2.1947 (2.2049) loss 3.4212 (3.3168) grad_norm 2.0918 (1.9416) [2022-01-24 01:56:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][920/1251] eta 0:12:09 lr 0.000254 time 2.0681 (2.2049) loss 2.3932 (3.3152) grad_norm 1.9301 (1.9416) [2022-01-24 01:57:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][930/1251] eta 0:11:47 lr 0.000254 time 1.9715 (2.2055) loss 3.5590 (3.3156) grad_norm 2.1393 (1.9414) [2022-01-24 01:57:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][940/1251] eta 0:11:25 lr 0.000254 time 1.8932 (2.2046) loss 3.7682 (3.3175) grad_norm 1.8815 (1.9434) [2022-01-24 01:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][950/1251] eta 0:11:03 lr 0.000254 time 3.0751 (2.2055) loss 3.3353 (3.3155) grad_norm 1.5431 (1.9438) [2022-01-24 01:58:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][960/1251] eta 0:10:41 lr 0.000254 time 1.8770 (2.2049) loss 2.6200 (3.3151) grad_norm 1.9472 (1.9434) [2022-01-24 01:58:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][970/1251] eta 0:10:19 lr 0.000254 time 2.5664 (2.2047) loss 3.6267 (3.3154) grad_norm 2.3479 (1.9448) [2022-01-24 01:59:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][980/1251] eta 0:09:57 lr 0.000254 time 1.8785 (2.2035) loss 3.5309 (3.3177) grad_norm 1.5779 (1.9446) [2022-01-24 01:59:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][990/1251] eta 0:09:35 lr 0.000254 time 3.2696 (2.2040) loss 3.6013 (3.3179) grad_norm 2.1574 (1.9458) [2022-01-24 01:59:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1000/1251] eta 0:09:12 lr 0.000254 time 1.8138 (2.2027) loss 2.4970 (3.3165) grad_norm 2.0754 (1.9483) [2022-01-24 02:00:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1010/1251] eta 0:08:50 lr 0.000254 time 2.0341 (2.2021) loss 3.9962 (3.3174) grad_norm 1.7512 (1.9481) [2022-01-24 02:00:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1020/1251] eta 0:08:28 lr 0.000254 time 2.0817 (2.2001) loss 3.7850 (3.3173) grad_norm 2.3217 (1.9486) [2022-01-24 02:00:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1030/1251] eta 0:08:06 lr 0.000254 time 2.8719 (2.2001) loss 4.0194 (3.3179) grad_norm 1.8095 (1.9481) [2022-01-24 02:01:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1040/1251] eta 0:07:44 lr 0.000254 time 2.1604 (2.1995) loss 3.4572 (3.3170) grad_norm 2.4772 (1.9489) [2022-01-24 02:01:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1050/1251] eta 0:07:22 lr 0.000254 time 2.4578 (2.2005) loss 3.6806 (3.3166) grad_norm 1.8942 (1.9483) [2022-01-24 02:01:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1060/1251] eta 0:07:00 lr 0.000254 time 1.9365 (2.2015) loss 3.2168 (3.3139) grad_norm 1.9111 (1.9485) [2022-01-24 02:02:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1070/1251] eta 0:06:38 lr 0.000254 time 2.8486 (2.2024) loss 3.1802 (3.3151) grad_norm 1.8885 (1.9485) [2022-01-24 02:02:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1080/1251] eta 0:06:16 lr 0.000254 time 1.8860 (2.2010) loss 3.3227 (3.3163) grad_norm 2.1465 (1.9478) [2022-01-24 02:02:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1090/1251] eta 0:05:53 lr 0.000254 time 1.9334 (2.1984) loss 3.7355 (3.3178) grad_norm 1.5995 (1.9468) [2022-01-24 02:03:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1100/1251] eta 0:05:31 lr 0.000254 time 2.2812 (2.1977) loss 3.7768 (3.3167) grad_norm 1.7638 (1.9463) [2022-01-24 02:03:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1110/1251] eta 0:05:09 lr 0.000254 time 2.1578 (2.1974) loss 3.2915 (3.3170) grad_norm 1.8368 (1.9459) [2022-01-24 02:04:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1120/1251] eta 0:04:47 lr 0.000253 time 2.0284 (2.1964) loss 3.5708 (3.3199) grad_norm 2.2752 (1.9463) [2022-01-24 02:04:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1130/1251] eta 0:04:25 lr 0.000253 time 2.7939 (2.1973) loss 3.8616 (3.3186) grad_norm 2.3313 (1.9474) [2022-01-24 02:04:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1140/1251] eta 0:04:03 lr 0.000253 time 2.4831 (2.1974) loss 3.4499 (3.3196) grad_norm 1.7943 (1.9478) [2022-01-24 02:05:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1150/1251] eta 0:03:41 lr 0.000253 time 2.3396 (2.1968) loss 2.5303 (3.3176) grad_norm 1.9643 (1.9481) [2022-01-24 02:05:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1160/1251] eta 0:03:19 lr 0.000253 time 2.4963 (2.1962) loss 3.9610 (3.3196) grad_norm 1.9113 (1.9482) [2022-01-24 02:05:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1170/1251] eta 0:02:58 lr 0.000253 time 2.7888 (2.1984) loss 3.5050 (3.3210) grad_norm 2.0020 (1.9478) [2022-01-24 02:06:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1180/1251] eta 0:02:36 lr 0.000253 time 2.5654 (2.1986) loss 2.2763 (3.3222) grad_norm 1.8328 (1.9479) [2022-01-24 02:06:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1190/1251] eta 0:02:14 lr 0.000253 time 1.7023 (2.1974) loss 2.8304 (3.3235) grad_norm 1.9037 (1.9485) [2022-01-24 02:06:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1200/1251] eta 0:01:52 lr 0.000253 time 1.8601 (2.1965) loss 3.7676 (3.3246) grad_norm 2.0415 (1.9487) [2022-01-24 02:07:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1210/1251] eta 0:01:30 lr 0.000253 time 1.5871 (2.1959) loss 3.3172 (3.3259) grad_norm 1.8978 (1.9504) [2022-01-24 02:07:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1220/1251] eta 0:01:08 lr 0.000253 time 2.2583 (2.1953) loss 2.8599 (3.3259) grad_norm 1.9273 (1.9502) [2022-01-24 02:08:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1230/1251] eta 0:00:46 lr 0.000253 time 1.9059 (2.1956) loss 3.2827 (3.3265) grad_norm 2.0821 (1.9514) [2022-01-24 02:08:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1240/1251] eta 0:00:24 lr 0.000253 time 1.2729 (2.1944) loss 3.5071 (3.3260) grad_norm 2.0350 (1.9551) [2022-01-24 02:08:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [200/300][1250/1251] eta 0:00:02 lr 0.000253 time 1.1757 (2.1891) loss 3.8951 (3.3291) grad_norm 2.0201 (1.9550) [2022-01-24 02:08:39 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 200 training takes 0:45:38 [2022-01-24 02:08:39 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_200 saving...... [2022-01-24 02:08:50 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_200 saved !!! [2022-01-24 02:09:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 15.207 (15.207) Loss 0.9469 (0.9469) Acc@1 78.320 (78.320) Acc@5 93.164 (93.164) [2022-01-24 02:09:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.968 (2.661) Loss 0.9927 (0.9120) Acc@1 76.855 (78.427) Acc@5 93.457 (94.425) [2022-01-24 02:09:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.721 (2.233) Loss 0.8864 (0.9114) Acc@1 79.492 (78.371) Acc@5 94.922 (94.420) [2022-01-24 02:09:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.969 (1.957) Loss 0.9581 (0.9189) Acc@1 78.320 (78.364) Acc@5 93.848 (94.317) [2022-01-24 02:10:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.493 (1.947) Loss 0.8496 (0.9180) Acc@1 80.078 (78.399) Acc@5 94.922 (94.310) [2022-01-24 02:10:18 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.300 Acc@5 94.328 [2022-01-24 02:10:18 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.3% [2022-01-24 02:10:18 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.40% [2022-01-24 02:10:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][0/1251] eta 7:15:18 lr 0.000253 time 20.8783 (20.8783) loss 2.6863 (2.6863) grad_norm 2.1116 (2.1116) [2022-01-24 02:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][10/1251] eta 1:21:07 lr 0.000253 time 1.5115 (3.9225) loss 3.5487 (3.3144) grad_norm 1.6138 (1.9444) [2022-01-24 02:11:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][20/1251] eta 1:02:47 lr 0.000253 time 1.4475 (3.0605) loss 3.6222 (3.4419) grad_norm 1.8705 (1.9180) [2022-01-24 02:11:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][30/1251] eta 0:57:38 lr 0.000253 time 1.8027 (2.8323) loss 3.8460 (3.4422) grad_norm 2.5386 (1.9402) [2022-01-24 02:12:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][40/1251] eta 0:54:53 lr 0.000253 time 3.8418 (2.7200) loss 2.7741 (3.3685) grad_norm 1.9569 (1.9430) [2022-01-24 02:12:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][50/1251] eta 0:53:41 lr 0.000253 time 3.1095 (2.6825) loss 3.6686 (3.3347) grad_norm 2.3370 (1.9706) [2022-01-24 02:12:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][60/1251] eta 0:51:24 lr 0.000253 time 1.5507 (2.5898) loss 3.3975 (3.3255) grad_norm 1.8046 (1.9737) [2022-01-24 02:13:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][70/1251] eta 0:49:57 lr 0.000253 time 1.9036 (2.5381) loss 3.2019 (3.3273) grad_norm 1.7959 (1.9867) [2022-01-24 02:13:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][80/1251] eta 0:49:11 lr 0.000253 time 3.8860 (2.5204) loss 4.1091 (3.3269) grad_norm 1.9971 (1.9854) [2022-01-24 02:14:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][90/1251] eta 0:48:21 lr 0.000253 time 1.8765 (2.4992) loss 3.7570 (3.3118) grad_norm 1.9491 (1.9887) [2022-01-24 02:14:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][100/1251] eta 0:47:15 lr 0.000253 time 1.6691 (2.4635) loss 3.3608 (3.3327) grad_norm 1.9394 (1.9861) [2022-01-24 02:14:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][110/1251] eta 0:46:09 lr 0.000253 time 1.7723 (2.4273) loss 3.6809 (3.3508) grad_norm 1.9914 (1.9772) [2022-01-24 02:15:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][120/1251] eta 0:45:40 lr 0.000253 time 3.6632 (2.4230) loss 3.5480 (3.3554) grad_norm 2.3952 (1.9796) [2022-01-24 02:15:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][130/1251] eta 0:44:50 lr 0.000253 time 1.7546 (2.3996) loss 4.0220 (3.3497) grad_norm 2.1135 (1.9686) [2022-01-24 02:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][140/1251] eta 0:44:02 lr 0.000253 time 2.1644 (2.3789) loss 3.8191 (3.3626) grad_norm 1.9737 (1.9657) [2022-01-24 02:16:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][150/1251] eta 0:43:12 lr 0.000252 time 1.5806 (2.3545) loss 3.9410 (3.3898) grad_norm 1.8485 (1.9675) [2022-01-24 02:16:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][160/1251] eta 0:42:36 lr 0.000252 time 2.6005 (2.3436) loss 3.7049 (3.4072) grad_norm 1.8684 (1.9703) [2022-01-24 02:16:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][170/1251] eta 0:42:09 lr 0.000252 time 2.1531 (2.3396) loss 3.2036 (3.4067) grad_norm 2.3485 (1.9790) [2022-01-24 02:17:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][180/1251] eta 0:41:41 lr 0.000252 time 1.9165 (2.3356) loss 3.4436 (3.4059) grad_norm 2.5174 (1.9796) [2022-01-24 02:17:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][190/1251] eta 0:41:14 lr 0.000252 time 2.1016 (2.3320) loss 3.3205 (3.3967) grad_norm 2.2780 (1.9938) [2022-01-24 02:18:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][200/1251] eta 0:40:37 lr 0.000252 time 1.9312 (2.3190) loss 4.0465 (3.3946) grad_norm 2.4037 (2.0015) [2022-01-24 02:18:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][210/1251] eta 0:40:04 lr 0.000252 time 1.9743 (2.3100) loss 3.6987 (3.4016) grad_norm 1.8261 (1.9931) [2022-01-24 02:18:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][220/1251] eta 0:39:37 lr 0.000252 time 2.2771 (2.3063) loss 3.7655 (3.4021) grad_norm 1.9317 (1.9885) [2022-01-24 02:19:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][230/1251] eta 0:39:13 lr 0.000252 time 1.9125 (2.3051) loss 3.3225 (3.4030) grad_norm 2.3566 (1.9884) [2022-01-24 02:19:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][240/1251] eta 0:38:43 lr 0.000252 time 1.8581 (2.2986) loss 2.4911 (3.3979) grad_norm 1.8387 (1.9877) [2022-01-24 02:19:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][250/1251] eta 0:38:14 lr 0.000252 time 1.8717 (2.2924) loss 3.7212 (3.3872) grad_norm 2.1756 (1.9902) [2022-01-24 02:20:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][260/1251] eta 0:37:44 lr 0.000252 time 1.8910 (2.2854) loss 2.2209 (3.3648) grad_norm 1.5733 (1.9868) [2022-01-24 02:20:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][270/1251] eta 0:37:19 lr 0.000252 time 2.7255 (2.2830) loss 3.4019 (3.3618) grad_norm 1.8523 (1.9881) [2022-01-24 02:21:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][280/1251] eta 0:36:55 lr 0.000252 time 1.7563 (2.2819) loss 3.4310 (3.3511) grad_norm 1.8601 (1.9825) [2022-01-24 02:21:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][290/1251] eta 0:36:29 lr 0.000252 time 2.0875 (2.2787) loss 3.5795 (3.3439) grad_norm 2.0151 (1.9803) [2022-01-24 02:21:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][300/1251] eta 0:36:04 lr 0.000252 time 2.2550 (2.2765) loss 3.5524 (3.3413) grad_norm 1.7458 (1.9738) [2022-01-24 02:22:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][310/1251] eta 0:35:37 lr 0.000252 time 2.5495 (2.2719) loss 3.1028 (3.3431) grad_norm 2.0690 (1.9708) [2022-01-24 02:22:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][320/1251] eta 0:35:06 lr 0.000252 time 1.6374 (2.2626) loss 2.6404 (3.3353) grad_norm 1.8796 (1.9687) [2022-01-24 02:22:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][330/1251] eta 0:34:38 lr 0.000252 time 2.4438 (2.2565) loss 3.8446 (3.3404) grad_norm 1.9428 (1.9676) [2022-01-24 02:23:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][340/1251] eta 0:34:17 lr 0.000252 time 2.9764 (2.2590) loss 3.7126 (3.3423) grad_norm 1.7951 (1.9663) [2022-01-24 02:23:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][350/1251] eta 0:33:54 lr 0.000252 time 2.5123 (2.2580) loss 3.7268 (3.3390) grad_norm 1.9192 (1.9632) [2022-01-24 02:23:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][360/1251] eta 0:33:31 lr 0.000252 time 2.2810 (2.2574) loss 2.7347 (3.3433) grad_norm 1.4797 (1.9617) [2022-01-24 02:24:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][370/1251] eta 0:33:07 lr 0.000252 time 2.2410 (2.2561) loss 3.1707 (3.3446) grad_norm 1.8604 (1.9621) [2022-01-24 02:24:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][380/1251] eta 0:32:45 lr 0.000252 time 2.9371 (2.2562) loss 4.0022 (3.3449) grad_norm 2.4494 (1.9642) [2022-01-24 02:25:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][390/1251] eta 0:32:21 lr 0.000252 time 2.4856 (2.2550) loss 3.0089 (3.3395) grad_norm 1.9714 (1.9656) [2022-01-24 02:25:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][400/1251] eta 0:31:56 lr 0.000252 time 1.7192 (2.2516) loss 3.5474 (3.3438) grad_norm 2.2139 (1.9651) [2022-01-24 02:25:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][410/1251] eta 0:31:30 lr 0.000252 time 1.8032 (2.2476) loss 3.5249 (3.3434) grad_norm 1.9806 (1.9673) [2022-01-24 02:26:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][420/1251] eta 0:31:03 lr 0.000252 time 1.9450 (2.2424) loss 2.6243 (3.3452) grad_norm 1.7516 (1.9644) [2022-01-24 02:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][430/1251] eta 0:30:37 lr 0.000251 time 2.2046 (2.2384) loss 2.5333 (3.3483) grad_norm 2.0326 (1.9632) [2022-01-24 02:26:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][440/1251] eta 0:30:14 lr 0.000251 time 2.4728 (2.2371) loss 3.0096 (3.3530) grad_norm 1.8983 (1.9617) [2022-01-24 02:27:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][450/1251] eta 0:29:53 lr 0.000251 time 2.8604 (2.2392) loss 4.1193 (3.3572) grad_norm 1.8705 (1.9603) [2022-01-24 02:27:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][460/1251] eta 0:29:31 lr 0.000251 time 2.0091 (2.2400) loss 3.1065 (3.3553) grad_norm 2.1296 (1.9614) [2022-01-24 02:27:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][470/1251] eta 0:29:09 lr 0.000251 time 2.1917 (2.2398) loss 3.4901 (3.3565) grad_norm 1.8509 (1.9606) [2022-01-24 02:28:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][480/1251] eta 0:28:46 lr 0.000251 time 2.4812 (2.2391) loss 3.8848 (3.3554) grad_norm 1.8732 (1.9596) [2022-01-24 02:28:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][490/1251] eta 0:28:23 lr 0.000251 time 2.5331 (2.2383) loss 3.8625 (3.3573) grad_norm 2.0639 (1.9588) [2022-01-24 02:28:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][500/1251] eta 0:27:58 lr 0.000251 time 1.6798 (2.2346) loss 3.5864 (3.3571) grad_norm 1.6315 (1.9645) [2022-01-24 02:29:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][510/1251] eta 0:27:35 lr 0.000251 time 2.7418 (2.2338) loss 3.4577 (3.3561) grad_norm 2.0985 (1.9643) [2022-01-24 02:29:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][520/1251] eta 0:27:09 lr 0.000251 time 1.9683 (2.2291) loss 2.5162 (3.3543) grad_norm 2.1551 (1.9646) [2022-01-24 02:30:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][530/1251] eta 0:26:46 lr 0.000251 time 2.5227 (2.2285) loss 3.2106 (3.3546) grad_norm 1.7611 (1.9620) [2022-01-24 02:30:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][540/1251] eta 0:26:24 lr 0.000251 time 2.2311 (2.2281) loss 3.6719 (3.3544) grad_norm 1.6397 (1.9633) [2022-01-24 02:30:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][550/1251] eta 0:26:04 lr 0.000251 time 2.6942 (2.2314) loss 3.1270 (3.3550) grad_norm 1.9867 (1.9641) [2022-01-24 02:31:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][560/1251] eta 0:25:40 lr 0.000251 time 1.5615 (2.2297) loss 3.4602 (3.3558) grad_norm 2.2243 (1.9648) [2022-01-24 02:31:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][570/1251] eta 0:25:18 lr 0.000251 time 2.4840 (2.2297) loss 3.5713 (3.3552) grad_norm 2.2638 (1.9631) [2022-01-24 02:31:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][580/1251] eta 0:24:53 lr 0.000251 time 1.9482 (2.2261) loss 2.5690 (3.3510) grad_norm 1.7643 (1.9627) [2022-01-24 02:32:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][590/1251] eta 0:24:30 lr 0.000251 time 2.5899 (2.2244) loss 3.3008 (3.3457) grad_norm 1.7690 (1.9623) [2022-01-24 02:32:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][600/1251] eta 0:24:08 lr 0.000251 time 2.1770 (2.2246) loss 3.7532 (3.3482) grad_norm 1.7925 (1.9616) [2022-01-24 02:32:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][610/1251] eta 0:23:46 lr 0.000251 time 2.2551 (2.2257) loss 3.6625 (3.3509) grad_norm 1.8841 (1.9595) [2022-01-24 02:33:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][620/1251] eta 0:23:25 lr 0.000251 time 1.8344 (2.2270) loss 4.0302 (3.3496) grad_norm 2.0347 (1.9591) [2022-01-24 02:33:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][630/1251] eta 0:23:02 lr 0.000251 time 2.2848 (2.2263) loss 2.8196 (3.3482) grad_norm 2.2697 (1.9571) [2022-01-24 02:34:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][640/1251] eta 0:22:40 lr 0.000251 time 3.1935 (2.2266) loss 3.3718 (3.3454) grad_norm 1.8749 (1.9559) [2022-01-24 02:34:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][650/1251] eta 0:22:16 lr 0.000251 time 1.9477 (2.2245) loss 3.9914 (3.3448) grad_norm 1.9055 (1.9545) [2022-01-24 02:34:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][660/1251] eta 0:21:53 lr 0.000251 time 2.2689 (2.2222) loss 3.1080 (3.3481) grad_norm 1.7308 (1.9542) [2022-01-24 02:35:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][670/1251] eta 0:21:29 lr 0.000251 time 1.8388 (2.2203) loss 3.1835 (3.3451) grad_norm 1.7643 (1.9539) [2022-01-24 02:35:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][680/1251] eta 0:21:08 lr 0.000251 time 3.2156 (2.2216) loss 3.6916 (3.3478) grad_norm 1.8858 (1.9536) [2022-01-24 02:35:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][690/1251] eta 0:20:46 lr 0.000251 time 2.2440 (2.2217) loss 2.5884 (3.3429) grad_norm 1.8551 (1.9528) [2022-01-24 02:36:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][700/1251] eta 0:20:24 lr 0.000251 time 2.2019 (2.2223) loss 3.5709 (3.3420) grad_norm 2.0231 (1.9520) [2022-01-24 02:36:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][710/1251] eta 0:20:01 lr 0.000250 time 2.0690 (2.2211) loss 3.1912 (3.3440) grad_norm 1.9138 (1.9512) [2022-01-24 02:36:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][720/1251] eta 0:19:38 lr 0.000250 time 2.7443 (2.2192) loss 2.4890 (3.3422) grad_norm 1.9739 (1.9509) [2022-01-24 02:37:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][730/1251] eta 0:19:15 lr 0.000250 time 1.9632 (2.2171) loss 3.6102 (3.3412) grad_norm 1.6442 (1.9509) [2022-01-24 02:37:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][740/1251] eta 0:18:52 lr 0.000250 time 2.7773 (2.2167) loss 2.2636 (3.3373) grad_norm 1.9174 (1.9495) [2022-01-24 02:38:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][750/1251] eta 0:18:30 lr 0.000250 time 2.2004 (2.2174) loss 3.5813 (3.3395) grad_norm 1.7932 (1.9505) [2022-01-24 02:38:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][760/1251] eta 0:18:08 lr 0.000250 time 2.4959 (2.2175) loss 3.8459 (3.3439) grad_norm 1.8269 (1.9494) [2022-01-24 02:38:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][770/1251] eta 0:17:46 lr 0.000250 time 2.1536 (2.2176) loss 2.4023 (3.3453) grad_norm 1.8049 (1.9489) [2022-01-24 02:39:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][780/1251] eta 0:17:24 lr 0.000250 time 2.7806 (2.2178) loss 3.9740 (3.3474) grad_norm 2.1332 (1.9514) [2022-01-24 02:39:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][790/1251] eta 0:17:02 lr 0.000250 time 1.6196 (2.2174) loss 3.2621 (3.3467) grad_norm 2.0378 (1.9535) [2022-01-24 02:39:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][800/1251] eta 0:16:40 lr 0.000250 time 2.4747 (2.2183) loss 3.8978 (3.3452) grad_norm 1.8975 (1.9532) [2022-01-24 02:40:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][810/1251] eta 0:16:17 lr 0.000250 time 2.1616 (2.2168) loss 3.9246 (3.3438) grad_norm 2.3126 (1.9546) [2022-01-24 02:40:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][820/1251] eta 0:15:54 lr 0.000250 time 1.8497 (2.2139) loss 3.1859 (3.3433) grad_norm 2.2011 (1.9547) [2022-01-24 02:40:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][830/1251] eta 0:15:31 lr 0.000250 time 2.4967 (2.2131) loss 3.8698 (3.3438) grad_norm 1.8913 (1.9534) [2022-01-24 02:41:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][840/1251] eta 0:15:08 lr 0.000250 time 1.9822 (2.2103) loss 3.7098 (3.3439) grad_norm 1.9275 (1.9527) [2022-01-24 02:41:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][850/1251] eta 0:14:46 lr 0.000250 time 2.4284 (2.2100) loss 3.0373 (3.3438) grad_norm 1.9636 (1.9525) [2022-01-24 02:42:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][860/1251] eta 0:14:24 lr 0.000250 time 2.0013 (2.2101) loss 3.6046 (3.3466) grad_norm 1.7917 (1.9515) [2022-01-24 02:42:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][870/1251] eta 0:14:02 lr 0.000250 time 2.4397 (2.2116) loss 2.7829 (3.3477) grad_norm 2.0126 (1.9522) [2022-01-24 02:42:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][880/1251] eta 0:13:40 lr 0.000250 time 2.9658 (2.2117) loss 3.5161 (3.3431) grad_norm 1.8868 (1.9529) [2022-01-24 02:43:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][890/1251] eta 0:13:18 lr 0.000250 time 1.9309 (2.2117) loss 3.4022 (3.3462) grad_norm 2.0237 (1.9522) [2022-01-24 02:43:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][900/1251] eta 0:12:56 lr 0.000250 time 1.8238 (2.2115) loss 3.5688 (3.3447) grad_norm 2.0194 (1.9533) [2022-01-24 02:43:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][910/1251] eta 0:12:33 lr 0.000250 time 2.1692 (2.2110) loss 3.2452 (3.3446) grad_norm 2.2269 (1.9541) [2022-01-24 02:44:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][920/1251] eta 0:12:11 lr 0.000250 time 2.7210 (2.2109) loss 3.6394 (3.3447) grad_norm 1.8749 (1.9549) [2022-01-24 02:44:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][930/1251] eta 0:11:49 lr 0.000250 time 2.6046 (2.2101) loss 3.6748 (3.3474) grad_norm 2.1250 (1.9561) [2022-01-24 02:44:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][940/1251] eta 0:11:27 lr 0.000250 time 1.8844 (2.2095) loss 3.5910 (3.3472) grad_norm 1.8448 (1.9557) [2022-01-24 02:45:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][950/1251] eta 0:11:04 lr 0.000250 time 1.4719 (2.2093) loss 3.2708 (3.3485) grad_norm 1.5983 (1.9558) [2022-01-24 02:45:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][960/1251] eta 0:10:43 lr 0.000250 time 3.3038 (2.2102) loss 3.4340 (3.3466) grad_norm 1.8378 (1.9563) [2022-01-24 02:46:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][970/1251] eta 0:10:21 lr 0.000250 time 2.7565 (2.2107) loss 4.1942 (3.3454) grad_norm 2.1613 (1.9565) [2022-01-24 02:46:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][980/1251] eta 0:09:58 lr 0.000250 time 2.2104 (2.2099) loss 4.2507 (3.3464) grad_norm 1.9755 (1.9572) [2022-01-24 02:46:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][990/1251] eta 0:09:36 lr 0.000250 time 1.8541 (2.2098) loss 3.4304 (3.3453) grad_norm 1.9036 (1.9580) [2022-01-24 02:47:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1000/1251] eta 0:09:14 lr 0.000249 time 2.1389 (2.2087) loss 2.9140 (3.3449) grad_norm 1.8386 (1.9578) [2022-01-24 02:47:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1010/1251] eta 0:08:52 lr 0.000249 time 2.1898 (2.2081) loss 2.3545 (3.3439) grad_norm 1.6334 (1.9568) [2022-01-24 02:47:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1020/1251] eta 0:08:30 lr 0.000249 time 2.1474 (2.2084) loss 3.5366 (3.3422) grad_norm 1.9248 (1.9578) [2022-01-24 02:48:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1030/1251] eta 0:08:08 lr 0.000249 time 1.6986 (2.2096) loss 2.8056 (3.3418) grad_norm 1.8361 (1.9576) [2022-01-24 02:48:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1040/1251] eta 0:07:46 lr 0.000249 time 2.5072 (2.2102) loss 3.3353 (3.3413) grad_norm 1.7732 (1.9580) [2022-01-24 02:49:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1050/1251] eta 0:07:24 lr 0.000249 time 2.6142 (2.2096) loss 3.3405 (3.3407) grad_norm 2.0474 (1.9575) [2022-01-24 02:49:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1060/1251] eta 0:07:01 lr 0.000249 time 1.7490 (2.2080) loss 2.2476 (3.3394) grad_norm 2.3072 (1.9584) [2022-01-24 02:49:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1070/1251] eta 0:06:39 lr 0.000249 time 1.8883 (2.2069) loss 3.4826 (3.3415) grad_norm 1.9056 (1.9578) [2022-01-24 02:50:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1080/1251] eta 0:06:17 lr 0.000249 time 2.2050 (2.2053) loss 3.7995 (3.3415) grad_norm 3.2131 (1.9592) [2022-01-24 02:50:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1090/1251] eta 0:05:54 lr 0.000249 time 1.5645 (2.2043) loss 2.1572 (3.3394) grad_norm 2.1608 (1.9599) [2022-01-24 02:50:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1100/1251] eta 0:05:32 lr 0.000249 time 2.2360 (2.2043) loss 3.9543 (3.3396) grad_norm 1.8371 (1.9590) [2022-01-24 02:51:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1110/1251] eta 0:05:10 lr 0.000249 time 1.8912 (2.2047) loss 2.0553 (3.3387) grad_norm 1.7402 (1.9587) [2022-01-24 02:51:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1120/1251] eta 0:04:49 lr 0.000249 time 3.4100 (2.2069) loss 3.6306 (3.3370) grad_norm 2.2381 (1.9591) [2022-01-24 02:51:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1130/1251] eta 0:04:27 lr 0.000249 time 1.6616 (2.2075) loss 3.8575 (3.3377) grad_norm 1.9988 (1.9590) [2022-01-24 02:52:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1140/1251] eta 0:04:05 lr 0.000249 time 2.9036 (2.2098) loss 3.6407 (3.3396) grad_norm 2.2087 (1.9583) [2022-01-24 02:52:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1150/1251] eta 0:03:43 lr 0.000249 time 2.2475 (2.2098) loss 2.7069 (3.3378) grad_norm 1.7139 (1.9586) [2022-01-24 02:53:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1160/1251] eta 0:03:20 lr 0.000249 time 2.2357 (2.2084) loss 2.2430 (3.3351) grad_norm 1.9190 (1.9580) [2022-01-24 02:53:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1170/1251] eta 0:02:58 lr 0.000249 time 1.9735 (2.2058) loss 3.1584 (3.3356) grad_norm 1.8827 (1.9578) [2022-01-24 02:53:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1180/1251] eta 0:02:36 lr 0.000249 time 1.8476 (2.2039) loss 3.8615 (3.3369) grad_norm 1.8842 (1.9589) [2022-01-24 02:54:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1190/1251] eta 0:02:14 lr 0.000249 time 2.1992 (2.2037) loss 2.7614 (3.3353) grad_norm 2.0964 (1.9592) [2022-01-24 02:54:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1200/1251] eta 0:01:52 lr 0.000249 time 2.0579 (2.2032) loss 3.3587 (3.3371) grad_norm 2.0816 (1.9600) [2022-01-24 02:54:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1210/1251] eta 0:01:30 lr 0.000249 time 2.9319 (2.2036) loss 3.3989 (3.3376) grad_norm 1.8504 (1.9605) [2022-01-24 02:55:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1220/1251] eta 0:01:08 lr 0.000249 time 2.1063 (2.2047) loss 3.0238 (3.3377) grad_norm 1.9586 (1.9609) [2022-01-24 02:55:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1230/1251] eta 0:00:46 lr 0.000249 time 2.7879 (2.2050) loss 3.5933 (3.3373) grad_norm 1.9402 (1.9623) [2022-01-24 02:55:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1240/1251] eta 0:00:24 lr 0.000249 time 1.4775 (2.2046) loss 3.6554 (3.3373) grad_norm 1.8403 (1.9626) [2022-01-24 02:56:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [201/300][1250/1251] eta 0:00:02 lr 0.000249 time 1.3072 (2.2000) loss 3.3968 (3.3373) grad_norm 2.0220 (1.9622) [2022-01-24 02:56:11 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 201 training takes 0:45:52 [2022-01-24 02:56:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.554 (18.554) Loss 1.0418 (1.0418) Acc@1 76.660 (76.660) Acc@5 93.262 (93.262) [2022-01-24 02:56:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.111 (3.293) Loss 0.8872 (0.9133) Acc@1 80.273 (78.516) Acc@5 94.824 (94.460) [2022-01-24 02:57:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.189 (2.679) Loss 0.8688 (0.9116) Acc@1 79.102 (78.367) Acc@5 94.727 (94.438) [2022-01-24 02:57:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.314 (2.265) Loss 0.9614 (0.9157) Acc@1 77.637 (78.292) Acc@5 93.750 (94.352) [2022-01-24 02:57:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.686 (2.130) Loss 0.9210 (0.9091) Acc@1 78.418 (78.361) Acc@5 95.312 (94.484) [2022-01-24 02:57:45 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.336 Acc@5 94.490 [2022-01-24 02:57:45 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.3% [2022-01-24 02:57:45 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.40% [2022-01-24 02:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][0/1251] eta 7:27:21 lr 0.000249 time 21.4559 (21.4559) loss 3.4813 (3.4813) grad_norm 1.8247 (1.8247) [2022-01-24 02:58:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][10/1251] eta 1:25:42 lr 0.000249 time 2.5925 (4.1436) loss 3.4062 (3.0395) grad_norm 1.8915 (1.8853) [2022-01-24 02:58:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][20/1251] eta 1:05:34 lr 0.000249 time 1.4322 (3.1958) loss 3.4024 (3.1855) grad_norm 1.8547 (1.9768) [2022-01-24 02:59:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][30/1251] eta 0:58:39 lr 0.000248 time 2.0344 (2.8824) loss 4.0566 (3.3056) grad_norm 1.9063 (1.9874) [2022-01-24 02:59:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][40/1251] eta 0:55:33 lr 0.000248 time 2.8413 (2.7527) loss 3.2687 (3.3545) grad_norm 1.9028 (1.9762) [2022-01-24 03:00:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][50/1251] eta 0:53:47 lr 0.000248 time 2.0375 (2.6870) loss 3.6930 (3.3893) grad_norm 1.9837 (1.9721) [2022-01-24 03:00:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][60/1251] eta 0:52:02 lr 0.000248 time 1.6028 (2.6215) loss 3.9079 (3.4075) grad_norm 1.8912 (1.9899) [2022-01-24 03:00:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][70/1251] eta 0:50:22 lr 0.000248 time 2.3164 (2.5591) loss 3.2627 (3.4247) grad_norm 1.9460 (1.9798) [2022-01-24 03:01:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][80/1251] eta 0:48:54 lr 0.000248 time 2.2431 (2.5057) loss 3.6435 (3.4249) grad_norm 2.1115 (1.9802) [2022-01-24 03:01:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][90/1251] eta 0:47:28 lr 0.000248 time 1.9323 (2.4532) loss 2.4842 (3.4221) grad_norm 1.8630 (2.0143) [2022-01-24 03:01:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][100/1251] eta 0:46:13 lr 0.000248 time 1.8511 (2.4096) loss 3.8004 (3.4243) grad_norm 1.8337 (2.0123) [2022-01-24 03:02:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][110/1251] eta 0:45:18 lr 0.000248 time 2.1655 (2.3824) loss 3.2944 (3.4137) grad_norm 1.7820 (2.0044) [2022-01-24 03:02:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][120/1251] eta 0:44:25 lr 0.000248 time 2.4218 (2.3571) loss 3.5798 (3.3943) grad_norm 1.8955 (2.0046) [2022-01-24 03:02:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][130/1251] eta 0:44:00 lr 0.000248 time 2.9985 (2.3558) loss 2.2373 (3.3838) grad_norm 1.8849 (1.9958) [2022-01-24 03:03:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][140/1251] eta 0:43:47 lr 0.000248 time 2.6093 (2.3649) loss 3.0669 (3.3711) grad_norm 2.1211 (1.9936) [2022-01-24 03:03:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][150/1251] eta 0:43:27 lr 0.000248 time 2.0422 (2.3681) loss 3.4932 (3.3416) grad_norm 2.0339 (1.9920) [2022-01-24 03:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][160/1251] eta 0:42:54 lr 0.000248 time 2.1991 (2.3600) loss 3.1565 (3.3439) grad_norm 1.8763 (1.9856) [2022-01-24 03:04:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][170/1251] eta 0:42:17 lr 0.000248 time 2.5630 (2.3470) loss 3.4241 (3.3344) grad_norm 2.1058 (1.9823) [2022-01-24 03:04:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][180/1251] eta 0:41:28 lr 0.000248 time 2.1778 (2.3233) loss 3.9039 (3.3477) grad_norm 2.2636 (1.9845) [2022-01-24 03:05:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][190/1251] eta 0:40:51 lr 0.000248 time 2.2987 (2.3108) loss 3.8445 (3.3605) grad_norm 1.8338 (1.9867) [2022-01-24 03:05:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][200/1251] eta 0:40:18 lr 0.000248 time 2.4810 (2.3007) loss 3.6532 (3.3561) grad_norm 1.7374 (1.9888) [2022-01-24 03:05:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][210/1251] eta 0:39:46 lr 0.000248 time 2.1723 (2.2930) loss 3.5594 (3.3590) grad_norm 1.6999 (1.9946) [2022-01-24 03:06:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][220/1251] eta 0:39:14 lr 0.000248 time 2.4483 (2.2834) loss 2.5575 (3.3494) grad_norm 2.1123 (1.9990) [2022-01-24 03:06:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][230/1251] eta 0:38:42 lr 0.000248 time 2.2153 (2.2748) loss 3.1358 (3.3412) grad_norm 2.2151 (2.0064) [2022-01-24 03:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][240/1251] eta 0:38:11 lr 0.000248 time 2.0085 (2.2669) loss 3.4430 (3.3525) grad_norm 1.8728 (2.0040) [2022-01-24 03:07:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][250/1251] eta 0:37:47 lr 0.000248 time 2.3633 (2.2656) loss 2.9986 (3.3458) grad_norm 1.8999 (1.9998) [2022-01-24 03:07:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][260/1251] eta 0:37:23 lr 0.000248 time 2.5392 (2.2636) loss 3.5584 (3.3461) grad_norm 1.8005 (1.9988) [2022-01-24 03:07:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][270/1251] eta 0:36:57 lr 0.000248 time 2.4380 (2.2603) loss 2.8823 (3.3411) grad_norm 1.9860 (2.0035) [2022-01-24 03:08:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][280/1251] eta 0:36:34 lr 0.000248 time 1.8907 (2.2602) loss 3.5255 (3.3356) grad_norm 1.6662 (2.0046) [2022-01-24 03:08:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][290/1251] eta 0:36:23 lr 0.000248 time 3.1245 (2.2719) loss 3.3051 (3.3341) grad_norm 2.0191 (2.0094) [2022-01-24 03:09:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][300/1251] eta 0:36:07 lr 0.000248 time 3.2584 (2.2790) loss 3.5969 (3.3316) grad_norm 1.7509 (2.0039) [2022-01-24 03:09:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][310/1251] eta 0:35:46 lr 0.000247 time 2.8707 (2.2812) loss 2.9015 (3.3246) grad_norm 2.0279 (1.9970) [2022-01-24 03:09:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][320/1251] eta 0:35:19 lr 0.000247 time 1.7669 (2.2768) loss 3.6537 (3.3246) grad_norm 2.0579 (1.9976) [2022-01-24 03:10:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][330/1251] eta 0:34:53 lr 0.000247 time 1.8360 (2.2725) loss 3.2075 (3.3232) grad_norm 2.0485 (1.9962) [2022-01-24 03:10:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][340/1251] eta 0:34:24 lr 0.000247 time 1.8493 (2.2658) loss 2.3592 (3.3181) grad_norm 1.8115 (1.9918) [2022-01-24 03:11:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][350/1251] eta 0:33:59 lr 0.000247 time 2.3979 (2.2634) loss 2.8674 (3.3148) grad_norm 2.2285 (1.9902) [2022-01-24 03:11:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][360/1251] eta 0:33:36 lr 0.000247 time 2.1515 (2.2637) loss 3.4187 (3.3168) grad_norm 2.0588 (1.9923) [2022-01-24 03:11:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][370/1251] eta 0:33:10 lr 0.000247 time 1.8033 (2.2596) loss 3.1131 (3.3132) grad_norm 1.9869 (1.9905) [2022-01-24 03:12:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][380/1251] eta 0:32:46 lr 0.000247 time 2.1845 (2.2575) loss 3.6811 (3.3155) grad_norm 2.2801 (1.9892) [2022-01-24 03:12:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][390/1251] eta 0:32:21 lr 0.000247 time 1.9111 (2.2549) loss 3.4354 (3.3112) grad_norm 2.1508 (1.9909) [2022-01-24 03:12:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][400/1251] eta 0:31:58 lr 0.000247 time 2.0835 (2.2539) loss 3.8897 (3.3167) grad_norm 1.9778 (1.9905) [2022-01-24 03:13:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][410/1251] eta 0:31:31 lr 0.000247 time 2.2480 (2.2486) loss 3.4692 (3.3162) grad_norm 2.3809 (1.9918) [2022-01-24 03:13:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][420/1251] eta 0:31:04 lr 0.000247 time 1.8741 (2.2434) loss 2.9127 (3.3190) grad_norm 1.9953 (1.9914) [2022-01-24 03:13:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][430/1251] eta 0:30:36 lr 0.000247 time 1.9316 (2.2370) loss 3.3505 (3.3203) grad_norm 1.7296 (1.9926) [2022-01-24 03:14:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][440/1251] eta 0:30:12 lr 0.000247 time 2.2829 (2.2355) loss 3.5701 (3.3240) grad_norm 2.1523 (1.9926) [2022-01-24 03:14:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][450/1251] eta 0:29:48 lr 0.000247 time 2.7698 (2.2334) loss 3.1935 (3.3281) grad_norm 1.7908 (1.9901) [2022-01-24 03:14:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][460/1251] eta 0:29:28 lr 0.000247 time 2.1507 (2.2359) loss 3.7403 (3.3256) grad_norm 2.0370 (1.9892) [2022-01-24 03:15:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][470/1251] eta 0:29:12 lr 0.000247 time 2.9958 (2.2436) loss 3.7167 (3.3247) grad_norm 1.8971 (1.9866) [2022-01-24 03:15:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][480/1251] eta 0:28:52 lr 0.000247 time 2.8040 (2.2467) loss 4.2115 (3.3281) grad_norm 2.1318 (1.9863) [2022-01-24 03:16:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][490/1251] eta 0:28:29 lr 0.000247 time 2.1809 (2.2469) loss 3.6131 (3.3307) grad_norm 1.7496 (1.9875) [2022-01-24 03:16:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][500/1251] eta 0:28:05 lr 0.000247 time 1.9012 (2.2442) loss 3.6654 (3.3302) grad_norm 1.8390 (1.9890) [2022-01-24 03:16:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][510/1251] eta 0:27:38 lr 0.000247 time 2.2117 (2.2388) loss 3.2717 (3.3319) grad_norm 1.6836 (1.9886) [2022-01-24 03:17:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][520/1251] eta 0:27:13 lr 0.000247 time 2.7191 (2.2344) loss 2.7419 (3.3311) grad_norm 2.0384 (1.9924) [2022-01-24 03:17:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][530/1251] eta 0:26:48 lr 0.000247 time 2.1590 (2.2310) loss 3.5219 (3.3276) grad_norm 2.2340 (1.9931) [2022-01-24 03:17:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][540/1251] eta 0:26:24 lr 0.000247 time 1.9655 (2.2292) loss 3.6742 (3.3273) grad_norm 1.8284 (1.9922) [2022-01-24 03:18:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][550/1251] eta 0:26:02 lr 0.000247 time 1.6674 (2.2283) loss 3.7329 (3.3325) grad_norm 2.0189 (1.9922) [2022-01-24 03:18:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][560/1251] eta 0:25:40 lr 0.000247 time 2.8835 (2.2294) loss 3.4341 (3.3304) grad_norm 2.0188 (1.9930) [2022-01-24 03:18:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][570/1251] eta 0:25:19 lr 0.000247 time 3.1283 (2.2314) loss 3.5952 (3.3309) grad_norm 2.0383 (1.9912) [2022-01-24 03:19:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][580/1251] eta 0:24:57 lr 0.000247 time 2.0340 (2.2311) loss 3.2528 (3.3333) grad_norm 1.7989 (1.9903) [2022-01-24 03:19:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][590/1251] eta 0:24:34 lr 0.000246 time 1.6491 (2.2307) loss 3.7813 (3.3364) grad_norm 1.7791 (1.9889) [2022-01-24 03:20:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][600/1251] eta 0:24:13 lr 0.000246 time 3.3755 (2.2321) loss 3.9449 (3.3389) grad_norm 2.1187 (1.9889) [2022-01-24 03:20:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][610/1251] eta 0:23:50 lr 0.000246 time 2.0546 (2.2314) loss 3.3947 (3.3390) grad_norm 1.8127 (1.9891) [2022-01-24 03:20:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][620/1251] eta 0:23:26 lr 0.000246 time 1.9234 (2.2284) loss 3.8375 (3.3401) grad_norm 2.0064 (1.9885) [2022-01-24 03:21:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][630/1251] eta 0:23:03 lr 0.000246 time 2.2556 (2.2273) loss 2.5809 (3.3361) grad_norm 2.0895 (1.9897) [2022-01-24 03:21:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][640/1251] eta 0:22:40 lr 0.000246 time 2.2502 (2.2274) loss 3.7180 (3.3393) grad_norm 1.7543 (1.9903) [2022-01-24 03:21:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][650/1251] eta 0:22:17 lr 0.000246 time 2.4434 (2.2259) loss 2.2448 (3.3318) grad_norm 1.7993 (1.9889) [2022-01-24 03:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][660/1251] eta 0:21:55 lr 0.000246 time 1.5767 (2.2253) loss 3.8456 (3.3294) grad_norm 1.9257 (1.9881) [2022-01-24 03:22:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][670/1251] eta 0:21:33 lr 0.000246 time 1.8301 (2.2261) loss 3.4342 (3.3348) grad_norm 1.6058 (1.9876) [2022-01-24 03:23:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][680/1251] eta 0:21:11 lr 0.000246 time 2.4083 (2.2265) loss 2.3783 (3.3342) grad_norm 1.7402 (1.9889) [2022-01-24 03:23:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][690/1251] eta 0:20:49 lr 0.000246 time 3.7183 (2.2280) loss 3.3363 (3.3374) grad_norm 1.6298 (1.9869) [2022-01-24 03:23:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][700/1251] eta 0:20:26 lr 0.000246 time 1.9781 (2.2256) loss 3.6996 (3.3361) grad_norm 2.3877 (1.9859) [2022-01-24 03:24:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][710/1251] eta 0:20:01 lr 0.000246 time 1.8854 (2.2218) loss 2.3650 (3.3334) grad_norm 1.5838 (1.9858) [2022-01-24 03:24:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][720/1251] eta 0:19:38 lr 0.000246 time 1.9382 (2.2199) loss 3.2063 (3.3296) grad_norm 1.8144 (1.9855) [2022-01-24 03:24:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][730/1251] eta 0:19:16 lr 0.000246 time 3.0117 (2.2202) loss 3.4017 (3.3297) grad_norm 1.8722 (1.9841) [2022-01-24 03:25:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][740/1251] eta 0:18:54 lr 0.000246 time 2.2753 (2.2202) loss 2.8932 (3.3273) grad_norm 2.0743 (1.9854) [2022-01-24 03:25:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][750/1251] eta 0:18:33 lr 0.000246 time 2.1087 (2.2221) loss 3.8701 (3.3303) grad_norm 2.2930 (1.9868) [2022-01-24 03:25:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][760/1251] eta 0:18:11 lr 0.000246 time 2.3172 (2.2226) loss 2.9453 (3.3298) grad_norm 1.9078 (1.9860) [2022-01-24 03:26:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][770/1251] eta 0:17:47 lr 0.000246 time 1.7408 (2.2200) loss 3.0515 (3.3275) grad_norm 1.8064 (1.9865) [2022-01-24 03:26:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][780/1251] eta 0:17:23 lr 0.000246 time 1.8700 (2.2164) loss 2.9948 (3.3267) grad_norm 1.8148 (1.9879) [2022-01-24 03:26:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][790/1251] eta 0:17:00 lr 0.000246 time 2.0548 (2.2141) loss 2.3783 (3.3269) grad_norm 1.7908 (1.9871) [2022-01-24 03:27:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][800/1251] eta 0:16:38 lr 0.000246 time 2.2484 (2.2137) loss 3.9758 (3.3278) grad_norm 2.1057 (1.9872) [2022-01-24 03:27:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][810/1251] eta 0:16:15 lr 0.000246 time 1.5194 (2.2131) loss 3.5816 (3.3286) grad_norm 1.7638 (1.9875) [2022-01-24 03:28:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][820/1251] eta 0:15:54 lr 0.000246 time 2.7623 (2.2150) loss 3.9723 (3.3294) grad_norm 2.7210 (1.9919) [2022-01-24 03:28:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][830/1251] eta 0:15:32 lr 0.000246 time 1.7985 (2.2145) loss 2.9899 (3.3271) grad_norm 2.0549 (1.9916) [2022-01-24 03:28:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][840/1251] eta 0:15:10 lr 0.000246 time 2.6228 (2.2162) loss 3.1036 (3.3263) grad_norm 1.8903 (1.9923) [2022-01-24 03:29:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][850/1251] eta 0:14:49 lr 0.000246 time 1.5029 (2.2175) loss 3.3540 (3.3250) grad_norm 1.7756 (1.9920) [2022-01-24 03:29:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][860/1251] eta 0:14:26 lr 0.000246 time 1.6279 (2.2165) loss 3.6887 (3.3227) grad_norm 1.8400 (1.9919) [2022-01-24 03:29:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][870/1251] eta 0:14:04 lr 0.000245 time 1.5863 (2.2156) loss 3.4286 (3.3214) grad_norm 1.8305 (1.9900) [2022-01-24 03:30:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][880/1251] eta 0:13:41 lr 0.000245 time 2.1735 (2.2146) loss 3.8029 (3.3200) grad_norm 2.0434 (1.9917) [2022-01-24 03:30:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][890/1251] eta 0:13:19 lr 0.000245 time 2.2564 (2.2155) loss 3.6774 (3.3184) grad_norm 1.8750 (1.9908) [2022-01-24 03:31:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][900/1251] eta 0:12:57 lr 0.000245 time 1.8643 (2.2146) loss 3.0314 (3.3174) grad_norm 1.8992 (1.9919) [2022-01-24 03:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][910/1251] eta 0:12:34 lr 0.000245 time 1.7480 (2.2135) loss 2.8832 (3.3150) grad_norm 1.7459 (1.9919) [2022-01-24 03:31:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][920/1251] eta 0:12:12 lr 0.000245 time 2.2567 (2.2131) loss 3.6408 (3.3151) grad_norm 2.1422 (1.9918) [2022-01-24 03:32:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][930/1251] eta 0:11:49 lr 0.000245 time 2.1493 (2.2115) loss 2.9023 (3.3126) grad_norm 1.9331 (1.9919) [2022-01-24 03:32:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][940/1251] eta 0:11:27 lr 0.000245 time 1.8324 (2.2101) loss 3.2535 (3.3135) grad_norm 1.9577 (1.9912) [2022-01-24 03:32:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][950/1251] eta 0:11:05 lr 0.000245 time 2.1448 (2.2107) loss 3.5071 (3.3140) grad_norm 2.0501 (1.9912) [2022-01-24 03:33:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][960/1251] eta 0:10:43 lr 0.000245 time 2.3198 (2.2115) loss 4.0189 (3.3147) grad_norm 1.9435 (1.9915) [2022-01-24 03:33:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][970/1251] eta 0:10:21 lr 0.000245 time 1.9113 (2.2117) loss 2.7252 (3.3146) grad_norm 1.8525 (1.9918) [2022-01-24 03:33:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][980/1251] eta 0:09:59 lr 0.000245 time 1.6095 (2.2130) loss 3.4419 (3.3140) grad_norm 1.8120 (1.9910) [2022-01-24 03:34:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][990/1251] eta 0:09:37 lr 0.000245 time 1.8482 (2.2139) loss 3.3729 (3.3138) grad_norm 2.1357 (1.9907) [2022-01-24 03:34:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1000/1251] eta 0:09:15 lr 0.000245 time 1.5706 (2.2125) loss 3.5960 (3.3132) grad_norm 1.8619 (1.9897) [2022-01-24 03:35:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1010/1251] eta 0:08:53 lr 0.000245 time 1.8240 (2.2128) loss 2.8644 (3.3160) grad_norm 1.9261 (1.9897) [2022-01-24 03:35:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1020/1251] eta 0:08:30 lr 0.000245 time 2.1478 (2.2118) loss 1.9287 (3.3131) grad_norm 1.8492 (1.9895) [2022-01-24 03:35:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1030/1251] eta 0:08:08 lr 0.000245 time 1.9908 (2.2114) loss 3.5553 (3.3141) grad_norm 2.1146 (1.9893) [2022-01-24 03:36:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1040/1251] eta 0:07:46 lr 0.000245 time 1.9174 (2.2102) loss 2.4231 (3.3112) grad_norm 2.3228 (1.9910) [2022-01-24 03:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1050/1251] eta 0:07:24 lr 0.000245 time 2.0388 (2.2109) loss 3.6478 (3.3124) grad_norm 1.8416 (1.9908) [2022-01-24 03:36:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1060/1251] eta 0:07:02 lr 0.000245 time 2.0467 (2.2097) loss 3.6511 (3.3126) grad_norm 2.3312 (1.9913) [2022-01-24 03:37:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1070/1251] eta 0:06:39 lr 0.000245 time 2.2221 (2.2096) loss 3.0358 (3.3127) grad_norm 2.0891 (1.9920) [2022-01-24 03:37:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1080/1251] eta 0:06:17 lr 0.000245 time 1.8083 (2.2101) loss 3.3055 (3.3145) grad_norm 1.9515 (1.9913) [2022-01-24 03:37:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1090/1251] eta 0:05:55 lr 0.000245 time 2.2382 (2.2100) loss 3.7270 (3.3166) grad_norm 2.0695 (1.9916) [2022-01-24 03:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1100/1251] eta 0:05:33 lr 0.000245 time 2.2917 (2.2098) loss 3.3022 (3.3152) grad_norm 1.9920 (1.9927) [2022-01-24 03:38:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1110/1251] eta 0:05:11 lr 0.000245 time 2.4026 (2.2084) loss 3.8371 (3.3126) grad_norm 2.1816 (1.9929) [2022-01-24 03:38:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1120/1251] eta 0:04:49 lr 0.000245 time 2.3182 (2.2070) loss 3.7550 (3.3131) grad_norm 1.9212 (1.9930) [2022-01-24 03:39:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1130/1251] eta 0:04:26 lr 0.000245 time 2.2512 (2.2063) loss 3.5590 (3.3144) grad_norm 2.0410 (1.9916) [2022-01-24 03:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1140/1251] eta 0:04:04 lr 0.000245 time 1.9246 (2.2051) loss 4.0780 (3.3130) grad_norm 1.7384 (1.9901) [2022-01-24 03:40:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1150/1251] eta 0:03:42 lr 0.000245 time 3.3547 (2.2066) loss 2.3861 (3.3133) grad_norm 1.9188 (1.9906) [2022-01-24 03:40:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1160/1251] eta 0:03:20 lr 0.000244 time 1.9006 (2.2081) loss 2.2987 (3.3136) grad_norm 1.8188 (1.9916) [2022-01-24 03:40:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1170/1251] eta 0:02:58 lr 0.000244 time 2.6316 (2.2088) loss 3.7762 (3.3150) grad_norm 2.0050 (1.9909) [2022-01-24 03:41:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1180/1251] eta 0:02:36 lr 0.000244 time 1.7315 (2.2080) loss 3.2411 (3.3164) grad_norm 1.8340 (1.9905) [2022-01-24 03:41:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1190/1251] eta 0:02:14 lr 0.000244 time 2.5719 (2.2081) loss 3.6248 (3.3157) grad_norm 1.7962 (1.9898) [2022-01-24 03:41:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1200/1251] eta 0:01:52 lr 0.000244 time 1.6304 (2.2068) loss 3.4094 (3.3150) grad_norm 2.2142 (1.9896) [2022-01-24 03:42:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1210/1251] eta 0:01:30 lr 0.000244 time 2.2852 (2.2063) loss 3.5900 (3.3144) grad_norm 1.8319 (1.9897) [2022-01-24 03:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1220/1251] eta 0:01:08 lr 0.000244 time 1.7632 (2.2057) loss 3.8597 (3.3154) grad_norm 1.8507 (1.9892) [2022-01-24 03:43:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1230/1251] eta 0:00:46 lr 0.000244 time 2.7570 (2.2059) loss 3.8484 (3.3184) grad_norm 1.9245 (1.9888) [2022-01-24 03:43:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1240/1251] eta 0:00:24 lr 0.000244 time 1.4343 (2.2052) loss 2.5338 (3.3169) grad_norm 1.8294 (1.9885) [2022-01-24 03:43:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [202/300][1250/1251] eta 0:00:02 lr 0.000244 time 1.1611 (2.1998) loss 3.4526 (3.3151) grad_norm 1.7524 (1.9873) [2022-01-24 03:43:38 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 202 training takes 0:45:52 [2022-01-24 03:43:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.799 (18.799) Loss 0.8968 (0.8968) Acc@1 78.711 (78.711) Acc@5 94.531 (94.531) [2022-01-24 03:44:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.943 (3.389) Loss 0.8112 (0.9340) Acc@1 80.078 (78.178) Acc@5 95.801 (94.309) [2022-01-24 03:44:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.268 (2.589) Loss 0.8592 (0.9180) Acc@1 79.395 (78.339) Acc@5 94.531 (94.294) [2022-01-24 03:44:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.620 (2.264) Loss 0.8658 (0.9141) Acc@1 80.078 (78.380) Acc@5 94.727 (94.358) [2022-01-24 03:45:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.787 (2.177) Loss 0.9064 (0.9095) Acc@1 78.711 (78.442) Acc@5 93.848 (94.424) [2022-01-24 03:45:14 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.466 Acc@5 94.446 [2022-01-24 03:45:14 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.5% [2022-01-24 03:45:14 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.47% [2022-01-24 03:45:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][0/1251] eta 7:24:20 lr 0.000244 time 21.3116 (21.3116) loss 2.8047 (2.8047) grad_norm 1.8881 (1.8881) [2022-01-24 03:45:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][10/1251] eta 1:21:34 lr 0.000244 time 1.8370 (3.9436) loss 3.4103 (3.3696) grad_norm 1.9110 (1.9181) [2022-01-24 03:46:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][20/1251] eta 1:03:45 lr 0.000244 time 1.4167 (3.1075) loss 3.3636 (3.4094) grad_norm 1.9330 (1.9350) [2022-01-24 03:46:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][30/1251] eta 0:57:23 lr 0.000244 time 2.2202 (2.8206) loss 2.8135 (3.4208) grad_norm 2.1761 (1.9479) [2022-01-24 03:47:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][40/1251] eta 0:54:23 lr 0.000244 time 3.8325 (2.6952) loss 3.6493 (3.3939) grad_norm 1.9549 (1.9456) [2022-01-24 03:47:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][50/1251] eta 0:52:10 lr 0.000244 time 1.9360 (2.6062) loss 3.5785 (3.3403) grad_norm 2.4237 (1.9570) [2022-01-24 03:47:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][60/1251] eta 0:50:01 lr 0.000244 time 1.7396 (2.5200) loss 3.6655 (3.3555) grad_norm 1.7788 (1.9593) [2022-01-24 03:48:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][70/1251] eta 0:49:06 lr 0.000244 time 2.2601 (2.4952) loss 3.1800 (3.3262) grad_norm 2.0605 (1.9796) [2022-01-24 03:48:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][80/1251] eta 0:48:10 lr 0.000244 time 3.5688 (2.4687) loss 3.7576 (3.3400) grad_norm 2.3004 (1.9864) [2022-01-24 03:48:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][90/1251] eta 0:47:10 lr 0.000244 time 1.8932 (2.4380) loss 3.7567 (3.3287) grad_norm 1.7735 (1.9757) [2022-01-24 03:49:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][100/1251] eta 0:46:09 lr 0.000244 time 1.6405 (2.4058) loss 3.6439 (3.3324) grad_norm 2.0430 (1.9729) [2022-01-24 03:49:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][110/1251] eta 0:45:29 lr 0.000244 time 1.8739 (2.3924) loss 2.8219 (3.3442) grad_norm 1.8104 (1.9783) [2022-01-24 03:50:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][120/1251] eta 0:44:57 lr 0.000244 time 3.5245 (2.3852) loss 3.7959 (3.3401) grad_norm 2.1653 (1.9809) [2022-01-24 03:50:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][130/1251] eta 0:44:13 lr 0.000244 time 1.6067 (2.3670) loss 3.8046 (3.3362) grad_norm 2.1709 (1.9790) [2022-01-24 03:50:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][140/1251] eta 0:43:26 lr 0.000244 time 1.9943 (2.3458) loss 3.8902 (3.3429) grad_norm 2.0290 (1.9785) [2022-01-24 03:51:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][150/1251] eta 0:42:50 lr 0.000244 time 1.7960 (2.3351) loss 3.8967 (3.3508) grad_norm 1.8395 (1.9752) [2022-01-24 03:51:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][160/1251] eta 0:42:10 lr 0.000244 time 2.8892 (2.3196) loss 3.4102 (3.3521) grad_norm 1.6587 (1.9788) [2022-01-24 03:51:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][170/1251] eta 0:41:41 lr 0.000244 time 2.5160 (2.3145) loss 3.7296 (3.3625) grad_norm 2.1563 (1.9736) [2022-01-24 03:52:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][180/1251] eta 0:41:11 lr 0.000244 time 2.2380 (2.3073) loss 3.3183 (3.3572) grad_norm 2.1080 (1.9740) [2022-01-24 03:52:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][190/1251] eta 0:40:37 lr 0.000243 time 1.9229 (2.2969) loss 3.5022 (3.3720) grad_norm 1.9359 (1.9698) [2022-01-24 03:52:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][200/1251] eta 0:40:09 lr 0.000243 time 2.5684 (2.2928) loss 4.2461 (3.3739) grad_norm 2.0285 (1.9697) [2022-01-24 03:53:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][210/1251] eta 0:39:33 lr 0.000243 time 2.6537 (2.2801) loss 2.9654 (3.3706) grad_norm 2.0681 (1.9740) [2022-01-24 03:53:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][220/1251] eta 0:39:03 lr 0.000243 time 2.0136 (2.2733) loss 3.0147 (3.3837) grad_norm 1.8392 (1.9688) [2022-01-24 03:53:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][230/1251] eta 0:38:36 lr 0.000243 time 2.2716 (2.2687) loss 2.5418 (3.3750) grad_norm 2.6751 (1.9733) [2022-01-24 03:54:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][240/1251] eta 0:38:08 lr 0.000243 time 2.5231 (2.2636) loss 3.0843 (3.3797) grad_norm 2.3116 (1.9767) [2022-01-24 03:54:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][250/1251] eta 0:37:41 lr 0.000243 time 1.6110 (2.2596) loss 2.2113 (3.3616) grad_norm 1.8340 (1.9720) [2022-01-24 03:55:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][260/1251] eta 0:37:24 lr 0.000243 time 3.0283 (2.2644) loss 3.2576 (3.3582) grad_norm 1.7602 (1.9737) [2022-01-24 03:55:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][270/1251] eta 0:36:56 lr 0.000243 time 2.2868 (2.2590) loss 3.5631 (3.3471) grad_norm 1.9035 (1.9725) [2022-01-24 03:55:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][280/1251] eta 0:36:26 lr 0.000243 time 1.8468 (2.2522) loss 3.7553 (3.3437) grad_norm 2.1861 (1.9725) [2022-01-24 03:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][290/1251] eta 0:35:57 lr 0.000243 time 1.9230 (2.2446) loss 3.3630 (3.3509) grad_norm 1.9051 (1.9738) [2022-01-24 03:56:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][300/1251] eta 0:35:34 lr 0.000243 time 2.0167 (2.2440) loss 3.3608 (3.3364) grad_norm 1.8122 (1.9748) [2022-01-24 03:56:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][310/1251] eta 0:35:08 lr 0.000243 time 2.4526 (2.2406) loss 3.1306 (3.3274) grad_norm 1.9831 (1.9705) [2022-01-24 03:57:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][320/1251] eta 0:34:49 lr 0.000243 time 2.1298 (2.2443) loss 3.3130 (3.3263) grad_norm 1.9285 (1.9712) [2022-01-24 03:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][330/1251] eta 0:34:25 lr 0.000243 time 1.6988 (2.2428) loss 3.4367 (3.3261) grad_norm 1.8532 (1.9681) [2022-01-24 03:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][340/1251] eta 0:34:04 lr 0.000243 time 2.1348 (2.2447) loss 2.8922 (3.3262) grad_norm 1.8894 (1.9677) [2022-01-24 03:58:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][350/1251] eta 0:33:39 lr 0.000243 time 2.1910 (2.2413) loss 3.2562 (3.3184) grad_norm 1.9227 (1.9687) [2022-01-24 03:58:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][360/1251] eta 0:33:13 lr 0.000243 time 2.1241 (2.2378) loss 2.4171 (3.3149) grad_norm 2.0757 (1.9713) [2022-01-24 03:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][370/1251] eta 0:32:45 lr 0.000243 time 1.5210 (2.2308) loss 3.5663 (3.3166) grad_norm 2.1151 (1.9714) [2022-01-24 03:59:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][380/1251] eta 0:32:19 lr 0.000243 time 1.8392 (2.2264) loss 3.5166 (3.3140) grad_norm 1.8092 (1.9711) [2022-01-24 03:59:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][390/1251] eta 0:31:55 lr 0.000243 time 2.1915 (2.2250) loss 3.0834 (3.3157) grad_norm 1.7733 (1.9705) [2022-01-24 04:00:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][400/1251] eta 0:31:36 lr 0.000243 time 2.8131 (2.2284) loss 3.0998 (3.3173) grad_norm 1.7096 (1.9715) [2022-01-24 04:00:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][410/1251] eta 0:31:12 lr 0.000243 time 1.6380 (2.2271) loss 3.2303 (3.3145) grad_norm 2.3727 (1.9752) [2022-01-24 04:00:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][420/1251] eta 0:30:53 lr 0.000243 time 1.7109 (2.2309) loss 3.5802 (3.3213) grad_norm 2.0518 (1.9767) [2022-01-24 04:01:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][430/1251] eta 0:30:29 lr 0.000243 time 1.9535 (2.2288) loss 3.8622 (3.3205) grad_norm 1.7517 (1.9758) [2022-01-24 04:01:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][440/1251] eta 0:30:07 lr 0.000243 time 2.7626 (2.2282) loss 3.6935 (3.3233) grad_norm 1.7599 (1.9747) [2022-01-24 04:01:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][450/1251] eta 0:29:43 lr 0.000243 time 2.2404 (2.2269) loss 3.8962 (3.3209) grad_norm 1.9853 (1.9744) [2022-01-24 04:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][460/1251] eta 0:29:21 lr 0.000243 time 1.5985 (2.2267) loss 2.3244 (3.3202) grad_norm 2.0617 (1.9735) [2022-01-24 04:02:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][470/1251] eta 0:28:58 lr 0.000243 time 2.4999 (2.2256) loss 3.3982 (3.3201) grad_norm 2.7387 (1.9800) [2022-01-24 04:03:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][480/1251] eta 0:28:33 lr 0.000242 time 1.8639 (2.2220) loss 3.9224 (3.3229) grad_norm 1.7353 (1.9785) [2022-01-24 04:03:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][490/1251] eta 0:28:07 lr 0.000242 time 1.6035 (2.2177) loss 2.6622 (3.3177) grad_norm 1.9944 (1.9773) [2022-01-24 04:03:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][500/1251] eta 0:27:47 lr 0.000242 time 1.8871 (2.2198) loss 1.9228 (3.3098) grad_norm 1.8954 (1.9796) [2022-01-24 04:04:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][510/1251] eta 0:27:26 lr 0.000242 time 3.0594 (2.2221) loss 3.6320 (3.3104) grad_norm 1.8462 (1.9805) [2022-01-24 04:04:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][520/1251] eta 0:27:04 lr 0.000242 time 1.5221 (2.2227) loss 3.7271 (3.3106) grad_norm 1.9086 (1.9800) [2022-01-24 04:04:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][530/1251] eta 0:26:40 lr 0.000242 time 1.4841 (2.2201) loss 2.8341 (3.3135) grad_norm 1.8541 (1.9808) [2022-01-24 04:05:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][540/1251] eta 0:26:15 lr 0.000242 time 1.8420 (2.2163) loss 3.4514 (3.3117) grad_norm 1.9482 (1.9808) [2022-01-24 04:05:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][550/1251] eta 0:25:50 lr 0.000242 time 1.9096 (2.2124) loss 3.4665 (3.3110) grad_norm 2.0031 (1.9795) [2022-01-24 04:05:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][560/1251] eta 0:25:28 lr 0.000242 time 2.3421 (2.2114) loss 3.9936 (3.3107) grad_norm 1.9137 (1.9788) [2022-01-24 04:06:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][570/1251] eta 0:25:07 lr 0.000242 time 1.4793 (2.2130) loss 3.7954 (3.3158) grad_norm 2.1411 (1.9781) [2022-01-24 04:06:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][580/1251] eta 0:24:45 lr 0.000242 time 2.4535 (2.2134) loss 2.9645 (3.3151) grad_norm 2.4806 (1.9788) [2022-01-24 04:07:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][590/1251] eta 0:24:22 lr 0.000242 time 1.9517 (2.2132) loss 3.6786 (3.3187) grad_norm 2.1060 (1.9849) [2022-01-24 04:07:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][600/1251] eta 0:24:01 lr 0.000242 time 2.9557 (2.2144) loss 4.0408 (3.3213) grad_norm 1.9153 (1.9864) [2022-01-24 04:07:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][610/1251] eta 0:23:39 lr 0.000242 time 1.8771 (2.2152) loss 3.5650 (3.3235) grad_norm 2.0905 (1.9865) [2022-01-24 04:08:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][620/1251] eta 0:23:17 lr 0.000242 time 1.6379 (2.2150) loss 2.3332 (3.3212) grad_norm 1.9738 (1.9881) [2022-01-24 04:08:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][630/1251] eta 0:22:54 lr 0.000242 time 1.9862 (2.2129) loss 2.3133 (3.3213) grad_norm 2.1885 (1.9885) [2022-01-24 04:08:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][640/1251] eta 0:22:30 lr 0.000242 time 2.2380 (2.2109) loss 3.6150 (3.3215) grad_norm 1.7054 (1.9872) [2022-01-24 04:09:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][650/1251] eta 0:22:08 lr 0.000242 time 2.1258 (2.2097) loss 3.7106 (3.3231) grad_norm 2.0371 (1.9860) [2022-01-24 04:09:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][660/1251] eta 0:21:44 lr 0.000242 time 2.0455 (2.2080) loss 3.0089 (3.3222) grad_norm 1.7449 (1.9850) [2022-01-24 04:09:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][670/1251] eta 0:21:22 lr 0.000242 time 2.1336 (2.2073) loss 3.8669 (3.3208) grad_norm 2.2785 (1.9858) [2022-01-24 04:10:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][680/1251] eta 0:21:00 lr 0.000242 time 2.1842 (2.2080) loss 2.5153 (3.3205) grad_norm 1.8914 (1.9850) [2022-01-24 04:10:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][690/1251] eta 0:20:38 lr 0.000242 time 2.4449 (2.2085) loss 3.5671 (3.3206) grad_norm 2.0233 (1.9847) [2022-01-24 04:11:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][700/1251] eta 0:20:17 lr 0.000242 time 2.2556 (2.2091) loss 3.0408 (3.3233) grad_norm 2.0611 (1.9842) [2022-01-24 04:11:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][710/1251] eta 0:19:57 lr 0.000242 time 3.3709 (2.2132) loss 3.1066 (3.3263) grad_norm 2.1021 (1.9839) [2022-01-24 04:11:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][720/1251] eta 0:19:35 lr 0.000242 time 3.1976 (2.2133) loss 3.4851 (3.3262) grad_norm 1.8182 (1.9832) [2022-01-24 04:12:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][730/1251] eta 0:19:11 lr 0.000242 time 1.7480 (2.2092) loss 3.4906 (3.3273) grad_norm 1.9500 (1.9837) [2022-01-24 04:12:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][740/1251] eta 0:18:46 lr 0.000242 time 1.9872 (2.2049) loss 3.5969 (3.3296) grad_norm 1.8386 (1.9834) [2022-01-24 04:12:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][750/1251] eta 0:18:23 lr 0.000242 time 2.1352 (2.2019) loss 2.6945 (3.3291) grad_norm 1.9600 (1.9837) [2022-01-24 04:13:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][760/1251] eta 0:17:59 lr 0.000241 time 1.9049 (2.1992) loss 4.0396 (3.3299) grad_norm 2.1289 (1.9840) [2022-01-24 04:13:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][770/1251] eta 0:17:37 lr 0.000241 time 2.7546 (2.1993) loss 2.8713 (3.3323) grad_norm 1.9380 (1.9830) [2022-01-24 04:13:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][780/1251] eta 0:17:15 lr 0.000241 time 1.5320 (2.1986) loss 4.0975 (3.3333) grad_norm 2.0017 (1.9833) [2022-01-24 04:14:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][790/1251] eta 0:16:53 lr 0.000241 time 1.8122 (2.1986) loss 3.7423 (3.3343) grad_norm 2.0468 (1.9847) [2022-01-24 04:14:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][800/1251] eta 0:16:32 lr 0.000241 time 2.1724 (2.1998) loss 3.1034 (3.3363) grad_norm 1.7501 (1.9841) [2022-01-24 04:14:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][810/1251] eta 0:16:10 lr 0.000241 time 2.3717 (2.1998) loss 3.3324 (3.3352) grad_norm 2.0206 (1.9837) [2022-01-24 04:15:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][820/1251] eta 0:15:47 lr 0.000241 time 1.9016 (2.1984) loss 3.7261 (3.3312) grad_norm 2.5251 (1.9837) [2022-01-24 04:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][830/1251] eta 0:15:25 lr 0.000241 time 2.0009 (2.1986) loss 2.6492 (3.3302) grad_norm 1.9929 (1.9837) [2022-01-24 04:16:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][840/1251] eta 0:15:04 lr 0.000241 time 2.6951 (2.2012) loss 3.6747 (3.3302) grad_norm 1.7792 (1.9835) [2022-01-24 04:16:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][850/1251] eta 0:14:43 lr 0.000241 time 2.9291 (2.2037) loss 3.1469 (3.3281) grad_norm 1.9607 (1.9831) [2022-01-24 04:16:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][860/1251] eta 0:14:21 lr 0.000241 time 1.7859 (2.2041) loss 3.2344 (3.3289) grad_norm 1.9620 (1.9825) [2022-01-24 04:17:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][870/1251] eta 0:14:00 lr 0.000241 time 1.8682 (2.2051) loss 3.0259 (3.3297) grad_norm 1.8591 (1.9820) [2022-01-24 04:17:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][880/1251] eta 0:13:38 lr 0.000241 time 1.6198 (2.2054) loss 3.9050 (3.3310) grad_norm 2.1537 (1.9837) [2022-01-24 04:17:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][890/1251] eta 0:13:15 lr 0.000241 time 1.8282 (2.2029) loss 3.8951 (3.3337) grad_norm 2.0145 (1.9844) [2022-01-24 04:18:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][900/1251] eta 0:12:52 lr 0.000241 time 2.3003 (2.2003) loss 2.9770 (3.3343) grad_norm 1.7242 (1.9845) [2022-01-24 04:18:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][910/1251] eta 0:12:29 lr 0.000241 time 2.3372 (2.1991) loss 3.7887 (3.3356) grad_norm 2.1280 (1.9850) [2022-01-24 04:18:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][920/1251] eta 0:12:07 lr 0.000241 time 2.5131 (2.1986) loss 3.3536 (3.3383) grad_norm 2.0211 (1.9874) [2022-01-24 04:19:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][930/1251] eta 0:11:45 lr 0.000241 time 2.3793 (2.1986) loss 3.4311 (3.3391) grad_norm 2.3737 (1.9887) [2022-01-24 04:19:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][940/1251] eta 0:11:24 lr 0.000241 time 2.6753 (2.2000) loss 3.3605 (3.3402) grad_norm 1.7345 (1.9888) [2022-01-24 04:20:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][950/1251] eta 0:11:02 lr 0.000241 time 2.1141 (2.2000) loss 2.6336 (3.3382) grad_norm 2.3771 (1.9888) [2022-01-24 04:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][960/1251] eta 0:10:39 lr 0.000241 time 2.3257 (2.1989) loss 3.2666 (3.3381) grad_norm 1.9335 (1.9894) [2022-01-24 04:20:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][970/1251] eta 0:10:18 lr 0.000241 time 2.5054 (2.1995) loss 3.7091 (3.3406) grad_norm 2.5818 (1.9912) [2022-01-24 04:21:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][980/1251] eta 0:09:56 lr 0.000241 time 2.7583 (2.2012) loss 2.8516 (3.3389) grad_norm 1.8391 (1.9917) [2022-01-24 04:21:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][990/1251] eta 0:09:34 lr 0.000241 time 2.1192 (2.2010) loss 3.8178 (3.3413) grad_norm 1.9992 (1.9925) [2022-01-24 04:21:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1000/1251] eta 0:09:12 lr 0.000241 time 1.8944 (2.2001) loss 2.5992 (3.3400) grad_norm 1.7276 (1.9921) [2022-01-24 04:22:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1010/1251] eta 0:08:49 lr 0.000241 time 1.8826 (2.1989) loss 3.5073 (3.3402) grad_norm 1.8141 (1.9915) [2022-01-24 04:22:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1020/1251] eta 0:08:27 lr 0.000241 time 2.4395 (2.1988) loss 3.3408 (3.3390) grad_norm 1.9910 (1.9919) [2022-01-24 04:23:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1030/1251] eta 0:08:06 lr 0.000241 time 2.9301 (2.1996) loss 3.6512 (3.3371) grad_norm 1.9385 (1.9912) [2022-01-24 04:23:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1040/1251] eta 0:07:44 lr 0.000241 time 1.8403 (2.2002) loss 2.5106 (3.3343) grad_norm 1.9156 (1.9910) [2022-01-24 04:23:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1050/1251] eta 0:07:22 lr 0.000240 time 3.3656 (2.2001) loss 3.6678 (3.3336) grad_norm 1.8377 (1.9899) [2022-01-24 04:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1060/1251] eta 0:06:59 lr 0.000240 time 1.6679 (2.1984) loss 3.7834 (3.3353) grad_norm 2.0930 (1.9890) [2022-01-24 04:24:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1070/1251] eta 0:06:37 lr 0.000240 time 1.8631 (2.1968) loss 3.5631 (3.3354) grad_norm 1.8258 (1.9883) [2022-01-24 04:24:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1080/1251] eta 0:06:15 lr 0.000240 time 1.8752 (2.1949) loss 3.3650 (3.3337) grad_norm 2.0417 (1.9873) [2022-01-24 04:25:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1090/1251] eta 0:05:53 lr 0.000240 time 2.9251 (2.1952) loss 3.2217 (3.3353) grad_norm 2.1567 (1.9877) [2022-01-24 04:25:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1100/1251] eta 0:05:31 lr 0.000240 time 2.0715 (2.1955) loss 2.5836 (3.3370) grad_norm 1.8685 (1.9871) [2022-01-24 04:25:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1110/1251] eta 0:05:09 lr 0.000240 time 3.0619 (2.1961) loss 3.2926 (3.3343) grad_norm 2.0477 (1.9868) [2022-01-24 04:26:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1120/1251] eta 0:04:47 lr 0.000240 time 2.8222 (2.1976) loss 3.8250 (3.3344) grad_norm 1.9400 (1.9866) [2022-01-24 04:26:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1130/1251] eta 0:04:25 lr 0.000240 time 2.2272 (2.1975) loss 3.7294 (3.3372) grad_norm 1.9801 (1.9865) [2022-01-24 04:26:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1140/1251] eta 0:04:03 lr 0.000240 time 1.6224 (2.1957) loss 2.9824 (3.3372) grad_norm 1.8994 (1.9865) [2022-01-24 04:27:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1150/1251] eta 0:03:41 lr 0.000240 time 2.0146 (2.1942) loss 2.5389 (3.3358) grad_norm 1.9342 (1.9863) [2022-01-24 04:27:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1160/1251] eta 0:03:19 lr 0.000240 time 2.2074 (2.1929) loss 3.8865 (3.3351) grad_norm 2.0141 (1.9879) [2022-01-24 04:28:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1170/1251] eta 0:02:57 lr 0.000240 time 2.2316 (2.1935) loss 3.0714 (3.3359) grad_norm 1.7315 (1.9876) [2022-01-24 04:28:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1180/1251] eta 0:02:35 lr 0.000240 time 1.5113 (2.1926) loss 2.1928 (3.3370) grad_norm 1.6765 (1.9884) [2022-01-24 04:28:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1190/1251] eta 0:02:13 lr 0.000240 time 1.9565 (2.1929) loss 3.2347 (3.3377) grad_norm 2.1235 (1.9882) [2022-01-24 04:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1200/1251] eta 0:01:51 lr 0.000240 time 2.4018 (2.1936) loss 3.7975 (3.3397) grad_norm 1.7385 (1.9881) [2022-01-24 04:29:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1210/1251] eta 0:01:30 lr 0.000240 time 2.7334 (2.1953) loss 3.6970 (3.3404) grad_norm 1.8583 (1.9873) [2022-01-24 04:29:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1220/1251] eta 0:01:08 lr 0.000240 time 1.7212 (2.1947) loss 3.9677 (3.3406) grad_norm 2.1497 (1.9868) [2022-01-24 04:30:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1230/1251] eta 0:00:46 lr 0.000240 time 2.2110 (2.1951) loss 2.5406 (3.3390) grad_norm 1.9754 (1.9860) [2022-01-24 04:30:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1240/1251] eta 0:00:24 lr 0.000240 time 2.1641 (2.1946) loss 3.9016 (3.3392) grad_norm 2.0770 (1.9856) [2022-01-24 04:30:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [203/300][1250/1251] eta 0:00:02 lr 0.000240 time 1.2056 (2.1893) loss 2.6058 (3.3380) grad_norm 1.7979 (1.9851) [2022-01-24 04:30:53 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 203 training takes 0:45:39 [2022-01-24 04:31:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.817 (18.817) Loss 0.9722 (0.9722) Acc@1 76.465 (76.465) Acc@5 94.434 (94.434) [2022-01-24 04:31:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.313 (3.327) Loss 0.8584 (0.9154) Acc@1 79.590 (78.436) Acc@5 95.215 (94.744) [2022-01-24 04:31:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.166 (2.534) Loss 0.9024 (0.9161) Acc@1 79.297 (78.585) Acc@5 94.922 (94.689) [2022-01-24 04:32:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.678 (2.204) Loss 0.8325 (0.9200) Acc@1 79.492 (78.572) Acc@5 95.996 (94.497) [2022-01-24 04:32:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.843 (2.179) Loss 0.9038 (0.9269) Acc@1 78.906 (78.361) Acc@5 95.410 (94.426) [2022-01-24 04:32:30 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.428 Acc@5 94.408 [2022-01-24 04:32:30 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.4% [2022-01-24 04:32:30 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.47% [2022-01-24 04:32:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][0/1251] eta 8:28:07 lr 0.000240 time 24.3708 (24.3708) loss 2.3917 (2.3917) grad_norm 1.8991 (1.8991) [2022-01-24 04:33:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][10/1251] eta 1:25:45 lr 0.000240 time 1.5104 (4.1460) loss 3.5460 (3.3587) grad_norm 1.9722 (1.8778) [2022-01-24 04:33:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][20/1251] eta 1:05:19 lr 0.000240 time 1.5038 (3.1840) loss 3.3273 (3.2576) grad_norm 1.7319 (1.9661) [2022-01-24 04:33:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][30/1251] eta 0:58:10 lr 0.000240 time 1.9741 (2.8583) loss 3.1981 (3.2978) grad_norm 1.9987 (2.0210) [2022-01-24 04:34:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][40/1251] eta 0:55:42 lr 0.000240 time 6.3633 (2.7601) loss 3.6107 (3.3116) grad_norm 1.9989 (2.0325) [2022-01-24 04:34:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][50/1251] eta 0:53:07 lr 0.000240 time 1.6876 (2.6541) loss 3.8288 (3.3560) grad_norm 1.8724 (2.0203) [2022-01-24 04:35:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][60/1251] eta 0:50:53 lr 0.000240 time 1.9681 (2.5640) loss 3.2463 (3.3534) grad_norm 2.1645 (2.0128) [2022-01-24 04:35:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][70/1251] eta 0:48:56 lr 0.000240 time 1.6628 (2.4867) loss 3.2362 (3.3634) grad_norm 1.9182 (2.0139) [2022-01-24 04:35:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][80/1251] eta 0:48:18 lr 0.000239 time 3.5180 (2.4753) loss 2.4921 (3.3471) grad_norm 2.0516 (2.0127) [2022-01-24 04:36:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][90/1251] eta 0:47:17 lr 0.000239 time 1.9959 (2.4436) loss 3.6731 (3.3688) grad_norm 1.7924 (2.0119) [2022-01-24 04:36:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][100/1251] eta 0:46:04 lr 0.000239 time 1.8916 (2.4015) loss 3.4685 (3.3807) grad_norm 2.2431 (2.0211) [2022-01-24 04:36:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][110/1251] eta 0:45:18 lr 0.000239 time 1.9164 (2.3823) loss 3.6850 (3.3796) grad_norm 1.7711 (2.0140) [2022-01-24 04:37:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][120/1251] eta 0:44:35 lr 0.000239 time 2.9093 (2.3659) loss 2.8850 (3.3586) grad_norm 2.0236 (2.0124) [2022-01-24 04:37:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][130/1251] eta 0:44:01 lr 0.000239 time 2.4741 (2.3560) loss 3.5622 (3.3459) grad_norm 1.9035 (2.0022) [2022-01-24 04:38:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][140/1251] eta 0:43:24 lr 0.000239 time 1.6768 (2.3445) loss 3.0549 (3.3359) grad_norm 1.8940 (1.9936) [2022-01-24 04:38:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][150/1251] eta 0:42:46 lr 0.000239 time 1.8948 (2.3313) loss 3.1624 (3.3355) grad_norm 2.4250 (2.0041) [2022-01-24 04:38:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][160/1251] eta 0:42:20 lr 0.000239 time 2.4155 (2.3287) loss 3.5376 (3.3214) grad_norm 1.7667 (2.0028) [2022-01-24 04:39:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][170/1251] eta 0:41:51 lr 0.000239 time 2.1416 (2.3230) loss 3.6597 (3.3255) grad_norm 1.7795 (1.9977) [2022-01-24 04:39:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][180/1251] eta 0:41:14 lr 0.000239 time 1.8966 (2.3107) loss 3.5678 (3.3083) grad_norm 2.2059 (1.9941) [2022-01-24 04:39:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][190/1251] eta 0:40:33 lr 0.000239 time 1.7906 (2.2936) loss 3.5915 (3.2977) grad_norm 1.8268 (1.9908) [2022-01-24 04:40:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][200/1251] eta 0:40:00 lr 0.000239 time 2.6421 (2.2836) loss 3.7067 (3.3180) grad_norm 1.8083 (1.9919) [2022-01-24 04:40:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][210/1251] eta 0:39:29 lr 0.000239 time 2.7547 (2.2759) loss 3.3644 (3.3256) grad_norm 1.7902 (1.9902) [2022-01-24 04:40:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][220/1251] eta 0:39:06 lr 0.000239 time 2.2267 (2.2764) loss 2.2434 (3.3252) grad_norm 2.2616 (1.9893) [2022-01-24 04:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][230/1251] eta 0:38:45 lr 0.000239 time 2.1879 (2.2777) loss 2.7750 (3.3272) grad_norm 2.0047 (1.9939) [2022-01-24 04:41:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][240/1251] eta 0:38:18 lr 0.000239 time 2.1861 (2.2736) loss 3.4278 (3.3312) grad_norm 2.2381 (2.0030) [2022-01-24 04:42:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][250/1251] eta 0:37:56 lr 0.000239 time 2.9763 (2.2743) loss 3.7468 (3.3409) grad_norm 1.9147 (1.9986) [2022-01-24 04:42:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][260/1251] eta 0:37:32 lr 0.000239 time 1.9213 (2.2733) loss 3.9852 (3.3426) grad_norm 2.0746 (1.9993) [2022-01-24 04:42:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][270/1251] eta 0:37:00 lr 0.000239 time 1.8555 (2.2633) loss 3.7199 (3.3545) grad_norm 1.8839 (2.0001) [2022-01-24 04:43:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][280/1251] eta 0:36:28 lr 0.000239 time 2.1395 (2.2535) loss 3.6663 (3.3477) grad_norm 1.9634 (1.9987) [2022-01-24 04:43:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][290/1251] eta 0:35:58 lr 0.000239 time 1.8252 (2.2456) loss 3.0313 (3.3391) grad_norm 1.9391 (1.9952) [2022-01-24 04:43:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][300/1251] eta 0:35:35 lr 0.000239 time 2.3487 (2.2454) loss 2.1351 (3.3328) grad_norm 1.8823 (1.9931) [2022-01-24 04:44:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][310/1251] eta 0:35:11 lr 0.000239 time 1.5185 (2.2440) loss 3.7873 (3.3278) grad_norm 1.8256 (1.9910) [2022-01-24 04:44:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][320/1251] eta 0:34:49 lr 0.000239 time 2.3868 (2.2447) loss 3.4607 (3.3329) grad_norm 2.0082 (1.9899) [2022-01-24 04:44:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][330/1251] eta 0:34:32 lr 0.000239 time 2.2713 (2.2507) loss 2.6615 (3.3292) grad_norm 2.2452 (1.9883) [2022-01-24 04:45:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][340/1251] eta 0:34:13 lr 0.000239 time 2.3736 (2.2545) loss 3.7546 (3.3360) grad_norm 2.7079 (1.9944) [2022-01-24 04:45:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][350/1251] eta 0:33:47 lr 0.000239 time 2.1900 (2.2504) loss 2.8984 (3.3377) grad_norm 1.9159 (1.9926) [2022-01-24 04:46:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][360/1251] eta 0:33:19 lr 0.000239 time 2.2779 (2.2441) loss 4.1022 (3.3416) grad_norm 2.0786 (1.9931) [2022-01-24 04:46:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][370/1251] eta 0:32:51 lr 0.000238 time 1.9167 (2.2380) loss 3.5866 (3.3469) grad_norm 1.9374 (1.9942) [2022-01-24 04:46:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][380/1251] eta 0:32:27 lr 0.000238 time 2.1328 (2.2363) loss 3.6054 (3.3484) grad_norm 1.9247 (1.9915) [2022-01-24 04:47:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][390/1251] eta 0:32:01 lr 0.000238 time 1.8662 (2.2320) loss 3.5163 (3.3496) grad_norm 1.9141 (1.9935) [2022-01-24 04:47:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][400/1251] eta 0:31:39 lr 0.000238 time 2.5313 (2.2315) loss 3.0122 (3.3455) grad_norm 1.8876 (1.9935) [2022-01-24 04:47:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][410/1251] eta 0:31:15 lr 0.000238 time 2.1137 (2.2299) loss 3.6762 (3.3404) grad_norm 1.7540 (1.9920) [2022-01-24 04:48:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][420/1251] eta 0:30:53 lr 0.000238 time 1.9474 (2.2306) loss 3.4244 (3.3353) grad_norm 1.6822 (1.9915) [2022-01-24 04:48:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][430/1251] eta 0:30:31 lr 0.000238 time 1.6751 (2.2313) loss 3.9290 (3.3339) grad_norm 1.8731 (1.9880) [2022-01-24 04:48:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][440/1251] eta 0:30:09 lr 0.000238 time 2.2355 (2.2310) loss 3.5248 (3.3326) grad_norm 1.7265 (1.9861) [2022-01-24 04:49:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][450/1251] eta 0:29:47 lr 0.000238 time 2.6941 (2.2316) loss 3.7615 (3.3304) grad_norm 1.7467 (1.9848) [2022-01-24 04:49:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][460/1251] eta 0:29:25 lr 0.000238 time 1.7872 (2.2314) loss 3.4807 (3.3320) grad_norm 1.9926 (1.9825) [2022-01-24 04:50:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][470/1251] eta 0:29:03 lr 0.000238 time 2.0784 (2.2322) loss 3.9821 (3.3340) grad_norm 2.0678 (1.9844) [2022-01-24 04:50:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][480/1251] eta 0:28:36 lr 0.000238 time 1.8746 (2.2269) loss 3.4806 (3.3360) grad_norm 2.4727 (1.9847) [2022-01-24 04:50:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][490/1251] eta 0:28:12 lr 0.000238 time 2.0841 (2.2238) loss 3.1296 (3.3342) grad_norm 1.8571 (1.9842) [2022-01-24 04:51:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][500/1251] eta 0:27:48 lr 0.000238 time 1.8750 (2.2213) loss 3.4092 (3.3319) grad_norm 2.1543 (1.9834) [2022-01-24 04:51:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][510/1251] eta 0:27:24 lr 0.000238 time 2.2461 (2.2195) loss 3.6485 (3.3299) grad_norm 1.9955 (1.9819) [2022-01-24 04:51:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][520/1251] eta 0:27:02 lr 0.000238 time 1.9486 (2.2194) loss 3.8450 (3.3344) grad_norm 2.2892 (1.9817) [2022-01-24 04:52:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][530/1251] eta 0:26:41 lr 0.000238 time 2.7330 (2.2206) loss 3.0440 (3.3351) grad_norm 1.9675 (1.9809) [2022-01-24 04:52:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][540/1251] eta 0:26:19 lr 0.000238 time 2.1835 (2.2222) loss 2.5609 (3.3346) grad_norm 1.9044 (1.9806) [2022-01-24 04:52:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][550/1251] eta 0:25:57 lr 0.000238 time 1.8273 (2.2220) loss 4.0588 (3.3384) grad_norm 1.8787 (1.9795) [2022-01-24 04:53:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][560/1251] eta 0:25:35 lr 0.000238 time 1.8840 (2.2225) loss 3.4794 (3.3441) grad_norm 1.9202 (1.9793) [2022-01-24 04:53:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][570/1251] eta 0:25:10 lr 0.000238 time 1.6165 (2.2173) loss 3.2106 (3.3443) grad_norm 2.0985 (1.9817) [2022-01-24 04:53:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][580/1251] eta 0:24:47 lr 0.000238 time 2.0883 (2.2164) loss 3.4237 (3.3436) grad_norm 1.7570 (1.9827) [2022-01-24 04:54:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][590/1251] eta 0:24:25 lr 0.000238 time 2.4437 (2.2164) loss 3.0911 (3.3421) grad_norm 2.6779 (1.9854) [2022-01-24 04:54:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][600/1251] eta 0:24:04 lr 0.000238 time 2.0698 (2.2185) loss 2.7338 (3.3456) grad_norm 2.2028 (1.9851) [2022-01-24 04:55:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][610/1251] eta 0:23:41 lr 0.000238 time 1.9173 (2.2173) loss 3.1958 (3.3407) grad_norm 2.1066 (1.9862) [2022-01-24 04:55:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][620/1251] eta 0:23:17 lr 0.000238 time 1.6910 (2.2145) loss 3.8123 (3.3394) grad_norm 2.0753 (1.9853) [2022-01-24 04:55:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][630/1251] eta 0:22:53 lr 0.000238 time 2.1422 (2.2120) loss 4.0221 (3.3390) grad_norm 2.5634 (1.9848) [2022-01-24 04:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][640/1251] eta 0:22:30 lr 0.000238 time 1.8300 (2.2109) loss 3.4531 (3.3417) grad_norm 1.8662 (1.9844) [2022-01-24 04:56:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][650/1251] eta 0:22:09 lr 0.000237 time 1.8843 (2.2114) loss 3.5696 (3.3351) grad_norm 1.9960 (1.9850) [2022-01-24 04:56:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][660/1251] eta 0:21:46 lr 0.000237 time 2.2604 (2.2102) loss 3.3769 (3.3364) grad_norm 1.9934 (1.9864) [2022-01-24 04:57:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][670/1251] eta 0:21:22 lr 0.000237 time 1.9434 (2.2070) loss 2.7396 (3.3357) grad_norm 1.8456 (1.9867) [2022-01-24 04:57:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][680/1251] eta 0:21:01 lr 0.000237 time 1.7720 (2.2090) loss 3.2747 (3.3393) grad_norm 2.2029 (1.9871) [2022-01-24 04:57:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][690/1251] eta 0:20:39 lr 0.000237 time 2.6229 (2.2096) loss 3.4437 (3.3409) grad_norm 2.1305 (1.9865) [2022-01-24 04:58:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][700/1251] eta 0:20:17 lr 0.000237 time 2.2519 (2.2100) loss 3.8533 (3.3429) grad_norm 2.1495 (1.9888) [2022-01-24 04:58:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][710/1251] eta 0:19:55 lr 0.000237 time 2.1327 (2.2097) loss 3.7644 (3.3454) grad_norm 1.8469 (1.9900) [2022-01-24 04:59:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][720/1251] eta 0:19:33 lr 0.000237 time 1.6763 (2.2101) loss 3.3930 (3.3426) grad_norm 2.0455 (1.9893) [2022-01-24 04:59:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][730/1251] eta 0:19:12 lr 0.000237 time 2.8102 (2.2115) loss 3.3968 (3.3421) grad_norm 2.0017 (1.9896) [2022-01-24 04:59:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][740/1251] eta 0:18:49 lr 0.000237 time 1.8535 (2.2106) loss 3.1738 (3.3435) grad_norm 2.0891 (1.9901) [2022-01-24 05:00:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][750/1251] eta 0:18:27 lr 0.000237 time 2.6765 (2.2098) loss 3.0609 (3.3449) grad_norm 2.0360 (1.9915) [2022-01-24 05:00:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][760/1251] eta 0:18:03 lr 0.000237 time 1.5925 (2.2075) loss 3.7706 (3.3471) grad_norm 2.1165 (1.9920) [2022-01-24 05:00:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][770/1251] eta 0:17:41 lr 0.000237 time 2.6998 (2.2060) loss 3.6572 (3.3457) grad_norm 1.9994 (1.9920) [2022-01-24 05:01:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][780/1251] eta 0:17:18 lr 0.000237 time 1.9132 (2.2046) loss 3.5798 (3.3462) grad_norm 1.9041 (1.9924) [2022-01-24 05:01:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][790/1251] eta 0:16:56 lr 0.000237 time 2.0445 (2.2049) loss 3.2997 (3.3462) grad_norm 2.0710 (1.9933) [2022-01-24 05:01:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][800/1251] eta 0:16:34 lr 0.000237 time 2.2130 (2.2050) loss 3.1706 (3.3451) grad_norm 1.6743 (1.9928) [2022-01-24 05:02:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][810/1251] eta 0:16:12 lr 0.000237 time 2.7738 (2.2061) loss 3.6650 (3.3445) grad_norm 1.8528 (1.9922) [2022-01-24 05:02:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][820/1251] eta 0:15:51 lr 0.000237 time 2.1431 (2.2069) loss 2.8990 (3.3452) grad_norm 2.1318 (1.9934) [2022-01-24 05:03:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][830/1251] eta 0:15:28 lr 0.000237 time 1.9114 (2.2063) loss 3.5715 (3.3491) grad_norm 1.8940 (1.9934) [2022-01-24 05:03:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][840/1251] eta 0:15:05 lr 0.000237 time 2.0557 (2.2038) loss 3.7788 (3.3492) grad_norm 2.0583 (1.9948) [2022-01-24 05:03:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][850/1251] eta 0:14:43 lr 0.000237 time 2.2313 (2.2020) loss 2.8757 (3.3469) grad_norm 2.2341 (1.9957) [2022-01-24 05:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][860/1251] eta 0:14:20 lr 0.000237 time 1.9448 (2.2013) loss 3.7387 (3.3468) grad_norm 1.8075 (1.9967) [2022-01-24 05:04:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][870/1251] eta 0:13:58 lr 0.000237 time 2.2636 (2.2013) loss 3.1275 (3.3435) grad_norm 1.6669 (1.9948) [2022-01-24 05:04:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][880/1251] eta 0:13:36 lr 0.000237 time 2.4762 (2.2019) loss 2.3506 (3.3409) grad_norm 2.0080 (1.9945) [2022-01-24 05:05:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][890/1251] eta 0:13:14 lr 0.000237 time 1.9653 (2.2011) loss 3.3967 (3.3423) grad_norm 2.0979 (1.9941) [2022-01-24 05:05:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][900/1251] eta 0:12:52 lr 0.000237 time 2.8729 (2.2013) loss 2.6799 (3.3381) grad_norm 1.8964 (1.9938) [2022-01-24 05:05:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][910/1251] eta 0:12:30 lr 0.000237 time 1.8593 (2.2007) loss 2.6428 (3.3362) grad_norm 1.8827 (1.9954) [2022-01-24 05:06:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][920/1251] eta 0:12:07 lr 0.000237 time 2.0665 (2.1988) loss 3.9620 (3.3366) grad_norm 1.8198 (1.9951) [2022-01-24 05:06:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][930/1251] eta 0:11:45 lr 0.000237 time 2.5625 (2.1972) loss 3.2740 (3.3364) grad_norm 2.0071 (1.9958) [2022-01-24 05:06:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][940/1251] eta 0:11:22 lr 0.000236 time 2.2230 (2.1954) loss 3.6994 (3.3374) grad_norm 2.3350 (1.9978) [2022-01-24 05:07:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][950/1251] eta 0:11:00 lr 0.000236 time 2.3340 (2.1946) loss 3.2078 (3.3366) grad_norm 2.4250 (1.9990) [2022-01-24 05:07:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][960/1251] eta 0:10:39 lr 0.000236 time 2.3463 (2.1984) loss 3.6377 (3.3347) grad_norm 2.0594 (1.9998) [2022-01-24 05:08:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][970/1251] eta 0:10:18 lr 0.000236 time 2.2300 (2.1995) loss 2.3355 (3.3334) grad_norm 1.7821 (1.9995) [2022-01-24 05:08:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][980/1251] eta 0:09:56 lr 0.000236 time 2.3234 (2.2006) loss 2.6294 (3.3329) grad_norm 1.9095 (2.0000) [2022-01-24 05:08:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][990/1251] eta 0:09:34 lr 0.000236 time 1.9150 (2.1998) loss 2.7727 (3.3314) grad_norm 2.0164 (1.9998) [2022-01-24 05:09:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1000/1251] eta 0:09:11 lr 0.000236 time 2.3969 (2.1988) loss 2.9444 (3.3313) grad_norm 2.0744 (1.9989) [2022-01-24 05:09:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1010/1251] eta 0:08:49 lr 0.000236 time 2.1480 (2.1972) loss 3.7269 (3.3342) grad_norm 1.8652 (1.9986) [2022-01-24 05:09:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1020/1251] eta 0:08:27 lr 0.000236 time 2.2734 (2.1964) loss 2.0362 (3.3334) grad_norm 2.0744 (1.9988) [2022-01-24 05:10:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1030/1251] eta 0:08:05 lr 0.000236 time 1.4925 (2.1952) loss 3.5394 (3.3352) grad_norm 1.8105 (1.9985) [2022-01-24 05:10:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1040/1251] eta 0:07:43 lr 0.000236 time 2.0938 (2.1946) loss 3.9436 (3.3359) grad_norm 2.2511 (1.9979) [2022-01-24 05:10:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1050/1251] eta 0:07:21 lr 0.000236 time 2.7126 (2.1970) loss 3.5796 (3.3360) grad_norm 2.0046 (1.9986) [2022-01-24 05:11:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1060/1251] eta 0:06:59 lr 0.000236 time 2.3939 (2.1979) loss 3.4785 (3.3350) grad_norm 2.3702 (1.9989) [2022-01-24 05:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1070/1251] eta 0:06:37 lr 0.000236 time 1.9875 (2.1983) loss 3.6435 (3.3358) grad_norm 2.0657 (1.9989) [2022-01-24 05:12:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1080/1251] eta 0:06:15 lr 0.000236 time 2.1829 (2.1978) loss 3.0961 (3.3377) grad_norm 2.1166 (1.9988) [2022-01-24 05:12:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1090/1251] eta 0:05:53 lr 0.000236 time 1.5646 (2.1974) loss 3.0370 (3.3384) grad_norm 1.8680 (1.9983) [2022-01-24 05:12:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1100/1251] eta 0:05:31 lr 0.000236 time 1.9953 (2.1965) loss 2.8272 (3.3389) grad_norm 1.8946 (1.9976) [2022-01-24 05:13:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1110/1251] eta 0:05:09 lr 0.000236 time 2.2716 (2.1971) loss 3.3255 (3.3392) grad_norm 1.9381 (2.0027) [2022-01-24 05:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1120/1251] eta 0:04:47 lr 0.000236 time 1.8205 (2.1966) loss 2.6555 (3.3376) grad_norm 2.1010 (2.0034) [2022-01-24 05:13:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1130/1251] eta 0:04:25 lr 0.000236 time 2.0689 (2.1965) loss 2.4384 (3.3380) grad_norm 1.9543 (2.0042) [2022-01-24 05:14:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1140/1251] eta 0:04:03 lr 0.000236 time 2.5125 (2.1970) loss 3.1133 (3.3365) grad_norm 1.6302 (2.0043) [2022-01-24 05:14:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1150/1251] eta 0:03:41 lr 0.000236 time 1.5231 (2.1976) loss 3.8491 (3.3379) grad_norm 2.2495 (2.0054) [2022-01-24 05:15:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1160/1251] eta 0:03:19 lr 0.000236 time 1.6529 (2.1964) loss 3.6453 (3.3370) grad_norm 2.1783 (2.0062) [2022-01-24 05:15:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1170/1251] eta 0:02:57 lr 0.000236 time 1.8620 (2.1949) loss 3.5366 (3.3365) grad_norm 2.0367 (2.0069) [2022-01-24 05:15:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1180/1251] eta 0:02:36 lr 0.000236 time 2.9468 (2.1972) loss 3.3403 (3.3371) grad_norm 1.8224 (2.0064) [2022-01-24 05:16:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1190/1251] eta 0:02:14 lr 0.000236 time 2.2124 (2.1989) loss 3.2705 (3.3364) grad_norm 1.7053 (2.0065) [2022-01-24 05:16:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1200/1251] eta 0:01:52 lr 0.000236 time 1.5671 (2.1978) loss 3.0614 (3.3376) grad_norm 2.2012 (2.0070) [2022-01-24 05:16:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1210/1251] eta 0:01:30 lr 0.000236 time 1.5969 (2.1961) loss 3.0415 (3.3378) grad_norm 2.1244 (2.0074) [2022-01-24 05:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1220/1251] eta 0:01:08 lr 0.000236 time 3.1814 (2.1967) loss 3.7261 (3.3396) grad_norm 1.9593 (2.0071) [2022-01-24 05:17:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1230/1251] eta 0:00:46 lr 0.000235 time 2.2576 (2.1962) loss 2.9486 (3.3386) grad_norm 1.9193 (2.0065) [2022-01-24 05:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1240/1251] eta 0:00:24 lr 0.000235 time 1.4267 (2.1952) loss 2.8843 (3.3366) grad_norm 2.3665 (2.0063) [2022-01-24 05:18:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [204/300][1250/1251] eta 0:00:02 lr 0.000235 time 1.1708 (2.1900) loss 2.8573 (3.3367) grad_norm 2.0744 (2.0054) [2022-01-24 05:18:10 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 204 training takes 0:45:40 [2022-01-24 05:18:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.575 (18.575) Loss 0.9054 (0.9054) Acc@1 77.441 (77.441) Acc@5 95.215 (95.215) [2022-01-24 05:18:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.937 (3.413) Loss 0.8935 (0.9080) Acc@1 79.590 (78.746) Acc@5 95.215 (94.478) [2022-01-24 05:19:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.595 (2.493) Loss 0.9193 (0.9080) Acc@1 78.027 (78.492) Acc@5 94.336 (94.508) [2022-01-24 05:19:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.854 (2.327) Loss 0.9142 (0.9084) Acc@1 79.395 (78.528) Acc@5 93.848 (94.553) [2022-01-24 05:19:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.408 (2.163) Loss 0.8895 (0.9139) Acc@1 78.125 (78.442) Acc@5 94.727 (94.457) [2022-01-24 05:19:46 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.496 Acc@5 94.538 [2022-01-24 05:19:46 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.5% [2022-01-24 05:19:46 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.50% [2022-01-24 05:20:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][0/1251] eta 7:28:59 lr 0.000235 time 21.5348 (21.5348) loss 3.5117 (3.5117) grad_norm 2.1309 (2.1309) [2022-01-24 05:20:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][10/1251] eta 1:21:32 lr 0.000235 time 2.9061 (3.9424) loss 3.2877 (3.3948) grad_norm 1.8341 (1.9743) [2022-01-24 05:20:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][20/1251] eta 1:03:07 lr 0.000235 time 1.6895 (3.0765) loss 3.2592 (3.3122) grad_norm 1.7756 (1.9702) [2022-01-24 05:21:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][30/1251] eta 0:56:14 lr 0.000235 time 1.5912 (2.7639) loss 4.0885 (3.3975) grad_norm 2.2744 (1.9715) [2022-01-24 05:21:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][40/1251] eta 0:53:13 lr 0.000235 time 3.0645 (2.6368) loss 2.6888 (3.3817) grad_norm 1.8819 (1.9754) [2022-01-24 05:21:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][50/1251] eta 0:51:22 lr 0.000235 time 2.9659 (2.5663) loss 3.6149 (3.3338) grad_norm 2.4056 (1.9910) [2022-01-24 05:22:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][60/1251] eta 0:49:40 lr 0.000235 time 1.6400 (2.5025) loss 3.8975 (3.3446) grad_norm 2.0166 (1.9905) [2022-01-24 05:22:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][70/1251] eta 0:48:18 lr 0.000235 time 1.5172 (2.4545) loss 3.7286 (3.3705) grad_norm 1.9594 (1.9983) [2022-01-24 05:23:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][80/1251] eta 0:48:01 lr 0.000235 time 3.5662 (2.4605) loss 3.3657 (3.3971) grad_norm 1.9543 (1.9853) [2022-01-24 05:23:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][90/1251] eta 0:47:27 lr 0.000235 time 3.2570 (2.4525) loss 3.6830 (3.4215) grad_norm 1.9324 (1.9956) [2022-01-24 05:23:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][100/1251] eta 0:46:20 lr 0.000235 time 1.5279 (2.4161) loss 3.2498 (3.4232) grad_norm 2.0349 (2.0081) [2022-01-24 05:24:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][110/1251] eta 0:45:10 lr 0.000235 time 1.7655 (2.3752) loss 3.9810 (3.4054) grad_norm 2.2794 (2.0186) [2022-01-24 05:24:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][120/1251] eta 0:44:24 lr 0.000235 time 2.6581 (2.3558) loss 2.7690 (3.4011) grad_norm 2.2189 (2.0277) [2022-01-24 05:24:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][130/1251] eta 0:44:06 lr 0.000235 time 2.8120 (2.3612) loss 2.9254 (3.3778) grad_norm 1.9938 (2.0232) [2022-01-24 05:25:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][140/1251] eta 0:43:47 lr 0.000235 time 1.7842 (2.3649) loss 3.5785 (3.3728) grad_norm 2.2251 (2.0276) [2022-01-24 05:25:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][150/1251] eta 0:43:17 lr 0.000235 time 1.6124 (2.3589) loss 3.4803 (3.3560) grad_norm 1.9510 (2.0285) [2022-01-24 05:26:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][160/1251] eta 0:42:33 lr 0.000235 time 1.9183 (2.3407) loss 3.0028 (3.3546) grad_norm 2.3431 (2.0301) [2022-01-24 05:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][170/1251] eta 0:41:50 lr 0.000235 time 1.9322 (2.3220) loss 3.6988 (3.3637) grad_norm 1.7919 (2.0279) [2022-01-24 05:26:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][180/1251] eta 0:41:11 lr 0.000235 time 2.0004 (2.3079) loss 3.2407 (3.3546) grad_norm 2.3002 (2.0285) [2022-01-24 05:27:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][190/1251] eta 0:40:34 lr 0.000235 time 2.4143 (2.2941) loss 3.3605 (3.3479) grad_norm 2.0317 (2.0225) [2022-01-24 05:27:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][200/1251] eta 0:40:02 lr 0.000235 time 2.1770 (2.2857) loss 3.9837 (3.3500) grad_norm 2.0723 (2.0210) [2022-01-24 05:27:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][210/1251] eta 0:39:30 lr 0.000235 time 2.4807 (2.2767) loss 3.4897 (3.3402) grad_norm 2.0817 (2.0213) [2022-01-24 05:28:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][220/1251] eta 0:38:59 lr 0.000235 time 1.8887 (2.2689) loss 3.4740 (3.3324) grad_norm 2.1770 (2.0186) [2022-01-24 05:28:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][230/1251] eta 0:38:36 lr 0.000235 time 2.1590 (2.2691) loss 2.3635 (3.3189) grad_norm 1.8989 (2.0181) [2022-01-24 05:28:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][240/1251] eta 0:38:15 lr 0.000235 time 2.3724 (2.2704) loss 3.1427 (3.3156) grad_norm 2.0906 (2.0196) [2022-01-24 05:29:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][250/1251] eta 0:37:52 lr 0.000235 time 2.4866 (2.2705) loss 3.6848 (3.3015) grad_norm 1.9967 (2.0205) [2022-01-24 05:29:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][260/1251] eta 0:37:29 lr 0.000235 time 2.1621 (2.2699) loss 2.5301 (3.2980) grad_norm 1.8495 (2.0193) [2022-01-24 05:30:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][270/1251] eta 0:37:03 lr 0.000234 time 2.1527 (2.2665) loss 3.7782 (3.3035) grad_norm 1.7158 (2.0161) [2022-01-24 05:30:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][280/1251] eta 0:36:44 lr 0.000234 time 2.9180 (2.2707) loss 3.5925 (3.3032) grad_norm 1.8172 (2.0098) [2022-01-24 05:30:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][290/1251] eta 0:36:19 lr 0.000234 time 1.9510 (2.2675) loss 3.2245 (3.3073) grad_norm 1.9690 (2.0101) [2022-01-24 05:31:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][300/1251] eta 0:35:54 lr 0.000234 time 2.8300 (2.2654) loss 3.5551 (3.3071) grad_norm 2.0083 (2.0103) [2022-01-24 05:31:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][310/1251] eta 0:35:28 lr 0.000234 time 2.4341 (2.2624) loss 3.9621 (3.3132) grad_norm 2.0269 (2.0119) [2022-01-24 05:31:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][320/1251] eta 0:35:00 lr 0.000234 time 2.6984 (2.2562) loss 3.4671 (3.3166) grad_norm 1.9086 (2.0097) [2022-01-24 05:32:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][330/1251] eta 0:34:30 lr 0.000234 time 1.5891 (2.2486) loss 3.2566 (3.3139) grad_norm 1.9393 (2.0098) [2022-01-24 05:32:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][340/1251] eta 0:34:04 lr 0.000234 time 1.8993 (2.2438) loss 2.4607 (3.3046) grad_norm 1.9774 (2.0091) [2022-01-24 05:32:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][350/1251] eta 0:33:40 lr 0.000234 time 2.7370 (2.2428) loss 3.3394 (3.2998) grad_norm 1.8568 (2.0073) [2022-01-24 05:33:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][360/1251] eta 0:33:22 lr 0.000234 time 2.4563 (2.2472) loss 2.9666 (3.2994) grad_norm 2.2269 (2.0083) [2022-01-24 05:33:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][370/1251] eta 0:33:02 lr 0.000234 time 2.2091 (2.2505) loss 3.1417 (3.3024) grad_norm 2.1486 (2.0090) [2022-01-24 05:34:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][380/1251] eta 0:32:34 lr 0.000234 time 1.9034 (2.2438) loss 3.1345 (3.2941) grad_norm 2.2532 (2.0108) [2022-01-24 05:34:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][390/1251] eta 0:32:06 lr 0.000234 time 2.6923 (2.2376) loss 3.7913 (3.2980) grad_norm 2.1690 (2.0193) [2022-01-24 05:34:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][400/1251] eta 0:31:39 lr 0.000234 time 2.1188 (2.2325) loss 3.5744 (3.2994) grad_norm 1.8784 (2.0200) [2022-01-24 05:35:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][410/1251] eta 0:31:18 lr 0.000234 time 2.5996 (2.2337) loss 3.6565 (3.3061) grad_norm 1.8938 (2.0202) [2022-01-24 05:35:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][420/1251] eta 0:30:57 lr 0.000234 time 3.0450 (2.2348) loss 3.5535 (3.3114) grad_norm 2.1355 (2.0203) [2022-01-24 05:35:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][430/1251] eta 0:30:37 lr 0.000234 time 3.1311 (2.2382) loss 3.5659 (3.3167) grad_norm 1.9320 (2.0225) [2022-01-24 05:36:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][440/1251] eta 0:30:14 lr 0.000234 time 2.2735 (2.2379) loss 3.5056 (3.3115) grad_norm 2.4038 (2.0226) [2022-01-24 05:36:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][450/1251] eta 0:29:52 lr 0.000234 time 2.9197 (2.2374) loss 3.5223 (3.3181) grad_norm 2.1917 (2.0215) [2022-01-24 05:36:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][460/1251] eta 0:29:25 lr 0.000234 time 1.6020 (2.2321) loss 3.2578 (3.3177) grad_norm 1.7789 (2.0205) [2022-01-24 05:37:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][470/1251] eta 0:29:01 lr 0.000234 time 2.5999 (2.2296) loss 3.6476 (3.3127) grad_norm 1.8072 (2.0203) [2022-01-24 05:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][480/1251] eta 0:28:35 lr 0.000234 time 2.1569 (2.2253) loss 2.9956 (3.3165) grad_norm 2.0453 (2.0221) [2022-01-24 05:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][490/1251] eta 0:28:12 lr 0.000234 time 2.4960 (2.2247) loss 3.5786 (3.3161) grad_norm 2.3200 (2.0216) [2022-01-24 05:38:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][500/1251] eta 0:27:49 lr 0.000234 time 1.5572 (2.2234) loss 3.7120 (3.3171) grad_norm 1.7781 (2.0201) [2022-01-24 05:38:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][510/1251] eta 0:27:25 lr 0.000234 time 2.1950 (2.2204) loss 3.9201 (3.3199) grad_norm 1.7593 (2.0170) [2022-01-24 05:39:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][520/1251] eta 0:27:01 lr 0.000234 time 2.0254 (2.2186) loss 2.5232 (3.3192) grad_norm 2.1553 (2.0180) [2022-01-24 05:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][530/1251] eta 0:26:40 lr 0.000234 time 2.6101 (2.2197) loss 3.8244 (3.3185) grad_norm 2.3125 (2.0186) [2022-01-24 05:39:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][540/1251] eta 0:26:17 lr 0.000234 time 2.2187 (2.2190) loss 3.6836 (3.3171) grad_norm 1.9368 (2.0216) [2022-01-24 05:40:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][550/1251] eta 0:25:57 lr 0.000233 time 2.7797 (2.2217) loss 3.1736 (3.3167) grad_norm 1.6704 (2.0205) [2022-01-24 05:40:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][560/1251] eta 0:25:35 lr 0.000233 time 2.4462 (2.2229) loss 3.3709 (3.3200) grad_norm 1.8311 (2.0212) [2022-01-24 05:40:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][570/1251] eta 0:25:13 lr 0.000233 time 2.1656 (2.2225) loss 3.3527 (3.3197) grad_norm 1.9230 (2.0220) [2022-01-24 05:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][580/1251] eta 0:24:50 lr 0.000233 time 2.0191 (2.2208) loss 2.1167 (3.3183) grad_norm 2.2252 (2.0210) [2022-01-24 05:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][590/1251] eta 0:24:26 lr 0.000233 time 2.4608 (2.2182) loss 3.7337 (3.3174) grad_norm 2.1423 (2.0206) [2022-01-24 05:41:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][600/1251] eta 0:24:04 lr 0.000233 time 1.6597 (2.2186) loss 3.5499 (3.3210) grad_norm 1.6954 (2.0196) [2022-01-24 05:42:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][610/1251] eta 0:23:41 lr 0.000233 time 2.5091 (2.2180) loss 3.2765 (3.3224) grad_norm 2.0543 (2.0199) [2022-01-24 05:42:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][620/1251] eta 0:23:20 lr 0.000233 time 2.5787 (2.2200) loss 3.9280 (3.3240) grad_norm 2.1870 (2.0174) [2022-01-24 05:43:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][630/1251] eta 0:22:57 lr 0.000233 time 2.0669 (2.2186) loss 3.6138 (3.3220) grad_norm 2.3062 (2.0169) [2022-01-24 05:43:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][640/1251] eta 0:22:34 lr 0.000233 time 2.2603 (2.2163) loss 2.3410 (3.3217) grad_norm 2.0596 (2.0162) [2022-01-24 05:43:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][650/1251] eta 0:22:10 lr 0.000233 time 1.9686 (2.2132) loss 3.7054 (3.3174) grad_norm 2.1084 (2.0154) [2022-01-24 05:44:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][660/1251] eta 0:21:47 lr 0.000233 time 2.1215 (2.2132) loss 3.1910 (3.3145) grad_norm 2.4677 (2.0159) [2022-01-24 05:44:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][670/1251] eta 0:21:26 lr 0.000233 time 2.2765 (2.2139) loss 3.1682 (3.3123) grad_norm 1.8228 (2.0163) [2022-01-24 05:44:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][680/1251] eta 0:21:04 lr 0.000233 time 2.5175 (2.2152) loss 3.3334 (3.3133) grad_norm 1.8994 (2.0154) [2022-01-24 05:45:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][690/1251] eta 0:20:42 lr 0.000233 time 2.2721 (2.2147) loss 2.5901 (3.3089) grad_norm 1.8788 (2.0166) [2022-01-24 05:45:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][700/1251] eta 0:20:19 lr 0.000233 time 1.6303 (2.2140) loss 3.3043 (3.3094) grad_norm 2.0570 (2.0165) [2022-01-24 05:46:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][710/1251] eta 0:19:58 lr 0.000233 time 2.5562 (2.2152) loss 3.8496 (3.3074) grad_norm 2.1785 (2.0169) [2022-01-24 05:46:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][720/1251] eta 0:19:35 lr 0.000233 time 1.6412 (2.2132) loss 4.0937 (3.3069) grad_norm 2.1056 (2.0177) [2022-01-24 05:46:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][730/1251] eta 0:19:12 lr 0.000233 time 2.4415 (2.2123) loss 3.9359 (3.3093) grad_norm 1.8455 (2.0165) [2022-01-24 05:47:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][740/1251] eta 0:18:49 lr 0.000233 time 1.7072 (2.2113) loss 3.0289 (3.3108) grad_norm 1.9096 (2.0174) [2022-01-24 05:47:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][750/1251] eta 0:18:28 lr 0.000233 time 2.2640 (2.2121) loss 3.7254 (3.3108) grad_norm 1.9244 (2.0187) [2022-01-24 05:47:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][760/1251] eta 0:18:06 lr 0.000233 time 1.5794 (2.2128) loss 3.6255 (3.3118) grad_norm 1.7488 (2.0197) [2022-01-24 05:48:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][770/1251] eta 0:17:44 lr 0.000233 time 1.9463 (2.2138) loss 3.5167 (3.3113) grad_norm 1.9232 (2.0182) [2022-01-24 05:48:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][780/1251] eta 0:17:24 lr 0.000233 time 2.2836 (2.2166) loss 3.1463 (3.3122) grad_norm 2.3032 (2.0185) [2022-01-24 05:48:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][790/1251] eta 0:17:01 lr 0.000233 time 2.0307 (2.2167) loss 3.5547 (3.3129) grad_norm 1.7764 (2.0199) [2022-01-24 05:49:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][800/1251] eta 0:16:38 lr 0.000233 time 1.6503 (2.2143) loss 3.3784 (3.3096) grad_norm 1.9427 (2.0203) [2022-01-24 05:49:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][810/1251] eta 0:16:14 lr 0.000233 time 1.9180 (2.2099) loss 3.4810 (3.3120) grad_norm 2.2904 (2.0205) [2022-01-24 05:49:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][820/1251] eta 0:15:51 lr 0.000233 time 2.1915 (2.2073) loss 3.5110 (3.3130) grad_norm 1.9975 (2.0210) [2022-01-24 05:50:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][830/1251] eta 0:15:29 lr 0.000233 time 2.4554 (2.2076) loss 3.0265 (3.3131) grad_norm 1.8385 (2.0195) [2022-01-24 05:50:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][840/1251] eta 0:15:06 lr 0.000232 time 2.6276 (2.2064) loss 2.4981 (3.3132) grad_norm 1.8636 (2.0193) [2022-01-24 05:51:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][850/1251] eta 0:14:44 lr 0.000232 time 2.0347 (2.2053) loss 2.8182 (3.3128) grad_norm 1.9195 (2.0185) [2022-01-24 05:51:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][860/1251] eta 0:14:22 lr 0.000232 time 2.2730 (2.2059) loss 3.0884 (3.3123) grad_norm 2.4870 (2.0186) [2022-01-24 05:51:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][870/1251] eta 0:14:01 lr 0.000232 time 2.4802 (2.2086) loss 3.5132 (3.3109) grad_norm 2.2004 (2.0173) [2022-01-24 05:52:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][880/1251] eta 0:13:41 lr 0.000232 time 2.8177 (2.2131) loss 3.5504 (3.3100) grad_norm 2.4045 (2.0188) [2022-01-24 05:52:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][890/1251] eta 0:13:19 lr 0.000232 time 1.9346 (2.2146) loss 3.1075 (3.3082) grad_norm 1.8772 (2.0185) [2022-01-24 05:52:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][900/1251] eta 0:12:56 lr 0.000232 time 2.0452 (2.2129) loss 3.6415 (3.3066) grad_norm 1.6380 (2.0179) [2022-01-24 05:53:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][910/1251] eta 0:12:33 lr 0.000232 time 2.1050 (2.2100) loss 3.4856 (3.3069) grad_norm 1.9349 (2.0185) [2022-01-24 05:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][920/1251] eta 0:12:10 lr 0.000232 time 1.9405 (2.2070) loss 3.6505 (3.3065) grad_norm 2.1526 (2.0180) [2022-01-24 05:54:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][930/1251] eta 0:11:48 lr 0.000232 time 2.2332 (2.2062) loss 3.2516 (3.3063) grad_norm 2.3556 (2.0180) [2022-01-24 05:54:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][940/1251] eta 0:11:26 lr 0.000232 time 2.6733 (2.2064) loss 3.3195 (3.3072) grad_norm 2.1561 (2.0193) [2022-01-24 05:54:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][950/1251] eta 0:11:04 lr 0.000232 time 2.2689 (2.2068) loss 3.8327 (3.3064) grad_norm 1.8893 (2.0192) [2022-01-24 05:55:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][960/1251] eta 0:10:42 lr 0.000232 time 1.9962 (2.2080) loss 3.3397 (3.3086) grad_norm 2.0032 (2.0197) [2022-01-24 05:55:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][970/1251] eta 0:10:20 lr 0.000232 time 2.6395 (2.2096) loss 3.8235 (3.3098) grad_norm 2.0341 (2.0198) [2022-01-24 05:55:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][980/1251] eta 0:09:59 lr 0.000232 time 3.5488 (2.2121) loss 3.3008 (3.3101) grad_norm 2.2132 (2.0204) [2022-01-24 05:56:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][990/1251] eta 0:09:37 lr 0.000232 time 1.8431 (2.2113) loss 3.9599 (3.3078) grad_norm 2.0838 (2.0199) [2022-01-24 05:56:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1000/1251] eta 0:09:14 lr 0.000232 time 1.8244 (2.2087) loss 2.4643 (3.3063) grad_norm 1.7994 (2.0193) [2022-01-24 05:56:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1010/1251] eta 0:08:51 lr 0.000232 time 2.6012 (2.2070) loss 3.2386 (3.3048) grad_norm 1.6959 (2.0191) [2022-01-24 05:57:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1020/1251] eta 0:08:29 lr 0.000232 time 2.3162 (2.2058) loss 3.3644 (3.3056) grad_norm 2.0573 (2.0189) [2022-01-24 05:57:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1030/1251] eta 0:08:07 lr 0.000232 time 1.8391 (2.2050) loss 3.6932 (3.3049) grad_norm 2.2317 (2.0192) [2022-01-24 05:58:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1040/1251] eta 0:07:45 lr 0.000232 time 2.8306 (2.2054) loss 3.6260 (3.3047) grad_norm 2.2600 (2.0193) [2022-01-24 05:58:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1050/1251] eta 0:07:23 lr 0.000232 time 3.1262 (2.2060) loss 3.5191 (3.3057) grad_norm 1.9856 (2.0194) [2022-01-24 05:58:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1060/1251] eta 0:07:01 lr 0.000232 time 2.4900 (2.2078) loss 4.1284 (3.3085) grad_norm 2.1369 (2.0201) [2022-01-24 05:59:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1070/1251] eta 0:06:39 lr 0.000232 time 2.0274 (2.2090) loss 3.8184 (3.3111) grad_norm 2.1688 (2.0198) [2022-01-24 05:59:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1080/1251] eta 0:06:17 lr 0.000232 time 1.8439 (2.2088) loss 2.1037 (3.3100) grad_norm 1.9952 (2.0194) [2022-01-24 05:59:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1090/1251] eta 0:05:55 lr 0.000232 time 1.9352 (2.2073) loss 2.7542 (3.3101) grad_norm 1.9746 (2.0186) [2022-01-24 06:00:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1100/1251] eta 0:05:33 lr 0.000232 time 2.6717 (2.2057) loss 2.1827 (3.3074) grad_norm 2.0893 (2.0183) [2022-01-24 06:00:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1110/1251] eta 0:05:10 lr 0.000232 time 3.1260 (2.2056) loss 2.4710 (3.3067) grad_norm 1.9449 (2.0185) [2022-01-24 06:00:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1120/1251] eta 0:04:48 lr 0.000232 time 2.5661 (2.2052) loss 3.2530 (3.3088) grad_norm 2.0724 (2.0187) [2022-01-24 06:01:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1130/1251] eta 0:04:27 lr 0.000231 time 2.1860 (2.2072) loss 3.7668 (3.3081) grad_norm 1.9906 (2.0188) [2022-01-24 06:01:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1140/1251] eta 0:04:05 lr 0.000231 time 2.4554 (2.2082) loss 3.7370 (3.3077) grad_norm 2.7267 (2.0195) [2022-01-24 06:02:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1150/1251] eta 0:03:43 lr 0.000231 time 2.2223 (2.2080) loss 3.6162 (3.3070) grad_norm 2.1715 (2.0191) [2022-01-24 06:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1160/1251] eta 0:03:20 lr 0.000231 time 2.9551 (2.2083) loss 3.6187 (3.3078) grad_norm 1.9666 (2.0189) [2022-01-24 06:02:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1170/1251] eta 0:02:58 lr 0.000231 time 1.5121 (2.2086) loss 3.3724 (3.3075) grad_norm 2.0678 (2.0194) [2022-01-24 06:03:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1180/1251] eta 0:02:36 lr 0.000231 time 2.1403 (2.2070) loss 2.3265 (3.3083) grad_norm 1.9504 (2.0201) [2022-01-24 06:03:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1190/1251] eta 0:02:14 lr 0.000231 time 1.8881 (2.2060) loss 3.3016 (3.3077) grad_norm 2.0091 (2.0201) [2022-01-24 06:03:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1200/1251] eta 0:01:52 lr 0.000231 time 2.7823 (2.2055) loss 3.6828 (3.3073) grad_norm 1.9132 (2.0197) [2022-01-24 06:04:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1210/1251] eta 0:01:30 lr 0.000231 time 2.5671 (2.2055) loss 3.5013 (3.3073) grad_norm 3.1317 (2.0217) [2022-01-24 06:04:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1220/1251] eta 0:01:08 lr 0.000231 time 2.5789 (2.2057) loss 4.0003 (3.3089) grad_norm 2.1356 (2.0218) [2022-01-24 06:05:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1230/1251] eta 0:00:46 lr 0.000231 time 1.6066 (2.2060) loss 3.2924 (3.3091) grad_norm 2.2546 (2.0225) [2022-01-24 06:05:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1240/1251] eta 0:00:24 lr 0.000231 time 2.0936 (2.2054) loss 3.2991 (3.3096) grad_norm 2.1365 (2.0234) [2022-01-24 06:05:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [205/300][1250/1251] eta 0:00:02 lr 0.000231 time 1.1840 (2.1997) loss 2.2349 (3.3065) grad_norm 5.2297 (2.0259) [2022-01-24 06:05:38 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 205 training takes 0:45:52 [2022-01-24 06:05:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.819 (18.819) Loss 0.9508 (0.9508) Acc@1 77.148 (77.148) Acc@5 94.434 (94.434) [2022-01-24 06:06:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.267 (3.559) Loss 0.9169 (0.9264) Acc@1 76.367 (77.947) Acc@5 95.020 (94.371) [2022-01-24 06:06:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.302 (2.657) Loss 0.8980 (0.9167) Acc@1 80.078 (78.255) Acc@5 94.336 (94.396) [2022-01-24 06:06:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.935 (2.298) Loss 0.9172 (0.9044) Acc@1 78.516 (78.434) Acc@5 93.555 (94.427) [2022-01-24 06:07:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.781 (2.206) Loss 0.9684 (0.9064) Acc@1 77.930 (78.387) Acc@5 93.359 (94.391) [2022-01-24 06:07:15 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.336 Acc@5 94.390 [2022-01-24 06:07:15 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.3% [2022-01-24 06:07:15 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.50% [2022-01-24 06:07:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][0/1251] eta 7:35:19 lr 0.000231 time 21.8378 (21.8378) loss 3.6264 (3.6264) grad_norm 2.0276 (2.0276) [2022-01-24 06:08:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][10/1251] eta 1:24:37 lr 0.000231 time 2.8021 (4.0914) loss 3.0778 (3.4037) grad_norm 1.7699 (1.9417) [2022-01-24 06:08:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][20/1251] eta 1:05:40 lr 0.000231 time 1.7094 (3.2010) loss 2.1911 (3.3133) grad_norm 2.0610 (1.9558) [2022-01-24 06:08:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][30/1251] eta 0:58:52 lr 0.000231 time 1.5420 (2.8929) loss 3.5712 (3.3124) grad_norm 1.7623 (1.9527) [2022-01-24 06:09:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][40/1251] eta 0:56:36 lr 0.000231 time 3.5024 (2.8043) loss 3.7864 (3.3327) grad_norm 2.2111 (2.0089) [2022-01-24 06:09:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][50/1251] eta 0:55:36 lr 0.000231 time 3.2170 (2.7782) loss 3.7913 (3.2924) grad_norm 1.9317 (1.9900) [2022-01-24 06:09:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][60/1251] eta 0:53:07 lr 0.000231 time 2.6556 (2.6760) loss 3.3001 (3.2849) grad_norm 1.9872 (1.9876) [2022-01-24 06:10:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][70/1251] eta 0:50:21 lr 0.000231 time 2.0173 (2.5587) loss 2.2271 (3.2776) grad_norm 2.1047 (1.9982) [2022-01-24 06:10:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][80/1251] eta 0:48:22 lr 0.000231 time 2.1458 (2.4785) loss 3.4722 (3.2974) grad_norm 2.1425 (2.0067) [2022-01-24 06:10:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][90/1251] eta 0:46:54 lr 0.000231 time 1.9815 (2.4244) loss 2.3781 (3.2790) grad_norm 2.0085 (1.9931) [2022-01-24 06:11:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][100/1251] eta 0:46:02 lr 0.000231 time 2.5422 (2.4004) loss 2.6745 (3.2892) grad_norm 1.9227 (1.9934) [2022-01-24 06:11:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][110/1251] eta 0:46:20 lr 0.000231 time 6.2091 (2.4367) loss 2.6401 (3.2729) grad_norm 1.8592 (2.0003) [2022-01-24 06:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][120/1251] eta 0:45:38 lr 0.000231 time 2.0852 (2.4211) loss 3.7490 (3.2655) grad_norm 2.5862 (2.0087) [2022-01-24 06:12:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][130/1251] eta 0:44:50 lr 0.000231 time 1.9506 (2.4003) loss 2.6914 (3.2461) grad_norm 2.2533 (2.0051) [2022-01-24 06:12:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][140/1251] eta 0:44:03 lr 0.000231 time 2.1286 (2.3797) loss 3.5661 (3.2625) grad_norm 1.9104 (1.9957) [2022-01-24 06:13:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][150/1251] eta 0:43:41 lr 0.000231 time 4.5638 (2.3806) loss 3.6452 (3.2835) grad_norm 1.7688 (1.9923) [2022-01-24 06:13:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][160/1251] eta 0:42:50 lr 0.000231 time 1.7614 (2.3565) loss 3.3755 (3.3043) grad_norm 1.6090 (1.9946) [2022-01-24 06:13:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][170/1251] eta 0:42:11 lr 0.000230 time 2.0240 (2.3417) loss 3.7927 (3.3052) grad_norm 2.1758 (1.9901) [2022-01-24 06:14:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][180/1251] eta 0:41:32 lr 0.000230 time 1.8867 (2.3274) loss 3.7308 (3.3106) grad_norm 1.9002 (1.9894) [2022-01-24 06:14:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][190/1251] eta 0:41:14 lr 0.000230 time 3.6150 (2.3318) loss 3.0245 (3.3063) grad_norm 1.8941 (1.9915) [2022-01-24 06:15:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][200/1251] eta 0:40:53 lr 0.000230 time 1.9745 (2.3347) loss 4.3117 (3.3173) grad_norm 2.1308 (1.9948) [2022-01-24 06:15:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][210/1251] eta 0:40:22 lr 0.000230 time 1.7105 (2.3266) loss 3.5985 (3.3260) grad_norm 2.1980 (1.9949) [2022-01-24 06:15:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][220/1251] eta 0:39:52 lr 0.000230 time 1.7408 (2.3209) loss 3.8213 (3.3297) grad_norm 2.2579 (2.0001) [2022-01-24 06:16:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][230/1251] eta 0:39:28 lr 0.000230 time 3.9998 (2.3202) loss 3.6540 (3.3382) grad_norm 2.0733 (2.0015) [2022-01-24 06:16:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][240/1251] eta 0:38:55 lr 0.000230 time 1.7048 (2.3104) loss 3.5298 (3.3349) grad_norm 2.0054 (2.0058) [2022-01-24 06:16:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][250/1251] eta 0:38:22 lr 0.000230 time 1.6061 (2.3003) loss 3.6562 (3.3282) grad_norm 1.8823 (2.0037) [2022-01-24 06:17:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][260/1251] eta 0:37:57 lr 0.000230 time 2.2287 (2.2986) loss 3.7986 (3.3306) grad_norm 2.0148 (2.0073) [2022-01-24 06:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][270/1251] eta 0:37:36 lr 0.000230 time 3.6813 (2.3000) loss 2.9452 (3.3371) grad_norm 1.8817 (2.0044) [2022-01-24 06:18:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][280/1251] eta 0:37:14 lr 0.000230 time 1.8778 (2.3016) loss 2.6369 (3.3368) grad_norm 1.8433 (2.0047) [2022-01-24 06:18:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][290/1251] eta 0:36:42 lr 0.000230 time 1.7056 (2.2917) loss 3.7691 (3.3275) grad_norm 1.8834 (2.0059) [2022-01-24 06:18:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][300/1251] eta 0:36:18 lr 0.000230 time 1.8656 (2.2904) loss 2.3821 (3.3251) grad_norm 2.3341 (2.0102) [2022-01-24 06:19:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][310/1251] eta 0:35:58 lr 0.000230 time 5.5589 (2.2936) loss 3.6329 (3.3246) grad_norm 1.7237 (2.0101) [2022-01-24 06:19:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][320/1251] eta 0:35:28 lr 0.000230 time 1.8681 (2.2859) loss 3.0344 (3.3246) grad_norm 1.8624 (2.0095) [2022-01-24 06:19:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][330/1251] eta 0:34:57 lr 0.000230 time 1.8642 (2.2774) loss 3.8591 (3.3192) grad_norm 2.1074 (2.0077) [2022-01-24 06:20:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][340/1251] eta 0:34:26 lr 0.000230 time 1.8959 (2.2682) loss 3.8342 (3.3231) grad_norm 1.9386 (2.0067) [2022-01-24 06:20:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][350/1251] eta 0:33:59 lr 0.000230 time 2.4262 (2.2639) loss 3.6000 (3.3113) grad_norm 2.2084 (2.0063) [2022-01-24 06:20:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][360/1251] eta 0:33:35 lr 0.000230 time 2.2467 (2.2625) loss 3.5046 (3.3094) grad_norm 1.8911 (2.0059) [2022-01-24 06:21:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][370/1251] eta 0:33:17 lr 0.000230 time 4.4610 (2.2673) loss 3.3591 (3.3116) grad_norm 2.3147 (2.0085) [2022-01-24 06:21:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][380/1251] eta 0:32:53 lr 0.000230 time 2.5531 (2.2661) loss 2.6831 (3.3007) grad_norm 2.0913 (2.0111) [2022-01-24 06:22:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][390/1251] eta 0:32:28 lr 0.000230 time 1.6783 (2.2631) loss 3.7094 (3.3048) grad_norm 2.5407 (2.0154) [2022-01-24 06:22:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][400/1251] eta 0:32:00 lr 0.000230 time 1.9197 (2.2566) loss 3.6701 (3.3078) grad_norm 2.1217 (2.0196) [2022-01-24 06:22:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][410/1251] eta 0:31:34 lr 0.000230 time 2.4653 (2.2528) loss 3.7206 (3.3057) grad_norm 1.8300 (2.0194) [2022-01-24 06:23:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][420/1251] eta 0:31:10 lr 0.000230 time 2.5108 (2.2509) loss 2.7908 (3.3020) grad_norm 1.7636 (2.0186) [2022-01-24 06:23:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][430/1251] eta 0:30:45 lr 0.000230 time 1.8462 (2.2476) loss 2.1563 (3.2929) grad_norm 1.9460 (2.0159) [2022-01-24 06:23:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][440/1251] eta 0:30:24 lr 0.000230 time 2.3098 (2.2491) loss 3.6304 (3.2888) grad_norm 1.9977 (2.0137) [2022-01-24 06:24:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][450/1251] eta 0:30:04 lr 0.000230 time 3.3590 (2.2528) loss 3.3937 (3.2908) grad_norm 1.8556 (2.0109) [2022-01-24 06:24:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][460/1251] eta 0:29:45 lr 0.000229 time 1.4234 (2.2573) loss 3.5312 (3.2983) grad_norm 2.0489 (2.0096) [2022-01-24 06:25:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][470/1251] eta 0:29:25 lr 0.000229 time 1.5191 (2.2606) loss 2.8607 (3.2952) grad_norm 1.8790 (2.0062) [2022-01-24 06:25:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][480/1251] eta 0:29:00 lr 0.000229 time 2.2540 (2.2578) loss 2.3257 (3.2918) grad_norm 1.8290 (2.0062) [2022-01-24 06:25:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][490/1251] eta 0:28:34 lr 0.000229 time 1.8604 (2.2531) loss 2.9075 (3.2866) grad_norm 2.0083 (2.0063) [2022-01-24 06:26:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][500/1251] eta 0:28:08 lr 0.000229 time 1.6379 (2.2482) loss 2.1387 (3.2844) grad_norm 1.9998 (2.0059) [2022-01-24 06:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][510/1251] eta 0:27:44 lr 0.000229 time 2.2259 (2.2467) loss 3.7821 (3.2826) grad_norm 1.9787 (2.0040) [2022-01-24 06:26:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][520/1251] eta 0:27:19 lr 0.000229 time 2.2282 (2.2430) loss 4.0950 (3.2872) grad_norm 2.0339 (2.0058) [2022-01-24 06:27:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][530/1251] eta 0:26:58 lr 0.000229 time 2.5409 (2.2448) loss 2.5872 (3.2840) grad_norm 2.1927 (2.0083) [2022-01-24 06:27:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][540/1251] eta 0:26:34 lr 0.000229 time 1.8753 (2.2432) loss 2.9751 (3.2840) grad_norm 1.9584 (2.0105) [2022-01-24 06:27:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][550/1251] eta 0:26:10 lr 0.000229 time 2.4916 (2.2408) loss 3.6035 (3.2882) grad_norm 1.8410 (2.0092) [2022-01-24 06:28:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][560/1251] eta 0:25:45 lr 0.000229 time 1.8847 (2.2369) loss 3.4557 (3.2901) grad_norm 1.9107 (2.0092) [2022-01-24 06:28:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][570/1251] eta 0:25:22 lr 0.000229 time 2.1401 (2.2356) loss 2.3810 (3.2868) grad_norm 2.0406 (2.0081) [2022-01-24 06:28:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][580/1251] eta 0:25:01 lr 0.000229 time 2.4742 (2.2384) loss 2.4228 (3.2832) grad_norm 1.8975 (2.0074) [2022-01-24 06:29:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][590/1251] eta 0:24:40 lr 0.000229 time 2.3153 (2.2401) loss 3.5842 (3.2841) grad_norm 1.9574 (2.0085) [2022-01-24 06:29:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][600/1251] eta 0:24:19 lr 0.000229 time 2.5217 (2.2424) loss 3.5931 (3.2858) grad_norm 2.1541 (2.0097) [2022-01-24 06:30:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][610/1251] eta 0:23:58 lr 0.000229 time 2.1981 (2.2441) loss 3.1909 (3.2865) grad_norm 2.0352 (2.0085) [2022-01-24 06:30:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][620/1251] eta 0:23:33 lr 0.000229 time 2.2482 (2.2406) loss 3.2652 (3.2916) grad_norm 1.8523 (2.0083) [2022-01-24 06:30:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][630/1251] eta 0:23:09 lr 0.000229 time 2.5836 (2.2372) loss 3.4403 (3.2856) grad_norm 1.8985 (2.0070) [2022-01-24 06:31:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][640/1251] eta 0:22:44 lr 0.000229 time 2.2331 (2.2337) loss 3.2521 (3.2872) grad_norm 1.9301 (2.0063) [2022-01-24 06:31:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][650/1251] eta 0:22:21 lr 0.000229 time 1.9085 (2.2320) loss 2.1617 (3.2876) grad_norm 1.8763 (2.0060) [2022-01-24 06:31:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][660/1251] eta 0:21:58 lr 0.000229 time 2.2459 (2.2317) loss 2.7011 (3.2851) grad_norm 1.8420 (2.0041) [2022-01-24 06:32:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][670/1251] eta 0:21:36 lr 0.000229 time 2.1863 (2.2313) loss 3.2990 (3.2878) grad_norm 1.7121 (2.0030) [2022-01-24 06:32:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][680/1251] eta 0:21:14 lr 0.000229 time 2.5651 (2.2321) loss 2.0904 (3.2887) grad_norm 1.7392 (2.0022) [2022-01-24 06:32:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][690/1251] eta 0:20:52 lr 0.000229 time 2.0902 (2.2329) loss 3.9071 (3.2903) grad_norm 2.0410 (2.0024) [2022-01-24 06:33:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][700/1251] eta 0:20:29 lr 0.000229 time 1.5689 (2.2316) loss 3.3559 (3.2921) grad_norm 1.9836 (2.0015) [2022-01-24 06:33:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][710/1251] eta 0:20:07 lr 0.000229 time 2.4720 (2.2319) loss 2.9550 (3.2934) grad_norm 1.8181 (2.0023) [2022-01-24 06:34:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][720/1251] eta 0:19:43 lr 0.000229 time 2.4504 (2.2294) loss 2.7679 (3.2887) grad_norm 1.8710 (2.0015) [2022-01-24 06:34:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][730/1251] eta 0:19:19 lr 0.000229 time 2.5399 (2.2261) loss 2.8189 (3.2884) grad_norm 2.4274 (2.0087) [2022-01-24 06:34:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][740/1251] eta 0:18:56 lr 0.000229 time 1.9377 (2.2246) loss 2.1240 (3.2851) grad_norm 2.2010 (2.0110) [2022-01-24 06:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][750/1251] eta 0:18:34 lr 0.000228 time 2.6514 (2.2254) loss 2.9557 (3.2813) grad_norm 2.0908 (2.0123) [2022-01-24 06:35:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][760/1251] eta 0:18:11 lr 0.000228 time 1.5528 (2.2234) loss 3.6097 (3.2836) grad_norm 2.1418 (2.0133) [2022-01-24 06:35:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][770/1251] eta 0:17:49 lr 0.000228 time 1.8796 (2.2238) loss 2.3033 (3.2819) grad_norm 1.9516 (2.0141) [2022-01-24 06:36:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][780/1251] eta 0:17:27 lr 0.000228 time 2.3169 (2.2250) loss 3.7500 (3.2817) grad_norm 2.0448 (2.0150) [2022-01-24 06:36:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][790/1251] eta 0:17:06 lr 0.000228 time 2.9966 (2.2271) loss 3.4141 (3.2817) grad_norm 1.8750 (2.0143) [2022-01-24 06:36:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][800/1251] eta 0:16:44 lr 0.000228 time 1.5441 (2.2275) loss 3.2917 (3.2836) grad_norm 1.7417 (2.0149) [2022-01-24 06:37:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][810/1251] eta 0:16:22 lr 0.000228 time 2.4265 (2.2270) loss 3.4966 (3.2816) grad_norm 1.8621 (2.0152) [2022-01-24 06:37:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][820/1251] eta 0:15:59 lr 0.000228 time 1.9211 (2.2267) loss 3.8180 (3.2848) grad_norm 1.9392 (2.0141) [2022-01-24 06:38:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][830/1251] eta 0:15:36 lr 0.000228 time 2.4799 (2.2251) loss 3.2250 (3.2851) grad_norm 2.0791 (2.0148) [2022-01-24 06:38:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][840/1251] eta 0:15:13 lr 0.000228 time 1.7611 (2.2233) loss 3.9421 (3.2875) grad_norm 2.0873 (2.0142) [2022-01-24 06:38:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][850/1251] eta 0:14:50 lr 0.000228 time 2.0741 (2.2212) loss 3.7485 (3.2880) grad_norm 2.0014 (2.0141) [2022-01-24 06:39:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][860/1251] eta 0:14:28 lr 0.000228 time 2.1079 (2.2210) loss 2.6286 (3.2899) grad_norm 2.1435 (2.0143) [2022-01-24 06:39:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][870/1251] eta 0:14:06 lr 0.000228 time 2.7468 (2.2212) loss 3.8856 (3.2894) grad_norm 2.5103 (2.0145) [2022-01-24 06:39:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][880/1251] eta 0:13:44 lr 0.000228 time 1.8326 (2.2229) loss 2.4782 (3.2905) grad_norm 2.1202 (2.0146) [2022-01-24 06:40:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][890/1251] eta 0:13:22 lr 0.000228 time 2.0128 (2.2225) loss 3.0364 (3.2926) grad_norm 1.7905 (2.0139) [2022-01-24 06:40:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][900/1251] eta 0:13:00 lr 0.000228 time 1.8633 (2.2226) loss 3.2128 (3.2934) grad_norm 2.5658 (2.0143) [2022-01-24 06:41:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][910/1251] eta 0:12:37 lr 0.000228 time 1.6396 (2.2222) loss 2.3651 (3.2935) grad_norm 1.9149 (2.0143) [2022-01-24 06:41:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][920/1251] eta 0:12:15 lr 0.000228 time 1.9176 (2.2214) loss 2.9578 (3.2935) grad_norm 2.0217 (2.0135) [2022-01-24 06:41:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][930/1251] eta 0:11:52 lr 0.000228 time 2.1574 (2.2200) loss 2.8063 (3.2942) grad_norm 2.0841 (2.0131) [2022-01-24 06:42:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][940/1251] eta 0:11:29 lr 0.000228 time 2.2822 (2.2172) loss 2.5415 (3.2941) grad_norm 1.6845 (2.0127) [2022-01-24 06:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][950/1251] eta 0:11:07 lr 0.000228 time 2.1204 (2.2171) loss 3.3615 (3.2945) grad_norm 1.8173 (2.0126) [2022-01-24 06:42:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][960/1251] eta 0:10:44 lr 0.000228 time 1.9145 (2.2159) loss 2.4570 (3.2951) grad_norm 1.9332 (2.0124) [2022-01-24 06:43:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][970/1251] eta 0:10:22 lr 0.000228 time 1.8046 (2.2155) loss 3.8933 (3.2955) grad_norm 1.7892 (2.0112) [2022-01-24 06:43:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][980/1251] eta 0:10:00 lr 0.000228 time 1.9101 (2.2155) loss 3.5225 (3.2949) grad_norm 1.8724 (2.0109) [2022-01-24 06:43:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][990/1251] eta 0:09:38 lr 0.000228 time 2.7391 (2.2168) loss 2.8406 (3.2929) grad_norm 1.8830 (2.0102) [2022-01-24 06:44:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1000/1251] eta 0:09:16 lr 0.000228 time 2.3808 (2.2164) loss 3.4915 (3.2946) grad_norm 1.7961 (2.0094) [2022-01-24 06:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1010/1251] eta 0:08:54 lr 0.000228 time 1.9896 (2.2173) loss 3.2319 (3.2968) grad_norm 1.6095 (2.0081) [2022-01-24 06:44:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1020/1251] eta 0:08:31 lr 0.000228 time 1.8629 (2.2156) loss 3.8591 (3.2978) grad_norm 1.8965 (2.0086) [2022-01-24 06:45:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1030/1251] eta 0:08:09 lr 0.000228 time 2.5564 (2.2160) loss 3.6486 (3.3003) grad_norm 1.7666 (2.0068) [2022-01-24 06:45:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1040/1251] eta 0:07:47 lr 0.000227 time 2.2062 (2.2151) loss 3.4071 (3.3005) grad_norm 1.9003 (2.0066) [2022-01-24 06:46:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1050/1251] eta 0:07:24 lr 0.000227 time 1.6487 (2.2139) loss 3.4768 (3.3006) grad_norm 2.3757 (2.0065) [2022-01-24 06:46:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1060/1251] eta 0:07:02 lr 0.000227 time 2.2300 (2.2140) loss 3.8958 (3.3032) grad_norm 1.9866 (2.0062) [2022-01-24 06:46:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1070/1251] eta 0:06:40 lr 0.000227 time 2.4387 (2.2151) loss 2.4634 (3.3028) grad_norm 2.2862 (2.0060) [2022-01-24 06:47:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1080/1251] eta 0:06:18 lr 0.000227 time 2.2691 (2.2134) loss 2.3861 (3.3033) grad_norm 1.8828 (2.0052) [2022-01-24 06:47:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1090/1251] eta 0:05:56 lr 0.000227 time 1.8439 (2.2123) loss 3.3495 (3.3053) grad_norm 1.9422 (2.0057) [2022-01-24 06:47:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1100/1251] eta 0:05:34 lr 0.000227 time 1.9958 (2.2122) loss 3.1668 (3.3065) grad_norm 1.7778 (2.0049) [2022-01-24 06:48:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1110/1251] eta 0:05:12 lr 0.000227 time 2.8523 (2.2131) loss 2.0228 (3.3046) grad_norm 1.9228 (2.0047) [2022-01-24 06:48:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1120/1251] eta 0:04:50 lr 0.000227 time 2.6891 (2.2144) loss 2.2749 (3.3037) grad_norm 2.3128 (2.0052) [2022-01-24 06:48:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1130/1251] eta 0:04:27 lr 0.000227 time 1.5304 (2.2141) loss 3.4284 (3.3039) grad_norm 1.9938 (2.0062) [2022-01-24 06:49:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1140/1251] eta 0:04:05 lr 0.000227 time 1.9801 (2.2135) loss 2.2882 (3.3010) grad_norm 1.8547 (2.0067) [2022-01-24 06:49:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1150/1251] eta 0:03:43 lr 0.000227 time 2.0727 (2.2130) loss 3.5524 (3.3004) grad_norm 1.7052 (2.0059) [2022-01-24 06:50:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1160/1251] eta 0:03:21 lr 0.000227 time 2.2204 (2.2121) loss 3.2743 (3.3025) grad_norm 2.1983 (2.0061) [2022-01-24 06:50:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1170/1251] eta 0:02:59 lr 0.000227 time 2.1309 (2.2109) loss 3.8186 (3.3040) grad_norm 1.8653 (2.0060) [2022-01-24 06:50:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1180/1251] eta 0:02:36 lr 0.000227 time 1.6912 (2.2109) loss 2.8555 (3.3040) grad_norm 1.8379 (2.0069) [2022-01-24 06:51:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1190/1251] eta 0:02:14 lr 0.000227 time 1.5504 (2.2108) loss 3.7992 (3.3051) grad_norm 2.0937 (2.0078) [2022-01-24 06:51:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1200/1251] eta 0:01:52 lr 0.000227 time 2.8274 (2.2126) loss 3.8072 (3.3067) grad_norm 2.1178 (2.0078) [2022-01-24 06:51:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1210/1251] eta 0:01:30 lr 0.000227 time 2.5231 (2.2132) loss 3.7400 (3.3088) grad_norm 1.9483 (2.0073) [2022-01-24 06:52:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1220/1251] eta 0:01:08 lr 0.000227 time 1.8161 (2.2124) loss 2.5551 (3.3078) grad_norm 1.8747 (2.0065) [2022-01-24 06:52:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1230/1251] eta 0:00:46 lr 0.000227 time 1.8107 (2.2098) loss 3.5839 (3.3097) grad_norm 2.0570 (2.0062) [2022-01-24 06:52:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1240/1251] eta 0:00:24 lr 0.000227 time 1.8287 (2.2083) loss 3.5634 (3.3099) grad_norm 1.9679 (2.0062) [2022-01-24 06:53:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [206/300][1250/1251] eta 0:00:02 lr 0.000227 time 1.1742 (2.2026) loss 2.7568 (3.3083) grad_norm 2.1284 (2.0060) [2022-01-24 06:53:11 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 206 training takes 0:45:55 [2022-01-24 06:53:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.341 (18.341) Loss 1.0157 (1.0157) Acc@1 77.539 (77.539) Acc@5 93.457 (93.457) [2022-01-24 06:53:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.996 (3.543) Loss 1.0265 (0.9592) Acc@1 76.172 (77.681) Acc@5 93.066 (94.221) [2022-01-24 06:54:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.314 (2.673) Loss 0.8773 (0.9369) Acc@1 78.906 (78.111) Acc@5 94.629 (94.424) [2022-01-24 06:54:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.992 (2.286) Loss 0.9049 (0.9247) Acc@1 78.809 (78.409) Acc@5 94.238 (94.544) [2022-01-24 06:54:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.449 (2.210) Loss 0.9593 (0.9202) Acc@1 78.223 (78.520) Acc@5 93.750 (94.484) [2022-01-24 06:54:49 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.614 Acc@5 94.518 [2022-01-24 06:54:49 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.6% [2022-01-24 06:54:49 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.61% [2022-01-24 06:55:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][0/1251] eta 7:34:44 lr 0.000227 time 21.8105 (21.8105) loss 3.3460 (3.3460) grad_norm 2.2938 (2.2938) [2022-01-24 06:55:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][10/1251] eta 1:20:59 lr 0.000227 time 1.4944 (3.9159) loss 2.8422 (3.4777) grad_norm 1.7365 (1.9594) [2022-01-24 06:55:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][20/1251] eta 1:02:21 lr 0.000227 time 2.2200 (3.0394) loss 4.0095 (3.4235) grad_norm 2.0052 (1.9627) [2022-01-24 06:56:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][30/1251] eta 0:55:37 lr 0.000227 time 1.8797 (2.7337) loss 3.5499 (3.3267) grad_norm 2.2313 (1.9556) [2022-01-24 06:56:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][40/1251] eta 0:52:52 lr 0.000227 time 3.3960 (2.6197) loss 3.7319 (3.3739) grad_norm 1.8911 (1.9700) [2022-01-24 06:56:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][50/1251] eta 0:50:48 lr 0.000227 time 2.3026 (2.5384) loss 2.4736 (3.3031) grad_norm 2.0245 (1.9846) [2022-01-24 06:57:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][60/1251] eta 0:49:44 lr 0.000227 time 3.0910 (2.5060) loss 3.6127 (3.3091) grad_norm 1.8839 (1.9956) [2022-01-24 06:57:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][70/1251] eta 0:48:19 lr 0.000227 time 1.8671 (2.4554) loss 2.3901 (3.2698) grad_norm 1.8324 (1.9842) [2022-01-24 06:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][80/1251] eta 0:47:41 lr 0.000226 time 3.7967 (2.4438) loss 3.9224 (3.2459) grad_norm 1.8328 (1.9796) [2022-01-24 06:58:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][90/1251] eta 0:46:52 lr 0.000226 time 1.6168 (2.4224) loss 3.3808 (3.2671) grad_norm 1.9551 (1.9895) [2022-01-24 06:58:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][100/1251] eta 0:46:08 lr 0.000226 time 1.7316 (2.4051) loss 3.3580 (3.2273) grad_norm 2.1524 (1.9975) [2022-01-24 06:59:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][110/1251] eta 0:45:24 lr 0.000226 time 1.8147 (2.3877) loss 4.1089 (3.2417) grad_norm 2.1921 (2.0078) [2022-01-24 06:59:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][120/1251] eta 0:45:07 lr 0.000226 time 3.4434 (2.3939) loss 3.3201 (3.2495) grad_norm 2.8958 (2.0180) [2022-01-24 07:00:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][130/1251] eta 0:44:52 lr 0.000226 time 2.7826 (2.4023) loss 3.3552 (3.2608) grad_norm 1.8190 (2.0145) [2022-01-24 07:00:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][140/1251] eta 0:44:01 lr 0.000226 time 1.8457 (2.3776) loss 4.2022 (3.2556) grad_norm 1.8850 (2.0148) [2022-01-24 07:00:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][150/1251] eta 0:43:12 lr 0.000226 time 1.9279 (2.3548) loss 2.9508 (3.2536) grad_norm 2.2352 (2.0193) [2022-01-24 07:01:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][160/1251] eta 0:42:26 lr 0.000226 time 1.9774 (2.3344) loss 3.2237 (3.2637) grad_norm 2.1126 (2.0193) [2022-01-24 07:01:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][170/1251] eta 0:41:50 lr 0.000226 time 2.0093 (2.3224) loss 3.3185 (3.2721) grad_norm 2.0363 (2.0173) [2022-01-24 07:01:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][180/1251] eta 0:41:17 lr 0.000226 time 1.9476 (2.3134) loss 3.1470 (3.2664) grad_norm 1.9451 (2.0116) [2022-01-24 07:02:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][190/1251] eta 0:40:46 lr 0.000226 time 2.2219 (2.3055) loss 3.7248 (3.2700) grad_norm 1.9462 (2.0064) [2022-01-24 07:02:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][200/1251] eta 0:40:17 lr 0.000226 time 2.6652 (2.3006) loss 3.5587 (3.2862) grad_norm 1.7524 (2.0030) [2022-01-24 07:02:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][210/1251] eta 0:39:53 lr 0.000226 time 2.4565 (2.2992) loss 3.8345 (3.2916) grad_norm 1.7896 (2.0026) [2022-01-24 07:03:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][220/1251] eta 0:39:28 lr 0.000226 time 1.8344 (2.2972) loss 2.7180 (3.2974) grad_norm 1.8726 (2.0076) [2022-01-24 07:03:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][230/1251] eta 0:39:13 lr 0.000226 time 3.0786 (2.3054) loss 2.8040 (3.2968) grad_norm 2.1104 (2.0098) [2022-01-24 07:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][240/1251] eta 0:38:51 lr 0.000226 time 2.4974 (2.3061) loss 3.7399 (3.2991) grad_norm 1.9780 (2.0152) [2022-01-24 07:04:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][250/1251] eta 0:38:18 lr 0.000226 time 1.8684 (2.2965) loss 2.8113 (3.2905) grad_norm 1.9271 (2.0156) [2022-01-24 07:04:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][260/1251] eta 0:37:46 lr 0.000226 time 1.9115 (2.2872) loss 4.1835 (3.2888) grad_norm 2.2576 (2.0159) [2022-01-24 07:05:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][270/1251] eta 0:37:15 lr 0.000226 time 2.9330 (2.2793) loss 3.4378 (3.2871) grad_norm 2.1459 (2.0198) [2022-01-24 07:05:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][280/1251] eta 0:36:46 lr 0.000226 time 2.1757 (2.2723) loss 3.6888 (3.2826) grad_norm 2.1149 (2.0171) [2022-01-24 07:05:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][290/1251] eta 0:36:17 lr 0.000226 time 2.1792 (2.2659) loss 3.4391 (3.2878) grad_norm 2.0536 (2.0205) [2022-01-24 07:06:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][300/1251] eta 0:35:51 lr 0.000226 time 1.8264 (2.2622) loss 3.6696 (3.2982) grad_norm 2.1754 (2.0199) [2022-01-24 07:06:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][310/1251] eta 0:35:26 lr 0.000226 time 2.7890 (2.2594) loss 3.8435 (3.2973) grad_norm 1.8913 (2.0165) [2022-01-24 07:06:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][320/1251] eta 0:35:03 lr 0.000226 time 2.2264 (2.2590) loss 3.0265 (3.2978) grad_norm 2.0461 (2.0143) [2022-01-24 07:07:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][330/1251] eta 0:34:39 lr 0.000226 time 2.1711 (2.2578) loss 2.4610 (3.2938) grad_norm 1.9496 (2.0150) [2022-01-24 07:07:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][340/1251] eta 0:34:18 lr 0.000226 time 2.5039 (2.2600) loss 3.8561 (3.2940) grad_norm 1.9487 (2.0165) [2022-01-24 07:08:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][350/1251] eta 0:33:58 lr 0.000226 time 2.7910 (2.2627) loss 3.1039 (3.2850) grad_norm 2.0692 (2.0167) [2022-01-24 07:08:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][360/1251] eta 0:33:40 lr 0.000226 time 2.8009 (2.2681) loss 3.9755 (3.2864) grad_norm 1.8795 (2.0176) [2022-01-24 07:08:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][370/1251] eta 0:33:15 lr 0.000226 time 1.9641 (2.2649) loss 3.6994 (3.2933) grad_norm 1.8988 (2.0161) [2022-01-24 07:09:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][380/1251] eta 0:32:47 lr 0.000225 time 1.9002 (2.2591) loss 3.0516 (3.2943) grad_norm 2.1409 (2.0182) [2022-01-24 07:09:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][390/1251] eta 0:32:15 lr 0.000225 time 1.8085 (2.2476) loss 3.4645 (3.2976) grad_norm 2.2398 (2.0195) [2022-01-24 07:09:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][400/1251] eta 0:31:44 lr 0.000225 time 1.9562 (2.2376) loss 2.8501 (3.2990) grad_norm 1.8045 (2.0210) [2022-01-24 07:10:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][410/1251] eta 0:31:20 lr 0.000225 time 1.8132 (2.2366) loss 2.9795 (3.2982) grad_norm 2.3673 (2.0247) [2022-01-24 07:10:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][420/1251] eta 0:30:55 lr 0.000225 time 2.0205 (2.2331) loss 3.0794 (3.3037) grad_norm 1.7603 (2.0248) [2022-01-24 07:10:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][430/1251] eta 0:30:33 lr 0.000225 time 2.5775 (2.2328) loss 3.8118 (3.3106) grad_norm 1.8923 (2.0242) [2022-01-24 07:11:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][440/1251] eta 0:30:11 lr 0.000225 time 2.1039 (2.2334) loss 4.0729 (3.3092) grad_norm 2.0692 (2.0229) [2022-01-24 07:11:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][450/1251] eta 0:29:51 lr 0.000225 time 1.9245 (2.2360) loss 3.5406 (3.3067) grad_norm 2.2709 (2.0231) [2022-01-24 07:12:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][460/1251] eta 0:29:28 lr 0.000225 time 1.9034 (2.2361) loss 2.4011 (3.3040) grad_norm 1.9137 (2.0209) [2022-01-24 07:12:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][470/1251] eta 0:29:05 lr 0.000225 time 2.0624 (2.2353) loss 3.5377 (3.3014) grad_norm 2.2067 (2.0204) [2022-01-24 07:12:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][480/1251] eta 0:28:45 lr 0.000225 time 3.0503 (2.2382) loss 4.1012 (3.3035) grad_norm 2.1184 (2.0205) [2022-01-24 07:13:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][490/1251] eta 0:28:25 lr 0.000225 time 1.7093 (2.2416) loss 3.5300 (3.3016) grad_norm 1.7256 (2.0204) [2022-01-24 07:13:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][500/1251] eta 0:28:02 lr 0.000225 time 2.1298 (2.2406) loss 3.4697 (3.3029) grad_norm 1.8171 (2.0201) [2022-01-24 07:13:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][510/1251] eta 0:27:37 lr 0.000225 time 2.2802 (2.2366) loss 3.9583 (3.3037) grad_norm 1.8786 (2.0187) [2022-01-24 07:14:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][520/1251] eta 0:27:12 lr 0.000225 time 2.2205 (2.2333) loss 2.8283 (3.3031) grad_norm 1.9804 (2.0184) [2022-01-24 07:14:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][530/1251] eta 0:26:48 lr 0.000225 time 2.1936 (2.2310) loss 3.6951 (3.2995) grad_norm 1.9521 (2.0198) [2022-01-24 07:14:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][540/1251] eta 0:26:26 lr 0.000225 time 2.3951 (2.2308) loss 2.7336 (3.2960) grad_norm 1.9161 (2.0233) [2022-01-24 07:15:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][550/1251] eta 0:26:03 lr 0.000225 time 2.1858 (2.2307) loss 3.9257 (3.2960) grad_norm 2.0107 (2.0240) [2022-01-24 07:15:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][560/1251] eta 0:25:41 lr 0.000225 time 2.2206 (2.2307) loss 2.3756 (3.2938) grad_norm 2.0447 (2.0244) [2022-01-24 07:16:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][570/1251] eta 0:25:18 lr 0.000225 time 2.4063 (2.2302) loss 2.8051 (3.2913) grad_norm 2.3335 (2.0266) [2022-01-24 07:16:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][580/1251] eta 0:24:56 lr 0.000225 time 2.3947 (2.2295) loss 3.4245 (3.2921) grad_norm 1.7816 (2.0252) [2022-01-24 07:16:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][590/1251] eta 0:24:32 lr 0.000225 time 1.9484 (2.2269) loss 2.3318 (3.2906) grad_norm 1.9548 (2.0255) [2022-01-24 07:17:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][600/1251] eta 0:24:07 lr 0.000225 time 1.9075 (2.2241) loss 3.7045 (3.2904) grad_norm 2.0943 (2.0262) [2022-01-24 07:17:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][610/1251] eta 0:23:44 lr 0.000225 time 2.1485 (2.2225) loss 2.6664 (3.2946) grad_norm 1.8666 (2.0257) [2022-01-24 07:17:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][620/1251] eta 0:23:23 lr 0.000225 time 2.8496 (2.2235) loss 2.9285 (3.2935) grad_norm 1.8745 (2.0266) [2022-01-24 07:18:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][630/1251] eta 0:22:59 lr 0.000225 time 1.9720 (2.2216) loss 2.9889 (3.2931) grad_norm 2.0634 (2.0261) [2022-01-24 07:18:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][640/1251] eta 0:22:37 lr 0.000225 time 2.1675 (2.2212) loss 3.3250 (3.2922) grad_norm 1.8413 (2.0260) [2022-01-24 07:18:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][650/1251] eta 0:22:15 lr 0.000225 time 1.5934 (2.2222) loss 3.2782 (3.2889) grad_norm 2.1279 (2.0265) [2022-01-24 07:19:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][660/1251] eta 0:21:54 lr 0.000225 time 3.0500 (2.2235) loss 2.3576 (3.2822) grad_norm 1.9890 (2.0257) [2022-01-24 07:19:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][670/1251] eta 0:21:30 lr 0.000224 time 1.5398 (2.2212) loss 3.4070 (3.2813) grad_norm 2.1566 (2.0269) [2022-01-24 07:20:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][680/1251] eta 0:21:07 lr 0.000224 time 2.3295 (2.2205) loss 2.6939 (3.2789) grad_norm 1.7514 (2.0274) [2022-01-24 07:20:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][690/1251] eta 0:20:44 lr 0.000224 time 1.7206 (2.2183) loss 3.4403 (3.2776) grad_norm 2.1208 (2.0274) [2022-01-24 07:20:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][700/1251] eta 0:20:22 lr 0.000224 time 2.1811 (2.2188) loss 4.0187 (3.2796) grad_norm 2.9185 (2.0273) [2022-01-24 07:21:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][710/1251] eta 0:19:59 lr 0.000224 time 1.7024 (2.2174) loss 3.4087 (3.2813) grad_norm 1.8838 (2.0285) [2022-01-24 07:21:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][720/1251] eta 0:19:36 lr 0.000224 time 2.2004 (2.2165) loss 4.0172 (3.2807) grad_norm 2.2790 (2.0279) [2022-01-24 07:21:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][730/1251] eta 0:19:14 lr 0.000224 time 1.8615 (2.2150) loss 4.0861 (3.2790) grad_norm 2.1571 (2.0278) [2022-01-24 07:22:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][740/1251] eta 0:18:52 lr 0.000224 time 2.1635 (2.2165) loss 3.7065 (3.2791) grad_norm 1.7007 (2.0264) [2022-01-24 07:22:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][750/1251] eta 0:18:29 lr 0.000224 time 1.8061 (2.2139) loss 3.5609 (3.2752) grad_norm 2.2106 (2.0270) [2022-01-24 07:22:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][760/1251] eta 0:18:06 lr 0.000224 time 1.8098 (2.2128) loss 2.8140 (3.2773) grad_norm 1.8431 (2.0268) [2022-01-24 07:23:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][770/1251] eta 0:17:43 lr 0.000224 time 2.6366 (2.2118) loss 2.5778 (3.2782) grad_norm 2.1381 (2.0267) [2022-01-24 07:23:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][780/1251] eta 0:17:21 lr 0.000224 time 1.7549 (2.2103) loss 3.4795 (3.2809) grad_norm 1.9584 (2.0273) [2022-01-24 07:23:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][790/1251] eta 0:16:58 lr 0.000224 time 1.9131 (2.2103) loss 3.6341 (3.2791) grad_norm 1.8900 (2.0285) [2022-01-24 07:24:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][800/1251] eta 0:16:37 lr 0.000224 time 2.9042 (2.2124) loss 3.7471 (3.2787) grad_norm 1.9665 (2.0279) [2022-01-24 07:24:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][810/1251] eta 0:16:15 lr 0.000224 time 2.5867 (2.2113) loss 3.8850 (3.2797) grad_norm 2.0993 (2.0271) [2022-01-24 07:25:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][820/1251] eta 0:15:53 lr 0.000224 time 1.8062 (2.2117) loss 4.1694 (3.2847) grad_norm 2.1292 (2.0265) [2022-01-24 07:25:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][830/1251] eta 0:15:31 lr 0.000224 time 2.4839 (2.2117) loss 3.6150 (3.2830) grad_norm 1.8569 (2.0250) [2022-01-24 07:25:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][840/1251] eta 0:15:08 lr 0.000224 time 1.8525 (2.2107) loss 3.0289 (3.2817) grad_norm 2.1476 (2.0250) [2022-01-24 07:26:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][850/1251] eta 0:14:45 lr 0.000224 time 1.7013 (2.2088) loss 3.2737 (3.2804) grad_norm 1.8805 (2.0244) [2022-01-24 07:26:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][860/1251] eta 0:14:22 lr 0.000224 time 2.1763 (2.2071) loss 3.7031 (3.2811) grad_norm 1.7829 (2.0231) [2022-01-24 07:26:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][870/1251] eta 0:14:00 lr 0.000224 time 1.5287 (2.2062) loss 2.9183 (3.2772) grad_norm 1.8758 (2.0221) [2022-01-24 07:27:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][880/1251] eta 0:13:39 lr 0.000224 time 2.3994 (2.2093) loss 3.5619 (3.2800) grad_norm 1.8381 (2.0214) [2022-01-24 07:27:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][890/1251] eta 0:13:17 lr 0.000224 time 1.6570 (2.2095) loss 3.3458 (3.2792) grad_norm 2.1830 (2.0208) [2022-01-24 07:27:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][900/1251] eta 0:12:55 lr 0.000224 time 1.8354 (2.2092) loss 3.2219 (3.2766) grad_norm 1.6886 (2.0204) [2022-01-24 07:28:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][910/1251] eta 0:12:32 lr 0.000224 time 1.8907 (2.2077) loss 3.7232 (3.2792) grad_norm 2.5190 (2.0196) [2022-01-24 07:28:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][920/1251] eta 0:12:10 lr 0.000224 time 1.7993 (2.2075) loss 3.3781 (3.2813) grad_norm 1.8306 (2.0194) [2022-01-24 07:29:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][930/1251] eta 0:11:47 lr 0.000224 time 1.9357 (2.2049) loss 2.7779 (3.2816) grad_norm 2.0410 (2.0190) [2022-01-24 07:29:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][940/1251] eta 0:11:25 lr 0.000224 time 1.9359 (2.2038) loss 2.7855 (3.2808) grad_norm 2.0668 (2.0198) [2022-01-24 07:29:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][950/1251] eta 0:11:03 lr 0.000224 time 2.7565 (2.2038) loss 3.8007 (3.2793) grad_norm 1.8433 (2.0228) [2022-01-24 07:30:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][960/1251] eta 0:10:41 lr 0.000223 time 2.2143 (2.2049) loss 3.4470 (3.2812) grad_norm 2.0227 (2.0225) [2022-01-24 07:30:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][970/1251] eta 0:10:19 lr 0.000223 time 1.8602 (2.2048) loss 3.4775 (3.2805) grad_norm 2.0392 (2.0227) [2022-01-24 07:30:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][980/1251] eta 0:09:57 lr 0.000223 time 2.4541 (2.2058) loss 3.7468 (3.2801) grad_norm 1.7248 (2.0230) [2022-01-24 07:31:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][990/1251] eta 0:09:35 lr 0.000223 time 2.3897 (2.2058) loss 3.6832 (3.2829) grad_norm 2.1660 (2.0234) [2022-01-24 07:31:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1000/1251] eta 0:09:13 lr 0.000223 time 1.8648 (2.2045) loss 3.5819 (3.2834) grad_norm 1.8079 (2.0233) [2022-01-24 07:31:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1010/1251] eta 0:08:51 lr 0.000223 time 1.9783 (2.2035) loss 3.4563 (3.2856) grad_norm 1.6402 (2.0236) [2022-01-24 07:32:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1020/1251] eta 0:08:29 lr 0.000223 time 2.8097 (2.2041) loss 2.5400 (3.2857) grad_norm 2.0154 (2.0241) [2022-01-24 07:32:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1030/1251] eta 0:08:07 lr 0.000223 time 2.2282 (2.2041) loss 3.1824 (3.2856) grad_norm 1.9023 (2.0241) [2022-01-24 07:33:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1040/1251] eta 0:07:45 lr 0.000223 time 2.5319 (2.2052) loss 3.0971 (3.2843) grad_norm 1.9550 (2.0236) [2022-01-24 07:33:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1050/1251] eta 0:07:23 lr 0.000223 time 1.6382 (2.2050) loss 3.6361 (3.2846) grad_norm 2.0744 (2.0233) [2022-01-24 07:33:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1060/1251] eta 0:07:01 lr 0.000223 time 3.1240 (2.2052) loss 3.2377 (3.2834) grad_norm 1.9385 (2.0226) [2022-01-24 07:34:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1070/1251] eta 0:06:38 lr 0.000223 time 1.6156 (2.2043) loss 3.2888 (3.2856) grad_norm 1.9412 (2.0223) [2022-01-24 07:34:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1080/1251] eta 0:06:17 lr 0.000223 time 2.7679 (2.2050) loss 3.9100 (3.2900) grad_norm 1.8880 (2.0215) [2022-01-24 07:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1090/1251] eta 0:05:54 lr 0.000223 time 2.1992 (2.2041) loss 3.4948 (3.2902) grad_norm 1.8468 (2.0206) [2022-01-24 07:35:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1100/1251] eta 0:05:32 lr 0.000223 time 1.9277 (2.2028) loss 4.0045 (3.2915) grad_norm 1.9738 (2.0196) [2022-01-24 07:35:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1110/1251] eta 0:05:10 lr 0.000223 time 1.8933 (2.2011) loss 3.1353 (3.2915) grad_norm 1.9392 (2.0192) [2022-01-24 07:35:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1120/1251] eta 0:04:48 lr 0.000223 time 1.8635 (2.2002) loss 2.8652 (3.2906) grad_norm 1.7607 (2.0176) [2022-01-24 07:36:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1130/1251] eta 0:04:26 lr 0.000223 time 2.5200 (2.1999) loss 2.0085 (3.2901) grad_norm 1.8424 (2.0172) [2022-01-24 07:36:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1140/1251] eta 0:04:04 lr 0.000223 time 1.4562 (2.2002) loss 2.5217 (3.2876) grad_norm 1.6897 (2.0167) [2022-01-24 07:37:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1150/1251] eta 0:03:42 lr 0.000223 time 1.7918 (2.2001) loss 3.4703 (3.2860) grad_norm 2.1020 (2.0172) [2022-01-24 07:37:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1160/1251] eta 0:03:20 lr 0.000223 time 2.1760 (2.2009) loss 2.8375 (3.2852) grad_norm 1.8396 (2.0174) [2022-01-24 07:37:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1170/1251] eta 0:02:58 lr 0.000223 time 2.7881 (2.2017) loss 3.1254 (3.2856) grad_norm 1.8375 (2.0179) [2022-01-24 07:38:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1180/1251] eta 0:02:36 lr 0.000223 time 1.8519 (2.2016) loss 3.3491 (3.2855) grad_norm 1.8957 (2.0185) [2022-01-24 07:38:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1190/1251] eta 0:02:14 lr 0.000223 time 1.5734 (2.2010) loss 3.2465 (3.2863) grad_norm 1.9410 (2.0187) [2022-01-24 07:38:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1200/1251] eta 0:01:52 lr 0.000223 time 1.8091 (2.2000) loss 2.6923 (3.2847) grad_norm 2.2934 (2.0183) [2022-01-24 07:39:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1210/1251] eta 0:01:30 lr 0.000223 time 2.9624 (2.1998) loss 3.1139 (3.2864) grad_norm 1.7419 (2.0181) [2022-01-24 07:39:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1220/1251] eta 0:01:08 lr 0.000223 time 2.1541 (2.2001) loss 3.5810 (3.2848) grad_norm 2.3277 (2.0184) [2022-01-24 07:39:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1230/1251] eta 0:00:46 lr 0.000223 time 1.6810 (2.2000) loss 3.7585 (3.2838) grad_norm 1.8337 (2.0189) [2022-01-24 07:40:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1240/1251] eta 0:00:24 lr 0.000223 time 1.2384 (2.1987) loss 3.3424 (3.2847) grad_norm 1.9074 (2.0192) [2022-01-24 07:40:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [207/300][1250/1251] eta 0:00:02 lr 0.000223 time 1.1967 (2.1938) loss 3.3251 (3.2846) grad_norm 1.8231 (2.0192) [2022-01-24 07:40:34 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 207 training takes 0:45:44 [2022-01-24 07:40:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 20.534 (20.534) Loss 0.8590 (0.8590) Acc@1 80.176 (80.176) Acc@5 95.703 (95.703) [2022-01-24 07:41:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.670 (3.210) Loss 0.9491 (0.9200) Acc@1 77.734 (78.382) Acc@5 94.531 (94.434) [2022-01-24 07:41:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.661 (2.529) Loss 0.8747 (0.8923) Acc@1 78.906 (78.818) Acc@5 94.824 (94.717) [2022-01-24 07:41:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.917 (2.298) Loss 0.7907 (0.8913) Acc@1 80.273 (78.897) Acc@5 95.996 (94.686) [2022-01-24 07:42:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.014 (2.232) Loss 0.8829 (0.8988) Acc@1 79.004 (78.792) Acc@5 94.629 (94.536) [2022-01-24 07:42:13 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.808 Acc@5 94.604 [2022-01-24 07:42:13 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.8% [2022-01-24 07:42:13 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.81% [2022-01-24 07:42:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][0/1251] eta 7:17:56 lr 0.000222 time 21.0041 (21.0041) loss 3.9569 (3.9569) grad_norm 2.1016 (2.1016) [2022-01-24 07:42:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][10/1251] eta 1:24:16 lr 0.000222 time 1.8692 (4.0745) loss 2.9819 (3.4589) grad_norm 2.0838 (2.0716) [2022-01-24 07:43:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][20/1251] eta 1:05:51 lr 0.000222 time 2.2744 (3.2104) loss 3.8610 (3.2581) grad_norm 1.6508 (2.0421) [2022-01-24 07:43:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][30/1251] eta 0:58:32 lr 0.000222 time 1.7387 (2.8764) loss 3.7835 (3.2777) grad_norm 2.2736 (2.0722) [2022-01-24 07:44:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][40/1251] eta 0:55:15 lr 0.000222 time 3.6652 (2.7381) loss 2.2496 (3.2487) grad_norm 2.0946 (2.1132) [2022-01-24 07:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][50/1251] eta 0:53:08 lr 0.000222 time 2.0855 (2.6548) loss 3.6491 (3.2624) grad_norm 2.0170 (2.1030) [2022-01-24 07:44:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][60/1251] eta 0:51:00 lr 0.000222 time 1.8880 (2.5699) loss 3.4680 (3.2522) grad_norm 1.8681 (2.0866) [2022-01-24 07:45:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][70/1251] eta 0:49:42 lr 0.000222 time 1.8643 (2.5252) loss 3.6221 (3.2884) grad_norm 2.1301 (2.0931) [2022-01-24 07:45:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][80/1251] eta 0:48:44 lr 0.000222 time 3.3725 (2.4971) loss 3.1874 (3.2959) grad_norm 2.3725 (2.1016) [2022-01-24 07:45:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][90/1251] eta 0:47:49 lr 0.000222 time 3.3937 (2.4713) loss 3.5924 (3.2643) grad_norm 1.7361 (2.0869) [2022-01-24 07:46:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][100/1251] eta 0:46:51 lr 0.000222 time 1.9264 (2.4431) loss 3.0121 (3.2697) grad_norm 2.0156 (2.0915) [2022-01-24 07:46:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][110/1251] eta 0:45:47 lr 0.000222 time 2.1719 (2.4079) loss 2.8707 (3.2688) grad_norm 1.8206 (2.0808) [2022-01-24 07:47:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][120/1251] eta 0:45:01 lr 0.000222 time 2.3446 (2.3882) loss 3.7580 (3.2650) grad_norm 2.3515 (2.0776) [2022-01-24 07:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][130/1251] eta 0:44:18 lr 0.000222 time 2.4258 (2.3716) loss 3.2230 (3.2640) grad_norm 2.2808 (2.0682) [2022-01-24 07:47:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][140/1251] eta 0:43:42 lr 0.000222 time 2.1252 (2.3602) loss 2.7687 (3.2713) grad_norm 2.0780 (2.0618) [2022-01-24 07:48:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][150/1251] eta 0:43:01 lr 0.000222 time 1.5850 (2.3446) loss 3.4138 (3.2611) grad_norm 1.9050 (2.0581) [2022-01-24 07:48:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][160/1251] eta 0:42:25 lr 0.000222 time 2.9428 (2.3335) loss 3.6288 (3.2725) grad_norm 1.8448 (2.0589) [2022-01-24 07:48:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][170/1251] eta 0:41:52 lr 0.000222 time 2.1490 (2.3240) loss 2.7290 (3.2785) grad_norm 1.9318 (2.0551) [2022-01-24 07:49:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][180/1251] eta 0:41:19 lr 0.000222 time 1.9650 (2.3147) loss 3.7747 (3.2852) grad_norm 2.1615 (2.0545) [2022-01-24 07:49:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][190/1251] eta 0:40:49 lr 0.000222 time 1.6580 (2.3086) loss 3.5839 (3.2781) grad_norm 1.8626 (2.0592) [2022-01-24 07:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][200/1251] eta 0:40:15 lr 0.000222 time 1.8882 (2.2980) loss 2.3115 (3.2736) grad_norm 2.0091 (2.0543) [2022-01-24 07:50:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][210/1251] eta 0:39:54 lr 0.000222 time 2.4085 (2.2998) loss 3.5719 (3.2644) grad_norm 1.9113 (2.0515) [2022-01-24 07:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][220/1251] eta 0:39:27 lr 0.000222 time 1.9140 (2.2962) loss 2.8671 (3.2649) grad_norm 1.9277 (2.0471) [2022-01-24 07:51:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][230/1251] eta 0:39:02 lr 0.000222 time 2.5264 (2.2947) loss 3.3558 (3.2646) grad_norm 2.1097 (2.0492) [2022-01-24 07:51:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][240/1251] eta 0:38:32 lr 0.000222 time 2.0155 (2.2870) loss 3.3237 (3.2639) grad_norm 2.0892 (2.0504) [2022-01-24 07:51:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][250/1251] eta 0:38:07 lr 0.000222 time 1.8583 (2.2854) loss 3.0305 (3.2427) grad_norm 2.1921 (2.0462) [2022-01-24 07:52:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][260/1251] eta 0:37:30 lr 0.000222 time 1.9774 (2.2712) loss 2.9364 (3.2411) grad_norm 1.8937 (2.0521) [2022-01-24 07:52:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][270/1251] eta 0:37:03 lr 0.000222 time 1.7739 (2.2667) loss 3.2466 (3.2464) grad_norm 1.9791 (2.0539) [2022-01-24 07:52:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][280/1251] eta 0:36:34 lr 0.000222 time 1.9428 (2.2606) loss 3.6466 (3.2520) grad_norm 1.8515 (2.0502) [2022-01-24 07:53:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][290/1251] eta 0:36:13 lr 0.000222 time 1.9326 (2.2614) loss 2.4692 (3.2557) grad_norm 2.4836 (2.0500) [2022-01-24 07:53:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][300/1251] eta 0:35:47 lr 0.000221 time 2.7919 (2.2582) loss 2.4328 (3.2578) grad_norm 1.8864 (2.0473) [2022-01-24 07:53:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][310/1251] eta 0:35:26 lr 0.000221 time 2.4228 (2.2603) loss 3.8731 (3.2593) grad_norm 2.0520 (2.0468) [2022-01-24 07:54:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][320/1251] eta 0:35:03 lr 0.000221 time 1.9851 (2.2597) loss 3.8344 (3.2629) grad_norm 2.3206 (2.0514) [2022-01-24 07:54:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][330/1251] eta 0:34:39 lr 0.000221 time 2.5127 (2.2581) loss 3.1445 (3.2644) grad_norm 2.2181 (2.0510) [2022-01-24 07:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][340/1251] eta 0:34:10 lr 0.000221 time 2.1695 (2.2512) loss 2.1826 (3.2626) grad_norm 1.7258 (2.0509) [2022-01-24 07:55:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][350/1251] eta 0:33:43 lr 0.000221 time 2.8086 (2.2459) loss 3.6027 (3.2677) grad_norm 1.9772 (2.0555) [2022-01-24 07:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][360/1251] eta 0:33:16 lr 0.000221 time 1.7859 (2.2406) loss 3.7518 (3.2625) grad_norm 1.7727 (2.0565) [2022-01-24 07:56:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][370/1251] eta 0:32:54 lr 0.000221 time 2.6475 (2.2413) loss 2.2669 (3.2599) grad_norm 2.5369 (2.0618) [2022-01-24 07:56:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][380/1251] eta 0:32:29 lr 0.000221 time 2.1111 (2.2378) loss 3.4220 (3.2682) grad_norm 1.9560 (2.0637) [2022-01-24 07:56:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][390/1251] eta 0:32:10 lr 0.000221 time 3.2200 (2.2423) loss 2.8733 (3.2627) grad_norm 1.9756 (2.0634) [2022-01-24 07:57:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][400/1251] eta 0:31:51 lr 0.000221 time 2.9120 (2.2465) loss 2.9979 (3.2606) grad_norm 1.9559 (2.0613) [2022-01-24 07:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][410/1251] eta 0:31:30 lr 0.000221 time 2.1189 (2.2474) loss 3.5435 (3.2612) grad_norm 2.4299 (2.0611) [2022-01-24 07:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][420/1251] eta 0:31:03 lr 0.000221 time 2.0478 (2.2430) loss 3.7285 (3.2616) grad_norm 2.2129 (2.0599) [2022-01-24 07:58:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][430/1251] eta 0:30:39 lr 0.000221 time 3.5080 (2.2410) loss 3.4284 (3.2581) grad_norm 2.1935 (2.0596) [2022-01-24 07:58:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][440/1251] eta 0:30:12 lr 0.000221 time 2.2339 (2.2354) loss 3.1688 (3.2584) grad_norm 1.9010 (2.0585) [2022-01-24 07:58:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][450/1251] eta 0:29:46 lr 0.000221 time 1.9279 (2.2303) loss 4.0626 (3.2615) grad_norm 2.1887 (2.0576) [2022-01-24 07:59:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][460/1251] eta 0:29:23 lr 0.000221 time 2.5312 (2.2300) loss 3.1959 (3.2645) grad_norm 1.8842 (2.0586) [2022-01-24 07:59:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][470/1251] eta 0:29:04 lr 0.000221 time 3.0035 (2.2332) loss 3.5810 (3.2640) grad_norm 1.8943 (2.0591) [2022-01-24 08:00:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][480/1251] eta 0:28:39 lr 0.000221 time 1.5762 (2.2300) loss 3.3834 (3.2633) grad_norm 2.2346 (2.0589) [2022-01-24 08:00:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][490/1251] eta 0:28:13 lr 0.000221 time 1.6746 (2.2252) loss 3.4710 (3.2672) grad_norm 1.8504 (2.0599) [2022-01-24 08:00:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][500/1251] eta 0:27:51 lr 0.000221 time 2.0997 (2.2255) loss 3.5982 (3.2663) grad_norm 1.8401 (2.0608) [2022-01-24 08:01:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][510/1251] eta 0:27:26 lr 0.000221 time 2.0709 (2.2225) loss 3.1363 (3.2690) grad_norm 2.0717 (2.0585) [2022-01-24 08:01:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][520/1251] eta 0:27:03 lr 0.000221 time 1.8413 (2.2209) loss 3.5698 (3.2712) grad_norm 2.1528 (2.0594) [2022-01-24 08:01:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][530/1251] eta 0:26:41 lr 0.000221 time 1.8022 (2.2206) loss 2.1915 (3.2673) grad_norm 1.7825 (2.0608) [2022-01-24 08:02:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][540/1251] eta 0:26:19 lr 0.000221 time 1.7934 (2.2215) loss 3.2156 (3.2707) grad_norm 1.9326 (2.0628) [2022-01-24 08:02:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][550/1251] eta 0:25:56 lr 0.000221 time 1.9047 (2.2209) loss 3.7765 (3.2660) grad_norm 2.0044 (2.0629) [2022-01-24 08:02:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][560/1251] eta 0:25:35 lr 0.000221 time 2.0561 (2.2223) loss 3.1854 (3.2674) grad_norm 2.0115 (2.0641) [2022-01-24 08:03:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][570/1251] eta 0:25:14 lr 0.000221 time 1.4792 (2.2237) loss 3.6646 (3.2667) grad_norm 2.0804 (2.0651) [2022-01-24 08:03:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][580/1251] eta 0:24:52 lr 0.000221 time 1.5493 (2.2240) loss 3.5747 (3.2667) grad_norm 1.9864 (2.0638) [2022-01-24 08:04:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][590/1251] eta 0:24:28 lr 0.000220 time 2.2038 (2.2221) loss 3.3621 (3.2648) grad_norm 2.0078 (2.0619) [2022-01-24 08:04:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][600/1251] eta 0:24:05 lr 0.000220 time 1.7945 (2.2205) loss 3.6203 (3.2679) grad_norm 1.8378 (2.0614) [2022-01-24 08:04:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][610/1251] eta 0:23:42 lr 0.000220 time 2.4978 (2.2197) loss 3.4673 (3.2665) grad_norm 1.8705 (2.0603) [2022-01-24 08:05:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][620/1251] eta 0:23:19 lr 0.000220 time 2.2220 (2.2186) loss 3.6948 (3.2695) grad_norm 1.8677 (2.0629) [2022-01-24 08:05:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][630/1251] eta 0:22:56 lr 0.000220 time 1.9123 (2.2161) loss 3.1113 (3.2659) grad_norm 2.3824 (2.0622) [2022-01-24 08:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][640/1251] eta 0:22:32 lr 0.000220 time 1.8566 (2.2128) loss 2.8349 (3.2677) grad_norm 1.7157 (2.0599) [2022-01-24 08:06:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][650/1251] eta 0:22:09 lr 0.000220 time 3.0639 (2.2130) loss 3.6067 (3.2668) grad_norm 1.7821 (2.0578) [2022-01-24 08:06:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][660/1251] eta 0:21:47 lr 0.000220 time 2.2266 (2.2123) loss 3.3846 (3.2644) grad_norm 1.7819 (2.0561) [2022-01-24 08:06:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][670/1251] eta 0:21:25 lr 0.000220 time 2.0370 (2.2132) loss 3.4261 (3.2690) grad_norm 1.9493 (2.0558) [2022-01-24 08:07:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][680/1251] eta 0:21:03 lr 0.000220 time 1.5617 (2.2128) loss 3.6297 (3.2680) grad_norm 2.0339 (2.0554) [2022-01-24 08:07:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][690/1251] eta 0:20:41 lr 0.000220 time 2.4894 (2.2136) loss 3.5529 (3.2667) grad_norm 2.0847 (2.0540) [2022-01-24 08:08:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][700/1251] eta 0:20:19 lr 0.000220 time 2.3715 (2.2136) loss 3.9014 (3.2708) grad_norm 2.0777 (2.0532) [2022-01-24 08:08:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][710/1251] eta 0:19:56 lr 0.000220 time 2.0267 (2.2119) loss 3.2838 (3.2703) grad_norm 1.9865 (2.0524) [2022-01-24 08:08:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][720/1251] eta 0:19:33 lr 0.000220 time 1.9572 (2.2104) loss 2.6549 (3.2678) grad_norm 2.1130 (2.0524) [2022-01-24 08:09:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][730/1251] eta 0:19:11 lr 0.000220 time 2.1882 (2.2096) loss 3.7061 (3.2635) grad_norm 2.1248 (2.0532) [2022-01-24 08:09:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][740/1251] eta 0:18:49 lr 0.000220 time 2.1673 (2.2099) loss 3.9038 (3.2626) grad_norm 2.0973 (2.0532) [2022-01-24 08:09:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][750/1251] eta 0:18:28 lr 0.000220 time 2.1651 (2.2130) loss 3.8960 (3.2619) grad_norm 1.9275 (2.0531) [2022-01-24 08:10:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][760/1251] eta 0:18:06 lr 0.000220 time 1.8414 (2.2134) loss 3.7292 (3.2643) grad_norm 2.0093 (2.0546) [2022-01-24 08:10:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][770/1251] eta 0:17:43 lr 0.000220 time 1.9565 (2.2103) loss 3.0593 (3.2652) grad_norm 1.9598 (2.0551) [2022-01-24 08:10:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][780/1251] eta 0:17:19 lr 0.000220 time 1.6790 (2.2074) loss 3.9026 (3.2650) grad_norm 2.4492 (2.0570) [2022-01-24 08:11:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][790/1251] eta 0:16:56 lr 0.000220 time 1.9316 (2.2051) loss 3.4958 (3.2635) grad_norm 1.9241 (2.0577) [2022-01-24 08:11:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][800/1251] eta 0:16:33 lr 0.000220 time 1.7885 (2.2026) loss 2.7070 (3.2652) grad_norm 1.9688 (2.0587) [2022-01-24 08:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][810/1251] eta 0:16:11 lr 0.000220 time 2.2302 (2.2023) loss 3.1066 (3.2647) grad_norm 2.4594 (2.0613) [2022-01-24 08:12:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][820/1251] eta 0:15:48 lr 0.000220 time 1.9312 (2.2016) loss 3.7392 (3.2637) grad_norm 1.9136 (2.0602) [2022-01-24 08:12:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][830/1251] eta 0:15:26 lr 0.000220 time 2.2109 (2.2019) loss 2.2173 (3.2640) grad_norm 2.0337 (2.0593) [2022-01-24 08:13:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][840/1251] eta 0:15:05 lr 0.000220 time 2.5265 (2.2034) loss 3.2508 (3.2638) grad_norm 1.7262 (2.0585) [2022-01-24 08:13:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][850/1251] eta 0:14:44 lr 0.000220 time 2.4685 (2.2061) loss 3.5488 (3.2651) grad_norm 2.3371 (2.0590) [2022-01-24 08:13:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][860/1251] eta 0:14:23 lr 0.000220 time 2.5731 (2.2080) loss 3.6753 (3.2629) grad_norm 1.9473 (2.0581) [2022-01-24 08:14:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][870/1251] eta 0:14:01 lr 0.000220 time 1.8233 (2.2074) loss 2.7528 (3.2593) grad_norm 1.7843 (2.0557) [2022-01-24 08:14:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][880/1251] eta 0:13:38 lr 0.000220 time 2.1024 (2.2053) loss 2.5842 (3.2605) grad_norm 1.7810 (2.0548) [2022-01-24 08:14:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][890/1251] eta 0:13:15 lr 0.000219 time 1.9133 (2.2031) loss 3.6026 (3.2616) grad_norm 1.7571 (2.0533) [2022-01-24 08:15:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][900/1251] eta 0:12:52 lr 0.000219 time 2.0714 (2.2011) loss 3.7869 (3.2627) grad_norm 2.1300 (2.0545) [2022-01-24 08:15:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][910/1251] eta 0:12:30 lr 0.000219 time 2.1806 (2.1999) loss 3.8330 (3.2617) grad_norm 2.2721 (2.0551) [2022-01-24 08:16:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][920/1251] eta 0:12:08 lr 0.000219 time 2.7479 (2.2006) loss 3.5016 (3.2655) grad_norm 2.1960 (2.0553) [2022-01-24 08:16:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][930/1251] eta 0:11:46 lr 0.000219 time 2.0147 (2.2012) loss 3.6006 (3.2662) grad_norm 1.8767 (2.0554) [2022-01-24 08:16:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][940/1251] eta 0:11:24 lr 0.000219 time 1.9481 (2.2022) loss 3.6996 (3.2653) grad_norm 1.9541 (2.0555) [2022-01-24 08:17:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][950/1251] eta 0:11:02 lr 0.000219 time 2.1872 (2.2020) loss 3.2384 (3.2650) grad_norm 1.9080 (2.0547) [2022-01-24 08:17:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][960/1251] eta 0:10:40 lr 0.000219 time 2.4337 (2.2017) loss 3.2467 (3.2655) grad_norm 2.3945 (2.0555) [2022-01-24 08:17:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][970/1251] eta 0:10:18 lr 0.000219 time 1.9605 (2.2014) loss 3.8205 (3.2674) grad_norm 2.2876 (2.0554) [2022-01-24 08:18:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][980/1251] eta 0:09:56 lr 0.000219 time 2.1740 (2.2014) loss 2.4984 (3.2642) grad_norm 1.8542 (2.0554) [2022-01-24 08:18:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][990/1251] eta 0:09:34 lr 0.000219 time 2.4057 (2.2010) loss 3.2927 (3.2642) grad_norm 2.0177 (2.0563) [2022-01-24 08:18:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1000/1251] eta 0:09:12 lr 0.000219 time 2.3974 (2.1998) loss 3.9061 (3.2659) grad_norm 2.0772 (2.0560) [2022-01-24 08:19:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1010/1251] eta 0:08:49 lr 0.000219 time 2.6317 (2.1988) loss 4.0423 (3.2669) grad_norm 2.0270 (2.0558) [2022-01-24 08:19:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1020/1251] eta 0:08:27 lr 0.000219 time 1.9424 (2.1988) loss 3.2413 (3.2670) grad_norm 2.4416 (2.0553) [2022-01-24 08:20:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1030/1251] eta 0:08:06 lr 0.000219 time 2.4817 (2.2000) loss 3.3607 (3.2672) grad_norm 2.1500 (2.0566) [2022-01-24 08:20:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1040/1251] eta 0:07:44 lr 0.000219 time 3.0869 (2.2026) loss 3.7118 (3.2690) grad_norm 2.0964 (2.0559) [2022-01-24 08:20:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1050/1251] eta 0:07:22 lr 0.000219 time 1.9394 (2.2023) loss 3.6581 (3.2683) grad_norm 2.2498 (2.0553) [2022-01-24 08:21:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1060/1251] eta 0:07:00 lr 0.000219 time 2.1202 (2.2005) loss 3.7246 (3.2695) grad_norm 1.7720 (2.0546) [2022-01-24 08:21:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1070/1251] eta 0:06:37 lr 0.000219 time 2.5127 (2.1987) loss 3.2211 (3.2689) grad_norm 1.7854 (2.0530) [2022-01-24 08:21:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1080/1251] eta 0:06:15 lr 0.000219 time 2.2324 (2.1969) loss 3.0548 (3.2672) grad_norm 2.0027 (2.0522) [2022-01-24 08:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1090/1251] eta 0:05:53 lr 0.000219 time 2.2507 (2.1962) loss 2.3207 (3.2680) grad_norm 2.0877 (2.0518) [2022-01-24 08:22:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1100/1251] eta 0:05:31 lr 0.000219 time 2.5715 (2.1972) loss 2.2261 (3.2684) grad_norm 2.2022 (2.0506) [2022-01-24 08:22:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1110/1251] eta 0:05:09 lr 0.000219 time 2.5579 (2.1978) loss 2.6870 (3.2669) grad_norm 2.1863 (2.0505) [2022-01-24 08:23:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1120/1251] eta 0:04:48 lr 0.000219 time 2.7379 (2.1992) loss 3.3929 (3.2692) grad_norm 2.0109 (2.0500) [2022-01-24 08:23:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1130/1251] eta 0:04:26 lr 0.000219 time 2.2393 (2.1998) loss 3.5425 (3.2707) grad_norm 2.0457 (2.0498) [2022-01-24 08:24:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1140/1251] eta 0:04:04 lr 0.000219 time 2.6950 (2.2001) loss 3.6691 (3.2713) grad_norm 2.2389 (2.0505) [2022-01-24 08:24:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1150/1251] eta 0:03:42 lr 0.000219 time 2.8239 (2.2013) loss 2.8723 (3.2691) grad_norm 2.1486 (2.0523) [2022-01-24 08:24:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1160/1251] eta 0:03:20 lr 0.000219 time 1.6993 (2.1986) loss 3.5428 (3.2693) grad_norm 2.2053 (2.0530) [2022-01-24 08:25:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1170/1251] eta 0:02:57 lr 0.000219 time 1.8668 (2.1975) loss 3.1658 (3.2684) grad_norm 1.7036 (2.0527) [2022-01-24 08:25:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1180/1251] eta 0:02:36 lr 0.000218 time 2.1613 (2.1973) loss 2.8639 (3.2680) grad_norm 2.0393 (2.0525) [2022-01-24 08:25:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1190/1251] eta 0:02:14 lr 0.000218 time 3.0433 (2.1972) loss 3.3417 (3.2695) grad_norm 2.4693 (2.0545) [2022-01-24 08:26:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1200/1251] eta 0:01:52 lr 0.000218 time 2.4433 (2.1969) loss 3.0707 (3.2703) grad_norm 1.8125 (2.0544) [2022-01-24 08:26:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1210/1251] eta 0:01:30 lr 0.000218 time 2.0446 (2.1965) loss 3.4916 (3.2718) grad_norm 2.1800 (2.0536) [2022-01-24 08:26:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1220/1251] eta 0:01:08 lr 0.000218 time 1.9818 (2.1960) loss 3.9471 (3.2725) grad_norm 1.9239 (2.0529) [2022-01-24 08:27:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1230/1251] eta 0:00:46 lr 0.000218 time 2.4788 (2.1961) loss 2.9525 (3.2738) grad_norm 2.3019 (2.0539) [2022-01-24 08:27:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1240/1251] eta 0:00:24 lr 0.000218 time 2.3322 (2.1961) loss 2.8130 (3.2724) grad_norm 1.8595 (2.0537) [2022-01-24 08:27:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [208/300][1250/1251] eta 0:00:02 lr 0.000218 time 1.1833 (2.1903) loss 2.8683 (3.2713) grad_norm 1.8099 (2.0541) [2022-01-24 08:27:53 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 208 training takes 0:45:40 [2022-01-24 08:28:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.112 (19.112) Loss 0.9073 (0.9073) Acc@1 78.125 (78.125) Acc@5 94.727 (94.727) [2022-01-24 08:28:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.188 (3.561) Loss 0.9049 (0.8996) Acc@1 78.320 (78.427) Acc@5 94.238 (94.877) [2022-01-24 08:28:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.310 (2.497) Loss 0.8134 (0.8936) Acc@1 79.395 (78.711) Acc@5 96.387 (94.759) [2022-01-24 08:29:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.620 (2.274) Loss 0.9747 (0.8982) Acc@1 77.051 (78.648) Acc@5 93.457 (94.623) [2022-01-24 08:29:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.421 (2.183) Loss 0.8785 (0.8979) Acc@1 78.809 (78.701) Acc@5 95.508 (94.641) [2022-01-24 08:29:29 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.674 Acc@5 94.590 [2022-01-24 08:29:29 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.7% [2022-01-24 08:29:29 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.81% [2022-01-24 08:29:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][0/1251] eta 7:27:49 lr 0.000218 time 21.4782 (21.4782) loss 2.9806 (2.9806) grad_norm 2.2072 (2.2072) [2022-01-24 08:30:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][10/1251] eta 1:22:50 lr 0.000218 time 1.5293 (4.0054) loss 3.3348 (3.2154) grad_norm 1.9479 (2.0394) [2022-01-24 08:30:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][20/1251] eta 1:06:09 lr 0.000218 time 1.3653 (3.2243) loss 3.8241 (3.2811) grad_norm 1.8992 (2.0655) [2022-01-24 08:30:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][30/1251] eta 0:58:35 lr 0.000218 time 1.6533 (2.8794) loss 3.2919 (3.2510) grad_norm 2.3578 (2.1056) [2022-01-24 08:31:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][40/1251] eta 0:54:53 lr 0.000218 time 2.9524 (2.7199) loss 2.8269 (3.2573) grad_norm 2.0271 (2.1399) [2022-01-24 08:31:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][50/1251] eta 0:53:07 lr 0.000218 time 1.7750 (2.6541) loss 3.2612 (3.2989) grad_norm 1.9160 (2.1020) [2022-01-24 08:32:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][60/1251] eta 0:51:39 lr 0.000218 time 1.6325 (2.6024) loss 3.7124 (3.3060) grad_norm 1.9457 (2.0842) [2022-01-24 08:32:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][70/1251] eta 0:49:44 lr 0.000218 time 1.6778 (2.5271) loss 2.7693 (3.3052) grad_norm 2.0583 (2.0960) [2022-01-24 08:32:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][80/1251] eta 0:48:25 lr 0.000218 time 3.1830 (2.4810) loss 3.7551 (3.3205) grad_norm 1.8592 (2.0828) [2022-01-24 08:33:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][90/1251] eta 0:47:02 lr 0.000218 time 1.5420 (2.4310) loss 3.6590 (3.3165) grad_norm 2.0185 (2.0769) [2022-01-24 08:33:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][100/1251] eta 0:46:22 lr 0.000218 time 1.5044 (2.4175) loss 2.6715 (3.3416) grad_norm 1.9920 (2.0708) [2022-01-24 08:33:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][110/1251] eta 0:45:22 lr 0.000218 time 1.8014 (2.3859) loss 3.3872 (3.3262) grad_norm 2.0279 (2.0609) [2022-01-24 08:34:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][120/1251] eta 0:44:41 lr 0.000218 time 3.1315 (2.3706) loss 2.0664 (3.3080) grad_norm 2.1468 (2.0644) [2022-01-24 08:34:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][130/1251] eta 0:44:10 lr 0.000218 time 2.1652 (2.3647) loss 2.5541 (3.3074) grad_norm 1.9685 (2.0612) [2022-01-24 08:35:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][140/1251] eta 0:43:34 lr 0.000218 time 2.2056 (2.3530) loss 3.8390 (3.3297) grad_norm 2.7610 (2.0611) [2022-01-24 08:35:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][150/1251] eta 0:43:07 lr 0.000218 time 2.5810 (2.3499) loss 3.4442 (3.3248) grad_norm 2.0262 (2.0635) [2022-01-24 08:35:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][160/1251] eta 0:42:41 lr 0.000218 time 3.0812 (2.3479) loss 3.1393 (3.3101) grad_norm 1.9569 (2.0639) [2022-01-24 08:36:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][170/1251] eta 0:42:10 lr 0.000218 time 1.6328 (2.3412) loss 3.0787 (3.3152) grad_norm 2.3350 (2.0600) [2022-01-24 08:36:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][180/1251] eta 0:41:33 lr 0.000218 time 1.7122 (2.3285) loss 3.1537 (3.3093) grad_norm 1.8798 (2.0544) [2022-01-24 08:36:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][190/1251] eta 0:40:56 lr 0.000218 time 2.8833 (2.3150) loss 3.4653 (3.3019) grad_norm 1.9953 (2.0521) [2022-01-24 08:37:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][200/1251] eta 0:40:18 lr 0.000218 time 2.2466 (2.3011) loss 2.7160 (3.2924) grad_norm 1.8610 (2.0460) [2022-01-24 08:37:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][210/1251] eta 0:39:49 lr 0.000218 time 1.9082 (2.2950) loss 3.1667 (3.2928) grad_norm 1.9329 (2.0457) [2022-01-24 08:37:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][220/1251] eta 0:39:26 lr 0.000218 time 3.4462 (2.2955) loss 3.5948 (3.2880) grad_norm 1.9656 (2.0420) [2022-01-24 08:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][230/1251] eta 0:38:54 lr 0.000217 time 1.9110 (2.2862) loss 2.5337 (3.2866) grad_norm 1.9003 (2.0390) [2022-01-24 08:38:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][240/1251] eta 0:38:23 lr 0.000217 time 1.8749 (2.2783) loss 3.5222 (3.2788) grad_norm 1.7178 (2.0416) [2022-01-24 08:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][250/1251] eta 0:37:59 lr 0.000217 time 1.8627 (2.2776) loss 3.1298 (3.2798) grad_norm 2.0109 (2.0398) [2022-01-24 08:39:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][260/1251] eta 0:37:33 lr 0.000217 time 2.5853 (2.2742) loss 3.1962 (3.2880) grad_norm 2.0375 (2.0462) [2022-01-24 08:39:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][270/1251] eta 0:37:11 lr 0.000217 time 2.7089 (2.2743) loss 2.4837 (3.2874) grad_norm 1.9247 (2.0524) [2022-01-24 08:40:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][280/1251] eta 0:36:47 lr 0.000217 time 2.0565 (2.2739) loss 3.6381 (3.2883) grad_norm 1.8985 (2.0514) [2022-01-24 08:40:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][290/1251] eta 0:36:22 lr 0.000217 time 1.7371 (2.2715) loss 2.9774 (3.2798) grad_norm 2.7938 (2.0601) [2022-01-24 08:40:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][300/1251] eta 0:35:52 lr 0.000217 time 1.9287 (2.2630) loss 3.1502 (3.2737) grad_norm 1.8035 (2.0596) [2022-01-24 08:41:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][310/1251] eta 0:35:27 lr 0.000217 time 1.9475 (2.2604) loss 3.4575 (3.2830) grad_norm 2.8041 (2.0617) [2022-01-24 08:41:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][320/1251] eta 0:35:04 lr 0.000217 time 1.9261 (2.2607) loss 3.1096 (3.2825) grad_norm 2.0609 (2.0612) [2022-01-24 08:41:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][330/1251] eta 0:34:43 lr 0.000217 time 2.3992 (2.2626) loss 3.2347 (3.2779) grad_norm 2.2013 (2.0713) [2022-01-24 08:42:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][340/1251] eta 0:34:22 lr 0.000217 time 2.1974 (2.2639) loss 3.1068 (3.2748) grad_norm 2.3299 (2.0706) [2022-01-24 08:42:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][350/1251] eta 0:33:50 lr 0.000217 time 1.7916 (2.2541) loss 3.4090 (3.2760) grad_norm 1.9462 (2.0697) [2022-01-24 08:43:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][360/1251] eta 0:33:24 lr 0.000217 time 1.8973 (2.2500) loss 3.5419 (3.2751) grad_norm 1.9717 (2.0682) [2022-01-24 08:43:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][370/1251] eta 0:32:59 lr 0.000217 time 2.0708 (2.2464) loss 4.0418 (3.2728) grad_norm 2.0418 (2.0647) [2022-01-24 08:43:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][380/1251] eta 0:32:32 lr 0.000217 time 1.9325 (2.2422) loss 2.7443 (3.2730) grad_norm 2.0301 (2.0697) [2022-01-24 08:44:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][390/1251] eta 0:32:09 lr 0.000217 time 2.1758 (2.2414) loss 3.7644 (3.2723) grad_norm 2.6724 (2.0757) [2022-01-24 08:44:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][400/1251] eta 0:31:45 lr 0.000217 time 1.8218 (2.2387) loss 3.5979 (3.2723) grad_norm 2.3640 (2.0791) [2022-01-24 08:44:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][410/1251] eta 0:31:19 lr 0.000217 time 2.1670 (2.2349) loss 3.2740 (3.2799) grad_norm 2.1524 (2.0800) [2022-01-24 08:45:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][420/1251] eta 0:30:57 lr 0.000217 time 2.7391 (2.2354) loss 3.4202 (3.2778) grad_norm 1.7549 (2.0793) [2022-01-24 08:45:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][430/1251] eta 0:30:37 lr 0.000217 time 2.1214 (2.2378) loss 3.4773 (3.2769) grad_norm 1.7443 (2.0786) [2022-01-24 08:45:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][440/1251] eta 0:30:17 lr 0.000217 time 2.7420 (2.2405) loss 3.7465 (3.2758) grad_norm 2.4088 (2.0870) [2022-01-24 08:46:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][450/1251] eta 0:29:54 lr 0.000217 time 2.2448 (2.2400) loss 3.4241 (3.2781) grad_norm 1.8983 (2.0896) [2022-01-24 08:46:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][460/1251] eta 0:29:28 lr 0.000217 time 1.5920 (2.2356) loss 2.8717 (3.2784) grad_norm 1.9701 (2.0892) [2022-01-24 08:47:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][470/1251] eta 0:29:05 lr 0.000217 time 1.6198 (2.2351) loss 3.2509 (3.2767) grad_norm 2.0733 (2.0880) [2022-01-24 08:47:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][480/1251] eta 0:28:42 lr 0.000217 time 2.4120 (2.2343) loss 3.4323 (3.2719) grad_norm 1.7953 (2.0850) [2022-01-24 08:47:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][490/1251] eta 0:28:17 lr 0.000217 time 1.7217 (2.2311) loss 4.2860 (3.2759) grad_norm 2.0638 (2.0847) [2022-01-24 08:48:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][500/1251] eta 0:27:54 lr 0.000217 time 1.9963 (2.2293) loss 3.7190 (3.2799) grad_norm 1.8410 (2.0864) [2022-01-24 08:48:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][510/1251] eta 0:27:34 lr 0.000217 time 1.8914 (2.2324) loss 3.5573 (3.2857) grad_norm 1.9579 (2.0881) [2022-01-24 08:48:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][520/1251] eta 0:27:12 lr 0.000217 time 2.5484 (2.2330) loss 3.9621 (3.2792) grad_norm 2.0149 (2.0954) [2022-01-24 08:49:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][530/1251] eta 0:26:48 lr 0.000216 time 1.8705 (2.2315) loss 3.1323 (3.2812) grad_norm 2.2241 (2.0944) [2022-01-24 08:49:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][540/1251] eta 0:26:24 lr 0.000216 time 1.9219 (2.2284) loss 3.4599 (3.2820) grad_norm 1.8530 (2.0927) [2022-01-24 08:49:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][550/1251] eta 0:26:04 lr 0.000216 time 2.2496 (2.2324) loss 3.6151 (3.2777) grad_norm 2.3632 (2.0910) [2022-01-24 08:50:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][560/1251] eta 0:25:40 lr 0.000216 time 1.8156 (2.2288) loss 2.0074 (3.2724) grad_norm 2.5414 (2.0926) [2022-01-24 08:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][570/1251] eta 0:25:15 lr 0.000216 time 1.7966 (2.2251) loss 3.8940 (3.2734) grad_norm 2.1493 (2.0937) [2022-01-24 08:51:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][580/1251] eta 0:24:51 lr 0.000216 time 1.9992 (2.2233) loss 3.6048 (3.2749) grad_norm 2.0031 (2.0911) [2022-01-24 08:51:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][590/1251] eta 0:24:29 lr 0.000216 time 1.8695 (2.2229) loss 3.5001 (3.2757) grad_norm 1.9351 (2.0911) [2022-01-24 08:51:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][600/1251] eta 0:24:07 lr 0.000216 time 2.1853 (2.2235) loss 2.5564 (3.2696) grad_norm 1.7756 (2.0914) [2022-01-24 08:52:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][610/1251] eta 0:23:44 lr 0.000216 time 2.0438 (2.2216) loss 3.3529 (3.2712) grad_norm 1.8255 (2.0883) [2022-01-24 08:52:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][620/1251] eta 0:23:22 lr 0.000216 time 2.2512 (2.2225) loss 3.4371 (3.2721) grad_norm 2.0475 (2.0871) [2022-01-24 08:52:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][630/1251] eta 0:23:00 lr 0.000216 time 1.5875 (2.2233) loss 3.4148 (3.2696) grad_norm 2.2225 (2.0856) [2022-01-24 08:53:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][640/1251] eta 0:22:36 lr 0.000216 time 1.5438 (2.2196) loss 3.6164 (3.2744) grad_norm 1.7755 (2.0839) [2022-01-24 08:53:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][650/1251] eta 0:22:12 lr 0.000216 time 1.8712 (2.2170) loss 3.3891 (3.2764) grad_norm 2.2113 (2.0811) [2022-01-24 08:53:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][660/1251] eta 0:21:49 lr 0.000216 time 2.3098 (2.2153) loss 3.6850 (3.2754) grad_norm 1.9827 (2.0806) [2022-01-24 08:54:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][670/1251] eta 0:21:26 lr 0.000216 time 1.9599 (2.2149) loss 3.6694 (3.2742) grad_norm 2.0045 (2.0799) [2022-01-24 08:54:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][680/1251] eta 0:21:04 lr 0.000216 time 2.1603 (2.2138) loss 3.8872 (3.2681) grad_norm 2.0048 (2.0794) [2022-01-24 08:54:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][690/1251] eta 0:20:41 lr 0.000216 time 2.3272 (2.2127) loss 3.0601 (3.2682) grad_norm 2.1423 (2.0786) [2022-01-24 08:55:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][700/1251] eta 0:20:18 lr 0.000216 time 1.9294 (2.2117) loss 3.0133 (3.2727) grad_norm 2.1532 (2.0783) [2022-01-24 08:55:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][710/1251] eta 0:19:56 lr 0.000216 time 2.3329 (2.2109) loss 2.7622 (3.2723) grad_norm 1.9695 (2.0776) [2022-01-24 08:56:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][720/1251] eta 0:19:34 lr 0.000216 time 3.1051 (2.2117) loss 2.1512 (3.2744) grad_norm 1.8858 (2.0780) [2022-01-24 08:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][730/1251] eta 0:19:12 lr 0.000216 time 2.1517 (2.2125) loss 2.8868 (3.2730) grad_norm 1.9121 (2.0768) [2022-01-24 08:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][740/1251] eta 0:18:51 lr 0.000216 time 2.5350 (2.2135) loss 3.8990 (3.2743) grad_norm 2.1668 (2.0772) [2022-01-24 08:57:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][750/1251] eta 0:18:29 lr 0.000216 time 2.1905 (2.2141) loss 3.0116 (3.2778) grad_norm 2.0516 (2.0781) [2022-01-24 08:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][760/1251] eta 0:18:08 lr 0.000216 time 3.1527 (2.2164) loss 3.4955 (3.2770) grad_norm 2.0587 (2.0770) [2022-01-24 08:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][770/1251] eta 0:17:45 lr 0.000216 time 1.7728 (2.2145) loss 2.7656 (3.2741) grad_norm 1.7829 (2.0749) [2022-01-24 08:58:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][780/1251] eta 0:17:22 lr 0.000216 time 1.7961 (2.2136) loss 3.6517 (3.2752) grad_norm 2.2096 (2.0754) [2022-01-24 08:58:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][790/1251] eta 0:17:00 lr 0.000216 time 1.8266 (2.2128) loss 2.9373 (3.2755) grad_norm 1.9744 (2.0757) [2022-01-24 08:59:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][800/1251] eta 0:16:37 lr 0.000216 time 2.0182 (2.2124) loss 3.4135 (3.2780) grad_norm 1.6727 (2.0756) [2022-01-24 08:59:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][810/1251] eta 0:16:15 lr 0.000216 time 2.1252 (2.2128) loss 3.0706 (3.2792) grad_norm 1.9794 (2.0759) [2022-01-24 08:59:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][820/1251] eta 0:15:52 lr 0.000215 time 1.6062 (2.2109) loss 3.4040 (3.2805) grad_norm 2.5887 (2.0771) [2022-01-24 09:00:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][830/1251] eta 0:15:30 lr 0.000215 time 2.6299 (2.2096) loss 3.5004 (3.2802) grad_norm 1.8667 (2.0778) [2022-01-24 09:00:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][840/1251] eta 0:15:07 lr 0.000215 time 1.8491 (2.2080) loss 2.9658 (3.2770) grad_norm 1.8227 (2.0781) [2022-01-24 09:00:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][850/1251] eta 0:14:45 lr 0.000215 time 2.9396 (2.2090) loss 2.5571 (3.2783) grad_norm 1.9112 (2.0791) [2022-01-24 09:01:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][860/1251] eta 0:14:23 lr 0.000215 time 1.8763 (2.2096) loss 3.3933 (3.2772) grad_norm 1.9651 (2.0798) [2022-01-24 09:01:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][870/1251] eta 0:14:02 lr 0.000215 time 2.4265 (2.2106) loss 3.0975 (3.2767) grad_norm 2.6235 (2.0803) [2022-01-24 09:01:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][880/1251] eta 0:13:39 lr 0.000215 time 1.6180 (2.2081) loss 3.6955 (3.2757) grad_norm 2.0313 (2.0815) [2022-01-24 09:02:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][890/1251] eta 0:13:17 lr 0.000215 time 3.1915 (2.2080) loss 2.5415 (3.2744) grad_norm 1.9616 (2.0817) [2022-01-24 09:02:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][900/1251] eta 0:12:54 lr 0.000215 time 1.8709 (2.2069) loss 3.9600 (3.2757) grad_norm 2.1485 (2.0816) [2022-01-24 09:03:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][910/1251] eta 0:12:32 lr 0.000215 time 2.0342 (2.2074) loss 3.1318 (3.2776) grad_norm 1.8623 (2.0805) [2022-01-24 09:03:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][920/1251] eta 0:12:10 lr 0.000215 time 1.6974 (2.2065) loss 3.4146 (3.2775) grad_norm 2.1487 (2.0808) [2022-01-24 09:03:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][930/1251] eta 0:11:49 lr 0.000215 time 3.6637 (2.2087) loss 3.5947 (3.2790) grad_norm 2.1785 (2.0809) [2022-01-24 09:04:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][940/1251] eta 0:11:26 lr 0.000215 time 2.2058 (2.2080) loss 3.7518 (3.2803) grad_norm 2.9654 (2.0815) [2022-01-24 09:04:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][950/1251] eta 0:11:04 lr 0.000215 time 2.0083 (2.2080) loss 3.2105 (3.2791) grad_norm 2.1003 (2.0813) [2022-01-24 09:04:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][960/1251] eta 0:10:42 lr 0.000215 time 1.9295 (2.2064) loss 3.2492 (3.2803) grad_norm 1.9329 (2.0811) [2022-01-24 09:05:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][970/1251] eta 0:10:20 lr 0.000215 time 2.1733 (2.2069) loss 3.9467 (3.2828) grad_norm 1.9340 (2.0811) [2022-01-24 09:05:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][980/1251] eta 0:09:57 lr 0.000215 time 1.6107 (2.2061) loss 3.4776 (3.2833) grad_norm 1.8926 (2.0808) [2022-01-24 09:05:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][990/1251] eta 0:09:35 lr 0.000215 time 2.4396 (2.2061) loss 4.0182 (3.2832) grad_norm 2.5943 (2.0823) [2022-01-24 09:06:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1000/1251] eta 0:09:13 lr 0.000215 time 2.1198 (2.2056) loss 3.5797 (3.2852) grad_norm 1.8704 (2.0823) [2022-01-24 09:06:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1010/1251] eta 0:08:52 lr 0.000215 time 1.8525 (2.2075) loss 2.9160 (3.2856) grad_norm 1.7255 (2.0828) [2022-01-24 09:07:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1020/1251] eta 0:08:29 lr 0.000215 time 1.6431 (2.2067) loss 3.2082 (3.2860) grad_norm 2.4973 (2.0837) [2022-01-24 09:07:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1030/1251] eta 0:08:07 lr 0.000215 time 1.5714 (2.2061) loss 3.4121 (3.2857) grad_norm 1.8938 (2.0827) [2022-01-24 09:07:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1040/1251] eta 0:07:45 lr 0.000215 time 1.6337 (2.2059) loss 3.4295 (3.2866) grad_norm 2.2182 (2.0818) [2022-01-24 09:08:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1050/1251] eta 0:07:23 lr 0.000215 time 1.8534 (2.2055) loss 3.1914 (3.2850) grad_norm 2.1810 (2.0810) [2022-01-24 09:08:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1060/1251] eta 0:07:01 lr 0.000215 time 2.1682 (2.2049) loss 2.8361 (3.2876) grad_norm 1.9891 (2.0806) [2022-01-24 09:08:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1070/1251] eta 0:06:39 lr 0.000215 time 1.8605 (2.2050) loss 3.5266 (3.2863) grad_norm 1.9113 (2.0792) [2022-01-24 09:09:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1080/1251] eta 0:06:17 lr 0.000215 time 1.6616 (2.2051) loss 3.4372 (3.2856) grad_norm 2.1049 (2.0783) [2022-01-24 09:09:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1090/1251] eta 0:05:54 lr 0.000215 time 2.1233 (2.2045) loss 2.5970 (3.2858) grad_norm 1.6788 (2.0783) [2022-01-24 09:09:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1100/1251] eta 0:05:32 lr 0.000215 time 2.0358 (2.2027) loss 3.3521 (3.2869) grad_norm 1.8179 (2.0775) [2022-01-24 09:10:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1110/1251] eta 0:05:10 lr 0.000215 time 1.9809 (2.2014) loss 3.2780 (3.2860) grad_norm 1.9329 (2.0769) [2022-01-24 09:10:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1120/1251] eta 0:04:48 lr 0.000214 time 2.2159 (2.2012) loss 3.5062 (3.2852) grad_norm 1.8259 (2.0765) [2022-01-24 09:10:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1130/1251] eta 0:04:26 lr 0.000214 time 1.9261 (2.2006) loss 3.4971 (3.2859) grad_norm 2.1430 (2.0756) [2022-01-24 09:11:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1140/1251] eta 0:04:04 lr 0.000214 time 2.2509 (2.2015) loss 3.4892 (3.2855) grad_norm 1.9353 (2.0754) [2022-01-24 09:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1150/1251] eta 0:03:42 lr 0.000214 time 2.5765 (2.2022) loss 3.0701 (3.2837) grad_norm 2.1228 (2.0744) [2022-01-24 09:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1160/1251] eta 0:03:20 lr 0.000214 time 2.8362 (2.2036) loss 3.7425 (3.2840) grad_norm 2.0309 (2.0734) [2022-01-24 09:12:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1170/1251] eta 0:02:58 lr 0.000214 time 1.7146 (2.2022) loss 3.1336 (3.2834) grad_norm 1.9017 (2.0736) [2022-01-24 09:12:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1180/1251] eta 0:02:36 lr 0.000214 time 1.5289 (2.2011) loss 3.0909 (3.2841) grad_norm 2.1290 (2.0746) [2022-01-24 09:13:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1190/1251] eta 0:02:14 lr 0.000214 time 2.1527 (2.2008) loss 2.2446 (3.2844) grad_norm 2.0689 (2.0738) [2022-01-24 09:13:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1200/1251] eta 0:01:52 lr 0.000214 time 2.5390 (2.2026) loss 3.5326 (3.2853) grad_norm 2.2726 (2.0739) [2022-01-24 09:13:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1210/1251] eta 0:01:30 lr 0.000214 time 1.5646 (2.2024) loss 2.6266 (3.2830) grad_norm 2.0028 (2.0737) [2022-01-24 09:14:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1220/1251] eta 0:01:08 lr 0.000214 time 2.0077 (2.2027) loss 3.0140 (3.2834) grad_norm 2.0485 (2.0725) [2022-01-24 09:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1230/1251] eta 0:00:46 lr 0.000214 time 2.5248 (2.2019) loss 3.5526 (3.2826) grad_norm 2.0144 (2.0719) [2022-01-24 09:15:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1240/1251] eta 0:00:24 lr 0.000214 time 1.4375 (2.2005) loss 3.6555 (3.2821) grad_norm 2.1519 (2.0713) [2022-01-24 09:15:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [209/300][1250/1251] eta 0:00:02 lr 0.000214 time 1.3693 (2.1947) loss 2.6880 (3.2810) grad_norm 1.8899 (2.0705) [2022-01-24 09:15:15 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 209 training takes 0:45:45 [2022-01-24 09:15:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.718 (18.718) Loss 0.9510 (0.9510) Acc@1 77.344 (77.344) Acc@5 94.238 (94.238) [2022-01-24 09:15:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.640 (3.286) Loss 0.8692 (0.8971) Acc@1 77.344 (78.471) Acc@5 95.996 (94.656) [2022-01-24 09:16:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.891 (2.670) Loss 0.8107 (0.8930) Acc@1 81.250 (78.795) Acc@5 95.020 (94.582) [2022-01-24 09:16:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.621 (2.230) Loss 0.8737 (0.8922) Acc@1 80.566 (78.818) Acc@5 95.020 (94.553) [2022-01-24 09:16:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.595 (2.139) Loss 0.8904 (0.8940) Acc@1 79.395 (78.828) Acc@5 94.531 (94.534) [2022-01-24 09:16:50 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.672 Acc@5 94.534 [2022-01-24 09:16:50 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.7% [2022-01-24 09:16:50 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.81% [2022-01-24 09:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][0/1251] eta 7:32:43 lr 0.000214 time 21.7138 (21.7138) loss 3.9093 (3.9093) grad_norm 2.1001 (2.1001) [2022-01-24 09:17:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][10/1251] eta 1:25:31 lr 0.000214 time 2.6639 (4.1346) loss 2.0653 (3.0143) grad_norm 2.0301 (2.0256) [2022-01-24 09:17:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][20/1251] eta 1:04:13 lr 0.000214 time 1.5465 (3.1301) loss 2.4858 (3.1835) grad_norm 1.9098 (2.0731) [2022-01-24 09:18:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][30/1251] eta 0:58:36 lr 0.000214 time 1.4624 (2.8797) loss 3.0705 (3.2922) grad_norm 1.9665 (2.0867) [2022-01-24 09:18:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][40/1251] eta 0:55:26 lr 0.000214 time 3.1850 (2.7467) loss 3.6509 (3.2748) grad_norm 1.7903 (2.0962) [2022-01-24 09:19:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][50/1251] eta 0:52:56 lr 0.000214 time 2.7317 (2.6448) loss 3.0984 (3.2323) grad_norm 2.1684 (2.0839) [2022-01-24 09:19:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][60/1251] eta 0:50:57 lr 0.000214 time 1.8397 (2.5674) loss 3.0800 (3.2188) grad_norm 1.7822 (2.0724) [2022-01-24 09:19:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][70/1251] eta 0:49:16 lr 0.000214 time 1.9766 (2.5033) loss 3.7760 (3.2541) grad_norm 2.6749 (2.0839) [2022-01-24 09:20:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][80/1251] eta 0:48:08 lr 0.000214 time 2.9449 (2.4665) loss 3.8735 (3.2511) grad_norm 2.2320 (2.0844) [2022-01-24 09:20:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][90/1251] eta 0:47:42 lr 0.000214 time 2.7686 (2.4659) loss 3.7050 (3.2082) grad_norm 1.9863 (2.0826) [2022-01-24 09:20:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][100/1251] eta 0:46:48 lr 0.000214 time 1.9344 (2.4402) loss 2.1101 (3.2117) grad_norm 2.0752 (2.0766) [2022-01-24 09:21:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][110/1251] eta 0:45:37 lr 0.000214 time 1.6469 (2.3990) loss 3.5350 (3.2111) grad_norm 1.7698 (2.0736) [2022-01-24 09:21:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][120/1251] eta 0:44:28 lr 0.000214 time 1.9951 (2.3598) loss 3.9140 (3.2043) grad_norm 1.9270 (2.0723) [2022-01-24 09:21:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][130/1251] eta 0:43:45 lr 0.000214 time 1.8866 (2.3418) loss 3.5458 (3.2117) grad_norm 2.8622 (2.0720) [2022-01-24 09:22:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][140/1251] eta 0:43:12 lr 0.000214 time 2.2809 (2.3334) loss 2.5478 (3.2152) grad_norm 2.5348 (2.0811) [2022-01-24 09:22:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][150/1251] eta 0:42:34 lr 0.000214 time 2.0808 (2.3198) loss 3.7238 (3.2239) grad_norm 2.0186 (2.0953) [2022-01-24 09:23:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][160/1251] eta 0:41:55 lr 0.000214 time 2.1000 (2.3057) loss 2.5694 (3.2302) grad_norm 1.7238 (2.0909) [2022-01-24 09:23:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][170/1251] eta 0:41:25 lr 0.000213 time 2.5572 (2.2997) loss 3.3762 (3.2315) grad_norm 2.0857 (2.0879) [2022-01-24 09:23:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][180/1251] eta 0:40:54 lr 0.000213 time 2.1990 (2.2921) loss 3.4665 (3.2459) grad_norm 2.1717 (2.0802) [2022-01-24 09:24:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][190/1251] eta 0:40:28 lr 0.000213 time 1.8635 (2.2890) loss 3.4695 (3.2530) grad_norm 1.9618 (2.0718) [2022-01-24 09:24:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][200/1251] eta 0:40:06 lr 0.000213 time 2.2197 (2.2901) loss 3.4138 (3.2498) grad_norm 2.1143 (2.0742) [2022-01-24 09:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][210/1251] eta 0:39:42 lr 0.000213 time 3.0515 (2.2885) loss 4.1082 (3.2735) grad_norm 1.8845 (2.0712) [2022-01-24 09:25:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][220/1251] eta 0:39:10 lr 0.000213 time 2.5792 (2.2800) loss 2.6401 (3.2670) grad_norm 2.0039 (2.0734) [2022-01-24 09:25:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][230/1251] eta 0:38:42 lr 0.000213 time 2.4684 (2.2744) loss 3.7601 (3.2793) grad_norm 2.2667 (2.0666) [2022-01-24 09:25:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][240/1251] eta 0:38:07 lr 0.000213 time 1.9657 (2.2629) loss 3.7102 (3.2851) grad_norm 1.8185 (2.0605) [2022-01-24 09:26:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][250/1251] eta 0:37:37 lr 0.000213 time 2.3883 (2.2550) loss 3.6427 (3.2831) grad_norm 1.8899 (2.0580) [2022-01-24 09:26:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][260/1251] eta 0:37:22 lr 0.000213 time 2.5610 (2.2626) loss 3.9383 (3.2808) grad_norm 2.1431 (2.0585) [2022-01-24 09:27:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][270/1251] eta 0:36:55 lr 0.000213 time 1.9056 (2.2579) loss 2.5533 (3.2717) grad_norm 1.9264 (2.0589) [2022-01-24 09:27:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][280/1251] eta 0:36:27 lr 0.000213 time 1.8814 (2.2525) loss 3.5254 (3.2673) grad_norm 2.1700 (2.0595) [2022-01-24 09:27:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][290/1251] eta 0:36:04 lr 0.000213 time 2.8306 (2.2521) loss 2.6721 (3.2782) grad_norm 2.0098 (2.0637) [2022-01-24 09:28:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][300/1251] eta 0:35:42 lr 0.000213 time 2.4098 (2.2524) loss 4.0830 (3.2788) grad_norm 1.7882 (2.0612) [2022-01-24 09:28:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][310/1251] eta 0:35:20 lr 0.000213 time 3.2993 (2.2540) loss 4.0772 (3.2877) grad_norm 2.0128 (2.0616) [2022-01-24 09:28:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][320/1251] eta 0:34:55 lr 0.000213 time 1.9525 (2.2504) loss 3.6117 (3.2811) grad_norm 2.0731 (2.0604) [2022-01-24 09:29:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][330/1251] eta 0:34:28 lr 0.000213 time 1.9181 (2.2455) loss 3.8647 (3.2932) grad_norm 2.0169 (2.0604) [2022-01-24 09:29:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][340/1251] eta 0:34:02 lr 0.000213 time 2.8135 (2.2424) loss 2.8818 (3.2880) grad_norm 2.0076 (2.0577) [2022-01-24 09:29:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][350/1251] eta 0:33:44 lr 0.000213 time 3.8388 (2.2465) loss 3.4656 (3.2886) grad_norm 2.0129 (2.0555) [2022-01-24 09:30:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][360/1251] eta 0:33:18 lr 0.000213 time 2.2814 (2.2430) loss 3.6340 (3.2927) grad_norm 2.1882 (2.0582) [2022-01-24 09:30:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][370/1251] eta 0:32:53 lr 0.000213 time 1.5725 (2.2403) loss 3.5740 (3.2941) grad_norm 2.2920 (2.0579) [2022-01-24 09:31:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][380/1251] eta 0:32:28 lr 0.000213 time 2.5294 (2.2370) loss 3.6668 (3.2913) grad_norm 1.8683 (2.0593) [2022-01-24 09:31:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][390/1251] eta 0:32:05 lr 0.000213 time 2.4814 (2.2363) loss 3.9566 (3.2897) grad_norm 2.0135 (2.0594) [2022-01-24 09:31:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][400/1251] eta 0:31:40 lr 0.000213 time 2.0414 (2.2338) loss 3.4648 (3.2885) grad_norm 1.9484 (2.0582) [2022-01-24 09:32:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][410/1251] eta 0:31:13 lr 0.000213 time 2.0675 (2.2280) loss 2.2502 (3.2908) grad_norm 2.0240 (2.0555) [2022-01-24 09:32:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][420/1251] eta 0:30:52 lr 0.000213 time 2.3553 (2.2295) loss 3.2624 (3.2887) grad_norm 2.0677 (2.0560) [2022-01-24 09:32:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][430/1251] eta 0:30:30 lr 0.000213 time 2.7070 (2.2295) loss 3.5668 (3.2942) grad_norm 1.8495 (2.0569) [2022-01-24 09:33:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][440/1251] eta 0:30:06 lr 0.000213 time 1.5132 (2.2274) loss 3.3750 (3.2895) grad_norm 1.9684 (2.0555) [2022-01-24 09:33:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][450/1251] eta 0:29:47 lr 0.000213 time 2.9257 (2.2322) loss 2.8981 (3.2811) grad_norm 1.8832 (2.0547) [2022-01-24 09:33:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][460/1251] eta 0:29:23 lr 0.000213 time 1.8681 (2.2296) loss 2.1601 (3.2816) grad_norm 2.1129 (2.0548) [2022-01-24 09:34:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][470/1251] eta 0:29:00 lr 0.000212 time 2.4466 (2.2281) loss 3.8011 (3.2785) grad_norm 1.7112 (2.0537) [2022-01-24 09:34:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][480/1251] eta 0:28:35 lr 0.000212 time 1.9726 (2.2252) loss 2.6005 (3.2785) grad_norm 1.9453 (2.0551) [2022-01-24 09:35:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][490/1251] eta 0:28:13 lr 0.000212 time 2.7246 (2.2249) loss 3.3601 (3.2817) grad_norm 1.9232 (2.0556) [2022-01-24 09:35:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][500/1251] eta 0:27:50 lr 0.000212 time 2.4048 (2.2240) loss 3.4390 (3.2818) grad_norm 1.9329 (2.0539) [2022-01-24 09:35:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][510/1251] eta 0:27:27 lr 0.000212 time 1.9090 (2.2228) loss 3.3298 (3.2812) grad_norm 1.8696 (2.0514) [2022-01-24 09:36:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][520/1251] eta 0:27:05 lr 0.000212 time 2.5078 (2.2233) loss 4.0680 (3.2858) grad_norm 2.1740 (2.0502) [2022-01-24 09:36:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][530/1251] eta 0:26:42 lr 0.000212 time 2.8146 (2.2224) loss 3.7238 (3.2828) grad_norm 1.9556 (2.0481) [2022-01-24 09:36:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][540/1251] eta 0:26:16 lr 0.000212 time 1.9443 (2.2178) loss 3.6062 (3.2834) grad_norm 2.0421 (2.0483) [2022-01-24 09:37:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][550/1251] eta 0:25:51 lr 0.000212 time 1.9232 (2.2135) loss 2.8218 (3.2845) grad_norm 1.8000 (2.0468) [2022-01-24 09:37:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][560/1251] eta 0:25:27 lr 0.000212 time 2.7663 (2.2111) loss 3.6599 (3.2869) grad_norm 2.3998 (2.0475) [2022-01-24 09:37:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][570/1251] eta 0:25:06 lr 0.000212 time 3.1505 (2.2128) loss 3.8794 (3.2885) grad_norm 2.2842 (2.0464) [2022-01-24 09:38:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][580/1251] eta 0:24:45 lr 0.000212 time 2.1315 (2.2134) loss 3.6051 (3.2886) grad_norm 1.9179 (2.0462) [2022-01-24 09:38:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][590/1251] eta 0:24:22 lr 0.000212 time 2.1090 (2.2132) loss 2.8419 (3.2909) grad_norm 1.8880 (2.0452) [2022-01-24 09:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][600/1251] eta 0:24:01 lr 0.000212 time 1.9075 (2.2144) loss 3.6559 (3.2864) grad_norm 1.9924 (2.0478) [2022-01-24 09:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][610/1251] eta 0:23:40 lr 0.000212 time 2.4700 (2.2157) loss 3.9071 (3.2876) grad_norm 2.2452 (2.0477) [2022-01-24 09:39:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][620/1251] eta 0:23:17 lr 0.000212 time 2.5500 (2.2141) loss 3.1694 (3.2896) grad_norm 1.8795 (2.0470) [2022-01-24 09:40:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][630/1251] eta 0:22:53 lr 0.000212 time 1.5867 (2.2112) loss 3.0126 (3.2919) grad_norm 2.0916 (2.0483) [2022-01-24 09:40:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][640/1251] eta 0:22:32 lr 0.000212 time 1.9128 (2.2130) loss 3.3739 (3.2939) grad_norm 2.0169 (2.0475) [2022-01-24 09:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][650/1251] eta 0:22:09 lr 0.000212 time 2.2191 (2.2124) loss 2.6720 (3.2969) grad_norm 2.0451 (2.0467) [2022-01-24 09:41:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][660/1251] eta 0:21:47 lr 0.000212 time 2.2821 (2.2129) loss 2.5285 (3.2974) grad_norm 1.9317 (2.0468) [2022-01-24 09:41:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][670/1251] eta 0:21:24 lr 0.000212 time 2.2060 (2.2114) loss 2.8422 (3.2964) grad_norm 1.8045 (2.0472) [2022-01-24 09:41:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][680/1251] eta 0:21:02 lr 0.000212 time 1.6854 (2.2103) loss 2.6297 (3.2964) grad_norm 1.9640 (2.0488) [2022-01-24 09:42:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][690/1251] eta 0:20:39 lr 0.000212 time 2.5221 (2.2098) loss 2.9380 (3.2993) grad_norm 1.9018 (2.0491) [2022-01-24 09:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][700/1251] eta 0:20:17 lr 0.000212 time 1.5138 (2.2088) loss 3.7174 (3.2977) grad_norm 1.9328 (2.0484) [2022-01-24 09:42:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][710/1251] eta 0:19:54 lr 0.000212 time 1.6063 (2.2073) loss 3.2150 (3.2988) grad_norm 2.1725 (2.0484) [2022-01-24 09:43:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][720/1251] eta 0:19:32 lr 0.000212 time 2.2554 (2.2074) loss 3.5686 (3.2995) grad_norm 2.9752 (2.0483) [2022-01-24 09:43:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][730/1251] eta 0:19:09 lr 0.000212 time 2.1833 (2.2069) loss 3.5938 (3.3029) grad_norm 1.9725 (2.0504) [2022-01-24 09:44:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][740/1251] eta 0:18:47 lr 0.000212 time 2.0166 (2.2072) loss 4.0381 (3.3023) grad_norm 1.9282 (2.0508) [2022-01-24 09:44:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][750/1251] eta 0:18:25 lr 0.000212 time 1.7877 (2.2059) loss 3.8133 (3.2994) grad_norm 2.0713 (2.0525) [2022-01-24 09:44:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][760/1251] eta 0:18:03 lr 0.000212 time 2.5247 (2.2061) loss 3.7085 (3.2960) grad_norm 2.2334 (2.0530) [2022-01-24 09:45:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][770/1251] eta 0:17:40 lr 0.000211 time 2.1163 (2.2051) loss 2.8642 (3.2956) grad_norm 3.2006 (2.0551) [2022-01-24 09:45:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][780/1251] eta 0:17:18 lr 0.000211 time 1.8145 (2.2049) loss 3.5481 (3.2976) grad_norm 1.9236 (2.0547) [2022-01-24 09:45:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][790/1251] eta 0:16:56 lr 0.000211 time 2.2264 (2.2052) loss 3.4508 (3.2942) grad_norm 1.8417 (2.0534) [2022-01-24 09:46:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][800/1251] eta 0:16:34 lr 0.000211 time 2.0824 (2.2040) loss 2.4213 (3.2947) grad_norm 2.0139 (2.0536) [2022-01-24 09:46:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][810/1251] eta 0:16:12 lr 0.000211 time 1.5569 (2.2043) loss 3.0778 (3.2937) grad_norm 2.3342 (2.0542) [2022-01-24 09:47:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][820/1251] eta 0:15:50 lr 0.000211 time 2.4095 (2.2043) loss 3.4961 (3.2931) grad_norm 2.0911 (2.0535) [2022-01-24 09:47:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][830/1251] eta 0:15:27 lr 0.000211 time 3.0367 (2.2036) loss 3.5911 (3.2930) grad_norm 1.7978 (2.0530) [2022-01-24 09:47:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][840/1251] eta 0:15:05 lr 0.000211 time 2.0575 (2.2031) loss 3.5283 (3.2950) grad_norm 1.8995 (2.0536) [2022-01-24 09:48:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][850/1251] eta 0:14:42 lr 0.000211 time 2.0169 (2.2006) loss 3.0597 (3.2974) grad_norm 2.0983 (2.0532) [2022-01-24 09:48:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][860/1251] eta 0:14:20 lr 0.000211 time 1.9920 (2.1999) loss 4.0309 (3.2985) grad_norm 2.1513 (2.0530) [2022-01-24 09:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][870/1251] eta 0:13:58 lr 0.000211 time 2.3784 (2.2008) loss 3.6877 (3.3004) grad_norm 1.8162 (2.0527) [2022-01-24 09:49:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][880/1251] eta 0:13:37 lr 0.000211 time 2.1538 (2.2027) loss 3.6369 (3.3012) grad_norm 2.0744 (2.0530) [2022-01-24 09:49:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][890/1251] eta 0:13:15 lr 0.000211 time 1.9019 (2.2027) loss 3.7054 (3.3016) grad_norm 2.1986 (2.0533) [2022-01-24 09:49:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][900/1251] eta 0:12:52 lr 0.000211 time 1.9054 (2.2008) loss 3.4430 (3.3028) grad_norm 1.7343 (2.0528) [2022-01-24 09:50:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][910/1251] eta 0:12:31 lr 0.000211 time 2.7884 (2.2024) loss 3.6256 (3.3019) grad_norm 1.9707 (2.0528) [2022-01-24 09:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][920/1251] eta 0:12:09 lr 0.000211 time 1.6195 (2.2036) loss 3.8338 (3.3028) grad_norm 2.1987 (2.0534) [2022-01-24 09:50:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][930/1251] eta 0:11:46 lr 0.000211 time 1.6660 (2.2011) loss 2.5806 (3.2999) grad_norm 2.2120 (2.0542) [2022-01-24 09:51:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][940/1251] eta 0:11:24 lr 0.000211 time 1.6391 (2.2002) loss 3.6271 (3.2985) grad_norm 2.3251 (2.0553) [2022-01-24 09:51:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][950/1251] eta 0:11:02 lr 0.000211 time 3.4136 (2.2005) loss 3.4041 (3.3015) grad_norm 1.7660 (2.0550) [2022-01-24 09:52:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][960/1251] eta 0:10:40 lr 0.000211 time 2.0175 (2.2020) loss 3.1130 (3.3005) grad_norm 1.8938 (2.0565) [2022-01-24 09:52:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][970/1251] eta 0:10:18 lr 0.000211 time 2.5250 (2.2019) loss 3.0141 (3.2994) grad_norm 2.3947 (2.0579) [2022-01-24 09:52:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][980/1251] eta 0:09:56 lr 0.000211 time 1.8491 (2.1999) loss 2.1156 (3.2988) grad_norm 2.0381 (2.0579) [2022-01-24 09:53:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][990/1251] eta 0:09:34 lr 0.000211 time 2.1801 (2.1999) loss 2.7374 (3.2967) grad_norm 2.1753 (2.0581) [2022-01-24 09:53:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1000/1251] eta 0:09:12 lr 0.000211 time 2.5222 (2.1994) loss 3.3030 (3.2990) grad_norm 2.2697 (2.0589) [2022-01-24 09:53:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1010/1251] eta 0:08:49 lr 0.000211 time 2.5570 (2.1989) loss 3.5490 (3.2982) grad_norm 2.1456 (2.0594) [2022-01-24 09:54:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1020/1251] eta 0:08:27 lr 0.000211 time 2.3539 (2.1983) loss 3.5117 (3.2999) grad_norm 1.7977 (2.0582) [2022-01-24 09:54:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1030/1251] eta 0:08:06 lr 0.000211 time 2.0383 (2.1996) loss 2.5926 (3.2983) grad_norm 2.5447 (2.0590) [2022-01-24 09:55:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1040/1251] eta 0:07:44 lr 0.000211 time 1.7667 (2.2024) loss 3.1779 (3.2982) grad_norm 2.2656 (2.0581) [2022-01-24 09:55:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1050/1251] eta 0:07:22 lr 0.000211 time 1.7785 (2.2029) loss 3.5126 (3.2990) grad_norm 1.9397 (2.0577) [2022-01-24 09:55:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1060/1251] eta 0:07:00 lr 0.000211 time 2.2269 (2.2023) loss 2.3365 (3.2994) grad_norm 2.2591 (2.0581) [2022-01-24 09:56:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1070/1251] eta 0:06:38 lr 0.000210 time 1.7292 (2.2001) loss 3.3113 (3.2995) grad_norm 2.0988 (2.0581) [2022-01-24 09:56:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1080/1251] eta 0:06:15 lr 0.000210 time 1.9093 (2.1984) loss 2.5532 (3.2977) grad_norm 2.2086 (2.0576) [2022-01-24 09:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1090/1251] eta 0:05:54 lr 0.000210 time 1.9936 (2.1998) loss 3.7649 (3.2970) grad_norm 2.1546 (2.0596) [2022-01-24 09:57:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1100/1251] eta 0:05:32 lr 0.000210 time 2.4210 (2.1996) loss 3.4613 (3.2965) grad_norm 2.2372 (2.0598) [2022-01-24 09:57:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1110/1251] eta 0:05:10 lr 0.000210 time 1.8279 (2.1991) loss 3.5309 (3.2984) grad_norm 2.0773 (2.0599) [2022-01-24 09:57:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1120/1251] eta 0:04:48 lr 0.000210 time 3.1344 (2.1996) loss 3.3910 (3.2979) grad_norm 1.9466 (2.0605) [2022-01-24 09:58:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1130/1251] eta 0:04:26 lr 0.000210 time 1.5932 (2.1999) loss 2.1799 (3.2977) grad_norm 1.8841 (2.0617) [2022-01-24 09:58:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1140/1251] eta 0:04:04 lr 0.000210 time 1.6200 (2.1990) loss 2.3267 (3.2962) grad_norm 1.7705 (2.0615) [2022-01-24 09:58:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1150/1251] eta 0:03:41 lr 0.000210 time 1.8243 (2.1974) loss 3.0609 (3.2977) grad_norm 2.2361 (2.0615) [2022-01-24 09:59:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1160/1251] eta 0:03:20 lr 0.000210 time 4.3536 (2.1985) loss 3.3785 (3.2978) grad_norm 1.8142 (2.0613) [2022-01-24 09:59:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1170/1251] eta 0:02:58 lr 0.000210 time 2.3502 (2.2024) loss 3.4090 (3.2966) grad_norm 2.0229 (2.0625) [2022-01-24 10:00:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1180/1251] eta 0:02:36 lr 0.000210 time 1.4882 (2.2018) loss 3.3507 (3.2966) grad_norm 1.9379 (2.0629) [2022-01-24 10:00:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1190/1251] eta 0:02:14 lr 0.000210 time 1.7892 (2.2008) loss 2.9132 (3.2973) grad_norm 2.0354 (2.0641) [2022-01-24 10:00:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1200/1251] eta 0:01:52 lr 0.000210 time 2.4178 (2.2000) loss 3.0918 (3.2968) grad_norm 1.7350 (2.0633) [2022-01-24 10:01:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1210/1251] eta 0:01:30 lr 0.000210 time 2.0035 (2.1982) loss 2.8629 (3.2949) grad_norm 1.7814 (2.0628) [2022-01-24 10:01:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1220/1251] eta 0:01:08 lr 0.000210 time 2.0991 (2.1974) loss 2.7881 (3.2928) grad_norm 2.0553 (2.0630) [2022-01-24 10:01:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1230/1251] eta 0:00:46 lr 0.000210 time 2.2209 (2.1979) loss 3.6398 (3.2931) grad_norm 1.9388 (2.0631) [2022-01-24 10:02:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1240/1251] eta 0:00:24 lr 0.000210 time 1.8011 (2.1980) loss 3.2891 (3.2907) grad_norm 2.4736 (2.0640) [2022-01-24 10:02:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [210/300][1250/1251] eta 0:00:02 lr 0.000210 time 1.1891 (2.1932) loss 3.6029 (3.2914) grad_norm 1.9857 (2.0639) [2022-01-24 10:02:34 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 210 training takes 0:45:44 [2022-01-24 10:02:34 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_210 saving...... [2022-01-24 10:02:45 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_210 saved !!! [2022-01-24 10:03:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 16.974 (16.974) Loss 0.9581 (0.9581) Acc@1 77.441 (77.441) Acc@5 94.531 (94.531) [2022-01-24 10:03:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.986 (2.915) Loss 0.9014 (0.9062) Acc@1 79.395 (78.267) Acc@5 94.141 (94.513) [2022-01-24 10:03:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.333 (2.276) Loss 0.9183 (0.9147) Acc@1 78.906 (78.330) Acc@5 94.141 (94.457) [2022-01-24 10:03:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.628 (2.096) Loss 0.8534 (0.9068) Acc@1 79.785 (78.497) Acc@5 95.605 (94.512) [2022-01-24 10:04:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.657 (2.000) Loss 0.9002 (0.9024) Acc@1 78.809 (78.613) Acc@5 94.238 (94.536) [2022-01-24 10:04:15 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.644 Acc@5 94.592 [2022-01-24 10:04:15 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.6% [2022-01-24 10:04:15 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.81% [2022-01-24 10:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][0/1251] eta 7:29:04 lr 0.000210 time 21.5380 (21.5380) loss 3.1223 (3.1223) grad_norm 2.3635 (2.3635) [2022-01-24 10:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][10/1251] eta 1:22:39 lr 0.000210 time 1.8316 (3.9961) loss 3.6990 (3.1021) grad_norm 2.1223 (2.0903) [2022-01-24 10:05:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][20/1251] eta 1:03:09 lr 0.000210 time 1.8182 (3.0785) loss 2.7987 (3.0326) grad_norm 2.0127 (2.0653) [2022-01-24 10:05:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][30/1251] eta 0:55:49 lr 0.000210 time 1.6647 (2.7435) loss 2.9490 (3.1357) grad_norm 3.1384 (2.1203) [2022-01-24 10:06:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][40/1251] eta 0:54:37 lr 0.000210 time 5.9517 (2.7066) loss 2.3178 (3.1530) grad_norm 2.0575 (2.1568) [2022-01-24 10:06:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][50/1251] eta 0:52:24 lr 0.000210 time 2.4109 (2.6184) loss 2.8471 (3.1641) grad_norm 2.2506 (2.1530) [2022-01-24 10:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][60/1251] eta 0:50:34 lr 0.000210 time 1.5681 (2.5475) loss 3.5718 (3.1912) grad_norm 2.5842 (2.1410) [2022-01-24 10:07:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][70/1251] eta 0:49:13 lr 0.000210 time 1.4904 (2.5011) loss 3.4461 (3.1527) grad_norm 1.9285 (2.1318) [2022-01-24 10:07:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][80/1251] eta 0:48:38 lr 0.000210 time 3.8282 (2.4922) loss 2.7581 (3.1652) grad_norm 2.0121 (2.1160) [2022-01-24 10:08:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][90/1251] eta 0:48:00 lr 0.000210 time 3.1829 (2.4809) loss 3.2409 (3.1640) grad_norm 1.8768 (2.1072) [2022-01-24 10:08:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][100/1251] eta 0:47:08 lr 0.000210 time 1.5379 (2.4578) loss 2.3557 (3.1770) grad_norm 1.9294 (2.0996) [2022-01-24 10:08:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][110/1251] eta 0:46:05 lr 0.000210 time 1.9281 (2.4236) loss 3.3201 (3.1958) grad_norm 1.8366 (2.0869) [2022-01-24 10:09:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][120/1251] eta 0:44:57 lr 0.000209 time 2.4595 (2.3852) loss 2.9987 (3.1940) grad_norm 2.0713 (2.0784) [2022-01-24 10:09:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][130/1251] eta 0:44:12 lr 0.000209 time 3.1077 (2.3664) loss 3.7702 (3.1999) grad_norm 2.1058 (2.0702) [2022-01-24 10:09:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][140/1251] eta 0:43:42 lr 0.000209 time 2.5190 (2.3608) loss 3.3820 (3.2035) grad_norm 2.1884 (2.0696) [2022-01-24 10:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][150/1251] eta 0:43:11 lr 0.000209 time 1.8099 (2.3537) loss 3.9046 (3.1983) grad_norm 2.3203 (2.0728) [2022-01-24 10:10:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][160/1251] eta 0:42:53 lr 0.000209 time 2.6367 (2.3587) loss 3.5133 (3.1970) grad_norm 2.1113 (2.0680) [2022-01-24 10:10:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][170/1251] eta 0:42:31 lr 0.000209 time 3.4259 (2.3601) loss 2.7102 (3.1769) grad_norm 2.0433 (2.0619) [2022-01-24 10:11:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][180/1251] eta 0:41:53 lr 0.000209 time 2.2105 (2.3468) loss 3.7229 (3.1786) grad_norm 1.9858 (2.0561) [2022-01-24 10:11:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][190/1251] eta 0:41:11 lr 0.000209 time 1.9859 (2.3292) loss 2.9209 (3.1801) grad_norm 1.8943 (2.0537) [2022-01-24 10:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][200/1251] eta 0:40:26 lr 0.000209 time 1.9398 (2.3084) loss 2.9984 (3.1717) grad_norm 1.8584 (2.0544) [2022-01-24 10:12:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][210/1251] eta 0:40:00 lr 0.000209 time 3.3622 (2.3057) loss 3.3963 (3.1797) grad_norm 2.0022 (2.0522) [2022-01-24 10:12:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][220/1251] eta 0:39:28 lr 0.000209 time 2.2535 (2.2976) loss 3.7424 (3.1978) grad_norm 2.1520 (2.0608) [2022-01-24 10:13:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][230/1251] eta 0:38:59 lr 0.000209 time 1.5852 (2.2913) loss 3.6419 (3.2038) grad_norm 2.2075 (2.0606) [2022-01-24 10:13:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][240/1251] eta 0:38:27 lr 0.000209 time 1.9210 (2.2825) loss 3.7857 (3.2110) grad_norm 1.8670 (2.0579) [2022-01-24 10:13:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][250/1251] eta 0:38:07 lr 0.000209 time 2.9048 (2.2857) loss 3.2487 (3.2083) grad_norm 1.8255 (2.0524) [2022-01-24 10:14:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][260/1251] eta 0:37:45 lr 0.000209 time 1.8930 (2.2863) loss 3.6620 (3.2012) grad_norm 1.8703 (2.0474) [2022-01-24 10:14:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][270/1251] eta 0:37:17 lr 0.000209 time 2.1820 (2.2813) loss 3.5756 (3.1968) grad_norm 1.8542 (2.0470) [2022-01-24 10:14:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][280/1251] eta 0:36:54 lr 0.000209 time 2.4077 (2.2811) loss 3.7041 (3.2097) grad_norm 2.1280 (2.0484) [2022-01-24 10:15:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][290/1251] eta 0:36:31 lr 0.000209 time 2.4044 (2.2804) loss 3.6081 (3.2079) grad_norm 2.0852 (2.0482) [2022-01-24 10:15:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][300/1251] eta 0:36:04 lr 0.000209 time 2.5642 (2.2762) loss 3.9712 (3.2156) grad_norm 2.0821 (2.0502) [2022-01-24 10:16:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][310/1251] eta 0:35:33 lr 0.000209 time 1.9547 (2.2677) loss 3.7387 (3.2207) grad_norm 2.1785 (2.0507) [2022-01-24 10:16:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][320/1251] eta 0:35:09 lr 0.000209 time 2.2693 (2.2659) loss 2.7838 (3.2209) grad_norm 1.9484 (2.0507) [2022-01-24 10:16:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][330/1251] eta 0:34:44 lr 0.000209 time 2.2014 (2.2632) loss 3.4744 (3.2229) grad_norm 1.9894 (2.0490) [2022-01-24 10:17:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][340/1251] eta 0:34:18 lr 0.000209 time 2.8437 (2.2596) loss 3.7709 (3.2246) grad_norm 2.8327 (2.0522) [2022-01-24 10:17:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][350/1251] eta 0:33:54 lr 0.000209 time 1.9097 (2.2584) loss 3.4147 (3.2217) grad_norm 1.8493 (2.0501) [2022-01-24 10:17:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][360/1251] eta 0:33:29 lr 0.000209 time 2.1919 (2.2550) loss 3.6993 (3.2147) grad_norm 1.9517 (2.0474) [2022-01-24 10:18:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][370/1251] eta 0:33:02 lr 0.000209 time 1.8461 (2.2497) loss 3.1205 (3.2119) grad_norm 2.3096 (2.0502) [2022-01-24 10:18:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][380/1251] eta 0:32:37 lr 0.000209 time 1.9999 (2.2470) loss 2.7797 (3.2167) grad_norm 1.8394 (2.0527) [2022-01-24 10:18:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][390/1251] eta 0:32:15 lr 0.000209 time 2.4346 (2.2477) loss 2.9326 (3.2134) grad_norm 2.6522 (2.0582) [2022-01-24 10:19:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][400/1251] eta 0:31:50 lr 0.000209 time 2.1873 (2.2453) loss 3.1138 (3.2220) grad_norm 2.3067 (2.0568) [2022-01-24 10:19:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][410/1251] eta 0:31:25 lr 0.000209 time 1.8597 (2.2417) loss 2.9545 (3.2215) grad_norm 2.0404 (2.0564) [2022-01-24 10:19:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][420/1251] eta 0:31:00 lr 0.000208 time 2.6460 (2.2390) loss 3.9799 (3.2299) grad_norm 1.9339 (2.0555) [2022-01-24 10:20:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][430/1251] eta 0:30:35 lr 0.000208 time 1.8642 (2.2362) loss 2.6953 (3.2306) grad_norm 2.1975 (2.0572) [2022-01-24 10:20:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][440/1251] eta 0:30:14 lr 0.000208 time 2.3967 (2.2379) loss 4.1543 (3.2378) grad_norm 2.2933 (2.0571) [2022-01-24 10:21:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][450/1251] eta 0:29:53 lr 0.000208 time 2.4884 (2.2386) loss 2.1934 (3.2398) grad_norm 1.9704 (2.0566) [2022-01-24 10:21:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][460/1251] eta 0:29:29 lr 0.000208 time 2.5057 (2.2376) loss 3.3756 (3.2443) grad_norm 1.9383 (2.0563) [2022-01-24 10:21:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][470/1251] eta 0:29:04 lr 0.000208 time 1.8802 (2.2341) loss 3.8090 (3.2458) grad_norm 2.4393 (2.0570) [2022-01-24 10:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][480/1251] eta 0:28:41 lr 0.000208 time 2.1287 (2.2325) loss 2.7582 (3.2406) grad_norm 2.1043 (2.0573) [2022-01-24 10:22:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][490/1251] eta 0:28:18 lr 0.000208 time 1.8780 (2.2316) loss 3.3914 (3.2354) grad_norm 1.8551 (2.0569) [2022-01-24 10:22:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][500/1251] eta 0:27:58 lr 0.000208 time 2.5060 (2.2353) loss 3.9737 (3.2327) grad_norm 2.1230 (2.0566) [2022-01-24 10:23:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][510/1251] eta 0:27:36 lr 0.000208 time 2.1644 (2.2352) loss 3.3441 (3.2367) grad_norm 2.4372 (2.0573) [2022-01-24 10:23:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][520/1251] eta 0:27:12 lr 0.000208 time 2.7571 (2.2336) loss 3.6462 (3.2403) grad_norm 1.8058 (2.0571) [2022-01-24 10:24:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][530/1251] eta 0:26:47 lr 0.000208 time 1.8698 (2.2301) loss 4.0927 (3.2440) grad_norm 1.7305 (2.0553) [2022-01-24 10:24:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][540/1251] eta 0:26:24 lr 0.000208 time 2.7966 (2.2285) loss 2.4814 (3.2468) grad_norm 1.8861 (2.0558) [2022-01-24 10:24:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][550/1251] eta 0:26:01 lr 0.000208 time 1.7677 (2.2270) loss 2.5607 (3.2431) grad_norm 2.1260 (2.0590) [2022-01-24 10:25:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][560/1251] eta 0:25:38 lr 0.000208 time 2.2776 (2.2259) loss 3.8107 (3.2456) grad_norm 1.9933 (2.0621) [2022-01-24 10:25:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][570/1251] eta 0:25:14 lr 0.000208 time 1.9588 (2.2247) loss 2.5307 (3.2409) grad_norm 1.9312 (2.0621) [2022-01-24 10:25:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][580/1251] eta 0:24:54 lr 0.000208 time 3.6412 (2.2277) loss 2.3555 (3.2346) grad_norm 2.0902 (2.0619) [2022-01-24 10:26:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][590/1251] eta 0:24:31 lr 0.000208 time 1.8436 (2.2265) loss 3.0441 (3.2402) grad_norm 1.9685 (2.0625) [2022-01-24 10:26:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][600/1251] eta 0:24:08 lr 0.000208 time 1.8893 (2.2244) loss 3.1884 (3.2416) grad_norm 2.3100 (2.0644) [2022-01-24 10:26:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][610/1251] eta 0:23:44 lr 0.000208 time 1.9531 (2.2223) loss 3.7397 (3.2450) grad_norm 1.8187 (2.0638) [2022-01-24 10:27:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][620/1251] eta 0:23:23 lr 0.000208 time 3.0388 (2.2250) loss 3.5715 (3.2477) grad_norm 2.2325 (2.0637) [2022-01-24 10:27:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][630/1251] eta 0:23:00 lr 0.000208 time 1.6415 (2.2224) loss 3.3197 (3.2511) grad_norm 2.5957 (2.0642) [2022-01-24 10:27:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][640/1251] eta 0:22:37 lr 0.000208 time 1.8005 (2.2211) loss 2.9891 (3.2475) grad_norm 1.9266 (2.0651) [2022-01-24 10:28:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][650/1251] eta 0:22:14 lr 0.000208 time 2.0979 (2.2206) loss 3.6936 (3.2485) grad_norm 1.9984 (2.0642) [2022-01-24 10:28:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][660/1251] eta 0:21:52 lr 0.000208 time 2.7115 (2.2201) loss 4.0053 (3.2472) grad_norm 2.1760 (2.0636) [2022-01-24 10:29:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][670/1251] eta 0:21:28 lr 0.000208 time 2.2926 (2.2172) loss 3.2730 (3.2483) grad_norm 2.6702 (2.0630) [2022-01-24 10:29:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][680/1251] eta 0:21:06 lr 0.000208 time 1.8663 (2.2180) loss 2.5736 (3.2479) grad_norm 2.0417 (2.0622) [2022-01-24 10:29:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][690/1251] eta 0:20:44 lr 0.000208 time 2.0914 (2.2191) loss 3.1905 (3.2477) grad_norm 2.2306 (2.0605) [2022-01-24 10:30:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][700/1251] eta 0:20:22 lr 0.000208 time 2.1689 (2.2187) loss 3.1038 (3.2498) grad_norm 2.0156 (2.0610) [2022-01-24 10:30:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][710/1251] eta 0:19:59 lr 0.000208 time 2.4524 (2.2175) loss 3.1503 (3.2526) grad_norm 2.0969 (2.0604) [2022-01-24 10:30:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][720/1251] eta 0:19:37 lr 0.000207 time 1.9227 (2.2169) loss 3.4421 (3.2543) grad_norm 2.0555 (2.0599) [2022-01-24 10:31:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][730/1251] eta 0:19:14 lr 0.000207 time 3.2413 (2.2158) loss 3.8247 (3.2562) grad_norm 1.9299 (2.0603) [2022-01-24 10:31:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][740/1251] eta 0:18:53 lr 0.000207 time 2.2172 (2.2183) loss 3.2691 (3.2575) grad_norm 1.8248 (2.0596) [2022-01-24 10:32:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][750/1251] eta 0:18:31 lr 0.000207 time 1.9433 (2.2185) loss 3.4692 (3.2592) grad_norm 1.9351 (2.0589) [2022-01-24 10:32:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][760/1251] eta 0:18:09 lr 0.000207 time 1.8174 (2.2182) loss 2.2415 (3.2592) grad_norm 2.0351 (2.0595) [2022-01-24 10:32:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][770/1251] eta 0:17:45 lr 0.000207 time 2.0491 (2.2157) loss 3.5198 (3.2602) grad_norm 2.2559 (2.0596) [2022-01-24 10:33:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][780/1251] eta 0:17:23 lr 0.000207 time 1.9401 (2.2146) loss 3.0263 (3.2580) grad_norm 2.1245 (2.0606) [2022-01-24 10:33:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][790/1251] eta 0:17:00 lr 0.000207 time 1.8452 (2.2128) loss 3.4078 (3.2578) grad_norm 2.1298 (2.0622) [2022-01-24 10:33:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][800/1251] eta 0:16:37 lr 0.000207 time 2.1354 (2.2123) loss 2.5768 (3.2569) grad_norm 1.9909 (2.0631) [2022-01-24 10:34:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][810/1251] eta 0:16:15 lr 0.000207 time 2.3636 (2.2118) loss 4.2575 (3.2631) grad_norm 2.3179 (2.0666) [2022-01-24 10:34:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][820/1251] eta 0:15:53 lr 0.000207 time 2.5055 (2.2131) loss 3.1723 (3.2625) grad_norm 2.0880 (2.0678) [2022-01-24 10:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][830/1251] eta 0:15:31 lr 0.000207 time 1.8481 (2.2127) loss 3.0180 (3.2633) grad_norm 1.9977 (2.0677) [2022-01-24 10:35:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][840/1251] eta 0:15:10 lr 0.000207 time 3.1786 (2.2149) loss 3.7148 (3.2648) grad_norm 2.1260 (2.0675) [2022-01-24 10:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][850/1251] eta 0:14:48 lr 0.000207 time 2.0378 (2.2151) loss 3.9363 (3.2660) grad_norm 2.2127 (2.0670) [2022-01-24 10:36:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][860/1251] eta 0:14:25 lr 0.000207 time 2.3829 (2.2135) loss 3.6383 (3.2658) grad_norm 1.9352 (2.0664) [2022-01-24 10:36:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][870/1251] eta 0:14:02 lr 0.000207 time 2.0399 (2.2117) loss 3.3407 (3.2664) grad_norm 1.8558 (2.0656) [2022-01-24 10:36:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][880/1251] eta 0:13:40 lr 0.000207 time 2.2976 (2.2112) loss 3.7810 (3.2655) grad_norm 2.1703 (2.0651) [2022-01-24 10:37:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][890/1251] eta 0:13:18 lr 0.000207 time 3.3775 (2.2112) loss 3.2751 (3.2662) grad_norm 2.1174 (2.0660) [2022-01-24 10:37:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][900/1251] eta 0:12:55 lr 0.000207 time 2.2446 (2.2097) loss 2.4281 (3.2668) grad_norm 2.0137 (2.0679) [2022-01-24 10:37:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][910/1251] eta 0:12:33 lr 0.000207 time 1.8711 (2.2095) loss 2.9848 (3.2641) grad_norm 2.2367 (2.0685) [2022-01-24 10:38:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][920/1251] eta 0:12:11 lr 0.000207 time 1.9177 (2.2098) loss 3.6917 (3.2657) grad_norm 2.0928 (2.0689) [2022-01-24 10:38:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][930/1251] eta 0:11:49 lr 0.000207 time 2.7592 (2.2107) loss 3.1784 (3.2664) grad_norm 1.8572 (2.0690) [2022-01-24 10:38:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][940/1251] eta 0:11:27 lr 0.000207 time 1.9122 (2.2101) loss 2.6437 (3.2659) grad_norm 2.0924 (2.0694) [2022-01-24 10:39:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][950/1251] eta 0:11:04 lr 0.000207 time 1.9495 (2.2090) loss 2.9239 (3.2637) grad_norm 1.8146 (2.0697) [2022-01-24 10:39:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][960/1251] eta 0:10:41 lr 0.000207 time 2.0718 (2.2060) loss 2.7746 (3.2638) grad_norm 1.8445 (2.0699) [2022-01-24 10:39:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][970/1251] eta 0:10:19 lr 0.000207 time 2.3053 (2.2040) loss 3.1354 (3.2637) grad_norm 2.1690 (2.0698) [2022-01-24 10:40:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][980/1251] eta 0:09:57 lr 0.000207 time 1.5651 (2.2038) loss 2.6073 (3.2651) grad_norm 2.0789 (2.0695) [2022-01-24 10:40:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][990/1251] eta 0:09:35 lr 0.000207 time 2.2144 (2.2048) loss 3.6604 (3.2646) grad_norm 2.1916 (2.0695) [2022-01-24 10:41:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1000/1251] eta 0:09:13 lr 0.000207 time 1.9470 (2.2045) loss 3.2736 (3.2651) grad_norm 2.1582 (2.0686) [2022-01-24 10:41:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1010/1251] eta 0:08:51 lr 0.000207 time 2.2071 (2.2056) loss 1.9765 (3.2658) grad_norm 1.8758 (2.0678) [2022-01-24 10:41:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1020/1251] eta 0:08:29 lr 0.000206 time 2.2073 (2.2071) loss 3.3702 (3.2664) grad_norm 2.3433 (2.0681) [2022-01-24 10:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1030/1251] eta 0:08:08 lr 0.000206 time 2.2215 (2.2091) loss 3.5570 (3.2648) grad_norm 2.6745 (2.0690) [2022-01-24 10:42:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1040/1251] eta 0:07:45 lr 0.000206 time 1.5483 (2.2083) loss 2.0177 (3.2622) grad_norm 2.3258 (2.0697) [2022-01-24 10:42:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1050/1251] eta 0:07:23 lr 0.000206 time 1.9726 (2.2082) loss 3.2876 (3.2638) grad_norm 2.3527 (2.0705) [2022-01-24 10:43:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1060/1251] eta 0:07:01 lr 0.000206 time 2.2034 (2.2060) loss 3.9507 (3.2648) grad_norm 2.1200 (2.0713) [2022-01-24 10:43:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1070/1251] eta 0:06:39 lr 0.000206 time 1.8616 (2.2050) loss 3.0538 (3.2664) grad_norm 1.8252 (2.0701) [2022-01-24 10:43:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1080/1251] eta 0:06:16 lr 0.000206 time 1.5815 (2.2032) loss 2.9533 (3.2651) grad_norm 2.1202 (2.0713) [2022-01-24 10:44:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1090/1251] eta 0:05:54 lr 0.000206 time 1.8750 (2.2030) loss 2.0406 (3.2637) grad_norm 2.1048 (2.0714) [2022-01-24 10:44:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1100/1251] eta 0:05:32 lr 0.000206 time 1.9301 (2.2025) loss 3.4804 (3.2611) grad_norm 2.3732 (2.0711) [2022-01-24 10:45:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1110/1251] eta 0:05:10 lr 0.000206 time 2.2323 (2.2031) loss 2.3353 (3.2603) grad_norm 2.4207 (2.0711) [2022-01-24 10:45:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1120/1251] eta 0:04:48 lr 0.000206 time 2.8253 (2.2039) loss 3.7684 (3.2598) grad_norm 1.9767 (2.0731) [2022-01-24 10:45:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1130/1251] eta 0:04:26 lr 0.000206 time 2.0733 (2.2041) loss 3.8156 (3.2611) grad_norm 1.8000 (2.0738) [2022-01-24 10:46:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1140/1251] eta 0:04:04 lr 0.000206 time 1.5764 (2.2031) loss 3.6302 (3.2608) grad_norm 1.8696 (2.0732) [2022-01-24 10:46:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1150/1251] eta 0:03:42 lr 0.000206 time 2.4945 (2.2038) loss 2.3466 (3.2592) grad_norm 2.1224 (2.0725) [2022-01-24 10:46:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1160/1251] eta 0:03:20 lr 0.000206 time 2.9891 (2.2052) loss 2.8185 (3.2590) grad_norm 1.8080 (2.0719) [2022-01-24 10:47:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1170/1251] eta 0:02:58 lr 0.000206 time 2.4502 (2.2061) loss 2.8970 (3.2591) grad_norm 1.9581 (2.0716) [2022-01-24 10:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1180/1251] eta 0:02:36 lr 0.000206 time 1.8051 (2.2042) loss 3.7032 (3.2586) grad_norm 2.0801 (2.0713) [2022-01-24 10:47:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1190/1251] eta 0:02:14 lr 0.000206 time 2.1394 (2.2018) loss 2.7707 (3.2578) grad_norm 1.9622 (2.0714) [2022-01-24 10:48:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1200/1251] eta 0:01:52 lr 0.000206 time 1.8143 (2.2004) loss 2.2991 (3.2561) grad_norm 2.2683 (2.0717) [2022-01-24 10:48:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1210/1251] eta 0:01:30 lr 0.000206 time 2.2015 (2.2012) loss 2.3998 (3.2548) grad_norm 1.9344 (2.0709) [2022-01-24 10:49:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1220/1251] eta 0:01:08 lr 0.000206 time 2.7785 (2.2075) loss 3.8388 (3.2547) grad_norm 2.0391 (2.0696) [2022-01-24 10:49:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1230/1251] eta 0:00:46 lr 0.000206 time 2.1351 (2.2081) loss 3.0068 (3.2567) grad_norm 1.8424 (2.0698) [2022-01-24 10:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1240/1251] eta 0:00:24 lr 0.000206 time 1.4673 (2.2064) loss 3.9126 (3.2572) grad_norm 2.0397 (2.0691) [2022-01-24 10:50:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [211/300][1250/1251] eta 0:00:02 lr 0.000206 time 1.1705 (2.2010) loss 3.1837 (3.2580) grad_norm 1.9289 (2.0693) [2022-01-24 10:50:09 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 211 training takes 0:45:53 [2022-01-24 10:50:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.099 (19.099) Loss 0.8596 (0.8596) Acc@1 81.641 (81.641) Acc@5 94.922 (94.922) [2022-01-24 10:50:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.619 (3.486) Loss 0.8122 (0.8593) Acc@1 80.664 (79.892) Acc@5 95.410 (94.895) [2022-01-24 10:51:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.962 (2.807) Loss 0.8797 (0.8740) Acc@1 79.395 (79.506) Acc@5 94.336 (94.745) [2022-01-24 10:51:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.603 (2.342) Loss 0.8715 (0.8853) Acc@1 79.199 (79.300) Acc@5 95.410 (94.629) [2022-01-24 10:51:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.171 (2.136) Loss 1.0065 (0.8912) Acc@1 75.000 (79.028) Acc@5 93.945 (94.650) [2022-01-24 10:51:44 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.992 Acc@5 94.590 [2022-01-24 10:51:44 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.0% [2022-01-24 10:51:44 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.99% [2022-01-24 10:52:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][0/1251] eta 7:25:47 lr 0.000206 time 21.3806 (21.3806) loss 2.7056 (2.7056) grad_norm 1.9276 (1.9276) [2022-01-24 10:52:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][10/1251] eta 1:25:11 lr 0.000206 time 2.9424 (4.1186) loss 2.9296 (3.1943) grad_norm 2.1803 (2.0377) [2022-01-24 10:52:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][20/1251] eta 1:06:25 lr 0.000206 time 1.4764 (3.2376) loss 3.7462 (3.2666) grad_norm 1.8668 (2.0402) [2022-01-24 10:53:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][30/1251] eta 0:58:35 lr 0.000206 time 1.8790 (2.8794) loss 2.8595 (3.2442) grad_norm 2.4975 (2.1474) [2022-01-24 10:53:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][40/1251] eta 0:55:09 lr 0.000206 time 3.7735 (2.7327) loss 3.6546 (3.1941) grad_norm 1.8901 (2.1518) [2022-01-24 10:54:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][50/1251] eta 0:53:26 lr 0.000206 time 2.6793 (2.6695) loss 3.0864 (3.2034) grad_norm 2.1911 (2.1268) [2022-01-24 10:54:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][60/1251] eta 0:51:39 lr 0.000206 time 1.3455 (2.6023) loss 2.2272 (3.1769) grad_norm 2.0513 (2.1150) [2022-01-24 10:54:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][70/1251] eta 0:50:04 lr 0.000205 time 2.2007 (2.5439) loss 3.2142 (3.2210) grad_norm 3.0386 (2.1269) [2022-01-24 10:55:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][80/1251] eta 0:48:52 lr 0.000205 time 2.9689 (2.5043) loss 3.4472 (3.2272) grad_norm 2.1379 (2.1139) [2022-01-24 10:55:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][90/1251] eta 0:47:22 lr 0.000205 time 1.8754 (2.4486) loss 3.4593 (3.2041) grad_norm 2.1400 (2.1028) [2022-01-24 10:55:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][100/1251] eta 0:46:07 lr 0.000205 time 1.9575 (2.4040) loss 2.3071 (3.2183) grad_norm 2.1644 (2.0941) [2022-01-24 10:56:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][110/1251] eta 0:45:19 lr 0.000205 time 2.0656 (2.3832) loss 2.1422 (3.2117) grad_norm 1.8852 (2.0881) [2022-01-24 10:56:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][120/1251] eta 0:44:34 lr 0.000205 time 1.9009 (2.3644) loss 3.6141 (3.2168) grad_norm 2.3692 (2.0883) [2022-01-24 10:56:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][130/1251] eta 0:44:10 lr 0.000205 time 2.0192 (2.3643) loss 3.7335 (3.2440) grad_norm 1.9693 (2.0928) [2022-01-24 10:57:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][140/1251] eta 0:43:37 lr 0.000205 time 2.4806 (2.3562) loss 3.2989 (3.2491) grad_norm 2.1702 (2.0868) [2022-01-24 10:57:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][150/1251] eta 0:43:02 lr 0.000205 time 1.4946 (2.3456) loss 3.7988 (3.2620) grad_norm 2.2713 (2.0870) [2022-01-24 10:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][160/1251] eta 0:42:24 lr 0.000205 time 1.7279 (2.3323) loss 3.5466 (3.2700) grad_norm 2.0627 (2.0955) [2022-01-24 10:58:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][170/1251] eta 0:41:55 lr 0.000205 time 1.7087 (2.3272) loss 3.6463 (3.2722) grad_norm 2.0781 (2.0925) [2022-01-24 10:58:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][180/1251] eta 0:41:23 lr 0.000205 time 1.6389 (2.3187) loss 3.2434 (3.2617) grad_norm 2.2328 (2.0907) [2022-01-24 10:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][190/1251] eta 0:40:46 lr 0.000205 time 1.8247 (2.3054) loss 3.0531 (3.2722) grad_norm 1.9793 (2.0910) [2022-01-24 10:59:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][200/1251] eta 0:40:12 lr 0.000205 time 1.6891 (2.2957) loss 3.3658 (3.2799) grad_norm 1.9762 (2.0887) [2022-01-24 10:59:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][210/1251] eta 0:39:44 lr 0.000205 time 1.9325 (2.2910) loss 2.9277 (3.2820) grad_norm 2.0300 (2.0940) [2022-01-24 11:00:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][220/1251] eta 0:39:14 lr 0.000205 time 1.8194 (2.2835) loss 2.7289 (3.2743) grad_norm 2.3738 (2.0960) [2022-01-24 11:00:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][230/1251] eta 0:38:44 lr 0.000205 time 1.5691 (2.2770) loss 2.4221 (3.2747) grad_norm 1.9416 (2.0970) [2022-01-24 11:00:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][240/1251] eta 0:38:21 lr 0.000205 time 1.6707 (2.2760) loss 3.1343 (3.2626) grad_norm 1.7872 (2.0944) [2022-01-24 11:01:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][250/1251] eta 0:38:02 lr 0.000205 time 1.8976 (2.2799) loss 3.3483 (3.2645) grad_norm 2.0331 (2.0942) [2022-01-24 11:01:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][260/1251] eta 0:37:32 lr 0.000205 time 1.5433 (2.2726) loss 3.3377 (3.2574) grad_norm 1.8366 (2.0937) [2022-01-24 11:01:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][270/1251] eta 0:37:03 lr 0.000205 time 1.8317 (2.2667) loss 3.3217 (3.2511) grad_norm 1.9245 (2.0928) [2022-01-24 11:02:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][280/1251] eta 0:36:34 lr 0.000205 time 1.7410 (2.2604) loss 3.4473 (3.2442) grad_norm 2.1846 (2.0925) [2022-01-24 11:02:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][290/1251] eta 0:36:10 lr 0.000205 time 2.2470 (2.2584) loss 3.9927 (3.2432) grad_norm 2.1182 (2.0931) [2022-01-24 11:03:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][300/1251] eta 0:35:43 lr 0.000205 time 1.8391 (2.2534) loss 2.3324 (3.2385) grad_norm 1.8742 (2.0900) [2022-01-24 11:03:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][310/1251] eta 0:35:19 lr 0.000205 time 2.4603 (2.2524) loss 3.9942 (3.2357) grad_norm 2.2182 (2.0890) [2022-01-24 11:03:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][320/1251] eta 0:34:49 lr 0.000205 time 1.8007 (2.2445) loss 2.8763 (3.2299) grad_norm 2.0106 (2.0901) [2022-01-24 11:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][330/1251] eta 0:34:23 lr 0.000205 time 2.0088 (2.2405) loss 3.9108 (3.2263) grad_norm 2.1652 (2.0920) [2022-01-24 11:04:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][340/1251] eta 0:33:59 lr 0.000205 time 2.5119 (2.2388) loss 3.0145 (3.2241) grad_norm 2.0739 (2.0909) [2022-01-24 11:04:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][350/1251] eta 0:33:37 lr 0.000205 time 3.0105 (2.2393) loss 3.7809 (3.2313) grad_norm 2.5645 (2.0902) [2022-01-24 11:05:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][360/1251] eta 0:33:15 lr 0.000205 time 1.8967 (2.2396) loss 3.2346 (3.2368) grad_norm 2.1038 (2.0914) [2022-01-24 11:05:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][370/1251] eta 0:32:53 lr 0.000205 time 2.1287 (2.2406) loss 2.6811 (3.2376) grad_norm 2.0508 (2.0935) [2022-01-24 11:05:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][380/1251] eta 0:32:30 lr 0.000204 time 2.2271 (2.2397) loss 3.1756 (3.2341) grad_norm 2.0579 (2.0953) [2022-01-24 11:06:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][390/1251] eta 0:32:05 lr 0.000204 time 2.7213 (2.2366) loss 2.3227 (3.2299) grad_norm 1.9244 (2.0953) [2022-01-24 11:06:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][400/1251] eta 0:31:42 lr 0.000204 time 2.0965 (2.2355) loss 3.8395 (3.2329) grad_norm 2.2190 (2.0965) [2022-01-24 11:07:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][410/1251] eta 0:31:19 lr 0.000204 time 1.6509 (2.2344) loss 2.4791 (3.2371) grad_norm 2.1178 (2.0964) [2022-01-24 11:07:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][420/1251] eta 0:30:57 lr 0.000204 time 2.2949 (2.2352) loss 3.5030 (3.2305) grad_norm 1.8582 (2.0966) [2022-01-24 11:07:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][430/1251] eta 0:30:33 lr 0.000204 time 2.2281 (2.2334) loss 2.6260 (3.2270) grad_norm 2.2019 (2.0968) [2022-01-24 11:08:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][440/1251] eta 0:30:09 lr 0.000204 time 2.0217 (2.2315) loss 2.6955 (3.2272) grad_norm 2.1112 (2.0946) [2022-01-24 11:08:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][450/1251] eta 0:29:44 lr 0.000204 time 1.7607 (2.2274) loss 3.9795 (3.2320) grad_norm 1.9013 (2.0937) [2022-01-24 11:08:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][460/1251] eta 0:29:20 lr 0.000204 time 2.5235 (2.2259) loss 2.6791 (3.2296) grad_norm 1.8575 (2.0904) [2022-01-24 11:09:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][470/1251] eta 0:28:55 lr 0.000204 time 1.6075 (2.2224) loss 3.4651 (3.2314) grad_norm 1.7445 (2.0895) [2022-01-24 11:09:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][480/1251] eta 0:28:36 lr 0.000204 time 2.7486 (2.2267) loss 3.7977 (3.2333) grad_norm 1.9038 (2.0881) [2022-01-24 11:09:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][490/1251] eta 0:28:15 lr 0.000204 time 2.2986 (2.2278) loss 2.9239 (3.2312) grad_norm 1.8271 (2.0856) [2022-01-24 11:10:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][500/1251] eta 0:27:53 lr 0.000204 time 2.8517 (2.2289) loss 3.5495 (3.2300) grad_norm 2.2967 (2.0964) [2022-01-24 11:10:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][510/1251] eta 0:27:27 lr 0.000204 time 1.6392 (2.2239) loss 3.7881 (3.2274) grad_norm 2.0142 (2.0991) [2022-01-24 11:11:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][520/1251] eta 0:27:05 lr 0.000204 time 1.6145 (2.2237) loss 3.1188 (3.2303) grad_norm 2.0077 (2.0993) [2022-01-24 11:11:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][530/1251] eta 0:26:43 lr 0.000204 time 2.2513 (2.2240) loss 3.4869 (3.2308) grad_norm 1.9505 (2.0968) [2022-01-24 11:11:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][540/1251] eta 0:26:22 lr 0.000204 time 2.9131 (2.2255) loss 3.3207 (3.2329) grad_norm 2.1465 (2.0965) [2022-01-24 11:12:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][550/1251] eta 0:25:58 lr 0.000204 time 1.9571 (2.2233) loss 3.4749 (3.2364) grad_norm 3.0442 (2.0971) [2022-01-24 11:12:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][560/1251] eta 0:25:36 lr 0.000204 time 2.3458 (2.2231) loss 3.9738 (3.2380) grad_norm 2.1659 (2.0967) [2022-01-24 11:12:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][570/1251] eta 0:25:14 lr 0.000204 time 1.8023 (2.2238) loss 3.4185 (3.2397) grad_norm 2.1908 (2.0986) [2022-01-24 11:13:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][580/1251] eta 0:24:53 lr 0.000204 time 2.4338 (2.2252) loss 3.8527 (3.2392) grad_norm 2.2408 (2.0986) [2022-01-24 11:13:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][590/1251] eta 0:24:28 lr 0.000204 time 1.7479 (2.2218) loss 2.4737 (3.2394) grad_norm 2.5887 (2.1001) [2022-01-24 11:13:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][600/1251] eta 0:24:06 lr 0.000204 time 3.0153 (2.2218) loss 3.0067 (3.2425) grad_norm 2.2557 (2.1003) [2022-01-24 11:14:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][610/1251] eta 0:23:43 lr 0.000204 time 1.8805 (2.2207) loss 2.9649 (3.2438) grad_norm 1.9801 (2.1025) [2022-01-24 11:14:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][620/1251] eta 0:23:20 lr 0.000204 time 2.2240 (2.2195) loss 3.4816 (3.2458) grad_norm 2.2783 (2.1018) [2022-01-24 11:15:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][630/1251] eta 0:22:56 lr 0.000204 time 2.0158 (2.2173) loss 3.4328 (3.2446) grad_norm 2.3149 (2.1019) [2022-01-24 11:15:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][640/1251] eta 0:22:34 lr 0.000204 time 3.2421 (2.2165) loss 2.9015 (3.2438) grad_norm 2.2289 (2.1002) [2022-01-24 11:15:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][650/1251] eta 0:22:11 lr 0.000204 time 1.6237 (2.2150) loss 3.4092 (3.2467) grad_norm 1.7328 (2.0993) [2022-01-24 11:16:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][660/1251] eta 0:21:48 lr 0.000204 time 2.5879 (2.2145) loss 3.6336 (3.2491) grad_norm 1.9077 (2.0995) [2022-01-24 11:16:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][670/1251] eta 0:21:26 lr 0.000204 time 1.7888 (2.2135) loss 3.6648 (3.2524) grad_norm 1.7755 (2.0988) [2022-01-24 11:16:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][680/1251] eta 0:21:04 lr 0.000203 time 3.2819 (2.2154) loss 3.2726 (3.2505) grad_norm 1.9869 (2.0978) [2022-01-24 11:17:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][690/1251] eta 0:20:42 lr 0.000203 time 2.1384 (2.2141) loss 2.3793 (3.2478) grad_norm 2.0164 (2.0958) [2022-01-24 11:17:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][700/1251] eta 0:20:18 lr 0.000203 time 2.1775 (2.2116) loss 3.4017 (3.2491) grad_norm 1.9460 (2.0946) [2022-01-24 11:17:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][710/1251] eta 0:19:55 lr 0.000203 time 1.7119 (2.2098) loss 3.6258 (3.2497) grad_norm 2.1292 (2.0943) [2022-01-24 11:18:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][720/1251] eta 0:19:33 lr 0.000203 time 2.4117 (2.2100) loss 3.4107 (3.2510) grad_norm 1.8571 (2.0930) [2022-01-24 11:18:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][730/1251] eta 0:19:11 lr 0.000203 time 2.0781 (2.2103) loss 3.3967 (3.2526) grad_norm 1.8992 (2.0917) [2022-01-24 11:19:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][740/1251] eta 0:18:48 lr 0.000203 time 1.5023 (2.2076) loss 3.3777 (3.2487) grad_norm 1.8361 (2.0924) [2022-01-24 11:19:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][750/1251] eta 0:18:25 lr 0.000203 time 1.7204 (2.2067) loss 3.1438 (3.2491) grad_norm 4.0929 (2.0948) [2022-01-24 11:19:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][760/1251] eta 0:18:02 lr 0.000203 time 2.5107 (2.2053) loss 2.8994 (3.2470) grad_norm 2.1588 (2.0994) [2022-01-24 11:20:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][770/1251] eta 0:17:41 lr 0.000203 time 2.7180 (2.2060) loss 3.1699 (3.2466) grad_norm 1.9069 (2.1001) [2022-01-24 11:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][780/1251] eta 0:17:19 lr 0.000203 time 2.2001 (2.2069) loss 3.7850 (3.2470) grad_norm 1.9801 (2.1005) [2022-01-24 11:20:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][790/1251] eta 0:16:58 lr 0.000203 time 2.1818 (2.2085) loss 2.8122 (3.2497) grad_norm 1.8471 (2.1013) [2022-01-24 11:21:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][800/1251] eta 0:16:36 lr 0.000203 time 2.6232 (2.2087) loss 3.5347 (3.2520) grad_norm 2.0829 (2.1016) [2022-01-24 11:21:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][810/1251] eta 0:16:14 lr 0.000203 time 2.0902 (2.2092) loss 3.0415 (3.2526) grad_norm 2.0818 (2.1026) [2022-01-24 11:21:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][820/1251] eta 0:15:51 lr 0.000203 time 2.2165 (2.2088) loss 3.1775 (3.2507) grad_norm 2.3202 (2.1042) [2022-01-24 11:22:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][830/1251] eta 0:15:29 lr 0.000203 time 1.8587 (2.2089) loss 2.6761 (3.2508) grad_norm 3.6402 (2.1068) [2022-01-24 11:22:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][840/1251] eta 0:15:07 lr 0.000203 time 2.2753 (2.2076) loss 3.0645 (3.2488) grad_norm 2.2958 (2.1083) [2022-01-24 11:23:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][850/1251] eta 0:14:45 lr 0.000203 time 2.4773 (2.2077) loss 3.1430 (3.2508) grad_norm 1.8836 (2.1080) [2022-01-24 11:23:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][860/1251] eta 0:14:23 lr 0.000203 time 1.9426 (2.2079) loss 2.9684 (3.2501) grad_norm 1.8946 (2.1080) [2022-01-24 11:23:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][870/1251] eta 0:14:00 lr 0.000203 time 2.0132 (2.2065) loss 3.6531 (3.2499) grad_norm 1.9987 (2.1072) [2022-01-24 11:24:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][880/1251] eta 0:13:38 lr 0.000203 time 2.2561 (2.2058) loss 3.7657 (3.2548) grad_norm 2.4375 (2.1076) [2022-01-24 11:24:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][890/1251] eta 0:13:16 lr 0.000203 time 2.4024 (2.2053) loss 2.7977 (3.2534) grad_norm 1.8033 (2.1066) [2022-01-24 11:24:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][900/1251] eta 0:12:54 lr 0.000203 time 2.2671 (2.2062) loss 3.5463 (3.2554) grad_norm 2.0289 (2.1056) [2022-01-24 11:25:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][910/1251] eta 0:12:32 lr 0.000203 time 2.3506 (2.2064) loss 3.0529 (3.2557) grad_norm 1.9284 (2.1036) [2022-01-24 11:25:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][920/1251] eta 0:12:10 lr 0.000203 time 1.5941 (2.2065) loss 3.5442 (3.2553) grad_norm 1.8877 (2.1018) [2022-01-24 11:25:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][930/1251] eta 0:11:47 lr 0.000203 time 2.4378 (2.2054) loss 3.7480 (3.2567) grad_norm 1.9153 (2.1011) [2022-01-24 11:26:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][940/1251] eta 0:11:25 lr 0.000203 time 1.9135 (2.2036) loss 3.2976 (3.2558) grad_norm 1.9955 (2.1003) [2022-01-24 11:26:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][950/1251] eta 0:11:02 lr 0.000203 time 2.5064 (2.2017) loss 2.3261 (3.2546) grad_norm 1.9199 (2.1001) [2022-01-24 11:26:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][960/1251] eta 0:10:40 lr 0.000203 time 2.1914 (2.2010) loss 2.8592 (3.2545) grad_norm 2.1314 (2.1002) [2022-01-24 11:27:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][970/1251] eta 0:10:18 lr 0.000203 time 2.1783 (2.2005) loss 2.8518 (3.2547) grad_norm 2.0089 (2.1010) [2022-01-24 11:27:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][980/1251] eta 0:09:56 lr 0.000202 time 2.1479 (2.2016) loss 2.2337 (3.2537) grad_norm 1.8193 (2.1014) [2022-01-24 11:28:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][990/1251] eta 0:09:34 lr 0.000202 time 2.3790 (2.2022) loss 2.3210 (3.2540) grad_norm 1.8609 (2.1015) [2022-01-24 11:28:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1000/1251] eta 0:09:12 lr 0.000202 time 1.9519 (2.2028) loss 3.6915 (3.2544) grad_norm 2.5283 (2.1011) [2022-01-24 11:28:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1010/1251] eta 0:08:50 lr 0.000202 time 2.4369 (2.2028) loss 3.2272 (3.2556) grad_norm 1.7953 (2.1017) [2022-01-24 11:29:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1020/1251] eta 0:08:28 lr 0.000202 time 2.6250 (2.2033) loss 2.2375 (3.2537) grad_norm 2.0195 (2.1020) [2022-01-24 11:29:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1030/1251] eta 0:08:06 lr 0.000202 time 2.4657 (2.2019) loss 2.9277 (3.2525) grad_norm 2.1718 (2.1015) [2022-01-24 11:29:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1040/1251] eta 0:07:44 lr 0.000202 time 2.0264 (2.2016) loss 2.9641 (3.2523) grad_norm 2.4401 (2.1011) [2022-01-24 11:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1050/1251] eta 0:07:22 lr 0.000202 time 2.5429 (2.2017) loss 2.8570 (3.2503) grad_norm 1.9764 (2.1000) [2022-01-24 11:30:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1060/1251] eta 0:07:00 lr 0.000202 time 1.4956 (2.2015) loss 3.7222 (3.2529) grad_norm 2.0955 (2.0991) [2022-01-24 11:31:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1070/1251] eta 0:06:38 lr 0.000202 time 2.1749 (2.2013) loss 3.1336 (3.2528) grad_norm 2.0293 (2.0983) [2022-01-24 11:31:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1080/1251] eta 0:06:16 lr 0.000202 time 1.9285 (2.2005) loss 2.3322 (3.2536) grad_norm 1.9555 (2.0987) [2022-01-24 11:31:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1090/1251] eta 0:05:54 lr 0.000202 time 1.8858 (2.1989) loss 3.4875 (3.2537) grad_norm 2.1312 (2.0978) [2022-01-24 11:32:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1100/1251] eta 0:05:31 lr 0.000202 time 1.9133 (2.1985) loss 2.6949 (3.2528) grad_norm 2.1274 (2.0974) [2022-01-24 11:32:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1110/1251] eta 0:05:10 lr 0.000202 time 1.9157 (2.1986) loss 3.2318 (3.2537) grad_norm 1.9520 (2.0971) [2022-01-24 11:32:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1120/1251] eta 0:04:48 lr 0.000202 time 1.4664 (2.1988) loss 3.5434 (3.2544) grad_norm 1.8905 (2.0971) [2022-01-24 11:33:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1130/1251] eta 0:04:26 lr 0.000202 time 1.6880 (2.1996) loss 3.2723 (3.2528) grad_norm 2.0851 (2.0965) [2022-01-24 11:33:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1140/1251] eta 0:04:04 lr 0.000202 time 2.2504 (2.1993) loss 2.5049 (3.2538) grad_norm 1.9478 (2.0960) [2022-01-24 11:33:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1150/1251] eta 0:03:42 lr 0.000202 time 1.9125 (2.1991) loss 3.6959 (3.2544) grad_norm 1.9663 (2.0953) [2022-01-24 11:34:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1160/1251] eta 0:03:20 lr 0.000202 time 1.8044 (2.1994) loss 3.4860 (3.2552) grad_norm 2.2297 (2.0951) [2022-01-24 11:34:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1170/1251] eta 0:02:58 lr 0.000202 time 1.8214 (2.1981) loss 3.2746 (3.2526) grad_norm 2.0207 (2.0956) [2022-01-24 11:34:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1180/1251] eta 0:02:35 lr 0.000202 time 1.9068 (2.1963) loss 3.3278 (3.2526) grad_norm 1.9168 (2.0953) [2022-01-24 11:35:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1190/1251] eta 0:02:13 lr 0.000202 time 1.9107 (2.1956) loss 2.6508 (3.2542) grad_norm 2.0501 (2.0955) [2022-01-24 11:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1200/1251] eta 0:01:51 lr 0.000202 time 2.4942 (2.1953) loss 3.1255 (3.2543) grad_norm 1.8272 (2.0947) [2022-01-24 11:36:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1210/1251] eta 0:01:30 lr 0.000202 time 2.1543 (2.1956) loss 4.1128 (3.2547) grad_norm 2.2594 (2.0943) [2022-01-24 11:36:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1220/1251] eta 0:01:08 lr 0.000202 time 1.9777 (2.1960) loss 3.6914 (3.2562) grad_norm 2.0490 (2.0941) [2022-01-24 11:36:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1230/1251] eta 0:00:46 lr 0.000202 time 2.1268 (2.1968) loss 2.3055 (3.2555) grad_norm 1.8290 (2.0938) [2022-01-24 11:37:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1240/1251] eta 0:00:24 lr 0.000202 time 2.2258 (2.1970) loss 3.0579 (3.2542) grad_norm 2.0384 (2.0934) [2022-01-24 11:37:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [212/300][1250/1251] eta 0:00:02 lr 0.000202 time 1.1213 (2.1910) loss 2.7309 (3.2533) grad_norm 1.8407 (2.0923) [2022-01-24 11:37:25 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 212 training takes 0:45:41 [2022-01-24 11:37:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.312 (18.312) Loss 0.8599 (0.8599) Acc@1 80.371 (80.371) Acc@5 94.824 (94.824) [2022-01-24 11:38:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.084 (3.170) Loss 0.8966 (0.9214) Acc@1 79.102 (78.622) Acc@5 95.117 (94.158) [2022-01-24 11:38:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.984 (2.629) Loss 0.7919 (0.9000) Acc@1 80.957 (78.790) Acc@5 96.289 (94.522) [2022-01-24 11:38:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.612 (2.286) Loss 0.8953 (0.9033) Acc@1 78.125 (78.676) Acc@5 94.336 (94.509) [2022-01-24 11:38:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.874 (2.141) Loss 0.8564 (0.8979) Acc@1 78.809 (78.699) Acc@5 94.922 (94.617) [2022-01-24 11:39:00 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.772 Acc@5 94.690 [2022-01-24 11:39:00 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.8% [2022-01-24 11:39:00 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.99% [2022-01-24 11:39:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][0/1251] eta 7:25:30 lr 0.000202 time 21.3674 (21.3674) loss 4.0257 (4.0257) grad_norm 1.9693 (1.9693) [2022-01-24 11:39:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][10/1251] eta 1:23:30 lr 0.000202 time 2.8040 (4.0374) loss 3.6266 (3.5017) grad_norm 1.7509 (1.9910) [2022-01-24 11:40:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][20/1251] eta 1:06:34 lr 0.000202 time 2.2534 (3.2449) loss 3.5933 (3.3634) grad_norm 1.6066 (1.9769) [2022-01-24 11:40:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][30/1251] eta 0:59:09 lr 0.000202 time 1.9334 (2.9073) loss 2.3558 (3.2668) grad_norm 2.0998 (2.0238) [2022-01-24 11:40:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][40/1251] eta 0:55:25 lr 0.000201 time 2.8307 (2.7458) loss 2.6545 (3.2595) grad_norm 1.8916 (2.0426) [2022-01-24 11:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][50/1251] eta 0:53:13 lr 0.000201 time 2.8394 (2.6588) loss 2.8593 (3.2418) grad_norm 1.8676 (2.0669) [2022-01-24 11:41:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][60/1251] eta 0:51:25 lr 0.000201 time 2.2516 (2.5911) loss 2.4738 (3.2142) grad_norm 2.4219 (2.0665) [2022-01-24 11:42:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][70/1251] eta 0:49:47 lr 0.000201 time 1.8324 (2.5296) loss 3.8636 (3.2290) grad_norm 1.9039 (2.0969) [2022-01-24 11:42:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][80/1251] eta 0:48:25 lr 0.000201 time 2.2702 (2.4811) loss 3.5954 (3.2160) grad_norm 2.0054 (2.0905) [2022-01-24 11:42:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][90/1251] eta 0:47:21 lr 0.000201 time 2.5108 (2.4473) loss 3.5384 (3.2158) grad_norm 2.1599 (2.0966) [2022-01-24 11:43:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][100/1251] eta 0:46:12 lr 0.000201 time 1.6127 (2.4088) loss 3.7901 (3.2305) grad_norm 2.0824 (2.0926) [2022-01-24 11:43:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][110/1251] eta 0:45:27 lr 0.000201 time 2.2752 (2.3905) loss 3.5236 (3.2368) grad_norm 2.6630 (2.0910) [2022-01-24 11:43:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][120/1251] eta 0:44:48 lr 0.000201 time 2.2734 (2.3770) loss 4.1097 (3.2563) grad_norm 2.0655 (2.0935) [2022-01-24 11:44:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][130/1251] eta 0:44:07 lr 0.000201 time 1.8303 (2.3621) loss 3.1261 (3.2664) grad_norm 2.2770 (2.0972) [2022-01-24 11:44:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][140/1251] eta 0:43:29 lr 0.000201 time 1.5053 (2.3492) loss 2.4570 (3.2609) grad_norm 2.2137 (2.0940) [2022-01-24 11:44:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][150/1251] eta 0:43:00 lr 0.000201 time 2.1336 (2.3440) loss 2.6829 (3.2636) grad_norm 1.8256 (2.1176) [2022-01-24 11:45:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][160/1251] eta 0:42:37 lr 0.000201 time 2.8827 (2.3441) loss 2.7422 (3.2625) grad_norm 2.4474 (2.1193) [2022-01-24 11:45:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][170/1251] eta 0:42:05 lr 0.000201 time 2.1807 (2.3362) loss 3.8141 (3.2471) grad_norm 2.4293 (2.1151) [2022-01-24 11:46:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][180/1251] eta 0:41:23 lr 0.000201 time 1.8868 (2.3191) loss 3.3322 (3.2520) grad_norm 2.4364 (2.1150) [2022-01-24 11:46:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][190/1251] eta 0:40:33 lr 0.000201 time 1.8270 (2.2933) loss 3.1304 (3.2618) grad_norm 1.8315 (2.1214) [2022-01-24 11:46:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][200/1251] eta 0:39:53 lr 0.000201 time 1.6234 (2.2776) loss 3.9246 (3.2689) grad_norm 2.2104 (2.1169) [2022-01-24 11:46:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][210/1251] eta 0:39:21 lr 0.000201 time 2.2855 (2.2689) loss 2.1884 (3.2631) grad_norm 2.2248 (2.1186) [2022-01-24 11:47:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][220/1251] eta 0:38:55 lr 0.000201 time 2.5363 (2.2653) loss 2.8294 (3.2605) grad_norm 1.9513 (2.1147) [2022-01-24 11:47:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][230/1251] eta 0:38:35 lr 0.000201 time 2.8205 (2.2684) loss 3.6800 (3.2596) grad_norm 2.4890 (2.1144) [2022-01-24 11:48:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][240/1251] eta 0:38:12 lr 0.000201 time 1.5833 (2.2671) loss 3.6764 (3.2745) grad_norm 1.9943 (2.1210) [2022-01-24 11:48:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][250/1251] eta 0:37:54 lr 0.000201 time 1.8380 (2.2718) loss 3.6781 (3.2687) grad_norm 2.0059 (2.1222) [2022-01-24 11:48:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][260/1251] eta 0:37:34 lr 0.000201 time 1.7547 (2.2749) loss 3.3613 (3.2825) grad_norm 2.1863 (2.1208) [2022-01-24 11:49:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][270/1251] eta 0:37:11 lr 0.000201 time 2.5125 (2.2747) loss 3.4796 (3.2819) grad_norm 1.9020 (2.1208) [2022-01-24 11:49:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][280/1251] eta 0:36:41 lr 0.000201 time 1.6668 (2.2672) loss 3.6492 (3.2786) grad_norm 1.8462 (2.1214) [2022-01-24 11:49:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][290/1251] eta 0:36:11 lr 0.000201 time 1.9366 (2.2601) loss 3.3084 (3.2841) grad_norm 2.1012 (2.1196) [2022-01-24 11:50:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][300/1251] eta 0:35:44 lr 0.000201 time 2.2324 (2.2548) loss 3.6876 (3.2840) grad_norm 2.0664 (2.1144) [2022-01-24 11:50:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][310/1251] eta 0:35:22 lr 0.000201 time 2.6409 (2.2561) loss 3.5298 (3.2855) grad_norm 1.9078 (2.1097) [2022-01-24 11:51:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][320/1251] eta 0:35:00 lr 0.000201 time 1.5132 (2.2558) loss 3.6972 (3.2815) grad_norm 1.8012 (2.1064) [2022-01-24 11:51:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][330/1251] eta 0:34:37 lr 0.000201 time 1.8472 (2.2555) loss 2.7664 (3.2798) grad_norm 2.2128 (2.1036) [2022-01-24 11:51:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][340/1251] eta 0:34:14 lr 0.000200 time 2.5353 (2.2548) loss 3.5573 (3.2694) grad_norm 1.8728 (2.1003) [2022-01-24 11:52:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][350/1251] eta 0:33:48 lr 0.000200 time 2.1463 (2.2518) loss 3.2110 (3.2680) grad_norm 1.8744 (2.0980) [2022-01-24 11:52:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][360/1251] eta 0:33:20 lr 0.000200 time 1.5400 (2.2450) loss 3.7806 (3.2667) grad_norm 2.1022 (2.0965) [2022-01-24 11:52:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][370/1251] eta 0:32:52 lr 0.000200 time 2.0750 (2.2392) loss 3.0035 (3.2679) grad_norm 2.0663 (2.0966) [2022-01-24 11:53:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][380/1251] eta 0:32:32 lr 0.000200 time 2.3888 (2.2422) loss 3.5675 (3.2667) grad_norm 1.8333 (2.0953) [2022-01-24 11:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][390/1251] eta 0:32:13 lr 0.000200 time 2.8031 (2.2457) loss 3.5599 (3.2682) grad_norm 2.0631 (2.0968) [2022-01-24 11:54:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][400/1251] eta 0:31:55 lr 0.000200 time 1.7922 (2.2504) loss 2.4312 (3.2689) grad_norm 2.1342 (2.0978) [2022-01-24 11:54:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][410/1251] eta 0:31:29 lr 0.000200 time 2.2000 (2.2473) loss 3.2364 (3.2716) grad_norm 2.6933 (2.0979) [2022-01-24 11:54:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][420/1251] eta 0:31:00 lr 0.000200 time 1.8899 (2.2390) loss 2.6606 (3.2697) grad_norm 1.9661 (2.0957) [2022-01-24 11:55:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][430/1251] eta 0:30:32 lr 0.000200 time 2.1919 (2.2325) loss 3.2031 (3.2705) grad_norm 1.9157 (2.0933) [2022-01-24 11:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][440/1251] eta 0:30:07 lr 0.000200 time 1.7489 (2.2284) loss 3.4239 (3.2715) grad_norm 2.8851 (2.0989) [2022-01-24 11:55:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][450/1251] eta 0:29:44 lr 0.000200 time 2.7663 (2.2279) loss 2.2512 (3.2664) grad_norm 1.9682 (2.1012) [2022-01-24 11:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][460/1251] eta 0:29:22 lr 0.000200 time 2.4979 (2.2283) loss 2.8681 (3.2627) grad_norm 2.3054 (2.1027) [2022-01-24 11:56:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][470/1251] eta 0:28:58 lr 0.000200 time 1.8277 (2.2260) loss 3.3917 (3.2566) grad_norm 2.2046 (2.1028) [2022-01-24 11:56:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][480/1251] eta 0:28:36 lr 0.000200 time 2.2610 (2.2260) loss 3.7033 (3.2517) grad_norm 2.0345 (2.1022) [2022-01-24 11:57:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][490/1251] eta 0:28:13 lr 0.000200 time 2.2201 (2.2259) loss 3.6427 (3.2502) grad_norm 2.0213 (2.1031) [2022-01-24 11:57:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][500/1251] eta 0:27:55 lr 0.000200 time 3.1121 (2.2307) loss 2.1223 (3.2464) grad_norm 1.9207 (2.1041) [2022-01-24 11:58:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][510/1251] eta 0:27:33 lr 0.000200 time 1.5733 (2.2308) loss 3.8046 (3.2471) grad_norm 2.0787 (2.1048) [2022-01-24 11:58:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][520/1251] eta 0:27:10 lr 0.000200 time 2.2267 (2.2310) loss 3.5418 (3.2472) grad_norm 2.0258 (2.1064) [2022-01-24 11:58:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][530/1251] eta 0:26:45 lr 0.000200 time 1.6012 (2.2272) loss 3.7068 (3.2475) grad_norm 1.9876 (2.1086) [2022-01-24 11:59:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][540/1251] eta 0:26:20 lr 0.000200 time 2.4461 (2.2225) loss 3.3557 (3.2468) grad_norm 2.2411 (2.1111) [2022-01-24 11:59:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][550/1251] eta 0:25:56 lr 0.000200 time 1.8582 (2.2209) loss 3.8374 (3.2462) grad_norm 2.2849 (2.1126) [2022-01-24 11:59:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][560/1251] eta 0:25:33 lr 0.000200 time 1.6213 (2.2198) loss 2.9072 (3.2463) grad_norm 2.3518 (2.1147) [2022-01-24 12:00:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][570/1251] eta 0:25:10 lr 0.000200 time 1.6047 (2.2184) loss 3.6724 (3.2513) grad_norm 2.0481 (2.1155) [2022-01-24 12:00:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][580/1251] eta 0:24:52 lr 0.000200 time 3.3984 (2.2236) loss 3.6871 (3.2536) grad_norm 1.9786 (2.1147) [2022-01-24 12:00:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][590/1251] eta 0:24:30 lr 0.000200 time 2.1741 (2.2245) loss 3.6214 (3.2545) grad_norm 1.8002 (2.1152) [2022-01-24 12:01:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][600/1251] eta 0:24:09 lr 0.000200 time 2.0330 (2.2260) loss 3.6288 (3.2550) grad_norm 2.2983 (2.1166) [2022-01-24 12:01:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][610/1251] eta 0:23:46 lr 0.000200 time 1.9680 (2.2248) loss 3.7401 (3.2592) grad_norm 2.0200 (2.1161) [2022-01-24 12:02:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][620/1251] eta 0:23:23 lr 0.000200 time 2.5225 (2.2236) loss 3.5382 (3.2563) grad_norm 2.2317 (2.1148) [2022-01-24 12:02:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][630/1251] eta 0:22:59 lr 0.000200 time 1.9901 (2.2208) loss 2.4526 (3.2569) grad_norm 2.4854 (2.1161) [2022-01-24 12:02:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][640/1251] eta 0:22:36 lr 0.000200 time 2.2902 (2.2208) loss 3.7695 (3.2586) grad_norm 2.0520 (2.1143) [2022-01-24 12:03:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][650/1251] eta 0:22:13 lr 0.000199 time 1.5783 (2.2190) loss 3.2016 (3.2585) grad_norm 1.9891 (2.1112) [2022-01-24 12:03:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][660/1251] eta 0:21:53 lr 0.000199 time 2.2149 (2.2225) loss 3.7942 (3.2601) grad_norm 2.3381 (2.1108) [2022-01-24 12:03:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][670/1251] eta 0:21:30 lr 0.000199 time 2.0209 (2.2211) loss 3.9162 (3.2648) grad_norm 2.6178 (2.1099) [2022-01-24 12:04:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][680/1251] eta 0:21:07 lr 0.000199 time 2.1714 (2.2198) loss 3.5726 (3.2649) grad_norm 1.8724 (2.1088) [2022-01-24 12:04:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][690/1251] eta 0:20:43 lr 0.000199 time 1.9351 (2.2170) loss 3.6808 (3.2675) grad_norm 1.7562 (2.1075) [2022-01-24 12:04:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][700/1251] eta 0:20:20 lr 0.000199 time 1.9371 (2.2157) loss 2.3523 (3.2659) grad_norm 2.0338 (2.1059) [2022-01-24 12:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][710/1251] eta 0:19:58 lr 0.000199 time 2.2765 (2.2151) loss 2.3187 (3.2676) grad_norm 2.1756 (2.1044) [2022-01-24 12:05:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][720/1251] eta 0:19:35 lr 0.000199 time 2.4904 (2.2141) loss 3.0583 (3.2619) grad_norm 2.1163 (2.1057) [2022-01-24 12:05:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][730/1251] eta 0:19:13 lr 0.000199 time 2.0795 (2.2140) loss 3.4279 (3.2601) grad_norm 1.9672 (2.1061) [2022-01-24 12:06:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][740/1251] eta 0:18:50 lr 0.000199 time 1.7275 (2.2120) loss 2.1576 (3.2604) grad_norm 2.0711 (2.1067) [2022-01-24 12:06:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][750/1251] eta 0:18:28 lr 0.000199 time 2.4641 (2.2126) loss 3.7916 (3.2630) grad_norm 2.2076 (2.1069) [2022-01-24 12:07:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][760/1251] eta 0:18:05 lr 0.000199 time 1.6926 (2.2114) loss 3.4751 (3.2628) grad_norm 2.7298 (2.1069) [2022-01-24 12:07:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][770/1251] eta 0:17:43 lr 0.000199 time 2.8172 (2.2103) loss 3.0142 (3.2624) grad_norm 2.1899 (2.1072) [2022-01-24 12:07:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][780/1251] eta 0:17:20 lr 0.000199 time 1.6614 (2.2098) loss 3.4921 (3.2568) grad_norm 1.8710 (2.1076) [2022-01-24 12:08:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][790/1251] eta 0:17:00 lr 0.000199 time 2.4475 (2.2137) loss 2.6022 (3.2576) grad_norm 2.1330 (2.1073) [2022-01-24 12:08:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][800/1251] eta 0:16:37 lr 0.000199 time 2.2972 (2.2128) loss 3.6859 (3.2590) grad_norm 2.1583 (2.1079) [2022-01-24 12:08:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][810/1251] eta 0:16:16 lr 0.000199 time 2.4634 (2.2135) loss 2.0900 (3.2579) grad_norm 1.8993 (2.1092) [2022-01-24 12:09:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][820/1251] eta 0:15:52 lr 0.000199 time 1.6330 (2.2103) loss 3.5895 (3.2613) grad_norm 2.0204 (2.1101) [2022-01-24 12:09:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][830/1251] eta 0:15:29 lr 0.000199 time 2.0704 (2.2084) loss 3.1334 (3.2624) grad_norm 2.2321 (2.1096) [2022-01-24 12:09:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][840/1251] eta 0:15:07 lr 0.000199 time 1.8497 (2.2072) loss 3.6518 (3.2629) grad_norm 2.0131 (2.1093) [2022-01-24 12:10:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][850/1251] eta 0:14:45 lr 0.000199 time 2.6656 (2.2086) loss 3.2464 (3.2620) grad_norm 2.0493 (2.1092) [2022-01-24 12:10:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][860/1251] eta 0:14:23 lr 0.000199 time 2.1822 (2.2084) loss 3.0275 (3.2603) grad_norm 2.3506 (2.1089) [2022-01-24 12:11:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][870/1251] eta 0:14:01 lr 0.000199 time 2.2638 (2.2074) loss 3.4964 (3.2569) grad_norm 2.0733 (2.1088) [2022-01-24 12:11:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][880/1251] eta 0:13:38 lr 0.000199 time 2.2531 (2.2065) loss 3.6139 (3.2568) grad_norm 2.1281 (2.1092) [2022-01-24 12:11:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][890/1251] eta 0:13:16 lr 0.000199 time 2.5942 (2.2075) loss 3.6527 (3.2575) grad_norm 1.7478 (2.1086) [2022-01-24 12:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][900/1251] eta 0:12:54 lr 0.000199 time 1.8808 (2.2069) loss 3.7978 (3.2570) grad_norm 2.3248 (2.1085) [2022-01-24 12:12:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][910/1251] eta 0:12:32 lr 0.000199 time 1.7464 (2.2071) loss 2.5734 (3.2563) grad_norm 1.8987 (2.1076) [2022-01-24 12:12:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][920/1251] eta 0:12:10 lr 0.000199 time 2.1905 (2.2070) loss 2.1956 (3.2566) grad_norm 1.9947 (2.1065) [2022-01-24 12:13:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][930/1251] eta 0:11:48 lr 0.000199 time 2.2818 (2.2065) loss 3.0752 (3.2583) grad_norm 2.5309 (2.1062) [2022-01-24 12:13:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][940/1251] eta 0:11:25 lr 0.000199 time 2.0257 (2.2043) loss 3.4397 (3.2576) grad_norm 2.5285 (2.1082) [2022-01-24 12:13:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][950/1251] eta 0:11:03 lr 0.000199 time 1.9533 (2.2031) loss 3.2872 (3.2584) grad_norm 2.2995 (2.1088) [2022-01-24 12:14:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][960/1251] eta 0:10:41 lr 0.000198 time 2.7444 (2.2032) loss 3.2731 (3.2568) grad_norm 2.1395 (2.1083) [2022-01-24 12:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][970/1251] eta 0:10:19 lr 0.000198 time 2.7350 (2.2042) loss 3.3858 (3.2566) grad_norm 1.9676 (2.1073) [2022-01-24 12:15:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][980/1251] eta 0:09:57 lr 0.000198 time 2.7973 (2.2051) loss 4.0565 (3.2543) grad_norm 2.4930 (2.1068) [2022-01-24 12:15:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][990/1251] eta 0:09:35 lr 0.000198 time 1.5965 (2.2044) loss 3.5580 (3.2549) grad_norm 2.4364 (2.1071) [2022-01-24 12:15:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1000/1251] eta 0:09:13 lr 0.000198 time 2.4270 (2.2050) loss 3.0369 (3.2535) grad_norm 2.2503 (2.1063) [2022-01-24 12:16:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1010/1251] eta 0:08:51 lr 0.000198 time 3.0440 (2.2053) loss 3.5165 (3.2555) grad_norm 2.1305 (2.1062) [2022-01-24 12:16:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1020/1251] eta 0:08:29 lr 0.000198 time 2.5629 (2.2044) loss 3.0350 (3.2559) grad_norm 2.0437 (2.1074) [2022-01-24 12:16:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1030/1251] eta 0:08:07 lr 0.000198 time 1.8201 (2.2048) loss 2.4245 (3.2537) grad_norm 2.0686 (2.1071) [2022-01-24 12:17:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1040/1251] eta 0:07:45 lr 0.000198 time 1.6690 (2.2051) loss 2.8250 (3.2558) grad_norm 3.0366 (2.1086) [2022-01-24 12:17:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1050/1251] eta 0:07:23 lr 0.000198 time 4.2857 (2.2052) loss 3.6415 (3.2570) grad_norm 2.0502 (2.1094) [2022-01-24 12:17:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1060/1251] eta 0:07:00 lr 0.000198 time 1.5681 (2.2037) loss 3.6264 (3.2567) grad_norm 2.1630 (2.1094) [2022-01-24 12:18:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1070/1251] eta 0:06:38 lr 0.000198 time 2.2651 (2.2037) loss 3.6511 (3.2584) grad_norm 2.2263 (2.1078) [2022-01-24 12:18:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1080/1251] eta 0:06:16 lr 0.000198 time 2.1497 (2.2024) loss 3.5853 (3.2592) grad_norm 2.0820 (2.1070) [2022-01-24 12:19:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1090/1251] eta 0:05:54 lr 0.000198 time 3.2629 (2.2027) loss 3.0856 (3.2608) grad_norm 1.9740 (2.1062) [2022-01-24 12:19:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1100/1251] eta 0:05:32 lr 0.000198 time 1.7914 (2.2028) loss 3.8161 (3.2619) grad_norm 2.3828 (2.1058) [2022-01-24 12:19:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1110/1251] eta 0:05:10 lr 0.000198 time 2.1856 (2.2011) loss 3.7079 (3.2615) grad_norm 2.1474 (2.1053) [2022-01-24 12:20:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1120/1251] eta 0:04:48 lr 0.000198 time 2.1535 (2.2005) loss 3.7607 (3.2598) grad_norm 2.0761 (2.1051) [2022-01-24 12:20:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1130/1251] eta 0:04:26 lr 0.000198 time 3.5134 (2.2017) loss 2.5458 (3.2598) grad_norm 2.3014 (2.1058) [2022-01-24 12:20:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1140/1251] eta 0:04:04 lr 0.000198 time 2.6045 (2.2021) loss 3.1593 (3.2583) grad_norm 2.1136 (2.1054) [2022-01-24 12:21:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1150/1251] eta 0:03:42 lr 0.000198 time 1.8893 (2.2018) loss 2.9038 (3.2590) grad_norm 1.8923 (2.1049) [2022-01-24 12:21:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1160/1251] eta 0:03:20 lr 0.000198 time 1.6453 (2.2005) loss 3.1220 (3.2592) grad_norm 2.3304 (2.1054) [2022-01-24 12:21:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1170/1251] eta 0:02:58 lr 0.000198 time 2.5268 (2.2000) loss 3.6199 (3.2565) grad_norm 2.1953 (2.1042) [2022-01-24 12:22:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1180/1251] eta 0:02:36 lr 0.000198 time 1.9573 (2.1998) loss 3.5464 (3.2588) grad_norm 2.1419 (2.1040) [2022-01-24 12:22:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1190/1251] eta 0:02:14 lr 0.000198 time 2.2718 (2.1990) loss 3.6723 (3.2585) grad_norm 1.8676 (2.1046) [2022-01-24 12:23:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1200/1251] eta 0:01:52 lr 0.000198 time 2.1559 (2.1982) loss 3.9697 (3.2591) grad_norm 1.8410 (2.1038) [2022-01-24 12:23:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1210/1251] eta 0:01:30 lr 0.000198 time 3.1946 (2.1992) loss 3.6241 (3.2608) grad_norm 1.7386 (2.1020) [2022-01-24 12:23:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1220/1251] eta 0:01:08 lr 0.000198 time 1.8093 (2.1983) loss 3.2412 (3.2610) grad_norm 2.0727 (2.1015) [2022-01-24 12:24:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1230/1251] eta 0:00:46 lr 0.000198 time 2.9823 (2.1997) loss 3.7410 (3.2629) grad_norm 1.9630 (2.1005) [2022-01-24 12:24:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1240/1251] eta 0:00:24 lr 0.000198 time 1.4185 (2.1989) loss 4.0193 (3.2643) grad_norm 2.3757 (2.1005) [2022-01-24 12:24:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [213/300][1250/1251] eta 0:00:02 lr 0.000198 time 1.1831 (2.1934) loss 2.3688 (3.2638) grad_norm 2.0285 (2.1011) [2022-01-24 12:24:44 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 213 training takes 0:45:44 [2022-01-24 12:25:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.747 (18.747) Loss 0.8982 (0.8982) Acc@1 76.855 (76.855) Acc@5 95.020 (95.020) [2022-01-24 12:25:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.628 (3.504) Loss 0.8635 (0.8759) Acc@1 78.613 (79.119) Acc@5 95.996 (94.949) [2022-01-24 12:25:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.603 (2.635) Loss 0.8253 (0.8838) Acc@1 79.883 (78.869) Acc@5 95.898 (94.820) [2022-01-24 12:25:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.595 (2.320) Loss 0.8048 (0.8913) Acc@1 80.957 (78.821) Acc@5 95.801 (94.698) [2022-01-24 12:26:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.588 (2.150) Loss 0.8524 (0.8940) Acc@1 80.273 (78.821) Acc@5 94.336 (94.681) [2022-01-24 12:26:20 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 78.802 Acc@5 94.664 [2022-01-24 12:26:20 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 78.8% [2022-01-24 12:26:20 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 78.99% [2022-01-24 12:26:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][0/1251] eta 7:21:54 lr 0.000198 time 21.1950 (21.1950) loss 3.4745 (3.4745) grad_norm 2.0076 (2.0076) [2022-01-24 12:27:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][10/1251] eta 1:26:40 lr 0.000197 time 2.2411 (4.1904) loss 3.0502 (3.4342) grad_norm 2.0079 (2.0585) [2022-01-24 12:27:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][20/1251] eta 1:05:49 lr 0.000197 time 1.5052 (3.2087) loss 3.4281 (3.3620) grad_norm 2.4422 (2.1285) [2022-01-24 12:27:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][30/1251] eta 0:58:55 lr 0.000197 time 1.3733 (2.8954) loss 3.7898 (3.3809) grad_norm 2.2064 (2.1461) [2022-01-24 12:28:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][40/1251] eta 0:55:42 lr 0.000197 time 3.2931 (2.7600) loss 3.2344 (3.3122) grad_norm 2.1856 (2.1194) [2022-01-24 12:28:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][50/1251] eta 0:54:16 lr 0.000197 time 3.3103 (2.7111) loss 3.1805 (3.2844) grad_norm 2.1631 (2.1243) [2022-01-24 12:29:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][60/1251] eta 0:52:35 lr 0.000197 time 1.7275 (2.6495) loss 3.1436 (3.2849) grad_norm 2.0887 (2.1066) [2022-01-24 12:29:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][70/1251] eta 0:50:49 lr 0.000197 time 1.8924 (2.5819) loss 3.4628 (3.2648) grad_norm 1.9787 (2.0915) [2022-01-24 12:29:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][80/1251] eta 0:49:02 lr 0.000197 time 2.4977 (2.5127) loss 2.6069 (3.2621) grad_norm 2.2157 (2.1043) [2022-01-24 12:30:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][90/1251] eta 0:47:25 lr 0.000197 time 2.0112 (2.4505) loss 3.2206 (3.2795) grad_norm 2.2706 (2.0953) [2022-01-24 12:30:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][100/1251] eta 0:46:03 lr 0.000197 time 2.2753 (2.4014) loss 3.9181 (3.2740) grad_norm 2.2308 (2.0998) [2022-01-24 12:30:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][110/1251] eta 0:45:18 lr 0.000197 time 2.2464 (2.3823) loss 3.1360 (3.2713) grad_norm 1.9799 (2.1064) [2022-01-24 12:31:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][120/1251] eta 0:44:38 lr 0.000197 time 1.9689 (2.3680) loss 4.0574 (3.2994) grad_norm 2.5664 (2.1049) [2022-01-24 12:31:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][130/1251] eta 0:44:04 lr 0.000197 time 1.8462 (2.3589) loss 2.0883 (3.2811) grad_norm 1.9918 (2.1058) [2022-01-24 12:31:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][140/1251] eta 0:43:36 lr 0.000197 time 2.5447 (2.3549) loss 3.5179 (3.2851) grad_norm 1.9246 (2.0991) [2022-01-24 12:32:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][150/1251] eta 0:43:02 lr 0.000197 time 2.2367 (2.3452) loss 3.3863 (3.2699) grad_norm 1.8378 (2.0929) [2022-01-24 12:32:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][160/1251] eta 0:42:32 lr 0.000197 time 1.9701 (2.3391) loss 3.1824 (3.2744) grad_norm 1.5871 (2.0834) [2022-01-24 12:33:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][170/1251] eta 0:42:09 lr 0.000197 time 2.0672 (2.3400) loss 3.4858 (3.2762) grad_norm 2.0608 (2.0793) [2022-01-24 12:33:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][180/1251] eta 0:41:34 lr 0.000197 time 2.5185 (2.3290) loss 3.3664 (3.2636) grad_norm 2.3608 (2.0818) [2022-01-24 12:33:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][190/1251] eta 0:40:54 lr 0.000197 time 2.0472 (2.3130) loss 2.9828 (3.2565) grad_norm 1.8504 (2.0816) [2022-01-24 12:34:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][200/1251] eta 0:40:16 lr 0.000197 time 2.0484 (2.2996) loss 3.3733 (3.2629) grad_norm 2.0357 (2.0867) [2022-01-24 12:34:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][210/1251] eta 0:39:49 lr 0.000197 time 2.2139 (2.2950) loss 2.9427 (3.2699) grad_norm 2.0679 (2.0850) [2022-01-24 12:34:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][220/1251] eta 0:39:18 lr 0.000197 time 2.0457 (2.2874) loss 3.8900 (3.2738) grad_norm 2.0717 (2.0871) [2022-01-24 12:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][230/1251] eta 0:38:48 lr 0.000197 time 2.1920 (2.2806) loss 3.2908 (3.2657) grad_norm 1.8997 (2.0929) [2022-01-24 12:35:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][240/1251] eta 0:38:19 lr 0.000197 time 1.6803 (2.2746) loss 3.4089 (3.2657) grad_norm 1.9558 (2.0961) [2022-01-24 12:35:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][250/1251] eta 0:37:51 lr 0.000197 time 1.8935 (2.2690) loss 2.2715 (3.2648) grad_norm 2.1781 (2.0940) [2022-01-24 12:36:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][260/1251] eta 0:37:24 lr 0.000197 time 2.2134 (2.2647) loss 3.4084 (3.2746) grad_norm 1.9907 (2.0926) [2022-01-24 12:36:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][270/1251] eta 0:36:56 lr 0.000197 time 2.1276 (2.2591) loss 2.4295 (3.2761) grad_norm 1.8084 (2.0986) [2022-01-24 12:36:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][280/1251] eta 0:36:33 lr 0.000197 time 2.1830 (2.2589) loss 3.8509 (3.2763) grad_norm 1.9304 (2.0967) [2022-01-24 12:37:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][290/1251] eta 0:36:15 lr 0.000197 time 2.1971 (2.2640) loss 2.8886 (3.2842) grad_norm 2.0727 (2.0953) [2022-01-24 12:37:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][300/1251] eta 0:35:57 lr 0.000197 time 3.1605 (2.2691) loss 3.3285 (3.2811) grad_norm 2.1641 (2.0948) [2022-01-24 12:38:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][310/1251] eta 0:35:29 lr 0.000197 time 2.0469 (2.2630) loss 3.9320 (3.2725) grad_norm 2.0494 (2.0915) [2022-01-24 12:38:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][320/1251] eta 0:34:58 lr 0.000196 time 1.7314 (2.2543) loss 2.3054 (3.2707) grad_norm 2.0414 (2.0926) [2022-01-24 12:38:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][330/1251] eta 0:34:26 lr 0.000196 time 1.9711 (2.2440) loss 3.4492 (3.2759) grad_norm 2.1495 (2.0930) [2022-01-24 12:39:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][340/1251] eta 0:33:58 lr 0.000196 time 2.7253 (2.2375) loss 2.9309 (3.2662) grad_norm 2.0176 (2.0909) [2022-01-24 12:39:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][350/1251] eta 0:33:33 lr 0.000196 time 2.2349 (2.2348) loss 3.4712 (3.2546) grad_norm 2.1265 (2.0872) [2022-01-24 12:39:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][360/1251] eta 0:33:10 lr 0.000196 time 2.3025 (2.2342) loss 3.1031 (3.2612) grad_norm 1.8520 (2.0846) [2022-01-24 12:40:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][370/1251] eta 0:32:45 lr 0.000196 time 1.7998 (2.2309) loss 3.1708 (3.2528) grad_norm 2.3077 (2.0899) [2022-01-24 12:40:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][380/1251] eta 0:32:25 lr 0.000196 time 2.8178 (2.2333) loss 2.3813 (3.2462) grad_norm 2.2046 (2.0902) [2022-01-24 12:40:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][390/1251] eta 0:32:02 lr 0.000196 time 1.7813 (2.2325) loss 3.8084 (3.2577) grad_norm 2.1405 (2.0936) [2022-01-24 12:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][400/1251] eta 0:31:40 lr 0.000196 time 2.8373 (2.2336) loss 3.7632 (3.2635) grad_norm 2.0321 (2.0936) [2022-01-24 12:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][410/1251] eta 0:31:16 lr 0.000196 time 1.8592 (2.2309) loss 2.4089 (3.2635) grad_norm 2.2927 (2.0950) [2022-01-24 12:42:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][420/1251] eta 0:30:55 lr 0.000196 time 3.1622 (2.2334) loss 3.8610 (3.2703) grad_norm 2.2671 (2.0978) [2022-01-24 12:42:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][430/1251] eta 0:30:32 lr 0.000196 time 2.1049 (2.2318) loss 3.5543 (3.2675) grad_norm 2.0697 (2.0995) [2022-01-24 12:42:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][440/1251] eta 0:30:11 lr 0.000196 time 2.5697 (2.2337) loss 3.3723 (3.2693) grad_norm 2.4586 (2.1009) [2022-01-24 12:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][450/1251] eta 0:29:49 lr 0.000196 time 1.5865 (2.2340) loss 3.2231 (3.2663) grad_norm 1.9027 (2.1031) [2022-01-24 12:43:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][460/1251] eta 0:29:28 lr 0.000196 time 3.5539 (2.2357) loss 3.8590 (3.2665) grad_norm 2.1283 (2.1037) [2022-01-24 12:43:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][470/1251] eta 0:29:03 lr 0.000196 time 2.8554 (2.2329) loss 3.5429 (3.2658) grad_norm 1.9312 (2.1014) [2022-01-24 12:44:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][480/1251] eta 0:28:37 lr 0.000196 time 2.1383 (2.2278) loss 3.7012 (3.2647) grad_norm 2.2458 (2.1039) [2022-01-24 12:44:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][490/1251] eta 0:28:15 lr 0.000196 time 2.7665 (2.2284) loss 3.6074 (3.2657) grad_norm 2.1052 (2.1057) [2022-01-24 12:44:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][500/1251] eta 0:27:53 lr 0.000196 time 1.9693 (2.2280) loss 3.0684 (3.2652) grad_norm 2.2992 (2.1085) [2022-01-24 12:45:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][510/1251] eta 0:27:28 lr 0.000196 time 2.1613 (2.2250) loss 4.0787 (3.2667) grad_norm 2.1382 (2.1072) [2022-01-24 12:45:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][520/1251] eta 0:27:08 lr 0.000196 time 2.5109 (2.2278) loss 3.4998 (3.2730) grad_norm 2.9951 (2.1101) [2022-01-24 12:46:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][530/1251] eta 0:26:46 lr 0.000196 time 2.7307 (2.2277) loss 3.0804 (3.2685) grad_norm 2.0034 (2.1092) [2022-01-24 12:46:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][540/1251] eta 0:26:23 lr 0.000196 time 1.9218 (2.2271) loss 3.4790 (3.2710) grad_norm 1.8975 (2.1115) [2022-01-24 12:46:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][550/1251] eta 0:25:58 lr 0.000196 time 1.9793 (2.2230) loss 3.8073 (3.2726) grad_norm 2.1459 (2.1117) [2022-01-24 12:47:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][560/1251] eta 0:25:34 lr 0.000196 time 1.8615 (2.2200) loss 3.4314 (3.2771) grad_norm 2.2610 (2.1134) [2022-01-24 12:47:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][570/1251] eta 0:25:10 lr 0.000196 time 2.9572 (2.2187) loss 3.5300 (3.2770) grad_norm 2.8489 (2.1150) [2022-01-24 12:47:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][580/1251] eta 0:24:46 lr 0.000196 time 1.5990 (2.2157) loss 3.5595 (3.2754) grad_norm 1.9266 (2.1141) [2022-01-24 12:48:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][590/1251] eta 0:24:26 lr 0.000196 time 2.2632 (2.2191) loss 2.7790 (3.2783) grad_norm 2.1506 (2.1133) [2022-01-24 12:48:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][600/1251] eta 0:24:06 lr 0.000196 time 2.4072 (2.2214) loss 3.4863 (3.2804) grad_norm 2.4058 (2.1123) [2022-01-24 12:48:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][610/1251] eta 0:23:45 lr 0.000196 time 3.6329 (2.2238) loss 3.5251 (3.2818) grad_norm 1.8309 (2.1125) [2022-01-24 12:49:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][620/1251] eta 0:23:22 lr 0.000196 time 2.0509 (2.2227) loss 2.6886 (3.2801) grad_norm 1.9360 (2.1120) [2022-01-24 12:49:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][630/1251] eta 0:22:58 lr 0.000195 time 1.6230 (2.2199) loss 2.9179 (3.2754) grad_norm 2.3247 (2.1149) [2022-01-24 12:50:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][640/1251] eta 0:22:34 lr 0.000195 time 2.0635 (2.2172) loss 3.9106 (3.2773) grad_norm 2.1791 (2.1161) [2022-01-24 12:50:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][650/1251] eta 0:22:13 lr 0.000195 time 3.7090 (2.2195) loss 3.4365 (3.2766) grad_norm 1.9204 (2.1152) [2022-01-24 12:50:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][660/1251] eta 0:21:52 lr 0.000195 time 2.2147 (2.2200) loss 2.8752 (3.2729) grad_norm 1.9339 (2.1145) [2022-01-24 12:51:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][670/1251] eta 0:21:29 lr 0.000195 time 1.6235 (2.2195) loss 3.1418 (3.2738) grad_norm 2.0106 (2.1153) [2022-01-24 12:51:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][680/1251] eta 0:21:07 lr 0.000195 time 2.3692 (2.2192) loss 3.3862 (3.2709) grad_norm 2.2887 (2.1171) [2022-01-24 12:51:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][690/1251] eta 0:20:43 lr 0.000195 time 3.0876 (2.2169) loss 2.7502 (3.2683) grad_norm 2.2272 (2.1169) [2022-01-24 12:52:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][700/1251] eta 0:20:20 lr 0.000195 time 1.6383 (2.2145) loss 4.1345 (3.2685) grad_norm 2.2483 (2.1162) [2022-01-24 12:52:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][710/1251] eta 0:19:57 lr 0.000195 time 1.8716 (2.2127) loss 3.3899 (3.2699) grad_norm 2.1042 (2.1165) [2022-01-24 12:52:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][720/1251] eta 0:19:34 lr 0.000195 time 2.5806 (2.2121) loss 3.3854 (3.2712) grad_norm 2.1026 (2.1173) [2022-01-24 12:53:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][730/1251] eta 0:19:12 lr 0.000195 time 1.8374 (2.2112) loss 2.6127 (3.2700) grad_norm 2.3860 (2.1180) [2022-01-24 12:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][740/1251] eta 0:18:49 lr 0.000195 time 1.8393 (2.2105) loss 3.9638 (3.2720) grad_norm 2.0489 (2.1181) [2022-01-24 12:54:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][750/1251] eta 0:18:27 lr 0.000195 time 1.8825 (2.2099) loss 2.7340 (3.2708) grad_norm 1.9442 (2.1192) [2022-01-24 12:54:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][760/1251] eta 0:18:05 lr 0.000195 time 2.1954 (2.2102) loss 3.3390 (3.2690) grad_norm 1.9914 (2.1197) [2022-01-24 12:54:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][770/1251] eta 0:17:43 lr 0.000195 time 2.0435 (2.2106) loss 3.4367 (3.2671) grad_norm 2.0243 (2.1191) [2022-01-24 12:55:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][780/1251] eta 0:17:20 lr 0.000195 time 2.2315 (2.2092) loss 3.8176 (3.2713) grad_norm 2.0142 (2.1194) [2022-01-24 12:55:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][790/1251] eta 0:16:58 lr 0.000195 time 1.8860 (2.2094) loss 2.4711 (3.2711) grad_norm 1.9812 (2.1181) [2022-01-24 12:55:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][800/1251] eta 0:16:36 lr 0.000195 time 2.3772 (2.2093) loss 3.0847 (3.2731) grad_norm 1.9132 (2.1175) [2022-01-24 12:56:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][810/1251] eta 0:16:14 lr 0.000195 time 1.8935 (2.2098) loss 2.5815 (3.2733) grad_norm 1.9136 (2.1176) [2022-01-24 12:56:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][820/1251] eta 0:15:53 lr 0.000195 time 2.5449 (2.2125) loss 3.8467 (3.2760) grad_norm 2.1587 (2.1181) [2022-01-24 12:56:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][830/1251] eta 0:15:31 lr 0.000195 time 1.9064 (2.2128) loss 3.7735 (3.2779) grad_norm 1.9452 (2.1179) [2022-01-24 12:57:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][840/1251] eta 0:15:09 lr 0.000195 time 2.7305 (2.2131) loss 3.7284 (3.2768) grad_norm 2.1533 (2.1196) [2022-01-24 12:57:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][850/1251] eta 0:14:46 lr 0.000195 time 1.5115 (2.2110) loss 3.1221 (3.2749) grad_norm 2.1752 (2.1194) [2022-01-24 12:58:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][860/1251] eta 0:14:23 lr 0.000195 time 1.9268 (2.2095) loss 3.3516 (3.2759) grad_norm 2.0605 (2.1195) [2022-01-24 12:58:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][870/1251] eta 0:14:00 lr 0.000195 time 1.5872 (2.2066) loss 3.3904 (3.2750) grad_norm 2.0884 (2.1231) [2022-01-24 12:58:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][880/1251] eta 0:13:38 lr 0.000195 time 2.1173 (2.2069) loss 3.5668 (3.2766) grad_norm 1.9824 (2.1257) [2022-01-24 12:59:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][890/1251] eta 0:13:16 lr 0.000195 time 2.2515 (2.2069) loss 3.3475 (3.2772) grad_norm 2.2713 (2.1270) [2022-01-24 12:59:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][900/1251] eta 0:12:55 lr 0.000195 time 2.2501 (2.2080) loss 3.9270 (3.2780) grad_norm 2.1727 (2.1278) [2022-01-24 12:59:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][910/1251] eta 0:12:32 lr 0.000195 time 2.5852 (2.2074) loss 3.0906 (3.2780) grad_norm 2.0633 (2.1280) [2022-01-24 13:00:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][920/1251] eta 0:12:10 lr 0.000195 time 1.4942 (2.2064) loss 3.8599 (3.2794) grad_norm 2.2829 (2.1273) [2022-01-24 13:00:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][930/1251] eta 0:11:47 lr 0.000195 time 1.9026 (2.2049) loss 3.5984 (3.2775) grad_norm 2.0447 (2.1276) [2022-01-24 13:00:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][940/1251] eta 0:11:25 lr 0.000194 time 1.9234 (2.2056) loss 3.0282 (3.2771) grad_norm 2.5880 (2.1284) [2022-01-24 13:01:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][950/1251] eta 0:11:03 lr 0.000194 time 2.4172 (2.2055) loss 3.6807 (3.2796) grad_norm 2.1239 (2.1280) [2022-01-24 13:01:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][960/1251] eta 0:10:41 lr 0.000194 time 1.7855 (2.2055) loss 3.8391 (3.2801) grad_norm 2.1947 (2.1293) [2022-01-24 13:02:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][970/1251] eta 0:10:19 lr 0.000194 time 1.7711 (2.2061) loss 3.0849 (3.2771) grad_norm 2.3395 (2.1294) [2022-01-24 13:02:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][980/1251] eta 0:09:58 lr 0.000194 time 1.5656 (2.2067) loss 3.3590 (3.2793) grad_norm 2.0292 (2.1301) [2022-01-24 13:02:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][990/1251] eta 0:09:35 lr 0.000194 time 2.8077 (2.2069) loss 2.5174 (3.2791) grad_norm 1.9916 (2.1293) [2022-01-24 13:03:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1000/1251] eta 0:09:13 lr 0.000194 time 1.7950 (2.2059) loss 3.1239 (3.2799) grad_norm 1.9213 (2.1288) [2022-01-24 13:03:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1010/1251] eta 0:08:51 lr 0.000194 time 1.8251 (2.2050) loss 3.7215 (3.2810) grad_norm 2.1241 (2.1286) [2022-01-24 13:03:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1020/1251] eta 0:08:29 lr 0.000194 time 1.8802 (2.2037) loss 3.0156 (3.2820) grad_norm 1.9444 (2.1284) [2022-01-24 13:04:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1030/1251] eta 0:08:06 lr 0.000194 time 2.1648 (2.2031) loss 3.9947 (3.2812) grad_norm 2.3017 (2.1298) [2022-01-24 13:04:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1040/1251] eta 0:07:44 lr 0.000194 time 2.6897 (2.2025) loss 3.7705 (3.2829) grad_norm 2.0718 (2.1282) [2022-01-24 13:04:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1050/1251] eta 0:07:22 lr 0.000194 time 2.4888 (2.2033) loss 3.3101 (3.2833) grad_norm 2.3380 (2.1282) [2022-01-24 13:05:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1060/1251] eta 0:07:00 lr 0.000194 time 2.1384 (2.2040) loss 2.5657 (3.2813) grad_norm 1.8820 (2.1279) [2022-01-24 13:05:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1070/1251] eta 0:06:38 lr 0.000194 time 2.9258 (2.2032) loss 2.9384 (3.2808) grad_norm 1.9305 (2.1272) [2022-01-24 13:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1080/1251] eta 0:06:16 lr 0.000194 time 2.6108 (2.2027) loss 2.9189 (3.2800) grad_norm 1.8534 (2.1262) [2022-01-24 13:06:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1090/1251] eta 0:05:54 lr 0.000194 time 1.5594 (2.2029) loss 3.9463 (3.2787) grad_norm 2.3886 (2.1258) [2022-01-24 13:06:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1100/1251] eta 0:05:32 lr 0.000194 time 2.2365 (2.2027) loss 2.4544 (3.2778) grad_norm 2.1527 (2.1253) [2022-01-24 13:07:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1110/1251] eta 0:05:10 lr 0.000194 time 2.2632 (2.2021) loss 3.7053 (3.2789) grad_norm 2.1097 (2.1242) [2022-01-24 13:07:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1120/1251] eta 0:04:48 lr 0.000194 time 2.2793 (2.2010) loss 4.0612 (3.2800) grad_norm 2.3616 (2.1229) [2022-01-24 13:07:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1130/1251] eta 0:04:26 lr 0.000194 time 1.9381 (2.2015) loss 3.9511 (3.2810) grad_norm 2.1420 (2.1230) [2022-01-24 13:08:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1140/1251] eta 0:04:04 lr 0.000194 time 1.9898 (2.2012) loss 3.4805 (3.2792) grad_norm 2.0486 (2.1228) [2022-01-24 13:08:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1150/1251] eta 0:03:42 lr 0.000194 time 2.5998 (2.2018) loss 4.0427 (3.2792) grad_norm 2.2098 (2.1237) [2022-01-24 13:08:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1160/1251] eta 0:03:20 lr 0.000194 time 1.5956 (2.2016) loss 2.3010 (3.2773) grad_norm 2.1590 (2.1233) [2022-01-24 13:09:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1170/1251] eta 0:02:58 lr 0.000194 time 1.9525 (2.2013) loss 3.3313 (3.2772) grad_norm 2.3594 (2.1236) [2022-01-24 13:09:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1180/1251] eta 0:02:36 lr 0.000194 time 1.5472 (2.2011) loss 2.9169 (3.2760) grad_norm 1.9140 (2.1229) [2022-01-24 13:10:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1190/1251] eta 0:02:14 lr 0.000194 time 2.5574 (2.2011) loss 3.0661 (3.2759) grad_norm 2.3034 (2.1232) [2022-01-24 13:10:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1200/1251] eta 0:01:52 lr 0.000194 time 1.8665 (2.1995) loss 3.5106 (3.2764) grad_norm 1.9888 (2.1240) [2022-01-24 13:10:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1210/1251] eta 0:01:30 lr 0.000194 time 1.6170 (2.1989) loss 3.7752 (3.2770) grad_norm 2.0003 (2.1238) [2022-01-24 13:11:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1220/1251] eta 0:01:08 lr 0.000194 time 2.0073 (2.1984) loss 2.4938 (3.2751) grad_norm 2.1877 (2.1238) [2022-01-24 13:11:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1230/1251] eta 0:00:46 lr 0.000194 time 3.1860 (2.1989) loss 4.2196 (3.2767) grad_norm 2.2774 (2.1237) [2022-01-24 13:11:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1240/1251] eta 0:00:24 lr 0.000194 time 1.7565 (2.1982) loss 2.2170 (3.2776) grad_norm 2.2571 (2.1223) [2022-01-24 13:12:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [214/300][1250/1251] eta 0:00:02 lr 0.000193 time 1.1375 (2.1931) loss 2.5008 (3.2755) grad_norm 2.1296 (2.1230) [2022-01-24 13:12:04 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 214 training takes 0:45:43 [2022-01-24 13:12:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.021 (18.021) Loss 0.8609 (0.8609) Acc@1 79.004 (79.004) Acc@5 95.703 (95.703) [2022-01-24 13:12:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.851 (3.311) Loss 0.9749 (0.9014) Acc@1 77.051 (79.128) Acc@5 94.727 (94.593) [2022-01-24 13:13:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.484 (2.688) Loss 0.7989 (0.8970) Acc@1 80.469 (78.818) Acc@5 95.801 (94.806) [2022-01-24 13:13:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.611 (2.354) Loss 0.8645 (0.8953) Acc@1 80.957 (78.799) Acc@5 95.117 (94.777) [2022-01-24 13:13:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.314 (2.280) Loss 0.8610 (0.8852) Acc@1 80.469 (79.121) Acc@5 94.629 (94.843) [2022-01-24 13:13:45 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.074 Acc@5 94.758 [2022-01-24 13:13:45 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.1% [2022-01-24 13:13:45 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.07% [2022-01-24 13:14:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][0/1251] eta 7:30:53 lr 0.000193 time 21.6252 (21.6252) loss 2.5657 (2.5657) grad_norm 2.2340 (2.2340) [2022-01-24 13:14:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][10/1251] eta 1:22:18 lr 0.000193 time 2.2058 (3.9796) loss 3.4944 (3.3854) grad_norm 2.2546 (2.0955) [2022-01-24 13:14:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][20/1251] eta 1:05:07 lr 0.000193 time 2.2629 (3.1743) loss 3.8282 (3.3795) grad_norm 1.9804 (2.0661) [2022-01-24 13:15:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][30/1251] eta 0:57:47 lr 0.000193 time 1.3780 (2.8402) loss 3.7563 (3.4209) grad_norm 2.1549 (2.0676) [2022-01-24 13:15:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][40/1251] eta 0:54:51 lr 0.000193 time 3.8060 (2.7176) loss 3.6084 (3.3911) grad_norm 1.8937 (2.0709) [2022-01-24 13:16:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][50/1251] eta 0:52:58 lr 0.000193 time 3.0861 (2.6468) loss 3.3562 (3.3904) grad_norm 2.0853 (2.0619) [2022-01-24 13:16:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][60/1251] eta 0:51:05 lr 0.000193 time 2.1047 (2.5736) loss 3.2278 (3.3217) grad_norm 2.1174 (2.0745) [2022-01-24 13:16:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][70/1251] eta 0:49:41 lr 0.000193 time 1.8902 (2.5249) loss 3.3590 (3.3366) grad_norm 2.7598 (2.0953) [2022-01-24 13:17:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][80/1251] eta 0:48:32 lr 0.000193 time 2.7705 (2.4875) loss 2.5223 (3.3128) grad_norm 1.9266 (2.0953) [2022-01-24 13:17:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][90/1251] eta 0:47:17 lr 0.000193 time 2.1829 (2.4437) loss 3.7843 (3.3010) grad_norm 2.0562 (2.0893) [2022-01-24 13:17:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][100/1251] eta 0:46:08 lr 0.000193 time 1.9145 (2.4050) loss 3.6694 (3.2775) grad_norm 2.0162 (2.1045) [2022-01-24 13:18:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][110/1251] eta 0:45:22 lr 0.000193 time 2.0305 (2.3862) loss 2.2444 (3.2802) grad_norm 1.7525 (2.1014) [2022-01-24 13:18:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][120/1251] eta 0:44:38 lr 0.000193 time 2.5708 (2.3681) loss 2.8948 (3.2724) grad_norm 2.9112 (2.1076) [2022-01-24 13:18:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][130/1251] eta 0:43:57 lr 0.000193 time 2.0510 (2.3529) loss 3.2666 (3.2699) grad_norm 2.3321 (2.1158) [2022-01-24 13:19:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][140/1251] eta 0:43:19 lr 0.000193 time 1.5319 (2.3401) loss 3.6120 (3.2622) grad_norm 2.0337 (2.1101) [2022-01-24 13:19:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][150/1251] eta 0:42:47 lr 0.000193 time 1.9446 (2.3319) loss 2.6950 (3.2618) grad_norm 2.4192 (2.1251) [2022-01-24 13:20:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][160/1251] eta 0:42:19 lr 0.000193 time 3.4174 (2.3276) loss 2.6032 (3.2536) grad_norm 2.2453 (2.1323) [2022-01-24 13:20:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][170/1251] eta 0:41:47 lr 0.000193 time 1.9261 (2.3193) loss 3.0947 (3.2481) grad_norm 2.1535 (2.1353) [2022-01-24 13:20:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][180/1251] eta 0:41:21 lr 0.000193 time 1.8463 (2.3166) loss 3.1353 (3.2516) grad_norm 1.9731 (2.1289) [2022-01-24 13:21:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][190/1251] eta 0:40:45 lr 0.000193 time 1.7404 (2.3048) loss 3.9435 (3.2530) grad_norm 1.9620 (2.1320) [2022-01-24 13:21:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][200/1251] eta 0:40:24 lr 0.000193 time 3.7083 (2.3064) loss 4.1079 (3.2571) grad_norm 2.0427 (2.1349) [2022-01-24 13:21:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][210/1251] eta 0:39:51 lr 0.000193 time 1.8224 (2.2970) loss 2.9651 (3.2645) grad_norm 2.6858 (2.1396) [2022-01-24 13:22:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][220/1251] eta 0:39:29 lr 0.000193 time 1.8652 (2.2984) loss 3.0985 (3.2548) grad_norm 2.2539 (2.1382) [2022-01-24 13:22:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][230/1251] eta 0:39:03 lr 0.000193 time 2.0058 (2.2957) loss 3.8657 (3.2501) grad_norm 2.1549 (2.1404) [2022-01-24 13:22:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][240/1251] eta 0:38:37 lr 0.000193 time 2.4960 (2.2919) loss 3.9071 (3.2462) grad_norm 2.1169 (2.1415) [2022-01-24 13:23:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][250/1251] eta 0:38:03 lr 0.000193 time 1.6042 (2.2807) loss 3.2484 (3.2519) grad_norm 1.7596 (2.1519) [2022-01-24 13:23:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][260/1251] eta 0:37:37 lr 0.000193 time 2.2135 (2.2776) loss 3.6975 (3.2496) grad_norm 1.7862 (2.1518) [2022-01-24 13:24:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][270/1251] eta 0:37:07 lr 0.000193 time 2.1852 (2.2702) loss 2.3680 (3.2422) grad_norm 1.9647 (2.1553) [2022-01-24 13:24:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][280/1251] eta 0:36:41 lr 0.000193 time 2.2267 (2.2676) loss 3.7354 (3.2478) grad_norm 2.4459 (2.1573) [2022-01-24 13:24:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][290/1251] eta 0:36:14 lr 0.000193 time 1.9321 (2.2628) loss 3.2350 (3.2513) grad_norm 2.1128 (2.1635) [2022-01-24 13:25:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][300/1251] eta 0:35:47 lr 0.000193 time 2.6399 (2.2580) loss 3.4292 (3.2561) grad_norm 2.0201 (2.1638) [2022-01-24 13:25:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][310/1251] eta 0:35:19 lr 0.000192 time 2.2388 (2.2523) loss 2.5641 (3.2513) grad_norm 2.3783 (2.1640) [2022-01-24 13:25:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][320/1251] eta 0:34:52 lr 0.000192 time 2.1096 (2.2475) loss 2.0823 (3.2515) grad_norm 2.1416 (2.1653) [2022-01-24 13:26:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][330/1251] eta 0:34:29 lr 0.000192 time 1.7476 (2.2465) loss 3.3273 (3.2467) grad_norm 2.3081 (2.1662) [2022-01-24 13:26:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][340/1251] eta 0:34:09 lr 0.000192 time 2.5196 (2.2501) loss 3.2431 (3.2518) grad_norm 1.9465 (2.1676) [2022-01-24 13:26:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][350/1251] eta 0:33:46 lr 0.000192 time 2.2343 (2.2490) loss 2.3099 (3.2520) grad_norm 1.9886 (2.1646) [2022-01-24 13:27:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][360/1251] eta 0:33:21 lr 0.000192 time 2.6364 (2.2466) loss 3.6576 (3.2528) grad_norm 2.1670 (2.1627) [2022-01-24 13:27:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][370/1251] eta 0:32:55 lr 0.000192 time 1.6069 (2.2422) loss 3.9051 (3.2569) grad_norm 2.3860 (2.1620) [2022-01-24 13:27:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][380/1251] eta 0:32:33 lr 0.000192 time 2.0057 (2.2427) loss 3.6269 (3.2632) grad_norm 2.0691 (2.1608) [2022-01-24 13:28:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][390/1251] eta 0:32:07 lr 0.000192 time 2.1069 (2.2390) loss 3.8399 (3.2666) grad_norm 2.5464 (2.1641) [2022-01-24 13:28:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][400/1251] eta 0:31:45 lr 0.000192 time 2.4067 (2.2393) loss 3.1993 (3.2673) grad_norm 2.0406 (2.1629) [2022-01-24 13:29:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][410/1251] eta 0:31:24 lr 0.000192 time 1.8963 (2.2409) loss 3.7023 (3.2629) grad_norm 2.1020 (2.1644) [2022-01-24 13:29:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][420/1251] eta 0:31:01 lr 0.000192 time 1.8922 (2.2402) loss 3.9292 (3.2650) grad_norm 1.9175 (2.1627) [2022-01-24 13:29:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][430/1251] eta 0:30:35 lr 0.000192 time 2.0619 (2.2360) loss 4.1165 (3.2741) grad_norm 1.7960 (2.1610) [2022-01-24 13:30:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][440/1251] eta 0:30:10 lr 0.000192 time 2.0225 (2.2328) loss 2.3261 (3.2734) grad_norm 2.0872 (2.1594) [2022-01-24 13:30:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][450/1251] eta 0:29:45 lr 0.000192 time 1.9370 (2.2287) loss 3.3367 (3.2766) grad_norm 2.0252 (2.1588) [2022-01-24 13:30:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][460/1251] eta 0:29:20 lr 0.000192 time 2.1845 (2.2255) loss 3.0689 (3.2739) grad_norm 2.2162 (2.1574) [2022-01-24 13:31:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][470/1251] eta 0:28:55 lr 0.000192 time 2.0187 (2.2227) loss 2.9801 (3.2727) grad_norm 1.9097 (2.1556) [2022-01-24 13:31:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][480/1251] eta 0:28:34 lr 0.000192 time 2.7500 (2.2233) loss 3.9587 (3.2754) grad_norm 2.0357 (2.1529) [2022-01-24 13:31:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][490/1251] eta 0:28:12 lr 0.000192 time 2.4417 (2.2235) loss 3.5231 (3.2735) grad_norm 1.7494 (2.1511) [2022-01-24 13:32:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][500/1251] eta 0:27:50 lr 0.000192 time 2.2193 (2.2243) loss 3.2160 (3.2713) grad_norm 2.3712 (2.1544) [2022-01-24 13:32:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][510/1251] eta 0:27:28 lr 0.000192 time 2.4677 (2.2252) loss 2.7766 (3.2703) grad_norm 2.2177 (2.1576) [2022-01-24 13:33:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][520/1251] eta 0:27:07 lr 0.000192 time 3.5562 (2.2266) loss 3.6215 (3.2720) grad_norm 2.0675 (2.1617) [2022-01-24 13:33:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][530/1251] eta 0:26:45 lr 0.000192 time 2.1252 (2.2262) loss 3.3287 (3.2677) grad_norm 1.9572 (2.1606) [2022-01-24 13:33:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][540/1251] eta 0:26:21 lr 0.000192 time 2.3284 (2.2249) loss 3.1649 (3.2661) grad_norm 2.2645 (2.1626) [2022-01-24 13:34:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][550/1251] eta 0:25:59 lr 0.000192 time 2.5076 (2.2244) loss 2.6326 (3.2632) grad_norm 2.1980 (2.1637) [2022-01-24 13:34:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][560/1251] eta 0:25:35 lr 0.000192 time 2.6554 (2.2228) loss 2.6216 (3.2667) grad_norm 2.3081 (2.1653) [2022-01-24 13:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][570/1251] eta 0:25:13 lr 0.000192 time 2.4950 (2.2218) loss 3.5173 (3.2648) grad_norm 2.4356 (2.1653) [2022-01-24 13:35:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][580/1251] eta 0:24:50 lr 0.000192 time 2.5255 (2.2211) loss 2.7084 (3.2654) grad_norm 2.0243 (2.1626) [2022-01-24 13:35:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][590/1251] eta 0:24:27 lr 0.000192 time 1.8242 (2.2207) loss 3.3803 (3.2705) grad_norm 2.9317 (2.1647) [2022-01-24 13:36:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][600/1251] eta 0:24:06 lr 0.000192 time 2.8466 (2.2223) loss 2.2579 (3.2666) grad_norm 2.5451 (2.1663) [2022-01-24 13:36:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][610/1251] eta 0:23:44 lr 0.000192 time 2.8693 (2.2216) loss 3.6671 (3.2639) grad_norm 2.3181 (2.1654) [2022-01-24 13:36:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][620/1251] eta 0:23:19 lr 0.000191 time 2.0326 (2.2181) loss 3.7041 (3.2676) grad_norm 1.9693 (2.1641) [2022-01-24 13:37:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][630/1251] eta 0:22:55 lr 0.000191 time 1.8634 (2.2157) loss 2.5493 (3.2688) grad_norm 2.0547 (2.1642) [2022-01-24 13:37:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][640/1251] eta 0:22:33 lr 0.000191 time 1.8595 (2.2157) loss 3.8864 (3.2676) grad_norm 2.1368 (2.1636) [2022-01-24 13:37:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][650/1251] eta 0:22:11 lr 0.000191 time 1.6307 (2.2155) loss 3.9443 (3.2651) grad_norm 2.0608 (2.1628) [2022-01-24 13:38:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][660/1251] eta 0:21:50 lr 0.000191 time 2.6876 (2.2177) loss 3.7016 (3.2648) grad_norm 2.0675 (2.1629) [2022-01-24 13:38:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][670/1251] eta 0:21:29 lr 0.000191 time 1.8683 (2.2196) loss 3.5658 (3.2647) grad_norm 1.9744 (2.1642) [2022-01-24 13:38:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][680/1251] eta 0:21:05 lr 0.000191 time 2.0325 (2.2160) loss 2.6127 (3.2647) grad_norm 3.0389 (2.1655) [2022-01-24 13:39:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][690/1251] eta 0:20:41 lr 0.000191 time 1.6120 (2.2122) loss 3.4305 (3.2662) grad_norm 1.9592 (2.1652) [2022-01-24 13:39:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][700/1251] eta 0:20:17 lr 0.000191 time 2.1123 (2.2099) loss 2.2865 (3.2619) grad_norm 2.6665 (2.1657) [2022-01-24 13:39:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][710/1251] eta 0:19:53 lr 0.000191 time 1.8350 (2.2062) loss 2.9205 (3.2632) grad_norm 2.1138 (2.1647) [2022-01-24 13:40:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][720/1251] eta 0:19:30 lr 0.000191 time 2.6350 (2.2052) loss 3.6412 (3.2640) grad_norm 2.0154 (2.1621) [2022-01-24 13:40:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][730/1251] eta 0:19:09 lr 0.000191 time 2.4413 (2.2059) loss 3.4666 (3.2638) grad_norm 2.2772 (2.1618) [2022-01-24 13:41:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][740/1251] eta 0:18:47 lr 0.000191 time 1.5800 (2.2061) loss 2.3657 (3.2637) grad_norm 2.0093 (2.1619) [2022-01-24 13:41:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][750/1251] eta 0:18:27 lr 0.000191 time 2.8149 (2.2098) loss 3.4708 (3.2680) grad_norm 1.9665 (2.1630) [2022-01-24 13:41:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][760/1251] eta 0:18:05 lr 0.000191 time 1.8891 (2.2102) loss 2.3783 (3.2653) grad_norm 2.2604 (2.1629) [2022-01-24 13:42:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][770/1251] eta 0:17:44 lr 0.000191 time 1.7772 (2.2126) loss 3.4056 (3.2669) grad_norm 1.9362 (2.1621) [2022-01-24 13:42:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][780/1251] eta 0:17:21 lr 0.000191 time 1.7228 (2.2120) loss 3.3189 (3.2661) grad_norm 2.2788 (2.1615) [2022-01-24 13:42:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][790/1251] eta 0:16:59 lr 0.000191 time 2.3111 (2.2125) loss 2.9825 (3.2654) grad_norm 2.3005 (2.1619) [2022-01-24 13:43:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][800/1251] eta 0:16:37 lr 0.000191 time 1.8191 (2.2121) loss 3.4088 (3.2644) grad_norm 1.8812 (2.1612) [2022-01-24 13:43:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][810/1251] eta 0:16:15 lr 0.000191 time 1.5978 (2.2128) loss 3.5994 (3.2643) grad_norm 2.2994 (2.1612) [2022-01-24 13:44:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][820/1251] eta 0:15:52 lr 0.000191 time 1.8783 (2.2109) loss 3.6234 (3.2638) grad_norm 2.5372 (2.1629) [2022-01-24 13:44:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][830/1251] eta 0:15:30 lr 0.000191 time 1.9698 (2.2094) loss 3.6279 (3.2623) grad_norm 2.0986 (2.1628) [2022-01-24 13:44:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][840/1251] eta 0:15:07 lr 0.000191 time 1.9995 (2.2078) loss 3.5074 (3.2651) grad_norm 1.9736 (2.1626) [2022-01-24 13:45:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][850/1251] eta 0:14:45 lr 0.000191 time 2.0228 (2.2070) loss 3.3034 (3.2646) grad_norm 1.9617 (2.1619) [2022-01-24 13:45:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][860/1251] eta 0:14:22 lr 0.000191 time 1.9423 (2.2067) loss 3.4615 (3.2639) grad_norm 2.1481 (2.1604) [2022-01-24 13:45:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][870/1251] eta 0:14:00 lr 0.000191 time 2.3242 (2.2063) loss 3.7125 (3.2645) grad_norm 2.1358 (2.1593) [2022-01-24 13:46:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][880/1251] eta 0:13:38 lr 0.000191 time 1.8619 (2.2073) loss 3.3475 (3.2664) grad_norm 2.0064 (2.1589) [2022-01-24 13:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][890/1251] eta 0:13:17 lr 0.000191 time 3.2436 (2.2100) loss 3.9364 (3.2678) grad_norm 2.0049 (2.1576) [2022-01-24 13:46:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][900/1251] eta 0:12:55 lr 0.000191 time 1.8542 (2.2103) loss 3.1722 (3.2700) grad_norm 1.9095 (2.1571) [2022-01-24 13:47:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][910/1251] eta 0:12:33 lr 0.000191 time 1.5750 (2.2088) loss 2.5999 (3.2695) grad_norm 1.9327 (2.1568) [2022-01-24 13:47:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][920/1251] eta 0:12:10 lr 0.000191 time 1.6630 (2.2072) loss 3.4579 (3.2700) grad_norm 2.1040 (2.1565) [2022-01-24 13:48:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][930/1251] eta 0:11:48 lr 0.000191 time 1.9555 (2.2078) loss 2.7484 (3.2692) grad_norm 1.9796 (2.1559) [2022-01-24 13:48:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][940/1251] eta 0:11:27 lr 0.000190 time 2.4410 (2.2100) loss 3.4217 (3.2709) grad_norm 2.1921 (2.1559) [2022-01-24 13:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][950/1251] eta 0:11:05 lr 0.000190 time 1.5292 (2.2105) loss 3.5795 (3.2680) grad_norm 1.8051 (2.1549) [2022-01-24 13:49:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][960/1251] eta 0:10:42 lr 0.000190 time 1.7396 (2.2090) loss 3.5409 (3.2692) grad_norm 2.0129 (2.1550) [2022-01-24 13:49:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][970/1251] eta 0:10:19 lr 0.000190 time 1.6758 (2.2058) loss 3.5063 (3.2672) grad_norm 2.3884 (2.1549) [2022-01-24 13:49:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][980/1251] eta 0:09:57 lr 0.000190 time 2.2273 (2.2045) loss 3.6411 (3.2677) grad_norm 2.4530 (2.1536) [2022-01-24 13:50:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][990/1251] eta 0:09:34 lr 0.000190 time 2.2075 (2.2026) loss 2.3549 (3.2660) grad_norm 2.0427 (2.1522) [2022-01-24 13:50:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1000/1251] eta 0:09:12 lr 0.000190 time 1.9204 (2.2002) loss 3.5009 (3.2668) grad_norm 1.9381 (2.1510) [2022-01-24 13:50:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1010/1251] eta 0:08:50 lr 0.000190 time 2.4258 (2.2014) loss 3.7507 (3.2659) grad_norm 1.9613 (2.1503) [2022-01-24 13:51:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1020/1251] eta 0:08:28 lr 0.000190 time 2.4613 (2.2014) loss 2.9422 (3.2656) grad_norm 1.9981 (2.1511) [2022-01-24 13:51:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1030/1251] eta 0:08:06 lr 0.000190 time 1.9631 (2.2020) loss 3.5750 (3.2680) grad_norm 1.9107 (2.1531) [2022-01-24 13:51:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1040/1251] eta 0:07:44 lr 0.000190 time 1.6946 (2.2019) loss 3.7103 (3.2695) grad_norm 2.2001 (2.1525) [2022-01-24 13:52:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1050/1251] eta 0:07:23 lr 0.000190 time 3.3714 (2.2040) loss 3.3060 (3.2713) grad_norm 2.0179 (2.1516) [2022-01-24 13:52:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1060/1251] eta 0:07:00 lr 0.000190 time 2.5089 (2.2030) loss 3.7620 (3.2704) grad_norm 1.8973 (2.1502) [2022-01-24 13:53:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1070/1251] eta 0:06:38 lr 0.000190 time 2.0550 (2.2032) loss 3.5181 (3.2700) grad_norm 1.8189 (2.1486) [2022-01-24 13:53:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1080/1251] eta 0:06:16 lr 0.000190 time 1.5706 (2.2027) loss 3.5323 (3.2703) grad_norm 2.4462 (2.1483) [2022-01-24 13:53:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1090/1251] eta 0:05:54 lr 0.000190 time 2.5501 (2.2039) loss 2.7937 (3.2678) grad_norm 2.1201 (2.1482) [2022-01-24 13:54:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1100/1251] eta 0:05:33 lr 0.000190 time 1.8236 (2.2063) loss 2.3470 (3.2695) grad_norm 1.9629 (2.1470) [2022-01-24 13:54:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1110/1251] eta 0:05:11 lr 0.000190 time 2.5291 (2.2076) loss 2.7103 (3.2720) grad_norm 2.0368 (2.1461) [2022-01-24 13:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1120/1251] eta 0:04:49 lr 0.000190 time 1.8342 (2.2083) loss 2.6005 (3.2713) grad_norm 1.9235 (2.1450) [2022-01-24 13:55:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1130/1251] eta 0:04:26 lr 0.000190 time 2.0193 (2.2065) loss 3.3714 (3.2719) grad_norm 1.9579 (2.1443) [2022-01-24 13:55:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1140/1251] eta 0:04:04 lr 0.000190 time 2.0402 (2.2034) loss 2.0346 (3.2698) grad_norm 1.8567 (2.1434) [2022-01-24 13:55:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1150/1251] eta 0:03:42 lr 0.000190 time 1.8850 (2.2011) loss 3.3238 (3.2694) grad_norm 1.9125 (2.1420) [2022-01-24 13:56:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1160/1251] eta 0:03:20 lr 0.000190 time 2.1969 (2.2008) loss 3.6472 (3.2691) grad_norm 2.5416 (2.1416) [2022-01-24 13:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1170/1251] eta 0:02:58 lr 0.000190 time 1.9269 (2.1992) loss 2.7083 (3.2655) grad_norm 2.3731 (2.1410) [2022-01-24 13:57:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1180/1251] eta 0:02:36 lr 0.000190 time 1.7602 (2.1992) loss 3.5324 (3.2674) grad_norm 2.0180 (2.1409) [2022-01-24 13:57:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1190/1251] eta 0:02:14 lr 0.000190 time 2.3704 (2.2010) loss 3.3831 (3.2658) grad_norm 2.5002 (2.1408) [2022-01-24 13:57:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1200/1251] eta 0:01:52 lr 0.000190 time 2.0424 (2.2025) loss 3.6196 (3.2653) grad_norm 1.9118 (2.1400) [2022-01-24 13:58:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1210/1251] eta 0:01:30 lr 0.000190 time 2.0148 (2.2025) loss 3.4027 (3.2650) grad_norm 2.2161 (2.1392) [2022-01-24 13:58:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1220/1251] eta 0:01:08 lr 0.000190 time 1.6268 (2.2026) loss 3.6660 (3.2669) grad_norm 1.9738 (2.1379) [2022-01-24 13:58:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1230/1251] eta 0:00:46 lr 0.000190 time 2.1491 (2.2019) loss 3.6905 (3.2662) grad_norm 2.0811 (2.1383) [2022-01-24 13:59:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1240/1251] eta 0:00:24 lr 0.000190 time 1.8566 (2.2011) loss 3.3305 (3.2649) grad_norm 2.3411 (2.1380) [2022-01-24 13:59:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [215/300][1250/1251] eta 0:00:02 lr 0.000189 time 1.1122 (2.1958) loss 2.9131 (3.2628) grad_norm 2.1034 (2.1373) [2022-01-24 13:59:32 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 215 training takes 0:45:47 [2022-01-24 13:59:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.153 (18.153) Loss 0.8920 (0.8920) Acc@1 78.906 (78.906) Acc@5 95.020 (95.020) [2022-01-24 14:00:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.537 (3.609) Loss 0.8682 (0.8923) Acc@1 79.004 (79.084) Acc@5 95.508 (94.664) [2022-01-24 14:00:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.288 (2.776) Loss 0.8549 (0.8881) Acc@1 79.883 (78.999) Acc@5 95.508 (94.713) [2022-01-24 14:00:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.332 (2.355) Loss 0.9004 (0.8918) Acc@1 78.711 (78.919) Acc@5 94.922 (94.660) [2022-01-24 14:01:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.981 (2.223) Loss 0.7950 (0.8858) Acc@1 81.445 (79.090) Acc@5 95.508 (94.729) [2022-01-24 14:01:11 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.122 Acc@5 94.732 [2022-01-24 14:01:11 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.1% [2022-01-24 14:01:11 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.12% [2022-01-24 14:01:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][0/1251] eta 7:22:51 lr 0.000189 time 21.2402 (21.2402) loss 3.0005 (3.0005) grad_norm 2.2580 (2.2580) [2022-01-24 14:01:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][10/1251] eta 1:23:03 lr 0.000189 time 1.6767 (4.0160) loss 3.5449 (3.2837) grad_norm 1.7628 (2.0439) [2022-01-24 14:02:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][20/1251] eta 1:05:19 lr 0.000189 time 1.9591 (3.1841) loss 3.5514 (3.3921) grad_norm 2.1648 (2.0453) [2022-01-24 14:02:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][30/1251] eta 0:57:54 lr 0.000189 time 1.3064 (2.8456) loss 3.8240 (3.3117) grad_norm 2.5925 (2.0611) [2022-01-24 14:03:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][40/1251] eta 0:55:00 lr 0.000189 time 3.4428 (2.7254) loss 3.6785 (3.2771) grad_norm 1.8177 (2.0613) [2022-01-24 14:03:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][50/1251] eta 0:53:42 lr 0.000189 time 2.9065 (2.6829) loss 3.1798 (3.2578) grad_norm 2.1328 (2.0732) [2022-01-24 14:03:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][60/1251] eta 0:51:40 lr 0.000189 time 1.9976 (2.6035) loss 3.4899 (3.2516) grad_norm 1.7487 (2.0781) [2022-01-24 14:04:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][70/1251] eta 0:49:50 lr 0.000189 time 1.5885 (2.5323) loss 2.7870 (3.2547) grad_norm 2.2570 (2.1218) [2022-01-24 14:04:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][80/1251] eta 0:48:34 lr 0.000189 time 2.5518 (2.4888) loss 3.3860 (3.2258) grad_norm 1.9997 (2.1261) [2022-01-24 14:04:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][90/1251] eta 0:47:23 lr 0.000189 time 2.1381 (2.4488) loss 3.3442 (3.2227) grad_norm 2.0712 (2.1278) [2022-01-24 14:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][100/1251] eta 0:46:17 lr 0.000189 time 2.1094 (2.4134) loss 3.8182 (3.2375) grad_norm 2.1396 (2.1205) [2022-01-24 14:05:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][110/1251] eta 0:45:29 lr 0.000189 time 1.9192 (2.3922) loss 3.3756 (3.2460) grad_norm 1.7547 (2.1182) [2022-01-24 14:05:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][120/1251] eta 0:44:41 lr 0.000189 time 2.4603 (2.3713) loss 2.4748 (3.2272) grad_norm 2.2197 (2.1181) [2022-01-24 14:06:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][130/1251] eta 0:43:49 lr 0.000189 time 1.6259 (2.3454) loss 3.2737 (3.2239) grad_norm 2.4089 (2.1320) [2022-01-24 14:06:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][140/1251] eta 0:43:02 lr 0.000189 time 2.0977 (2.3247) loss 2.0139 (3.1942) grad_norm 1.9318 (2.1257) [2022-01-24 14:07:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][150/1251] eta 0:42:30 lr 0.000189 time 1.8905 (2.3164) loss 2.7554 (3.1910) grad_norm 2.3781 (2.1324) [2022-01-24 14:07:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][160/1251] eta 0:42:12 lr 0.000189 time 3.3324 (2.3209) loss 3.8282 (3.2046) grad_norm 2.0848 (2.1328) [2022-01-24 14:07:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][170/1251] eta 0:41:39 lr 0.000189 time 1.9538 (2.3121) loss 3.5495 (3.2033) grad_norm 3.5875 (2.1400) [2022-01-24 14:08:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][180/1251] eta 0:41:12 lr 0.000189 time 2.3669 (2.3086) loss 2.8429 (3.2016) grad_norm 2.3198 (2.1337) [2022-01-24 14:08:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][190/1251] eta 0:40:39 lr 0.000189 time 1.9347 (2.2995) loss 3.9446 (3.2029) grad_norm 2.0333 (2.1254) [2022-01-24 14:08:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][200/1251] eta 0:40:09 lr 0.000189 time 2.4476 (2.2930) loss 3.3751 (3.2045) grad_norm 2.1436 (2.1215) [2022-01-24 14:09:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][210/1251] eta 0:39:34 lr 0.000189 time 1.9821 (2.2808) loss 2.8239 (3.2012) grad_norm 2.2257 (2.1221) [2022-01-24 14:09:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][220/1251] eta 0:39:05 lr 0.000189 time 2.1680 (2.2745) loss 2.7439 (3.1955) grad_norm 1.9847 (2.1197) [2022-01-24 14:09:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][230/1251] eta 0:38:48 lr 0.000189 time 2.6199 (2.2806) loss 3.6205 (3.1960) grad_norm 2.1555 (2.1170) [2022-01-24 14:10:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][240/1251] eta 0:38:25 lr 0.000189 time 2.1431 (2.2802) loss 3.3375 (3.1903) grad_norm 2.2107 (2.1229) [2022-01-24 14:10:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][250/1251] eta 0:38:03 lr 0.000189 time 2.1738 (2.2814) loss 3.4883 (3.1894) grad_norm 2.2450 (2.1227) [2022-01-24 14:11:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][260/1251] eta 0:37:32 lr 0.000189 time 2.2434 (2.2728) loss 3.8934 (3.1881) grad_norm 2.8328 (2.1268) [2022-01-24 14:11:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][270/1251] eta 0:36:58 lr 0.000189 time 2.3189 (2.2612) loss 3.4098 (3.1946) grad_norm 2.2553 (2.1342) [2022-01-24 14:11:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][280/1251] eta 0:36:30 lr 0.000189 time 2.4748 (2.2562) loss 2.2589 (3.1995) grad_norm 2.0224 (2.1385) [2022-01-24 14:12:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][290/1251] eta 0:36:01 lr 0.000189 time 2.5072 (2.2494) loss 3.5804 (3.2028) grad_norm 2.2377 (2.1382) [2022-01-24 14:12:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][300/1251] eta 0:35:37 lr 0.000189 time 2.4701 (2.2471) loss 3.3375 (3.2056) grad_norm 2.2475 (2.1416) [2022-01-24 14:12:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][310/1251] eta 0:35:13 lr 0.000188 time 2.7642 (2.2460) loss 2.6901 (3.1965) grad_norm 2.3500 (2.1450) [2022-01-24 14:13:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][320/1251] eta 0:34:50 lr 0.000188 time 2.0741 (2.2459) loss 2.5897 (3.1934) grad_norm 2.4943 (2.1472) [2022-01-24 14:13:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][330/1251] eta 0:34:31 lr 0.000188 time 2.4580 (2.2494) loss 3.8693 (3.1982) grad_norm 2.3919 (2.1477) [2022-01-24 14:13:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][340/1251] eta 0:34:11 lr 0.000188 time 2.4218 (2.2514) loss 2.7581 (3.1991) grad_norm 2.0759 (2.1461) [2022-01-24 14:14:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][350/1251] eta 0:33:47 lr 0.000188 time 2.5211 (2.2507) loss 2.6755 (3.1999) grad_norm 2.0208 (2.1478) [2022-01-24 14:14:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][360/1251] eta 0:33:21 lr 0.000188 time 1.8245 (2.2469) loss 3.3119 (3.2035) grad_norm 2.1329 (2.1474) [2022-01-24 14:15:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][370/1251] eta 0:32:53 lr 0.000188 time 2.1268 (2.2397) loss 2.1200 (3.2029) grad_norm 1.9841 (2.1485) [2022-01-24 14:15:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][380/1251] eta 0:32:24 lr 0.000188 time 1.9723 (2.2326) loss 3.6873 (3.2119) grad_norm 2.1003 (2.1523) [2022-01-24 14:15:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][390/1251] eta 0:31:57 lr 0.000188 time 2.0874 (2.2273) loss 3.1720 (3.2151) grad_norm 2.3339 (2.1509) [2022-01-24 14:16:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][400/1251] eta 0:31:31 lr 0.000188 time 1.5778 (2.2224) loss 3.6452 (3.2196) grad_norm 1.8728 (2.1524) [2022-01-24 14:16:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][410/1251] eta 0:31:09 lr 0.000188 time 1.9676 (2.2231) loss 2.3684 (3.2151) grad_norm 2.0180 (2.1526) [2022-01-24 14:16:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][420/1251] eta 0:30:51 lr 0.000188 time 1.8430 (2.2285) loss 3.4379 (3.2145) grad_norm 2.1320 (2.1521) [2022-01-24 14:17:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][430/1251] eta 0:30:36 lr 0.000188 time 3.3145 (2.2364) loss 3.2056 (3.2120) grad_norm 1.7804 (2.1521) [2022-01-24 14:17:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][440/1251] eta 0:30:18 lr 0.000188 time 2.1690 (2.2420) loss 2.4969 (3.2139) grad_norm 2.1259 (2.1524) [2022-01-24 14:18:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][450/1251] eta 0:29:51 lr 0.000188 time 1.7724 (2.2367) loss 3.5729 (3.2151) grad_norm 2.6715 (2.1527) [2022-01-24 14:18:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][460/1251] eta 0:29:23 lr 0.000188 time 1.9614 (2.2299) loss 3.8773 (3.2186) grad_norm 2.3153 (2.1543) [2022-01-24 14:18:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][470/1251] eta 0:28:57 lr 0.000188 time 1.8349 (2.2242) loss 3.0044 (3.2179) grad_norm 2.1099 (2.1528) [2022-01-24 14:18:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][480/1251] eta 0:28:30 lr 0.000188 time 2.0210 (2.2190) loss 3.0526 (3.2168) grad_norm 2.4013 (2.1546) [2022-01-24 14:19:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][490/1251] eta 0:28:08 lr 0.000188 time 2.4995 (2.2188) loss 3.2752 (3.2205) grad_norm 2.3414 (2.1526) [2022-01-24 14:19:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][500/1251] eta 0:27:47 lr 0.000188 time 1.8163 (2.2204) loss 3.2350 (3.2251) grad_norm 1.8598 (2.1504) [2022-01-24 14:20:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][510/1251] eta 0:27:24 lr 0.000188 time 1.9339 (2.2194) loss 3.4504 (3.2262) grad_norm 2.7696 (2.1517) [2022-01-24 14:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][520/1251] eta 0:27:01 lr 0.000188 time 2.1954 (2.2181) loss 2.9758 (3.2244) grad_norm 2.5731 (2.1527) [2022-01-24 14:20:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][530/1251] eta 0:26:42 lr 0.000188 time 4.1327 (2.2224) loss 2.2491 (3.2240) grad_norm 3.7213 (2.1570) [2022-01-24 14:21:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][540/1251] eta 0:26:20 lr 0.000188 time 2.6555 (2.2231) loss 2.3320 (3.2183) grad_norm 2.2926 (2.1593) [2022-01-24 14:21:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][550/1251] eta 0:25:57 lr 0.000188 time 1.5246 (2.2218) loss 3.8383 (3.2203) grad_norm 2.1728 (2.1583) [2022-01-24 14:21:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][560/1251] eta 0:25:35 lr 0.000188 time 2.1772 (2.2228) loss 3.3388 (3.2225) grad_norm 2.3752 (2.1589) [2022-01-24 14:22:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][570/1251] eta 0:25:15 lr 0.000188 time 3.5034 (2.2257) loss 3.2155 (3.2222) grad_norm 1.8932 (2.1572) [2022-01-24 14:22:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][580/1251] eta 0:24:49 lr 0.000188 time 1.9271 (2.2205) loss 2.3600 (3.2141) grad_norm 1.9789 (2.1542) [2022-01-24 14:23:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][590/1251] eta 0:24:26 lr 0.000188 time 1.9380 (2.2189) loss 3.7390 (3.2123) grad_norm 2.3099 (2.1541) [2022-01-24 14:23:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][600/1251] eta 0:24:03 lr 0.000188 time 1.4680 (2.2176) loss 3.3880 (3.2088) grad_norm 1.8686 (2.1527) [2022-01-24 14:23:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][610/1251] eta 0:23:43 lr 0.000188 time 3.6141 (2.2208) loss 3.9716 (3.2074) grad_norm 2.3365 (2.1519) [2022-01-24 14:24:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][620/1251] eta 0:23:21 lr 0.000187 time 1.7399 (2.2206) loss 2.4513 (3.2066) grad_norm 2.1006 (2.1505) [2022-01-24 14:24:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][630/1251] eta 0:22:59 lr 0.000187 time 1.8692 (2.2213) loss 3.6425 (3.2066) grad_norm 2.1070 (2.1499) [2022-01-24 14:24:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][640/1251] eta 0:22:36 lr 0.000187 time 1.9224 (2.2205) loss 2.9671 (3.2095) grad_norm 2.1477 (2.1501) [2022-01-24 14:25:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][650/1251] eta 0:22:15 lr 0.000187 time 3.6741 (2.2217) loss 3.4984 (3.2098) grad_norm 1.9521 (2.1502) [2022-01-24 14:25:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][660/1251] eta 0:21:51 lr 0.000187 time 1.7872 (2.2188) loss 3.6175 (3.2118) grad_norm 2.0605 (2.1498) [2022-01-24 14:25:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][670/1251] eta 0:21:27 lr 0.000187 time 1.9290 (2.2164) loss 2.0646 (3.2096) grad_norm 1.9912 (2.1472) [2022-01-24 14:26:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][680/1251] eta 0:21:04 lr 0.000187 time 2.0889 (2.2149) loss 2.2990 (3.2075) grad_norm 2.2158 (2.1460) [2022-01-24 14:26:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][690/1251] eta 0:20:43 lr 0.000187 time 3.1557 (2.2168) loss 3.0588 (3.2102) grad_norm 2.0004 (2.1459) [2022-01-24 14:27:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][700/1251] eta 0:20:21 lr 0.000187 time 1.6287 (2.2174) loss 3.6736 (3.2095) grad_norm 2.1994 (2.1441) [2022-01-24 14:27:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][710/1251] eta 0:19:59 lr 0.000187 time 2.5076 (2.2177) loss 2.7414 (3.2092) grad_norm 2.7296 (2.1465) [2022-01-24 14:27:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][720/1251] eta 0:19:37 lr 0.000187 time 2.0294 (2.2179) loss 3.6186 (3.2076) grad_norm 2.4257 (2.1475) [2022-01-24 14:28:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][730/1251] eta 0:19:15 lr 0.000187 time 2.4755 (2.2169) loss 3.3860 (3.2072) grad_norm 2.0643 (2.1486) [2022-01-24 14:28:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][740/1251] eta 0:18:51 lr 0.000187 time 1.5972 (2.2142) loss 2.6138 (3.2065) grad_norm 1.8598 (2.1489) [2022-01-24 14:28:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][750/1251] eta 0:18:27 lr 0.000187 time 2.2008 (2.2115) loss 3.3920 (3.2092) grad_norm 2.1033 (2.1488) [2022-01-24 14:29:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][760/1251] eta 0:18:04 lr 0.000187 time 1.9549 (2.2097) loss 2.9414 (3.2096) grad_norm 2.1906 (2.1494) [2022-01-24 14:29:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][770/1251] eta 0:17:42 lr 0.000187 time 2.2163 (2.2094) loss 3.6564 (3.2129) grad_norm 2.4399 (2.1513) [2022-01-24 14:29:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][780/1251] eta 0:17:20 lr 0.000187 time 2.5448 (2.2093) loss 3.3311 (3.2119) grad_norm 2.1297 (2.1521) [2022-01-24 14:30:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][790/1251] eta 0:16:58 lr 0.000187 time 2.7870 (2.2103) loss 2.3381 (3.2087) grad_norm 2.1942 (2.1524) [2022-01-24 14:30:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][800/1251] eta 0:16:36 lr 0.000187 time 2.3970 (2.2098) loss 2.8523 (3.2099) grad_norm 2.0265 (2.1526) [2022-01-24 14:31:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][810/1251] eta 0:16:14 lr 0.000187 time 2.1831 (2.2094) loss 3.7550 (3.2120) grad_norm 2.3230 (2.1522) [2022-01-24 14:31:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][820/1251] eta 0:15:52 lr 0.000187 time 1.9861 (2.2091) loss 2.1881 (3.2094) grad_norm 2.3213 (2.1538) [2022-01-24 14:31:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][830/1251] eta 0:15:31 lr 0.000187 time 2.7268 (2.2118) loss 3.1872 (3.2090) grad_norm 1.9124 (2.1543) [2022-01-24 14:32:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][840/1251] eta 0:15:10 lr 0.000187 time 2.1062 (2.2151) loss 2.0998 (3.2077) grad_norm 2.0207 (2.1534) [2022-01-24 14:32:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][850/1251] eta 0:14:48 lr 0.000187 time 1.9114 (2.2164) loss 2.2784 (3.2055) grad_norm 2.2846 (2.1548) [2022-01-24 14:32:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][860/1251] eta 0:14:25 lr 0.000187 time 1.6037 (2.2139) loss 3.6696 (3.2079) grad_norm 1.9581 (2.1539) [2022-01-24 14:33:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][870/1251] eta 0:14:02 lr 0.000187 time 1.9167 (2.2102) loss 2.4444 (3.2091) grad_norm 2.2948 (2.1528) [2022-01-24 14:33:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][880/1251] eta 0:13:39 lr 0.000187 time 1.6742 (2.2083) loss 4.0200 (3.2130) grad_norm 2.5622 (2.1554) [2022-01-24 14:33:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][890/1251] eta 0:13:17 lr 0.000187 time 2.3516 (2.2091) loss 3.8352 (3.2154) grad_norm 2.1974 (2.1563) [2022-01-24 14:34:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][900/1251] eta 0:12:55 lr 0.000187 time 1.9275 (2.2105) loss 3.2537 (3.2140) grad_norm 1.9867 (2.1557) [2022-01-24 14:34:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][910/1251] eta 0:12:33 lr 0.000187 time 2.1896 (2.2105) loss 4.1567 (3.2156) grad_norm 1.8038 (2.1552) [2022-01-24 14:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][920/1251] eta 0:12:11 lr 0.000187 time 2.1907 (2.2108) loss 2.9635 (3.2151) grad_norm 2.1685 (2.1547) [2022-01-24 14:35:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][930/1251] eta 0:11:49 lr 0.000187 time 2.2825 (2.2106) loss 3.3900 (3.2146) grad_norm 2.0792 (2.1536) [2022-01-24 14:35:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][940/1251] eta 0:11:27 lr 0.000186 time 1.9317 (2.2101) loss 3.7328 (3.2168) grad_norm 2.5398 (2.1555) [2022-01-24 14:36:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][950/1251] eta 0:11:04 lr 0.000186 time 1.8466 (2.2091) loss 3.6400 (3.2202) grad_norm 2.0142 (2.1558) [2022-01-24 14:36:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][960/1251] eta 0:10:42 lr 0.000186 time 1.9839 (2.2073) loss 3.1466 (3.2189) grad_norm 2.2262 (2.1553) [2022-01-24 14:36:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][970/1251] eta 0:10:20 lr 0.000186 time 1.6473 (2.2078) loss 3.6701 (3.2212) grad_norm 2.5805 (2.1546) [2022-01-24 14:37:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][980/1251] eta 0:09:58 lr 0.000186 time 2.4802 (2.2077) loss 3.5428 (3.2197) grad_norm 2.1149 (2.1537) [2022-01-24 14:37:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][990/1251] eta 0:09:36 lr 0.000186 time 1.5256 (2.2079) loss 2.5425 (3.2202) grad_norm 2.0947 (2.1541) [2022-01-24 14:38:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1000/1251] eta 0:09:14 lr 0.000186 time 2.8112 (2.2097) loss 2.8678 (3.2190) grad_norm 2.0026 (2.1533) [2022-01-24 14:38:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1010/1251] eta 0:08:52 lr 0.000186 time 2.1664 (2.2115) loss 4.0458 (3.2203) grad_norm 2.0950 (2.1544) [2022-01-24 14:38:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1020/1251] eta 0:08:30 lr 0.000186 time 2.0503 (2.2113) loss 3.2400 (3.2204) grad_norm 2.5957 (2.1547) [2022-01-24 14:39:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1030/1251] eta 0:08:08 lr 0.000186 time 1.9066 (2.2104) loss 3.2270 (3.2203) grad_norm 2.6309 (2.1546) [2022-01-24 14:39:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1040/1251] eta 0:07:45 lr 0.000186 time 1.5428 (2.2062) loss 3.9163 (3.2220) grad_norm 2.2052 (2.1537) [2022-01-24 14:39:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1050/1251] eta 0:07:22 lr 0.000186 time 1.7907 (2.2035) loss 3.5955 (3.2223) grad_norm 2.0394 (2.1544) [2022-01-24 14:40:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1060/1251] eta 0:07:00 lr 0.000186 time 2.2668 (2.2015) loss 3.7461 (3.2249) grad_norm 2.1680 (2.1548) [2022-01-24 14:40:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1070/1251] eta 0:06:38 lr 0.000186 time 1.7751 (2.2002) loss 3.8464 (3.2263) grad_norm 2.0687 (2.1543) [2022-01-24 14:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1080/1251] eta 0:06:16 lr 0.000186 time 2.4985 (2.2003) loss 4.0984 (3.2282) grad_norm 1.9546 (2.1533) [2022-01-24 14:41:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1090/1251] eta 0:05:54 lr 0.000186 time 2.2112 (2.2011) loss 3.8611 (3.2320) grad_norm 2.2678 (2.1543) [2022-01-24 14:41:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1100/1251] eta 0:05:32 lr 0.000186 time 2.1868 (2.2003) loss 3.6349 (3.2348) grad_norm 2.0452 (2.1533) [2022-01-24 14:41:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1110/1251] eta 0:05:10 lr 0.000186 time 1.6029 (2.2009) loss 3.7286 (3.2347) grad_norm 2.5600 (2.1530) [2022-01-24 14:42:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1120/1251] eta 0:04:48 lr 0.000186 time 2.8650 (2.2027) loss 2.2410 (3.2363) grad_norm 2.0141 (2.1529) [2022-01-24 14:42:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1130/1251] eta 0:04:26 lr 0.000186 time 2.0694 (2.2052) loss 3.4992 (3.2363) grad_norm 2.3439 (2.1519) [2022-01-24 14:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1140/1251] eta 0:04:04 lr 0.000186 time 1.8581 (2.2059) loss 2.7855 (3.2366) grad_norm 1.9797 (2.1513) [2022-01-24 14:43:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1150/1251] eta 0:03:42 lr 0.000186 time 2.0287 (2.2067) loss 2.7934 (3.2354) grad_norm 2.1053 (2.1503) [2022-01-24 14:43:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1160/1251] eta 0:03:20 lr 0.000186 time 1.9965 (2.2082) loss 3.2925 (3.2344) grad_norm 2.0771 (2.1502) [2022-01-24 14:44:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1170/1251] eta 0:02:58 lr 0.000186 time 1.8064 (2.2073) loss 2.5618 (3.2328) grad_norm 2.2078 (2.1503) [2022-01-24 14:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1180/1251] eta 0:02:36 lr 0.000186 time 2.7628 (2.2062) loss 3.7009 (3.2351) grad_norm 2.3517 (2.1509) [2022-01-24 14:44:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1190/1251] eta 0:02:14 lr 0.000186 time 1.9449 (2.2038) loss 3.4956 (3.2355) grad_norm 2.4277 (2.1527) [2022-01-24 14:45:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1200/1251] eta 0:01:52 lr 0.000186 time 2.6504 (2.2024) loss 3.8871 (3.2366) grad_norm 2.1006 (2.1531) [2022-01-24 14:45:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1210/1251] eta 0:01:30 lr 0.000186 time 2.3812 (2.2008) loss 2.4377 (3.2370) grad_norm 2.1887 (2.1524) [2022-01-24 14:46:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1220/1251] eta 0:01:08 lr 0.000186 time 2.1909 (2.2020) loss 3.2745 (3.2378) grad_norm 1.9226 (2.1519) [2022-01-24 14:46:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1230/1251] eta 0:00:46 lr 0.000186 time 2.8634 (2.2032) loss 2.5343 (3.2371) grad_norm 2.0620 (2.1524) [2022-01-24 14:46:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1240/1251] eta 0:00:24 lr 0.000186 time 1.7151 (2.2028) loss 4.1531 (3.2381) grad_norm 2.3903 (2.1535) [2022-01-24 14:47:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [216/300][1250/1251] eta 0:00:02 lr 0.000186 time 1.1440 (2.1979) loss 3.8492 (3.2388) grad_norm 2.1218 (2.1535) [2022-01-24 14:47:01 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 216 training takes 0:45:49 [2022-01-24 14:47:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.341 (18.341) Loss 0.8897 (0.8897) Acc@1 79.102 (79.102) Acc@5 94.238 (94.238) [2022-01-24 14:47:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.945 (3.302) Loss 0.8770 (0.8869) Acc@1 79.590 (79.537) Acc@5 95.020 (94.869) [2022-01-24 14:47:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.328 (2.563) Loss 0.9506 (0.8946) Acc@1 77.539 (79.167) Acc@5 93.262 (94.782) [2022-01-24 14:48:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.612 (2.307) Loss 0.8876 (0.8900) Acc@1 79.395 (79.265) Acc@5 94.922 (94.771) [2022-01-24 14:48:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.191 (2.162) Loss 0.8168 (0.8906) Acc@1 80.762 (79.194) Acc@5 95.215 (94.734) [2022-01-24 14:48:37 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.068 Acc@5 94.716 [2022-01-24 14:48:37 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.1% [2022-01-24 14:48:37 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.12% [2022-01-24 14:48:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][0/1251] eta 7:31:47 lr 0.000185 time 21.6684 (21.6684) loss 3.2664 (3.2664) grad_norm 2.1834 (2.1834) [2022-01-24 14:49:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][10/1251] eta 1:22:59 lr 0.000185 time 2.0373 (4.0124) loss 2.7787 (3.0718) grad_norm 1.9227 (2.1562) [2022-01-24 14:49:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][20/1251] eta 1:05:47 lr 0.000185 time 1.5428 (3.2069) loss 3.7871 (3.3508) grad_norm 2.1579 (2.1532) [2022-01-24 14:50:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][30/1251] eta 0:57:50 lr 0.000185 time 1.4103 (2.8422) loss 2.6087 (3.2373) grad_norm 2.1298 (2.1779) [2022-01-24 14:50:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][40/1251] eta 0:54:37 lr 0.000185 time 3.5455 (2.7061) loss 2.5342 (3.2123) grad_norm 1.9292 (2.1531) [2022-01-24 14:50:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][50/1251] eta 0:53:05 lr 0.000185 time 2.5608 (2.6520) loss 3.3615 (3.1626) grad_norm 1.9775 (2.1248) [2022-01-24 14:51:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][60/1251] eta 0:51:17 lr 0.000185 time 1.6822 (2.5837) loss 2.5753 (3.1701) grad_norm 1.9292 (2.1342) [2022-01-24 14:51:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][70/1251] eta 0:49:25 lr 0.000185 time 1.9088 (2.5112) loss 3.8386 (3.1954) grad_norm 2.3317 (2.1381) [2022-01-24 14:51:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][80/1251] eta 0:48:10 lr 0.000185 time 2.6347 (2.4681) loss 3.5088 (3.2061) grad_norm 2.2606 (2.1397) [2022-01-24 14:52:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][90/1251] eta 0:47:32 lr 0.000185 time 3.2226 (2.4572) loss 2.5262 (3.1843) grad_norm 2.0567 (2.1385) [2022-01-24 14:52:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][100/1251] eta 0:46:48 lr 0.000185 time 1.8372 (2.4399) loss 3.5018 (3.2005) grad_norm 2.6236 (2.1402) [2022-01-24 14:53:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][110/1251] eta 0:45:49 lr 0.000185 time 1.5879 (2.4098) loss 3.4368 (3.2176) grad_norm 2.2154 (2.1448) [2022-01-24 14:53:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][120/1251] eta 0:44:53 lr 0.000185 time 1.6930 (2.3812) loss 2.1338 (3.1853) grad_norm 2.3105 (2.1431) [2022-01-24 14:53:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][130/1251] eta 0:44:38 lr 0.000185 time 3.8558 (2.3889) loss 3.6505 (3.2160) grad_norm 2.2634 (2.1500) [2022-01-24 14:54:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][140/1251] eta 0:44:11 lr 0.000185 time 1.9532 (2.3866) loss 2.5603 (3.2171) grad_norm 2.2415 (2.1487) [2022-01-24 14:54:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][150/1251] eta 0:43:26 lr 0.000185 time 1.6169 (2.3675) loss 2.4795 (3.2160) grad_norm 2.2261 (2.1655) [2022-01-24 14:54:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][160/1251] eta 0:42:43 lr 0.000185 time 1.8572 (2.3499) loss 3.2850 (3.2148) grad_norm 1.9659 (2.1653) [2022-01-24 14:55:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][170/1251] eta 0:42:08 lr 0.000185 time 2.8222 (2.3390) loss 3.6532 (3.2269) grad_norm 2.1155 (2.1674) [2022-01-24 14:55:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][180/1251] eta 0:41:35 lr 0.000185 time 1.8908 (2.3298) loss 3.5707 (3.2255) grad_norm 2.1920 (2.1663) [2022-01-24 14:55:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][190/1251] eta 0:40:52 lr 0.000185 time 1.9435 (2.3114) loss 2.8318 (3.2309) grad_norm 1.9793 (2.1648) [2022-01-24 14:56:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][200/1251] eta 0:40:17 lr 0.000185 time 2.0305 (2.3006) loss 3.5414 (3.2255) grad_norm 2.2674 (2.1618) [2022-01-24 14:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][210/1251] eta 0:39:43 lr 0.000185 time 2.1601 (2.2897) loss 3.5397 (3.2422) grad_norm 1.9784 (2.1551) [2022-01-24 14:57:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][220/1251] eta 0:39:15 lr 0.000185 time 2.2124 (2.2845) loss 3.8072 (3.2414) grad_norm 2.5869 (2.1564) [2022-01-24 14:57:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][230/1251] eta 0:38:47 lr 0.000185 time 1.9274 (2.2799) loss 2.7489 (3.2276) grad_norm 2.7199 (2.1687) [2022-01-24 14:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][240/1251] eta 0:38:21 lr 0.000185 time 2.4585 (2.2768) loss 3.4825 (3.2303) grad_norm 2.4433 (2.1673) [2022-01-24 14:58:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][250/1251] eta 0:37:58 lr 0.000185 time 1.8333 (2.2761) loss 3.3940 (3.2214) grad_norm 2.1953 (2.1655) [2022-01-24 14:58:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][260/1251] eta 0:37:39 lr 0.000185 time 2.7575 (2.2803) loss 3.1003 (3.2129) grad_norm 2.4278 (2.1680) [2022-01-24 14:58:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][270/1251] eta 0:37:18 lr 0.000185 time 2.2323 (2.2819) loss 3.8619 (3.2121) grad_norm 2.1373 (2.1722) [2022-01-24 14:59:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][280/1251] eta 0:36:52 lr 0.000185 time 2.0621 (2.2786) loss 2.2873 (3.2174) grad_norm 1.9143 (2.1682) [2022-01-24 14:59:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][290/1251] eta 0:36:20 lr 0.000185 time 1.7074 (2.2692) loss 3.1632 (3.2138) grad_norm 1.8506 (2.1660) [2022-01-24 14:59:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][300/1251] eta 0:35:46 lr 0.000185 time 1.9692 (2.2573) loss 3.0958 (3.2119) grad_norm 1.6737 (2.1626) [2022-01-24 15:00:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][310/1251] eta 0:35:21 lr 0.000185 time 1.7580 (2.2546) loss 3.6179 (3.2142) grad_norm 2.0978 (2.1595) [2022-01-24 15:00:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][320/1251] eta 0:34:56 lr 0.000184 time 2.1766 (2.2518) loss 3.1315 (3.2158) grad_norm 2.4291 (2.1587) [2022-01-24 15:01:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][330/1251] eta 0:34:28 lr 0.000184 time 2.2589 (2.2464) loss 2.1292 (3.2181) grad_norm 1.9966 (2.1579) [2022-01-24 15:01:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][340/1251] eta 0:33:59 lr 0.000184 time 1.6487 (2.2391) loss 2.8787 (3.2157) grad_norm 2.0690 (2.1558) [2022-01-24 15:01:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][350/1251] eta 0:33:40 lr 0.000184 time 2.2059 (2.2423) loss 3.5422 (3.2268) grad_norm 2.0849 (2.1554) [2022-01-24 15:02:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][360/1251] eta 0:33:17 lr 0.000184 time 1.8907 (2.2420) loss 2.2417 (3.2257) grad_norm 2.1912 (2.1524) [2022-01-24 15:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][370/1251] eta 0:32:53 lr 0.000184 time 1.9277 (2.2406) loss 2.4980 (3.2242) grad_norm 1.9077 (2.1508) [2022-01-24 15:02:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][380/1251] eta 0:32:34 lr 0.000184 time 2.9634 (2.2439) loss 2.3235 (3.2255) grad_norm 1.7574 (2.1490) [2022-01-24 15:03:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][390/1251] eta 0:32:13 lr 0.000184 time 1.5673 (2.2454) loss 3.3016 (3.2252) grad_norm 2.0562 (2.1497) [2022-01-24 15:03:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][400/1251] eta 0:31:49 lr 0.000184 time 1.6193 (2.2443) loss 3.3563 (3.2248) grad_norm 2.3707 (2.1524) [2022-01-24 15:04:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][410/1251] eta 0:31:27 lr 0.000184 time 1.8261 (2.2444) loss 2.2804 (3.2218) grad_norm 2.5579 (2.1531) [2022-01-24 15:04:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][420/1251] eta 0:31:03 lr 0.000184 time 2.5707 (2.2422) loss 3.5374 (3.2218) grad_norm 1.9610 (2.1546) [2022-01-24 15:04:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][430/1251] eta 0:30:39 lr 0.000184 time 2.1763 (2.2408) loss 3.5662 (3.2239) grad_norm 1.8573 (2.1531) [2022-01-24 15:05:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][440/1251] eta 0:30:17 lr 0.000184 time 1.9519 (2.2413) loss 3.4163 (3.2220) grad_norm 2.0910 (2.1537) [2022-01-24 15:05:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][450/1251] eta 0:29:55 lr 0.000184 time 1.9716 (2.2413) loss 2.9378 (3.2209) grad_norm 2.0281 (2.1554) [2022-01-24 15:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][460/1251] eta 0:29:33 lr 0.000184 time 1.7973 (2.2419) loss 3.5535 (3.2238) grad_norm 2.0045 (2.1537) [2022-01-24 15:06:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][470/1251] eta 0:29:09 lr 0.000184 time 1.6024 (2.2403) loss 4.0448 (3.2280) grad_norm 2.2758 (2.1527) [2022-01-24 15:06:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][480/1251] eta 0:28:43 lr 0.000184 time 2.0577 (2.2356) loss 3.5636 (3.2274) grad_norm 2.1448 (2.1522) [2022-01-24 15:06:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][490/1251] eta 0:28:19 lr 0.000184 time 1.9799 (2.2328) loss 3.5487 (3.2293) grad_norm 2.2222 (2.1523) [2022-01-24 15:07:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][500/1251] eta 0:27:55 lr 0.000184 time 2.1663 (2.2310) loss 2.6717 (3.2322) grad_norm 1.9632 (2.1513) [2022-01-24 15:07:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][510/1251] eta 0:27:32 lr 0.000184 time 2.5189 (2.2303) loss 2.1564 (3.2322) grad_norm 1.8254 (2.1507) [2022-01-24 15:07:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][520/1251] eta 0:27:09 lr 0.000184 time 2.3652 (2.2290) loss 3.2940 (3.2312) grad_norm 2.2454 (2.1507) [2022-01-24 15:08:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][530/1251] eta 0:26:46 lr 0.000184 time 2.1641 (2.2287) loss 3.2133 (3.2341) grad_norm 2.1048 (2.1524) [2022-01-24 15:08:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][540/1251] eta 0:26:25 lr 0.000184 time 2.1587 (2.2306) loss 2.5658 (3.2329) grad_norm 2.4995 (2.1525) [2022-01-24 15:09:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][550/1251] eta 0:26:01 lr 0.000184 time 1.9707 (2.2279) loss 3.2663 (3.2383) grad_norm 2.1768 (2.1543) [2022-01-24 15:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][560/1251] eta 0:25:38 lr 0.000184 time 1.8089 (2.2261) loss 3.2316 (3.2400) grad_norm 2.0447 (2.1552) [2022-01-24 15:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][570/1251] eta 0:25:16 lr 0.000184 time 2.0632 (2.2265) loss 2.2717 (3.2396) grad_norm 2.0562 (2.1551) [2022-01-24 15:10:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][580/1251] eta 0:24:53 lr 0.000184 time 2.5464 (2.2255) loss 3.1053 (3.2428) grad_norm 2.0778 (2.1566) [2022-01-24 15:10:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][590/1251] eta 0:24:31 lr 0.000184 time 2.0874 (2.2256) loss 2.6330 (3.2431) grad_norm 2.3274 (2.1570) [2022-01-24 15:10:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][600/1251] eta 0:24:08 lr 0.000184 time 2.4781 (2.2244) loss 3.2047 (3.2426) grad_norm 2.0269 (2.1576) [2022-01-24 15:11:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][610/1251] eta 0:23:44 lr 0.000184 time 1.8710 (2.2223) loss 3.1361 (3.2413) grad_norm 2.0887 (2.1586) [2022-01-24 15:11:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][620/1251] eta 0:23:20 lr 0.000184 time 1.9383 (2.2194) loss 3.9977 (3.2428) grad_norm 2.1324 (2.1585) [2022-01-24 15:11:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][630/1251] eta 0:22:57 lr 0.000184 time 1.8381 (2.2179) loss 2.7685 (3.2424) grad_norm 2.6262 (2.1582) [2022-01-24 15:12:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][640/1251] eta 0:22:35 lr 0.000183 time 2.1970 (2.2179) loss 3.3526 (3.2426) grad_norm 2.6945 (2.1586) [2022-01-24 15:12:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][650/1251] eta 0:22:13 lr 0.000183 time 2.7506 (2.2186) loss 3.3746 (3.2474) grad_norm 2.1639 (2.1589) [2022-01-24 15:13:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][660/1251] eta 0:21:51 lr 0.000183 time 2.2489 (2.2184) loss 4.0345 (3.2478) grad_norm 1.9345 (2.1584) [2022-01-24 15:13:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][670/1251] eta 0:21:29 lr 0.000183 time 2.5539 (2.2188) loss 3.9412 (3.2469) grad_norm 2.0666 (2.1595) [2022-01-24 15:13:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][680/1251] eta 0:21:08 lr 0.000183 time 2.8250 (2.2209) loss 2.8987 (3.2485) grad_norm 1.9810 (2.1581) [2022-01-24 15:14:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][690/1251] eta 0:20:45 lr 0.000183 time 2.7671 (2.2208) loss 2.3566 (3.2448) grad_norm 2.3519 (2.1582) [2022-01-24 15:14:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][700/1251] eta 0:20:21 lr 0.000183 time 1.6446 (2.2169) loss 3.4906 (3.2455) grad_norm 2.2674 (2.1571) [2022-01-24 15:14:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][710/1251] eta 0:19:58 lr 0.000183 time 2.1682 (2.2150) loss 3.8095 (3.2457) grad_norm 2.4315 (2.1595) [2022-01-24 15:15:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][720/1251] eta 0:19:35 lr 0.000183 time 2.5463 (2.2134) loss 3.7163 (3.2452) grad_norm 2.1952 (2.1598) [2022-01-24 15:15:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][730/1251] eta 0:19:12 lr 0.000183 time 2.5204 (2.2126) loss 3.5664 (3.2444) grad_norm 2.3528 (2.1615) [2022-01-24 15:15:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][740/1251] eta 0:18:49 lr 0.000183 time 1.7137 (2.2110) loss 2.3482 (3.2421) grad_norm 1.9338 (2.1612) [2022-01-24 15:16:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][750/1251] eta 0:18:27 lr 0.000183 time 1.9115 (2.2104) loss 3.0559 (3.2401) grad_norm 2.0900 (2.1611) [2022-01-24 15:16:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][760/1251] eta 0:18:06 lr 0.000183 time 2.7871 (2.2122) loss 4.0789 (3.2445) grad_norm 2.2338 (2.1613) [2022-01-24 15:17:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][770/1251] eta 0:17:44 lr 0.000183 time 2.5276 (2.2140) loss 4.0647 (3.2478) grad_norm 2.2398 (2.1617) [2022-01-24 15:17:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][780/1251] eta 0:17:22 lr 0.000183 time 2.2229 (2.2140) loss 2.2809 (3.2477) grad_norm 1.8917 (2.1611) [2022-01-24 15:17:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][790/1251] eta 0:17:00 lr 0.000183 time 1.8275 (2.2139) loss 2.9813 (3.2449) grad_norm 2.0640 (2.1608) [2022-01-24 15:18:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][800/1251] eta 0:16:39 lr 0.000183 time 2.8308 (2.2159) loss 2.4414 (3.2465) grad_norm 1.8928 (2.1593) [2022-01-24 15:18:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][810/1251] eta 0:16:16 lr 0.000183 time 1.9217 (2.2148) loss 3.3105 (3.2458) grad_norm 2.4686 (2.1628) [2022-01-24 15:18:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][820/1251] eta 0:15:53 lr 0.000183 time 1.8496 (2.2125) loss 3.8298 (3.2424) grad_norm 2.4147 (2.1631) [2022-01-24 15:19:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][830/1251] eta 0:15:30 lr 0.000183 time 1.9734 (2.2104) loss 3.1313 (3.2416) grad_norm 2.0707 (2.1628) [2022-01-24 15:19:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][840/1251] eta 0:15:08 lr 0.000183 time 2.7793 (2.2095) loss 3.6611 (3.2388) grad_norm 2.1933 (2.1627) [2022-01-24 15:19:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][850/1251] eta 0:14:45 lr 0.000183 time 2.3192 (2.2087) loss 3.8607 (3.2404) grad_norm 2.0060 (2.1640) [2022-01-24 15:20:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][860/1251] eta 0:14:23 lr 0.000183 time 2.1522 (2.2086) loss 3.6061 (3.2407) grad_norm 2.1825 (2.1642) [2022-01-24 15:20:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][870/1251] eta 0:14:01 lr 0.000183 time 2.2139 (2.2074) loss 3.4191 (3.2395) grad_norm 1.9255 (2.1631) [2022-01-24 15:21:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][880/1251] eta 0:13:39 lr 0.000183 time 2.5517 (2.2091) loss 3.4825 (3.2378) grad_norm 1.9257 (2.1630) [2022-01-24 15:21:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][890/1251] eta 0:13:18 lr 0.000183 time 2.1622 (2.2106) loss 3.6246 (3.2392) grad_norm 2.1808 (2.1612) [2022-01-24 15:21:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][900/1251] eta 0:12:55 lr 0.000183 time 1.8540 (2.2093) loss 3.5364 (3.2393) grad_norm 1.8246 (2.1597) [2022-01-24 15:22:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][910/1251] eta 0:12:33 lr 0.000183 time 3.4161 (2.2095) loss 3.2069 (3.2392) grad_norm 2.0011 (2.1592) [2022-01-24 15:22:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][920/1251] eta 0:12:11 lr 0.000183 time 2.6612 (2.2094) loss 3.3502 (3.2360) grad_norm 1.8264 (2.1569) [2022-01-24 15:22:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][930/1251] eta 0:11:49 lr 0.000183 time 2.1507 (2.2091) loss 2.7403 (3.2351) grad_norm 1.9580 (2.1561) [2022-01-24 15:23:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][940/1251] eta 0:11:27 lr 0.000183 time 1.7012 (2.2094) loss 3.7157 (3.2372) grad_norm 1.9955 (2.1566) [2022-01-24 15:23:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][950/1251] eta 0:11:05 lr 0.000183 time 3.3015 (2.2102) loss 3.0086 (3.2391) grad_norm 1.9487 (2.1562) [2022-01-24 15:24:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][960/1251] eta 0:10:42 lr 0.000182 time 1.9212 (2.2084) loss 3.5109 (3.2387) grad_norm 1.9388 (2.1559) [2022-01-24 15:24:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][970/1251] eta 0:10:20 lr 0.000182 time 1.8916 (2.2066) loss 3.6300 (3.2398) grad_norm 2.2540 (2.1560) [2022-01-24 15:24:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][980/1251] eta 0:09:58 lr 0.000182 time 2.1773 (2.2070) loss 2.9355 (3.2381) grad_norm 1.9975 (2.1566) [2022-01-24 15:25:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][990/1251] eta 0:09:36 lr 0.000182 time 2.8911 (2.2077) loss 2.6968 (3.2354) grad_norm 2.3922 (2.1568) [2022-01-24 15:25:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1000/1251] eta 0:09:14 lr 0.000182 time 1.8666 (2.2074) loss 3.6394 (3.2359) grad_norm 1.9796 (2.1565) [2022-01-24 15:25:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1010/1251] eta 0:08:51 lr 0.000182 time 2.0185 (2.2060) loss 3.0443 (3.2362) grad_norm 1.9526 (2.1561) [2022-01-24 15:26:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1020/1251] eta 0:08:29 lr 0.000182 time 1.9997 (2.2066) loss 3.1657 (3.2364) grad_norm 2.2510 (2.1573) [2022-01-24 15:26:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1030/1251] eta 0:08:08 lr 0.000182 time 3.0694 (2.2082) loss 3.7604 (3.2355) grad_norm 2.0341 (2.1572) [2022-01-24 15:26:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1040/1251] eta 0:07:45 lr 0.000182 time 1.8436 (2.2081) loss 2.0121 (3.2333) grad_norm 2.3862 (2.1571) [2022-01-24 15:27:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1050/1251] eta 0:07:23 lr 0.000182 time 1.7682 (2.2062) loss 3.0895 (3.2320) grad_norm 2.2650 (2.1569) [2022-01-24 15:27:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1060/1251] eta 0:07:01 lr 0.000182 time 1.7357 (2.2050) loss 3.7628 (3.2298) grad_norm 2.2405 (2.1566) [2022-01-24 15:27:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1070/1251] eta 0:06:39 lr 0.000182 time 3.1215 (2.2049) loss 3.2965 (3.2259) grad_norm 2.2402 (2.1557) [2022-01-24 15:28:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1080/1251] eta 0:06:17 lr 0.000182 time 2.2305 (2.2051) loss 2.0675 (3.2247) grad_norm 1.9263 (2.1551) [2022-01-24 15:28:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1090/1251] eta 0:05:54 lr 0.000182 time 1.8332 (2.2034) loss 2.6741 (3.2240) grad_norm 2.2093 (2.1548) [2022-01-24 15:29:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1100/1251] eta 0:05:32 lr 0.000182 time 2.1018 (2.2029) loss 3.1731 (3.2232) grad_norm 2.0833 (2.1547) [2022-01-24 15:29:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1110/1251] eta 0:05:10 lr 0.000182 time 1.8884 (2.2029) loss 2.7197 (3.2200) grad_norm 2.1634 (2.1543) [2022-01-24 15:29:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1120/1251] eta 0:04:48 lr 0.000182 time 2.1783 (2.2018) loss 3.6733 (3.2197) grad_norm 2.1039 (2.1541) [2022-01-24 15:30:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1130/1251] eta 0:04:26 lr 0.000182 time 3.0607 (2.2018) loss 2.3584 (3.2220) grad_norm 1.9053 (2.1534) [2022-01-24 15:30:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1140/1251] eta 0:04:04 lr 0.000182 time 2.0060 (2.2023) loss 2.4953 (3.2221) grad_norm 2.1725 (2.1535) [2022-01-24 15:30:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1150/1251] eta 0:03:42 lr 0.000182 time 1.8407 (2.2035) loss 3.4574 (3.2237) grad_norm 2.2755 (2.1524) [2022-01-24 15:31:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1160/1251] eta 0:03:20 lr 0.000182 time 2.1812 (2.2041) loss 3.6650 (3.2229) grad_norm 1.9747 (2.1534) [2022-01-24 15:31:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1170/1251] eta 0:02:58 lr 0.000182 time 3.5137 (2.2059) loss 2.6802 (3.2231) grad_norm 2.2768 (2.1541) [2022-01-24 15:32:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1180/1251] eta 0:02:36 lr 0.000182 time 2.1694 (2.2058) loss 3.1369 (3.2236) grad_norm 2.1476 (2.1542) [2022-01-24 15:32:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1190/1251] eta 0:02:14 lr 0.000182 time 1.5394 (2.2045) loss 3.4894 (3.2238) grad_norm 1.9697 (2.1541) [2022-01-24 15:32:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1200/1251] eta 0:01:52 lr 0.000182 time 2.0835 (2.2023) loss 3.2487 (3.2228) grad_norm 1.9790 (2.1534) [2022-01-24 15:33:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1210/1251] eta 0:01:30 lr 0.000182 time 2.0684 (2.2011) loss 2.8881 (3.2231) grad_norm 1.8208 (2.1522) [2022-01-24 15:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1220/1251] eta 0:01:08 lr 0.000182 time 2.2480 (2.2003) loss 3.8166 (3.2238) grad_norm 2.0612 (2.1510) [2022-01-24 15:33:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1230/1251] eta 0:00:46 lr 0.000182 time 2.1843 (2.2001) loss 3.5641 (3.2259) grad_norm 2.1420 (2.1500) [2022-01-24 15:34:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1240/1251] eta 0:00:24 lr 0.000182 time 1.3270 (2.1989) loss 2.1355 (3.2263) grad_norm 2.2813 (2.1497) [2022-01-24 15:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [217/300][1250/1251] eta 0:00:02 lr 0.000182 time 1.1731 (2.1940) loss 3.3313 (3.2277) grad_norm 2.0179 (2.1496) [2022-01-24 15:34:23 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 217 training takes 0:45:45 [2022-01-24 15:34:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.844 (18.844) Loss 0.8222 (0.8222) Acc@1 80.957 (80.957) Acc@5 94.629 (94.629) [2022-01-24 15:34:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.647 (3.318) Loss 0.8501 (0.8859) Acc@1 78.418 (78.702) Acc@5 94.922 (94.256) [2022-01-24 15:35:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.545 (2.723) Loss 0.9020 (0.8722) Acc@1 78.711 (79.078) Acc@5 94.824 (94.587) [2022-01-24 15:35:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.584 (2.379) Loss 0.8246 (0.8712) Acc@1 80.762 (79.083) Acc@5 94.434 (94.585) [2022-01-24 15:35:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.707 (2.249) Loss 0.8570 (0.8757) Acc@1 80.664 (79.054) Acc@5 95.020 (94.600) [2022-01-24 15:36:02 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.000 Acc@5 94.666 [2022-01-24 15:36:02 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.0% [2022-01-24 15:36:02 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.12% [2022-01-24 15:36:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][0/1251] eta 7:28:49 lr 0.000182 time 21.5261 (21.5261) loss 3.4507 (3.4507) grad_norm 2.2672 (2.2672) [2022-01-24 15:36:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][10/1251] eta 1:22:47 lr 0.000182 time 1.5424 (4.0027) loss 2.2346 (3.2469) grad_norm 1.8180 (2.1526) [2022-01-24 15:37:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][20/1251] eta 1:03:29 lr 0.000181 time 1.6986 (3.0947) loss 3.1994 (3.3582) grad_norm 2.1580 (2.1421) [2022-01-24 15:37:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][30/1251] eta 0:56:01 lr 0.000181 time 1.9765 (2.7533) loss 2.5522 (3.2700) grad_norm 2.0921 (2.1305) [2022-01-24 15:37:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][40/1251] eta 0:53:24 lr 0.000181 time 3.6880 (2.6459) loss 3.8042 (3.3548) grad_norm 2.0469 (2.1736) [2022-01-24 15:38:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][50/1251] eta 0:51:13 lr 0.000181 time 1.9229 (2.5591) loss 2.7307 (3.3079) grad_norm 2.1248 (2.1920) [2022-01-24 15:38:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][60/1251] eta 0:48:42 lr 0.000181 time 1.5455 (2.4542) loss 3.0527 (3.2626) grad_norm 1.8313 (2.1743) [2022-01-24 15:38:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][70/1251] eta 0:47:37 lr 0.000181 time 2.1328 (2.4198) loss 3.7892 (3.2705) grad_norm 1.8352 (2.1695) [2022-01-24 15:39:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][80/1251] eta 0:46:46 lr 0.000181 time 2.8719 (2.3970) loss 3.8016 (3.2741) grad_norm 2.7218 (2.1628) [2022-01-24 15:39:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][90/1251] eta 0:45:52 lr 0.000181 time 2.0397 (2.3709) loss 2.3105 (3.2670) grad_norm 2.4156 (2.1653) [2022-01-24 15:39:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][100/1251] eta 0:45:00 lr 0.000181 time 1.5183 (2.3459) loss 2.2943 (3.2521) grad_norm 2.0551 (2.1550) [2022-01-24 15:40:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][110/1251] eta 0:44:35 lr 0.000181 time 2.6656 (2.3452) loss 4.0308 (3.2558) grad_norm 1.8582 (2.1482) [2022-01-24 15:40:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][120/1251] eta 0:44:08 lr 0.000181 time 2.8149 (2.3418) loss 3.4459 (3.2569) grad_norm 2.2467 (2.1538) [2022-01-24 15:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][130/1251] eta 0:43:47 lr 0.000181 time 3.3838 (2.3436) loss 3.2723 (3.2520) grad_norm 1.8878 (2.1451) [2022-01-24 15:41:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][140/1251] eta 0:43:05 lr 0.000181 time 1.6191 (2.3270) loss 3.2459 (3.2477) grad_norm 2.1280 (2.1389) [2022-01-24 15:41:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][150/1251] eta 0:42:41 lr 0.000181 time 3.6074 (2.3264) loss 3.5134 (3.2352) grad_norm 1.8650 (2.1401) [2022-01-24 15:42:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][160/1251] eta 0:42:07 lr 0.000181 time 1.8880 (2.3167) loss 2.5337 (3.2441) grad_norm 1.8372 (2.1396) [2022-01-24 15:42:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][170/1251] eta 0:41:51 lr 0.000181 time 2.5539 (2.3233) loss 3.0492 (3.2423) grad_norm 1.9564 (2.1315) [2022-01-24 15:43:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][180/1251] eta 0:41:12 lr 0.000181 time 1.6420 (2.3086) loss 3.7864 (3.2429) grad_norm 2.3859 (2.1361) [2022-01-24 15:43:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][190/1251] eta 0:40:39 lr 0.000181 time 3.1528 (2.2996) loss 3.6760 (3.2470) grad_norm 2.1990 (2.1364) [2022-01-24 15:43:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][200/1251] eta 0:40:04 lr 0.000181 time 1.9741 (2.2880) loss 3.2972 (3.2659) grad_norm 2.1493 (2.1388) [2022-01-24 15:44:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][210/1251] eta 0:39:46 lr 0.000181 time 2.1691 (2.2926) loss 3.7157 (3.2722) grad_norm 2.2302 (2.1443) [2022-01-24 15:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][220/1251] eta 0:39:21 lr 0.000181 time 2.1995 (2.2908) loss 3.5373 (3.2732) grad_norm 2.0208 (2.1467) [2022-01-24 15:44:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][230/1251] eta 0:39:03 lr 0.000181 time 2.5485 (2.2951) loss 3.2715 (3.2635) grad_norm 2.3352 (2.1482) [2022-01-24 15:45:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][240/1251] eta 0:38:35 lr 0.000181 time 2.4168 (2.2906) loss 3.6982 (3.2690) grad_norm 1.9492 (2.1429) [2022-01-24 15:45:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][250/1251] eta 0:38:01 lr 0.000181 time 1.8760 (2.2792) loss 3.7904 (3.2706) grad_norm 2.1458 (2.1400) [2022-01-24 15:45:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][260/1251] eta 0:37:30 lr 0.000181 time 2.2468 (2.2710) loss 2.9264 (3.2658) grad_norm 2.0609 (2.1366) [2022-01-24 15:46:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][270/1251] eta 0:37:01 lr 0.000181 time 2.1945 (2.2647) loss 2.6117 (3.2571) grad_norm 1.8513 (2.1376) [2022-01-24 15:46:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][280/1251] eta 0:36:35 lr 0.000181 time 1.9215 (2.2614) loss 2.5030 (3.2517) grad_norm 1.9716 (2.1338) [2022-01-24 15:47:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][290/1251] eta 0:36:15 lr 0.000181 time 2.2211 (2.2636) loss 3.2702 (3.2376) grad_norm 2.1205 (2.1316) [2022-01-24 15:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][300/1251] eta 0:35:52 lr 0.000181 time 1.8751 (2.2631) loss 3.5991 (3.2432) grad_norm 1.9162 (2.1270) [2022-01-24 15:47:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][310/1251] eta 0:35:33 lr 0.000181 time 3.5365 (2.2670) loss 2.3667 (3.2409) grad_norm 2.2878 (2.1262) [2022-01-24 15:48:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][320/1251] eta 0:35:05 lr 0.000181 time 1.8984 (2.2612) loss 3.8140 (3.2387) grad_norm 2.6636 (2.1269) [2022-01-24 15:48:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][330/1251] eta 0:34:38 lr 0.000181 time 1.9470 (2.2564) loss 3.3189 (3.2418) grad_norm 2.0391 (2.1263) [2022-01-24 15:48:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][340/1251] eta 0:34:07 lr 0.000180 time 1.8576 (2.2481) loss 4.1457 (3.2464) grad_norm 2.1356 (2.1272) [2022-01-24 15:49:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][350/1251] eta 0:33:42 lr 0.000180 time 2.2331 (2.2449) loss 2.9955 (3.2488) grad_norm 2.2641 (2.1257) [2022-01-24 15:49:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][360/1251] eta 0:33:17 lr 0.000180 time 2.1893 (2.2419) loss 2.7884 (3.2520) grad_norm 1.9948 (2.1247) [2022-01-24 15:49:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][370/1251] eta 0:32:53 lr 0.000180 time 2.8994 (2.2405) loss 3.6328 (3.2529) grad_norm 2.2320 (2.1257) [2022-01-24 15:50:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][380/1251] eta 0:32:28 lr 0.000180 time 1.9532 (2.2374) loss 3.6359 (3.2463) grad_norm 2.2971 (2.1278) [2022-01-24 15:50:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][390/1251] eta 0:32:08 lr 0.000180 time 2.9277 (2.2399) loss 2.6218 (3.2432) grad_norm 1.9069 (2.1326) [2022-01-24 15:51:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][400/1251] eta 0:31:49 lr 0.000180 time 2.1418 (2.2434) loss 3.4626 (3.2417) grad_norm 1.8876 (2.1359) [2022-01-24 15:51:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][410/1251] eta 0:31:30 lr 0.000180 time 3.4559 (2.2476) loss 3.4383 (3.2381) grad_norm 2.0610 (2.1395) [2022-01-24 15:51:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][420/1251] eta 0:31:06 lr 0.000180 time 2.4307 (2.2467) loss 2.6689 (3.2429) grad_norm 2.0198 (2.1428) [2022-01-24 15:52:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][430/1251] eta 0:30:40 lr 0.000180 time 1.8404 (2.2423) loss 3.4282 (3.2421) grad_norm 2.0724 (2.1425) [2022-01-24 15:52:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][440/1251] eta 0:30:12 lr 0.000180 time 1.9423 (2.2353) loss 3.6117 (3.2424) grad_norm 2.1884 (2.1413) [2022-01-24 15:52:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][450/1251] eta 0:29:46 lr 0.000180 time 1.8916 (2.2306) loss 3.5935 (3.2431) grad_norm 2.0122 (2.1379) [2022-01-24 15:53:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][460/1251] eta 0:29:21 lr 0.000180 time 1.8933 (2.2272) loss 3.6048 (3.2459) grad_norm 1.9039 (2.1366) [2022-01-24 15:53:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][470/1251] eta 0:28:58 lr 0.000180 time 2.2584 (2.2256) loss 3.2980 (3.2443) grad_norm 1.9714 (2.1343) [2022-01-24 15:53:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][480/1251] eta 0:28:38 lr 0.000180 time 2.5216 (2.2283) loss 3.2142 (3.2441) grad_norm 2.1655 (2.1342) [2022-01-24 15:54:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][490/1251] eta 0:28:15 lr 0.000180 time 1.9313 (2.2280) loss 2.9409 (3.2422) grad_norm 1.9777 (2.1339) [2022-01-24 15:54:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][500/1251] eta 0:27:52 lr 0.000180 time 1.9606 (2.2273) loss 3.0678 (3.2427) grad_norm 1.8220 (2.1354) [2022-01-24 15:55:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][510/1251] eta 0:27:30 lr 0.000180 time 1.9620 (2.2280) loss 3.7077 (3.2434) grad_norm 1.8822 (2.1336) [2022-01-24 15:55:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][520/1251] eta 0:27:11 lr 0.000180 time 3.0971 (2.2313) loss 2.9843 (3.2423) grad_norm 2.1404 (2.1336) [2022-01-24 15:55:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][530/1251] eta 0:26:46 lr 0.000180 time 1.8563 (2.2277) loss 3.8476 (3.2434) grad_norm 1.7831 (2.1348) [2022-01-24 15:56:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][540/1251] eta 0:26:20 lr 0.000180 time 1.8543 (2.2235) loss 3.7109 (3.2471) grad_norm 2.5373 (2.1399) [2022-01-24 15:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][550/1251] eta 0:25:58 lr 0.000180 time 2.3023 (2.2236) loss 2.9482 (3.2471) grad_norm 2.4941 (2.1409) [2022-01-24 15:56:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][560/1251] eta 0:25:38 lr 0.000180 time 2.8738 (2.2259) loss 3.5815 (3.2479) grad_norm 1.8420 (2.1425) [2022-01-24 15:57:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][570/1251] eta 0:25:13 lr 0.000180 time 2.2255 (2.2232) loss 2.3896 (3.2404) grad_norm 2.0468 (2.1413) [2022-01-24 15:57:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][580/1251] eta 0:24:50 lr 0.000180 time 1.8617 (2.2217) loss 3.6199 (3.2443) grad_norm 1.9353 (2.1407) [2022-01-24 15:57:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][590/1251] eta 0:24:28 lr 0.000180 time 2.2356 (2.2213) loss 3.7076 (3.2466) grad_norm 2.1350 (2.1409) [2022-01-24 15:58:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][600/1251] eta 0:24:06 lr 0.000180 time 3.0914 (2.2224) loss 3.9208 (3.2481) grad_norm 1.9613 (2.1395) [2022-01-24 15:58:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][610/1251] eta 0:23:44 lr 0.000180 time 1.9251 (2.2220) loss 3.3262 (3.2459) grad_norm 2.6476 (2.1409) [2022-01-24 15:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][620/1251] eta 0:23:21 lr 0.000180 time 1.8349 (2.2211) loss 3.3782 (3.2474) grad_norm 2.3085 (2.1401) [2022-01-24 15:59:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][630/1251] eta 0:22:58 lr 0.000180 time 2.5283 (2.2205) loss 2.3430 (3.2494) grad_norm 1.9981 (2.1391) [2022-01-24 15:59:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][640/1251] eta 0:22:36 lr 0.000180 time 2.4904 (2.2204) loss 3.4659 (3.2447) grad_norm 2.1612 (2.1388) [2022-01-24 16:00:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][650/1251] eta 0:22:12 lr 0.000180 time 2.1793 (2.2178) loss 3.6543 (3.2480) grad_norm 2.2501 (2.1406) [2022-01-24 16:00:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][660/1251] eta 0:21:50 lr 0.000179 time 2.2888 (2.2177) loss 3.1297 (3.2483) grad_norm 1.8523 (2.1408) [2022-01-24 16:00:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][670/1251] eta 0:21:27 lr 0.000179 time 2.4926 (2.2167) loss 2.2418 (3.2474) grad_norm 2.2097 (2.1393) [2022-01-24 16:01:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][680/1251] eta 0:21:04 lr 0.000179 time 2.1165 (2.2141) loss 3.2522 (3.2488) grad_norm 2.1067 (2.1381) [2022-01-24 16:01:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][690/1251] eta 0:20:41 lr 0.000179 time 1.9119 (2.2127) loss 3.5650 (3.2480) grad_norm 2.1651 (2.1385) [2022-01-24 16:01:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][700/1251] eta 0:20:19 lr 0.000179 time 2.4347 (2.2131) loss 3.3260 (3.2506) grad_norm 2.0355 (2.1386) [2022-01-24 16:02:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][710/1251] eta 0:19:57 lr 0.000179 time 2.4526 (2.2136) loss 2.4198 (3.2479) grad_norm 1.9015 (2.1373) [2022-01-24 16:02:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][720/1251] eta 0:19:35 lr 0.000179 time 2.3175 (2.2140) loss 3.3116 (3.2453) grad_norm 1.9340 (2.1366) [2022-01-24 16:03:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][730/1251] eta 0:19:13 lr 0.000179 time 2.0814 (2.2131) loss 3.8571 (3.2457) grad_norm 2.0412 (2.1388) [2022-01-24 16:03:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][740/1251] eta 0:18:50 lr 0.000179 time 1.8452 (2.2126) loss 2.8832 (3.2500) grad_norm 2.1777 (2.1394) [2022-01-24 16:03:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][750/1251] eta 0:18:28 lr 0.000179 time 2.3671 (2.2116) loss 2.3865 (3.2497) grad_norm 2.1134 (2.1391) [2022-01-24 16:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][760/1251] eta 0:18:05 lr 0.000179 time 1.6192 (2.2113) loss 3.2838 (3.2523) grad_norm 2.1038 (2.1401) [2022-01-24 16:04:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][770/1251] eta 0:17:43 lr 0.000179 time 2.5913 (2.2108) loss 3.4607 (3.2501) grad_norm 2.3395 (2.1398) [2022-01-24 16:04:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][780/1251] eta 0:17:20 lr 0.000179 time 1.8239 (2.2099) loss 3.7552 (3.2509) grad_norm 1.9700 (2.1432) [2022-01-24 16:05:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][790/1251] eta 0:16:58 lr 0.000179 time 2.2301 (2.2094) loss 2.4590 (3.2500) grad_norm 2.0878 (2.1430) [2022-01-24 16:05:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][800/1251] eta 0:16:36 lr 0.000179 time 1.7330 (2.2087) loss 3.3698 (3.2518) grad_norm 1.9292 (2.1432) [2022-01-24 16:05:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][810/1251] eta 0:16:14 lr 0.000179 time 2.6268 (2.2089) loss 3.0920 (3.2523) grad_norm 2.2253 (2.1422) [2022-01-24 16:06:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][820/1251] eta 0:15:51 lr 0.000179 time 1.5169 (2.2083) loss 3.7420 (3.2524) grad_norm 1.9918 (2.1420) [2022-01-24 16:06:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][830/1251] eta 0:15:30 lr 0.000179 time 2.2070 (2.2100) loss 3.3388 (3.2532) grad_norm 2.0985 (2.1424) [2022-01-24 16:07:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][840/1251] eta 0:15:08 lr 0.000179 time 1.7951 (2.2093) loss 3.4959 (3.2546) grad_norm 1.9946 (2.1416) [2022-01-24 16:07:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][850/1251] eta 0:14:45 lr 0.000179 time 1.9737 (2.2080) loss 3.3316 (3.2569) grad_norm 2.0414 (2.1424) [2022-01-24 16:07:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][860/1251] eta 0:14:22 lr 0.000179 time 1.7193 (2.2071) loss 3.2177 (3.2540) grad_norm 2.0077 (2.1423) [2022-01-24 16:08:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][870/1251] eta 0:14:01 lr 0.000179 time 2.8915 (2.2086) loss 3.9106 (3.2512) grad_norm 2.1473 (2.1417) [2022-01-24 16:08:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][880/1251] eta 0:13:39 lr 0.000179 time 2.4199 (2.2086) loss 3.9361 (3.2512) grad_norm 2.2812 (2.1426) [2022-01-24 16:08:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][890/1251] eta 0:13:17 lr 0.000179 time 2.3117 (2.2085) loss 2.6134 (3.2526) grad_norm 2.0588 (2.1421) [2022-01-24 16:09:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][900/1251] eta 0:12:54 lr 0.000179 time 2.0238 (2.2068) loss 3.4624 (3.2534) grad_norm 2.3107 (2.1469) [2022-01-24 16:09:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][910/1251] eta 0:12:31 lr 0.000179 time 2.0017 (2.2051) loss 3.6019 (3.2534) grad_norm 1.9868 (2.1463) [2022-01-24 16:09:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][920/1251] eta 0:12:09 lr 0.000179 time 1.8196 (2.2034) loss 3.6656 (3.2546) grad_norm 2.0215 (2.1464) [2022-01-24 16:10:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][930/1251] eta 0:11:47 lr 0.000179 time 1.6881 (2.2027) loss 3.7132 (3.2546) grad_norm 1.8569 (2.1459) [2022-01-24 16:10:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][940/1251] eta 0:11:25 lr 0.000179 time 2.3636 (2.2036) loss 3.4191 (3.2539) grad_norm 2.4570 (2.1467) [2022-01-24 16:10:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][950/1251] eta 0:11:03 lr 0.000179 time 2.2952 (2.2048) loss 2.2094 (3.2509) grad_norm 1.9145 (2.1470) [2022-01-24 16:11:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][960/1251] eta 0:10:41 lr 0.000179 time 2.2046 (2.2051) loss 3.3858 (3.2486) grad_norm 1.9684 (2.1471) [2022-01-24 16:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][970/1251] eta 0:10:19 lr 0.000179 time 1.9022 (2.2060) loss 2.9684 (3.2481) grad_norm 1.9527 (2.1471) [2022-01-24 16:12:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][980/1251] eta 0:09:58 lr 0.000178 time 2.3625 (2.2068) loss 3.2005 (3.2498) grad_norm 1.9281 (2.1470) [2022-01-24 16:12:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][990/1251] eta 0:09:35 lr 0.000178 time 1.7214 (2.2063) loss 2.8465 (3.2499) grad_norm 2.1646 (2.1480) [2022-01-24 16:12:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1000/1251] eta 0:09:13 lr 0.000178 time 1.8850 (2.2047) loss 3.1346 (3.2483) grad_norm 2.1291 (2.1493) [2022-01-24 16:13:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1010/1251] eta 0:08:50 lr 0.000178 time 1.8685 (2.2032) loss 3.0647 (3.2478) grad_norm 1.9926 (2.1493) [2022-01-24 16:13:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1020/1251] eta 0:08:28 lr 0.000178 time 2.6805 (2.2031) loss 3.0398 (3.2467) grad_norm 1.8225 (2.1490) [2022-01-24 16:13:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1030/1251] eta 0:08:06 lr 0.000178 time 2.1382 (2.2036) loss 3.5980 (3.2477) grad_norm 2.3494 (2.1488) [2022-01-24 16:14:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1040/1251] eta 0:07:44 lr 0.000178 time 1.8318 (2.2026) loss 3.5793 (3.2439) grad_norm 2.2406 (2.1487) [2022-01-24 16:14:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1050/1251] eta 0:07:22 lr 0.000178 time 1.8647 (2.2016) loss 3.4439 (3.2433) grad_norm 2.2073 (2.1495) [2022-01-24 16:14:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1060/1251] eta 0:07:00 lr 0.000178 time 2.2385 (2.2008) loss 3.3885 (3.2449) grad_norm 2.2327 (2.1499) [2022-01-24 16:15:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1070/1251] eta 0:06:38 lr 0.000178 time 2.5342 (2.2007) loss 3.7889 (3.2464) grad_norm 2.0181 (2.1511) [2022-01-24 16:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1080/1251] eta 0:06:16 lr 0.000178 time 1.8213 (2.2005) loss 3.0802 (3.2483) grad_norm 2.1269 (2.1514) [2022-01-24 16:16:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1090/1251] eta 0:05:54 lr 0.000178 time 2.8447 (2.2008) loss 3.7605 (3.2490) grad_norm 2.2728 (2.1521) [2022-01-24 16:16:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1100/1251] eta 0:05:32 lr 0.000178 time 2.0026 (2.2006) loss 2.4624 (3.2475) grad_norm 2.0817 (2.1515) [2022-01-24 16:16:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1110/1251] eta 0:05:10 lr 0.000178 time 2.1910 (2.2005) loss 3.2183 (3.2482) grad_norm 2.1992 (2.1517) [2022-01-24 16:17:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1120/1251] eta 0:04:48 lr 0.000178 time 1.8521 (2.2007) loss 3.5156 (3.2464) grad_norm 2.2832 (2.1523) [2022-01-24 16:17:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1130/1251] eta 0:04:26 lr 0.000178 time 2.3432 (2.2011) loss 3.4407 (3.2470) grad_norm 2.3314 (2.1522) [2022-01-24 16:17:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1140/1251] eta 0:04:04 lr 0.000178 time 2.1581 (2.2011) loss 3.6411 (3.2474) grad_norm 2.3699 (2.1522) [2022-01-24 16:18:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1150/1251] eta 0:03:42 lr 0.000178 time 2.3198 (2.1994) loss 3.4404 (3.2471) grad_norm 2.3105 (2.1522) [2022-01-24 16:18:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1160/1251] eta 0:03:20 lr 0.000178 time 2.1875 (2.1998) loss 2.1382 (3.2451) grad_norm 2.1844 (2.1521) [2022-01-24 16:18:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1170/1251] eta 0:02:58 lr 0.000178 time 1.9574 (2.1995) loss 3.6780 (3.2473) grad_norm 2.0195 (2.1515) [2022-01-24 16:19:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1180/1251] eta 0:02:36 lr 0.000178 time 1.8619 (2.2002) loss 2.4888 (3.2464) grad_norm 2.1859 (2.1531) [2022-01-24 16:19:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1190/1251] eta 0:02:14 lr 0.000178 time 2.0138 (2.2002) loss 3.7192 (3.2469) grad_norm 2.1881 (2.1537) [2022-01-24 16:20:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1200/1251] eta 0:01:52 lr 0.000178 time 1.5711 (2.1995) loss 3.5438 (3.2484) grad_norm 1.9930 (2.1550) [2022-01-24 16:20:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1210/1251] eta 0:01:30 lr 0.000178 time 1.6985 (2.1980) loss 2.8947 (3.2474) grad_norm 2.1703 (2.1554) [2022-01-24 16:20:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1220/1251] eta 0:01:08 lr 0.000178 time 2.0025 (2.1973) loss 3.7321 (3.2447) grad_norm 2.9766 (2.1558) [2022-01-24 16:21:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1230/1251] eta 0:00:46 lr 0.000178 time 1.9363 (2.1967) loss 2.8941 (3.2446) grad_norm 3.2184 (2.1564) [2022-01-24 16:21:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1240/1251] eta 0:00:24 lr 0.000178 time 1.8968 (2.1973) loss 3.5824 (3.2459) grad_norm 2.1372 (2.1587) [2022-01-24 16:21:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [218/300][1250/1251] eta 0:00:02 lr 0.000178 time 1.1734 (2.1927) loss 3.0007 (3.2473) grad_norm 2.1861 (2.1580) [2022-01-24 16:21:46 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 218 training takes 0:45:43 [2022-01-24 16:22:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.390 (18.390) Loss 0.8453 (0.8453) Acc@1 78.711 (78.711) Acc@5 95.605 (95.605) [2022-01-24 16:22:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.318 (3.499) Loss 0.8664 (0.8651) Acc@1 80.078 (79.812) Acc@5 94.727 (94.949) [2022-01-24 16:22:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.264 (2.596) Loss 0.8445 (0.8683) Acc@1 79.785 (79.655) Acc@5 95.508 (94.899) [2022-01-24 16:22:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.315 (2.345) Loss 0.9344 (0.8753) Acc@1 79.199 (79.435) Acc@5 94.629 (94.783) [2022-01-24 16:23:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.953 (2.209) Loss 0.9008 (0.8761) Acc@1 78.516 (79.333) Acc@5 94.531 (94.808) [2022-01-24 16:23:23 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.330 Acc@5 94.786 [2022-01-24 16:23:23 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.3% [2022-01-24 16:23:23 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.33% [2022-01-24 16:23:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][0/1251] eta 7:32:40 lr 0.000178 time 21.7114 (21.7114) loss 3.6318 (3.6318) grad_norm 2.0835 (2.0835) [2022-01-24 16:24:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][10/1251] eta 1:23:42 lr 0.000178 time 2.6467 (4.0474) loss 2.5784 (3.2955) grad_norm 2.1430 (2.1083) [2022-01-24 16:24:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][20/1251] eta 1:05:03 lr 0.000178 time 2.1241 (3.1707) loss 2.0808 (3.2125) grad_norm 2.0875 (2.1215) [2022-01-24 16:24:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][30/1251] eta 0:59:32 lr 0.000178 time 1.6637 (2.9256) loss 3.7196 (3.2478) grad_norm 2.0958 (2.1319) [2022-01-24 16:25:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][40/1251] eta 0:56:11 lr 0.000178 time 2.9872 (2.7841) loss 3.7137 (3.2603) grad_norm 1.9617 (2.1302) [2022-01-24 16:25:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][50/1251] eta 0:53:46 lr 0.000177 time 2.2439 (2.6865) loss 3.3097 (3.2786) grad_norm 2.2318 (2.1111) [2022-01-24 16:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][60/1251] eta 0:51:58 lr 0.000177 time 2.7128 (2.6184) loss 2.2078 (3.2492) grad_norm 2.1778 (2.1073) [2022-01-24 16:26:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][70/1251] eta 0:50:04 lr 0.000177 time 1.6825 (2.5441) loss 3.6491 (3.2494) grad_norm 3.0568 (2.1533) [2022-01-24 16:26:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][80/1251] eta 0:48:42 lr 0.000177 time 2.4719 (2.4961) loss 2.2755 (3.2395) grad_norm 1.9595 (2.1478) [2022-01-24 16:27:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][90/1251] eta 0:47:39 lr 0.000177 time 3.1386 (2.4633) loss 3.5432 (3.2468) grad_norm 2.1517 (2.2198) [2022-01-24 16:27:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][100/1251] eta 0:46:33 lr 0.000177 time 2.4112 (2.4271) loss 2.6031 (3.2354) grad_norm 2.1484 (2.2274) [2022-01-24 16:27:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][110/1251] eta 0:45:27 lr 0.000177 time 1.8852 (2.3902) loss 3.5914 (3.2602) grad_norm 2.1832 (2.2236) [2022-01-24 16:28:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][120/1251] eta 0:44:33 lr 0.000177 time 1.8807 (2.3635) loss 3.4463 (3.2546) grad_norm 2.5525 (2.2267) [2022-01-24 16:28:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][130/1251] eta 0:43:56 lr 0.000177 time 2.9256 (2.3522) loss 3.4210 (3.2525) grad_norm 1.9485 (2.2197) [2022-01-24 16:28:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][140/1251] eta 0:43:12 lr 0.000177 time 2.1972 (2.3339) loss 3.7753 (3.2348) grad_norm 2.4783 (2.2168) [2022-01-24 16:29:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][150/1251] eta 0:42:33 lr 0.000177 time 2.2747 (2.3189) loss 2.4581 (3.2525) grad_norm 2.1872 (2.2147) [2022-01-24 16:29:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][160/1251] eta 0:42:00 lr 0.000177 time 2.2592 (2.3102) loss 3.7724 (3.2656) grad_norm 1.9959 (2.2128) [2022-01-24 16:29:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][170/1251] eta 0:41:26 lr 0.000177 time 2.8027 (2.3002) loss 3.6406 (3.2666) grad_norm 2.4400 (2.2110) [2022-01-24 16:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][180/1251] eta 0:40:52 lr 0.000177 time 1.8966 (2.2897) loss 2.7682 (3.2743) grad_norm 2.1996 (2.2043) [2022-01-24 16:30:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][190/1251] eta 0:40:20 lr 0.000177 time 1.9753 (2.2817) loss 3.2161 (3.2638) grad_norm 2.1567 (2.2049) [2022-01-24 16:31:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][200/1251] eta 0:40:02 lr 0.000177 time 1.5532 (2.2856) loss 3.6557 (3.2539) grad_norm 2.3488 (2.2048) [2022-01-24 16:31:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][210/1251] eta 0:39:29 lr 0.000177 time 2.1370 (2.2764) loss 3.6166 (3.2586) grad_norm 1.9054 (2.2005) [2022-01-24 16:31:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][220/1251] eta 0:39:02 lr 0.000177 time 2.2976 (2.2720) loss 3.3091 (3.2608) grad_norm 2.3214 (2.1966) [2022-01-24 16:32:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][230/1251] eta 0:38:36 lr 0.000177 time 2.0037 (2.2687) loss 3.4699 (3.2515) grad_norm 2.6061 (2.1975) [2022-01-24 16:32:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][240/1251] eta 0:38:14 lr 0.000177 time 2.5523 (2.2699) loss 2.2230 (3.2548) grad_norm 2.3918 (2.2004) [2022-01-24 16:32:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][250/1251] eta 0:37:49 lr 0.000177 time 2.2902 (2.2676) loss 3.7453 (3.2563) grad_norm 2.3789 (2.1972) [2022-01-24 16:33:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][260/1251] eta 0:37:27 lr 0.000177 time 2.8829 (2.2678) loss 3.1045 (3.2596) grad_norm 2.1554 (2.1976) [2022-01-24 16:33:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][270/1251] eta 0:36:59 lr 0.000177 time 1.5953 (2.2628) loss 3.5318 (3.2535) grad_norm 2.1952 (2.2009) [2022-01-24 16:33:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][280/1251] eta 0:36:37 lr 0.000177 time 2.1540 (2.2630) loss 3.5554 (3.2550) grad_norm 1.9488 (2.1984) [2022-01-24 16:34:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][290/1251] eta 0:36:09 lr 0.000177 time 2.2552 (2.2578) loss 3.3908 (3.2524) grad_norm 2.0273 (2.1959) [2022-01-24 16:34:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][300/1251] eta 0:35:41 lr 0.000177 time 2.6463 (2.2518) loss 2.8582 (3.2515) grad_norm 2.0498 (2.1952) [2022-01-24 16:35:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][310/1251] eta 0:35:10 lr 0.000177 time 1.7042 (2.2429) loss 3.5567 (3.2477) grad_norm 2.6410 (2.1979) [2022-01-24 16:35:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][320/1251] eta 0:34:44 lr 0.000177 time 1.8828 (2.2390) loss 3.7059 (3.2487) grad_norm 2.5020 (2.1983) [2022-01-24 16:35:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][330/1251] eta 0:34:22 lr 0.000177 time 2.2281 (2.2392) loss 2.8616 (3.2414) grad_norm 2.1297 (2.1971) [2022-01-24 16:36:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][340/1251] eta 0:34:00 lr 0.000177 time 2.5715 (2.2400) loss 3.4970 (3.2360) grad_norm 2.0309 (2.1933) [2022-01-24 16:36:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][350/1251] eta 0:33:40 lr 0.000177 time 2.1627 (2.2420) loss 3.7513 (3.2413) grad_norm 2.2759 (2.1952) [2022-01-24 16:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][360/1251] eta 0:33:14 lr 0.000177 time 1.9202 (2.2382) loss 3.6403 (3.2393) grad_norm 2.0829 (2.1929) [2022-01-24 16:37:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][370/1251] eta 0:32:47 lr 0.000177 time 1.8146 (2.2332) loss 3.4765 (3.2427) grad_norm 2.1541 (2.1937) [2022-01-24 16:37:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][380/1251] eta 0:32:25 lr 0.000176 time 2.3954 (2.2341) loss 3.5745 (3.2439) grad_norm 2.0081 (2.1920) [2022-01-24 16:37:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][390/1251] eta 0:32:04 lr 0.000176 time 2.0954 (2.2356) loss 3.2755 (3.2464) grad_norm 2.3632 (2.1917) [2022-01-24 16:38:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][400/1251] eta 0:31:42 lr 0.000176 time 1.6025 (2.2352) loss 3.6556 (3.2482) grad_norm 2.2245 (2.1924) [2022-01-24 16:38:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][410/1251] eta 0:31:17 lr 0.000176 time 2.1439 (2.2327) loss 3.1619 (3.2522) grad_norm 3.6598 (2.2051) [2022-01-24 16:39:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][420/1251] eta 0:30:53 lr 0.000176 time 2.2608 (2.2306) loss 3.5741 (3.2549) grad_norm 2.0863 (2.2063) [2022-01-24 16:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][430/1251] eta 0:30:29 lr 0.000176 time 1.6319 (2.2281) loss 3.3184 (3.2498) grad_norm 1.9738 (2.2059) [2022-01-24 16:39:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][440/1251] eta 0:30:05 lr 0.000176 time 2.1219 (2.2265) loss 3.3702 (3.2483) grad_norm 2.6177 (2.2057) [2022-01-24 16:40:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][450/1251] eta 0:29:39 lr 0.000176 time 2.0739 (2.2214) loss 2.8874 (3.2424) grad_norm 2.0853 (2.2054) [2022-01-24 16:40:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][460/1251] eta 0:29:18 lr 0.000176 time 2.2715 (2.2234) loss 2.9917 (3.2372) grad_norm 2.1780 (2.2036) [2022-01-24 16:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][470/1251] eta 0:28:55 lr 0.000176 time 2.1887 (2.2225) loss 3.6498 (3.2384) grad_norm 1.9542 (2.2010) [2022-01-24 16:41:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][480/1251] eta 0:28:33 lr 0.000176 time 1.5916 (2.2230) loss 2.6162 (3.2269) grad_norm 2.1368 (2.2011) [2022-01-24 16:41:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][490/1251] eta 0:28:12 lr 0.000176 time 1.9500 (2.2238) loss 3.7841 (3.2316) grad_norm 2.1807 (2.1991) [2022-01-24 16:41:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][500/1251] eta 0:27:50 lr 0.000176 time 1.9566 (2.2241) loss 2.3284 (3.2340) grad_norm 2.0999 (2.1986) [2022-01-24 16:42:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][510/1251] eta 0:27:29 lr 0.000176 time 1.9336 (2.2260) loss 2.4380 (3.2334) grad_norm 2.1456 (2.1983) [2022-01-24 16:42:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][520/1251] eta 0:27:04 lr 0.000176 time 1.6498 (2.2225) loss 2.4284 (3.2346) grad_norm 2.2749 (2.2009) [2022-01-24 16:43:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][530/1251] eta 0:26:38 lr 0.000176 time 2.1249 (2.2167) loss 3.9242 (3.2339) grad_norm 2.0862 (2.2015) [2022-01-24 16:43:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][540/1251] eta 0:26:16 lr 0.000176 time 1.8916 (2.2170) loss 3.3337 (3.2308) grad_norm 2.3588 (2.2018) [2022-01-24 16:43:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][550/1251] eta 0:25:53 lr 0.000176 time 1.9525 (2.2162) loss 2.6630 (3.2345) grad_norm 1.9575 (2.2016) [2022-01-24 16:44:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][560/1251] eta 0:25:31 lr 0.000176 time 2.3325 (2.2159) loss 3.6798 (3.2370) grad_norm 2.1847 (2.2012) [2022-01-24 16:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][570/1251] eta 0:25:07 lr 0.000176 time 2.5042 (2.2142) loss 3.3621 (3.2436) grad_norm 2.4556 (2.2043) [2022-01-24 16:44:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][580/1251] eta 0:24:45 lr 0.000176 time 1.5067 (2.2137) loss 2.0539 (3.2442) grad_norm 2.0985 (2.2041) [2022-01-24 16:45:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][590/1251] eta 0:24:23 lr 0.000176 time 2.8467 (2.2135) loss 2.3062 (3.2388) grad_norm 2.1615 (2.2038) [2022-01-24 16:45:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][600/1251] eta 0:24:02 lr 0.000176 time 2.1457 (2.2151) loss 3.5123 (3.2381) grad_norm 2.1316 (2.2017) [2022-01-24 16:45:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][610/1251] eta 0:23:39 lr 0.000176 time 2.0516 (2.2140) loss 3.3265 (3.2335) grad_norm 2.3653 (2.2002) [2022-01-24 16:46:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][620/1251] eta 0:23:17 lr 0.000176 time 1.9103 (2.2142) loss 2.3669 (3.2344) grad_norm 2.1804 (2.1984) [2022-01-24 16:46:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][630/1251] eta 0:22:54 lr 0.000176 time 1.9111 (2.2127) loss 2.5821 (3.2359) grad_norm 2.0632 (2.1969) [2022-01-24 16:47:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][640/1251] eta 0:22:30 lr 0.000176 time 1.8148 (2.2100) loss 2.4400 (3.2323) grad_norm 2.0522 (2.1968) [2022-01-24 16:47:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][650/1251] eta 0:22:07 lr 0.000176 time 1.8050 (2.2082) loss 3.6828 (3.2314) grad_norm 2.0819 (2.1955) [2022-01-24 16:47:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][660/1251] eta 0:21:45 lr 0.000176 time 2.0811 (2.2094) loss 3.5086 (3.2336) grad_norm 2.0471 (2.1949) [2022-01-24 16:48:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][670/1251] eta 0:21:24 lr 0.000176 time 2.1628 (2.2101) loss 3.8296 (3.2365) grad_norm 2.2041 (2.1953) [2022-01-24 16:48:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][680/1251] eta 0:21:01 lr 0.000176 time 2.0921 (2.2099) loss 3.3754 (3.2373) grad_norm 2.2822 (2.1953) [2022-01-24 16:48:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][690/1251] eta 0:20:40 lr 0.000176 time 2.0421 (2.2112) loss 3.5465 (3.2366) grad_norm 2.3845 (2.1945) [2022-01-24 16:49:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][700/1251] eta 0:20:17 lr 0.000175 time 2.1999 (2.2094) loss 3.3613 (3.2373) grad_norm 2.3193 (2.1932) [2022-01-24 16:49:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][710/1251] eta 0:19:54 lr 0.000175 time 1.8311 (2.2075) loss 3.6089 (3.2352) grad_norm 1.8678 (2.1914) [2022-01-24 16:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][720/1251] eta 0:19:32 lr 0.000175 time 1.9280 (2.2075) loss 3.7950 (3.2341) grad_norm 2.2472 (2.1911) [2022-01-24 16:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][730/1251] eta 0:19:08 lr 0.000175 time 1.9841 (2.2053) loss 3.7131 (3.2326) grad_norm 2.1891 (2.1913) [2022-01-24 16:50:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][740/1251] eta 0:18:46 lr 0.000175 time 2.3542 (2.2054) loss 3.6108 (3.2354) grad_norm 2.0177 (2.1917) [2022-01-24 16:51:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][750/1251] eta 0:18:25 lr 0.000175 time 2.7777 (2.2070) loss 3.6711 (3.2348) grad_norm 2.3523 (2.1928) [2022-01-24 16:51:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][760/1251] eta 0:18:03 lr 0.000175 time 1.9226 (2.2072) loss 3.5294 (3.2359) grad_norm 2.0780 (2.1922) [2022-01-24 16:51:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][770/1251] eta 0:17:40 lr 0.000175 time 2.0054 (2.2053) loss 3.7408 (3.2347) grad_norm 2.0014 (2.1913) [2022-01-24 16:52:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][780/1251] eta 0:17:18 lr 0.000175 time 2.4300 (2.2056) loss 3.2057 (3.2350) grad_norm 2.2630 (2.1916) [2022-01-24 16:52:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][790/1251] eta 0:16:56 lr 0.000175 time 2.1549 (2.2044) loss 3.5386 (3.2357) grad_norm 2.2379 (2.1913) [2022-01-24 16:52:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][800/1251] eta 0:16:33 lr 0.000175 time 1.7645 (2.2025) loss 3.1243 (3.2344) grad_norm 2.0357 (2.1909) [2022-01-24 16:53:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][810/1251] eta 0:16:11 lr 0.000175 time 2.0447 (2.2029) loss 3.6435 (3.2351) grad_norm 2.0839 (2.1903) [2022-01-24 16:53:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][820/1251] eta 0:15:49 lr 0.000175 time 2.2355 (2.2030) loss 3.6992 (3.2345) grad_norm 2.3264 (2.1889) [2022-01-24 16:53:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][830/1251] eta 0:15:27 lr 0.000175 time 2.2545 (2.2030) loss 3.4030 (3.2361) grad_norm 1.9315 (2.1897) [2022-01-24 16:54:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][840/1251] eta 0:15:04 lr 0.000175 time 1.9076 (2.2012) loss 2.0646 (3.2366) grad_norm 2.0268 (2.1892) [2022-01-24 16:54:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][850/1251] eta 0:14:42 lr 0.000175 time 2.0822 (2.2016) loss 3.9379 (3.2347) grad_norm 2.4266 (2.1878) [2022-01-24 16:54:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][860/1251] eta 0:14:20 lr 0.000175 time 2.2361 (2.2010) loss 3.9630 (3.2376) grad_norm 2.1847 (2.1873) [2022-01-24 16:55:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][870/1251] eta 0:13:58 lr 0.000175 time 1.7044 (2.2011) loss 2.7066 (3.2352) grad_norm 2.1189 (2.1865) [2022-01-24 16:55:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][880/1251] eta 0:13:36 lr 0.000175 time 1.7322 (2.1996) loss 4.0370 (3.2368) grad_norm 2.2706 (2.1867) [2022-01-24 16:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][890/1251] eta 0:13:14 lr 0.000175 time 1.8511 (2.1995) loss 2.3201 (3.2399) grad_norm 2.2490 (2.1878) [2022-01-24 16:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][900/1251] eta 0:12:52 lr 0.000175 time 2.1447 (2.2014) loss 3.3271 (3.2367) grad_norm 1.8105 (2.1877) [2022-01-24 16:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][910/1251] eta 0:12:31 lr 0.000175 time 2.4833 (2.2031) loss 3.8331 (3.2380) grad_norm 2.1581 (2.1884) [2022-01-24 16:57:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][920/1251] eta 0:12:08 lr 0.000175 time 2.2023 (2.2013) loss 3.5572 (3.2399) grad_norm 2.3332 (2.1890) [2022-01-24 16:57:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][930/1251] eta 0:11:45 lr 0.000175 time 1.9430 (2.1994) loss 3.9046 (3.2406) grad_norm 2.3618 (2.1902) [2022-01-24 16:57:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][940/1251] eta 0:11:23 lr 0.000175 time 2.4053 (2.1987) loss 3.7562 (3.2401) grad_norm 3.0115 (2.1924) [2022-01-24 16:58:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][950/1251] eta 0:11:02 lr 0.000175 time 2.2940 (2.1996) loss 3.8909 (3.2422) grad_norm 2.2271 (2.1940) [2022-01-24 16:58:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][960/1251] eta 0:10:40 lr 0.000175 time 2.0758 (2.2000) loss 2.6476 (3.2361) grad_norm 2.3940 (2.1945) [2022-01-24 16:58:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][970/1251] eta 0:10:18 lr 0.000175 time 2.0662 (2.1995) loss 3.3727 (3.2370) grad_norm 1.9901 (2.1947) [2022-01-24 16:59:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][980/1251] eta 0:09:55 lr 0.000175 time 2.0866 (2.1986) loss 3.2508 (3.2375) grad_norm 2.2190 (2.1958) [2022-01-24 16:59:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][990/1251] eta 0:09:33 lr 0.000175 time 2.3186 (2.1987) loss 3.3099 (3.2373) grad_norm 2.0524 (2.1949) [2022-01-24 17:00:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1000/1251] eta 0:09:11 lr 0.000175 time 1.8336 (2.1983) loss 2.8395 (3.2369) grad_norm 1.8978 (2.1947) [2022-01-24 17:00:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1010/1251] eta 0:08:49 lr 0.000175 time 1.8520 (2.1980) loss 3.5693 (3.2361) grad_norm 2.1317 (2.1941) [2022-01-24 17:00:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1020/1251] eta 0:08:27 lr 0.000174 time 2.5537 (2.1982) loss 3.6134 (3.2336) grad_norm 2.2437 (2.1944) [2022-01-24 17:01:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1030/1251] eta 0:08:05 lr 0.000174 time 1.6316 (2.1985) loss 2.9672 (3.2352) grad_norm 2.2906 (2.1946) [2022-01-24 17:01:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1040/1251] eta 0:07:43 lr 0.000174 time 2.1030 (2.1975) loss 3.6117 (3.2350) grad_norm 2.2967 (2.1946) [2022-01-24 17:01:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1050/1251] eta 0:07:21 lr 0.000174 time 1.8982 (2.1970) loss 3.4485 (3.2341) grad_norm 2.0801 (2.1930) [2022-01-24 17:02:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1060/1251] eta 0:06:59 lr 0.000174 time 2.2560 (2.1961) loss 3.6284 (3.2357) grad_norm 2.2501 (2.1933) [2022-01-24 17:02:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1070/1251] eta 0:06:37 lr 0.000174 time 1.8523 (2.1960) loss 3.6057 (3.2354) grad_norm 1.9970 (2.1928) [2022-01-24 17:02:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1080/1251] eta 0:06:15 lr 0.000174 time 2.2586 (2.1955) loss 3.2310 (3.2328) grad_norm 2.0218 (2.1937) [2022-01-24 17:03:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1090/1251] eta 0:05:53 lr 0.000174 time 2.0962 (2.1943) loss 3.7252 (3.2347) grad_norm 2.0104 (2.1925) [2022-01-24 17:03:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1100/1251] eta 0:05:31 lr 0.000174 time 2.2227 (2.1943) loss 2.7332 (3.2361) grad_norm 2.1209 (2.1918) [2022-01-24 17:04:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1110/1251] eta 0:05:09 lr 0.000174 time 2.0726 (2.1938) loss 3.3056 (3.2368) grad_norm 1.9168 (2.1909) [2022-01-24 17:04:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1120/1251] eta 0:04:47 lr 0.000174 time 1.8766 (2.1943) loss 3.3238 (3.2376) grad_norm 1.8850 (2.1909) [2022-01-24 17:04:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1130/1251] eta 0:04:25 lr 0.000174 time 2.4951 (2.1952) loss 3.4920 (3.2358) grad_norm 2.5837 (2.1930) [2022-01-24 17:05:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1140/1251] eta 0:04:03 lr 0.000174 time 2.2186 (2.1944) loss 2.6474 (3.2341) grad_norm 1.9652 (2.1927) [2022-01-24 17:05:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1150/1251] eta 0:03:41 lr 0.000174 time 1.8512 (2.1941) loss 3.2439 (3.2340) grad_norm 2.1224 (2.1917) [2022-01-24 17:05:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1160/1251] eta 0:03:19 lr 0.000174 time 1.6612 (2.1930) loss 3.8198 (3.2341) grad_norm 2.3695 (2.1912) [2022-01-24 17:06:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1170/1251] eta 0:02:57 lr 0.000174 time 2.2766 (2.1935) loss 2.2702 (3.2328) grad_norm 1.9731 (2.1914) [2022-01-24 17:06:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1180/1251] eta 0:02:35 lr 0.000174 time 2.5108 (2.1933) loss 3.2470 (3.2338) grad_norm 2.2965 (2.1915) [2022-01-24 17:06:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1190/1251] eta 0:02:13 lr 0.000174 time 2.4790 (2.1936) loss 3.8501 (3.2333) grad_norm 2.5590 (2.1911) [2022-01-24 17:07:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1200/1251] eta 0:01:51 lr 0.000174 time 1.8569 (2.1936) loss 2.6128 (3.2307) grad_norm 2.6175 (2.1908) [2022-01-24 17:07:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1210/1251] eta 0:01:29 lr 0.000174 time 2.2221 (2.1941) loss 3.4969 (3.2288) grad_norm 2.1601 (2.1908) [2022-01-24 17:08:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1220/1251] eta 0:01:08 lr 0.000174 time 2.2077 (2.1942) loss 3.3621 (3.2302) grad_norm 1.9327 (2.1905) [2022-01-24 17:08:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1230/1251] eta 0:00:46 lr 0.000174 time 1.5587 (2.1948) loss 3.5087 (3.2288) grad_norm 1.9266 (2.1898) [2022-01-24 17:08:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1240/1251] eta 0:00:24 lr 0.000174 time 1.6404 (2.1938) loss 3.4606 (3.2298) grad_norm 2.3797 (2.1892) [2022-01-24 17:09:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [219/300][1250/1251] eta 0:00:02 lr 0.000174 time 1.2114 (2.1881) loss 3.3137 (3.2308) grad_norm 2.6252 (2.1891) [2022-01-24 17:09:01 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 219 training takes 0:45:37 [2022-01-24 17:09:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.596 (18.596) Loss 0.8955 (0.8955) Acc@1 78.809 (78.809) Acc@5 95.117 (95.117) [2022-01-24 17:09:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.256 (3.546) Loss 0.9703 (0.8777) Acc@1 76.855 (79.510) Acc@5 94.238 (94.869) [2022-01-24 17:09:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.958 (2.489) Loss 0.9320 (0.8839) Acc@1 78.223 (79.204) Acc@5 93.555 (94.754) [2022-01-24 17:10:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.609 (2.274) Loss 0.9073 (0.8874) Acc@1 78.613 (79.080) Acc@5 94.727 (94.749) [2022-01-24 17:10:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.322 (2.156) Loss 0.9206 (0.8805) Acc@1 78.516 (79.137) Acc@5 95.020 (94.841) [2022-01-24 17:10:36 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.114 Acc@5 94.804 [2022-01-24 17:10:36 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.1% [2022-01-24 17:10:36 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.33% [2022-01-24 17:10:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][0/1251] eta 7:42:31 lr 0.000174 time 22.1832 (22.1832) loss 3.5796 (3.5796) grad_norm 2.1415 (2.1415) [2022-01-24 17:11:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][10/1251] eta 1:25:46 lr 0.000174 time 2.1815 (4.1469) loss 2.5628 (3.0022) grad_norm 2.1401 (2.0717) [2022-01-24 17:11:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][20/1251] eta 1:06:34 lr 0.000174 time 1.8366 (3.2446) loss 3.0910 (3.0801) grad_norm 2.0984 (2.0924) [2022-01-24 17:12:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][30/1251] eta 0:59:27 lr 0.000174 time 1.3596 (2.9215) loss 3.1349 (3.1199) grad_norm 2.2102 (2.1065) [2022-01-24 17:12:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][40/1251] eta 0:56:29 lr 0.000174 time 3.6465 (2.7990) loss 2.9250 (3.1404) grad_norm 2.2572 (2.1176) [2022-01-24 17:12:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][50/1251] eta 0:53:45 lr 0.000174 time 2.6540 (2.6860) loss 3.6443 (3.2017) grad_norm 2.0592 (2.1413) [2022-01-24 17:13:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][60/1251] eta 0:51:37 lr 0.000174 time 1.8831 (2.6009) loss 3.4727 (3.1963) grad_norm 2.0984 (2.1397) [2022-01-24 17:13:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][70/1251] eta 0:49:18 lr 0.000174 time 1.6576 (2.5052) loss 3.6240 (3.1797) grad_norm 1.9545 (2.1335) [2022-01-24 17:13:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][80/1251] eta 0:47:54 lr 0.000174 time 2.9793 (2.4545) loss 3.6377 (3.2048) grad_norm 1.8877 (2.1436) [2022-01-24 17:14:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][90/1251] eta 0:46:57 lr 0.000174 time 1.6164 (2.4272) loss 3.9704 (3.2186) grad_norm 2.3425 (2.1628) [2022-01-24 17:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][100/1251] eta 0:46:12 lr 0.000173 time 2.1982 (2.4085) loss 3.4480 (3.2146) grad_norm 2.1514 (2.1759) [2022-01-24 17:15:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][110/1251] eta 0:45:25 lr 0.000173 time 2.2183 (2.3887) loss 2.9503 (3.2034) grad_norm 2.0133 (2.1711) [2022-01-24 17:15:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][120/1251] eta 0:44:58 lr 0.000173 time 2.7140 (2.3860) loss 4.0875 (3.2254) grad_norm 2.5870 (2.1855) [2022-01-24 17:15:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][130/1251] eta 0:44:39 lr 0.000173 time 2.7657 (2.3900) loss 3.6757 (3.2255) grad_norm 2.1735 (2.1865) [2022-01-24 17:16:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][140/1251] eta 0:43:58 lr 0.000173 time 1.8780 (2.3753) loss 2.8915 (3.2310) grad_norm 2.5457 (2.1849) [2022-01-24 17:16:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][150/1251] eta 0:43:18 lr 0.000173 time 1.8141 (2.3600) loss 3.8566 (3.2480) grad_norm 2.4633 (2.1938) [2022-01-24 17:16:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][160/1251] eta 0:42:24 lr 0.000173 time 1.9363 (2.3321) loss 3.6746 (3.2661) grad_norm 1.8456 (2.1993) [2022-01-24 17:17:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][170/1251] eta 0:41:45 lr 0.000173 time 2.7714 (2.3178) loss 2.7806 (3.2666) grad_norm 2.2944 (2.2009) [2022-01-24 17:17:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][180/1251] eta 0:41:09 lr 0.000173 time 1.9686 (2.3056) loss 2.8485 (3.2424) grad_norm 1.9489 (2.1949) [2022-01-24 17:17:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][190/1251] eta 0:40:34 lr 0.000173 time 2.0116 (2.2944) loss 3.6566 (3.2493) grad_norm 2.4524 (2.1910) [2022-01-24 17:18:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][200/1251] eta 0:39:57 lr 0.000173 time 2.1853 (2.2808) loss 3.5859 (3.2560) grad_norm 2.6293 (2.1959) [2022-01-24 17:18:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][210/1251] eta 0:39:32 lr 0.000173 time 2.1516 (2.2792) loss 3.8111 (3.2488) grad_norm 2.1419 (2.2037) [2022-01-24 17:19:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][220/1251] eta 0:39:14 lr 0.000173 time 1.9166 (2.2834) loss 1.8896 (3.2484) grad_norm 2.3537 (2.2068) [2022-01-24 17:19:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][230/1251] eta 0:38:49 lr 0.000173 time 2.5617 (2.2813) loss 3.6519 (3.2454) grad_norm 2.0225 (2.2026) [2022-01-24 17:19:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][240/1251] eta 0:38:24 lr 0.000173 time 1.8526 (2.2794) loss 3.4216 (3.2539) grad_norm 2.1770 (2.2015) [2022-01-24 17:20:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][250/1251] eta 0:38:07 lr 0.000173 time 3.4058 (2.2852) loss 3.3802 (3.2589) grad_norm 2.1356 (2.2012) [2022-01-24 17:20:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][260/1251] eta 0:37:45 lr 0.000173 time 2.1449 (2.2860) loss 2.8300 (3.2562) grad_norm 2.0751 (2.2020) [2022-01-24 17:20:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][270/1251] eta 0:37:15 lr 0.000173 time 1.7960 (2.2789) loss 2.5844 (3.2529) grad_norm 2.0106 (2.1983) [2022-01-24 17:21:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][280/1251] eta 0:36:43 lr 0.000173 time 1.8408 (2.2692) loss 3.3777 (3.2571) grad_norm 2.1862 (2.1980) [2022-01-24 17:21:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][290/1251] eta 0:36:13 lr 0.000173 time 2.8279 (2.2616) loss 3.1654 (3.2487) grad_norm 2.0758 (2.2001) [2022-01-24 17:21:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][300/1251] eta 0:35:41 lr 0.000173 time 1.8907 (2.2519) loss 3.1130 (3.2438) grad_norm 2.0705 (2.1967) [2022-01-24 17:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][310/1251] eta 0:35:16 lr 0.000173 time 2.4243 (2.2490) loss 2.3655 (3.2326) grad_norm 2.1473 (2.1928) [2022-01-24 17:22:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][320/1251] eta 0:34:55 lr 0.000173 time 1.8485 (2.2503) loss 3.6914 (3.2323) grad_norm 2.1952 (2.1914) [2022-01-24 17:23:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][330/1251] eta 0:34:29 lr 0.000173 time 2.1042 (2.2466) loss 3.7955 (3.2400) grad_norm 2.5786 (2.1917) [2022-01-24 17:23:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][340/1251] eta 0:34:08 lr 0.000173 time 2.6719 (2.2481) loss 3.0458 (3.2282) grad_norm 2.5249 (2.1949) [2022-01-24 17:23:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][350/1251] eta 0:33:46 lr 0.000173 time 3.3016 (2.2493) loss 3.6773 (3.2313) grad_norm 2.9394 (2.1985) [2022-01-24 17:24:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][360/1251] eta 0:33:25 lr 0.000173 time 2.5881 (2.2504) loss 3.3549 (3.2374) grad_norm 1.9881 (2.1991) [2022-01-24 17:24:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][370/1251] eta 0:33:01 lr 0.000173 time 2.1188 (2.2494) loss 3.6308 (3.2357) grad_norm 2.1052 (2.1994) [2022-01-24 17:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][380/1251] eta 0:32:37 lr 0.000173 time 1.8676 (2.2479) loss 3.4342 (3.2375) grad_norm 2.5894 (2.1998) [2022-01-24 17:25:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][390/1251] eta 0:32:16 lr 0.000173 time 3.4632 (2.2489) loss 3.7536 (3.2336) grad_norm 3.1420 (2.2076) [2022-01-24 17:25:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][400/1251] eta 0:31:52 lr 0.000173 time 2.4252 (2.2471) loss 2.1958 (3.2360) grad_norm 2.6347 (2.2114) [2022-01-24 17:25:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][410/1251] eta 0:31:25 lr 0.000173 time 1.8705 (2.2426) loss 2.0605 (3.2316) grad_norm 2.7625 (2.2141) [2022-01-24 17:26:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][420/1251] eta 0:31:00 lr 0.000172 time 2.8744 (2.2393) loss 3.2866 (3.2379) grad_norm 2.0216 (2.2128) [2022-01-24 17:26:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][430/1251] eta 0:30:33 lr 0.000172 time 1.5965 (2.2336) loss 2.7958 (3.2332) grad_norm 2.3146 (2.2096) [2022-01-24 17:27:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][440/1251] eta 0:30:10 lr 0.000172 time 2.2023 (2.2328) loss 1.9147 (3.2283) grad_norm 2.1692 (2.2088) [2022-01-24 17:27:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][450/1251] eta 0:29:46 lr 0.000172 time 2.6508 (2.2303) loss 2.6040 (3.2232) grad_norm 2.1059 (2.2105) [2022-01-24 17:27:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][460/1251] eta 0:29:22 lr 0.000172 time 2.4820 (2.2281) loss 3.3245 (3.2211) grad_norm 2.1307 (2.2105) [2022-01-24 17:28:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][470/1251] eta 0:29:00 lr 0.000172 time 1.9799 (2.2287) loss 3.0617 (3.2205) grad_norm 1.9856 (2.2087) [2022-01-24 17:28:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][480/1251] eta 0:28:40 lr 0.000172 time 2.6980 (2.2318) loss 4.1736 (3.2275) grad_norm 2.3556 (2.2069) [2022-01-24 17:28:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][490/1251] eta 0:28:19 lr 0.000172 time 1.9068 (2.2339) loss 2.5904 (3.2304) grad_norm 2.1757 (2.2095) [2022-01-24 17:29:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][500/1251] eta 0:27:57 lr 0.000172 time 2.4524 (2.2335) loss 3.4784 (3.2272) grad_norm 2.1214 (2.2120) [2022-01-24 17:29:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][510/1251] eta 0:27:30 lr 0.000172 time 1.5981 (2.2281) loss 3.6583 (3.2295) grad_norm 2.2046 (2.2114) [2022-01-24 17:29:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][520/1251] eta 0:27:05 lr 0.000172 time 1.9188 (2.2233) loss 2.8958 (3.2272) grad_norm 2.1342 (2.2130) [2022-01-24 17:30:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][530/1251] eta 0:26:41 lr 0.000172 time 2.5992 (2.2210) loss 2.1244 (3.2253) grad_norm 1.9554 (2.2108) [2022-01-24 17:30:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][540/1251] eta 0:26:17 lr 0.000172 time 2.2829 (2.2182) loss 3.4786 (3.2259) grad_norm 2.1992 (2.2108) [2022-01-24 17:30:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][550/1251] eta 0:25:55 lr 0.000172 time 2.5271 (2.2184) loss 3.1248 (3.2268) grad_norm 1.7758 (2.2093) [2022-01-24 17:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][560/1251] eta 0:25:34 lr 0.000172 time 1.8971 (2.2203) loss 3.6092 (3.2266) grad_norm 2.2348 (2.2094) [2022-01-24 17:31:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][570/1251] eta 0:25:13 lr 0.000172 time 2.3621 (2.2231) loss 3.6862 (3.2246) grad_norm 2.2347 (2.2075) [2022-01-24 17:32:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][580/1251] eta 0:24:50 lr 0.000172 time 2.4973 (2.2214) loss 3.4901 (3.2278) grad_norm 2.2023 (2.2060) [2022-01-24 17:32:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][590/1251] eta 0:24:26 lr 0.000172 time 1.5660 (2.2181) loss 3.8668 (3.2259) grad_norm 1.9695 (2.2064) [2022-01-24 17:32:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][600/1251] eta 0:24:03 lr 0.000172 time 1.8899 (2.2176) loss 3.2697 (3.2213) grad_norm 1.9155 (2.2073) [2022-01-24 17:33:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][610/1251] eta 0:23:40 lr 0.000172 time 1.9200 (2.2157) loss 2.9919 (3.2229) grad_norm 2.0960 (2.2078) [2022-01-24 17:33:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][620/1251] eta 0:23:17 lr 0.000172 time 2.1180 (2.2149) loss 3.7204 (3.2235) grad_norm 1.9896 (2.2077) [2022-01-24 17:33:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][630/1251] eta 0:22:55 lr 0.000172 time 2.4476 (2.2150) loss 3.6641 (3.2250) grad_norm 2.2792 (2.2070) [2022-01-24 17:34:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][640/1251] eta 0:22:34 lr 0.000172 time 2.8355 (2.2163) loss 3.7761 (3.2268) grad_norm 2.1452 (2.2055) [2022-01-24 17:34:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][650/1251] eta 0:22:11 lr 0.000172 time 1.5101 (2.2157) loss 3.3720 (3.2276) grad_norm 2.1116 (2.2043) [2022-01-24 17:35:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][660/1251] eta 0:21:49 lr 0.000172 time 2.2428 (2.2166) loss 3.7093 (3.2278) grad_norm 1.8480 (2.2028) [2022-01-24 17:35:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][670/1251] eta 0:21:27 lr 0.000172 time 2.0582 (2.2162) loss 3.9687 (3.2324) grad_norm 2.1656 (2.2015) [2022-01-24 17:35:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][680/1251] eta 0:21:05 lr 0.000172 time 1.8907 (2.2164) loss 2.7656 (3.2315) grad_norm 2.5581 (2.2025) [2022-01-24 17:36:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][690/1251] eta 0:20:43 lr 0.000172 time 2.1238 (2.2174) loss 2.8461 (3.2308) grad_norm 2.2671 (2.2018) [2022-01-24 17:36:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][700/1251] eta 0:20:20 lr 0.000172 time 2.1780 (2.2158) loss 3.7655 (3.2320) grad_norm 2.2917 (2.2019) [2022-01-24 17:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][710/1251] eta 0:19:58 lr 0.000172 time 2.5991 (2.2146) loss 2.1199 (3.2321) grad_norm 1.9789 (2.2005) [2022-01-24 17:37:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][720/1251] eta 0:19:34 lr 0.000172 time 2.0015 (2.2123) loss 3.6407 (3.2352) grad_norm 2.0146 (2.1984) [2022-01-24 17:37:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][730/1251] eta 0:19:12 lr 0.000172 time 2.2431 (2.2127) loss 3.2965 (3.2364) grad_norm 2.2211 (2.1979) [2022-01-24 17:37:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][740/1251] eta 0:18:50 lr 0.000172 time 2.1427 (2.2120) loss 2.6172 (3.2334) grad_norm 2.1567 (2.1980) [2022-01-24 17:38:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][750/1251] eta 0:18:26 lr 0.000171 time 2.1812 (2.2094) loss 3.4995 (3.2363) grad_norm 2.2012 (2.1969) [2022-01-24 17:38:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][760/1251] eta 0:18:05 lr 0.000171 time 2.1872 (2.2098) loss 2.6407 (3.2339) grad_norm 2.0741 (2.1984) [2022-01-24 17:39:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][770/1251] eta 0:17:43 lr 0.000171 time 2.1513 (2.2101) loss 3.6288 (3.2348) grad_norm 2.1249 (2.1982) [2022-01-24 17:39:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][780/1251] eta 0:17:20 lr 0.000171 time 1.8676 (2.2098) loss 3.2386 (3.2355) grad_norm 2.0213 (2.1995) [2022-01-24 17:39:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][790/1251] eta 0:16:58 lr 0.000171 time 1.8232 (2.2104) loss 3.6274 (3.2343) grad_norm 2.1945 (2.1996) [2022-01-24 17:40:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][800/1251] eta 0:16:36 lr 0.000171 time 1.8827 (2.2095) loss 3.1176 (3.2347) grad_norm 1.8734 (2.1997) [2022-01-24 17:40:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][810/1251] eta 0:16:13 lr 0.000171 time 1.9492 (2.2079) loss 3.4664 (3.2368) grad_norm 1.9669 (2.1988) [2022-01-24 17:40:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][820/1251] eta 0:15:51 lr 0.000171 time 2.2145 (2.2082) loss 2.1177 (3.2336) grad_norm 2.4170 (2.1979) [2022-01-24 17:41:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][830/1251] eta 0:15:28 lr 0.000171 time 1.9352 (2.2065) loss 2.6384 (3.2342) grad_norm 2.3869 (2.1984) [2022-01-24 17:41:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][840/1251] eta 0:15:06 lr 0.000171 time 1.9415 (2.2047) loss 3.1617 (3.2359) grad_norm 1.8970 (2.1983) [2022-01-24 17:41:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][850/1251] eta 0:14:43 lr 0.000171 time 1.5066 (2.2027) loss 3.3097 (3.2361) grad_norm 2.2215 (2.1976) [2022-01-24 17:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][860/1251] eta 0:14:21 lr 0.000171 time 1.9410 (2.2025) loss 2.3531 (3.2362) grad_norm 1.8528 (2.1981) [2022-01-24 17:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][870/1251] eta 0:13:59 lr 0.000171 time 2.4655 (2.2027) loss 3.5702 (3.2374) grad_norm 2.4518 (2.1968) [2022-01-24 17:42:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][880/1251] eta 0:13:37 lr 0.000171 time 2.8152 (2.2035) loss 3.7115 (3.2382) grad_norm 1.9887 (2.1983) [2022-01-24 17:43:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][890/1251] eta 0:13:15 lr 0.000171 time 1.7450 (2.2042) loss 2.4412 (3.2373) grad_norm 2.0903 (2.1998) [2022-01-24 17:43:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][900/1251] eta 0:12:54 lr 0.000171 time 1.8030 (2.2054) loss 2.8351 (3.2374) grad_norm 2.1980 (2.2006) [2022-01-24 17:44:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][910/1251] eta 0:12:32 lr 0.000171 time 2.6393 (2.2053) loss 2.6728 (3.2356) grad_norm 2.3203 (2.2014) [2022-01-24 17:44:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][920/1251] eta 0:12:09 lr 0.000171 time 1.9581 (2.2035) loss 3.9625 (3.2347) grad_norm 2.1603 (2.2012) [2022-01-24 17:44:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][930/1251] eta 0:11:47 lr 0.000171 time 1.6919 (2.2056) loss 4.0681 (3.2369) grad_norm 2.1713 (2.2017) [2022-01-24 17:45:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][940/1251] eta 0:11:26 lr 0.000171 time 2.8422 (2.2067) loss 3.3642 (3.2395) grad_norm 2.4103 (2.2025) [2022-01-24 17:45:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][950/1251] eta 0:11:03 lr 0.000171 time 1.7011 (2.2040) loss 3.2222 (3.2392) grad_norm 2.2323 (2.2039) [2022-01-24 17:45:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][960/1251] eta 0:10:40 lr 0.000171 time 1.7070 (2.2018) loss 3.3692 (3.2408) grad_norm 2.1355 (2.2038) [2022-01-24 17:46:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][970/1251] eta 0:10:18 lr 0.000171 time 1.9174 (2.2007) loss 3.6083 (3.2408) grad_norm 2.5293 (2.2034) [2022-01-24 17:46:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][980/1251] eta 0:09:56 lr 0.000171 time 2.3244 (2.2004) loss 3.4803 (3.2414) grad_norm 1.9344 (2.2030) [2022-01-24 17:46:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][990/1251] eta 0:09:34 lr 0.000171 time 2.5558 (2.2005) loss 3.9460 (3.2434) grad_norm 2.5147 (2.2021) [2022-01-24 17:47:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1000/1251] eta 0:09:12 lr 0.000171 time 1.8307 (2.1993) loss 2.2933 (3.2420) grad_norm 2.1819 (2.2045) [2022-01-24 17:47:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1010/1251] eta 0:08:50 lr 0.000171 time 1.8732 (2.2002) loss 3.5424 (3.2419) grad_norm 2.1261 (2.2055) [2022-01-24 17:48:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1020/1251] eta 0:08:28 lr 0.000171 time 2.5675 (2.2001) loss 3.7544 (3.2407) grad_norm 1.9883 (2.2086) [2022-01-24 17:48:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1030/1251] eta 0:08:05 lr 0.000171 time 2.2826 (2.1987) loss 3.6344 (3.2417) grad_norm 2.3755 (2.2086) [2022-01-24 17:48:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1040/1251] eta 0:07:43 lr 0.000171 time 2.7793 (2.1985) loss 3.5364 (3.2412) grad_norm 1.9540 (2.2080) [2022-01-24 17:49:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1050/1251] eta 0:07:21 lr 0.000171 time 1.5340 (2.1975) loss 2.6868 (3.2407) grad_norm 2.0652 (2.2075) [2022-01-24 17:49:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1060/1251] eta 0:06:59 lr 0.000171 time 3.1755 (2.1987) loss 3.7696 (3.2407) grad_norm 2.0176 (2.2065) [2022-01-24 17:49:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1070/1251] eta 0:06:38 lr 0.000170 time 2.7801 (2.1997) loss 2.8181 (3.2391) grad_norm 2.4014 (2.2059) [2022-01-24 17:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1080/1251] eta 0:06:16 lr 0.000170 time 2.8110 (2.2005) loss 2.9410 (3.2405) grad_norm 2.0942 (2.2054) [2022-01-24 17:50:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1090/1251] eta 0:05:54 lr 0.000170 time 1.9396 (2.2024) loss 3.0933 (3.2377) grad_norm 2.2147 (2.2052) [2022-01-24 17:51:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1100/1251] eta 0:05:32 lr 0.000170 time 2.8730 (2.2032) loss 2.7012 (3.2320) grad_norm 2.4560 (2.2048) [2022-01-24 17:51:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1110/1251] eta 0:05:10 lr 0.000170 time 2.4842 (2.2026) loss 2.7500 (3.2327) grad_norm 2.0332 (2.2055) [2022-01-24 17:51:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1120/1251] eta 0:04:48 lr 0.000170 time 1.8094 (2.2009) loss 3.0973 (3.2317) grad_norm 2.0052 (2.2047) [2022-01-24 17:52:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1130/1251] eta 0:04:26 lr 0.000170 time 2.0129 (2.2001) loss 3.5519 (3.2315) grad_norm 2.0253 (2.2052) [2022-01-24 17:52:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1140/1251] eta 0:04:04 lr 0.000170 time 1.9374 (2.2002) loss 3.3209 (3.2323) grad_norm 2.4959 (2.2056) [2022-01-24 17:52:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1150/1251] eta 0:03:42 lr 0.000170 time 2.1950 (2.2006) loss 2.5331 (3.2321) grad_norm 2.1206 (2.2052) [2022-01-24 17:53:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1160/1251] eta 0:03:20 lr 0.000170 time 2.2036 (2.2005) loss 4.0414 (3.2337) grad_norm 2.3828 (2.2049) [2022-01-24 17:53:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1170/1251] eta 0:02:58 lr 0.000170 time 2.5687 (2.2012) loss 3.0613 (3.2333) grad_norm 2.0944 (2.2054) [2022-01-24 17:53:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1180/1251] eta 0:02:36 lr 0.000170 time 1.8229 (2.2003) loss 3.5631 (3.2343) grad_norm 2.1110 (2.2052) [2022-01-24 17:54:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1190/1251] eta 0:02:14 lr 0.000170 time 1.6246 (2.1984) loss 3.5797 (3.2336) grad_norm 1.9337 (2.2040) [2022-01-24 17:54:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1200/1251] eta 0:01:52 lr 0.000170 time 2.2580 (2.1978) loss 3.7878 (3.2343) grad_norm 2.1734 (2.2038) [2022-01-24 17:54:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1210/1251] eta 0:01:30 lr 0.000170 time 1.7411 (2.1971) loss 3.2028 (3.2332) grad_norm 2.0150 (2.2026) [2022-01-24 17:55:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1220/1251] eta 0:01:08 lr 0.000170 time 2.0950 (2.1981) loss 3.7416 (3.2335) grad_norm 2.1368 (2.2017) [2022-01-24 17:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1230/1251] eta 0:00:46 lr 0.000170 time 2.2231 (2.1982) loss 3.4977 (3.2339) grad_norm 2.1283 (2.2016) [2022-01-24 17:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1240/1251] eta 0:00:24 lr 0.000170 time 1.5583 (2.1972) loss 3.7258 (3.2364) grad_norm 2.5200 (2.2058) [2022-01-24 17:56:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [220/300][1250/1251] eta 0:00:02 lr 0.000170 time 1.3029 (2.1930) loss 3.7074 (3.2368) grad_norm 2.1697 (2.2058) [2022-01-24 17:56:20 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 220 training takes 0:45:43 [2022-01-24 17:56:20 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_220 saving...... [2022-01-24 17:56:32 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_220 saved !!! [2022-01-24 17:56:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 15.981 (15.981) Loss 0.9069 (0.9069) Acc@1 79.004 (79.004) Acc@5 94.824 (94.824) [2022-01-24 17:57:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.007 (2.777) Loss 0.7858 (0.8978) Acc@1 81.152 (79.102) Acc@5 95.703 (94.407) [2022-01-24 17:57:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.683 (2.157) Loss 0.9166 (0.8902) Acc@1 78.516 (79.069) Acc@5 94.727 (94.568) [2022-01-24 17:57:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.293 (2.118) Loss 0.8327 (0.8859) Acc@1 79.883 (79.124) Acc@5 95.410 (94.689) [2022-01-24 17:57:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 0.782 (2.001) Loss 0.9387 (0.8872) Acc@1 77.832 (79.087) Acc@5 94.922 (94.708) [2022-01-24 17:58:02 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.050 Acc@5 94.690 [2022-01-24 17:58:02 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.1% [2022-01-24 17:58:02 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.33% [2022-01-24 17:58:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][0/1251] eta 6:40:44 lr 0.000170 time 19.2203 (19.2203) loss 2.5398 (2.5398) grad_norm 1.9492 (1.9492) [2022-01-24 17:58:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][10/1251] eta 1:20:57 lr 0.000170 time 3.0608 (3.9139) loss 3.7172 (3.2383) grad_norm 2.5091 (2.2931) [2022-01-24 17:59:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][20/1251] eta 1:02:57 lr 0.000170 time 1.5685 (3.0688) loss 2.2388 (3.2672) grad_norm 2.3940 (2.2887) [2022-01-24 17:59:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][30/1251] eta 0:56:51 lr 0.000170 time 1.8213 (2.7938) loss 2.8686 (3.2587) grad_norm 2.3339 (2.2711) [2022-01-24 17:59:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][40/1251] eta 0:53:33 lr 0.000170 time 2.7647 (2.6540) loss 3.2541 (3.3176) grad_norm 1.9733 (2.2514) [2022-01-24 18:00:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][50/1251] eta 0:52:31 lr 0.000170 time 2.0072 (2.6242) loss 2.8615 (3.3229) grad_norm 1.9430 (2.2190) [2022-01-24 18:00:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][60/1251] eta 0:50:40 lr 0.000170 time 1.8574 (2.5529) loss 3.2293 (3.3079) grad_norm 2.4655 (2.2110) [2022-01-24 18:00:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][70/1251] eta 0:49:00 lr 0.000170 time 2.2119 (2.4899) loss 3.2919 (3.2687) grad_norm 2.1133 (2.2091) [2022-01-24 18:01:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][80/1251] eta 0:47:43 lr 0.000170 time 2.3013 (2.4457) loss 3.8670 (3.2832) grad_norm 2.0821 (2.2204) [2022-01-24 18:01:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][90/1251] eta 0:46:53 lr 0.000170 time 1.8262 (2.4235) loss 3.0910 (3.2566) grad_norm 2.5757 (2.2215) [2022-01-24 18:02:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][100/1251] eta 0:46:43 lr 0.000170 time 2.4665 (2.4355) loss 2.0437 (3.2186) grad_norm 2.3165 (2.2205) [2022-01-24 18:02:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][110/1251] eta 0:45:53 lr 0.000170 time 2.1184 (2.4135) loss 3.2755 (3.2205) grad_norm 2.1895 (2.2258) [2022-01-24 18:02:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][120/1251] eta 0:44:54 lr 0.000170 time 1.6281 (2.3825) loss 2.5885 (3.2165) grad_norm 2.4936 (2.2384) [2022-01-24 18:03:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][130/1251] eta 0:44:02 lr 0.000170 time 2.1866 (2.3575) loss 3.6036 (3.2110) grad_norm 2.4755 (2.2324) [2022-01-24 18:03:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][140/1251] eta 0:43:23 lr 0.000170 time 2.4186 (2.3434) loss 3.3821 (3.2159) grad_norm 1.9742 (2.2143) [2022-01-24 18:03:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][150/1251] eta 0:42:57 lr 0.000169 time 2.2366 (2.3410) loss 3.1703 (3.2082) grad_norm 2.1049 (2.2156) [2022-01-24 18:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][160/1251] eta 0:42:24 lr 0.000169 time 2.2023 (2.3322) loss 3.6713 (3.2290) grad_norm 2.4035 (2.2135) [2022-01-24 18:04:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][170/1251] eta 0:41:43 lr 0.000169 time 2.4134 (2.3161) loss 3.2239 (3.2224) grad_norm 2.1248 (2.2127) [2022-01-24 18:05:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][180/1251] eta 0:41:18 lr 0.000169 time 1.8179 (2.3145) loss 2.1172 (3.2144) grad_norm 2.0899 (2.2032) [2022-01-24 18:05:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][190/1251] eta 0:40:52 lr 0.000169 time 1.8399 (2.3119) loss 3.2326 (3.2126) grad_norm 2.1542 (2.1986) [2022-01-24 18:05:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][200/1251] eta 0:40:21 lr 0.000169 time 1.8486 (2.3037) loss 2.8907 (3.2256) grad_norm 2.9340 (2.2025) [2022-01-24 18:06:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][210/1251] eta 0:39:48 lr 0.000169 time 2.3779 (2.2944) loss 3.8883 (3.2293) grad_norm 2.1711 (2.2035) [2022-01-24 18:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][220/1251] eta 0:39:14 lr 0.000169 time 2.0085 (2.2834) loss 2.0378 (3.2154) grad_norm 2.5911 (2.2103) [2022-01-24 18:06:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][230/1251] eta 0:38:48 lr 0.000169 time 1.9637 (2.2807) loss 3.2767 (3.2207) grad_norm 2.3368 (2.2122) [2022-01-24 18:07:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][240/1251] eta 0:38:18 lr 0.000169 time 2.0642 (2.2734) loss 2.9463 (3.2270) grad_norm 2.3373 (2.2154) [2022-01-24 18:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][250/1251] eta 0:37:54 lr 0.000169 time 2.4521 (2.2725) loss 3.6825 (3.2320) grad_norm 1.9389 (2.2123) [2022-01-24 18:07:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][260/1251] eta 0:37:29 lr 0.000169 time 1.9473 (2.2696) loss 3.4680 (3.2397) grad_norm 1.9371 (2.2170) [2022-01-24 18:08:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][270/1251] eta 0:37:04 lr 0.000169 time 2.1544 (2.2674) loss 3.6146 (3.2483) grad_norm 2.3014 (2.2186) [2022-01-24 18:08:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][280/1251] eta 0:36:38 lr 0.000169 time 1.8657 (2.2642) loss 2.3189 (3.2412) grad_norm 1.8368 (2.2192) [2022-01-24 18:09:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][290/1251] eta 0:36:13 lr 0.000169 time 2.3871 (2.2621) loss 3.3848 (3.2308) grad_norm 2.3995 (2.2242) [2022-01-24 18:09:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][300/1251] eta 0:35:47 lr 0.000169 time 1.9479 (2.2584) loss 3.6108 (3.2285) grad_norm 1.9685 (2.2251) [2022-01-24 18:09:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][310/1251] eta 0:35:22 lr 0.000169 time 2.8258 (2.2559) loss 3.3679 (3.2221) grad_norm 2.0650 (2.2230) [2022-01-24 18:10:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][320/1251] eta 0:34:53 lr 0.000169 time 1.7472 (2.2491) loss 3.4812 (3.2264) grad_norm 2.3429 (2.2199) [2022-01-24 18:10:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][330/1251] eta 0:34:28 lr 0.000169 time 2.0418 (2.2455) loss 3.3021 (3.2236) grad_norm 2.3037 (2.2194) [2022-01-24 18:10:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][340/1251] eta 0:34:02 lr 0.000169 time 2.4837 (2.2425) loss 3.2772 (3.2273) grad_norm 2.0877 (2.2171) [2022-01-24 18:11:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][350/1251] eta 0:33:44 lr 0.000169 time 3.8002 (2.2468) loss 2.8564 (3.2236) grad_norm 5.5822 (2.2294) [2022-01-24 18:11:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][360/1251] eta 0:33:20 lr 0.000169 time 1.6038 (2.2455) loss 2.3111 (3.2220) grad_norm 2.0824 (2.2282) [2022-01-24 18:11:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][370/1251] eta 0:33:03 lr 0.000169 time 3.2571 (2.2510) loss 2.9963 (3.2181) grad_norm 1.9551 (2.2259) [2022-01-24 18:12:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][380/1251] eta 0:32:37 lr 0.000169 time 1.7973 (2.2476) loss 4.1475 (3.2206) grad_norm 2.0325 (2.2276) [2022-01-24 18:12:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][390/1251] eta 0:32:11 lr 0.000169 time 2.2338 (2.2433) loss 3.6897 (3.2176) grad_norm 2.3201 (2.2302) [2022-01-24 18:12:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][400/1251] eta 0:31:43 lr 0.000169 time 1.8544 (2.2362) loss 3.8254 (3.2251) grad_norm 2.3959 (2.2320) [2022-01-24 18:13:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][410/1251] eta 0:31:15 lr 0.000169 time 1.9497 (2.2301) loss 3.8469 (3.2224) grad_norm 2.0921 (2.2358) [2022-01-24 18:13:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][420/1251] eta 0:30:48 lr 0.000169 time 1.9437 (2.2245) loss 3.6177 (3.2195) grad_norm 1.8350 (2.2331) [2022-01-24 18:13:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][430/1251] eta 0:30:22 lr 0.000169 time 1.9637 (2.2198) loss 3.9583 (3.2153) grad_norm 2.2728 (2.2315) [2022-01-24 18:14:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][440/1251] eta 0:30:01 lr 0.000169 time 2.0039 (2.2212) loss 3.2573 (3.2106) grad_norm 2.1520 (2.2278) [2022-01-24 18:14:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][450/1251] eta 0:29:39 lr 0.000169 time 2.4706 (2.2221) loss 2.8609 (3.2126) grad_norm 1.9484 (2.2247) [2022-01-24 18:15:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][460/1251] eta 0:29:18 lr 0.000169 time 2.6928 (2.2237) loss 3.4083 (3.2183) grad_norm 2.4431 (2.2239) [2022-01-24 18:15:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][470/1251] eta 0:29:01 lr 0.000169 time 2.9383 (2.2302) loss 3.2054 (3.2167) grad_norm 2.3386 (2.2227) [2022-01-24 18:15:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][480/1251] eta 0:28:45 lr 0.000168 time 2.7787 (2.2384) loss 3.5110 (3.2150) grad_norm 2.4129 (2.2244) [2022-01-24 18:16:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][490/1251] eta 0:28:24 lr 0.000168 time 2.3707 (2.2402) loss 3.3183 (3.2189) grad_norm 2.1821 (2.2241) [2022-01-24 18:16:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][500/1251] eta 0:28:00 lr 0.000168 time 2.2401 (2.2373) loss 3.6078 (3.2164) grad_norm 2.8812 (2.2262) [2022-01-24 18:17:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][510/1251] eta 0:27:32 lr 0.000168 time 1.7604 (2.2301) loss 2.3422 (3.2181) grad_norm 2.1330 (2.2254) [2022-01-24 18:17:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][520/1251] eta 0:27:04 lr 0.000168 time 1.7220 (2.2230) loss 2.4907 (3.2187) grad_norm 2.3272 (2.2261) [2022-01-24 18:17:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][530/1251] eta 0:26:39 lr 0.000168 time 2.0871 (2.2191) loss 3.6856 (3.2226) grad_norm 1.9234 (2.2256) [2022-01-24 18:18:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][540/1251] eta 0:26:15 lr 0.000168 time 1.9714 (2.2164) loss 3.5513 (3.2236) grad_norm 2.0308 (2.2265) [2022-01-24 18:18:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][550/1251] eta 0:25:54 lr 0.000168 time 1.5757 (2.2180) loss 3.2656 (3.2243) grad_norm 2.2492 (2.2250) [2022-01-24 18:18:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][560/1251] eta 0:25:32 lr 0.000168 time 1.8529 (2.2185) loss 2.2212 (3.2261) grad_norm 2.7193 (2.2291) [2022-01-24 18:19:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][570/1251] eta 0:25:14 lr 0.000168 time 2.8477 (2.2238) loss 3.5832 (3.2310) grad_norm 2.3800 (2.2297) [2022-01-24 18:19:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][580/1251] eta 0:24:54 lr 0.000168 time 2.0814 (2.2267) loss 3.8977 (3.2282) grad_norm 2.3064 (2.2300) [2022-01-24 18:20:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][590/1251] eta 0:24:34 lr 0.000168 time 2.0802 (2.2312) loss 4.0719 (3.2286) grad_norm 2.1122 (2.2307) [2022-01-24 18:20:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][600/1251] eta 0:24:12 lr 0.000168 time 1.7042 (2.2307) loss 3.4641 (3.2288) grad_norm 2.2695 (2.2310) [2022-01-24 18:20:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][610/1251] eta 0:23:47 lr 0.000168 time 2.2718 (2.2274) loss 3.0031 (3.2281) grad_norm 2.1958 (2.2307) [2022-01-24 18:21:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][620/1251] eta 0:23:22 lr 0.000168 time 2.0480 (2.2226) loss 2.4109 (3.2286) grad_norm 2.3488 (2.2294) [2022-01-24 18:21:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][630/1251] eta 0:22:59 lr 0.000168 time 2.2111 (2.2206) loss 3.1111 (3.2290) grad_norm 2.5419 (2.2305) [2022-01-24 18:21:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][640/1251] eta 0:22:35 lr 0.000168 time 2.6025 (2.2182) loss 2.6803 (3.2326) grad_norm 1.8515 (2.2280) [2022-01-24 18:22:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][650/1251] eta 0:22:13 lr 0.000168 time 2.2561 (2.2186) loss 3.4881 (3.2337) grad_norm 2.3941 (2.2289) [2022-01-24 18:22:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][660/1251] eta 0:21:51 lr 0.000168 time 2.7161 (2.2191) loss 3.9186 (3.2390) grad_norm 2.1141 (2.2271) [2022-01-24 18:22:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][670/1251] eta 0:21:29 lr 0.000168 time 1.5152 (2.2188) loss 2.7294 (3.2348) grad_norm 2.3326 (2.2265) [2022-01-24 18:23:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][680/1251] eta 0:21:05 lr 0.000168 time 2.0153 (2.2170) loss 3.6329 (3.2345) grad_norm 1.9283 (2.2252) [2022-01-24 18:23:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][690/1251] eta 0:20:44 lr 0.000168 time 1.9063 (2.2180) loss 4.1636 (3.2358) grad_norm 1.9809 (2.2230) [2022-01-24 18:23:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][700/1251] eta 0:20:22 lr 0.000168 time 2.6558 (2.2184) loss 3.7308 (3.2378) grad_norm 2.1078 (2.2208) [2022-01-24 18:24:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][710/1251] eta 0:20:00 lr 0.000168 time 1.5153 (2.2188) loss 3.7104 (3.2380) grad_norm 2.2607 (2.2197) [2022-01-24 18:24:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][720/1251] eta 0:19:37 lr 0.000168 time 1.9161 (2.2170) loss 3.5150 (3.2367) grad_norm 2.2399 (2.2190) [2022-01-24 18:25:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][730/1251] eta 0:19:15 lr 0.000168 time 2.0879 (2.2180) loss 3.4557 (3.2393) grad_norm 2.7489 (2.2200) [2022-01-24 18:25:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][740/1251] eta 0:18:53 lr 0.000168 time 2.8115 (2.2186) loss 2.6350 (3.2398) grad_norm 2.3039 (2.2197) [2022-01-24 18:25:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][750/1251] eta 0:18:31 lr 0.000168 time 1.7812 (2.2183) loss 3.5939 (3.2365) grad_norm 2.3207 (2.2202) [2022-01-24 18:26:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][760/1251] eta 0:18:08 lr 0.000168 time 1.8632 (2.2170) loss 2.3897 (3.2342) grad_norm 2.1247 (2.2207) [2022-01-24 18:26:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][770/1251] eta 0:17:46 lr 0.000168 time 1.6023 (2.2168) loss 3.3352 (3.2323) grad_norm 2.0814 (2.2186) [2022-01-24 18:26:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][780/1251] eta 0:17:23 lr 0.000168 time 2.7940 (2.2157) loss 3.3768 (3.2325) grad_norm 2.3581 (2.2185) [2022-01-24 18:27:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][790/1251] eta 0:17:01 lr 0.000168 time 3.3999 (2.2156) loss 3.2449 (3.2333) grad_norm 2.2053 (2.2169) [2022-01-24 18:27:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][800/1251] eta 0:16:40 lr 0.000168 time 2.8254 (2.2175) loss 3.3974 (3.2324) grad_norm 2.2493 (2.2159) [2022-01-24 18:28:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][810/1251] eta 0:16:17 lr 0.000167 time 1.7426 (2.2170) loss 3.4246 (3.2292) grad_norm 2.0656 (2.2154) [2022-01-24 18:28:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][820/1251] eta 0:15:55 lr 0.000167 time 2.4676 (2.2180) loss 2.2931 (3.2279) grad_norm 2.2708 (2.2155) [2022-01-24 18:28:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][830/1251] eta 0:15:33 lr 0.000167 time 2.3207 (2.2184) loss 2.9176 (3.2255) grad_norm 2.3298 (2.2149) [2022-01-24 18:29:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][840/1251] eta 0:15:11 lr 0.000167 time 2.1446 (2.2176) loss 3.2403 (3.2234) grad_norm 1.8744 (2.2144) [2022-01-24 18:29:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][850/1251] eta 0:14:47 lr 0.000167 time 1.9319 (2.2137) loss 2.9664 (3.2207) grad_norm 2.0670 (2.2134) [2022-01-24 18:29:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][860/1251] eta 0:14:24 lr 0.000167 time 2.4117 (2.2116) loss 3.6763 (3.2224) grad_norm 2.2933 (2.2147) [2022-01-24 18:30:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][870/1251] eta 0:14:03 lr 0.000167 time 2.7252 (2.2138) loss 2.9773 (3.2216) grad_norm 2.1407 (2.2139) [2022-01-24 18:30:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][880/1251] eta 0:13:41 lr 0.000167 time 2.2837 (2.2151) loss 3.8851 (3.2241) grad_norm 2.1675 (2.2150) [2022-01-24 18:30:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][890/1251] eta 0:13:20 lr 0.000167 time 1.8205 (2.2180) loss 2.3935 (3.2218) grad_norm 2.4429 (2.2149) [2022-01-24 18:31:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][900/1251] eta 0:12:58 lr 0.000167 time 1.9092 (2.2170) loss 3.1808 (3.2209) grad_norm 2.0701 (2.2142) [2022-01-24 18:31:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][910/1251] eta 0:12:35 lr 0.000167 time 2.8219 (2.2165) loss 3.4026 (3.2199) grad_norm 1.9471 (2.2140) [2022-01-24 18:32:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][920/1251] eta 0:12:12 lr 0.000167 time 1.9095 (2.2137) loss 2.9538 (3.2196) grad_norm 1.9184 (2.2129) [2022-01-24 18:32:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][930/1251] eta 0:11:49 lr 0.000167 time 2.2591 (2.2108) loss 3.8843 (3.2213) grad_norm 2.2859 (2.2115) [2022-01-24 18:32:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][940/1251] eta 0:11:27 lr 0.000167 time 2.1256 (2.2101) loss 3.3145 (3.2197) grad_norm 2.5821 (2.2126) [2022-01-24 18:33:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][950/1251] eta 0:11:05 lr 0.000167 time 2.5611 (2.2097) loss 3.2994 (3.2204) grad_norm 2.2178 (2.2128) [2022-01-24 18:33:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][960/1251] eta 0:10:42 lr 0.000167 time 2.4276 (2.2088) loss 2.4377 (3.2199) grad_norm 2.2120 (2.2134) [2022-01-24 18:33:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][970/1251] eta 0:10:21 lr 0.000167 time 2.3217 (2.2111) loss 2.1465 (3.2222) grad_norm 2.1792 (2.2141) [2022-01-24 18:34:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][980/1251] eta 0:09:59 lr 0.000167 time 1.7307 (2.2112) loss 3.2840 (3.2216) grad_norm 2.1662 (2.2141) [2022-01-24 18:34:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][990/1251] eta 0:09:37 lr 0.000167 time 2.4388 (2.2124) loss 3.9233 (3.2208) grad_norm 2.1375 (2.2145) [2022-01-24 18:34:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1000/1251] eta 0:09:15 lr 0.000167 time 1.9337 (2.2136) loss 3.6398 (3.2236) grad_norm 2.2133 (2.2137) [2022-01-24 18:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1010/1251] eta 0:08:53 lr 0.000167 time 2.5404 (2.2142) loss 3.8525 (3.2246) grad_norm 2.4059 (2.2131) [2022-01-24 18:35:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1020/1251] eta 0:08:31 lr 0.000167 time 1.7478 (2.2136) loss 3.2584 (3.2225) grad_norm 1.9780 (2.2112) [2022-01-24 18:36:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1030/1251] eta 0:08:08 lr 0.000167 time 1.6782 (2.2118) loss 3.5140 (3.2232) grad_norm 2.0622 (2.2096) [2022-01-24 18:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1040/1251] eta 0:07:46 lr 0.000167 time 2.0106 (2.2103) loss 2.1992 (3.2212) grad_norm 2.8417 (2.2102) [2022-01-24 18:36:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1050/1251] eta 0:07:24 lr 0.000167 time 2.1869 (2.2099) loss 3.0736 (3.2219) grad_norm 1.9383 (2.2093) [2022-01-24 18:37:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1060/1251] eta 0:07:02 lr 0.000167 time 2.5619 (2.2109) loss 3.4413 (3.2214) grad_norm 2.3847 (2.2092) [2022-01-24 18:37:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1070/1251] eta 0:06:40 lr 0.000167 time 1.9108 (2.2123) loss 3.1271 (3.2230) grad_norm 1.8214 (2.2072) [2022-01-24 18:37:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1080/1251] eta 0:06:18 lr 0.000167 time 1.9349 (2.2113) loss 3.5124 (3.2239) grad_norm 2.5147 (2.2074) [2022-01-24 18:38:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1090/1251] eta 0:05:55 lr 0.000167 time 2.2046 (2.2107) loss 1.8793 (3.2205) grad_norm 2.2500 (2.2062) [2022-01-24 18:38:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1100/1251] eta 0:05:33 lr 0.000167 time 2.0980 (2.2091) loss 2.1297 (3.2219) grad_norm 2.0062 (2.2065) [2022-01-24 18:38:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1110/1251] eta 0:05:11 lr 0.000167 time 2.1904 (2.2084) loss 3.3932 (3.2218) grad_norm 2.2319 (2.2062) [2022-01-24 18:39:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1120/1251] eta 0:04:49 lr 0.000167 time 2.2561 (2.2074) loss 3.4140 (3.2221) grad_norm 2.4099 (2.2062) [2022-01-24 18:39:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1130/1251] eta 0:04:26 lr 0.000167 time 2.2316 (2.2065) loss 3.6136 (3.2229) grad_norm 2.2333 (2.2053) [2022-01-24 18:40:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1140/1251] eta 0:04:05 lr 0.000166 time 2.6007 (2.2083) loss 3.2605 (3.2223) grad_norm 2.1666 (2.2038) [2022-01-24 18:40:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1150/1251] eta 0:03:42 lr 0.000166 time 1.7838 (2.2074) loss 2.6596 (3.2208) grad_norm 2.0418 (2.2037) [2022-01-24 18:40:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1160/1251] eta 0:03:20 lr 0.000166 time 2.3101 (2.2066) loss 2.4874 (3.2201) grad_norm 2.0597 (2.2055) [2022-01-24 18:41:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1170/1251] eta 0:02:58 lr 0.000166 time 2.4335 (2.2075) loss 3.0600 (3.2195) grad_norm 2.1474 (2.2059) [2022-01-24 18:41:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1180/1251] eta 0:02:36 lr 0.000166 time 1.8473 (2.2088) loss 2.3633 (3.2174) grad_norm 2.4792 (2.2070) [2022-01-24 18:41:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1190/1251] eta 0:02:14 lr 0.000166 time 1.8671 (2.2084) loss 2.3080 (3.2174) grad_norm 1.9939 (2.2065) [2022-01-24 18:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1200/1251] eta 0:01:52 lr 0.000166 time 2.0298 (2.2076) loss 3.6034 (3.2198) grad_norm 2.2597 (2.2076) [2022-01-24 18:42:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1210/1251] eta 0:01:30 lr 0.000166 time 1.7397 (2.2064) loss 3.4144 (3.2192) grad_norm 2.2616 (2.2074) [2022-01-24 18:42:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1220/1251] eta 0:01:08 lr 0.000166 time 1.9201 (2.2070) loss 3.5400 (3.2199) grad_norm 2.1417 (2.2070) [2022-01-24 18:43:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1230/1251] eta 0:00:46 lr 0.000166 time 2.6132 (2.2076) loss 3.4725 (3.2195) grad_norm 2.1633 (2.2064) [2022-01-24 18:43:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1240/1251] eta 0:00:24 lr 0.000166 time 1.2340 (2.2062) loss 2.3274 (3.2203) grad_norm 2.2245 (2.2071) [2022-01-24 18:43:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [221/300][1250/1251] eta 0:00:02 lr 0.000166 time 1.1905 (2.2007) loss 3.0767 (3.2193) grad_norm 2.8741 (2.2113) [2022-01-24 18:43:55 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 221 training takes 0:45:53 [2022-01-24 18:44:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.262 (19.262) Loss 0.8242 (0.8242) Acc@1 80.176 (80.176) Acc@5 95.703 (95.703) [2022-01-24 18:44:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.591 (3.273) Loss 0.8716 (0.8839) Acc@1 80.664 (79.093) Acc@5 94.824 (94.957) [2022-01-24 18:44:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.613 (2.519) Loss 0.9105 (0.8733) Acc@1 78.809 (79.422) Acc@5 94.922 (94.978) [2022-01-24 18:45:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.828 (2.226) Loss 0.8731 (0.8732) Acc@1 78.711 (79.407) Acc@5 95.312 (94.938) [2022-01-24 18:45:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.302 (2.189) Loss 0.8513 (0.8733) Acc@1 79.785 (79.335) Acc@5 94.629 (94.970) [2022-01-24 18:45:32 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.304 Acc@5 94.920 [2022-01-24 18:45:32 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.3% [2022-01-24 18:45:32 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.33% [2022-01-24 18:45:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][0/1251] eta 8:19:26 lr 0.000166 time 23.9539 (23.9539) loss 3.6238 (3.6238) grad_norm 2.3955 (2.3955) [2022-01-24 18:46:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][10/1251] eta 1:30:21 lr 0.000166 time 1.7124 (4.3684) loss 3.5957 (3.4349) grad_norm 2.4772 (2.2588) [2022-01-24 18:46:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][20/1251] eta 1:06:44 lr 0.000166 time 1.2296 (3.2527) loss 2.2585 (3.2970) grad_norm 1.9385 (2.2235) [2022-01-24 18:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][30/1251] eta 0:59:23 lr 0.000166 time 1.5748 (2.9184) loss 3.1311 (3.3300) grad_norm 2.1703 (2.2117) [2022-01-24 18:47:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][40/1251] eta 0:56:06 lr 0.000166 time 4.2852 (2.7800) loss 2.5619 (3.3113) grad_norm 2.2025 (2.1799) [2022-01-24 18:47:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][50/1251] eta 0:54:36 lr 0.000166 time 2.4501 (2.7278) loss 3.2028 (3.2623) grad_norm 2.0660 (2.1837) [2022-01-24 18:48:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][60/1251] eta 0:52:07 lr 0.000166 time 1.8927 (2.6260) loss 3.3196 (3.1965) grad_norm 2.0957 (2.1918) [2022-01-24 18:48:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][70/1251] eta 0:50:23 lr 0.000166 time 1.7557 (2.5604) loss 2.8250 (3.2034) grad_norm 2.4407 (2.1952) [2022-01-24 18:48:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][80/1251] eta 0:49:29 lr 0.000166 time 3.4650 (2.5358) loss 3.6282 (3.2295) grad_norm 2.1132 (2.2216) [2022-01-24 18:49:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][90/1251] eta 0:48:32 lr 0.000166 time 2.1676 (2.5082) loss 2.5158 (3.1780) grad_norm 2.1838 (2.2353) [2022-01-24 18:49:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][100/1251] eta 0:47:22 lr 0.000166 time 1.8843 (2.4695) loss 3.5723 (3.1421) grad_norm 2.1857 (2.2346) [2022-01-24 18:50:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][110/1251] eta 0:46:14 lr 0.000166 time 1.8322 (2.4319) loss 3.3750 (3.1321) grad_norm 2.0212 (2.2210) [2022-01-24 18:50:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][120/1251] eta 0:45:16 lr 0.000166 time 1.6055 (2.4018) loss 2.8998 (3.1335) grad_norm 2.2058 (2.2174) [2022-01-24 18:50:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][130/1251] eta 0:44:21 lr 0.000166 time 1.8283 (2.3743) loss 3.6970 (3.1464) grad_norm 2.2131 (2.2127) [2022-01-24 18:51:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][140/1251] eta 0:43:42 lr 0.000166 time 1.8774 (2.3601) loss 3.3792 (3.1468) grad_norm 2.3250 (2.2138) [2022-01-24 18:51:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][150/1251] eta 0:43:12 lr 0.000166 time 3.2037 (2.3549) loss 3.5245 (3.1429) grad_norm 1.7711 (2.2103) [2022-01-24 18:51:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][160/1251] eta 0:42:40 lr 0.000166 time 2.8029 (2.3472) loss 3.2875 (3.1470) grad_norm 2.1662 (2.2150) [2022-01-24 18:52:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][170/1251] eta 0:42:14 lr 0.000166 time 2.1688 (2.3443) loss 3.6438 (3.1541) grad_norm 2.4467 (2.2127) [2022-01-24 18:52:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][180/1251] eta 0:41:41 lr 0.000166 time 1.5788 (2.3361) loss 3.2620 (3.1495) grad_norm 2.0934 (2.2089) [2022-01-24 18:52:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][190/1251] eta 0:41:09 lr 0.000166 time 2.7631 (2.3276) loss 3.9083 (3.1492) grad_norm 2.3142 (2.2070) [2022-01-24 18:53:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][200/1251] eta 0:40:40 lr 0.000166 time 2.5224 (2.3217) loss 3.7335 (3.1532) grad_norm 1.9250 (2.2078) [2022-01-24 18:53:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][210/1251] eta 0:40:13 lr 0.000166 time 2.8825 (2.3180) loss 3.7484 (3.1705) grad_norm 2.0363 (2.2111) [2022-01-24 18:54:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][220/1251] eta 0:39:40 lr 0.000165 time 1.6525 (2.3087) loss 2.9312 (3.1713) grad_norm 2.4704 (2.2171) [2022-01-24 18:54:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][230/1251] eta 0:39:11 lr 0.000165 time 2.4879 (2.3033) loss 3.7551 (3.1652) grad_norm 2.3783 (2.2210) [2022-01-24 18:54:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][240/1251] eta 0:38:42 lr 0.000165 time 2.1981 (2.2975) loss 3.7211 (3.1644) grad_norm 2.2337 (2.2202) [2022-01-24 18:55:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][250/1251] eta 0:38:04 lr 0.000165 time 1.9508 (2.2818) loss 3.5084 (3.1745) grad_norm 2.1059 (2.2181) [2022-01-24 18:55:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][260/1251] eta 0:37:31 lr 0.000165 time 1.6277 (2.2721) loss 2.8228 (3.1726) grad_norm 2.2611 (2.2179) [2022-01-24 18:55:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][270/1251] eta 0:37:05 lr 0.000165 time 1.8813 (2.2687) loss 3.3849 (3.1778) grad_norm 1.8847 (2.2180) [2022-01-24 18:56:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][280/1251] eta 0:36:39 lr 0.000165 time 2.4526 (2.2647) loss 2.9537 (3.1820) grad_norm 2.0780 (2.2165) [2022-01-24 18:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][290/1251] eta 0:36:18 lr 0.000165 time 2.5135 (2.2671) loss 2.8461 (3.1883) grad_norm 2.0788 (2.2178) [2022-01-24 18:56:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][300/1251] eta 0:36:01 lr 0.000165 time 2.0470 (2.2724) loss 2.2724 (3.1783) grad_norm 1.9479 (2.2148) [2022-01-24 18:57:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][310/1251] eta 0:35:41 lr 0.000165 time 1.9613 (2.2754) loss 3.4086 (3.1724) grad_norm 2.5521 (2.2145) [2022-01-24 18:57:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][320/1251] eta 0:35:16 lr 0.000165 time 2.1833 (2.2736) loss 2.6556 (3.1681) grad_norm 2.3353 (2.2139) [2022-01-24 18:58:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][330/1251] eta 0:34:50 lr 0.000165 time 1.9188 (2.2701) loss 3.8180 (3.1704) grad_norm 2.1760 (2.2135) [2022-01-24 18:58:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][340/1251] eta 0:34:19 lr 0.000165 time 1.6094 (2.2602) loss 3.4297 (3.1713) grad_norm 2.2409 (2.2123) [2022-01-24 18:58:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][350/1251] eta 0:33:50 lr 0.000165 time 1.9017 (2.2536) loss 2.8409 (3.1682) grad_norm 2.4335 (2.2124) [2022-01-24 18:59:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][360/1251] eta 0:33:25 lr 0.000165 time 2.4851 (2.2507) loss 2.2631 (3.1698) grad_norm 2.4446 (2.2131) [2022-01-24 18:59:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][370/1251] eta 0:33:04 lr 0.000165 time 1.8999 (2.2524) loss 3.7422 (3.1744) grad_norm 2.3985 (2.2143) [2022-01-24 18:59:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][380/1251] eta 0:32:42 lr 0.000165 time 2.1116 (2.2535) loss 3.1632 (3.1668) grad_norm 2.2835 (2.2136) [2022-01-24 19:00:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][390/1251] eta 0:32:21 lr 0.000165 time 1.9839 (2.2545) loss 2.8731 (3.1700) grad_norm 2.1630 (2.2190) [2022-01-24 19:00:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][400/1251] eta 0:32:00 lr 0.000165 time 2.0548 (2.2567) loss 3.1191 (3.1728) grad_norm 1.9147 (2.2181) [2022-01-24 19:00:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][410/1251] eta 0:31:35 lr 0.000165 time 1.5954 (2.2543) loss 2.4713 (3.1695) grad_norm 2.0558 (2.2143) [2022-01-24 19:01:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][420/1251] eta 0:31:09 lr 0.000165 time 1.5635 (2.2501) loss 3.4561 (3.1734) grad_norm 1.9476 (2.2196) [2022-01-24 19:01:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][430/1251] eta 0:30:43 lr 0.000165 time 1.9171 (2.2452) loss 3.8911 (3.1786) grad_norm 2.2833 (2.2228) [2022-01-24 19:02:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][440/1251] eta 0:30:19 lr 0.000165 time 2.5378 (2.2437) loss 3.4378 (3.1758) grad_norm 2.4436 (2.2230) [2022-01-24 19:02:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][450/1251] eta 0:29:58 lr 0.000165 time 2.2201 (2.2449) loss 2.6024 (3.1736) grad_norm 2.4588 (2.2229) [2022-01-24 19:02:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][460/1251] eta 0:29:33 lr 0.000165 time 2.1162 (2.2420) loss 3.1859 (3.1758) grad_norm 2.1710 (2.2211) [2022-01-24 19:03:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][470/1251] eta 0:29:12 lr 0.000165 time 2.6376 (2.2439) loss 2.3951 (3.1710) grad_norm 2.1704 (2.2179) [2022-01-24 19:03:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][480/1251] eta 0:28:48 lr 0.000165 time 2.1769 (2.2423) loss 3.8891 (3.1761) grad_norm 2.2039 (2.2204) [2022-01-24 19:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][490/1251] eta 0:28:26 lr 0.000165 time 1.9792 (2.2426) loss 3.2902 (3.1744) grad_norm 2.2765 (2.2234) [2022-01-24 19:04:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][500/1251] eta 0:28:01 lr 0.000165 time 1.6498 (2.2385) loss 2.2349 (3.1694) grad_norm 2.1947 (2.2259) [2022-01-24 19:04:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][510/1251] eta 0:27:35 lr 0.000165 time 2.0920 (2.2336) loss 3.5501 (3.1704) grad_norm 2.5951 (2.2269) [2022-01-24 19:04:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][520/1251] eta 0:27:09 lr 0.000165 time 1.6860 (2.2289) loss 3.5005 (3.1697) grad_norm 2.1156 (2.2266) [2022-01-24 19:05:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][530/1251] eta 0:26:48 lr 0.000165 time 2.2301 (2.2306) loss 3.6675 (3.1678) grad_norm 2.1328 (2.2260) [2022-01-24 19:05:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][540/1251] eta 0:26:25 lr 0.000165 time 2.4738 (2.2302) loss 2.6450 (3.1669) grad_norm 2.1472 (2.2267) [2022-01-24 19:06:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][550/1251] eta 0:26:04 lr 0.000164 time 2.4750 (2.2316) loss 2.6646 (3.1634) grad_norm 2.5999 (2.2280) [2022-01-24 19:06:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][560/1251] eta 0:25:42 lr 0.000164 time 2.7643 (2.2326) loss 3.4214 (3.1709) grad_norm 2.2776 (2.2268) [2022-01-24 19:06:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][570/1251] eta 0:25:23 lr 0.000164 time 3.0885 (2.2367) loss 3.4610 (3.1724) grad_norm 2.5025 (2.2281) [2022-01-24 19:07:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][580/1251] eta 0:25:00 lr 0.000164 time 1.7652 (2.2366) loss 3.0018 (3.1731) grad_norm 2.3989 (2.2328) [2022-01-24 19:07:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][590/1251] eta 0:24:36 lr 0.000164 time 2.1466 (2.2343) loss 2.3677 (3.1725) grad_norm 3.5020 (2.2373) [2022-01-24 19:07:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][600/1251] eta 0:24:11 lr 0.000164 time 1.9615 (2.2303) loss 3.4729 (3.1739) grad_norm 1.9948 (2.2379) [2022-01-24 19:08:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][610/1251] eta 0:23:46 lr 0.000164 time 1.5758 (2.2261) loss 3.3772 (3.1761) grad_norm 2.3304 (2.2364) [2022-01-24 19:08:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][620/1251] eta 0:23:23 lr 0.000164 time 1.7916 (2.2237) loss 2.3557 (3.1739) grad_norm 2.1885 (2.2349) [2022-01-24 19:08:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][630/1251] eta 0:23:02 lr 0.000164 time 2.5439 (2.2259) loss 3.2991 (3.1739) grad_norm 2.3988 (2.2361) [2022-01-24 19:09:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][640/1251] eta 0:22:40 lr 0.000164 time 2.4723 (2.2263) loss 2.4200 (3.1728) grad_norm 2.3484 (2.2364) [2022-01-24 19:09:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][650/1251] eta 0:22:19 lr 0.000164 time 2.1734 (2.2293) loss 3.4771 (3.1708) grad_norm 1.9920 (2.2348) [2022-01-24 19:10:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][660/1251] eta 0:21:58 lr 0.000164 time 2.2446 (2.2303) loss 2.3724 (3.1695) grad_norm 2.3936 (2.2337) [2022-01-24 19:10:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][670/1251] eta 0:21:34 lr 0.000164 time 2.3256 (2.2288) loss 3.4792 (3.1714) grad_norm 2.0676 (2.2334) [2022-01-24 19:10:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][680/1251] eta 0:21:10 lr 0.000164 time 2.1923 (2.2252) loss 2.7251 (3.1712) grad_norm 2.3890 (2.2335) [2022-01-24 19:11:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][690/1251] eta 0:20:47 lr 0.000164 time 2.3076 (2.2232) loss 3.8199 (3.1696) grad_norm 2.3291 (2.2315) [2022-01-24 19:11:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][700/1251] eta 0:20:23 lr 0.000164 time 1.7057 (2.2208) loss 3.5562 (3.1692) grad_norm 2.0278 (2.2310) [2022-01-24 19:11:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][710/1251] eta 0:20:01 lr 0.000164 time 2.5222 (2.2216) loss 1.9425 (3.1649) grad_norm 2.3204 (2.2319) [2022-01-24 19:12:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][720/1251] eta 0:19:39 lr 0.000164 time 2.7989 (2.2213) loss 3.5169 (3.1664) grad_norm 2.2789 (2.2309) [2022-01-24 19:12:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][730/1251] eta 0:19:16 lr 0.000164 time 1.8621 (2.2206) loss 3.5587 (3.1661) grad_norm 2.1124 (2.2305) [2022-01-24 19:12:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][740/1251] eta 0:18:54 lr 0.000164 time 2.2109 (2.2208) loss 3.1033 (3.1650) grad_norm 2.0044 (2.2296) [2022-01-24 19:13:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][750/1251] eta 0:18:33 lr 0.000164 time 2.2995 (2.2221) loss 3.2465 (3.1671) grad_norm 2.2071 (2.2306) [2022-01-24 19:13:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][760/1251] eta 0:18:11 lr 0.000164 time 1.6357 (2.2234) loss 3.0293 (3.1687) grad_norm 2.1936 (2.2321) [2022-01-24 19:14:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][770/1251] eta 0:17:48 lr 0.000164 time 1.9036 (2.2218) loss 3.5948 (3.1685) grad_norm 2.0446 (2.2336) [2022-01-24 19:14:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][780/1251] eta 0:17:24 lr 0.000164 time 1.5742 (2.2182) loss 3.9362 (3.1701) grad_norm 2.2009 (2.2332) [2022-01-24 19:14:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][790/1251] eta 0:17:02 lr 0.000164 time 2.2780 (2.2181) loss 3.1730 (3.1669) grad_norm 1.9781 (2.2323) [2022-01-24 19:15:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][800/1251] eta 0:16:41 lr 0.000164 time 2.4807 (2.2203) loss 4.0408 (3.1684) grad_norm 2.1317 (2.2312) [2022-01-24 19:15:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][810/1251] eta 0:16:18 lr 0.000164 time 2.4697 (2.2198) loss 2.4382 (3.1692) grad_norm 2.0959 (2.2311) [2022-01-24 19:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][820/1251] eta 0:15:56 lr 0.000164 time 1.8566 (2.2186) loss 3.4880 (3.1722) grad_norm 2.0961 (2.2304) [2022-01-24 19:16:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][830/1251] eta 0:15:34 lr 0.000164 time 2.4372 (2.2188) loss 3.6593 (3.1735) grad_norm 2.0585 (2.2290) [2022-01-24 19:16:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][840/1251] eta 0:15:12 lr 0.000164 time 1.6867 (2.2198) loss 3.3305 (3.1734) grad_norm 1.9252 (2.2272) [2022-01-24 19:17:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][850/1251] eta 0:14:49 lr 0.000164 time 1.6574 (2.2182) loss 3.0250 (3.1751) grad_norm 2.1543 (2.2292) [2022-01-24 19:17:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][860/1251] eta 0:14:27 lr 0.000164 time 2.1380 (2.2177) loss 3.6867 (3.1778) grad_norm 2.1956 (2.2308) [2022-01-24 19:17:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][870/1251] eta 0:14:05 lr 0.000164 time 1.8958 (2.2187) loss 3.2211 (3.1774) grad_norm 2.1708 (2.2302) [2022-01-24 19:18:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][880/1251] eta 0:13:43 lr 0.000164 time 2.0609 (2.2202) loss 3.3474 (3.1776) grad_norm 2.1206 (2.2305) [2022-01-24 19:18:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][890/1251] eta 0:13:20 lr 0.000163 time 1.5256 (2.2175) loss 3.5890 (3.1794) grad_norm 2.4503 (2.2306) [2022-01-24 19:18:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][900/1251] eta 0:12:58 lr 0.000163 time 1.7910 (2.2175) loss 2.1399 (3.1792) grad_norm 2.2278 (2.2315) [2022-01-24 19:19:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][910/1251] eta 0:12:35 lr 0.000163 time 1.9074 (2.2168) loss 3.9053 (3.1777) grad_norm 2.3691 (2.2313) [2022-01-24 19:19:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][920/1251] eta 0:12:14 lr 0.000163 time 1.9058 (2.2184) loss 2.9762 (3.1804) grad_norm 1.9941 (2.2315) [2022-01-24 19:19:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][930/1251] eta 0:11:51 lr 0.000163 time 1.8954 (2.2179) loss 3.9517 (3.1811) grad_norm 2.2873 (2.2305) [2022-01-24 19:20:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][940/1251] eta 0:11:29 lr 0.000163 time 1.7126 (2.2179) loss 3.5616 (3.1791) grad_norm 2.6675 (2.2304) [2022-01-24 19:20:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][950/1251] eta 0:11:07 lr 0.000163 time 1.8695 (2.2174) loss 3.1744 (3.1801) grad_norm 2.3282 (2.2301) [2022-01-24 19:21:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][960/1251] eta 0:10:44 lr 0.000163 time 1.6785 (2.2160) loss 3.5417 (3.1829) grad_norm 2.0128 (2.2302) [2022-01-24 19:21:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][970/1251] eta 0:10:22 lr 0.000163 time 2.6293 (2.2143) loss 3.1044 (3.1846) grad_norm 2.4532 (2.2303) [2022-01-24 19:21:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][980/1251] eta 0:09:59 lr 0.000163 time 1.5984 (2.2131) loss 2.6912 (3.1857) grad_norm 2.2191 (2.2301) [2022-01-24 19:22:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][990/1251] eta 0:09:37 lr 0.000163 time 2.1479 (2.2134) loss 3.9232 (3.1843) grad_norm 2.1873 (2.2314) [2022-01-24 19:22:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1000/1251] eta 0:09:15 lr 0.000163 time 2.1766 (2.2128) loss 3.5901 (3.1832) grad_norm 2.4305 (2.2329) [2022-01-24 19:22:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1010/1251] eta 0:08:53 lr 0.000163 time 2.8569 (2.2127) loss 2.1225 (3.1812) grad_norm 2.1279 (2.2322) [2022-01-24 19:23:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1020/1251] eta 0:08:31 lr 0.000163 time 2.1621 (2.2131) loss 2.8159 (3.1828) grad_norm 2.1482 (2.2334) [2022-01-24 19:23:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1030/1251] eta 0:08:09 lr 0.000163 time 2.6105 (2.2140) loss 2.6174 (3.1819) grad_norm 2.3715 (2.2344) [2022-01-24 19:23:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1040/1251] eta 0:07:46 lr 0.000163 time 2.1135 (2.2127) loss 3.4987 (3.1808) grad_norm 2.4902 (2.2334) [2022-01-24 19:24:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1050/1251] eta 0:07:24 lr 0.000163 time 2.0152 (2.2128) loss 3.0045 (3.1821) grad_norm 1.9326 (2.2319) [2022-01-24 19:24:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1060/1251] eta 0:07:02 lr 0.000163 time 1.8958 (2.2122) loss 2.5772 (3.1788) grad_norm 2.1105 (2.2324) [2022-01-24 19:25:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1070/1251] eta 0:06:40 lr 0.000163 time 1.9897 (2.2114) loss 3.5842 (3.1793) grad_norm 2.2867 (2.2325) [2022-01-24 19:25:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1080/1251] eta 0:06:18 lr 0.000163 time 2.2269 (2.2116) loss 3.4724 (3.1810) grad_norm 2.3602 (2.2324) [2022-01-24 19:25:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1090/1251] eta 0:05:55 lr 0.000163 time 1.7957 (2.2102) loss 3.3462 (3.1810) grad_norm 2.2084 (2.2314) [2022-01-24 19:26:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1100/1251] eta 0:05:33 lr 0.000163 time 2.2179 (2.2097) loss 3.1700 (3.1823) grad_norm 2.3587 (2.2315) [2022-01-24 19:26:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1110/1251] eta 0:05:11 lr 0.000163 time 2.2184 (2.2083) loss 3.7677 (3.1831) grad_norm 2.1120 (2.2306) [2022-01-24 19:26:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1120/1251] eta 0:04:49 lr 0.000163 time 2.5801 (2.2085) loss 3.4136 (3.1833) grad_norm 3.0054 (2.2310) [2022-01-24 19:27:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1130/1251] eta 0:04:27 lr 0.000163 time 1.6871 (2.2084) loss 3.7222 (3.1867) grad_norm 2.1727 (2.2311) [2022-01-24 19:27:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1140/1251] eta 0:04:05 lr 0.000163 time 2.4800 (2.2081) loss 2.6086 (3.1863) grad_norm 2.4726 (2.2333) [2022-01-24 19:27:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1150/1251] eta 0:03:42 lr 0.000163 time 2.0469 (2.2077) loss 3.6267 (3.1882) grad_norm 2.0670 (2.2330) [2022-01-24 19:28:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1160/1251] eta 0:03:20 lr 0.000163 time 2.3010 (2.2087) loss 3.2507 (3.1871) grad_norm 2.1880 (2.2347) [2022-01-24 19:28:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1170/1251] eta 0:02:58 lr 0.000163 time 1.9811 (2.2083) loss 3.3452 (3.1848) grad_norm 2.0588 (2.2348) [2022-01-24 19:29:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1180/1251] eta 0:02:36 lr 0.000163 time 2.0462 (2.2097) loss 2.9886 (3.1851) grad_norm 1.8893 (2.2358) [2022-01-24 19:29:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1190/1251] eta 0:02:14 lr 0.000163 time 1.5869 (2.2092) loss 3.4933 (3.1868) grad_norm 2.3224 (2.2368) [2022-01-24 19:29:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1200/1251] eta 0:01:52 lr 0.000163 time 1.9576 (2.2080) loss 3.2538 (3.1870) grad_norm 2.2615 (2.2361) [2022-01-24 19:30:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1210/1251] eta 0:01:30 lr 0.000163 time 2.2214 (2.2072) loss 3.8485 (3.1867) grad_norm 1.8325 (2.2355) [2022-01-24 19:30:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1220/1251] eta 0:01:08 lr 0.000162 time 2.1978 (2.2069) loss 2.9771 (3.1881) grad_norm 2.2128 (2.2360) [2022-01-24 19:30:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1230/1251] eta 0:00:46 lr 0.000162 time 2.1695 (2.2059) loss 2.3868 (3.1883) grad_norm 2.2047 (2.2356) [2022-01-24 19:31:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1240/1251] eta 0:00:24 lr 0.000162 time 1.3368 (2.2046) loss 3.4552 (3.1877) grad_norm 2.0956 (2.2361) [2022-01-24 19:31:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [222/300][1250/1251] eta 0:00:02 lr 0.000162 time 1.1616 (2.1995) loss 3.1439 (3.1868) grad_norm 2.2189 (2.2360) [2022-01-24 19:31:24 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 222 training takes 0:45:52 [2022-01-24 19:31:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.737 (18.737) Loss 0.8487 (0.8487) Acc@1 80.469 (80.469) Acc@5 95.312 (95.312) [2022-01-24 19:32:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.948 (3.468) Loss 0.8388 (0.8933) Acc@1 81.348 (79.439) Acc@5 94.434 (94.718) [2022-01-24 19:32:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.956 (2.592) Loss 0.9526 (0.8941) Acc@1 77.051 (79.260) Acc@5 94.141 (94.745) [2022-01-24 19:32:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.607 (2.287) Loss 0.8682 (0.8923) Acc@1 79.492 (79.193) Acc@5 94.824 (94.761) [2022-01-24 19:32:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.822 (2.190) Loss 0.8212 (0.8920) Acc@1 80.469 (79.161) Acc@5 95.410 (94.746) [2022-01-24 19:33:01 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.230 Acc@5 94.806 [2022-01-24 19:33:01 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.2% [2022-01-24 19:33:01 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.33% [2022-01-24 19:33:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][0/1251] eta 7:28:19 lr 0.000162 time 21.5024 (21.5024) loss 3.6230 (3.6230) grad_norm 2.1886 (2.1886) [2022-01-24 19:33:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][10/1251] eta 1:24:08 lr 0.000162 time 1.6449 (4.0678) loss 2.6501 (2.9597) grad_norm 2.2895 (2.1727) [2022-01-24 19:34:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][20/1251] eta 1:03:18 lr 0.000162 time 1.5010 (3.0856) loss 3.3629 (3.0720) grad_norm 2.0908 (2.2081) [2022-01-24 19:34:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][30/1251] eta 0:56:54 lr 0.000162 time 1.7185 (2.7962) loss 3.8442 (3.1658) grad_norm 2.4510 (2.1923) [2022-01-24 19:34:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][40/1251] eta 0:54:30 lr 0.000162 time 3.3307 (2.7006) loss 3.5247 (3.1702) grad_norm 2.2322 (2.2022) [2022-01-24 19:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][50/1251] eta 0:54:29 lr 0.000162 time 2.5057 (2.7225) loss 3.6873 (3.1651) grad_norm 2.2046 (2.2139) [2022-01-24 19:35:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][60/1251] eta 0:52:18 lr 0.000162 time 1.6461 (2.6354) loss 3.4707 (3.1604) grad_norm 2.2851 (2.2286) [2022-01-24 19:36:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][70/1251] eta 0:50:15 lr 0.000162 time 1.8370 (2.5530) loss 3.4385 (3.1756) grad_norm 2.5821 (2.2666) [2022-01-24 19:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][80/1251] eta 0:48:36 lr 0.000162 time 2.0443 (2.4910) loss 3.8642 (3.1892) grad_norm 2.2342 (2.2964) [2022-01-24 19:36:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][90/1251] eta 0:47:49 lr 0.000162 time 1.9344 (2.4717) loss 3.3454 (3.1993) grad_norm 2.5060 (2.2957) [2022-01-24 19:37:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][100/1251] eta 0:47:28 lr 0.000162 time 1.7391 (2.4746) loss 2.5928 (3.1939) grad_norm 2.0804 (2.3029) [2022-01-24 19:37:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][110/1251] eta 0:46:21 lr 0.000162 time 1.6171 (2.4381) loss 2.6904 (3.1947) grad_norm 1.9432 (2.2837) [2022-01-24 19:37:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][120/1251] eta 0:45:18 lr 0.000162 time 1.5639 (2.4040) loss 2.1807 (3.1746) grad_norm 2.5117 (2.2873) [2022-01-24 19:38:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][130/1251] eta 0:44:43 lr 0.000162 time 1.8632 (2.3940) loss 3.1933 (3.1814) grad_norm 2.5930 (2.2808) [2022-01-24 19:38:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][140/1251] eta 0:43:55 lr 0.000162 time 1.9007 (2.3723) loss 3.1159 (3.1868) grad_norm 2.1677 (2.2761) [2022-01-24 19:38:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][150/1251] eta 0:43:11 lr 0.000162 time 1.8102 (2.3542) loss 3.0739 (3.1853) grad_norm 2.0670 (2.2790) [2022-01-24 19:39:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][160/1251] eta 0:42:39 lr 0.000162 time 1.9625 (2.3461) loss 3.8148 (3.1850) grad_norm 1.9659 (2.2701) [2022-01-24 19:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][170/1251] eta 0:42:07 lr 0.000162 time 2.2607 (2.3377) loss 2.1202 (3.1698) grad_norm 2.3998 (2.2701) [2022-01-24 19:40:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][180/1251] eta 0:41:31 lr 0.000162 time 1.8777 (2.3265) loss 3.3683 (3.1852) grad_norm 2.3522 (2.2642) [2022-01-24 19:40:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][190/1251] eta 0:40:59 lr 0.000162 time 2.2926 (2.3185) loss 3.7647 (3.1813) grad_norm 2.4125 (2.2608) [2022-01-24 19:40:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][200/1251] eta 0:40:24 lr 0.000162 time 1.5825 (2.3064) loss 3.5506 (3.1892) grad_norm 2.1951 (2.2645) [2022-01-24 19:41:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][210/1251] eta 0:39:52 lr 0.000162 time 2.5240 (2.2980) loss 3.4117 (3.1913) grad_norm 1.9468 (2.2607) [2022-01-24 19:41:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][220/1251] eta 0:39:22 lr 0.000162 time 1.9167 (2.2917) loss 2.0786 (3.1880) grad_norm 2.2328 (2.2548) [2022-01-24 19:41:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][230/1251] eta 0:39:08 lr 0.000162 time 2.2135 (2.2999) loss 3.8925 (3.2068) grad_norm 2.4188 (2.2610) [2022-01-24 19:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][240/1251] eta 0:38:34 lr 0.000162 time 1.6610 (2.2892) loss 2.3780 (3.2067) grad_norm 2.0552 (2.2659) [2022-01-24 19:42:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][250/1251] eta 0:38:06 lr 0.000162 time 3.0058 (2.2845) loss 3.2825 (3.1934) grad_norm 1.9746 (2.2658) [2022-01-24 19:42:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][260/1251] eta 0:37:34 lr 0.000162 time 1.8581 (2.2754) loss 3.3719 (3.2065) grad_norm 1.9378 (2.2666) [2022-01-24 19:43:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][270/1251] eta 0:37:12 lr 0.000162 time 1.7048 (2.2757) loss 2.8514 (3.2062) grad_norm 2.0145 (2.2626) [2022-01-24 19:43:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][280/1251] eta 0:36:45 lr 0.000162 time 1.9035 (2.2712) loss 3.4116 (3.1991) grad_norm 2.0809 (2.2600) [2022-01-24 19:44:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][290/1251] eta 0:36:18 lr 0.000162 time 2.5813 (2.2673) loss 2.9331 (3.1838) grad_norm 2.3597 (2.2560) [2022-01-24 19:44:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][300/1251] eta 0:35:50 lr 0.000161 time 2.1674 (2.2611) loss 3.5290 (3.1781) grad_norm 2.0605 (2.2496) [2022-01-24 19:44:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][310/1251] eta 0:35:28 lr 0.000161 time 1.8416 (2.2620) loss 3.3283 (3.1816) grad_norm 2.2386 (2.2480) [2022-01-24 19:45:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][320/1251] eta 0:35:07 lr 0.000161 time 2.2715 (2.2642) loss 3.7878 (3.1842) grad_norm 2.2587 (2.2458) [2022-01-24 19:45:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][330/1251] eta 0:34:45 lr 0.000161 time 2.8438 (2.2643) loss 3.1693 (3.1857) grad_norm 2.2865 (2.2426) [2022-01-24 19:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][340/1251] eta 0:34:17 lr 0.000161 time 2.1272 (2.2590) loss 4.0504 (3.1901) grad_norm 2.1199 (2.2420) [2022-01-24 19:46:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][350/1251] eta 0:33:53 lr 0.000161 time 1.9003 (2.2573) loss 2.7467 (3.1995) grad_norm 2.2622 (2.2450) [2022-01-24 19:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][360/1251] eta 0:33:26 lr 0.000161 time 1.8532 (2.2514) loss 3.0670 (3.2004) grad_norm 1.8082 (2.2429) [2022-01-24 19:46:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][370/1251] eta 0:33:00 lr 0.000161 time 2.5088 (2.2482) loss 3.6569 (3.2008) grad_norm 2.5202 (2.2441) [2022-01-24 19:47:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][380/1251] eta 0:32:36 lr 0.000161 time 2.3572 (2.2464) loss 3.2080 (3.2007) grad_norm 1.8818 (2.2394) [2022-01-24 19:47:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][390/1251] eta 0:32:12 lr 0.000161 time 1.5250 (2.2440) loss 2.6616 (3.2018) grad_norm 2.4492 (2.2414) [2022-01-24 19:47:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][400/1251] eta 0:31:45 lr 0.000161 time 2.1592 (2.2397) loss 3.5904 (3.2042) grad_norm 2.0758 (2.2417) [2022-01-24 19:48:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][410/1251] eta 0:31:21 lr 0.000161 time 2.2191 (2.2375) loss 3.3521 (3.2111) grad_norm 2.6178 (2.2419) [2022-01-24 19:48:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][420/1251] eta 0:30:59 lr 0.000161 time 1.9533 (2.2381) loss 3.9231 (3.2201) grad_norm 2.1179 (2.2442) [2022-01-24 19:49:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][430/1251] eta 0:30:37 lr 0.000161 time 1.8276 (2.2384) loss 3.1292 (3.2155) grad_norm 2.1767 (2.2410) [2022-01-24 19:49:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][440/1251] eta 0:30:13 lr 0.000161 time 2.8607 (2.2360) loss 3.6250 (3.2168) grad_norm 2.8950 (2.2417) [2022-01-24 19:49:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][450/1251] eta 0:29:49 lr 0.000161 time 1.9140 (2.2343) loss 2.2426 (3.2205) grad_norm 2.3130 (2.2399) [2022-01-24 19:50:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][460/1251] eta 0:29:26 lr 0.000161 time 2.2072 (2.2339) loss 3.4932 (3.2224) grad_norm 2.2755 (2.2420) [2022-01-24 19:50:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][470/1251] eta 0:29:05 lr 0.000161 time 1.5836 (2.2355) loss 3.5493 (3.2225) grad_norm 3.1575 (2.2419) [2022-01-24 19:50:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][480/1251] eta 0:28:44 lr 0.000161 time 2.9194 (2.2370) loss 3.4803 (3.2267) grad_norm 2.3137 (2.2410) [2022-01-24 19:51:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][490/1251] eta 0:28:22 lr 0.000161 time 1.8316 (2.2368) loss 3.7610 (3.2199) grad_norm 2.0641 (2.2401) [2022-01-24 19:51:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][500/1251] eta 0:27:58 lr 0.000161 time 1.7577 (2.2348) loss 3.4406 (3.2203) grad_norm 2.4533 (2.2395) [2022-01-24 19:52:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][510/1251] eta 0:27:35 lr 0.000161 time 1.6494 (2.2346) loss 3.5853 (3.2231) grad_norm 2.0791 (2.2409) [2022-01-24 19:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][520/1251] eta 0:27:12 lr 0.000161 time 2.6280 (2.2339) loss 3.4898 (3.2233) grad_norm 2.1326 (2.2417) [2022-01-24 19:52:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][530/1251] eta 0:26:48 lr 0.000161 time 1.5775 (2.2309) loss 3.4153 (3.2224) grad_norm 2.1699 (2.2424) [2022-01-24 19:53:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][540/1251] eta 0:26:24 lr 0.000161 time 1.7562 (2.2279) loss 3.4139 (3.2199) grad_norm 2.0986 (2.2444) [2022-01-24 19:53:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][550/1251] eta 0:26:04 lr 0.000161 time 2.5451 (2.2315) loss 3.8879 (3.2189) grad_norm 2.4097 (2.2429) [2022-01-24 19:53:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][560/1251] eta 0:25:40 lr 0.000161 time 2.3483 (2.2297) loss 3.9947 (3.2247) grad_norm 2.3603 (2.2450) [2022-01-24 19:54:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][570/1251] eta 0:25:16 lr 0.000161 time 1.9855 (2.2274) loss 3.0024 (3.2284) grad_norm 3.1443 (2.2472) [2022-01-24 19:54:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][580/1251] eta 0:24:53 lr 0.000161 time 2.2681 (2.2262) loss 2.6227 (3.2285) grad_norm 2.1435 (2.2461) [2022-01-24 19:54:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][590/1251] eta 0:24:30 lr 0.000161 time 1.9160 (2.2240) loss 2.8720 (3.2270) grad_norm 2.2396 (2.2476) [2022-01-24 19:55:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][600/1251] eta 0:24:08 lr 0.000161 time 2.5675 (2.2249) loss 3.2090 (3.2290) grad_norm 2.1667 (2.2488) [2022-01-24 19:55:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][610/1251] eta 0:23:46 lr 0.000161 time 1.9410 (2.2259) loss 3.7319 (3.2312) grad_norm 2.2882 (2.2512) [2022-01-24 19:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][620/1251] eta 0:23:24 lr 0.000161 time 2.4450 (2.2255) loss 3.0298 (3.2327) grad_norm 2.2368 (2.2514) [2022-01-24 19:56:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][630/1251] eta 0:22:59 lr 0.000161 time 1.9440 (2.2220) loss 3.5678 (3.2348) grad_norm 2.8045 (2.2503) [2022-01-24 19:56:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][640/1251] eta 0:22:36 lr 0.000160 time 2.2469 (2.2204) loss 2.6351 (3.2372) grad_norm 1.8209 (2.2492) [2022-01-24 19:57:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][650/1251] eta 0:22:13 lr 0.000160 time 1.5861 (2.2184) loss 3.9658 (3.2372) grad_norm 2.3004 (2.2478) [2022-01-24 19:57:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][660/1251] eta 0:21:51 lr 0.000160 time 2.4677 (2.2184) loss 3.3514 (3.2367) grad_norm 2.0424 (2.2450) [2022-01-24 19:57:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][670/1251] eta 0:21:28 lr 0.000160 time 2.1670 (2.2182) loss 3.1549 (3.2357) grad_norm 1.9509 (2.2433) [2022-01-24 19:58:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][680/1251] eta 0:21:07 lr 0.000160 time 2.5764 (2.2189) loss 3.1015 (3.2375) grad_norm 2.0021 (2.2413) [2022-01-24 19:58:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][690/1251] eta 0:20:44 lr 0.000160 time 1.6148 (2.2185) loss 2.8735 (3.2374) grad_norm 2.0000 (2.2447) [2022-01-24 19:58:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][700/1251] eta 0:20:23 lr 0.000160 time 2.4596 (2.2200) loss 3.8115 (3.2405) grad_norm 2.2415 (2.2433) [2022-01-24 19:59:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][710/1251] eta 0:20:01 lr 0.000160 time 1.9025 (2.2202) loss 3.5043 (3.2397) grad_norm 2.2080 (2.2437) [2022-01-24 19:59:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][720/1251] eta 0:19:40 lr 0.000160 time 3.7407 (2.2236) loss 3.5308 (3.2381) grad_norm 2.2451 (2.2441) [2022-01-24 20:00:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][730/1251] eta 0:19:17 lr 0.000160 time 1.7146 (2.2216) loss 2.8942 (3.2391) grad_norm 2.3446 (2.2441) [2022-01-24 20:00:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][740/1251] eta 0:18:54 lr 0.000160 time 1.8265 (2.2202) loss 3.0878 (3.2354) grad_norm 2.2813 (2.2444) [2022-01-24 20:00:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][750/1251] eta 0:18:31 lr 0.000160 time 1.9076 (2.2178) loss 3.4800 (3.2371) grad_norm 2.1896 (2.2446) [2022-01-24 20:01:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][760/1251] eta 0:18:08 lr 0.000160 time 3.1974 (2.2168) loss 3.0934 (3.2386) grad_norm 2.2544 (2.2441) [2022-01-24 20:01:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][770/1251] eta 0:17:45 lr 0.000160 time 1.9394 (2.2153) loss 3.0306 (3.2366) grad_norm 2.2344 (2.2433) [2022-01-24 20:01:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][780/1251] eta 0:17:23 lr 0.000160 time 2.7443 (2.2163) loss 3.4977 (3.2359) grad_norm 2.1733 (2.2434) [2022-01-24 20:02:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][790/1251] eta 0:17:01 lr 0.000160 time 1.5329 (2.2162) loss 2.8221 (3.2311) grad_norm 2.0292 (2.2428) [2022-01-24 20:02:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][800/1251] eta 0:16:41 lr 0.000160 time 2.0104 (2.2201) loss 3.2488 (3.2295) grad_norm 1.9500 (2.2424) [2022-01-24 20:03:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][810/1251] eta 0:16:19 lr 0.000160 time 2.9480 (2.2203) loss 3.1845 (3.2290) grad_norm 2.0252 (2.2418) [2022-01-24 20:03:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][820/1251] eta 0:15:56 lr 0.000160 time 1.5415 (2.2188) loss 2.8806 (3.2262) grad_norm 2.7637 (2.2413) [2022-01-24 20:03:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][830/1251] eta 0:15:33 lr 0.000160 time 1.9429 (2.2166) loss 2.7622 (3.2260) grad_norm 2.0009 (2.2407) [2022-01-24 20:04:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][840/1251] eta 0:15:10 lr 0.000160 time 1.8608 (2.2153) loss 3.4316 (3.2262) grad_norm 2.2817 (2.2413) [2022-01-24 20:04:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][850/1251] eta 0:14:47 lr 0.000160 time 2.0593 (2.2144) loss 2.9176 (3.2260) grad_norm 2.2289 (2.2403) [2022-01-24 20:04:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][860/1251] eta 0:14:25 lr 0.000160 time 2.6412 (2.2137) loss 2.3773 (3.2261) grad_norm 2.3925 (2.2407) [2022-01-24 20:05:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][870/1251] eta 0:14:03 lr 0.000160 time 2.2892 (2.2151) loss 3.2067 (3.2271) grad_norm 2.4632 (2.2411) [2022-01-24 20:05:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][880/1251] eta 0:13:41 lr 0.000160 time 2.1517 (2.2131) loss 2.6088 (3.2247) grad_norm 2.0356 (2.2428) [2022-01-24 20:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][890/1251] eta 0:13:18 lr 0.000160 time 2.1197 (2.2114) loss 3.1668 (3.2246) grad_norm 2.9518 (2.2450) [2022-01-24 20:06:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][900/1251] eta 0:12:55 lr 0.000160 time 1.8816 (2.2105) loss 3.4328 (3.2235) grad_norm 2.3047 (2.2475) [2022-01-24 20:06:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][910/1251] eta 0:12:33 lr 0.000160 time 2.2107 (2.2105) loss 3.0906 (3.2255) grad_norm 1.8752 (2.2469) [2022-01-24 20:06:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][920/1251] eta 0:12:11 lr 0.000160 time 2.4624 (2.2099) loss 3.4260 (3.2242) grad_norm 2.1805 (2.2462) [2022-01-24 20:07:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][930/1251] eta 0:11:49 lr 0.000160 time 2.5455 (2.2089) loss 3.0888 (3.2235) grad_norm 2.3762 (2.2458) [2022-01-24 20:07:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][940/1251] eta 0:11:27 lr 0.000160 time 3.0927 (2.2114) loss 2.9967 (3.2236) grad_norm 2.6161 (2.2456) [2022-01-24 20:08:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][950/1251] eta 0:11:06 lr 0.000160 time 2.7524 (2.2129) loss 2.8801 (3.2210) grad_norm 2.0890 (2.2454) [2022-01-24 20:08:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][960/1251] eta 0:10:43 lr 0.000160 time 1.7916 (2.2118) loss 3.9149 (3.2218) grad_norm 2.0043 (2.2439) [2022-01-24 20:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][970/1251] eta 0:10:21 lr 0.000159 time 1.8290 (2.2105) loss 3.7489 (3.2233) grad_norm 2.1335 (2.2425) [2022-01-24 20:09:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][980/1251] eta 0:09:58 lr 0.000159 time 1.8901 (2.2086) loss 3.0320 (3.2223) grad_norm 2.2730 (2.2422) [2022-01-24 20:09:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][990/1251] eta 0:09:36 lr 0.000159 time 2.4618 (2.2097) loss 3.4117 (3.2240) grad_norm 2.1894 (2.2420) [2022-01-24 20:09:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1000/1251] eta 0:09:14 lr 0.000159 time 2.3288 (2.2095) loss 3.8474 (3.2250) grad_norm 2.0657 (2.2432) [2022-01-24 20:10:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1010/1251] eta 0:08:52 lr 0.000159 time 1.8167 (2.2091) loss 3.1685 (3.2271) grad_norm 2.5144 (2.2430) [2022-01-24 20:10:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1020/1251] eta 0:08:30 lr 0.000159 time 1.5106 (2.2091) loss 3.5898 (3.2287) grad_norm 2.3687 (2.2419) [2022-01-24 20:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1030/1251] eta 0:08:08 lr 0.000159 time 3.1321 (2.2119) loss 3.9106 (3.2296) grad_norm 2.0644 (2.2421) [2022-01-24 20:11:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1040/1251] eta 0:07:46 lr 0.000159 time 1.5719 (2.2119) loss 2.0821 (3.2283) grad_norm 2.6145 (2.2414) [2022-01-24 20:11:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1050/1251] eta 0:07:24 lr 0.000159 time 1.5593 (2.2111) loss 2.3322 (3.2261) grad_norm 2.1817 (2.2406) [2022-01-24 20:12:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1060/1251] eta 0:07:02 lr 0.000159 time 1.9223 (2.2107) loss 2.9508 (3.2261) grad_norm 1.9879 (2.2412) [2022-01-24 20:12:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1070/1251] eta 0:06:40 lr 0.000159 time 2.5623 (2.2108) loss 3.5270 (3.2276) grad_norm 2.1325 (2.2401) [2022-01-24 20:12:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1080/1251] eta 0:06:17 lr 0.000159 time 1.6961 (2.2092) loss 3.9386 (3.2281) grad_norm 2.4857 (2.2394) [2022-01-24 20:13:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1090/1251] eta 0:05:55 lr 0.000159 time 2.1475 (2.2091) loss 3.1671 (3.2281) grad_norm 2.5052 (2.2391) [2022-01-24 20:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1100/1251] eta 0:05:33 lr 0.000159 time 2.0424 (2.2080) loss 1.9273 (3.2276) grad_norm 2.2054 (2.2385) [2022-01-24 20:13:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1110/1251] eta 0:05:11 lr 0.000159 time 1.8437 (2.2076) loss 3.4134 (3.2253) grad_norm 2.1198 (2.2394) [2022-01-24 20:14:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1120/1251] eta 0:04:49 lr 0.000159 time 2.0591 (2.2071) loss 3.2760 (3.2249) grad_norm 2.7713 (2.2394) [2022-01-24 20:14:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1130/1251] eta 0:04:27 lr 0.000159 time 1.6007 (2.2071) loss 3.7063 (3.2260) grad_norm 2.2484 (2.2394) [2022-01-24 20:15:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1140/1251] eta 0:04:05 lr 0.000159 time 2.0198 (2.2074) loss 4.1012 (3.2259) grad_norm 2.1605 (2.2393) [2022-01-24 20:15:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1150/1251] eta 0:03:43 lr 0.000159 time 2.6890 (2.2083) loss 3.1171 (3.2265) grad_norm 2.3404 (2.2401) [2022-01-24 20:15:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1160/1251] eta 0:03:20 lr 0.000159 time 1.6942 (2.2071) loss 2.6120 (3.2237) grad_norm 2.1096 (2.2402) [2022-01-24 20:16:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1170/1251] eta 0:02:58 lr 0.000159 time 1.8988 (2.2074) loss 3.2427 (3.2220) grad_norm 2.0535 (2.2391) [2022-01-24 20:16:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1180/1251] eta 0:02:36 lr 0.000159 time 1.6121 (2.2066) loss 3.6299 (3.2233) grad_norm 2.1708 (2.2396) [2022-01-24 20:16:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1190/1251] eta 0:02:14 lr 0.000159 time 2.5334 (2.2056) loss 3.7536 (3.2235) grad_norm 2.1498 (2.2413) [2022-01-24 20:17:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1200/1251] eta 0:01:52 lr 0.000159 time 1.4417 (2.2047) loss 1.9496 (3.2214) grad_norm 1.9732 (2.2423) [2022-01-24 20:17:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1210/1251] eta 0:01:30 lr 0.000159 time 2.1635 (2.2042) loss 3.4705 (3.2223) grad_norm 2.4036 (2.2439) [2022-01-24 20:17:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1220/1251] eta 0:01:08 lr 0.000159 time 1.8211 (2.2046) loss 3.4362 (3.2228) grad_norm 2.1672 (2.2440) [2022-01-24 20:18:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1230/1251] eta 0:00:46 lr 0.000159 time 2.6418 (2.2060) loss 3.5132 (3.2227) grad_norm 2.2996 (2.2448) [2022-01-24 20:18:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1240/1251] eta 0:00:24 lr 0.000159 time 1.3738 (2.2049) loss 3.9572 (3.2219) grad_norm 2.1844 (2.2445) [2022-01-24 20:18:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [223/300][1250/1251] eta 0:00:02 lr 0.000159 time 1.2003 (2.1991) loss 2.9332 (3.2229) grad_norm 2.0266 (2.2431) [2022-01-24 20:18:53 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 223 training takes 0:45:51 [2022-01-24 20:19:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 20.794 (20.794) Loss 0.8937 (0.8937) Acc@1 78.418 (78.418) Acc@5 94.727 (94.727) [2022-01-24 20:19:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 2.257 (3.344) Loss 0.8668 (0.8824) Acc@1 79.004 (79.226) Acc@5 95.117 (94.771) [2022-01-24 20:19:49 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.905 (2.673) Loss 0.9081 (0.8701) Acc@1 78.320 (79.492) Acc@5 94.922 (94.913) [2022-01-24 20:20:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.569 (2.230) Loss 0.8705 (0.8696) Acc@1 79.297 (79.458) Acc@5 95.215 (94.960) [2022-01-24 20:20:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.289 (2.161) Loss 0.8525 (0.8699) Acc@1 81.055 (79.549) Acc@5 94.727 (94.941) [2022-01-24 20:20:29 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.472 Acc@5 94.876 [2022-01-24 20:20:29 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.5% [2022-01-24 20:20:29 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.47% [2022-01-24 20:20:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][0/1251] eta 7:34:41 lr 0.000159 time 21.8080 (21.8080) loss 3.2592 (3.2592) grad_norm 1.9753 (1.9753) [2022-01-24 20:21:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][10/1251] eta 1:26:50 lr 0.000159 time 2.2680 (4.1985) loss 3.2202 (3.2945) grad_norm 2.9437 (2.2582) [2022-01-24 20:21:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][20/1251] eta 1:04:49 lr 0.000159 time 1.5275 (3.1592) loss 3.0373 (3.2501) grad_norm 1.9614 (2.2124) [2022-01-24 20:21:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][30/1251] eta 0:57:36 lr 0.000159 time 1.6891 (2.8306) loss 2.1628 (3.2618) grad_norm 2.1459 (2.2382) [2022-01-24 20:22:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][40/1251] eta 0:54:25 lr 0.000159 time 3.1137 (2.6964) loss 3.6812 (3.2735) grad_norm 2.0940 (2.2538) [2022-01-24 20:22:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][50/1251] eta 0:53:37 lr 0.000159 time 3.2565 (2.6789) loss 2.7451 (3.2976) grad_norm 2.0048 (2.2609) [2022-01-24 20:23:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][60/1251] eta 0:51:42 lr 0.000158 time 1.8139 (2.6052) loss 3.6754 (3.2943) grad_norm 2.0690 (2.2447) [2022-01-24 20:23:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][70/1251] eta 0:49:55 lr 0.000158 time 1.5681 (2.5368) loss 3.2496 (3.2654) grad_norm 2.4453 (2.2695) [2022-01-24 20:23:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][80/1251] eta 0:48:41 lr 0.000158 time 3.0862 (2.4950) loss 3.8203 (3.2866) grad_norm 2.1963 (2.2690) [2022-01-24 20:24:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][90/1251] eta 0:47:21 lr 0.000158 time 1.9926 (2.4479) loss 4.0965 (3.2732) grad_norm 2.3455 (2.2648) [2022-01-24 20:24:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][100/1251] eta 0:46:17 lr 0.000158 time 1.8951 (2.4130) loss 3.6922 (3.2903) grad_norm 1.9948 (2.2565) [2022-01-24 20:24:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][110/1251] eta 0:45:07 lr 0.000158 time 2.0256 (2.3733) loss 2.1692 (3.2878) grad_norm 2.2153 (2.2441) [2022-01-24 20:25:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][120/1251] eta 0:44:21 lr 0.000158 time 2.2103 (2.3528) loss 2.1812 (3.2869) grad_norm 2.0766 (2.2429) [2022-01-24 20:25:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][130/1251] eta 0:43:39 lr 0.000158 time 2.1954 (2.3366) loss 3.2851 (3.2640) grad_norm 2.3143 (2.2521) [2022-01-24 20:25:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][140/1251] eta 0:43:04 lr 0.000158 time 1.5737 (2.3263) loss 2.4775 (3.2442) grad_norm 2.1075 (2.2436) [2022-01-24 20:26:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][150/1251] eta 0:42:35 lr 0.000158 time 2.4188 (2.3214) loss 2.0620 (3.2349) grad_norm 1.9714 (2.2388) [2022-01-24 20:26:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][160/1251] eta 0:42:23 lr 0.000158 time 3.1942 (2.3313) loss 2.7038 (3.2060) grad_norm 2.0285 (2.2381) [2022-01-24 20:27:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][170/1251] eta 0:42:01 lr 0.000158 time 2.4647 (2.3329) loss 3.1705 (3.1999) grad_norm 2.4618 (2.2405) [2022-01-24 20:27:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][180/1251] eta 0:41:31 lr 0.000158 time 1.7645 (2.3268) loss 2.4003 (3.1753) grad_norm 2.5240 (2.2349) [2022-01-24 20:27:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][190/1251] eta 0:41:03 lr 0.000158 time 2.4259 (2.3216) loss 3.4887 (3.1792) grad_norm 2.1392 (2.2303) [2022-01-24 20:28:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][200/1251] eta 0:40:20 lr 0.000158 time 1.5669 (2.3031) loss 3.0646 (3.1819) grad_norm 2.5148 (2.2332) [2022-01-24 20:28:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][210/1251] eta 0:39:44 lr 0.000158 time 1.9118 (2.2905) loss 3.4818 (3.1922) grad_norm 1.9719 (2.2316) [2022-01-24 20:28:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][220/1251] eta 0:39:10 lr 0.000158 time 2.1982 (2.2802) loss 2.0449 (3.1816) grad_norm 2.0889 (2.2367) [2022-01-24 20:29:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][230/1251] eta 0:38:41 lr 0.000158 time 1.9683 (2.2735) loss 2.9463 (3.1828) grad_norm 2.4293 (2.2363) [2022-01-24 20:29:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][240/1251] eta 0:38:15 lr 0.000158 time 1.9591 (2.2702) loss 2.4044 (3.1889) grad_norm 2.2230 (2.2419) [2022-01-24 20:29:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][250/1251] eta 0:37:47 lr 0.000158 time 2.5074 (2.2652) loss 3.4129 (3.1884) grad_norm 2.0935 (2.2417) [2022-01-24 20:30:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][260/1251] eta 0:37:24 lr 0.000158 time 2.7208 (2.2647) loss 2.2571 (3.1828) grad_norm 1.9602 (2.2443) [2022-01-24 20:30:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][270/1251] eta 0:37:03 lr 0.000158 time 3.0608 (2.2663) loss 2.9521 (3.1916) grad_norm 2.2018 (2.2425) [2022-01-24 20:31:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][280/1251] eta 0:36:48 lr 0.000158 time 1.4354 (2.2740) loss 3.3736 (3.1973) grad_norm 2.1448 (2.2427) [2022-01-24 20:31:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][290/1251] eta 0:36:26 lr 0.000158 time 2.2146 (2.2752) loss 3.8054 (3.2029) grad_norm 2.4639 (2.2457) [2022-01-24 20:31:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][300/1251] eta 0:35:57 lr 0.000158 time 2.2830 (2.2688) loss 3.7484 (3.2018) grad_norm 2.2976 (2.2433) [2022-01-24 20:32:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][310/1251] eta 0:35:27 lr 0.000158 time 2.1529 (2.2607) loss 3.5199 (3.2047) grad_norm 2.2665 (2.2461) [2022-01-24 20:32:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][320/1251] eta 0:34:59 lr 0.000158 time 1.8337 (2.2549) loss 3.4337 (3.2134) grad_norm 1.8580 (2.2462) [2022-01-24 20:32:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][330/1251] eta 0:34:36 lr 0.000158 time 2.3823 (2.2546) loss 3.5928 (3.2161) grad_norm 2.6901 (2.2470) [2022-01-24 20:33:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][340/1251] eta 0:34:07 lr 0.000158 time 1.9244 (2.2477) loss 3.6332 (3.2141) grad_norm 2.5318 (2.2466) [2022-01-24 20:33:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][350/1251] eta 0:33:46 lr 0.000158 time 2.5818 (2.2491) loss 2.6397 (3.2068) grad_norm 2.0940 (2.2474) [2022-01-24 20:34:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][360/1251] eta 0:33:22 lr 0.000158 time 2.1650 (2.2474) loss 3.2842 (3.2074) grad_norm 2.3043 (2.2520) [2022-01-24 20:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][370/1251] eta 0:32:59 lr 0.000158 time 2.5038 (2.2469) loss 2.8773 (3.2089) grad_norm 2.3444 (2.2536) [2022-01-24 20:34:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][380/1251] eta 0:32:38 lr 0.000158 time 2.4516 (2.2487) loss 2.5721 (3.2052) grad_norm 2.0624 (2.2530) [2022-01-24 20:35:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][390/1251] eta 0:32:17 lr 0.000158 time 2.1813 (2.2497) loss 3.3086 (3.2058) grad_norm 2.3698 (2.2550) [2022-01-24 20:35:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][400/1251] eta 0:31:49 lr 0.000157 time 1.7995 (2.2439) loss 3.1273 (3.2033) grad_norm 2.1125 (2.2550) [2022-01-24 20:35:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][410/1251] eta 0:31:21 lr 0.000157 time 1.6064 (2.2376) loss 3.3111 (3.2016) grad_norm 2.1945 (2.2544) [2022-01-24 20:36:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][420/1251] eta 0:30:56 lr 0.000157 time 1.9465 (2.2336) loss 2.9123 (3.2047) grad_norm 2.2584 (2.2551) [2022-01-24 20:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][430/1251] eta 0:30:30 lr 0.000157 time 1.8681 (2.2295) loss 2.6688 (3.2000) grad_norm 2.2915 (2.2528) [2022-01-24 20:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][440/1251] eta 0:30:06 lr 0.000157 time 1.7903 (2.2272) loss 3.3429 (3.2011) grad_norm 2.5296 (2.2527) [2022-01-24 20:37:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][450/1251] eta 0:29:43 lr 0.000157 time 1.8941 (2.2270) loss 3.9246 (3.2091) grad_norm 2.2627 (2.2577) [2022-01-24 20:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][460/1251] eta 0:29:22 lr 0.000157 time 2.2741 (2.2285) loss 3.8699 (3.2124) grad_norm 2.3338 (2.2606) [2022-01-24 20:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][470/1251] eta 0:28:59 lr 0.000157 time 2.0781 (2.2271) loss 3.6160 (3.2177) grad_norm 2.1227 (2.2577) [2022-01-24 20:38:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][480/1251] eta 0:28:36 lr 0.000157 time 1.6314 (2.2265) loss 3.7076 (3.2166) grad_norm 2.2562 (2.2551) [2022-01-24 20:38:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][490/1251] eta 0:28:15 lr 0.000157 time 1.6587 (2.2278) loss 2.6019 (3.2168) grad_norm 2.4443 (2.2549) [2022-01-24 20:39:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][500/1251] eta 0:27:55 lr 0.000157 time 1.9664 (2.2315) loss 2.9806 (3.2181) grad_norm 2.2152 (2.2547) [2022-01-24 20:39:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][510/1251] eta 0:27:32 lr 0.000157 time 1.5242 (2.2302) loss 1.9185 (3.2166) grad_norm 2.3333 (2.2534) [2022-01-24 20:39:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][520/1251] eta 0:27:08 lr 0.000157 time 1.7729 (2.2276) loss 3.7075 (3.2158) grad_norm 2.4046 (2.2524) [2022-01-24 20:40:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][530/1251] eta 0:26:43 lr 0.000157 time 1.9059 (2.2243) loss 2.8540 (3.2152) grad_norm 2.1720 (2.2502) [2022-01-24 20:40:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][540/1251] eta 0:26:22 lr 0.000157 time 2.5417 (2.2251) loss 3.5577 (3.2186) grad_norm 1.9322 (2.2499) [2022-01-24 20:40:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][550/1251] eta 0:25:59 lr 0.000157 time 2.1582 (2.2246) loss 2.2592 (3.2195) grad_norm 2.5314 (2.2506) [2022-01-24 20:41:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][560/1251] eta 0:25:35 lr 0.000157 time 1.8458 (2.2224) loss 2.3696 (3.2152) grad_norm 1.9106 (2.2502) [2022-01-24 20:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][570/1251] eta 0:25:12 lr 0.000157 time 2.1965 (2.2212) loss 3.6352 (3.2194) grad_norm 2.3065 (2.2508) [2022-01-24 20:41:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][580/1251] eta 0:24:48 lr 0.000157 time 2.2408 (2.2189) loss 2.7043 (3.2164) grad_norm 2.0914 (2.2497) [2022-01-24 20:42:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][590/1251] eta 0:24:24 lr 0.000157 time 1.9101 (2.2153) loss 3.1059 (3.2171) grad_norm 2.3657 (2.2514) [2022-01-24 20:42:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][600/1251] eta 0:24:01 lr 0.000157 time 1.9612 (2.2139) loss 3.5133 (3.2175) grad_norm 2.1687 (2.2526) [2022-01-24 20:43:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][610/1251] eta 0:23:37 lr 0.000157 time 2.1909 (2.2116) loss 3.6668 (3.2158) grad_norm 2.5166 (2.2524) [2022-01-24 20:43:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][620/1251] eta 0:23:15 lr 0.000157 time 2.5792 (2.2115) loss 3.7227 (3.2086) grad_norm 1.9155 (2.2494) [2022-01-24 20:43:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][630/1251] eta 0:22:52 lr 0.000157 time 2.4878 (2.2107) loss 3.1234 (3.2103) grad_norm 2.3341 (2.2480) [2022-01-24 20:44:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][640/1251] eta 0:22:30 lr 0.000157 time 2.0676 (2.2111) loss 2.4575 (3.2073) grad_norm 2.1258 (2.2464) [2022-01-24 20:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][650/1251] eta 0:22:09 lr 0.000157 time 2.6210 (2.2115) loss 3.3181 (3.2067) grad_norm 2.0185 (2.2462) [2022-01-24 20:44:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][660/1251] eta 0:21:47 lr 0.000157 time 2.2737 (2.2116) loss 3.4628 (3.2071) grad_norm 2.0827 (2.2450) [2022-01-24 20:45:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][670/1251] eta 0:21:25 lr 0.000157 time 2.1822 (2.2119) loss 3.9223 (3.2118) grad_norm 2.2367 (2.2435) [2022-01-24 20:45:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][680/1251] eta 0:21:03 lr 0.000157 time 2.7994 (2.2136) loss 3.8178 (3.2129) grad_norm 2.4806 (2.2425) [2022-01-24 20:45:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][690/1251] eta 0:20:41 lr 0.000157 time 3.0835 (2.2135) loss 3.7994 (3.2123) grad_norm 2.0185 (2.2427) [2022-01-24 20:46:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][700/1251] eta 0:20:19 lr 0.000157 time 2.0749 (2.2130) loss 3.4598 (3.2104) grad_norm 2.5426 (2.2424) [2022-01-24 20:46:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][710/1251] eta 0:19:57 lr 0.000157 time 1.7955 (2.2127) loss 2.8735 (3.2094) grad_norm 2.3529 (2.2445) [2022-01-24 20:47:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][720/1251] eta 0:19:36 lr 0.000157 time 2.5483 (2.2150) loss 3.5491 (3.2094) grad_norm 2.3205 (2.2454) [2022-01-24 20:47:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][730/1251] eta 0:19:12 lr 0.000157 time 1.9025 (2.2130) loss 3.6124 (3.2082) grad_norm 2.1320 (2.2445) [2022-01-24 20:47:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][740/1251] eta 0:18:49 lr 0.000156 time 1.7977 (2.2110) loss 2.5646 (3.2092) grad_norm 1.9860 (2.2431) [2022-01-24 20:48:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][750/1251] eta 0:18:27 lr 0.000156 time 1.8870 (2.2096) loss 3.2911 (3.2068) grad_norm 2.4119 (2.2442) [2022-01-24 20:48:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][760/1251] eta 0:18:04 lr 0.000156 time 2.5953 (2.2086) loss 3.5749 (3.2045) grad_norm 2.1850 (2.2449) [2022-01-24 20:48:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][770/1251] eta 0:17:41 lr 0.000156 time 2.1352 (2.2063) loss 3.5632 (3.2032) grad_norm 2.2439 (2.2457) [2022-01-24 20:49:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][780/1251] eta 0:17:18 lr 0.000156 time 1.6925 (2.2047) loss 3.4874 (3.2033) grad_norm 2.2287 (2.2461) [2022-01-24 20:49:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][790/1251] eta 0:16:56 lr 0.000156 time 1.8976 (2.2042) loss 2.7109 (3.2018) grad_norm 2.0841 (2.2456) [2022-01-24 20:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][800/1251] eta 0:16:34 lr 0.000156 time 2.1349 (2.2043) loss 2.5704 (3.2003) grad_norm 1.8662 (2.2439) [2022-01-24 20:50:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][810/1251] eta 0:16:12 lr 0.000156 time 2.8788 (2.2041) loss 3.5362 (3.2009) grad_norm 2.5290 (2.2438) [2022-01-24 20:50:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][820/1251] eta 0:15:50 lr 0.000156 time 2.7758 (2.2047) loss 2.8148 (3.2001) grad_norm 2.2063 (2.2440) [2022-01-24 20:51:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][830/1251] eta 0:15:28 lr 0.000156 time 2.4720 (2.2058) loss 4.1346 (3.2031) grad_norm 2.1238 (2.2433) [2022-01-24 20:51:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][840/1251] eta 0:15:06 lr 0.000156 time 2.3289 (2.2063) loss 2.4443 (3.2037) grad_norm 1.9603 (2.2441) [2022-01-24 20:51:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][850/1251] eta 0:14:45 lr 0.000156 time 3.1036 (2.2079) loss 3.8499 (3.2059) grad_norm 2.3891 (2.2445) [2022-01-24 20:52:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][860/1251] eta 0:14:22 lr 0.000156 time 2.6720 (2.2071) loss 3.4538 (3.2058) grad_norm 2.2624 (2.2440) [2022-01-24 20:52:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][870/1251] eta 0:14:00 lr 0.000156 time 2.2325 (2.2063) loss 2.6315 (3.2068) grad_norm 2.4672 (2.2440) [2022-01-24 20:52:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][880/1251] eta 0:13:38 lr 0.000156 time 1.9031 (2.2066) loss 3.6984 (3.2084) grad_norm 2.1296 (2.2448) [2022-01-24 20:53:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][890/1251] eta 0:13:16 lr 0.000156 time 2.3709 (2.2063) loss 3.8531 (3.2090) grad_norm 2.5151 (2.2441) [2022-01-24 20:53:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][900/1251] eta 0:12:54 lr 0.000156 time 2.5766 (2.2060) loss 3.4658 (3.2125) grad_norm 1.8593 (2.2437) [2022-01-24 20:53:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][910/1251] eta 0:12:32 lr 0.000156 time 2.5814 (2.2063) loss 3.6657 (3.2098) grad_norm 2.0041 (2.2434) [2022-01-24 20:54:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][920/1251] eta 0:12:10 lr 0.000156 time 1.5628 (2.2063) loss 3.5730 (3.2111) grad_norm 2.1947 (2.2443) [2022-01-24 20:54:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][930/1251] eta 0:11:47 lr 0.000156 time 1.5690 (2.2049) loss 1.9627 (3.2079) grad_norm 2.2111 (2.2448) [2022-01-24 20:55:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][940/1251] eta 0:11:26 lr 0.000156 time 3.0909 (2.2074) loss 2.5745 (3.2084) grad_norm 2.1485 (2.2445) [2022-01-24 20:55:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][950/1251] eta 0:11:03 lr 0.000156 time 1.9507 (2.2054) loss 3.8904 (3.2120) grad_norm 2.3543 (2.2450) [2022-01-24 20:55:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][960/1251] eta 0:10:41 lr 0.000156 time 1.8529 (2.2040) loss 3.5679 (3.2128) grad_norm 1.9750 (2.2458) [2022-01-24 20:56:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][970/1251] eta 0:10:19 lr 0.000156 time 2.1341 (2.2038) loss 3.6331 (3.2110) grad_norm 2.0507 (2.2459) [2022-01-24 20:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][980/1251] eta 0:09:57 lr 0.000156 time 2.9688 (2.2051) loss 3.8179 (3.2120) grad_norm 2.0448 (2.2466) [2022-01-24 20:56:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][990/1251] eta 0:09:35 lr 0.000156 time 1.7631 (2.2054) loss 3.6421 (3.2139) grad_norm 2.5381 (2.2466) [2022-01-24 20:57:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1000/1251] eta 0:09:13 lr 0.000156 time 2.1523 (2.2071) loss 3.5847 (3.2123) grad_norm 2.2218 (2.2464) [2022-01-24 20:57:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1010/1251] eta 0:08:51 lr 0.000156 time 1.5408 (2.2056) loss 3.5332 (3.2120) grad_norm 2.3850 (2.2466) [2022-01-24 20:57:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1020/1251] eta 0:08:29 lr 0.000156 time 1.7754 (2.2035) loss 4.1649 (3.2119) grad_norm 2.0844 (2.2464) [2022-01-24 20:58:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1030/1251] eta 0:08:06 lr 0.000156 time 2.5488 (2.2036) loss 3.2098 (3.2108) grad_norm 2.2338 (2.2457) [2022-01-24 20:58:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1040/1251] eta 0:07:44 lr 0.000156 time 2.6439 (2.2037) loss 3.5810 (3.2110) grad_norm 2.4558 (2.2463) [2022-01-24 20:59:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1050/1251] eta 0:07:22 lr 0.000156 time 1.8644 (2.2025) loss 3.5529 (3.2120) grad_norm 2.4194 (2.2476) [2022-01-24 20:59:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1060/1251] eta 0:07:00 lr 0.000156 time 1.8865 (2.2011) loss 3.3723 (3.2133) grad_norm 2.6724 (2.2485) [2022-01-24 20:59:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1070/1251] eta 0:06:38 lr 0.000156 time 1.8848 (2.2010) loss 2.6322 (3.2098) grad_norm 2.4890 (2.2479) [2022-01-24 21:00:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1080/1251] eta 0:06:16 lr 0.000155 time 2.3649 (2.2011) loss 3.2504 (3.2111) grad_norm 2.7091 (2.2515) [2022-01-24 21:00:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1090/1251] eta 0:05:54 lr 0.000155 time 1.6965 (2.2010) loss 2.2356 (3.2094) grad_norm 2.0993 (2.2510) [2022-01-24 21:00:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1100/1251] eta 0:05:32 lr 0.000155 time 1.5821 (2.2001) loss 3.5762 (3.2105) grad_norm 2.2254 (2.2513) [2022-01-24 21:01:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1110/1251] eta 0:05:10 lr 0.000155 time 1.9132 (2.1997) loss 3.5792 (3.2095) grad_norm 2.2419 (2.2514) [2022-01-24 21:01:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1120/1251] eta 0:04:48 lr 0.000155 time 2.6252 (2.1999) loss 3.6343 (3.2072) grad_norm 2.2200 (2.2508) [2022-01-24 21:01:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1130/1251] eta 0:04:26 lr 0.000155 time 2.0066 (2.1997) loss 2.8782 (3.2064) grad_norm 2.6225 (2.2528) [2022-01-24 21:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1140/1251] eta 0:04:04 lr 0.000155 time 2.5786 (2.2008) loss 2.5912 (3.2073) grad_norm 2.1321 (2.2525) [2022-01-24 21:02:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1150/1251] eta 0:03:42 lr 0.000155 time 2.1174 (2.2007) loss 3.2308 (3.2085) grad_norm 2.0586 (2.2526) [2022-01-24 21:03:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1160/1251] eta 0:03:20 lr 0.000155 time 3.0375 (2.2019) loss 3.5589 (3.2083) grad_norm 2.0233 (2.2531) [2022-01-24 21:03:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1170/1251] eta 0:02:58 lr 0.000155 time 1.9079 (2.2006) loss 3.4246 (3.2102) grad_norm 2.0038 (2.2532) [2022-01-24 21:03:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1180/1251] eta 0:02:36 lr 0.000155 time 2.4288 (2.1995) loss 2.3039 (3.2100) grad_norm 2.7303 (2.2537) [2022-01-24 21:04:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1190/1251] eta 0:02:14 lr 0.000155 time 1.6003 (2.1993) loss 3.4620 (3.2084) grad_norm 2.3334 (2.2563) [2022-01-24 21:04:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1200/1251] eta 0:01:52 lr 0.000155 time 2.4863 (2.1997) loss 3.7870 (3.2090) grad_norm 1.9880 (2.2562) [2022-01-24 21:04:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1210/1251] eta 0:01:30 lr 0.000155 time 2.1346 (2.1999) loss 3.2118 (3.2094) grad_norm 2.0995 (2.2559) [2022-01-24 21:05:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1220/1251] eta 0:01:08 lr 0.000155 time 2.2297 (2.1992) loss 2.2105 (3.2084) grad_norm 1.9362 (2.2559) [2022-01-24 21:05:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1230/1251] eta 0:00:46 lr 0.000155 time 1.9708 (2.1986) loss 3.3203 (3.2076) grad_norm 1.9860 (2.2564) [2022-01-24 21:05:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1240/1251] eta 0:00:24 lr 0.000155 time 2.3118 (2.1969) loss 3.2991 (3.2090) grad_norm 2.2235 (2.2561) [2022-01-24 21:06:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [224/300][1250/1251] eta 0:00:02 lr 0.000155 time 1.1695 (2.1913) loss 2.4333 (3.2078) grad_norm 2.0343 (2.2551) [2022-01-24 21:06:10 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 224 training takes 0:45:41 [2022-01-24 21:06:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.266 (18.266) Loss 0.8687 (0.8687) Acc@1 80.176 (80.176) Acc@5 94.922 (94.922) [2022-01-24 21:06:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.248 (3.348) Loss 0.8367 (0.8631) Acc@1 79.980 (79.510) Acc@5 96.191 (95.339) [2022-01-24 21:07:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.593 (2.505) Loss 0.9022 (0.8733) Acc@1 79.590 (79.488) Acc@5 94.434 (95.061) [2022-01-24 21:07:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.646 (2.245) Loss 0.8610 (0.8754) Acc@1 78.906 (79.366) Acc@5 95.215 (95.029) [2022-01-24 21:07:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.212 (2.204) Loss 0.8697 (0.8742) Acc@1 78.516 (79.349) Acc@5 95.508 (95.034) [2022-01-24 21:07:48 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.408 Acc@5 95.042 [2022-01-24 21:07:48 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.4% [2022-01-24 21:07:48 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.47% [2022-01-24 21:08:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][0/1251] eta 7:37:23 lr 0.000155 time 21.9372 (21.9372) loss 2.5172 (2.5172) grad_norm 2.2710 (2.2710) [2022-01-24 21:08:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][10/1251] eta 1:24:15 lr 0.000155 time 1.6556 (4.0740) loss 3.2099 (3.0797) grad_norm 2.5376 (2.1313) [2022-01-24 21:08:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][20/1251] eta 1:04:39 lr 0.000155 time 1.4744 (3.1516) loss 3.0752 (2.9729) grad_norm 2.1100 (2.1520) [2022-01-24 21:09:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][30/1251] eta 0:57:20 lr 0.000155 time 1.3689 (2.8182) loss 3.5352 (2.9152) grad_norm 2.3735 (2.1651) [2022-01-24 21:09:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][40/1251] eta 0:54:16 lr 0.000155 time 3.9623 (2.6892) loss 3.3258 (2.9187) grad_norm 2.1400 (2.1710) [2022-01-24 21:10:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][50/1251] eta 0:52:09 lr 0.000155 time 2.1074 (2.6058) loss 2.7331 (2.9585) grad_norm 2.5033 (2.1802) [2022-01-24 21:10:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][60/1251] eta 0:50:28 lr 0.000155 time 2.5098 (2.5432) loss 2.5824 (3.0377) grad_norm 2.6193 (2.2098) [2022-01-24 21:10:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][70/1251] eta 0:48:37 lr 0.000155 time 1.5766 (2.4707) loss 2.8406 (3.0289) grad_norm 2.1321 (2.2189) [2022-01-24 21:11:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][80/1251] eta 0:47:47 lr 0.000155 time 3.9351 (2.4486) loss 3.1554 (3.0260) grad_norm 2.2500 (2.2293) [2022-01-24 21:11:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][90/1251] eta 0:47:05 lr 0.000155 time 3.0186 (2.4337) loss 3.7929 (3.0573) grad_norm 2.1829 (2.2401) [2022-01-24 21:11:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][100/1251] eta 0:46:32 lr 0.000155 time 2.4512 (2.4260) loss 3.1459 (3.0867) grad_norm 2.2738 (2.2484) [2022-01-24 21:12:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][110/1251] eta 0:45:47 lr 0.000155 time 1.8924 (2.4076) loss 3.2402 (3.1013) grad_norm 2.1443 (2.2596) [2022-01-24 21:12:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][120/1251] eta 0:44:58 lr 0.000155 time 2.2178 (2.3857) loss 3.0197 (3.1098) grad_norm 2.4856 (2.2598) [2022-01-24 21:12:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][130/1251] eta 0:44:19 lr 0.000155 time 3.0588 (2.3723) loss 3.5664 (3.0997) grad_norm 2.4288 (2.2542) [2022-01-24 21:13:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][140/1251] eta 0:43:26 lr 0.000155 time 1.7024 (2.3458) loss 3.1775 (3.0984) grad_norm 2.1772 (2.2433) [2022-01-24 21:13:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][150/1251] eta 0:42:40 lr 0.000155 time 2.2246 (2.3253) loss 2.9747 (3.1067) grad_norm 2.0309 (2.2389) [2022-01-24 21:14:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][160/1251] eta 0:42:12 lr 0.000155 time 2.8322 (2.3214) loss 2.3622 (3.1022) grad_norm 2.1680 (2.2376) [2022-01-24 21:14:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][170/1251] eta 0:41:39 lr 0.000154 time 2.1624 (2.3124) loss 2.4123 (3.1098) grad_norm 2.1442 (2.2394) [2022-01-24 21:14:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][180/1251] eta 0:41:16 lr 0.000154 time 2.3598 (2.3122) loss 3.5630 (3.1024) grad_norm 2.2508 (2.2387) [2022-01-24 21:15:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][190/1251] eta 0:40:45 lr 0.000154 time 2.1276 (2.3048) loss 4.0711 (3.1088) grad_norm 2.2392 (2.2388) [2022-01-24 21:15:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][200/1251] eta 0:40:11 lr 0.000154 time 2.8284 (2.2941) loss 3.3573 (3.1134) grad_norm 2.2321 (2.2427) [2022-01-24 21:15:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][210/1251] eta 0:39:32 lr 0.000154 time 1.8934 (2.2790) loss 3.6923 (3.1205) grad_norm 2.6301 (2.2417) [2022-01-24 21:16:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][220/1251] eta 0:38:59 lr 0.000154 time 1.9079 (2.2693) loss 2.1169 (3.1138) grad_norm 2.4258 (2.2459) [2022-01-24 21:16:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][230/1251] eta 0:38:33 lr 0.000154 time 3.1733 (2.2655) loss 3.2878 (3.1204) grad_norm 2.1891 (2.2468) [2022-01-24 21:16:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][240/1251] eta 0:38:11 lr 0.000154 time 1.5211 (2.2666) loss 3.5426 (3.1314) grad_norm 2.0119 (2.2504) [2022-01-24 21:17:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][250/1251] eta 0:37:47 lr 0.000154 time 1.8785 (2.2649) loss 3.2781 (3.1312) grad_norm 2.0223 (2.2482) [2022-01-24 21:17:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][260/1251] eta 0:37:20 lr 0.000154 time 1.9499 (2.2605) loss 2.4589 (3.1363) grad_norm 2.3587 (2.2492) [2022-01-24 21:17:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][270/1251] eta 0:36:52 lr 0.000154 time 2.6150 (2.2549) loss 3.4720 (3.1362) grad_norm 1.9771 (2.2515) [2022-01-24 21:18:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][280/1251] eta 0:36:26 lr 0.000154 time 1.9243 (2.2521) loss 3.6826 (3.1290) grad_norm 2.3511 (2.2488) [2022-01-24 21:18:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][290/1251] eta 0:36:02 lr 0.000154 time 2.4451 (2.2501) loss 3.2762 (3.1315) grad_norm 2.2662 (2.2445) [2022-01-24 21:19:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][300/1251] eta 0:35:40 lr 0.000154 time 3.0149 (2.2509) loss 2.2798 (3.1282) grad_norm 1.8439 (2.2408) [2022-01-24 21:19:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][310/1251] eta 0:35:16 lr 0.000154 time 2.4830 (2.2496) loss 3.7071 (3.1191) grad_norm 2.1410 (2.2389) [2022-01-24 21:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][320/1251] eta 0:34:53 lr 0.000154 time 1.7862 (2.2491) loss 3.0854 (3.1197) grad_norm 1.9016 (2.2373) [2022-01-24 21:20:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][330/1251] eta 0:34:30 lr 0.000154 time 2.5014 (2.2482) loss 3.7746 (3.1143) grad_norm 2.5567 (2.2400) [2022-01-24 21:20:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][340/1251] eta 0:34:06 lr 0.000154 time 2.4969 (2.2464) loss 3.1851 (3.1077) grad_norm 2.1268 (2.2386) [2022-01-24 21:20:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][350/1251] eta 0:33:40 lr 0.000154 time 2.8096 (2.2427) loss 3.5156 (3.1099) grad_norm 2.3001 (2.2359) [2022-01-24 21:21:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][360/1251] eta 0:33:12 lr 0.000154 time 2.0232 (2.2368) loss 3.7702 (3.1078) grad_norm 2.2464 (2.2355) [2022-01-24 21:21:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][370/1251] eta 0:32:46 lr 0.000154 time 1.6274 (2.2317) loss 3.0494 (3.1035) grad_norm 1.9185 (2.2352) [2022-01-24 21:21:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][380/1251] eta 0:32:21 lr 0.000154 time 2.6016 (2.2287) loss 2.4989 (3.1076) grad_norm 2.4094 (2.2389) [2022-01-24 21:22:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][390/1251] eta 0:31:56 lr 0.000154 time 1.8987 (2.2254) loss 3.0191 (3.1142) grad_norm 2.1944 (2.2419) [2022-01-24 21:22:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][400/1251] eta 0:31:32 lr 0.000154 time 1.7719 (2.2237) loss 3.4324 (3.1143) grad_norm 1.9573 (2.2428) [2022-01-24 21:23:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][410/1251] eta 0:31:11 lr 0.000154 time 1.8558 (2.2255) loss 3.1084 (3.1161) grad_norm 2.2535 (2.2438) [2022-01-24 21:23:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][420/1251] eta 0:30:49 lr 0.000154 time 2.7218 (2.2256) loss 3.1204 (3.1217) grad_norm 2.0681 (2.2452) [2022-01-24 21:23:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][430/1251] eta 0:30:26 lr 0.000154 time 2.5913 (2.2253) loss 2.6909 (3.1241) grad_norm 2.0856 (2.2444) [2022-01-24 21:24:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][440/1251] eta 0:30:02 lr 0.000154 time 1.5857 (2.2224) loss 2.5073 (3.1297) grad_norm 2.4421 (2.2449) [2022-01-24 21:24:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][450/1251] eta 0:29:40 lr 0.000154 time 1.9245 (2.2223) loss 3.7279 (3.1299) grad_norm 2.0935 (2.2424) [2022-01-24 21:24:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][460/1251] eta 0:29:15 lr 0.000154 time 1.7212 (2.2194) loss 3.2991 (3.1346) grad_norm 3.0205 (2.2461) [2022-01-24 21:25:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][470/1251] eta 0:28:54 lr 0.000154 time 2.7379 (2.2205) loss 3.1777 (3.1323) grad_norm 2.0750 (2.2437) [2022-01-24 21:25:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][480/1251] eta 0:28:31 lr 0.000154 time 1.8820 (2.2203) loss 3.5412 (3.1353) grad_norm 2.2764 (2.2430) [2022-01-24 21:26:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][490/1251] eta 0:28:12 lr 0.000154 time 3.1867 (2.2236) loss 3.5538 (3.1364) grad_norm 1.9745 (2.2406) [2022-01-24 21:26:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][500/1251] eta 0:27:49 lr 0.000154 time 2.1970 (2.2224) loss 3.4692 (3.1357) grad_norm 2.0370 (2.2403) [2022-01-24 21:26:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][510/1251] eta 0:27:29 lr 0.000153 time 3.9627 (2.2259) loss 3.1338 (3.1332) grad_norm 2.2103 (2.2407) [2022-01-24 21:27:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][520/1251] eta 0:27:05 lr 0.000153 time 2.5431 (2.2231) loss 3.7268 (3.1360) grad_norm 2.5297 (2.2422) [2022-01-24 21:27:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][530/1251] eta 0:26:40 lr 0.000153 time 2.1938 (2.2205) loss 3.7984 (3.1365) grad_norm 2.0802 (2.2416) [2022-01-24 21:27:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][540/1251] eta 0:26:18 lr 0.000153 time 1.6236 (2.2202) loss 4.0589 (3.1369) grad_norm 2.2077 (2.2433) [2022-01-24 21:28:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][550/1251] eta 0:25:55 lr 0.000153 time 2.5746 (2.2183) loss 3.3464 (3.1413) grad_norm 2.1296 (2.2416) [2022-01-24 21:28:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][560/1251] eta 0:25:33 lr 0.000153 time 3.1333 (2.2193) loss 2.4899 (3.1419) grad_norm 2.2732 (2.2407) [2022-01-24 21:28:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][570/1251] eta 0:25:09 lr 0.000153 time 1.5701 (2.2172) loss 3.4111 (3.1449) grad_norm 2.2990 (2.2416) [2022-01-24 21:29:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][580/1251] eta 0:24:46 lr 0.000153 time 1.9212 (2.2155) loss 3.5191 (3.1440) grad_norm 2.5584 (2.2417) [2022-01-24 21:29:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][590/1251] eta 0:24:22 lr 0.000153 time 1.6003 (2.2129) loss 2.5716 (3.1446) grad_norm 2.2916 (2.2403) [2022-01-24 21:29:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][600/1251] eta 0:24:01 lr 0.000153 time 3.0716 (2.2136) loss 2.3042 (3.1439) grad_norm 2.3332 (2.2414) [2022-01-24 21:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][610/1251] eta 0:23:36 lr 0.000153 time 1.6237 (2.2104) loss 3.4052 (3.1454) grad_norm 2.0647 (2.2422) [2022-01-24 21:30:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][620/1251] eta 0:23:14 lr 0.000153 time 2.1351 (2.2094) loss 3.1788 (3.1443) grad_norm 2.2016 (2.2403) [2022-01-24 21:31:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][630/1251] eta 0:22:54 lr 0.000153 time 2.1773 (2.2127) loss 3.0718 (3.1451) grad_norm 2.5279 (2.2404) [2022-01-24 21:31:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][640/1251] eta 0:22:32 lr 0.000153 time 2.9507 (2.2143) loss 3.7801 (3.1473) grad_norm 2.1920 (2.2391) [2022-01-24 21:31:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][650/1251] eta 0:22:11 lr 0.000153 time 2.1413 (2.2148) loss 3.0765 (3.1477) grad_norm 2.4026 (2.2372) [2022-01-24 21:32:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][660/1251] eta 0:21:48 lr 0.000153 time 2.2665 (2.2133) loss 2.9645 (3.1491) grad_norm 2.2520 (2.2373) [2022-01-24 21:32:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][670/1251] eta 0:21:26 lr 0.000153 time 1.6011 (2.2145) loss 3.3754 (3.1540) grad_norm 2.1468 (2.2377) [2022-01-24 21:32:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][680/1251] eta 0:21:03 lr 0.000153 time 2.2214 (2.2123) loss 2.7343 (3.1544) grad_norm 2.6316 (2.2391) [2022-01-24 21:33:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][690/1251] eta 0:20:40 lr 0.000153 time 1.7852 (2.2108) loss 2.5113 (3.1578) grad_norm 2.1319 (2.2393) [2022-01-24 21:33:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][700/1251] eta 0:20:18 lr 0.000153 time 2.9070 (2.2110) loss 1.9796 (3.1552) grad_norm 2.2047 (2.2387) [2022-01-24 21:33:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][710/1251] eta 0:19:55 lr 0.000153 time 1.9032 (2.2102) loss 3.8424 (3.1582) grad_norm 2.2142 (2.2396) [2022-01-24 21:34:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][720/1251] eta 0:19:32 lr 0.000153 time 2.4516 (2.2087) loss 3.7338 (3.1607) grad_norm 2.3889 (2.2407) [2022-01-24 21:34:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][730/1251] eta 0:19:10 lr 0.000153 time 1.9510 (2.2079) loss 2.1193 (3.1592) grad_norm 2.4018 (2.2420) [2022-01-24 21:35:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][740/1251] eta 0:18:47 lr 0.000153 time 2.5589 (2.2073) loss 2.8719 (3.1614) grad_norm 2.2165 (2.2439) [2022-01-24 21:35:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][750/1251] eta 0:18:26 lr 0.000153 time 1.7875 (2.2080) loss 2.4633 (3.1615) grad_norm 2.2896 (2.2452) [2022-01-24 21:35:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][760/1251] eta 0:18:04 lr 0.000153 time 2.4644 (2.2092) loss 3.4700 (3.1606) grad_norm 2.2801 (2.2462) [2022-01-24 21:36:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][770/1251] eta 0:17:41 lr 0.000153 time 1.7840 (2.2060) loss 2.4290 (3.1580) grad_norm 2.3776 (2.2468) [2022-01-24 21:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][780/1251] eta 0:17:18 lr 0.000153 time 2.4045 (2.2042) loss 3.1537 (3.1570) grad_norm 2.1952 (2.2478) [2022-01-24 21:36:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][790/1251] eta 0:16:56 lr 0.000153 time 2.7027 (2.2051) loss 3.3955 (3.1577) grad_norm 2.0543 (2.2465) [2022-01-24 21:37:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][800/1251] eta 0:16:34 lr 0.000153 time 2.1873 (2.2044) loss 3.4590 (3.1612) grad_norm 2.1439 (2.2465) [2022-01-24 21:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][810/1251] eta 0:16:12 lr 0.000153 time 2.1664 (2.2054) loss 2.8414 (3.1648) grad_norm 2.2266 (2.2470) [2022-01-24 21:37:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][820/1251] eta 0:15:50 lr 0.000153 time 1.7884 (2.2057) loss 2.6296 (3.1634) grad_norm 2.2921 (2.2483) [2022-01-24 21:38:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][830/1251] eta 0:15:29 lr 0.000153 time 1.8637 (2.2071) loss 2.9431 (3.1635) grad_norm 2.2553 (2.2487) [2022-01-24 21:38:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][840/1251] eta 0:15:06 lr 0.000153 time 1.6746 (2.2061) loss 3.5401 (3.1658) grad_norm 2.2528 (2.2495) [2022-01-24 21:39:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][850/1251] eta 0:14:44 lr 0.000153 time 2.7899 (2.2049) loss 3.6551 (3.1684) grad_norm 1.9455 (2.2487) [2022-01-24 21:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][860/1251] eta 0:14:21 lr 0.000152 time 2.0600 (2.2025) loss 2.6553 (3.1675) grad_norm 1.9917 (2.2488) [2022-01-24 21:39:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][870/1251] eta 0:13:59 lr 0.000152 time 1.8959 (2.2032) loss 3.1654 (3.1670) grad_norm 2.2162 (2.2485) [2022-01-24 21:40:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][880/1251] eta 0:13:37 lr 0.000152 time 2.2200 (2.2041) loss 3.5493 (3.1684) grad_norm 2.1277 (2.2493) [2022-01-24 21:40:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][890/1251] eta 0:13:16 lr 0.000152 time 1.9608 (2.2055) loss 3.5044 (3.1663) grad_norm 2.4473 (2.2499) [2022-01-24 21:40:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][900/1251] eta 0:12:54 lr 0.000152 time 2.1532 (2.2074) loss 2.4740 (3.1666) grad_norm 2.1089 (2.2511) [2022-01-24 21:41:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][910/1251] eta 0:12:32 lr 0.000152 time 1.8325 (2.2067) loss 3.1758 (3.1688) grad_norm 2.1091 (2.2545) [2022-01-24 21:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][920/1251] eta 0:12:09 lr 0.000152 time 1.6091 (2.2035) loss 3.3135 (3.1696) grad_norm 2.1071 (2.2551) [2022-01-24 21:41:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][930/1251] eta 0:11:46 lr 0.000152 time 2.2046 (2.2006) loss 3.2104 (3.1709) grad_norm 2.6143 (2.2558) [2022-01-24 21:42:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][940/1251] eta 0:11:23 lr 0.000152 time 1.7977 (2.1993) loss 3.2476 (3.1716) grad_norm 2.3271 (2.2550) [2022-01-24 21:42:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][950/1251] eta 0:11:02 lr 0.000152 time 2.5412 (2.1997) loss 2.9145 (3.1727) grad_norm 2.4870 (2.2548) [2022-01-24 21:43:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][960/1251] eta 0:10:39 lr 0.000152 time 2.4910 (2.1985) loss 2.8240 (3.1685) grad_norm 1.9922 (2.2536) [2022-01-24 21:43:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][970/1251] eta 0:10:17 lr 0.000152 time 2.5414 (2.1975) loss 2.8067 (3.1695) grad_norm 2.0789 (2.2547) [2022-01-24 21:43:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][980/1251] eta 0:09:55 lr 0.000152 time 2.8079 (2.1977) loss 3.9386 (3.1708) grad_norm 2.3103 (2.2544) [2022-01-24 21:44:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][990/1251] eta 0:09:34 lr 0.000152 time 2.0731 (2.1998) loss 3.7059 (3.1707) grad_norm 2.4401 (2.2551) [2022-01-24 21:44:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1000/1251] eta 0:09:12 lr 0.000152 time 2.5862 (2.2005) loss 2.9470 (3.1714) grad_norm 2.3176 (2.2555) [2022-01-24 21:44:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1010/1251] eta 0:08:50 lr 0.000152 time 2.9862 (2.2009) loss 3.6293 (3.1730) grad_norm 2.2035 (2.2543) [2022-01-24 21:45:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1020/1251] eta 0:08:28 lr 0.000152 time 1.8679 (2.2000) loss 3.2777 (3.1743) grad_norm 2.1593 (2.2548) [2022-01-24 21:45:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1030/1251] eta 0:08:06 lr 0.000152 time 2.5087 (2.1999) loss 3.2079 (3.1734) grad_norm 2.2534 (2.2539) [2022-01-24 21:45:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1040/1251] eta 0:07:44 lr 0.000152 time 2.2309 (2.2013) loss 3.3314 (3.1744) grad_norm 2.0916 (2.2535) [2022-01-24 21:46:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1050/1251] eta 0:07:23 lr 0.000152 time 3.1519 (2.2040) loss 3.1618 (3.1758) grad_norm 2.3098 (2.2538) [2022-01-24 21:46:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1060/1251] eta 0:07:00 lr 0.000152 time 1.5507 (2.2033) loss 3.0608 (3.1749) grad_norm 2.2578 (2.2553) [2022-01-24 21:47:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1070/1251] eta 0:06:38 lr 0.000152 time 1.9176 (2.2021) loss 3.7353 (3.1760) grad_norm 2.2143 (2.2548) [2022-01-24 21:47:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1080/1251] eta 0:06:16 lr 0.000152 time 1.9619 (2.2005) loss 3.5446 (3.1788) grad_norm 2.1683 (2.2562) [2022-01-24 21:47:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1090/1251] eta 0:05:54 lr 0.000152 time 2.9641 (2.2000) loss 3.7130 (3.1795) grad_norm 2.1167 (2.2560) [2022-01-24 21:48:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1100/1251] eta 0:05:32 lr 0.000152 time 2.3609 (2.2003) loss 3.3183 (3.1797) grad_norm 2.0416 (2.2551) [2022-01-24 21:48:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1110/1251] eta 0:05:10 lr 0.000152 time 2.2007 (2.2009) loss 2.1088 (3.1790) grad_norm 2.1969 (2.2544) [2022-01-24 21:48:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1120/1251] eta 0:04:48 lr 0.000152 time 2.5029 (2.2018) loss 2.9401 (3.1786) grad_norm 2.2506 (2.2539) [2022-01-24 21:49:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1130/1251] eta 0:04:26 lr 0.000152 time 2.2464 (2.2007) loss 3.5367 (3.1789) grad_norm 2.2714 (2.2556) [2022-01-24 21:49:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1140/1251] eta 0:04:04 lr 0.000152 time 2.1864 (2.1996) loss 2.8355 (3.1759) grad_norm 2.2682 (2.2558) [2022-01-24 21:49:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1150/1251] eta 0:03:42 lr 0.000152 time 2.2485 (2.1988) loss 3.7355 (3.1781) grad_norm 2.1679 (2.2561) [2022-01-24 21:50:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1160/1251] eta 0:03:19 lr 0.000152 time 2.1983 (2.1978) loss 3.5879 (3.1811) grad_norm 2.3033 (2.2563) [2022-01-24 21:50:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1170/1251] eta 0:02:57 lr 0.000152 time 1.7322 (2.1957) loss 3.7959 (3.1796) grad_norm 2.1746 (2.2557) [2022-01-24 21:51:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1180/1251] eta 0:02:35 lr 0.000152 time 2.2913 (2.1954) loss 3.4601 (3.1808) grad_norm 2.1765 (2.2566) [2022-01-24 21:51:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1190/1251] eta 0:02:13 lr 0.000152 time 2.3796 (2.1950) loss 3.7152 (3.1809) grad_norm 2.2815 (2.2564) [2022-01-24 21:51:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1200/1251] eta 0:01:51 lr 0.000151 time 2.3646 (2.1953) loss 3.5766 (3.1802) grad_norm 2.0967 (2.2564) [2022-01-24 21:52:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1210/1251] eta 0:01:29 lr 0.000151 time 2.1363 (2.1951) loss 2.2195 (3.1793) grad_norm 2.1270 (2.2564) [2022-01-24 21:52:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1220/1251] eta 0:01:08 lr 0.000151 time 3.3053 (2.1971) loss 3.4564 (3.1796) grad_norm 2.3111 (2.2582) [2022-01-24 21:52:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1230/1251] eta 0:00:46 lr 0.000151 time 2.2694 (2.1981) loss 2.8528 (3.1783) grad_norm 2.2860 (2.2587) [2022-01-24 21:53:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1240/1251] eta 0:00:24 lr 0.000151 time 1.7601 (2.1976) loss 3.2459 (3.1808) grad_norm 2.0774 (2.2598) [2022-01-24 21:53:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [225/300][1250/1251] eta 0:00:02 lr 0.000151 time 1.3302 (2.1926) loss 3.7087 (3.1814) grad_norm 2.4755 (2.2601) [2022-01-24 21:53:31 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 225 training takes 0:45:43 [2022-01-24 21:53:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.620 (18.620) Loss 0.8211 (0.8211) Acc@1 80.566 (80.566) Acc@5 95.801 (95.801) [2022-01-24 21:54:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.942 (3.547) Loss 0.9401 (0.8640) Acc@1 77.734 (79.545) Acc@5 93.848 (94.886) [2022-01-24 21:54:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.545 (2.711) Loss 0.8953 (0.8625) Acc@1 77.734 (79.590) Acc@5 95.215 (94.899) [2022-01-24 21:54:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.922 (2.331) Loss 0.9125 (0.8685) Acc@1 79.395 (79.458) Acc@5 93.262 (94.824) [2022-01-24 21:55:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.249 (2.236) Loss 0.9060 (0.8675) Acc@1 77.344 (79.445) Acc@5 94.238 (94.815) [2022-01-24 21:55:10 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.476 Acc@5 94.864 [2022-01-24 21:55:10 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.5% [2022-01-24 21:55:10 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.48% [2022-01-24 21:55:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][0/1251] eta 7:24:49 lr 0.000151 time 21.3346 (21.3346) loss 3.6159 (3.6159) grad_norm 2.6968 (2.6968) [2022-01-24 21:55:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][10/1251] eta 1:22:17 lr 0.000151 time 1.5255 (3.9783) loss 3.2595 (3.1271) grad_norm 2.3097 (2.1910) [2022-01-24 21:56:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][20/1251] eta 1:05:21 lr 0.000151 time 1.3379 (3.1857) loss 2.6081 (3.0377) grad_norm 2.2824 (2.2430) [2022-01-24 21:56:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][30/1251] eta 0:57:05 lr 0.000151 time 1.5052 (2.8053) loss 2.5649 (3.1250) grad_norm 2.5818 (2.2820) [2022-01-24 21:57:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][40/1251] eta 0:54:45 lr 0.000151 time 3.6788 (2.7128) loss 3.8672 (3.1503) grad_norm 2.1562 (2.2736) [2022-01-24 21:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][50/1251] eta 0:52:54 lr 0.000151 time 2.8368 (2.6433) loss 3.4749 (3.1871) grad_norm 2.0072 (2.2702) [2022-01-24 21:57:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][60/1251] eta 0:51:03 lr 0.000151 time 2.0382 (2.5721) loss 2.9982 (3.1304) grad_norm 2.0189 (2.2483) [2022-01-24 21:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][70/1251] eta 0:49:15 lr 0.000151 time 1.6539 (2.5024) loss 3.1856 (3.1595) grad_norm 3.2571 (2.2528) [2022-01-24 21:58:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][80/1251] eta 0:48:21 lr 0.000151 time 3.0945 (2.4780) loss 3.6879 (3.1699) grad_norm 2.5812 (2.2579) [2022-01-24 21:58:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][90/1251] eta 0:47:26 lr 0.000151 time 1.8883 (2.4522) loss 3.5935 (3.1978) grad_norm 2.4856 (2.2512) [2022-01-24 21:59:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][100/1251] eta 0:46:18 lr 0.000151 time 2.0053 (2.4142) loss 2.4958 (3.1790) grad_norm 2.2553 (2.2642) [2022-01-24 21:59:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][110/1251] eta 0:45:25 lr 0.000151 time 1.5809 (2.3884) loss 3.5022 (3.1885) grad_norm 2.2622 (2.2599) [2022-01-24 21:59:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][120/1251] eta 0:44:49 lr 0.000151 time 2.1840 (2.3782) loss 3.6468 (3.2029) grad_norm 2.5216 (2.2661) [2022-01-24 22:00:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][130/1251] eta 0:44:19 lr 0.000151 time 1.9189 (2.3725) loss 3.2761 (3.1927) grad_norm 2.0901 (2.2586) [2022-01-24 22:00:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][140/1251] eta 0:43:29 lr 0.000151 time 1.9207 (2.3486) loss 2.7103 (3.1992) grad_norm 2.3802 (2.2607) [2022-01-24 22:01:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][150/1251] eta 0:42:42 lr 0.000151 time 2.1046 (2.3273) loss 2.7279 (3.1895) grad_norm 2.1589 (2.2642) [2022-01-24 22:01:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][160/1251] eta 0:42:24 lr 0.000151 time 2.0377 (2.3323) loss 3.8213 (3.1929) grad_norm 2.2024 (2.2604) [2022-01-24 22:01:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][170/1251] eta 0:42:02 lr 0.000151 time 2.1686 (2.3331) loss 3.4436 (3.2045) grad_norm 2.6978 (2.2629) [2022-01-24 22:02:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][180/1251] eta 0:41:35 lr 0.000151 time 1.8496 (2.3297) loss 3.3406 (3.2043) grad_norm 2.6943 (2.2635) [2022-01-24 22:02:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][190/1251] eta 0:40:54 lr 0.000151 time 1.6368 (2.3137) loss 3.5228 (3.1954) grad_norm 2.1198 (2.2718) [2022-01-24 22:02:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][200/1251] eta 0:40:13 lr 0.000151 time 1.9174 (2.2964) loss 3.7752 (3.1944) grad_norm 2.4126 (2.2787) [2022-01-24 22:03:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][210/1251] eta 0:39:31 lr 0.000151 time 2.2479 (2.2785) loss 2.5228 (3.1767) grad_norm 1.9404 (2.2745) [2022-01-24 22:03:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][220/1251] eta 0:38:57 lr 0.000151 time 2.4742 (2.2676) loss 3.1048 (3.1685) grad_norm 2.2602 (2.2749) [2022-01-24 22:03:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][230/1251] eta 0:38:30 lr 0.000151 time 2.1644 (2.2627) loss 3.3096 (3.1729) grad_norm 2.5772 (2.2805) [2022-01-24 22:04:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][240/1251] eta 0:38:08 lr 0.000151 time 2.0164 (2.2638) loss 3.9213 (3.1771) grad_norm 2.7685 (2.2836) [2022-01-24 22:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][250/1251] eta 0:37:42 lr 0.000151 time 1.9065 (2.2598) loss 3.3657 (3.1776) grad_norm 2.1007 (2.2811) [2022-01-24 22:05:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][260/1251] eta 0:37:23 lr 0.000151 time 2.3378 (2.2639) loss 2.8274 (3.1720) grad_norm 1.9851 (2.2798) [2022-01-24 22:05:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][270/1251] eta 0:37:08 lr 0.000151 time 2.2086 (2.2717) loss 3.4712 (3.1740) grad_norm 2.1539 (2.2817) [2022-01-24 22:05:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][280/1251] eta 0:36:49 lr 0.000151 time 2.4197 (2.2758) loss 3.9139 (3.1835) grad_norm 2.0705 (2.2791) [2022-01-24 22:06:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][290/1251] eta 0:36:22 lr 0.000150 time 1.8682 (2.2716) loss 2.4352 (3.1753) grad_norm 2.4409 (2.2807) [2022-01-24 22:06:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][300/1251] eta 0:35:51 lr 0.000150 time 1.9477 (2.2627) loss 3.3089 (3.1796) grad_norm 2.0859 (2.2781) [2022-01-24 22:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][310/1251] eta 0:35:17 lr 0.000150 time 1.9828 (2.2502) loss 3.4685 (3.1857) grad_norm 2.4747 (2.2788) [2022-01-24 22:07:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][320/1251] eta 0:34:47 lr 0.000150 time 1.8791 (2.2419) loss 3.5266 (3.1896) grad_norm 2.0780 (2.2798) [2022-01-24 22:07:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][330/1251] eta 0:34:19 lr 0.000150 time 2.1587 (2.2366) loss 3.0724 (3.1848) grad_norm 2.2906 (2.2793) [2022-01-24 22:07:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][340/1251] eta 0:33:53 lr 0.000150 time 1.6346 (2.2327) loss 2.3298 (3.1778) grad_norm 1.9987 (2.2781) [2022-01-24 22:08:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][350/1251] eta 0:33:28 lr 0.000150 time 2.1503 (2.2297) loss 3.8178 (3.1849) grad_norm 2.2568 (2.2782) [2022-01-24 22:08:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][360/1251] eta 0:33:19 lr 0.000150 time 2.8444 (2.2446) loss 3.7472 (3.1849) grad_norm 2.0461 (2.2805) [2022-01-24 22:09:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][370/1251] eta 0:33:06 lr 0.000150 time 2.5177 (2.2550) loss 3.5413 (3.1832) grad_norm 2.5129 (2.2814) [2022-01-24 22:09:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][380/1251] eta 0:32:41 lr 0.000150 time 1.4766 (2.2515) loss 3.3629 (3.1768) grad_norm 2.1518 (2.2792) [2022-01-24 22:09:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][390/1251] eta 0:32:14 lr 0.000150 time 1.8092 (2.2471) loss 3.4140 (3.1797) grad_norm 2.3427 (2.2792) [2022-01-24 22:10:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][400/1251] eta 0:31:49 lr 0.000150 time 1.9112 (2.2433) loss 3.3420 (3.1858) grad_norm 2.1332 (2.2779) [2022-01-24 22:10:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][410/1251] eta 0:31:21 lr 0.000150 time 1.8436 (2.2371) loss 3.1716 (3.1897) grad_norm 2.4099 (2.2774) [2022-01-24 22:10:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][420/1251] eta 0:31:01 lr 0.000150 time 1.9081 (2.2402) loss 3.4321 (3.1894) grad_norm 1.9536 (2.2739) [2022-01-24 22:11:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][430/1251] eta 0:30:37 lr 0.000150 time 1.9533 (2.2380) loss 3.7148 (3.1904) grad_norm 2.4889 (2.2730) [2022-01-24 22:11:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][440/1251] eta 0:30:16 lr 0.000150 time 2.1840 (2.2395) loss 3.0959 (3.1899) grad_norm 2.4400 (2.2835) [2022-01-24 22:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][450/1251] eta 0:29:51 lr 0.000150 time 1.9034 (2.2370) loss 3.0116 (3.1902) grad_norm 2.2106 (2.2851) [2022-01-24 22:12:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][460/1251] eta 0:29:26 lr 0.000150 time 1.6676 (2.2336) loss 3.5829 (3.1913) grad_norm 2.9512 (2.2863) [2022-01-24 22:12:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][470/1251] eta 0:29:03 lr 0.000150 time 1.4844 (2.2327) loss 3.5968 (3.1979) grad_norm 2.3202 (2.2866) [2022-01-24 22:13:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][480/1251] eta 0:28:43 lr 0.000150 time 2.1657 (2.2358) loss 3.7678 (3.1931) grad_norm 2.5115 (2.2867) [2022-01-24 22:13:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][490/1251] eta 0:28:19 lr 0.000150 time 2.2448 (2.2337) loss 2.4183 (3.1848) grad_norm 2.2816 (2.2885) [2022-01-24 22:13:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][500/1251] eta 0:27:56 lr 0.000150 time 2.2914 (2.2329) loss 3.3503 (3.1887) grad_norm 2.0491 (2.2866) [2022-01-24 22:14:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][510/1251] eta 0:27:31 lr 0.000150 time 1.5442 (2.2292) loss 2.3968 (3.1795) grad_norm 2.1713 (2.2849) [2022-01-24 22:14:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][520/1251] eta 0:27:09 lr 0.000150 time 1.8812 (2.2297) loss 2.7455 (3.1799) grad_norm 2.1711 (2.2874) [2022-01-24 22:14:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][530/1251] eta 0:26:47 lr 0.000150 time 1.5395 (2.2298) loss 4.0097 (3.1772) grad_norm 2.3949 (2.2884) [2022-01-24 22:15:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][540/1251] eta 0:26:23 lr 0.000150 time 2.3113 (2.2275) loss 3.8834 (3.1748) grad_norm 2.5227 (2.2907) [2022-01-24 22:15:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][550/1251] eta 0:25:59 lr 0.000150 time 1.5879 (2.2249) loss 3.6419 (3.1748) grad_norm 2.1729 (2.2898) [2022-01-24 22:15:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][560/1251] eta 0:25:35 lr 0.000150 time 2.0300 (2.2227) loss 3.9047 (3.1768) grad_norm 2.7006 (2.2927) [2022-01-24 22:16:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][570/1251] eta 0:25:13 lr 0.000150 time 1.9971 (2.2221) loss 2.8217 (3.1812) grad_norm 2.3183 (2.2971) [2022-01-24 22:16:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][580/1251] eta 0:24:49 lr 0.000150 time 1.9535 (2.2205) loss 2.5002 (3.1797) grad_norm 2.2203 (2.2979) [2022-01-24 22:17:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][590/1251] eta 0:24:27 lr 0.000150 time 2.2131 (2.2206) loss 3.3747 (3.1818) grad_norm 2.0898 (2.2972) [2022-01-24 22:17:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][600/1251] eta 0:24:04 lr 0.000150 time 2.2531 (2.2195) loss 3.2611 (3.1782) grad_norm 2.0031 (2.2977) [2022-01-24 22:17:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][610/1251] eta 0:23:41 lr 0.000150 time 1.5669 (2.2181) loss 3.5230 (3.1812) grad_norm 2.3499 (2.2981) [2022-01-24 22:18:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][620/1251] eta 0:23:19 lr 0.000150 time 2.6695 (2.2186) loss 3.7209 (3.1792) grad_norm 1.9306 (2.2957) [2022-01-24 22:18:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][630/1251] eta 0:22:56 lr 0.000150 time 1.5333 (2.2160) loss 3.5308 (3.1812) grad_norm 2.6065 (2.2957) [2022-01-24 22:18:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][640/1251] eta 0:22:33 lr 0.000149 time 2.2203 (2.2155) loss 2.5690 (3.1840) grad_norm 2.3392 (2.2956) [2022-01-24 22:19:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][650/1251] eta 0:22:11 lr 0.000149 time 1.5911 (2.2155) loss 3.3742 (3.1851) grad_norm 2.2027 (2.2970) [2022-01-24 22:19:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][660/1251] eta 0:21:50 lr 0.000149 time 2.9585 (2.2173) loss 3.2400 (3.1853) grad_norm 1.9557 (2.2972) [2022-01-24 22:19:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][670/1251] eta 0:21:28 lr 0.000149 time 1.6418 (2.2172) loss 2.9579 (3.1838) grad_norm 2.2159 (2.2990) [2022-01-24 22:20:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][680/1251] eta 0:21:06 lr 0.000149 time 2.2063 (2.2182) loss 3.0744 (3.1833) grad_norm 2.2964 (2.3000) [2022-01-24 22:20:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][690/1251] eta 0:20:44 lr 0.000149 time 1.8786 (2.2187) loss 2.7170 (3.1817) grad_norm 2.7300 (2.2996) [2022-01-24 22:21:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][700/1251] eta 0:20:21 lr 0.000149 time 1.8799 (2.2168) loss 3.0098 (3.1845) grad_norm 1.9129 (2.2994) [2022-01-24 22:21:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][710/1251] eta 0:19:58 lr 0.000149 time 1.5899 (2.2147) loss 2.9711 (3.1822) grad_norm 2.3701 (2.2989) [2022-01-24 22:21:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][720/1251] eta 0:19:35 lr 0.000149 time 1.9797 (2.2133) loss 3.4606 (3.1844) grad_norm 2.2211 (2.2970) [2022-01-24 22:22:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][730/1251] eta 0:19:15 lr 0.000149 time 2.1138 (2.2173) loss 2.8452 (3.1854) grad_norm 2.4160 (2.2986) [2022-01-24 22:22:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][740/1251] eta 0:18:52 lr 0.000149 time 1.9192 (2.2162) loss 3.7178 (3.1844) grad_norm 2.0974 (2.2989) [2022-01-24 22:22:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][750/1251] eta 0:18:30 lr 0.000149 time 1.6091 (2.2165) loss 3.6131 (3.1880) grad_norm 2.5331 (2.2986) [2022-01-24 22:23:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][760/1251] eta 0:18:07 lr 0.000149 time 1.7229 (2.2143) loss 3.7344 (3.1928) grad_norm 2.2200 (2.3005) [2022-01-24 22:23:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][770/1251] eta 0:17:44 lr 0.000149 time 1.9124 (2.2131) loss 3.5637 (3.1956) grad_norm 2.0302 (2.3026) [2022-01-24 22:24:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][780/1251] eta 0:17:23 lr 0.000149 time 2.1689 (2.2152) loss 4.0137 (3.2002) grad_norm 2.6450 (2.3032) [2022-01-24 22:24:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][790/1251] eta 0:17:01 lr 0.000149 time 2.1992 (2.2161) loss 3.5648 (3.1987) grad_norm 2.3260 (2.3056) [2022-01-24 22:24:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][800/1251] eta 0:16:38 lr 0.000149 time 1.9075 (2.2147) loss 3.6751 (3.1981) grad_norm 2.2751 (2.3057) [2022-01-24 22:25:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][810/1251] eta 0:16:16 lr 0.000149 time 1.9069 (2.2152) loss 3.3833 (3.1990) grad_norm 2.1533 (2.3059) [2022-01-24 22:25:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][820/1251] eta 0:15:54 lr 0.000149 time 1.6095 (2.2146) loss 3.7516 (3.1983) grad_norm 2.4488 (2.3107) [2022-01-24 22:25:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][830/1251] eta 0:15:31 lr 0.000149 time 2.1338 (2.2137) loss 4.1143 (3.2014) grad_norm 2.7887 (2.3120) [2022-01-24 22:26:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][840/1251] eta 0:15:09 lr 0.000149 time 1.8923 (2.2136) loss 3.5707 (3.2033) grad_norm 2.3194 (2.3110) [2022-01-24 22:26:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][850/1251] eta 0:14:47 lr 0.000149 time 1.9140 (2.2123) loss 3.5701 (3.2046) grad_norm 1.8035 (2.3088) [2022-01-24 22:26:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][860/1251] eta 0:14:24 lr 0.000149 time 1.6728 (2.2110) loss 3.7495 (3.2052) grad_norm 2.6223 (2.3076) [2022-01-24 22:27:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][870/1251] eta 0:14:01 lr 0.000149 time 2.0588 (2.2096) loss 3.3096 (3.2081) grad_norm 2.2244 (2.3082) [2022-01-24 22:27:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][880/1251] eta 0:13:38 lr 0.000149 time 1.9509 (2.2069) loss 2.0632 (3.2087) grad_norm 2.1757 (2.3075) [2022-01-24 22:27:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][890/1251] eta 0:13:16 lr 0.000149 time 1.8386 (2.2065) loss 3.2360 (3.2100) grad_norm 2.4963 (2.3083) [2022-01-24 22:28:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][900/1251] eta 0:12:54 lr 0.000149 time 2.4804 (2.2071) loss 3.0243 (3.2125) grad_norm 1.9241 (2.3082) [2022-01-24 22:28:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][910/1251] eta 0:12:32 lr 0.000149 time 2.5686 (2.2077) loss 3.5748 (3.2152) grad_norm 2.0658 (2.3089) [2022-01-24 22:29:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][920/1251] eta 0:12:10 lr 0.000149 time 2.8231 (2.2081) loss 3.6176 (3.2139) grad_norm 2.1157 (2.3073) [2022-01-24 22:29:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][930/1251] eta 0:11:49 lr 0.000149 time 2.0120 (2.2091) loss 3.2403 (3.2134) grad_norm 2.1427 (2.3069) [2022-01-24 22:29:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][940/1251] eta 0:11:27 lr 0.000149 time 2.4447 (2.2091) loss 3.0322 (3.2134) grad_norm 2.3121 (2.3072) [2022-01-24 22:30:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][950/1251] eta 0:11:04 lr 0.000149 time 2.4737 (2.2085) loss 3.1044 (3.2111) grad_norm 1.8875 (2.3059) [2022-01-24 22:30:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][960/1251] eta 0:10:43 lr 0.000149 time 3.1052 (2.2099) loss 3.4460 (3.2106) grad_norm 2.0188 (2.3062) [2022-01-24 22:30:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][970/1251] eta 0:10:20 lr 0.000149 time 2.1645 (2.2093) loss 2.2392 (3.2100) grad_norm 2.9642 (2.3068) [2022-01-24 22:31:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][980/1251] eta 0:09:58 lr 0.000149 time 2.0042 (2.2083) loss 3.3457 (3.2113) grad_norm 2.4226 (2.3064) [2022-01-24 22:31:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][990/1251] eta 0:09:36 lr 0.000148 time 1.6882 (2.2071) loss 2.5618 (3.2109) grad_norm 2.1955 (2.3053) [2022-01-24 22:31:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1000/1251] eta 0:09:14 lr 0.000148 time 2.6932 (2.2075) loss 2.6194 (3.2108) grad_norm 2.2991 (2.3049) [2022-01-24 22:32:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1010/1251] eta 0:08:51 lr 0.000148 time 1.9592 (2.2067) loss 3.3157 (3.2111) grad_norm 2.2459 (2.3050) [2022-01-24 22:32:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1020/1251] eta 0:08:29 lr 0.000148 time 2.2926 (2.2071) loss 2.1675 (3.2117) grad_norm 2.2412 (2.3041) [2022-01-24 22:33:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1030/1251] eta 0:08:07 lr 0.000148 time 1.6580 (2.2062) loss 3.4466 (3.2110) grad_norm 2.1994 (2.3033) [2022-01-24 22:33:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1040/1251] eta 0:07:45 lr 0.000148 time 2.5057 (2.2071) loss 3.0394 (3.2107) grad_norm 2.9067 (2.3040) [2022-01-24 22:33:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1050/1251] eta 0:07:23 lr 0.000148 time 2.4402 (2.2072) loss 3.6785 (3.2098) grad_norm 2.5231 (2.3034) [2022-01-24 22:34:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1060/1251] eta 0:07:01 lr 0.000148 time 2.2868 (2.2061) loss 3.8176 (3.2088) grad_norm 2.3048 (2.3029) [2022-01-24 22:34:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1070/1251] eta 0:06:39 lr 0.000148 time 2.7800 (2.2054) loss 3.1520 (3.2077) grad_norm 2.2312 (2.3038) [2022-01-24 22:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1080/1251] eta 0:06:17 lr 0.000148 time 1.4725 (2.2055) loss 2.1403 (3.2060) grad_norm 2.4693 (2.3037) [2022-01-24 22:35:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1090/1251] eta 0:05:55 lr 0.000148 time 1.8910 (2.2057) loss 2.7696 (3.2065) grad_norm 2.0202 (2.3034) [2022-01-24 22:35:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1100/1251] eta 0:05:33 lr 0.000148 time 2.7268 (2.2058) loss 3.7101 (3.2075) grad_norm 2.1062 (2.3032) [2022-01-24 22:36:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1110/1251] eta 0:05:10 lr 0.000148 time 2.0277 (2.2054) loss 3.8753 (3.2076) grad_norm 2.0035 (2.3023) [2022-01-24 22:36:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1120/1251] eta 0:04:48 lr 0.000148 time 1.6204 (2.2048) loss 3.1936 (3.2071) grad_norm 2.4909 (2.3014) [2022-01-24 22:36:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1130/1251] eta 0:04:26 lr 0.000148 time 2.0865 (2.2038) loss 3.6467 (3.2066) grad_norm 2.2664 (2.3026) [2022-01-24 22:37:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1140/1251] eta 0:04:04 lr 0.000148 time 3.2360 (2.2044) loss 3.4979 (3.2061) grad_norm 2.2038 (2.3022) [2022-01-24 22:37:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1150/1251] eta 0:03:42 lr 0.000148 time 2.7710 (2.2047) loss 3.4819 (3.2064) grad_norm 2.5226 (2.3021) [2022-01-24 22:37:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1160/1251] eta 0:03:20 lr 0.000148 time 2.1628 (2.2037) loss 3.6185 (3.2057) grad_norm 2.1887 (2.3026) [2022-01-24 22:38:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1170/1251] eta 0:02:58 lr 0.000148 time 2.1542 (2.2030) loss 3.0943 (3.2056) grad_norm 1.9498 (2.3029) [2022-01-24 22:38:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1180/1251] eta 0:02:36 lr 0.000148 time 2.6891 (2.2031) loss 3.6754 (3.2074) grad_norm 2.0213 (2.3034) [2022-01-24 22:38:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1190/1251] eta 0:02:14 lr 0.000148 time 1.9382 (2.2032) loss 3.4913 (3.2068) grad_norm 2.2893 (2.3035) [2022-01-24 22:39:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1200/1251] eta 0:01:52 lr 0.000148 time 2.2063 (2.2019) loss 3.5585 (3.2077) grad_norm 2.4134 (2.3051) [2022-01-24 22:39:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1210/1251] eta 0:01:30 lr 0.000148 time 2.8111 (2.2020) loss 3.7559 (3.2069) grad_norm 2.0981 (2.3048) [2022-01-24 22:40:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1220/1251] eta 0:01:08 lr 0.000148 time 2.7006 (2.2032) loss 3.2827 (3.2079) grad_norm 2.3927 (2.3062) [2022-01-24 22:40:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1230/1251] eta 0:00:46 lr 0.000148 time 1.8460 (2.2027) loss 3.5041 (3.2072) grad_norm 2.5793 (2.3080) [2022-01-24 22:40:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1240/1251] eta 0:00:24 lr 0.000148 time 1.8960 (2.2019) loss 3.2991 (3.2079) grad_norm 2.3238 (2.3087) [2022-01-24 22:40:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [226/300][1250/1251] eta 0:00:02 lr 0.000148 time 1.1989 (2.1973) loss 3.1709 (3.2087) grad_norm 1.9571 (2.3075) [2022-01-24 22:40:59 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 226 training takes 0:45:49 [2022-01-24 22:41:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.693 (17.693) Loss 0.7876 (0.7876) Acc@1 81.055 (81.055) Acc@5 96.289 (96.289) [2022-01-24 22:41:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.249 (3.220) Loss 0.8408 (0.8458) Acc@1 80.371 (80.265) Acc@5 95.703 (95.046) [2022-01-24 22:41:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.896 (2.458) Loss 0.8188 (0.8576) Acc@1 82.031 (79.869) Acc@5 95.410 (95.020) [2022-01-24 22:42:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.301 (2.253) Loss 0.8592 (0.8633) Acc@1 79.980 (79.580) Acc@5 94.629 (94.994) [2022-01-24 22:42:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.074 (2.115) Loss 0.8511 (0.8638) Acc@1 79.102 (79.597) Acc@5 95.508 (94.936) [2022-01-24 22:42:33 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.534 Acc@5 94.946 [2022-01-24 22:42:33 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.5% [2022-01-24 22:42:33 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.53% [2022-01-24 22:42:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][0/1251] eta 8:22:59 lr 0.000148 time 24.1245 (24.1245) loss 2.4193 (2.4193) grad_norm 2.9269 (2.9269) [2022-01-24 22:43:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][10/1251] eta 1:32:17 lr 0.000148 time 2.9682 (4.4622) loss 3.3749 (2.8975) grad_norm 2.3632 (2.2892) [2022-01-24 22:43:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][20/1251] eta 1:10:45 lr 0.000148 time 1.4650 (3.4490) loss 3.2783 (3.1948) grad_norm 2.2424 (2.3230) [2022-01-24 22:44:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][30/1251] eta 1:01:47 lr 0.000148 time 1.6258 (3.0362) loss 3.8314 (3.1884) grad_norm 2.5553 (2.3220) [2022-01-24 22:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][40/1251] eta 0:56:49 lr 0.000148 time 2.2639 (2.8154) loss 3.5412 (3.1934) grad_norm 2.4323 (2.3147) [2022-01-24 22:44:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][50/1251] eta 0:53:37 lr 0.000148 time 2.0037 (2.6790) loss 3.8849 (3.1833) grad_norm 2.2709 (2.3137) [2022-01-24 22:45:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][60/1251] eta 0:51:13 lr 0.000148 time 1.8121 (2.5806) loss 3.5073 (3.2096) grad_norm 2.6165 (2.3102) [2022-01-24 22:45:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][70/1251] eta 0:49:23 lr 0.000148 time 2.2368 (2.5090) loss 3.8156 (3.2457) grad_norm 2.1310 (2.3189) [2022-01-24 22:45:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][80/1251] eta 0:47:59 lr 0.000147 time 1.8952 (2.4592) loss 3.5119 (3.2570) grad_norm 2.3732 (2.3137) [2022-01-24 22:46:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][90/1251] eta 0:46:56 lr 0.000147 time 1.5661 (2.4259) loss 3.0409 (3.2574) grad_norm 2.4659 (2.3039) [2022-01-24 22:46:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][100/1251] eta 0:46:05 lr 0.000147 time 2.8226 (2.4026) loss 2.8871 (3.2229) grad_norm 2.7098 (2.3122) [2022-01-24 22:46:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][110/1251] eta 0:45:10 lr 0.000147 time 2.2480 (2.3756) loss 3.3848 (3.2381) grad_norm 2.0163 (2.3247) [2022-01-24 22:47:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][120/1251] eta 0:44:29 lr 0.000147 time 2.1344 (2.3607) loss 3.4918 (3.2416) grad_norm 2.4268 (2.3322) [2022-01-24 22:47:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][130/1251] eta 0:43:47 lr 0.000147 time 1.8062 (2.3435) loss 2.8667 (3.2294) grad_norm 2.5494 (2.3321) [2022-01-24 22:48:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][140/1251] eta 0:43:20 lr 0.000147 time 2.2015 (2.3407) loss 2.8762 (3.2216) grad_norm 2.4974 (2.3301) [2022-01-24 22:48:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][150/1251] eta 0:43:04 lr 0.000147 time 2.3573 (2.3477) loss 3.3195 (3.2289) grad_norm 2.4283 (2.3302) [2022-01-24 22:48:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][160/1251] eta 0:42:38 lr 0.000147 time 2.0947 (2.3450) loss 4.0052 (3.2266) grad_norm 2.5041 (2.3263) [2022-01-24 22:49:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][170/1251] eta 0:42:03 lr 0.000147 time 1.7703 (2.3341) loss 3.5806 (3.2266) grad_norm 2.8352 (2.3259) [2022-01-24 22:49:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][180/1251] eta 0:41:32 lr 0.000147 time 3.2549 (2.3277) loss 3.4527 (3.2327) grad_norm 2.1549 (2.3284) [2022-01-24 22:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][190/1251] eta 0:40:58 lr 0.000147 time 2.1653 (2.3170) loss 3.6274 (3.2419) grad_norm 2.0931 (2.3254) [2022-01-24 22:50:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][200/1251] eta 0:40:27 lr 0.000147 time 2.2530 (2.3094) loss 3.0362 (3.2373) grad_norm 2.0677 (2.3187) [2022-01-24 22:50:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][210/1251] eta 0:39:51 lr 0.000147 time 1.6920 (2.2976) loss 3.2677 (3.2254) grad_norm 2.2378 (2.3149) [2022-01-24 22:50:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][220/1251] eta 0:39:16 lr 0.000147 time 2.2124 (2.2855) loss 2.6201 (3.2280) grad_norm 2.8887 (2.3135) [2022-01-24 22:51:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][230/1251] eta 0:38:49 lr 0.000147 time 2.0551 (2.2818) loss 2.5021 (3.2122) grad_norm 2.2817 (2.3128) [2022-01-24 22:51:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][240/1251] eta 0:38:24 lr 0.000147 time 2.2306 (2.2791) loss 3.9680 (3.2160) grad_norm 2.2555 (2.3111) [2022-01-24 22:52:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][250/1251] eta 0:37:59 lr 0.000147 time 1.8975 (2.2770) loss 2.8573 (3.2152) grad_norm 2.2413 (2.3098) [2022-01-24 22:52:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][260/1251] eta 0:37:35 lr 0.000147 time 2.9594 (2.2765) loss 2.3201 (3.2103) grad_norm 2.3646 (2.3070) [2022-01-24 22:52:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][270/1251] eta 0:37:13 lr 0.000147 time 2.8123 (2.2768) loss 2.6901 (3.2128) grad_norm 1.9980 (2.3061) [2022-01-24 22:53:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][280/1251] eta 0:36:48 lr 0.000147 time 2.4099 (2.2747) loss 3.1259 (3.2071) grad_norm 1.8046 (2.3012) [2022-01-24 22:53:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][290/1251] eta 0:36:18 lr 0.000147 time 1.9541 (2.2665) loss 2.3912 (3.2003) grad_norm 2.2117 (2.2980) [2022-01-24 22:53:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][300/1251] eta 0:35:52 lr 0.000147 time 2.6537 (2.2632) loss 3.2547 (3.2059) grad_norm 2.3429 (2.3028) [2022-01-24 22:54:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][310/1251] eta 0:35:22 lr 0.000147 time 1.7691 (2.2551) loss 4.0794 (3.2076) grad_norm 2.7915 (2.3086) [2022-01-24 22:54:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][320/1251] eta 0:35:01 lr 0.000147 time 2.5929 (2.2578) loss 2.8053 (3.2009) grad_norm 2.8694 (2.3104) [2022-01-24 22:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][330/1251] eta 0:34:39 lr 0.000147 time 2.7635 (2.2575) loss 3.3505 (3.2026) grad_norm 2.1854 (2.3163) [2022-01-24 22:55:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][340/1251] eta 0:34:12 lr 0.000147 time 2.3065 (2.2535) loss 2.2480 (3.2042) grad_norm 1.9588 (2.3135) [2022-01-24 22:55:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][350/1251] eta 0:33:47 lr 0.000147 time 2.1552 (2.2505) loss 3.8317 (3.2061) grad_norm 2.2816 (2.3115) [2022-01-24 22:56:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][360/1251] eta 0:33:23 lr 0.000147 time 2.5381 (2.2491) loss 3.3381 (3.2052) grad_norm 2.1882 (2.3094) [2022-01-24 22:56:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][370/1251] eta 0:32:56 lr 0.000147 time 1.8216 (2.2431) loss 3.6254 (3.2026) grad_norm 2.1942 (2.3101) [2022-01-24 22:56:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][380/1251] eta 0:32:28 lr 0.000147 time 1.9294 (2.2370) loss 2.6625 (3.1972) grad_norm 2.4093 (2.3104) [2022-01-24 22:57:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][390/1251] eta 0:32:03 lr 0.000147 time 2.3556 (2.2340) loss 3.6280 (3.2006) grad_norm 2.0127 (2.3116) [2022-01-24 22:57:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][400/1251] eta 0:31:42 lr 0.000147 time 2.2325 (2.2360) loss 2.7327 (3.1940) grad_norm 3.0252 (2.3149) [2022-01-24 22:57:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][410/1251] eta 0:31:20 lr 0.000147 time 2.2184 (2.2361) loss 3.1708 (3.1926) grad_norm 2.7334 (2.3157) [2022-01-24 22:58:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][420/1251] eta 0:30:58 lr 0.000147 time 2.5101 (2.2365) loss 3.4688 (3.1948) grad_norm 2.4475 (2.3139) [2022-01-24 22:58:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][430/1251] eta 0:30:36 lr 0.000146 time 2.1801 (2.2363) loss 3.7119 (3.1990) grad_norm 2.5495 (2.3131) [2022-01-24 22:58:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][440/1251] eta 0:30:13 lr 0.000146 time 2.5405 (2.2356) loss 2.9659 (3.2001) grad_norm 2.1247 (2.3167) [2022-01-24 22:59:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][450/1251] eta 0:29:49 lr 0.000146 time 2.0131 (2.2338) loss 3.1803 (3.2021) grad_norm 2.3468 (2.3171) [2022-01-24 22:59:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][460/1251] eta 0:29:23 lr 0.000146 time 2.1877 (2.2296) loss 3.8430 (3.2080) grad_norm 2.6667 (2.3165) [2022-01-24 23:00:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][470/1251] eta 0:28:58 lr 0.000146 time 2.1540 (2.2258) loss 3.2096 (3.2043) grad_norm 2.1000 (2.3185) [2022-01-24 23:00:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][480/1251] eta 0:28:34 lr 0.000146 time 2.0397 (2.2233) loss 2.5197 (3.2024) grad_norm 2.2384 (2.3201) [2022-01-24 23:00:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][490/1251] eta 0:28:12 lr 0.000146 time 2.3280 (2.2241) loss 3.1217 (3.1991) grad_norm 2.3149 (2.3219) [2022-01-24 23:01:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][500/1251] eta 0:27:50 lr 0.000146 time 2.1712 (2.2238) loss 3.5668 (3.1986) grad_norm 2.3698 (2.3212) [2022-01-24 23:01:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][510/1251] eta 0:27:29 lr 0.000146 time 2.7120 (2.2258) loss 3.2590 (3.1978) grad_norm 2.1744 (2.3204) [2022-01-24 23:01:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][520/1251] eta 0:27:06 lr 0.000146 time 2.8481 (2.2248) loss 3.5211 (3.1994) grad_norm 2.3508 (2.3213) [2022-01-24 23:02:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][530/1251] eta 0:26:42 lr 0.000146 time 2.4926 (2.2227) loss 3.4671 (3.1999) grad_norm 2.0332 (2.3182) [2022-01-24 23:02:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][540/1251] eta 0:26:18 lr 0.000146 time 1.9064 (2.2205) loss 2.3387 (3.1997) grad_norm 2.3661 (2.3187) [2022-01-24 23:02:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][550/1251] eta 0:25:58 lr 0.000146 time 3.0794 (2.2236) loss 2.3858 (3.1977) grad_norm 2.0339 (2.3180) [2022-01-24 23:03:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][560/1251] eta 0:25:36 lr 0.000146 time 2.1444 (2.2239) loss 2.4235 (3.1970) grad_norm 2.3224 (2.3174) [2022-01-24 23:03:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][570/1251] eta 0:25:15 lr 0.000146 time 2.5335 (2.2257) loss 3.7974 (3.1980) grad_norm 2.6757 (2.3168) [2022-01-24 23:04:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][580/1251] eta 0:24:50 lr 0.000146 time 1.6703 (2.2214) loss 3.4337 (3.1968) grad_norm 2.1908 (2.3170) [2022-01-24 23:04:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][590/1251] eta 0:24:27 lr 0.000146 time 3.3336 (2.2197) loss 2.6398 (3.1920) grad_norm 2.3688 (2.3168) [2022-01-24 23:04:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][600/1251] eta 0:24:02 lr 0.000146 time 1.8124 (2.2159) loss 3.6051 (3.1925) grad_norm 2.3771 (2.3184) [2022-01-24 23:05:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][610/1251] eta 0:23:40 lr 0.000146 time 2.2123 (2.2159) loss 2.6414 (3.1912) grad_norm 2.5476 (2.3194) [2022-01-24 23:05:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][620/1251] eta 0:23:18 lr 0.000146 time 1.8474 (2.2165) loss 2.9871 (3.1902) grad_norm 2.1607 (2.3186) [2022-01-24 23:05:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][630/1251] eta 0:22:58 lr 0.000146 time 3.0108 (2.2193) loss 2.4594 (3.1925) grad_norm 2.1213 (2.3166) [2022-01-24 23:06:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][640/1251] eta 0:22:36 lr 0.000146 time 2.1842 (2.2200) loss 2.6391 (3.1936) grad_norm 2.1632 (2.3153) [2022-01-24 23:06:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][650/1251] eta 0:22:15 lr 0.000146 time 3.1532 (2.2221) loss 3.3995 (3.1955) grad_norm 2.5400 (2.3138) [2022-01-24 23:07:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][660/1251] eta 0:21:52 lr 0.000146 time 1.7483 (2.2213) loss 3.2520 (3.1957) grad_norm 2.7379 (2.3130) [2022-01-24 23:07:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][670/1251] eta 0:21:31 lr 0.000146 time 3.3885 (2.2225) loss 3.2648 (3.1927) grad_norm 2.1394 (2.3129) [2022-01-24 23:07:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][680/1251] eta 0:21:07 lr 0.000146 time 1.8721 (2.2199) loss 2.4653 (3.1943) grad_norm 2.6547 (2.3115) [2022-01-24 23:08:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][690/1251] eta 0:20:43 lr 0.000146 time 2.3474 (2.2168) loss 3.6706 (3.1950) grad_norm 2.5421 (2.3118) [2022-01-24 23:08:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][700/1251] eta 0:20:20 lr 0.000146 time 1.9568 (2.2145) loss 3.6644 (3.1956) grad_norm 2.2164 (2.3122) [2022-01-24 23:08:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][710/1251] eta 0:19:56 lr 0.000146 time 2.4734 (2.2120) loss 3.5817 (3.1965) grad_norm 2.2666 (2.3110) [2022-01-24 23:09:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][720/1251] eta 0:19:34 lr 0.000146 time 2.2420 (2.2112) loss 3.2086 (3.1922) grad_norm 2.3625 (2.3104) [2022-01-24 23:09:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][730/1251] eta 0:19:11 lr 0.000146 time 2.2326 (2.2108) loss 3.2915 (3.1947) grad_norm 2.2929 (2.3083) [2022-01-24 23:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][740/1251] eta 0:18:48 lr 0.000146 time 1.5780 (2.2083) loss 3.5724 (3.1968) grad_norm 2.4955 (2.3079) [2022-01-24 23:10:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][750/1251] eta 0:18:27 lr 0.000146 time 2.3467 (2.2111) loss 3.3865 (3.1982) grad_norm 2.3600 (2.3080) [2022-01-24 23:10:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][760/1251] eta 0:18:06 lr 0.000146 time 1.9528 (2.2127) loss 3.6249 (3.1998) grad_norm 2.3591 (2.3081) [2022-01-24 23:11:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][770/1251] eta 0:17:45 lr 0.000146 time 2.2415 (2.2144) loss 3.4937 (3.2023) grad_norm 2.3703 (2.3097) [2022-01-24 23:11:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][780/1251] eta 0:17:23 lr 0.000145 time 2.4556 (2.2158) loss 3.8625 (3.2050) grad_norm 2.3670 (2.3110) [2022-01-24 23:11:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][790/1251] eta 0:17:01 lr 0.000145 time 1.8642 (2.2151) loss 3.7107 (3.2029) grad_norm 2.6577 (2.3106) [2022-01-24 23:12:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][800/1251] eta 0:16:37 lr 0.000145 time 1.8684 (2.2121) loss 3.4494 (3.2039) grad_norm 1.9977 (2.3095) [2022-01-24 23:12:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][810/1251] eta 0:16:14 lr 0.000145 time 1.8679 (2.2093) loss 3.5886 (3.2039) grad_norm 2.1690 (2.3095) [2022-01-24 23:12:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][820/1251] eta 0:15:51 lr 0.000145 time 1.8938 (2.2068) loss 2.6486 (3.2049) grad_norm 2.2370 (2.3097) [2022-01-24 23:13:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][830/1251] eta 0:15:29 lr 0.000145 time 2.2923 (2.2078) loss 2.9111 (3.2085) grad_norm 2.2947 (2.3090) [2022-01-24 23:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][840/1251] eta 0:15:08 lr 0.000145 time 2.8853 (2.2101) loss 2.3561 (3.2084) grad_norm 2.0630 (2.3080) [2022-01-24 23:13:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][850/1251] eta 0:14:46 lr 0.000145 time 2.4631 (2.2116) loss 2.6275 (3.2062) grad_norm 2.1736 (2.3071) [2022-01-24 23:14:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][860/1251] eta 0:14:24 lr 0.000145 time 1.9188 (2.2121) loss 3.4190 (3.2098) grad_norm 1.9793 (2.3073) [2022-01-24 23:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][870/1251] eta 0:14:02 lr 0.000145 time 1.9384 (2.2124) loss 2.3894 (3.2064) grad_norm 2.2286 (2.3068) [2022-01-24 23:15:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][880/1251] eta 0:13:40 lr 0.000145 time 2.5349 (2.2109) loss 2.1685 (3.2056) grad_norm 2.1418 (2.3058) [2022-01-24 23:15:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][890/1251] eta 0:13:17 lr 0.000145 time 2.1691 (2.2092) loss 3.3663 (3.2074) grad_norm 2.2537 (2.3092) [2022-01-24 23:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][900/1251] eta 0:12:54 lr 0.000145 time 1.9238 (2.2065) loss 3.2122 (3.2085) grad_norm 2.1341 (2.3094) [2022-01-24 23:16:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][910/1251] eta 0:12:31 lr 0.000145 time 2.1478 (2.2051) loss 2.1197 (3.2079) grad_norm 2.1394 (2.3094) [2022-01-24 23:16:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][920/1251] eta 0:12:09 lr 0.000145 time 2.2169 (2.2041) loss 3.8975 (3.2034) grad_norm 2.0836 (2.3080) [2022-01-24 23:16:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][930/1251] eta 0:11:47 lr 0.000145 time 2.2719 (2.2040) loss 3.6197 (3.2055) grad_norm 2.0971 (2.3072) [2022-01-24 23:17:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][940/1251] eta 0:11:25 lr 0.000145 time 2.2429 (2.2030) loss 3.1174 (3.2027) grad_norm 2.1963 (2.3070) [2022-01-24 23:17:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][950/1251] eta 0:11:03 lr 0.000145 time 2.1409 (2.2059) loss 2.1704 (3.1984) grad_norm 2.0149 (2.3064) [2022-01-24 23:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][960/1251] eta 0:10:42 lr 0.000145 time 2.2248 (2.2068) loss 2.7666 (3.1946) grad_norm 1.9769 (2.3060) [2022-01-24 23:18:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][970/1251] eta 0:10:20 lr 0.000145 time 2.5853 (2.2086) loss 3.1567 (3.1954) grad_norm 2.0784 (2.3058) [2022-01-24 23:18:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][980/1251] eta 0:09:58 lr 0.000145 time 1.8126 (2.2080) loss 3.1977 (3.1941) grad_norm 2.3410 (2.3066) [2022-01-24 23:18:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][990/1251] eta 0:09:35 lr 0.000145 time 2.0264 (2.2063) loss 3.5179 (3.1929) grad_norm 2.5546 (2.3096) [2022-01-24 23:19:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1000/1251] eta 0:09:13 lr 0.000145 time 1.6927 (2.2046) loss 3.5046 (3.1950) grad_norm 2.1356 (2.3105) [2022-01-24 23:19:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1010/1251] eta 0:08:51 lr 0.000145 time 2.1690 (2.2041) loss 2.8658 (3.1950) grad_norm 2.2243 (2.3087) [2022-01-24 23:20:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1020/1251] eta 0:08:29 lr 0.000145 time 2.2253 (2.2043) loss 2.9035 (3.1950) grad_norm 2.3960 (2.3077) [2022-01-24 23:20:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1030/1251] eta 0:08:07 lr 0.000145 time 2.1671 (2.2038) loss 3.7574 (3.1958) grad_norm 2.4024 (2.3071) [2022-01-24 23:20:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1040/1251] eta 0:07:44 lr 0.000145 time 1.8754 (2.2036) loss 3.4639 (3.1977) grad_norm 2.9595 (2.3070) [2022-01-24 23:21:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1050/1251] eta 0:07:23 lr 0.000145 time 1.8541 (2.2043) loss 3.6895 (3.2006) grad_norm 2.5190 (2.3066) [2022-01-24 23:21:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1060/1251] eta 0:07:01 lr 0.000145 time 2.1538 (2.2051) loss 2.3412 (3.1985) grad_norm 5.9715 (2.3108) [2022-01-24 23:21:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1070/1251] eta 0:06:39 lr 0.000145 time 3.0406 (2.2075) loss 2.7877 (3.2002) grad_norm 2.3976 (2.3132) [2022-01-24 23:22:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1080/1251] eta 0:06:17 lr 0.000145 time 1.6054 (2.2064) loss 3.1387 (3.1995) grad_norm 2.5124 (2.3127) [2022-01-24 23:22:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1090/1251] eta 0:05:55 lr 0.000145 time 2.1737 (2.2052) loss 2.6943 (3.2004) grad_norm 2.8613 (2.3133) [2022-01-24 23:22:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1100/1251] eta 0:05:32 lr 0.000145 time 1.5564 (2.2033) loss 2.6279 (3.1992) grad_norm 2.0745 (2.3131) [2022-01-24 23:23:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1110/1251] eta 0:05:10 lr 0.000145 time 2.2848 (2.2028) loss 3.5041 (3.1975) grad_norm 2.0579 (2.3124) [2022-01-24 23:23:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1120/1251] eta 0:04:48 lr 0.000145 time 2.2970 (2.2009) loss 2.4944 (3.1954) grad_norm 2.1600 (2.3123) [2022-01-24 23:24:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1130/1251] eta 0:04:26 lr 0.000145 time 2.1352 (2.2004) loss 2.3114 (3.1958) grad_norm 2.4536 (2.3116) [2022-01-24 23:24:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1140/1251] eta 0:04:04 lr 0.000144 time 1.8782 (2.2003) loss 2.6263 (3.1968) grad_norm 2.3596 (2.3107) [2022-01-24 23:24:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1150/1251] eta 0:03:42 lr 0.000144 time 2.7487 (2.2012) loss 3.7796 (3.1971) grad_norm 2.0960 (2.3093) [2022-01-24 23:25:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1160/1251] eta 0:03:20 lr 0.000144 time 2.0052 (2.2004) loss 3.0713 (3.1981) grad_norm 1.9110 (2.3077) [2022-01-24 23:25:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1170/1251] eta 0:02:58 lr 0.000144 time 1.4910 (2.2012) loss 3.5729 (3.1984) grad_norm 2.2578 (2.3069) [2022-01-24 23:25:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1180/1251] eta 0:02:36 lr 0.000144 time 1.8026 (2.2008) loss 2.3784 (3.1983) grad_norm 2.6290 (2.3059) [2022-01-24 23:26:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1190/1251] eta 0:02:14 lr 0.000144 time 1.7610 (2.2014) loss 3.6992 (3.1997) grad_norm 1.9239 (2.3045) [2022-01-24 23:26:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1200/1251] eta 0:01:52 lr 0.000144 time 1.7978 (2.2015) loss 2.2222 (3.1988) grad_norm 2.3048 (2.3041) [2022-01-24 23:27:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1210/1251] eta 0:01:30 lr 0.000144 time 1.7369 (2.2022) loss 3.7983 (3.1982) grad_norm 2.2299 (2.3036) [2022-01-24 23:27:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1220/1251] eta 0:01:08 lr 0.000144 time 1.5427 (2.2011) loss 2.6554 (3.1985) grad_norm 2.0930 (2.3033) [2022-01-24 23:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1230/1251] eta 0:00:46 lr 0.000144 time 1.9199 (2.2002) loss 3.2535 (3.1970) grad_norm 2.4405 (2.3028) [2022-01-24 23:28:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1240/1251] eta 0:00:24 lr 0.000144 time 1.7275 (2.1990) loss 2.9755 (3.1942) grad_norm 2.2136 (2.3023) [2022-01-24 23:28:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [227/300][1250/1251] eta 0:00:02 lr 0.000144 time 1.1866 (2.1931) loss 2.1249 (3.1933) grad_norm 2.1209 (2.3015) [2022-01-24 23:28:17 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 227 training takes 0:45:44 [2022-01-24 23:28:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.976 (18.976) Loss 0.9365 (0.9365) Acc@1 79.297 (79.297) Acc@5 94.336 (94.336) [2022-01-24 23:28:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.285 (3.485) Loss 0.8839 (0.8707) Acc@1 79.102 (79.545) Acc@5 94.824 (95.108) [2022-01-24 23:29:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.963 (2.553) Loss 0.8897 (0.8679) Acc@1 78.418 (79.515) Acc@5 94.629 (94.950) [2022-01-24 23:29:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.279 (2.294) Loss 0.8273 (0.8729) Acc@1 81.250 (79.432) Acc@5 95.898 (94.963) [2022-01-24 23:29:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.786 (2.191) Loss 0.8261 (0.8662) Acc@1 81.250 (79.502) Acc@5 95.801 (95.000) [2022-01-24 23:29:53 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.500 Acc@5 95.072 [2022-01-24 23:29:53 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.5% [2022-01-24 23:29:53 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.53% [2022-01-24 23:30:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][0/1251] eta 7:28:32 lr 0.000144 time 21.5125 (21.5125) loss 3.7372 (3.7372) grad_norm 2.3888 (2.3888) [2022-01-24 23:30:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][10/1251] eta 1:25:01 lr 0.000144 time 1.7522 (4.1111) loss 3.4710 (3.2793) grad_norm 2.6413 (2.3823) [2022-01-24 23:30:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][20/1251] eta 1:04:21 lr 0.000144 time 1.2991 (3.1368) loss 2.8613 (3.2829) grad_norm 2.2348 (2.3531) [2022-01-24 23:31:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][30/1251] eta 0:57:30 lr 0.000144 time 1.8727 (2.8259) loss 2.4965 (3.3358) grad_norm 2.7260 (2.3859) [2022-01-24 23:31:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][40/1251] eta 0:55:37 lr 0.000144 time 4.3931 (2.7562) loss 3.1784 (3.3063) grad_norm 2.4777 (2.3896) [2022-01-24 23:32:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][50/1251] eta 0:52:44 lr 0.000144 time 1.2256 (2.6347) loss 1.9991 (3.2679) grad_norm 2.4324 (2.3364) [2022-01-24 23:32:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][60/1251] eta 0:50:23 lr 0.000144 time 1.6598 (2.5385) loss 3.5837 (3.2749) grad_norm 2.2514 (2.3186) [2022-01-24 23:32:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][70/1251] eta 0:48:24 lr 0.000144 time 1.9231 (2.4594) loss 2.8895 (3.2294) grad_norm 2.0710 (2.3261) [2022-01-24 23:33:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][80/1251] eta 0:47:23 lr 0.000144 time 3.3298 (2.4287) loss 3.6384 (3.2323) grad_norm 3.0948 (2.3481) [2022-01-24 23:33:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][90/1251] eta 0:46:16 lr 0.000144 time 1.4112 (2.3917) loss 2.7528 (3.2184) grad_norm 1.9453 (2.3498) [2022-01-24 23:33:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][100/1251] eta 0:45:18 lr 0.000144 time 1.7233 (2.3622) loss 3.6056 (3.2446) grad_norm 2.4414 (2.3511) [2022-01-24 23:34:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][110/1251] eta 0:44:58 lr 0.000144 time 1.7816 (2.3650) loss 3.5609 (3.2114) grad_norm 2.7013 (2.3506) [2022-01-24 23:34:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][120/1251] eta 0:44:34 lr 0.000144 time 3.2375 (2.3651) loss 3.7442 (3.2131) grad_norm 2.1091 (2.3504) [2022-01-24 23:35:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][130/1251] eta 0:43:48 lr 0.000144 time 1.2318 (2.3447) loss 3.8583 (3.2186) grad_norm 2.2422 (2.3426) [2022-01-24 23:35:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][140/1251] eta 0:43:06 lr 0.000144 time 1.8794 (2.3279) loss 3.2229 (3.1969) grad_norm 2.7452 (2.3370) [2022-01-24 23:35:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][150/1251] eta 0:42:30 lr 0.000144 time 2.0452 (2.3165) loss 2.3254 (3.1884) grad_norm 2.2174 (2.3383) [2022-01-24 23:36:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][160/1251] eta 0:42:01 lr 0.000144 time 3.5252 (2.3109) loss 2.3835 (3.1959) grad_norm 3.0883 (2.3407) [2022-01-24 23:36:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][170/1251] eta 0:41:32 lr 0.000144 time 2.1706 (2.3062) loss 2.7348 (3.1964) grad_norm 2.4631 (2.3411) [2022-01-24 23:36:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][180/1251] eta 0:41:08 lr 0.000144 time 2.1440 (2.3048) loss 2.9722 (3.1845) grad_norm 2.6385 (2.3456) [2022-01-24 23:37:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][190/1251] eta 0:40:38 lr 0.000144 time 2.2436 (2.2981) loss 2.8879 (3.1801) grad_norm 2.1901 (2.3465) [2022-01-24 23:37:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][200/1251] eta 0:40:16 lr 0.000144 time 2.9467 (2.2990) loss 2.5630 (3.1900) grad_norm 2.0399 (2.3455) [2022-01-24 23:37:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][210/1251] eta 0:39:46 lr 0.000144 time 1.7982 (2.2927) loss 2.6923 (3.1952) grad_norm 2.2671 (2.3431) [2022-01-24 23:38:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][220/1251] eta 0:39:16 lr 0.000144 time 2.1157 (2.2861) loss 2.8649 (3.1965) grad_norm 2.6181 (2.3396) [2022-01-24 23:38:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][230/1251] eta 0:38:43 lr 0.000144 time 1.9574 (2.2753) loss 3.7851 (3.1987) grad_norm 2.2582 (2.3356) [2022-01-24 23:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][240/1251] eta 0:38:16 lr 0.000143 time 3.2973 (2.2718) loss 3.6469 (3.2110) grad_norm 2.1383 (2.3391) [2022-01-24 23:39:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][250/1251] eta 0:37:46 lr 0.000143 time 1.7918 (2.2639) loss 2.7771 (3.2071) grad_norm 2.1979 (2.3374) [2022-01-24 23:39:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][260/1251] eta 0:37:16 lr 0.000143 time 1.8013 (2.2564) loss 2.6984 (3.2019) grad_norm 2.3188 (2.3325) [2022-01-24 23:40:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][270/1251] eta 0:36:49 lr 0.000143 time 1.8060 (2.2521) loss 3.3349 (3.2044) grad_norm 2.0902 (2.3320) [2022-01-24 23:40:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][280/1251] eta 0:36:26 lr 0.000143 time 3.1845 (2.2521) loss 2.9708 (3.2160) grad_norm 2.0011 (2.3304) [2022-01-24 23:40:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][290/1251] eta 0:36:00 lr 0.000143 time 2.0965 (2.2482) loss 3.2587 (3.2129) grad_norm 2.1998 (2.3309) [2022-01-24 23:41:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][300/1251] eta 0:35:31 lr 0.000143 time 1.8562 (2.2412) loss 3.8058 (3.2196) grad_norm 2.1912 (2.3260) [2022-01-24 23:41:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][310/1251] eta 0:35:04 lr 0.000143 time 1.9762 (2.2367) loss 2.8893 (3.2195) grad_norm 2.5496 (2.3253) [2022-01-24 23:41:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][320/1251] eta 0:34:39 lr 0.000143 time 2.3188 (2.2331) loss 2.5629 (3.2184) grad_norm 2.1172 (2.3228) [2022-01-24 23:42:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][330/1251] eta 0:34:12 lr 0.000143 time 1.7824 (2.2285) loss 3.8796 (3.2159) grad_norm 2.3350 (2.3244) [2022-01-24 23:42:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][340/1251] eta 0:33:48 lr 0.000143 time 2.1141 (2.2272) loss 3.0421 (3.2072) grad_norm 2.3888 (2.3237) [2022-01-24 23:42:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][350/1251] eta 0:33:25 lr 0.000143 time 2.2293 (2.2263) loss 3.3187 (3.2110) grad_norm 2.3140 (2.3247) [2022-01-24 23:43:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][360/1251] eta 0:33:04 lr 0.000143 time 1.9854 (2.2277) loss 3.3420 (3.2086) grad_norm 2.7973 (2.3259) [2022-01-24 23:43:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][370/1251] eta 0:32:45 lr 0.000143 time 2.6852 (2.2306) loss 3.2589 (3.2111) grad_norm 2.8360 (2.3324) [2022-01-24 23:44:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][380/1251] eta 0:32:27 lr 0.000143 time 2.5563 (2.2360) loss 2.8813 (3.2106) grad_norm 2.0005 (2.3368) [2022-01-24 23:44:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][390/1251] eta 0:32:01 lr 0.000143 time 1.8778 (2.2321) loss 3.8151 (3.2054) grad_norm 2.4712 (2.3381) [2022-01-24 23:44:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][400/1251] eta 0:31:37 lr 0.000143 time 1.9267 (2.2301) loss 3.7754 (3.2021) grad_norm 2.1897 (2.3380) [2022-01-24 23:45:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][410/1251] eta 0:31:16 lr 0.000143 time 2.1649 (2.2309) loss 3.5092 (3.2002) grad_norm 2.4126 (2.3375) [2022-01-24 23:45:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][420/1251] eta 0:30:50 lr 0.000143 time 1.8768 (2.2274) loss 3.6718 (3.2030) grad_norm 3.0028 (2.3375) [2022-01-24 23:45:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][430/1251] eta 0:30:27 lr 0.000143 time 2.2385 (2.2265) loss 3.8245 (3.2010) grad_norm 2.7406 (2.3398) [2022-01-24 23:46:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][440/1251] eta 0:30:05 lr 0.000143 time 2.2116 (2.2262) loss 3.3821 (3.2067) grad_norm 2.2939 (2.3396) [2022-01-24 23:46:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][450/1251] eta 0:29:42 lr 0.000143 time 2.5854 (2.2250) loss 3.6396 (3.2024) grad_norm 2.4462 (2.3399) [2022-01-24 23:46:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][460/1251] eta 0:29:17 lr 0.000143 time 1.5386 (2.2214) loss 4.0204 (3.2053) grad_norm 2.2593 (2.3387) [2022-01-24 23:47:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][470/1251] eta 0:28:56 lr 0.000143 time 2.4310 (2.2234) loss 3.6331 (3.2089) grad_norm 2.8262 (2.3385) [2022-01-24 23:47:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][480/1251] eta 0:28:36 lr 0.000143 time 2.5901 (2.2266) loss 2.9958 (3.2070) grad_norm 2.2821 (2.3355) [2022-01-24 23:48:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][490/1251] eta 0:28:12 lr 0.000143 time 2.3309 (2.2245) loss 2.9254 (3.2037) grad_norm 2.2350 (2.3318) [2022-01-24 23:48:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][500/1251] eta 0:27:46 lr 0.000143 time 1.9718 (2.2195) loss 2.9641 (3.2048) grad_norm 2.2990 (2.3299) [2022-01-24 23:48:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][510/1251] eta 0:27:21 lr 0.000143 time 2.1314 (2.2149) loss 4.0380 (3.2076) grad_norm 2.8139 (2.3307) [2022-01-24 23:49:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][520/1251] eta 0:26:57 lr 0.000143 time 2.2623 (2.2132) loss 3.6891 (3.2029) grad_norm 2.3587 (2.3308) [2022-01-24 23:49:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][530/1251] eta 0:26:37 lr 0.000143 time 2.2325 (2.2155) loss 2.5953 (3.2042) grad_norm 2.3212 (2.3299) [2022-01-24 23:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][540/1251] eta 0:26:17 lr 0.000143 time 2.0378 (2.2186) loss 3.7087 (3.2056) grad_norm 2.2348 (2.3306) [2022-01-24 23:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][550/1251] eta 0:25:54 lr 0.000143 time 1.8637 (2.2178) loss 3.4346 (3.2054) grad_norm 1.9911 (2.3313) [2022-01-24 23:50:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][560/1251] eta 0:25:32 lr 0.000143 time 3.2162 (2.2174) loss 3.3443 (3.2016) grad_norm 2.1247 (2.3319) [2022-01-24 23:50:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][570/1251] eta 0:25:10 lr 0.000143 time 1.8564 (2.2174) loss 2.8956 (3.1989) grad_norm 2.3939 (2.3326) [2022-01-24 23:51:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][580/1251] eta 0:24:46 lr 0.000143 time 1.7812 (2.2160) loss 3.8977 (3.1984) grad_norm 2.4476 (2.3339) [2022-01-24 23:51:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][590/1251] eta 0:24:24 lr 0.000142 time 2.2580 (2.2155) loss 3.8879 (3.2017) grad_norm 2.7249 (2.3335) [2022-01-24 23:52:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][600/1251] eta 0:24:02 lr 0.000142 time 3.1800 (2.2155) loss 3.5244 (3.2050) grad_norm 2.2362 (2.3347) [2022-01-24 23:52:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][610/1251] eta 0:23:40 lr 0.000142 time 1.7004 (2.2153) loss 3.3288 (3.2040) grad_norm 2.5870 (2.3346) [2022-01-24 23:52:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][620/1251] eta 0:23:18 lr 0.000142 time 1.9202 (2.2157) loss 3.4970 (3.2046) grad_norm 2.4772 (2.3351) [2022-01-24 23:53:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][630/1251] eta 0:22:55 lr 0.000142 time 1.6442 (2.2143) loss 2.2631 (3.2045) grad_norm 2.2838 (2.3369) [2022-01-24 23:53:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][640/1251] eta 0:22:32 lr 0.000142 time 2.8230 (2.2131) loss 3.8486 (3.2090) grad_norm 2.1779 (2.3358) [2022-01-24 23:53:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][650/1251] eta 0:22:10 lr 0.000142 time 2.1701 (2.2141) loss 3.4115 (3.2110) grad_norm 1.9754 (2.3343) [2022-01-24 23:54:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][660/1251] eta 0:21:48 lr 0.000142 time 1.7953 (2.2133) loss 3.0641 (3.2138) grad_norm 2.0214 (2.3336) [2022-01-24 23:54:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][670/1251] eta 0:21:25 lr 0.000142 time 1.8279 (2.2128) loss 3.5836 (3.2110) grad_norm 2.2853 (2.3352) [2022-01-24 23:54:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][680/1251] eta 0:21:01 lr 0.000142 time 1.7639 (2.2085) loss 3.9093 (3.2103) grad_norm 2.4971 (2.3331) [2022-01-24 23:55:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][690/1251] eta 0:20:38 lr 0.000142 time 2.8727 (2.2080) loss 3.5461 (3.2115) grad_norm 2.3891 (2.3335) [2022-01-24 23:55:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][700/1251] eta 0:20:16 lr 0.000142 time 2.2128 (2.2072) loss 3.2801 (3.2127) grad_norm 2.3380 (2.3323) [2022-01-24 23:56:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][710/1251] eta 0:19:53 lr 0.000142 time 1.8247 (2.2066) loss 2.6157 (3.2100) grad_norm 2.4467 (2.3329) [2022-01-24 23:56:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][720/1251] eta 0:19:32 lr 0.000142 time 2.8110 (2.2089) loss 3.3774 (3.2107) grad_norm 2.2736 (2.3328) [2022-01-24 23:56:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][730/1251] eta 0:19:10 lr 0.000142 time 2.3048 (2.2092) loss 3.6991 (3.2109) grad_norm 2.1942 (2.3329) [2022-01-24 23:57:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][740/1251] eta 0:18:49 lr 0.000142 time 2.6192 (2.2103) loss 3.1901 (3.2128) grad_norm 2.2607 (2.3334) [2022-01-24 23:57:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][750/1251] eta 0:18:28 lr 0.000142 time 2.3673 (2.2120) loss 2.5394 (3.2097) grad_norm 2.3606 (2.3325) [2022-01-24 23:57:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][760/1251] eta 0:18:06 lr 0.000142 time 2.8040 (2.2138) loss 3.6976 (3.2078) grad_norm 2.2031 (2.3323) [2022-01-24 23:58:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][770/1251] eta 0:17:43 lr 0.000142 time 2.0723 (2.2120) loss 3.7542 (3.2088) grad_norm 2.2446 (2.3305) [2022-01-24 23:58:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][780/1251] eta 0:17:20 lr 0.000142 time 1.7976 (2.2088) loss 2.2650 (3.2061) grad_norm 2.0117 (2.3312) [2022-01-24 23:58:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][790/1251] eta 0:16:57 lr 0.000142 time 2.2517 (2.2074) loss 2.2797 (3.2018) grad_norm 2.1999 (2.3322) [2022-01-24 23:59:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][800/1251] eta 0:16:35 lr 0.000142 time 2.7091 (2.2066) loss 3.4316 (3.2010) grad_norm 1.9886 (2.3321) [2022-01-24 23:59:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][810/1251] eta 0:16:13 lr 0.000142 time 2.8832 (2.2070) loss 3.4319 (3.2021) grad_norm 2.2942 (2.3314) [2022-01-25 00:00:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][820/1251] eta 0:15:50 lr 0.000142 time 2.5688 (2.2062) loss 3.3553 (3.2022) grad_norm 2.1287 (2.3307) [2022-01-25 00:00:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][830/1251] eta 0:15:28 lr 0.000142 time 2.1786 (2.2049) loss 3.2732 (3.2015) grad_norm 2.2777 (2.3304) [2022-01-25 00:00:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][840/1251] eta 0:15:06 lr 0.000142 time 1.7581 (2.2052) loss 3.1286 (3.2031) grad_norm 1.9211 (2.3290) [2022-01-25 00:01:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][850/1251] eta 0:14:44 lr 0.000142 time 2.3976 (2.2058) loss 3.5443 (3.2037) grad_norm 2.4356 (2.3282) [2022-01-25 00:01:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][860/1251] eta 0:14:22 lr 0.000142 time 2.3431 (2.2054) loss 3.5735 (3.2033) grad_norm 2.0013 (2.3296) [2022-01-25 00:01:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][870/1251] eta 0:14:00 lr 0.000142 time 2.7687 (2.2059) loss 3.6761 (3.2054) grad_norm 2.5752 (2.3290) [2022-01-25 00:02:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][880/1251] eta 0:13:38 lr 0.000142 time 2.4608 (2.2055) loss 2.2449 (3.2044) grad_norm 2.5308 (2.3300) [2022-01-25 00:02:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][890/1251] eta 0:13:15 lr 0.000142 time 2.1678 (2.2042) loss 2.6946 (3.2004) grad_norm 2.3114 (2.3308) [2022-01-25 00:02:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][900/1251] eta 0:12:53 lr 0.000142 time 1.9949 (2.2037) loss 2.0811 (3.2011) grad_norm 2.0529 (2.3301) [2022-01-25 00:03:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][910/1251] eta 0:12:32 lr 0.000142 time 2.4681 (2.2073) loss 3.4263 (3.2022) grad_norm 2.5475 (2.3290) [2022-01-25 00:03:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][920/1251] eta 0:12:11 lr 0.000142 time 2.0499 (2.2087) loss 2.6509 (3.2033) grad_norm 2.1070 (2.3274) [2022-01-25 00:04:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][930/1251] eta 0:11:49 lr 0.000142 time 2.4078 (2.2087) loss 2.4372 (3.2038) grad_norm 2.2216 (2.3261) [2022-01-25 00:04:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][940/1251] eta 0:11:26 lr 0.000142 time 1.9028 (2.2084) loss 3.7851 (3.2043) grad_norm 2.1976 (2.3263) [2022-01-25 00:04:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][950/1251] eta 0:11:04 lr 0.000141 time 1.8898 (2.2068) loss 2.9902 (3.1993) grad_norm 3.7422 (2.3277) [2022-01-25 00:05:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][960/1251] eta 0:10:41 lr 0.000141 time 1.9155 (2.2029) loss 3.5862 (3.2019) grad_norm 2.0484 (2.3273) [2022-01-25 00:05:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][970/1251] eta 0:10:18 lr 0.000141 time 1.7187 (2.2011) loss 3.3536 (3.2028) grad_norm 3.1678 (2.3345) [2022-01-25 00:05:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][980/1251] eta 0:09:56 lr 0.000141 time 2.1392 (2.2005) loss 3.5935 (3.2053) grad_norm 2.2169 (2.3369) [2022-01-25 00:06:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][990/1251] eta 0:09:33 lr 0.000141 time 1.9363 (2.1989) loss 4.0664 (3.2060) grad_norm 2.1465 (2.3362) [2022-01-25 00:06:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1000/1251] eta 0:09:11 lr 0.000141 time 2.2530 (2.1981) loss 2.0199 (3.2046) grad_norm 2.4327 (2.3357) [2022-01-25 00:06:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1010/1251] eta 0:08:49 lr 0.000141 time 2.0979 (2.1981) loss 3.4185 (3.2069) grad_norm 2.0314 (2.3348) [2022-01-25 00:07:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1020/1251] eta 0:08:27 lr 0.000141 time 2.2800 (2.1976) loss 1.9041 (3.2064) grad_norm 2.0939 (2.3352) [2022-01-25 00:07:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1030/1251] eta 0:08:05 lr 0.000141 time 1.6979 (2.1989) loss 3.8958 (3.2077) grad_norm 2.3945 (2.3339) [2022-01-25 00:08:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1040/1251] eta 0:07:44 lr 0.000141 time 2.2606 (2.2002) loss 3.6631 (3.2090) grad_norm 3.0288 (2.3346) [2022-01-25 00:08:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1050/1251] eta 0:07:22 lr 0.000141 time 2.1642 (2.2006) loss 3.8932 (3.2081) grad_norm 2.5211 (2.3333) [2022-01-25 00:08:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1060/1251] eta 0:07:01 lr 0.000141 time 3.4401 (2.2050) loss 3.5343 (3.2071) grad_norm 2.3717 (2.3321) [2022-01-25 00:09:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1070/1251] eta 0:06:39 lr 0.000141 time 1.5101 (2.2060) loss 3.5037 (3.2088) grad_norm 2.3462 (2.3323) [2022-01-25 00:09:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1080/1251] eta 0:06:17 lr 0.000141 time 2.0266 (2.2051) loss 2.5488 (3.2084) grad_norm 2.5051 (2.3327) [2022-01-25 00:09:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1090/1251] eta 0:05:54 lr 0.000141 time 1.8491 (2.2038) loss 3.8677 (3.2093) grad_norm 2.3882 (2.3315) [2022-01-25 00:10:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1100/1251] eta 0:05:32 lr 0.000141 time 1.7140 (2.2027) loss 3.4734 (3.2106) grad_norm 2.2765 (2.3311) [2022-01-25 00:10:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1110/1251] eta 0:05:10 lr 0.000141 time 1.8680 (2.2006) loss 3.1899 (3.2107) grad_norm 2.3967 (2.3304) [2022-01-25 00:11:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1120/1251] eta 0:04:48 lr 0.000141 time 2.1063 (2.2004) loss 3.6806 (3.2118) grad_norm 2.3551 (2.3310) [2022-01-25 00:11:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1130/1251] eta 0:04:26 lr 0.000141 time 1.8561 (2.2004) loss 3.8072 (3.2111) grad_norm 2.1462 (2.3311) [2022-01-25 00:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1140/1251] eta 0:04:04 lr 0.000141 time 1.8463 (2.2003) loss 2.4579 (3.2090) grad_norm 2.3389 (2.3311) [2022-01-25 00:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1150/1251] eta 0:03:42 lr 0.000141 time 2.8622 (2.2017) loss 3.9280 (3.2075) grad_norm 2.1177 (2.3310) [2022-01-25 00:12:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1160/1251] eta 0:03:20 lr 0.000141 time 1.6581 (2.2015) loss 3.4984 (3.2084) grad_norm 2.2738 (2.3306) [2022-01-25 00:12:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1170/1251] eta 0:02:58 lr 0.000141 time 2.1699 (2.2020) loss 3.3368 (3.2097) grad_norm 2.1537 (2.3301) [2022-01-25 00:13:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1180/1251] eta 0:02:36 lr 0.000141 time 2.5352 (2.2025) loss 2.4026 (3.2095) grad_norm 2.0866 (2.3292) [2022-01-25 00:13:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1190/1251] eta 0:02:14 lr 0.000141 time 2.1885 (2.2022) loss 3.1236 (3.2099) grad_norm 2.4750 (2.3293) [2022-01-25 00:13:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1200/1251] eta 0:01:52 lr 0.000141 time 2.0326 (2.2011) loss 3.9773 (3.2115) grad_norm 2.1915 (2.3294) [2022-01-25 00:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1210/1251] eta 0:01:30 lr 0.000141 time 1.8673 (2.2005) loss 3.4576 (3.2122) grad_norm 2.1291 (2.3296) [2022-01-25 00:14:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1220/1251] eta 0:01:08 lr 0.000141 time 2.1675 (2.1994) loss 3.5886 (3.2090) grad_norm 2.3801 (2.3298) [2022-01-25 00:15:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1230/1251] eta 0:00:46 lr 0.000141 time 2.5197 (2.1993) loss 2.3632 (3.2075) grad_norm 2.2646 (2.3287) [2022-01-25 00:15:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1240/1251] eta 0:00:24 lr 0.000141 time 1.4299 (2.1997) loss 3.3746 (3.2080) grad_norm 2.2624 (2.3288) [2022-01-25 00:15:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [228/300][1250/1251] eta 0:00:02 lr 0.000141 time 1.1785 (2.1949) loss 3.5195 (3.2061) grad_norm 2.2959 (2.3330) [2022-01-25 00:15:40 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 228 training takes 0:45:46 [2022-01-25 00:15:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.097 (19.097) Loss 0.8507 (0.8507) Acc@1 80.566 (80.566) Acc@5 95.605 (95.605) [2022-01-25 00:16:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.024 (3.210) Loss 0.8872 (0.8730) Acc@1 77.734 (78.915) Acc@5 94.922 (95.215) [2022-01-25 00:16:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.995 (2.606) Loss 0.8952 (0.8683) Acc@1 80.078 (79.367) Acc@5 94.238 (95.178) [2022-01-25 00:16:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.306 (2.353) Loss 0.8168 (0.8685) Acc@1 80.566 (79.413) Acc@5 95.605 (95.136) [2022-01-25 00:17:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.221 (2.230) Loss 0.8911 (0.8693) Acc@1 78.809 (79.456) Acc@5 94.141 (95.029) [2022-01-25 00:17:18 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.510 Acc@5 95.014 [2022-01-25 00:17:18 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.5% [2022-01-25 00:17:18 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.53% [2022-01-25 00:17:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][0/1251] eta 7:33:03 lr 0.000141 time 21.7291 (21.7291) loss 2.2374 (2.2374) grad_norm 2.8203 (2.8203) [2022-01-25 00:18:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][10/1251] eta 1:25:59 lr 0.000141 time 2.5971 (4.1579) loss 3.1250 (3.0771) grad_norm 2.0164 (2.3464) [2022-01-25 00:18:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][20/1251] eta 1:06:46 lr 0.000141 time 1.5224 (3.2547) loss 3.2928 (3.1774) grad_norm 2.4284 (2.4160) [2022-01-25 00:18:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][30/1251] eta 1:00:24 lr 0.000141 time 1.7960 (2.9688) loss 3.3205 (3.1767) grad_norm 2.3434 (2.4526) [2022-01-25 00:19:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][40/1251] eta 0:57:00 lr 0.000141 time 3.9340 (2.8243) loss 3.9202 (3.1722) grad_norm 2.2807 (2.4327) [2022-01-25 00:19:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][50/1251] eta 0:53:55 lr 0.000140 time 2.1865 (2.6937) loss 2.5310 (3.1511) grad_norm 2.4715 (2.4079) [2022-01-25 00:19:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][60/1251] eta 0:51:41 lr 0.000140 time 1.8350 (2.6041) loss 2.5510 (3.1294) grad_norm 2.3889 (2.3782) [2022-01-25 00:20:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][70/1251] eta 0:50:10 lr 0.000140 time 1.9152 (2.5491) loss 2.7146 (3.0984) grad_norm 2.4258 (2.3832) [2022-01-25 00:20:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][80/1251] eta 0:49:27 lr 0.000140 time 3.5555 (2.5339) loss 3.5301 (3.1192) grad_norm 2.2555 (2.3771) [2022-01-25 00:21:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][90/1251] eta 0:48:17 lr 0.000140 time 1.5045 (2.4960) loss 3.7095 (3.1161) grad_norm 2.1372 (2.3582) [2022-01-25 00:21:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][100/1251] eta 0:46:54 lr 0.000140 time 1.9352 (2.4451) loss 3.3157 (3.1493) grad_norm 2.3139 (2.3415) [2022-01-25 00:21:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][110/1251] eta 0:45:59 lr 0.000140 time 1.9791 (2.4188) loss 3.6486 (3.1464) grad_norm 2.0995 (2.3204) [2022-01-25 00:22:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][120/1251] eta 0:44:58 lr 0.000140 time 2.2710 (2.3860) loss 3.7098 (3.1253) grad_norm 2.8921 (2.3141) [2022-01-25 00:22:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][130/1251] eta 0:44:12 lr 0.000140 time 2.1558 (2.3665) loss 3.5869 (3.1251) grad_norm 2.5308 (2.3037) [2022-01-25 00:22:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][140/1251] eta 0:43:28 lr 0.000140 time 2.4047 (2.3483) loss 3.4003 (3.1304) grad_norm 2.1747 (2.2999) [2022-01-25 00:23:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][150/1251] eta 0:42:59 lr 0.000140 time 2.1369 (2.3432) loss 3.1419 (3.1228) grad_norm 2.0446 (2.2925) [2022-01-25 00:23:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][160/1251] eta 0:42:36 lr 0.000140 time 3.1066 (2.3430) loss 3.9676 (3.1455) grad_norm 2.1874 (2.2967) [2022-01-25 00:23:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][170/1251] eta 0:42:12 lr 0.000140 time 2.5546 (2.3423) loss 3.9476 (3.1573) grad_norm 2.4816 (2.2996) [2022-01-25 00:24:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][180/1251] eta 0:41:35 lr 0.000140 time 1.8419 (2.3302) loss 3.5113 (3.1697) grad_norm 2.3786 (2.3020) [2022-01-25 00:24:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][190/1251] eta 0:40:55 lr 0.000140 time 1.9333 (2.3143) loss 2.5562 (3.1520) grad_norm 2.1409 (2.3002) [2022-01-25 00:25:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][200/1251] eta 0:40:19 lr 0.000140 time 2.4217 (2.3021) loss 3.1031 (3.1574) grad_norm 2.3239 (2.2977) [2022-01-25 00:25:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][210/1251] eta 0:39:39 lr 0.000140 time 1.8644 (2.2862) loss 3.4529 (3.1573) grad_norm 2.5003 (2.2989) [2022-01-25 00:25:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][220/1251] eta 0:39:09 lr 0.000140 time 2.1895 (2.2786) loss 2.5693 (3.1687) grad_norm 2.4794 (2.3005) [2022-01-25 00:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][230/1251] eta 0:38:37 lr 0.000140 time 2.7619 (2.2701) loss 3.1227 (3.1649) grad_norm 2.5122 (2.2969) [2022-01-25 00:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][240/1251] eta 0:38:05 lr 0.000140 time 1.9650 (2.2611) loss 3.4851 (3.1720) grad_norm 2.1353 (2.2979) [2022-01-25 00:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][250/1251] eta 0:37:48 lr 0.000140 time 3.1897 (2.2666) loss 3.3190 (3.1611) grad_norm 2.1708 (2.2998) [2022-01-25 00:27:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][260/1251] eta 0:37:26 lr 0.000140 time 1.8491 (2.2669) loss 2.6530 (3.1619) grad_norm 2.2410 (2.3015) [2022-01-25 00:27:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][270/1251] eta 0:37:12 lr 0.000140 time 2.4504 (2.2755) loss 3.6234 (3.1634) grad_norm 2.3475 (2.3050) [2022-01-25 00:27:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][280/1251] eta 0:36:45 lr 0.000140 time 2.6300 (2.2709) loss 3.9025 (3.1634) grad_norm 2.2886 (2.3075) [2022-01-25 00:28:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][290/1251] eta 0:36:18 lr 0.000140 time 3.1910 (2.2672) loss 3.1246 (3.1719) grad_norm 2.3253 (2.3092) [2022-01-25 00:28:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][300/1251] eta 0:35:54 lr 0.000140 time 2.1209 (2.2657) loss 3.7102 (3.1740) grad_norm 2.0793 (2.3091) [2022-01-25 00:29:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][310/1251] eta 0:35:31 lr 0.000140 time 2.5112 (2.2653) loss 3.5678 (3.1724) grad_norm 2.3459 (2.3114) [2022-01-25 00:29:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][320/1251] eta 0:35:06 lr 0.000140 time 2.1680 (2.2627) loss 3.4011 (3.1660) grad_norm 2.4184 (2.3100) [2022-01-25 00:29:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][330/1251] eta 0:34:38 lr 0.000140 time 1.9109 (2.2563) loss 2.8414 (3.1649) grad_norm 2.0425 (2.3114) [2022-01-25 00:30:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][340/1251] eta 0:34:12 lr 0.000140 time 1.5009 (2.2529) loss 3.6055 (3.1704) grad_norm 2.0922 (2.3126) [2022-01-25 00:30:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][350/1251] eta 0:33:45 lr 0.000140 time 1.9227 (2.2482) loss 3.5710 (3.1678) grad_norm 2.2926 (2.3129) [2022-01-25 00:30:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][360/1251] eta 0:33:22 lr 0.000140 time 2.1936 (2.2475) loss 3.4263 (3.1725) grad_norm 2.2427 (2.3110) [2022-01-25 00:31:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][370/1251] eta 0:33:00 lr 0.000140 time 2.1096 (2.2484) loss 3.2679 (3.1733) grad_norm 2.2415 (2.3084) [2022-01-25 00:31:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][380/1251] eta 0:32:41 lr 0.000140 time 1.8666 (2.2523) loss 2.4268 (3.1667) grad_norm 2.2990 (2.3063) [2022-01-25 00:31:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][390/1251] eta 0:32:15 lr 0.000140 time 1.5904 (2.2480) loss 3.2821 (3.1610) grad_norm 1.9624 (2.3045) [2022-01-25 00:32:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][400/1251] eta 0:31:48 lr 0.000140 time 1.9817 (2.2431) loss 3.1918 (3.1598) grad_norm 2.2573 (2.3045) [2022-01-25 00:32:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][410/1251] eta 0:31:22 lr 0.000139 time 2.5306 (2.2387) loss 3.7775 (3.1591) grad_norm 2.2127 (2.3031) [2022-01-25 00:33:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][420/1251] eta 0:30:59 lr 0.000139 time 1.8891 (2.2382) loss 2.8426 (3.1546) grad_norm 2.1412 (2.3034) [2022-01-25 00:33:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][430/1251] eta 0:30:36 lr 0.000139 time 1.8160 (2.2367) loss 2.6920 (3.1452) grad_norm 2.2631 (2.3003) [2022-01-25 00:33:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][440/1251] eta 0:30:12 lr 0.000139 time 1.7451 (2.2344) loss 3.7466 (3.1445) grad_norm 2.9307 (2.3049) [2022-01-25 00:34:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][450/1251] eta 0:29:51 lr 0.000139 time 3.0492 (2.2365) loss 2.2495 (3.1401) grad_norm 2.2241 (2.3089) [2022-01-25 00:34:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][460/1251] eta 0:29:27 lr 0.000139 time 1.9028 (2.2348) loss 3.1343 (3.1433) grad_norm 2.1537 (2.3084) [2022-01-25 00:34:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][470/1251] eta 0:29:05 lr 0.000139 time 1.9298 (2.2345) loss 3.2717 (3.1471) grad_norm 2.5471 (2.3082) [2022-01-25 00:35:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][480/1251] eta 0:28:43 lr 0.000139 time 2.5689 (2.2352) loss 3.0133 (3.1487) grad_norm 2.2340 (2.3077) [2022-01-25 00:35:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][490/1251] eta 0:28:18 lr 0.000139 time 2.1854 (2.2318) loss 3.4674 (3.1522) grad_norm 2.0018 (2.3061) [2022-01-25 00:35:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][500/1251] eta 0:27:52 lr 0.000139 time 2.1822 (2.2265) loss 2.8031 (3.1472) grad_norm 2.0112 (2.3026) [2022-01-25 00:36:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][510/1251] eta 0:27:27 lr 0.000139 time 2.3694 (2.2234) loss 2.5689 (3.1497) grad_norm 2.1463 (2.3017) [2022-01-25 00:36:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][520/1251] eta 0:27:05 lr 0.000139 time 2.7297 (2.2231) loss 3.6341 (3.1539) grad_norm 2.7215 (2.3069) [2022-01-25 00:36:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][530/1251] eta 0:26:43 lr 0.000139 time 2.6348 (2.2235) loss 2.2773 (3.1537) grad_norm 2.5975 (2.3106) [2022-01-25 00:37:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][540/1251] eta 0:26:19 lr 0.000139 time 1.9876 (2.2220) loss 3.5190 (3.1565) grad_norm 2.1215 (2.3137) [2022-01-25 00:37:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][550/1251] eta 0:25:57 lr 0.000139 time 1.8578 (2.2214) loss 3.1661 (3.1614) grad_norm 2.3408 (2.3169) [2022-01-25 00:38:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][560/1251] eta 0:25:35 lr 0.000139 time 2.7525 (2.2227) loss 3.7122 (3.1646) grad_norm 2.2132 (2.3159) [2022-01-25 00:38:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][570/1251] eta 0:25:12 lr 0.000139 time 2.5438 (2.2215) loss 3.4685 (3.1681) grad_norm 2.1154 (2.3186) [2022-01-25 00:38:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][580/1251] eta 0:24:49 lr 0.000139 time 1.6386 (2.2204) loss 2.2790 (3.1673) grad_norm 2.1129 (2.3180) [2022-01-25 00:39:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][590/1251] eta 0:24:27 lr 0.000139 time 2.2671 (2.2203) loss 2.5755 (3.1636) grad_norm 2.2805 (2.3161) [2022-01-25 00:39:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][600/1251] eta 0:24:05 lr 0.000139 time 3.4543 (2.2211) loss 3.5034 (3.1594) grad_norm 2.1410 (2.3185) [2022-01-25 00:39:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][610/1251] eta 0:23:42 lr 0.000139 time 2.5853 (2.2189) loss 3.4649 (3.1628) grad_norm 2.0223 (2.3185) [2022-01-25 00:40:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][620/1251] eta 0:23:19 lr 0.000139 time 1.6317 (2.2173) loss 2.9178 (3.1625) grad_norm 2.1771 (2.3193) [2022-01-25 00:40:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][630/1251] eta 0:22:56 lr 0.000139 time 1.8320 (2.2174) loss 2.8615 (3.1645) grad_norm 2.1812 (2.3176) [2022-01-25 00:41:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][640/1251] eta 0:22:35 lr 0.000139 time 4.1321 (2.2185) loss 3.7912 (3.1681) grad_norm 2.1863 (2.3201) [2022-01-25 00:41:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][650/1251] eta 0:22:13 lr 0.000139 time 2.0453 (2.2190) loss 3.8237 (3.1692) grad_norm 2.3845 (2.3186) [2022-01-25 00:41:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][660/1251] eta 0:21:50 lr 0.000139 time 1.6077 (2.2170) loss 3.7009 (3.1706) grad_norm 2.4504 (2.3175) [2022-01-25 00:42:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][670/1251] eta 0:21:27 lr 0.000139 time 1.9737 (2.2160) loss 3.6112 (3.1699) grad_norm 2.0930 (2.3167) [2022-01-25 00:42:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][680/1251] eta 0:21:04 lr 0.000139 time 2.0905 (2.2145) loss 3.0566 (3.1689) grad_norm 2.2330 (2.3166) [2022-01-25 00:42:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][690/1251] eta 0:20:41 lr 0.000139 time 2.2779 (2.2123) loss 3.5112 (3.1688) grad_norm 2.3719 (2.3147) [2022-01-25 00:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][700/1251] eta 0:20:17 lr 0.000139 time 1.6308 (2.2103) loss 3.5614 (3.1685) grad_norm 2.2548 (2.3146) [2022-01-25 00:43:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][710/1251] eta 0:19:56 lr 0.000139 time 1.7368 (2.2107) loss 2.7799 (3.1664) grad_norm 2.3560 (2.3171) [2022-01-25 00:43:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][720/1251] eta 0:19:34 lr 0.000139 time 2.2823 (2.2128) loss 3.2304 (3.1683) grad_norm 2.2118 (2.3160) [2022-01-25 00:44:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][730/1251] eta 0:19:12 lr 0.000139 time 2.5431 (2.2121) loss 2.5622 (3.1668) grad_norm 2.1487 (2.3171) [2022-01-25 00:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][740/1251] eta 0:18:50 lr 0.000139 time 1.6131 (2.2122) loss 2.0919 (3.1661) grad_norm 2.1805 (2.3197) [2022-01-25 00:44:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][750/1251] eta 0:18:27 lr 0.000139 time 2.0155 (2.2104) loss 2.1409 (3.1642) grad_norm 2.1340 (2.3207) [2022-01-25 00:45:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][760/1251] eta 0:18:05 lr 0.000139 time 2.1587 (2.2108) loss 3.3235 (3.1623) grad_norm 2.1103 (2.3200) [2022-01-25 00:45:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][770/1251] eta 0:17:43 lr 0.000138 time 2.4985 (2.2109) loss 3.7733 (3.1613) grad_norm 2.2826 (2.3198) [2022-01-25 00:46:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][780/1251] eta 0:17:21 lr 0.000138 time 2.3216 (2.2108) loss 2.9982 (3.1595) grad_norm 2.1146 (2.3214) [2022-01-25 00:46:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][790/1251] eta 0:16:58 lr 0.000138 time 1.5694 (2.2101) loss 2.6730 (3.1529) grad_norm 2.0194 (2.3205) [2022-01-25 00:46:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][800/1251] eta 0:16:36 lr 0.000138 time 2.3481 (2.2090) loss 2.9183 (3.1528) grad_norm 2.2281 (2.3186) [2022-01-25 00:47:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][810/1251] eta 0:16:13 lr 0.000138 time 2.2496 (2.2086) loss 2.2368 (3.1523) grad_norm 2.7214 (2.3200) [2022-01-25 00:47:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][820/1251] eta 0:15:52 lr 0.000138 time 2.2024 (2.2101) loss 3.6155 (3.1523) grad_norm 2.6367 (2.3210) [2022-01-25 00:47:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][830/1251] eta 0:15:30 lr 0.000138 time 1.8280 (2.2091) loss 3.5700 (3.1539) grad_norm 2.3040 (2.3217) [2022-01-25 00:48:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][840/1251] eta 0:15:07 lr 0.000138 time 1.9546 (2.2092) loss 3.2805 (3.1569) grad_norm 2.1876 (2.3221) [2022-01-25 00:48:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][850/1251] eta 0:14:46 lr 0.000138 time 2.8904 (2.2101) loss 3.2478 (3.1547) grad_norm 2.2497 (2.3221) [2022-01-25 00:49:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][860/1251] eta 0:14:24 lr 0.000138 time 2.0507 (2.2107) loss 3.0883 (3.1542) grad_norm 2.3179 (2.3232) [2022-01-25 00:49:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][870/1251] eta 0:14:01 lr 0.000138 time 2.0396 (2.2097) loss 2.6370 (3.1524) grad_norm 3.1135 (2.3252) [2022-01-25 00:49:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][880/1251] eta 0:13:39 lr 0.000138 time 2.2626 (2.2096) loss 3.9475 (3.1511) grad_norm 2.3893 (2.3263) [2022-01-25 00:50:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][890/1251] eta 0:13:17 lr 0.000138 time 2.9592 (2.2091) loss 2.3592 (3.1525) grad_norm 2.5609 (2.3247) [2022-01-25 00:50:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][900/1251] eta 0:12:54 lr 0.000138 time 1.9444 (2.2073) loss 3.3692 (3.1549) grad_norm 2.1238 (2.3255) [2022-01-25 00:50:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][910/1251] eta 0:12:31 lr 0.000138 time 1.9023 (2.2052) loss 3.5299 (3.1551) grad_norm 2.4346 (2.3264) [2022-01-25 00:51:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][920/1251] eta 0:12:09 lr 0.000138 time 2.2150 (2.2050) loss 3.5662 (3.1530) grad_norm 2.4448 (2.3323) [2022-01-25 00:51:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][930/1251] eta 0:11:47 lr 0.000138 time 2.5224 (2.2049) loss 3.1225 (3.1555) grad_norm 2.8469 (2.3333) [2022-01-25 00:51:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][940/1251] eta 0:11:25 lr 0.000138 time 2.2168 (2.2045) loss 3.6551 (3.1552) grad_norm 2.5662 (2.3331) [2022-01-25 00:52:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][950/1251] eta 0:11:03 lr 0.000138 time 2.4808 (2.2055) loss 3.0393 (3.1542) grad_norm 2.2605 (2.3342) [2022-01-25 00:52:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][960/1251] eta 0:10:42 lr 0.000138 time 2.1645 (2.2066) loss 3.2444 (3.1553) grad_norm 2.2705 (2.3369) [2022-01-25 00:53:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][970/1251] eta 0:10:20 lr 0.000138 time 2.8247 (2.2085) loss 3.3147 (3.1551) grad_norm 2.5141 (2.3375) [2022-01-25 00:53:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][980/1251] eta 0:09:58 lr 0.000138 time 2.1784 (2.2090) loss 3.1614 (3.1546) grad_norm 2.0491 (2.3379) [2022-01-25 00:53:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][990/1251] eta 0:09:36 lr 0.000138 time 2.5213 (2.2081) loss 3.2237 (3.1533) grad_norm 2.2534 (2.3383) [2022-01-25 00:54:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1000/1251] eta 0:09:13 lr 0.000138 time 1.8098 (2.2061) loss 2.6266 (3.1551) grad_norm 2.0049 (2.3396) [2022-01-25 00:54:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1010/1251] eta 0:08:51 lr 0.000138 time 2.7939 (2.2050) loss 3.6176 (3.1543) grad_norm 2.2400 (2.3389) [2022-01-25 00:54:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1020/1251] eta 0:08:29 lr 0.000138 time 2.2117 (2.2055) loss 3.6356 (3.1558) grad_norm 2.2372 (2.3377) [2022-01-25 00:55:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1030/1251] eta 0:08:07 lr 0.000138 time 1.8852 (2.2062) loss 3.3420 (3.1572) grad_norm 2.3085 (2.3366) [2022-01-25 00:55:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1040/1251] eta 0:07:45 lr 0.000138 time 2.2138 (2.2060) loss 2.5148 (3.1559) grad_norm 2.8918 (2.3369) [2022-01-25 00:55:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1050/1251] eta 0:07:23 lr 0.000138 time 2.9779 (2.2050) loss 3.4889 (3.1584) grad_norm 2.9462 (2.3367) [2022-01-25 00:56:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1060/1251] eta 0:07:00 lr 0.000138 time 1.8250 (2.2033) loss 2.9334 (3.1588) grad_norm 2.4866 (2.3365) [2022-01-25 00:56:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1070/1251] eta 0:06:38 lr 0.000138 time 2.4838 (2.2035) loss 3.2339 (3.1587) grad_norm 2.1221 (2.3354) [2022-01-25 00:57:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1080/1251] eta 0:06:16 lr 0.000138 time 2.2740 (2.2043) loss 3.8507 (3.1618) grad_norm 2.1611 (2.3357) [2022-01-25 00:57:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1090/1251] eta 0:05:54 lr 0.000138 time 2.9539 (2.2043) loss 3.6442 (3.1623) grad_norm 2.5221 (2.3372) [2022-01-25 00:57:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1100/1251] eta 0:05:32 lr 0.000138 time 1.8140 (2.2042) loss 3.0351 (3.1643) grad_norm 3.2912 (2.3394) [2022-01-25 00:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1110/1251] eta 0:05:10 lr 0.000138 time 2.6268 (2.2045) loss 3.0975 (3.1632) grad_norm 3.0813 (2.3402) [2022-01-25 00:58:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1120/1251] eta 0:04:48 lr 0.000138 time 2.2365 (2.2041) loss 3.4388 (3.1628) grad_norm 2.1389 (2.3419) [2022-01-25 00:58:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1130/1251] eta 0:04:26 lr 0.000137 time 1.8053 (2.2042) loss 3.6777 (3.1640) grad_norm 2.4487 (2.3420) [2022-01-25 00:59:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1140/1251] eta 0:04:04 lr 0.000137 time 2.3048 (2.2036) loss 3.2998 (3.1631) grad_norm 2.2234 (2.3408) [2022-01-25 00:59:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1150/1251] eta 0:03:42 lr 0.000137 time 3.1191 (2.2048) loss 2.1856 (3.1646) grad_norm 2.0776 (2.3405) [2022-01-25 00:59:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1160/1251] eta 0:03:20 lr 0.000137 time 1.9466 (2.2036) loss 3.5155 (3.1670) grad_norm 2.0789 (2.3398) [2022-01-25 01:00:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1170/1251] eta 0:02:58 lr 0.000137 time 1.9803 (2.2015) loss 2.9798 (3.1680) grad_norm 2.0133 (2.3392) [2022-01-25 01:00:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1180/1251] eta 0:02:36 lr 0.000137 time 2.2378 (2.2008) loss 3.1598 (3.1692) grad_norm 1.9962 (2.3379) [2022-01-25 01:01:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1190/1251] eta 0:02:14 lr 0.000137 time 2.2349 (2.2019) loss 2.0419 (3.1693) grad_norm 2.3506 (2.3371) [2022-01-25 01:01:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1200/1251] eta 0:01:52 lr 0.000137 time 2.1063 (2.2028) loss 3.3649 (3.1712) grad_norm 2.4005 (2.3362) [2022-01-25 01:01:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1210/1251] eta 0:01:30 lr 0.000137 time 1.9721 (2.2024) loss 3.4014 (3.1712) grad_norm 3.0230 (2.3360) [2022-01-25 01:02:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1220/1251] eta 0:01:08 lr 0.000137 time 1.9838 (2.2016) loss 2.1417 (3.1690) grad_norm 2.4781 (2.3360) [2022-01-25 01:02:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1230/1251] eta 0:00:46 lr 0.000137 time 2.7690 (2.2010) loss 3.4279 (3.1700) grad_norm 2.2922 (2.3353) [2022-01-25 01:02:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1240/1251] eta 0:00:24 lr 0.000137 time 1.4777 (2.1999) loss 3.6354 (3.1710) grad_norm 2.1053 (2.3353) [2022-01-25 01:03:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [229/300][1250/1251] eta 0:00:02 lr 0.000137 time 1.2907 (2.1943) loss 2.2293 (3.1703) grad_norm 2.3088 (2.3350) [2022-01-25 01:03:04 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 229 training takes 0:45:45 [2022-01-25 01:03:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.999 (18.999) Loss 0.9517 (0.9517) Acc@1 77.246 (77.246) Acc@5 93.945 (93.945) [2022-01-25 01:03:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.363 (3.350) Loss 0.8263 (0.8602) Acc@1 80.469 (79.714) Acc@5 95.703 (95.046) [2022-01-25 01:04:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.634 (2.668) Loss 0.8497 (0.8648) Acc@1 80.371 (79.567) Acc@5 95.117 (94.996) [2022-01-25 01:04:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.656 (2.357) Loss 0.8293 (0.8699) Acc@1 79.688 (79.442) Acc@5 94.629 (94.912) [2022-01-25 01:04:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.378 (2.221) Loss 0.7644 (0.8595) Acc@1 82.031 (79.671) Acc@5 94.922 (94.981) [2022-01-25 01:04:43 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.692 Acc@5 95.028 [2022-01-25 01:04:43 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.7% [2022-01-25 01:04:43 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.69% [2022-01-25 01:05:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][0/1251] eta 7:30:03 lr 0.000137 time 21.5857 (21.5857) loss 3.3694 (3.3694) grad_norm 2.4170 (2.4170) [2022-01-25 01:05:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][10/1251] eta 1:22:44 lr 0.000137 time 2.1433 (4.0005) loss 3.4034 (3.2918) grad_norm 2.2425 (2.4662) [2022-01-25 01:05:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][20/1251] eta 1:05:27 lr 0.000137 time 1.6778 (3.1908) loss 3.6075 (3.1199) grad_norm 2.2003 (2.3779) [2022-01-25 01:06:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][30/1251] eta 0:58:49 lr 0.000137 time 1.4905 (2.8907) loss 3.2057 (3.1401) grad_norm 2.4696 (2.3768) [2022-01-25 01:06:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][40/1251] eta 0:55:24 lr 0.000137 time 3.8861 (2.7451) loss 3.8234 (3.1458) grad_norm 2.2242 (2.3816) [2022-01-25 01:06:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][50/1251] eta 0:52:58 lr 0.000137 time 2.8052 (2.6461) loss 3.6125 (3.1172) grad_norm 2.4124 (2.3644) [2022-01-25 01:07:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][60/1251] eta 0:51:37 lr 0.000137 time 1.8469 (2.6004) loss 2.4139 (3.1061) grad_norm 2.5529 (2.3543) [2022-01-25 01:07:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][70/1251] eta 0:50:07 lr 0.000137 time 1.9348 (2.5465) loss 3.5284 (3.0760) grad_norm 2.3644 (2.3619) [2022-01-25 01:08:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][80/1251] eta 0:49:09 lr 0.000137 time 3.3492 (2.5190) loss 3.6375 (3.0945) grad_norm 2.4319 (2.3556) [2022-01-25 01:08:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][90/1251] eta 0:48:02 lr 0.000137 time 2.8228 (2.4827) loss 3.0557 (3.1020) grad_norm 2.1304 (2.3443) [2022-01-25 01:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][100/1251] eta 0:46:24 lr 0.000137 time 1.9900 (2.4193) loss 2.5663 (3.1048) grad_norm 2.0972 (2.3287) [2022-01-25 01:09:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][110/1251] eta 0:45:26 lr 0.000137 time 1.6048 (2.3892) loss 3.4343 (3.1098) grad_norm 2.1324 (2.3272) [2022-01-25 01:09:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][120/1251] eta 0:44:51 lr 0.000137 time 2.8266 (2.3801) loss 2.7487 (3.1083) grad_norm 2.8111 (2.3299) [2022-01-25 01:09:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][130/1251] eta 0:44:04 lr 0.000137 time 2.5295 (2.3589) loss 2.1057 (3.1051) grad_norm 2.5272 (2.3513) [2022-01-25 01:10:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][140/1251] eta 0:43:28 lr 0.000137 time 1.9097 (2.3482) loss 3.5125 (3.1116) grad_norm 3.5339 (2.3684) [2022-01-25 01:10:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][150/1251] eta 0:42:51 lr 0.000137 time 1.8876 (2.3359) loss 3.4542 (3.1181) grad_norm 2.3501 (2.3737) [2022-01-25 01:10:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][160/1251] eta 0:42:25 lr 0.000137 time 2.8299 (2.3334) loss 3.2824 (3.1213) grad_norm 2.3479 (2.3830) [2022-01-25 01:11:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][170/1251] eta 0:41:47 lr 0.000137 time 2.3002 (2.3200) loss 3.0130 (3.1177) grad_norm 2.8623 (2.3906) [2022-01-25 01:11:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][180/1251] eta 0:41:15 lr 0.000137 time 1.8697 (2.3117) loss 1.9324 (3.1030) grad_norm 2.4959 (2.4093) [2022-01-25 01:12:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][190/1251] eta 0:40:45 lr 0.000137 time 1.8326 (2.3045) loss 2.9436 (3.1004) grad_norm 2.5540 (2.4084) [2022-01-25 01:12:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][200/1251] eta 0:40:19 lr 0.000137 time 2.7560 (2.3022) loss 3.6041 (3.1124) grad_norm 2.4750 (2.4037) [2022-01-25 01:12:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][210/1251] eta 0:39:48 lr 0.000137 time 1.9444 (2.2944) loss 2.1815 (3.1172) grad_norm 2.1385 (2.3984) [2022-01-25 01:13:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][220/1251] eta 0:39:24 lr 0.000137 time 2.7156 (2.2931) loss 2.1484 (3.1220) grad_norm 2.2700 (2.3944) [2022-01-25 01:13:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][230/1251] eta 0:39:04 lr 0.000137 time 2.4999 (2.2962) loss 2.6936 (3.1099) grad_norm 2.5053 (2.3961) [2022-01-25 01:13:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][240/1251] eta 0:38:34 lr 0.000136 time 2.0363 (2.2895) loss 3.6550 (3.1082) grad_norm 2.1772 (2.3905) [2022-01-25 01:14:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][250/1251] eta 0:37:58 lr 0.000136 time 2.2593 (2.2767) loss 3.0903 (3.0990) grad_norm 2.5623 (2.3853) [2022-01-25 01:14:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][260/1251] eta 0:37:31 lr 0.000136 time 2.1958 (2.2718) loss 2.5635 (3.0959) grad_norm 2.1133 (2.3779) [2022-01-25 01:14:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][270/1251] eta 0:37:04 lr 0.000136 time 2.6872 (2.2680) loss 3.7323 (3.0956) grad_norm 2.1485 (2.3761) [2022-01-25 01:15:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][280/1251] eta 0:36:41 lr 0.000136 time 3.3551 (2.2668) loss 2.6804 (3.1001) grad_norm 2.0750 (2.3744) [2022-01-25 01:15:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][290/1251] eta 0:36:20 lr 0.000136 time 1.8173 (2.2691) loss 3.3064 (3.0957) grad_norm 2.3485 (2.3759) [2022-01-25 01:16:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][300/1251] eta 0:35:56 lr 0.000136 time 1.8340 (2.2674) loss 3.3641 (3.0942) grad_norm 2.2420 (2.3750) [2022-01-25 01:16:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][310/1251] eta 0:35:30 lr 0.000136 time 2.8127 (2.2642) loss 3.8769 (3.0965) grad_norm 2.4492 (2.3786) [2022-01-25 01:16:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][320/1251] eta 0:35:00 lr 0.000136 time 2.0538 (2.2565) loss 3.1844 (3.0994) grad_norm 2.2382 (2.3747) [2022-01-25 01:17:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][330/1251] eta 0:34:29 lr 0.000136 time 2.3276 (2.2474) loss 3.2332 (3.0950) grad_norm 2.3650 (2.3717) [2022-01-25 01:17:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][340/1251] eta 0:34:01 lr 0.000136 time 1.7712 (2.2407) loss 3.6543 (3.0989) grad_norm 2.2103 (2.3646) [2022-01-25 01:17:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][350/1251] eta 0:33:32 lr 0.000136 time 1.9538 (2.2339) loss 2.1430 (3.1001) grad_norm 2.0506 (2.3611) [2022-01-25 01:18:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][360/1251] eta 0:33:11 lr 0.000136 time 2.8314 (2.2346) loss 2.0743 (3.0996) grad_norm 2.6947 (2.3642) [2022-01-25 01:18:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][370/1251] eta 0:32:46 lr 0.000136 time 2.7588 (2.2318) loss 2.2930 (3.0993) grad_norm 2.2015 (2.3647) [2022-01-25 01:18:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][380/1251] eta 0:32:25 lr 0.000136 time 2.3043 (2.2340) loss 3.8450 (3.1002) grad_norm 2.6299 (2.3629) [2022-01-25 01:19:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][390/1251] eta 0:32:06 lr 0.000136 time 1.9065 (2.2370) loss 2.2630 (3.0995) grad_norm 2.8853 (2.3663) [2022-01-25 01:19:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][400/1251] eta 0:31:50 lr 0.000136 time 2.5322 (2.2444) loss 3.4987 (3.1038) grad_norm 2.2186 (2.3654) [2022-01-25 01:20:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][410/1251] eta 0:31:28 lr 0.000136 time 2.8978 (2.2451) loss 2.9052 (3.1014) grad_norm 3.0362 (2.3666) [2022-01-25 01:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][420/1251] eta 0:31:04 lr 0.000136 time 1.8648 (2.2436) loss 2.1146 (3.1038) grad_norm 2.3855 (2.3671) [2022-01-25 01:20:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][430/1251] eta 0:30:37 lr 0.000136 time 2.1438 (2.2382) loss 3.4635 (3.1024) grad_norm 2.4076 (2.3649) [2022-01-25 01:21:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][440/1251] eta 0:30:11 lr 0.000136 time 1.8820 (2.2331) loss 3.1945 (3.1010) grad_norm 2.4939 (2.3630) [2022-01-25 01:21:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][450/1251] eta 0:29:46 lr 0.000136 time 2.1278 (2.2301) loss 2.5019 (3.1014) grad_norm 2.1940 (2.3625) [2022-01-25 01:21:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][460/1251] eta 0:29:22 lr 0.000136 time 2.2230 (2.2286) loss 2.4329 (3.1007) grad_norm 2.5968 (2.3668) [2022-01-25 01:22:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][470/1251] eta 0:29:00 lr 0.000136 time 2.8150 (2.2283) loss 3.7709 (3.1010) grad_norm 2.6373 (2.3658) [2022-01-25 01:22:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][480/1251] eta 0:28:37 lr 0.000136 time 2.2336 (2.2274) loss 3.0370 (3.1015) grad_norm 2.3917 (2.3636) [2022-01-25 01:22:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][490/1251] eta 0:28:14 lr 0.000136 time 2.1934 (2.2267) loss 3.9650 (3.1039) grad_norm 2.5199 (2.3660) [2022-01-25 01:23:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][500/1251] eta 0:27:52 lr 0.000136 time 2.8610 (2.2273) loss 3.6448 (3.1099) grad_norm 2.2889 (2.3645) [2022-01-25 01:23:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][510/1251] eta 0:27:30 lr 0.000136 time 1.9036 (2.2272) loss 2.2609 (3.1123) grad_norm 2.6756 (2.3648) [2022-01-25 01:24:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][520/1251] eta 0:27:09 lr 0.000136 time 1.7964 (2.2294) loss 3.2933 (3.1140) grad_norm 3.2933 (2.3661) [2022-01-25 01:24:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][530/1251] eta 0:26:46 lr 0.000136 time 2.5456 (2.2286) loss 3.2312 (3.1155) grad_norm 4.5226 (2.3738) [2022-01-25 01:24:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][540/1251] eta 0:26:24 lr 0.000136 time 2.6738 (2.2279) loss 3.7051 (3.1157) grad_norm 2.0665 (2.3821) [2022-01-25 01:25:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][550/1251] eta 0:26:00 lr 0.000136 time 2.1566 (2.2255) loss 3.5650 (3.1188) grad_norm 2.4955 (2.3820) [2022-01-25 01:25:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][560/1251] eta 0:25:34 lr 0.000136 time 1.8949 (2.2212) loss 3.7529 (3.1232) grad_norm 2.2748 (2.3853) [2022-01-25 01:25:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][570/1251] eta 0:25:10 lr 0.000136 time 2.0928 (2.2176) loss 3.3093 (3.1277) grad_norm 2.6209 (2.3846) [2022-01-25 01:26:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][580/1251] eta 0:24:47 lr 0.000136 time 2.6577 (2.2171) loss 3.7595 (3.1297) grad_norm 2.7713 (2.3857) [2022-01-25 01:26:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][590/1251] eta 0:24:26 lr 0.000136 time 2.9953 (2.2190) loss 3.5705 (3.1299) grad_norm 2.3422 (2.3846) [2022-01-25 01:26:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][600/1251] eta 0:24:04 lr 0.000135 time 1.8151 (2.2187) loss 3.2546 (3.1297) grad_norm 2.4167 (2.3847) [2022-01-25 01:27:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][610/1251] eta 0:23:42 lr 0.000135 time 1.7279 (2.2184) loss 3.7961 (3.1308) grad_norm 2.2728 (2.3855) [2022-01-25 01:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][620/1251] eta 0:23:20 lr 0.000135 time 1.9954 (2.2197) loss 3.0201 (3.1295) grad_norm 2.2218 (2.3843) [2022-01-25 01:28:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][630/1251] eta 0:23:00 lr 0.000135 time 3.2840 (2.2227) loss 2.8495 (3.1299) grad_norm 2.0868 (2.3815) [2022-01-25 01:28:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][640/1251] eta 0:22:37 lr 0.000135 time 1.9186 (2.2225) loss 3.1135 (3.1296) grad_norm 2.0329 (2.3776) [2022-01-25 01:28:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][650/1251] eta 0:22:13 lr 0.000135 time 1.8633 (2.2185) loss 2.5886 (3.1312) grad_norm 2.1172 (2.3759) [2022-01-25 01:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][660/1251] eta 0:21:49 lr 0.000135 time 2.1995 (2.2162) loss 2.8218 (3.1271) grad_norm 2.1745 (2.3734) [2022-01-25 01:29:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][670/1251] eta 0:21:27 lr 0.000135 time 2.1532 (2.2156) loss 3.1815 (3.1307) grad_norm 2.0094 (2.3730) [2022-01-25 01:29:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][680/1251] eta 0:21:05 lr 0.000135 time 1.8967 (2.2156) loss 3.2700 (3.1283) grad_norm 2.3402 (2.3723) [2022-01-25 01:30:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][690/1251] eta 0:20:43 lr 0.000135 time 2.0884 (2.2157) loss 2.0774 (3.1244) grad_norm 2.0872 (2.3697) [2022-01-25 01:30:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][700/1251] eta 0:20:21 lr 0.000135 time 2.3568 (2.2162) loss 3.3438 (3.1236) grad_norm 2.5349 (2.3695) [2022-01-25 01:30:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][710/1251] eta 0:19:59 lr 0.000135 time 1.6690 (2.2165) loss 3.6882 (3.1274) grad_norm 2.2936 (2.3689) [2022-01-25 01:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][720/1251] eta 0:19:38 lr 0.000135 time 2.2310 (2.2187) loss 3.7760 (3.1287) grad_norm 2.0860 (2.3677) [2022-01-25 01:31:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][730/1251] eta 0:19:15 lr 0.000135 time 2.2592 (2.2181) loss 3.2114 (3.1283) grad_norm 2.2087 (2.3675) [2022-01-25 01:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][740/1251] eta 0:18:51 lr 0.000135 time 1.6666 (2.2142) loss 3.6553 (3.1301) grad_norm 2.0205 (2.3661) [2022-01-25 01:32:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][750/1251] eta 0:18:27 lr 0.000135 time 1.6344 (2.2108) loss 3.4237 (3.1330) grad_norm 2.4826 (2.3684) [2022-01-25 01:32:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][760/1251] eta 0:18:05 lr 0.000135 time 2.5439 (2.2111) loss 3.1666 (3.1305) grad_norm 2.6337 (2.3707) [2022-01-25 01:33:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][770/1251] eta 0:17:43 lr 0.000135 time 1.9427 (2.2101) loss 3.5542 (3.1316) grad_norm 2.3467 (2.3709) [2022-01-25 01:33:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][780/1251] eta 0:17:20 lr 0.000135 time 1.8699 (2.2081) loss 2.5304 (3.1299) grad_norm 2.2397 (2.3689) [2022-01-25 01:33:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][790/1251] eta 0:16:57 lr 0.000135 time 2.8599 (2.2079) loss 2.6309 (3.1295) grad_norm 2.0036 (2.3683) [2022-01-25 01:34:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][800/1251] eta 0:16:35 lr 0.000135 time 1.9088 (2.2074) loss 2.8906 (3.1292) grad_norm 2.0978 (2.3687) [2022-01-25 01:34:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][810/1251] eta 0:16:13 lr 0.000135 time 2.4989 (2.2085) loss 3.6868 (3.1271) grad_norm 2.4937 (2.3684) [2022-01-25 01:34:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][820/1251] eta 0:15:51 lr 0.000135 time 2.2196 (2.2081) loss 3.6532 (3.1286) grad_norm 2.5204 (2.3688) [2022-01-25 01:35:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][830/1251] eta 0:15:30 lr 0.000135 time 2.8724 (2.2101) loss 3.3054 (3.1251) grad_norm 2.5701 (2.3708) [2022-01-25 01:35:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][840/1251] eta 0:15:08 lr 0.000135 time 2.0384 (2.2098) loss 3.3011 (3.1255) grad_norm 2.2392 (2.3718) [2022-01-25 01:36:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][850/1251] eta 0:14:47 lr 0.000135 time 3.0011 (2.2141) loss 2.3813 (3.1277) grad_norm 2.3508 (2.3722) [2022-01-25 01:36:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][860/1251] eta 0:14:25 lr 0.000135 time 1.9395 (2.2129) loss 2.0823 (3.1280) grad_norm 2.2655 (2.3716) [2022-01-25 01:36:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][870/1251] eta 0:14:02 lr 0.000135 time 1.8768 (2.2105) loss 2.8625 (3.1252) grad_norm 2.3577 (2.3694) [2022-01-25 01:37:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][880/1251] eta 0:13:39 lr 0.000135 time 1.8556 (2.2082) loss 3.1618 (3.1254) grad_norm 2.4546 (2.3697) [2022-01-25 01:37:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][890/1251] eta 0:13:17 lr 0.000135 time 3.0521 (2.2090) loss 3.6542 (3.1288) grad_norm 2.3700 (2.3698) [2022-01-25 01:37:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][900/1251] eta 0:12:55 lr 0.000135 time 2.1378 (2.2088) loss 3.6452 (3.1268) grad_norm 2.0425 (2.3678) [2022-01-25 01:38:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][910/1251] eta 0:12:32 lr 0.000135 time 1.6212 (2.2078) loss 3.6530 (3.1264) grad_norm 2.3828 (2.3666) [2022-01-25 01:38:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][920/1251] eta 0:12:10 lr 0.000135 time 2.1495 (2.2079) loss 2.8862 (3.1276) grad_norm 2.1085 (2.3663) [2022-01-25 01:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][930/1251] eta 0:11:49 lr 0.000135 time 3.3117 (2.2105) loss 3.4547 (3.1268) grad_norm 2.0265 (2.3659) [2022-01-25 01:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][940/1251] eta 0:11:27 lr 0.000135 time 2.2773 (2.2122) loss 3.2359 (3.1300) grad_norm 2.3157 (2.3670) [2022-01-25 01:39:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][950/1251] eta 0:11:05 lr 0.000135 time 1.8798 (2.2114) loss 3.3229 (3.1289) grad_norm 2.0882 (2.3662) [2022-01-25 01:40:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][960/1251] eta 0:10:42 lr 0.000134 time 1.9186 (2.2095) loss 3.5450 (3.1285) grad_norm 2.2698 (2.3656) [2022-01-25 01:40:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][970/1251] eta 0:10:20 lr 0.000134 time 2.8341 (2.2084) loss 3.5995 (3.1287) grad_norm 2.0439 (2.3640) [2022-01-25 01:40:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][980/1251] eta 0:09:58 lr 0.000134 time 1.9772 (2.2082) loss 3.2246 (3.1306) grad_norm 2.2962 (2.3674) [2022-01-25 01:41:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][990/1251] eta 0:09:36 lr 0.000134 time 2.5399 (2.2082) loss 3.5344 (3.1333) grad_norm 2.1845 (2.3663) [2022-01-25 01:41:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1000/1251] eta 0:09:14 lr 0.000134 time 2.2144 (2.2076) loss 3.6125 (3.1362) grad_norm 2.3013 (2.3656) [2022-01-25 01:41:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1010/1251] eta 0:08:51 lr 0.000134 time 2.7812 (2.2072) loss 3.6670 (3.1367) grad_norm 2.1145 (2.3658) [2022-01-25 01:42:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1020/1251] eta 0:08:29 lr 0.000134 time 2.4927 (2.2061) loss 3.3424 (3.1392) grad_norm 2.5969 (2.3656) [2022-01-25 01:42:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1030/1251] eta 0:08:07 lr 0.000134 time 2.1720 (2.2063) loss 3.1411 (3.1412) grad_norm 2.4793 (2.3666) [2022-01-25 01:42:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1040/1251] eta 0:07:45 lr 0.000134 time 2.4901 (2.2059) loss 3.0400 (3.1379) grad_norm 2.9234 (2.3666) [2022-01-25 01:43:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1050/1251] eta 0:07:23 lr 0.000134 time 2.1865 (2.2056) loss 2.6848 (3.1379) grad_norm 2.4959 (2.3669) [2022-01-25 01:43:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1060/1251] eta 0:07:01 lr 0.000134 time 2.1402 (2.2048) loss 3.0074 (3.1404) grad_norm 2.5194 (2.3658) [2022-01-25 01:44:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1070/1251] eta 0:06:39 lr 0.000134 time 2.7401 (2.2047) loss 3.6219 (3.1406) grad_norm 2.1154 (2.3653) [2022-01-25 01:44:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1080/1251] eta 0:06:16 lr 0.000134 time 2.2391 (2.2043) loss 3.5133 (3.1397) grad_norm 2.4466 (2.3642) [2022-01-25 01:44:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1090/1251] eta 0:05:54 lr 0.000134 time 2.4608 (2.2045) loss 2.6826 (3.1400) grad_norm 2.3792 (2.3645) [2022-01-25 01:45:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1100/1251] eta 0:05:32 lr 0.000134 time 1.9288 (2.2040) loss 3.0926 (3.1421) grad_norm 2.3712 (2.3645) [2022-01-25 01:45:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1110/1251] eta 0:05:10 lr 0.000134 time 2.0992 (2.2043) loss 2.8556 (3.1403) grad_norm 2.2132 (2.3631) [2022-01-25 01:45:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1120/1251] eta 0:04:48 lr 0.000134 time 2.3187 (2.2034) loss 2.4061 (3.1395) grad_norm 2.6693 (2.3626) [2022-01-25 01:46:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1130/1251] eta 0:04:26 lr 0.000134 time 2.4947 (2.2031) loss 2.9301 (3.1377) grad_norm 2.5455 (2.3622) [2022-01-25 01:46:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1140/1251] eta 0:04:04 lr 0.000134 time 1.8191 (2.2027) loss 3.0307 (3.1393) grad_norm 2.3362 (2.3611) [2022-01-25 01:46:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1150/1251] eta 0:03:42 lr 0.000134 time 2.9603 (2.2023) loss 3.5177 (3.1403) grad_norm 2.1729 (2.3608) [2022-01-25 01:47:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1160/1251] eta 0:03:20 lr 0.000134 time 2.5966 (2.2024) loss 2.3328 (3.1418) grad_norm 2.1471 (2.3600) [2022-01-25 01:47:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1170/1251] eta 0:02:58 lr 0.000134 time 2.0410 (2.2014) loss 3.2738 (3.1431) grad_norm 2.2137 (2.3591) [2022-01-25 01:48:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1180/1251] eta 0:02:36 lr 0.000134 time 2.3610 (2.2015) loss 3.5594 (3.1442) grad_norm 2.3475 (2.3591) [2022-01-25 01:48:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1190/1251] eta 0:02:14 lr 0.000134 time 2.2042 (2.2015) loss 3.2812 (3.1464) grad_norm 2.0934 (2.3582) [2022-01-25 01:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1200/1251] eta 0:01:52 lr 0.000134 time 2.4187 (2.2019) loss 3.6204 (3.1469) grad_norm 2.9997 (2.3585) [2022-01-25 01:49:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1210/1251] eta 0:01:30 lr 0.000134 time 1.9012 (2.2017) loss 2.5361 (3.1479) grad_norm 2.2681 (2.3580) [2022-01-25 01:49:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1220/1251] eta 0:01:08 lr 0.000134 time 1.6295 (2.2024) loss 3.6546 (3.1487) grad_norm 2.2014 (2.3572) [2022-01-25 01:49:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1230/1251] eta 0:00:46 lr 0.000134 time 2.1895 (2.2017) loss 3.3263 (3.1494) grad_norm 2.3556 (2.3566) [2022-01-25 01:50:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1240/1251] eta 0:00:24 lr 0.000134 time 2.2258 (2.2012) loss 3.7425 (3.1490) grad_norm 2.2381 (2.3551) [2022-01-25 01:50:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [230/300][1250/1251] eta 0:00:02 lr 0.000134 time 1.3514 (2.1953) loss 3.3165 (3.1510) grad_norm 2.2233 (2.3544) [2022-01-25 01:50:29 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 230 training takes 0:45:46 [2022-01-25 01:50:29 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_230 saving...... [2022-01-25 01:50:41 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_230 saved !!! [2022-01-25 01:50:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 15.392 (15.392) Loss 0.9035 (0.9035) Acc@1 78.125 (78.125) Acc@5 95.508 (95.508) [2022-01-25 01:51:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.594 (2.701) Loss 0.7920 (0.8529) Acc@1 81.055 (79.954) Acc@5 95.996 (95.197) [2022-01-25 01:51:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 3.737 (2.362) Loss 0.8335 (0.8602) Acc@1 78.809 (79.539) Acc@5 95.508 (95.043) [2022-01-25 01:51:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.340 (2.178) Loss 0.8556 (0.8635) Acc@1 80.469 (79.672) Acc@5 95.410 (94.947) [2022-01-25 01:52:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.185 (1.971) Loss 0.8234 (0.8583) Acc@1 79.492 (79.699) Acc@5 96.289 (95.029) [2022-01-25 01:52:10 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.662 Acc@5 95.032 [2022-01-25 01:52:10 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.7% [2022-01-25 01:52:10 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.69% [2022-01-25 01:52:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][0/1251] eta 7:30:21 lr 0.000134 time 21.6000 (21.6000) loss 2.6835 (2.6835) grad_norm 2.2602 (2.2602) [2022-01-25 01:52:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][10/1251] eta 1:26:04 lr 0.000134 time 2.8546 (4.1616) loss 3.6429 (3.1572) grad_norm 2.1971 (2.2318) [2022-01-25 01:53:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][20/1251] eta 1:06:54 lr 0.000134 time 1.4694 (3.2612) loss 3.2075 (3.1906) grad_norm 2.2571 (2.2480) [2022-01-25 01:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][30/1251] eta 0:58:19 lr 0.000134 time 1.5493 (2.8662) loss 3.2982 (3.1548) grad_norm 2.2781 (2.2652) [2022-01-25 01:54:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][40/1251] eta 0:55:43 lr 0.000134 time 3.7374 (2.7611) loss 2.9925 (3.1399) grad_norm 2.2631 (2.2603) [2022-01-25 01:54:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][50/1251] eta 0:53:49 lr 0.000134 time 3.1336 (2.6886) loss 3.4653 (3.1432) grad_norm 2.2264 (2.2775) [2022-01-25 01:54:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][60/1251] eta 0:51:24 lr 0.000134 time 1.7028 (2.5901) loss 3.6728 (3.2008) grad_norm 2.5240 (2.2918) [2022-01-25 01:55:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][70/1251] eta 0:49:18 lr 0.000134 time 1.7934 (2.5050) loss 3.3198 (3.2224) grad_norm 2.2242 (2.2944) [2022-01-25 01:55:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][80/1251] eta 0:48:04 lr 0.000133 time 3.2397 (2.4634) loss 2.6162 (3.2362) grad_norm 2.5794 (2.3119) [2022-01-25 01:55:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][90/1251] eta 0:47:04 lr 0.000133 time 2.7928 (2.4327) loss 2.8720 (3.2263) grad_norm 2.3925 (2.3203) [2022-01-25 01:56:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][100/1251] eta 0:46:12 lr 0.000133 time 2.1111 (2.4089) loss 3.4048 (3.2405) grad_norm 2.0369 (2.3214) [2022-01-25 01:56:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][110/1251] eta 0:45:39 lr 0.000133 time 1.5825 (2.4006) loss 2.1719 (3.2251) grad_norm 2.3496 (2.3196) [2022-01-25 01:57:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][120/1251] eta 0:45:12 lr 0.000133 time 3.1837 (2.3982) loss 2.7552 (3.2093) grad_norm 2.2571 (2.3281) [2022-01-25 01:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][130/1251] eta 0:45:02 lr 0.000133 time 3.1320 (2.4105) loss 3.4973 (3.2125) grad_norm 2.7298 (2.3300) [2022-01-25 01:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][140/1251] eta 0:44:09 lr 0.000133 time 1.8747 (2.3848) loss 3.5788 (3.2252) grad_norm 2.2050 (2.3209) [2022-01-25 01:58:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][150/1251] eta 0:43:10 lr 0.000133 time 2.0219 (2.3526) loss 3.6001 (3.2307) grad_norm 2.3709 (2.3161) [2022-01-25 01:58:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][160/1251] eta 0:42:16 lr 0.000133 time 1.8455 (2.3250) loss 3.3522 (3.2337) grad_norm 2.3854 (2.3168) [2022-01-25 01:58:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][170/1251] eta 0:41:34 lr 0.000133 time 2.0544 (2.3079) loss 3.6912 (3.2293) grad_norm 2.2403 (2.3147) [2022-01-25 01:59:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][180/1251] eta 0:41:06 lr 0.000133 time 2.9182 (2.3026) loss 2.0806 (3.2217) grad_norm 2.1967 (2.3122) [2022-01-25 01:59:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][190/1251] eta 0:40:34 lr 0.000133 time 1.9697 (2.2946) loss 3.3335 (3.2253) grad_norm 2.4754 (2.3192) [2022-01-25 01:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][200/1251] eta 0:40:06 lr 0.000133 time 2.1625 (2.2901) loss 2.9564 (3.2317) grad_norm 2.1606 (2.3344) [2022-01-25 02:00:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][210/1251] eta 0:39:47 lr 0.000133 time 2.4748 (2.2933) loss 3.7422 (3.2275) grad_norm 2.1116 (2.3343) [2022-01-25 02:00:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][220/1251] eta 0:39:39 lr 0.000133 time 3.1237 (2.3081) loss 3.4823 (3.2276) grad_norm 2.6065 (2.3312) [2022-01-25 02:01:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][230/1251] eta 0:39:13 lr 0.000133 time 2.2602 (2.3052) loss 3.3116 (3.2299) grad_norm 2.2673 (2.3278) [2022-01-25 02:01:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][240/1251] eta 0:38:41 lr 0.000133 time 1.8767 (2.2958) loss 2.9179 (3.2180) grad_norm 2.2344 (2.3238) [2022-01-25 02:01:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][250/1251] eta 0:38:05 lr 0.000133 time 1.8867 (2.2834) loss 2.7389 (3.2062) grad_norm 2.0406 (2.3209) [2022-01-25 02:02:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][260/1251] eta 0:37:30 lr 0.000133 time 1.9790 (2.2713) loss 2.9141 (3.2058) grad_norm 2.1330 (2.3182) [2022-01-25 02:02:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][270/1251] eta 0:37:01 lr 0.000133 time 1.8514 (2.2646) loss 2.7133 (3.2001) grad_norm 2.4193 (2.3204) [2022-01-25 02:02:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][280/1251] eta 0:36:38 lr 0.000133 time 3.3307 (2.2639) loss 3.7579 (3.2014) grad_norm 2.2982 (2.3264) [2022-01-25 02:03:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][290/1251] eta 0:36:12 lr 0.000133 time 2.0968 (2.2607) loss 2.4587 (3.1876) grad_norm 2.5643 (2.3338) [2022-01-25 02:03:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][300/1251] eta 0:35:48 lr 0.000133 time 2.3283 (2.2596) loss 3.4755 (3.1861) grad_norm 2.3153 (2.3351) [2022-01-25 02:03:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][310/1251] eta 0:35:25 lr 0.000133 time 1.5969 (2.2592) loss 2.8968 (3.1831) grad_norm 2.8449 (2.3385) [2022-01-25 02:04:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][320/1251] eta 0:35:04 lr 0.000133 time 2.8647 (2.2603) loss 3.9066 (3.1850) grad_norm 2.2669 (2.3420) [2022-01-25 02:04:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][330/1251] eta 0:34:39 lr 0.000133 time 2.0421 (2.2576) loss 3.3326 (3.1816) grad_norm 2.0074 (2.3405) [2022-01-25 02:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][340/1251] eta 0:34:15 lr 0.000133 time 2.6920 (2.2566) loss 3.9335 (3.1808) grad_norm 2.2415 (2.3451) [2022-01-25 02:05:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][350/1251] eta 0:33:47 lr 0.000133 time 1.5873 (2.2508) loss 3.5744 (3.1744) grad_norm 2.7395 (2.3456) [2022-01-25 02:05:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][360/1251] eta 0:33:18 lr 0.000133 time 2.0413 (2.2435) loss 3.3108 (3.1681) grad_norm 2.8635 (2.3461) [2022-01-25 02:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][370/1251] eta 0:32:55 lr 0.000133 time 2.1762 (2.2422) loss 2.9542 (3.1676) grad_norm 2.4241 (2.3466) [2022-01-25 02:06:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][380/1251] eta 0:32:30 lr 0.000133 time 2.3008 (2.2396) loss 3.4540 (3.1697) grad_norm 1.9916 (2.3467) [2022-01-25 02:06:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][390/1251] eta 0:32:07 lr 0.000133 time 1.7803 (2.2382) loss 3.3849 (3.1731) grad_norm 2.8849 (2.3466) [2022-01-25 02:07:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][400/1251] eta 0:31:49 lr 0.000133 time 2.1748 (2.2435) loss 3.3321 (3.1772) grad_norm 2.2808 (2.3436) [2022-01-25 02:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][410/1251] eta 0:31:28 lr 0.000133 time 2.0178 (2.2451) loss 3.6776 (3.1773) grad_norm 2.6718 (2.3437) [2022-01-25 02:07:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][420/1251] eta 0:31:04 lr 0.000133 time 1.9125 (2.2436) loss 2.5731 (3.1793) grad_norm 2.1255 (2.3414) [2022-01-25 02:08:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][430/1251] eta 0:30:37 lr 0.000133 time 1.9318 (2.2383) loss 2.9114 (3.1707) grad_norm 2.4576 (2.3423) [2022-01-25 02:08:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][440/1251] eta 0:30:11 lr 0.000132 time 2.2539 (2.2339) loss 3.8291 (3.1751) grad_norm 3.1721 (2.3424) [2022-01-25 02:08:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][450/1251] eta 0:29:46 lr 0.000132 time 1.9121 (2.2309) loss 3.3602 (3.1770) grad_norm 2.1384 (2.3431) [2022-01-25 02:09:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][460/1251] eta 0:29:24 lr 0.000132 time 1.9605 (2.2312) loss 3.4212 (3.1802) grad_norm 2.3503 (2.3461) [2022-01-25 02:09:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][470/1251] eta 0:29:03 lr 0.000132 time 1.8841 (2.2329) loss 2.6672 (3.1795) grad_norm 1.9866 (2.3436) [2022-01-25 02:10:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][480/1251] eta 0:28:39 lr 0.000132 time 1.6175 (2.2306) loss 3.1805 (3.1769) grad_norm 2.8064 (2.3472) [2022-01-25 02:10:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][490/1251] eta 0:28:16 lr 0.000132 time 1.9970 (2.2298) loss 3.4994 (3.1769) grad_norm 2.1156 (2.3458) [2022-01-25 02:10:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][500/1251] eta 0:27:52 lr 0.000132 time 2.1527 (2.2265) loss 3.7376 (3.1780) grad_norm 2.1204 (2.3460) [2022-01-25 02:11:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][510/1251] eta 0:27:26 lr 0.000132 time 1.6207 (2.2226) loss 2.9570 (3.1765) grad_norm 2.5641 (2.3496) [2022-01-25 02:11:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][520/1251] eta 0:27:03 lr 0.000132 time 2.1573 (2.2203) loss 3.4548 (3.1787) grad_norm 2.2286 (2.3506) [2022-01-25 02:11:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][530/1251] eta 0:26:40 lr 0.000132 time 2.0514 (2.2201) loss 3.5042 (3.1843) grad_norm 2.1952 (2.3487) [2022-01-25 02:12:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][540/1251] eta 0:26:20 lr 0.000132 time 2.7699 (2.2224) loss 4.1612 (3.1891) grad_norm 2.2371 (2.3455) [2022-01-25 02:12:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][550/1251] eta 0:25:58 lr 0.000132 time 2.0319 (2.2232) loss 3.3323 (3.1853) grad_norm 2.3182 (2.3430) [2022-01-25 02:12:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][560/1251] eta 0:25:37 lr 0.000132 time 2.4835 (2.2255) loss 3.6271 (3.1817) grad_norm 2.3822 (2.3426) [2022-01-25 02:13:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][570/1251] eta 0:25:15 lr 0.000132 time 1.6623 (2.2258) loss 3.6487 (3.1774) grad_norm 2.7758 (2.3457) [2022-01-25 02:13:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][580/1251] eta 0:24:51 lr 0.000132 time 2.2831 (2.2232) loss 3.3826 (3.1754) grad_norm 2.0312 (2.3458) [2022-01-25 02:14:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][590/1251] eta 0:24:26 lr 0.000132 time 2.2198 (2.2191) loss 2.5230 (3.1730) grad_norm 2.2556 (2.3455) [2022-01-25 02:14:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][600/1251] eta 0:24:02 lr 0.000132 time 1.9896 (2.2153) loss 3.3961 (3.1755) grad_norm 2.2667 (2.3460) [2022-01-25 02:14:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][610/1251] eta 0:23:39 lr 0.000132 time 1.8788 (2.2146) loss 2.9822 (3.1786) grad_norm 2.1142 (2.3456) [2022-01-25 02:15:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][620/1251] eta 0:23:19 lr 0.000132 time 2.4569 (2.2177) loss 3.5197 (3.1811) grad_norm 2.3041 (2.3431) [2022-01-25 02:15:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][630/1251] eta 0:22:58 lr 0.000132 time 2.1619 (2.2200) loss 3.2474 (3.1780) grad_norm 2.2070 (2.3409) [2022-01-25 02:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][640/1251] eta 0:22:38 lr 0.000132 time 2.4181 (2.2226) loss 3.2915 (3.1776) grad_norm 1.9946 (2.3411) [2022-01-25 02:16:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][650/1251] eta 0:22:15 lr 0.000132 time 1.9231 (2.2224) loss 3.6075 (3.1811) grad_norm 2.1326 (2.3410) [2022-01-25 02:16:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][660/1251] eta 0:21:53 lr 0.000132 time 2.5383 (2.2220) loss 3.4868 (3.1835) grad_norm 2.3457 (2.3400) [2022-01-25 02:16:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][670/1251] eta 0:21:29 lr 0.000132 time 2.3898 (2.2199) loss 3.1593 (3.1806) grad_norm 2.4651 (2.3390) [2022-01-25 02:17:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][680/1251] eta 0:21:06 lr 0.000132 time 2.1920 (2.2181) loss 3.0453 (3.1766) grad_norm 2.0243 (2.3367) [2022-01-25 02:17:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][690/1251] eta 0:20:42 lr 0.000132 time 1.7106 (2.2154) loss 2.9947 (3.1743) grad_norm 1.9353 (2.3357) [2022-01-25 02:18:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][700/1251] eta 0:20:21 lr 0.000132 time 2.9594 (2.2165) loss 2.6936 (3.1732) grad_norm 2.6592 (2.3362) [2022-01-25 02:18:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][710/1251] eta 0:19:59 lr 0.000132 time 2.3710 (2.2170) loss 3.6207 (3.1755) grad_norm 2.5999 (2.3354) [2022-01-25 02:18:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][720/1251] eta 0:19:36 lr 0.000132 time 2.4161 (2.2160) loss 3.4777 (3.1773) grad_norm 2.4639 (2.3338) [2022-01-25 02:19:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][730/1251] eta 0:19:14 lr 0.000132 time 1.8620 (2.2161) loss 3.4761 (3.1818) grad_norm 2.0901 (2.3343) [2022-01-25 02:19:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][740/1251] eta 0:18:52 lr 0.000132 time 2.4726 (2.2167) loss 3.1394 (3.1781) grad_norm 2.3520 (2.3332) [2022-01-25 02:19:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][750/1251] eta 0:18:30 lr 0.000132 time 2.5489 (2.2173) loss 3.6022 (3.1767) grad_norm 2.4653 (2.3351) [2022-01-25 02:20:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][760/1251] eta 0:18:09 lr 0.000132 time 2.2513 (2.2193) loss 3.1902 (3.1768) grad_norm 2.3164 (2.3349) [2022-01-25 02:20:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][770/1251] eta 0:17:46 lr 0.000132 time 1.5917 (2.2178) loss 2.0115 (3.1777) grad_norm 2.0892 (2.3355) [2022-01-25 02:21:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][780/1251] eta 0:17:23 lr 0.000132 time 1.6086 (2.2156) loss 3.0580 (3.1775) grad_norm 2.3165 (2.3398) [2022-01-25 02:21:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][790/1251] eta 0:17:00 lr 0.000132 time 2.1006 (2.2132) loss 3.4567 (3.1799) grad_norm 2.6137 (2.3416) [2022-01-25 02:21:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][800/1251] eta 0:16:36 lr 0.000132 time 2.1240 (2.2104) loss 3.3811 (3.1820) grad_norm 2.1548 (2.3416) [2022-01-25 02:22:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][810/1251] eta 0:16:13 lr 0.000131 time 1.9246 (2.2083) loss 3.2851 (3.1839) grad_norm 1.9768 (2.3410) [2022-01-25 02:22:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][820/1251] eta 0:15:51 lr 0.000131 time 2.4649 (2.2078) loss 3.7728 (3.1806) grad_norm 2.3520 (2.3413) [2022-01-25 02:22:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][830/1251] eta 0:15:29 lr 0.000131 time 1.6999 (2.2077) loss 2.8101 (3.1805) grad_norm 2.4037 (2.3415) [2022-01-25 02:23:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][840/1251] eta 0:15:09 lr 0.000131 time 2.8764 (2.2120) loss 3.6308 (3.1825) grad_norm 2.0231 (2.3415) [2022-01-25 02:23:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][850/1251] eta 0:14:47 lr 0.000131 time 1.8825 (2.2129) loss 3.4280 (3.1817) grad_norm 1.9274 (2.3406) [2022-01-25 02:23:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][860/1251] eta 0:14:25 lr 0.000131 time 2.0889 (2.2135) loss 3.3122 (3.1792) grad_norm 2.1459 (2.3402) [2022-01-25 02:24:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][870/1251] eta 0:14:02 lr 0.000131 time 1.9032 (2.2122) loss 3.8197 (3.1779) grad_norm 2.3051 (2.3393) [2022-01-25 02:24:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][880/1251] eta 0:13:40 lr 0.000131 time 2.5571 (2.2108) loss 3.9839 (3.1798) grad_norm 2.6605 (2.3414) [2022-01-25 02:24:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][890/1251] eta 0:13:17 lr 0.000131 time 1.9506 (2.2085) loss 2.9718 (3.1758) grad_norm 2.2972 (2.3427) [2022-01-25 02:25:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][900/1251] eta 0:12:54 lr 0.000131 time 2.2518 (2.2074) loss 3.1819 (3.1756) grad_norm 2.0214 (2.3422) [2022-01-25 02:25:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][910/1251] eta 0:12:32 lr 0.000131 time 1.8476 (2.2058) loss 3.4662 (3.1744) grad_norm 2.2797 (2.3421) [2022-01-25 02:26:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][920/1251] eta 0:12:10 lr 0.000131 time 2.6396 (2.2070) loss 2.6342 (3.1741) grad_norm 2.2189 (2.3426) [2022-01-25 02:26:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][930/1251] eta 0:11:48 lr 0.000131 time 2.2667 (2.2066) loss 3.3054 (3.1710) grad_norm 2.2438 (2.3415) [2022-01-25 02:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][940/1251] eta 0:11:26 lr 0.000131 time 2.4552 (2.2076) loss 3.5755 (3.1717) grad_norm 2.3016 (2.3425) [2022-01-25 02:27:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][950/1251] eta 0:11:04 lr 0.000131 time 1.5012 (2.2078) loss 3.3486 (3.1733) grad_norm 2.1969 (2.3420) [2022-01-25 02:27:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][960/1251] eta 0:10:42 lr 0.000131 time 3.4534 (2.2079) loss 2.8374 (3.1711) grad_norm 2.1363 (2.3418) [2022-01-25 02:27:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][970/1251] eta 0:10:20 lr 0.000131 time 1.6533 (2.2074) loss 2.7289 (3.1681) grad_norm 2.2580 (2.3412) [2022-01-25 02:28:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][980/1251] eta 0:09:58 lr 0.000131 time 1.5832 (2.2081) loss 2.4541 (3.1695) grad_norm 2.2344 (2.3409) [2022-01-25 02:28:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][990/1251] eta 0:09:36 lr 0.000131 time 1.9067 (2.2082) loss 3.8542 (3.1702) grad_norm 2.4269 (2.3413) [2022-01-25 02:29:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1000/1251] eta 0:09:14 lr 0.000131 time 3.3488 (2.2090) loss 3.3000 (3.1710) grad_norm 2.1824 (2.3407) [2022-01-25 02:29:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1010/1251] eta 0:08:51 lr 0.000131 time 1.6743 (2.2072) loss 3.3228 (3.1697) grad_norm 2.2363 (2.3401) [2022-01-25 02:29:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1020/1251] eta 0:08:29 lr 0.000131 time 1.8077 (2.2055) loss 2.8554 (3.1673) grad_norm 2.3203 (2.3399) [2022-01-25 02:30:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1030/1251] eta 0:08:07 lr 0.000131 time 1.6429 (2.2055) loss 2.3403 (3.1683) grad_norm 2.4486 (2.3399) [2022-01-25 02:30:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1040/1251] eta 0:07:45 lr 0.000131 time 3.6756 (2.2076) loss 2.8030 (3.1666) grad_norm 2.7631 (2.3403) [2022-01-25 02:30:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1050/1251] eta 0:07:23 lr 0.000131 time 1.9105 (2.2071) loss 3.7053 (3.1639) grad_norm 2.5744 (2.3406) [2022-01-25 02:31:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1060/1251] eta 0:07:01 lr 0.000131 time 1.8614 (2.2070) loss 3.2727 (3.1644) grad_norm 2.5839 (2.3399) [2022-01-25 02:31:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1070/1251] eta 0:06:39 lr 0.000131 time 1.6777 (2.2056) loss 3.3981 (3.1670) grad_norm 2.1506 (2.3394) [2022-01-25 02:31:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1080/1251] eta 0:06:17 lr 0.000131 time 3.5355 (2.2067) loss 2.1425 (3.1657) grad_norm 2.4922 (2.3394) [2022-01-25 02:32:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1090/1251] eta 0:05:55 lr 0.000131 time 1.9889 (2.2063) loss 2.8110 (3.1652) grad_norm 2.2963 (2.3396) [2022-01-25 02:32:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1100/1251] eta 0:05:33 lr 0.000131 time 1.9930 (2.2054) loss 2.5780 (3.1651) grad_norm 2.1692 (2.3396) [2022-01-25 02:32:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1110/1251] eta 0:05:10 lr 0.000131 time 2.2100 (2.2036) loss 3.1602 (3.1635) grad_norm 2.2813 (2.3383) [2022-01-25 02:33:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1120/1251] eta 0:04:48 lr 0.000131 time 2.5838 (2.2044) loss 3.5904 (3.1662) grad_norm 2.6559 (2.3375) [2022-01-25 02:33:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1130/1251] eta 0:04:26 lr 0.000131 time 3.0961 (2.2045) loss 3.3240 (3.1671) grad_norm 2.1866 (2.3367) [2022-01-25 02:34:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1140/1251] eta 0:04:04 lr 0.000131 time 1.5452 (2.2049) loss 3.8056 (3.1674) grad_norm 1.9799 (2.3363) [2022-01-25 02:34:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1150/1251] eta 0:03:42 lr 0.000131 time 1.8766 (2.2039) loss 3.8715 (3.1680) grad_norm 2.3335 (2.3364) [2022-01-25 02:34:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1160/1251] eta 0:03:20 lr 0.000131 time 1.6121 (2.2029) loss 3.0037 (3.1693) grad_norm 2.3417 (2.3355) [2022-01-25 02:35:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1170/1251] eta 0:02:58 lr 0.000131 time 1.6143 (2.2022) loss 2.6898 (3.1691) grad_norm 2.1518 (2.3354) [2022-01-25 02:35:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1180/1251] eta 0:02:36 lr 0.000130 time 1.9065 (2.2035) loss 3.2808 (3.1694) grad_norm 2.4071 (2.3369) [2022-01-25 02:35:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1190/1251] eta 0:02:14 lr 0.000130 time 1.8960 (2.2035) loss 3.3747 (3.1696) grad_norm 2.4506 (2.3378) [2022-01-25 02:36:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1200/1251] eta 0:01:52 lr 0.000130 time 2.2493 (2.2043) loss 3.2994 (3.1713) grad_norm 2.7377 (2.3380) [2022-01-25 02:36:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1210/1251] eta 0:01:30 lr 0.000130 time 1.9267 (2.2042) loss 3.3978 (3.1711) grad_norm 2.2668 (2.3379) [2022-01-25 02:37:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1220/1251] eta 0:01:08 lr 0.000130 time 2.2238 (2.2045) loss 3.5116 (3.1697) grad_norm 2.2915 (2.3384) [2022-01-25 02:37:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1230/1251] eta 0:00:46 lr 0.000130 time 1.5707 (2.2023) loss 3.2543 (3.1718) grad_norm 2.0692 (2.3386) [2022-01-25 02:37:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1240/1251] eta 0:00:24 lr 0.000130 time 1.3007 (2.2002) loss 3.4329 (3.1722) grad_norm 1.8928 (2.3375) [2022-01-25 02:37:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [231/300][1250/1251] eta 0:00:02 lr 0.000130 time 1.1903 (2.1944) loss 3.3331 (3.1742) grad_norm 2.4322 (2.3369) [2022-01-25 02:37:55 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 231 training takes 0:45:45 [2022-01-25 02:38:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.460 (18.460) Loss 0.8310 (0.8310) Acc@1 79.004 (79.004) Acc@5 95.605 (95.605) [2022-01-25 02:38:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.198 (3.426) Loss 0.9158 (0.8703) Acc@1 78.320 (79.492) Acc@5 94.238 (94.984) [2022-01-25 02:38:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.530 (2.626) Loss 0.8025 (0.8621) Acc@1 80.273 (79.636) Acc@5 95.605 (95.085) [2022-01-25 02:39:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.646 (2.232) Loss 0.8866 (0.8668) Acc@1 78.125 (79.543) Acc@5 94.922 (94.994) [2022-01-25 02:39:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.365 (2.200) Loss 0.8401 (0.8640) Acc@1 78.027 (79.509) Acc@5 95.117 (95.020) [2022-01-25 02:39:33 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.520 Acc@5 95.050 [2022-01-25 02:39:33 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.5% [2022-01-25 02:39:33 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.69% [2022-01-25 02:39:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][0/1251] eta 7:25:09 lr 0.000130 time 21.3503 (21.3503) loss 3.5116 (3.5116) grad_norm 2.1936 (2.1936) [2022-01-25 02:40:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][10/1251] eta 1:20:07 lr 0.000130 time 1.8982 (3.8743) loss 3.4772 (3.2882) grad_norm 2.0999 (2.2972) [2022-01-25 02:40:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][20/1251] eta 1:03:32 lr 0.000130 time 1.4587 (3.0972) loss 3.5401 (3.2763) grad_norm 2.0306 (2.3193) [2022-01-25 02:40:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][30/1251] eta 0:55:43 lr 0.000130 time 1.3952 (2.7383) loss 3.9275 (3.3194) grad_norm 2.5227 (2.3418) [2022-01-25 02:41:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][40/1251] eta 0:52:46 lr 0.000130 time 3.3469 (2.6146) loss 3.3869 (3.2930) grad_norm 2.0565 (2.3565) [2022-01-25 02:41:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][50/1251] eta 0:51:33 lr 0.000130 time 2.6126 (2.5756) loss 2.9199 (3.2509) grad_norm 2.3377 (2.3595) [2022-01-25 02:42:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][60/1251] eta 0:50:02 lr 0.000130 time 1.4730 (2.5211) loss 3.9121 (3.2411) grad_norm 2.4281 (2.3357) [2022-01-25 02:42:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][70/1251] eta 0:48:55 lr 0.000130 time 1.5779 (2.4854) loss 3.1796 (3.2122) grad_norm 2.4330 (2.3800) [2022-01-25 02:42:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][80/1251] eta 0:48:18 lr 0.000130 time 2.5833 (2.4749) loss 3.2902 (3.2157) grad_norm 2.7044 (2.3800) [2022-01-25 02:43:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][90/1251] eta 0:47:29 lr 0.000130 time 1.8815 (2.4546) loss 3.2409 (3.2180) grad_norm 2.0108 (2.3888) [2022-01-25 02:43:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][100/1251] eta 0:46:19 lr 0.000130 time 1.5751 (2.4145) loss 3.0085 (3.2067) grad_norm 2.6577 (2.3884) [2022-01-25 02:43:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][110/1251] eta 0:45:20 lr 0.000130 time 2.0644 (2.3841) loss 3.0039 (3.1934) grad_norm 1.9449 (2.3770) [2022-01-25 02:44:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][120/1251] eta 0:44:40 lr 0.000130 time 1.8798 (2.3704) loss 3.3899 (3.1922) grad_norm 2.6692 (2.3712) [2022-01-25 02:44:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][130/1251] eta 0:44:37 lr 0.000130 time 3.7036 (2.3885) loss 2.6465 (3.1753) grad_norm 2.3547 (2.3561) [2022-01-25 02:45:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][140/1251] eta 0:44:11 lr 0.000130 time 2.3440 (2.3863) loss 3.6237 (3.1749) grad_norm 2.1028 (2.3583) [2022-01-25 02:45:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][150/1251] eta 0:43:34 lr 0.000130 time 1.8591 (2.3748) loss 2.6761 (3.1643) grad_norm 2.1767 (2.3562) [2022-01-25 02:45:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][160/1251] eta 0:43:07 lr 0.000130 time 2.1391 (2.3720) loss 2.5807 (3.1417) grad_norm 2.1521 (2.3485) [2022-01-25 02:46:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][170/1251] eta 0:42:17 lr 0.000130 time 1.6236 (2.3469) loss 3.1666 (3.1351) grad_norm 2.2421 (2.3412) [2022-01-25 02:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][180/1251] eta 0:41:30 lr 0.000130 time 1.7733 (2.3257) loss 3.7232 (3.1320) grad_norm 2.2055 (2.3366) [2022-01-25 02:46:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][190/1251] eta 0:40:56 lr 0.000130 time 2.1834 (2.3154) loss 3.6767 (3.1413) grad_norm 2.1535 (2.3302) [2022-01-25 02:47:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][200/1251] eta 0:40:30 lr 0.000130 time 2.6147 (2.3123) loss 3.7333 (3.1483) grad_norm 2.2068 (2.3302) [2022-01-25 02:47:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][210/1251] eta 0:40:08 lr 0.000130 time 2.2589 (2.3135) loss 2.1726 (3.1471) grad_norm 2.2938 (2.3310) [2022-01-25 02:48:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][220/1251] eta 0:39:47 lr 0.000130 time 2.7059 (2.3153) loss 3.3065 (3.1498) grad_norm 2.1722 (2.3337) [2022-01-25 02:48:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][230/1251] eta 0:39:23 lr 0.000130 time 2.5234 (2.3151) loss 3.6494 (3.1513) grad_norm 2.2184 (2.3364) [2022-01-25 02:48:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][240/1251] eta 0:38:53 lr 0.000130 time 1.8988 (2.3083) loss 3.2246 (3.1507) grad_norm 2.3225 (2.3340) [2022-01-25 02:49:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][250/1251] eta 0:38:15 lr 0.000130 time 1.9290 (2.2937) loss 3.7878 (3.1579) grad_norm 2.1430 (2.3310) [2022-01-25 02:49:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][260/1251] eta 0:37:44 lr 0.000130 time 1.8625 (2.2855) loss 2.1399 (3.1633) grad_norm 2.0773 (2.3321) [2022-01-25 02:49:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][270/1251] eta 0:37:16 lr 0.000130 time 1.8280 (2.2794) loss 3.6163 (3.1629) grad_norm 2.1092 (2.3274) [2022-01-25 02:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][280/1251] eta 0:36:58 lr 0.000130 time 2.7347 (2.2847) loss 2.7938 (3.1627) grad_norm 2.0640 (2.3256) [2022-01-25 02:50:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][290/1251] eta 0:36:31 lr 0.000130 time 1.8620 (2.2802) loss 3.4863 (3.1549) grad_norm 2.1501 (2.3255) [2022-01-25 02:50:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][300/1251] eta 0:36:06 lr 0.000129 time 2.0262 (2.2780) loss 2.3564 (3.1596) grad_norm 2.1040 (2.3245) [2022-01-25 02:51:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][310/1251] eta 0:35:46 lr 0.000129 time 2.5730 (2.2813) loss 3.1036 (3.1506) grad_norm 2.6651 (2.3246) [2022-01-25 02:51:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][320/1251] eta 0:35:22 lr 0.000129 time 2.3328 (2.2800) loss 3.0210 (3.1460) grad_norm 2.8062 (2.3262) [2022-01-25 02:52:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][330/1251] eta 0:34:53 lr 0.000129 time 2.2535 (2.2732) loss 3.6939 (3.1488) grad_norm 2.2697 (2.3292) [2022-01-25 02:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][340/1251] eta 0:34:21 lr 0.000129 time 2.1006 (2.2633) loss 3.3114 (3.1537) grad_norm 2.0627 (2.3300) [2022-01-25 02:52:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][350/1251] eta 0:33:53 lr 0.000129 time 1.8436 (2.2570) loss 2.2573 (3.1501) grad_norm 2.7443 (2.3321) [2022-01-25 02:53:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][360/1251] eta 0:33:27 lr 0.000129 time 1.9059 (2.2534) loss 2.7309 (3.1511) grad_norm 2.5814 (2.3336) [2022-01-25 02:53:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][370/1251] eta 0:33:01 lr 0.000129 time 1.9626 (2.2489) loss 3.3930 (3.1556) grad_norm 2.7214 (2.3372) [2022-01-25 02:53:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][380/1251] eta 0:32:39 lr 0.000129 time 2.8729 (2.2493) loss 3.1343 (3.1559) grad_norm 2.5339 (2.3393) [2022-01-25 02:54:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][390/1251] eta 0:32:22 lr 0.000129 time 2.5699 (2.2556) loss 2.2068 (3.1529) grad_norm 2.1485 (2.3386) [2022-01-25 02:54:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][400/1251] eta 0:31:58 lr 0.000129 time 1.8567 (2.2548) loss 2.8635 (3.1545) grad_norm 2.3409 (2.3372) [2022-01-25 02:54:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][410/1251] eta 0:31:35 lr 0.000129 time 1.5941 (2.2539) loss 3.3858 (3.1519) grad_norm 2.4627 (2.3354) [2022-01-25 02:55:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][420/1251] eta 0:31:11 lr 0.000129 time 2.1539 (2.2523) loss 3.0915 (3.1582) grad_norm 2.3912 (2.3366) [2022-01-25 02:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][430/1251] eta 0:30:45 lr 0.000129 time 2.0965 (2.2484) loss 3.1985 (3.1586) grad_norm 2.0122 (2.3340) [2022-01-25 02:56:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][440/1251] eta 0:30:22 lr 0.000129 time 1.5345 (2.2477) loss 2.5093 (3.1513) grad_norm 2.1926 (2.3304) [2022-01-25 02:56:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][450/1251] eta 0:29:59 lr 0.000129 time 1.7328 (2.2463) loss 3.1990 (3.1516) grad_norm 2.0072 (2.3280) [2022-01-25 02:56:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][460/1251] eta 0:29:36 lr 0.000129 time 1.8519 (2.2455) loss 3.4190 (3.1545) grad_norm 2.2338 (2.3263) [2022-01-25 02:57:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][470/1251] eta 0:29:13 lr 0.000129 time 1.6283 (2.2450) loss 3.5463 (3.1550) grad_norm 2.2866 (2.3235) [2022-01-25 02:57:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][480/1251] eta 0:28:49 lr 0.000129 time 1.8446 (2.2437) loss 2.2388 (3.1500) grad_norm 2.9000 (2.3242) [2022-01-25 02:57:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][490/1251] eta 0:28:29 lr 0.000129 time 2.7218 (2.2461) loss 3.8033 (3.1507) grad_norm 2.3552 (2.3242) [2022-01-25 02:58:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][500/1251] eta 0:28:04 lr 0.000129 time 1.9122 (2.2430) loss 3.1849 (3.1523) grad_norm 2.4384 (2.3256) [2022-01-25 02:58:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][510/1251] eta 0:27:40 lr 0.000129 time 1.6794 (2.2414) loss 2.5542 (3.1507) grad_norm 2.7710 (2.3270) [2022-01-25 02:59:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][520/1251] eta 0:27:16 lr 0.000129 time 2.1822 (2.2394) loss 3.2961 (3.1567) grad_norm 2.4028 (2.3255) [2022-01-25 02:59:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][530/1251] eta 0:26:53 lr 0.000129 time 2.2512 (2.2378) loss 3.3615 (3.1579) grad_norm 2.2703 (2.3238) [2022-01-25 02:59:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][540/1251] eta 0:26:31 lr 0.000129 time 2.2370 (2.2385) loss 3.4406 (3.1604) grad_norm 2.0998 (2.3222) [2022-01-25 03:00:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][550/1251] eta 0:26:11 lr 0.000129 time 2.1310 (2.2413) loss 3.8957 (3.1618) grad_norm 2.3758 (2.3234) [2022-01-25 03:00:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][560/1251] eta 0:25:48 lr 0.000129 time 1.9486 (2.2415) loss 3.2895 (3.1578) grad_norm 2.3326 (2.3254) [2022-01-25 03:00:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][570/1251] eta 0:25:23 lr 0.000129 time 1.8255 (2.2378) loss 2.7058 (3.1528) grad_norm 2.5497 (2.3278) [2022-01-25 03:01:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][580/1251] eta 0:24:58 lr 0.000129 time 1.7244 (2.2327) loss 3.3609 (3.1546) grad_norm 2.0650 (2.3278) [2022-01-25 03:01:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][590/1251] eta 0:24:36 lr 0.000129 time 1.9303 (2.2340) loss 3.3652 (3.1524) grad_norm 3.3477 (2.3353) [2022-01-25 03:01:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][600/1251] eta 0:24:14 lr 0.000129 time 2.5009 (2.2346) loss 3.8720 (3.1542) grad_norm 2.3465 (2.3387) [2022-01-25 03:02:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][610/1251] eta 0:23:51 lr 0.000129 time 1.8703 (2.2338) loss 3.5099 (3.1551) grad_norm 2.5339 (2.3403) [2022-01-25 03:02:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][620/1251] eta 0:23:28 lr 0.000129 time 1.9019 (2.2326) loss 2.6357 (3.1533) grad_norm 2.4416 (2.3403) [2022-01-25 03:03:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][630/1251] eta 0:23:07 lr 0.000129 time 2.2773 (2.2351) loss 3.4527 (3.1514) grad_norm 2.3618 (2.3411) [2022-01-25 03:03:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][640/1251] eta 0:22:44 lr 0.000129 time 2.0613 (2.2330) loss 3.3580 (3.1527) grad_norm 2.1059 (2.3406) [2022-01-25 03:03:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][650/1251] eta 0:22:20 lr 0.000129 time 1.5812 (2.2304) loss 3.1411 (3.1572) grad_norm 2.4608 (2.3405) [2022-01-25 03:04:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][660/1251] eta 0:21:57 lr 0.000129 time 2.4755 (2.2285) loss 3.0674 (3.1594) grad_norm 2.2404 (2.3406) [2022-01-25 03:04:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][670/1251] eta 0:21:33 lr 0.000128 time 1.6232 (2.2266) loss 3.8153 (3.1619) grad_norm 2.2034 (2.3402) [2022-01-25 03:04:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][680/1251] eta 0:21:12 lr 0.000128 time 2.1461 (2.2280) loss 3.2127 (3.1598) grad_norm 2.4495 (2.3402) [2022-01-25 03:05:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][690/1251] eta 0:20:49 lr 0.000128 time 1.8083 (2.2276) loss 3.5427 (3.1617) grad_norm 2.3782 (2.3393) [2022-01-25 03:05:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][700/1251] eta 0:20:27 lr 0.000128 time 2.4512 (2.2272) loss 3.5361 (3.1657) grad_norm 2.3131 (2.3384) [2022-01-25 03:05:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][710/1251] eta 0:20:04 lr 0.000128 time 1.5852 (2.2272) loss 3.7008 (3.1603) grad_norm 2.5382 (2.3384) [2022-01-25 03:06:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][720/1251] eta 0:19:42 lr 0.000128 time 1.9083 (2.2265) loss 3.4243 (3.1623) grad_norm 2.2656 (2.3377) [2022-01-25 03:06:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][730/1251] eta 0:19:20 lr 0.000128 time 1.8954 (2.2278) loss 2.6366 (3.1619) grad_norm 2.2557 (2.3387) [2022-01-25 03:07:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][740/1251] eta 0:18:58 lr 0.000128 time 2.2186 (2.2276) loss 3.4314 (3.1605) grad_norm 2.2142 (2.3380) [2022-01-25 03:07:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][750/1251] eta 0:18:34 lr 0.000128 time 1.8727 (2.2237) loss 3.5657 (3.1616) grad_norm 2.3725 (2.3382) [2022-01-25 03:07:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][760/1251] eta 0:18:10 lr 0.000128 time 1.8196 (2.2202) loss 3.5147 (3.1631) grad_norm 2.6215 (2.3389) [2022-01-25 03:08:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][770/1251] eta 0:17:46 lr 0.000128 time 1.9190 (2.2169) loss 2.8289 (3.1623) grad_norm 2.4011 (2.3392) [2022-01-25 03:08:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][780/1251] eta 0:17:24 lr 0.000128 time 2.2465 (2.2176) loss 1.9695 (3.1601) grad_norm 2.2161 (2.3421) [2022-01-25 03:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][790/1251] eta 0:17:02 lr 0.000128 time 2.0489 (2.2176) loss 2.7810 (3.1606) grad_norm 2.2326 (2.3438) [2022-01-25 03:09:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][800/1251] eta 0:16:40 lr 0.000128 time 2.4574 (2.2194) loss 3.3370 (3.1608) grad_norm 1.8720 (2.3448) [2022-01-25 03:09:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][810/1251] eta 0:16:20 lr 0.000128 time 3.2195 (2.2234) loss 2.7120 (3.1604) grad_norm 2.5750 (2.3441) [2022-01-25 03:09:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][820/1251] eta 0:15:58 lr 0.000128 time 1.7229 (2.2236) loss 3.0572 (3.1607) grad_norm 2.3727 (2.3431) [2022-01-25 03:10:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][830/1251] eta 0:15:35 lr 0.000128 time 1.6520 (2.2226) loss 3.2918 (3.1612) grad_norm 2.1674 (2.3432) [2022-01-25 03:10:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][840/1251] eta 0:15:12 lr 0.000128 time 1.7464 (2.2208) loss 2.3598 (3.1580) grad_norm 2.0770 (2.3413) [2022-01-25 03:11:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][850/1251] eta 0:14:50 lr 0.000128 time 3.3511 (2.2218) loss 3.2300 (3.1554) grad_norm 2.1481 (2.3423) [2022-01-25 03:11:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][860/1251] eta 0:14:28 lr 0.000128 time 1.5970 (2.2208) loss 3.4105 (3.1562) grad_norm 2.2202 (2.3412) [2022-01-25 03:11:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][870/1251] eta 0:14:06 lr 0.000128 time 2.2863 (2.2209) loss 3.0608 (3.1537) grad_norm 2.2576 (2.3392) [2022-01-25 03:12:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][880/1251] eta 0:13:43 lr 0.000128 time 1.9187 (2.2207) loss 3.4761 (3.1505) grad_norm 2.1955 (2.3386) [2022-01-25 03:12:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][890/1251] eta 0:13:21 lr 0.000128 time 3.3738 (2.2202) loss 2.5426 (3.1488) grad_norm 2.2748 (2.3396) [2022-01-25 03:12:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][900/1251] eta 0:12:58 lr 0.000128 time 1.8556 (2.2178) loss 2.2351 (3.1491) grad_norm 2.5204 (2.3405) [2022-01-25 03:13:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][910/1251] eta 0:12:36 lr 0.000128 time 2.2307 (2.2172) loss 3.1294 (3.1478) grad_norm 2.6672 (2.3428) [2022-01-25 03:13:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][920/1251] eta 0:12:13 lr 0.000128 time 2.1954 (2.2168) loss 2.9227 (3.1470) grad_norm 2.0979 (2.3431) [2022-01-25 03:13:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][930/1251] eta 0:11:51 lr 0.000128 time 2.5428 (2.2167) loss 3.8037 (3.1487) grad_norm 2.6106 (2.3446) [2022-01-25 03:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][940/1251] eta 0:11:29 lr 0.000128 time 2.3766 (2.2162) loss 3.4278 (3.1517) grad_norm 2.3786 (2.3486) [2022-01-25 03:14:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][950/1251] eta 0:11:07 lr 0.000128 time 2.6894 (2.2176) loss 3.4886 (3.1534) grad_norm 2.3920 (2.3499) [2022-01-25 03:15:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][960/1251] eta 0:10:45 lr 0.000128 time 2.1916 (2.2171) loss 3.9640 (3.1529) grad_norm 2.3770 (2.3483) [2022-01-25 03:15:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][970/1251] eta 0:10:22 lr 0.000128 time 2.6818 (2.2166) loss 2.7066 (3.1513) grad_norm 2.2724 (2.3470) [2022-01-25 03:15:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][980/1251] eta 0:10:00 lr 0.000128 time 1.6010 (2.2142) loss 3.4074 (3.1534) grad_norm 2.1194 (2.3474) [2022-01-25 03:16:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][990/1251] eta 0:09:37 lr 0.000128 time 2.9369 (2.2136) loss 3.2305 (3.1549) grad_norm 2.1280 (2.3457) [2022-01-25 03:16:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1000/1251] eta 0:09:15 lr 0.000128 time 2.8148 (2.2127) loss 3.5281 (3.1546) grad_norm 2.1521 (2.3448) [2022-01-25 03:16:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1010/1251] eta 0:08:53 lr 0.000128 time 2.6443 (2.2143) loss 3.7641 (3.1560) grad_norm 2.0724 (2.3452) [2022-01-25 03:17:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1020/1251] eta 0:08:31 lr 0.000128 time 1.8141 (2.2135) loss 2.6170 (3.1559) grad_norm 2.1820 (2.3458) [2022-01-25 03:17:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1030/1251] eta 0:08:09 lr 0.000128 time 2.0235 (2.2136) loss 3.1403 (3.1561) grad_norm 2.2280 (2.3442) [2022-01-25 03:17:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1040/1251] eta 0:07:47 lr 0.000127 time 2.9767 (2.2142) loss 3.5192 (3.1573) grad_norm 2.9718 (2.3459) [2022-01-25 03:18:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1050/1251] eta 0:07:25 lr 0.000127 time 2.1949 (2.2141) loss 3.4617 (3.1558) grad_norm 2.3837 (2.3468) [2022-01-25 03:18:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1060/1251] eta 0:07:02 lr 0.000127 time 2.4633 (2.2136) loss 2.3454 (3.1566) grad_norm 2.3824 (2.3478) [2022-01-25 03:19:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1070/1251] eta 0:06:40 lr 0.000127 time 1.6342 (2.2127) loss 3.3330 (3.1569) grad_norm 2.4446 (2.3484) [2022-01-25 03:19:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1080/1251] eta 0:06:18 lr 0.000127 time 1.5619 (2.2113) loss 2.0831 (3.1573) grad_norm 2.3593 (2.3482) [2022-01-25 03:19:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1090/1251] eta 0:05:56 lr 0.000127 time 3.5627 (2.2132) loss 2.9036 (3.1577) grad_norm 2.4867 (2.3478) [2022-01-25 03:20:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1100/1251] eta 0:05:34 lr 0.000127 time 1.8486 (2.2123) loss 3.2581 (3.1543) grad_norm 2.7183 (2.3472) [2022-01-25 03:20:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1110/1251] eta 0:05:11 lr 0.000127 time 1.8178 (2.2122) loss 3.5947 (3.1541) grad_norm 2.2715 (2.3467) [2022-01-25 03:20:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1120/1251] eta 0:04:49 lr 0.000127 time 1.7355 (2.2129) loss 3.5755 (3.1556) grad_norm 2.3394 (2.3473) [2022-01-25 03:21:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1130/1251] eta 0:04:27 lr 0.000127 time 2.4581 (2.2125) loss 2.3379 (3.1543) grad_norm 2.4511 (2.3479) [2022-01-25 03:21:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1140/1251] eta 0:04:05 lr 0.000127 time 2.2277 (2.2111) loss 3.3687 (3.1542) grad_norm 2.0052 (2.3471) [2022-01-25 03:21:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1150/1251] eta 0:03:43 lr 0.000127 time 1.6082 (2.2091) loss 3.5557 (3.1548) grad_norm 2.3229 (2.3471) [2022-01-25 03:22:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1160/1251] eta 0:03:21 lr 0.000127 time 2.5536 (2.2098) loss 2.7727 (3.1539) grad_norm 2.2945 (2.3473) [2022-01-25 03:22:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1170/1251] eta 0:02:58 lr 0.000127 time 2.0170 (2.2096) loss 3.2814 (3.1522) grad_norm 2.0856 (2.3467) [2022-01-25 03:23:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1180/1251] eta 0:02:36 lr 0.000127 time 2.2641 (2.2097) loss 2.0894 (3.1504) grad_norm 2.5530 (2.3472) [2022-01-25 03:23:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1190/1251] eta 0:02:14 lr 0.000127 time 1.8306 (2.2090) loss 2.0510 (3.1495) grad_norm 2.3442 (2.3472) [2022-01-25 03:23:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1200/1251] eta 0:01:52 lr 0.000127 time 2.1374 (2.2090) loss 3.3711 (3.1503) grad_norm 2.0957 (2.3473) [2022-01-25 03:24:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1210/1251] eta 0:01:30 lr 0.000127 time 2.0357 (2.2082) loss 3.2355 (3.1516) grad_norm 2.5313 (2.3464) [2022-01-25 03:24:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1220/1251] eta 0:01:08 lr 0.000127 time 2.0303 (2.2078) loss 3.9119 (3.1520) grad_norm 2.3044 (2.3456) [2022-01-25 03:24:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1230/1251] eta 0:00:46 lr 0.000127 time 1.8018 (2.2082) loss 3.4686 (3.1511) grad_norm 2.0803 (2.3450) [2022-01-25 03:25:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1240/1251] eta 0:00:24 lr 0.000127 time 1.5308 (2.2076) loss 3.4203 (3.1518) grad_norm 2.6616 (2.3460) [2022-01-25 03:25:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [232/300][1250/1251] eta 0:00:02 lr 0.000127 time 1.1946 (2.2019) loss 3.7425 (3.1545) grad_norm 2.4328 (2.3458) [2022-01-25 03:25:28 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 232 training takes 0:45:55 [2022-01-25 03:25:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.731 (18.731) Loss 0.8285 (0.8285) Acc@1 80.957 (80.957) Acc@5 96.191 (96.191) [2022-01-25 03:26:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.257 (3.537) Loss 0.8153 (0.8472) Acc@1 81.348 (80.114) Acc@5 95.703 (95.295) [2022-01-25 03:26:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.272 (2.566) Loss 0.9183 (0.8556) Acc@1 77.734 (79.734) Acc@5 94.629 (95.206) [2022-01-25 03:26:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.315 (2.281) Loss 0.8665 (0.8573) Acc@1 80.273 (79.801) Acc@5 95.020 (95.149) [2022-01-25 03:26:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.230 (2.174) Loss 0.8544 (0.8571) Acc@1 79.590 (79.783) Acc@5 95.312 (95.127) [2022-01-25 03:27:04 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.778 Acc@5 95.160 [2022-01-25 03:27:04 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-01-25 03:27:04 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.78% [2022-01-25 03:27:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][0/1251] eta 7:24:03 lr 0.000127 time 21.2975 (21.2975) loss 2.9436 (2.9436) grad_norm 2.2815 (2.2815) [2022-01-25 03:27:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][10/1251] eta 1:27:25 lr 0.000127 time 1.9929 (4.2267) loss 3.5668 (3.3209) grad_norm 2.1790 (2.2919) [2022-01-25 03:28:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][20/1251] eta 1:06:10 lr 0.000127 time 1.2172 (3.2256) loss 3.5573 (3.3094) grad_norm 2.3957 (2.2849) [2022-01-25 03:28:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][30/1251] eta 0:59:03 lr 0.000127 time 1.5416 (2.9020) loss 3.1455 (3.2749) grad_norm 2.4911 (2.3093) [2022-01-25 03:28:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][40/1251] eta 0:55:42 lr 0.000127 time 3.3125 (2.7602) loss 3.5444 (3.2622) grad_norm 1.9437 (2.3145) [2022-01-25 03:29:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][50/1251] eta 0:53:20 lr 0.000127 time 2.2671 (2.6646) loss 3.7977 (3.2882) grad_norm 2.1527 (2.3124) [2022-01-25 03:29:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][60/1251] eta 0:51:10 lr 0.000127 time 1.7159 (2.5782) loss 3.5440 (3.2956) grad_norm 2.6883 (2.3858) [2022-01-25 03:30:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][70/1251] eta 0:49:31 lr 0.000127 time 1.7409 (2.5162) loss 3.7329 (3.3067) grad_norm 2.5196 (2.3900) [2022-01-25 03:30:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][80/1251] eta 0:48:35 lr 0.000127 time 3.5898 (2.4896) loss 3.5127 (3.2838) grad_norm 3.2433 (2.4092) [2022-01-25 03:30:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][90/1251] eta 0:47:40 lr 0.000127 time 1.9143 (2.4641) loss 3.5287 (3.2791) grad_norm 2.4750 (2.4209) [2022-01-25 03:31:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][100/1251] eta 0:46:37 lr 0.000127 time 1.7606 (2.4301) loss 2.3196 (3.2277) grad_norm 2.6098 (2.4486) [2022-01-25 03:31:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][110/1251] eta 0:45:52 lr 0.000127 time 1.6248 (2.4122) loss 3.6518 (3.2312) grad_norm 2.0396 (2.4389) [2022-01-25 03:31:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][120/1251] eta 0:45:00 lr 0.000127 time 2.3639 (2.3874) loss 2.5418 (3.2283) grad_norm 2.4389 (2.4449) [2022-01-25 03:32:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][130/1251] eta 0:44:16 lr 0.000127 time 2.1977 (2.3700) loss 3.0282 (3.2305) grad_norm 2.3701 (2.4276) [2022-01-25 03:32:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][140/1251] eta 0:43:36 lr 0.000127 time 2.0967 (2.3552) loss 3.4579 (3.2438) grad_norm 2.4600 (2.4266) [2022-01-25 03:32:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][150/1251] eta 0:43:08 lr 0.000127 time 2.0967 (2.3515) loss 2.5341 (3.2493) grad_norm 2.6053 (2.4379) [2022-01-25 03:33:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][160/1251] eta 0:42:38 lr 0.000126 time 2.1937 (2.3453) loss 2.3250 (3.2479) grad_norm 2.4302 (2.4373) [2022-01-25 03:33:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][170/1251] eta 0:42:04 lr 0.000126 time 1.8695 (2.3352) loss 2.3838 (3.2442) grad_norm 2.5062 (2.4289) [2022-01-25 03:34:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][180/1251] eta 0:41:17 lr 0.000126 time 1.9278 (2.3137) loss 2.5498 (3.2362) grad_norm 2.1889 (2.4243) [2022-01-25 03:34:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][190/1251] eta 0:40:43 lr 0.000126 time 1.8617 (2.3033) loss 2.1376 (3.2289) grad_norm 2.0276 (2.4238) [2022-01-25 03:34:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][200/1251] eta 0:40:11 lr 0.000126 time 2.0636 (2.2944) loss 3.8156 (3.2297) grad_norm 2.1656 (2.4201) [2022-01-25 03:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][210/1251] eta 0:39:40 lr 0.000126 time 1.9300 (2.2868) loss 3.3701 (3.2363) grad_norm 2.2957 (2.4208) [2022-01-25 03:35:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][220/1251] eta 0:39:11 lr 0.000126 time 2.2266 (2.2804) loss 3.3918 (3.2334) grad_norm 2.5050 (2.4150) [2022-01-25 03:35:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][230/1251] eta 0:38:48 lr 0.000126 time 1.5633 (2.2809) loss 3.8739 (3.2441) grad_norm 2.4066 (2.4123) [2022-01-25 03:36:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][240/1251] eta 0:38:25 lr 0.000126 time 2.5365 (2.2808) loss 3.1220 (3.2282) grad_norm 2.4565 (2.4085) [2022-01-25 03:36:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][250/1251] eta 0:38:02 lr 0.000126 time 2.1544 (2.2804) loss 3.2608 (3.2199) grad_norm 2.6381 (2.4059) [2022-01-25 03:36:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][260/1251] eta 0:37:34 lr 0.000126 time 2.2363 (2.2745) loss 3.2652 (3.2193) grad_norm 2.7397 (2.4092) [2022-01-25 03:37:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][270/1251] eta 0:37:08 lr 0.000126 time 1.9526 (2.2716) loss 3.2228 (3.2171) grad_norm 2.1757 (2.4035) [2022-01-25 03:37:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][280/1251] eta 0:36:37 lr 0.000126 time 2.2116 (2.2632) loss 2.8559 (3.2149) grad_norm 2.2930 (2.3976) [2022-01-25 03:38:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][290/1251] eta 0:36:09 lr 0.000126 time 1.9867 (2.2577) loss 2.9641 (3.2028) grad_norm 2.4330 (2.3943) [2022-01-25 03:38:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][300/1251] eta 0:35:44 lr 0.000126 time 2.1015 (2.2547) loss 3.7399 (3.2012) grad_norm 2.4358 (2.3957) [2022-01-25 03:38:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][310/1251] eta 0:35:29 lr 0.000126 time 1.9031 (2.2626) loss 3.8232 (3.2030) grad_norm 2.3042 (2.3940) [2022-01-25 03:39:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][320/1251] eta 0:35:05 lr 0.000126 time 1.9604 (2.2616) loss 3.8304 (3.2005) grad_norm 2.0497 (2.3883) [2022-01-25 03:39:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][330/1251] eta 0:34:37 lr 0.000126 time 1.9783 (2.2556) loss 3.5536 (3.1984) grad_norm 2.0920 (2.3862) [2022-01-25 03:39:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][340/1251] eta 0:34:08 lr 0.000126 time 1.6535 (2.2492) loss 2.9216 (3.1989) grad_norm 2.1756 (2.3843) [2022-01-25 03:40:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][350/1251] eta 0:33:43 lr 0.000126 time 1.8454 (2.2458) loss 3.5994 (3.2054) grad_norm 2.2274 (2.3845) [2022-01-25 03:40:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][360/1251] eta 0:33:19 lr 0.000126 time 2.2041 (2.2440) loss 3.3476 (3.2087) grad_norm 2.3249 (2.3877) [2022-01-25 03:40:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][370/1251] eta 0:32:56 lr 0.000126 time 1.9088 (2.2436) loss 2.5934 (3.2035) grad_norm 2.3546 (2.3905) [2022-01-25 03:41:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][380/1251] eta 0:32:37 lr 0.000126 time 2.8768 (2.2470) loss 2.2198 (3.2054) grad_norm 2.4947 (2.3917) [2022-01-25 03:41:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][390/1251] eta 0:32:15 lr 0.000126 time 1.4781 (2.2480) loss 3.1013 (3.2023) grad_norm 2.6272 (2.3895) [2022-01-25 03:42:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][400/1251] eta 0:31:50 lr 0.000126 time 2.5810 (2.2451) loss 3.5540 (3.2121) grad_norm 2.4322 (2.3891) [2022-01-25 03:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][410/1251] eta 0:31:22 lr 0.000126 time 1.8181 (2.2385) loss 3.6630 (3.2183) grad_norm 2.2030 (2.3880) [2022-01-25 03:42:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][420/1251] eta 0:30:55 lr 0.000126 time 1.9593 (2.2327) loss 3.3769 (3.2213) grad_norm 2.2195 (2.3879) [2022-01-25 03:43:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][430/1251] eta 0:30:32 lr 0.000126 time 2.2197 (2.2323) loss 3.7689 (3.2287) grad_norm 2.3799 (2.3905) [2022-01-25 03:43:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][440/1251] eta 0:30:09 lr 0.000126 time 1.9581 (2.2308) loss 2.7166 (3.2222) grad_norm 2.4586 (2.3926) [2022-01-25 03:43:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][450/1251] eta 0:29:46 lr 0.000126 time 1.8725 (2.2297) loss 3.8022 (3.2217) grad_norm 2.2918 (2.3917) [2022-01-25 03:44:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][460/1251] eta 0:29:22 lr 0.000126 time 1.9164 (2.2279) loss 3.1144 (3.2219) grad_norm 2.0315 (2.3888) [2022-01-25 03:44:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][470/1251] eta 0:29:00 lr 0.000126 time 2.1527 (2.2292) loss 2.7454 (3.2229) grad_norm 2.4170 (2.3852) [2022-01-25 03:44:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][480/1251] eta 0:28:36 lr 0.000126 time 1.6025 (2.2264) loss 3.8203 (3.2247) grad_norm 2.7468 (2.3835) [2022-01-25 03:45:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][490/1251] eta 0:28:14 lr 0.000126 time 2.4907 (2.2263) loss 3.4151 (3.2230) grad_norm 2.1750 (2.3849) [2022-01-25 03:45:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][500/1251] eta 0:27:51 lr 0.000126 time 2.1291 (2.2258) loss 3.0999 (3.2230) grad_norm 2.9238 (2.3877) [2022-01-25 03:46:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][510/1251] eta 0:27:29 lr 0.000126 time 2.3278 (2.2259) loss 3.1502 (3.2238) grad_norm 2.1545 (2.3874) [2022-01-25 03:46:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][520/1251] eta 0:27:03 lr 0.000126 time 1.7613 (2.2203) loss 3.1734 (3.2240) grad_norm 2.0614 (2.3864) [2022-01-25 03:46:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][530/1251] eta 0:26:40 lr 0.000126 time 1.7546 (2.2195) loss 3.9514 (3.2282) grad_norm 2.4154 (2.3857) [2022-01-25 03:47:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][540/1251] eta 0:26:16 lr 0.000125 time 1.9215 (2.2173) loss 3.4933 (3.2292) grad_norm 2.1956 (2.3871) [2022-01-25 03:47:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][550/1251] eta 0:25:54 lr 0.000125 time 3.2920 (2.2177) loss 3.8011 (3.2306) grad_norm 2.1791 (2.3853) [2022-01-25 03:47:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][560/1251] eta 0:25:34 lr 0.000125 time 1.5030 (2.2212) loss 2.5743 (3.2273) grad_norm 2.2037 (2.3857) [2022-01-25 03:48:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][570/1251] eta 0:25:13 lr 0.000125 time 2.3524 (2.2218) loss 3.3447 (3.2302) grad_norm 2.1180 (2.3854) [2022-01-25 03:48:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][580/1251] eta 0:24:48 lr 0.000125 time 1.8099 (2.2178) loss 1.9719 (3.2260) grad_norm 2.0163 (2.3820) [2022-01-25 03:48:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][590/1251] eta 0:24:23 lr 0.000125 time 2.7472 (2.2145) loss 3.4467 (3.2201) grad_norm 2.3865 (2.3811) [2022-01-25 03:49:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][600/1251] eta 0:24:00 lr 0.000125 time 1.6642 (2.2123) loss 3.1493 (3.2192) grad_norm 2.2561 (2.3817) [2022-01-25 03:49:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][610/1251] eta 0:23:38 lr 0.000125 time 1.9595 (2.2127) loss 2.2206 (3.2152) grad_norm 1.9976 (2.3822) [2022-01-25 03:50:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][620/1251] eta 0:23:17 lr 0.000125 time 2.4702 (2.2149) loss 3.1610 (3.2130) grad_norm 2.2217 (2.3837) [2022-01-25 03:50:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][630/1251] eta 0:22:55 lr 0.000125 time 1.8620 (2.2154) loss 3.2972 (3.2155) grad_norm 2.4631 (2.3838) [2022-01-25 03:50:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][640/1251] eta 0:22:34 lr 0.000125 time 1.8579 (2.2172) loss 3.0672 (3.2149) grad_norm 2.5330 (2.3838) [2022-01-25 03:51:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][650/1251] eta 0:22:12 lr 0.000125 time 2.2608 (2.2169) loss 3.5372 (3.2108) grad_norm 2.0644 (2.3807) [2022-01-25 03:51:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][660/1251] eta 0:21:48 lr 0.000125 time 1.8546 (2.2139) loss 3.0378 (3.2106) grad_norm 2.3895 (2.3830) [2022-01-25 03:51:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][670/1251] eta 0:21:24 lr 0.000125 time 1.7990 (2.2101) loss 3.7354 (3.2074) grad_norm 2.2111 (2.3825) [2022-01-25 03:52:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][680/1251] eta 0:21:00 lr 0.000125 time 1.9250 (2.2082) loss 3.5520 (3.2089) grad_norm 2.9066 (2.3879) [2022-01-25 03:52:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][690/1251] eta 0:20:39 lr 0.000125 time 2.4560 (2.2087) loss 3.5231 (3.2089) grad_norm 2.4715 (2.3875) [2022-01-25 03:52:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][700/1251] eta 0:20:18 lr 0.000125 time 2.9299 (2.2109) loss 3.1353 (3.2086) grad_norm 2.2634 (2.3854) [2022-01-25 03:53:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][710/1251] eta 0:19:56 lr 0.000125 time 2.9139 (2.2119) loss 3.8528 (3.2091) grad_norm 5.0563 (2.3881) [2022-01-25 03:53:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][720/1251] eta 0:19:34 lr 0.000125 time 3.5294 (2.2120) loss 2.9814 (3.2044) grad_norm 2.4751 (2.3900) [2022-01-25 03:54:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][730/1251] eta 0:19:12 lr 0.000125 time 1.6951 (2.2121) loss 3.8146 (3.2069) grad_norm 2.3887 (2.3894) [2022-01-25 03:54:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][740/1251] eta 0:18:50 lr 0.000125 time 2.4354 (2.2127) loss 3.2879 (3.2076) grad_norm 2.3896 (2.3886) [2022-01-25 03:54:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][750/1251] eta 0:18:29 lr 0.000125 time 3.4372 (2.2147) loss 2.6446 (3.2062) grad_norm 2.5042 (2.3878) [2022-01-25 03:55:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][760/1251] eta 0:18:08 lr 0.000125 time 3.8946 (2.2168) loss 3.5940 (3.2033) grad_norm 2.9672 (2.3886) [2022-01-25 03:55:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][770/1251] eta 0:17:45 lr 0.000125 time 1.8039 (2.2143) loss 3.3833 (3.2020) grad_norm 2.3406 (2.3881) [2022-01-25 03:55:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][780/1251] eta 0:17:21 lr 0.000125 time 1.8873 (2.2105) loss 2.7813 (3.1993) grad_norm 2.2269 (2.3877) [2022-01-25 03:56:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][790/1251] eta 0:16:57 lr 0.000125 time 1.7913 (2.2082) loss 4.0460 (3.2002) grad_norm 2.4186 (2.3879) [2022-01-25 03:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][800/1251] eta 0:16:35 lr 0.000125 time 2.8679 (2.2070) loss 3.5679 (3.1985) grad_norm 2.3900 (2.3874) [2022-01-25 03:56:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][810/1251] eta 0:16:12 lr 0.000125 time 2.0094 (2.2061) loss 3.7760 (3.2023) grad_norm 2.4844 (2.3880) [2022-01-25 03:57:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][820/1251] eta 0:15:51 lr 0.000125 time 2.7712 (2.2068) loss 3.4808 (3.2031) grad_norm 2.8626 (2.3890) [2022-01-25 03:57:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][830/1251] eta 0:15:28 lr 0.000125 time 2.6117 (2.2059) loss 3.4481 (3.2011) grad_norm 2.1588 (2.3911) [2022-01-25 03:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][840/1251] eta 0:15:06 lr 0.000125 time 1.9184 (2.2053) loss 3.6132 (3.2032) grad_norm 2.0222 (2.3904) [2022-01-25 03:58:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][850/1251] eta 0:14:45 lr 0.000125 time 2.1487 (2.2074) loss 3.3019 (3.2036) grad_norm 3.2327 (2.3914) [2022-01-25 03:58:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][860/1251] eta 0:14:24 lr 0.000125 time 2.3359 (2.2107) loss 2.0723 (3.2034) grad_norm 2.8663 (2.3932) [2022-01-25 03:59:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][870/1251] eta 0:14:02 lr 0.000125 time 3.2130 (2.2120) loss 3.1936 (3.1975) grad_norm 2.5075 (2.3929) [2022-01-25 03:59:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][880/1251] eta 0:13:40 lr 0.000125 time 1.8986 (2.2108) loss 3.4752 (3.1981) grad_norm 2.7617 (2.3942) [2022-01-25 03:59:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][890/1251] eta 0:13:17 lr 0.000125 time 1.8966 (2.2086) loss 2.8661 (3.1961) grad_norm 2.9255 (2.3954) [2022-01-25 04:00:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][900/1251] eta 0:12:54 lr 0.000125 time 2.2244 (2.2070) loss 3.5885 (3.1958) grad_norm 2.1593 (2.3965) [2022-01-25 04:00:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][910/1251] eta 0:12:32 lr 0.000124 time 3.5074 (2.2076) loss 3.3889 (3.1976) grad_norm 2.3369 (2.3968) [2022-01-25 04:00:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][920/1251] eta 0:12:10 lr 0.000124 time 1.8663 (2.2068) loss 2.1396 (3.1984) grad_norm 2.4351 (2.3963) [2022-01-25 04:01:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][930/1251] eta 0:11:48 lr 0.000124 time 1.9548 (2.2066) loss 3.3567 (3.2001) grad_norm 2.2314 (2.3956) [2022-01-25 04:01:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][940/1251] eta 0:11:26 lr 0.000124 time 2.5226 (2.2063) loss 2.0025 (3.2001) grad_norm 2.3505 (2.3955) [2022-01-25 04:02:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][950/1251] eta 0:11:04 lr 0.000124 time 3.1326 (2.2066) loss 2.6900 (3.1991) grad_norm 2.2686 (2.3951) [2022-01-25 04:02:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][960/1251] eta 0:10:42 lr 0.000124 time 2.3562 (2.2065) loss 2.2044 (3.1990) grad_norm 2.4271 (2.3945) [2022-01-25 04:02:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][970/1251] eta 0:10:19 lr 0.000124 time 1.9038 (2.2048) loss 3.0280 (3.1995) grad_norm 2.4752 (2.3947) [2022-01-25 04:03:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][980/1251] eta 0:09:57 lr 0.000124 time 2.0083 (2.2056) loss 3.8962 (3.2006) grad_norm 2.5372 (2.3936) [2022-01-25 04:03:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][990/1251] eta 0:09:36 lr 0.000124 time 2.6021 (2.2071) loss 3.5246 (3.2022) grad_norm 2.5883 (2.3934) [2022-01-25 04:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1000/1251] eta 0:09:13 lr 0.000124 time 2.4955 (2.2067) loss 2.4257 (3.2011) grad_norm 2.2592 (2.3941) [2022-01-25 04:04:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1010/1251] eta 0:08:51 lr 0.000124 time 1.6930 (2.2047) loss 3.2808 (3.2022) grad_norm 2.3389 (2.3954) [2022-01-25 04:04:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1020/1251] eta 0:08:28 lr 0.000124 time 1.8266 (2.2025) loss 3.8403 (3.2042) grad_norm 2.4744 (2.3951) [2022-01-25 04:04:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1030/1251] eta 0:08:06 lr 0.000124 time 1.8983 (2.2004) loss 2.9130 (3.2042) grad_norm 2.3368 (2.3946) [2022-01-25 04:05:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1040/1251] eta 0:07:44 lr 0.000124 time 2.6306 (2.2012) loss 3.4243 (3.2052) grad_norm 2.9417 (2.4001) [2022-01-25 04:05:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1050/1251] eta 0:07:22 lr 0.000124 time 2.1321 (2.2026) loss 2.4222 (3.2063) grad_norm 2.7700 (2.4010) [2022-01-25 04:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1060/1251] eta 0:07:00 lr 0.000124 time 2.5383 (2.2023) loss 3.4468 (3.2072) grad_norm 2.5282 (2.4002) [2022-01-25 04:06:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1070/1251] eta 0:06:38 lr 0.000124 time 1.9172 (2.2007) loss 2.7552 (3.2089) grad_norm 2.0004 (2.3985) [2022-01-25 04:06:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1080/1251] eta 0:06:16 lr 0.000124 time 1.9594 (2.1992) loss 2.9174 (3.2059) grad_norm 2.5146 (2.3988) [2022-01-25 04:07:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1090/1251] eta 0:05:54 lr 0.000124 time 2.3794 (2.1991) loss 3.5766 (3.2070) grad_norm 2.3767 (2.4002) [2022-01-25 04:07:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1100/1251] eta 0:05:31 lr 0.000124 time 1.5831 (2.1983) loss 3.3754 (3.2099) grad_norm 2.2577 (2.4009) [2022-01-25 04:07:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1110/1251] eta 0:05:09 lr 0.000124 time 1.9516 (2.1977) loss 3.6533 (3.2116) grad_norm 2.4788 (2.4007) [2022-01-25 04:08:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1120/1251] eta 0:04:48 lr 0.000124 time 1.8560 (2.1988) loss 3.3869 (3.2111) grad_norm 2.5743 (2.4028) [2022-01-25 04:08:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1130/1251] eta 0:04:26 lr 0.000124 time 2.5609 (2.2001) loss 3.6039 (3.2116) grad_norm 2.2927 (2.4026) [2022-01-25 04:08:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1140/1251] eta 0:04:04 lr 0.000124 time 2.2157 (2.2005) loss 3.5748 (3.2089) grad_norm 2.4341 (2.4036) [2022-01-25 04:09:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1150/1251] eta 0:03:42 lr 0.000124 time 2.2179 (2.2009) loss 2.4299 (3.2097) grad_norm 2.5721 (2.4042) [2022-01-25 04:09:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1160/1251] eta 0:03:20 lr 0.000124 time 1.7099 (2.2007) loss 3.3596 (3.2093) grad_norm 2.2611 (2.4041) [2022-01-25 04:10:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1170/1251] eta 0:02:58 lr 0.000124 time 1.8978 (2.1999) loss 3.1768 (3.2097) grad_norm 2.4642 (2.4036) [2022-01-25 04:10:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1180/1251] eta 0:02:36 lr 0.000124 time 1.9500 (2.1986) loss 2.9915 (3.2120) grad_norm 2.1326 (2.4037) [2022-01-25 04:10:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1190/1251] eta 0:02:14 lr 0.000124 time 2.4551 (2.1981) loss 2.9952 (3.2115) grad_norm 2.3189 (2.4036) [2022-01-25 04:11:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1200/1251] eta 0:01:52 lr 0.000124 time 1.9376 (2.1978) loss 3.6256 (3.2117) grad_norm 2.6557 (2.4049) [2022-01-25 04:11:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1210/1251] eta 0:01:30 lr 0.000124 time 2.2027 (2.1979) loss 3.3747 (3.2116) grad_norm 2.0692 (2.4045) [2022-01-25 04:11:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1220/1251] eta 0:01:08 lr 0.000124 time 2.1497 (2.1986) loss 3.0128 (3.2117) grad_norm 2.3516 (2.4040) [2022-01-25 04:12:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1230/1251] eta 0:00:46 lr 0.000124 time 2.4150 (2.2003) loss 2.2109 (3.2091) grad_norm 2.5257 (2.4028) [2022-01-25 04:12:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1240/1251] eta 0:00:24 lr 0.000124 time 1.8955 (2.1988) loss 3.4220 (3.2112) grad_norm 2.2809 (2.4030) [2022-01-25 04:12:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [233/300][1250/1251] eta 0:00:02 lr 0.000124 time 1.2129 (2.1933) loss 3.5124 (3.2134) grad_norm 2.5301 (2.4033) [2022-01-25 04:12:49 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 233 training takes 0:45:44 [2022-01-25 04:13:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.767 (18.767) Loss 0.8802 (0.8802) Acc@1 78.613 (78.613) Acc@5 95.508 (95.508) [2022-01-25 04:13:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.900 (3.524) Loss 0.8189 (0.8463) Acc@1 81.055 (79.980) Acc@5 95.020 (95.162) [2022-01-25 04:13:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.640 (2.679) Loss 0.8554 (0.8729) Acc@1 79.883 (79.548) Acc@5 95.215 (94.987) [2022-01-25 04:14:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.659 (2.300) Loss 0.8309 (0.8695) Acc@1 80.371 (79.602) Acc@5 95.605 (94.947) [2022-01-25 04:14:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.161 (2.221) Loss 0.8156 (0.8657) Acc@1 81.250 (79.716) Acc@5 96.289 (95.024) [2022-01-25 04:14:28 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.736 Acc@5 95.042 [2022-01-25 04:14:28 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.7% [2022-01-25 04:14:28 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.78% [2022-01-25 04:14:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][0/1251] eta 8:21:37 lr 0.000124 time 24.0585 (24.0585) loss 3.6809 (3.6809) grad_norm 2.0408 (2.0408) [2022-01-25 04:15:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][10/1251] eta 1:26:08 lr 0.000124 time 1.2514 (4.1647) loss 2.9889 (3.4350) grad_norm 2.2403 (2.3999) [2022-01-25 04:15:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][20/1251] eta 1:04:06 lr 0.000124 time 1.5313 (3.1251) loss 3.4927 (3.4315) grad_norm 2.1731 (2.4164) [2022-01-25 04:15:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][30/1251] eta 0:56:52 lr 0.000124 time 1.7459 (2.7945) loss 3.5010 (3.3405) grad_norm 2.8640 (2.4106) [2022-01-25 04:16:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][40/1251] eta 0:55:05 lr 0.000123 time 4.2146 (2.7298) loss 3.2174 (3.3480) grad_norm 2.4866 (2.4578) [2022-01-25 04:16:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][50/1251] eta 0:52:55 lr 0.000123 time 1.8921 (2.6437) loss 3.1874 (3.3646) grad_norm 2.4254 (2.4444) [2022-01-25 04:17:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][60/1251] eta 0:50:55 lr 0.000123 time 1.2959 (2.5658) loss 3.2460 (3.2868) grad_norm 2.3184 (2.4188) [2022-01-25 04:17:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][70/1251] eta 0:49:33 lr 0.000123 time 1.5830 (2.5181) loss 2.9814 (3.2861) grad_norm 2.2292 (2.4043) [2022-01-25 04:17:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][80/1251] eta 0:48:41 lr 0.000123 time 3.5175 (2.4946) loss 2.7747 (3.2883) grad_norm 2.4666 (2.3947) [2022-01-25 04:18:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][90/1251] eta 0:47:45 lr 0.000123 time 1.5510 (2.4677) loss 3.6839 (3.2821) grad_norm 2.2879 (2.4258) [2022-01-25 04:18:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][100/1251] eta 0:46:38 lr 0.000123 time 1.8167 (2.4316) loss 3.5344 (3.2828) grad_norm 2.2205 (2.4234) [2022-01-25 04:18:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][110/1251] eta 0:45:33 lr 0.000123 time 1.5222 (2.3955) loss 1.9417 (3.2577) grad_norm 2.1793 (2.4155) [2022-01-25 04:19:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][120/1251] eta 0:44:41 lr 0.000123 time 2.1446 (2.3713) loss 3.7606 (3.2577) grad_norm 2.5037 (2.4104) [2022-01-25 04:19:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][130/1251] eta 0:44:25 lr 0.000123 time 2.5343 (2.3776) loss 3.3552 (3.2495) grad_norm 2.4863 (2.4039) [2022-01-25 04:20:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][140/1251] eta 0:43:42 lr 0.000123 time 1.5885 (2.3609) loss 2.9954 (3.2415) grad_norm 2.7456 (2.4050) [2022-01-25 04:20:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][150/1251] eta 0:43:06 lr 0.000123 time 1.8408 (2.3488) loss 2.3308 (3.2367) grad_norm 2.2239 (2.4088) [2022-01-25 04:20:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][160/1251] eta 0:42:25 lr 0.000123 time 1.8065 (2.3330) loss 2.5953 (3.2309) grad_norm 2.0508 (2.4048) [2022-01-25 04:21:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][170/1251] eta 0:42:06 lr 0.000123 time 2.5488 (2.3374) loss 2.2219 (3.2128) grad_norm 2.8225 (2.4044) [2022-01-25 04:21:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][180/1251] eta 0:41:30 lr 0.000123 time 1.6190 (2.3256) loss 3.4162 (3.2161) grad_norm 2.4708 (2.4026) [2022-01-25 04:21:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][190/1251] eta 0:40:45 lr 0.000123 time 1.7667 (2.3046) loss 3.4406 (3.2117) grad_norm 2.3358 (2.4024) [2022-01-25 04:22:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][200/1251] eta 0:40:07 lr 0.000123 time 1.9297 (2.2906) loss 2.6929 (3.1995) grad_norm 2.4456 (2.4047) [2022-01-25 04:22:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][210/1251] eta 0:39:34 lr 0.000123 time 2.1831 (2.2807) loss 3.5516 (3.1853) grad_norm 2.5694 (2.4029) [2022-01-25 04:22:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][220/1251] eta 0:39:12 lr 0.000123 time 2.7172 (2.2815) loss 3.5537 (3.1840) grad_norm 2.2933 (2.4019) [2022-01-25 04:23:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][230/1251] eta 0:38:50 lr 0.000123 time 2.2815 (2.2822) loss 2.6277 (3.1862) grad_norm 2.6065 (2.3999) [2022-01-25 04:23:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][240/1251] eta 0:38:26 lr 0.000123 time 1.9849 (2.2816) loss 3.7205 (3.1877) grad_norm 2.2712 (2.4013) [2022-01-25 04:24:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][250/1251] eta 0:38:04 lr 0.000123 time 2.5376 (2.2826) loss 3.6496 (3.1887) grad_norm 2.2605 (2.3957) [2022-01-25 04:24:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][260/1251] eta 0:37:43 lr 0.000123 time 2.5713 (2.2836) loss 2.7383 (3.1802) grad_norm 2.0211 (2.4004) [2022-01-25 04:24:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][270/1251] eta 0:37:11 lr 0.000123 time 1.6347 (2.2750) loss 3.2399 (3.1777) grad_norm 2.4679 (2.4039) [2022-01-25 04:25:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][280/1251] eta 0:36:37 lr 0.000123 time 1.9125 (2.2636) loss 3.4383 (3.1713) grad_norm 2.0027 (2.3995) [2022-01-25 04:25:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][290/1251] eta 0:36:07 lr 0.000123 time 1.9795 (2.2550) loss 3.5024 (3.1653) grad_norm 2.2947 (2.3974) [2022-01-25 04:25:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][300/1251] eta 0:35:41 lr 0.000123 time 1.9792 (2.2523) loss 3.5536 (3.1690) grad_norm 4.0659 (2.4104) [2022-01-25 04:26:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][310/1251] eta 0:35:18 lr 0.000123 time 1.9514 (2.2517) loss 3.0961 (3.1741) grad_norm 2.3497 (2.4082) [2022-01-25 04:26:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][320/1251] eta 0:34:53 lr 0.000123 time 2.0834 (2.2491) loss 3.5336 (3.1739) grad_norm 2.3280 (2.4092) [2022-01-25 04:26:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][330/1251] eta 0:34:32 lr 0.000123 time 2.1892 (2.2503) loss 3.2777 (3.1788) grad_norm 2.1128 (2.4129) [2022-01-25 04:27:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][340/1251] eta 0:34:08 lr 0.000123 time 1.7434 (2.2483) loss 2.9750 (3.1822) grad_norm 2.7651 (2.4219) [2022-01-25 04:27:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][350/1251] eta 0:33:46 lr 0.000123 time 2.3398 (2.2488) loss 2.5608 (3.1742) grad_norm 3.1204 (2.4231) [2022-01-25 04:27:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][360/1251] eta 0:33:23 lr 0.000123 time 1.9682 (2.2481) loss 3.3457 (3.1641) grad_norm 2.1174 (2.4566) [2022-01-25 04:28:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][370/1251] eta 0:33:02 lr 0.000123 time 3.0630 (2.2505) loss 3.7352 (3.1699) grad_norm 2.4309 (2.4579) [2022-01-25 04:28:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][380/1251] eta 0:32:36 lr 0.000123 time 2.1158 (2.2464) loss 2.5036 (3.1725) grad_norm 2.5055 (2.4583) [2022-01-25 04:29:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][390/1251] eta 0:32:12 lr 0.000123 time 2.5216 (2.2446) loss 2.3075 (3.1738) grad_norm 2.4880 (2.4584) [2022-01-25 04:29:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][400/1251] eta 0:31:44 lr 0.000123 time 1.9072 (2.2383) loss 2.8021 (3.1757) grad_norm 2.4145 (2.4597) [2022-01-25 04:29:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][410/1251] eta 0:31:21 lr 0.000123 time 2.8203 (2.2375) loss 2.5128 (3.1746) grad_norm 3.3081 (2.4587) [2022-01-25 04:30:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][420/1251] eta 0:31:00 lr 0.000122 time 2.5721 (2.2392) loss 3.9677 (3.1758) grad_norm 2.1857 (2.4588) [2022-01-25 04:30:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][430/1251] eta 0:30:36 lr 0.000122 time 2.0691 (2.2374) loss 2.3760 (3.1697) grad_norm 2.3384 (2.4584) [2022-01-25 04:30:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][440/1251] eta 0:30:13 lr 0.000122 time 1.7755 (2.2362) loss 2.7152 (3.1652) grad_norm 2.5933 (2.4629) [2022-01-25 04:31:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][450/1251] eta 0:29:50 lr 0.000122 time 3.0058 (2.2359) loss 3.2752 (3.1664) grad_norm 2.2182 (2.4613) [2022-01-25 04:31:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][460/1251] eta 0:29:27 lr 0.000122 time 1.8172 (2.2350) loss 3.7933 (3.1654) grad_norm 2.1775 (2.4576) [2022-01-25 04:31:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][470/1251] eta 0:29:01 lr 0.000122 time 1.9026 (2.2292) loss 2.0513 (3.1638) grad_norm 2.2845 (2.4532) [2022-01-25 04:32:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][480/1251] eta 0:28:35 lr 0.000122 time 1.8615 (2.2253) loss 2.6321 (3.1630) grad_norm 2.4547 (2.4509) [2022-01-25 04:32:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][490/1251] eta 0:28:11 lr 0.000122 time 2.5193 (2.2225) loss 3.8147 (3.1638) grad_norm 2.4130 (2.4499) [2022-01-25 04:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][500/1251] eta 0:27:48 lr 0.000122 time 2.4952 (2.2216) loss 3.4222 (3.1668) grad_norm 2.0862 (2.4525) [2022-01-25 04:33:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][510/1251] eta 0:27:25 lr 0.000122 time 2.5490 (2.2207) loss 3.1850 (3.1684) grad_norm 2.2928 (2.4550) [2022-01-25 04:33:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][520/1251] eta 0:27:04 lr 0.000122 time 2.2943 (2.2221) loss 3.7108 (3.1681) grad_norm 2.5640 (2.4552) [2022-01-25 04:34:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][530/1251] eta 0:26:43 lr 0.000122 time 2.4195 (2.2242) loss 3.3366 (3.1704) grad_norm 3.0059 (2.4546) [2022-01-25 04:34:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][540/1251] eta 0:26:22 lr 0.000122 time 2.0952 (2.2261) loss 3.6812 (3.1749) grad_norm 2.3636 (2.4566) [2022-01-25 04:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][550/1251] eta 0:26:00 lr 0.000122 time 1.9055 (2.2256) loss 3.6432 (3.1754) grad_norm 2.3017 (2.4559) [2022-01-25 04:35:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][560/1251] eta 0:25:37 lr 0.000122 time 2.7966 (2.2250) loss 3.0400 (3.1756) grad_norm 2.6308 (2.4546) [2022-01-25 04:35:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][570/1251] eta 0:25:13 lr 0.000122 time 2.0262 (2.2225) loss 3.5328 (3.1758) grad_norm 2.4611 (2.4535) [2022-01-25 04:35:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][580/1251] eta 0:24:50 lr 0.000122 time 2.6900 (2.2211) loss 3.1639 (3.1735) grad_norm 2.3436 (2.4508) [2022-01-25 04:36:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][590/1251] eta 0:24:28 lr 0.000122 time 2.0639 (2.2215) loss 3.1941 (3.1736) grad_norm 2.8523 (2.4510) [2022-01-25 04:36:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][600/1251] eta 0:24:07 lr 0.000122 time 2.5947 (2.2231) loss 3.6235 (3.1740) grad_norm 2.3068 (2.4505) [2022-01-25 04:37:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][610/1251] eta 0:23:45 lr 0.000122 time 2.1856 (2.2240) loss 2.1935 (3.1739) grad_norm 2.0747 (2.4498) [2022-01-25 04:37:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][620/1251] eta 0:23:20 lr 0.000122 time 2.4415 (2.2202) loss 3.5578 (3.1725) grad_norm 2.5738 (2.4471) [2022-01-25 04:37:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][630/1251] eta 0:22:55 lr 0.000122 time 1.6059 (2.2154) loss 2.6673 (3.1703) grad_norm 2.6835 (2.4451) [2022-01-25 04:38:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][640/1251] eta 0:22:33 lr 0.000122 time 2.8380 (2.2157) loss 3.3823 (3.1676) grad_norm 2.2579 (2.4418) [2022-01-25 04:38:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][650/1251] eta 0:22:12 lr 0.000122 time 2.6886 (2.2179) loss 3.1570 (3.1676) grad_norm 2.1843 (2.4388) [2022-01-25 04:38:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][660/1251] eta 0:21:51 lr 0.000122 time 1.8982 (2.2198) loss 3.0620 (3.1689) grad_norm 2.1926 (2.4376) [2022-01-25 04:39:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][670/1251] eta 0:21:29 lr 0.000122 time 2.1351 (2.2189) loss 3.5696 (3.1694) grad_norm 2.6149 (2.4349) [2022-01-25 04:39:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][680/1251] eta 0:21:06 lr 0.000122 time 3.2413 (2.2184) loss 3.3954 (3.1704) grad_norm 2.8051 (2.4364) [2022-01-25 04:40:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][690/1251] eta 0:20:43 lr 0.000122 time 2.3600 (2.2172) loss 2.1403 (3.1696) grad_norm 2.2304 (2.4357) [2022-01-25 04:40:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][700/1251] eta 0:20:19 lr 0.000122 time 2.4650 (2.2139) loss 3.4137 (3.1695) grad_norm 2.6560 (2.4353) [2022-01-25 04:40:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][710/1251] eta 0:19:56 lr 0.000122 time 2.2085 (2.2121) loss 3.5210 (3.1708) grad_norm 2.5430 (2.4346) [2022-01-25 04:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][720/1251] eta 0:19:35 lr 0.000122 time 2.9890 (2.2131) loss 3.4734 (3.1704) grad_norm 2.0599 (2.4326) [2022-01-25 04:41:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][730/1251] eta 0:19:12 lr 0.000122 time 1.8249 (2.2117) loss 3.2168 (3.1709) grad_norm 2.1950 (2.4349) [2022-01-25 04:41:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][740/1251] eta 0:18:49 lr 0.000122 time 2.5688 (2.2107) loss 2.4829 (3.1696) grad_norm 2.5715 (2.4367) [2022-01-25 04:42:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][750/1251] eta 0:18:27 lr 0.000122 time 1.8633 (2.2102) loss 3.4774 (3.1690) grad_norm 2.6621 (2.4372) [2022-01-25 04:42:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][760/1251] eta 0:18:05 lr 0.000122 time 2.0234 (2.2102) loss 3.6210 (3.1706) grad_norm 2.3097 (2.4359) [2022-01-25 04:42:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][770/1251] eta 0:17:43 lr 0.000122 time 1.8456 (2.2114) loss 3.4978 (3.1724) grad_norm 2.5320 (2.4375) [2022-01-25 04:43:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][780/1251] eta 0:17:22 lr 0.000122 time 2.7305 (2.2128) loss 3.5095 (3.1751) grad_norm 2.3653 (2.4390) [2022-01-25 04:43:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][790/1251] eta 0:16:59 lr 0.000122 time 1.5666 (2.2118) loss 2.0875 (3.1756) grad_norm 2.3911 (2.4412) [2022-01-25 04:43:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][800/1251] eta 0:16:37 lr 0.000121 time 1.9712 (2.2112) loss 2.6552 (3.1740) grad_norm 2.1322 (2.4404) [2022-01-25 04:44:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][810/1251] eta 0:16:15 lr 0.000121 time 1.6709 (2.2109) loss 3.4575 (3.1756) grad_norm 2.4159 (2.4416) [2022-01-25 04:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][820/1251] eta 0:15:52 lr 0.000121 time 2.3190 (2.2108) loss 3.2598 (3.1732) grad_norm 2.4399 (2.4420) [2022-01-25 04:45:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][830/1251] eta 0:15:29 lr 0.000121 time 1.6289 (2.2090) loss 3.4166 (3.1746) grad_norm 2.1878 (2.4405) [2022-01-25 04:45:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][840/1251] eta 0:15:07 lr 0.000121 time 2.2214 (2.2071) loss 2.4537 (3.1729) grad_norm 2.2166 (2.4398) [2022-01-25 04:45:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][850/1251] eta 0:14:45 lr 0.000121 time 2.2386 (2.2078) loss 3.0274 (3.1717) grad_norm 2.1928 (2.4381) [2022-01-25 04:46:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][860/1251] eta 0:14:22 lr 0.000121 time 1.9787 (2.2071) loss 2.5054 (3.1722) grad_norm 2.1112 (2.4380) [2022-01-25 04:46:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][870/1251] eta 0:14:00 lr 0.000121 time 1.8542 (2.2063) loss 3.3035 (3.1731) grad_norm 2.8555 (2.4363) [2022-01-25 04:46:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][880/1251] eta 0:13:38 lr 0.000121 time 2.2176 (2.2060) loss 3.2946 (3.1722) grad_norm 2.6640 (2.4357) [2022-01-25 04:47:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][890/1251] eta 0:13:15 lr 0.000121 time 2.1393 (2.2049) loss 3.1631 (3.1749) grad_norm 2.0585 (2.4347) [2022-01-25 04:47:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][900/1251] eta 0:12:53 lr 0.000121 time 2.1712 (2.2029) loss 3.6201 (3.1773) grad_norm 2.0268 (2.4335) [2022-01-25 04:47:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][910/1251] eta 0:12:31 lr 0.000121 time 1.8143 (2.2029) loss 3.2942 (3.1781) grad_norm 2.8005 (2.4329) [2022-01-25 04:48:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][920/1251] eta 0:12:09 lr 0.000121 time 2.8624 (2.2025) loss 3.6844 (3.1773) grad_norm 2.3870 (2.4322) [2022-01-25 04:48:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][930/1251] eta 0:11:47 lr 0.000121 time 3.1994 (2.2044) loss 2.6060 (3.1768) grad_norm 2.2123 (2.4302) [2022-01-25 04:49:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][940/1251] eta 0:11:26 lr 0.000121 time 2.9640 (2.2064) loss 3.3625 (3.1782) grad_norm 2.4303 (2.4315) [2022-01-25 04:49:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][950/1251] eta 0:11:04 lr 0.000121 time 1.7027 (2.2061) loss 2.5800 (3.1772) grad_norm 2.4095 (2.4322) [2022-01-25 04:49:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][960/1251] eta 0:10:41 lr 0.000121 time 1.6024 (2.2044) loss 3.3900 (3.1757) grad_norm 2.0880 (2.4329) [2022-01-25 04:50:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][970/1251] eta 0:10:19 lr 0.000121 time 2.2065 (2.2032) loss 3.8853 (3.1772) grad_norm 3.1297 (2.4342) [2022-01-25 04:50:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][980/1251] eta 0:09:56 lr 0.000121 time 2.2343 (2.2016) loss 3.8046 (3.1771) grad_norm 2.3148 (2.4345) [2022-01-25 04:50:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][990/1251] eta 0:09:34 lr 0.000121 time 2.2481 (2.2006) loss 3.0891 (3.1758) grad_norm 2.2940 (2.4332) [2022-01-25 04:51:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1000/1251] eta 0:09:12 lr 0.000121 time 2.4792 (2.2009) loss 3.5743 (3.1756) grad_norm 2.1428 (2.4320) [2022-01-25 04:51:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1010/1251] eta 0:08:50 lr 0.000121 time 1.6130 (2.1994) loss 3.0374 (3.1747) grad_norm 2.4230 (2.4308) [2022-01-25 04:51:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1020/1251] eta 0:08:28 lr 0.000121 time 2.5794 (2.2021) loss 2.9361 (3.1729) grad_norm 2.0613 (2.4300) [2022-01-25 04:52:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1030/1251] eta 0:08:06 lr 0.000121 time 2.4604 (2.2024) loss 3.0847 (3.1736) grad_norm 2.3261 (2.4285) [2022-01-25 04:52:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1040/1251] eta 0:07:44 lr 0.000121 time 2.6961 (2.2037) loss 2.6542 (3.1718) grad_norm 2.6359 (2.4278) [2022-01-25 04:53:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1050/1251] eta 0:07:22 lr 0.000121 time 2.0235 (2.2028) loss 3.4175 (3.1748) grad_norm 2.7026 (2.4275) [2022-01-25 04:53:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1060/1251] eta 0:07:00 lr 0.000121 time 2.1198 (2.2032) loss 3.2478 (3.1757) grad_norm 2.3508 (2.4277) [2022-01-25 04:53:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1070/1251] eta 0:06:38 lr 0.000121 time 3.0418 (2.2037) loss 3.7436 (3.1781) grad_norm 2.6591 (2.4275) [2022-01-25 04:54:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1080/1251] eta 0:06:16 lr 0.000121 time 1.6563 (2.2013) loss 2.4809 (3.1770) grad_norm 3.5896 (2.4286) [2022-01-25 04:54:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1090/1251] eta 0:05:54 lr 0.000121 time 1.6975 (2.1993) loss 3.1405 (3.1789) grad_norm 2.1887 (2.4292) [2022-01-25 04:54:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1100/1251] eta 0:05:32 lr 0.000121 time 2.3965 (2.1996) loss 3.6919 (3.1785) grad_norm 2.4054 (2.4298) [2022-01-25 04:55:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1110/1251] eta 0:05:10 lr 0.000121 time 3.6264 (2.2025) loss 3.7226 (3.1789) grad_norm 2.3851 (2.4359) [2022-01-25 04:55:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1120/1251] eta 0:04:48 lr 0.000121 time 1.9573 (2.2028) loss 3.0546 (3.1790) grad_norm 2.5968 (2.4350) [2022-01-25 04:55:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1130/1251] eta 0:04:26 lr 0.000121 time 1.6263 (2.2018) loss 2.7937 (3.1760) grad_norm 2.5934 (2.4353) [2022-01-25 04:56:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1140/1251] eta 0:04:04 lr 0.000121 time 1.8288 (2.2019) loss 3.2331 (3.1768) grad_norm 1.9799 (2.4334) [2022-01-25 04:56:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1150/1251] eta 0:03:42 lr 0.000121 time 2.9299 (2.2020) loss 3.6037 (3.1758) grad_norm 2.2034 (2.4332) [2022-01-25 04:57:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1160/1251] eta 0:03:20 lr 0.000121 time 2.4499 (2.2004) loss 3.6275 (3.1748) grad_norm 2.6573 (2.4333) [2022-01-25 04:57:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1170/1251] eta 0:02:58 lr 0.000121 time 1.7966 (2.1997) loss 2.6196 (3.1751) grad_norm 2.2626 (2.4330) [2022-01-25 04:57:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1180/1251] eta 0:02:36 lr 0.000120 time 2.0097 (2.1991) loss 2.6418 (3.1740) grad_norm 2.4631 (2.4331) [2022-01-25 04:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1190/1251] eta 0:02:14 lr 0.000120 time 2.2251 (2.1989) loss 2.3859 (3.1730) grad_norm 2.1988 (2.4347) [2022-01-25 04:58:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1200/1251] eta 0:01:52 lr 0.000120 time 2.1621 (2.1981) loss 3.0288 (3.1737) grad_norm 2.1913 (2.4338) [2022-01-25 04:58:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1210/1251] eta 0:01:30 lr 0.000120 time 2.5565 (2.1980) loss 3.0264 (3.1728) grad_norm 2.1208 (2.4323) [2022-01-25 04:59:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1220/1251] eta 0:01:08 lr 0.000120 time 2.4800 (2.1976) loss 2.9933 (3.1699) grad_norm 2.1253 (2.4318) [2022-01-25 04:59:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1230/1251] eta 0:00:46 lr 0.000120 time 1.8303 (2.1975) loss 3.0404 (3.1701) grad_norm 2.0733 (2.4320) [2022-01-25 04:59:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1240/1251] eta 0:00:24 lr 0.000120 time 1.7105 (2.1968) loss 3.5568 (3.1702) grad_norm 2.1938 (2.4323) [2022-01-25 05:00:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [234/300][1250/1251] eta 0:00:02 lr 0.000120 time 1.1737 (2.1913) loss 3.5128 (3.1713) grad_norm 2.2568 (2.4328) [2022-01-25 05:00:10 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 234 training takes 0:45:41 [2022-01-25 05:00:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.898 (18.898) Loss 0.9800 (0.9800) Acc@1 77.441 (77.441) Acc@5 93.359 (93.359) [2022-01-25 05:00:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.316 (3.450) Loss 0.9190 (0.8766) Acc@1 78.223 (79.785) Acc@5 94.629 (94.691) [2022-01-25 05:01:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.318 (2.627) Loss 0.8707 (0.8681) Acc@1 79.883 (79.832) Acc@5 95.312 (94.927) [2022-01-25 05:01:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.597 (2.259) Loss 0.7989 (0.8644) Acc@1 80.273 (79.895) Acc@5 95.898 (94.988) [2022-01-25 05:01:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 5.029 (2.213) Loss 0.8360 (0.8565) Acc@1 81.348 (79.904) Acc@5 95.117 (95.091) [2022-01-25 05:01:47 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.818 Acc@5 95.022 [2022-01-25 05:01:47 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-01-25 05:01:47 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.82% [2022-01-25 05:02:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][0/1251] eta 8:32:11 lr 0.000120 time 24.5653 (24.5653) loss 3.1984 (3.1984) grad_norm 2.4376 (2.4376) [2022-01-25 05:02:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][10/1251] eta 1:26:07 lr 0.000120 time 2.2451 (4.1638) loss 3.1736 (3.0965) grad_norm 2.8175 (2.4483) [2022-01-25 05:02:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][20/1251] eta 1:04:19 lr 0.000120 time 1.8043 (3.1353) loss 3.4154 (3.1305) grad_norm 2.1756 (2.4206) [2022-01-25 05:03:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][30/1251] eta 0:57:51 lr 0.000120 time 1.6830 (2.8434) loss 3.2143 (3.1522) grad_norm 2.9608 (2.4586) [2022-01-25 05:03:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][40/1251] eta 0:54:38 lr 0.000120 time 4.6911 (2.7071) loss 3.5268 (3.2086) grad_norm 2.4708 (2.4649) [2022-01-25 05:04:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][50/1251] eta 0:52:14 lr 0.000120 time 2.1862 (2.6096) loss 3.5714 (3.2066) grad_norm 2.7695 (2.4909) [2022-01-25 05:04:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][60/1251] eta 0:49:57 lr 0.000120 time 1.2671 (2.5168) loss 2.9989 (3.1577) grad_norm 2.3714 (2.4895) [2022-01-25 05:04:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][70/1251] eta 0:49:42 lr 0.000120 time 1.2827 (2.5253) loss 3.5279 (3.1785) grad_norm 1.9567 (2.4926) [2022-01-25 05:05:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][80/1251] eta 0:49:00 lr 0.000120 time 3.6010 (2.5108) loss 3.6307 (3.1459) grad_norm 2.5227 (2.4746) [2022-01-25 05:05:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][90/1251] eta 0:47:33 lr 0.000120 time 1.5352 (2.4580) loss 2.2214 (3.1507) grad_norm 2.5915 (2.4598) [2022-01-25 05:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][100/1251] eta 0:46:21 lr 0.000120 time 1.5379 (2.4162) loss 3.6305 (3.1307) grad_norm 2.4331 (2.4645) [2022-01-25 05:06:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][110/1251] eta 0:45:20 lr 0.000120 time 1.5986 (2.3847) loss 3.4983 (3.1116) grad_norm 2.5583 (2.4539) [2022-01-25 05:06:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][120/1251] eta 0:44:59 lr 0.000120 time 3.1287 (2.3869) loss 3.3405 (3.1074) grad_norm 2.1778 (2.4514) [2022-01-25 05:07:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][130/1251] eta 0:44:33 lr 0.000120 time 1.6599 (2.3853) loss 3.5951 (3.1181) grad_norm 2.3878 (2.4413) [2022-01-25 05:07:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][140/1251] eta 0:43:57 lr 0.000120 time 2.2245 (2.3738) loss 2.3879 (3.1268) grad_norm 2.3121 (2.4313) [2022-01-25 05:07:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][150/1251] eta 0:43:20 lr 0.000120 time 1.7834 (2.3621) loss 3.4617 (3.1252) grad_norm 2.2955 (2.4404) [2022-01-25 05:08:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][160/1251] eta 0:42:55 lr 0.000120 time 3.1159 (2.3609) loss 3.4890 (3.1143) grad_norm 2.3777 (2.4451) [2022-01-25 05:08:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][170/1251] eta 0:42:04 lr 0.000120 time 1.8536 (2.3353) loss 3.6848 (3.1251) grad_norm 2.2625 (2.4383) [2022-01-25 05:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][180/1251] eta 0:41:25 lr 0.000120 time 1.9682 (2.3206) loss 2.7918 (3.1315) grad_norm 2.8847 (2.4386) [2022-01-25 05:09:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][190/1251] eta 0:40:53 lr 0.000120 time 2.1715 (2.3123) loss 3.7704 (3.1387) grad_norm 2.5980 (2.4401) [2022-01-25 05:09:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][200/1251] eta 0:40:21 lr 0.000120 time 2.3686 (2.3044) loss 3.0977 (3.1510) grad_norm 2.5313 (2.4358) [2022-01-25 05:09:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][210/1251] eta 0:39:54 lr 0.000120 time 2.4308 (2.3000) loss 3.6209 (3.1512) grad_norm 5.8504 (2.4500) [2022-01-25 05:10:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][220/1251] eta 0:39:22 lr 0.000120 time 2.1354 (2.2910) loss 3.4697 (3.1517) grad_norm 2.2307 (2.4506) [2022-01-25 05:10:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][230/1251] eta 0:39:00 lr 0.000120 time 2.5167 (2.2924) loss 3.7793 (3.1425) grad_norm 2.4641 (2.4549) [2022-01-25 05:11:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][240/1251] eta 0:38:37 lr 0.000120 time 3.0740 (2.2927) loss 3.6358 (3.1463) grad_norm 2.2007 (2.4517) [2022-01-25 05:11:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][250/1251] eta 0:38:11 lr 0.000120 time 1.5226 (2.2892) loss 3.3646 (3.1445) grad_norm 2.7331 (2.4516) [2022-01-25 05:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][260/1251] eta 0:37:44 lr 0.000120 time 2.2123 (2.2847) loss 3.7175 (3.1464) grad_norm 2.3720 (2.4502) [2022-01-25 05:12:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][270/1251] eta 0:37:12 lr 0.000120 time 1.5942 (2.2757) loss 2.7969 (3.1523) grad_norm 2.2945 (2.4443) [2022-01-25 05:12:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][280/1251] eta 0:36:46 lr 0.000120 time 2.4578 (2.2720) loss 2.7254 (3.1534) grad_norm 2.2215 (2.4435) [2022-01-25 05:12:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][290/1251] eta 0:36:18 lr 0.000120 time 1.8204 (2.2669) loss 3.2158 (3.1517) grad_norm 2.2428 (2.4398) [2022-01-25 05:13:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][300/1251] eta 0:35:54 lr 0.000120 time 2.2492 (2.2654) loss 3.5377 (3.1589) grad_norm 2.2662 (2.4359) [2022-01-25 05:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][310/1251] eta 0:35:32 lr 0.000120 time 3.0462 (2.2667) loss 2.2741 (3.1621) grad_norm 2.5778 (2.4362) [2022-01-25 05:13:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][320/1251] eta 0:35:13 lr 0.000119 time 3.2507 (2.2700) loss 3.6779 (3.1662) grad_norm 2.4835 (2.4365) [2022-01-25 05:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][330/1251] eta 0:34:49 lr 0.000119 time 1.9146 (2.2686) loss 3.6149 (3.1668) grad_norm 2.5702 (2.4345) [2022-01-25 05:14:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][340/1251] eta 0:34:26 lr 0.000119 time 2.2527 (2.2688) loss 3.0667 (3.1637) grad_norm 2.2935 (2.4349) [2022-01-25 05:15:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][350/1251] eta 0:33:57 lr 0.000119 time 2.2012 (2.2613) loss 2.0889 (3.1563) grad_norm 2.2708 (2.4329) [2022-01-25 05:15:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][360/1251] eta 0:33:26 lr 0.000119 time 2.1886 (2.2519) loss 3.5366 (3.1563) grad_norm 2.0986 (2.4296) [2022-01-25 05:15:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][370/1251] eta 0:32:55 lr 0.000119 time 1.6540 (2.2422) loss 3.4914 (3.1586) grad_norm 2.4457 (2.4337) [2022-01-25 05:16:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][380/1251] eta 0:32:30 lr 0.000119 time 2.1742 (2.2393) loss 3.4752 (3.1563) grad_norm 3.3934 (2.4400) [2022-01-25 05:16:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][390/1251] eta 0:32:05 lr 0.000119 time 2.5391 (2.2363) loss 1.8612 (3.1523) grad_norm 2.2915 (2.4432) [2022-01-25 05:16:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][400/1251] eta 0:31:42 lr 0.000119 time 2.7542 (2.2355) loss 3.1192 (3.1473) grad_norm 2.3714 (2.4436) [2022-01-25 05:17:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][410/1251] eta 0:31:21 lr 0.000119 time 1.8150 (2.2366) loss 3.0314 (3.1448) grad_norm 2.2558 (2.4440) [2022-01-25 05:17:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][420/1251] eta 0:31:00 lr 0.000119 time 1.8329 (2.2393) loss 3.1689 (3.1448) grad_norm 2.3335 (2.4465) [2022-01-25 05:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][430/1251] eta 0:30:40 lr 0.000119 time 1.8837 (2.2417) loss 3.6817 (3.1465) grad_norm 2.3102 (2.4455) [2022-01-25 05:18:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][440/1251] eta 0:30:16 lr 0.000119 time 2.2158 (2.2398) loss 3.4565 (3.1455) grad_norm 2.3955 (2.4449) [2022-01-25 05:18:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][450/1251] eta 0:29:57 lr 0.000119 time 3.0079 (2.2441) loss 3.7295 (3.1448) grad_norm 2.3966 (2.4457) [2022-01-25 05:19:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][460/1251] eta 0:29:34 lr 0.000119 time 2.0418 (2.2435) loss 3.1382 (3.1431) grad_norm 2.5694 (2.4442) [2022-01-25 05:19:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][470/1251] eta 0:29:09 lr 0.000119 time 1.8616 (2.2396) loss 3.3752 (3.1432) grad_norm 2.6371 (2.4499) [2022-01-25 05:19:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][480/1251] eta 0:28:42 lr 0.000119 time 1.9754 (2.2343) loss 2.5778 (3.1395) grad_norm 2.0214 (2.4450) [2022-01-25 05:20:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][490/1251] eta 0:28:18 lr 0.000119 time 2.4345 (2.2316) loss 2.0907 (3.1385) grad_norm 2.2994 (2.4458) [2022-01-25 05:20:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][500/1251] eta 0:27:56 lr 0.000119 time 2.5267 (2.2324) loss 2.8475 (3.1380) grad_norm 2.1093 (2.4457) [2022-01-25 05:20:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][510/1251] eta 0:27:32 lr 0.000119 time 2.1161 (2.2302) loss 2.3453 (3.1393) grad_norm 2.3158 (2.4460) [2022-01-25 05:21:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][520/1251] eta 0:27:09 lr 0.000119 time 2.5056 (2.2296) loss 2.2013 (3.1380) grad_norm 2.6026 (2.4471) [2022-01-25 05:21:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][530/1251] eta 0:26:47 lr 0.000119 time 2.4215 (2.2302) loss 2.6654 (3.1424) grad_norm 2.0216 (2.4439) [2022-01-25 05:21:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][540/1251] eta 0:26:23 lr 0.000119 time 2.2925 (2.2272) loss 3.3547 (3.1467) grad_norm 2.3313 (2.4437) [2022-01-25 05:22:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][550/1251] eta 0:25:59 lr 0.000119 time 2.5176 (2.2247) loss 3.1857 (3.1501) grad_norm 2.0536 (2.4414) [2022-01-25 05:22:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][560/1251] eta 0:25:36 lr 0.000119 time 2.1829 (2.2230) loss 3.4799 (3.1486) grad_norm 2.6584 (2.4402) [2022-01-25 05:22:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][570/1251] eta 0:25:14 lr 0.000119 time 1.8235 (2.2236) loss 3.5920 (3.1501) grad_norm 2.7174 (2.4399) [2022-01-25 05:23:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][580/1251] eta 0:24:52 lr 0.000119 time 2.0119 (2.2242) loss 2.8438 (3.1474) grad_norm 2.3506 (2.4379) [2022-01-25 05:23:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][590/1251] eta 0:24:28 lr 0.000119 time 2.4696 (2.2219) loss 3.1563 (3.1476) grad_norm 3.0962 (2.4379) [2022-01-25 05:24:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][600/1251] eta 0:24:04 lr 0.000119 time 2.1805 (2.2183) loss 3.3957 (3.1532) grad_norm 2.2984 (2.4447) [2022-01-25 05:24:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][610/1251] eta 0:23:40 lr 0.000119 time 1.9101 (2.2168) loss 2.6071 (3.1522) grad_norm 2.5371 (2.4432) [2022-01-25 05:24:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][620/1251] eta 0:23:20 lr 0.000119 time 3.2398 (2.2197) loss 3.0453 (3.1548) grad_norm 2.5086 (2.4448) [2022-01-25 05:25:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][630/1251] eta 0:22:58 lr 0.000119 time 2.5429 (2.2204) loss 2.9007 (3.1553) grad_norm 2.3037 (2.4481) [2022-01-25 05:25:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][640/1251] eta 0:22:36 lr 0.000119 time 2.1047 (2.2201) loss 3.8382 (3.1583) grad_norm 2.2667 (2.4484) [2022-01-25 05:25:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][650/1251] eta 0:22:12 lr 0.000119 time 2.5399 (2.2175) loss 3.3128 (3.1563) grad_norm 2.1528 (2.4474) [2022-01-25 05:26:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][660/1251] eta 0:21:50 lr 0.000119 time 2.7359 (2.2166) loss 2.7883 (3.1571) grad_norm 2.4496 (2.4467) [2022-01-25 05:26:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][670/1251] eta 0:21:25 lr 0.000119 time 1.5574 (2.2132) loss 2.0718 (3.1552) grad_norm 2.2972 (2.4468) [2022-01-25 05:26:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][680/1251] eta 0:21:02 lr 0.000119 time 1.8999 (2.2114) loss 2.5981 (3.1526) grad_norm 2.2244 (2.4468) [2022-01-25 05:27:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][690/1251] eta 0:20:40 lr 0.000119 time 2.4906 (2.2114) loss 3.4607 (3.1508) grad_norm 2.2334 (2.4450) [2022-01-25 05:27:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][700/1251] eta 0:20:19 lr 0.000118 time 3.2027 (2.2135) loss 3.5777 (3.1527) grad_norm 2.5437 (2.4431) [2022-01-25 05:28:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][710/1251] eta 0:19:57 lr 0.000118 time 1.7958 (2.2131) loss 3.5214 (3.1537) grad_norm 2.5471 (2.4442) [2022-01-25 05:28:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][720/1251] eta 0:19:34 lr 0.000118 time 1.6438 (2.2128) loss 3.2397 (3.1535) grad_norm 2.4536 (2.4445) [2022-01-25 05:28:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][730/1251] eta 0:19:12 lr 0.000118 time 1.6174 (2.2114) loss 2.2644 (3.1509) grad_norm 2.3380 (2.4424) [2022-01-25 05:29:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][740/1251] eta 0:18:50 lr 0.000118 time 2.5760 (2.2124) loss 3.8056 (3.1507) grad_norm 2.7631 (2.4421) [2022-01-25 05:29:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][750/1251] eta 0:18:27 lr 0.000118 time 2.6376 (2.2116) loss 2.2941 (3.1507) grad_norm 2.3098 (2.4416) [2022-01-25 05:29:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][760/1251] eta 0:18:04 lr 0.000118 time 2.1189 (2.2096) loss 2.3738 (3.1518) grad_norm 2.2025 (2.4429) [2022-01-25 05:30:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][770/1251] eta 0:17:42 lr 0.000118 time 2.2183 (2.2089) loss 3.3852 (3.1523) grad_norm 2.5886 (2.4427) [2022-01-25 05:30:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][780/1251] eta 0:17:19 lr 0.000118 time 2.5127 (2.2076) loss 2.9430 (3.1502) grad_norm 2.3443 (2.4435) [2022-01-25 05:30:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][790/1251] eta 0:16:57 lr 0.000118 time 1.8924 (2.2082) loss 3.3850 (3.1514) grad_norm 2.6752 (2.4481) [2022-01-25 05:31:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][800/1251] eta 0:16:36 lr 0.000118 time 2.8333 (2.2096) loss 3.5554 (3.1529) grad_norm 2.2754 (2.4483) [2022-01-25 05:31:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][810/1251] eta 0:16:15 lr 0.000118 time 2.1428 (2.2111) loss 2.4377 (3.1530) grad_norm 2.8126 (2.4481) [2022-01-25 05:32:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][820/1251] eta 0:15:53 lr 0.000118 time 2.3800 (2.2125) loss 3.5981 (3.1533) grad_norm 3.2207 (2.4494) [2022-01-25 05:32:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][830/1251] eta 0:15:32 lr 0.000118 time 2.1530 (2.2147) loss 3.4473 (3.1545) grad_norm 2.4835 (2.4494) [2022-01-25 05:32:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][840/1251] eta 0:15:10 lr 0.000118 time 2.7654 (2.2147) loss 3.6405 (3.1562) grad_norm 2.4837 (2.4509) [2022-01-25 05:33:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][850/1251] eta 0:14:46 lr 0.000118 time 1.8483 (2.2111) loss 3.1410 (3.1585) grad_norm 2.4368 (2.4509) [2022-01-25 05:33:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][860/1251] eta 0:14:23 lr 0.000118 time 1.7605 (2.2076) loss 3.7510 (3.1605) grad_norm 2.5832 (2.4511) [2022-01-25 05:33:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][870/1251] eta 0:14:00 lr 0.000118 time 2.0169 (2.2068) loss 3.3522 (3.1611) grad_norm 2.5884 (2.4497) [2022-01-25 05:34:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][880/1251] eta 0:13:38 lr 0.000118 time 2.4530 (2.2053) loss 3.6005 (3.1609) grad_norm 2.5875 (2.4506) [2022-01-25 05:34:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][890/1251] eta 0:13:16 lr 0.000118 time 2.5125 (2.2059) loss 2.1491 (3.1588) grad_norm 2.3343 (2.4537) [2022-01-25 05:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][900/1251] eta 0:12:54 lr 0.000118 time 1.8511 (2.2054) loss 3.7142 (3.1610) grad_norm 2.3356 (2.4532) [2022-01-25 05:35:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][910/1251] eta 0:12:32 lr 0.000118 time 2.8396 (2.2070) loss 2.6352 (3.1586) grad_norm 2.3439 (2.4530) [2022-01-25 05:35:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][920/1251] eta 0:12:10 lr 0.000118 time 2.3045 (2.2062) loss 3.1917 (3.1587) grad_norm 2.4378 (2.4518) [2022-01-25 05:36:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][930/1251] eta 0:11:48 lr 0.000118 time 2.2445 (2.2059) loss 3.3895 (3.1573) grad_norm 2.5570 (2.4545) [2022-01-25 05:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][940/1251] eta 0:11:25 lr 0.000118 time 2.2152 (2.2058) loss 3.1046 (3.1573) grad_norm 2.6715 (2.4553) [2022-01-25 05:36:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][950/1251] eta 0:11:04 lr 0.000118 time 2.2115 (2.2069) loss 2.9969 (3.1562) grad_norm 1.9798 (2.4546) [2022-01-25 05:37:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][960/1251] eta 0:10:42 lr 0.000118 time 2.1872 (2.2080) loss 3.0251 (3.1553) grad_norm 2.1473 (2.4532) [2022-01-25 05:37:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][970/1251] eta 0:10:20 lr 0.000118 time 2.3766 (2.2092) loss 3.3571 (3.1578) grad_norm 2.8976 (2.4548) [2022-01-25 05:37:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][980/1251] eta 0:09:58 lr 0.000118 time 1.6113 (2.2069) loss 3.6120 (3.1593) grad_norm 2.8342 (2.4545) [2022-01-25 05:38:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][990/1251] eta 0:09:35 lr 0.000118 time 1.9011 (2.2062) loss 3.2132 (3.1601) grad_norm 2.4288 (2.4549) [2022-01-25 05:38:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1000/1251] eta 0:09:14 lr 0.000118 time 4.1511 (2.2078) loss 3.1235 (3.1591) grad_norm 2.2043 (2.4544) [2022-01-25 05:39:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1010/1251] eta 0:08:52 lr 0.000118 time 1.8320 (2.2086) loss 3.3958 (3.1600) grad_norm 2.5672 (2.4547) [2022-01-25 05:39:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1020/1251] eta 0:08:29 lr 0.000118 time 1.8051 (2.2075) loss 3.7945 (3.1590) grad_norm 2.6838 (2.4583) [2022-01-25 05:39:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1030/1251] eta 0:08:07 lr 0.000118 time 2.7117 (2.2065) loss 3.2500 (3.1596) grad_norm 2.7380 (2.4579) [2022-01-25 05:40:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1040/1251] eta 0:07:45 lr 0.000118 time 2.0571 (2.2053) loss 3.7247 (3.1624) grad_norm 2.9468 (2.4573) [2022-01-25 05:40:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1050/1251] eta 0:07:23 lr 0.000118 time 1.8438 (2.2052) loss 3.8657 (3.1638) grad_norm 2.7073 (2.4581) [2022-01-25 05:40:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1060/1251] eta 0:07:01 lr 0.000118 time 1.4642 (2.2066) loss 3.4457 (3.1651) grad_norm 2.2973 (2.4577) [2022-01-25 05:41:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1070/1251] eta 0:06:39 lr 0.000118 time 2.6213 (2.2068) loss 4.0777 (3.1644) grad_norm 2.6085 (2.4569) [2022-01-25 05:41:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1080/1251] eta 0:06:17 lr 0.000118 time 1.9289 (2.2064) loss 3.1124 (3.1651) grad_norm 2.2685 (2.4573) [2022-01-25 05:41:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1090/1251] eta 0:05:55 lr 0.000117 time 2.2041 (2.2059) loss 3.2722 (3.1640) grad_norm 2.5098 (2.4577) [2022-01-25 05:42:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1100/1251] eta 0:05:32 lr 0.000117 time 1.5802 (2.2037) loss 3.1492 (3.1651) grad_norm 2.2042 (2.4572) [2022-01-25 05:42:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1110/1251] eta 0:05:10 lr 0.000117 time 1.9794 (2.2020) loss 3.4035 (3.1656) grad_norm 2.3952 (2.4560) [2022-01-25 05:42:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1120/1251] eta 0:04:48 lr 0.000117 time 1.8238 (2.2013) loss 3.3923 (3.1664) grad_norm 2.2747 (2.4556) [2022-01-25 05:43:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1130/1251] eta 0:04:26 lr 0.000117 time 1.8688 (2.2012) loss 2.7592 (3.1656) grad_norm 2.2270 (2.4542) [2022-01-25 05:43:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1140/1251] eta 0:04:04 lr 0.000117 time 2.0001 (2.2010) loss 3.2708 (3.1652) grad_norm 2.1791 (2.4531) [2022-01-25 05:44:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1150/1251] eta 0:03:42 lr 0.000117 time 1.9683 (2.2003) loss 3.3120 (3.1633) grad_norm 2.3537 (2.4539) [2022-01-25 05:44:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1160/1251] eta 0:03:20 lr 0.000117 time 1.9496 (2.1995) loss 2.9656 (3.1632) grad_norm 2.6741 (2.4528) [2022-01-25 05:44:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1170/1251] eta 0:02:58 lr 0.000117 time 1.9916 (2.2004) loss 3.8089 (3.1646) grad_norm 2.3038 (2.4522) [2022-01-25 05:45:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1180/1251] eta 0:02:36 lr 0.000117 time 2.1038 (2.2011) loss 3.5532 (3.1647) grad_norm 2.3465 (2.4517) [2022-01-25 05:45:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1190/1251] eta 0:02:14 lr 0.000117 time 2.2351 (2.2025) loss 2.3387 (3.1657) grad_norm 2.2393 (2.4503) [2022-01-25 05:45:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1200/1251] eta 0:01:52 lr 0.000117 time 1.5571 (2.2039) loss 2.3841 (3.1635) grad_norm 2.2627 (2.4492) [2022-01-25 05:46:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1210/1251] eta 0:01:30 lr 0.000117 time 2.1016 (2.2049) loss 2.1104 (3.1591) grad_norm 2.5299 (2.4489) [2022-01-25 05:46:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1220/1251] eta 0:01:08 lr 0.000117 time 2.2767 (2.2046) loss 3.8160 (3.1584) grad_norm 2.2120 (2.4470) [2022-01-25 05:47:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1230/1251] eta 0:00:46 lr 0.000117 time 1.6680 (2.2035) loss 2.3143 (3.1567) grad_norm 2.3360 (2.4464) [2022-01-25 05:47:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1240/1251] eta 0:00:24 lr 0.000117 time 1.7428 (2.2016) loss 3.6903 (3.1550) grad_norm 3.1517 (2.4466) [2022-01-25 05:47:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [235/300][1250/1251] eta 0:00:02 lr 0.000117 time 1.1839 (2.1960) loss 3.5083 (3.1557) grad_norm 2.9964 (2.4463) [2022-01-25 05:47:35 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 235 training takes 0:45:47 [2022-01-25 05:47:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.154 (18.154) Loss 0.8595 (0.8595) Acc@1 79.980 (79.980) Acc@5 95.410 (95.410) [2022-01-25 05:48:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.593 (3.365) Loss 0.7828 (0.8599) Acc@1 82.324 (79.918) Acc@5 95.605 (94.966) [2022-01-25 05:48:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.232 (2.492) Loss 0.8376 (0.8613) Acc@1 80.176 (79.869) Acc@5 95.605 (94.959) [2022-01-25 05:48:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.319 (2.310) Loss 0.8577 (0.8712) Acc@1 78.906 (79.606) Acc@5 95.410 (94.934) [2022-01-25 05:49:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.009 (2.180) Loss 0.8477 (0.8666) Acc@1 80.566 (79.690) Acc@5 94.824 (95.015) [2022-01-25 05:49:11 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.764 Acc@5 95.004 [2022-01-25 05:49:11 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-01-25 05:49:11 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.82% [2022-01-25 05:49:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][0/1251] eta 7:27:05 lr 0.000117 time 21.4430 (21.4430) loss 3.6914 (3.6914) grad_norm 2.3261 (2.3261) [2022-01-25 05:49:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][10/1251] eta 1:26:47 lr 0.000117 time 2.2863 (4.1959) loss 3.5459 (3.3956) grad_norm 2.2529 (2.3106) [2022-01-25 05:50:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][20/1251] eta 1:05:21 lr 0.000117 time 1.4721 (3.1853) loss 2.5710 (3.2049) grad_norm 2.3350 (2.3599) [2022-01-25 05:50:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][30/1251] eta 0:58:30 lr 0.000117 time 1.2640 (2.8750) loss 2.7756 (3.2298) grad_norm 2.5076 (2.3522) [2022-01-25 05:51:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][40/1251] eta 0:54:58 lr 0.000117 time 3.6634 (2.7238) loss 3.9548 (3.2692) grad_norm 2.2092 (2.3486) [2022-01-25 05:51:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][50/1251] eta 0:53:06 lr 0.000117 time 2.6113 (2.6532) loss 3.6569 (3.2447) grad_norm 2.7298 (2.3756) [2022-01-25 05:51:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][60/1251] eta 0:50:57 lr 0.000117 time 1.5416 (2.5670) loss 3.4589 (3.2751) grad_norm 2.4144 (2.3953) [2022-01-25 05:52:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][70/1251] eta 0:49:38 lr 0.000117 time 2.0967 (2.5217) loss 3.7822 (3.2452) grad_norm 2.3583 (2.3948) [2022-01-25 05:52:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][80/1251] eta 0:48:37 lr 0.000117 time 3.8419 (2.4912) loss 3.6673 (3.2317) grad_norm 2.4252 (2.3906) [2022-01-25 05:52:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][90/1251] eta 0:47:47 lr 0.000117 time 3.2448 (2.4702) loss 2.6763 (3.2150) grad_norm 2.4018 (2.3905) [2022-01-25 05:53:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][100/1251] eta 0:46:35 lr 0.000117 time 2.1139 (2.4288) loss 2.8246 (3.2066) grad_norm 2.3968 (2.3859) [2022-01-25 05:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][110/1251] eta 0:45:41 lr 0.000117 time 1.6291 (2.4028) loss 3.5262 (3.1824) grad_norm 2.2388 (2.3976) [2022-01-25 05:54:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][120/1251] eta 0:45:14 lr 0.000117 time 3.5028 (2.4004) loss 2.8373 (3.1907) grad_norm 2.5223 (2.3921) [2022-01-25 05:54:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][130/1251] eta 0:44:26 lr 0.000117 time 2.5965 (2.3789) loss 3.6262 (3.2049) grad_norm 2.8946 (2.3973) [2022-01-25 05:54:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][140/1251] eta 0:43:32 lr 0.000117 time 1.8836 (2.3518) loss 2.7812 (3.1969) grad_norm 2.3930 (2.4172) [2022-01-25 05:55:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][150/1251] eta 0:43:06 lr 0.000117 time 2.2876 (2.3497) loss 3.2058 (3.1933) grad_norm 3.2814 (2.4213) [2022-01-25 05:55:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][160/1251] eta 0:42:38 lr 0.000117 time 3.0691 (2.3450) loss 3.3858 (3.1764) grad_norm 2.1515 (2.4168) [2022-01-25 05:55:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][170/1251] eta 0:42:07 lr 0.000117 time 2.5515 (2.3382) loss 3.3806 (3.1732) grad_norm 2.6658 (2.4219) [2022-01-25 05:56:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][180/1251] eta 0:41:33 lr 0.000117 time 2.2898 (2.3284) loss 3.4250 (3.1701) grad_norm 2.6583 (2.4224) [2022-01-25 05:56:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][190/1251] eta 0:40:55 lr 0.000117 time 1.9682 (2.3143) loss 3.5813 (3.1661) grad_norm 2.4803 (2.4208) [2022-01-25 05:56:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][200/1251] eta 0:40:16 lr 0.000117 time 2.4627 (2.2992) loss 3.6547 (3.1720) grad_norm 2.5226 (2.4216) [2022-01-25 05:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][210/1251] eta 0:39:44 lr 0.000117 time 2.1109 (2.2905) loss 3.3524 (3.1788) grad_norm 2.5698 (2.4196) [2022-01-25 05:57:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][220/1251] eta 0:39:08 lr 0.000117 time 2.0072 (2.2781) loss 3.3888 (3.1654) grad_norm 2.1080 (2.4171) [2022-01-25 05:57:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][230/1251] eta 0:38:36 lr 0.000116 time 1.8935 (2.2690) loss 2.3994 (3.1660) grad_norm 2.4295 (2.4131) [2022-01-25 05:58:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][240/1251] eta 0:38:10 lr 0.000116 time 2.0757 (2.2658) loss 2.9524 (3.1662) grad_norm 2.4562 (2.4268) [2022-01-25 05:58:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][250/1251] eta 0:37:45 lr 0.000116 time 2.6579 (2.2628) loss 3.2019 (3.1485) grad_norm 2.3389 (2.4259) [2022-01-25 05:59:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][260/1251] eta 0:37:20 lr 0.000116 time 2.1415 (2.2610) loss 3.1063 (3.1380) grad_norm 2.1799 (2.4246) [2022-01-25 05:59:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][270/1251] eta 0:37:00 lr 0.000116 time 2.3627 (2.2635) loss 2.0604 (3.1366) grad_norm 2.4152 (2.4294) [2022-01-25 05:59:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][280/1251] eta 0:36:42 lr 0.000116 time 2.0765 (2.2681) loss 2.3640 (3.1341) grad_norm 2.1971 (2.4282) [2022-01-25 06:00:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][290/1251] eta 0:36:21 lr 0.000116 time 3.1893 (2.2698) loss 3.4265 (3.1403) grad_norm 2.2397 (2.4282) [2022-01-25 06:00:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][300/1251] eta 0:35:54 lr 0.000116 time 2.0470 (2.2656) loss 3.6254 (3.1516) grad_norm 2.0834 (2.4295) [2022-01-25 06:00:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][310/1251] eta 0:35:27 lr 0.000116 time 1.7128 (2.2612) loss 3.1966 (3.1561) grad_norm 2.2386 (2.4300) [2022-01-25 06:01:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][320/1251] eta 0:34:58 lr 0.000116 time 1.6277 (2.2543) loss 3.4991 (3.1610) grad_norm 2.6078 (2.4341) [2022-01-25 06:01:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][330/1251] eta 0:34:41 lr 0.000116 time 3.6671 (2.2604) loss 2.2851 (3.1621) grad_norm 2.2542 (2.4331) [2022-01-25 06:02:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][340/1251] eta 0:34:21 lr 0.000116 time 1.8970 (2.2626) loss 3.3407 (3.1594) grad_norm 2.6464 (2.4332) [2022-01-25 06:02:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][350/1251] eta 0:34:00 lr 0.000116 time 1.7382 (2.2643) loss 3.1477 (3.1596) grad_norm 2.2141 (2.4302) [2022-01-25 06:02:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][360/1251] eta 0:33:31 lr 0.000116 time 1.6441 (2.2571) loss 3.6842 (3.1576) grad_norm 2.2884 (2.4298) [2022-01-25 06:03:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][370/1251] eta 0:33:02 lr 0.000116 time 2.5174 (2.2502) loss 3.7232 (3.1560) grad_norm 2.3548 (2.4251) [2022-01-25 06:03:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][380/1251] eta 0:32:34 lr 0.000116 time 1.9306 (2.2445) loss 3.2050 (3.1523) grad_norm 2.1713 (2.4288) [2022-01-25 06:03:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][390/1251] eta 0:32:08 lr 0.000116 time 1.5687 (2.2394) loss 2.6256 (3.1578) grad_norm 2.3815 (2.4286) [2022-01-25 06:04:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][400/1251] eta 0:31:43 lr 0.000116 time 2.1194 (2.2366) loss 3.2851 (3.1569) grad_norm 2.6911 (2.4279) [2022-01-25 06:04:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][410/1251] eta 0:31:18 lr 0.000116 time 2.2033 (2.2338) loss 2.6716 (3.1575) grad_norm 2.3037 (2.4276) [2022-01-25 06:04:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][420/1251] eta 0:30:55 lr 0.000116 time 2.6274 (2.2333) loss 3.3585 (3.1596) grad_norm 2.2298 (2.4289) [2022-01-25 06:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][430/1251] eta 0:30:34 lr 0.000116 time 2.2597 (2.2344) loss 2.3207 (3.1524) grad_norm 2.9200 (2.4307) [2022-01-25 06:05:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][440/1251] eta 0:30:18 lr 0.000116 time 2.1517 (2.2422) loss 3.5438 (3.1582) grad_norm 2.2968 (2.4284) [2022-01-25 06:06:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][450/1251] eta 0:29:58 lr 0.000116 time 2.5765 (2.2451) loss 3.2198 (3.1597) grad_norm 2.2844 (2.4292) [2022-01-25 06:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][460/1251] eta 0:29:36 lr 0.000116 time 2.2495 (2.2454) loss 3.6462 (3.1574) grad_norm 2.2887 (2.4319) [2022-01-25 06:06:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][470/1251] eta 0:29:13 lr 0.000116 time 1.5954 (2.2448) loss 2.0472 (3.1568) grad_norm 2.4309 (2.4308) [2022-01-25 06:07:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][480/1251] eta 0:28:47 lr 0.000116 time 1.9034 (2.2400) loss 3.3429 (3.1551) grad_norm 2.7170 (2.4345) [2022-01-25 06:07:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][490/1251] eta 0:28:20 lr 0.000116 time 1.8025 (2.2348) loss 3.3165 (3.1571) grad_norm 2.5351 (2.4361) [2022-01-25 06:07:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][500/1251] eta 0:27:56 lr 0.000116 time 2.3238 (2.2319) loss 2.2254 (3.1537) grad_norm 2.1634 (2.4364) [2022-01-25 06:08:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][510/1251] eta 0:27:33 lr 0.000116 time 3.1396 (2.2317) loss 2.6697 (3.1511) grad_norm 2.4536 (2.4347) [2022-01-25 06:08:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][520/1251] eta 0:27:11 lr 0.000116 time 1.8650 (2.2315) loss 3.4613 (3.1500) grad_norm 2.4806 (2.4357) [2022-01-25 06:08:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][530/1251] eta 0:26:50 lr 0.000116 time 1.9618 (2.2336) loss 3.6492 (3.1494) grad_norm 2.3959 (2.4338) [2022-01-25 06:09:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][540/1251] eta 0:26:28 lr 0.000116 time 2.5726 (2.2339) loss 3.8472 (3.1546) grad_norm 2.5318 (2.4398) [2022-01-25 06:09:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][550/1251] eta 0:26:07 lr 0.000116 time 2.2371 (2.2355) loss 3.2695 (3.1548) grad_norm 2.2015 (2.4406) [2022-01-25 06:10:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][560/1251] eta 0:25:42 lr 0.000116 time 2.8013 (2.2317) loss 3.0706 (3.1498) grad_norm 2.2184 (2.4417) [2022-01-25 06:10:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][570/1251] eta 0:25:18 lr 0.000116 time 2.1428 (2.2301) loss 3.1530 (3.1495) grad_norm 2.4968 (2.4418) [2022-01-25 06:10:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][580/1251] eta 0:24:53 lr 0.000116 time 1.9025 (2.2258) loss 3.5982 (3.1469) grad_norm 2.8583 (2.4452) [2022-01-25 06:11:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][590/1251] eta 0:24:29 lr 0.000116 time 2.2021 (2.2239) loss 3.5715 (3.1494) grad_norm 2.2572 (2.4479) [2022-01-25 06:11:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][600/1251] eta 0:24:07 lr 0.000116 time 2.4299 (2.2240) loss 3.3496 (3.1489) grad_norm 2.9086 (2.4494) [2022-01-25 06:11:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][610/1251] eta 0:23:47 lr 0.000116 time 2.3177 (2.2267) loss 3.7518 (3.1483) grad_norm 2.3947 (2.4473) [2022-01-25 06:12:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][620/1251] eta 0:23:23 lr 0.000115 time 1.6040 (2.2236) loss 3.5902 (3.1506) grad_norm 2.3926 (2.4460) [2022-01-25 06:12:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][630/1251] eta 0:22:59 lr 0.000115 time 1.9292 (2.2216) loss 2.5959 (3.1507) grad_norm 2.4748 (2.4460) [2022-01-25 06:12:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][640/1251] eta 0:22:39 lr 0.000115 time 2.7678 (2.2243) loss 2.0915 (3.1487) grad_norm 2.1129 (2.4424) [2022-01-25 06:13:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][650/1251] eta 0:22:17 lr 0.000115 time 2.8434 (2.2248) loss 3.5438 (3.1483) grad_norm 2.2858 (2.4405) [2022-01-25 06:13:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][660/1251] eta 0:21:53 lr 0.000115 time 1.8615 (2.2233) loss 3.8965 (3.1503) grad_norm 2.5059 (2.4411) [2022-01-25 06:14:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][670/1251] eta 0:21:30 lr 0.000115 time 1.6208 (2.2206) loss 3.4546 (3.1458) grad_norm 2.7629 (2.4401) [2022-01-25 06:14:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][680/1251] eta 0:21:06 lr 0.000115 time 2.4723 (2.2189) loss 3.6135 (3.1414) grad_norm 2.3711 (2.4412) [2022-01-25 06:14:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][690/1251] eta 0:20:45 lr 0.000115 time 3.8167 (2.2203) loss 3.3592 (3.1436) grad_norm 2.3613 (2.4407) [2022-01-25 06:15:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][700/1251] eta 0:20:23 lr 0.000115 time 1.9569 (2.2207) loss 3.2981 (3.1424) grad_norm 2.6629 (2.4374) [2022-01-25 06:15:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][710/1251] eta 0:20:00 lr 0.000115 time 1.6699 (2.2194) loss 3.0433 (3.1428) grad_norm 2.2164 (2.4366) [2022-01-25 06:15:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][720/1251] eta 0:19:38 lr 0.000115 time 2.7027 (2.2196) loss 3.4293 (3.1435) grad_norm 2.2607 (2.4355) [2022-01-25 06:16:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][730/1251] eta 0:19:14 lr 0.000115 time 2.5507 (2.2166) loss 3.2917 (3.1472) grad_norm 2.4161 (2.4361) [2022-01-25 06:16:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][740/1251] eta 0:18:52 lr 0.000115 time 2.0415 (2.2163) loss 2.6130 (3.1465) grad_norm 2.2907 (2.4381) [2022-01-25 06:16:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][750/1251] eta 0:18:29 lr 0.000115 time 1.6708 (2.2142) loss 3.3197 (3.1458) grad_norm 2.3448 (2.4374) [2022-01-25 06:17:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][760/1251] eta 0:18:07 lr 0.000115 time 2.7762 (2.2154) loss 3.2738 (3.1465) grad_norm 2.4106 (2.4374) [2022-01-25 06:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][770/1251] eta 0:17:45 lr 0.000115 time 1.8624 (2.2151) loss 2.2040 (3.1453) grad_norm 2.6986 (2.4382) [2022-01-25 06:18:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][780/1251] eta 0:17:23 lr 0.000115 time 1.9416 (2.2146) loss 2.5893 (3.1448) grad_norm 2.4333 (2.4374) [2022-01-25 06:18:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][790/1251] eta 0:17:00 lr 0.000115 time 2.6720 (2.2139) loss 3.2029 (3.1465) grad_norm 2.5672 (2.4375) [2022-01-25 06:18:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][800/1251] eta 0:16:37 lr 0.000115 time 2.3206 (2.2127) loss 3.1254 (3.1444) grad_norm 2.1813 (2.4357) [2022-01-25 06:19:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][810/1251] eta 0:16:14 lr 0.000115 time 1.9253 (2.2102) loss 2.4848 (3.1441) grad_norm 2.2828 (2.4368) [2022-01-25 06:19:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][820/1251] eta 0:15:51 lr 0.000115 time 1.8559 (2.2083) loss 2.7359 (3.1432) grad_norm 2.6245 (2.4376) [2022-01-25 06:19:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][830/1251] eta 0:15:29 lr 0.000115 time 2.6785 (2.2072) loss 2.4547 (3.1430) grad_norm 2.8852 (2.4382) [2022-01-25 06:20:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][840/1251] eta 0:15:07 lr 0.000115 time 2.2383 (2.2071) loss 2.5814 (3.1412) grad_norm 2.0504 (2.4366) [2022-01-25 06:20:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][850/1251] eta 0:14:45 lr 0.000115 time 2.2236 (2.2074) loss 3.8506 (3.1429) grad_norm 2.5970 (2.4360) [2022-01-25 06:20:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][860/1251] eta 0:14:23 lr 0.000115 time 2.5277 (2.2095) loss 2.6773 (3.1413) grad_norm 2.1105 (2.4369) [2022-01-25 06:21:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][870/1251] eta 0:14:02 lr 0.000115 time 2.6302 (2.2101) loss 2.9705 (3.1416) grad_norm 2.2597 (2.4358) [2022-01-25 06:21:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][880/1251] eta 0:13:40 lr 0.000115 time 2.4530 (2.2102) loss 3.8196 (3.1416) grad_norm 2.4044 (2.4365) [2022-01-25 06:22:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][890/1251] eta 0:13:18 lr 0.000115 time 1.8415 (2.2107) loss 3.3362 (3.1419) grad_norm 2.7000 (2.4368) [2022-01-25 06:22:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][900/1251] eta 0:12:56 lr 0.000115 time 3.1080 (2.2123) loss 3.0003 (3.1408) grad_norm 2.9376 (2.4387) [2022-01-25 06:22:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][910/1251] eta 0:12:34 lr 0.000115 time 3.1546 (2.2139) loss 3.0347 (3.1416) grad_norm 2.3420 (2.4395) [2022-01-25 06:23:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][920/1251] eta 0:12:12 lr 0.000115 time 2.2739 (2.2129) loss 2.4200 (3.1427) grad_norm 2.1046 (2.4388) [2022-01-25 06:23:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][930/1251] eta 0:11:49 lr 0.000115 time 1.7191 (2.2096) loss 3.4684 (3.1434) grad_norm 2.3874 (2.4389) [2022-01-25 06:23:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][940/1251] eta 0:11:26 lr 0.000115 time 2.6584 (2.2078) loss 3.1759 (3.1429) grad_norm 2.5688 (2.4378) [2022-01-25 06:24:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][950/1251] eta 0:11:04 lr 0.000115 time 2.7959 (2.2082) loss 3.7984 (3.1447) grad_norm 2.4469 (2.4395) [2022-01-25 06:24:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][960/1251] eta 0:10:42 lr 0.000115 time 2.2327 (2.2086) loss 3.2843 (3.1447) grad_norm 2.1699 (2.4400) [2022-01-25 06:24:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][970/1251] eta 0:10:20 lr 0.000115 time 1.9867 (2.2094) loss 3.1464 (3.1434) grad_norm 2.3010 (2.4398) [2022-01-25 06:25:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][980/1251] eta 0:09:58 lr 0.000115 time 2.9018 (2.2097) loss 3.1673 (3.1424) grad_norm 2.5482 (2.4399) [2022-01-25 06:25:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][990/1251] eta 0:09:36 lr 0.000115 time 2.9419 (2.2099) loss 3.4955 (3.1442) grad_norm 2.3922 (2.4400) [2022-01-25 06:26:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1000/1251] eta 0:09:14 lr 0.000115 time 1.6418 (2.2082) loss 3.6151 (3.1435) grad_norm 2.9677 (2.4399) [2022-01-25 06:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1010/1251] eta 0:08:51 lr 0.000114 time 2.6464 (2.2073) loss 2.4418 (3.1434) grad_norm 2.4190 (2.4393) [2022-01-25 06:26:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1020/1251] eta 0:08:29 lr 0.000114 time 1.8816 (2.2066) loss 3.4347 (3.1459) grad_norm 2.5876 (2.4385) [2022-01-25 06:27:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1030/1251] eta 0:08:07 lr 0.000114 time 3.6309 (2.2078) loss 3.3432 (3.1451) grad_norm 3.6013 (2.4398) [2022-01-25 06:27:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1040/1251] eta 0:07:45 lr 0.000114 time 1.8099 (2.2071) loss 2.8673 (3.1447) grad_norm 2.5256 (2.4400) [2022-01-25 06:27:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1050/1251] eta 0:07:23 lr 0.000114 time 1.4772 (2.2078) loss 3.6368 (3.1454) grad_norm 2.2712 (2.4404) [2022-01-25 06:28:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1060/1251] eta 0:07:01 lr 0.000114 time 2.0681 (2.2083) loss 2.8177 (3.1436) grad_norm 2.3489 (2.4400) [2022-01-25 06:28:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1070/1251] eta 0:06:40 lr 0.000114 time 3.6714 (2.2103) loss 3.0250 (3.1430) grad_norm 2.2563 (2.4388) [2022-01-25 06:28:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1080/1251] eta 0:06:17 lr 0.000114 time 1.6057 (2.2081) loss 3.3713 (3.1445) grad_norm 2.6550 (2.4389) [2022-01-25 06:29:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1090/1251] eta 0:05:55 lr 0.000114 time 1.7331 (2.2050) loss 3.6422 (3.1464) grad_norm 2.3057 (2.4380) [2022-01-25 06:29:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1100/1251] eta 0:05:32 lr 0.000114 time 1.9744 (2.2047) loss 3.0217 (3.1470) grad_norm 2.5041 (2.4372) [2022-01-25 06:30:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1110/1251] eta 0:05:10 lr 0.000114 time 2.0494 (2.2043) loss 2.5204 (3.1471) grad_norm 2.3532 (2.4364) [2022-01-25 06:30:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1120/1251] eta 0:04:48 lr 0.000114 time 1.9157 (2.2035) loss 3.3696 (3.1483) grad_norm 2.1575 (2.4358) [2022-01-25 06:30:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1130/1251] eta 0:04:26 lr 0.000114 time 1.9514 (2.2033) loss 3.5032 (3.1465) grad_norm 1.9566 (2.4355) [2022-01-25 06:31:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1140/1251] eta 0:04:04 lr 0.000114 time 2.2469 (2.2039) loss 3.4323 (3.1475) grad_norm 2.0997 (2.4341) [2022-01-25 06:31:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1150/1251] eta 0:03:42 lr 0.000114 time 3.2132 (2.2052) loss 3.1723 (3.1493) grad_norm 2.2922 (2.4370) [2022-01-25 06:31:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1160/1251] eta 0:03:20 lr 0.000114 time 1.9880 (2.2054) loss 3.3309 (3.1495) grad_norm 2.2898 (2.4365) [2022-01-25 06:32:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1170/1251] eta 0:02:58 lr 0.000114 time 1.9412 (2.2047) loss 3.4584 (3.1501) grad_norm 2.5540 (2.4362) [2022-01-25 06:32:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1180/1251] eta 0:02:36 lr 0.000114 time 1.8897 (2.2045) loss 2.1868 (3.1501) grad_norm 2.4094 (2.4372) [2022-01-25 06:32:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1190/1251] eta 0:02:14 lr 0.000114 time 3.5973 (2.2039) loss 3.2356 (3.1504) grad_norm 2.1998 (2.4367) [2022-01-25 06:33:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1200/1251] eta 0:01:52 lr 0.000114 time 1.8931 (2.2030) loss 3.7313 (3.1506) grad_norm 2.4863 (2.4359) [2022-01-25 06:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1210/1251] eta 0:01:30 lr 0.000114 time 2.2240 (2.2038) loss 3.2249 (3.1528) grad_norm 2.4381 (2.4356) [2022-01-25 06:34:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1220/1251] eta 0:01:08 lr 0.000114 time 2.2320 (2.2042) loss 3.4210 (3.1517) grad_norm 2.2819 (2.4349) [2022-01-25 06:34:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1230/1251] eta 0:00:46 lr 0.000114 time 3.7326 (2.2053) loss 3.4405 (3.1536) grad_norm 2.2725 (2.4349) [2022-01-25 06:34:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1240/1251] eta 0:00:24 lr 0.000114 time 1.3208 (2.2026) loss 3.3329 (3.1554) grad_norm 2.3399 (2.4352) [2022-01-25 06:35:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [236/300][1250/1251] eta 0:00:02 lr 0.000114 time 1.3517 (2.1967) loss 3.2923 (3.1543) grad_norm 2.1666 (2.4340) [2022-01-25 06:35:00 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 236 training takes 0:45:48 [2022-01-25 06:35:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.675 (18.675) Loss 0.8514 (0.8514) Acc@1 78.906 (78.906) Acc@5 95.410 (95.410) [2022-01-25 06:35:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.321 (3.505) Loss 0.8083 (0.8349) Acc@1 79.980 (80.167) Acc@5 95.508 (95.250) [2022-01-25 06:35:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.593 (2.587) Loss 0.8576 (0.8454) Acc@1 79.297 (79.999) Acc@5 94.238 (95.196) [2022-01-25 06:36:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.611 (2.265) Loss 0.8823 (0.8549) Acc@1 80.176 (79.952) Acc@5 94.434 (95.123) [2022-01-25 06:36:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.309 (2.190) Loss 0.8963 (0.8563) Acc@1 79.297 (79.864) Acc@5 95.117 (95.096) [2022-01-25 06:36:37 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.818 Acc@5 95.102 [2022-01-25 06:36:37 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-01-25 06:36:37 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.82% [2022-01-25 06:36:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][0/1251] eta 7:34:10 lr 0.000114 time 21.7827 (21.7827) loss 2.8402 (2.8402) grad_norm 2.6889 (2.6889) [2022-01-25 06:37:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][10/1251] eta 1:25:09 lr 0.000114 time 2.7250 (4.1169) loss 3.5471 (3.3300) grad_norm 2.5100 (2.5043) [2022-01-25 06:37:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][20/1251] eta 1:06:45 lr 0.000114 time 2.3520 (3.2541) loss 2.6081 (3.1329) grad_norm 2.6271 (2.4961) [2022-01-25 06:38:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][30/1251] eta 1:00:26 lr 0.000114 time 1.9338 (2.9704) loss 2.9925 (3.1205) grad_norm 2.3114 (2.4840) [2022-01-25 06:38:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][40/1251] eta 0:57:20 lr 0.000114 time 3.5608 (2.8408) loss 3.5548 (3.1443) grad_norm 2.0109 (2.4531) [2022-01-25 06:38:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][50/1251] eta 0:55:02 lr 0.000114 time 2.0429 (2.7499) loss 3.4941 (3.1417) grad_norm 2.2164 (2.4309) [2022-01-25 06:39:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][60/1251] eta 0:52:08 lr 0.000114 time 1.5879 (2.6264) loss 3.3315 (3.1047) grad_norm 2.1446 (2.4337) [2022-01-25 06:39:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][70/1251] eta 0:49:57 lr 0.000114 time 1.9431 (2.5382) loss 2.9476 (3.0978) grad_norm 2.3296 (2.4157) [2022-01-25 06:39:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][80/1251] eta 0:48:01 lr 0.000114 time 1.9360 (2.4608) loss 2.7590 (3.0708) grad_norm 2.3060 (2.4069) [2022-01-25 06:40:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][90/1251] eta 0:47:11 lr 0.000114 time 1.9613 (2.4389) loss 3.3379 (3.0777) grad_norm 2.2996 (2.4004) [2022-01-25 06:40:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][100/1251] eta 0:46:25 lr 0.000114 time 2.2987 (2.4201) loss 2.5788 (3.0658) grad_norm 2.2990 (2.3999) [2022-01-25 06:41:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][110/1251] eta 0:45:34 lr 0.000114 time 2.2646 (2.3962) loss 2.0673 (3.0561) grad_norm 2.3530 (2.4025) [2022-01-25 06:41:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][120/1251] eta 0:44:58 lr 0.000114 time 2.6733 (2.3858) loss 3.5138 (3.0609) grad_norm 2.8944 (2.4027) [2022-01-25 06:41:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][130/1251] eta 0:44:14 lr 0.000114 time 1.5519 (2.3679) loss 2.1258 (3.0309) grad_norm 2.8809 (2.3975) [2022-01-25 06:42:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][140/1251] eta 0:43:39 lr 0.000114 time 2.6285 (2.3577) loss 2.7085 (3.0229) grad_norm 2.8948 (2.4108) [2022-01-25 06:42:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][150/1251] eta 0:43:11 lr 0.000113 time 2.1081 (2.3535) loss 3.3780 (3.0389) grad_norm 2.8549 (2.4276) [2022-01-25 06:42:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][160/1251] eta 0:42:37 lr 0.000113 time 1.9208 (2.3444) loss 3.3881 (3.0530) grad_norm 2.1478 (2.4312) [2022-01-25 06:43:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][170/1251] eta 0:41:56 lr 0.000113 time 1.9953 (2.3279) loss 2.6488 (3.0588) grad_norm 2.8234 (2.4337) [2022-01-25 06:43:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][180/1251] eta 0:41:18 lr 0.000113 time 2.2404 (2.3138) loss 3.1572 (3.0750) grad_norm 2.6011 (2.4359) [2022-01-25 06:43:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][190/1251] eta 0:40:46 lr 0.000113 time 1.6652 (2.3061) loss 2.3631 (3.0821) grad_norm 2.4290 (2.4386) [2022-01-25 06:44:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][200/1251] eta 0:40:25 lr 0.000113 time 2.1621 (2.3082) loss 3.3539 (3.0794) grad_norm 2.2251 (2.4288) [2022-01-25 06:44:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][210/1251] eta 0:40:01 lr 0.000113 time 2.2556 (2.3065) loss 1.9719 (3.0720) grad_norm 2.3007 (2.4286) [2022-01-25 06:45:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][220/1251] eta 0:39:30 lr 0.000113 time 1.9408 (2.2992) loss 3.2933 (3.0669) grad_norm 2.1183 (2.4404) [2022-01-25 06:45:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][230/1251] eta 0:38:55 lr 0.000113 time 1.7055 (2.2870) loss 3.7754 (3.0755) grad_norm 2.6517 (2.4375) [2022-01-25 06:45:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][240/1251] eta 0:38:18 lr 0.000113 time 1.8518 (2.2733) loss 3.3407 (3.0825) grad_norm 2.8530 (2.4376) [2022-01-25 06:46:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][250/1251] eta 0:37:47 lr 0.000113 time 1.8998 (2.2656) loss 3.6012 (3.0892) grad_norm 2.6559 (2.4395) [2022-01-25 06:46:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][260/1251] eta 0:37:27 lr 0.000113 time 1.7903 (2.2675) loss 3.4781 (3.0990) grad_norm 2.3135 (2.4421) [2022-01-25 06:46:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][270/1251] eta 0:37:02 lr 0.000113 time 2.1058 (2.2660) loss 3.5996 (3.0961) grad_norm 2.2703 (2.4403) [2022-01-25 06:47:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][280/1251] eta 0:36:41 lr 0.000113 time 1.6413 (2.2671) loss 3.4419 (3.1047) grad_norm 2.3484 (2.4410) [2022-01-25 06:47:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][290/1251] eta 0:36:15 lr 0.000113 time 1.8781 (2.2636) loss 2.3821 (3.0988) grad_norm 2.6916 (2.4419) [2022-01-25 06:47:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][300/1251] eta 0:35:54 lr 0.000113 time 1.9524 (2.2655) loss 3.8596 (3.1049) grad_norm 2.6400 (2.4393) [2022-01-25 06:48:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][310/1251] eta 0:35:30 lr 0.000113 time 1.6628 (2.2646) loss 3.5960 (3.1066) grad_norm 2.2171 (2.4393) [2022-01-25 06:48:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][320/1251] eta 0:35:03 lr 0.000113 time 1.8741 (2.2598) loss 2.6940 (3.1085) grad_norm 2.6125 (2.4377) [2022-01-25 06:49:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][330/1251] eta 0:34:36 lr 0.000113 time 1.8631 (2.2544) loss 2.5271 (3.1064) grad_norm 2.3983 (2.4385) [2022-01-25 06:49:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][340/1251] eta 0:34:10 lr 0.000113 time 2.0110 (2.2507) loss 2.8278 (3.1036) grad_norm 2.3685 (2.4409) [2022-01-25 06:49:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][350/1251] eta 0:33:43 lr 0.000113 time 1.9255 (2.2458) loss 2.9739 (3.1027) grad_norm 2.6451 (2.4406) [2022-01-25 06:50:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][360/1251] eta 0:33:15 lr 0.000113 time 1.7784 (2.2398) loss 3.6831 (3.1106) grad_norm 2.5478 (2.4423) [2022-01-25 06:50:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][370/1251] eta 0:32:54 lr 0.000113 time 2.4793 (2.2416) loss 3.1972 (3.1127) grad_norm 3.5599 (2.4478) [2022-01-25 06:50:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][380/1251] eta 0:32:35 lr 0.000113 time 3.4756 (2.2449) loss 2.9270 (3.1130) grad_norm 2.1379 (2.4480) [2022-01-25 06:51:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][390/1251] eta 0:32:14 lr 0.000113 time 1.7749 (2.2464) loss 2.5624 (3.1125) grad_norm 2.3846 (2.4485) [2022-01-25 06:51:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][400/1251] eta 0:31:51 lr 0.000113 time 2.2414 (2.2460) loss 3.0269 (3.1158) grad_norm 2.2840 (2.4554) [2022-01-25 06:52:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][410/1251] eta 0:31:28 lr 0.000113 time 2.5323 (2.2453) loss 3.1809 (3.1160) grad_norm 2.7868 (2.4540) [2022-01-25 06:52:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][420/1251] eta 0:31:00 lr 0.000113 time 1.9500 (2.2394) loss 2.9045 (3.1203) grad_norm 2.3632 (2.4513) [2022-01-25 06:52:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][430/1251] eta 0:30:35 lr 0.000113 time 1.8750 (2.2357) loss 2.8194 (3.1209) grad_norm 2.4512 (2.4488) [2022-01-25 06:53:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][440/1251] eta 0:30:11 lr 0.000113 time 1.7984 (2.2331) loss 3.5390 (3.1198) grad_norm 2.6032 (2.4511) [2022-01-25 06:53:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][450/1251] eta 0:29:47 lr 0.000113 time 2.0910 (2.2312) loss 3.5698 (3.1243) grad_norm 2.3209 (2.4528) [2022-01-25 06:53:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][460/1251] eta 0:29:24 lr 0.000113 time 2.8128 (2.2306) loss 2.2157 (3.1218) grad_norm 2.3639 (2.4510) [2022-01-25 06:54:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][470/1251] eta 0:29:01 lr 0.000113 time 2.6470 (2.2303) loss 3.1141 (3.1202) grad_norm 2.1052 (2.4503) [2022-01-25 06:54:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][480/1251] eta 0:28:41 lr 0.000113 time 2.2181 (2.2322) loss 3.1318 (3.1209) grad_norm 2.2034 (2.4490) [2022-01-25 06:54:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][490/1251] eta 0:28:19 lr 0.000113 time 2.0573 (2.2330) loss 2.3635 (3.1220) grad_norm 2.1861 (2.4479) [2022-01-25 06:55:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][500/1251] eta 0:27:56 lr 0.000113 time 2.4904 (2.2321) loss 3.0635 (3.1206) grad_norm 2.3141 (2.4510) [2022-01-25 06:55:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][510/1251] eta 0:27:36 lr 0.000113 time 3.2473 (2.2349) loss 2.3460 (3.1196) grad_norm 2.4380 (2.4551) [2022-01-25 06:55:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][520/1251] eta 0:27:10 lr 0.000113 time 1.7211 (2.2311) loss 3.5470 (3.1221) grad_norm 3.2367 (2.4589) [2022-01-25 06:56:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][530/1251] eta 0:26:45 lr 0.000113 time 1.8358 (2.2262) loss 2.1295 (3.1222) grad_norm 2.4938 (2.4594) [2022-01-25 06:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][540/1251] eta 0:26:21 lr 0.000113 time 2.1864 (2.2246) loss 3.4396 (3.1232) grad_norm 2.1013 (2.4634) [2022-01-25 06:57:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][550/1251] eta 0:25:59 lr 0.000112 time 3.1889 (2.2246) loss 2.6964 (3.1211) grad_norm 2.2118 (2.4623) [2022-01-25 06:57:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][560/1251] eta 0:25:35 lr 0.000112 time 2.1436 (2.2224) loss 2.8751 (3.1215) grad_norm 2.5031 (2.4615) [2022-01-25 06:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][570/1251] eta 0:25:13 lr 0.000112 time 1.9357 (2.2218) loss 3.2306 (3.1227) grad_norm 2.5652 (2.4628) [2022-01-25 06:58:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][580/1251] eta 0:24:51 lr 0.000112 time 3.1770 (2.2227) loss 3.0873 (3.1266) grad_norm 2.5884 (2.4650) [2022-01-25 06:58:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][590/1251] eta 0:24:28 lr 0.000112 time 2.3702 (2.2221) loss 3.7102 (3.1304) grad_norm 2.5472 (2.4661) [2022-01-25 06:58:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][600/1251] eta 0:24:05 lr 0.000112 time 2.1902 (2.2212) loss 3.6671 (3.1325) grad_norm 2.2747 (2.4662) [2022-01-25 06:59:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][610/1251] eta 0:23:43 lr 0.000112 time 2.8178 (2.2208) loss 2.5481 (3.1316) grad_norm 2.4990 (2.4670) [2022-01-25 06:59:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][620/1251] eta 0:23:18 lr 0.000112 time 1.5789 (2.2166) loss 3.1934 (3.1371) grad_norm 2.3382 (2.4652) [2022-01-25 06:59:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][630/1251] eta 0:22:56 lr 0.000112 time 2.2665 (2.2161) loss 3.6274 (3.1361) grad_norm 2.8988 (2.4665) [2022-01-25 07:00:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][640/1251] eta 0:22:32 lr 0.000112 time 2.1692 (2.2138) loss 2.1817 (3.1349) grad_norm 2.5035 (2.4644) [2022-01-25 07:00:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][650/1251] eta 0:22:10 lr 0.000112 time 2.9103 (2.2142) loss 3.2513 (3.1326) grad_norm 2.5059 (2.4620) [2022-01-25 07:01:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][660/1251] eta 0:21:48 lr 0.000112 time 1.8498 (2.2148) loss 2.6459 (3.1331) grad_norm 2.8311 (2.4612) [2022-01-25 07:01:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][670/1251] eta 0:21:26 lr 0.000112 time 2.3950 (2.2143) loss 3.2282 (3.1324) grad_norm 2.5961 (2.4623) [2022-01-25 07:01:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][680/1251] eta 0:21:04 lr 0.000112 time 2.2726 (2.2139) loss 2.6286 (3.1320) grad_norm 2.7223 (2.4626) [2022-01-25 07:02:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][690/1251] eta 0:20:42 lr 0.000112 time 1.6700 (2.2146) loss 3.2561 (3.1294) grad_norm 2.3285 (2.4642) [2022-01-25 07:02:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][700/1251] eta 0:20:20 lr 0.000112 time 1.6654 (2.2154) loss 3.6608 (3.1304) grad_norm 2.5325 (2.4629) [2022-01-25 07:02:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][710/1251] eta 0:19:58 lr 0.000112 time 3.1459 (2.2159) loss 2.1290 (3.1307) grad_norm 2.5769 (2.4671) [2022-01-25 07:03:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][720/1251] eta 0:19:36 lr 0.000112 time 2.4188 (2.2149) loss 1.9125 (3.1283) grad_norm 2.1422 (2.4671) [2022-01-25 07:03:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][730/1251] eta 0:19:13 lr 0.000112 time 1.5864 (2.2147) loss 2.4788 (3.1296) grad_norm 2.6177 (2.4680) [2022-01-25 07:03:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][740/1251] eta 0:18:52 lr 0.000112 time 2.0717 (2.2161) loss 3.8686 (3.1274) grad_norm 2.5352 (2.4687) [2022-01-25 07:04:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][750/1251] eta 0:18:30 lr 0.000112 time 2.8116 (2.2171) loss 3.3045 (3.1279) grad_norm 2.3012 (2.4698) [2022-01-25 07:04:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][760/1251] eta 0:18:07 lr 0.000112 time 2.1154 (2.2148) loss 3.9515 (3.1265) grad_norm 2.3922 (2.4678) [2022-01-25 07:05:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][770/1251] eta 0:17:45 lr 0.000112 time 1.7935 (2.2145) loss 3.4453 (3.1289) grad_norm 2.3183 (2.4672) [2022-01-25 07:05:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][780/1251] eta 0:17:22 lr 0.000112 time 1.7881 (2.2140) loss 3.0239 (3.1292) grad_norm 2.1490 (2.4681) [2022-01-25 07:05:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][790/1251] eta 0:17:00 lr 0.000112 time 2.2488 (2.2139) loss 2.2552 (3.1298) grad_norm 2.1178 (2.4677) [2022-01-25 07:06:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][800/1251] eta 0:16:38 lr 0.000112 time 2.8203 (2.2149) loss 3.6425 (3.1317) grad_norm 2.6369 (2.4679) [2022-01-25 07:06:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][810/1251] eta 0:16:16 lr 0.000112 time 1.5709 (2.2145) loss 3.4493 (3.1337) grad_norm 2.5447 (2.4685) [2022-01-25 07:06:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][820/1251] eta 0:15:53 lr 0.000112 time 1.9159 (2.2122) loss 3.7591 (3.1317) grad_norm 3.0828 (2.4694) [2022-01-25 07:07:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][830/1251] eta 0:15:30 lr 0.000112 time 1.5743 (2.2105) loss 3.5831 (3.1319) grad_norm 2.2226 (2.4692) [2022-01-25 07:07:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][840/1251] eta 0:15:08 lr 0.000112 time 2.3327 (2.2107) loss 2.3237 (3.1294) grad_norm 2.9877 (2.4702) [2022-01-25 07:07:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][850/1251] eta 0:14:46 lr 0.000112 time 1.9065 (2.2109) loss 3.5124 (3.1296) grad_norm 2.4970 (2.4682) [2022-01-25 07:08:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][860/1251] eta 0:14:24 lr 0.000112 time 2.2361 (2.2120) loss 3.6221 (3.1319) grad_norm 3.0555 (2.4716) [2022-01-25 07:08:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][870/1251] eta 0:14:03 lr 0.000112 time 1.7842 (2.2131) loss 2.4633 (3.1324) grad_norm 2.2036 (2.4700) [2022-01-25 07:09:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][880/1251] eta 0:13:42 lr 0.000112 time 2.5782 (2.2163) loss 3.9955 (3.1321) grad_norm 2.5120 (2.4704) [2022-01-25 07:09:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][890/1251] eta 0:13:19 lr 0.000112 time 1.9155 (2.2145) loss 2.3562 (3.1327) grad_norm 2.0164 (2.4701) [2022-01-25 07:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][900/1251] eta 0:12:56 lr 0.000112 time 1.8488 (2.2110) loss 2.0696 (3.1330) grad_norm 2.1620 (2.4687) [2022-01-25 07:10:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][910/1251] eta 0:12:33 lr 0.000112 time 2.3109 (2.2090) loss 3.2284 (3.1336) grad_norm 2.3210 (2.4665) [2022-01-25 07:10:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][920/1251] eta 0:12:10 lr 0.000112 time 1.8986 (2.2077) loss 3.4641 (3.1326) grad_norm 2.4253 (2.4651) [2022-01-25 07:10:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][930/1251] eta 0:11:48 lr 0.000112 time 2.2785 (2.2064) loss 3.8335 (3.1327) grad_norm 2.5475 (2.4638) [2022-01-25 07:11:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][940/1251] eta 0:11:26 lr 0.000111 time 1.9083 (2.2061) loss 3.0090 (3.1307) grad_norm 2.2860 (2.4745) [2022-01-25 07:11:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][950/1251] eta 0:11:04 lr 0.000111 time 2.4661 (2.2060) loss 3.7606 (3.1349) grad_norm 2.3621 (2.4742) [2022-01-25 07:11:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][960/1251] eta 0:10:42 lr 0.000111 time 1.8504 (2.2069) loss 2.8915 (3.1345) grad_norm 2.1622 (2.4740) [2022-01-25 07:12:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][970/1251] eta 0:10:20 lr 0.000111 time 1.6437 (2.2069) loss 3.7593 (3.1334) grad_norm 3.4222 (2.4740) [2022-01-25 07:12:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][980/1251] eta 0:09:58 lr 0.000111 time 2.5122 (2.2077) loss 3.6189 (3.1338) grad_norm 2.2324 (2.4721) [2022-01-25 07:13:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][990/1251] eta 0:09:36 lr 0.000111 time 3.0335 (2.2098) loss 3.7050 (3.1361) grad_norm 2.5680 (2.4717) [2022-01-25 07:13:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1000/1251] eta 0:09:14 lr 0.000111 time 1.5018 (2.2102) loss 2.9765 (3.1359) grad_norm 1.9743 (2.4702) [2022-01-25 07:13:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1010/1251] eta 0:08:52 lr 0.000111 time 1.9215 (2.2103) loss 3.0327 (3.1348) grad_norm 2.1981 (2.4683) [2022-01-25 07:14:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1020/1251] eta 0:08:30 lr 0.000111 time 2.8462 (2.2101) loss 3.4551 (3.1361) grad_norm 2.3953 (2.4670) [2022-01-25 07:14:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1030/1251] eta 0:08:08 lr 0.000111 time 1.8869 (2.2097) loss 2.5160 (3.1332) grad_norm 2.5775 (2.4672) [2022-01-25 07:15:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1040/1251] eta 0:07:46 lr 0.000111 time 1.5869 (2.2123) loss 3.3747 (3.1349) grad_norm 3.1440 (2.4663) [2022-01-25 07:15:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1050/1251] eta 0:07:24 lr 0.000111 time 1.7571 (2.2118) loss 2.8282 (3.1361) grad_norm 2.1441 (2.4647) [2022-01-25 07:15:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1060/1251] eta 0:07:02 lr 0.000111 time 2.1635 (2.2105) loss 2.4814 (3.1340) grad_norm 2.3983 (2.4636) [2022-01-25 07:16:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1070/1251] eta 0:06:39 lr 0.000111 time 1.9039 (2.2083) loss 1.9545 (3.1349) grad_norm 2.0874 (2.4623) [2022-01-25 07:16:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1080/1251] eta 0:06:17 lr 0.000111 time 1.8194 (2.2075) loss 3.3184 (3.1318) grad_norm 2.5123 (2.4621) [2022-01-25 07:16:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1090/1251] eta 0:05:55 lr 0.000111 time 2.4576 (2.2098) loss 3.4345 (3.1326) grad_norm 2.7825 (2.4620) [2022-01-25 07:17:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1100/1251] eta 0:05:33 lr 0.000111 time 1.9976 (2.2102) loss 3.0036 (3.1334) grad_norm 2.1739 (2.4624) [2022-01-25 07:17:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1110/1251] eta 0:05:11 lr 0.000111 time 2.6341 (2.2112) loss 3.5265 (3.1346) grad_norm 2.1467 (2.4623) [2022-01-25 07:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1120/1251] eta 0:04:49 lr 0.000111 time 1.8271 (2.2101) loss 3.5093 (3.1350) grad_norm 2.7659 (2.4626) [2022-01-25 07:18:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1130/1251] eta 0:04:27 lr 0.000111 time 2.2349 (2.2097) loss 2.5516 (3.1354) grad_norm 2.1512 (2.4619) [2022-01-25 07:18:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1140/1251] eta 0:04:05 lr 0.000111 time 1.5991 (2.2074) loss 3.1670 (3.1356) grad_norm 2.3034 (2.4630) [2022-01-25 07:18:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1150/1251] eta 0:03:42 lr 0.000111 time 2.2153 (2.2075) loss 2.3985 (3.1344) grad_norm 2.5705 (2.4632) [2022-01-25 07:19:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1160/1251] eta 0:03:20 lr 0.000111 time 1.9041 (2.2077) loss 3.1339 (3.1336) grad_norm 2.1481 (2.4632) [2022-01-25 07:19:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1170/1251] eta 0:02:58 lr 0.000111 time 2.9001 (2.2095) loss 3.3363 (3.1343) grad_norm 2.3590 (2.4626) [2022-01-25 07:20:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1180/1251] eta 0:02:36 lr 0.000111 time 2.3949 (2.2100) loss 3.5452 (3.1347) grad_norm 2.3223 (2.4620) [2022-01-25 07:20:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1190/1251] eta 0:02:14 lr 0.000111 time 1.8979 (2.2094) loss 3.3002 (3.1350) grad_norm 2.3814 (2.4622) [2022-01-25 07:20:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1200/1251] eta 0:01:52 lr 0.000111 time 1.8886 (2.2083) loss 3.4629 (3.1354) grad_norm 2.1037 (2.4616) [2022-01-25 07:21:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1210/1251] eta 0:01:30 lr 0.000111 time 2.2058 (2.2086) loss 3.3425 (3.1369) grad_norm 2.2219 (2.4613) [2022-01-25 07:21:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1220/1251] eta 0:01:08 lr 0.000111 time 1.7866 (2.2089) loss 3.2994 (3.1357) grad_norm 2.6042 (2.4629) [2022-01-25 07:21:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1230/1251] eta 0:00:46 lr 0.000111 time 1.6361 (2.2077) loss 3.3614 (3.1364) grad_norm 2.4023 (2.4622) [2022-01-25 07:22:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1240/1251] eta 0:00:24 lr 0.000111 time 1.4953 (2.2061) loss 3.4240 (3.1377) grad_norm 2.4158 (2.4618) [2022-01-25 07:22:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [237/300][1250/1251] eta 0:00:02 lr 0.000111 time 1.2066 (2.2003) loss 3.2838 (3.1382) grad_norm 2.3517 (2.4607) [2022-01-25 07:22:30 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 237 training takes 0:45:52 [2022-01-25 07:22:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.246 (18.246) Loss 0.8670 (0.8670) Acc@1 78.613 (78.613) Acc@5 95.410 (95.410) [2022-01-25 07:23:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.552 (3.566) Loss 0.8543 (0.8529) Acc@1 79.492 (79.750) Acc@5 95.312 (95.091) [2022-01-25 07:23:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.952 (2.744) Loss 0.8141 (0.8586) Acc@1 81.543 (79.669) Acc@5 95.312 (95.066) [2022-01-25 07:23:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.889 (2.296) Loss 0.8438 (0.8632) Acc@1 80.078 (79.700) Acc@5 95.215 (95.010) [2022-01-25 07:24:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.073 (2.198) Loss 0.9800 (0.8650) Acc@1 77.539 (79.597) Acc@5 94.043 (95.022) [2022-01-25 07:24:08 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.768 Acc@5 95.098 [2022-01-25 07:24:08 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.8% [2022-01-25 07:24:08 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.82% [2022-01-25 07:24:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][0/1251] eta 7:36:30 lr 0.000111 time 21.8949 (21.8949) loss 3.4373 (3.4373) grad_norm 2.4441 (2.4441) [2022-01-25 07:24:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][10/1251] eta 1:22:05 lr 0.000111 time 2.5082 (3.9690) loss 2.1429 (3.1066) grad_norm 2.1792 (2.3482) [2022-01-25 07:25:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][20/1251] eta 1:04:14 lr 0.000111 time 1.4547 (3.1308) loss 2.9025 (3.2156) grad_norm 2.5066 (2.4801) [2022-01-25 07:25:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][30/1251] eta 0:56:59 lr 0.000111 time 1.6495 (2.8007) loss 3.0443 (3.1977) grad_norm 2.5545 (2.4472) [2022-01-25 07:25:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][40/1251] eta 0:54:01 lr 0.000111 time 3.5953 (2.6763) loss 3.5445 (3.2420) grad_norm 2.3193 (2.4319) [2022-01-25 07:26:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][50/1251] eta 0:52:27 lr 0.000111 time 2.8543 (2.6205) loss 3.8680 (3.2230) grad_norm 2.9045 (2.4233) [2022-01-25 07:26:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][60/1251] eta 0:50:27 lr 0.000111 time 2.1526 (2.5420) loss 3.3819 (3.2159) grad_norm 2.3844 (2.4362) [2022-01-25 07:27:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][70/1251] eta 0:48:56 lr 0.000111 time 1.8295 (2.4868) loss 3.4661 (3.2159) grad_norm 2.2607 (2.4366) [2022-01-25 07:27:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][80/1251] eta 0:48:04 lr 0.000111 time 3.5848 (2.4637) loss 2.8513 (3.2119) grad_norm 2.2396 (2.4346) [2022-01-25 07:27:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][90/1251] eta 0:47:11 lr 0.000110 time 1.8252 (2.4385) loss 3.3485 (3.2000) grad_norm 2.1680 (2.4400) [2022-01-25 07:28:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][100/1251] eta 0:46:06 lr 0.000110 time 1.7512 (2.4032) loss 3.2676 (3.2032) grad_norm 2.3195 (2.4564) [2022-01-25 07:28:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][110/1251] eta 0:44:58 lr 0.000110 time 1.5438 (2.3655) loss 2.5625 (3.1782) grad_norm 2.6664 (2.4545) [2022-01-25 07:28:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][120/1251] eta 0:44:43 lr 0.000110 time 2.8641 (2.3730) loss 3.6620 (3.1666) grad_norm 2.8548 (2.4685) [2022-01-25 07:29:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][130/1251] eta 0:44:17 lr 0.000110 time 1.8460 (2.3705) loss 3.2544 (3.1672) grad_norm 3.5742 (2.4751) [2022-01-25 07:29:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][140/1251] eta 0:43:46 lr 0.000110 time 2.4969 (2.3641) loss 3.3491 (3.1657) grad_norm 2.5659 (2.4666) [2022-01-25 07:30:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][150/1251] eta 0:43:09 lr 0.000110 time 1.5725 (2.3521) loss 2.2694 (3.1638) grad_norm 2.3716 (2.4653) [2022-01-25 07:30:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][160/1251] eta 0:42:33 lr 0.000110 time 2.8967 (2.3406) loss 2.9668 (3.1809) grad_norm 2.5788 (2.4669) [2022-01-25 07:30:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][170/1251] eta 0:41:40 lr 0.000110 time 1.8948 (2.3131) loss 3.1870 (3.1767) grad_norm 2.6234 (2.4695) [2022-01-25 07:31:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][180/1251] eta 0:40:57 lr 0.000110 time 1.7896 (2.2944) loss 3.2149 (3.1826) grad_norm 2.7019 (2.4740) [2022-01-25 07:31:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][190/1251] eta 0:40:22 lr 0.000110 time 1.8736 (2.2829) loss 3.2390 (3.1851) grad_norm 2.3780 (2.4668) [2022-01-25 07:31:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][200/1251] eta 0:40:00 lr 0.000110 time 2.5284 (2.2841) loss 3.3776 (3.1780) grad_norm 2.2667 (2.4617) [2022-01-25 07:32:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][210/1251] eta 0:39:45 lr 0.000110 time 2.8106 (2.2916) loss 3.4969 (3.1861) grad_norm 2.3205 (2.4661) [2022-01-25 07:32:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][220/1251] eta 0:39:22 lr 0.000110 time 1.5755 (2.2917) loss 2.9200 (3.1875) grad_norm 2.3591 (2.4650) [2022-01-25 07:32:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][230/1251] eta 0:38:54 lr 0.000110 time 1.9865 (2.2864) loss 2.8115 (3.1806) grad_norm 2.4475 (2.4574) [2022-01-25 07:33:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][240/1251] eta 0:38:27 lr 0.000110 time 1.9043 (2.2827) loss 3.2409 (3.1797) grad_norm 2.7564 (2.4684) [2022-01-25 07:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][250/1251] eta 0:37:59 lr 0.000110 time 2.1509 (2.2770) loss 2.2383 (3.1727) grad_norm 2.6408 (2.4791) [2022-01-25 07:34:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][260/1251] eta 0:37:28 lr 0.000110 time 1.8400 (2.2693) loss 2.6961 (3.1704) grad_norm 3.2505 (2.4919) [2022-01-25 07:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][270/1251] eta 0:37:03 lr 0.000110 time 2.5040 (2.2669) loss 2.0576 (3.1547) grad_norm 2.2406 (2.4914) [2022-01-25 07:34:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][280/1251] eta 0:36:35 lr 0.000110 time 1.9981 (2.2606) loss 3.7511 (3.1585) grad_norm 2.5447 (2.5267) [2022-01-25 07:35:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][290/1251] eta 0:36:09 lr 0.000110 time 2.6007 (2.2580) loss 3.3767 (3.1609) grad_norm 2.6639 (2.5318) [2022-01-25 07:35:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][300/1251] eta 0:35:46 lr 0.000110 time 1.8793 (2.2573) loss 3.4937 (3.1578) grad_norm 2.6356 (2.5274) [2022-01-25 07:35:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][310/1251] eta 0:35:22 lr 0.000110 time 2.4472 (2.2561) loss 2.9092 (3.1634) grad_norm 2.3944 (2.5240) [2022-01-25 07:36:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][320/1251] eta 0:35:03 lr 0.000110 time 2.3189 (2.2591) loss 2.4552 (3.1634) grad_norm 2.1944 (2.5203) [2022-01-25 07:36:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][330/1251] eta 0:34:39 lr 0.000110 time 2.8711 (2.2583) loss 3.7286 (3.1699) grad_norm 2.6799 (2.5174) [2022-01-25 07:36:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][340/1251] eta 0:34:12 lr 0.000110 time 1.6365 (2.2525) loss 3.9431 (3.1621) grad_norm 2.5914 (2.5154) [2022-01-25 07:37:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][350/1251] eta 0:33:41 lr 0.000110 time 1.9100 (2.2432) loss 3.7135 (3.1682) grad_norm 2.0625 (2.5145) [2022-01-25 07:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][360/1251] eta 0:33:14 lr 0.000110 time 2.4662 (2.2383) loss 2.6993 (3.1701) grad_norm 2.3679 (2.5156) [2022-01-25 07:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][370/1251] eta 0:32:50 lr 0.000110 time 2.5492 (2.2367) loss 2.2468 (3.1703) grad_norm 2.7227 (2.5190) [2022-01-25 07:38:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][380/1251] eta 0:32:27 lr 0.000110 time 1.6361 (2.2357) loss 4.1723 (3.1757) grad_norm 2.4463 (2.5215) [2022-01-25 07:38:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][390/1251] eta 0:32:03 lr 0.000110 time 2.3049 (2.2346) loss 3.7814 (3.1758) grad_norm 2.6645 (2.5267) [2022-01-25 07:39:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][400/1251] eta 0:31:41 lr 0.000110 time 2.4138 (2.2347) loss 3.4246 (3.1822) grad_norm 2.7218 (2.5333) [2022-01-25 07:39:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][410/1251] eta 0:31:19 lr 0.000110 time 2.1127 (2.2347) loss 2.4926 (3.1708) grad_norm 1.9329 (2.5351) [2022-01-25 07:39:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][420/1251] eta 0:30:58 lr 0.000110 time 2.1458 (2.2362) loss 2.3156 (3.1676) grad_norm 2.3750 (2.5341) [2022-01-25 07:40:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][430/1251] eta 0:30:35 lr 0.000110 time 2.2834 (2.2359) loss 2.7004 (3.1663) grad_norm 2.4061 (2.5313) [2022-01-25 07:40:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][440/1251] eta 0:30:11 lr 0.000110 time 2.1612 (2.2333) loss 2.6533 (3.1647) grad_norm 2.6618 (2.5308) [2022-01-25 07:40:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][450/1251] eta 0:29:46 lr 0.000110 time 2.0729 (2.2301) loss 3.2728 (3.1673) grad_norm 2.2616 (2.5266) [2022-01-25 07:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][460/1251] eta 0:29:23 lr 0.000110 time 2.8911 (2.2290) loss 3.6758 (3.1637) grad_norm 2.2236 (2.5268) [2022-01-25 07:41:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][470/1251] eta 0:29:02 lr 0.000110 time 2.1179 (2.2305) loss 2.9495 (3.1636) grad_norm 3.0642 (2.5292) [2022-01-25 07:42:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][480/1251] eta 0:28:40 lr 0.000110 time 2.6092 (2.2313) loss 3.4351 (3.1620) grad_norm 2.3825 (2.5313) [2022-01-25 07:42:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][490/1251] eta 0:28:17 lr 0.000109 time 2.0132 (2.2306) loss 2.9365 (3.1503) grad_norm 4.6021 (2.5347) [2022-01-25 07:42:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][500/1251] eta 0:27:55 lr 0.000109 time 2.6232 (2.2312) loss 3.2053 (3.1467) grad_norm 2.3154 (2.5361) [2022-01-25 07:43:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][510/1251] eta 0:27:28 lr 0.000109 time 1.9184 (2.2249) loss 2.9926 (3.1478) grad_norm 2.6789 (2.5383) [2022-01-25 07:43:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][520/1251] eta 0:27:02 lr 0.000109 time 1.9157 (2.2199) loss 3.6973 (3.1499) grad_norm 2.3549 (2.5407) [2022-01-25 07:43:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][530/1251] eta 0:26:38 lr 0.000109 time 1.8811 (2.2174) loss 3.6576 (3.1519) grad_norm 3.5533 (2.5427) [2022-01-25 07:44:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][540/1251] eta 0:26:15 lr 0.000109 time 2.4783 (2.2155) loss 3.5597 (3.1512) grad_norm 2.1289 (2.5418) [2022-01-25 07:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][550/1251] eta 0:25:52 lr 0.000109 time 2.6000 (2.2145) loss 3.1386 (3.1512) grad_norm 2.3545 (2.5400) [2022-01-25 07:44:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][560/1251] eta 0:25:29 lr 0.000109 time 1.7159 (2.2128) loss 3.6901 (3.1505) grad_norm 2.5304 (2.5401) [2022-01-25 07:45:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][570/1251] eta 0:25:07 lr 0.000109 time 2.0522 (2.2143) loss 3.0204 (3.1498) grad_norm 2.5473 (2.5417) [2022-01-25 07:45:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][580/1251] eta 0:24:46 lr 0.000109 time 1.9634 (2.2158) loss 3.2370 (3.1478) grad_norm 2.5460 (2.5418) [2022-01-25 07:45:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][590/1251] eta 0:24:24 lr 0.000109 time 2.5558 (2.2152) loss 2.8109 (3.1509) grad_norm 2.9618 (2.5428) [2022-01-25 07:46:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][600/1251] eta 0:24:03 lr 0.000109 time 2.8510 (2.2167) loss 2.3434 (3.1498) grad_norm 2.5985 (2.5412) [2022-01-25 07:46:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][610/1251] eta 0:23:41 lr 0.000109 time 1.8636 (2.2176) loss 4.0594 (3.1503) grad_norm 2.3196 (2.5412) [2022-01-25 07:47:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][620/1251] eta 0:23:18 lr 0.000109 time 2.4577 (2.2171) loss 3.6207 (3.1529) grad_norm 2.3577 (2.5387) [2022-01-25 07:47:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][630/1251] eta 0:22:55 lr 0.000109 time 2.0662 (2.2143) loss 2.3397 (3.1550) grad_norm 2.0471 (2.5368) [2022-01-25 07:47:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][640/1251] eta 0:22:31 lr 0.000109 time 1.9191 (2.2123) loss 3.0656 (3.1533) grad_norm 2.2841 (2.5332) [2022-01-25 07:48:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][650/1251] eta 0:22:09 lr 0.000109 time 1.8696 (2.2114) loss 2.7514 (3.1548) grad_norm 2.1928 (2.5324) [2022-01-25 07:48:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][660/1251] eta 0:21:47 lr 0.000109 time 3.1573 (2.2131) loss 2.2156 (3.1512) grad_norm 2.1982 (2.5296) [2022-01-25 07:48:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][670/1251] eta 0:21:25 lr 0.000109 time 2.1112 (2.2125) loss 2.7960 (3.1509) grad_norm 2.0926 (2.5269) [2022-01-25 07:49:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][680/1251] eta 0:21:03 lr 0.000109 time 2.1218 (2.2128) loss 2.2922 (3.1480) grad_norm 2.2941 (2.5249) [2022-01-25 07:49:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][690/1251] eta 0:20:41 lr 0.000109 time 1.7151 (2.2128) loss 3.1983 (3.1468) grad_norm 2.1519 (2.5216) [2022-01-25 07:50:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][700/1251] eta 0:20:19 lr 0.000109 time 2.1901 (2.2135) loss 3.2511 (3.1472) grad_norm 2.2311 (2.5186) [2022-01-25 07:50:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][710/1251] eta 0:19:58 lr 0.000109 time 2.5024 (2.2157) loss 3.4699 (3.1501) grad_norm 2.5762 (2.5184) [2022-01-25 07:50:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][720/1251] eta 0:19:35 lr 0.000109 time 2.5656 (2.2139) loss 3.4989 (3.1510) grad_norm 3.7551 (2.5183) [2022-01-25 07:51:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][730/1251] eta 0:19:11 lr 0.000109 time 1.8575 (2.2108) loss 3.3748 (3.1547) grad_norm 2.2673 (2.5165) [2022-01-25 07:51:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][740/1251] eta 0:18:48 lr 0.000109 time 2.0356 (2.2091) loss 3.3821 (3.1503) grad_norm 2.1395 (2.5147) [2022-01-25 07:51:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][750/1251] eta 0:18:26 lr 0.000109 time 2.3679 (2.2080) loss 2.7996 (3.1511) grad_norm 2.4331 (2.5147) [2022-01-25 07:52:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][760/1251] eta 0:18:03 lr 0.000109 time 2.1967 (2.2075) loss 3.3468 (3.1499) grad_norm 2.2558 (2.5161) [2022-01-25 07:52:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][770/1251] eta 0:17:42 lr 0.000109 time 2.3230 (2.2080) loss 3.5087 (3.1506) grad_norm 2.6242 (2.5162) [2022-01-25 07:52:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][780/1251] eta 0:17:19 lr 0.000109 time 1.8501 (2.2066) loss 3.7491 (3.1528) grad_norm 2.2082 (2.5182) [2022-01-25 07:53:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][790/1251] eta 0:16:57 lr 0.000109 time 2.4654 (2.2075) loss 3.2477 (3.1559) grad_norm 2.2957 (2.5217) [2022-01-25 07:53:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][800/1251] eta 0:16:36 lr 0.000109 time 2.6019 (2.2085) loss 3.3698 (3.1541) grad_norm 2.4342 (2.5256) [2022-01-25 07:53:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][810/1251] eta 0:16:13 lr 0.000109 time 2.2270 (2.2085) loss 2.3728 (3.1543) grad_norm 2.4770 (2.5296) [2022-01-25 07:54:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][820/1251] eta 0:15:51 lr 0.000109 time 1.9310 (2.2065) loss 2.5572 (3.1540) grad_norm 2.6842 (2.5331) [2022-01-25 07:54:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][830/1251] eta 0:15:28 lr 0.000109 time 2.4586 (2.2062) loss 2.4004 (3.1519) grad_norm 2.5327 (2.5357) [2022-01-25 07:55:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][840/1251] eta 0:15:06 lr 0.000109 time 1.8017 (2.2048) loss 3.3875 (3.1543) grad_norm 2.7157 (2.5360) [2022-01-25 07:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][850/1251] eta 0:14:43 lr 0.000109 time 1.6700 (2.2029) loss 3.3646 (3.1547) grad_norm 3.2369 (2.5370) [2022-01-25 07:55:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][860/1251] eta 0:14:21 lr 0.000109 time 1.6375 (2.2026) loss 2.3284 (3.1530) grad_norm 2.2798 (2.5379) [2022-01-25 07:56:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][870/1251] eta 0:13:59 lr 0.000109 time 2.2717 (2.2041) loss 3.0757 (3.1521) grad_norm 2.1373 (2.5361) [2022-01-25 07:56:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][880/1251] eta 0:13:37 lr 0.000109 time 2.3074 (2.2047) loss 3.6232 (3.1494) grad_norm 2.4779 (2.5350) [2022-01-25 07:56:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][890/1251] eta 0:13:16 lr 0.000108 time 1.9241 (2.2057) loss 3.3159 (3.1499) grad_norm 2.2779 (2.5332) [2022-01-25 07:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][900/1251] eta 0:12:54 lr 0.000108 time 1.8278 (2.2057) loss 3.1676 (3.1522) grad_norm 2.2621 (2.5319) [2022-01-25 07:57:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][910/1251] eta 0:12:32 lr 0.000108 time 2.2867 (2.2053) loss 3.2979 (3.1533) grad_norm 2.4544 (2.5311) [2022-01-25 07:57:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][920/1251] eta 0:12:09 lr 0.000108 time 1.6568 (2.2031) loss 3.1962 (3.1535) grad_norm 2.1641 (2.5295) [2022-01-25 07:58:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][930/1251] eta 0:11:46 lr 0.000108 time 1.9814 (2.2016) loss 3.2256 (3.1540) grad_norm 2.4349 (2.5276) [2022-01-25 07:58:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][940/1251] eta 0:11:24 lr 0.000108 time 2.6065 (2.2006) loss 2.9088 (3.1538) grad_norm 2.2800 (2.5262) [2022-01-25 07:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][950/1251] eta 0:11:02 lr 0.000108 time 2.2277 (2.2006) loss 3.3271 (3.1544) grad_norm 2.1956 (2.5249) [2022-01-25 07:59:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][960/1251] eta 0:10:40 lr 0.000108 time 1.5242 (2.1999) loss 3.4304 (3.1570) grad_norm 2.3490 (2.5228) [2022-01-25 07:59:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][970/1251] eta 0:10:18 lr 0.000108 time 1.7990 (2.2002) loss 3.3284 (3.1556) grad_norm 2.4823 (2.5254) [2022-01-25 08:00:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][980/1251] eta 0:09:56 lr 0.000108 time 1.6801 (2.1999) loss 2.9173 (3.1574) grad_norm 2.1516 (2.5240) [2022-01-25 08:00:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][990/1251] eta 0:09:34 lr 0.000108 time 2.6384 (2.2003) loss 3.1491 (3.1593) grad_norm 2.6820 (2.5237) [2022-01-25 08:00:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1000/1251] eta 0:09:12 lr 0.000108 time 1.7828 (2.2003) loss 3.2795 (3.1568) grad_norm 2.3689 (2.5219) [2022-01-25 08:01:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1010/1251] eta 0:08:50 lr 0.000108 time 1.6659 (2.2013) loss 3.2713 (3.1575) grad_norm 2.6604 (2.5211) [2022-01-25 08:01:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1020/1251] eta 0:08:28 lr 0.000108 time 1.7744 (2.2021) loss 3.1480 (3.1594) grad_norm 2.5525 (2.5200) [2022-01-25 08:01:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1030/1251] eta 0:08:06 lr 0.000108 time 2.1621 (2.2022) loss 2.8290 (3.1590) grad_norm 2.4906 (2.5203) [2022-01-25 08:02:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1040/1251] eta 0:07:44 lr 0.000108 time 1.5156 (2.2008) loss 2.4957 (3.1583) grad_norm 2.0647 (2.5214) [2022-01-25 08:02:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1050/1251] eta 0:07:22 lr 0.000108 time 2.0060 (2.2020) loss 3.7747 (3.1599) grad_norm 2.2981 (2.5215) [2022-01-25 08:03:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1060/1251] eta 0:07:00 lr 0.000108 time 2.4697 (2.2029) loss 2.6761 (3.1589) grad_norm 2.2613 (2.5220) [2022-01-25 08:03:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1070/1251] eta 0:06:38 lr 0.000108 time 1.6199 (2.2014) loss 3.3028 (3.1582) grad_norm 3.0464 (2.5214) [2022-01-25 08:03:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1080/1251] eta 0:06:16 lr 0.000108 time 1.8436 (2.1994) loss 3.5483 (3.1610) grad_norm 2.3026 (2.5209) [2022-01-25 08:04:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1090/1251] eta 0:05:53 lr 0.000108 time 2.1795 (2.1987) loss 2.8296 (3.1574) grad_norm 2.5062 (2.5211) [2022-01-25 08:04:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1100/1251] eta 0:05:31 lr 0.000108 time 1.8473 (2.1977) loss 3.5372 (3.1590) grad_norm 2.5373 (2.5208) [2022-01-25 08:04:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1110/1251] eta 0:05:09 lr 0.000108 time 2.2417 (2.1981) loss 3.5771 (3.1572) grad_norm 2.9655 (2.5216) [2022-01-25 08:05:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1120/1251] eta 0:04:47 lr 0.000108 time 1.8737 (2.1969) loss 3.2258 (3.1580) grad_norm 2.5198 (2.5203) [2022-01-25 08:05:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1130/1251] eta 0:04:25 lr 0.000108 time 1.9905 (2.1972) loss 3.2948 (3.1584) grad_norm 2.4877 (2.5204) [2022-01-25 08:05:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1140/1251] eta 0:04:03 lr 0.000108 time 2.6048 (2.1981) loss 3.5182 (3.1571) grad_norm 2.2997 (2.5193) [2022-01-25 08:06:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1150/1251] eta 0:03:42 lr 0.000108 time 1.6231 (2.1986) loss 3.4681 (3.1573) grad_norm 2.3050 (2.5182) [2022-01-25 08:06:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1160/1251] eta 0:03:20 lr 0.000108 time 1.5158 (2.1991) loss 3.0572 (3.1590) grad_norm 2.9515 (2.5180) [2022-01-25 08:07:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1170/1251] eta 0:02:58 lr 0.000108 time 2.1052 (2.2002) loss 2.3582 (3.1578) grad_norm 2.5472 (2.5170) [2022-01-25 08:07:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1180/1251] eta 0:02:36 lr 0.000108 time 1.7164 (2.1982) loss 3.4632 (3.1568) grad_norm 2.5508 (2.5166) [2022-01-25 08:07:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1190/1251] eta 0:02:14 lr 0.000108 time 1.8644 (2.1968) loss 3.7466 (3.1582) grad_norm 2.4849 (2.5177) [2022-01-25 08:08:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1200/1251] eta 0:01:51 lr 0.000108 time 1.5961 (2.1951) loss 3.3933 (3.1580) grad_norm 2.6268 (2.5179) [2022-01-25 08:08:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1210/1251] eta 0:01:29 lr 0.000108 time 1.4835 (2.1948) loss 3.1835 (3.1571) grad_norm 2.2799 (2.5166) [2022-01-25 08:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1220/1251] eta 0:01:08 lr 0.000108 time 1.8263 (2.1941) loss 3.5609 (3.1554) grad_norm 2.4743 (2.5172) [2022-01-25 08:09:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1230/1251] eta 0:00:46 lr 0.000108 time 1.5314 (2.1937) loss 3.2467 (3.1538) grad_norm 2.1620 (2.5168) [2022-01-25 08:09:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1240/1251] eta 0:00:24 lr 0.000108 time 1.6951 (2.1927) loss 3.7489 (3.1535) grad_norm 2.6824 (2.5156) [2022-01-25 08:09:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [238/300][1250/1251] eta 0:00:02 lr 0.000108 time 1.1705 (2.1877) loss 3.0972 (3.1541) grad_norm 2.4639 (2.5154) [2022-01-25 08:09:45 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 238 training takes 0:45:37 [2022-01-25 08:10:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.434 (18.434) Loss 0.9483 (0.9483) Acc@1 79.199 (79.199) Acc@5 93.066 (93.066) [2022-01-25 08:10:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.744 (3.538) Loss 0.8801 (0.8743) Acc@1 79.297 (79.643) Acc@5 95.312 (94.709) [2022-01-25 08:10:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.866 (2.676) Loss 0.8617 (0.8667) Acc@1 79.883 (79.794) Acc@5 94.922 (94.945) [2022-01-25 08:10:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.297 (2.364) Loss 0.8502 (0.8575) Acc@1 79.199 (79.921) Acc@5 95.996 (95.171) [2022-01-25 08:11:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.626 (2.216) Loss 0.8501 (0.8591) Acc@1 80.176 (79.899) Acc@5 95.605 (95.120) [2022-01-25 08:11:24 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.922 Acc@5 95.070 [2022-01-25 08:11:24 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.9% [2022-01-25 08:11:24 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 79.92% [2022-01-25 08:11:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][0/1251] eta 7:20:57 lr 0.000108 time 21.1487 (21.1487) loss 3.4380 (3.4380) grad_norm 2.3685 (2.3685) [2022-01-25 08:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][10/1251] eta 1:23:24 lr 0.000108 time 3.1903 (4.0326) loss 3.4159 (3.0547) grad_norm 2.3904 (2.4008) [2022-01-25 08:12:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][20/1251] eta 1:04:03 lr 0.000108 time 1.8078 (3.1222) loss 3.2039 (2.9961) grad_norm 2.2445 (2.4382) [2022-01-25 08:12:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][30/1251] eta 0:57:46 lr 0.000108 time 1.5196 (2.8394) loss 2.4024 (2.9509) grad_norm 2.4534 (2.4275) [2022-01-25 08:13:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][40/1251] eta 0:54:40 lr 0.000108 time 3.8081 (2.7090) loss 3.5361 (3.0147) grad_norm 2.3878 (2.4201) [2022-01-25 08:13:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][50/1251] eta 0:52:51 lr 0.000107 time 2.7374 (2.6410) loss 3.4862 (3.0645) grad_norm 2.2847 (2.4062) [2022-01-25 08:14:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][60/1251] eta 0:50:37 lr 0.000107 time 1.2232 (2.5500) loss 3.3882 (3.0551) grad_norm 2.6038 (2.4283) [2022-01-25 08:14:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][70/1251] eta 0:49:26 lr 0.000107 time 1.8715 (2.5119) loss 2.6117 (3.0590) grad_norm 2.1303 (2.4484) [2022-01-25 08:14:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][80/1251] eta 0:48:39 lr 0.000107 time 3.3653 (2.4929) loss 3.1766 (3.0448) grad_norm 2.2857 (2.4396) [2022-01-25 08:15:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][90/1251] eta 0:48:20 lr 0.000107 time 2.4642 (2.4980) loss 3.0811 (3.0551) grad_norm 2.7195 (2.4501) [2022-01-25 08:15:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][100/1251] eta 0:47:21 lr 0.000107 time 1.8832 (2.4684) loss 2.4017 (3.0630) grad_norm 2.4789 (2.4496) [2022-01-25 08:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][110/1251] eta 0:46:18 lr 0.000107 time 1.8156 (2.4348) loss 2.1896 (3.0567) grad_norm 2.3058 (2.4591) [2022-01-25 08:16:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][120/1251] eta 0:45:19 lr 0.000107 time 2.5087 (2.4043) loss 3.5781 (3.0754) grad_norm 2.9436 (2.4729) [2022-01-25 08:16:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][130/1251] eta 0:44:16 lr 0.000107 time 1.8941 (2.3693) loss 3.1984 (3.0967) grad_norm 2.4099 (2.4708) [2022-01-25 08:16:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][140/1251] eta 0:43:29 lr 0.000107 time 1.9185 (2.3491) loss 2.2385 (3.0973) grad_norm 2.5440 (2.4759) [2022-01-25 08:17:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][150/1251] eta 0:42:55 lr 0.000107 time 2.2180 (2.3393) loss 3.4160 (3.1221) grad_norm 2.4126 (2.4800) [2022-01-25 08:17:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][160/1251] eta 0:42:35 lr 0.000107 time 2.9083 (2.3419) loss 3.3625 (3.1303) grad_norm 2.1777 (2.4744) [2022-01-25 08:18:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][170/1251] eta 0:42:13 lr 0.000107 time 2.3956 (2.3440) loss 3.2971 (3.1486) grad_norm 2.5317 (2.4701) [2022-01-25 08:18:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][180/1251] eta 0:41:43 lr 0.000107 time 2.1576 (2.3372) loss 3.7718 (3.1452) grad_norm 2.3428 (2.4768) [2022-01-25 08:18:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][190/1251] eta 0:41:03 lr 0.000107 time 2.2061 (2.3218) loss 2.3882 (3.1477) grad_norm 2.5417 (2.4772) [2022-01-25 08:19:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][200/1251] eta 0:40:30 lr 0.000107 time 2.1859 (2.3124) loss 2.5257 (3.1599) grad_norm 2.4015 (2.4781) [2022-01-25 08:19:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][210/1251] eta 0:39:55 lr 0.000107 time 1.6799 (2.3011) loss 2.2548 (3.1599) grad_norm 2.0651 (2.4839) [2022-01-25 08:19:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][220/1251] eta 0:39:23 lr 0.000107 time 2.0055 (2.2927) loss 3.1490 (3.1677) grad_norm 3.0236 (2.4847) [2022-01-25 08:20:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][230/1251] eta 0:38:55 lr 0.000107 time 1.8605 (2.2878) loss 3.4044 (3.1665) grad_norm 2.9253 (2.4865) [2022-01-25 08:20:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][240/1251] eta 0:38:38 lr 0.000107 time 2.3874 (2.2936) loss 3.1874 (3.1714) grad_norm 2.6477 (2.4919) [2022-01-25 08:20:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][250/1251] eta 0:38:15 lr 0.000107 time 1.6100 (2.2928) loss 3.4318 (3.1700) grad_norm 2.3211 (2.4934) [2022-01-25 08:21:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][260/1251] eta 0:37:50 lr 0.000107 time 2.3065 (2.2911) loss 3.3705 (3.1631) grad_norm 2.1821 (2.4948) [2022-01-25 08:21:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][270/1251] eta 0:37:15 lr 0.000107 time 1.7002 (2.2791) loss 2.6858 (3.1588) grad_norm 2.8132 (2.4965) [2022-01-25 08:22:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][280/1251] eta 0:36:45 lr 0.000107 time 1.7803 (2.2713) loss 3.4092 (3.1683) grad_norm 2.6822 (2.4972) [2022-01-25 08:22:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][290/1251] eta 0:36:15 lr 0.000107 time 1.8991 (2.2637) loss 2.5802 (3.1669) grad_norm 2.3928 (2.5035) [2022-01-25 08:22:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][300/1251] eta 0:35:55 lr 0.000107 time 1.8872 (2.2661) loss 3.4057 (3.1646) grad_norm 2.8818 (2.5121) [2022-01-25 08:23:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][310/1251] eta 0:35:35 lr 0.000107 time 2.2023 (2.2699) loss 3.5353 (3.1640) grad_norm 2.7759 (2.5097) [2022-01-25 08:23:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][320/1251] eta 0:35:16 lr 0.000107 time 2.0282 (2.2735) loss 2.4324 (3.1615) grad_norm 2.2898 (2.5058) [2022-01-25 08:23:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][330/1251] eta 0:34:51 lr 0.000107 time 2.0996 (2.2713) loss 3.2372 (3.1670) grad_norm 2.4663 (2.5045) [2022-01-25 08:24:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][340/1251] eta 0:34:24 lr 0.000107 time 1.8839 (2.2661) loss 3.0922 (3.1697) grad_norm 2.1400 (2.5050) [2022-01-25 08:24:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][350/1251] eta 0:33:53 lr 0.000107 time 2.1560 (2.2574) loss 3.0404 (3.1727) grad_norm 2.3733 (2.5006) [2022-01-25 08:24:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][360/1251] eta 0:33:24 lr 0.000107 time 2.1647 (2.2500) loss 2.8969 (3.1704) grad_norm 2.8917 (2.5069) [2022-01-25 08:25:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][370/1251] eta 0:32:59 lr 0.000107 time 2.4384 (2.2473) loss 3.5241 (3.1662) grad_norm 2.4373 (2.5078) [2022-01-25 08:25:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][380/1251] eta 0:32:34 lr 0.000107 time 2.2395 (2.2437) loss 3.2014 (3.1660) grad_norm 3.0636 (2.5096) [2022-01-25 08:26:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][390/1251] eta 0:32:09 lr 0.000107 time 1.8908 (2.2405) loss 3.5812 (3.1654) grad_norm 2.2295 (2.5127) [2022-01-25 08:26:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][400/1251] eta 0:31:43 lr 0.000107 time 2.1301 (2.2366) loss 3.2722 (3.1729) grad_norm 3.5402 (2.5143) [2022-01-25 08:26:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][410/1251] eta 0:31:21 lr 0.000107 time 3.0601 (2.2370) loss 3.2821 (3.1714) grad_norm 2.8281 (2.5193) [2022-01-25 08:27:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][420/1251] eta 0:30:57 lr 0.000107 time 2.1131 (2.2353) loss 3.7262 (3.1656) grad_norm 2.2234 (2.5186) [2022-01-25 08:27:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][430/1251] eta 0:30:35 lr 0.000107 time 1.8415 (2.2354) loss 3.3708 (3.1646) grad_norm 3.1502 (2.5196) [2022-01-25 08:27:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][440/1251] eta 0:30:15 lr 0.000107 time 2.7275 (2.2383) loss 2.6170 (3.1672) grad_norm 2.9563 (2.5270) [2022-01-25 08:28:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][450/1251] eta 0:29:55 lr 0.000106 time 3.0875 (2.2413) loss 2.1382 (3.1659) grad_norm 2.5317 (2.5235) [2022-01-25 08:28:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][460/1251] eta 0:29:34 lr 0.000106 time 2.8105 (2.2434) loss 3.9704 (3.1727) grad_norm 3.4214 (2.5259) [2022-01-25 08:29:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][470/1251] eta 0:29:10 lr 0.000106 time 1.8956 (2.2411) loss 3.6697 (3.1763) grad_norm 2.1913 (2.5251) [2022-01-25 08:29:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][480/1251] eta 0:28:48 lr 0.000106 time 2.7567 (2.2421) loss 2.1623 (3.1654) grad_norm 2.4646 (2.5222) [2022-01-25 08:29:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][490/1251] eta 0:28:22 lr 0.000106 time 1.7831 (2.2371) loss 3.8423 (3.1669) grad_norm 2.3005 (2.5206) [2022-01-25 08:30:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][500/1251] eta 0:27:58 lr 0.000106 time 1.9073 (2.2354) loss 2.9413 (3.1662) grad_norm 2.0416 (2.5179) [2022-01-25 08:30:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][510/1251] eta 0:27:35 lr 0.000106 time 2.1526 (2.2340) loss 4.0694 (3.1663) grad_norm 2.9236 (2.5200) [2022-01-25 08:30:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][520/1251] eta 0:27:13 lr 0.000106 time 3.7778 (2.2345) loss 3.6156 (3.1672) grad_norm 2.5559 (2.5168) [2022-01-25 08:31:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][530/1251] eta 0:26:50 lr 0.000106 time 2.1979 (2.2333) loss 3.2945 (3.1675) grad_norm 2.0955 (2.5133) [2022-01-25 08:31:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][540/1251] eta 0:26:27 lr 0.000106 time 1.9316 (2.2325) loss 3.2989 (3.1655) grad_norm 2.4174 (2.5119) [2022-01-25 08:31:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][550/1251] eta 0:26:06 lr 0.000106 time 2.2152 (2.2345) loss 3.6256 (3.1661) grad_norm 2.5926 (2.5102) [2022-01-25 08:32:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][560/1251] eta 0:25:44 lr 0.000106 time 2.8558 (2.2355) loss 2.7726 (3.1667) grad_norm 2.6482 (2.5126) [2022-01-25 08:32:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][570/1251] eta 0:25:20 lr 0.000106 time 1.5862 (2.2322) loss 3.5054 (3.1678) grad_norm 2.8216 (2.5157) [2022-01-25 08:32:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][580/1251] eta 0:24:54 lr 0.000106 time 1.8330 (2.2272) loss 3.0773 (3.1636) grad_norm 2.3694 (2.5168) [2022-01-25 08:33:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][590/1251] eta 0:24:30 lr 0.000106 time 2.0780 (2.2243) loss 3.2961 (3.1634) grad_norm 2.6939 (2.5176) [2022-01-25 08:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][600/1251] eta 0:24:06 lr 0.000106 time 1.8827 (2.2226) loss 3.6411 (3.1699) grad_norm 2.3639 (2.5174) [2022-01-25 08:34:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][610/1251] eta 0:23:45 lr 0.000106 time 3.0257 (2.2237) loss 3.5792 (3.1721) grad_norm 2.4794 (2.5176) [2022-01-25 08:34:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][620/1251] eta 0:23:22 lr 0.000106 time 1.8916 (2.2221) loss 3.4664 (3.1742) grad_norm 2.4108 (2.5174) [2022-01-25 08:34:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][630/1251] eta 0:22:59 lr 0.000106 time 2.8312 (2.2219) loss 2.6518 (3.1685) grad_norm 2.3897 (2.5195) [2022-01-25 08:35:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][640/1251] eta 0:22:38 lr 0.000106 time 1.9759 (2.2228) loss 3.9359 (3.1671) grad_norm 2.5834 (2.5225) [2022-01-25 08:35:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][650/1251] eta 0:22:16 lr 0.000106 time 2.5706 (2.2235) loss 2.3503 (3.1679) grad_norm 2.4721 (2.5208) [2022-01-25 08:35:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][660/1251] eta 0:21:54 lr 0.000106 time 1.8859 (2.2237) loss 3.3798 (3.1696) grad_norm 2.4049 (2.5196) [2022-01-25 08:36:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][670/1251] eta 0:21:30 lr 0.000106 time 1.6873 (2.2220) loss 4.0027 (3.1657) grad_norm 2.4136 (2.5184) [2022-01-25 08:36:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][680/1251] eta 0:21:08 lr 0.000106 time 1.9000 (2.2210) loss 3.8716 (3.1635) grad_norm 2.8352 (2.5192) [2022-01-25 08:36:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][690/1251] eta 0:20:45 lr 0.000106 time 2.4819 (2.2200) loss 3.6232 (3.1645) grad_norm 2.5465 (2.5185) [2022-01-25 08:37:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][700/1251] eta 0:20:23 lr 0.000106 time 2.4535 (2.2199) loss 3.8135 (3.1654) grad_norm 2.5998 (2.5187) [2022-01-25 08:37:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][710/1251] eta 0:19:59 lr 0.000106 time 2.0205 (2.2171) loss 3.2720 (3.1631) grad_norm 2.5552 (2.5196) [2022-01-25 08:38:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][720/1251] eta 0:19:37 lr 0.000106 time 1.9157 (2.2182) loss 3.3793 (3.1629) grad_norm 2.3210 (2.5192) [2022-01-25 08:38:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][730/1251] eta 0:19:16 lr 0.000106 time 3.0294 (2.2200) loss 3.6393 (3.1630) grad_norm 2.3958 (2.5177) [2022-01-25 08:38:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][740/1251] eta 0:18:55 lr 0.000106 time 2.3275 (2.2214) loss 2.9549 (3.1611) grad_norm 2.4611 (2.5183) [2022-01-25 08:39:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][750/1251] eta 0:18:32 lr 0.000106 time 1.6502 (2.2209) loss 3.5216 (3.1632) grad_norm 2.3697 (2.5217) [2022-01-25 08:39:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][760/1251] eta 0:18:09 lr 0.000106 time 1.9565 (2.2191) loss 2.6541 (3.1663) grad_norm 2.1211 (2.5208) [2022-01-25 08:39:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][770/1251] eta 0:17:45 lr 0.000106 time 2.2997 (2.2151) loss 3.6113 (3.1676) grad_norm 2.4794 (2.5218) [2022-01-25 08:40:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][780/1251] eta 0:17:22 lr 0.000106 time 2.1350 (2.2128) loss 3.5136 (3.1677) grad_norm 2.3295 (2.5226) [2022-01-25 08:40:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][790/1251] eta 0:16:59 lr 0.000106 time 1.6742 (2.2123) loss 2.8448 (3.1691) grad_norm 2.8664 (2.5220) [2022-01-25 08:40:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][800/1251] eta 0:16:39 lr 0.000106 time 4.1404 (2.2161) loss 3.8284 (3.1644) grad_norm 2.2141 (2.5214) [2022-01-25 08:41:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][810/1251] eta 0:16:18 lr 0.000106 time 1.8003 (2.2186) loss 3.1871 (3.1633) grad_norm 2.4525 (2.5206) [2022-01-25 08:41:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][820/1251] eta 0:15:57 lr 0.000106 time 2.5434 (2.2213) loss 3.7022 (3.1612) grad_norm 2.7370 (2.5207) [2022-01-25 08:42:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][830/1251] eta 0:15:35 lr 0.000106 time 1.8355 (2.2216) loss 3.5548 (3.1646) grad_norm 2.5708 (2.5210) [2022-01-25 08:42:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][840/1251] eta 0:15:11 lr 0.000106 time 1.6460 (2.2184) loss 3.4140 (3.1655) grad_norm 2.1282 (2.5200) [2022-01-25 08:42:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][850/1251] eta 0:14:48 lr 0.000106 time 2.1078 (2.2153) loss 2.4749 (3.1652) grad_norm 2.2607 (2.5197) [2022-01-25 08:43:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][860/1251] eta 0:14:25 lr 0.000105 time 2.0261 (2.2125) loss 3.2144 (3.1626) grad_norm 2.4385 (2.5176) [2022-01-25 08:43:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][870/1251] eta 0:14:02 lr 0.000105 time 2.3638 (2.2124) loss 3.1611 (3.1621) grad_norm 2.1127 (2.5149) [2022-01-25 08:43:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][880/1251] eta 0:13:40 lr 0.000105 time 2.2409 (2.2122) loss 3.8259 (3.1627) grad_norm 2.4608 (2.5150) [2022-01-25 08:44:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][890/1251] eta 0:13:18 lr 0.000105 time 2.1760 (2.2122) loss 3.1420 (3.1633) grad_norm 2.7261 (2.5152) [2022-01-25 08:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][900/1251] eta 0:12:56 lr 0.000105 time 1.4697 (2.2115) loss 2.1756 (3.1614) grad_norm 2.0519 (2.5162) [2022-01-25 08:45:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][910/1251] eta 0:12:34 lr 0.000105 time 2.9784 (2.2134) loss 2.7871 (3.1603) grad_norm 2.3576 (2.5134) [2022-01-25 08:45:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][920/1251] eta 0:12:12 lr 0.000105 time 2.1469 (2.2137) loss 3.0730 (3.1593) grad_norm 2.3641 (2.5123) [2022-01-25 08:45:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][930/1251] eta 0:11:50 lr 0.000105 time 2.2153 (2.2145) loss 3.4705 (3.1562) grad_norm 2.3133 (2.5142) [2022-01-25 08:46:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][940/1251] eta 0:11:29 lr 0.000105 time 2.1042 (2.2156) loss 3.5023 (3.1577) grad_norm 2.3455 (2.5132) [2022-01-25 08:46:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][950/1251] eta 0:11:07 lr 0.000105 time 2.6944 (2.2169) loss 3.1981 (3.1556) grad_norm 2.1836 (2.5120) [2022-01-25 08:46:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][960/1251] eta 0:10:44 lr 0.000105 time 1.9743 (2.2135) loss 3.1773 (3.1547) grad_norm 2.2637 (2.5118) [2022-01-25 08:47:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][970/1251] eta 0:10:21 lr 0.000105 time 2.1997 (2.2104) loss 3.2468 (3.1531) grad_norm 2.5090 (2.5111) [2022-01-25 08:47:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][980/1251] eta 0:09:58 lr 0.000105 time 2.2007 (2.2082) loss 3.3310 (3.1503) grad_norm 2.6709 (2.5101) [2022-01-25 08:47:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][990/1251] eta 0:09:36 lr 0.000105 time 2.2097 (2.2070) loss 2.9103 (3.1455) grad_norm 2.6766 (2.5090) [2022-01-25 08:48:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1000/1251] eta 0:09:13 lr 0.000105 time 1.3824 (2.2062) loss 3.2825 (3.1458) grad_norm 2.4226 (2.5078) [2022-01-25 08:48:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1010/1251] eta 0:08:51 lr 0.000105 time 2.1112 (2.2065) loss 3.6401 (3.1485) grad_norm 3.5692 (2.5091) [2022-01-25 08:48:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1020/1251] eta 0:08:30 lr 0.000105 time 3.0505 (2.2089) loss 2.2418 (3.1472) grad_norm 2.5017 (2.5077) [2022-01-25 08:49:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1030/1251] eta 0:08:08 lr 0.000105 time 2.5686 (2.2103) loss 3.3154 (3.1457) grad_norm 3.4753 (2.5079) [2022-01-25 08:49:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1040/1251] eta 0:07:46 lr 0.000105 time 1.8965 (2.2103) loss 2.6160 (3.1479) grad_norm 3.1967 (2.5073) [2022-01-25 08:50:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1050/1251] eta 0:07:24 lr 0.000105 time 2.3195 (2.2114) loss 3.4539 (3.1464) grad_norm 2.5694 (2.5067) [2022-01-25 08:50:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1060/1251] eta 0:07:02 lr 0.000105 time 2.2069 (2.2124) loss 3.1997 (3.1472) grad_norm 2.5269 (2.5058) [2022-01-25 08:50:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1070/1251] eta 0:06:40 lr 0.000105 time 1.8795 (2.2113) loss 3.1643 (3.1472) grad_norm 2.3817 (2.5050) [2022-01-25 08:51:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1080/1251] eta 0:06:17 lr 0.000105 time 1.9191 (2.2096) loss 3.4659 (3.1467) grad_norm 2.4213 (2.5072) [2022-01-25 08:51:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1090/1251] eta 0:05:55 lr 0.000105 time 2.2074 (2.2083) loss 3.6496 (3.1457) grad_norm 3.3356 (2.5067) [2022-01-25 08:51:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1100/1251] eta 0:05:33 lr 0.000105 time 1.9506 (2.2061) loss 3.2015 (3.1462) grad_norm 2.7710 (2.5085) [2022-01-25 08:52:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1110/1251] eta 0:05:11 lr 0.000105 time 1.9414 (2.2061) loss 2.1378 (3.1469) grad_norm 2.5964 (2.5095) [2022-01-25 08:52:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1120/1251] eta 0:04:49 lr 0.000105 time 2.8804 (2.2077) loss 3.0924 (3.1485) grad_norm 2.3414 (2.5095) [2022-01-25 08:53:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1130/1251] eta 0:04:27 lr 0.000105 time 3.0920 (2.2092) loss 3.5471 (3.1492) grad_norm 2.3724 (2.5090) [2022-01-25 08:53:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1140/1251] eta 0:04:05 lr 0.000105 time 2.2214 (2.2082) loss 3.4256 (3.1500) grad_norm 2.5476 (2.5084) [2022-01-25 08:53:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1150/1251] eta 0:03:43 lr 0.000105 time 2.1653 (2.2079) loss 3.5755 (3.1503) grad_norm 2.7127 (2.5073) [2022-01-25 08:54:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1160/1251] eta 0:03:20 lr 0.000105 time 1.9271 (2.2063) loss 3.4205 (3.1513) grad_norm 2.6340 (2.5069) [2022-01-25 08:54:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1170/1251] eta 0:02:58 lr 0.000105 time 1.6101 (2.2057) loss 2.2966 (3.1495) grad_norm 2.6729 (2.5072) [2022-01-25 08:54:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1180/1251] eta 0:02:36 lr 0.000105 time 2.3238 (2.2067) loss 3.0538 (3.1494) grad_norm 2.2607 (2.5085) [2022-01-25 08:55:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1190/1251] eta 0:02:14 lr 0.000105 time 2.2338 (2.2079) loss 2.1202 (3.1496) grad_norm 2.3601 (2.5080) [2022-01-25 08:55:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1200/1251] eta 0:01:52 lr 0.000105 time 2.1900 (2.2073) loss 3.3940 (3.1491) grad_norm 2.2298 (2.5075) [2022-01-25 08:55:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1210/1251] eta 0:01:30 lr 0.000105 time 1.7267 (2.2061) loss 3.1681 (3.1489) grad_norm 2.4853 (2.5071) [2022-01-25 08:56:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1220/1251] eta 0:01:08 lr 0.000105 time 1.9090 (2.2059) loss 3.2583 (3.1496) grad_norm 2.5888 (2.5070) [2022-01-25 08:56:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1230/1251] eta 0:00:46 lr 0.000105 time 1.5512 (2.2070) loss 3.0997 (3.1500) grad_norm 2.2454 (2.5069) [2022-01-25 08:57:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1240/1251] eta 0:00:24 lr 0.000105 time 1.2471 (2.2057) loss 3.7460 (3.1509) grad_norm 2.4688 (2.5072) [2022-01-25 08:57:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [239/300][1250/1251] eta 0:00:02 lr 0.000105 time 1.1943 (2.2003) loss 3.6188 (3.1510) grad_norm 2.8617 (2.5075) [2022-01-25 08:57:17 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 239 training takes 0:45:52 [2022-01-25 08:57:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.044 (19.044) Loss 0.9090 (0.9090) Acc@1 78.516 (78.516) Acc@5 94.531 (94.531) [2022-01-25 08:57:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.574 (3.085) Loss 0.8266 (0.8561) Acc@1 79.395 (80.211) Acc@5 95.898 (94.913) [2022-01-25 08:58:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.137 (2.480) Loss 0.8003 (0.8670) Acc@1 82.031 (79.915) Acc@5 95.801 (94.875) [2022-01-25 08:58:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.657 (2.217) Loss 0.8654 (0.8662) Acc@1 79.395 (79.795) Acc@5 95.020 (94.938) [2022-01-25 08:58:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.323 (2.176) Loss 0.8300 (0.8626) Acc@1 81.250 (79.985) Acc@5 95.605 (94.965) [2022-01-25 08:58:53 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.012 Acc@5 94.990 [2022-01-25 08:58:53 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.0% [2022-01-25 08:58:53 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.01% [2022-01-25 08:59:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][0/1251] eta 8:20:45 lr 0.000105 time 24.0171 (24.0171) loss 2.5618 (2.5618) grad_norm 3.0028 (3.0028) [2022-01-25 08:59:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][10/1251] eta 1:29:08 lr 0.000105 time 2.8716 (4.3100) loss 2.5498 (2.9577) grad_norm 2.3128 (2.5327) [2022-01-25 09:00:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][20/1251] eta 1:07:28 lr 0.000104 time 2.4729 (3.2888) loss 3.4365 (3.1144) grad_norm 3.3704 (2.4855) [2022-01-25 09:00:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][30/1251] eta 1:00:01 lr 0.000104 time 1.6527 (2.9493) loss 2.3851 (3.1388) grad_norm 2.2905 (2.4669) [2022-01-25 09:00:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][40/1251] eta 0:56:37 lr 0.000104 time 3.9086 (2.8054) loss 3.3682 (3.1488) grad_norm 1.8754 (2.4726) [2022-01-25 09:01:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][50/1251] eta 0:53:19 lr 0.000104 time 2.1142 (2.6639) loss 3.0302 (3.1448) grad_norm 2.4803 (2.5017) [2022-01-25 09:01:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][60/1251] eta 0:50:47 lr 0.000104 time 1.9003 (2.5584) loss 3.2266 (3.1709) grad_norm 2.4272 (2.4972) [2022-01-25 09:01:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][70/1251] eta 0:49:06 lr 0.000104 time 1.5358 (2.4946) loss 3.0392 (3.1812) grad_norm 2.4176 (2.5221) [2022-01-25 09:02:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][80/1251] eta 0:48:11 lr 0.000104 time 3.8402 (2.4693) loss 3.5407 (3.1723) grad_norm 2.5659 (2.5116) [2022-01-25 09:02:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][90/1251] eta 0:47:10 lr 0.000104 time 1.5926 (2.4384) loss 3.5248 (3.1735) grad_norm 2.4192 (2.5088) [2022-01-25 09:02:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][100/1251] eta 0:46:22 lr 0.000104 time 2.6927 (2.4178) loss 3.5770 (3.1797) grad_norm 2.5042 (2.5145) [2022-01-25 09:03:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][110/1251] eta 0:45:49 lr 0.000104 time 2.1396 (2.4097) loss 3.0182 (3.1822) grad_norm 2.3456 (2.5114) [2022-01-25 09:03:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][120/1251] eta 0:45:14 lr 0.000104 time 2.2387 (2.4003) loss 2.7800 (3.1913) grad_norm 2.6666 (2.5137) [2022-01-25 09:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][130/1251] eta 0:44:24 lr 0.000104 time 2.1611 (2.3770) loss 2.9205 (3.1914) grad_norm 2.5638 (2.5206) [2022-01-25 09:04:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][140/1251] eta 0:43:37 lr 0.000104 time 1.7685 (2.3563) loss 3.6923 (3.1915) grad_norm 2.4624 (2.5198) [2022-01-25 09:04:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][150/1251] eta 0:42:59 lr 0.000104 time 2.2789 (2.3433) loss 3.1228 (3.1860) grad_norm 2.3456 (2.5257) [2022-01-25 09:05:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][160/1251] eta 0:42:30 lr 0.000104 time 2.4637 (2.3379) loss 3.5804 (3.1736) grad_norm 2.2962 (2.5250) [2022-01-25 09:05:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][170/1251] eta 0:41:53 lr 0.000104 time 2.0178 (2.3253) loss 2.8302 (3.1596) grad_norm 2.6943 (2.5230) [2022-01-25 09:05:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][180/1251] eta 0:41:19 lr 0.000104 time 1.8915 (2.3152) loss 1.9836 (3.1484) grad_norm 3.0544 (2.5245) [2022-01-25 09:06:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][190/1251] eta 0:40:48 lr 0.000104 time 2.3123 (2.3076) loss 2.3504 (3.1400) grad_norm 2.6913 (2.5373) [2022-01-25 09:06:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][200/1251] eta 0:40:15 lr 0.000104 time 2.2936 (2.2986) loss 3.4814 (3.1412) grad_norm 2.0575 (2.5353) [2022-01-25 09:06:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][210/1251] eta 0:39:58 lr 0.000104 time 3.4398 (2.3039) loss 2.1836 (3.1407) grad_norm 2.5028 (2.5310) [2022-01-25 09:07:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][220/1251] eta 0:39:30 lr 0.000104 time 1.6091 (2.2992) loss 3.5972 (3.1439) grad_norm 2.2763 (2.5299) [2022-01-25 09:07:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][230/1251] eta 0:39:04 lr 0.000104 time 2.4499 (2.2962) loss 3.2474 (3.1510) grad_norm 3.1000 (2.5309) [2022-01-25 09:08:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][240/1251] eta 0:38:41 lr 0.000104 time 2.7765 (2.2960) loss 3.3899 (3.1567) grad_norm 2.1564 (2.5271) [2022-01-25 09:08:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][250/1251] eta 0:38:07 lr 0.000104 time 2.1822 (2.2855) loss 3.4306 (3.1503) grad_norm 2.0774 (2.5211) [2022-01-25 09:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][260/1251] eta 0:37:33 lr 0.000104 time 1.8941 (2.2744) loss 3.0688 (3.1541) grad_norm 2.4054 (2.5249) [2022-01-25 09:09:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][270/1251] eta 0:37:01 lr 0.000104 time 1.5798 (2.2641) loss 3.4656 (3.1543) grad_norm 2.0031 (2.5199) [2022-01-25 09:09:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][280/1251] eta 0:36:36 lr 0.000104 time 1.9650 (2.2623) loss 3.1020 (3.1478) grad_norm 2.6760 (2.5170) [2022-01-25 09:09:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][290/1251] eta 0:36:15 lr 0.000104 time 2.7378 (2.2637) loss 3.4432 (3.1551) grad_norm 2.6880 (2.5148) [2022-01-25 09:10:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][300/1251] eta 0:35:52 lr 0.000104 time 2.4500 (2.2639) loss 2.7594 (3.1543) grad_norm 2.0269 (2.5100) [2022-01-25 09:10:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][310/1251] eta 0:35:23 lr 0.000104 time 2.1641 (2.2571) loss 3.7242 (3.1552) grad_norm 2.2117 (2.5070) [2022-01-25 09:10:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][320/1251] eta 0:35:03 lr 0.000104 time 2.0900 (2.2593) loss 2.9928 (3.1553) grad_norm 2.6389 (2.5063) [2022-01-25 09:11:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][330/1251] eta 0:34:39 lr 0.000104 time 2.2426 (2.2581) loss 3.4542 (3.1557) grad_norm 2.4147 (2.5026) [2022-01-25 09:11:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][340/1251] eta 0:34:17 lr 0.000104 time 2.1316 (2.2586) loss 2.9423 (3.1524) grad_norm 2.2441 (2.4999) [2022-01-25 09:12:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][350/1251] eta 0:33:49 lr 0.000104 time 1.7043 (2.2524) loss 3.3560 (3.1467) grad_norm 2.9557 (2.5044) [2022-01-25 09:12:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][360/1251] eta 0:33:26 lr 0.000104 time 1.9510 (2.2518) loss 3.0389 (3.1483) grad_norm 2.2768 (2.5010) [2022-01-25 09:12:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][370/1251] eta 0:33:04 lr 0.000104 time 2.5946 (2.2523) loss 3.5939 (3.1525) grad_norm 2.6491 (2.5015) [2022-01-25 09:13:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][380/1251] eta 0:32:41 lr 0.000104 time 3.4320 (2.2526) loss 3.5032 (3.1493) grad_norm 2.7210 (2.5007) [2022-01-25 09:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][390/1251] eta 0:32:16 lr 0.000104 time 1.6720 (2.2486) loss 3.2155 (3.1480) grad_norm 2.8065 (2.5003) [2022-01-25 09:13:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][400/1251] eta 0:31:48 lr 0.000104 time 1.5640 (2.2428) loss 3.7553 (3.1500) grad_norm 2.4352 (2.4996) [2022-01-25 09:14:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][410/1251] eta 0:31:23 lr 0.000104 time 1.9177 (2.2400) loss 3.4337 (3.1519) grad_norm 2.7717 (2.5065) [2022-01-25 09:14:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][420/1251] eta 0:31:03 lr 0.000104 time 3.6994 (2.2431) loss 3.4725 (3.1521) grad_norm 2.1177 (2.5120) [2022-01-25 09:15:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][430/1251] eta 0:30:42 lr 0.000103 time 1.8233 (2.2439) loss 3.6604 (3.1516) grad_norm 2.4123 (2.5154) [2022-01-25 09:15:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][440/1251] eta 0:30:20 lr 0.000103 time 1.5787 (2.2445) loss 2.5621 (3.1472) grad_norm 2.9844 (2.5176) [2022-01-25 09:15:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][450/1251] eta 0:29:55 lr 0.000103 time 2.1653 (2.2414) loss 2.6250 (3.1506) grad_norm 2.3727 (2.5176) [2022-01-25 09:16:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][460/1251] eta 0:29:30 lr 0.000103 time 2.5427 (2.2385) loss 1.8353 (3.1469) grad_norm 2.5141 (2.5178) [2022-01-25 09:16:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][470/1251] eta 0:29:08 lr 0.000103 time 2.5448 (2.2382) loss 3.7236 (3.1456) grad_norm 2.2916 (2.5175) [2022-01-25 09:16:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][480/1251] eta 0:28:44 lr 0.000103 time 2.1479 (2.2371) loss 2.5791 (3.1493) grad_norm 2.4843 (2.5172) [2022-01-25 09:17:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][490/1251] eta 0:28:19 lr 0.000103 time 1.9440 (2.2331) loss 3.6755 (3.1551) grad_norm 2.6021 (2.5177) [2022-01-25 09:17:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][500/1251] eta 0:27:54 lr 0.000103 time 1.9601 (2.2297) loss 2.5108 (3.1598) grad_norm 2.3269 (2.5189) [2022-01-25 09:17:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][510/1251] eta 0:27:28 lr 0.000103 time 1.9257 (2.2250) loss 2.6465 (3.1558) grad_norm 2.5634 (2.5210) [2022-01-25 09:18:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][520/1251] eta 0:27:05 lr 0.000103 time 2.1868 (2.2233) loss 2.2860 (3.1516) grad_norm 2.5783 (2.5237) [2022-01-25 09:18:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][530/1251] eta 0:26:42 lr 0.000103 time 2.3455 (2.2232) loss 2.3995 (3.1497) grad_norm 2.5662 (2.5214) [2022-01-25 09:18:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][540/1251] eta 0:26:19 lr 0.000103 time 2.4168 (2.2221) loss 2.9954 (3.1515) grad_norm 2.7334 (2.5261) [2022-01-25 09:19:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][550/1251] eta 0:25:58 lr 0.000103 time 1.9234 (2.2230) loss 3.6638 (3.1515) grad_norm 2.4795 (2.5262) [2022-01-25 09:19:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][560/1251] eta 0:25:37 lr 0.000103 time 3.1053 (2.2245) loss 3.4524 (3.1528) grad_norm 2.5871 (2.5298) [2022-01-25 09:20:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][570/1251] eta 0:25:16 lr 0.000103 time 2.5532 (2.2262) loss 2.2857 (3.1521) grad_norm 2.6383 (2.5324) [2022-01-25 09:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][580/1251] eta 0:24:54 lr 0.000103 time 2.7476 (2.2268) loss 3.6464 (3.1539) grad_norm 2.6959 (2.5332) [2022-01-25 09:20:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][590/1251] eta 0:24:31 lr 0.000103 time 2.3921 (2.2262) loss 2.4803 (3.1527) grad_norm 2.9573 (2.5343) [2022-01-25 09:21:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][600/1251] eta 0:24:07 lr 0.000103 time 1.9733 (2.2230) loss 2.9991 (3.1491) grad_norm 2.6513 (2.5361) [2022-01-25 09:21:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][610/1251] eta 0:23:43 lr 0.000103 time 2.2843 (2.2201) loss 2.8247 (3.1489) grad_norm 2.4690 (2.5363) [2022-01-25 09:21:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][620/1251] eta 0:23:20 lr 0.000103 time 2.0963 (2.2194) loss 3.2564 (3.1484) grad_norm 2.3643 (2.5348) [2022-01-25 09:22:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][630/1251] eta 0:22:58 lr 0.000103 time 2.8091 (2.2204) loss 2.9433 (3.1498) grad_norm 2.3424 (2.5325) [2022-01-25 09:22:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][640/1251] eta 0:22:35 lr 0.000103 time 1.8430 (2.2178) loss 3.5685 (3.1531) grad_norm 2.4903 (2.5311) [2022-01-25 09:22:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][650/1251] eta 0:22:13 lr 0.000103 time 2.6013 (2.2180) loss 3.0442 (3.1525) grad_norm 2.5177 (2.5294) [2022-01-25 09:23:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][660/1251] eta 0:21:49 lr 0.000103 time 2.1939 (2.2156) loss 2.0885 (3.1498) grad_norm 2.3552 (2.5282) [2022-01-25 09:23:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][670/1251] eta 0:21:26 lr 0.000103 time 2.1037 (2.2146) loss 3.5859 (3.1507) grad_norm 2.4576 (2.5281) [2022-01-25 09:24:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][680/1251] eta 0:21:04 lr 0.000103 time 2.7542 (2.2142) loss 3.4840 (3.1529) grad_norm 2.4341 (2.5284) [2022-01-25 09:24:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][690/1251] eta 0:20:42 lr 0.000103 time 1.5222 (2.2148) loss 3.7156 (3.1525) grad_norm 2.1027 (2.5271) [2022-01-25 09:24:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][700/1251] eta 0:20:22 lr 0.000103 time 1.9003 (2.2179) loss 2.4120 (3.1518) grad_norm 2.4832 (2.5257) [2022-01-25 09:25:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][710/1251] eta 0:20:00 lr 0.000103 time 1.9437 (2.2185) loss 2.7693 (3.1492) grad_norm 3.1529 (2.5261) [2022-01-25 09:25:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][720/1251] eta 0:19:38 lr 0.000103 time 2.1711 (2.2188) loss 3.7242 (3.1500) grad_norm 2.6308 (2.5262) [2022-01-25 09:25:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][730/1251] eta 0:19:14 lr 0.000103 time 1.6000 (2.2157) loss 3.0360 (3.1498) grad_norm 2.1426 (2.5265) [2022-01-25 09:26:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][740/1251] eta 0:18:50 lr 0.000103 time 2.1891 (2.2132) loss 2.7692 (3.1506) grad_norm 2.0962 (2.5252) [2022-01-25 09:26:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][750/1251] eta 0:18:27 lr 0.000103 time 2.5712 (2.2115) loss 3.1535 (3.1515) grad_norm 2.2463 (2.5245) [2022-01-25 09:26:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][760/1251] eta 0:18:05 lr 0.000103 time 2.1868 (2.2116) loss 3.2069 (3.1525) grad_norm 2.8313 (2.5239) [2022-01-25 09:27:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][770/1251] eta 0:17:43 lr 0.000103 time 2.1408 (2.2106) loss 3.6966 (3.1511) grad_norm 2.4515 (2.5235) [2022-01-25 09:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][780/1251] eta 0:17:21 lr 0.000103 time 2.5817 (2.2123) loss 2.7740 (3.1523) grad_norm 2.4712 (2.5255) [2022-01-25 09:28:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][790/1251] eta 0:17:00 lr 0.000103 time 1.9721 (2.2130) loss 3.3833 (3.1518) grad_norm 2.4567 (2.5250) [2022-01-25 09:28:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][800/1251] eta 0:16:38 lr 0.000103 time 2.1421 (2.2141) loss 2.2811 (3.1505) grad_norm 2.3903 (2.5236) [2022-01-25 09:28:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][810/1251] eta 0:16:15 lr 0.000103 time 2.2021 (2.2127) loss 3.3920 (3.1487) grad_norm 2.2298 (2.5218) [2022-01-25 09:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][820/1251] eta 0:15:52 lr 0.000103 time 1.9538 (2.2108) loss 3.4759 (3.1506) grad_norm 2.5870 (2.5213) [2022-01-25 09:29:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][830/1251] eta 0:15:31 lr 0.000103 time 1.9002 (2.2118) loss 3.3118 (3.1517) grad_norm 2.3416 (2.5202) [2022-01-25 09:29:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][840/1251] eta 0:15:08 lr 0.000103 time 2.1318 (2.2116) loss 2.7757 (3.1530) grad_norm 3.0658 (2.5209) [2022-01-25 09:30:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][850/1251] eta 0:14:45 lr 0.000102 time 2.0913 (2.2093) loss 3.5621 (3.1535) grad_norm 2.4919 (2.5202) [2022-01-25 09:30:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][860/1251] eta 0:14:23 lr 0.000102 time 2.1044 (2.2088) loss 2.8973 (3.1544) grad_norm 2.6646 (2.5197) [2022-01-25 09:30:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][870/1251] eta 0:14:01 lr 0.000102 time 2.5222 (2.2074) loss 3.3043 (3.1565) grad_norm 2.7196 (2.5180) [2022-01-25 09:31:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][880/1251] eta 0:13:38 lr 0.000102 time 2.1690 (2.2074) loss 3.3688 (3.1570) grad_norm 2.1802 (2.5170) [2022-01-25 09:31:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][890/1251] eta 0:13:17 lr 0.000102 time 1.9720 (2.2089) loss 3.2652 (3.1576) grad_norm 2.4344 (2.5190) [2022-01-25 09:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][900/1251] eta 0:12:55 lr 0.000102 time 2.1978 (2.2084) loss 3.9301 (3.1574) grad_norm 2.4180 (2.5191) [2022-01-25 09:32:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][910/1251] eta 0:12:33 lr 0.000102 time 1.8417 (2.2083) loss 3.2340 (3.1564) grad_norm 2.4944 (2.5176) [2022-01-25 09:32:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][920/1251] eta 0:12:11 lr 0.000102 time 1.8236 (2.2087) loss 2.9728 (3.1538) grad_norm 2.4136 (2.5157) [2022-01-25 09:33:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][930/1251] eta 0:11:49 lr 0.000102 time 1.9006 (2.2090) loss 3.7593 (3.1512) grad_norm 2.6283 (2.5138) [2022-01-25 09:33:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][940/1251] eta 0:11:26 lr 0.000102 time 1.8652 (2.2080) loss 2.8085 (3.1510) grad_norm 2.5305 (2.5130) [2022-01-25 09:33:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][950/1251] eta 0:11:04 lr 0.000102 time 1.6407 (2.2075) loss 3.5770 (3.1497) grad_norm 2.6935 (2.5129) [2022-01-25 09:34:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][960/1251] eta 0:10:42 lr 0.000102 time 1.9161 (2.2067) loss 2.8665 (3.1481) grad_norm 2.1154 (2.5122) [2022-01-25 09:34:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][970/1251] eta 0:10:19 lr 0.000102 time 2.2087 (2.2046) loss 3.2476 (3.1496) grad_norm 2.6199 (2.5121) [2022-01-25 09:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][980/1251] eta 0:09:57 lr 0.000102 time 2.1024 (2.2031) loss 3.4210 (3.1490) grad_norm 2.1457 (2.5119) [2022-01-25 09:35:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][990/1251] eta 0:09:34 lr 0.000102 time 2.1677 (2.2018) loss 3.2435 (3.1502) grad_norm 2.6445 (2.5111) [2022-01-25 09:35:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1000/1251] eta 0:09:12 lr 0.000102 time 2.3300 (2.2016) loss 3.3582 (3.1517) grad_norm 2.2323 (2.5093) [2022-01-25 09:36:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1010/1251] eta 0:08:50 lr 0.000102 time 2.5568 (2.2027) loss 3.1356 (3.1516) grad_norm 2.3576 (2.5084) [2022-01-25 09:36:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1020/1251] eta 0:08:28 lr 0.000102 time 1.5334 (2.2030) loss 3.5262 (3.1511) grad_norm 2.6169 (2.5080) [2022-01-25 09:36:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1030/1251] eta 0:08:07 lr 0.000102 time 2.5671 (2.2041) loss 3.2530 (3.1507) grad_norm 3.2078 (2.5093) [2022-01-25 09:37:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1040/1251] eta 0:07:45 lr 0.000102 time 2.1676 (2.2045) loss 3.1219 (3.1523) grad_norm 2.5641 (2.5082) [2022-01-25 09:37:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1050/1251] eta 0:07:23 lr 0.000102 time 2.5433 (2.2062) loss 2.4259 (3.1526) grad_norm 2.4604 (2.5084) [2022-01-25 09:37:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1060/1251] eta 0:07:01 lr 0.000102 time 1.8830 (2.2068) loss 2.9752 (3.1531) grad_norm 2.5728 (2.5124) [2022-01-25 09:38:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1070/1251] eta 0:06:39 lr 0.000102 time 1.9132 (2.2063) loss 3.0940 (3.1524) grad_norm 2.3952 (2.5120) [2022-01-25 09:38:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1080/1251] eta 0:06:17 lr 0.000102 time 2.6222 (2.2054) loss 2.5554 (3.1482) grad_norm 3.0765 (2.5118) [2022-01-25 09:38:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1090/1251] eta 0:05:54 lr 0.000102 time 1.8046 (2.2047) loss 3.3803 (3.1482) grad_norm 2.1782 (2.5105) [2022-01-25 09:39:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1100/1251] eta 0:05:32 lr 0.000102 time 2.0209 (2.2026) loss 3.4905 (3.1473) grad_norm 2.6108 (2.5102) [2022-01-25 09:39:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1110/1251] eta 0:05:10 lr 0.000102 time 1.8237 (2.2014) loss 3.2385 (3.1465) grad_norm 2.2211 (2.5110) [2022-01-25 09:40:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1120/1251] eta 0:04:48 lr 0.000102 time 1.8527 (2.2004) loss 2.6568 (3.1447) grad_norm 3.0459 (2.5132) [2022-01-25 09:40:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1130/1251] eta 0:04:26 lr 0.000102 time 2.9424 (2.2009) loss 3.3297 (3.1442) grad_norm 2.1306 (2.5125) [2022-01-25 09:40:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1140/1251] eta 0:04:04 lr 0.000102 time 2.0663 (2.2014) loss 2.9415 (3.1443) grad_norm 2.6832 (2.5124) [2022-01-25 09:41:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1150/1251] eta 0:03:42 lr 0.000102 time 2.0185 (2.2016) loss 2.0245 (3.1448) grad_norm 2.5116 (2.5124) [2022-01-25 09:41:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1160/1251] eta 0:03:20 lr 0.000102 time 2.4622 (2.2028) loss 3.4004 (3.1455) grad_norm 2.7519 (2.5118) [2022-01-25 09:41:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1170/1251] eta 0:02:58 lr 0.000102 time 2.5148 (2.2044) loss 2.9308 (3.1459) grad_norm 2.4703 (2.5112) [2022-01-25 09:42:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1180/1251] eta 0:02:36 lr 0.000102 time 1.9350 (2.2035) loss 3.2270 (3.1456) grad_norm 3.5013 (2.5111) [2022-01-25 09:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1190/1251] eta 0:02:14 lr 0.000102 time 2.0696 (2.2017) loss 3.6754 (3.1457) grad_norm 2.5386 (2.5111) [2022-01-25 09:42:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1200/1251] eta 0:01:52 lr 0.000102 time 1.9786 (2.2005) loss 2.6866 (3.1453) grad_norm 2.1070 (2.5115) [2022-01-25 09:43:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1210/1251] eta 0:01:30 lr 0.000102 time 2.5010 (2.2015) loss 2.8305 (3.1448) grad_norm 2.3849 (2.5111) [2022-01-25 09:43:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1220/1251] eta 0:01:08 lr 0.000102 time 2.2872 (2.2012) loss 2.8780 (3.1422) grad_norm 3.8185 (2.5115) [2022-01-25 09:44:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1230/1251] eta 0:00:46 lr 0.000102 time 1.8284 (2.2020) loss 3.5261 (3.1430) grad_norm 2.5226 (2.5106) [2022-01-25 09:44:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1240/1251] eta 0:00:24 lr 0.000102 time 1.5363 (2.2010) loss 2.6808 (3.1439) grad_norm 2.0642 (2.5098) [2022-01-25 09:44:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [240/300][1250/1251] eta 0:00:02 lr 0.000102 time 1.3498 (2.1957) loss 2.5898 (3.1436) grad_norm 2.0699 (2.5098) [2022-01-25 09:44:40 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 240 training takes 0:45:47 [2022-01-25 09:44:40 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_240 saving...... [2022-01-25 09:44:52 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_240 saved !!! [2022-01-25 09:45:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 12.674 (12.674) Loss 0.7978 (0.7978) Acc@1 80.762 (80.762) Acc@5 95.312 (95.312) [2022-01-25 09:45:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 5.144 (2.747) Loss 0.8025 (0.8550) Acc@1 80.371 (79.652) Acc@5 95.898 (94.789) [2022-01-25 09:45:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.315 (2.297) Loss 0.8432 (0.8476) Acc@1 81.250 (79.985) Acc@5 95.215 (94.968) [2022-01-25 09:45:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.358 (2.028) Loss 0.8279 (0.8425) Acc@1 81.055 (80.182) Acc@5 95.312 (94.982) [2022-01-25 09:46:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.728 (1.945) Loss 0.7870 (0.8431) Acc@1 82.227 (80.119) Acc@5 95.410 (95.070) [2022-01-25 09:46:19 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.056 Acc@5 95.104 [2022-01-25 09:46:19 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.1% [2022-01-25 09:46:19 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.06% [2022-01-25 09:46:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][0/1251] eta 7:53:26 lr 0.000102 time 22.7069 (22.7069) loss 3.1795 (3.1795) grad_norm 2.4521 (2.4521) [2022-01-25 09:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][10/1251] eta 1:24:01 lr 0.000101 time 2.0596 (4.0623) loss 2.7249 (2.8975) grad_norm 2.4075 (2.3606) [2022-01-25 09:47:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][20/1251] eta 1:05:59 lr 0.000101 time 1.4446 (3.2167) loss 3.5472 (2.9722) grad_norm 2.5757 (2.4223) [2022-01-25 09:47:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][30/1251] eta 0:59:19 lr 0.000101 time 1.3924 (2.9153) loss 3.5622 (3.0804) grad_norm 2.2860 (2.4111) [2022-01-25 09:48:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][40/1251] eta 0:56:07 lr 0.000101 time 3.6283 (2.7810) loss 3.1770 (3.0955) grad_norm 1.9434 (2.3987) [2022-01-25 09:48:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][50/1251] eta 0:53:32 lr 0.000101 time 2.3613 (2.6749) loss 3.0542 (3.0971) grad_norm 2.2143 (2.3867) [2022-01-25 09:48:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][60/1251] eta 0:51:34 lr 0.000101 time 1.5925 (2.5981) loss 3.2810 (3.1031) grad_norm 2.4043 (2.3892) [2022-01-25 09:49:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][70/1251] eta 0:49:41 lr 0.000101 time 1.6004 (2.5242) loss 2.2318 (3.0816) grad_norm 2.2017 (2.4185) [2022-01-25 09:49:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][80/1251] eta 0:48:23 lr 0.000101 time 2.4863 (2.4795) loss 3.4800 (3.0816) grad_norm 2.3206 (2.4480) [2022-01-25 09:50:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][90/1251] eta 0:47:17 lr 0.000101 time 1.6226 (2.4439) loss 3.1958 (3.1173) grad_norm 2.5818 (2.4491) [2022-01-25 09:50:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][100/1251] eta 0:46:29 lr 0.000101 time 1.5918 (2.4235) loss 3.4727 (3.1398) grad_norm 2.3889 (2.4652) [2022-01-25 09:50:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][110/1251] eta 0:45:45 lr 0.000101 time 1.6369 (2.4065) loss 2.8185 (3.1517) grad_norm 2.3910 (2.4668) [2022-01-25 09:51:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][120/1251] eta 0:45:00 lr 0.000101 time 3.3947 (2.3874) loss 2.1529 (3.1512) grad_norm 3.2111 (2.4784) [2022-01-25 09:51:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][130/1251] eta 0:44:16 lr 0.000101 time 1.7148 (2.3693) loss 3.2353 (3.1655) grad_norm 2.5117 (2.4846) [2022-01-25 09:51:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][140/1251] eta 0:43:33 lr 0.000101 time 1.7457 (2.3528) loss 3.5860 (3.1593) grad_norm 2.7923 (2.4871) [2022-01-25 09:52:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][150/1251] eta 0:42:52 lr 0.000101 time 2.0009 (2.3365) loss 3.6733 (3.1708) grad_norm 2.3988 (2.4844) [2022-01-25 09:52:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][160/1251] eta 0:42:27 lr 0.000101 time 2.8995 (2.3346) loss 3.2167 (3.1713) grad_norm 2.3437 (2.4913) [2022-01-25 09:52:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][170/1251] eta 0:41:55 lr 0.000101 time 2.0121 (2.3270) loss 3.4066 (3.1649) grad_norm 2.3360 (2.4875) [2022-01-25 09:53:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][180/1251] eta 0:41:30 lr 0.000101 time 1.6639 (2.3255) loss 3.3547 (3.1638) grad_norm 2.9326 (2.4861) [2022-01-25 09:53:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][190/1251] eta 0:40:51 lr 0.000101 time 2.2172 (2.3106) loss 3.2324 (3.1552) grad_norm 2.0503 (2.4929) [2022-01-25 09:54:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][200/1251] eta 0:40:15 lr 0.000101 time 1.6549 (2.2981) loss 3.5477 (3.1515) grad_norm 3.0249 (2.4960) [2022-01-25 09:54:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][210/1251] eta 0:39:58 lr 0.000101 time 2.2789 (2.3037) loss 3.5898 (3.1589) grad_norm 2.3467 (2.4960) [2022-01-25 09:54:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][220/1251] eta 0:39:34 lr 0.000101 time 2.1547 (2.3027) loss 3.4922 (3.1599) grad_norm 2.8721 (2.4993) [2022-01-25 09:55:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][230/1251] eta 0:39:06 lr 0.000101 time 2.3468 (2.2987) loss 3.3797 (3.1625) grad_norm 2.2318 (2.5041) [2022-01-25 09:55:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][240/1251] eta 0:38:37 lr 0.000101 time 1.6701 (2.2923) loss 3.4755 (3.1605) grad_norm 2.5578 (2.4978) [2022-01-25 09:55:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][250/1251] eta 0:38:07 lr 0.000101 time 1.6144 (2.2855) loss 3.1356 (3.1591) grad_norm 2.6104 (2.5001) [2022-01-25 09:56:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][260/1251] eta 0:37:34 lr 0.000101 time 1.9644 (2.2748) loss 3.0691 (3.1621) grad_norm 2.1211 (2.4927) [2022-01-25 09:56:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][270/1251] eta 0:37:13 lr 0.000101 time 2.7339 (2.2767) loss 1.9737 (3.1623) grad_norm 2.3154 (2.4916) [2022-01-25 09:56:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][280/1251] eta 0:36:43 lr 0.000101 time 1.9361 (2.2698) loss 2.8133 (3.1512) grad_norm 2.6018 (2.4903) [2022-01-25 09:57:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][290/1251] eta 0:36:21 lr 0.000101 time 1.8633 (2.2704) loss 3.8552 (3.1509) grad_norm 2.5003 (2.4887) [2022-01-25 09:57:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][300/1251] eta 0:35:57 lr 0.000101 time 2.2570 (2.2687) loss 3.1372 (3.1495) grad_norm 2.4080 (2.4874) [2022-01-25 09:58:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][310/1251] eta 0:35:32 lr 0.000101 time 2.2016 (2.2666) loss 2.3430 (3.1447) grad_norm 2.4574 (2.4892) [2022-01-25 09:58:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][320/1251] eta 0:35:03 lr 0.000101 time 1.7029 (2.2597) loss 3.6883 (3.1399) grad_norm 2.3088 (2.4870) [2022-01-25 09:58:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][330/1251] eta 0:34:44 lr 0.000101 time 2.0418 (2.2634) loss 2.5753 (3.1380) grad_norm 2.7708 (2.4864) [2022-01-25 09:59:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][340/1251] eta 0:34:20 lr 0.000101 time 1.6370 (2.2614) loss 2.6062 (3.1341) grad_norm 2.5530 (2.4888) [2022-01-25 09:59:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][350/1251] eta 0:33:54 lr 0.000101 time 1.9423 (2.2586) loss 3.2254 (3.1406) grad_norm 2.3399 (2.4867) [2022-01-25 09:59:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][360/1251] eta 0:33:28 lr 0.000101 time 1.7477 (2.2541) loss 3.1309 (3.1400) grad_norm 2.4445 (2.4878) [2022-01-25 10:00:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][370/1251] eta 0:33:04 lr 0.000101 time 1.9893 (2.2520) loss 2.0719 (3.1374) grad_norm 2.3995 (2.4899) [2022-01-25 10:00:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][380/1251] eta 0:32:37 lr 0.000101 time 2.5259 (2.2470) loss 2.2574 (3.1388) grad_norm 2.2999 (2.4891) [2022-01-25 10:00:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][390/1251] eta 0:32:10 lr 0.000101 time 1.9123 (2.2420) loss 3.3841 (3.1417) grad_norm 2.4338 (2.4900) [2022-01-25 10:01:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][400/1251] eta 0:31:41 lr 0.000101 time 2.2894 (2.2347) loss 3.7821 (3.1402) grad_norm 2.6521 (2.4996) [2022-01-25 10:01:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][410/1251] eta 0:31:18 lr 0.000101 time 2.4330 (2.2340) loss 3.1415 (3.1426) grad_norm 2.8201 (2.5021) [2022-01-25 10:01:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][420/1251] eta 0:30:54 lr 0.000101 time 1.7200 (2.2318) loss 3.6479 (3.1460) grad_norm 2.5107 (2.5104) [2022-01-25 10:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][430/1251] eta 0:30:30 lr 0.000100 time 1.6163 (2.2295) loss 2.5333 (3.1452) grad_norm 2.7242 (2.5119) [2022-01-25 10:02:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][440/1251] eta 0:30:12 lr 0.000100 time 2.7553 (2.2344) loss 3.3788 (3.1510) grad_norm 2.3537 (2.5103) [2022-01-25 10:03:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][450/1251] eta 0:29:52 lr 0.000100 time 2.1967 (2.2373) loss 3.3411 (3.1518) grad_norm 2.7256 (2.5090) [2022-01-25 10:03:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][460/1251] eta 0:29:29 lr 0.000100 time 2.5276 (2.2365) loss 3.4875 (3.1527) grad_norm 2.4704 (2.5069) [2022-01-25 10:03:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][470/1251] eta 0:29:06 lr 0.000100 time 2.2458 (2.2368) loss 2.5965 (3.1560) grad_norm 2.8805 (2.5051) [2022-01-25 10:04:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][480/1251] eta 0:28:43 lr 0.000100 time 2.7278 (2.2353) loss 2.5103 (3.1559) grad_norm 2.6243 (2.5035) [2022-01-25 10:04:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][490/1251] eta 0:28:18 lr 0.000100 time 1.9885 (2.2324) loss 3.7699 (3.1614) grad_norm 2.5412 (2.5037) [2022-01-25 10:04:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][500/1251] eta 0:27:55 lr 0.000100 time 2.7289 (2.2311) loss 2.4293 (3.1569) grad_norm 2.2233 (2.5033) [2022-01-25 10:05:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][510/1251] eta 0:27:32 lr 0.000100 time 1.8718 (2.2295) loss 3.2934 (3.1552) grad_norm 2.2032 (2.5009) [2022-01-25 10:05:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][520/1251] eta 0:27:08 lr 0.000100 time 2.2874 (2.2278) loss 3.1232 (3.1523) grad_norm 2.9417 (2.5000) [2022-01-25 10:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][530/1251] eta 0:26:44 lr 0.000100 time 2.3184 (2.2257) loss 3.6699 (3.1507) grad_norm 2.6322 (2.5067) [2022-01-25 10:06:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][540/1251] eta 0:26:20 lr 0.000100 time 2.0783 (2.2236) loss 2.5647 (3.1461) grad_norm 2.3885 (2.5143) [2022-01-25 10:06:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][550/1251] eta 0:25:58 lr 0.000100 time 1.6708 (2.2230) loss 3.4530 (3.1452) grad_norm 2.1122 (2.5128) [2022-01-25 10:07:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][560/1251] eta 0:25:35 lr 0.000100 time 2.2306 (2.2217) loss 3.2795 (3.1507) grad_norm 2.3930 (2.5150) [2022-01-25 10:07:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][570/1251] eta 0:25:12 lr 0.000100 time 2.6453 (2.2207) loss 2.2201 (3.1488) grad_norm 3.3290 (2.5185) [2022-01-25 10:07:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][580/1251] eta 0:24:49 lr 0.000100 time 2.5739 (2.2195) loss 1.9871 (3.1451) grad_norm 2.2692 (2.5194) [2022-01-25 10:08:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][590/1251] eta 0:24:27 lr 0.000100 time 1.9143 (2.2204) loss 3.3985 (3.1447) grad_norm 2.8730 (2.5196) [2022-01-25 10:08:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][600/1251] eta 0:24:04 lr 0.000100 time 1.9778 (2.2190) loss 3.4314 (3.1481) grad_norm 2.1261 (2.5231) [2022-01-25 10:08:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][610/1251] eta 0:23:43 lr 0.000100 time 2.5288 (2.2206) loss 3.5135 (3.1493) grad_norm 2.8895 (2.5225) [2022-01-25 10:09:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][620/1251] eta 0:23:21 lr 0.000100 time 1.7616 (2.2203) loss 3.0539 (3.1488) grad_norm 2.9100 (2.5231) [2022-01-25 10:09:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][630/1251] eta 0:22:58 lr 0.000100 time 2.4591 (2.2204) loss 3.4832 (3.1476) grad_norm 2.7225 (2.5229) [2022-01-25 10:09:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][640/1251] eta 0:22:34 lr 0.000100 time 1.8275 (2.2161) loss 3.1875 (3.1455) grad_norm 2.1456 (2.5218) [2022-01-25 10:10:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][650/1251] eta 0:22:10 lr 0.000100 time 1.9392 (2.2141) loss 2.5240 (3.1440) grad_norm 2.4548 (2.5217) [2022-01-25 10:10:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][660/1251] eta 0:21:48 lr 0.000100 time 1.7901 (2.2136) loss 3.3574 (3.1459) grad_norm 2.4627 (2.5234) [2022-01-25 10:11:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][670/1251] eta 0:21:29 lr 0.000100 time 6.5860 (2.2193) loss 2.8346 (3.1477) grad_norm 2.1299 (2.5229) [2022-01-25 10:11:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][680/1251] eta 0:21:05 lr 0.000100 time 1.5866 (2.2171) loss 3.0986 (3.1441) grad_norm 2.7667 (2.5226) [2022-01-25 10:11:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][690/1251] eta 0:20:44 lr 0.000100 time 2.6218 (2.2177) loss 3.4734 (3.1391) grad_norm 2.4673 (2.5248) [2022-01-25 10:12:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][700/1251] eta 0:20:20 lr 0.000100 time 1.5648 (2.2156) loss 2.7353 (3.1365) grad_norm 3.1163 (2.5272) [2022-01-25 10:12:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][710/1251] eta 0:19:57 lr 0.000100 time 2.6412 (2.2143) loss 3.3200 (3.1390) grad_norm 2.7130 (2.5294) [2022-01-25 10:12:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][720/1251] eta 0:19:34 lr 0.000100 time 1.9367 (2.2121) loss 3.7553 (3.1395) grad_norm 2.0034 (2.5279) [2022-01-25 10:13:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][730/1251] eta 0:19:13 lr 0.000100 time 2.7711 (2.2142) loss 2.1211 (3.1380) grad_norm 2.0065 (2.5276) [2022-01-25 10:13:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][740/1251] eta 0:18:51 lr 0.000100 time 1.9741 (2.2144) loss 3.1022 (3.1404) grad_norm 2.5435 (2.5304) [2022-01-25 10:14:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][750/1251] eta 0:18:28 lr 0.000100 time 3.3134 (2.2130) loss 2.8618 (3.1419) grad_norm 2.3108 (2.5285) [2022-01-25 10:14:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][760/1251] eta 0:18:06 lr 0.000100 time 1.8320 (2.2130) loss 3.7374 (3.1435) grad_norm 2.3659 (2.5265) [2022-01-25 10:14:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][770/1251] eta 0:17:44 lr 0.000100 time 2.5141 (2.2128) loss 2.8805 (3.1441) grad_norm 2.1989 (2.5250) [2022-01-25 10:15:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][780/1251] eta 0:17:21 lr 0.000100 time 1.9224 (2.2108) loss 3.0315 (3.1436) grad_norm 2.4439 (2.5256) [2022-01-25 10:15:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][790/1251] eta 0:16:58 lr 0.000100 time 1.9226 (2.2093) loss 3.2089 (3.1421) grad_norm 2.3148 (2.5233) [2022-01-25 10:15:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][800/1251] eta 0:16:35 lr 0.000100 time 1.9504 (2.2079) loss 3.5259 (3.1450) grad_norm 2.2567 (2.5223) [2022-01-25 10:16:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][810/1251] eta 0:16:14 lr 0.000100 time 2.4695 (2.2089) loss 2.1661 (3.1432) grad_norm 2.4707 (2.5218) [2022-01-25 10:16:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][820/1251] eta 0:15:52 lr 0.000100 time 1.5613 (2.2090) loss 3.5858 (3.1419) grad_norm 2.6640 (2.5209) [2022-01-25 10:16:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][830/1251] eta 0:15:30 lr 0.000100 time 2.1142 (2.2099) loss 2.4818 (3.1427) grad_norm 2.4421 (2.5193) [2022-01-25 10:17:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][840/1251] eta 0:15:07 lr 0.000100 time 1.7575 (2.2084) loss 3.2634 (3.1389) grad_norm 3.2268 (2.5195) [2022-01-25 10:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][850/1251] eta 0:14:46 lr 0.000099 time 2.0232 (2.2095) loss 3.3123 (3.1407) grad_norm 3.0373 (2.5195) [2022-01-25 10:18:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][860/1251] eta 0:14:23 lr 0.000099 time 1.5435 (2.2083) loss 3.5272 (3.1430) grad_norm 2.5608 (2.5200) [2022-01-25 10:18:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][870/1251] eta 0:14:00 lr 0.000099 time 2.5246 (2.2070) loss 3.2857 (3.1449) grad_norm 2.2190 (2.5207) [2022-01-25 10:18:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][880/1251] eta 0:13:38 lr 0.000099 time 1.5873 (2.2055) loss 3.7284 (3.1486) grad_norm 2.7203 (2.5217) [2022-01-25 10:19:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][890/1251] eta 0:13:16 lr 0.000099 time 1.8410 (2.2055) loss 3.3221 (3.1487) grad_norm 2.6005 (2.5226) [2022-01-25 10:19:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][900/1251] eta 0:12:53 lr 0.000099 time 1.8525 (2.2044) loss 3.1116 (3.1499) grad_norm 2.2886 (2.5223) [2022-01-25 10:19:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][910/1251] eta 0:12:31 lr 0.000099 time 2.4663 (2.2048) loss 3.0613 (3.1497) grad_norm 2.3320 (2.5219) [2022-01-25 10:20:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][920/1251] eta 0:12:10 lr 0.000099 time 2.2472 (2.2066) loss 3.3458 (3.1508) grad_norm 2.7958 (2.5229) [2022-01-25 10:20:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][930/1251] eta 0:11:48 lr 0.000099 time 2.4853 (2.2069) loss 3.3872 (3.1534) grad_norm 2.9758 (2.5222) [2022-01-25 10:20:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][940/1251] eta 0:11:26 lr 0.000099 time 2.1117 (2.2064) loss 3.5547 (3.1535) grad_norm 2.4803 (2.5227) [2022-01-25 10:21:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][950/1251] eta 0:11:03 lr 0.000099 time 2.9597 (2.2058) loss 2.1929 (3.1516) grad_norm 2.2274 (2.5229) [2022-01-25 10:21:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][960/1251] eta 0:10:41 lr 0.000099 time 1.7172 (2.2053) loss 2.9726 (3.1509) grad_norm 2.6623 (2.5218) [2022-01-25 10:21:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][970/1251] eta 0:10:19 lr 0.000099 time 2.0723 (2.2046) loss 3.2229 (3.1486) grad_norm 2.7225 (2.5239) [2022-01-25 10:22:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][980/1251] eta 0:09:57 lr 0.000099 time 1.6097 (2.2050) loss 3.3847 (3.1503) grad_norm 2.5857 (2.5233) [2022-01-25 10:22:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][990/1251] eta 0:09:35 lr 0.000099 time 1.7684 (2.2046) loss 2.4232 (3.1488) grad_norm 7.1774 (2.5286) [2022-01-25 10:23:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1000/1251] eta 0:09:13 lr 0.000099 time 1.8816 (2.2058) loss 2.4116 (3.1480) grad_norm 2.0781 (2.5301) [2022-01-25 10:23:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1010/1251] eta 0:08:51 lr 0.000099 time 2.2367 (2.2056) loss 2.0055 (3.1453) grad_norm 2.6623 (2.5315) [2022-01-25 10:23:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1020/1251] eta 0:08:29 lr 0.000099 time 1.6280 (2.2050) loss 2.6422 (3.1443) grad_norm 3.2212 (2.5311) [2022-01-25 10:24:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1030/1251] eta 0:08:07 lr 0.000099 time 1.8069 (2.2043) loss 3.4028 (3.1440) grad_norm 2.5010 (2.5307) [2022-01-25 10:24:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1040/1251] eta 0:07:45 lr 0.000099 time 1.9936 (2.2045) loss 3.1713 (3.1443) grad_norm 2.6755 (2.5305) [2022-01-25 10:24:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1050/1251] eta 0:07:22 lr 0.000099 time 1.8533 (2.2034) loss 3.3910 (3.1464) grad_norm 3.0620 (2.5329) [2022-01-25 10:25:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1060/1251] eta 0:07:00 lr 0.000099 time 2.2604 (2.2030) loss 3.5800 (3.1458) grad_norm 2.4696 (2.5328) [2022-01-25 10:25:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1070/1251] eta 0:06:38 lr 0.000099 time 2.3257 (2.2023) loss 2.9343 (3.1453) grad_norm 2.5724 (2.5315) [2022-01-25 10:26:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1080/1251] eta 0:06:16 lr 0.000099 time 2.5274 (2.2034) loss 3.3301 (3.1469) grad_norm 2.9359 (2.5311) [2022-01-25 10:26:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1090/1251] eta 0:05:54 lr 0.000099 time 2.1921 (2.2023) loss 3.5829 (3.1459) grad_norm 2.3967 (2.5332) [2022-01-25 10:26:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1100/1251] eta 0:05:32 lr 0.000099 time 2.2302 (2.2017) loss 3.4479 (3.1443) grad_norm 2.1991 (2.5326) [2022-01-25 10:27:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1110/1251] eta 0:05:10 lr 0.000099 time 2.2546 (2.2017) loss 3.4476 (3.1443) grad_norm 2.2321 (2.5321) [2022-01-25 10:27:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1120/1251] eta 0:04:48 lr 0.000099 time 2.3613 (2.2013) loss 3.5095 (3.1450) grad_norm 2.6360 (2.5328) [2022-01-25 10:27:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1130/1251] eta 0:04:26 lr 0.000099 time 1.6939 (2.1998) loss 3.9698 (3.1442) grad_norm 2.6433 (2.5338) [2022-01-25 10:28:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1140/1251] eta 0:04:04 lr 0.000099 time 2.5283 (2.1994) loss 3.7364 (3.1438) grad_norm 2.3532 (2.5342) [2022-01-25 10:28:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1150/1251] eta 0:03:42 lr 0.000099 time 1.8272 (2.1993) loss 3.3082 (3.1430) grad_norm 2.3179 (2.5330) [2022-01-25 10:28:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1160/1251] eta 0:03:20 lr 0.000099 time 1.6345 (2.1987) loss 2.7970 (3.1427) grad_norm 2.3746 (2.5327) [2022-01-25 10:29:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1170/1251] eta 0:02:58 lr 0.000099 time 1.5878 (2.1982) loss 3.3060 (3.1436) grad_norm 2.8407 (2.5320) [2022-01-25 10:29:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1180/1251] eta 0:02:36 lr 0.000099 time 2.7892 (2.1990) loss 3.6621 (3.1438) grad_norm 2.0861 (2.5314) [2022-01-25 10:29:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1190/1251] eta 0:02:14 lr 0.000099 time 2.0503 (2.1987) loss 3.3901 (3.1459) grad_norm 2.3555 (2.5320) [2022-01-25 10:30:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1200/1251] eta 0:01:52 lr 0.000099 time 2.2254 (2.1993) loss 3.0956 (3.1447) grad_norm 2.3913 (2.5326) [2022-01-25 10:30:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1210/1251] eta 0:01:30 lr 0.000099 time 1.6324 (2.1995) loss 2.9270 (3.1456) grad_norm 2.5496 (2.5321) [2022-01-25 10:31:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1220/1251] eta 0:01:08 lr 0.000099 time 2.4935 (2.1983) loss 3.1525 (3.1463) grad_norm 2.7516 (2.5330) [2022-01-25 10:31:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1230/1251] eta 0:00:46 lr 0.000099 time 2.3885 (2.1992) loss 2.3913 (3.1450) grad_norm 2.3960 (2.5314) [2022-01-25 10:31:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1240/1251] eta 0:00:24 lr 0.000099 time 1.3883 (2.1985) loss 3.5973 (3.1467) grad_norm 2.5221 (2.5303) [2022-01-25 10:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [241/300][1250/1251] eta 0:00:02 lr 0.000099 time 1.1627 (2.1933) loss 3.0019 (3.1485) grad_norm 2.6380 (2.5313) [2022-01-25 10:32:03 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 241 training takes 0:45:44 [2022-01-25 10:32:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.969 (17.969) Loss 0.8704 (0.8704) Acc@1 80.078 (80.078) Acc@5 94.238 (94.238) [2022-01-25 10:32:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.950 (3.566) Loss 0.8612 (0.8533) Acc@1 78.516 (79.679) Acc@5 95.020 (95.162) [2022-01-25 10:32:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.583 (2.561) Loss 0.8320 (0.8440) Acc@1 81.934 (80.046) Acc@5 94.824 (95.257) [2022-01-25 10:33:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.992 (2.311) Loss 0.8614 (0.8466) Acc@1 80.371 (79.968) Acc@5 95.020 (95.183) [2022-01-25 10:33:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.652 (2.121) Loss 0.8234 (0.8481) Acc@1 80.664 (79.921) Acc@5 95.703 (95.160) [2022-01-25 10:33:38 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 79.916 Acc@5 95.166 [2022-01-25 10:33:38 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 79.9% [2022-01-25 10:33:38 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.06% [2022-01-25 10:33:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][0/1251] eta 7:27:43 lr 0.000099 time 21.4734 (21.4734) loss 3.4207 (3.4207) grad_norm 2.4850 (2.4850) [2022-01-25 10:34:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][10/1251] eta 1:25:05 lr 0.000099 time 1.7274 (4.1142) loss 3.0752 (2.9072) grad_norm 2.5080 (2.4365) [2022-01-25 10:34:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][20/1251] eta 1:05:54 lr 0.000098 time 2.6868 (3.2124) loss 2.4842 (2.9876) grad_norm 2.3381 (2.4489) [2022-01-25 10:35:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][30/1251] eta 0:57:47 lr 0.000098 time 1.5934 (2.8398) loss 2.7366 (3.1271) grad_norm 2.6542 (2.4660) [2022-01-25 10:35:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][40/1251] eta 0:54:28 lr 0.000098 time 3.7714 (2.6991) loss 2.9097 (3.1082) grad_norm 2.1777 (2.4573) [2022-01-25 10:35:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][50/1251] eta 0:52:58 lr 0.000098 time 1.8279 (2.6467) loss 2.0619 (3.0586) grad_norm 2.7223 (2.4579) [2022-01-25 10:36:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][60/1251] eta 0:51:29 lr 0.000098 time 2.6845 (2.5941) loss 3.6932 (3.0389) grad_norm 2.4476 (2.4746) [2022-01-25 10:36:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][70/1251] eta 0:49:40 lr 0.000098 time 1.9677 (2.5241) loss 3.5694 (3.0657) grad_norm 2.4136 (2.4789) [2022-01-25 10:37:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][80/1251] eta 0:48:39 lr 0.000098 time 2.9072 (2.4929) loss 3.6769 (3.0849) grad_norm 2.4323 (2.4713) [2022-01-25 10:37:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][90/1251] eta 0:47:47 lr 0.000098 time 2.1914 (2.4700) loss 2.8509 (3.1061) grad_norm 2.4040 (2.4802) [2022-01-25 10:37:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][100/1251] eta 0:46:33 lr 0.000098 time 1.7618 (2.4266) loss 3.5775 (3.0991) grad_norm 2.3735 (2.4693) [2022-01-25 10:38:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][110/1251] eta 0:45:31 lr 0.000098 time 1.9356 (2.3938) loss 3.3510 (3.1069) grad_norm 2.2397 (2.4655) [2022-01-25 10:38:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][120/1251] eta 0:44:45 lr 0.000098 time 1.9738 (2.3746) loss 3.5422 (3.1117) grad_norm 2.5722 (2.4676) [2022-01-25 10:38:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][130/1251] eta 0:44:13 lr 0.000098 time 2.4905 (2.3670) loss 2.5138 (3.1137) grad_norm 2.6265 (2.4725) [2022-01-25 10:39:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][140/1251] eta 0:43:31 lr 0.000098 time 1.6506 (2.3508) loss 3.4810 (3.0983) grad_norm 2.5402 (2.4538) [2022-01-25 10:39:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][150/1251] eta 0:42:59 lr 0.000098 time 1.7494 (2.3430) loss 3.1953 (3.1112) grad_norm 2.6741 (2.4618) [2022-01-25 10:39:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][160/1251] eta 0:42:25 lr 0.000098 time 2.5731 (2.3328) loss 2.5621 (3.1135) grad_norm 2.8504 (2.4663) [2022-01-25 10:40:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][170/1251] eta 0:41:45 lr 0.000098 time 2.0057 (2.3179) loss 3.7360 (3.1195) grad_norm 2.6570 (2.4655) [2022-01-25 10:40:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][180/1251] eta 0:41:07 lr 0.000098 time 2.0444 (2.3038) loss 3.3480 (3.1120) grad_norm 2.7323 (2.4650) [2022-01-25 10:40:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][190/1251] eta 0:40:31 lr 0.000098 time 1.6321 (2.2918) loss 3.4907 (3.1070) grad_norm 2.3754 (2.4679) [2022-01-25 10:41:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][200/1251] eta 0:40:04 lr 0.000098 time 3.4549 (2.2875) loss 3.4470 (3.1080) grad_norm 2.9932 (2.4698) [2022-01-25 10:41:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][210/1251] eta 0:39:38 lr 0.000098 time 1.5309 (2.2852) loss 3.0078 (3.1087) grad_norm 2.7559 (2.4766) [2022-01-25 10:42:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][220/1251] eta 0:39:15 lr 0.000098 time 2.2318 (2.2850) loss 3.4934 (3.1235) grad_norm 2.3780 (2.4774) [2022-01-25 10:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][230/1251] eta 0:38:47 lr 0.000098 time 1.5575 (2.2798) loss 3.4362 (3.1263) grad_norm 2.5778 (2.4766) [2022-01-25 10:42:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][240/1251] eta 0:38:27 lr 0.000098 time 3.6529 (2.2826) loss 3.0332 (3.1072) grad_norm 2.6494 (2.4784) [2022-01-25 10:43:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][250/1251] eta 0:38:08 lr 0.000098 time 1.8672 (2.2861) loss 3.4269 (3.1128) grad_norm 2.5215 (2.4815) [2022-01-25 10:43:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][260/1251] eta 0:37:36 lr 0.000098 time 1.6002 (2.2774) loss 3.3470 (3.1120) grad_norm 2.3261 (2.4854) [2022-01-25 10:43:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][270/1251] eta 0:37:06 lr 0.000098 time 1.7712 (2.2692) loss 3.1638 (3.1070) grad_norm 2.1792 (2.4849) [2022-01-25 10:44:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][280/1251] eta 0:36:40 lr 0.000098 time 2.8301 (2.2662) loss 2.2915 (3.1039) grad_norm 2.4663 (2.4865) [2022-01-25 10:44:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][290/1251] eta 0:36:14 lr 0.000098 time 1.8233 (2.2624) loss 3.6045 (3.0998) grad_norm 2.7454 (2.4849) [2022-01-25 10:44:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][300/1251] eta 0:35:46 lr 0.000098 time 1.5952 (2.2572) loss 3.3479 (3.0999) grad_norm 2.5741 (2.4862) [2022-01-25 10:45:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][310/1251] eta 0:35:18 lr 0.000098 time 1.7761 (2.2518) loss 2.5750 (3.0924) grad_norm 2.5156 (2.4897) [2022-01-25 10:45:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][320/1251] eta 0:34:51 lr 0.000098 time 2.2475 (2.2470) loss 3.9922 (3.0959) grad_norm 2.3719 (2.4869) [2022-01-25 10:46:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][330/1251] eta 0:34:24 lr 0.000098 time 1.5934 (2.2412) loss 3.6706 (3.1001) grad_norm 2.2814 (2.4823) [2022-01-25 10:46:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][340/1251] eta 0:34:01 lr 0.000098 time 2.0241 (2.2410) loss 3.4664 (3.1028) grad_norm 2.1765 (2.4816) [2022-01-25 10:46:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][350/1251] eta 0:33:40 lr 0.000098 time 2.5778 (2.2428) loss 3.7375 (3.1023) grad_norm 2.2082 (2.4858) [2022-01-25 10:47:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][360/1251] eta 0:33:18 lr 0.000098 time 2.5501 (2.2431) loss 3.3497 (3.1036) grad_norm 2.4749 (2.4864) [2022-01-25 10:47:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][370/1251] eta 0:32:55 lr 0.000098 time 1.8683 (2.2418) loss 3.4109 (3.0992) grad_norm 2.3133 (2.4858) [2022-01-25 10:47:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][380/1251] eta 0:32:33 lr 0.000098 time 2.0647 (2.2428) loss 3.8145 (3.1009) grad_norm 2.5234 (2.4870) [2022-01-25 10:48:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][390/1251] eta 0:32:11 lr 0.000098 time 2.7544 (2.2433) loss 2.8765 (3.0992) grad_norm 2.8006 (2.4924) [2022-01-25 10:48:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][400/1251] eta 0:31:47 lr 0.000098 time 1.7224 (2.2410) loss 3.8284 (3.0967) grad_norm 2.6963 (2.4935) [2022-01-25 10:48:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][410/1251] eta 0:31:25 lr 0.000098 time 1.8294 (2.2414) loss 3.7369 (3.1016) grad_norm 3.1534 (2.4990) [2022-01-25 10:49:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][420/1251] eta 0:30:59 lr 0.000098 time 2.2805 (2.2375) loss 3.4303 (3.1049) grad_norm 2.4181 (2.5019) [2022-01-25 10:49:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][430/1251] eta 0:30:36 lr 0.000098 time 2.3236 (2.2363) loss 3.6228 (3.1050) grad_norm 2.5002 (2.5015) [2022-01-25 10:50:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][440/1251] eta 0:30:10 lr 0.000097 time 1.9426 (2.2329) loss 3.4692 (3.1076) grad_norm 2.4925 (2.5014) [2022-01-25 10:50:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][450/1251] eta 0:29:49 lr 0.000097 time 2.3478 (2.2345) loss 3.1002 (3.1132) grad_norm 2.3814 (2.5014) [2022-01-25 10:50:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][460/1251] eta 0:29:29 lr 0.000097 time 1.8323 (2.2369) loss 3.3356 (3.1083) grad_norm 2.4280 (2.5005) [2022-01-25 10:51:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][470/1251] eta 0:29:06 lr 0.000097 time 1.8932 (2.2367) loss 3.5455 (3.1109) grad_norm 2.4568 (2.5006) [2022-01-25 10:51:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][480/1251] eta 0:28:44 lr 0.000097 time 1.8074 (2.2361) loss 3.4790 (3.1123) grad_norm 2.1573 (2.5003) [2022-01-25 10:51:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][490/1251] eta 0:28:18 lr 0.000097 time 2.2030 (2.2319) loss 2.7436 (3.1126) grad_norm 2.7440 (2.4987) [2022-01-25 10:52:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][500/1251] eta 0:27:52 lr 0.000097 time 2.1773 (2.2274) loss 3.2796 (3.1096) grad_norm 2.4109 (2.4994) [2022-01-25 10:52:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][510/1251] eta 0:27:28 lr 0.000097 time 1.9151 (2.2243) loss 3.2820 (3.1151) grad_norm 2.6500 (2.5030) [2022-01-25 10:52:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][520/1251] eta 0:27:07 lr 0.000097 time 2.5065 (2.2259) loss 3.3653 (3.1180) grad_norm 2.2824 (2.5050) [2022-01-25 10:53:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][530/1251] eta 0:26:44 lr 0.000097 time 1.9077 (2.2251) loss 3.4265 (3.1210) grad_norm 2.4460 (2.5065) [2022-01-25 10:53:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][540/1251] eta 0:26:22 lr 0.000097 time 2.1158 (2.2263) loss 3.5669 (3.1190) grad_norm 2.3360 (2.5089) [2022-01-25 10:54:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][550/1251] eta 0:26:01 lr 0.000097 time 2.3720 (2.2269) loss 2.1502 (3.1168) grad_norm 2.2317 (2.5116) [2022-01-25 10:54:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][560/1251] eta 0:25:38 lr 0.000097 time 2.2462 (2.2264) loss 3.4761 (3.1189) grad_norm 2.2332 (2.5142) [2022-01-25 10:54:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][570/1251] eta 0:25:14 lr 0.000097 time 1.9104 (2.2240) loss 3.5008 (3.1209) grad_norm 2.3069 (2.5137) [2022-01-25 10:55:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][580/1251] eta 0:24:51 lr 0.000097 time 1.6009 (2.2221) loss 3.5258 (3.1219) grad_norm 2.3954 (2.5172) [2022-01-25 10:55:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][590/1251] eta 0:24:27 lr 0.000097 time 1.8011 (2.2202) loss 3.1383 (3.1239) grad_norm 2.7768 (2.5188) [2022-01-25 10:55:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][600/1251] eta 0:24:03 lr 0.000097 time 2.2801 (2.2175) loss 3.4641 (3.1224) grad_norm 3.5699 (2.5210) [2022-01-25 10:56:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][610/1251] eta 0:23:40 lr 0.000097 time 2.4975 (2.2158) loss 3.7760 (3.1242) grad_norm 2.5378 (2.5236) [2022-01-25 10:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][620/1251] eta 0:23:16 lr 0.000097 time 1.6849 (2.2125) loss 2.2165 (3.1258) grad_norm 2.5242 (2.5259) [2022-01-25 10:56:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][630/1251] eta 0:22:53 lr 0.000097 time 2.2550 (2.2118) loss 2.5275 (3.1261) grad_norm 2.9736 (2.5272) [2022-01-25 10:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][640/1251] eta 0:22:30 lr 0.000097 time 2.0257 (2.2105) loss 2.5744 (3.1250) grad_norm 3.0554 (2.5270) [2022-01-25 10:57:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][650/1251] eta 0:22:09 lr 0.000097 time 2.1645 (2.2113) loss 3.2306 (3.1265) grad_norm 2.4361 (2.5275) [2022-01-25 10:58:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][660/1251] eta 0:21:47 lr 0.000097 time 2.0449 (2.2121) loss 3.2094 (3.1265) grad_norm 2.3123 (2.5268) [2022-01-25 10:58:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][670/1251] eta 0:21:25 lr 0.000097 time 2.2824 (2.2128) loss 3.1112 (3.1282) grad_norm 3.0180 (2.5290) [2022-01-25 10:58:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][680/1251] eta 0:21:03 lr 0.000097 time 2.1318 (2.2136) loss 3.4936 (3.1271) grad_norm 2.4074 (2.5304) [2022-01-25 10:59:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][690/1251] eta 0:20:42 lr 0.000097 time 2.5059 (2.2155) loss 2.5017 (3.1250) grad_norm 2.9593 (2.5287) [2022-01-25 10:59:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][700/1251] eta 0:20:21 lr 0.000097 time 1.8249 (2.2164) loss 3.4470 (3.1236) grad_norm 2.4919 (2.5308) [2022-01-25 10:59:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][710/1251] eta 0:19:58 lr 0.000097 time 2.5221 (2.2146) loss 2.2843 (3.1181) grad_norm 2.3747 (2.5299) [2022-01-25 11:00:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][720/1251] eta 0:19:34 lr 0.000097 time 1.5401 (2.2124) loss 3.5045 (3.1164) grad_norm 2.3589 (2.5293) [2022-01-25 11:00:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][730/1251] eta 0:19:12 lr 0.000097 time 2.1531 (2.2121) loss 2.7853 (3.1201) grad_norm 2.3185 (2.5289) [2022-01-25 11:00:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][740/1251] eta 0:18:50 lr 0.000097 time 2.2133 (2.2119) loss 3.1986 (3.1198) grad_norm 2.4130 (2.5292) [2022-01-25 11:01:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][750/1251] eta 0:18:28 lr 0.000097 time 1.9447 (2.2132) loss 3.4516 (3.1227) grad_norm 3.0356 (2.5329) [2022-01-25 11:01:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][760/1251] eta 0:18:05 lr 0.000097 time 2.1664 (2.2117) loss 3.4745 (3.1257) grad_norm 2.2341 (2.5329) [2022-01-25 11:02:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][770/1251] eta 0:17:43 lr 0.000097 time 2.0100 (2.2100) loss 3.1368 (3.1263) grad_norm 2.4349 (2.5324) [2022-01-25 11:02:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][780/1251] eta 0:17:22 lr 0.000097 time 2.4549 (2.2126) loss 3.4098 (3.1241) grad_norm 2.3619 (2.5316) [2022-01-25 11:02:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][790/1251] eta 0:16:59 lr 0.000097 time 1.6580 (2.2109) loss 3.4357 (3.1256) grad_norm 2.4127 (2.5327) [2022-01-25 11:03:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][800/1251] eta 0:16:38 lr 0.000097 time 3.1094 (2.2133) loss 3.6001 (3.1271) grad_norm 2.1476 (2.5332) [2022-01-25 11:03:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][810/1251] eta 0:16:16 lr 0.000097 time 1.8793 (2.2144) loss 2.3470 (3.1287) grad_norm 2.3129 (2.5333) [2022-01-25 11:03:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][820/1251] eta 0:15:55 lr 0.000097 time 2.4889 (2.2179) loss 2.5106 (3.1272) grad_norm 2.1315 (2.5330) [2022-01-25 11:04:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][830/1251] eta 0:15:33 lr 0.000097 time 1.6850 (2.2171) loss 3.7074 (3.1263) grad_norm 2.3732 (2.5316) [2022-01-25 11:04:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][840/1251] eta 0:15:10 lr 0.000097 time 1.8506 (2.2148) loss 3.3498 (3.1267) grad_norm 2.2837 (2.5303) [2022-01-25 11:05:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][850/1251] eta 0:14:47 lr 0.000097 time 1.8209 (2.2126) loss 3.4443 (3.1254) grad_norm 2.4688 (2.5293) [2022-01-25 11:05:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][860/1251] eta 0:14:25 lr 0.000097 time 1.9086 (2.2133) loss 3.5675 (3.1269) grad_norm 2.5982 (2.5303) [2022-01-25 11:05:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][870/1251] eta 0:14:02 lr 0.000096 time 1.6032 (2.2123) loss 3.4217 (3.1241) grad_norm 2.5960 (2.5311) [2022-01-25 11:06:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][880/1251] eta 0:13:40 lr 0.000096 time 1.9284 (2.2114) loss 3.0049 (3.1231) grad_norm 2.7987 (2.5320) [2022-01-25 11:06:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][890/1251] eta 0:13:18 lr 0.000096 time 1.9336 (2.2114) loss 3.2814 (3.1213) grad_norm 2.2596 (2.5342) [2022-01-25 11:06:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][900/1251] eta 0:12:57 lr 0.000096 time 2.0282 (2.2141) loss 2.4584 (3.1194) grad_norm 1.9587 (2.5321) [2022-01-25 11:07:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][910/1251] eta 0:12:34 lr 0.000096 time 1.9379 (2.2132) loss 3.3711 (3.1184) grad_norm 2.8264 (2.5325) [2022-01-25 11:07:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][920/1251] eta 0:12:12 lr 0.000096 time 1.5798 (2.2121) loss 3.5438 (3.1168) grad_norm 2.6835 (2.5313) [2022-01-25 11:07:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][930/1251] eta 0:11:49 lr 0.000096 time 1.8499 (2.2110) loss 3.4032 (3.1175) grad_norm 2.7168 (2.5310) [2022-01-25 11:08:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][940/1251] eta 0:11:27 lr 0.000096 time 1.7493 (2.2115) loss 3.2430 (3.1151) grad_norm 2.3995 (2.5305) [2022-01-25 11:08:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][950/1251] eta 0:11:05 lr 0.000096 time 1.9707 (2.2104) loss 3.2304 (3.1147) grad_norm 2.5692 (2.5306) [2022-01-25 11:09:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][960/1251] eta 0:10:42 lr 0.000096 time 1.9167 (2.2093) loss 3.5273 (3.1160) grad_norm 2.2872 (2.5303) [2022-01-25 11:09:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][970/1251] eta 0:10:21 lr 0.000096 time 2.4551 (2.2106) loss 2.4558 (3.1175) grad_norm 2.0474 (2.5286) [2022-01-25 11:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][980/1251] eta 0:09:59 lr 0.000096 time 2.1407 (2.2128) loss 3.3429 (3.1169) grad_norm 2.5351 (2.5281) [2022-01-25 11:10:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][990/1251] eta 0:09:37 lr 0.000096 time 1.6146 (2.2123) loss 3.6354 (3.1178) grad_norm 2.6969 (2.5288) [2022-01-25 11:10:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1000/1251] eta 0:09:15 lr 0.000096 time 1.8412 (2.2128) loss 3.1931 (3.1199) grad_norm 2.3118 (2.5287) [2022-01-25 11:10:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1010/1251] eta 0:08:52 lr 0.000096 time 1.9976 (2.2110) loss 2.4399 (3.1192) grad_norm 2.2073 (2.5281) [2022-01-25 11:11:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1020/1251] eta 0:08:30 lr 0.000096 time 1.8200 (2.2082) loss 3.5833 (3.1186) grad_norm 2.4325 (2.5277) [2022-01-25 11:11:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1030/1251] eta 0:08:07 lr 0.000096 time 1.6925 (2.2071) loss 3.4252 (3.1190) grad_norm 2.4733 (2.5279) [2022-01-25 11:11:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1040/1251] eta 0:07:45 lr 0.000096 time 1.9017 (2.2075) loss 2.4642 (3.1184) grad_norm 2.8678 (2.5275) [2022-01-25 11:12:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1050/1251] eta 0:07:23 lr 0.000096 time 2.0592 (2.2080) loss 2.6614 (3.1162) grad_norm 2.6304 (2.5272) [2022-01-25 11:12:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1060/1251] eta 0:07:01 lr 0.000096 time 1.6384 (2.2071) loss 3.6207 (3.1159) grad_norm 2.2930 (2.5272) [2022-01-25 11:13:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1070/1251] eta 0:06:39 lr 0.000096 time 2.2757 (2.2063) loss 2.3589 (3.1156) grad_norm 2.3354 (2.5261) [2022-01-25 11:13:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1080/1251] eta 0:06:17 lr 0.000096 time 1.9554 (2.2070) loss 3.6886 (3.1164) grad_norm 2.6916 (2.5260) [2022-01-25 11:13:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1090/1251] eta 0:05:55 lr 0.000096 time 1.9117 (2.2065) loss 3.5384 (3.1169) grad_norm 2.5860 (2.5252) [2022-01-25 11:14:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1100/1251] eta 0:05:33 lr 0.000096 time 1.9056 (2.2075) loss 1.9059 (3.1126) grad_norm 2.2239 (2.5264) [2022-01-25 11:14:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1110/1251] eta 0:05:11 lr 0.000096 time 2.3848 (2.2082) loss 2.1495 (3.1111) grad_norm 2.6609 (2.5258) [2022-01-25 11:14:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1120/1251] eta 0:04:49 lr 0.000096 time 1.8765 (2.2085) loss 2.2787 (3.1096) grad_norm 2.3243 (2.5260) [2022-01-25 11:15:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1130/1251] eta 0:04:27 lr 0.000096 time 1.5945 (2.2076) loss 3.6430 (3.1088) grad_norm 2.5034 (2.5255) [2022-01-25 11:15:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1140/1251] eta 0:04:04 lr 0.000096 time 2.2141 (2.2064) loss 2.3096 (3.1080) grad_norm 2.6082 (2.5247) [2022-01-25 11:15:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1150/1251] eta 0:03:42 lr 0.000096 time 1.8553 (2.2054) loss 2.4835 (3.1072) grad_norm 2.5663 (2.5254) [2022-01-25 11:16:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1160/1251] eta 0:03:20 lr 0.000096 time 2.4410 (2.2060) loss 3.2722 (3.1072) grad_norm 2.1621 (2.5252) [2022-01-25 11:16:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1170/1251] eta 0:02:58 lr 0.000096 time 2.0340 (2.2060) loss 3.7355 (3.1065) grad_norm 2.6683 (2.5261) [2022-01-25 11:17:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1180/1251] eta 0:02:36 lr 0.000096 time 2.3328 (2.2063) loss 3.1617 (3.1062) grad_norm 2.6462 (2.5274) [2022-01-25 11:17:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1190/1251] eta 0:02:14 lr 0.000096 time 1.8488 (2.2063) loss 3.5108 (3.1060) grad_norm 3.2528 (2.5283) [2022-01-25 11:17:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1200/1251] eta 0:01:52 lr 0.000096 time 2.4127 (2.2061) loss 3.2213 (3.1044) grad_norm 2.4820 (2.5300) [2022-01-25 11:18:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1210/1251] eta 0:01:30 lr 0.000096 time 2.2067 (2.2048) loss 3.4183 (3.1054) grad_norm 2.3211 (2.5296) [2022-01-25 11:18:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1220/1251] eta 0:01:08 lr 0.000096 time 2.2514 (2.2044) loss 3.5218 (3.1057) grad_norm 2.3360 (2.5289) [2022-01-25 11:18:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1230/1251] eta 0:00:46 lr 0.000096 time 2.2995 (2.2043) loss 3.4677 (3.1065) grad_norm 2.3641 (2.5283) [2022-01-25 11:19:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1240/1251] eta 0:00:24 lr 0.000096 time 1.8131 (2.2034) loss 2.0062 (3.1048) grad_norm 2.7965 (2.5276) [2022-01-25 11:19:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [242/300][1250/1251] eta 0:00:02 lr 0.000096 time 1.3106 (2.1979) loss 2.9487 (3.1030) grad_norm 2.2183 (2.5278) [2022-01-25 11:19:28 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 242 training takes 0:45:49 [2022-01-25 11:19:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.211 (18.211) Loss 0.8038 (0.8038) Acc@1 81.641 (81.641) Acc@5 95.898 (95.898) [2022-01-25 11:20:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 2.915 (3.539) Loss 0.8612 (0.8557) Acc@1 81.055 (80.273) Acc@5 94.629 (95.082) [2022-01-25 11:20:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.918 (2.555) Loss 0.8595 (0.8580) Acc@1 79.883 (80.297) Acc@5 95.215 (95.047) [2022-01-25 11:20:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.581 (2.302) Loss 0.8871 (0.8588) Acc@1 80.762 (80.192) Acc@5 94.141 (95.038) [2022-01-25 11:20:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.025 (2.197) Loss 0.8357 (0.8534) Acc@1 81.348 (80.266) Acc@5 95.508 (95.177) [2022-01-25 11:21:04 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.178 Acc@5 95.210 [2022-01-25 11:21:04 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.2% [2022-01-25 11:21:04 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.18% [2022-01-25 11:21:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][0/1251] eta 7:36:09 lr 0.000096 time 21.8784 (21.8784) loss 3.1021 (3.1021) grad_norm 2.3601 (2.3601) [2022-01-25 11:21:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][10/1251] eta 1:32:46 lr 0.000096 time 3.0743 (4.4853) loss 2.9688 (3.2863) grad_norm 2.6042 (2.5911) [2022-01-25 11:22:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][20/1251] eta 1:09:53 lr 0.000096 time 1.4254 (3.4066) loss 3.5267 (3.3651) grad_norm 2.2954 (2.5423) [2022-01-25 11:22:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][30/1251] eta 1:01:15 lr 0.000096 time 1.7230 (3.0105) loss 3.3673 (3.3144) grad_norm 3.1865 (2.4930) [2022-01-25 11:23:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][40/1251] eta 0:57:04 lr 0.000096 time 2.8132 (2.8281) loss 2.8818 (3.2718) grad_norm 2.9283 (2.5053) [2022-01-25 11:23:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][50/1251] eta 0:54:34 lr 0.000095 time 3.5257 (2.7264) loss 3.3078 (3.2740) grad_norm 2.3230 (2.5354) [2022-01-25 11:23:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][60/1251] eta 0:51:48 lr 0.000095 time 1.9475 (2.6101) loss 3.1153 (3.2554) grad_norm 2.4104 (2.5388) [2022-01-25 11:24:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][70/1251] eta 0:49:43 lr 0.000095 time 1.8194 (2.5264) loss 2.6685 (3.1914) grad_norm 2.6229 (2.5706) [2022-01-25 11:24:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][80/1251] eta 0:48:40 lr 0.000095 time 1.8415 (2.4943) loss 3.8458 (3.1991) grad_norm 2.7688 (2.5933) [2022-01-25 11:24:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][90/1251] eta 0:47:56 lr 0.000095 time 2.9245 (2.4772) loss 3.6769 (3.1998) grad_norm 2.5670 (2.5780) [2022-01-25 11:25:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][100/1251] eta 0:47:04 lr 0.000095 time 1.9454 (2.4537) loss 3.0746 (3.2241) grad_norm 2.3912 (2.5717) [2022-01-25 11:25:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][110/1251] eta 0:46:06 lr 0.000095 time 2.1908 (2.4245) loss 2.8207 (3.2034) grad_norm 2.3081 (2.5670) [2022-01-25 11:25:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][120/1251] eta 0:45:07 lr 0.000095 time 2.4495 (2.3938) loss 3.8684 (3.1770) grad_norm 3.2823 (2.5563) [2022-01-25 11:26:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][130/1251] eta 0:44:07 lr 0.000095 time 1.8218 (2.3621) loss 2.7454 (3.1763) grad_norm 2.5546 (2.5644) [2022-01-25 11:26:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][140/1251] eta 0:43:25 lr 0.000095 time 2.2668 (2.3452) loss 3.5285 (3.1805) grad_norm 3.1744 (2.5599) [2022-01-25 11:26:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][150/1251] eta 0:43:05 lr 0.000095 time 2.7242 (2.3482) loss 2.8724 (3.1759) grad_norm 2.4629 (2.5562) [2022-01-25 11:27:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][160/1251] eta 0:42:44 lr 0.000095 time 2.1218 (2.3510) loss 3.3561 (3.1765) grad_norm 2.1085 (2.5511) [2022-01-25 11:27:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][170/1251] eta 0:42:12 lr 0.000095 time 1.8850 (2.3429) loss 3.3084 (3.1710) grad_norm 2.0372 (2.5513) [2022-01-25 11:28:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][180/1251] eta 0:41:38 lr 0.000095 time 1.6752 (2.3332) loss 2.9835 (3.1657) grad_norm 2.5852 (2.5541) [2022-01-25 11:28:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][190/1251] eta 0:41:02 lr 0.000095 time 2.7199 (2.3213) loss 3.7315 (3.1693) grad_norm 2.4907 (2.5473) [2022-01-25 11:28:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][200/1251] eta 0:40:35 lr 0.000095 time 1.8027 (2.3176) loss 2.3953 (3.1651) grad_norm 2.8211 (2.5456) [2022-01-25 11:29:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][210/1251] eta 0:40:03 lr 0.000095 time 2.0164 (2.3086) loss 2.6654 (3.1727) grad_norm 2.6373 (2.5453) [2022-01-25 11:29:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][220/1251] eta 0:39:31 lr 0.000095 time 1.8388 (2.3004) loss 3.3631 (3.1732) grad_norm 2.9075 (2.5435) [2022-01-25 11:29:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][230/1251] eta 0:39:09 lr 0.000095 time 2.3930 (2.3007) loss 2.2265 (3.1609) grad_norm 2.5601 (2.5384) [2022-01-25 11:30:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][240/1251] eta 0:38:50 lr 0.000095 time 1.8406 (2.3052) loss 2.5338 (3.1554) grad_norm 2.6208 (2.5458) [2022-01-25 11:30:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][250/1251] eta 0:38:17 lr 0.000095 time 1.8661 (2.2952) loss 3.4695 (3.1574) grad_norm 2.1997 (2.5463) [2022-01-25 11:31:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][260/1251] eta 0:37:46 lr 0.000095 time 1.9734 (2.2869) loss 2.6624 (3.1547) grad_norm 2.2089 (2.5502) [2022-01-25 11:31:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][270/1251] eta 0:37:20 lr 0.000095 time 1.8691 (2.2838) loss 3.4605 (3.1544) grad_norm 2.2207 (2.5515) [2022-01-25 11:31:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][280/1251] eta 0:37:01 lr 0.000095 time 1.8287 (2.2881) loss 3.2896 (3.1483) grad_norm 2.2175 (2.5475) [2022-01-25 11:32:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][290/1251] eta 0:36:29 lr 0.000095 time 1.9183 (2.2783) loss 3.4917 (3.1499) grad_norm 2.2451 (2.5436) [2022-01-25 11:32:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][300/1251] eta 0:36:01 lr 0.000095 time 1.8803 (2.2725) loss 3.7665 (3.1501) grad_norm 2.5860 (2.5409) [2022-01-25 11:32:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][310/1251] eta 0:35:32 lr 0.000095 time 2.1516 (2.2666) loss 3.2115 (3.1480) grad_norm 3.0306 (2.5406) [2022-01-25 11:33:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][320/1251] eta 0:35:15 lr 0.000095 time 2.3438 (2.2719) loss 3.5295 (3.1461) grad_norm 2.6416 (2.5449) [2022-01-25 11:33:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][330/1251] eta 0:34:50 lr 0.000095 time 1.8773 (2.2695) loss 3.8684 (3.1518) grad_norm 3.0515 (2.5508) [2022-01-25 11:33:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][340/1251] eta 0:34:25 lr 0.000095 time 2.1901 (2.2676) loss 2.5997 (3.1448) grad_norm 3.0560 (2.5513) [2022-01-25 11:34:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][350/1251] eta 0:33:59 lr 0.000095 time 2.1627 (2.2635) loss 3.5773 (3.1475) grad_norm 2.5498 (2.5506) [2022-01-25 11:34:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][360/1251] eta 0:33:31 lr 0.000095 time 1.7532 (2.2579) loss 3.6030 (3.1443) grad_norm 2.3290 (2.5493) [2022-01-25 11:34:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][370/1251] eta 0:33:01 lr 0.000095 time 1.9090 (2.2492) loss 3.6232 (3.1458) grad_norm 3.0269 (2.5483) [2022-01-25 11:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][380/1251] eta 0:32:35 lr 0.000095 time 2.1504 (2.2455) loss 3.3827 (3.1497) grad_norm 2.3944 (2.5456) [2022-01-25 11:35:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][390/1251] eta 0:32:14 lr 0.000095 time 1.8016 (2.2471) loss 2.8258 (3.1524) grad_norm 2.8535 (2.5473) [2022-01-25 11:36:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][400/1251] eta 0:31:52 lr 0.000095 time 2.2800 (2.2479) loss 3.2646 (3.1583) grad_norm 2.3974 (2.5457) [2022-01-25 11:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][410/1251] eta 0:31:31 lr 0.000095 time 1.8784 (2.2490) loss 2.5482 (3.1628) grad_norm 2.4478 (2.5432) [2022-01-25 11:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][420/1251] eta 0:31:09 lr 0.000095 time 1.8597 (2.2492) loss 2.3916 (3.1589) grad_norm 2.4473 (2.5410) [2022-01-25 11:37:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][430/1251] eta 0:30:47 lr 0.000095 time 2.2627 (2.2502) loss 2.4206 (3.1557) grad_norm 2.2994 (2.5400) [2022-01-25 11:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][440/1251] eta 0:30:23 lr 0.000095 time 2.0738 (2.2489) loss 2.7983 (3.1484) grad_norm 2.4705 (2.5382) [2022-01-25 11:37:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][450/1251] eta 0:29:58 lr 0.000095 time 1.8457 (2.2456) loss 3.1965 (3.1468) grad_norm 2.6587 (2.5381) [2022-01-25 11:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][460/1251] eta 0:29:33 lr 0.000095 time 1.9364 (2.2426) loss 2.6200 (3.1431) grad_norm 2.7661 (2.5373) [2022-01-25 11:38:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][470/1251] eta 0:29:11 lr 0.000095 time 1.9469 (2.2423) loss 3.3650 (3.1444) grad_norm 2.4146 (2.5355) [2022-01-25 11:39:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][480/1251] eta 0:28:48 lr 0.000094 time 2.7974 (2.2422) loss 3.7249 (3.1423) grad_norm 2.9058 (2.5337) [2022-01-25 11:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][490/1251] eta 0:28:24 lr 0.000094 time 1.9149 (2.2398) loss 2.2065 (3.1430) grad_norm 2.6533 (2.5325) [2022-01-25 11:39:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][500/1251] eta 0:27:59 lr 0.000094 time 1.8606 (2.2359) loss 3.4317 (3.1461) grad_norm 2.3044 (2.5299) [2022-01-25 11:40:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][510/1251] eta 0:27:36 lr 0.000094 time 2.0898 (2.2351) loss 3.4278 (3.1481) grad_norm 2.3917 (2.5297) [2022-01-25 11:40:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][520/1251] eta 0:27:12 lr 0.000094 time 2.0276 (2.2336) loss 3.4394 (3.1490) grad_norm 2.1791 (2.5292) [2022-01-25 11:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][530/1251] eta 0:26:49 lr 0.000094 time 1.5937 (2.2323) loss 3.9040 (3.1508) grad_norm 2.7636 (2.5293) [2022-01-25 11:41:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][540/1251] eta 0:26:24 lr 0.000094 time 2.0221 (2.2284) loss 2.8949 (3.1513) grad_norm 2.5995 (2.5312) [2022-01-25 11:41:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][550/1251] eta 0:26:02 lr 0.000094 time 2.2071 (2.2296) loss 2.5568 (3.1502) grad_norm 2.3711 (2.5334) [2022-01-25 11:41:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][560/1251] eta 0:25:39 lr 0.000094 time 2.1912 (2.2281) loss 3.0503 (3.1490) grad_norm 2.4962 (2.5362) [2022-01-25 11:42:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][570/1251] eta 0:25:18 lr 0.000094 time 1.6410 (2.2298) loss 1.8826 (3.1453) grad_norm 2.6871 (2.5343) [2022-01-25 11:42:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][580/1251] eta 0:24:56 lr 0.000094 time 1.9087 (2.2308) loss 3.0103 (3.1459) grad_norm 2.6772 (2.5369) [2022-01-25 11:43:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][590/1251] eta 0:24:35 lr 0.000094 time 1.7841 (2.2316) loss 3.3813 (3.1474) grad_norm 3.1173 (2.5399) [2022-01-25 11:43:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][600/1251] eta 0:24:11 lr 0.000094 time 1.5307 (2.2299) loss 2.8460 (3.1472) grad_norm 2.1904 (2.5373) [2022-01-25 11:43:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][610/1251] eta 0:23:47 lr 0.000094 time 1.5363 (2.2265) loss 3.2271 (3.1455) grad_norm 2.2842 (2.5392) [2022-01-25 11:44:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][620/1251] eta 0:23:23 lr 0.000094 time 1.9863 (2.2241) loss 3.5193 (3.1480) grad_norm 2.2887 (2.5422) [2022-01-25 11:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][630/1251] eta 0:23:01 lr 0.000094 time 2.4204 (2.2250) loss 3.0972 (3.1471) grad_norm 2.2584 (2.5430) [2022-01-25 11:44:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][640/1251] eta 0:22:38 lr 0.000094 time 2.1337 (2.2238) loss 3.5237 (3.1467) grad_norm 2.2018 (2.5419) [2022-01-25 11:45:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][650/1251] eta 0:22:16 lr 0.000094 time 2.3164 (2.2242) loss 2.9937 (3.1430) grad_norm 2.3246 (2.5413) [2022-01-25 11:45:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][660/1251] eta 0:21:52 lr 0.000094 time 1.6758 (2.2208) loss 2.1439 (3.1390) grad_norm 2.4702 (2.5384) [2022-01-25 11:45:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][670/1251] eta 0:21:29 lr 0.000094 time 1.5672 (2.2197) loss 3.4953 (3.1419) grad_norm 2.8653 (2.5398) [2022-01-25 11:46:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][680/1251] eta 0:21:06 lr 0.000094 time 1.9459 (2.2183) loss 3.6130 (3.1427) grad_norm 3.3127 (2.5402) [2022-01-25 11:46:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][690/1251] eta 0:20:44 lr 0.000094 time 1.5900 (2.2176) loss 3.3380 (3.1422) grad_norm 2.3582 (2.5385) [2022-01-25 11:46:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][700/1251] eta 0:20:21 lr 0.000094 time 1.9340 (2.2164) loss 1.8811 (3.1396) grad_norm 2.7574 (2.5389) [2022-01-25 11:47:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][710/1251] eta 0:19:59 lr 0.000094 time 2.9689 (2.2174) loss 3.5265 (3.1377) grad_norm 2.7824 (2.5411) [2022-01-25 11:47:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][720/1251] eta 0:19:38 lr 0.000094 time 2.5382 (2.2197) loss 3.4861 (3.1360) grad_norm 2.5338 (2.5393) [2022-01-25 11:48:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][730/1251] eta 0:19:17 lr 0.000094 time 1.9636 (2.2219) loss 2.4455 (3.1368) grad_norm 2.5045 (2.5390) [2022-01-25 11:48:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][740/1251] eta 0:18:54 lr 0.000094 time 1.6119 (2.2198) loss 2.3494 (3.1326) grad_norm 2.8708 (2.5405) [2022-01-25 11:48:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][750/1251] eta 0:18:30 lr 0.000094 time 1.8662 (2.2176) loss 2.2406 (3.1287) grad_norm 2.2330 (2.5424) [2022-01-25 11:49:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][760/1251] eta 0:18:08 lr 0.000094 time 1.8736 (2.2164) loss 3.8446 (3.1281) grad_norm 2.5056 (2.5438) [2022-01-25 11:49:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][770/1251] eta 0:17:44 lr 0.000094 time 2.2475 (2.2140) loss 2.3095 (3.1300) grad_norm 2.6834 (2.5479) [2022-01-25 11:49:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][780/1251] eta 0:17:22 lr 0.000094 time 2.1274 (2.2132) loss 3.5434 (3.1274) grad_norm 2.5941 (2.5482) [2022-01-25 11:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][790/1251] eta 0:17:00 lr 0.000094 time 2.4533 (2.2133) loss 3.7121 (3.1287) grad_norm 2.3534 (2.5496) [2022-01-25 11:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][800/1251] eta 0:16:39 lr 0.000094 time 2.1010 (2.2160) loss 3.5745 (3.1302) grad_norm 2.2037 (2.5468) [2022-01-25 11:51:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][810/1251] eta 0:16:16 lr 0.000094 time 1.9439 (2.2150) loss 3.4309 (3.1299) grad_norm 2.5136 (2.5469) [2022-01-25 11:51:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][820/1251] eta 0:15:54 lr 0.000094 time 2.1522 (2.2138) loss 2.1421 (3.1292) grad_norm 2.2542 (2.5480) [2022-01-25 11:51:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][830/1251] eta 0:15:31 lr 0.000094 time 2.2122 (2.2135) loss 2.0509 (3.1278) grad_norm 2.4806 (2.5458) [2022-01-25 11:52:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][840/1251] eta 0:15:09 lr 0.000094 time 2.2265 (2.2134) loss 3.1311 (3.1269) grad_norm 2.5644 (2.5453) [2022-01-25 11:52:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][850/1251] eta 0:14:47 lr 0.000094 time 2.2570 (2.2123) loss 3.4931 (3.1274) grad_norm 2.1951 (2.5454) [2022-01-25 11:52:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][860/1251] eta 0:14:24 lr 0.000094 time 1.5283 (2.2105) loss 2.5557 (3.1241) grad_norm 2.1614 (2.5450) [2022-01-25 11:53:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][870/1251] eta 0:14:02 lr 0.000094 time 2.3766 (2.2101) loss 2.4985 (3.1237) grad_norm 2.5280 (2.5455) [2022-01-25 11:53:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][880/1251] eta 0:13:40 lr 0.000094 time 2.2285 (2.2116) loss 2.7100 (3.1232) grad_norm 2.3411 (2.5444) [2022-01-25 11:53:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][890/1251] eta 0:13:18 lr 0.000094 time 2.3126 (2.2120) loss 3.1963 (3.1268) grad_norm 2.3171 (2.5438) [2022-01-25 11:54:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][900/1251] eta 0:12:56 lr 0.000094 time 1.5611 (2.2112) loss 3.0154 (3.1262) grad_norm 2.4985 (2.5443) [2022-01-25 11:54:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][910/1251] eta 0:12:33 lr 0.000093 time 2.4219 (2.2107) loss 2.4205 (3.1278) grad_norm 2.3797 (2.5451) [2022-01-25 11:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][920/1251] eta 0:12:11 lr 0.000093 time 2.2593 (2.2101) loss 3.1876 (3.1290) grad_norm 2.6990 (2.5443) [2022-01-25 11:55:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][930/1251] eta 0:11:48 lr 0.000093 time 1.5445 (2.2087) loss 3.2312 (3.1296) grad_norm 2.4420 (2.5448) [2022-01-25 11:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][940/1251] eta 0:11:26 lr 0.000093 time 2.2737 (2.2083) loss 3.6310 (3.1319) grad_norm 2.9185 (2.5463) [2022-01-25 11:56:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][950/1251] eta 0:11:04 lr 0.000093 time 2.5042 (2.2087) loss 2.0416 (3.1311) grad_norm 2.5275 (2.5461) [2022-01-25 11:56:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][960/1251] eta 0:10:42 lr 0.000093 time 2.2784 (2.2093) loss 3.6575 (3.1319) grad_norm 2.2847 (2.5455) [2022-01-25 11:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][970/1251] eta 0:10:20 lr 0.000093 time 1.5606 (2.2093) loss 2.2094 (3.1322) grad_norm 2.8882 (2.5460) [2022-01-25 11:57:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][980/1251] eta 0:09:58 lr 0.000093 time 1.9211 (2.2089) loss 3.5748 (3.1343) grad_norm 2.4526 (2.5455) [2022-01-25 11:57:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][990/1251] eta 0:09:36 lr 0.000093 time 1.9277 (2.2073) loss 3.3830 (3.1361) grad_norm 2.5152 (2.5464) [2022-01-25 11:57:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1000/1251] eta 0:09:13 lr 0.000093 time 2.3218 (2.2061) loss 3.3569 (3.1351) grad_norm 2.3837 (2.5459) [2022-01-25 11:58:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1010/1251] eta 0:08:51 lr 0.000093 time 1.8751 (2.2058) loss 3.2199 (3.1351) grad_norm 2.1691 (2.5457) [2022-01-25 11:58:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1020/1251] eta 0:08:29 lr 0.000093 time 2.4962 (2.2058) loss 3.3722 (3.1363) grad_norm 2.3091 (2.5448) [2022-01-25 11:58:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1030/1251] eta 0:08:07 lr 0.000093 time 1.8316 (2.2050) loss 3.3081 (3.1392) grad_norm 2.6849 (2.5452) [2022-01-25 11:59:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1040/1251] eta 0:07:45 lr 0.000093 time 2.1862 (2.2068) loss 3.3836 (3.1421) grad_norm 2.3071 (2.5446) [2022-01-25 11:59:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1050/1251] eta 0:07:23 lr 0.000093 time 1.7499 (2.2072) loss 2.8412 (3.1411) grad_norm 2.6044 (2.5452) [2022-01-25 12:00:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1060/1251] eta 0:07:01 lr 0.000093 time 2.2950 (2.2054) loss 3.1499 (3.1397) grad_norm 2.5149 (2.5451) [2022-01-25 12:00:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1070/1251] eta 0:06:38 lr 0.000093 time 1.9005 (2.2039) loss 3.3484 (3.1392) grad_norm 2.2767 (2.5446) [2022-01-25 12:00:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1080/1251] eta 0:06:16 lr 0.000093 time 1.5364 (2.2023) loss 3.2309 (3.1410) grad_norm 2.2812 (2.5452) [2022-01-25 12:01:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1090/1251] eta 0:05:54 lr 0.000093 time 2.1054 (2.2015) loss 3.7251 (3.1411) grad_norm 2.5271 (2.5452) [2022-01-25 12:01:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1100/1251] eta 0:05:32 lr 0.000093 time 2.4377 (2.2019) loss 3.1612 (3.1396) grad_norm 2.7869 (2.5460) [2022-01-25 12:01:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1110/1251] eta 0:05:10 lr 0.000093 time 2.4603 (2.2043) loss 3.1533 (3.1402) grad_norm 2.4093 (2.5464) [2022-01-25 12:02:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1120/1251] eta 0:04:48 lr 0.000093 time 2.2786 (2.2050) loss 2.2989 (3.1363) grad_norm 2.4970 (2.5462) [2022-01-25 12:02:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1130/1251] eta 0:04:26 lr 0.000093 time 1.6465 (2.2057) loss 3.4261 (3.1377) grad_norm 2.9536 (2.5466) [2022-01-25 12:03:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1140/1251] eta 0:04:04 lr 0.000093 time 2.5515 (2.2063) loss 2.3249 (3.1376) grad_norm 2.1530 (2.5473) [2022-01-25 12:03:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1150/1251] eta 0:03:42 lr 0.000093 time 1.5672 (2.2054) loss 3.2884 (3.1366) grad_norm 2.6273 (2.5469) [2022-01-25 12:03:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1160/1251] eta 0:03:20 lr 0.000093 time 1.6152 (2.2032) loss 2.3334 (3.1330) grad_norm 2.3923 (2.5467) [2022-01-25 12:04:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1170/1251] eta 0:02:58 lr 0.000093 time 1.7814 (2.2022) loss 3.0531 (3.1307) grad_norm 2.7031 (2.5459) [2022-01-25 12:04:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1180/1251] eta 0:02:36 lr 0.000093 time 2.2634 (2.2024) loss 3.6183 (3.1295) grad_norm 2.4948 (2.5467) [2022-01-25 12:04:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1190/1251] eta 0:02:14 lr 0.000093 time 2.3237 (2.2037) loss 3.3668 (3.1267) grad_norm 2.3715 (2.5469) [2022-01-25 12:05:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1200/1251] eta 0:01:52 lr 0.000093 time 2.0058 (2.2044) loss 2.8303 (3.1247) grad_norm 2.4491 (2.5475) [2022-01-25 12:05:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1210/1251] eta 0:01:30 lr 0.000093 time 1.9154 (2.2046) loss 2.6990 (3.1226) grad_norm 2.2922 (2.5492) [2022-01-25 12:05:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1220/1251] eta 0:01:08 lr 0.000093 time 1.8084 (2.2043) loss 3.5545 (3.1230) grad_norm 2.8382 (2.5485) [2022-01-25 12:06:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1230/1251] eta 0:00:46 lr 0.000093 time 1.8329 (2.2037) loss 3.5250 (3.1244) grad_norm 2.2677 (2.5485) [2022-01-25 12:06:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1240/1251] eta 0:00:24 lr 0.000093 time 1.4049 (2.2025) loss 2.2527 (3.1232) grad_norm 2.4791 (2.5488) [2022-01-25 12:06:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [243/300][1250/1251] eta 0:00:02 lr 0.000093 time 1.2248 (2.1973) loss 3.6898 (3.1229) grad_norm 2.5496 (2.5489) [2022-01-25 12:06:54 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 243 training takes 0:45:49 [2022-01-25 12:07:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 16.765 (16.765) Loss 0.7749 (0.7749) Acc@1 80.371 (80.371) Acc@5 95.898 (95.898) [2022-01-25 12:07:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.941 (3.233) Loss 0.7812 (0.8491) Acc@1 80.762 (79.980) Acc@5 96.582 (95.028) [2022-01-25 12:07:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 2.042 (2.667) Loss 0.8532 (0.8487) Acc@1 78.711 (80.166) Acc@5 94.922 (95.006) [2022-01-25 12:08:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.276 (2.298) Loss 0.8369 (0.8471) Acc@1 79.492 (80.163) Acc@5 95.508 (95.035) [2022-01-25 12:08:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.399 (2.190) Loss 0.8191 (0.8434) Acc@1 81.348 (80.176) Acc@5 95.117 (95.108) [2022-01-25 12:08:30 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.170 Acc@5 95.136 [2022-01-25 12:08:30 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.2% [2022-01-25 12:08:30 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.18% [2022-01-25 12:08:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][0/1251] eta 8:32:00 lr 0.000093 time 24.5569 (24.5569) loss 2.8486 (2.8486) grad_norm 2.8268 (2.8268) [2022-01-25 12:09:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][10/1251] eta 1:29:53 lr 0.000093 time 1.5640 (4.3457) loss 3.2439 (3.1897) grad_norm 2.2668 (2.8959) [2022-01-25 12:09:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][20/1251] eta 1:11:04 lr 0.000093 time 1.1959 (3.4643) loss 3.2946 (3.1941) grad_norm 2.6047 (2.6963) [2022-01-25 12:10:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][30/1251] eta 1:01:36 lr 0.000093 time 1.9741 (3.0278) loss 3.2963 (3.2527) grad_norm 2.4103 (2.6536) [2022-01-25 12:10:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][40/1251] eta 0:56:45 lr 0.000093 time 2.8894 (2.8119) loss 2.1302 (3.2437) grad_norm 2.5168 (2.6666) [2022-01-25 12:10:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][50/1251] eta 0:53:23 lr 0.000093 time 1.9442 (2.6674) loss 3.2089 (3.1959) grad_norm 2.5846 (2.6655) [2022-01-25 12:11:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][60/1251] eta 0:51:15 lr 0.000093 time 2.1995 (2.5823) loss 3.4380 (3.2096) grad_norm 2.1215 (2.6544) [2022-01-25 12:11:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][70/1251] eta 0:49:41 lr 0.000093 time 2.2924 (2.5246) loss 3.3480 (3.2269) grad_norm 2.6959 (2.6571) [2022-01-25 12:11:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][80/1251] eta 0:48:23 lr 0.000093 time 2.6479 (2.4791) loss 3.2768 (3.1975) grad_norm 2.8307 (2.7463) [2022-01-25 12:12:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][90/1251] eta 0:47:09 lr 0.000092 time 1.8689 (2.4368) loss 3.8473 (3.1772) grad_norm 2.8568 (2.7471) [2022-01-25 12:12:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][100/1251] eta 0:46:24 lr 0.000092 time 2.2910 (2.4190) loss 2.8875 (3.1863) grad_norm 2.2730 (2.7404) [2022-01-25 12:12:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][110/1251] eta 0:45:34 lr 0.000092 time 2.1197 (2.3965) loss 3.1123 (3.1976) grad_norm 2.3660 (2.7244) [2022-01-25 12:13:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][120/1251] eta 0:44:50 lr 0.000092 time 2.8231 (2.3790) loss 2.4364 (3.1954) grad_norm 2.5889 (2.7087) [2022-01-25 12:13:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][130/1251] eta 0:44:10 lr 0.000092 time 1.6469 (2.3646) loss 2.7577 (3.1916) grad_norm 3.0761 (2.7057) [2022-01-25 12:14:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][140/1251] eta 0:43:33 lr 0.000092 time 2.4677 (2.3520) loss 3.4634 (3.1705) grad_norm 2.5120 (2.6836) [2022-01-25 12:14:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][150/1251] eta 0:42:59 lr 0.000092 time 2.3115 (2.3429) loss 2.2007 (3.1653) grad_norm 3.1444 (2.6779) [2022-01-25 12:14:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][160/1251] eta 0:42:31 lr 0.000092 time 2.3697 (2.3387) loss 3.4829 (3.1602) grad_norm 2.5208 (2.6670) [2022-01-25 12:15:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][170/1251] eta 0:41:58 lr 0.000092 time 1.7218 (2.3294) loss 3.0196 (3.1628) grad_norm 2.7605 (2.6613) [2022-01-25 12:15:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][180/1251] eta 0:41:35 lr 0.000092 time 3.3694 (2.3303) loss 2.6510 (3.1541) grad_norm 2.4559 (2.6450) [2022-01-25 12:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][190/1251] eta 0:41:05 lr 0.000092 time 2.6035 (2.3239) loss 2.4982 (3.1397) grad_norm 2.2204 (2.6326) [2022-01-25 12:16:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][200/1251] eta 0:40:32 lr 0.000092 time 1.7455 (2.3142) loss 2.1169 (3.1341) grad_norm 2.4742 (2.6279) [2022-01-25 12:16:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][210/1251] eta 0:39:54 lr 0.000092 time 2.0287 (2.3004) loss 3.8413 (3.1354) grad_norm 2.8790 (2.6265) [2022-01-25 12:16:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][220/1251] eta 0:39:24 lr 0.000092 time 2.7243 (2.2931) loss 2.4333 (3.1274) grad_norm 3.4712 (2.6237) [2022-01-25 12:17:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][230/1251] eta 0:38:48 lr 0.000092 time 1.9711 (2.2809) loss 2.3415 (3.1185) grad_norm 2.8430 (2.6261) [2022-01-25 12:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][240/1251] eta 0:38:21 lr 0.000092 time 2.5301 (2.2764) loss 3.6166 (3.1305) grad_norm 2.9146 (2.6288) [2022-01-25 12:18:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][250/1251] eta 0:37:54 lr 0.000092 time 1.9180 (2.2724) loss 2.5844 (3.1283) grad_norm 2.3159 (2.6289) [2022-01-25 12:18:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][260/1251] eta 0:37:30 lr 0.000092 time 2.2444 (2.2708) loss 2.5493 (3.1263) grad_norm 2.0823 (2.6218) [2022-01-25 12:18:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][270/1251] eta 0:37:07 lr 0.000092 time 1.9312 (2.2706) loss 3.8971 (3.1332) grad_norm 2.2852 (2.6143) [2022-01-25 12:19:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][280/1251] eta 0:36:41 lr 0.000092 time 2.4493 (2.2673) loss 2.5507 (3.1256) grad_norm 2.1028 (2.6076) [2022-01-25 12:19:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][290/1251] eta 0:36:13 lr 0.000092 time 1.9453 (2.2619) loss 3.1625 (3.1305) grad_norm 2.5025 (2.6044) [2022-01-25 12:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][300/1251] eta 0:35:45 lr 0.000092 time 2.0320 (2.2565) loss 3.1978 (3.1279) grad_norm 2.5068 (2.6081) [2022-01-25 12:20:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][310/1251] eta 0:35:23 lr 0.000092 time 2.5330 (2.2561) loss 3.5144 (3.1317) grad_norm 2.6485 (2.6058) [2022-01-25 12:20:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][320/1251] eta 0:35:03 lr 0.000092 time 2.3814 (2.2589) loss 3.0234 (3.1354) grad_norm 2.5371 (2.6058) [2022-01-25 12:20:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][330/1251] eta 0:34:36 lr 0.000092 time 2.2149 (2.2547) loss 3.3879 (3.1309) grad_norm 2.3396 (2.6003) [2022-01-25 12:21:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][340/1251] eta 0:34:10 lr 0.000092 time 2.2196 (2.2512) loss 3.6329 (3.1301) grad_norm 2.9416 (2.5976) [2022-01-25 12:21:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][350/1251] eta 0:33:47 lr 0.000092 time 1.7711 (2.2505) loss 3.1078 (3.1282) grad_norm 2.3844 (2.5926) [2022-01-25 12:22:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][360/1251] eta 0:33:27 lr 0.000092 time 2.3851 (2.2527) loss 3.4791 (3.1349) grad_norm 2.3128 (2.5875) [2022-01-25 12:22:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][370/1251] eta 0:33:00 lr 0.000092 time 2.2161 (2.2481) loss 3.1448 (3.1281) grad_norm 2.7144 (2.5901) [2022-01-25 12:22:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][380/1251] eta 0:32:35 lr 0.000092 time 1.9674 (2.2455) loss 3.5720 (3.1297) grad_norm 2.5507 (2.5934) [2022-01-25 12:23:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][390/1251] eta 0:32:08 lr 0.000092 time 2.1481 (2.2399) loss 2.6434 (3.1261) grad_norm 2.4958 (2.5981) [2022-01-25 12:23:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][400/1251] eta 0:31:44 lr 0.000092 time 2.6232 (2.2381) loss 3.0259 (3.1200) grad_norm 2.7266 (2.5989) [2022-01-25 12:23:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][410/1251] eta 0:31:20 lr 0.000092 time 1.9547 (2.2365) loss 3.0902 (3.1258) grad_norm 3.0956 (2.5965) [2022-01-25 12:24:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][420/1251] eta 0:30:57 lr 0.000092 time 2.2276 (2.2354) loss 3.3617 (3.1255) grad_norm 2.0636 (2.5954) [2022-01-25 12:24:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][430/1251] eta 0:30:34 lr 0.000092 time 2.2051 (2.2344) loss 2.6881 (3.1332) grad_norm 2.2689 (2.5919) [2022-01-25 12:24:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][440/1251] eta 0:30:13 lr 0.000092 time 2.4967 (2.2358) loss 3.7623 (3.1327) grad_norm 2.9430 (2.5934) [2022-01-25 12:25:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][450/1251] eta 0:29:45 lr 0.000092 time 1.6018 (2.2291) loss 2.1955 (3.1265) grad_norm 2.4229 (2.5911) [2022-01-25 12:25:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][460/1251] eta 0:29:20 lr 0.000092 time 2.2065 (2.2258) loss 3.2137 (3.1274) grad_norm 2.3899 (2.5905) [2022-01-25 12:25:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][470/1251] eta 0:28:55 lr 0.000092 time 1.9788 (2.2222) loss 2.2460 (3.1274) grad_norm 2.5550 (2.5883) [2022-01-25 12:26:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][480/1251] eta 0:28:35 lr 0.000092 time 2.6870 (2.2250) loss 3.6161 (3.1301) grad_norm 3.3894 (2.5884) [2022-01-25 12:26:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][490/1251] eta 0:28:11 lr 0.000092 time 2.1665 (2.2228) loss 3.6141 (3.1292) grad_norm 2.2924 (2.5850) [2022-01-25 12:27:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][500/1251] eta 0:27:49 lr 0.000092 time 2.5844 (2.2234) loss 2.7586 (3.1251) grad_norm 2.4298 (2.5835) [2022-01-25 12:27:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][510/1251] eta 0:27:29 lr 0.000092 time 3.0719 (2.2257) loss 3.7336 (3.1316) grad_norm 2.3690 (2.5819) [2022-01-25 12:27:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][520/1251] eta 0:27:09 lr 0.000092 time 2.2830 (2.2286) loss 3.0598 (3.1292) grad_norm 3.2539 (2.5820) [2022-01-25 12:28:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][530/1251] eta 0:26:46 lr 0.000091 time 1.9723 (2.2282) loss 3.8770 (3.1287) grad_norm 2.1045 (2.5804) [2022-01-25 12:28:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][540/1251] eta 0:26:24 lr 0.000091 time 2.6185 (2.2290) loss 2.6694 (3.1252) grad_norm 2.6087 (2.5832) [2022-01-25 12:28:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][550/1251] eta 0:26:00 lr 0.000091 time 1.8692 (2.2258) loss 3.6114 (3.1268) grad_norm 2.6441 (2.5821) [2022-01-25 12:29:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][560/1251] eta 0:25:37 lr 0.000091 time 2.2574 (2.2248) loss 3.2150 (3.1246) grad_norm 2.2015 (2.5806) [2022-01-25 12:29:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][570/1251] eta 0:25:14 lr 0.000091 time 2.2356 (2.2238) loss 2.6835 (3.1259) grad_norm 2.3357 (2.5782) [2022-01-25 12:30:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][580/1251] eta 0:24:52 lr 0.000091 time 2.4771 (2.2238) loss 1.9886 (3.1215) grad_norm 2.7417 (2.5776) [2022-01-25 12:30:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][590/1251] eta 0:24:29 lr 0.000091 time 1.7684 (2.2236) loss 2.6044 (3.1205) grad_norm 2.3614 (2.5756) [2022-01-25 12:30:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][600/1251] eta 0:24:08 lr 0.000091 time 1.9642 (2.2256) loss 3.5269 (3.1238) grad_norm 2.3091 (2.5757) [2022-01-25 12:31:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][610/1251] eta 0:23:43 lr 0.000091 time 1.6023 (2.2214) loss 3.5660 (3.1226) grad_norm 2.5265 (2.5745) [2022-01-25 12:31:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][620/1251] eta 0:23:20 lr 0.000091 time 2.2414 (2.2188) loss 3.1935 (3.1264) grad_norm 2.5652 (2.5737) [2022-01-25 12:31:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][630/1251] eta 0:22:56 lr 0.000091 time 1.6987 (2.2171) loss 1.9592 (3.1234) grad_norm 2.9672 (2.5732) [2022-01-25 12:32:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][640/1251] eta 0:22:35 lr 0.000091 time 2.5590 (2.2184) loss 3.1764 (3.1192) grad_norm 2.5420 (2.5720) [2022-01-25 12:32:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][650/1251] eta 0:22:12 lr 0.000091 time 2.1403 (2.2171) loss 3.4550 (3.1162) grad_norm 2.4089 (2.5704) [2022-01-25 12:32:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][660/1251] eta 0:21:52 lr 0.000091 time 2.2519 (2.2210) loss 3.4499 (3.1157) grad_norm 2.4061 (2.5698) [2022-01-25 12:33:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][670/1251] eta 0:21:30 lr 0.000091 time 2.2756 (2.2205) loss 3.5961 (3.1141) grad_norm 2.7867 (2.5693) [2022-01-25 12:33:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][680/1251] eta 0:21:08 lr 0.000091 time 2.3620 (2.2207) loss 3.1044 (3.1149) grad_norm 2.6058 (2.5711) [2022-01-25 12:34:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][690/1251] eta 0:20:44 lr 0.000091 time 1.8568 (2.2183) loss 3.4984 (3.1139) grad_norm 2.3604 (2.5683) [2022-01-25 12:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][700/1251] eta 0:20:19 lr 0.000091 time 2.2196 (2.2137) loss 3.1346 (3.1147) grad_norm 2.9716 (2.5679) [2022-01-25 12:34:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][710/1251] eta 0:19:55 lr 0.000091 time 1.6748 (2.2103) loss 3.3248 (3.1141) grad_norm 2.7420 (2.5679) [2022-01-25 12:35:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][720/1251] eta 0:19:32 lr 0.000091 time 2.1154 (2.2088) loss 3.5243 (3.1139) grad_norm 2.9370 (2.5668) [2022-01-25 12:35:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][730/1251] eta 0:19:11 lr 0.000091 time 1.9925 (2.2096) loss 2.3658 (3.1107) grad_norm 2.7075 (2.5692) [2022-01-25 12:35:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][740/1251] eta 0:18:50 lr 0.000091 time 2.8560 (2.2118) loss 2.0373 (3.1070) grad_norm 2.1269 (2.5703) [2022-01-25 12:36:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][750/1251] eta 0:18:27 lr 0.000091 time 1.9475 (2.2114) loss 2.8382 (3.1053) grad_norm 2.2029 (2.5711) [2022-01-25 12:36:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][760/1251] eta 0:18:05 lr 0.000091 time 2.0123 (2.2115) loss 3.2711 (3.1029) grad_norm 2.2116 (2.5703) [2022-01-25 12:36:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][770/1251] eta 0:17:43 lr 0.000091 time 1.7586 (2.2105) loss 2.4997 (3.1028) grad_norm 2.3174 (2.5684) [2022-01-25 12:37:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][780/1251] eta 0:17:21 lr 0.000091 time 1.7745 (2.2113) loss 2.8208 (3.1050) grad_norm 2.5542 (2.5688) [2022-01-25 12:37:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][790/1251] eta 0:16:59 lr 0.000091 time 2.7883 (2.2121) loss 2.8691 (3.1063) grad_norm 2.5646 (2.5687) [2022-01-25 12:38:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][800/1251] eta 0:16:37 lr 0.000091 time 1.9496 (2.2118) loss 2.4873 (3.1072) grad_norm 2.0898 (2.5669) [2022-01-25 12:38:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][810/1251] eta 0:16:15 lr 0.000091 time 1.6235 (2.2111) loss 3.7559 (3.1080) grad_norm 2.7531 (2.5689) [2022-01-25 12:38:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][820/1251] eta 0:15:52 lr 0.000091 time 2.2410 (2.2108) loss 3.2182 (3.1080) grad_norm 2.3210 (2.5673) [2022-01-25 12:39:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][830/1251] eta 0:15:31 lr 0.000091 time 3.0327 (2.2133) loss 3.2856 (3.1112) grad_norm 2.2649 (2.5661) [2022-01-25 12:39:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][840/1251] eta 0:15:09 lr 0.000091 time 2.1248 (2.2127) loss 3.5047 (3.1123) grad_norm 2.3194 (2.5659) [2022-01-25 12:39:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][850/1251] eta 0:14:46 lr 0.000091 time 1.6493 (2.2116) loss 3.1364 (3.1112) grad_norm 2.0564 (2.5654) [2022-01-25 12:40:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][860/1251] eta 0:14:24 lr 0.000091 time 1.6337 (2.2109) loss 3.1049 (3.1121) grad_norm 2.3299 (2.5652) [2022-01-25 12:40:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][870/1251] eta 0:14:03 lr 0.000091 time 2.5700 (2.2127) loss 2.7035 (3.1112) grad_norm 2.2498 (2.5633) [2022-01-25 12:41:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][880/1251] eta 0:13:40 lr 0.000091 time 1.7254 (2.2129) loss 2.3775 (3.1116) grad_norm 2.4862 (2.5640) [2022-01-25 12:41:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][890/1251] eta 0:13:19 lr 0.000091 time 1.9745 (2.2135) loss 2.2964 (3.1088) grad_norm 2.5795 (2.5625) [2022-01-25 12:41:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][900/1251] eta 0:12:56 lr 0.000091 time 1.9091 (2.2124) loss 3.5015 (3.1119) grad_norm 2.7868 (2.5623) [2022-01-25 12:42:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][910/1251] eta 0:12:33 lr 0.000091 time 1.7647 (2.2111) loss 3.3820 (3.1147) grad_norm 2.4749 (2.5601) [2022-01-25 12:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][920/1251] eta 0:12:10 lr 0.000091 time 1.8359 (2.2078) loss 2.6614 (3.1133) grad_norm 2.3019 (2.5621) [2022-01-25 12:42:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][930/1251] eta 0:11:48 lr 0.000091 time 1.8399 (2.2064) loss 2.9686 (3.1150) grad_norm 2.6858 (2.5629) [2022-01-25 12:43:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][940/1251] eta 0:11:25 lr 0.000091 time 2.2986 (2.2053) loss 3.1049 (3.1162) grad_norm 2.7148 (2.5618) [2022-01-25 12:43:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][950/1251] eta 0:11:03 lr 0.000091 time 1.9590 (2.2044) loss 3.2364 (3.1164) grad_norm 2.2713 (2.5600) [2022-01-25 12:43:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][960/1251] eta 0:10:41 lr 0.000091 time 2.1577 (2.2043) loss 3.9675 (3.1173) grad_norm 2.5167 (2.5598) [2022-01-25 12:44:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][970/1251] eta 0:10:19 lr 0.000090 time 2.6492 (2.2057) loss 3.0893 (3.1178) grad_norm 2.4460 (2.5600) [2022-01-25 12:44:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][980/1251] eta 0:09:58 lr 0.000090 time 2.5208 (2.2088) loss 3.2979 (3.1181) grad_norm 2.5064 (2.5594) [2022-01-25 12:45:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][990/1251] eta 0:09:36 lr 0.000090 time 2.2012 (2.2100) loss 3.8722 (3.1191) grad_norm 3.1404 (2.5597) [2022-01-25 12:45:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1000/1251] eta 0:09:14 lr 0.000090 time 1.8688 (2.2087) loss 3.3102 (3.1191) grad_norm 2.4050 (2.5585) [2022-01-25 12:45:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1010/1251] eta 0:08:52 lr 0.000090 time 1.8367 (2.2076) loss 3.3591 (3.1185) grad_norm 2.5537 (2.5584) [2022-01-25 12:46:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1020/1251] eta 0:08:29 lr 0.000090 time 2.3406 (2.2071) loss 3.6271 (3.1210) grad_norm 2.6180 (2.5586) [2022-01-25 12:46:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1030/1251] eta 0:08:07 lr 0.000090 time 2.7516 (2.2073) loss 3.6160 (3.1204) grad_norm 3.6721 (2.5607) [2022-01-25 12:46:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1040/1251] eta 0:07:45 lr 0.000090 time 1.8921 (2.2067) loss 3.4678 (3.1199) grad_norm 2.4812 (2.5609) [2022-01-25 12:47:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1050/1251] eta 0:07:23 lr 0.000090 time 2.3109 (2.2067) loss 2.2597 (3.1183) grad_norm 2.5282 (2.5608) [2022-01-25 12:47:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1060/1251] eta 0:07:01 lr 0.000090 time 2.2186 (2.2066) loss 2.8884 (3.1202) grad_norm 2.2131 (2.5599) [2022-01-25 12:47:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1070/1251] eta 0:06:39 lr 0.000090 time 2.1579 (2.2069) loss 3.1457 (3.1201) grad_norm 3.6044 (2.5607) [2022-01-25 12:48:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1080/1251] eta 0:06:17 lr 0.000090 time 1.8762 (2.2057) loss 3.8355 (3.1198) grad_norm 2.8970 (2.5617) [2022-01-25 12:48:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1090/1251] eta 0:05:55 lr 0.000090 time 2.3308 (2.2052) loss 2.3565 (3.1173) grad_norm 2.5132 (2.5617) [2022-01-25 12:48:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1100/1251] eta 0:05:33 lr 0.000090 time 2.0475 (2.2058) loss 3.6068 (3.1181) grad_norm 2.3754 (2.5620) [2022-01-25 12:49:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1110/1251] eta 0:05:11 lr 0.000090 time 2.4742 (2.2076) loss 2.0265 (3.1173) grad_norm 2.7222 (2.5622) [2022-01-25 12:49:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1120/1251] eta 0:04:49 lr 0.000090 time 1.9355 (2.2067) loss 3.4952 (3.1174) grad_norm 2.4655 (2.5622) [2022-01-25 12:50:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1130/1251] eta 0:04:26 lr 0.000090 time 1.8768 (2.2051) loss 3.5952 (3.1205) grad_norm 2.2290 (2.5604) [2022-01-25 12:50:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1140/1251] eta 0:04:04 lr 0.000090 time 1.7881 (2.2023) loss 3.2489 (3.1203) grad_norm 2.2488 (2.5591) [2022-01-25 12:50:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1150/1251] eta 0:03:42 lr 0.000090 time 2.8894 (2.2026) loss 3.5976 (3.1192) grad_norm 2.6817 (2.5599) [2022-01-25 12:51:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1160/1251] eta 0:03:20 lr 0.000090 time 2.4687 (2.2020) loss 3.6345 (3.1193) grad_norm 2.2614 (2.5592) [2022-01-25 12:51:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1170/1251] eta 0:02:58 lr 0.000090 time 2.6095 (2.2028) loss 3.1476 (3.1202) grad_norm 2.8449 (2.5660) [2022-01-25 12:51:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1180/1251] eta 0:02:36 lr 0.000090 time 3.0957 (2.2038) loss 3.7627 (3.1205) grad_norm 2.5093 (2.5689) [2022-01-25 12:52:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1190/1251] eta 0:02:14 lr 0.000090 time 2.1886 (2.2046) loss 3.5515 (3.1213) grad_norm 2.7129 (2.5690) [2022-01-25 12:52:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1200/1251] eta 0:01:52 lr 0.000090 time 2.1849 (2.2049) loss 3.1553 (3.1211) grad_norm 2.3840 (2.5689) [2022-01-25 12:53:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1210/1251] eta 0:01:30 lr 0.000090 time 2.2412 (2.2042) loss 3.0252 (3.1200) grad_norm 2.7592 (2.5689) [2022-01-25 12:53:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1220/1251] eta 0:01:08 lr 0.000090 time 2.9136 (2.2046) loss 3.1410 (3.1216) grad_norm 2.5095 (2.5692) [2022-01-25 12:53:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1230/1251] eta 0:00:46 lr 0.000090 time 1.8976 (2.2028) loss 3.3063 (3.1234) grad_norm 2.3563 (2.5686) [2022-01-25 12:54:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1240/1251] eta 0:00:24 lr 0.000090 time 1.6680 (2.2004) loss 3.4708 (3.1230) grad_norm 2.3919 (2.5689) [2022-01-25 12:54:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [244/300][1250/1251] eta 0:00:02 lr 0.000090 time 1.1731 (2.1948) loss 3.1933 (3.1224) grad_norm 2.4051 (2.5682) [2022-01-25 12:54:17 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 244 training takes 0:45:46 [2022-01-25 12:54:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.239 (18.239) Loss 0.8287 (0.8287) Acc@1 80.469 (80.469) Acc@5 95.020 (95.020) [2022-01-25 12:54:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.857 (3.294) Loss 0.8109 (0.8316) Acc@1 80.664 (80.433) Acc@5 95.605 (95.437) [2022-01-25 12:55:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.887 (2.450) Loss 0.7976 (0.8355) Acc@1 80.664 (80.315) Acc@5 95.605 (95.364) [2022-01-25 12:55:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.271 (2.239) Loss 0.8676 (0.8359) Acc@1 80.762 (80.343) Acc@5 94.043 (95.294) [2022-01-25 12:55:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.201 (2.193) Loss 0.8438 (0.8447) Acc@1 80.176 (80.171) Acc@5 95.312 (95.181) [2022-01-25 12:55:53 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.178 Acc@5 95.184 [2022-01-25 12:55:53 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.2% [2022-01-25 12:55:53 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.18% [2022-01-25 12:56:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][0/1251] eta 7:37:21 lr 0.000090 time 21.9360 (21.9360) loss 2.7362 (2.7362) grad_norm 2.4632 (2.4632) [2022-01-25 12:56:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][10/1251] eta 1:24:04 lr 0.000090 time 2.1667 (4.0645) loss 2.1511 (3.0719) grad_norm 2.1404 (2.5179) [2022-01-25 12:57:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][20/1251] eta 1:04:36 lr 0.000090 time 2.2681 (3.1494) loss 2.5608 (3.1192) grad_norm 2.3979 (2.5190) [2022-01-25 12:57:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][30/1251] eta 0:56:40 lr 0.000090 time 1.3749 (2.7847) loss 3.1550 (3.1433) grad_norm 2.8851 (2.5532) [2022-01-25 12:57:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][40/1251] eta 0:54:39 lr 0.000090 time 3.3515 (2.7081) loss 3.3673 (3.1057) grad_norm 2.4307 (2.5544) [2022-01-25 12:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][50/1251] eta 0:52:26 lr 0.000090 time 1.7521 (2.6196) loss 3.2523 (3.1686) grad_norm 3.1594 (2.6248) [2022-01-25 12:58:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][60/1251] eta 0:50:38 lr 0.000090 time 2.4668 (2.5515) loss 2.6574 (3.1429) grad_norm 2.3557 (2.6165) [2022-01-25 12:58:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][70/1251] eta 0:49:12 lr 0.000090 time 1.6821 (2.4996) loss 3.0692 (3.1536) grad_norm 2.6460 (2.6421) [2022-01-25 12:59:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][80/1251] eta 0:48:05 lr 0.000090 time 2.8998 (2.4643) loss 3.5809 (3.1480) grad_norm 2.9169 (2.6249) [2022-01-25 12:59:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][90/1251] eta 0:47:13 lr 0.000090 time 2.1580 (2.4404) loss 3.2600 (3.1677) grad_norm 2.8697 (2.6293) [2022-01-25 12:59:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][100/1251] eta 0:46:18 lr 0.000090 time 2.0361 (2.4137) loss 2.7225 (3.1605) grad_norm 2.4620 (2.6208) [2022-01-25 13:00:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][110/1251] eta 0:45:20 lr 0.000090 time 1.9697 (2.3847) loss 2.2533 (3.1208) grad_norm 2.5267 (2.6052) [2022-01-25 13:00:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][120/1251] eta 0:44:36 lr 0.000090 time 2.9315 (2.3667) loss 3.3420 (3.1150) grad_norm 3.1308 (2.6012) [2022-01-25 13:01:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][130/1251] eta 0:44:01 lr 0.000090 time 2.1275 (2.3560) loss 3.4770 (3.1270) grad_norm 2.6165 (2.5923) [2022-01-25 13:01:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][140/1251] eta 0:43:17 lr 0.000090 time 1.7980 (2.3383) loss 3.2942 (3.1226) grad_norm 2.7756 (2.5861) [2022-01-25 13:01:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][150/1251] eta 0:42:35 lr 0.000090 time 2.3101 (2.3213) loss 3.5106 (3.1213) grad_norm 2.5364 (2.5949) [2022-01-25 13:02:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][160/1251] eta 0:42:04 lr 0.000089 time 2.3153 (2.3141) loss 2.5329 (3.1148) grad_norm 2.4761 (2.5991) [2022-01-25 13:02:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][170/1251] eta 0:41:37 lr 0.000089 time 2.8056 (2.3104) loss 3.7906 (3.1189) grad_norm 2.5891 (2.5976) [2022-01-25 13:02:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][180/1251] eta 0:41:05 lr 0.000089 time 1.8925 (2.3024) loss 3.4082 (3.1095) grad_norm 2.8160 (2.5983) [2022-01-25 13:03:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][190/1251] eta 0:40:40 lr 0.000089 time 2.5438 (2.3000) loss 3.1895 (3.1092) grad_norm 2.3759 (2.6001) [2022-01-25 13:03:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][200/1251] eta 0:40:15 lr 0.000089 time 2.8237 (2.2982) loss 3.6677 (3.1156) grad_norm 2.6031 (2.5971) [2022-01-25 13:03:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][210/1251] eta 0:39:54 lr 0.000089 time 2.5371 (2.2998) loss 2.7771 (3.1206) grad_norm 2.5484 (2.6006) [2022-01-25 13:04:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][220/1251] eta 0:39:17 lr 0.000089 time 1.8502 (2.2867) loss 2.1223 (3.1200) grad_norm 2.6847 (2.5995) [2022-01-25 13:04:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][230/1251] eta 0:38:44 lr 0.000089 time 1.7563 (2.2769) loss 3.3577 (3.1186) grad_norm 2.4160 (2.5951) [2022-01-25 13:05:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][240/1251] eta 0:38:10 lr 0.000089 time 1.8358 (2.2660) loss 3.4691 (3.1097) grad_norm 2.3510 (2.6015) [2022-01-25 13:05:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][250/1251] eta 0:37:41 lr 0.000089 time 1.8737 (2.2588) loss 3.0116 (3.0898) grad_norm 2.5242 (2.5997) [2022-01-25 13:05:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][260/1251] eta 0:37:18 lr 0.000089 time 3.1374 (2.2590) loss 2.5747 (3.0919) grad_norm 2.9256 (2.5995) [2022-01-25 13:06:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][270/1251] eta 0:36:54 lr 0.000089 time 2.6834 (2.2573) loss 3.5398 (3.0919) grad_norm 2.4443 (2.5986) [2022-01-25 13:06:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][280/1251] eta 0:36:26 lr 0.000089 time 1.9038 (2.2518) loss 3.3317 (3.0932) grad_norm 2.4034 (2.5974) [2022-01-25 13:06:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][290/1251] eta 0:36:01 lr 0.000089 time 1.8355 (2.2489) loss 3.7271 (3.0941) grad_norm 2.6761 (2.5959) [2022-01-25 13:07:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][300/1251] eta 0:35:42 lr 0.000089 time 3.1714 (2.2531) loss 3.7254 (3.1001) grad_norm 2.5253 (2.5937) [2022-01-25 13:07:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][310/1251] eta 0:35:18 lr 0.000089 time 2.5082 (2.2510) loss 3.4204 (3.0954) grad_norm 2.7425 (2.5987) [2022-01-25 13:07:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][320/1251] eta 0:34:54 lr 0.000089 time 1.9893 (2.2498) loss 3.3616 (3.1034) grad_norm 2.5741 (2.5962) [2022-01-25 13:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][330/1251] eta 0:34:29 lr 0.000089 time 2.1781 (2.2470) loss 3.0513 (3.1020) grad_norm 2.2930 (2.5956) [2022-01-25 13:08:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][340/1251] eta 0:34:08 lr 0.000089 time 3.5883 (2.2486) loss 3.4947 (3.1008) grad_norm 2.5544 (2.5997) [2022-01-25 13:09:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][350/1251] eta 0:33:49 lr 0.000089 time 2.5579 (2.2525) loss 2.9450 (3.1014) grad_norm 2.4376 (2.5972) [2022-01-25 13:09:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][360/1251] eta 0:33:19 lr 0.000089 time 2.0798 (2.2443) loss 3.3049 (3.1064) grad_norm 2.3947 (2.5962) [2022-01-25 13:09:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][370/1251] eta 0:32:53 lr 0.000089 time 1.7563 (2.2402) loss 3.3745 (3.1012) grad_norm 2.8252 (2.5993) [2022-01-25 13:10:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][380/1251] eta 0:32:26 lr 0.000089 time 2.1799 (2.2351) loss 3.1393 (3.0977) grad_norm 2.5831 (2.6027) [2022-01-25 13:10:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][390/1251] eta 0:32:01 lr 0.000089 time 1.8558 (2.2322) loss 2.9183 (3.1000) grad_norm 2.5380 (2.6005) [2022-01-25 13:10:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][400/1251] eta 0:31:37 lr 0.000089 time 1.8654 (2.2294) loss 3.5705 (3.1013) grad_norm 2.3881 (2.5996) [2022-01-25 13:11:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][410/1251] eta 0:31:11 lr 0.000089 time 1.9394 (2.2258) loss 3.4268 (3.1023) grad_norm 2.4573 (2.6000) [2022-01-25 13:11:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][420/1251] eta 0:30:49 lr 0.000089 time 2.3803 (2.2254) loss 2.2935 (3.1048) grad_norm 2.3462 (2.6018) [2022-01-25 13:11:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][430/1251] eta 0:30:31 lr 0.000089 time 2.3274 (2.2306) loss 2.7142 (3.0944) grad_norm 2.5377 (2.6033) [2022-01-25 13:12:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][440/1251] eta 0:30:08 lr 0.000089 time 1.4429 (2.2300) loss 3.7534 (3.0976) grad_norm 3.0943 (2.6014) [2022-01-25 13:12:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][450/1251] eta 0:29:46 lr 0.000089 time 2.4477 (2.2306) loss 2.2456 (3.0959) grad_norm 2.4199 (2.5999) [2022-01-25 13:13:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][460/1251] eta 0:29:24 lr 0.000089 time 3.0310 (2.2304) loss 3.6495 (3.1018) grad_norm 2.9995 (2.6026) [2022-01-25 13:13:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][470/1251] eta 0:28:58 lr 0.000089 time 1.5663 (2.2256) loss 2.1825 (3.1000) grad_norm 3.2514 (2.6038) [2022-01-25 13:13:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][480/1251] eta 0:28:32 lr 0.000089 time 2.2571 (2.2217) loss 3.0950 (3.1008) grad_norm 2.6227 (2.6008) [2022-01-25 13:14:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][490/1251] eta 0:28:09 lr 0.000089 time 1.9245 (2.2203) loss 3.2165 (3.1014) grad_norm 2.5033 (2.5966) [2022-01-25 13:14:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][500/1251] eta 0:27:48 lr 0.000089 time 3.0672 (2.2212) loss 3.3955 (3.1028) grad_norm 2.4622 (2.5934) [2022-01-25 13:14:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][510/1251] eta 0:27:28 lr 0.000089 time 2.4996 (2.2249) loss 3.2983 (3.0975) grad_norm 2.3940 (2.5928) [2022-01-25 13:15:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][520/1251] eta 0:27:05 lr 0.000089 time 2.1099 (2.2241) loss 3.2083 (3.1022) grad_norm 2.7199 (2.5942) [2022-01-25 13:15:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][530/1251] eta 0:26:43 lr 0.000089 time 2.2092 (2.2240) loss 3.4125 (3.0952) grad_norm 2.4406 (2.5964) [2022-01-25 13:15:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][540/1251] eta 0:26:20 lr 0.000089 time 3.5121 (2.2228) loss 2.6971 (3.0957) grad_norm 3.1992 (2.6022) [2022-01-25 13:16:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][550/1251] eta 0:25:57 lr 0.000089 time 1.9353 (2.2218) loss 3.3786 (3.0973) grad_norm 2.4576 (2.6005) [2022-01-25 13:16:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][560/1251] eta 0:25:33 lr 0.000089 time 2.2162 (2.2199) loss 3.3446 (3.0990) grad_norm 2.9121 (2.6004) [2022-01-25 13:17:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][570/1251] eta 0:25:10 lr 0.000089 time 1.9051 (2.2178) loss 2.9877 (3.1010) grad_norm 3.0937 (2.5984) [2022-01-25 13:17:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][580/1251] eta 0:24:47 lr 0.000089 time 2.3232 (2.2171) loss 3.5551 (3.0980) grad_norm 2.1474 (2.5973) [2022-01-25 13:17:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][590/1251] eta 0:24:25 lr 0.000089 time 2.0397 (2.2166) loss 2.1614 (3.0993) grad_norm 3.0436 (2.5963) [2022-01-25 13:18:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][600/1251] eta 0:24:02 lr 0.000089 time 2.2100 (2.2151) loss 2.8196 (3.0998) grad_norm 2.9008 (2.5982) [2022-01-25 13:18:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][610/1251] eta 0:23:39 lr 0.000088 time 1.6014 (2.2144) loss 2.5454 (3.0999) grad_norm 2.5595 (2.5963) [2022-01-25 13:18:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][620/1251] eta 0:23:17 lr 0.000088 time 2.5683 (2.2145) loss 3.5898 (3.1003) grad_norm 3.0173 (2.6023) [2022-01-25 13:19:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][630/1251] eta 0:22:54 lr 0.000088 time 2.4493 (2.2141) loss 3.2785 (3.0979) grad_norm 2.6666 (2.6026) [2022-01-25 13:19:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][640/1251] eta 0:22:34 lr 0.000088 time 3.3840 (2.2165) loss 3.5165 (3.1013) grad_norm 2.6159 (2.6017) [2022-01-25 13:19:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][650/1251] eta 0:22:13 lr 0.000088 time 1.7360 (2.2187) loss 3.1986 (3.1059) grad_norm 2.6950 (2.6033) [2022-01-25 13:20:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][660/1251] eta 0:21:52 lr 0.000088 time 2.5452 (2.2200) loss 3.3051 (3.1089) grad_norm 2.3994 (2.6016) [2022-01-25 13:20:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][670/1251] eta 0:21:28 lr 0.000088 time 2.0235 (2.2181) loss 3.3873 (3.1106) grad_norm 2.3266 (2.6000) [2022-01-25 13:21:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][680/1251] eta 0:21:04 lr 0.000088 time 2.4746 (2.2147) loss 3.3271 (3.1098) grad_norm 2.4352 (2.6001) [2022-01-25 13:21:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][690/1251] eta 0:20:41 lr 0.000088 time 2.2120 (2.2133) loss 3.5775 (3.1130) grad_norm 2.4941 (2.5976) [2022-01-25 13:21:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][700/1251] eta 0:20:18 lr 0.000088 time 2.1785 (2.2113) loss 3.4125 (3.1171) grad_norm 2.5362 (2.5985) [2022-01-25 13:22:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][710/1251] eta 0:19:56 lr 0.000088 time 2.8456 (2.2111) loss 2.8318 (3.1143) grad_norm 3.0071 (2.6016) [2022-01-25 13:22:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][720/1251] eta 0:19:35 lr 0.000088 time 2.9855 (2.2131) loss 3.3737 (3.1161) grad_norm 4.2584 (2.6038) [2022-01-25 13:22:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][730/1251] eta 0:19:13 lr 0.000088 time 2.6737 (2.2140) loss 2.5032 (3.1171) grad_norm 2.4887 (2.6040) [2022-01-25 13:23:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][740/1251] eta 0:18:50 lr 0.000088 time 1.8277 (2.2132) loss 3.4003 (3.1172) grad_norm 2.5856 (2.6033) [2022-01-25 13:23:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][750/1251] eta 0:18:27 lr 0.000088 time 2.3748 (2.2114) loss 3.5651 (3.1184) grad_norm 2.4442 (2.6041) [2022-01-25 13:23:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][760/1251] eta 0:18:05 lr 0.000088 time 2.8001 (2.2098) loss 2.5061 (3.1202) grad_norm 2.6684 (2.6058) [2022-01-25 13:24:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][770/1251] eta 0:17:42 lr 0.000088 time 1.9502 (2.2089) loss 3.5159 (3.1163) grad_norm 2.3617 (2.6044) [2022-01-25 13:24:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][780/1251] eta 0:17:19 lr 0.000088 time 2.4813 (2.2080) loss 2.7905 (3.1132) grad_norm 2.6175 (2.6036) [2022-01-25 13:24:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][790/1251] eta 0:16:56 lr 0.000088 time 1.6180 (2.2054) loss 2.8866 (3.1089) grad_norm 2.2359 (2.6058) [2022-01-25 13:25:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][800/1251] eta 0:16:34 lr 0.000088 time 2.3783 (2.2042) loss 2.2523 (3.1055) grad_norm 2.3152 (2.6056) [2022-01-25 13:25:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][810/1251] eta 0:16:11 lr 0.000088 time 1.5778 (2.2033) loss 3.0653 (3.1060) grad_norm 2.4711 (2.6045) [2022-01-25 13:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][820/1251] eta 0:15:49 lr 0.000088 time 2.8458 (2.2036) loss 3.3667 (3.1073) grad_norm 2.4902 (2.6038) [2022-01-25 13:26:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][830/1251] eta 0:15:27 lr 0.000088 time 1.8706 (2.2034) loss 2.2800 (3.1068) grad_norm 2.5342 (2.6042) [2022-01-25 13:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][840/1251] eta 0:15:05 lr 0.000088 time 2.7211 (2.2038) loss 3.4101 (3.1088) grad_norm 2.6195 (2.6104) [2022-01-25 13:27:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][850/1251] eta 0:14:44 lr 0.000088 time 2.4664 (2.2054) loss 2.7966 (3.1097) grad_norm 2.7812 (2.6117) [2022-01-25 13:27:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][860/1251] eta 0:14:22 lr 0.000088 time 2.4086 (2.2071) loss 2.1176 (3.1055) grad_norm 2.5654 (2.6100) [2022-01-25 13:27:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][870/1251] eta 0:14:00 lr 0.000088 time 1.8544 (2.2064) loss 2.9761 (3.1053) grad_norm 2.3035 (2.6082) [2022-01-25 13:28:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][880/1251] eta 0:13:38 lr 0.000088 time 3.4461 (2.2070) loss 3.8113 (3.1064) grad_norm 3.0314 (2.6096) [2022-01-25 13:28:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][890/1251] eta 0:13:16 lr 0.000088 time 1.9102 (2.2067) loss 3.7750 (3.1100) grad_norm 2.5997 (2.6111) [2022-01-25 13:29:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][900/1251] eta 0:12:54 lr 0.000088 time 2.4363 (2.2066) loss 2.5341 (3.1082) grad_norm 2.3157 (2.6107) [2022-01-25 13:29:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][910/1251] eta 0:12:31 lr 0.000088 time 2.1954 (2.2051) loss 2.3112 (3.1090) grad_norm 3.0847 (2.6105) [2022-01-25 13:29:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][920/1251] eta 0:12:09 lr 0.000088 time 3.6233 (2.2047) loss 2.4314 (3.1076) grad_norm 2.2418 (2.6091) [2022-01-25 13:30:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][930/1251] eta 0:11:47 lr 0.000088 time 2.1656 (2.2035) loss 3.2535 (3.1100) grad_norm 3.1646 (2.6087) [2022-01-25 13:30:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][940/1251] eta 0:11:25 lr 0.000088 time 2.2849 (2.2039) loss 3.6214 (3.1081) grad_norm 2.2094 (2.6100) [2022-01-25 13:30:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][950/1251] eta 0:11:03 lr 0.000088 time 1.9118 (2.2032) loss 3.8176 (3.1071) grad_norm 2.4154 (2.6084) [2022-01-25 13:31:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][960/1251] eta 0:10:41 lr 0.000088 time 3.2945 (2.2031) loss 3.1355 (3.1043) grad_norm 2.2696 (2.6073) [2022-01-25 13:31:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][970/1251] eta 0:10:18 lr 0.000088 time 2.1703 (2.2023) loss 3.0753 (3.1039) grad_norm 2.8853 (2.6065) [2022-01-25 13:31:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][980/1251] eta 0:09:56 lr 0.000088 time 2.2398 (2.2026) loss 3.3949 (3.1028) grad_norm 2.3973 (2.6095) [2022-01-25 13:32:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][990/1251] eta 0:09:34 lr 0.000088 time 2.0892 (2.2023) loss 3.3568 (3.1042) grad_norm 2.6413 (2.6092) [2022-01-25 13:32:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1000/1251] eta 0:09:13 lr 0.000088 time 4.9739 (2.2042) loss 3.1754 (3.1037) grad_norm 2.5494 (2.6083) [2022-01-25 13:33:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1010/1251] eta 0:08:50 lr 0.000088 time 2.2082 (2.2023) loss 3.2056 (3.1043) grad_norm 2.2002 (2.6085) [2022-01-25 13:33:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1020/1251] eta 0:08:28 lr 0.000088 time 1.4929 (2.2028) loss 2.2745 (3.1046) grad_norm 2.6987 (2.6081) [2022-01-25 13:33:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1030/1251] eta 0:08:06 lr 0.000088 time 2.2074 (2.2015) loss 3.7418 (3.1051) grad_norm 4.2867 (2.6100) [2022-01-25 13:34:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1040/1251] eta 0:07:44 lr 0.000088 time 3.4499 (2.2022) loss 3.1940 (3.1073) grad_norm 2.7701 (2.6108) [2022-01-25 13:34:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1050/1251] eta 0:07:22 lr 0.000088 time 3.4436 (2.2034) loss 3.6756 (3.1082) grad_norm 2.2979 (2.6111) [2022-01-25 13:34:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1060/1251] eta 0:07:00 lr 0.000087 time 1.8654 (2.2038) loss 2.8708 (3.1079) grad_norm 3.0143 (2.6111) [2022-01-25 13:35:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1070/1251] eta 0:06:38 lr 0.000087 time 1.7686 (2.2032) loss 3.2498 (3.1085) grad_norm 2.6391 (2.6113) [2022-01-25 13:35:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1080/1251] eta 0:06:16 lr 0.000087 time 2.4653 (2.2027) loss 2.3304 (3.1064) grad_norm 2.7620 (2.6101) [2022-01-25 13:35:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1090/1251] eta 0:05:54 lr 0.000087 time 2.9024 (2.2011) loss 3.3920 (3.1078) grad_norm 2.9196 (2.6109) [2022-01-25 13:36:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1100/1251] eta 0:05:32 lr 0.000087 time 1.9233 (2.2010) loss 3.6203 (3.1051) grad_norm 2.5191 (2.6105) [2022-01-25 13:36:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1110/1251] eta 0:05:10 lr 0.000087 time 2.1856 (2.1999) loss 3.1064 (3.1060) grad_norm 2.5405 (2.6100) [2022-01-25 13:37:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1120/1251] eta 0:04:48 lr 0.000087 time 2.5905 (2.2002) loss 3.0090 (3.1063) grad_norm 2.5580 (2.6099) [2022-01-25 13:37:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1130/1251] eta 0:04:26 lr 0.000087 time 4.5491 (2.2023) loss 3.5196 (3.1054) grad_norm 2.4331 (2.6107) [2022-01-25 13:37:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1140/1251] eta 0:04:04 lr 0.000087 time 1.5219 (2.2010) loss 2.7768 (3.1052) grad_norm 2.4816 (2.6099) [2022-01-25 13:38:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1150/1251] eta 0:03:42 lr 0.000087 time 2.1892 (2.2001) loss 3.4025 (3.1034) grad_norm 2.5542 (2.6092) [2022-01-25 13:38:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1160/1251] eta 0:03:20 lr 0.000087 time 1.8669 (2.2000) loss 3.3403 (3.1047) grad_norm 2.7048 (2.6112) [2022-01-25 13:38:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1170/1251] eta 0:02:58 lr 0.000087 time 2.4930 (2.1996) loss 3.4658 (3.1052) grad_norm 2.3751 (2.6101) [2022-01-25 13:39:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1180/1251] eta 0:02:36 lr 0.000087 time 2.1614 (2.1993) loss 3.8028 (3.1067) grad_norm 2.3464 (2.6101) [2022-01-25 13:39:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1190/1251] eta 0:02:14 lr 0.000087 time 2.1707 (2.1995) loss 3.6791 (3.1068) grad_norm 2.3296 (2.6095) [2022-01-25 13:39:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1200/1251] eta 0:01:52 lr 0.000087 time 2.5248 (2.1991) loss 3.5000 (3.1083) grad_norm 2.6837 (2.6112) [2022-01-25 13:40:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1210/1251] eta 0:01:30 lr 0.000087 time 2.2196 (2.1991) loss 3.0974 (3.1079) grad_norm 2.4114 (2.6112) [2022-01-25 13:40:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1220/1251] eta 0:01:08 lr 0.000087 time 2.2745 (2.1998) loss 2.8948 (3.1096) grad_norm 2.2218 (2.6103) [2022-01-25 13:41:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1230/1251] eta 0:00:46 lr 0.000087 time 1.8295 (2.1997) loss 2.5462 (3.1092) grad_norm 2.2558 (2.6106) [2022-01-25 13:41:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1240/1251] eta 0:00:24 lr 0.000087 time 1.4065 (2.1986) loss 2.0744 (3.1080) grad_norm 2.4358 (2.6108) [2022-01-25 13:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [245/300][1250/1251] eta 0:00:02 lr 0.000087 time 1.3216 (2.1931) loss 2.2117 (3.1082) grad_norm 2.6990 (2.6099) [2022-01-25 13:41:37 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 245 training takes 0:45:43 [2022-01-25 13:41:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.441 (18.441) Loss 0.9066 (0.9066) Acc@1 77.344 (77.344) Acc@5 94.922 (94.922) [2022-01-25 13:42:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.250 (3.260) Loss 0.8385 (0.8632) Acc@1 79.688 (79.537) Acc@5 95.117 (95.259) [2022-01-25 13:42:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.535 (2.615) Loss 0.8977 (0.8623) Acc@1 79.199 (79.822) Acc@5 95.215 (95.261) [2022-01-25 13:42:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.916 (2.278) Loss 0.8045 (0.8605) Acc@1 81.641 (79.880) Acc@5 95.898 (95.196) [2022-01-25 13:43:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.572 (2.192) Loss 0.7698 (0.8537) Acc@1 80.859 (79.983) Acc@5 96.289 (95.217) [2022-01-25 13:43:14 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.034 Acc@5 95.230 [2022-01-25 13:43:14 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.0% [2022-01-25 13:43:14 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.18% [2022-01-25 13:43:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][0/1251] eta 7:40:36 lr 0.000087 time 22.0911 (22.0911) loss 3.5991 (3.5991) grad_norm 2.6240 (2.6240) [2022-01-25 13:43:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][10/1251] eta 1:25:51 lr 0.000087 time 1.7730 (4.1508) loss 3.3061 (3.2821) grad_norm 2.5434 (2.5609) [2022-01-25 13:44:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][20/1251] eta 1:10:19 lr 0.000087 time 1.6133 (3.4281) loss 2.4001 (3.1447) grad_norm 2.7339 (2.5226) [2022-01-25 13:44:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][30/1251] eta 1:01:48 lr 0.000087 time 1.9444 (3.0370) loss 3.2094 (3.1935) grad_norm 2.7577 (2.5253) [2022-01-25 13:45:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][40/1251] eta 0:57:11 lr 0.000087 time 2.7815 (2.8334) loss 3.0791 (3.1524) grad_norm 2.2851 (2.5132) [2022-01-25 13:45:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][50/1251] eta 0:53:46 lr 0.000087 time 1.7934 (2.6869) loss 2.1707 (3.1075) grad_norm 2.5358 (2.4848) [2022-01-25 13:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][60/1251] eta 0:51:12 lr 0.000087 time 2.1271 (2.5799) loss 3.8175 (3.1315) grad_norm 2.6024 (2.4950) [2022-01-25 13:46:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][70/1251] eta 0:49:11 lr 0.000087 time 1.5622 (2.4994) loss 3.1817 (3.1374) grad_norm 2.3520 (2.5398) [2022-01-25 13:46:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][80/1251] eta 0:47:54 lr 0.000087 time 2.4894 (2.4546) loss 3.3840 (3.1382) grad_norm 2.3433 (2.5526) [2022-01-25 13:46:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][90/1251] eta 0:46:46 lr 0.000087 time 2.1709 (2.4175) loss 2.4462 (3.0940) grad_norm 2.2878 (2.5475) [2022-01-25 13:47:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][100/1251] eta 0:46:06 lr 0.000087 time 1.7354 (2.4038) loss 3.3768 (3.0973) grad_norm 2.3214 (2.5474) [2022-01-25 13:47:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][110/1251] eta 0:45:08 lr 0.000087 time 1.8628 (2.3739) loss 3.2395 (3.0855) grad_norm 2.4902 (2.5407) [2022-01-25 13:48:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][120/1251] eta 0:44:57 lr 0.000087 time 3.4188 (2.3852) loss 2.7612 (3.0850) grad_norm 3.2711 (2.5501) [2022-01-25 13:48:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][130/1251] eta 0:44:22 lr 0.000087 time 1.9213 (2.3753) loss 2.8539 (3.0867) grad_norm 2.9398 (2.5465) [2022-01-25 13:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][140/1251] eta 0:43:44 lr 0.000087 time 1.8643 (2.3622) loss 2.7624 (3.0798) grad_norm 2.8690 (2.5522) [2022-01-25 13:49:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][150/1251] eta 0:43:07 lr 0.000087 time 2.5657 (2.3501) loss 3.4893 (3.0766) grad_norm 2.8533 (2.5573) [2022-01-25 13:49:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][160/1251] eta 0:42:25 lr 0.000087 time 1.8737 (2.3328) loss 2.2375 (3.0917) grad_norm 2.4892 (2.5622) [2022-01-25 13:49:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][170/1251] eta 0:41:44 lr 0.000087 time 1.8795 (2.3164) loss 2.3578 (3.0732) grad_norm 2.4478 (2.5644) [2022-01-25 13:50:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][180/1251] eta 0:41:06 lr 0.000087 time 1.9421 (2.3034) loss 2.8208 (3.0738) grad_norm 2.5245 (2.5569) [2022-01-25 13:50:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][190/1251] eta 0:40:40 lr 0.000087 time 1.8508 (2.2999) loss 2.9463 (3.0729) grad_norm 2.3840 (2.5555) [2022-01-25 13:50:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][200/1251] eta 0:40:21 lr 0.000087 time 1.9422 (2.3038) loss 3.6108 (3.0779) grad_norm 2.5523 (2.5548) [2022-01-25 13:51:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][210/1251] eta 0:39:58 lr 0.000087 time 1.8124 (2.3036) loss 2.1709 (3.0827) grad_norm 2.5665 (2.5561) [2022-01-25 13:51:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][220/1251] eta 0:39:26 lr 0.000087 time 1.6669 (2.2958) loss 2.5182 (3.0922) grad_norm 2.3921 (2.5552) [2022-01-25 13:52:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][230/1251] eta 0:38:53 lr 0.000087 time 1.6988 (2.2853) loss 2.2522 (3.0959) grad_norm 2.8591 (2.5653) [2022-01-25 13:52:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][240/1251] eta 0:38:27 lr 0.000087 time 1.6858 (2.2824) loss 3.2289 (3.0927) grad_norm 2.3621 (2.5628) [2022-01-25 13:52:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][250/1251] eta 0:37:59 lr 0.000087 time 1.9373 (2.2770) loss 3.5527 (3.0942) grad_norm 2.6948 (2.5714) [2022-01-25 13:53:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][260/1251] eta 0:37:39 lr 0.000086 time 2.0069 (2.2804) loss 3.2972 (3.0976) grad_norm 2.6819 (2.5730) [2022-01-25 13:53:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][270/1251] eta 0:37:07 lr 0.000086 time 1.6046 (2.2705) loss 2.5421 (3.0850) grad_norm 3.2504 (2.5767) [2022-01-25 13:53:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][280/1251] eta 0:36:35 lr 0.000086 time 1.8104 (2.2611) loss 3.4169 (3.0918) grad_norm 2.5919 (2.5782) [2022-01-25 13:54:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][290/1251] eta 0:36:27 lr 0.000086 time 1.5463 (2.2764) loss 3.1976 (3.0936) grad_norm 2.7692 (2.5802) [2022-01-25 13:54:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][300/1251] eta 0:36:01 lr 0.000086 time 1.5992 (2.2729) loss 2.7785 (3.0887) grad_norm 2.2027 (2.5795) [2022-01-25 13:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][310/1251] eta 0:35:38 lr 0.000086 time 2.1203 (2.2724) loss 3.5555 (3.1016) grad_norm 2.5283 (2.5777) [2022-01-25 13:55:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][320/1251] eta 0:35:07 lr 0.000086 time 1.8728 (2.2638) loss 2.1372 (3.1005) grad_norm 2.4231 (2.5770) [2022-01-25 13:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][330/1251] eta 0:34:42 lr 0.000086 time 2.0058 (2.2612) loss 3.3921 (3.1027) grad_norm 2.4575 (2.5777) [2022-01-25 13:56:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][340/1251] eta 0:34:28 lr 0.000086 time 1.9400 (2.2709) loss 3.0814 (3.1081) grad_norm 2.4734 (2.5762) [2022-01-25 13:56:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][350/1251] eta 0:34:00 lr 0.000086 time 1.6990 (2.2652) loss 3.4983 (3.1064) grad_norm 2.6084 (2.5773) [2022-01-25 13:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][360/1251] eta 0:33:33 lr 0.000086 time 1.9082 (2.2600) loss 2.7072 (3.1083) grad_norm 2.3233 (2.5754) [2022-01-25 13:57:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][370/1251] eta 0:33:05 lr 0.000086 time 1.6334 (2.2533) loss 1.9103 (3.1090) grad_norm 2.3585 (2.5744) [2022-01-25 13:57:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][380/1251] eta 0:32:45 lr 0.000086 time 2.1160 (2.2572) loss 3.6555 (3.1114) grad_norm 2.7469 (2.5729) [2022-01-25 13:57:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][390/1251] eta 0:32:20 lr 0.000086 time 1.9141 (2.2538) loss 3.2244 (3.1140) grad_norm 2.9961 (2.5711) [2022-01-25 13:58:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][400/1251] eta 0:31:54 lr 0.000086 time 2.5118 (2.2492) loss 2.1600 (3.1076) grad_norm 2.7792 (2.5697) [2022-01-25 13:58:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][410/1251] eta 0:31:27 lr 0.000086 time 1.9052 (2.2438) loss 3.7017 (3.1133) grad_norm 2.5708 (2.5693) [2022-01-25 13:59:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][420/1251] eta 0:31:08 lr 0.000086 time 2.1766 (2.2481) loss 2.1081 (3.1104) grad_norm 2.4666 (2.5720) [2022-01-25 13:59:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][430/1251] eta 0:30:46 lr 0.000086 time 1.8531 (2.2491) loss 3.4015 (3.1116) grad_norm 2.5136 (2.5740) [2022-01-25 13:59:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][440/1251] eta 0:30:22 lr 0.000086 time 1.7688 (2.2469) loss 2.7256 (3.1151) grad_norm 2.9214 (2.5729) [2022-01-25 14:00:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][450/1251] eta 0:29:55 lr 0.000086 time 1.6048 (2.2413) loss 3.6555 (3.1195) grad_norm 2.4188 (2.5730) [2022-01-25 14:00:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][460/1251] eta 0:29:30 lr 0.000086 time 1.8681 (2.2388) loss 2.9307 (3.1206) grad_norm 2.4690 (2.5725) [2022-01-25 14:00:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][470/1251] eta 0:29:08 lr 0.000086 time 2.2937 (2.2389) loss 3.1965 (3.1235) grad_norm 2.6386 (2.5757) [2022-01-25 14:01:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][480/1251] eta 0:28:43 lr 0.000086 time 2.1777 (2.2352) loss 3.2564 (3.1289) grad_norm 2.4625 (2.5745) [2022-01-25 14:01:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][490/1251] eta 0:28:19 lr 0.000086 time 1.9953 (2.2332) loss 3.6980 (3.1351) grad_norm 2.5350 (2.5773) [2022-01-25 14:01:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][500/1251] eta 0:27:54 lr 0.000086 time 2.4554 (2.2303) loss 2.0814 (3.1297) grad_norm 2.7134 (2.5828) [2022-01-25 14:02:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][510/1251] eta 0:27:32 lr 0.000086 time 2.5418 (2.2300) loss 2.7902 (3.1303) grad_norm 3.1475 (2.5895) [2022-01-25 14:02:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][520/1251] eta 0:27:09 lr 0.000086 time 1.5144 (2.2286) loss 3.7314 (3.1282) grad_norm 2.8299 (2.5948) [2022-01-25 14:02:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][530/1251] eta 0:26:46 lr 0.000086 time 1.6506 (2.2276) loss 3.6992 (3.1257) grad_norm 2.6653 (2.5992) [2022-01-25 14:03:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][540/1251] eta 0:26:24 lr 0.000086 time 2.0752 (2.2288) loss 2.0313 (3.1212) grad_norm 3.1870 (2.6036) [2022-01-25 14:03:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][550/1251] eta 0:26:05 lr 0.000086 time 2.4919 (2.2329) loss 3.8313 (3.1221) grad_norm 2.9459 (2.6022) [2022-01-25 14:04:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][560/1251] eta 0:25:43 lr 0.000086 time 1.4951 (2.2339) loss 2.1813 (3.1169) grad_norm 2.7201 (2.6021) [2022-01-25 14:04:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][570/1251] eta 0:25:19 lr 0.000086 time 1.6590 (2.2313) loss 3.3211 (3.1193) grad_norm 6.6183 (2.6081) [2022-01-25 14:04:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][580/1251] eta 0:24:54 lr 0.000086 time 1.8183 (2.2278) loss 3.0168 (3.1172) grad_norm 3.1116 (2.6075) [2022-01-25 14:05:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][590/1251] eta 0:24:30 lr 0.000086 time 1.7744 (2.2242) loss 2.3553 (3.1131) grad_norm 2.9225 (2.6132) [2022-01-25 14:05:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][600/1251] eta 0:24:06 lr 0.000086 time 1.8310 (2.2218) loss 4.0332 (3.1116) grad_norm 2.9877 (2.6129) [2022-01-25 14:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][610/1251] eta 0:23:43 lr 0.000086 time 2.5185 (2.2206) loss 3.9557 (3.1147) grad_norm 2.8828 (2.6158) [2022-01-25 14:06:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][620/1251] eta 0:23:20 lr 0.000086 time 2.4210 (2.2200) loss 3.4679 (3.1191) grad_norm 2.5506 (2.6156) [2022-01-25 14:06:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][630/1251] eta 0:22:57 lr 0.000086 time 2.7771 (2.2189) loss 3.1908 (3.1168) grad_norm 2.5441 (2.6141) [2022-01-25 14:06:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][640/1251] eta 0:22:35 lr 0.000086 time 1.6049 (2.2184) loss 3.6141 (3.1160) grad_norm 2.5875 (2.6136) [2022-01-25 14:07:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][650/1251] eta 0:22:13 lr 0.000086 time 1.8955 (2.2190) loss 3.2238 (3.1187) grad_norm 2.2949 (2.6127) [2022-01-25 14:07:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][660/1251] eta 0:21:52 lr 0.000086 time 2.2327 (2.2208) loss 2.0985 (3.1137) grad_norm 2.3570 (2.6123) [2022-01-25 14:08:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][670/1251] eta 0:21:29 lr 0.000086 time 2.5191 (2.2201) loss 2.5938 (3.1106) grad_norm 2.4070 (2.6112) [2022-01-25 14:08:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][680/1251] eta 0:21:07 lr 0.000086 time 1.5973 (2.2192) loss 3.2383 (3.1097) grad_norm 2.3744 (2.6097) [2022-01-25 14:08:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][690/1251] eta 0:20:43 lr 0.000086 time 1.6376 (2.2165) loss 2.1406 (3.1053) grad_norm 2.5109 (2.6105) [2022-01-25 14:09:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][700/1251] eta 0:20:20 lr 0.000086 time 2.5372 (2.2150) loss 2.5862 (3.1011) grad_norm 3.3490 (2.6099) [2022-01-25 14:09:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][710/1251] eta 0:19:58 lr 0.000085 time 2.2278 (2.2153) loss 3.3939 (3.0966) grad_norm 2.9147 (2.6100) [2022-01-25 14:09:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][720/1251] eta 0:19:36 lr 0.000085 time 2.1510 (2.2150) loss 3.7696 (3.0965) grad_norm 2.3534 (2.6097) [2022-01-25 14:10:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][730/1251] eta 0:19:14 lr 0.000085 time 2.0136 (2.2159) loss 3.2695 (3.0965) grad_norm 2.4466 (2.6109) [2022-01-25 14:10:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][740/1251] eta 0:18:52 lr 0.000085 time 1.7346 (2.2168) loss 2.4443 (3.0985) grad_norm 2.3380 (2.6124) [2022-01-25 14:10:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][750/1251] eta 0:18:30 lr 0.000085 time 1.8202 (2.2174) loss 2.3089 (3.0978) grad_norm 4.0881 (2.6175) [2022-01-25 14:11:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][760/1251] eta 0:18:09 lr 0.000085 time 3.2875 (2.2197) loss 3.0239 (3.0976) grad_norm 2.6713 (2.6232) [2022-01-25 14:11:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][770/1251] eta 0:17:47 lr 0.000085 time 1.8253 (2.2186) loss 2.4186 (3.0974) grad_norm 3.5830 (2.6267) [2022-01-25 14:12:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][780/1251] eta 0:17:24 lr 0.000085 time 1.8259 (2.2174) loss 2.6277 (3.0967) grad_norm 2.5759 (2.6252) [2022-01-25 14:12:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][790/1251] eta 0:17:00 lr 0.000085 time 2.0152 (2.2147) loss 3.3627 (3.0987) grad_norm 2.3571 (2.6265) [2022-01-25 14:12:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][800/1251] eta 0:16:38 lr 0.000085 time 2.7972 (2.2148) loss 3.5880 (3.0999) grad_norm 2.3577 (2.6248) [2022-01-25 14:13:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][810/1251] eta 0:16:16 lr 0.000085 time 1.5591 (2.2142) loss 2.7952 (3.1009) grad_norm 2.1568 (2.6238) [2022-01-25 14:13:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][820/1251] eta 0:15:53 lr 0.000085 time 1.6669 (2.2120) loss 3.3821 (3.1024) grad_norm 2.4058 (2.6223) [2022-01-25 14:13:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][830/1251] eta 0:15:30 lr 0.000085 time 2.1711 (2.2112) loss 2.7576 (3.0976) grad_norm 2.7843 (2.6216) [2022-01-25 14:14:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][840/1251] eta 0:15:08 lr 0.000085 time 3.4083 (2.2114) loss 3.2700 (3.0980) grad_norm 2.3321 (2.6204) [2022-01-25 14:14:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][850/1251] eta 0:14:46 lr 0.000085 time 1.8165 (2.2111) loss 2.7056 (3.0978) grad_norm 2.5612 (2.6194) [2022-01-25 14:14:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][860/1251] eta 0:14:25 lr 0.000085 time 2.1400 (2.2125) loss 3.5215 (3.0996) grad_norm 2.5713 (2.6197) [2022-01-25 14:15:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][870/1251] eta 0:14:02 lr 0.000085 time 1.8075 (2.2121) loss 3.2227 (3.1009) grad_norm 2.2823 (2.6190) [2022-01-25 14:15:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][880/1251] eta 0:13:40 lr 0.000085 time 2.9136 (2.2123) loss 3.7100 (3.1026) grad_norm 2.8172 (2.6189) [2022-01-25 14:16:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][890/1251] eta 0:13:18 lr 0.000085 time 2.1897 (2.2109) loss 2.9498 (3.1027) grad_norm 2.8211 (2.6182) [2022-01-25 14:16:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][900/1251] eta 0:12:56 lr 0.000085 time 2.2068 (2.2110) loss 3.4245 (3.1025) grad_norm 2.5879 (2.6178) [2022-01-25 14:16:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][910/1251] eta 0:12:33 lr 0.000085 time 1.8610 (2.2106) loss 2.9210 (3.1032) grad_norm 2.7191 (2.6353) [2022-01-25 14:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][920/1251] eta 0:12:12 lr 0.000085 time 3.1657 (2.2127) loss 3.1816 (3.1049) grad_norm 2.4740 (2.6352) [2022-01-25 14:17:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][930/1251] eta 0:11:50 lr 0.000085 time 1.6803 (2.2124) loss 3.4384 (3.1065) grad_norm 2.2973 (2.6340) [2022-01-25 14:17:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][940/1251] eta 0:11:27 lr 0.000085 time 1.7109 (2.2101) loss 3.8020 (3.1069) grad_norm 3.2251 (2.6363) [2022-01-25 14:18:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][950/1251] eta 0:11:04 lr 0.000085 time 1.7748 (2.2093) loss 3.3266 (3.1077) grad_norm 2.1461 (2.6355) [2022-01-25 14:18:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][960/1251] eta 0:10:43 lr 0.000085 time 2.9153 (2.2097) loss 3.5992 (3.1085) grad_norm 2.5024 (2.6338) [2022-01-25 14:18:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][970/1251] eta 0:10:20 lr 0.000085 time 1.8924 (2.2096) loss 3.2271 (3.1092) grad_norm 2.3623 (2.6325) [2022-01-25 14:19:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][980/1251] eta 0:09:58 lr 0.000085 time 2.5454 (2.2101) loss 2.7194 (3.1091) grad_norm 2.3337 (2.6321) [2022-01-25 14:19:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][990/1251] eta 0:09:36 lr 0.000085 time 1.7521 (2.2091) loss 2.7132 (3.1104) grad_norm 2.7193 (2.6323) [2022-01-25 14:20:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1000/1251] eta 0:09:14 lr 0.000085 time 2.9243 (2.2089) loss 3.4377 (3.1096) grad_norm 3.1863 (2.6305) [2022-01-25 14:20:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1010/1251] eta 0:08:51 lr 0.000085 time 2.0492 (2.2063) loss 3.1950 (3.1094) grad_norm 2.3759 (2.6302) [2022-01-25 14:20:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1020/1251] eta 0:08:29 lr 0.000085 time 1.8915 (2.2047) loss 3.3179 (3.1083) grad_norm 2.4133 (2.6283) [2022-01-25 14:21:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1030/1251] eta 0:08:07 lr 0.000085 time 2.4158 (2.2041) loss 3.7313 (3.1105) grad_norm 2.3408 (2.6270) [2022-01-25 14:21:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1040/1251] eta 0:07:45 lr 0.000085 time 2.5372 (2.2050) loss 3.6151 (3.1097) grad_norm 3.1029 (2.6275) [2022-01-25 14:21:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1050/1251] eta 0:07:23 lr 0.000085 time 2.6605 (2.2057) loss 3.5271 (3.1089) grad_norm 2.7946 (2.6270) [2022-01-25 14:22:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1060/1251] eta 0:07:01 lr 0.000085 time 2.5692 (2.2063) loss 3.0233 (3.1089) grad_norm 2.5847 (2.6279) [2022-01-25 14:22:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1070/1251] eta 0:06:39 lr 0.000085 time 1.8210 (2.2051) loss 2.6108 (3.1069) grad_norm 3.1528 (2.6283) [2022-01-25 14:22:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1080/1251] eta 0:06:17 lr 0.000085 time 3.2406 (2.2053) loss 2.3801 (3.1051) grad_norm 2.4708 (2.6269) [2022-01-25 14:23:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1090/1251] eta 0:05:55 lr 0.000085 time 2.5018 (2.2052) loss 2.9752 (3.1050) grad_norm 2.6912 (2.6264) [2022-01-25 14:23:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1100/1251] eta 0:05:32 lr 0.000085 time 1.6023 (2.2050) loss 2.4631 (3.1036) grad_norm 2.7107 (2.6270) [2022-01-25 14:24:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1110/1251] eta 0:05:10 lr 0.000085 time 1.9649 (2.2030) loss 2.9067 (3.1028) grad_norm 3.2588 (2.6282) [2022-01-25 14:24:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1120/1251] eta 0:04:48 lr 0.000085 time 4.1698 (2.2054) loss 3.5013 (3.1057) grad_norm 3.7326 (2.6316) [2022-01-25 14:24:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1130/1251] eta 0:04:26 lr 0.000085 time 1.8564 (2.2038) loss 2.9976 (3.1069) grad_norm 2.5712 (2.6324) [2022-01-25 14:25:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1140/1251] eta 0:04:04 lr 0.000085 time 2.2967 (2.2032) loss 2.8150 (3.1066) grad_norm 2.4349 (2.6316) [2022-01-25 14:25:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1150/1251] eta 0:03:42 lr 0.000085 time 2.4739 (2.2033) loss 2.6239 (3.1062) grad_norm 3.7309 (2.6319) [2022-01-25 14:25:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1160/1251] eta 0:03:20 lr 0.000085 time 3.0354 (2.2034) loss 3.5319 (3.1054) grad_norm 2.3681 (2.6319) [2022-01-25 14:26:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1170/1251] eta 0:02:58 lr 0.000084 time 2.2634 (2.2034) loss 3.1835 (3.1054) grad_norm 2.2431 (2.6317) [2022-01-25 14:26:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1180/1251] eta 0:02:36 lr 0.000084 time 1.6135 (2.2020) loss 3.0775 (3.1072) grad_norm 2.5538 (2.6308) [2022-01-25 14:26:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1190/1251] eta 0:02:14 lr 0.000084 time 2.5493 (2.2012) loss 2.2498 (3.1068) grad_norm 2.1810 (2.6304) [2022-01-25 14:27:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1200/1251] eta 0:01:52 lr 0.000084 time 3.1141 (2.2020) loss 3.1838 (3.1080) grad_norm 2.5916 (2.6300) [2022-01-25 14:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1210/1251] eta 0:01:30 lr 0.000084 time 2.2786 (2.2025) loss 2.1707 (3.1065) grad_norm 2.9984 (2.6299) [2022-01-25 14:28:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1220/1251] eta 0:01:08 lr 0.000084 time 2.0713 (2.2024) loss 3.4349 (3.1062) grad_norm 2.4985 (2.6287) [2022-01-25 14:28:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1230/1251] eta 0:00:46 lr 0.000084 time 2.1012 (2.2020) loss 2.3278 (3.1051) grad_norm 2.4187 (2.6282) [2022-01-25 14:28:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1240/1251] eta 0:00:24 lr 0.000084 time 2.3119 (2.2015) loss 2.8854 (3.1062) grad_norm 2.7494 (2.6281) [2022-01-25 14:29:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [246/300][1250/1251] eta 0:00:02 lr 0.000084 time 1.1933 (2.1954) loss 2.7437 (3.1070) grad_norm 2.7939 (2.6284) [2022-01-25 14:29:01 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 246 training takes 0:45:46 [2022-01-25 14:29:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.579 (18.579) Loss 0.7780 (0.7780) Acc@1 80.371 (80.371) Acc@5 95.898 (95.898) [2022-01-25 14:29:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.959 (3.342) Loss 0.8974 (0.8541) Acc@1 79.883 (79.616) Acc@5 94.336 (95.135) [2022-01-25 14:29:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.259 (2.543) Loss 0.8890 (0.8394) Acc@1 78.027 (79.994) Acc@5 93.750 (95.206) [2022-01-25 14:30:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.948 (2.285) Loss 0.9194 (0.8307) Acc@1 79.395 (80.232) Acc@5 93.848 (95.325) [2022-01-25 14:30:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.189 (2.145) Loss 0.7816 (0.8297) Acc@1 81.348 (80.290) Acc@5 95.801 (95.322) [2022-01-25 14:30:36 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.240 Acc@5 95.266 [2022-01-25 14:30:36 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.2% [2022-01-25 14:30:36 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.24% [2022-01-25 14:30:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][0/1251] eta 7:29:30 lr 0.000084 time 21.5590 (21.5590) loss 2.4366 (2.4366) grad_norm 2.3305 (2.3305) [2022-01-25 14:31:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][10/1251] eta 1:22:49 lr 0.000084 time 1.8364 (4.0041) loss 3.3996 (3.2022) grad_norm 2.5634 (2.6648) [2022-01-25 14:31:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][20/1251] eta 1:03:23 lr 0.000084 time 1.9288 (3.0899) loss 3.8158 (3.2126) grad_norm 2.5840 (2.6420) [2022-01-25 14:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][30/1251] eta 0:57:09 lr 0.000084 time 2.1994 (2.8084) loss 3.1682 (3.1592) grad_norm 2.6157 (2.6742) [2022-01-25 14:32:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][40/1251] eta 0:54:30 lr 0.000084 time 3.6686 (2.7011) loss 3.8071 (3.1461) grad_norm 3.5084 (2.6574) [2022-01-25 14:32:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][50/1251] eta 0:51:59 lr 0.000084 time 1.2629 (2.5975) loss 3.6837 (3.1070) grad_norm 2.2399 (2.6238) [2022-01-25 14:33:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][60/1251] eta 0:50:46 lr 0.000084 time 2.7733 (2.5581) loss 3.4494 (3.1143) grad_norm 2.9068 (2.6416) [2022-01-25 14:33:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][70/1251] eta 0:49:11 lr 0.000084 time 1.8660 (2.4990) loss 3.4585 (3.1309) grad_norm 2.3616 (2.6466) [2022-01-25 14:33:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][80/1251] eta 0:48:16 lr 0.000084 time 3.4787 (2.4736) loss 2.1368 (3.1020) grad_norm 2.8118 (2.6569) [2022-01-25 14:34:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][90/1251] eta 0:46:54 lr 0.000084 time 1.5687 (2.4241) loss 2.6663 (3.0914) grad_norm 2.4092 (2.6320) [2022-01-25 14:34:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][100/1251] eta 0:45:53 lr 0.000084 time 2.2137 (2.3923) loss 3.4566 (3.0902) grad_norm 2.6535 (2.6259) [2022-01-25 14:34:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][110/1251] eta 0:45:07 lr 0.000084 time 1.5997 (2.3731) loss 2.3544 (3.0977) grad_norm 2.4958 (2.6318) [2022-01-25 14:35:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][120/1251] eta 0:44:54 lr 0.000084 time 3.9701 (2.3821) loss 3.4369 (3.0814) grad_norm 3.0640 (2.6495) [2022-01-25 14:35:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][130/1251] eta 0:44:18 lr 0.000084 time 2.3202 (2.3714) loss 3.3029 (3.0862) grad_norm 2.5140 (2.6484) [2022-01-25 14:36:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][140/1251] eta 0:43:44 lr 0.000084 time 2.1800 (2.3626) loss 3.3870 (3.0927) grad_norm 2.6790 (2.6424) [2022-01-25 14:36:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][150/1251] eta 0:43:10 lr 0.000084 time 1.8315 (2.3532) loss 2.9019 (3.0980) grad_norm 2.5987 (2.6364) [2022-01-25 14:36:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][160/1251] eta 0:42:32 lr 0.000084 time 3.1971 (2.3399) loss 3.4434 (3.0858) grad_norm 2.3962 (2.6435) [2022-01-25 14:37:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][170/1251] eta 0:41:49 lr 0.000084 time 1.9085 (2.3213) loss 2.7870 (3.0883) grad_norm 2.3510 (2.6359) [2022-01-25 14:37:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][180/1251] eta 0:41:15 lr 0.000084 time 1.9202 (2.3115) loss 3.2854 (3.1023) grad_norm 2.7386 (2.6349) [2022-01-25 14:37:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][190/1251] eta 0:40:42 lr 0.000084 time 1.8752 (2.3025) loss 2.2272 (3.0895) grad_norm 2.1841 (2.6274) [2022-01-25 14:38:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][200/1251] eta 0:40:11 lr 0.000084 time 1.9499 (2.2949) loss 2.9371 (3.0814) grad_norm 2.7751 (2.6376) [2022-01-25 14:38:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][210/1251] eta 0:39:44 lr 0.000084 time 2.9043 (2.2906) loss 3.1448 (3.0748) grad_norm 2.6668 (2.6366) [2022-01-25 14:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][220/1251] eta 0:39:18 lr 0.000084 time 2.5629 (2.2872) loss 3.2847 (3.0891) grad_norm 2.7169 (2.6372) [2022-01-25 14:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][230/1251] eta 0:38:52 lr 0.000084 time 2.5431 (2.2849) loss 3.2083 (3.0943) grad_norm 2.8972 (2.6398) [2022-01-25 14:39:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][240/1251] eta 0:38:26 lr 0.000084 time 2.2245 (2.2813) loss 3.4160 (3.0959) grad_norm 2.4131 (2.6468) [2022-01-25 14:40:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][250/1251] eta 0:37:58 lr 0.000084 time 2.8554 (2.2760) loss 3.6357 (3.0922) grad_norm 2.3109 (2.6517) [2022-01-25 14:40:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][260/1251] eta 0:37:36 lr 0.000084 time 1.7551 (2.2769) loss 2.4610 (3.0962) grad_norm 2.7437 (2.6614) [2022-01-25 14:40:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][270/1251] eta 0:37:15 lr 0.000084 time 2.7921 (2.2791) loss 3.2427 (3.0961) grad_norm 2.6970 (2.6684) [2022-01-25 14:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][280/1251] eta 0:36:53 lr 0.000084 time 2.5087 (2.2797) loss 2.4646 (3.0921) grad_norm 2.0172 (2.6626) [2022-01-25 14:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][290/1251] eta 0:36:24 lr 0.000084 time 2.1502 (2.2727) loss 3.5352 (3.0944) grad_norm 2.3788 (2.6724) [2022-01-25 14:41:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][300/1251] eta 0:35:53 lr 0.000084 time 2.0573 (2.2642) loss 2.7916 (3.0950) grad_norm 2.7675 (2.6758) [2022-01-25 14:42:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][310/1251] eta 0:35:26 lr 0.000084 time 2.7648 (2.2599) loss 3.3181 (3.0992) grad_norm 2.4650 (2.6715) [2022-01-25 14:42:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][320/1251] eta 0:35:02 lr 0.000084 time 2.2781 (2.2578) loss 3.0861 (3.0982) grad_norm 3.0099 (2.6711) [2022-01-25 14:43:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][330/1251] eta 0:34:37 lr 0.000084 time 1.5289 (2.2556) loss 2.8910 (3.1034) grad_norm 2.6663 (2.6700) [2022-01-25 14:43:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][340/1251] eta 0:34:15 lr 0.000084 time 1.9764 (2.2560) loss 3.2583 (3.0996) grad_norm 2.2702 (2.6644) [2022-01-25 14:43:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][350/1251] eta 0:33:52 lr 0.000084 time 3.0852 (2.2553) loss 3.7830 (3.1009) grad_norm 2.5026 (2.6618) [2022-01-25 14:44:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][360/1251] eta 0:33:30 lr 0.000084 time 3.5576 (2.2563) loss 2.8484 (3.1003) grad_norm 1.9987 (2.6563) [2022-01-25 14:44:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][370/1251] eta 0:33:09 lr 0.000083 time 2.0945 (2.2581) loss 2.8510 (3.0990) grad_norm 2.6901 (2.6516) [2022-01-25 14:44:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][380/1251] eta 0:32:45 lr 0.000083 time 1.9225 (2.2566) loss 3.6611 (3.1020) grad_norm 3.4089 (2.6530) [2022-01-25 14:45:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][390/1251] eta 0:32:23 lr 0.000083 time 2.2049 (2.2568) loss 2.3454 (3.1018) grad_norm 2.4454 (2.6527) [2022-01-25 14:45:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][400/1251] eta 0:31:58 lr 0.000083 time 3.3847 (2.2543) loss 3.1364 (3.1008) grad_norm 2.3933 (2.6530) [2022-01-25 14:46:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][410/1251] eta 0:31:31 lr 0.000083 time 1.9751 (2.2496) loss 3.6354 (3.1075) grad_norm 2.7044 (2.6547) [2022-01-25 14:46:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][420/1251] eta 0:31:07 lr 0.000083 time 2.1993 (2.2469) loss 2.8646 (3.0997) grad_norm 2.3216 (2.6555) [2022-01-25 14:46:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][430/1251] eta 0:30:41 lr 0.000083 time 1.8373 (2.2428) loss 3.5850 (3.0988) grad_norm 3.5764 (2.6571) [2022-01-25 14:47:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][440/1251] eta 0:30:22 lr 0.000083 time 3.7100 (2.2475) loss 3.1489 (3.0987) grad_norm 3.3556 (2.6572) [2022-01-25 14:47:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][450/1251] eta 0:29:59 lr 0.000083 time 1.7036 (2.2470) loss 2.2831 (3.0917) grad_norm 2.5549 (2.6584) [2022-01-25 14:47:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][460/1251] eta 0:29:37 lr 0.000083 time 2.9263 (2.2477) loss 3.6126 (3.0884) grad_norm 2.7256 (2.6614) [2022-01-25 14:48:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][470/1251] eta 0:29:13 lr 0.000083 time 1.9977 (2.2450) loss 3.6711 (3.0853) grad_norm 2.4196 (2.6586) [2022-01-25 14:48:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][480/1251] eta 0:28:49 lr 0.000083 time 2.0203 (2.2428) loss 3.2180 (3.0865) grad_norm 2.4527 (2.6535) [2022-01-25 14:48:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][490/1251] eta 0:28:21 lr 0.000083 time 1.8462 (2.2359) loss 3.2518 (3.0897) grad_norm 2.9545 (2.6546) [2022-01-25 14:49:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][500/1251] eta 0:27:55 lr 0.000083 time 2.2930 (2.2308) loss 3.3250 (3.0898) grad_norm 2.7412 (2.6561) [2022-01-25 14:49:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][510/1251] eta 0:27:31 lr 0.000083 time 2.2948 (2.2288) loss 3.4484 (3.0897) grad_norm 2.7533 (2.6557) [2022-01-25 14:49:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][520/1251] eta 0:27:10 lr 0.000083 time 3.0323 (2.2299) loss 2.7932 (3.0892) grad_norm 2.9168 (2.6600) [2022-01-25 14:50:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][530/1251] eta 0:26:48 lr 0.000083 time 2.0962 (2.2311) loss 3.1702 (3.0910) grad_norm 2.8996 (2.6568) [2022-01-25 14:50:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][540/1251] eta 0:26:28 lr 0.000083 time 2.8070 (2.2336) loss 3.7289 (3.0923) grad_norm 2.5864 (2.6539) [2022-01-25 14:51:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][550/1251] eta 0:26:05 lr 0.000083 time 2.2095 (2.2338) loss 3.4125 (3.0980) grad_norm 2.6575 (2.6539) [2022-01-25 14:51:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][560/1251] eta 0:25:45 lr 0.000083 time 2.9271 (2.2362) loss 3.0620 (3.0979) grad_norm 2.5028 (2.6538) [2022-01-25 14:51:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][570/1251] eta 0:25:21 lr 0.000083 time 2.2456 (2.2342) loss 3.1534 (3.0985) grad_norm 4.2408 (2.6578) [2022-01-25 14:52:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][580/1251] eta 0:24:56 lr 0.000083 time 2.0394 (2.2300) loss 2.9166 (3.1005) grad_norm 2.9481 (2.6560) [2022-01-25 14:52:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][590/1251] eta 0:24:31 lr 0.000083 time 1.9185 (2.2269) loss 2.4584 (3.1000) grad_norm 2.6986 (2.6553) [2022-01-25 14:52:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][600/1251] eta 0:24:08 lr 0.000083 time 2.5227 (2.2245) loss 2.4345 (3.1008) grad_norm 2.3784 (2.6529) [2022-01-25 14:53:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][610/1251] eta 0:23:46 lr 0.000083 time 2.6211 (2.2256) loss 3.6458 (3.1038) grad_norm 2.8929 (2.6543) [2022-01-25 14:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][620/1251] eta 0:23:24 lr 0.000083 time 2.4462 (2.2264) loss 2.9637 (3.1040) grad_norm 2.5021 (2.6529) [2022-01-25 14:54:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][630/1251] eta 0:23:01 lr 0.000083 time 1.7576 (2.2249) loss 3.6394 (3.1049) grad_norm 2.5229 (2.6525) [2022-01-25 14:54:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][640/1251] eta 0:22:39 lr 0.000083 time 2.1655 (2.2250) loss 3.5417 (3.1033) grad_norm 2.4728 (2.6529) [2022-01-25 14:54:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][650/1251] eta 0:22:17 lr 0.000083 time 2.5485 (2.2250) loss 3.3920 (3.1035) grad_norm 2.4316 (2.6495) [2022-01-25 14:55:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][660/1251] eta 0:21:54 lr 0.000083 time 2.2232 (2.2237) loss 3.1178 (3.1032) grad_norm 2.3740 (2.6486) [2022-01-25 14:55:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][670/1251] eta 0:21:32 lr 0.000083 time 1.8718 (2.2238) loss 3.4007 (3.1036) grad_norm 2.2740 (2.6489) [2022-01-25 14:55:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][680/1251] eta 0:21:09 lr 0.000083 time 2.3075 (2.2239) loss 3.1646 (3.1005) grad_norm 2.5130 (2.6484) [2022-01-25 14:56:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][690/1251] eta 0:20:47 lr 0.000083 time 2.1163 (2.2232) loss 3.1276 (3.1008) grad_norm 2.4183 (2.6501) [2022-01-25 14:56:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][700/1251] eta 0:20:24 lr 0.000083 time 2.4890 (2.2227) loss 1.9933 (3.0992) grad_norm 2.3814 (2.6484) [2022-01-25 14:56:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][710/1251] eta 0:20:01 lr 0.000083 time 1.8582 (2.2216) loss 3.0803 (3.1002) grad_norm 3.2240 (2.6512) [2022-01-25 14:57:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][720/1251] eta 0:19:38 lr 0.000083 time 1.8798 (2.2203) loss 3.4345 (3.1009) grad_norm 2.6226 (2.6497) [2022-01-25 14:57:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][730/1251] eta 0:19:17 lr 0.000083 time 2.5862 (2.2223) loss 3.5567 (3.1032) grad_norm 2.9642 (2.6496) [2022-01-25 14:58:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][740/1251] eta 0:18:54 lr 0.000083 time 1.6023 (2.2198) loss 3.5512 (3.1056) grad_norm 2.3764 (2.6489) [2022-01-25 14:58:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][750/1251] eta 0:18:31 lr 0.000083 time 1.8930 (2.2179) loss 3.0537 (3.1055) grad_norm 2.5742 (2.6484) [2022-01-25 14:58:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][760/1251] eta 0:18:08 lr 0.000083 time 2.2331 (2.2173) loss 3.4606 (3.1065) grad_norm 2.3425 (2.6480) [2022-01-25 14:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][770/1251] eta 0:17:45 lr 0.000083 time 1.9412 (2.2157) loss 3.7028 (3.1095) grad_norm 2.7376 (2.6483) [2022-01-25 14:59:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][780/1251] eta 0:17:22 lr 0.000083 time 2.1591 (2.2141) loss 3.2689 (3.1128) grad_norm 2.5433 (2.6470) [2022-01-25 14:59:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][790/1251] eta 0:17:00 lr 0.000083 time 2.0809 (2.2127) loss 3.2332 (3.1136) grad_norm 2.3443 (2.6456) [2022-01-25 15:00:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][800/1251] eta 0:16:37 lr 0.000083 time 2.3523 (2.2126) loss 2.3804 (3.1112) grad_norm 2.7514 (2.6455) [2022-01-25 15:00:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][810/1251] eta 0:16:15 lr 0.000083 time 2.1373 (2.2128) loss 3.6042 (3.1124) grad_norm 2.5547 (2.6487) [2022-01-25 15:00:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][820/1251] eta 0:15:53 lr 0.000083 time 1.6403 (2.2126) loss 3.3510 (3.1127) grad_norm 2.4466 (2.6467) [2022-01-25 15:01:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][830/1251] eta 0:15:31 lr 0.000083 time 2.5475 (2.2132) loss 3.2173 (3.1130) grad_norm 2.6919 (2.6451) [2022-01-25 15:01:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][840/1251] eta 0:15:10 lr 0.000082 time 2.3680 (2.2147) loss 2.6584 (3.1140) grad_norm 2.5321 (2.6439) [2022-01-25 15:02:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][850/1251] eta 0:14:48 lr 0.000082 time 3.5187 (2.2160) loss 3.3778 (3.1141) grad_norm 3.0576 (2.6449) [2022-01-25 15:02:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][860/1251] eta 0:14:26 lr 0.000082 time 1.8991 (2.2155) loss 3.2525 (3.1164) grad_norm 2.6356 (2.6445) [2022-01-25 15:02:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][870/1251] eta 0:14:03 lr 0.000082 time 2.1983 (2.2142) loss 2.8095 (3.1159) grad_norm 2.6803 (2.6443) [2022-01-25 15:03:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][880/1251] eta 0:13:40 lr 0.000082 time 2.3025 (2.2129) loss 3.6053 (3.1174) grad_norm 3.0473 (2.6458) [2022-01-25 15:03:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][890/1251] eta 0:13:18 lr 0.000082 time 1.8752 (2.2115) loss 3.0006 (3.1179) grad_norm 2.3975 (2.6462) [2022-01-25 15:03:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][900/1251] eta 0:12:55 lr 0.000082 time 2.3217 (2.2108) loss 2.2015 (3.1176) grad_norm 2.8603 (2.6477) [2022-01-25 15:04:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][910/1251] eta 0:12:33 lr 0.000082 time 2.3525 (2.2103) loss 3.5908 (3.1164) grad_norm 2.4383 (2.6479) [2022-01-25 15:04:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][920/1251] eta 0:12:11 lr 0.000082 time 1.9187 (2.2099) loss 2.8506 (3.1189) grad_norm 2.9038 (2.6466) [2022-01-25 15:04:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][930/1251] eta 0:11:49 lr 0.000082 time 2.7316 (2.2101) loss 2.8554 (3.1195) grad_norm 2.6267 (2.6445) [2022-01-25 15:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][940/1251] eta 0:11:27 lr 0.000082 time 2.2351 (2.2100) loss 2.9516 (3.1191) grad_norm 2.4133 (2.6447) [2022-01-25 15:05:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][950/1251] eta 0:11:05 lr 0.000082 time 2.1349 (2.2098) loss 2.9705 (3.1179) grad_norm 2.4365 (2.6455) [2022-01-25 15:05:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][960/1251] eta 0:10:42 lr 0.000082 time 2.6976 (2.2092) loss 3.1832 (3.1204) grad_norm 2.5716 (2.6454) [2022-01-25 15:06:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][970/1251] eta 0:10:20 lr 0.000082 time 2.1650 (2.2080) loss 2.5225 (3.1181) grad_norm 2.7738 (2.6470) [2022-01-25 15:06:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][980/1251] eta 0:09:58 lr 0.000082 time 2.4396 (2.2082) loss 3.3424 (3.1203) grad_norm 2.1515 (2.6485) [2022-01-25 15:07:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][990/1251] eta 0:09:36 lr 0.000082 time 1.8079 (2.2083) loss 2.8190 (3.1176) grad_norm 2.5844 (2.6482) [2022-01-25 15:07:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1000/1251] eta 0:09:14 lr 0.000082 time 3.0370 (2.2094) loss 3.7862 (3.1153) grad_norm 2.5786 (2.6467) [2022-01-25 15:07:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1010/1251] eta 0:08:52 lr 0.000082 time 2.1608 (2.2083) loss 2.0122 (3.1151) grad_norm 2.3232 (2.6467) [2022-01-25 15:08:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1020/1251] eta 0:08:29 lr 0.000082 time 2.2691 (2.2074) loss 3.0227 (3.1146) grad_norm 2.4177 (2.6440) [2022-01-25 15:08:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1030/1251] eta 0:08:07 lr 0.000082 time 1.9187 (2.2061) loss 3.5667 (3.1139) grad_norm 2.2284 (2.6436) [2022-01-25 15:08:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1040/1251] eta 0:07:45 lr 0.000082 time 2.9100 (2.2070) loss 3.4831 (3.1139) grad_norm 2.2974 (2.6435) [2022-01-25 15:09:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1050/1251] eta 0:07:23 lr 0.000082 time 2.6906 (2.2080) loss 3.4574 (3.1160) grad_norm 2.4499 (2.6433) [2022-01-25 15:09:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1060/1251] eta 0:07:01 lr 0.000082 time 2.1942 (2.2076) loss 3.3283 (3.1157) grad_norm 2.5564 (2.6478) [2022-01-25 15:09:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1070/1251] eta 0:06:39 lr 0.000082 time 2.2149 (2.2062) loss 2.0965 (3.1153) grad_norm 2.5276 (2.6478) [2022-01-25 15:10:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1080/1251] eta 0:06:17 lr 0.000082 time 2.2688 (2.2047) loss 2.9494 (3.1159) grad_norm 2.7658 (2.6476) [2022-01-25 15:10:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1090/1251] eta 0:05:54 lr 0.000082 time 2.2102 (2.2035) loss 3.1006 (3.1139) grad_norm 2.8769 (2.6471) [2022-01-25 15:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1100/1251] eta 0:05:32 lr 0.000082 time 2.0703 (2.2030) loss 3.3636 (3.1165) grad_norm 2.9269 (2.6464) [2022-01-25 15:11:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1110/1251] eta 0:05:10 lr 0.000082 time 2.5826 (2.2024) loss 3.1950 (3.1185) grad_norm 2.4355 (2.6451) [2022-01-25 15:11:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1120/1251] eta 0:04:48 lr 0.000082 time 2.6420 (2.2028) loss 3.3044 (3.1178) grad_norm 2.7306 (2.6445) [2022-01-25 15:12:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1130/1251] eta 0:04:26 lr 0.000082 time 2.0628 (2.2025) loss 2.2158 (3.1169) grad_norm 2.4402 (2.6442) [2022-01-25 15:12:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1140/1251] eta 0:04:04 lr 0.000082 time 1.8562 (2.2013) loss 2.1819 (3.1157) grad_norm 2.8637 (2.6449) [2022-01-25 15:12:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1150/1251] eta 0:03:42 lr 0.000082 time 2.1933 (2.2023) loss 3.4265 (3.1135) grad_norm 2.4077 (2.6444) [2022-01-25 15:13:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1160/1251] eta 0:03:20 lr 0.000082 time 3.3620 (2.2028) loss 2.8697 (3.1120) grad_norm 2.6509 (2.6437) [2022-01-25 15:13:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1170/1251] eta 0:02:58 lr 0.000082 time 2.1410 (2.2035) loss 3.7001 (3.1118) grad_norm 2.7741 (2.6472) [2022-01-25 15:13:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1180/1251] eta 0:02:36 lr 0.000082 time 1.7577 (2.2022) loss 2.6068 (3.1098) grad_norm 2.5620 (2.6480) [2022-01-25 15:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1190/1251] eta 0:02:14 lr 0.000082 time 1.8668 (2.2014) loss 2.6197 (3.1093) grad_norm 4.1644 (2.6505) [2022-01-25 15:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1200/1251] eta 0:01:52 lr 0.000082 time 2.7751 (2.2018) loss 3.5716 (3.1109) grad_norm 2.3840 (2.6512) [2022-01-25 15:15:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1210/1251] eta 0:01:30 lr 0.000082 time 2.2975 (2.2011) loss 2.3319 (3.1104) grad_norm 2.4110 (2.6495) [2022-01-25 15:15:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1220/1251] eta 0:01:08 lr 0.000082 time 2.2848 (2.2006) loss 3.6228 (3.1111) grad_norm 2.6155 (2.6489) [2022-01-25 15:15:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1230/1251] eta 0:00:46 lr 0.000082 time 2.0289 (2.2004) loss 2.9916 (3.1109) grad_norm 2.4765 (2.6483) [2022-01-25 15:16:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1240/1251] eta 0:00:24 lr 0.000082 time 1.4401 (2.2010) loss 3.2095 (3.1098) grad_norm 2.5995 (2.6476) [2022-01-25 15:16:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [247/300][1250/1251] eta 0:00:02 lr 0.000082 time 1.2001 (2.1962) loss 2.1662 (3.1105) grad_norm 2.8000 (2.6460) [2022-01-25 15:16:24 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 247 training takes 0:45:47 [2022-01-25 15:16:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.276 (18.276) Loss 0.8217 (0.8217) Acc@1 80.762 (80.762) Acc@5 95.703 (95.703) [2022-01-25 15:17:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.265 (3.341) Loss 0.8787 (0.8225) Acc@1 78.613 (80.176) Acc@5 94.727 (95.446) [2022-01-25 15:17:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.595 (2.498) Loss 0.8910 (0.8319) Acc@1 79.883 (80.222) Acc@5 95.215 (95.387) [2022-01-25 15:17:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.593 (2.232) Loss 0.8687 (0.8302) Acc@1 79.297 (80.289) Acc@5 95.020 (95.372) [2022-01-25 15:17:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.833 (2.156) Loss 0.7707 (0.8305) Acc@1 82.324 (80.312) Acc@5 95.898 (95.308) [2022-01-25 15:17:59 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.242 Acc@5 95.270 [2022-01-25 15:17:59 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.2% [2022-01-25 15:17:59 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.24% [2022-01-25 15:18:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][0/1251] eta 7:36:39 lr 0.000082 time 21.9025 (21.9025) loss 2.1089 (2.1089) grad_norm 2.8887 (2.8887) [2022-01-25 15:18:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][10/1251] eta 1:24:13 lr 0.000082 time 2.2215 (4.0719) loss 3.5822 (3.0206) grad_norm 3.3432 (2.7753) [2022-01-25 15:19:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][20/1251] eta 1:06:14 lr 0.000082 time 1.8013 (3.2287) loss 3.2627 (3.1255) grad_norm 2.4227 (2.7216) [2022-01-25 15:19:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][30/1251] eta 0:58:28 lr 0.000082 time 1.8949 (2.8733) loss 3.0892 (3.0755) grad_norm 2.5514 (2.6367) [2022-01-25 15:19:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][40/1251] eta 0:56:11 lr 0.000082 time 4.4828 (2.7839) loss 2.4223 (3.0776) grad_norm 2.4029 (2.6085) [2022-01-25 15:20:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][50/1251] eta 0:53:35 lr 0.000081 time 2.1886 (2.6771) loss 2.1695 (3.0226) grad_norm 2.6365 (2.6026) [2022-01-25 15:20:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][60/1251] eta 0:51:42 lr 0.000081 time 1.8499 (2.6048) loss 2.4370 (3.0082) grad_norm 2.3888 (2.5777) [2022-01-25 15:21:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][70/1251] eta 0:50:15 lr 0.000081 time 2.1290 (2.5534) loss 3.9858 (3.0289) grad_norm 2.5136 (2.5884) [2022-01-25 15:21:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][80/1251] eta 0:49:09 lr 0.000081 time 3.5576 (2.5191) loss 3.1836 (3.0522) grad_norm 2.9652 (2.6722) [2022-01-25 15:21:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][90/1251] eta 0:47:50 lr 0.000081 time 2.1186 (2.4723) loss 3.3281 (3.0848) grad_norm 2.7547 (2.6575) [2022-01-25 15:22:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][100/1251] eta 0:46:46 lr 0.000081 time 1.6506 (2.4380) loss 2.6955 (3.0722) grad_norm 2.7044 (2.6690) [2022-01-25 15:22:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][110/1251] eta 0:45:43 lr 0.000081 time 1.6812 (2.4042) loss 2.8731 (3.0724) grad_norm 2.4503 (2.6937) [2022-01-25 15:22:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][120/1251] eta 0:44:50 lr 0.000081 time 3.1177 (2.3788) loss 3.0237 (3.0598) grad_norm 3.0487 (2.6873) [2022-01-25 15:23:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][130/1251] eta 0:44:17 lr 0.000081 time 2.4232 (2.3710) loss 3.5636 (3.0359) grad_norm 2.8475 (2.6796) [2022-01-25 15:23:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][140/1251] eta 0:43:39 lr 0.000081 time 2.2087 (2.3580) loss 3.1416 (3.0514) grad_norm 3.0476 (2.6685) [2022-01-25 15:23:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][150/1251] eta 0:43:01 lr 0.000081 time 2.0510 (2.3445) loss 1.9290 (3.0602) grad_norm 2.6195 (2.6662) [2022-01-25 15:24:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][160/1251] eta 0:42:25 lr 0.000081 time 2.2572 (2.3332) loss 3.6093 (3.0645) grad_norm 2.2934 (2.6602) [2022-01-25 15:24:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][170/1251] eta 0:41:52 lr 0.000081 time 2.6075 (2.3239) loss 3.4089 (3.0778) grad_norm 2.4027 (2.6583) [2022-01-25 15:24:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][180/1251] eta 0:41:15 lr 0.000081 time 1.6057 (2.3111) loss 2.8580 (3.0770) grad_norm 2.4853 (2.6609) [2022-01-25 15:25:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][190/1251] eta 0:40:43 lr 0.000081 time 1.6645 (2.3026) loss 3.8069 (3.0885) grad_norm 2.4078 (2.6569) [2022-01-25 15:25:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][200/1251] eta 0:40:16 lr 0.000081 time 3.5684 (2.2997) loss 3.5325 (3.0992) grad_norm 2.9283 (2.6596) [2022-01-25 15:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][210/1251] eta 0:39:48 lr 0.000081 time 2.5498 (2.2944) loss 1.8299 (3.1079) grad_norm 2.3415 (2.6549) [2022-01-25 15:26:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][220/1251] eta 0:39:18 lr 0.000081 time 1.7970 (2.2879) loss 3.1988 (3.1104) grad_norm 2.2094 (2.6480) [2022-01-25 15:26:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][230/1251] eta 0:38:57 lr 0.000081 time 2.4445 (2.2898) loss 2.9992 (3.1015) grad_norm 2.4329 (2.6479) [2022-01-25 15:27:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][240/1251] eta 0:38:31 lr 0.000081 time 2.2713 (2.2861) loss 2.2508 (3.1041) grad_norm 2.2906 (2.6406) [2022-01-25 15:27:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][250/1251] eta 0:38:04 lr 0.000081 time 3.2406 (2.2821) loss 2.1041 (3.1035) grad_norm 2.4195 (2.6367) [2022-01-25 15:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][260/1251] eta 0:37:37 lr 0.000081 time 1.6312 (2.2777) loss 3.6533 (3.1046) grad_norm 2.3849 (2.6373) [2022-01-25 15:28:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][270/1251] eta 0:37:06 lr 0.000081 time 2.1153 (2.2698) loss 2.8402 (3.0998) grad_norm 2.7854 (2.6349) [2022-01-25 15:28:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][280/1251] eta 0:36:35 lr 0.000081 time 2.2470 (2.2608) loss 3.1596 (3.1045) grad_norm 2.7588 (2.6400) [2022-01-25 15:28:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][290/1251] eta 0:36:12 lr 0.000081 time 3.2728 (2.2611) loss 3.1067 (3.1006) grad_norm 2.4994 (2.6374) [2022-01-25 15:29:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][300/1251] eta 0:35:47 lr 0.000081 time 2.3193 (2.2583) loss 2.7474 (3.0959) grad_norm 2.4082 (2.6338) [2022-01-25 15:29:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][310/1251] eta 0:35:20 lr 0.000081 time 1.8319 (2.2537) loss 2.9411 (3.0937) grad_norm 2.8921 (2.6367) [2022-01-25 15:30:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][320/1251] eta 0:34:53 lr 0.000081 time 2.8138 (2.2486) loss 3.3784 (3.1002) grad_norm 2.7919 (2.6367) [2022-01-25 15:30:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][330/1251] eta 0:34:30 lr 0.000081 time 2.8280 (2.2480) loss 2.7355 (3.0973) grad_norm 2.7899 (2.6378) [2022-01-25 15:30:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][340/1251] eta 0:34:08 lr 0.000081 time 1.8934 (2.2490) loss 2.9811 (3.1000) grad_norm 2.4581 (2.6330) [2022-01-25 15:31:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][350/1251] eta 0:33:47 lr 0.000081 time 2.5325 (2.2504) loss 2.8825 (3.0997) grad_norm 2.5335 (2.6332) [2022-01-25 15:31:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][360/1251] eta 0:33:22 lr 0.000081 time 2.1574 (2.2476) loss 3.3302 (3.0928) grad_norm 2.2867 (2.6313) [2022-01-25 15:31:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][370/1251] eta 0:32:59 lr 0.000081 time 2.4036 (2.2471) loss 3.2173 (3.0912) grad_norm 2.8077 (2.6319) [2022-01-25 15:32:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][380/1251] eta 0:32:35 lr 0.000081 time 1.9622 (2.2454) loss 3.3775 (3.0957) grad_norm 2.4216 (2.6356) [2022-01-25 15:32:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][390/1251] eta 0:32:11 lr 0.000081 time 2.4657 (2.2428) loss 3.7724 (3.1019) grad_norm 3.5734 (2.6389) [2022-01-25 15:32:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][400/1251] eta 0:31:45 lr 0.000081 time 2.3432 (2.2394) loss 3.1847 (3.0984) grad_norm 2.6671 (2.6413) [2022-01-25 15:33:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][410/1251] eta 0:31:19 lr 0.000081 time 2.2609 (2.2348) loss 3.0694 (3.0987) grad_norm 2.6416 (2.6412) [2022-01-25 15:33:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][420/1251] eta 0:30:54 lr 0.000081 time 2.1787 (2.2315) loss 3.4270 (3.0999) grad_norm 2.3054 (2.6401) [2022-01-25 15:34:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][430/1251] eta 0:30:31 lr 0.000081 time 2.2644 (2.2312) loss 3.3025 (3.1026) grad_norm 2.4110 (2.6377) [2022-01-25 15:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][440/1251] eta 0:30:06 lr 0.000081 time 2.2277 (2.2275) loss 3.2731 (3.1014) grad_norm 2.5745 (2.6347) [2022-01-25 15:34:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][450/1251] eta 0:29:43 lr 0.000081 time 1.9072 (2.2262) loss 2.5972 (3.0995) grad_norm 2.5629 (2.6379) [2022-01-25 15:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][460/1251] eta 0:29:23 lr 0.000081 time 2.3443 (2.2297) loss 3.1573 (3.0946) grad_norm 2.5942 (2.6360) [2022-01-25 15:35:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][470/1251] eta 0:29:01 lr 0.000081 time 2.5547 (2.2300) loss 3.7299 (3.0964) grad_norm 2.7941 (2.6343) [2022-01-25 15:35:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][480/1251] eta 0:28:38 lr 0.000081 time 3.1639 (2.2285) loss 3.7460 (3.0948) grad_norm 2.5860 (2.6352) [2022-01-25 15:36:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][490/1251] eta 0:28:16 lr 0.000081 time 2.2423 (2.2288) loss 2.1300 (3.0972) grad_norm 2.3975 (2.6346) [2022-01-25 15:36:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][500/1251] eta 0:27:55 lr 0.000081 time 1.9826 (2.2308) loss 3.6746 (3.1038) grad_norm 2.8889 (2.6426) [2022-01-25 15:36:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][510/1251] eta 0:27:31 lr 0.000081 time 2.2380 (2.2291) loss 2.2930 (3.1026) grad_norm 2.2880 (2.6446) [2022-01-25 15:37:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][520/1251] eta 0:27:11 lr 0.000080 time 3.6824 (2.2320) loss 3.2333 (3.1035) grad_norm 2.6931 (2.6469) [2022-01-25 15:37:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][530/1251] eta 0:26:48 lr 0.000080 time 1.8417 (2.2313) loss 3.9385 (3.1060) grad_norm 2.2168 (2.6713) [2022-01-25 15:38:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][540/1251] eta 0:26:23 lr 0.000080 time 1.6549 (2.2271) loss 3.2122 (3.1061) grad_norm 2.4923 (2.6728) [2022-01-25 15:38:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][550/1251] eta 0:25:58 lr 0.000080 time 1.9210 (2.2230) loss 2.2301 (3.1063) grad_norm 2.4838 (2.6734) [2022-01-25 15:38:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][560/1251] eta 0:25:39 lr 0.000080 time 5.7347 (2.2285) loss 2.2356 (3.1053) grad_norm 2.7847 (2.6752) [2022-01-25 15:39:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][570/1251] eta 0:25:16 lr 0.000080 time 1.9119 (2.2268) loss 3.4111 (3.1076) grad_norm 4.1076 (2.6774) [2022-01-25 15:39:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][580/1251] eta 0:24:53 lr 0.000080 time 1.7553 (2.2259) loss 3.3263 (3.1086) grad_norm 2.5630 (2.6789) [2022-01-25 15:39:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][590/1251] eta 0:24:29 lr 0.000080 time 2.0934 (2.2235) loss 3.5183 (3.1103) grad_norm 2.6814 (2.6816) [2022-01-25 15:40:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][600/1251] eta 0:24:06 lr 0.000080 time 2.9860 (2.2214) loss 3.5569 (3.1093) grad_norm 2.8720 (2.6841) [2022-01-25 15:40:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][610/1251] eta 0:23:42 lr 0.000080 time 2.3172 (2.2191) loss 3.4702 (3.1079) grad_norm 3.3717 (2.6850) [2022-01-25 15:40:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][620/1251] eta 0:23:19 lr 0.000080 time 1.8733 (2.2180) loss 3.7067 (3.1105) grad_norm 2.8344 (2.6814) [2022-01-25 15:41:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][630/1251] eta 0:22:57 lr 0.000080 time 1.8992 (2.2183) loss 2.8133 (3.1070) grad_norm 2.9866 (2.6826) [2022-01-25 15:41:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][640/1251] eta 0:22:36 lr 0.000080 time 3.2633 (2.2204) loss 3.5528 (3.1092) grad_norm 2.3727 (2.6810) [2022-01-25 15:42:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][650/1251] eta 0:22:14 lr 0.000080 time 2.2582 (2.2212) loss 3.1640 (3.1102) grad_norm 2.9653 (2.6813) [2022-01-25 15:42:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][660/1251] eta 0:21:50 lr 0.000080 time 1.9763 (2.2181) loss 2.0571 (3.1075) grad_norm 2.2132 (2.6807) [2022-01-25 15:42:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][670/1251] eta 0:21:27 lr 0.000080 time 2.1863 (2.2168) loss 3.1463 (3.1034) grad_norm 2.5436 (2.6787) [2022-01-25 15:43:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][680/1251] eta 0:21:06 lr 0.000080 time 3.0027 (2.2172) loss 3.4627 (3.1025) grad_norm 2.9874 (2.6784) [2022-01-25 15:43:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][690/1251] eta 0:20:43 lr 0.000080 time 2.3203 (2.2171) loss 3.0772 (3.1025) grad_norm 2.5091 (2.6756) [2022-01-25 15:43:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][700/1251] eta 0:20:20 lr 0.000080 time 1.8609 (2.2159) loss 3.4775 (3.1033) grad_norm 2.7050 (2.6748) [2022-01-25 15:44:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][710/1251] eta 0:19:59 lr 0.000080 time 1.8392 (2.2169) loss 3.5235 (3.1035) grad_norm 2.8675 (2.6743) [2022-01-25 15:44:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][720/1251] eta 0:19:37 lr 0.000080 time 3.0467 (2.2179) loss 2.0991 (3.1001) grad_norm 2.6587 (2.6727) [2022-01-25 15:45:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][730/1251] eta 0:19:16 lr 0.000080 time 1.8442 (2.2189) loss 3.3790 (3.0961) grad_norm 3.0486 (2.6712) [2022-01-25 15:45:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][740/1251] eta 0:18:52 lr 0.000080 time 1.6625 (2.2164) loss 3.3814 (3.0928) grad_norm 2.3780 (2.6712) [2022-01-25 15:45:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][750/1251] eta 0:18:28 lr 0.000080 time 1.8786 (2.2129) loss 3.6765 (3.0929) grad_norm 2.5539 (2.6727) [2022-01-25 15:46:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][760/1251] eta 0:18:05 lr 0.000080 time 2.1988 (2.2102) loss 3.4519 (3.0957) grad_norm 2.2591 (2.6707) [2022-01-25 15:46:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][770/1251] eta 0:17:42 lr 0.000080 time 2.4658 (2.2097) loss 3.8493 (3.0965) grad_norm 2.4389 (2.6698) [2022-01-25 15:46:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][780/1251] eta 0:17:20 lr 0.000080 time 2.1541 (2.2096) loss 3.1698 (3.0957) grad_norm 2.9814 (2.6705) [2022-01-25 15:47:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][790/1251] eta 0:16:59 lr 0.000080 time 2.7830 (2.2114) loss 3.2820 (3.0964) grad_norm 2.4160 (2.6696) [2022-01-25 15:47:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][800/1251] eta 0:16:37 lr 0.000080 time 2.5388 (2.2126) loss 2.9436 (3.0970) grad_norm 2.3736 (2.6706) [2022-01-25 15:47:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][810/1251] eta 0:16:14 lr 0.000080 time 1.8112 (2.2102) loss 3.6339 (3.0974) grad_norm 2.6006 (2.6703) [2022-01-25 15:48:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][820/1251] eta 0:15:51 lr 0.000080 time 2.1956 (2.2088) loss 3.2804 (3.0967) grad_norm 2.4647 (2.6691) [2022-01-25 15:48:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][830/1251] eta 0:15:30 lr 0.000080 time 1.8948 (2.2096) loss 2.1919 (3.0971) grad_norm 2.4829 (2.6690) [2022-01-25 15:48:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][840/1251] eta 0:15:08 lr 0.000080 time 3.1407 (2.2096) loss 3.4142 (3.0968) grad_norm 2.1976 (2.6672) [2022-01-25 15:49:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][850/1251] eta 0:14:45 lr 0.000080 time 1.6039 (2.2083) loss 3.3107 (3.1008) grad_norm 2.7711 (2.6683) [2022-01-25 15:49:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][860/1251] eta 0:14:23 lr 0.000080 time 2.1518 (2.2091) loss 3.7249 (3.0991) grad_norm 2.8427 (2.6668) [2022-01-25 15:50:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][870/1251] eta 0:14:01 lr 0.000080 time 1.8130 (2.2095) loss 2.8107 (3.0998) grad_norm 2.4190 (2.6656) [2022-01-25 15:50:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][880/1251] eta 0:13:39 lr 0.000080 time 2.8732 (2.2100) loss 3.5458 (3.0998) grad_norm 2.5981 (2.6644) [2022-01-25 15:50:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][890/1251] eta 0:13:17 lr 0.000080 time 1.8723 (2.2080) loss 2.5157 (3.0969) grad_norm 2.5408 (2.6660) [2022-01-25 15:51:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][900/1251] eta 0:12:54 lr 0.000080 time 1.8891 (2.2066) loss 3.6846 (3.0987) grad_norm 2.4464 (2.6665) [2022-01-25 15:51:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][910/1251] eta 0:12:33 lr 0.000080 time 1.8701 (2.2087) loss 2.3466 (3.0991) grad_norm 2.5686 (2.6647) [2022-01-25 15:51:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][920/1251] eta 0:12:11 lr 0.000080 time 2.9558 (2.2112) loss 2.0011 (3.0997) grad_norm 2.7723 (2.6643) [2022-01-25 15:52:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][930/1251] eta 0:11:49 lr 0.000080 time 1.8279 (2.2093) loss 2.4748 (3.0988) grad_norm 2.9227 (2.6656) [2022-01-25 15:52:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][940/1251] eta 0:11:26 lr 0.000080 time 1.9654 (2.2074) loss 3.0041 (3.0974) grad_norm 2.4402 (2.6646) [2022-01-25 15:52:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][950/1251] eta 0:11:03 lr 0.000080 time 2.2318 (2.2055) loss 2.4319 (3.0972) grad_norm 2.2693 (2.6637) [2022-01-25 15:53:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][960/1251] eta 0:10:41 lr 0.000080 time 2.4501 (2.2041) loss 3.7176 (3.0990) grad_norm 2.4643 (2.6634) [2022-01-25 15:53:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][970/1251] eta 0:10:19 lr 0.000080 time 2.4830 (2.2038) loss 2.4510 (3.0965) grad_norm 3.2830 (2.6625) [2022-01-25 15:54:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][980/1251] eta 0:09:57 lr 0.000080 time 2.2234 (2.2033) loss 3.6408 (3.0979) grad_norm 2.5181 (2.6621) [2022-01-25 15:54:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][990/1251] eta 0:09:35 lr 0.000079 time 1.9541 (2.2057) loss 3.6489 (3.1008) grad_norm 5.2173 (2.6642) [2022-01-25 15:54:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1000/1251] eta 0:09:13 lr 0.000079 time 2.9557 (2.2068) loss 3.7808 (3.1044) grad_norm 2.6605 (2.6634) [2022-01-25 15:55:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1010/1251] eta 0:08:52 lr 0.000079 time 2.2576 (2.2083) loss 2.5653 (3.1056) grad_norm 2.4829 (2.6635) [2022-01-25 15:55:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1020/1251] eta 0:08:30 lr 0.000079 time 1.5146 (2.2088) loss 3.4224 (3.1051) grad_norm 2.5158 (2.6615) [2022-01-25 15:55:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1030/1251] eta 0:08:08 lr 0.000079 time 1.7312 (2.2086) loss 2.4125 (3.1046) grad_norm 2.6904 (2.6606) [2022-01-25 15:56:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1040/1251] eta 0:07:45 lr 0.000079 time 3.3696 (2.2084) loss 2.3970 (3.1052) grad_norm 3.8166 (2.6612) [2022-01-25 15:56:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1050/1251] eta 0:07:23 lr 0.000079 time 1.9564 (2.2065) loss 3.3474 (3.1068) grad_norm 2.5734 (2.6614) [2022-01-25 15:57:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1060/1251] eta 0:07:01 lr 0.000079 time 2.2265 (2.2066) loss 2.8438 (3.1086) grad_norm 3.2981 (2.6621) [2022-01-25 15:57:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1070/1251] eta 0:06:39 lr 0.000079 time 2.1658 (2.2058) loss 3.6211 (3.1096) grad_norm 2.8250 (2.6619) [2022-01-25 15:57:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1080/1251] eta 0:06:17 lr 0.000079 time 3.2590 (2.2058) loss 3.4195 (3.1123) grad_norm 2.5535 (2.6613) [2022-01-25 15:58:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1090/1251] eta 0:05:55 lr 0.000079 time 2.4567 (2.2055) loss 3.5193 (3.1146) grad_norm 2.3701 (2.6606) [2022-01-25 15:58:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1100/1251] eta 0:05:32 lr 0.000079 time 2.2449 (2.2050) loss 3.6501 (3.1179) grad_norm 2.7567 (2.6607) [2022-01-25 15:58:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1110/1251] eta 0:05:10 lr 0.000079 time 1.7465 (2.2041) loss 2.7779 (3.1173) grad_norm 2.7886 (2.6600) [2022-01-25 15:59:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1120/1251] eta 0:04:49 lr 0.000079 time 3.9283 (2.2079) loss 3.5165 (3.1188) grad_norm 2.7086 (2.6612) [2022-01-25 15:59:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1130/1251] eta 0:04:27 lr 0.000079 time 1.9119 (2.2072) loss 2.6569 (3.1188) grad_norm 2.8333 (2.6601) [2022-01-25 15:59:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1140/1251] eta 0:04:04 lr 0.000079 time 2.1559 (2.2065) loss 2.6567 (3.1180) grad_norm 2.5268 (2.6593) [2022-01-25 16:00:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1150/1251] eta 0:03:42 lr 0.000079 time 2.2440 (2.2052) loss 2.5910 (3.1183) grad_norm 2.4091 (2.6604) [2022-01-25 16:00:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1160/1251] eta 0:03:20 lr 0.000079 time 3.6666 (2.2054) loss 3.2079 (3.1165) grad_norm 2.4215 (2.6598) [2022-01-25 16:01:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1170/1251] eta 0:02:58 lr 0.000079 time 1.8697 (2.2054) loss 2.7127 (3.1150) grad_norm 2.0556 (2.6586) [2022-01-25 16:01:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1180/1251] eta 0:02:36 lr 0.000079 time 2.2104 (2.2052) loss 3.7509 (3.1147) grad_norm 2.7165 (2.6595) [2022-01-25 16:01:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1190/1251] eta 0:02:14 lr 0.000079 time 2.2803 (2.2054) loss 2.0551 (3.1116) grad_norm 2.5474 (2.6596) [2022-01-25 16:02:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1200/1251] eta 0:01:52 lr 0.000079 time 3.6268 (2.2066) loss 3.4765 (3.1107) grad_norm 2.3167 (2.6593) [2022-01-25 16:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1210/1251] eta 0:01:30 lr 0.000079 time 1.8382 (2.2049) loss 3.2134 (3.1113) grad_norm 2.4597 (2.6595) [2022-01-25 16:02:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1220/1251] eta 0:01:08 lr 0.000079 time 2.1210 (2.2032) loss 3.5255 (3.1116) grad_norm 2.3001 (2.6589) [2022-01-25 16:03:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1230/1251] eta 0:00:46 lr 0.000079 time 2.2616 (2.2011) loss 2.7681 (3.1130) grad_norm 2.4436 (2.6585) [2022-01-25 16:03:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1240/1251] eta 0:00:24 lr 0.000079 time 2.2397 (2.2000) loss 2.5310 (3.1130) grad_norm 2.5951 (2.6577) [2022-01-25 16:03:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [248/300][1250/1251] eta 0:00:02 lr 0.000079 time 1.1557 (2.1947) loss 3.2875 (3.1117) grad_norm 2.1227 (2.6571) [2022-01-25 16:03:45 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 248 training takes 0:45:45 [2022-01-25 16:04:03 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.211 (18.211) Loss 0.7903 (0.7903) Acc@1 82.129 (82.129) Acc@5 95.898 (95.898) [2022-01-25 16:04:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.544 (3.405) Loss 0.8219 (0.8344) Acc@1 81.641 (80.575) Acc@5 95.605 (95.508) [2022-01-25 16:04:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.305 (2.513) Loss 0.8262 (0.8401) Acc@1 80.664 (80.185) Acc@5 95.020 (95.289) [2022-01-25 16:04:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.309 (2.263) Loss 0.8578 (0.8385) Acc@1 79.199 (80.343) Acc@5 94.629 (95.265) [2022-01-25 16:05:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.170 (2.216) Loss 0.8247 (0.8336) Acc@1 81.738 (80.488) Acc@5 95.117 (95.258) [2022-01-25 16:05:23 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.380 Acc@5 95.270 [2022-01-25 16:05:23 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-01-25 16:05:23 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.38% [2022-01-25 16:05:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][0/1251] eta 7:20:12 lr 0.000079 time 21.1131 (21.1131) loss 2.9155 (2.9155) grad_norm 2.7112 (2.7112) [2022-01-25 16:06:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][10/1251] eta 1:22:18 lr 0.000079 time 1.6092 (3.9796) loss 2.9748 (3.0082) grad_norm 2.1444 (2.7139) [2022-01-25 16:06:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][20/1251] eta 1:05:17 lr 0.000079 time 1.8392 (3.1824) loss 3.7834 (3.0210) grad_norm 2.8007 (2.6965) [2022-01-25 16:06:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][30/1251] eta 0:57:53 lr 0.000079 time 1.3882 (2.8445) loss 2.6395 (3.0361) grad_norm 2.6947 (2.6795) [2022-01-25 16:07:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][40/1251] eta 0:55:21 lr 0.000079 time 3.9910 (2.7430) loss 2.9917 (3.0133) grad_norm 2.9196 (2.7142) [2022-01-25 16:07:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][50/1251] eta 0:53:11 lr 0.000079 time 2.4494 (2.6577) loss 3.4807 (2.9822) grad_norm 2.5697 (2.6928) [2022-01-25 16:08:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][60/1251] eta 0:51:29 lr 0.000079 time 2.4066 (2.5940) loss 1.9839 (2.9836) grad_norm 2.4224 (2.6819) [2022-01-25 16:08:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][70/1251] eta 0:49:36 lr 0.000079 time 1.7370 (2.5200) loss 2.6558 (2.9686) grad_norm 2.8386 (2.6797) [2022-01-25 16:08:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][80/1251] eta 0:48:24 lr 0.000079 time 3.4589 (2.4805) loss 2.5824 (2.9678) grad_norm 2.9428 (2.6858) [2022-01-25 16:09:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][90/1251] eta 0:47:10 lr 0.000079 time 2.1606 (2.4380) loss 2.5305 (2.9747) grad_norm 3.4240 (2.7949) [2022-01-25 16:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][100/1251] eta 0:46:07 lr 0.000079 time 2.0538 (2.4045) loss 3.1025 (2.9738) grad_norm 3.0795 (2.7811) [2022-01-25 16:09:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][110/1251] eta 0:45:10 lr 0.000079 time 1.8803 (2.3756) loss 3.0572 (2.9635) grad_norm 2.4038 (2.7607) [2022-01-25 16:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][120/1251] eta 0:44:49 lr 0.000079 time 3.1887 (2.3782) loss 3.2384 (2.9603) grad_norm 2.8176 (2.7603) [2022-01-25 16:10:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][130/1251] eta 0:44:21 lr 0.000079 time 2.2141 (2.3743) loss 2.3741 (2.9487) grad_norm 2.8510 (2.7555) [2022-01-25 16:10:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][140/1251] eta 0:43:52 lr 0.000079 time 1.6482 (2.3693) loss 2.7580 (2.9633) grad_norm 3.5679 (2.7462) [2022-01-25 16:11:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][150/1251] eta 0:43:14 lr 0.000079 time 1.6734 (2.3565) loss 3.4708 (2.9559) grad_norm 2.8284 (2.7475) [2022-01-25 16:11:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][160/1251] eta 0:42:46 lr 0.000079 time 3.6981 (2.3521) loss 3.4325 (2.9667) grad_norm 2.4457 (2.7334) [2022-01-25 16:12:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][170/1251] eta 0:41:58 lr 0.000079 time 1.9396 (2.3294) loss 3.0237 (2.9594) grad_norm 2.6839 (2.7222) [2022-01-25 16:12:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][180/1251] eta 0:41:16 lr 0.000079 time 2.0708 (2.3127) loss 3.2530 (2.9682) grad_norm 3.2772 (2.7108) [2022-01-25 16:12:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][190/1251] eta 0:40:38 lr 0.000079 time 1.8716 (2.2985) loss 3.6999 (2.9793) grad_norm 2.8542 (2.7010) [2022-01-25 16:13:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][200/1251] eta 0:40:07 lr 0.000079 time 2.5360 (2.2908) loss 3.7569 (2.9951) grad_norm 2.7734 (2.7046) [2022-01-25 16:13:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][210/1251] eta 0:39:39 lr 0.000078 time 2.2948 (2.2862) loss 3.5489 (3.0022) grad_norm 2.3405 (2.6981) [2022-01-25 16:13:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][220/1251] eta 0:39:13 lr 0.000078 time 1.9360 (2.2825) loss 2.7807 (3.0147) grad_norm 2.7987 (2.7064) [2022-01-25 16:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][230/1251] eta 0:38:48 lr 0.000078 time 2.2026 (2.2807) loss 3.3155 (3.0171) grad_norm 3.0263 (2.7074) [2022-01-25 16:14:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][240/1251] eta 0:38:27 lr 0.000078 time 3.0217 (2.2824) loss 3.3014 (3.0308) grad_norm 2.3924 (2.7052) [2022-01-25 16:14:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][250/1251] eta 0:37:59 lr 0.000078 time 2.2232 (2.2768) loss 3.0495 (3.0318) grad_norm 2.5559 (2.6975) [2022-01-25 16:15:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][260/1251] eta 0:37:34 lr 0.000078 time 2.7659 (2.2745) loss 2.5061 (3.0284) grad_norm 2.4270 (2.6992) [2022-01-25 16:15:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][270/1251] eta 0:37:10 lr 0.000078 time 2.1670 (2.2742) loss 3.4167 (3.0261) grad_norm 2.4327 (2.6913) [2022-01-25 16:16:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][280/1251] eta 0:36:51 lr 0.000078 time 3.6583 (2.2773) loss 2.9081 (3.0309) grad_norm 2.3493 (2.6890) [2022-01-25 16:16:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][290/1251] eta 0:36:24 lr 0.000078 time 1.6318 (2.2727) loss 3.4349 (3.0415) grad_norm 2.7865 (2.6848) [2022-01-25 16:16:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][300/1251] eta 0:35:53 lr 0.000078 time 1.8394 (2.2649) loss 3.2146 (3.0351) grad_norm 2.3761 (2.6797) [2022-01-25 16:17:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][310/1251] eta 0:35:19 lr 0.000078 time 1.8934 (2.2528) loss 3.2264 (3.0331) grad_norm 2.8209 (2.6728) [2022-01-25 16:17:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][320/1251] eta 0:34:56 lr 0.000078 time 2.8083 (2.2520) loss 2.2874 (3.0310) grad_norm 2.7210 (2.6732) [2022-01-25 16:17:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][330/1251] eta 0:34:33 lr 0.000078 time 2.2690 (2.2517) loss 3.3079 (3.0370) grad_norm 2.4856 (2.6713) [2022-01-25 16:18:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][340/1251] eta 0:34:10 lr 0.000078 time 2.1895 (2.2511) loss 3.2889 (3.0367) grad_norm 2.7836 (2.6701) [2022-01-25 16:18:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][350/1251] eta 0:33:47 lr 0.000078 time 2.2224 (2.2507) loss 3.5580 (3.0395) grad_norm 3.0893 (2.6698) [2022-01-25 16:18:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][360/1251] eta 0:33:25 lr 0.000078 time 3.1582 (2.2512) loss 2.3402 (3.0395) grad_norm 2.7588 (2.6731) [2022-01-25 16:19:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][370/1251] eta 0:32:58 lr 0.000078 time 1.9613 (2.2458) loss 2.4720 (3.0321) grad_norm 2.8907 (2.6729) [2022-01-25 16:19:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][380/1251] eta 0:32:31 lr 0.000078 time 2.6430 (2.2411) loss 2.0284 (3.0317) grad_norm 2.8353 (2.6703) [2022-01-25 16:19:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][390/1251] eta 0:32:06 lr 0.000078 time 2.5158 (2.2378) loss 3.4003 (3.0315) grad_norm 2.2402 (2.6724) [2022-01-25 16:20:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][400/1251] eta 0:31:45 lr 0.000078 time 2.1553 (2.2395) loss 2.5782 (3.0308) grad_norm 2.6446 (2.6671) [2022-01-25 16:20:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][410/1251] eta 0:31:22 lr 0.000078 time 2.0039 (2.2383) loss 3.0814 (3.0309) grad_norm 2.6041 (2.6661) [2022-01-25 16:21:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][420/1251] eta 0:31:04 lr 0.000078 time 3.4651 (2.2441) loss 2.5135 (3.0305) grad_norm 2.5207 (2.6649) [2022-01-25 16:21:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][430/1251] eta 0:30:45 lr 0.000078 time 3.4039 (2.2482) loss 3.6144 (3.0348) grad_norm 2.5710 (2.6617) [2022-01-25 16:21:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][440/1251] eta 0:30:21 lr 0.000078 time 2.5977 (2.2456) loss 3.0376 (3.0298) grad_norm 2.8276 (2.6638) [2022-01-25 16:22:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][450/1251] eta 0:29:54 lr 0.000078 time 1.8870 (2.2399) loss 3.0048 (3.0341) grad_norm 2.5399 (2.6610) [2022-01-25 16:22:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][460/1251] eta 0:29:27 lr 0.000078 time 2.0006 (2.2339) loss 1.9449 (3.0358) grad_norm 2.7360 (2.6624) [2022-01-25 16:22:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][470/1251] eta 0:29:02 lr 0.000078 time 2.0039 (2.2312) loss 3.7073 (3.0415) grad_norm 3.0509 (2.6652) [2022-01-25 16:23:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][480/1251] eta 0:28:38 lr 0.000078 time 2.4547 (2.2286) loss 2.8681 (3.0435) grad_norm 2.7078 (2.6640) [2022-01-25 16:23:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][490/1251] eta 0:28:14 lr 0.000078 time 1.4721 (2.2271) loss 3.4269 (3.0429) grad_norm 2.8625 (2.6641) [2022-01-25 16:23:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][500/1251] eta 0:27:51 lr 0.000078 time 2.1103 (2.2260) loss 2.0869 (3.0400) grad_norm 2.3552 (2.6601) [2022-01-25 16:24:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][510/1251] eta 0:27:30 lr 0.000078 time 2.1566 (2.2280) loss 3.4394 (3.0435) grad_norm 2.5058 (2.6595) [2022-01-25 16:24:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][520/1251] eta 0:27:13 lr 0.000078 time 2.7410 (2.2342) loss 3.6281 (3.0506) grad_norm 2.5460 (2.6599) [2022-01-25 16:25:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][530/1251] eta 0:26:52 lr 0.000078 time 1.5045 (2.2358) loss 3.4173 (3.0532) grad_norm 2.2530 (2.6614) [2022-01-25 16:25:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][540/1251] eta 0:26:28 lr 0.000078 time 1.5674 (2.2338) loss 3.4327 (3.0560) grad_norm 2.9787 (2.6656) [2022-01-25 16:25:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][550/1251] eta 0:26:03 lr 0.000078 time 1.9111 (2.2304) loss 3.3646 (3.0567) grad_norm 2.8881 (2.6689) [2022-01-25 16:26:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][560/1251] eta 0:25:37 lr 0.000078 time 2.0486 (2.2252) loss 3.7510 (3.0599) grad_norm 2.8254 (2.6747) [2022-01-25 16:26:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][570/1251] eta 0:25:13 lr 0.000078 time 2.3832 (2.2222) loss 3.5921 (3.0571) grad_norm 2.8521 (2.6766) [2022-01-25 16:26:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][580/1251] eta 0:24:51 lr 0.000078 time 2.5880 (2.2222) loss 3.9008 (3.0601) grad_norm 2.7800 (2.6776) [2022-01-25 16:27:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][590/1251] eta 0:24:28 lr 0.000078 time 2.5428 (2.2217) loss 3.2506 (3.0641) grad_norm 2.2936 (2.6750) [2022-01-25 16:27:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][600/1251] eta 0:24:07 lr 0.000078 time 2.4654 (2.2228) loss 3.6558 (3.0684) grad_norm 2.5093 (2.6768) [2022-01-25 16:28:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][610/1251] eta 0:23:45 lr 0.000078 time 2.3578 (2.2232) loss 3.7871 (3.0684) grad_norm 3.0588 (2.6769) [2022-01-25 16:28:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][620/1251] eta 0:23:23 lr 0.000078 time 2.7440 (2.2239) loss 3.2698 (3.0718) grad_norm 2.4032 (2.6756) [2022-01-25 16:28:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][630/1251] eta 0:23:00 lr 0.000078 time 1.5863 (2.2223) loss 3.5869 (3.0717) grad_norm 2.8733 (2.6749) [2022-01-25 16:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][640/1251] eta 0:22:37 lr 0.000078 time 2.2118 (2.2222) loss 2.1225 (3.0683) grad_norm 2.4394 (2.6742) [2022-01-25 16:29:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][650/1251] eta 0:22:16 lr 0.000078 time 3.1327 (2.2238) loss 3.2671 (3.0662) grad_norm 2.5848 (2.6736) [2022-01-25 16:29:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][660/1251] eta 0:21:53 lr 0.000078 time 1.6387 (2.2224) loss 3.1218 (3.0669) grad_norm 2.3978 (2.6798) [2022-01-25 16:30:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][670/1251] eta 0:21:31 lr 0.000078 time 2.4851 (2.2221) loss 3.5524 (3.0691) grad_norm 2.2813 (2.6784) [2022-01-25 16:30:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][680/1251] eta 0:21:09 lr 0.000078 time 2.4957 (2.2234) loss 3.5546 (3.0716) grad_norm 2.6647 (2.6796) [2022-01-25 16:31:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][690/1251] eta 0:20:47 lr 0.000077 time 3.2235 (2.2233) loss 3.7547 (3.0718) grad_norm 2.8059 (2.6790) [2022-01-25 16:31:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][700/1251] eta 0:20:22 lr 0.000077 time 1.7886 (2.2190) loss 3.3810 (3.0742) grad_norm 2.5119 (2.6798) [2022-01-25 16:31:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][710/1251] eta 0:19:59 lr 0.000077 time 1.9447 (2.2163) loss 3.0955 (3.0769) grad_norm 2.7351 (2.6805) [2022-01-25 16:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][720/1251] eta 0:19:38 lr 0.000077 time 3.2887 (2.2190) loss 2.2637 (3.0757) grad_norm 2.9299 (2.6844) [2022-01-25 16:32:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][730/1251] eta 0:19:17 lr 0.000077 time 2.0654 (2.2209) loss 3.3626 (3.0794) grad_norm 3.0955 (2.6859) [2022-01-25 16:32:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][740/1251] eta 0:18:54 lr 0.000077 time 2.1863 (2.2201) loss 2.5990 (3.0783) grad_norm 2.5593 (2.6842) [2022-01-25 16:33:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][750/1251] eta 0:18:32 lr 0.000077 time 2.3588 (2.2200) loss 3.2877 (3.0781) grad_norm 2.4734 (2.6834) [2022-01-25 16:33:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][760/1251] eta 0:18:08 lr 0.000077 time 1.5779 (2.2170) loss 2.8922 (3.0799) grad_norm 2.4588 (2.6856) [2022-01-25 16:33:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][770/1251] eta 0:17:45 lr 0.000077 time 1.8562 (2.2154) loss 3.8149 (3.0847) grad_norm 2.8061 (2.6862) [2022-01-25 16:34:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][780/1251] eta 0:17:23 lr 0.000077 time 2.1695 (2.2158) loss 2.5266 (3.0829) grad_norm 2.3415 (2.6863) [2022-01-25 16:34:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][790/1251] eta 0:17:01 lr 0.000077 time 2.2075 (2.2158) loss 2.9958 (3.0840) grad_norm 2.8603 (2.6861) [2022-01-25 16:34:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][800/1251] eta 0:16:39 lr 0.000077 time 1.8748 (2.2156) loss 1.9174 (3.0829) grad_norm 3.4985 (2.6866) [2022-01-25 16:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][810/1251] eta 0:16:16 lr 0.000077 time 1.9071 (2.2152) loss 2.5439 (3.0857) grad_norm 2.5158 (2.6861) [2022-01-25 16:35:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][820/1251] eta 0:15:55 lr 0.000077 time 1.8644 (2.2159) loss 3.2678 (3.0875) grad_norm 2.4862 (2.6868) [2022-01-25 16:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][830/1251] eta 0:15:32 lr 0.000077 time 2.4915 (2.2155) loss 3.2472 (3.0854) grad_norm 2.5616 (2.6853) [2022-01-25 16:36:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][840/1251] eta 0:15:09 lr 0.000077 time 1.9860 (2.2131) loss 2.6648 (3.0860) grad_norm 2.4047 (2.6841) [2022-01-25 16:36:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][850/1251] eta 0:14:46 lr 0.000077 time 2.2366 (2.2103) loss 2.7639 (3.0854) grad_norm 2.9084 (2.6825) [2022-01-25 16:37:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][860/1251] eta 0:14:24 lr 0.000077 time 2.1063 (2.2104) loss 2.2929 (3.0844) grad_norm 2.7531 (2.6968) [2022-01-25 16:37:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][870/1251] eta 0:14:01 lr 0.000077 time 2.4882 (2.2090) loss 3.2027 (3.0854) grad_norm 3.7757 (2.7010) [2022-01-25 16:37:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][880/1251] eta 0:13:39 lr 0.000077 time 1.5848 (2.2087) loss 3.3443 (3.0867) grad_norm 2.3055 (2.7058) [2022-01-25 16:38:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][890/1251] eta 0:13:17 lr 0.000077 time 2.7351 (2.2090) loss 2.6835 (3.0867) grad_norm 2.3285 (2.7060) [2022-01-25 16:38:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][900/1251] eta 0:12:55 lr 0.000077 time 1.9505 (2.2082) loss 3.0690 (3.0886) grad_norm 2.5732 (2.7071) [2022-01-25 16:38:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][910/1251] eta 0:12:33 lr 0.000077 time 3.5774 (2.2104) loss 2.0907 (3.0883) grad_norm 2.6259 (2.7070) [2022-01-25 16:39:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][920/1251] eta 0:12:12 lr 0.000077 time 2.2190 (2.2116) loss 3.1746 (3.0898) grad_norm 2.5236 (2.7059) [2022-01-25 16:39:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][930/1251] eta 0:11:50 lr 0.000077 time 2.4726 (2.2127) loss 3.4931 (3.0921) grad_norm 2.5138 (2.7037) [2022-01-25 16:40:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][940/1251] eta 0:11:28 lr 0.000077 time 1.7430 (2.2136) loss 2.2957 (3.0909) grad_norm 2.7585 (2.7022) [2022-01-25 16:40:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][950/1251] eta 0:11:05 lr 0.000077 time 2.1668 (2.2118) loss 2.9864 (3.0914) grad_norm 2.9881 (2.7027) [2022-01-25 16:40:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][960/1251] eta 0:10:42 lr 0.000077 time 1.5801 (2.2093) loss 3.4240 (3.0921) grad_norm 2.4489 (2.7024) [2022-01-25 16:41:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][970/1251] eta 0:10:20 lr 0.000077 time 1.7711 (2.2076) loss 2.3289 (3.0905) grad_norm 2.5154 (2.7021) [2022-01-25 16:41:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][980/1251] eta 0:09:57 lr 0.000077 time 1.9090 (2.2059) loss 2.2908 (3.0857) grad_norm 2.9334 (2.7020) [2022-01-25 16:41:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][990/1251] eta 0:09:35 lr 0.000077 time 1.9618 (2.2048) loss 2.7761 (3.0869) grad_norm 2.5487 (2.7010) [2022-01-25 16:42:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1000/1251] eta 0:09:13 lr 0.000077 time 2.5485 (2.2055) loss 2.3601 (3.0848) grad_norm 2.6294 (2.6999) [2022-01-25 16:42:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1010/1251] eta 0:08:51 lr 0.000077 time 1.9349 (2.2066) loss 2.7293 (3.0856) grad_norm 2.5251 (2.7049) [2022-01-25 16:42:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1020/1251] eta 0:08:30 lr 0.000077 time 1.8431 (2.2091) loss 3.4185 (3.0887) grad_norm 2.4102 (2.7050) [2022-01-25 16:43:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1030/1251] eta 0:08:07 lr 0.000077 time 1.8713 (2.2072) loss 2.8353 (3.0881) grad_norm 2.6339 (2.7059) [2022-01-25 16:43:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1040/1251] eta 0:07:45 lr 0.000077 time 2.5409 (2.2061) loss 3.7059 (3.0883) grad_norm 3.5458 (2.7066) [2022-01-25 16:44:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1050/1251] eta 0:07:23 lr 0.000077 time 1.8970 (2.2057) loss 3.0275 (3.0878) grad_norm 6.3989 (2.7101) [2022-01-25 16:44:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1060/1251] eta 0:07:01 lr 0.000077 time 1.8692 (2.2053) loss 3.5473 (3.0886) grad_norm 2.2664 (2.7105) [2022-01-25 16:44:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1070/1251] eta 0:06:39 lr 0.000077 time 2.2844 (2.2063) loss 2.9566 (3.0885) grad_norm 2.8283 (2.7104) [2022-01-25 16:45:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1080/1251] eta 0:06:17 lr 0.000077 time 1.7239 (2.2056) loss 3.3122 (3.0909) grad_norm 2.9330 (2.7112) [2022-01-25 16:45:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1090/1251] eta 0:05:54 lr 0.000077 time 1.9634 (2.2049) loss 3.4538 (3.0896) grad_norm 2.5303 (2.7108) [2022-01-25 16:45:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1100/1251] eta 0:05:32 lr 0.000077 time 1.8887 (2.2040) loss 3.5072 (3.0892) grad_norm 3.1296 (2.7104) [2022-01-25 16:46:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1110/1251] eta 0:05:10 lr 0.000077 time 1.8786 (2.2036) loss 2.5301 (3.0892) grad_norm 2.7678 (2.7103) [2022-01-25 16:46:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1120/1251] eta 0:04:48 lr 0.000077 time 2.1192 (2.2031) loss 3.3527 (3.0907) grad_norm 2.4139 (2.7094) [2022-01-25 16:46:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1130/1251] eta 0:04:26 lr 0.000077 time 2.0253 (2.2044) loss 2.4217 (3.0895) grad_norm 2.5125 (2.7092) [2022-01-25 16:47:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1140/1251] eta 0:04:04 lr 0.000077 time 1.4621 (2.2047) loss 3.2276 (3.0904) grad_norm 2.6219 (2.7091) [2022-01-25 16:47:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1150/1251] eta 0:03:42 lr 0.000077 time 2.0040 (2.2047) loss 2.0623 (3.0881) grad_norm 3.0541 (2.7082) [2022-01-25 16:48:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1160/1251] eta 0:03:20 lr 0.000077 time 2.5272 (2.2040) loss 3.0923 (3.0872) grad_norm 2.8437 (2.7086) [2022-01-25 16:48:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1170/1251] eta 0:02:58 lr 0.000076 time 1.5277 (2.2031) loss 3.7028 (3.0878) grad_norm 2.3203 (2.7075) [2022-01-25 16:48:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1180/1251] eta 0:02:36 lr 0.000076 time 1.8932 (2.2020) loss 3.4032 (3.0866) grad_norm 3.0152 (2.7072) [2022-01-25 16:49:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1190/1251] eta 0:02:14 lr 0.000076 time 2.2937 (2.2017) loss 3.4994 (3.0877) grad_norm 2.6360 (2.7083) [2022-01-25 16:49:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1200/1251] eta 0:01:52 lr 0.000076 time 2.7021 (2.2016) loss 2.1471 (3.0862) grad_norm 2.7473 (2.7068) [2022-01-25 16:49:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1210/1251] eta 0:01:30 lr 0.000076 time 1.7403 (2.2016) loss 3.1037 (3.0871) grad_norm 3.2197 (2.7061) [2022-01-25 16:50:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1220/1251] eta 0:01:08 lr 0.000076 time 2.4869 (2.2035) loss 3.5623 (3.0894) grad_norm 2.5618 (2.7050) [2022-01-25 16:50:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1230/1251] eta 0:00:46 lr 0.000076 time 2.2941 (2.2034) loss 3.1180 (3.0887) grad_norm 2.5258 (2.7047) [2022-01-25 16:50:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1240/1251] eta 0:00:24 lr 0.000076 time 2.1811 (2.2024) loss 2.4184 (3.0863) grad_norm 2.7223 (2.7044) [2022-01-25 16:51:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [249/300][1250/1251] eta 0:00:02 lr 0.000076 time 1.3875 (2.1966) loss 3.2849 (3.0865) grad_norm 2.5753 (2.7039) [2022-01-25 16:51:12 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 249 training takes 0:45:48 [2022-01-25 16:51:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.259 (18.259) Loss 0.7840 (0.7840) Acc@1 81.934 (81.934) Acc@5 95.410 (95.410) [2022-01-25 16:51:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.958 (3.311) Loss 0.8695 (0.8204) Acc@1 79.297 (80.824) Acc@5 94.824 (95.410) [2022-01-25 16:52:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.628 (2.562) Loss 0.8887 (0.8287) Acc@1 78.809 (80.539) Acc@5 95.117 (95.387) [2022-01-25 16:52:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.644 (2.257) Loss 0.8877 (0.8400) Acc@1 80.371 (80.390) Acc@5 93.750 (95.231) [2022-01-25 16:52:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 5.438 (2.208) Loss 0.6911 (0.8374) Acc@1 84.668 (80.447) Acc@5 96.484 (95.258) [2022-01-25 16:52:49 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.274 Acc@5 95.234 [2022-01-25 16:52:49 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-01-25 16:52:49 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.38% [2022-01-25 16:53:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][0/1251] eta 7:26:37 lr 0.000076 time 21.4206 (21.4206) loss 2.0666 (2.0666) grad_norm 2.6323 (2.6323) [2022-01-25 16:53:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][10/1251] eta 1:21:20 lr 0.000076 time 2.2161 (3.9326) loss 3.5302 (2.8116) grad_norm 3.3320 (2.7499) [2022-01-25 16:53:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][20/1251] eta 1:04:12 lr 0.000076 time 2.2819 (3.1293) loss 2.4683 (2.8855) grad_norm 2.4968 (2.7095) [2022-01-25 16:54:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][30/1251] eta 0:57:00 lr 0.000076 time 1.8189 (2.8013) loss 2.8492 (2.9924) grad_norm 2.7027 (2.7294) [2022-01-25 16:54:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][40/1251] eta 0:55:05 lr 0.000076 time 5.2676 (2.7298) loss 3.2524 (2.9966) grad_norm 2.8288 (2.7681) [2022-01-25 16:55:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][50/1251] eta 0:52:45 lr 0.000076 time 2.7541 (2.6353) loss 3.4961 (3.0189) grad_norm 2.7666 (2.7489) [2022-01-25 16:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][60/1251] eta 0:50:10 lr 0.000076 time 1.8034 (2.5280) loss 3.3998 (3.0033) grad_norm 2.6375 (2.7620) [2022-01-25 16:55:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][70/1251] eta 0:48:32 lr 0.000076 time 1.8466 (2.4664) loss 2.7097 (3.0121) grad_norm 2.5760 (2.7517) [2022-01-25 16:56:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][80/1251] eta 0:47:16 lr 0.000076 time 2.2563 (2.4227) loss 3.1617 (3.0396) grad_norm 2.4804 (2.7277) [2022-01-25 16:56:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][90/1251] eta 0:46:31 lr 0.000076 time 2.6445 (2.4042) loss 2.5865 (3.0457) grad_norm 3.0618 (2.7151) [2022-01-25 16:56:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][100/1251] eta 0:45:52 lr 0.000076 time 1.9407 (2.3913) loss 3.4356 (3.0169) grad_norm 2.7251 (2.6992) [2022-01-25 16:57:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][110/1251] eta 0:45:23 lr 0.000076 time 2.1076 (2.3871) loss 2.8760 (3.0448) grad_norm 2.4231 (2.6951) [2022-01-25 16:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][120/1251] eta 0:44:43 lr 0.000076 time 2.2071 (2.3729) loss 2.4246 (3.0366) grad_norm 2.7453 (2.6885) [2022-01-25 16:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][130/1251] eta 0:44:07 lr 0.000076 time 2.2987 (2.3616) loss 3.0741 (3.0284) grad_norm 3.6761 (2.6843) [2022-01-25 16:58:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][140/1251] eta 0:43:18 lr 0.000076 time 1.7957 (2.3391) loss 2.5137 (3.0247) grad_norm 2.6545 (2.6834) [2022-01-25 16:58:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][150/1251] eta 0:42:41 lr 0.000076 time 2.0403 (2.3269) loss 3.5835 (3.0350) grad_norm 2.7571 (2.6827) [2022-01-25 16:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][160/1251] eta 0:42:02 lr 0.000076 time 2.2186 (2.3121) loss 3.4123 (3.0462) grad_norm 2.2560 (2.6914) [2022-01-25 16:59:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][170/1251] eta 0:41:33 lr 0.000076 time 2.7104 (2.3068) loss 3.6372 (3.0493) grad_norm 2.5983 (2.6876) [2022-01-25 16:59:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][180/1251] eta 0:41:08 lr 0.000076 time 2.1808 (2.3044) loss 2.5874 (3.0580) grad_norm 3.0885 (2.6943) [2022-01-25 17:00:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][190/1251] eta 0:40:45 lr 0.000076 time 2.1271 (2.3050) loss 3.0998 (3.0586) grad_norm 2.2435 (2.6938) [2022-01-25 17:00:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][200/1251] eta 0:40:14 lr 0.000076 time 2.9258 (2.2975) loss 3.4565 (3.0619) grad_norm 3.0684 (2.7049) [2022-01-25 17:00:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][210/1251] eta 0:39:42 lr 0.000076 time 1.8902 (2.2885) loss 2.9097 (3.0685) grad_norm 2.4167 (2.7028) [2022-01-25 17:01:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][220/1251] eta 0:39:14 lr 0.000076 time 2.1363 (2.2835) loss 3.1655 (3.0734) grad_norm 2.4807 (2.7004) [2022-01-25 17:01:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][230/1251] eta 0:38:49 lr 0.000076 time 2.4461 (2.2814) loss 3.6869 (3.0721) grad_norm 3.2945 (2.7087) [2022-01-25 17:01:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][240/1251] eta 0:38:18 lr 0.000076 time 2.1734 (2.2736) loss 3.7131 (3.0725) grad_norm 4.5603 (2.7135) [2022-01-25 17:02:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][250/1251] eta 0:37:53 lr 0.000076 time 2.5702 (2.2716) loss 3.1464 (3.0675) grad_norm 2.8126 (2.7268) [2022-01-25 17:02:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][260/1251] eta 0:37:30 lr 0.000076 time 2.2140 (2.2707) loss 3.4169 (3.0649) grad_norm 2.7248 (2.7238) [2022-01-25 17:03:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][270/1251] eta 0:37:09 lr 0.000076 time 3.2295 (2.2725) loss 3.1115 (3.0683) grad_norm 2.6896 (2.7346) [2022-01-25 17:03:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][280/1251] eta 0:36:40 lr 0.000076 time 2.0267 (2.2660) loss 3.4673 (3.0677) grad_norm 2.2833 (2.7318) [2022-01-25 17:03:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][290/1251] eta 0:36:10 lr 0.000076 time 1.9744 (2.2585) loss 2.9125 (3.0696) grad_norm 2.5726 (2.7282) [2022-01-25 17:04:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][300/1251] eta 0:35:40 lr 0.000076 time 2.3703 (2.2512) loss 3.5952 (3.0681) grad_norm 2.9855 (2.7300) [2022-01-25 17:04:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][310/1251] eta 0:35:18 lr 0.000076 time 2.8732 (2.2516) loss 2.5978 (3.0702) grad_norm 3.1365 (2.7249) [2022-01-25 17:04:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][320/1251] eta 0:34:56 lr 0.000076 time 1.9017 (2.2516) loss 2.8687 (3.0656) grad_norm 2.2143 (2.7155) [2022-01-25 17:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][330/1251] eta 0:34:34 lr 0.000076 time 2.4485 (2.2520) loss 3.6475 (3.0699) grad_norm 2.7054 (2.7107) [2022-01-25 17:05:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][340/1251] eta 0:34:09 lr 0.000076 time 1.8307 (2.2499) loss 2.5312 (3.0678) grad_norm 2.5325 (2.7064) [2022-01-25 17:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][350/1251] eta 0:33:51 lr 0.000076 time 3.7193 (2.2546) loss 3.6020 (3.0655) grad_norm 2.4182 (2.7055) [2022-01-25 17:06:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][360/1251] eta 0:33:26 lr 0.000076 time 1.7492 (2.2523) loss 3.2960 (3.0717) grad_norm 2.4750 (2.7055) [2022-01-25 17:06:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][370/1251] eta 0:32:59 lr 0.000076 time 2.0251 (2.2474) loss 2.2823 (3.0668) grad_norm 2.7119 (2.7071) [2022-01-25 17:07:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][380/1251] eta 0:32:33 lr 0.000076 time 2.0295 (2.2425) loss 3.8676 (3.0723) grad_norm 2.7662 (2.7109) [2022-01-25 17:07:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][390/1251] eta 0:32:10 lr 0.000076 time 3.1226 (2.2423) loss 3.3591 (3.0728) grad_norm 2.5496 (2.7152) [2022-01-25 17:07:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][400/1251] eta 0:31:43 lr 0.000075 time 2.0062 (2.2371) loss 3.2356 (3.0736) grad_norm 2.7610 (2.7164) [2022-01-25 17:08:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][410/1251] eta 0:31:15 lr 0.000075 time 1.8348 (2.2306) loss 2.0278 (3.0769) grad_norm 2.8662 (2.7148) [2022-01-25 17:08:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][420/1251] eta 0:30:52 lr 0.000075 time 2.2662 (2.2297) loss 3.1639 (3.0779) grad_norm 2.1494 (2.7124) [2022-01-25 17:08:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][430/1251] eta 0:30:33 lr 0.000075 time 3.0703 (2.2337) loss 2.0090 (3.0769) grad_norm 3.1184 (2.7110) [2022-01-25 17:09:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][440/1251] eta 0:30:14 lr 0.000075 time 2.8679 (2.2378) loss 3.4910 (3.0740) grad_norm 2.9692 (2.7128) [2022-01-25 17:09:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][450/1251] eta 0:29:51 lr 0.000075 time 1.6032 (2.2371) loss 3.4130 (3.0784) grad_norm 2.7200 (2.7167) [2022-01-25 17:10:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][460/1251] eta 0:29:28 lr 0.000075 time 1.9786 (2.2357) loss 2.7774 (3.0771) grad_norm 3.2008 (2.7152) [2022-01-25 17:10:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][470/1251] eta 0:29:02 lr 0.000075 time 2.5699 (2.2310) loss 2.4744 (3.0771) grad_norm 2.6505 (2.7158) [2022-01-25 17:10:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][480/1251] eta 0:28:36 lr 0.000075 time 2.2024 (2.2263) loss 3.0790 (3.0794) grad_norm 2.7836 (2.7126) [2022-01-25 17:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][490/1251] eta 0:28:12 lr 0.000075 time 2.1490 (2.2241) loss 2.6637 (3.0751) grad_norm 2.6367 (2.7122) [2022-01-25 17:11:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][500/1251] eta 0:27:51 lr 0.000075 time 2.1043 (2.2264) loss 3.2326 (3.0782) grad_norm 2.1994 (2.7091) [2022-01-25 17:11:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][510/1251] eta 0:27:31 lr 0.000075 time 2.7621 (2.2288) loss 3.4992 (3.0758) grad_norm 2.3277 (2.7052) [2022-01-25 17:12:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][520/1251] eta 0:27:09 lr 0.000075 time 2.1253 (2.2297) loss 3.3574 (3.0727) grad_norm 2.6754 (2.7034) [2022-01-25 17:12:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][530/1251] eta 0:26:48 lr 0.000075 time 2.2120 (2.2304) loss 3.1468 (3.0713) grad_norm 3.0849 (2.7051) [2022-01-25 17:12:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][540/1251] eta 0:26:23 lr 0.000075 time 1.9388 (2.2269) loss 3.2474 (3.0754) grad_norm 2.6800 (2.7107) [2022-01-25 17:13:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][550/1251] eta 0:25:59 lr 0.000075 time 2.7392 (2.2247) loss 2.7670 (3.0754) grad_norm 2.4719 (2.7106) [2022-01-25 17:13:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][560/1251] eta 0:25:34 lr 0.000075 time 1.7446 (2.2210) loss 3.5193 (3.0804) grad_norm 2.7246 (2.7085) [2022-01-25 17:13:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][570/1251] eta 0:25:09 lr 0.000075 time 1.7534 (2.2162) loss 2.7804 (3.0773) grad_norm 2.9275 (2.7082) [2022-01-25 17:14:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][580/1251] eta 0:24:45 lr 0.000075 time 1.8712 (2.2141) loss 3.5936 (3.0781) grad_norm 2.6014 (2.7086) [2022-01-25 17:14:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][590/1251] eta 0:24:25 lr 0.000075 time 3.8006 (2.2167) loss 3.3148 (3.0795) grad_norm 2.9224 (2.7089) [2022-01-25 17:15:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][600/1251] eta 0:24:03 lr 0.000075 time 2.1182 (2.2179) loss 2.9210 (3.0800) grad_norm 2.7907 (2.7097) [2022-01-25 17:15:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][610/1251] eta 0:23:42 lr 0.000075 time 2.3716 (2.2199) loss 3.3685 (3.0846) grad_norm 2.2289 (2.7068) [2022-01-25 17:15:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][620/1251] eta 0:23:21 lr 0.000075 time 2.8858 (2.2213) loss 2.2310 (3.0823) grad_norm 2.7368 (2.7052) [2022-01-25 17:16:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][630/1251] eta 0:22:59 lr 0.000075 time 2.4906 (2.2218) loss 3.1560 (3.0855) grad_norm 2.5984 (2.7035) [2022-01-25 17:16:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][640/1251] eta 0:22:36 lr 0.000075 time 1.8981 (2.2195) loss 3.3858 (3.0870) grad_norm 2.5348 (2.7010) [2022-01-25 17:16:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][650/1251] eta 0:22:12 lr 0.000075 time 2.5882 (2.2176) loss 3.4158 (3.0882) grad_norm 2.3389 (2.6982) [2022-01-25 17:17:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][660/1251] eta 0:21:49 lr 0.000075 time 2.1907 (2.2156) loss 3.3968 (3.0915) grad_norm 2.3848 (2.6971) [2022-01-25 17:17:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][670/1251] eta 0:21:26 lr 0.000075 time 2.1735 (2.2143) loss 3.2962 (3.0947) grad_norm 2.5146 (2.7032) [2022-01-25 17:17:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][680/1251] eta 0:21:03 lr 0.000075 time 2.2694 (2.2135) loss 2.3508 (3.0935) grad_norm 2.6119 (2.7079) [2022-01-25 17:18:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][690/1251] eta 0:20:41 lr 0.000075 time 2.4437 (2.2138) loss 3.4107 (3.0937) grad_norm 3.0634 (2.7086) [2022-01-25 17:18:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][700/1251] eta 0:20:19 lr 0.000075 time 2.2389 (2.2132) loss 2.3055 (3.0918) grad_norm 2.7343 (2.7075) [2022-01-25 17:19:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][710/1251] eta 0:19:56 lr 0.000075 time 2.0712 (2.2107) loss 2.4486 (3.0902) grad_norm 3.7777 (2.7097) [2022-01-25 17:19:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][720/1251] eta 0:19:35 lr 0.000075 time 1.8548 (2.2135) loss 3.1287 (3.0902) grad_norm 2.6951 (2.7090) [2022-01-25 17:19:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][730/1251] eta 0:19:14 lr 0.000075 time 3.3583 (2.2155) loss 2.8981 (3.0906) grad_norm 2.5843 (2.7086) [2022-01-25 17:20:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][740/1251] eta 0:18:51 lr 0.000075 time 1.8793 (2.2151) loss 2.0982 (3.0879) grad_norm 2.4612 (2.7054) [2022-01-25 17:20:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][750/1251] eta 0:18:28 lr 0.000075 time 1.5697 (2.2127) loss 3.1116 (3.0873) grad_norm 3.1644 (2.7043) [2022-01-25 17:20:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][760/1251] eta 0:18:07 lr 0.000075 time 2.3051 (2.2147) loss 3.7512 (3.0890) grad_norm 2.5879 (2.7024) [2022-01-25 17:21:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][770/1251] eta 0:17:44 lr 0.000075 time 3.0475 (2.2140) loss 3.4850 (3.0901) grad_norm 2.6611 (2.7047) [2022-01-25 17:21:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][780/1251] eta 0:17:21 lr 0.000075 time 1.6759 (2.2114) loss 3.7912 (3.0935) grad_norm 2.4305 (2.7055) [2022-01-25 17:21:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][790/1251] eta 0:16:59 lr 0.000075 time 1.9230 (2.2106) loss 2.7436 (3.0922) grad_norm 2.7893 (2.7042) [2022-01-25 17:22:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][800/1251] eta 0:16:36 lr 0.000075 time 2.5203 (2.2105) loss 3.4353 (3.0954) grad_norm 2.5891 (2.7052) [2022-01-25 17:22:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][810/1251] eta 0:16:15 lr 0.000075 time 3.6753 (2.2115) loss 3.0449 (3.0948) grad_norm 2.6254 (2.7053) [2022-01-25 17:23:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][820/1251] eta 0:15:53 lr 0.000075 time 2.0454 (2.2127) loss 3.3353 (3.0959) grad_norm 2.4815 (2.7061) [2022-01-25 17:23:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][830/1251] eta 0:15:31 lr 0.000075 time 1.6911 (2.2121) loss 3.3289 (3.0937) grad_norm 2.3639 (2.7048) [2022-01-25 17:23:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][840/1251] eta 0:15:09 lr 0.000075 time 2.2672 (2.2120) loss 2.9043 (3.0941) grad_norm 2.5437 (2.7083) [2022-01-25 17:24:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][850/1251] eta 0:14:46 lr 0.000075 time 2.4851 (2.2111) loss 3.4867 (3.0949) grad_norm 2.8399 (2.7099) [2022-01-25 17:24:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][860/1251] eta 0:14:24 lr 0.000075 time 1.9473 (2.2102) loss 3.6859 (3.0968) grad_norm 2.4780 (2.7097) [2022-01-25 17:24:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][870/1251] eta 0:14:00 lr 0.000075 time 1.5224 (2.2070) loss 3.4160 (3.0963) grad_norm 2.5015 (2.7078) [2022-01-25 17:25:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][880/1251] eta 0:13:38 lr 0.000075 time 1.5871 (2.2055) loss 3.2912 (3.0948) grad_norm 2.5833 (2.7088) [2022-01-25 17:25:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][890/1251] eta 0:13:16 lr 0.000074 time 2.5796 (2.2056) loss 3.3414 (3.0961) grad_norm 2.5770 (2.7088) [2022-01-25 17:25:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][900/1251] eta 0:12:55 lr 0.000074 time 3.2430 (2.2081) loss 3.1651 (3.0949) grad_norm 2.5962 (2.7090) [2022-01-25 17:26:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][910/1251] eta 0:12:34 lr 0.000074 time 1.8837 (2.2129) loss 3.4270 (3.0938) grad_norm 2.5751 (2.7071) [2022-01-25 17:26:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][920/1251] eta 0:12:13 lr 0.000074 time 1.5446 (2.2157) loss 2.2284 (3.0921) grad_norm 2.6079 (2.7060) [2022-01-25 17:27:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][930/1251] eta 0:11:51 lr 0.000074 time 2.6845 (2.2174) loss 3.5722 (3.0930) grad_norm 2.7553 (2.7048) [2022-01-25 17:27:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][940/1251] eta 0:11:29 lr 0.000074 time 2.3279 (2.2169) loss 2.8125 (3.0964) grad_norm 2.6916 (2.7066) [2022-01-25 17:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][950/1251] eta 0:11:06 lr 0.000074 time 1.6283 (2.2132) loss 2.8300 (3.0955) grad_norm 3.0207 (2.7057) [2022-01-25 17:28:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][960/1251] eta 0:10:42 lr 0.000074 time 2.2006 (2.2096) loss 2.8423 (3.0924) grad_norm 2.7074 (2.7052) [2022-01-25 17:28:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][970/1251] eta 0:10:20 lr 0.000074 time 1.6598 (2.2080) loss 3.3099 (3.0929) grad_norm 2.5821 (2.7044) [2022-01-25 17:28:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][980/1251] eta 0:09:58 lr 0.000074 time 2.2247 (2.2077) loss 2.7626 (3.0924) grad_norm 2.4208 (2.7038) [2022-01-25 17:29:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][990/1251] eta 0:09:35 lr 0.000074 time 1.5847 (2.2053) loss 2.8170 (3.0908) grad_norm 2.7676 (2.7012) [2022-01-25 17:29:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1000/1251] eta 0:09:13 lr 0.000074 time 2.9420 (2.2056) loss 3.1038 (3.0908) grad_norm 2.5294 (2.6995) [2022-01-25 17:29:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1010/1251] eta 0:08:51 lr 0.000074 time 1.9546 (2.2039) loss 2.5583 (3.0880) grad_norm 2.2465 (2.6979) [2022-01-25 17:30:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1020/1251] eta 0:08:29 lr 0.000074 time 1.8708 (2.2053) loss 2.8870 (3.0865) grad_norm 2.4575 (2.6955) [2022-01-25 17:30:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1030/1251] eta 0:08:07 lr 0.000074 time 1.5919 (2.2066) loss 3.6576 (3.0868) grad_norm 2.5570 (2.6944) [2022-01-25 17:31:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1040/1251] eta 0:07:46 lr 0.000074 time 3.3897 (2.2094) loss 3.3621 (3.0864) grad_norm 2.4987 (2.6939) [2022-01-25 17:31:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1050/1251] eta 0:07:24 lr 0.000074 time 1.8020 (2.2091) loss 2.1537 (3.0866) grad_norm 2.4647 (2.6924) [2022-01-25 17:31:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1060/1251] eta 0:07:02 lr 0.000074 time 2.0934 (2.2104) loss 3.6841 (3.0881) grad_norm 2.7166 (2.6915) [2022-01-25 17:32:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1070/1251] eta 0:06:40 lr 0.000074 time 1.5498 (2.2103) loss 2.1939 (3.0860) grad_norm 2.4977 (2.6890) [2022-01-25 17:32:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1080/1251] eta 0:06:17 lr 0.000074 time 2.5712 (2.2082) loss 3.0822 (3.0854) grad_norm 2.8141 (2.6892) [2022-01-25 17:32:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1090/1251] eta 0:05:55 lr 0.000074 time 1.5491 (2.2064) loss 3.3438 (3.0857) grad_norm 2.9608 (2.6879) [2022-01-25 17:33:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1100/1251] eta 0:05:32 lr 0.000074 time 1.8537 (2.2042) loss 3.3893 (3.0872) grad_norm 2.5084 (2.6881) [2022-01-25 17:33:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1110/1251] eta 0:05:10 lr 0.000074 time 2.2283 (2.2039) loss 3.3605 (3.0881) grad_norm 2.6547 (2.6878) [2022-01-25 17:34:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1120/1251] eta 0:04:48 lr 0.000074 time 3.0309 (2.2045) loss 3.2020 (3.0887) grad_norm 2.7579 (2.6897) [2022-01-25 17:34:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1130/1251] eta 0:04:26 lr 0.000074 time 2.5408 (2.2052) loss 3.5582 (3.0903) grad_norm 2.6205 (2.6903) [2022-01-25 17:34:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1140/1251] eta 0:04:04 lr 0.000074 time 2.1492 (2.2060) loss 3.5428 (3.0905) grad_norm 2.7804 (2.6889) [2022-01-25 17:35:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1150/1251] eta 0:03:42 lr 0.000074 time 2.1681 (2.2063) loss 3.2760 (3.0898) grad_norm 2.4181 (2.6869) [2022-01-25 17:35:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1160/1251] eta 0:03:20 lr 0.000074 time 2.5188 (2.2071) loss 3.6778 (3.0895) grad_norm 2.4188 (2.6868) [2022-01-25 17:35:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1170/1251] eta 0:02:58 lr 0.000074 time 2.1690 (2.2068) loss 3.5574 (3.0894) grad_norm 2.4144 (2.6873) [2022-01-25 17:36:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1180/1251] eta 0:02:36 lr 0.000074 time 2.7375 (2.2073) loss 3.3793 (3.0890) grad_norm 3.1106 (2.6902) [2022-01-25 17:36:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1190/1251] eta 0:02:14 lr 0.000074 time 1.7261 (2.2060) loss 3.1221 (3.0887) grad_norm 2.5255 (2.6932) [2022-01-25 17:36:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1200/1251] eta 0:01:52 lr 0.000074 time 1.9514 (2.2051) loss 2.8182 (3.0850) grad_norm 2.5319 (2.6924) [2022-01-25 17:37:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1210/1251] eta 0:01:30 lr 0.000074 time 2.6915 (2.2036) loss 2.6709 (3.0846) grad_norm 2.9633 (2.6944) [2022-01-25 17:37:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1220/1251] eta 0:01:08 lr 0.000074 time 2.2368 (2.2028) loss 3.7113 (3.0827) grad_norm 2.6085 (2.6940) [2022-01-25 17:38:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1230/1251] eta 0:00:46 lr 0.000074 time 2.2128 (2.2023) loss 3.0740 (3.0837) grad_norm 2.7997 (2.6944) [2022-01-25 17:38:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1240/1251] eta 0:00:24 lr 0.000074 time 1.7314 (2.2038) loss 3.2493 (3.0855) grad_norm 5.1413 (2.6971) [2022-01-25 17:38:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [250/300][1250/1251] eta 0:00:02 lr 0.000074 time 1.2945 (2.1990) loss 3.0503 (3.0839) grad_norm 2.3196 (2.6958) [2022-01-25 17:38:41 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 250 training takes 0:45:51 [2022-01-25 17:38:41 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_250 saving...... [2022-01-25 17:38:52 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_250 saved !!! [2022-01-25 17:39:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 16.041 (16.041) Loss 0.8562 (0.8562) Acc@1 79.883 (79.883) Acc@5 95.020 (95.020) [2022-01-25 17:39:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.576 (2.886) Loss 0.8290 (0.8548) Acc@1 80.859 (80.238) Acc@5 95.703 (95.153) [2022-01-25 17:39:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.351 (2.284) Loss 0.8806 (0.8587) Acc@1 81.152 (80.092) Acc@5 94.629 (95.075) [2022-01-25 17:39:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.266 (2.028) Loss 0.8730 (0.8505) Acc@1 77.539 (80.214) Acc@5 95.410 (95.174) [2022-01-25 17:40:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.503 (1.959) Loss 0.8248 (0.8463) Acc@1 81.934 (80.223) Acc@5 95.312 (95.248) [2022-01-25 17:40:20 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.338 Acc@5 95.218 [2022-01-25 17:40:20 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.3% [2022-01-25 17:40:20 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.38% [2022-01-25 17:40:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][0/1251] eta 7:22:04 lr 0.000074 time 21.2027 (21.2027) loss 3.3103 (3.3103) grad_norm 2.6894 (2.6894) [2022-01-25 17:41:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][10/1251] eta 1:21:01 lr 0.000074 time 1.8798 (3.9171) loss 3.0537 (2.9840) grad_norm 2.6449 (2.5564) [2022-01-25 17:41:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][20/1251] eta 1:01:56 lr 0.000074 time 2.2149 (3.0189) loss 3.1151 (3.0556) grad_norm 2.4238 (2.5077) [2022-01-25 17:41:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][30/1251] eta 0:55:12 lr 0.000074 time 1.7796 (2.7133) loss 3.6380 (3.0520) grad_norm 2.7692 (2.5303) [2022-01-25 17:42:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][40/1251] eta 0:53:42 lr 0.000074 time 3.6356 (2.6607) loss 3.4964 (3.0083) grad_norm 2.4988 (2.5540) [2022-01-25 17:42:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][50/1251] eta 0:51:53 lr 0.000074 time 2.6689 (2.5928) loss 3.5542 (3.0369) grad_norm 3.4561 (2.6220) [2022-01-25 17:42:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][60/1251] eta 0:50:00 lr 0.000074 time 2.1047 (2.5197) loss 2.7152 (3.0030) grad_norm 2.6057 (2.6520) [2022-01-25 17:43:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][70/1251] eta 0:48:47 lr 0.000074 time 2.1911 (2.4785) loss 3.5020 (3.0452) grad_norm 2.5526 (2.6479) [2022-01-25 17:43:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][80/1251] eta 0:48:26 lr 0.000074 time 4.2008 (2.4819) loss 2.5651 (3.0390) grad_norm 2.6121 (2.6357) [2022-01-25 17:44:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][90/1251] eta 0:47:31 lr 0.000074 time 1.7072 (2.4560) loss 3.3098 (3.0195) grad_norm 2.6197 (2.6300) [2022-01-25 17:44:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][100/1251] eta 0:46:25 lr 0.000074 time 1.9591 (2.4203) loss 2.8037 (3.0256) grad_norm 2.6183 (2.6338) [2022-01-25 17:44:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][110/1251] eta 0:45:24 lr 0.000074 time 1.9231 (2.3880) loss 3.1207 (3.0386) grad_norm 2.3170 (2.6282) [2022-01-25 17:45:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][120/1251] eta 0:44:38 lr 0.000074 time 2.5396 (2.3687) loss 3.5490 (3.0395) grad_norm 2.6802 (2.6199) [2022-01-25 17:45:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][130/1251] eta 0:44:04 lr 0.000073 time 3.0727 (2.3587) loss 2.2222 (3.0437) grad_norm 2.5731 (2.6216) [2022-01-25 17:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][140/1251] eta 0:43:29 lr 0.000073 time 2.1486 (2.3489) loss 3.4090 (3.0361) grad_norm 4.4999 (2.6472) [2022-01-25 17:46:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][150/1251] eta 0:42:49 lr 0.000073 time 2.2731 (2.3335) loss 2.6741 (3.0448) grad_norm 2.5067 (2.6564) [2022-01-25 17:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][160/1251] eta 0:42:10 lr 0.000073 time 2.6867 (2.3197) loss 3.1495 (3.0356) grad_norm 2.6288 (2.6636) [2022-01-25 17:46:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][170/1251] eta 0:41:38 lr 0.000073 time 2.5365 (2.3111) loss 3.2549 (3.0532) grad_norm 2.5396 (2.6659) [2022-01-25 17:47:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][180/1251] eta 0:41:03 lr 0.000073 time 2.2108 (2.2998) loss 2.7472 (3.0596) grad_norm 3.0632 (2.6632) [2022-01-25 17:47:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][190/1251] eta 0:40:34 lr 0.000073 time 2.6104 (2.2948) loss 2.9825 (3.0613) grad_norm 2.3954 (2.6634) [2022-01-25 17:47:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][200/1251] eta 0:39:57 lr 0.000073 time 2.2736 (2.2810) loss 2.5825 (3.0674) grad_norm 2.7771 (2.6643) [2022-01-25 17:48:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][210/1251] eta 0:39:21 lr 0.000073 time 2.3116 (2.2689) loss 2.0045 (3.0579) grad_norm 3.6637 (2.6672) [2022-01-25 17:48:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][220/1251] eta 0:38:56 lr 0.000073 time 2.2606 (2.2666) loss 2.6428 (3.0558) grad_norm 2.6846 (2.6789) [2022-01-25 17:49:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][230/1251] eta 0:38:30 lr 0.000073 time 1.9355 (2.2629) loss 3.0827 (3.0653) grad_norm 3.1952 (2.6803) [2022-01-25 17:49:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][240/1251] eta 0:38:06 lr 0.000073 time 1.7201 (2.2615) loss 2.9195 (3.0508) grad_norm 2.3725 (2.6786) [2022-01-25 17:49:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][250/1251] eta 0:37:56 lr 0.000073 time 3.5102 (2.2741) loss 2.2954 (3.0432) grad_norm 2.2865 (2.6829) [2022-01-25 17:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][260/1251] eta 0:37:37 lr 0.000073 time 1.9118 (2.2785) loss 2.1753 (3.0498) grad_norm 2.4889 (2.6821) [2022-01-25 17:50:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][270/1251] eta 0:37:06 lr 0.000073 time 1.5700 (2.2696) loss 3.5015 (3.0572) grad_norm 2.1790 (2.6772) [2022-01-25 17:50:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][280/1251] eta 0:36:33 lr 0.000073 time 2.0879 (2.2593) loss 2.6775 (3.0588) grad_norm 2.1974 (2.6755) [2022-01-25 17:51:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][290/1251] eta 0:36:02 lr 0.000073 time 2.0173 (2.2503) loss 3.1846 (3.0557) grad_norm 2.5412 (2.6736) [2022-01-25 17:51:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][300/1251] eta 0:35:36 lr 0.000073 time 1.8902 (2.2461) loss 3.3002 (3.0584) grad_norm 2.7861 (2.6737) [2022-01-25 17:51:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][310/1251] eta 0:35:10 lr 0.000073 time 2.0969 (2.2427) loss 3.6193 (3.0543) grad_norm 2.6940 (2.6783) [2022-01-25 17:52:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][320/1251] eta 0:34:45 lr 0.000073 time 1.9604 (2.2404) loss 2.7357 (3.0580) grad_norm 2.7594 (2.6873) [2022-01-25 17:52:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][330/1251] eta 0:34:22 lr 0.000073 time 1.9271 (2.2393) loss 3.3783 (3.0604) grad_norm 2.6512 (2.6861) [2022-01-25 17:53:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][340/1251] eta 0:34:03 lr 0.000073 time 2.4557 (2.2429) loss 3.9619 (3.0541) grad_norm 2.5341 (2.6867) [2022-01-25 17:53:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][350/1251] eta 0:33:39 lr 0.000073 time 1.9268 (2.2417) loss 3.7318 (3.0583) grad_norm 2.7046 (2.6890) [2022-01-25 17:53:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][360/1251] eta 0:33:17 lr 0.000073 time 2.1437 (2.2417) loss 3.5779 (3.0646) grad_norm 2.5742 (2.6900) [2022-01-25 17:54:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][370/1251] eta 0:32:50 lr 0.000073 time 1.5797 (2.2366) loss 3.5318 (3.0641) grad_norm 2.4510 (2.6904) [2022-01-25 17:54:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][380/1251] eta 0:32:28 lr 0.000073 time 2.9061 (2.2366) loss 2.5310 (3.0611) grad_norm 2.6391 (2.6954) [2022-01-25 17:54:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][390/1251] eta 0:32:01 lr 0.000073 time 1.6453 (2.2316) loss 3.0761 (3.0603) grad_norm 2.6630 (2.7016) [2022-01-25 17:55:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][400/1251] eta 0:31:36 lr 0.000073 time 1.9488 (2.2286) loss 3.2026 (3.0646) grad_norm 2.7037 (2.7010) [2022-01-25 17:55:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][410/1251] eta 0:31:12 lr 0.000073 time 1.8629 (2.2270) loss 3.0146 (3.0632) grad_norm 2.6316 (2.6982) [2022-01-25 17:56:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][420/1251] eta 0:30:54 lr 0.000073 time 2.8359 (2.2320) loss 3.4991 (3.0720) grad_norm 2.6307 (2.6992) [2022-01-25 17:56:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][430/1251] eta 0:30:34 lr 0.000073 time 2.4158 (2.2346) loss 2.8934 (3.0711) grad_norm 2.6846 (2.7019) [2022-01-25 17:56:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][440/1251] eta 0:30:12 lr 0.000073 time 1.8732 (2.2347) loss 3.4464 (3.0745) grad_norm 3.5453 (2.7027) [2022-01-25 17:57:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][450/1251] eta 0:29:48 lr 0.000073 time 1.9046 (2.2327) loss 2.3574 (3.0761) grad_norm 2.5221 (2.7027) [2022-01-25 17:57:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][460/1251] eta 0:29:21 lr 0.000073 time 1.8963 (2.2266) loss 2.6495 (3.0766) grad_norm 3.5593 (2.7092) [2022-01-25 17:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][470/1251] eta 0:28:53 lr 0.000073 time 1.8596 (2.2197) loss 3.4940 (3.0777) grad_norm 3.0901 (2.7120) [2022-01-25 17:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][480/1251] eta 0:28:30 lr 0.000073 time 2.0409 (2.2183) loss 2.7662 (3.0765) grad_norm 2.8298 (2.7132) [2022-01-25 17:58:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][490/1251] eta 0:28:07 lr 0.000073 time 1.9020 (2.2180) loss 3.3293 (3.0746) grad_norm 2.8698 (2.7130) [2022-01-25 17:58:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][500/1251] eta 0:27:45 lr 0.000073 time 2.3551 (2.2175) loss 3.2478 (3.0707) grad_norm 2.3733 (2.7132) [2022-01-25 17:59:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][510/1251] eta 0:27:24 lr 0.000073 time 2.5142 (2.2196) loss 3.3229 (3.0714) grad_norm 2.3530 (2.7110) [2022-01-25 17:59:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][520/1251] eta 0:27:03 lr 0.000073 time 2.4005 (2.2213) loss 3.7807 (3.0737) grad_norm 2.4955 (2.7128) [2022-01-25 17:59:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][530/1251] eta 0:26:40 lr 0.000073 time 1.8593 (2.2193) loss 3.5094 (3.0723) grad_norm 2.3797 (2.7097) [2022-01-25 18:00:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][540/1251] eta 0:26:18 lr 0.000073 time 2.8131 (2.2206) loss 2.6778 (3.0736) grad_norm 2.8729 (2.7137) [2022-01-25 18:00:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][550/1251] eta 0:25:56 lr 0.000073 time 1.6840 (2.2209) loss 3.4925 (3.0753) grad_norm 2.7957 (2.7180) [2022-01-25 18:01:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][560/1251] eta 0:25:33 lr 0.000073 time 1.6222 (2.2192) loss 3.2435 (3.0724) grad_norm 2.5944 (2.7207) [2022-01-25 18:01:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][570/1251] eta 0:25:08 lr 0.000073 time 1.5741 (2.2152) loss 2.9614 (3.0688) grad_norm 2.7177 (2.7237) [2022-01-25 18:01:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][580/1251] eta 0:24:45 lr 0.000073 time 2.5176 (2.2136) loss 3.1380 (3.0670) grad_norm 2.6919 (2.7229) [2022-01-25 18:02:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][590/1251] eta 0:24:24 lr 0.000073 time 1.8813 (2.2155) loss 2.4789 (3.0673) grad_norm 2.6544 (2.7293) [2022-01-25 18:02:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][600/1251] eta 0:24:02 lr 0.000073 time 1.9736 (2.2161) loss 3.3713 (3.0691) grad_norm 2.3956 (2.7280) [2022-01-25 18:02:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][610/1251] eta 0:23:39 lr 0.000073 time 1.6165 (2.2149) loss 3.0049 (3.0714) grad_norm 2.1978 (2.7253) [2022-01-25 18:03:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][620/1251] eta 0:23:16 lr 0.000072 time 2.2163 (2.2137) loss 3.3325 (3.0667) grad_norm 2.8047 (2.7244) [2022-01-25 18:03:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][630/1251] eta 0:22:55 lr 0.000072 time 2.1702 (2.2149) loss 3.2711 (3.0669) grad_norm 3.0142 (2.7238) [2022-01-25 18:04:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][640/1251] eta 0:22:33 lr 0.000072 time 2.5914 (2.2147) loss 3.5937 (3.0635) grad_norm 2.7703 (2.7234) [2022-01-25 18:04:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][650/1251] eta 0:22:09 lr 0.000072 time 1.9192 (2.2126) loss 3.6402 (3.0655) grad_norm 2.5316 (2.7225) [2022-01-25 18:04:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][660/1251] eta 0:21:47 lr 0.000072 time 1.9837 (2.2116) loss 3.3489 (3.0653) grad_norm 2.9372 (2.7231) [2022-01-25 18:05:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][670/1251] eta 0:21:24 lr 0.000072 time 1.7588 (2.2114) loss 3.6779 (3.0659) grad_norm 2.7196 (2.7246) [2022-01-25 18:05:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][680/1251] eta 0:21:02 lr 0.000072 time 2.3381 (2.2108) loss 3.2911 (3.0659) grad_norm 2.3102 (2.7228) [2022-01-25 18:05:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][690/1251] eta 0:20:39 lr 0.000072 time 2.2492 (2.2094) loss 2.6024 (3.0660) grad_norm 2.8296 (2.7207) [2022-01-25 18:06:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][700/1251] eta 0:20:16 lr 0.000072 time 1.9577 (2.2081) loss 3.6278 (3.0660) grad_norm 3.0546 (2.7202) [2022-01-25 18:06:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][710/1251] eta 0:19:53 lr 0.000072 time 1.9309 (2.2064) loss 3.4228 (3.0678) grad_norm 2.9077 (2.7239) [2022-01-25 18:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][720/1251] eta 0:19:31 lr 0.000072 time 2.5201 (2.2062) loss 3.1903 (3.0669) grad_norm 2.9443 (2.7230) [2022-01-25 18:07:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][730/1251] eta 0:19:09 lr 0.000072 time 1.8207 (2.2056) loss 3.5656 (3.0675) grad_norm 2.4776 (2.7240) [2022-01-25 18:07:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][740/1251] eta 0:18:46 lr 0.000072 time 2.0168 (2.2047) loss 3.4719 (3.0709) grad_norm 2.7961 (2.7292) [2022-01-25 18:07:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][750/1251] eta 0:18:24 lr 0.000072 time 1.9247 (2.2054) loss 2.2037 (3.0669) grad_norm 2.3949 (2.7285) [2022-01-25 18:08:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][760/1251] eta 0:18:03 lr 0.000072 time 3.1233 (2.2068) loss 3.7677 (3.0700) grad_norm 2.7306 (2.7264) [2022-01-25 18:08:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][770/1251] eta 0:17:42 lr 0.000072 time 2.6210 (2.2091) loss 2.7973 (3.0705) grad_norm 2.6646 (2.7240) [2022-01-25 18:09:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][780/1251] eta 0:17:19 lr 0.000072 time 1.5426 (2.2076) loss 3.2714 (3.0686) grad_norm 2.3818 (2.7255) [2022-01-25 18:09:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][790/1251] eta 0:16:58 lr 0.000072 time 1.8436 (2.2083) loss 2.7905 (3.0683) grad_norm 2.4385 (2.7238) [2022-01-25 18:09:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][800/1251] eta 0:16:35 lr 0.000072 time 1.5722 (2.2074) loss 2.3420 (3.0711) grad_norm 2.6958 (2.7227) [2022-01-25 18:10:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][810/1251] eta 0:16:13 lr 0.000072 time 1.9192 (2.2076) loss 2.8643 (3.0720) grad_norm 2.4756 (2.7256) [2022-01-25 18:10:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][820/1251] eta 0:15:50 lr 0.000072 time 1.7162 (2.2045) loss 2.6396 (3.0751) grad_norm 2.8892 (2.7297) [2022-01-25 18:10:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][830/1251] eta 0:15:27 lr 0.000072 time 1.8957 (2.2035) loss 3.5739 (3.0789) grad_norm 2.3605 (2.7300) [2022-01-25 18:11:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][840/1251] eta 0:15:05 lr 0.000072 time 2.0081 (2.2020) loss 3.4041 (3.0753) grad_norm 2.3671 (2.7289) [2022-01-25 18:11:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][850/1251] eta 0:14:43 lr 0.000072 time 2.3115 (2.2026) loss 3.4953 (3.0765) grad_norm 2.3316 (2.7292) [2022-01-25 18:11:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][860/1251] eta 0:14:21 lr 0.000072 time 2.1875 (2.2029) loss 3.2768 (3.0743) grad_norm 2.5347 (2.7290) [2022-01-25 18:12:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][870/1251] eta 0:13:58 lr 0.000072 time 1.8690 (2.2019) loss 3.3258 (3.0758) grad_norm 3.2261 (2.7298) [2022-01-25 18:12:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][880/1251] eta 0:13:37 lr 0.000072 time 2.2795 (2.2027) loss 3.3101 (3.0735) grad_norm 2.3964 (2.7306) [2022-01-25 18:13:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][890/1251] eta 0:13:15 lr 0.000072 time 2.4224 (2.2027) loss 3.4991 (3.0741) grad_norm 3.4517 (2.7317) [2022-01-25 18:13:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][900/1251] eta 0:12:53 lr 0.000072 time 2.8097 (2.2033) loss 2.1260 (3.0747) grad_norm 2.5616 (2.7317) [2022-01-25 18:13:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][910/1251] eta 0:12:31 lr 0.000072 time 1.9882 (2.2031) loss 2.2864 (3.0717) grad_norm 3.0065 (2.7326) [2022-01-25 18:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][920/1251] eta 0:12:09 lr 0.000072 time 1.9263 (2.2035) loss 3.0529 (3.0710) grad_norm 2.6687 (2.7304) [2022-01-25 18:14:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][930/1251] eta 0:11:47 lr 0.000072 time 2.5309 (2.2044) loss 3.0404 (3.0733) grad_norm 2.4991 (2.7286) [2022-01-25 18:14:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][940/1251] eta 0:11:25 lr 0.000072 time 2.1313 (2.2028) loss 3.9722 (3.0728) grad_norm 3.4880 (2.7284) [2022-01-25 18:15:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][950/1251] eta 0:11:02 lr 0.000072 time 2.4099 (2.2005) loss 3.5311 (3.0711) grad_norm 2.3716 (2.7287) [2022-01-25 18:15:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][960/1251] eta 0:10:39 lr 0.000072 time 1.7052 (2.1988) loss 3.0747 (3.0702) grad_norm 2.5468 (2.7289) [2022-01-25 18:15:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][970/1251] eta 0:10:18 lr 0.000072 time 2.4490 (2.2006) loss 3.6216 (3.0694) grad_norm 2.5811 (2.7288) [2022-01-25 18:16:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][980/1251] eta 0:09:56 lr 0.000072 time 2.2458 (2.2016) loss 2.4868 (3.0688) grad_norm 3.2623 (2.7290) [2022-01-25 18:16:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][990/1251] eta 0:09:34 lr 0.000072 time 2.1740 (2.2029) loss 3.7067 (3.0714) grad_norm 2.8018 (2.7303) [2022-01-25 18:17:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1000/1251] eta 0:09:12 lr 0.000072 time 1.4735 (2.2012) loss 2.5148 (3.0702) grad_norm 2.7267 (2.7317) [2022-01-25 18:17:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1010/1251] eta 0:08:50 lr 0.000072 time 2.2704 (2.1995) loss 2.2736 (3.0702) grad_norm 2.4911 (2.7308) [2022-01-25 18:17:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1020/1251] eta 0:08:27 lr 0.000072 time 2.2597 (2.1986) loss 3.7817 (3.0690) grad_norm 2.8262 (2.7306) [2022-01-25 18:18:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1030/1251] eta 0:08:05 lr 0.000072 time 2.5917 (2.1981) loss 3.5772 (3.0679) grad_norm 2.6047 (2.7302) [2022-01-25 18:18:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1040/1251] eta 0:07:43 lr 0.000072 time 1.8816 (2.1970) loss 3.0258 (3.0693) grad_norm 2.4863 (2.7292) [2022-01-25 18:18:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1050/1251] eta 0:07:21 lr 0.000072 time 2.4350 (2.1976) loss 3.3150 (3.0691) grad_norm 2.3985 (2.7290) [2022-01-25 18:19:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1060/1251] eta 0:06:59 lr 0.000072 time 1.8677 (2.1969) loss 3.1049 (3.0725) grad_norm 2.6895 (2.7293) [2022-01-25 18:19:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1070/1251] eta 0:06:38 lr 0.000072 time 3.0765 (2.1994) loss 2.2998 (3.0723) grad_norm 2.4256 (2.7275) [2022-01-25 18:20:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1080/1251] eta 0:06:16 lr 0.000072 time 2.0952 (2.2019) loss 2.7975 (3.0719) grad_norm 2.3636 (2.7273) [2022-01-25 18:20:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1090/1251] eta 0:05:54 lr 0.000072 time 2.4885 (2.2025) loss 3.2981 (3.0728) grad_norm 2.6779 (2.7272) [2022-01-25 18:20:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1100/1251] eta 0:05:32 lr 0.000072 time 2.1148 (2.2022) loss 2.6346 (3.0735) grad_norm 3.0824 (2.7276) [2022-01-25 18:21:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1110/1251] eta 0:05:10 lr 0.000072 time 3.0959 (2.2015) loss 3.2176 (3.0743) grad_norm 2.7841 (2.7276) [2022-01-25 18:21:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1120/1251] eta 0:04:48 lr 0.000071 time 1.8006 (2.1989) loss 3.6200 (3.0755) grad_norm 3.9472 (2.7320) [2022-01-25 18:21:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1130/1251] eta 0:04:26 lr 0.000071 time 2.1515 (2.1985) loss 2.9647 (3.0750) grad_norm 2.8938 (2.7321) [2022-01-25 18:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1140/1251] eta 0:04:04 lr 0.000071 time 2.5403 (2.1986) loss 2.7127 (3.0746) grad_norm 2.6242 (2.7337) [2022-01-25 18:22:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1150/1251] eta 0:03:42 lr 0.000071 time 2.4505 (2.1991) loss 3.2024 (3.0752) grad_norm 2.8101 (2.7328) [2022-01-25 18:22:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1160/1251] eta 0:03:20 lr 0.000071 time 1.5691 (2.1988) loss 1.9178 (3.0732) grad_norm 2.5503 (2.7316) [2022-01-25 18:23:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1170/1251] eta 0:02:58 lr 0.000071 time 2.4892 (2.2001) loss 3.4278 (3.0725) grad_norm 2.3280 (2.7307) [2022-01-25 18:23:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1180/1251] eta 0:02:36 lr 0.000071 time 2.6376 (2.1999) loss 3.4271 (3.0692) grad_norm 2.9007 (2.7312) [2022-01-25 18:23:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1190/1251] eta 0:02:14 lr 0.000071 time 2.5221 (2.1983) loss 3.1637 (3.0698) grad_norm 2.7137 (2.7322) [2022-01-25 18:24:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1200/1251] eta 0:01:52 lr 0.000071 time 2.2124 (2.1964) loss 3.0059 (3.0699) grad_norm 2.9619 (2.7321) [2022-01-25 18:24:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1210/1251] eta 0:01:30 lr 0.000071 time 2.5716 (2.1956) loss 2.9131 (3.0671) grad_norm 2.8605 (2.7331) [2022-01-25 18:25:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1220/1251] eta 0:01:08 lr 0.000071 time 1.9055 (2.1947) loss 3.7702 (3.0684) grad_norm 2.4848 (2.7324) [2022-01-25 18:25:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1230/1251] eta 0:00:46 lr 0.000071 time 2.2038 (2.1949) loss 2.5650 (3.0679) grad_norm 2.4745 (2.7319) [2022-01-25 18:25:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1240/1251] eta 0:00:24 lr 0.000071 time 1.7488 (2.1936) loss 2.7295 (3.0667) grad_norm 2.3253 (2.7327) [2022-01-25 18:25:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [251/300][1250/1251] eta 0:00:02 lr 0.000071 time 1.1888 (2.1879) loss 3.7419 (3.0659) grad_norm 3.0234 (2.7329) [2022-01-25 18:25:58 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 251 training takes 0:45:37 [2022-01-25 18:26:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.919 (17.919) Loss 0.8003 (0.8003) Acc@1 79.492 (79.492) Acc@5 95.801 (95.801) [2022-01-25 18:26:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.915 (3.551) Loss 0.8680 (0.8414) Acc@1 79.492 (80.229) Acc@5 95.703 (95.250) [2022-01-25 18:26:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.619 (2.737) Loss 0.8449 (0.8420) Acc@1 80.664 (80.357) Acc@5 95.410 (95.192) [2022-01-25 18:27:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.306 (2.413) Loss 0.8645 (0.8458) Acc@1 80.469 (80.255) Acc@5 95.117 (95.171) [2022-01-25 18:27:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.634 (2.226) Loss 0.8950 (0.8426) Acc@1 79.688 (80.347) Acc@5 94.922 (95.177) [2022-01-25 18:27:36 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.410 Acc@5 95.180 [2022-01-25 18:27:36 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-01-25 18:27:36 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.41% [2022-01-25 18:27:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][0/1251] eta 7:36:02 lr 0.000071 time 21.8723 (21.8723) loss 3.3761 (3.3761) grad_norm 2.8983 (2.8983) [2022-01-25 18:28:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][10/1251] eta 1:26:42 lr 0.000071 time 3.0213 (4.1922) loss 3.3671 (2.9852) grad_norm 2.2079 (2.5704) [2022-01-25 18:28:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][20/1251] eta 1:04:28 lr 0.000071 time 1.2200 (3.1424) loss 2.2887 (3.0027) grad_norm 2.8244 (2.6060) [2022-01-25 18:29:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][30/1251] eta 0:59:13 lr 0.000071 time 1.8250 (2.9103) loss 2.6828 (3.0018) grad_norm 2.9863 (2.7003) [2022-01-25 18:29:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][40/1251] eta 0:55:54 lr 0.000071 time 3.9195 (2.7703) loss 3.1102 (3.0333) grad_norm 2.5701 (2.7119) [2022-01-25 18:29:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][50/1251] eta 0:53:31 lr 0.000071 time 2.9011 (2.6742) loss 2.5773 (2.9941) grad_norm 2.6168 (2.7017) [2022-01-25 18:30:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][60/1251] eta 0:51:41 lr 0.000071 time 1.7017 (2.6039) loss 2.7986 (3.0062) grad_norm 2.5339 (2.7051) [2022-01-25 18:30:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][70/1251] eta 0:49:48 lr 0.000071 time 1.5973 (2.5305) loss 3.7337 (2.9823) grad_norm 2.8245 (2.7149) [2022-01-25 18:30:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][80/1251] eta 0:48:20 lr 0.000071 time 2.1262 (2.4768) loss 3.4881 (3.0268) grad_norm 2.5800 (2.7101) [2022-01-25 18:31:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][90/1251] eta 0:46:52 lr 0.000071 time 2.1595 (2.4221) loss 3.3373 (3.0345) grad_norm 2.8882 (2.7143) [2022-01-25 18:31:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][100/1251] eta 0:45:58 lr 0.000071 time 1.9263 (2.3965) loss 3.4802 (3.0584) grad_norm 3.1407 (2.7207) [2022-01-25 18:31:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][110/1251] eta 0:44:59 lr 0.000071 time 1.8353 (2.3658) loss 2.1638 (3.0646) grad_norm 3.0411 (2.7323) [2022-01-25 18:32:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][120/1251] eta 0:44:27 lr 0.000071 time 2.5527 (2.3581) loss 3.4235 (3.0787) grad_norm 3.0441 (2.7355) [2022-01-25 18:32:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][130/1251] eta 0:43:49 lr 0.000071 time 1.7086 (2.3456) loss 2.6257 (3.0666) grad_norm 3.1290 (2.7506) [2022-01-25 18:33:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][140/1251] eta 0:43:23 lr 0.000071 time 1.8588 (2.3430) loss 3.4296 (3.0625) grad_norm 2.9957 (2.7428) [2022-01-25 18:33:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][150/1251] eta 0:42:55 lr 0.000071 time 1.6746 (2.3396) loss 3.5270 (3.0610) grad_norm 2.7700 (2.7507) [2022-01-25 18:33:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][160/1251] eta 0:42:18 lr 0.000071 time 2.3053 (2.3271) loss 3.2437 (3.0588) grad_norm 3.1953 (2.7581) [2022-01-25 18:34:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][170/1251] eta 0:41:40 lr 0.000071 time 1.9017 (2.3133) loss 3.1885 (3.0535) grad_norm 3.0911 (2.7505) [2022-01-25 18:34:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][180/1251] eta 0:41:11 lr 0.000071 time 2.0952 (2.3075) loss 3.5273 (3.0695) grad_norm 2.2892 (2.7488) [2022-01-25 18:34:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][190/1251] eta 0:40:39 lr 0.000071 time 1.5876 (2.2992) loss 3.6936 (3.0752) grad_norm 2.7200 (2.7467) [2022-01-25 18:35:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][200/1251] eta 0:40:11 lr 0.000071 time 2.0888 (2.2948) loss 3.6002 (3.0781) grad_norm 2.9325 (2.7473) [2022-01-25 18:35:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][210/1251] eta 0:39:52 lr 0.000071 time 2.5320 (2.2984) loss 3.5099 (3.0763) grad_norm 2.5465 (2.7455) [2022-01-25 18:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][220/1251] eta 0:39:31 lr 0.000071 time 2.0707 (2.3001) loss 2.4305 (3.0771) grad_norm 3.2236 (2.7494) [2022-01-25 18:36:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][230/1251] eta 0:39:03 lr 0.000071 time 1.5165 (2.2953) loss 3.2937 (3.0679) grad_norm 2.7221 (2.7489) [2022-01-25 18:36:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][240/1251] eta 0:38:30 lr 0.000071 time 1.5755 (2.2855) loss 3.1181 (3.0637) grad_norm 4.1922 (2.7601) [2022-01-25 18:37:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][250/1251] eta 0:37:57 lr 0.000071 time 2.1505 (2.2749) loss 3.0319 (3.0571) grad_norm 2.4528 (2.7568) [2022-01-25 18:37:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][260/1251] eta 0:37:27 lr 0.000071 time 2.0915 (2.2681) loss 3.7104 (3.0607) grad_norm 2.9419 (2.7527) [2022-01-25 18:37:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][270/1251] eta 0:37:00 lr 0.000071 time 1.9318 (2.2640) loss 3.3652 (3.0650) grad_norm 2.3152 (2.7499) [2022-01-25 18:38:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][280/1251] eta 0:36:38 lr 0.000071 time 2.6214 (2.2637) loss 2.3123 (3.0663) grad_norm 2.4766 (2.7464) [2022-01-25 18:38:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][290/1251] eta 0:36:17 lr 0.000071 time 1.8040 (2.2663) loss 2.5241 (3.0676) grad_norm 2.9701 (2.7476) [2022-01-25 18:38:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][300/1251] eta 0:35:51 lr 0.000071 time 1.9307 (2.2627) loss 2.8970 (3.0722) grad_norm 2.7432 (2.7485) [2022-01-25 18:39:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][310/1251] eta 0:35:25 lr 0.000071 time 2.3146 (2.2591) loss 3.3665 (3.0800) grad_norm 2.6734 (2.7537) [2022-01-25 18:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][320/1251] eta 0:35:02 lr 0.000071 time 3.1866 (2.2582) loss 2.6225 (3.0854) grad_norm 2.5488 (2.7466) [2022-01-25 18:40:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][330/1251] eta 0:34:37 lr 0.000071 time 2.2440 (2.2559) loss 2.1447 (3.0801) grad_norm 2.5653 (2.7472) [2022-01-25 18:40:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][340/1251] eta 0:34:10 lr 0.000071 time 1.9174 (2.2511) loss 2.1896 (3.0826) grad_norm 2.3576 (2.7456) [2022-01-25 18:40:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][350/1251] eta 0:33:47 lr 0.000071 time 1.8480 (2.2505) loss 2.6603 (3.0831) grad_norm 3.4868 (2.7500) [2022-01-25 18:41:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][360/1251] eta 0:33:22 lr 0.000071 time 2.6973 (2.2471) loss 3.5817 (3.0812) grad_norm 2.5618 (2.7560) [2022-01-25 18:41:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][370/1251] eta 0:32:54 lr 0.000070 time 1.8378 (2.2409) loss 3.3690 (3.0739) grad_norm 2.4106 (2.7576) [2022-01-25 18:41:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][380/1251] eta 0:32:27 lr 0.000070 time 2.6074 (2.2357) loss 2.6234 (3.0735) grad_norm 2.7047 (2.7567) [2022-01-25 18:42:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][390/1251] eta 0:32:00 lr 0.000070 time 1.8528 (2.2304) loss 3.7232 (3.0780) grad_norm 2.8259 (2.7561) [2022-01-25 18:42:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][400/1251] eta 0:31:35 lr 0.000070 time 2.1863 (2.2279) loss 3.0997 (3.0812) grad_norm 2.3540 (2.7539) [2022-01-25 18:42:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][410/1251] eta 0:31:14 lr 0.000070 time 2.5974 (2.2286) loss 2.2014 (3.0730) grad_norm 2.8534 (2.7576) [2022-01-25 18:43:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][420/1251] eta 0:30:51 lr 0.000070 time 2.9405 (2.2282) loss 3.4504 (3.0757) grad_norm 3.0190 (2.7544) [2022-01-25 18:43:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][430/1251] eta 0:30:27 lr 0.000070 time 2.1253 (2.2261) loss 2.0439 (3.0768) grad_norm 2.7100 (2.7522) [2022-01-25 18:43:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][440/1251] eta 0:30:03 lr 0.000070 time 2.1488 (2.2241) loss 3.4461 (3.0795) grad_norm 4.3755 (2.7537) [2022-01-25 18:44:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][450/1251] eta 0:29:42 lr 0.000070 time 2.3677 (2.2250) loss 3.1238 (3.0793) grad_norm 2.5478 (2.7544) [2022-01-25 18:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][460/1251] eta 0:29:22 lr 0.000070 time 3.3098 (2.2287) loss 3.7154 (3.0778) grad_norm 2.5977 (2.7560) [2022-01-25 18:45:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][470/1251] eta 0:28:59 lr 0.000070 time 2.3259 (2.2271) loss 3.3112 (3.0770) grad_norm 2.4641 (2.7505) [2022-01-25 18:45:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][480/1251] eta 0:28:36 lr 0.000070 time 1.9008 (2.2258) loss 3.3176 (3.0748) grad_norm 2.5371 (2.7485) [2022-01-25 18:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][490/1251] eta 0:28:17 lr 0.000070 time 3.0784 (2.2307) loss 3.5457 (3.0724) grad_norm 2.7703 (2.7487) [2022-01-25 18:46:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][500/1251] eta 0:27:54 lr 0.000070 time 2.4985 (2.2296) loss 2.8621 (3.0741) grad_norm 2.7233 (2.7554) [2022-01-25 18:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][510/1251] eta 0:27:30 lr 0.000070 time 2.1747 (2.2278) loss 3.5046 (3.0795) grad_norm 3.1573 (2.7555) [2022-01-25 18:46:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][520/1251] eta 0:27:05 lr 0.000070 time 1.9193 (2.2235) loss 3.5321 (3.0840) grad_norm 2.8384 (2.7537) [2022-01-25 18:47:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][530/1251] eta 0:26:43 lr 0.000070 time 3.4054 (2.2245) loss 2.8765 (3.0806) grad_norm 2.9894 (2.7533) [2022-01-25 18:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][540/1251] eta 0:26:21 lr 0.000070 time 2.1750 (2.2237) loss 3.2859 (3.0752) grad_norm 2.5778 (2.7513) [2022-01-25 18:48:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][550/1251] eta 0:25:58 lr 0.000070 time 2.2039 (2.2235) loss 3.5767 (3.0786) grad_norm 3.9791 (2.7539) [2022-01-25 18:48:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][560/1251] eta 0:25:34 lr 0.000070 time 1.7742 (2.2206) loss 3.6657 (3.0777) grad_norm 3.1571 (2.7547) [2022-01-25 18:48:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][570/1251] eta 0:25:13 lr 0.000070 time 2.6978 (2.2218) loss 2.4962 (3.0785) grad_norm 3.3410 (2.7571) [2022-01-25 18:49:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][580/1251] eta 0:24:49 lr 0.000070 time 2.1923 (2.2203) loss 3.3621 (3.0801) grad_norm 2.7350 (2.7562) [2022-01-25 18:49:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][590/1251] eta 0:24:25 lr 0.000070 time 2.1771 (2.2171) loss 2.7925 (3.0771) grad_norm 2.8155 (2.7553) [2022-01-25 18:49:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][600/1251] eta 0:24:02 lr 0.000070 time 2.2312 (2.2157) loss 3.0451 (3.0780) grad_norm 2.7985 (2.7559) [2022-01-25 18:50:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][610/1251] eta 0:23:40 lr 0.000070 time 3.1286 (2.2155) loss 2.8229 (3.0800) grad_norm 3.6190 (2.7577) [2022-01-25 18:50:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][620/1251] eta 0:23:16 lr 0.000070 time 1.9977 (2.2133) loss 3.3287 (3.0775) grad_norm 3.2094 (2.7610) [2022-01-25 18:50:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][630/1251] eta 0:22:54 lr 0.000070 time 2.2236 (2.2130) loss 3.4424 (3.0735) grad_norm 2.5136 (2.7595) [2022-01-25 18:51:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][640/1251] eta 0:22:31 lr 0.000070 time 1.6354 (2.2122) loss 3.0963 (3.0768) grad_norm 2.8439 (2.7575) [2022-01-25 18:51:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][650/1251] eta 0:22:10 lr 0.000070 time 2.7814 (2.2133) loss 3.4967 (3.0780) grad_norm 2.6372 (2.7585) [2022-01-25 18:52:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][660/1251] eta 0:21:49 lr 0.000070 time 2.9147 (2.2166) loss 3.6778 (3.0759) grad_norm 3.2425 (2.7602) [2022-01-25 18:52:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][670/1251] eta 0:21:28 lr 0.000070 time 1.9485 (2.2180) loss 3.0848 (3.0771) grad_norm 2.7383 (2.7591) [2022-01-25 18:52:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][680/1251] eta 0:21:06 lr 0.000070 time 2.2440 (2.2178) loss 3.0856 (3.0759) grad_norm 2.5826 (2.7579) [2022-01-25 18:53:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][690/1251] eta 0:20:42 lr 0.000070 time 2.2854 (2.2148) loss 2.4622 (3.0753) grad_norm 2.3168 (2.7585) [2022-01-25 18:53:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][700/1251] eta 0:20:19 lr 0.000070 time 2.5075 (2.2127) loss 3.3809 (3.0734) grad_norm 2.6588 (2.7587) [2022-01-25 18:53:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][710/1251] eta 0:19:56 lr 0.000070 time 1.8842 (2.2117) loss 3.0273 (3.0763) grad_norm 2.9238 (2.7590) [2022-01-25 18:54:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][720/1251] eta 0:19:33 lr 0.000070 time 2.2717 (2.2097) loss 3.5105 (3.0780) grad_norm 2.5509 (2.7562) [2022-01-25 18:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][730/1251] eta 0:19:09 lr 0.000070 time 1.8605 (2.2069) loss 2.5879 (3.0743) grad_norm 2.6551 (2.7545) [2022-01-25 18:54:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][740/1251] eta 0:18:47 lr 0.000070 time 2.4392 (2.2062) loss 3.6825 (3.0756) grad_norm 2.7483 (2.7571) [2022-01-25 18:55:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][750/1251] eta 0:18:24 lr 0.000070 time 1.6087 (2.2045) loss 3.3613 (3.0793) grad_norm 2.7615 (2.7570) [2022-01-25 18:55:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][760/1251] eta 0:18:01 lr 0.000070 time 2.0931 (2.2031) loss 3.0023 (3.0763) grad_norm 3.2445 (2.7590) [2022-01-25 18:55:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][770/1251] eta 0:17:41 lr 0.000070 time 2.7804 (2.2065) loss 3.4734 (3.0793) grad_norm 2.3220 (2.7609) [2022-01-25 18:56:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][780/1251] eta 0:17:19 lr 0.000070 time 2.9696 (2.2063) loss 3.3451 (3.0829) grad_norm 2.4309 (2.7630) [2022-01-25 18:56:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][790/1251] eta 0:16:57 lr 0.000070 time 2.0973 (2.2061) loss 3.0332 (3.0824) grad_norm 2.7103 (2.7627) [2022-01-25 18:57:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][800/1251] eta 0:16:34 lr 0.000070 time 2.4531 (2.2061) loss 2.3921 (3.0832) grad_norm 2.3140 (2.7624) [2022-01-25 18:57:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][810/1251] eta 0:16:13 lr 0.000070 time 1.8953 (2.2067) loss 3.5036 (3.0852) grad_norm 2.6692 (2.7621) [2022-01-25 18:57:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][820/1251] eta 0:15:51 lr 0.000070 time 3.0221 (2.2069) loss 2.8918 (3.0841) grad_norm 2.7252 (2.7615) [2022-01-25 18:58:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][830/1251] eta 0:15:28 lr 0.000070 time 1.6537 (2.2055) loss 2.8724 (3.0853) grad_norm 2.5595 (2.7598) [2022-01-25 18:58:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][840/1251] eta 0:15:06 lr 0.000070 time 2.5264 (2.2048) loss 3.6354 (3.0872) grad_norm 2.4012 (2.7593) [2022-01-25 18:58:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][850/1251] eta 0:14:43 lr 0.000070 time 1.4850 (2.2035) loss 3.7367 (3.0880) grad_norm 2.8782 (2.7593) [2022-01-25 18:59:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][860/1251] eta 0:14:21 lr 0.000070 time 2.5345 (2.2037) loss 2.8052 (3.0904) grad_norm 2.4655 (2.7580) [2022-01-25 18:59:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][870/1251] eta 0:13:59 lr 0.000070 time 2.4942 (2.2047) loss 3.2189 (3.0911) grad_norm 2.9272 (2.7566) [2022-01-25 18:59:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][880/1251] eta 0:13:38 lr 0.000069 time 2.5233 (2.2056) loss 3.5063 (3.0900) grad_norm 2.5201 (2.7540) [2022-01-25 19:00:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][890/1251] eta 0:13:15 lr 0.000069 time 1.8420 (2.2047) loss 2.3825 (3.0874) grad_norm 2.6830 (2.7539) [2022-01-25 19:00:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][900/1251] eta 0:12:53 lr 0.000069 time 2.7197 (2.2041) loss 3.2370 (3.0900) grad_norm 2.5544 (2.7529) [2022-01-25 19:01:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][910/1251] eta 0:12:30 lr 0.000069 time 1.9242 (2.2020) loss 3.0196 (3.0898) grad_norm 2.4643 (2.7513) [2022-01-25 19:01:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][920/1251] eta 0:12:08 lr 0.000069 time 1.8460 (2.2009) loss 3.2590 (3.0911) grad_norm 2.5471 (2.7497) [2022-01-25 19:01:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][930/1251] eta 0:11:46 lr 0.000069 time 2.0656 (2.2008) loss 3.6921 (3.0902) grad_norm 2.9970 (2.7492) [2022-01-25 19:02:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][940/1251] eta 0:11:24 lr 0.000069 time 2.2811 (2.2007) loss 3.1330 (3.0901) grad_norm 2.5653 (2.7486) [2022-01-25 19:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][950/1251] eta 0:11:02 lr 0.000069 time 2.2053 (2.2012) loss 2.9254 (3.0899) grad_norm 2.5686 (2.7475) [2022-01-25 19:02:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][960/1251] eta 0:10:40 lr 0.000069 time 2.3406 (2.2004) loss 3.0919 (3.0901) grad_norm 2.6154 (2.7463) [2022-01-25 19:03:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][970/1251] eta 0:10:18 lr 0.000069 time 2.2140 (2.1994) loss 2.8373 (3.0906) grad_norm 2.8058 (2.7455) [2022-01-25 19:03:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][980/1251] eta 0:09:55 lr 0.000069 time 2.2566 (2.1977) loss 3.7495 (3.0919) grad_norm 2.4637 (2.7441) [2022-01-25 19:03:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][990/1251] eta 0:09:33 lr 0.000069 time 2.7629 (2.1986) loss 3.5659 (3.0906) grad_norm 2.8230 (2.7426) [2022-01-25 19:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1000/1251] eta 0:09:11 lr 0.000069 time 2.2821 (2.1988) loss 2.6508 (3.0896) grad_norm 2.2753 (2.7410) [2022-01-25 19:04:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1010/1251] eta 0:08:50 lr 0.000069 time 3.0232 (2.2003) loss 3.3207 (3.0883) grad_norm 2.8921 (2.7399) [2022-01-25 19:05:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1020/1251] eta 0:08:28 lr 0.000069 time 2.1891 (2.2009) loss 3.4235 (3.0898) grad_norm 2.7406 (2.7398) [2022-01-25 19:05:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1030/1251] eta 0:08:06 lr 0.000069 time 2.3675 (2.2005) loss 3.4152 (3.0908) grad_norm 2.4313 (2.7398) [2022-01-25 19:05:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1040/1251] eta 0:07:44 lr 0.000069 time 2.1863 (2.2001) loss 3.3070 (3.0902) grad_norm 2.9788 (2.7400) [2022-01-25 19:06:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1050/1251] eta 0:07:21 lr 0.000069 time 2.0990 (2.1983) loss 2.7943 (3.0910) grad_norm 3.0427 (2.7409) [2022-01-25 19:06:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1060/1251] eta 0:06:59 lr 0.000069 time 2.1974 (2.1960) loss 3.6256 (3.0894) grad_norm 3.0987 (2.7412) [2022-01-25 19:06:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1070/1251] eta 0:06:37 lr 0.000069 time 1.9518 (2.1946) loss 3.0321 (3.0920) grad_norm 2.4556 (2.7416) [2022-01-25 19:07:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1080/1251] eta 0:06:15 lr 0.000069 time 1.9830 (2.1954) loss 2.3478 (3.0919) grad_norm 2.8932 (2.7416) [2022-01-25 19:07:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1090/1251] eta 0:05:53 lr 0.000069 time 2.1099 (2.1971) loss 2.9664 (3.0941) grad_norm 2.5737 (2.7409) [2022-01-25 19:07:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1100/1251] eta 0:05:32 lr 0.000069 time 3.7382 (2.1997) loss 2.3361 (3.0936) grad_norm 2.3423 (2.7401) [2022-01-25 19:08:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1110/1251] eta 0:05:10 lr 0.000069 time 2.0947 (2.2011) loss 2.6878 (3.0915) grad_norm 2.9054 (2.7408) [2022-01-25 19:08:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1120/1251] eta 0:04:48 lr 0.000069 time 1.6187 (2.2005) loss 2.7664 (3.0911) grad_norm 2.8873 (2.7416) [2022-01-25 19:09:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1130/1251] eta 0:04:26 lr 0.000069 time 2.0427 (2.1997) loss 3.4053 (3.0912) grad_norm 2.8378 (2.7414) [2022-01-25 19:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1140/1251] eta 0:04:04 lr 0.000069 time 3.3820 (2.1999) loss 3.4001 (3.0915) grad_norm 2.5639 (2.7416) [2022-01-25 19:09:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1150/1251] eta 0:03:42 lr 0.000069 time 2.0891 (2.1991) loss 3.4775 (3.0920) grad_norm 2.9164 (2.7415) [2022-01-25 19:10:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1160/1251] eta 0:03:20 lr 0.000069 time 1.7218 (2.1990) loss 3.1612 (3.0914) grad_norm 2.5314 (2.7404) [2022-01-25 19:10:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1170/1251] eta 0:02:58 lr 0.000069 time 2.1067 (2.1991) loss 2.8150 (3.0912) grad_norm 2.5851 (2.7415) [2022-01-25 19:10:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1180/1251] eta 0:02:36 lr 0.000069 time 3.6624 (2.2005) loss 2.7623 (3.0908) grad_norm 3.0995 (2.7465) [2022-01-25 19:11:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1190/1251] eta 0:02:14 lr 0.000069 time 2.2337 (2.1999) loss 2.7243 (3.0893) grad_norm 2.5431 (2.7500) [2022-01-25 19:11:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1200/1251] eta 0:01:52 lr 0.000069 time 1.8623 (2.1997) loss 2.8583 (3.0898) grad_norm 2.7974 (2.7510) [2022-01-25 19:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1210/1251] eta 0:01:30 lr 0.000069 time 2.2494 (2.1993) loss 2.2038 (3.0862) grad_norm 2.8635 (2.7510) [2022-01-25 19:12:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1220/1251] eta 0:01:08 lr 0.000069 time 3.0780 (2.1993) loss 2.1510 (3.0852) grad_norm 2.6476 (2.7498) [2022-01-25 19:12:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1230/1251] eta 0:00:46 lr 0.000069 time 2.5452 (2.1985) loss 3.2114 (3.0857) grad_norm 3.2390 (2.7503) [2022-01-25 19:13:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1240/1251] eta 0:00:24 lr 0.000069 time 1.3348 (2.1956) loss 3.0522 (3.0852) grad_norm 3.0220 (2.7501) [2022-01-25 19:13:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [252/300][1250/1251] eta 0:00:02 lr 0.000069 time 1.1747 (2.1903) loss 3.6735 (3.0869) grad_norm 2.7134 (2.7498) [2022-01-25 19:13:17 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 252 training takes 0:45:40 [2022-01-25 19:13:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.975 (18.975) Loss 0.7688 (0.7688) Acc@1 82.520 (82.520) Acc@5 96.484 (96.484) [2022-01-25 19:13:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.496 (3.226) Loss 0.8804 (0.8464) Acc@1 79.688 (80.060) Acc@5 94.727 (95.357) [2022-01-25 19:14:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.981 (2.630) Loss 0.9027 (0.8399) Acc@1 78.809 (80.199) Acc@5 93.555 (95.322) [2022-01-25 19:14:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.612 (2.280) Loss 0.7859 (0.8362) Acc@1 81.641 (80.491) Acc@5 95.898 (95.300) [2022-01-25 19:14:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.185 (2.217) Loss 0.8245 (0.8417) Acc@1 81.641 (80.331) Acc@5 95.605 (95.255) [2022-01-25 19:14:54 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.410 Acc@5 95.276 [2022-01-25 19:14:54 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-01-25 19:14:54 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.41% [2022-01-25 19:15:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][0/1251] eta 7:34:40 lr 0.000069 time 21.8071 (21.8071) loss 3.7644 (3.7644) grad_norm 3.0354 (3.0354) [2022-01-25 19:15:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][10/1251] eta 1:26:04 lr 0.000069 time 1.6353 (4.1616) loss 3.1139 (2.9195) grad_norm 2.6635 (2.6827) [2022-01-25 19:16:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][20/1251] eta 1:06:01 lr 0.000069 time 1.6049 (3.2177) loss 3.6098 (2.9919) grad_norm 2.6035 (2.7036) [2022-01-25 19:16:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][30/1251] eta 0:59:17 lr 0.000069 time 2.0826 (2.9135) loss 3.6321 (3.0997) grad_norm 3.8988 (2.6769) [2022-01-25 19:16:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][40/1251] eta 0:56:15 lr 0.000069 time 3.6146 (2.7871) loss 2.7783 (3.0482) grad_norm 2.4736 (2.7080) [2022-01-25 19:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][50/1251] eta 0:53:58 lr 0.000069 time 2.1734 (2.6965) loss 3.7836 (3.0307) grad_norm 2.9020 (2.7090) [2022-01-25 19:17:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][60/1251] eta 0:52:29 lr 0.000069 time 2.4096 (2.6442) loss 3.4872 (3.0657) grad_norm 2.8739 (2.7057) [2022-01-25 19:17:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][70/1251] eta 0:50:40 lr 0.000069 time 1.8470 (2.5749) loss 2.5268 (3.0585) grad_norm 2.6637 (2.7130) [2022-01-25 19:18:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][80/1251] eta 0:49:23 lr 0.000069 time 2.9353 (2.5304) loss 3.0554 (3.0373) grad_norm 2.5361 (2.7021) [2022-01-25 19:18:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][90/1251] eta 0:47:34 lr 0.000069 time 1.6100 (2.4589) loss 2.2266 (3.0443) grad_norm 2.5386 (2.6856) [2022-01-25 19:18:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][100/1251] eta 0:46:20 lr 0.000069 time 2.1877 (2.4157) loss 3.5131 (3.0280) grad_norm 2.2853 (2.6775) [2022-01-25 19:19:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][110/1251] eta 0:45:25 lr 0.000069 time 1.9490 (2.3883) loss 3.3580 (3.0287) grad_norm 2.4144 (2.6698) [2022-01-25 19:19:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][120/1251] eta 0:44:40 lr 0.000069 time 2.4776 (2.3700) loss 2.9891 (3.0189) grad_norm 2.6635 (2.6743) [2022-01-25 19:20:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][130/1251] eta 0:44:06 lr 0.000069 time 2.5341 (2.3609) loss 3.6164 (3.0275) grad_norm 3.0934 (2.6839) [2022-01-25 19:20:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][140/1251] eta 0:43:34 lr 0.000068 time 1.6226 (2.3535) loss 3.2252 (3.0333) grad_norm 2.6506 (2.6983) [2022-01-25 19:20:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][150/1251] eta 0:43:02 lr 0.000068 time 2.5718 (2.3451) loss 2.1181 (3.0326) grad_norm 3.0123 (2.7020) [2022-01-25 19:21:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][160/1251] eta 0:42:34 lr 0.000068 time 2.4134 (2.3411) loss 3.3334 (3.0402) grad_norm 2.4684 (2.7027) [2022-01-25 19:21:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][170/1251] eta 0:42:08 lr 0.000068 time 1.8729 (2.3392) loss 3.6705 (3.0392) grad_norm 2.4558 (2.7008) [2022-01-25 19:21:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][180/1251] eta 0:41:38 lr 0.000068 time 1.8129 (2.3329) loss 3.1517 (3.0401) grad_norm 2.2583 (2.6861) [2022-01-25 19:22:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][190/1251] eta 0:41:05 lr 0.000068 time 2.5554 (2.3237) loss 3.1424 (3.0531) grad_norm 2.5067 (2.6863) [2022-01-25 19:22:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][200/1251] eta 0:40:32 lr 0.000068 time 2.7963 (2.3148) loss 2.1062 (3.0464) grad_norm 2.5748 (2.6841) [2022-01-25 19:23:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][210/1251] eta 0:40:05 lr 0.000068 time 1.8648 (2.3104) loss 3.2001 (3.0506) grad_norm 2.7104 (2.6853) [2022-01-25 19:23:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][220/1251] eta 0:39:29 lr 0.000068 time 1.8038 (2.2979) loss 2.2019 (3.0537) grad_norm 2.6826 (2.6794) [2022-01-25 19:23:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][230/1251] eta 0:38:59 lr 0.000068 time 2.5320 (2.2915) loss 3.5566 (3.0546) grad_norm 2.8983 (2.6779) [2022-01-25 19:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][240/1251] eta 0:38:36 lr 0.000068 time 3.1264 (2.2910) loss 3.5312 (3.0636) grad_norm 2.2730 (2.6729) [2022-01-25 19:24:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][250/1251] eta 0:38:07 lr 0.000068 time 1.7809 (2.2852) loss 3.1681 (3.0528) grad_norm 2.3699 (2.6774) [2022-01-25 19:24:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][260/1251] eta 0:37:42 lr 0.000068 time 2.6111 (2.2830) loss 3.6404 (3.0587) grad_norm 2.7059 (2.6819) [2022-01-25 19:25:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][270/1251] eta 0:37:12 lr 0.000068 time 2.1923 (2.2754) loss 3.4940 (3.0658) grad_norm 2.5115 (2.6805) [2022-01-25 19:25:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][280/1251] eta 0:36:44 lr 0.000068 time 2.5483 (2.2706) loss 3.1482 (3.0619) grad_norm 2.6989 (2.6769) [2022-01-25 19:25:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][290/1251] eta 0:36:18 lr 0.000068 time 2.5957 (2.2672) loss 3.1716 (3.0641) grad_norm 3.0232 (2.6843) [2022-01-25 19:26:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][300/1251] eta 0:35:57 lr 0.000068 time 2.9680 (2.2684) loss 3.4504 (3.0670) grad_norm 2.7697 (2.6877) [2022-01-25 19:26:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][310/1251] eta 0:35:31 lr 0.000068 time 1.5882 (2.2654) loss 3.3932 (3.0648) grad_norm 2.5858 (2.6909) [2022-01-25 19:27:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][320/1251] eta 0:35:05 lr 0.000068 time 2.7905 (2.2621) loss 3.3476 (3.0653) grad_norm 2.7444 (2.6895) [2022-01-25 19:27:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][330/1251] eta 0:34:35 lr 0.000068 time 1.8320 (2.2535) loss 2.8324 (3.0577) grad_norm 4.0518 (2.6934) [2022-01-25 19:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][340/1251] eta 0:34:09 lr 0.000068 time 1.8641 (2.2497) loss 3.8103 (3.0620) grad_norm 2.7315 (2.6944) [2022-01-25 19:28:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][350/1251] eta 0:33:47 lr 0.000068 time 2.1754 (2.2507) loss 2.0044 (3.0593) grad_norm 3.0281 (2.6934) [2022-01-25 19:28:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][360/1251] eta 0:33:20 lr 0.000068 time 2.2706 (2.2450) loss 3.2044 (3.0686) grad_norm 2.8748 (2.6962) [2022-01-25 19:28:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][370/1251] eta 0:32:56 lr 0.000068 time 1.8602 (2.2437) loss 2.0885 (3.0734) grad_norm 3.9536 (2.7014) [2022-01-25 19:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][380/1251] eta 0:32:32 lr 0.000068 time 1.8818 (2.2412) loss 2.4296 (3.0701) grad_norm 2.5672 (2.6994) [2022-01-25 19:29:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][390/1251] eta 0:32:09 lr 0.000068 time 1.6681 (2.2414) loss 2.9724 (3.0676) grad_norm 2.4829 (2.7033) [2022-01-25 19:29:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][400/1251] eta 0:31:45 lr 0.000068 time 2.5487 (2.2392) loss 3.2596 (3.0660) grad_norm 2.9080 (2.7054) [2022-01-25 19:30:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][410/1251] eta 0:31:21 lr 0.000068 time 1.9222 (2.2376) loss 3.8561 (3.0710) grad_norm 3.3976 (2.7125) [2022-01-25 19:30:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][420/1251] eta 0:31:00 lr 0.000068 time 2.1110 (2.2388) loss 2.4580 (3.0690) grad_norm 2.6215 (2.7147) [2022-01-25 19:31:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][430/1251] eta 0:30:40 lr 0.000068 time 1.9491 (2.2414) loss 2.3687 (3.0677) grad_norm 2.8422 (2.7181) [2022-01-25 19:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][440/1251] eta 0:30:17 lr 0.000068 time 2.1292 (2.2405) loss 3.6610 (3.0696) grad_norm 2.3091 (2.7239) [2022-01-25 19:31:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][450/1251] eta 0:29:55 lr 0.000068 time 3.0407 (2.2413) loss 3.8980 (3.0722) grad_norm 3.1088 (2.7260) [2022-01-25 19:32:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][460/1251] eta 0:29:28 lr 0.000068 time 1.9546 (2.2357) loss 3.3357 (3.0756) grad_norm 2.6878 (2.7230) [2022-01-25 19:32:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][470/1251] eta 0:29:04 lr 0.000068 time 2.2581 (2.2331) loss 3.0207 (3.0792) grad_norm 2.6999 (2.7267) [2022-01-25 19:32:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][480/1251] eta 0:28:38 lr 0.000068 time 1.5677 (2.2286) loss 3.4069 (3.0769) grad_norm 2.3310 (2.7239) [2022-01-25 19:33:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][490/1251] eta 0:28:18 lr 0.000068 time 3.5614 (2.2314) loss 2.4202 (3.0807) grad_norm 2.4314 (2.7227) [2022-01-25 19:33:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][500/1251] eta 0:27:54 lr 0.000068 time 1.6644 (2.2300) loss 3.6746 (3.0802) grad_norm 2.4143 (2.7214) [2022-01-25 19:33:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][510/1251] eta 0:27:30 lr 0.000068 time 1.8243 (2.2274) loss 3.3530 (3.0819) grad_norm 2.7373 (2.7215) [2022-01-25 19:34:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][520/1251] eta 0:27:06 lr 0.000068 time 2.1088 (2.2253) loss 3.1586 (3.0851) grad_norm 2.6628 (2.7216) [2022-01-25 19:34:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][530/1251] eta 0:26:44 lr 0.000068 time 3.2486 (2.2254) loss 3.6690 (3.0809) grad_norm 3.1390 (2.7215) [2022-01-25 19:34:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][540/1251] eta 0:26:22 lr 0.000068 time 2.5249 (2.2254) loss 3.0590 (3.0778) grad_norm 2.8240 (2.7255) [2022-01-25 19:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][550/1251] eta 0:25:59 lr 0.000068 time 2.1083 (2.2249) loss 3.1271 (3.0853) grad_norm 3.1655 (2.7306) [2022-01-25 19:35:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][560/1251] eta 0:25:36 lr 0.000068 time 2.2986 (2.2238) loss 3.4013 (3.0870) grad_norm 2.8965 (2.7312) [2022-01-25 19:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][570/1251] eta 0:25:14 lr 0.000068 time 2.7037 (2.2240) loss 3.4216 (3.0869) grad_norm 3.2262 (2.7315) [2022-01-25 19:36:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][580/1251] eta 0:24:50 lr 0.000068 time 1.5979 (2.2218) loss 3.6129 (3.0914) grad_norm 2.5985 (2.7330) [2022-01-25 19:36:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][590/1251] eta 0:24:27 lr 0.000068 time 2.2057 (2.2206) loss 3.3990 (3.0925) grad_norm 3.3433 (2.7383) [2022-01-25 19:37:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][600/1251] eta 0:24:05 lr 0.000068 time 2.6308 (2.2205) loss 3.3805 (3.0929) grad_norm 2.4676 (2.7373) [2022-01-25 19:37:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][610/1251] eta 0:23:43 lr 0.000068 time 2.3284 (2.2206) loss 3.6413 (3.0927) grad_norm 2.3935 (2.7372) [2022-01-25 19:37:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][620/1251] eta 0:23:20 lr 0.000068 time 1.8735 (2.2195) loss 3.5466 (3.0932) grad_norm 3.0997 (2.7380) [2022-01-25 19:38:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][630/1251] eta 0:22:56 lr 0.000068 time 1.9889 (2.2174) loss 3.3172 (3.0931) grad_norm 2.4917 (2.7410) [2022-01-25 19:38:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][640/1251] eta 0:22:33 lr 0.000068 time 2.3673 (2.2158) loss 3.3394 (3.0955) grad_norm 3.1597 (2.7416) [2022-01-25 19:38:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][650/1251] eta 0:22:10 lr 0.000067 time 1.9097 (2.2141) loss 3.2717 (3.0930) grad_norm 2.4754 (2.7484) [2022-01-25 19:39:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][660/1251] eta 0:21:47 lr 0.000067 time 1.8986 (2.2127) loss 3.3571 (3.0947) grad_norm 2.4477 (2.7521) [2022-01-25 19:39:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][670/1251] eta 0:21:25 lr 0.000067 time 1.7314 (2.2123) loss 3.0056 (3.0963) grad_norm 2.8630 (2.7533) [2022-01-25 19:40:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][680/1251] eta 0:21:05 lr 0.000067 time 3.6895 (2.2168) loss 3.1247 (3.0993) grad_norm 2.6924 (2.7523) [2022-01-25 19:40:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][690/1251] eta 0:20:43 lr 0.000067 time 1.9012 (2.2159) loss 3.4710 (3.1008) grad_norm 2.5163 (2.7517) [2022-01-25 19:40:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][700/1251] eta 0:20:21 lr 0.000067 time 2.1928 (2.2163) loss 2.9011 (3.0997) grad_norm 3.6662 (2.7514) [2022-01-25 19:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][710/1251] eta 0:19:57 lr 0.000067 time 1.5841 (2.2141) loss 2.2058 (3.0979) grad_norm 2.8157 (2.7522) [2022-01-25 19:41:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][720/1251] eta 0:19:35 lr 0.000067 time 2.6955 (2.2142) loss 3.3992 (3.0957) grad_norm 2.5679 (2.7521) [2022-01-25 19:41:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][730/1251] eta 0:19:12 lr 0.000067 time 1.8457 (2.2130) loss 2.6453 (3.0925) grad_norm 4.2693 (2.7541) [2022-01-25 19:42:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][740/1251] eta 0:18:50 lr 0.000067 time 2.8671 (2.2124) loss 3.2662 (3.0913) grad_norm 2.8171 (2.7553) [2022-01-25 19:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][750/1251] eta 0:18:27 lr 0.000067 time 2.0338 (2.2112) loss 3.1954 (3.0941) grad_norm 4.0589 (2.7557) [2022-01-25 19:42:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][760/1251] eta 0:18:05 lr 0.000067 time 1.9521 (2.2099) loss 3.4071 (3.0874) grad_norm 2.3868 (2.7540) [2022-01-25 19:43:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][770/1251] eta 0:17:43 lr 0.000067 time 1.8921 (2.2105) loss 3.3594 (3.0879) grad_norm 2.3845 (2.7549) [2022-01-25 19:43:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][780/1251] eta 0:17:20 lr 0.000067 time 2.0197 (2.2102) loss 3.3375 (3.0865) grad_norm 2.9630 (2.7577) [2022-01-25 19:44:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][790/1251] eta 0:16:57 lr 0.000067 time 2.0049 (2.2081) loss 3.4380 (3.0894) grad_norm 2.6322 (2.7558) [2022-01-25 19:44:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][800/1251] eta 0:16:35 lr 0.000067 time 1.9743 (2.2066) loss 2.4109 (3.0880) grad_norm 2.8018 (2.7543) [2022-01-25 19:44:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][810/1251] eta 0:16:13 lr 0.000067 time 1.8713 (2.2081) loss 3.6625 (3.0890) grad_norm 2.8356 (2.7536) [2022-01-25 19:45:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][820/1251] eta 0:15:52 lr 0.000067 time 1.9718 (2.2092) loss 2.9925 (3.0900) grad_norm 2.4213 (2.7521) [2022-01-25 19:45:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][830/1251] eta 0:15:30 lr 0.000067 time 2.2264 (2.2091) loss 3.4440 (3.0920) grad_norm 2.7278 (2.7518) [2022-01-25 19:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][840/1251] eta 0:15:07 lr 0.000067 time 2.6163 (2.2081) loss 2.6807 (3.0924) grad_norm 2.4976 (2.7508) [2022-01-25 19:46:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][850/1251] eta 0:14:45 lr 0.000067 time 2.4851 (2.2093) loss 3.6163 (3.0959) grad_norm 2.8144 (2.7518) [2022-01-25 19:46:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][860/1251] eta 0:14:24 lr 0.000067 time 2.4983 (2.2107) loss 2.5568 (3.0936) grad_norm 2.4295 (2.7522) [2022-01-25 19:46:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][870/1251] eta 0:14:01 lr 0.000067 time 3.1220 (2.2099) loss 3.6532 (3.0926) grad_norm 2.6637 (2.7497) [2022-01-25 19:47:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][880/1251] eta 0:13:39 lr 0.000067 time 2.5715 (2.2091) loss 2.7693 (3.0929) grad_norm 2.6775 (2.7495) [2022-01-25 19:47:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][890/1251] eta 0:13:16 lr 0.000067 time 1.7678 (2.2077) loss 2.7578 (3.0909) grad_norm 2.5047 (2.7488) [2022-01-25 19:48:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][900/1251] eta 0:12:55 lr 0.000067 time 2.0637 (2.2080) loss 2.4064 (3.0897) grad_norm 2.4383 (2.7478) [2022-01-25 19:48:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][910/1251] eta 0:12:32 lr 0.000067 time 2.2035 (2.2059) loss 3.7602 (3.0875) grad_norm 2.9073 (2.7477) [2022-01-25 19:48:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][920/1251] eta 0:12:09 lr 0.000067 time 2.1999 (2.2050) loss 3.1232 (3.0846) grad_norm 2.2641 (2.7462) [2022-01-25 19:49:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][930/1251] eta 0:11:47 lr 0.000067 time 1.9609 (2.2029) loss 2.0947 (3.0822) grad_norm 3.1093 (2.7467) [2022-01-25 19:49:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][940/1251] eta 0:11:24 lr 0.000067 time 1.9232 (2.2012) loss 3.4945 (3.0813) grad_norm 2.6619 (2.7484) [2022-01-25 19:49:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][950/1251] eta 0:11:02 lr 0.000067 time 1.8250 (2.2003) loss 3.1342 (3.0822) grad_norm 2.2473 (2.7492) [2022-01-25 19:50:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][960/1251] eta 0:10:39 lr 0.000067 time 1.8933 (2.1992) loss 2.3168 (3.0817) grad_norm 2.2854 (2.7481) [2022-01-25 19:50:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][970/1251] eta 0:10:17 lr 0.000067 time 2.1578 (2.1980) loss 3.7269 (3.0820) grad_norm 3.2047 (2.7479) [2022-01-25 19:50:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][980/1251] eta 0:09:56 lr 0.000067 time 1.7119 (2.1995) loss 3.6979 (3.0846) grad_norm 2.8530 (2.7472) [2022-01-25 19:51:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][990/1251] eta 0:09:34 lr 0.000067 time 1.8670 (2.1998) loss 3.6993 (3.0855) grad_norm 2.5885 (2.7474) [2022-01-25 19:51:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1000/1251] eta 0:09:12 lr 0.000067 time 2.2297 (2.1994) loss 2.0425 (3.0833) grad_norm 2.5546 (2.7464) [2022-01-25 19:51:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1010/1251] eta 0:08:50 lr 0.000067 time 2.6423 (2.2003) loss 3.6919 (3.0828) grad_norm 2.5392 (2.7450) [2022-01-25 19:52:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1020/1251] eta 0:08:28 lr 0.000067 time 2.1700 (2.2003) loss 3.1845 (3.0836) grad_norm 2.3223 (2.7443) [2022-01-25 19:52:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1030/1251] eta 0:08:06 lr 0.000067 time 2.5453 (2.2007) loss 3.4914 (3.0845) grad_norm 2.8682 (2.7446) [2022-01-25 19:53:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1040/1251] eta 0:07:44 lr 0.000067 time 2.8338 (2.2019) loss 3.1904 (3.0861) grad_norm 3.1615 (2.7427) [2022-01-25 19:53:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1050/1251] eta 0:07:22 lr 0.000067 time 2.2152 (2.2014) loss 2.0769 (3.0861) grad_norm 2.5937 (2.7420) [2022-01-25 19:53:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1060/1251] eta 0:07:00 lr 0.000067 time 1.5861 (2.2001) loss 2.4309 (3.0844) grad_norm 3.3501 (2.7413) [2022-01-25 19:54:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1070/1251] eta 0:06:37 lr 0.000067 time 1.6678 (2.1987) loss 3.6449 (3.0836) grad_norm 2.8285 (2.7402) [2022-01-25 19:54:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1080/1251] eta 0:06:15 lr 0.000067 time 2.0279 (2.1980) loss 2.9245 (3.0828) grad_norm 2.4642 (2.7390) [2022-01-25 19:54:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1090/1251] eta 0:05:53 lr 0.000067 time 2.2444 (2.1981) loss 3.4736 (3.0823) grad_norm 2.2878 (2.7411) [2022-01-25 19:55:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1100/1251] eta 0:05:31 lr 0.000067 time 1.9083 (2.1976) loss 3.2312 (3.0834) grad_norm 2.4865 (2.7428) [2022-01-25 19:55:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1110/1251] eta 0:05:09 lr 0.000067 time 2.2045 (2.1964) loss 2.4316 (3.0814) grad_norm 3.1801 (2.7444) [2022-01-25 19:55:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1120/1251] eta 0:04:47 lr 0.000067 time 2.7519 (2.1973) loss 3.3085 (3.0808) grad_norm 2.7291 (2.7458) [2022-01-25 19:56:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1130/1251] eta 0:04:26 lr 0.000067 time 2.9245 (2.1989) loss 3.3364 (3.0805) grad_norm 2.9097 (2.7462) [2022-01-25 19:56:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1140/1251] eta 0:04:04 lr 0.000067 time 1.8025 (2.1991) loss 2.9729 (3.0780) grad_norm 2.5323 (2.7471) [2022-01-25 19:57:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1150/1251] eta 0:03:42 lr 0.000067 time 1.9003 (2.1989) loss 2.2589 (3.0773) grad_norm 2.7798 (2.7462) [2022-01-25 19:57:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1160/1251] eta 0:03:20 lr 0.000067 time 1.8671 (2.1989) loss 2.3910 (3.0766) grad_norm 2.5799 (2.7459) [2022-01-25 19:57:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1170/1251] eta 0:02:58 lr 0.000066 time 1.6486 (2.1984) loss 3.5455 (3.0758) grad_norm 3.4827 (2.7469) [2022-01-25 19:58:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1180/1251] eta 0:02:36 lr 0.000066 time 2.4398 (2.1982) loss 3.0076 (3.0770) grad_norm 2.5884 (2.7491) [2022-01-25 19:58:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1190/1251] eta 0:02:14 lr 0.000066 time 1.9399 (2.1982) loss 2.1049 (3.0743) grad_norm 2.6490 (2.7494) [2022-01-25 19:58:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1200/1251] eta 0:01:52 lr 0.000066 time 1.7727 (2.1988) loss 2.6713 (3.0744) grad_norm 2.6043 (2.7491) [2022-01-25 19:59:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1210/1251] eta 0:01:30 lr 0.000066 time 2.6891 (2.2002) loss 3.4396 (3.0735) grad_norm 2.3340 (2.7472) [2022-01-25 19:59:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1220/1251] eta 0:01:08 lr 0.000066 time 1.7911 (2.2003) loss 1.9606 (3.0718) grad_norm 3.0120 (2.7477) [2022-01-25 20:00:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1230/1251] eta 0:00:46 lr 0.000066 time 1.6869 (2.1989) loss 3.3801 (3.0732) grad_norm 2.4982 (2.7478) [2022-01-25 20:00:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1240/1251] eta 0:00:24 lr 0.000066 time 1.2607 (2.1963) loss 3.4766 (3.0718) grad_norm 3.4724 (2.7473) [2022-01-25 20:00:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [253/300][1250/1251] eta 0:00:02 lr 0.000066 time 1.2106 (2.1903) loss 3.3700 (3.0709) grad_norm 2.6495 (2.7463) [2022-01-25 20:00:35 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 253 training takes 0:45:40 [2022-01-25 20:00:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 20.740 (20.740) Loss 0.7796 (0.7796) Acc@1 80.762 (80.762) Acc@5 96.973 (96.973) [2022-01-25 20:01:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.606 (3.330) Loss 0.8375 (0.8146) Acc@1 79.883 (81.037) Acc@5 94.922 (95.277) [2022-01-25 20:01:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.602 (2.562) Loss 0.8559 (0.8194) Acc@1 79.883 (80.780) Acc@5 95.410 (95.387) [2022-01-25 20:01:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.600 (2.311) Loss 0.9164 (0.8360) Acc@1 79.492 (80.516) Acc@5 94.629 (95.253) [2022-01-25 20:02:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.877 (2.173) Loss 0.8450 (0.8389) Acc@1 79.590 (80.428) Acc@5 95.312 (95.179) [2022-01-25 20:02:12 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.422 Acc@5 95.270 [2022-01-25 20:02:12 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-01-25 20:02:12 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.42% [2022-01-25 20:02:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][0/1251] eta 7:30:20 lr 0.000066 time 21.5994 (21.5994) loss 3.4099 (3.4099) grad_norm 2.5842 (2.5842) [2022-01-25 20:02:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][10/1251] eta 1:25:32 lr 0.000066 time 1.6155 (4.1361) loss 3.4044 (3.2584) grad_norm 2.6107 (3.0834) [2022-01-25 20:03:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][20/1251] eta 1:06:13 lr 0.000066 time 2.3105 (3.2280) loss 3.5580 (3.1761) grad_norm 2.6790 (3.0308) [2022-01-25 20:03:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][30/1251] eta 0:57:54 lr 0.000066 time 1.5655 (2.8458) loss 3.4322 (3.1277) grad_norm 2.7297 (2.9679) [2022-01-25 20:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][40/1251] eta 0:55:53 lr 0.000066 time 4.0695 (2.7689) loss 3.1835 (3.1505) grad_norm 2.6208 (2.9628) [2022-01-25 20:04:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][50/1251] eta 0:53:09 lr 0.000066 time 1.6263 (2.6554) loss 3.3075 (3.1408) grad_norm 2.5868 (2.9245) [2022-01-25 20:04:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][60/1251] eta 0:50:53 lr 0.000066 time 2.1194 (2.5635) loss 3.1668 (3.1252) grad_norm 2.7498 (2.8837) [2022-01-25 20:05:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][70/1251] eta 0:48:57 lr 0.000066 time 1.7627 (2.4869) loss 3.3457 (3.1329) grad_norm 2.6267 (2.8978) [2022-01-25 20:05:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][80/1251] eta 0:48:07 lr 0.000066 time 3.8124 (2.4661) loss 3.4517 (3.1349) grad_norm 2.9130 (2.8724) [2022-01-25 20:05:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][90/1251] eta 0:47:06 lr 0.000066 time 1.5713 (2.4343) loss 3.3012 (3.1242) grad_norm 2.5700 (2.8638) [2022-01-25 20:06:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][100/1251] eta 0:46:09 lr 0.000066 time 1.8520 (2.4058) loss 3.4080 (3.1038) grad_norm 2.4196 (2.8555) [2022-01-25 20:06:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][110/1251] eta 0:45:12 lr 0.000066 time 1.8465 (2.3776) loss 2.3914 (3.0712) grad_norm 3.1212 (2.8416) [2022-01-25 20:06:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][120/1251] eta 0:44:35 lr 0.000066 time 3.5060 (2.3657) loss 3.3533 (3.0898) grad_norm 4.1298 (2.8426) [2022-01-25 20:07:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][130/1251] eta 0:43:58 lr 0.000066 time 2.3837 (2.3540) loss 2.0110 (3.0654) grad_norm 3.4504 (2.8376) [2022-01-25 20:07:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][140/1251] eta 0:43:29 lr 0.000066 time 2.0559 (2.3491) loss 3.2000 (3.0860) grad_norm 2.9921 (2.8413) [2022-01-25 20:08:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][150/1251] eta 0:42:55 lr 0.000066 time 1.8123 (2.3393) loss 2.8434 (3.0883) grad_norm 3.0088 (2.8492) [2022-01-25 20:08:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][160/1251] eta 0:42:30 lr 0.000066 time 3.1653 (2.3374) loss 3.7540 (3.0975) grad_norm 2.8783 (2.8472) [2022-01-25 20:08:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][170/1251] eta 0:41:55 lr 0.000066 time 1.8576 (2.3274) loss 3.5630 (3.0847) grad_norm 2.9131 (2.8477) [2022-01-25 20:09:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][180/1251] eta 0:41:17 lr 0.000066 time 1.6415 (2.3129) loss 2.4169 (3.0699) grad_norm 2.7379 (2.8386) [2022-01-25 20:09:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][190/1251] eta 0:40:42 lr 0.000066 time 1.9181 (2.3019) loss 3.2448 (3.0656) grad_norm 2.8519 (2.8321) [2022-01-25 20:09:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][200/1251] eta 0:40:15 lr 0.000066 time 3.5939 (2.2983) loss 3.2997 (3.0521) grad_norm 2.9720 (2.8296) [2022-01-25 20:10:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][210/1251] eta 0:39:51 lr 0.000066 time 1.8699 (2.2972) loss 2.8319 (3.0597) grad_norm 2.9516 (2.8287) [2022-01-25 20:10:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][220/1251] eta 0:39:27 lr 0.000066 time 1.4679 (2.2963) loss 3.0527 (3.0520) grad_norm 2.3597 (2.8257) [2022-01-25 20:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][230/1251] eta 0:39:01 lr 0.000066 time 1.4638 (2.2930) loss 2.6445 (3.0438) grad_norm 2.8068 (2.8193) [2022-01-25 20:11:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][240/1251] eta 0:38:47 lr 0.000066 time 3.0145 (2.3017) loss 2.3511 (3.0317) grad_norm 2.5085 (2.8141) [2022-01-25 20:11:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][250/1251] eta 0:38:14 lr 0.000066 time 1.6563 (2.2918) loss 2.9124 (3.0283) grad_norm 2.8444 (2.8151) [2022-01-25 20:12:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][260/1251] eta 0:37:39 lr 0.000066 time 1.6678 (2.2805) loss 2.5956 (3.0328) grad_norm 2.7530 (2.8136) [2022-01-25 20:12:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][270/1251] eta 0:37:13 lr 0.000066 time 1.9437 (2.2769) loss 2.1532 (3.0300) grad_norm 2.6102 (2.8100) [2022-01-25 20:12:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][280/1251] eta 0:37:01 lr 0.000066 time 2.0541 (2.2878) loss 3.7835 (3.0319) grad_norm 2.6420 (2.8065) [2022-01-25 20:13:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][290/1251] eta 0:36:35 lr 0.000066 time 1.8680 (2.2849) loss 2.3412 (3.0385) grad_norm 2.5257 (2.7987) [2022-01-25 20:13:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][300/1251] eta 0:36:05 lr 0.000066 time 1.6321 (2.2772) loss 2.1286 (3.0371) grad_norm 2.6311 (2.7901) [2022-01-25 20:13:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][310/1251] eta 0:35:39 lr 0.000066 time 1.8386 (2.2733) loss 2.2037 (3.0347) grad_norm 2.8480 (2.7910) [2022-01-25 20:14:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][320/1251] eta 0:35:25 lr 0.000066 time 1.7733 (2.2832) loss 3.3244 (3.0345) grad_norm 2.9714 (2.7877) [2022-01-25 20:14:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][330/1251] eta 0:34:55 lr 0.000066 time 1.6541 (2.2748) loss 3.5254 (3.0352) grad_norm 2.5779 (2.7873) [2022-01-25 20:15:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][340/1251] eta 0:34:26 lr 0.000066 time 1.7365 (2.2688) loss 2.7437 (3.0305) grad_norm 2.6685 (2.7842) [2022-01-25 20:15:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][350/1251] eta 0:33:59 lr 0.000066 time 2.0648 (2.2635) loss 1.9954 (3.0348) grad_norm 3.0427 (2.7844) [2022-01-25 20:15:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][360/1251] eta 0:33:33 lr 0.000066 time 1.8783 (2.2603) loss 2.8065 (3.0341) grad_norm 2.5154 (2.7830) [2022-01-25 20:16:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][370/1251] eta 0:33:06 lr 0.000066 time 2.2264 (2.2545) loss 3.6931 (3.0421) grad_norm 2.8182 (2.7812) [2022-01-25 20:16:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][380/1251] eta 0:32:39 lr 0.000066 time 2.3814 (2.2502) loss 3.0651 (3.0369) grad_norm 2.7466 (2.7796) [2022-01-25 20:16:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][390/1251] eta 0:32:18 lr 0.000066 time 3.0946 (2.2509) loss 2.6525 (3.0385) grad_norm 2.7165 (2.7838) [2022-01-25 20:17:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][400/1251] eta 0:31:57 lr 0.000066 time 2.1432 (2.2531) loss 2.4715 (3.0355) grad_norm 2.9698 (2.7838) [2022-01-25 20:17:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][410/1251] eta 0:31:36 lr 0.000066 time 2.4435 (2.2551) loss 3.5087 (3.0430) grad_norm 2.7837 (2.7839) [2022-01-25 20:18:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][420/1251] eta 0:31:15 lr 0.000066 time 3.1105 (2.2572) loss 1.8674 (3.0416) grad_norm 1.9526 (2.7800) [2022-01-25 20:18:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][430/1251] eta 0:30:55 lr 0.000066 time 2.8280 (2.2601) loss 2.7473 (3.0464) grad_norm 3.4871 (2.7782) [2022-01-25 20:18:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][440/1251] eta 0:30:27 lr 0.000065 time 1.9405 (2.2536) loss 3.3720 (3.0402) grad_norm 2.7341 (2.7764) [2022-01-25 20:19:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][450/1251] eta 0:30:02 lr 0.000065 time 1.8653 (2.2499) loss 2.7320 (3.0364) grad_norm 2.5755 (2.7770) [2022-01-25 20:19:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][460/1251] eta 0:29:38 lr 0.000065 time 2.5860 (2.2484) loss 3.1087 (3.0328) grad_norm 2.6435 (2.7779) [2022-01-25 20:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][470/1251] eta 0:29:14 lr 0.000065 time 2.2167 (2.2470) loss 2.3068 (3.0318) grad_norm 2.4467 (2.7747) [2022-01-25 20:20:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][480/1251] eta 0:28:50 lr 0.000065 time 1.9411 (2.2450) loss 3.1849 (3.0338) grad_norm 2.7915 (2.7745) [2022-01-25 20:20:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][490/1251] eta 0:28:26 lr 0.000065 time 1.8840 (2.2430) loss 3.2047 (3.0364) grad_norm 2.4636 (2.7710) [2022-01-25 20:20:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][500/1251] eta 0:28:04 lr 0.000065 time 1.9112 (2.2433) loss 3.0982 (3.0352) grad_norm 2.4486 (2.7692) [2022-01-25 20:21:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][510/1251] eta 0:27:40 lr 0.000065 time 2.3188 (2.2412) loss 3.5278 (3.0339) grad_norm 2.8488 (2.7707) [2022-01-25 20:21:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][520/1251] eta 0:27:17 lr 0.000065 time 2.5664 (2.2403) loss 2.3546 (3.0324) grad_norm 2.5321 (2.7693) [2022-01-25 20:22:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][530/1251] eta 0:26:54 lr 0.000065 time 1.5538 (2.2386) loss 2.9993 (3.0292) grad_norm 2.5567 (2.7651) [2022-01-25 20:22:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][540/1251] eta 0:26:31 lr 0.000065 time 2.3326 (2.2386) loss 3.5750 (3.0307) grad_norm 2.7381 (2.7688) [2022-01-25 20:22:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][550/1251] eta 0:26:08 lr 0.000065 time 2.8051 (2.2371) loss 1.9185 (3.0313) grad_norm 2.5974 (2.7659) [2022-01-25 20:23:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][560/1251] eta 0:25:44 lr 0.000065 time 2.2784 (2.2353) loss 2.3693 (3.0366) grad_norm 2.9395 (2.7662) [2022-01-25 20:23:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][570/1251] eta 0:25:21 lr 0.000065 time 1.8774 (2.2337) loss 3.1140 (3.0366) grad_norm 2.8080 (2.7689) [2022-01-25 20:23:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][580/1251] eta 0:24:59 lr 0.000065 time 2.2930 (2.2341) loss 3.2074 (3.0350) grad_norm 2.4273 (2.7668) [2022-01-25 20:24:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][590/1251] eta 0:24:37 lr 0.000065 time 3.5961 (2.2352) loss 3.1057 (3.0333) grad_norm 2.8197 (2.7739) [2022-01-25 20:24:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][600/1251] eta 0:24:13 lr 0.000065 time 2.4750 (2.2330) loss 3.7701 (3.0347) grad_norm 2.4398 (2.7745) [2022-01-25 20:24:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][610/1251] eta 0:23:50 lr 0.000065 time 1.9276 (2.2310) loss 3.6180 (3.0314) grad_norm 2.8233 (2.7771) [2022-01-25 20:25:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][620/1251] eta 0:23:27 lr 0.000065 time 2.8412 (2.2304) loss 3.6484 (3.0298) grad_norm 3.0651 (2.7758) [2022-01-25 20:25:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][630/1251] eta 0:23:05 lr 0.000065 time 2.9086 (2.2317) loss 3.7201 (3.0348) grad_norm 3.1484 (2.7767) [2022-01-25 20:26:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][640/1251] eta 0:22:43 lr 0.000065 time 2.6021 (2.2314) loss 3.2596 (3.0359) grad_norm 2.6139 (2.7750) [2022-01-25 20:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][650/1251] eta 0:22:20 lr 0.000065 time 2.4885 (2.2301) loss 3.4752 (3.0384) grad_norm 3.0502 (2.7758) [2022-01-25 20:26:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][660/1251] eta 0:21:55 lr 0.000065 time 1.7524 (2.2262) loss 2.9682 (3.0359) grad_norm 2.4965 (2.7760) [2022-01-25 20:27:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][670/1251] eta 0:21:32 lr 0.000065 time 2.6308 (2.2244) loss 3.3388 (3.0380) grad_norm 2.3927 (2.7755) [2022-01-25 20:27:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][680/1251] eta 0:21:10 lr 0.000065 time 2.2695 (2.2251) loss 3.5963 (3.0350) grad_norm 3.4143 (2.7765) [2022-01-25 20:27:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][690/1251] eta 0:20:47 lr 0.000065 time 2.4880 (2.2237) loss 3.8564 (3.0373) grad_norm 2.8179 (2.7744) [2022-01-25 20:28:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][700/1251] eta 0:20:23 lr 0.000065 time 2.4044 (2.2214) loss 3.3082 (3.0385) grad_norm 2.3212 (2.7710) [2022-01-25 20:28:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][710/1251] eta 0:20:02 lr 0.000065 time 2.4593 (2.2227) loss 3.0444 (3.0372) grad_norm 2.6873 (2.7702) [2022-01-25 20:28:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][720/1251] eta 0:19:41 lr 0.000065 time 2.2498 (2.2250) loss 2.3578 (3.0345) grad_norm 2.7700 (2.7680) [2022-01-25 20:29:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][730/1251] eta 0:19:18 lr 0.000065 time 1.8269 (2.2232) loss 2.9759 (3.0333) grad_norm 2.8585 (2.7683) [2022-01-25 20:29:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][740/1251] eta 0:18:57 lr 0.000065 time 4.0115 (2.2258) loss 3.0698 (3.0342) grad_norm 2.4311 (2.7690) [2022-01-25 20:30:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][750/1251] eta 0:18:34 lr 0.000065 time 1.6081 (2.2247) loss 2.6650 (3.0331) grad_norm 2.5035 (2.7674) [2022-01-25 20:30:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][760/1251] eta 0:18:11 lr 0.000065 time 1.7755 (2.2238) loss 2.9272 (3.0352) grad_norm 2.8655 (2.7681) [2022-01-25 20:30:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][770/1251] eta 0:17:49 lr 0.000065 time 1.8979 (2.2235) loss 2.8189 (3.0298) grad_norm 2.5225 (2.7681) [2022-01-25 20:31:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][780/1251] eta 0:17:27 lr 0.000065 time 3.7263 (2.2246) loss 2.9112 (3.0292) grad_norm 2.5167 (2.7679) [2022-01-25 20:31:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][790/1251] eta 0:17:04 lr 0.000065 time 1.8960 (2.2231) loss 2.6030 (3.0298) grad_norm 3.8306 (2.7690) [2022-01-25 20:31:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][800/1251] eta 0:16:41 lr 0.000065 time 2.1381 (2.2217) loss 3.1010 (3.0322) grad_norm 2.8429 (2.7692) [2022-01-25 20:32:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][810/1251] eta 0:16:18 lr 0.000065 time 1.7578 (2.2189) loss 2.8083 (3.0346) grad_norm 2.9796 (2.7720) [2022-01-25 20:32:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][820/1251] eta 0:15:55 lr 0.000065 time 1.9099 (2.2167) loss 3.1352 (3.0334) grad_norm 2.9802 (2.7726) [2022-01-25 20:32:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][830/1251] eta 0:15:32 lr 0.000065 time 1.9120 (2.2157) loss 3.0157 (3.0340) grad_norm 2.6130 (2.7766) [2022-01-25 20:33:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][840/1251] eta 0:15:10 lr 0.000065 time 1.8526 (2.2146) loss 2.3356 (3.0359) grad_norm 2.8549 (2.7756) [2022-01-25 20:33:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][850/1251] eta 0:14:48 lr 0.000065 time 2.4721 (2.2157) loss 2.2042 (3.0372) grad_norm 2.6511 (2.7752) [2022-01-25 20:34:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][860/1251] eta 0:14:26 lr 0.000065 time 2.7763 (2.2168) loss 2.8218 (3.0368) grad_norm 2.7145 (2.7763) [2022-01-25 20:34:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][870/1251] eta 0:14:04 lr 0.000065 time 1.9740 (2.2170) loss 3.0818 (3.0375) grad_norm 2.3933 (2.7744) [2022-01-25 20:34:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][880/1251] eta 0:13:41 lr 0.000065 time 1.5182 (2.2151) loss 2.7663 (3.0378) grad_norm 4.8598 (2.7765) [2022-01-25 20:35:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][890/1251] eta 0:13:19 lr 0.000065 time 1.8875 (2.2151) loss 3.2295 (3.0412) grad_norm 2.3115 (2.7749) [2022-01-25 20:35:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][900/1251] eta 0:12:57 lr 0.000065 time 2.0499 (2.2142) loss 3.2668 (3.0428) grad_norm 2.3073 (2.7737) [2022-01-25 20:35:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][910/1251] eta 0:12:34 lr 0.000065 time 1.9045 (2.2139) loss 3.8583 (3.0416) grad_norm 2.7250 (2.7736) [2022-01-25 20:36:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][920/1251] eta 0:12:12 lr 0.000065 time 2.1820 (2.2131) loss 3.5249 (3.0427) grad_norm 3.3505 (2.7735) [2022-01-25 20:36:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][930/1251] eta 0:11:50 lr 0.000065 time 1.9854 (2.2133) loss 2.7511 (3.0447) grad_norm 2.6595 (2.7735) [2022-01-25 20:36:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][940/1251] eta 0:11:28 lr 0.000065 time 1.6008 (2.2124) loss 3.5240 (3.0459) grad_norm 2.6472 (2.7728) [2022-01-25 20:37:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][950/1251] eta 0:11:06 lr 0.000065 time 1.5272 (2.2134) loss 3.4649 (3.0480) grad_norm 2.6758 (2.7711) [2022-01-25 20:37:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][960/1251] eta 0:10:43 lr 0.000065 time 1.9965 (2.2128) loss 3.2675 (3.0493) grad_norm 2.8239 (2.7704) [2022-01-25 20:38:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][970/1251] eta 0:10:21 lr 0.000064 time 1.8852 (2.2130) loss 3.4608 (3.0504) grad_norm 2.8220 (2.7695) [2022-01-25 20:38:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][980/1251] eta 0:09:59 lr 0.000064 time 1.8116 (2.2121) loss 3.6006 (3.0518) grad_norm 2.6578 (2.7690) [2022-01-25 20:38:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][990/1251] eta 0:09:37 lr 0.000064 time 1.9069 (2.2114) loss 3.2374 (3.0529) grad_norm 2.3435 (2.7691) [2022-01-25 20:39:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1000/1251] eta 0:09:14 lr 0.000064 time 1.7044 (2.2089) loss 3.3679 (3.0546) grad_norm 3.6488 (2.7707) [2022-01-25 20:39:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1010/1251] eta 0:08:52 lr 0.000064 time 2.1190 (2.2094) loss 3.5404 (3.0552) grad_norm 2.5214 (2.7721) [2022-01-25 20:39:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1020/1251] eta 0:08:30 lr 0.000064 time 2.2236 (2.2085) loss 2.9660 (3.0572) grad_norm 2.3513 (2.7729) [2022-01-25 20:40:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1030/1251] eta 0:08:08 lr 0.000064 time 2.2538 (2.2088) loss 3.4233 (3.0536) grad_norm 3.0322 (2.7777) [2022-01-25 20:40:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1040/1251] eta 0:07:45 lr 0.000064 time 1.6073 (2.2082) loss 3.3453 (3.0547) grad_norm 2.3005 (2.7768) [2022-01-25 20:40:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1050/1251] eta 0:07:24 lr 0.000064 time 2.5219 (2.2102) loss 2.3568 (3.0557) grad_norm 2.6688 (2.7761) [2022-01-25 20:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1060/1251] eta 0:07:02 lr 0.000064 time 1.8305 (2.2098) loss 2.4000 (3.0551) grad_norm 3.2095 (2.7770) [2022-01-25 20:41:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1070/1251] eta 0:06:40 lr 0.000064 time 2.4883 (2.2114) loss 3.3162 (3.0554) grad_norm 2.9323 (2.7772) [2022-01-25 20:42:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1080/1251] eta 0:06:17 lr 0.000064 time 1.7655 (2.2103) loss 2.7822 (3.0554) grad_norm 2.4914 (2.7767) [2022-01-25 20:42:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1090/1251] eta 0:05:55 lr 0.000064 time 1.9112 (2.2091) loss 2.7312 (3.0557) grad_norm 2.8559 (2.7773) [2022-01-25 20:42:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1100/1251] eta 0:05:33 lr 0.000064 time 1.8300 (2.2079) loss 2.1850 (3.0558) grad_norm 2.3238 (2.7782) [2022-01-25 20:43:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1110/1251] eta 0:05:11 lr 0.000064 time 1.7053 (2.2076) loss 2.5493 (3.0561) grad_norm 2.4812 (2.7769) [2022-01-25 20:43:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1120/1251] eta 0:04:49 lr 0.000064 time 2.2458 (2.2075) loss 2.9680 (3.0559) grad_norm 2.8386 (2.7777) [2022-01-25 20:43:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1130/1251] eta 0:04:27 lr 0.000064 time 2.0439 (2.2071) loss 2.8523 (3.0543) grad_norm 2.8895 (2.7788) [2022-01-25 20:44:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1140/1251] eta 0:04:04 lr 0.000064 time 1.7745 (2.2065) loss 1.9991 (3.0538) grad_norm 2.8913 (2.7799) [2022-01-25 20:44:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1150/1251] eta 0:03:42 lr 0.000064 time 2.4644 (2.2067) loss 3.0438 (3.0529) grad_norm 2.6242 (2.7789) [2022-01-25 20:44:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1160/1251] eta 0:03:20 lr 0.000064 time 2.7054 (2.2069) loss 3.4952 (3.0526) grad_norm 2.9598 (2.7786) [2022-01-25 20:45:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1170/1251] eta 0:02:58 lr 0.000064 time 2.1656 (2.2063) loss 3.0451 (3.0516) grad_norm 2.7880 (2.7776) [2022-01-25 20:45:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1180/1251] eta 0:02:36 lr 0.000064 time 1.8023 (2.2054) loss 3.4048 (3.0517) grad_norm 2.5733 (2.7776) [2022-01-25 20:45:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1190/1251] eta 0:02:14 lr 0.000064 time 2.1450 (2.2045) loss 3.2963 (3.0533) grad_norm 3.1052 (2.7774) [2022-01-25 20:46:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1200/1251] eta 0:01:52 lr 0.000064 time 3.0427 (2.2065) loss 2.6193 (3.0522) grad_norm 2.5691 (2.7759) [2022-01-25 20:46:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1210/1251] eta 0:01:30 lr 0.000064 time 2.3975 (2.2052) loss 3.1087 (3.0533) grad_norm 2.5473 (2.7746) [2022-01-25 20:47:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1220/1251] eta 0:01:08 lr 0.000064 time 1.6017 (2.2046) loss 3.3231 (3.0531) grad_norm 2.5532 (2.7731) [2022-01-25 20:47:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1230/1251] eta 0:00:46 lr 0.000064 time 1.8300 (2.2041) loss 3.1937 (3.0530) grad_norm 2.8224 (2.7727) [2022-01-25 20:47:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1240/1251] eta 0:00:24 lr 0.000064 time 2.4507 (2.2029) loss 3.2834 (3.0538) grad_norm 2.4563 (2.7719) [2022-01-25 20:48:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [254/300][1250/1251] eta 0:00:02 lr 0.000064 time 1.1772 (2.1971) loss 2.4371 (3.0529) grad_norm 2.7952 (2.7700) [2022-01-25 20:48:01 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 254 training takes 0:45:49 [2022-01-25 20:48:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.442 (18.442) Loss 0.9094 (0.9094) Acc@1 79.395 (79.395) Acc@5 94.238 (94.238) [2022-01-25 20:48:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.936 (3.272) Loss 0.8450 (0.8394) Acc@1 79.590 (79.998) Acc@5 95.020 (95.117) [2022-01-25 20:48:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.928 (2.548) Loss 0.8362 (0.8253) Acc@1 81.641 (80.697) Acc@5 95.605 (95.229) [2022-01-25 20:49:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.229 (2.249) Loss 0.8091 (0.8259) Acc@1 81.543 (80.598) Acc@5 94.824 (95.275) [2022-01-25 20:49:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.668 (2.172) Loss 0.8091 (0.8291) Acc@1 79.883 (80.476) Acc@5 96.191 (95.272) [2022-01-25 20:49:37 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.422 Acc@5 95.266 [2022-01-25 20:49:37 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.4% [2022-01-25 20:49:37 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.42% [2022-01-25 20:49:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][0/1251] eta 7:47:45 lr 0.000064 time 22.4344 (22.4344) loss 3.4811 (3.4811) grad_norm 2.7659 (2.7659) [2022-01-25 20:50:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][10/1251] eta 1:24:27 lr 0.000064 time 1.8323 (4.0831) loss 3.0519 (2.9894) grad_norm 2.8823 (2.6735) [2022-01-25 20:50:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][20/1251] eta 1:04:22 lr 0.000064 time 1.5338 (3.1380) loss 3.1882 (3.0526) grad_norm 2.7415 (2.7811) [2022-01-25 20:51:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][30/1251] eta 0:57:23 lr 0.000064 time 1.5941 (2.8201) loss 3.3854 (3.0224) grad_norm 2.6119 (2.7890) [2022-01-25 20:51:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][40/1251] eta 0:53:49 lr 0.000064 time 3.7766 (2.6664) loss 3.0528 (3.0342) grad_norm 2.9704 (2.7983) [2022-01-25 20:51:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][50/1251] eta 0:51:35 lr 0.000064 time 2.6371 (2.5771) loss 2.1473 (3.0720) grad_norm 2.4404 (2.7672) [2022-01-25 20:52:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][60/1251] eta 0:50:01 lr 0.000064 time 2.1331 (2.5198) loss 2.9689 (3.0800) grad_norm 2.7949 (2.7535) [2022-01-25 20:52:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][70/1251] eta 0:48:31 lr 0.000064 time 1.5634 (2.4650) loss 3.3130 (3.1108) grad_norm 2.8805 (2.7375) [2022-01-25 20:52:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][80/1251] eta 0:47:52 lr 0.000064 time 3.8031 (2.4534) loss 3.2165 (3.1018) grad_norm 3.0365 (2.7411) [2022-01-25 20:53:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][90/1251] eta 0:47:11 lr 0.000064 time 3.4441 (2.4384) loss 2.3603 (3.1034) grad_norm 2.5389 (2.7394) [2022-01-25 20:53:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][100/1251] eta 0:46:15 lr 0.000064 time 1.5445 (2.4114) loss 3.3885 (3.0995) grad_norm 2.6557 (2.7842) [2022-01-25 20:54:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][110/1251] eta 0:45:26 lr 0.000064 time 1.5872 (2.3898) loss 3.2620 (3.1047) grad_norm 2.7726 (2.8183) [2022-01-25 20:54:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][120/1251] eta 0:44:46 lr 0.000064 time 2.6537 (2.3749) loss 3.5153 (3.1058) grad_norm 4.3046 (2.8284) [2022-01-25 20:54:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][130/1251] eta 0:44:18 lr 0.000064 time 3.5218 (2.3711) loss 3.4616 (3.1119) grad_norm 2.5380 (2.8159) [2022-01-25 20:55:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][140/1251] eta 0:43:38 lr 0.000064 time 1.5905 (2.3573) loss 3.4391 (3.1174) grad_norm 2.5551 (2.8150) [2022-01-25 20:55:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][150/1251] eta 0:43:03 lr 0.000064 time 2.3522 (2.3464) loss 2.7264 (3.0984) grad_norm 3.4456 (2.8125) [2022-01-25 20:55:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][160/1251] eta 0:42:30 lr 0.000064 time 3.1562 (2.3380) loss 3.5142 (3.0933) grad_norm 2.9378 (2.8136) [2022-01-25 20:56:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][170/1251] eta 0:42:08 lr 0.000064 time 2.6832 (2.3394) loss 3.3854 (3.0780) grad_norm 3.0732 (2.8098) [2022-01-25 20:56:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][180/1251] eta 0:41:21 lr 0.000064 time 1.5733 (2.3168) loss 2.9281 (3.0788) grad_norm 3.1958 (2.8212) [2022-01-25 20:56:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][190/1251] eta 0:40:42 lr 0.000064 time 2.2255 (2.3023) loss 2.5773 (3.0750) grad_norm 2.9180 (2.8271) [2022-01-25 20:57:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][200/1251] eta 0:40:08 lr 0.000064 time 1.8581 (2.2915) loss 2.2315 (3.0700) grad_norm 2.9950 (2.8369) [2022-01-25 20:57:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][210/1251] eta 0:39:40 lr 0.000064 time 1.7913 (2.2868) loss 3.4157 (3.0769) grad_norm 2.4830 (2.8327) [2022-01-25 20:58:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][220/1251] eta 0:39:08 lr 0.000064 time 1.4671 (2.2780) loss 2.4915 (3.0752) grad_norm 2.5350 (2.8278) [2022-01-25 20:58:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][230/1251] eta 0:38:42 lr 0.000064 time 2.2634 (2.2748) loss 3.3482 (3.0710) grad_norm 2.9211 (2.8281) [2022-01-25 20:58:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][240/1251] eta 0:38:23 lr 0.000064 time 2.3228 (2.2789) loss 3.4988 (3.0720) grad_norm 2.6628 (2.8339) [2022-01-25 20:59:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][250/1251] eta 0:38:00 lr 0.000063 time 1.5848 (2.2780) loss 3.3402 (3.0662) grad_norm 2.9054 (2.8301) [2022-01-25 20:59:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][260/1251] eta 0:37:29 lr 0.000063 time 1.3466 (2.2704) loss 3.4831 (3.0620) grad_norm 2.4620 (2.8221) [2022-01-25 20:59:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][270/1251] eta 0:37:02 lr 0.000063 time 2.5448 (2.2659) loss 2.0931 (3.0625) grad_norm 2.9408 (2.8215) [2022-01-25 21:00:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][280/1251] eta 0:36:39 lr 0.000063 time 1.8816 (2.2652) loss 3.4692 (3.0595) grad_norm 2.9302 (2.8284) [2022-01-25 21:00:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][290/1251] eta 0:36:16 lr 0.000063 time 1.8677 (2.2643) loss 2.3965 (3.0539) grad_norm 2.7345 (2.8325) [2022-01-25 21:00:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][300/1251] eta 0:35:49 lr 0.000063 time 2.2676 (2.2606) loss 2.2937 (3.0533) grad_norm 2.3226 (2.8338) [2022-01-25 21:01:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][310/1251] eta 0:35:22 lr 0.000063 time 2.7839 (2.2560) loss 2.1993 (3.0585) grad_norm 2.5968 (2.8297) [2022-01-25 21:01:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][320/1251] eta 0:35:03 lr 0.000063 time 1.9962 (2.2590) loss 2.8544 (3.0585) grad_norm 3.0700 (2.8261) [2022-01-25 21:02:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][330/1251] eta 0:34:41 lr 0.000063 time 1.9481 (2.2595) loss 3.7592 (3.0603) grad_norm 2.8470 (2.8261) [2022-01-25 21:02:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][340/1251] eta 0:34:17 lr 0.000063 time 2.3046 (2.2586) loss 2.8424 (3.0647) grad_norm 2.3495 (2.8203) [2022-01-25 21:02:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][350/1251] eta 0:33:52 lr 0.000063 time 2.2246 (2.2562) loss 3.4730 (3.0632) grad_norm 2.5861 (2.8136) [2022-01-25 21:03:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][360/1251] eta 0:33:25 lr 0.000063 time 1.5512 (2.2508) loss 3.0337 (3.0663) grad_norm 2.3746 (2.8129) [2022-01-25 21:03:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][370/1251] eta 0:32:57 lr 0.000063 time 1.8414 (2.2450) loss 2.9179 (3.0642) grad_norm 2.9370 (2.8193) [2022-01-25 21:03:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][380/1251] eta 0:32:32 lr 0.000063 time 2.2844 (2.2414) loss 2.0617 (3.0638) grad_norm 2.8482 (2.8169) [2022-01-25 21:04:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][390/1251] eta 0:32:07 lr 0.000063 time 1.8254 (2.2382) loss 2.7733 (3.0688) grad_norm 3.1199 (2.8158) [2022-01-25 21:04:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][400/1251] eta 0:31:44 lr 0.000063 time 1.7714 (2.2374) loss 3.4063 (3.0714) grad_norm 2.7069 (2.8137) [2022-01-25 21:04:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][410/1251] eta 0:31:21 lr 0.000063 time 2.5140 (2.2367) loss 3.1566 (3.0700) grad_norm 2.8693 (2.8089) [2022-01-25 21:05:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][420/1251] eta 0:30:58 lr 0.000063 time 2.5084 (2.2365) loss 2.1463 (3.0677) grad_norm 2.2402 (2.8042) [2022-01-25 21:05:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][430/1251] eta 0:30:38 lr 0.000063 time 2.2900 (2.2389) loss 2.1597 (3.0611) grad_norm 2.4820 (2.8000) [2022-01-25 21:06:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][440/1251] eta 0:30:16 lr 0.000063 time 2.1113 (2.2397) loss 3.5164 (3.0546) grad_norm 3.5959 (2.8025) [2022-01-25 21:06:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][450/1251] eta 0:29:55 lr 0.000063 time 2.1557 (2.2417) loss 3.1041 (3.0518) grad_norm 2.6614 (2.7998) [2022-01-25 21:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][460/1251] eta 0:29:32 lr 0.000063 time 2.1920 (2.2406) loss 2.9428 (3.0508) grad_norm 3.1969 (2.8024) [2022-01-25 21:07:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][470/1251] eta 0:29:05 lr 0.000063 time 1.6679 (2.2355) loss 2.2207 (3.0482) grad_norm 2.9810 (2.7972) [2022-01-25 21:07:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][480/1251] eta 0:28:41 lr 0.000063 time 2.3616 (2.2326) loss 2.6829 (3.0454) grad_norm 2.7503 (2.7918) [2022-01-25 21:07:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][490/1251] eta 0:28:13 lr 0.000063 time 1.8664 (2.2257) loss 2.1440 (3.0399) grad_norm 2.4927 (2.7969) [2022-01-25 21:08:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][500/1251] eta 0:27:48 lr 0.000063 time 1.6299 (2.2223) loss 3.1725 (3.0446) grad_norm 2.4667 (2.7995) [2022-01-25 21:08:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][510/1251] eta 0:27:26 lr 0.000063 time 1.8735 (2.2218) loss 1.9817 (3.0456) grad_norm 3.1901 (2.8004) [2022-01-25 21:08:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][520/1251] eta 0:27:05 lr 0.000063 time 2.1373 (2.2234) loss 3.5575 (3.0469) grad_norm 2.8300 (2.7998) [2022-01-25 21:09:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][530/1251] eta 0:26:44 lr 0.000063 time 1.8356 (2.2252) loss 3.4316 (3.0500) grad_norm 2.8334 (2.7998) [2022-01-25 21:09:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][540/1251] eta 0:26:23 lr 0.000063 time 2.2677 (2.2271) loss 3.1927 (3.0494) grad_norm 2.7477 (2.8015) [2022-01-25 21:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][550/1251] eta 0:26:04 lr 0.000063 time 2.4664 (2.2321) loss 3.4968 (3.0559) grad_norm 2.7881 (2.7996) [2022-01-25 21:10:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][560/1251] eta 0:25:44 lr 0.000063 time 2.2008 (2.2352) loss 2.9233 (3.0568) grad_norm 2.7157 (2.7948) [2022-01-25 21:10:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][570/1251] eta 0:25:20 lr 0.000063 time 1.9448 (2.2333) loss 3.3898 (3.0542) grad_norm 2.7208 (2.7937) [2022-01-25 21:11:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][580/1251] eta 0:24:55 lr 0.000063 time 1.9535 (2.2287) loss 2.3109 (3.0539) grad_norm 3.1637 (2.7914) [2022-01-25 21:11:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][590/1251] eta 0:24:29 lr 0.000063 time 1.6881 (2.2236) loss 2.6841 (3.0564) grad_norm 2.4532 (2.7888) [2022-01-25 21:11:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][600/1251] eta 0:24:07 lr 0.000063 time 2.6097 (2.2239) loss 3.5102 (3.0577) grad_norm 2.5014 (2.7899) [2022-01-25 21:12:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][610/1251] eta 0:23:46 lr 0.000063 time 2.1438 (2.2253) loss 2.0409 (3.0503) grad_norm 2.5035 (2.7890) [2022-01-25 21:12:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][620/1251] eta 0:23:24 lr 0.000063 time 2.1760 (2.2264) loss 3.3110 (3.0506) grad_norm 2.9024 (2.7900) [2022-01-25 21:13:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][630/1251] eta 0:23:02 lr 0.000063 time 1.6818 (2.2257) loss 3.4852 (3.0515) grad_norm 3.7152 (2.7963) [2022-01-25 21:13:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][640/1251] eta 0:22:40 lr 0.000063 time 2.5467 (2.2264) loss 3.0260 (3.0512) grad_norm 2.4499 (2.7931) [2022-01-25 21:13:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][650/1251] eta 0:22:16 lr 0.000063 time 1.7449 (2.2246) loss 2.3357 (3.0521) grad_norm 3.0376 (2.7932) [2022-01-25 21:14:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][660/1251] eta 0:21:53 lr 0.000063 time 2.1818 (2.2227) loss 2.9990 (3.0490) grad_norm 2.6708 (2.7927) [2022-01-25 21:14:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][670/1251] eta 0:21:30 lr 0.000063 time 1.9106 (2.2208) loss 2.6189 (3.0501) grad_norm 2.9745 (2.7906) [2022-01-25 21:14:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][680/1251] eta 0:21:07 lr 0.000063 time 2.3151 (2.2204) loss 2.8163 (3.0536) grad_norm 2.9735 (2.7899) [2022-01-25 21:15:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][690/1251] eta 0:20:45 lr 0.000063 time 1.8961 (2.2204) loss 3.8372 (3.0521) grad_norm 2.7610 (2.7893) [2022-01-25 21:15:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][700/1251] eta 0:20:22 lr 0.000063 time 2.2590 (2.2194) loss 3.2956 (3.0517) grad_norm 2.6712 (2.7872) [2022-01-25 21:15:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][710/1251] eta 0:19:59 lr 0.000063 time 1.9286 (2.2174) loss 3.3341 (3.0523) grad_norm 2.3130 (2.7871) [2022-01-25 21:16:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][720/1251] eta 0:19:37 lr 0.000063 time 2.2915 (2.2173) loss 3.4703 (3.0541) grad_norm 3.1662 (2.7870) [2022-01-25 21:16:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][730/1251] eta 0:19:16 lr 0.000063 time 1.9569 (2.2191) loss 3.4356 (3.0562) grad_norm 2.7518 (2.7881) [2022-01-25 21:17:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][740/1251] eta 0:18:56 lr 0.000063 time 2.6887 (2.2242) loss 3.1773 (3.0558) grad_norm 2.3301 (2.7888) [2022-01-25 21:17:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][750/1251] eta 0:18:34 lr 0.000063 time 1.9362 (2.2252) loss 2.4242 (3.0548) grad_norm 2.8971 (2.7923) [2022-01-25 21:17:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][760/1251] eta 0:18:12 lr 0.000063 time 1.9296 (2.2249) loss 3.6287 (3.0550) grad_norm 2.8153 (2.7909) [2022-01-25 21:18:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][770/1251] eta 0:17:49 lr 0.000063 time 1.9461 (2.2235) loss 3.0562 (3.0556) grad_norm 2.9563 (2.7919) [2022-01-25 21:18:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][780/1251] eta 0:17:25 lr 0.000062 time 2.5372 (2.2202) loss 2.7564 (3.0574) grad_norm 2.7034 (2.7943) [2022-01-25 21:18:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][790/1251] eta 0:17:02 lr 0.000062 time 2.0520 (2.2173) loss 3.5807 (3.0551) grad_norm 2.9602 (2.7938) [2022-01-25 21:19:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][800/1251] eta 0:16:39 lr 0.000062 time 1.9013 (2.2166) loss 3.4850 (3.0540) grad_norm 2.5570 (2.7923) [2022-01-25 21:19:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][810/1251] eta 0:16:17 lr 0.000062 time 2.6627 (2.2174) loss 3.5224 (3.0547) grad_norm 3.0238 (2.7911) [2022-01-25 21:19:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][820/1251] eta 0:15:56 lr 0.000062 time 1.9944 (2.2183) loss 2.1818 (3.0550) grad_norm 3.2465 (2.7921) [2022-01-25 21:20:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][830/1251] eta 0:15:34 lr 0.000062 time 2.2746 (2.2192) loss 3.4629 (3.0554) grad_norm 2.9110 (2.7913) [2022-01-25 21:20:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][840/1251] eta 0:15:11 lr 0.000062 time 1.6862 (2.2176) loss 3.5158 (3.0550) grad_norm 2.4568 (2.7904) [2022-01-25 21:21:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][850/1251] eta 0:14:49 lr 0.000062 time 2.6134 (2.2175) loss 2.8020 (3.0553) grad_norm 2.6276 (2.7900) [2022-01-25 21:21:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][860/1251] eta 0:14:26 lr 0.000062 time 2.0880 (2.2172) loss 3.2885 (3.0554) grad_norm 2.6709 (2.7889) [2022-01-25 21:21:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][870/1251] eta 0:14:04 lr 0.000062 time 2.4714 (2.2171) loss 3.0140 (3.0532) grad_norm 2.6837 (2.7857) [2022-01-25 21:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][880/1251] eta 0:13:42 lr 0.000062 time 1.9061 (2.2163) loss 2.7166 (3.0532) grad_norm 2.5962 (2.7871) [2022-01-25 21:22:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][890/1251] eta 0:13:21 lr 0.000062 time 3.3708 (2.2194) loss 3.2239 (3.0537) grad_norm 3.1390 (2.7864) [2022-01-25 21:22:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][900/1251] eta 0:12:59 lr 0.000062 time 1.8329 (2.2207) loss 3.1149 (3.0541) grad_norm 2.9533 (2.7861) [2022-01-25 21:23:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][910/1251] eta 0:12:37 lr 0.000062 time 2.2412 (2.2206) loss 3.1962 (3.0556) grad_norm 3.2031 (2.7859) [2022-01-25 21:23:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][920/1251] eta 0:12:14 lr 0.000062 time 2.0292 (2.2185) loss 2.7490 (3.0564) grad_norm 2.7542 (2.7850) [2022-01-25 21:24:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][930/1251] eta 0:11:51 lr 0.000062 time 1.9195 (2.2164) loss 3.1669 (3.0570) grad_norm 2.7087 (2.7912) [2022-01-25 21:24:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][940/1251] eta 0:11:28 lr 0.000062 time 1.6208 (2.2137) loss 3.1672 (3.0583) grad_norm 2.6811 (2.7918) [2022-01-25 21:24:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][950/1251] eta 0:11:06 lr 0.000062 time 2.3686 (2.2127) loss 2.8674 (3.0597) grad_norm 2.6108 (2.7906) [2022-01-25 21:25:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][960/1251] eta 0:10:43 lr 0.000062 time 1.7459 (2.2117) loss 2.5540 (3.0599) grad_norm 2.4792 (2.7887) [2022-01-25 21:25:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][970/1251] eta 0:10:21 lr 0.000062 time 1.6817 (2.2131) loss 3.2239 (3.0622) grad_norm 2.6169 (2.7882) [2022-01-25 21:25:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][980/1251] eta 0:10:00 lr 0.000062 time 2.6873 (2.2163) loss 3.7101 (3.0637) grad_norm 2.2923 (2.7873) [2022-01-25 21:26:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][990/1251] eta 0:09:38 lr 0.000062 time 1.8239 (2.2180) loss 3.1964 (3.0618) grad_norm 3.0771 (2.7875) [2022-01-25 21:26:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1000/1251] eta 0:09:16 lr 0.000062 time 1.9233 (2.2182) loss 2.2018 (3.0609) grad_norm 2.7658 (2.7858) [2022-01-25 21:26:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1010/1251] eta 0:08:54 lr 0.000062 time 1.5695 (2.2159) loss 2.6999 (3.0610) grad_norm 2.5007 (2.7838) [2022-01-25 21:27:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1020/1251] eta 0:08:31 lr 0.000062 time 1.8379 (2.2127) loss 3.2864 (3.0611) grad_norm 2.7455 (2.7855) [2022-01-25 21:27:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1030/1251] eta 0:08:08 lr 0.000062 time 1.9344 (2.2106) loss 3.5218 (3.0609) grad_norm 3.2266 (2.7855) [2022-01-25 21:27:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1040/1251] eta 0:07:45 lr 0.000062 time 1.7754 (2.2083) loss 3.5759 (3.0605) grad_norm 2.5348 (2.7850) [2022-01-25 21:28:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1050/1251] eta 0:07:23 lr 0.000062 time 1.8715 (2.2063) loss 3.3248 (3.0615) grad_norm 2.8975 (2.7842) [2022-01-25 21:28:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1060/1251] eta 0:07:01 lr 0.000062 time 2.2403 (2.2057) loss 2.9825 (3.0614) grad_norm 3.1780 (2.7874) [2022-01-25 21:29:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1070/1251] eta 0:06:39 lr 0.000062 time 2.6235 (2.2063) loss 3.0477 (3.0617) grad_norm 2.4019 (2.7852) [2022-01-25 21:29:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1080/1251] eta 0:06:18 lr 0.000062 time 3.5713 (2.2107) loss 2.1244 (3.0597) grad_norm 2.6961 (2.7843) [2022-01-25 21:29:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1090/1251] eta 0:05:56 lr 0.000062 time 2.7611 (2.2135) loss 3.0136 (3.0625) grad_norm 2.8322 (2.7834) [2022-01-25 21:30:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1100/1251] eta 0:05:34 lr 0.000062 time 2.0742 (2.2166) loss 3.5449 (3.0617) grad_norm 2.5132 (2.7840) [2022-01-25 21:30:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1110/1251] eta 0:05:12 lr 0.000062 time 1.8203 (2.2168) loss 2.5474 (3.0609) grad_norm 3.0116 (2.7836) [2022-01-25 21:31:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1120/1251] eta 0:04:50 lr 0.000062 time 2.8343 (2.2167) loss 3.3935 (3.0622) grad_norm 2.6313 (2.7821) [2022-01-25 21:31:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1130/1251] eta 0:04:27 lr 0.000062 time 2.0584 (2.2142) loss 3.4828 (3.0642) grad_norm 2.6481 (2.7818) [2022-01-25 21:31:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1140/1251] eta 0:04:05 lr 0.000062 time 2.5276 (2.2127) loss 2.5755 (3.0654) grad_norm 2.7042 (2.7811) [2022-01-25 21:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1150/1251] eta 0:03:43 lr 0.000062 time 2.0373 (2.2124) loss 3.4866 (3.0646) grad_norm 2.5570 (2.7804) [2022-01-25 21:32:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1160/1251] eta 0:03:21 lr 0.000062 time 3.5768 (2.2137) loss 2.6688 (3.0647) grad_norm 2.4683 (2.7800) [2022-01-25 21:32:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1170/1251] eta 0:02:59 lr 0.000062 time 1.8954 (2.2139) loss 3.2028 (3.0660) grad_norm 2.7781 (2.7823) [2022-01-25 21:33:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1180/1251] eta 0:02:37 lr 0.000062 time 2.2679 (2.2137) loss 3.4649 (3.0676) grad_norm 3.1911 (2.7830) [2022-01-25 21:33:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1190/1251] eta 0:02:14 lr 0.000062 time 1.8914 (2.2120) loss 2.3704 (3.0690) grad_norm 2.5421 (2.7849) [2022-01-25 21:33:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1200/1251] eta 0:01:52 lr 0.000062 time 2.5447 (2.2100) loss 3.1054 (3.0683) grad_norm 2.5982 (2.7845) [2022-01-25 21:34:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1210/1251] eta 0:01:30 lr 0.000062 time 1.8860 (2.2095) loss 3.4453 (3.0680) grad_norm 2.8087 (2.7837) [2022-01-25 21:34:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1220/1251] eta 0:01:08 lr 0.000062 time 2.4252 (2.2089) loss 2.3057 (3.0674) grad_norm 2.6343 (2.7827) [2022-01-25 21:34:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1230/1251] eta 0:00:46 lr 0.000062 time 2.1061 (2.2080) loss 2.8402 (3.0677) grad_norm 2.4472 (2.7814) [2022-01-25 21:35:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1240/1251] eta 0:00:24 lr 0.000062 time 2.8787 (2.2088) loss 3.2947 (3.0676) grad_norm 2.5243 (2.7810) [2022-01-25 21:35:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [255/300][1250/1251] eta 0:00:02 lr 0.000062 time 1.1686 (2.2038) loss 3.2375 (3.0693) grad_norm 2.4003 (2.7799) [2022-01-25 21:35:34 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 255 training takes 0:45:57 [2022-01-25 21:35:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.026 (19.026) Loss 0.8877 (0.8877) Acc@1 79.004 (79.004) Acc@5 94.922 (94.922) [2022-01-25 21:36:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.976 (3.485) Loss 0.9401 (0.8443) Acc@1 77.539 (80.256) Acc@5 95.117 (95.286) [2022-01-25 21:36:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.635 (2.650) Loss 0.8761 (0.8414) Acc@1 79.688 (80.339) Acc@5 94.531 (95.196) [2022-01-25 21:36:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.602 (2.277) Loss 0.7854 (0.8407) Acc@1 81.250 (80.349) Acc@5 95.703 (95.196) [2022-01-25 21:37:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.258 (2.193) Loss 0.7920 (0.8336) Acc@1 79.688 (80.454) Acc@5 95.898 (95.289) [2022-01-25 21:37:11 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.494 Acc@5 95.288 [2022-01-25 21:37:11 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.5% [2022-01-25 21:37:11 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.49% [2022-01-25 21:37:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][0/1251] eta 7:08:40 lr 0.000062 time 20.5602 (20.5602) loss 3.3498 (3.3498) grad_norm 3.4109 (3.4109) [2022-01-25 21:37:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][10/1251] eta 1:24:04 lr 0.000062 time 3.0941 (4.0646) loss 3.0650 (3.2037) grad_norm 2.6466 (2.6679) [2022-01-25 21:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][20/1251] eta 1:05:38 lr 0.000062 time 1.9986 (3.1997) loss 2.2073 (3.1554) grad_norm 2.4280 (2.6848) [2022-01-25 21:38:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][30/1251] eta 0:58:45 lr 0.000062 time 1.3216 (2.8877) loss 3.3629 (3.1928) grad_norm 2.9331 (2.7826) [2022-01-25 21:39:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][40/1251] eta 0:55:18 lr 0.000062 time 3.3276 (2.7406) loss 3.2309 (3.1602) grad_norm 2.5927 (2.8520) [2022-01-25 21:39:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][50/1251] eta 0:53:59 lr 0.000062 time 2.0991 (2.6973) loss 2.7641 (3.0668) grad_norm 2.7761 (2.8137) [2022-01-25 21:39:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][60/1251] eta 0:51:12 lr 0.000062 time 1.9690 (2.5798) loss 1.9896 (3.0340) grad_norm 3.2704 (2.8078) [2022-01-25 21:40:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][70/1251] eta 0:49:16 lr 0.000061 time 1.5529 (2.5033) loss 2.5930 (3.0342) grad_norm 2.8451 (2.8072) [2022-01-25 21:40:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][80/1251] eta 0:48:18 lr 0.000061 time 3.0280 (2.4754) loss 3.2437 (3.0488) grad_norm 2.9023 (2.7909) [2022-01-25 21:40:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][90/1251] eta 0:47:41 lr 0.000061 time 1.9168 (2.4643) loss 3.0942 (3.0345) grad_norm 2.6660 (2.7933) [2022-01-25 21:41:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][100/1251] eta 0:46:51 lr 0.000061 time 2.3369 (2.4430) loss 3.4901 (3.0140) grad_norm 2.7849 (2.8007) [2022-01-25 21:41:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][110/1251] eta 0:45:54 lr 0.000061 time 1.8930 (2.4138) loss 3.2207 (3.0194) grad_norm 2.8919 (2.7860) [2022-01-25 21:42:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][120/1251] eta 0:45:00 lr 0.000061 time 2.7634 (2.3876) loss 3.7962 (3.0149) grad_norm 2.7123 (2.7908) [2022-01-25 21:42:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][130/1251] eta 0:44:03 lr 0.000061 time 2.1198 (2.3584) loss 3.6805 (3.0113) grad_norm 2.9045 (2.7864) [2022-01-25 21:42:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][140/1251] eta 0:43:19 lr 0.000061 time 1.5801 (2.3401) loss 3.1412 (3.0175) grad_norm 2.6378 (2.7797) [2022-01-25 21:43:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][150/1251] eta 0:42:39 lr 0.000061 time 1.9009 (2.3251) loss 2.9768 (3.0292) grad_norm 2.4961 (2.7850) [2022-01-25 21:43:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][160/1251] eta 0:42:10 lr 0.000061 time 2.9920 (2.3198) loss 3.7186 (3.0359) grad_norm 2.5152 (2.7728) [2022-01-25 21:43:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][170/1251] eta 0:41:53 lr 0.000061 time 2.1952 (2.3254) loss 2.4703 (3.0338) grad_norm 2.5869 (2.7677) [2022-01-25 21:44:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][180/1251] eta 0:41:32 lr 0.000061 time 1.7062 (2.3273) loss 3.0662 (3.0351) grad_norm 3.1735 (2.7771) [2022-01-25 21:44:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][190/1251] eta 0:41:05 lr 0.000061 time 1.8650 (2.3235) loss 3.4157 (3.0362) grad_norm 2.7071 (2.7747) [2022-01-25 21:44:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][200/1251] eta 0:40:31 lr 0.000061 time 1.9342 (2.3139) loss 2.3742 (3.0391) grad_norm 2.5933 (2.7746) [2022-01-25 21:45:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][210/1251] eta 0:40:03 lr 0.000061 time 2.6057 (2.3093) loss 2.1501 (3.0298) grad_norm 2.5906 (2.7663) [2022-01-25 21:45:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][220/1251] eta 0:39:35 lr 0.000061 time 1.6627 (2.3045) loss 3.2242 (3.0335) grad_norm 2.9647 (2.7655) [2022-01-25 21:46:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][230/1251] eta 0:39:02 lr 0.000061 time 1.7226 (2.2940) loss 1.9535 (3.0409) grad_norm 2.8808 (2.7624) [2022-01-25 21:46:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][240/1251] eta 0:38:32 lr 0.000061 time 2.3214 (2.2872) loss 3.4143 (3.0389) grad_norm 2.5285 (2.7663) [2022-01-25 21:46:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][250/1251] eta 0:38:05 lr 0.000061 time 2.5527 (2.2832) loss 2.6193 (3.0257) grad_norm 2.7132 (2.7643) [2022-01-25 21:47:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][260/1251] eta 0:37:37 lr 0.000061 time 2.0200 (2.2784) loss 2.9688 (3.0322) grad_norm 2.2212 (2.7593) [2022-01-25 21:47:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][270/1251] eta 0:37:15 lr 0.000061 time 2.4582 (2.2784) loss 2.3268 (3.0260) grad_norm 2.5906 (2.7627) [2022-01-25 21:47:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][280/1251] eta 0:36:48 lr 0.000061 time 2.3551 (2.2749) loss 3.3816 (3.0231) grad_norm 2.6605 (2.7571) [2022-01-25 21:48:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][290/1251] eta 0:36:22 lr 0.000061 time 2.9478 (2.2706) loss 2.1935 (3.0206) grad_norm 2.7265 (2.7638) [2022-01-25 21:48:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][300/1251] eta 0:35:52 lr 0.000061 time 1.6035 (2.2636) loss 3.0476 (3.0279) grad_norm 2.3393 (2.7697) [2022-01-25 21:48:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][310/1251] eta 0:35:29 lr 0.000061 time 2.4467 (2.2635) loss 3.6317 (3.0302) grad_norm 3.4816 (2.7685) [2022-01-25 21:49:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][320/1251] eta 0:35:03 lr 0.000061 time 2.6413 (2.2599) loss 2.3008 (3.0340) grad_norm 2.9010 (2.7674) [2022-01-25 21:49:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][330/1251] eta 0:34:40 lr 0.000061 time 2.6004 (2.2590) loss 3.5350 (3.0416) grad_norm 2.8805 (2.7918) [2022-01-25 21:50:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][340/1251] eta 0:34:16 lr 0.000061 time 1.6012 (2.2573) loss 3.6096 (3.0461) grad_norm 2.7904 (2.7913) [2022-01-25 21:50:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][350/1251] eta 0:33:52 lr 0.000061 time 2.2685 (2.2554) loss 3.8346 (3.0530) grad_norm 2.4433 (2.7882) [2022-01-25 21:50:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][360/1251] eta 0:33:25 lr 0.000061 time 1.8981 (2.2508) loss 3.3786 (3.0550) grad_norm 2.8148 (2.7863) [2022-01-25 21:51:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][370/1251] eta 0:33:04 lr 0.000061 time 3.5375 (2.2522) loss 3.2812 (3.0552) grad_norm 2.6983 (2.7893) [2022-01-25 21:51:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][380/1251] eta 0:32:37 lr 0.000061 time 1.9009 (2.2470) loss 3.2508 (3.0644) grad_norm 2.8319 (2.7883) [2022-01-25 21:51:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][390/1251] eta 0:32:12 lr 0.000061 time 2.2493 (2.2449) loss 3.1450 (3.0612) grad_norm 2.5353 (2.7924) [2022-01-25 21:52:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][400/1251] eta 0:31:46 lr 0.000061 time 2.0458 (2.2404) loss 2.4279 (3.0552) grad_norm 2.4937 (2.7942) [2022-01-25 21:52:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][410/1251] eta 0:31:24 lr 0.000061 time 3.0474 (2.2408) loss 1.8877 (3.0579) grad_norm 3.1813 (2.7966) [2022-01-25 21:52:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][420/1251] eta 0:31:01 lr 0.000061 time 1.7887 (2.2403) loss 3.4914 (3.0559) grad_norm 2.5109 (2.7971) [2022-01-25 21:53:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][430/1251] eta 0:30:38 lr 0.000061 time 2.8133 (2.2394) loss 3.4307 (3.0580) grad_norm 3.6479 (2.7977) [2022-01-25 21:53:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][440/1251] eta 0:30:15 lr 0.000061 time 2.4234 (2.2387) loss 2.8629 (3.0581) grad_norm 3.2250 (2.7966) [2022-01-25 21:54:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][450/1251] eta 0:29:51 lr 0.000061 time 2.8999 (2.2362) loss 2.9199 (3.0606) grad_norm 2.6221 (2.7942) [2022-01-25 21:54:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][460/1251] eta 0:29:27 lr 0.000061 time 2.0352 (2.2340) loss 3.4454 (3.0560) grad_norm 2.7336 (2.7926) [2022-01-25 21:54:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][470/1251] eta 0:29:04 lr 0.000061 time 2.1964 (2.2331) loss 3.5030 (3.0628) grad_norm 3.0455 (2.7936) [2022-01-25 21:55:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][480/1251] eta 0:28:41 lr 0.000061 time 2.1271 (2.2333) loss 3.3222 (3.0632) grad_norm 3.4467 (2.7946) [2022-01-25 21:55:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][490/1251] eta 0:28:19 lr 0.000061 time 2.5749 (2.2337) loss 3.1166 (3.0619) grad_norm 2.5615 (2.7949) [2022-01-25 21:55:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][500/1251] eta 0:27:55 lr 0.000061 time 1.8203 (2.2306) loss 3.3869 (3.0654) grad_norm 3.2072 (2.8090) [2022-01-25 21:56:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][510/1251] eta 0:27:29 lr 0.000061 time 1.9551 (2.2261) loss 2.1670 (3.0643) grad_norm 2.6235 (2.8085) [2022-01-25 21:56:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][520/1251] eta 0:27:06 lr 0.000061 time 2.4629 (2.2251) loss 3.3544 (3.0628) grad_norm 2.7949 (2.8103) [2022-01-25 21:56:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][530/1251] eta 0:26:45 lr 0.000061 time 1.9590 (2.2266) loss 3.6209 (3.0657) grad_norm 2.9736 (2.8153) [2022-01-25 21:57:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][540/1251] eta 0:26:23 lr 0.000061 time 2.1958 (2.2270) loss 3.4479 (3.0633) grad_norm 2.7134 (2.8249) [2022-01-25 21:57:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][550/1251] eta 0:25:59 lr 0.000061 time 2.2695 (2.2252) loss 3.2442 (3.0579) grad_norm 2.7439 (2.8255) [2022-01-25 21:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][560/1251] eta 0:25:37 lr 0.000061 time 2.4543 (2.2253) loss 2.1059 (3.0551) grad_norm 2.6466 (2.8240) [2022-01-25 21:58:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][570/1251] eta 0:25:16 lr 0.000061 time 2.0329 (2.2274) loss 3.7543 (3.0507) grad_norm 3.0453 (2.8238) [2022-01-25 21:58:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][580/1251] eta 0:24:53 lr 0.000061 time 2.6043 (2.2265) loss 2.8884 (3.0526) grad_norm 2.9874 (2.8241) [2022-01-25 21:59:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][590/1251] eta 0:24:29 lr 0.000061 time 1.8666 (2.2231) loss 3.3458 (3.0547) grad_norm 2.6696 (2.8216) [2022-01-25 21:59:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][600/1251] eta 0:24:05 lr 0.000061 time 2.9468 (2.2208) loss 3.7104 (3.0556) grad_norm 2.6605 (2.8217) [2022-01-25 21:59:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][610/1251] eta 0:23:42 lr 0.000061 time 2.1025 (2.2198) loss 3.3490 (3.0556) grad_norm 2.6647 (2.8208) [2022-01-25 22:00:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][620/1251] eta 0:23:20 lr 0.000060 time 2.4267 (2.2198) loss 2.7017 (3.0589) grad_norm 2.4624 (2.8190) [2022-01-25 22:00:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][630/1251] eta 0:22:58 lr 0.000060 time 2.0981 (2.2204) loss 3.3569 (3.0622) grad_norm 3.0214 (2.8167) [2022-01-25 22:00:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][640/1251] eta 0:22:35 lr 0.000060 time 1.6082 (2.2184) loss 3.4739 (3.0579) grad_norm 2.6525 (2.8126) [2022-01-25 22:01:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][650/1251] eta 0:22:12 lr 0.000060 time 1.8643 (2.2167) loss 3.6966 (3.0561) grad_norm 3.4676 (2.8100) [2022-01-25 22:01:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][660/1251] eta 0:21:50 lr 0.000060 time 2.6788 (2.2180) loss 2.8252 (3.0530) grad_norm 2.7178 (2.8094) [2022-01-25 22:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][670/1251] eta 0:21:28 lr 0.000060 time 2.3330 (2.2182) loss 2.1997 (3.0522) grad_norm 2.3534 (2.8068) [2022-01-25 22:02:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][680/1251] eta 0:21:06 lr 0.000060 time 1.7074 (2.2173) loss 3.5164 (3.0545) grad_norm 2.2057 (2.8065) [2022-01-25 22:02:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][690/1251] eta 0:20:43 lr 0.000060 time 1.8836 (2.2157) loss 2.1882 (3.0548) grad_norm 5.4129 (2.8078) [2022-01-25 22:03:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][700/1251] eta 0:20:21 lr 0.000060 time 2.4742 (2.2164) loss 3.9675 (3.0565) grad_norm 2.9551 (2.8076) [2022-01-25 22:03:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][710/1251] eta 0:19:58 lr 0.000060 time 1.8593 (2.2157) loss 3.1783 (3.0578) grad_norm 2.8573 (2.8118) [2022-01-25 22:03:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][720/1251] eta 0:19:36 lr 0.000060 time 2.0879 (2.2156) loss 3.1891 (3.0532) grad_norm 3.4257 (2.8095) [2022-01-25 22:04:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][730/1251] eta 0:19:13 lr 0.000060 time 2.4820 (2.2144) loss 3.4895 (3.0540) grad_norm 2.2903 (2.8129) [2022-01-25 22:04:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][740/1251] eta 0:18:50 lr 0.000060 time 2.1179 (2.2131) loss 3.4149 (3.0560) grad_norm 2.4875 (2.8124) [2022-01-25 22:04:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][750/1251] eta 0:18:29 lr 0.000060 time 2.7746 (2.2139) loss 2.5125 (3.0562) grad_norm 2.6394 (2.8182) [2022-01-25 22:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][760/1251] eta 0:18:06 lr 0.000060 time 1.8452 (2.2132) loss 2.2390 (3.0549) grad_norm 2.6578 (2.8209) [2022-01-25 22:05:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][770/1251] eta 0:17:44 lr 0.000060 time 2.4805 (2.2123) loss 2.9333 (3.0564) grad_norm 3.4400 (2.8219) [2022-01-25 22:05:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][780/1251] eta 0:17:20 lr 0.000060 time 1.8140 (2.2101) loss 3.5389 (3.0596) grad_norm 2.9005 (2.8226) [2022-01-25 22:06:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][790/1251] eta 0:17:00 lr 0.000060 time 2.3188 (2.2126) loss 3.4483 (3.0585) grad_norm 2.5736 (2.8230) [2022-01-25 22:06:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][800/1251] eta 0:16:37 lr 0.000060 time 1.9738 (2.2122) loss 2.6795 (3.0591) grad_norm 2.8387 (2.8229) [2022-01-25 22:07:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][810/1251] eta 0:16:15 lr 0.000060 time 2.1675 (2.2113) loss 3.4956 (3.0572) grad_norm 2.6227 (2.8222) [2022-01-25 22:07:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][820/1251] eta 0:15:51 lr 0.000060 time 1.5992 (2.2085) loss 3.5218 (3.0594) grad_norm 2.7194 (2.8241) [2022-01-25 22:07:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][830/1251] eta 0:15:29 lr 0.000060 time 1.8939 (2.2089) loss 3.3211 (3.0587) grad_norm 2.5424 (2.8220) [2022-01-25 22:08:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][840/1251] eta 0:15:07 lr 0.000060 time 1.7015 (2.2071) loss 3.1476 (3.0595) grad_norm 2.6832 (2.8203) [2022-01-25 22:08:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][850/1251] eta 0:14:45 lr 0.000060 time 2.4348 (2.2086) loss 2.4286 (3.0574) grad_norm 3.0530 (2.8182) [2022-01-25 22:08:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][860/1251] eta 0:14:23 lr 0.000060 time 1.6369 (2.2082) loss 3.2996 (3.0600) grad_norm 2.9827 (2.8174) [2022-01-25 22:09:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][870/1251] eta 0:14:01 lr 0.000060 time 1.8558 (2.2084) loss 2.3520 (3.0595) grad_norm 2.5222 (2.8159) [2022-01-25 22:09:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][880/1251] eta 0:13:38 lr 0.000060 time 1.8487 (2.2065) loss 3.3659 (3.0605) grad_norm 3.0735 (2.8182) [2022-01-25 22:09:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][890/1251] eta 0:13:16 lr 0.000060 time 2.3912 (2.2074) loss 2.9937 (3.0606) grad_norm 2.9840 (2.8192) [2022-01-25 22:10:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][900/1251] eta 0:12:54 lr 0.000060 time 2.0122 (2.2076) loss 3.6103 (3.0604) grad_norm 2.7305 (2.8200) [2022-01-25 22:10:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][910/1251] eta 0:12:33 lr 0.000060 time 2.5396 (2.2095) loss 2.7335 (3.0595) grad_norm 2.6045 (2.8177) [2022-01-25 22:11:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][920/1251] eta 0:12:11 lr 0.000060 time 2.1951 (2.2102) loss 2.8309 (3.0548) grad_norm 3.3046 (2.8181) [2022-01-25 22:11:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][930/1251] eta 0:11:49 lr 0.000060 time 1.8054 (2.2100) loss 3.3502 (3.0548) grad_norm 3.0882 (2.8181) [2022-01-25 22:11:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][940/1251] eta 0:11:27 lr 0.000060 time 2.0631 (2.2100) loss 3.7128 (3.0556) grad_norm 2.7577 (2.8168) [2022-01-25 22:12:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][950/1251] eta 0:11:04 lr 0.000060 time 1.7153 (2.2075) loss 2.9232 (3.0563) grad_norm 2.6585 (2.8164) [2022-01-25 22:12:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][960/1251] eta 0:10:41 lr 0.000060 time 2.0456 (2.2049) loss 3.2796 (3.0564) grad_norm 2.5041 (2.8155) [2022-01-25 22:12:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][970/1251] eta 0:10:19 lr 0.000060 time 2.5410 (2.2047) loss 3.4344 (3.0572) grad_norm 3.2659 (2.8134) [2022-01-25 22:13:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][980/1251] eta 0:09:57 lr 0.000060 time 2.8896 (2.2062) loss 3.3540 (3.0582) grad_norm 2.2391 (2.8114) [2022-01-25 22:13:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][990/1251] eta 0:09:35 lr 0.000060 time 2.2240 (2.2056) loss 3.7916 (3.0587) grad_norm 3.7220 (2.8120) [2022-01-25 22:13:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1000/1251] eta 0:09:13 lr 0.000060 time 2.6220 (2.2060) loss 3.2147 (3.0594) grad_norm 2.4623 (2.8123) [2022-01-25 22:14:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1010/1251] eta 0:08:51 lr 0.000060 time 3.4201 (2.2068) loss 2.6706 (3.0594) grad_norm 2.9505 (2.8120) [2022-01-25 22:14:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1020/1251] eta 0:08:29 lr 0.000060 time 1.5798 (2.2058) loss 3.0237 (3.0607) grad_norm 2.5359 (2.8129) [2022-01-25 22:15:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1030/1251] eta 0:08:07 lr 0.000060 time 2.2059 (2.2050) loss 3.8073 (3.0616) grad_norm 3.1108 (2.8127) [2022-01-25 22:15:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1040/1251] eta 0:07:45 lr 0.000060 time 2.5289 (2.2053) loss 3.7449 (3.0613) grad_norm 3.3264 (2.8128) [2022-01-25 22:15:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1050/1251] eta 0:07:23 lr 0.000060 time 2.7522 (2.2057) loss 3.3627 (3.0619) grad_norm 2.9134 (2.8115) [2022-01-25 22:16:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1060/1251] eta 0:07:01 lr 0.000060 time 1.7711 (2.2044) loss 2.4073 (3.0612) grad_norm 2.4812 (2.8120) [2022-01-25 22:16:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1070/1251] eta 0:06:38 lr 0.000060 time 2.9563 (2.2038) loss 2.6319 (3.0606) grad_norm 2.7152 (2.8109) [2022-01-25 22:16:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1080/1251] eta 0:06:16 lr 0.000060 time 2.2020 (2.2036) loss 3.3148 (3.0624) grad_norm 2.4986 (2.8107) [2022-01-25 22:17:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1090/1251] eta 0:05:54 lr 0.000060 time 2.9060 (2.2030) loss 3.2195 (3.0635) grad_norm 2.7498 (2.8086) [2022-01-25 22:17:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1100/1251] eta 0:05:32 lr 0.000060 time 2.2755 (2.2023) loss 3.3838 (3.0651) grad_norm 2.5032 (2.8068) [2022-01-25 22:17:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1110/1251] eta 0:05:10 lr 0.000060 time 2.5083 (2.2021) loss 3.2253 (3.0667) grad_norm 3.1587 (2.8062) [2022-01-25 22:18:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1120/1251] eta 0:04:48 lr 0.000060 time 1.9851 (2.2021) loss 3.3079 (3.0655) grad_norm 2.8352 (2.8060) [2022-01-25 22:18:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1130/1251] eta 0:04:26 lr 0.000060 time 2.7801 (2.2019) loss 2.1648 (3.0644) grad_norm 2.7611 (2.8059) [2022-01-25 22:19:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1140/1251] eta 0:04:04 lr 0.000060 time 1.7321 (2.2028) loss 3.2908 (3.0631) grad_norm 2.7833 (2.8044) [2022-01-25 22:19:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1150/1251] eta 0:03:42 lr 0.000060 time 1.9865 (2.2024) loss 3.1552 (3.0619) grad_norm 2.9303 (2.8051) [2022-01-25 22:19:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1160/1251] eta 0:03:20 lr 0.000060 time 1.6000 (2.2023) loss 2.8619 (3.0590) grad_norm 3.5130 (2.8062) [2022-01-25 22:20:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1170/1251] eta 0:02:58 lr 0.000059 time 2.2585 (2.2017) loss 2.7950 (3.0600) grad_norm 2.4431 (2.8051) [2022-01-25 22:20:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1180/1251] eta 0:02:36 lr 0.000059 time 2.2210 (2.2027) loss 3.0411 (3.0614) grad_norm 2.5943 (2.8046) [2022-01-25 22:20:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1190/1251] eta 0:02:14 lr 0.000059 time 2.1071 (2.2023) loss 2.7791 (3.0623) grad_norm 2.4764 (2.8042) [2022-01-25 22:21:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1200/1251] eta 0:01:52 lr 0.000059 time 1.7409 (2.2024) loss 3.3948 (3.0638) grad_norm 2.8022 (2.8027) [2022-01-25 22:21:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1210/1251] eta 0:01:30 lr 0.000059 time 1.8925 (2.2019) loss 3.3051 (3.0643) grad_norm 2.3280 (2.8018) [2022-01-25 22:21:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1220/1251] eta 0:01:08 lr 0.000059 time 1.6193 (2.2018) loss 2.7911 (3.0647) grad_norm 4.1609 (2.8011) [2022-01-25 22:22:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1230/1251] eta 0:00:46 lr 0.000059 time 1.9532 (2.2010) loss 3.9477 (3.0658) grad_norm 2.7069 (2.8023) [2022-01-25 22:22:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1240/1251] eta 0:00:24 lr 0.000059 time 1.7032 (2.2014) loss 2.2105 (3.0665) grad_norm 2.6403 (2.8022) [2022-01-25 22:22:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [256/300][1250/1251] eta 0:00:02 lr 0.000059 time 1.1774 (2.1956) loss 2.4228 (3.0666) grad_norm 2.5497 (2.8015) [2022-01-25 22:22:58 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 256 training takes 0:45:47 [2022-01-25 22:23:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.601 (18.601) Loss 0.7585 (0.7585) Acc@1 82.324 (82.324) Acc@5 96.289 (96.289) [2022-01-25 22:23:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.873 (3.501) Loss 0.8824 (0.8289) Acc@1 79.395 (80.460) Acc@5 94.922 (95.277) [2022-01-25 22:23:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.530 (2.709) Loss 0.8237 (0.8314) Acc@1 81.738 (80.534) Acc@5 95.117 (95.257) [2022-01-25 22:24:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.017 (2.428) Loss 0.8319 (0.8334) Acc@1 80.664 (80.570) Acc@5 96.484 (95.316) [2022-01-25 22:24:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.733 (2.184) Loss 0.8619 (0.8339) Acc@1 80.176 (80.585) Acc@5 95.410 (95.312) [2022-01-25 22:24:36 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.578 Acc@5 95.310 [2022-01-25 22:24:36 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-01-25 22:24:36 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.58% [2022-01-25 22:24:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][0/1251] eta 7:40:16 lr 0.000059 time 22.0752 (22.0752) loss 3.3979 (3.3979) grad_norm 2.5734 (2.5734) [2022-01-25 22:25:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][10/1251] eta 1:26:11 lr 0.000059 time 2.2003 (4.1675) loss 3.4728 (3.1001) grad_norm 2.8384 (2.5894) [2022-01-25 22:25:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][20/1251] eta 1:07:14 lr 0.000059 time 2.0865 (3.2772) loss 2.5553 (2.9978) grad_norm 2.6069 (2.6192) [2022-01-25 22:26:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][30/1251] eta 0:59:11 lr 0.000059 time 1.5380 (2.9083) loss 3.5777 (3.0399) grad_norm 3.1029 (2.7639) [2022-01-25 22:26:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][40/1251] eta 0:55:06 lr 0.000059 time 3.6773 (2.7307) loss 3.0363 (3.0296) grad_norm 2.5033 (2.7635) [2022-01-25 22:26:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][50/1251] eta 0:52:15 lr 0.000059 time 1.4696 (2.6110) loss 3.3375 (3.0469) grad_norm 3.0263 (2.7762) [2022-01-25 22:27:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][60/1251] eta 0:50:04 lr 0.000059 time 2.3345 (2.5230) loss 3.4585 (3.0466) grad_norm 2.3988 (2.8578) [2022-01-25 22:27:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][70/1251] eta 0:48:35 lr 0.000059 time 1.6218 (2.4686) loss 3.5194 (3.0352) grad_norm 2.8450 (2.8603) [2022-01-25 22:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][80/1251] eta 0:47:43 lr 0.000059 time 2.8271 (2.4457) loss 3.2471 (3.0679) grad_norm 3.4193 (2.8726) [2022-01-25 22:28:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][90/1251] eta 0:47:06 lr 0.000059 time 2.5042 (2.4348) loss 3.3709 (3.0595) grad_norm 2.5489 (2.8347) [2022-01-25 22:28:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][100/1251] eta 0:46:07 lr 0.000059 time 2.2148 (2.4041) loss 3.5164 (3.0602) grad_norm 2.9019 (2.8356) [2022-01-25 22:29:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][110/1251] eta 0:45:15 lr 0.000059 time 1.5516 (2.3799) loss 3.1804 (3.0503) grad_norm 2.5673 (2.8154) [2022-01-25 22:29:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][120/1251] eta 0:44:38 lr 0.000059 time 2.8867 (2.3686) loss 3.5114 (3.0453) grad_norm 2.9176 (2.8068) [2022-01-25 22:29:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][130/1251] eta 0:44:01 lr 0.000059 time 1.8149 (2.3568) loss 3.2992 (3.0488) grad_norm 3.0793 (2.8039) [2022-01-25 22:30:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][140/1251] eta 0:43:25 lr 0.000059 time 2.1179 (2.3448) loss 3.2269 (3.0489) grad_norm 2.9927 (2.7983) [2022-01-25 22:30:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][150/1251] eta 0:42:51 lr 0.000059 time 1.5938 (2.3352) loss 2.8715 (3.0438) grad_norm 2.9113 (2.7937) [2022-01-25 22:30:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][160/1251] eta 0:42:26 lr 0.000059 time 3.4273 (2.3343) loss 1.9912 (3.0440) grad_norm 2.6782 (2.7888) [2022-01-25 22:31:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][170/1251] eta 0:41:54 lr 0.000059 time 1.8125 (2.3265) loss 3.5530 (3.0521) grad_norm 2.6658 (2.7834) [2022-01-25 22:31:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][180/1251] eta 0:41:19 lr 0.000059 time 1.5965 (2.3155) loss 3.1762 (3.0493) grad_norm 2.7480 (2.7783) [2022-01-25 22:31:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][190/1251] eta 0:40:45 lr 0.000059 time 1.4995 (2.3052) loss 2.5440 (3.0418) grad_norm 2.4767 (2.7637) [2022-01-25 22:32:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][200/1251] eta 0:40:23 lr 0.000059 time 3.1200 (2.3062) loss 2.4910 (3.0268) grad_norm 2.6587 (2.7669) [2022-01-25 22:32:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][210/1251] eta 0:39:51 lr 0.000059 time 1.7868 (2.2977) loss 3.1677 (3.0375) grad_norm 2.5812 (2.7764) [2022-01-25 22:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][220/1251] eta 0:39:15 lr 0.000059 time 1.7905 (2.2849) loss 3.3276 (3.0432) grad_norm 2.4983 (2.7727) [2022-01-25 22:33:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][230/1251] eta 0:38:43 lr 0.000059 time 1.9105 (2.2759) loss 3.1075 (3.0414) grad_norm 2.3474 (2.7756) [2022-01-25 22:33:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][240/1251] eta 0:38:10 lr 0.000059 time 1.7365 (2.2656) loss 2.7154 (3.0400) grad_norm 3.5892 (2.7786) [2022-01-25 22:34:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][250/1251] eta 0:37:41 lr 0.000059 time 1.5701 (2.2588) loss 3.1779 (3.0306) grad_norm 2.5258 (2.7741) [2022-01-25 22:34:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][260/1251] eta 0:37:18 lr 0.000059 time 2.4640 (2.2587) loss 2.5466 (3.0329) grad_norm 2.4792 (2.7810) [2022-01-25 22:34:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][270/1251] eta 0:37:02 lr 0.000059 time 2.3668 (2.2660) loss 2.5315 (3.0351) grad_norm 2.5227 (2.7825) [2022-01-25 22:35:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][280/1251] eta 0:36:33 lr 0.000059 time 1.6716 (2.2592) loss 3.6377 (3.0407) grad_norm 2.6705 (2.7812) [2022-01-25 22:35:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][290/1251] eta 0:36:10 lr 0.000059 time 2.2170 (2.2582) loss 3.4435 (3.0397) grad_norm 4.0514 (2.7868) [2022-01-25 22:35:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][300/1251] eta 0:35:46 lr 0.000059 time 2.7923 (2.2573) loss 3.3844 (3.0409) grad_norm 2.8015 (2.7927) [2022-01-25 22:36:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][310/1251] eta 0:35:31 lr 0.000059 time 2.5937 (2.2652) loss 3.1870 (3.0443) grad_norm 3.2450 (2.7897) [2022-01-25 22:36:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][320/1251] eta 0:35:05 lr 0.000059 time 1.8715 (2.2618) loss 2.3417 (3.0432) grad_norm 2.5314 (2.7826) [2022-01-25 22:37:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][330/1251] eta 0:34:37 lr 0.000059 time 2.0575 (2.2556) loss 3.0834 (3.0435) grad_norm 2.5570 (2.7807) [2022-01-25 22:37:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][340/1251] eta 0:34:09 lr 0.000059 time 2.7008 (2.2493) loss 3.9905 (3.0419) grad_norm 2.5609 (2.7794) [2022-01-25 22:37:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][350/1251] eta 0:33:42 lr 0.000059 time 2.1849 (2.2449) loss 3.2609 (3.0366) grad_norm 2.5902 (2.7783) [2022-01-25 22:38:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][360/1251] eta 0:33:14 lr 0.000059 time 1.8937 (2.2384) loss 3.3481 (3.0393) grad_norm 2.3207 (2.7814) [2022-01-25 22:38:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][370/1251] eta 0:32:48 lr 0.000059 time 2.2116 (2.2343) loss 3.0324 (3.0347) grad_norm 2.1798 (2.7849) [2022-01-25 22:38:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][380/1251] eta 0:32:23 lr 0.000059 time 2.0578 (2.2310) loss 2.8361 (3.0354) grad_norm 2.6118 (2.7807) [2022-01-25 22:39:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][390/1251] eta 0:32:00 lr 0.000059 time 2.6735 (2.2301) loss 3.1488 (3.0344) grad_norm 2.7725 (2.7786) [2022-01-25 22:39:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][400/1251] eta 0:31:36 lr 0.000059 time 1.8475 (2.2287) loss 3.4459 (3.0290) grad_norm 2.7227 (2.7818) [2022-01-25 22:39:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][410/1251] eta 0:31:16 lr 0.000059 time 1.5014 (2.2315) loss 3.7762 (3.0284) grad_norm 3.2159 (2.7787) [2022-01-25 22:40:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][420/1251] eta 0:30:57 lr 0.000059 time 2.0985 (2.2354) loss 2.1373 (3.0285) grad_norm 2.7965 (2.7856) [2022-01-25 22:40:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][430/1251] eta 0:30:37 lr 0.000059 time 2.8273 (2.2376) loss 3.4643 (3.0272) grad_norm 3.2416 (2.7866) [2022-01-25 22:41:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][440/1251] eta 0:30:13 lr 0.000059 time 2.1919 (2.2364) loss 2.7726 (3.0295) grad_norm 3.0602 (2.7887) [2022-01-25 22:41:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][450/1251] eta 0:29:48 lr 0.000059 time 1.5873 (2.2327) loss 3.2574 (3.0310) grad_norm 3.5582 (2.7895) [2022-01-25 22:41:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][460/1251] eta 0:29:23 lr 0.000059 time 1.5885 (2.2297) loss 3.8126 (3.0314) grad_norm 2.9444 (2.7906) [2022-01-25 22:42:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][470/1251] eta 0:29:03 lr 0.000058 time 2.2165 (2.2326) loss 3.0176 (3.0381) grad_norm 2.8386 (2.7914) [2022-01-25 22:42:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][480/1251] eta 0:28:40 lr 0.000058 time 1.8931 (2.2310) loss 3.1490 (3.0376) grad_norm 2.6245 (2.7909) [2022-01-25 22:42:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][490/1251] eta 0:28:16 lr 0.000058 time 1.5626 (2.2299) loss 3.2296 (3.0362) grad_norm 3.4370 (2.7897) [2022-01-25 22:43:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][500/1251] eta 0:27:54 lr 0.000058 time 1.9036 (2.2293) loss 2.8378 (3.0393) grad_norm 2.6055 (2.7909) [2022-01-25 22:43:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][510/1251] eta 0:27:35 lr 0.000058 time 2.1153 (2.2340) loss 3.3584 (3.0409) grad_norm 2.7702 (2.7916) [2022-01-25 22:43:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][520/1251] eta 0:27:10 lr 0.000058 time 1.9497 (2.2298) loss 3.1200 (3.0436) grad_norm 2.4850 (2.7900) [2022-01-25 22:44:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][530/1251] eta 0:26:45 lr 0.000058 time 1.9262 (2.2273) loss 3.1762 (3.0426) grad_norm 2.4632 (2.7927) [2022-01-25 22:44:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][540/1251] eta 0:26:21 lr 0.000058 time 2.0525 (2.2244) loss 3.1257 (3.0430) grad_norm 2.7110 (2.7949) [2022-01-25 22:45:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][550/1251] eta 0:25:59 lr 0.000058 time 2.2281 (2.2242) loss 3.1564 (3.0464) grad_norm 2.7001 (2.7939) [2022-01-25 22:45:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][560/1251] eta 0:25:35 lr 0.000058 time 1.7105 (2.2224) loss 2.6169 (3.0479) grad_norm 3.3780 (2.7949) [2022-01-25 22:45:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][570/1251] eta 0:25:10 lr 0.000058 time 1.8045 (2.2176) loss 3.2876 (3.0506) grad_norm 3.0824 (2.8003) [2022-01-25 22:46:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][580/1251] eta 0:24:47 lr 0.000058 time 1.9994 (2.2164) loss 2.4628 (3.0504) grad_norm 2.4423 (2.8011) [2022-01-25 22:46:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][590/1251] eta 0:24:26 lr 0.000058 time 2.1556 (2.2183) loss 1.9229 (3.0475) grad_norm 2.9988 (2.8024) [2022-01-25 22:46:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][600/1251] eta 0:24:06 lr 0.000058 time 2.5916 (2.2218) loss 3.3705 (3.0474) grad_norm 2.9873 (2.8009) [2022-01-25 22:47:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][610/1251] eta 0:23:45 lr 0.000058 time 2.4534 (2.2245) loss 2.4825 (3.0422) grad_norm 2.6146 (2.7998) [2022-01-25 22:47:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][620/1251] eta 0:23:26 lr 0.000058 time 2.7683 (2.2283) loss 3.0641 (3.0414) grad_norm 2.4790 (2.7980) [2022-01-25 22:48:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][630/1251] eta 0:23:03 lr 0.000058 time 2.5565 (2.2278) loss 2.0333 (3.0406) grad_norm 2.6706 (2.8012) [2022-01-25 22:48:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][640/1251] eta 0:22:38 lr 0.000058 time 1.9227 (2.2234) loss 3.5794 (3.0422) grad_norm 2.8580 (2.8012) [2022-01-25 22:48:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][650/1251] eta 0:22:12 lr 0.000058 time 1.9684 (2.2178) loss 3.7609 (3.0461) grad_norm 2.4841 (2.8083) [2022-01-25 22:49:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][660/1251] eta 0:21:48 lr 0.000058 time 2.1842 (2.2148) loss 2.8855 (3.0510) grad_norm 2.6038 (2.8088) [2022-01-25 22:49:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][670/1251] eta 0:21:24 lr 0.000058 time 2.1603 (2.2110) loss 3.3673 (3.0511) grad_norm 2.5377 (2.8076) [2022-01-25 22:49:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][680/1251] eta 0:21:02 lr 0.000058 time 2.2882 (2.2104) loss 2.6861 (3.0504) grad_norm 2.6225 (2.8080) [2022-01-25 22:50:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][690/1251] eta 0:20:40 lr 0.000058 time 2.6731 (2.2107) loss 3.2148 (3.0544) grad_norm 2.3981 (2.8093) [2022-01-25 22:50:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][700/1251] eta 0:20:19 lr 0.000058 time 2.5023 (2.2139) loss 3.9070 (3.0567) grad_norm 3.0217 (2.8085) [2022-01-25 22:50:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][710/1251] eta 0:19:59 lr 0.000058 time 2.3949 (2.2166) loss 3.1663 (3.0538) grad_norm 2.5706 (2.8071) [2022-01-25 22:51:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][720/1251] eta 0:19:37 lr 0.000058 time 2.4944 (2.2184) loss 3.7280 (3.0527) grad_norm 2.7593 (2.8069) [2022-01-25 22:51:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][730/1251] eta 0:19:15 lr 0.000058 time 2.2242 (2.2182) loss 3.6041 (3.0562) grad_norm 2.6555 (2.8060) [2022-01-25 22:51:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][740/1251] eta 0:18:52 lr 0.000058 time 1.8941 (2.2160) loss 2.9204 (3.0596) grad_norm 2.7332 (2.8051) [2022-01-25 22:52:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][750/1251] eta 0:18:29 lr 0.000058 time 2.2455 (2.2139) loss 3.5002 (3.0580) grad_norm 2.8008 (2.8056) [2022-01-25 22:52:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][760/1251] eta 0:18:07 lr 0.000058 time 2.0089 (2.2142) loss 3.7943 (3.0592) grad_norm 2.5483 (2.8065) [2022-01-25 22:53:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][770/1251] eta 0:17:44 lr 0.000058 time 1.8762 (2.2129) loss 2.9321 (3.0593) grad_norm 2.3878 (2.8050) [2022-01-25 22:53:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][780/1251] eta 0:17:22 lr 0.000058 time 1.6172 (2.2137) loss 3.3948 (3.0615) grad_norm 2.3545 (2.8038) [2022-01-25 22:53:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][790/1251] eta 0:16:59 lr 0.000058 time 1.7983 (2.2122) loss 3.7156 (3.0624) grad_norm 2.7515 (2.8037) [2022-01-25 22:54:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][800/1251] eta 0:16:38 lr 0.000058 time 2.1885 (2.2131) loss 3.3573 (3.0633) grad_norm 2.3748 (2.8025) [2022-01-25 22:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][810/1251] eta 0:16:15 lr 0.000058 time 1.9426 (2.2111) loss 3.3465 (3.0633) grad_norm 2.7395 (2.8015) [2022-01-25 22:54:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][820/1251] eta 0:15:53 lr 0.000058 time 2.1024 (2.2113) loss 2.1524 (3.0610) grad_norm 2.8896 (2.8012) [2022-01-25 22:55:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][830/1251] eta 0:15:31 lr 0.000058 time 2.2615 (2.2120) loss 3.4943 (3.0602) grad_norm 3.0624 (2.7999) [2022-01-25 22:55:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][840/1251] eta 0:15:09 lr 0.000058 time 2.0612 (2.2117) loss 3.0510 (3.0625) grad_norm 2.6830 (2.7985) [2022-01-25 22:55:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][850/1251] eta 0:14:46 lr 0.000058 time 1.9037 (2.2105) loss 2.7203 (3.0624) grad_norm 3.1476 (2.7989) [2022-01-25 22:56:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][860/1251] eta 0:14:24 lr 0.000058 time 1.4347 (2.2120) loss 2.3765 (3.0616) grad_norm 2.7134 (2.7988) [2022-01-25 22:56:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][870/1251] eta 0:14:02 lr 0.000058 time 2.1561 (2.2121) loss 3.4338 (3.0638) grad_norm 3.4809 (2.7999) [2022-01-25 22:57:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][880/1251] eta 0:13:40 lr 0.000058 time 1.5709 (2.2113) loss 3.3863 (3.0652) grad_norm 2.6298 (2.8003) [2022-01-25 22:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][890/1251] eta 0:13:17 lr 0.000058 time 1.7266 (2.2100) loss 2.7627 (3.0670) grad_norm 2.5065 (2.8005) [2022-01-25 22:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][900/1251] eta 0:12:55 lr 0.000058 time 2.2080 (2.2091) loss 3.0465 (3.0680) grad_norm 2.6560 (2.8014) [2022-01-25 22:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][910/1251] eta 0:12:32 lr 0.000058 time 1.9005 (2.2073) loss 3.3435 (3.0674) grad_norm 2.7181 (2.8000) [2022-01-25 22:58:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][920/1251] eta 0:12:09 lr 0.000058 time 1.8493 (2.2052) loss 3.2663 (3.0693) grad_norm 2.9581 (2.8002) [2022-01-25 22:58:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][930/1251] eta 0:11:47 lr 0.000058 time 1.9195 (2.2048) loss 3.0760 (3.0692) grad_norm 2.3971 (2.8016) [2022-01-25 22:59:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][940/1251] eta 0:11:26 lr 0.000058 time 2.7853 (2.2065) loss 2.7947 (3.0693) grad_norm 3.0990 (2.8084) [2022-01-25 22:59:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][950/1251] eta 0:11:04 lr 0.000058 time 1.8146 (2.2064) loss 2.5908 (3.0670) grad_norm 2.7299 (2.8086) [2022-01-25 22:59:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][960/1251] eta 0:10:41 lr 0.000058 time 2.2512 (2.2058) loss 1.9819 (3.0660) grad_norm 2.7394 (2.8078) [2022-01-25 23:00:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][970/1251] eta 0:10:19 lr 0.000058 time 1.5793 (2.2059) loss 3.5177 (3.0670) grad_norm 2.9711 (2.8069) [2022-01-25 23:00:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][980/1251] eta 0:09:58 lr 0.000058 time 1.7595 (2.2070) loss 1.9092 (3.0668) grad_norm 3.5194 (2.8077) [2022-01-25 23:01:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][990/1251] eta 0:09:35 lr 0.000058 time 2.1954 (2.2063) loss 2.4828 (3.0648) grad_norm 3.0141 (2.8120) [2022-01-25 23:01:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1000/1251] eta 0:09:13 lr 0.000058 time 2.2117 (2.2060) loss 3.3722 (3.0659) grad_norm 2.8897 (2.8125) [2022-01-25 23:01:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1010/1251] eta 0:08:51 lr 0.000058 time 1.6034 (2.2052) loss 2.8361 (3.0648) grad_norm 3.0164 (2.8126) [2022-01-25 23:02:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1020/1251] eta 0:08:29 lr 0.000058 time 1.8657 (2.2073) loss 2.9549 (3.0648) grad_norm 3.0003 (2.8161) [2022-01-25 23:02:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1030/1251] eta 0:08:08 lr 0.000058 time 2.9400 (2.2091) loss 3.2839 (3.0653) grad_norm 2.9704 (2.8164) [2022-01-25 23:02:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1040/1251] eta 0:07:45 lr 0.000057 time 1.6056 (2.2072) loss 3.6574 (3.0656) grad_norm 3.6428 (2.8166) [2022-01-25 23:03:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1050/1251] eta 0:07:23 lr 0.000057 time 1.6942 (2.2065) loss 3.2207 (3.0640) grad_norm 2.7801 (2.8166) [2022-01-25 23:03:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1060/1251] eta 0:07:01 lr 0.000057 time 1.8751 (2.2067) loss 2.5251 (3.0636) grad_norm 2.6352 (2.8171) [2022-01-25 23:03:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1070/1251] eta 0:06:38 lr 0.000057 time 1.9228 (2.2043) loss 2.7611 (3.0633) grad_norm 2.5066 (2.8151) [2022-01-25 23:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1080/1251] eta 0:06:16 lr 0.000057 time 2.5677 (2.2028) loss 3.2671 (3.0619) grad_norm 3.7073 (2.8167) [2022-01-25 23:04:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1090/1251] eta 0:05:54 lr 0.000057 time 2.2376 (2.2033) loss 3.6057 (3.0617) grad_norm 2.7623 (2.8160) [2022-01-25 23:05:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1100/1251] eta 0:05:32 lr 0.000057 time 2.1487 (2.2038) loss 3.0877 (3.0627) grad_norm 2.8630 (2.8155) [2022-01-25 23:05:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1110/1251] eta 0:05:10 lr 0.000057 time 2.1464 (2.2044) loss 2.1273 (3.0606) grad_norm 2.7139 (2.8137) [2022-01-25 23:05:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1120/1251] eta 0:04:48 lr 0.000057 time 2.3649 (2.2048) loss 3.0954 (3.0626) grad_norm 2.5281 (2.8143) [2022-01-25 23:06:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1130/1251] eta 0:04:26 lr 0.000057 time 1.8437 (2.2038) loss 2.7450 (3.0626) grad_norm 3.0038 (2.8151) [2022-01-25 23:06:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1140/1251] eta 0:04:04 lr 0.000057 time 1.5667 (2.2031) loss 2.3112 (3.0623) grad_norm 2.5541 (2.8135) [2022-01-25 23:06:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1150/1251] eta 0:03:42 lr 0.000057 time 1.5647 (2.2012) loss 3.3525 (3.0638) grad_norm 2.6225 (2.8121) [2022-01-25 23:07:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1160/1251] eta 0:03:20 lr 0.000057 time 2.2141 (2.2006) loss 2.9316 (3.0638) grad_norm 2.9263 (2.8129) [2022-01-25 23:07:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1170/1251] eta 0:02:58 lr 0.000057 time 2.4161 (2.2058) loss 2.5191 (3.0640) grad_norm 2.3835 (2.8114) [2022-01-25 23:08:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1180/1251] eta 0:02:36 lr 0.000057 time 1.5400 (2.2053) loss 2.6382 (3.0663) grad_norm 2.8199 (2.8112) [2022-01-25 23:08:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1190/1251] eta 0:02:14 lr 0.000057 time 1.8625 (2.2056) loss 3.5931 (3.0676) grad_norm 2.9413 (2.8128) [2022-01-25 23:08:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1200/1251] eta 0:01:52 lr 0.000057 time 1.8139 (2.2044) loss 3.5443 (3.0678) grad_norm 2.7003 (2.8110) [2022-01-25 23:09:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1210/1251] eta 0:01:30 lr 0.000057 time 1.8098 (2.2054) loss 3.2209 (3.0695) grad_norm 2.4984 (2.8099) [2022-01-25 23:09:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1220/1251] eta 0:01:08 lr 0.000057 time 1.5964 (2.2043) loss 2.7410 (3.0710) grad_norm 2.5984 (2.8098) [2022-01-25 23:09:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1230/1251] eta 0:00:46 lr 0.000057 time 1.7933 (2.2059) loss 3.3928 (3.0725) grad_norm 2.5739 (2.8088) [2022-01-25 23:10:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1240/1251] eta 0:00:24 lr 0.000057 time 1.4719 (2.2037) loss 2.6615 (3.0698) grad_norm 2.5739 (2.8082) [2022-01-25 23:10:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [257/300][1250/1251] eta 0:00:02 lr 0.000057 time 1.2096 (2.1977) loss 2.8785 (3.0708) grad_norm 2.3716 (2.8076) [2022-01-25 23:10:26 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 257 training takes 0:45:49 [2022-01-25 23:10:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.236 (18.236) Loss 0.8174 (0.8174) Acc@1 81.543 (81.543) Acc@5 95.410 (95.410) [2022-01-25 23:11:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.656 (3.545) Loss 0.9008 (0.8207) Acc@1 79.492 (80.850) Acc@5 94.434 (95.419) [2022-01-25 23:11:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.337 (2.555) Loss 0.9262 (0.8220) Acc@1 77.930 (80.915) Acc@5 94.336 (95.359) [2022-01-25 23:11:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.635 (2.217) Loss 0.8038 (0.8252) Acc@1 81.445 (80.891) Acc@5 95.801 (95.319) [2022-01-25 23:11:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.358 (2.163) Loss 0.8481 (0.8240) Acc@1 79.297 (80.895) Acc@5 94.824 (95.343) [2022-01-25 23:12:01 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.776 Acc@5 95.310 [2022-01-25 23:12:01 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-01-25 23:12:01 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.78% [2022-01-25 23:12:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][0/1251] eta 7:35:43 lr 0.000057 time 21.8577 (21.8577) loss 3.3874 (3.3874) grad_norm 2.5509 (2.5509) [2022-01-25 23:12:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][10/1251] eta 1:21:24 lr 0.000057 time 1.5733 (3.9357) loss 2.7470 (3.1054) grad_norm 2.7077 (2.6789) [2022-01-25 23:13:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][20/1251] eta 1:03:55 lr 0.000057 time 1.5728 (3.1161) loss 2.4842 (2.9822) grad_norm 3.0616 (2.7206) [2022-01-25 23:13:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][30/1251] eta 0:58:55 lr 0.000057 time 1.4219 (2.8960) loss 3.8003 (3.0724) grad_norm 2.5296 (2.7748) [2022-01-25 23:13:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][40/1251] eta 0:55:02 lr 0.000057 time 3.8790 (2.7269) loss 3.5139 (3.0372) grad_norm 3.3496 (2.7951) [2022-01-25 23:14:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][50/1251] eta 0:53:01 lr 0.000057 time 3.0218 (2.6487) loss 3.6568 (3.0456) grad_norm 3.0619 (2.8045) [2022-01-25 23:14:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][60/1251] eta 0:50:30 lr 0.000057 time 1.5816 (2.5447) loss 3.1431 (3.0484) grad_norm 2.8100 (2.7853) [2022-01-25 23:15:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][70/1251] eta 0:49:30 lr 0.000057 time 1.9089 (2.5156) loss 2.4148 (3.0296) grad_norm 3.0737 (2.8003) [2022-01-25 23:15:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][80/1251] eta 0:48:53 lr 0.000057 time 3.7347 (2.5053) loss 3.5082 (3.0289) grad_norm 2.8937 (2.7812) [2022-01-25 23:15:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][90/1251] eta 0:47:43 lr 0.000057 time 2.5339 (2.4661) loss 3.1652 (3.0153) grad_norm 2.5257 (2.7944) [2022-01-25 23:16:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][100/1251] eta 0:46:38 lr 0.000057 time 1.6753 (2.4311) loss 3.2622 (3.0285) grad_norm 2.7547 (2.7953) [2022-01-25 23:16:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][110/1251] eta 0:45:42 lr 0.000057 time 1.7892 (2.4040) loss 2.8868 (3.0427) grad_norm 2.6010 (2.7758) [2022-01-25 23:16:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][120/1251] eta 0:45:04 lr 0.000057 time 3.4572 (2.3913) loss 2.8106 (3.0372) grad_norm 3.2757 (2.7940) [2022-01-25 23:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][130/1251] eta 0:44:19 lr 0.000057 time 2.8298 (2.3725) loss 3.0449 (3.0454) grad_norm 3.1313 (2.8018) [2022-01-25 23:17:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][140/1251] eta 0:43:27 lr 0.000057 time 1.9744 (2.3470) loss 1.7794 (3.0370) grad_norm 2.7192 (2.7947) [2022-01-25 23:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][150/1251] eta 0:42:51 lr 0.000057 time 1.8597 (2.3352) loss 3.4371 (3.0449) grad_norm 2.9381 (2.7925) [2022-01-25 23:18:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][160/1251] eta 0:42:18 lr 0.000057 time 2.9275 (2.3270) loss 3.2356 (3.0450) grad_norm 2.6428 (2.7940) [2022-01-25 23:18:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][170/1251] eta 0:41:38 lr 0.000057 time 1.8895 (2.3112) loss 2.2488 (3.0489) grad_norm 2.7590 (2.7786) [2022-01-25 23:18:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][180/1251] eta 0:41:07 lr 0.000057 time 1.9702 (2.3041) loss 2.9827 (3.0362) grad_norm 3.0083 (2.7755) [2022-01-25 23:19:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][190/1251] eta 0:40:44 lr 0.000057 time 1.3513 (2.3043) loss 3.2485 (3.0458) grad_norm 3.3408 (2.7647) [2022-01-25 23:19:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][200/1251] eta 0:40:24 lr 0.000057 time 3.5230 (2.3072) loss 3.4162 (3.0543) grad_norm 3.0218 (2.7658) [2022-01-25 23:20:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][210/1251] eta 0:39:47 lr 0.000057 time 1.8881 (2.2937) loss 3.3906 (3.0616) grad_norm 2.8034 (2.7654) [2022-01-25 23:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][220/1251] eta 0:39:17 lr 0.000057 time 1.6819 (2.2867) loss 3.3627 (3.0602) grad_norm 2.8647 (2.7683) [2022-01-25 23:20:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][230/1251] eta 0:38:52 lr 0.000057 time 2.1411 (2.2844) loss 2.7117 (3.0455) grad_norm 2.9934 (2.7725) [2022-01-25 23:21:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][240/1251] eta 0:38:31 lr 0.000057 time 3.4068 (2.2862) loss 3.5029 (3.0521) grad_norm 2.7147 (2.7726) [2022-01-25 23:21:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][250/1251] eta 0:37:58 lr 0.000057 time 1.8202 (2.2762) loss 2.7474 (3.0447) grad_norm 3.4616 (2.7805) [2022-01-25 23:21:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][260/1251] eta 0:37:36 lr 0.000057 time 2.2335 (2.2775) loss 3.3615 (3.0357) grad_norm 2.8189 (2.7837) [2022-01-25 23:22:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][270/1251] eta 0:37:10 lr 0.000057 time 1.8954 (2.2735) loss 2.0596 (3.0312) grad_norm 2.7875 (2.7915) [2022-01-25 23:22:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][280/1251] eta 0:36:50 lr 0.000057 time 3.3987 (2.2766) loss 3.0209 (3.0362) grad_norm 2.7811 (2.7966) [2022-01-25 23:23:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][290/1251] eta 0:36:24 lr 0.000057 time 2.2111 (2.2727) loss 3.1273 (3.0430) grad_norm 2.7597 (2.7959) [2022-01-25 23:23:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][300/1251] eta 0:35:53 lr 0.000057 time 1.9998 (2.2648) loss 3.3650 (3.0399) grad_norm 2.9503 (2.7925) [2022-01-25 23:23:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][310/1251] eta 0:35:25 lr 0.000057 time 2.4837 (2.2584) loss 3.1150 (3.0439) grad_norm 2.4727 (2.7918) [2022-01-25 23:24:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][320/1251] eta 0:35:05 lr 0.000057 time 3.3542 (2.2613) loss 2.2414 (3.0459) grad_norm 2.6574 (2.7875) [2022-01-25 23:24:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][330/1251] eta 0:34:38 lr 0.000057 time 1.8865 (2.2572) loss 3.2551 (3.0542) grad_norm 2.8689 (2.7874) [2022-01-25 23:24:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][340/1251] eta 0:34:13 lr 0.000057 time 2.2384 (2.2544) loss 3.1383 (3.0554) grad_norm 2.5562 (2.7843) [2022-01-25 23:25:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][350/1251] eta 0:33:47 lr 0.000056 time 2.1586 (2.2500) loss 3.3696 (3.0550) grad_norm 3.0652 (2.7851) [2022-01-25 23:25:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][360/1251] eta 0:33:21 lr 0.000056 time 2.6193 (2.2468) loss 3.1615 (3.0484) grad_norm 2.1694 (2.7813) [2022-01-25 23:25:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][370/1251] eta 0:32:52 lr 0.000056 time 1.7864 (2.2386) loss 2.7640 (3.0441) grad_norm 3.0279 (2.7839) [2022-01-25 23:26:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][380/1251] eta 0:32:27 lr 0.000056 time 2.2922 (2.2358) loss 3.4556 (3.0382) grad_norm 2.7434 (2.7819) [2022-01-25 23:26:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][390/1251] eta 0:32:04 lr 0.000056 time 1.8684 (2.2357) loss 2.9382 (3.0405) grad_norm 2.5361 (2.7827) [2022-01-25 23:26:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][400/1251] eta 0:31:42 lr 0.000056 time 2.8715 (2.2360) loss 3.3867 (3.0285) grad_norm 2.9588 (2.7833) [2022-01-25 23:27:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][410/1251] eta 0:31:17 lr 0.000056 time 1.9140 (2.2325) loss 3.2435 (3.0332) grad_norm 2.6430 (2.7847) [2022-01-25 23:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][420/1251] eta 0:30:54 lr 0.000056 time 2.5290 (2.2311) loss 3.3799 (3.0380) grad_norm 2.6400 (2.7861) [2022-01-25 23:28:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][430/1251] eta 0:30:32 lr 0.000056 time 1.4511 (2.2316) loss 3.5774 (3.0271) grad_norm 2.5442 (2.7856) [2022-01-25 23:28:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][440/1251] eta 0:30:13 lr 0.000056 time 2.4098 (2.2361) loss 2.5784 (3.0288) grad_norm 3.6031 (2.7888) [2022-01-25 23:28:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][450/1251] eta 0:29:49 lr 0.000056 time 1.6231 (2.2340) loss 2.7701 (3.0302) grad_norm 2.6942 (2.7892) [2022-01-25 23:29:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][460/1251] eta 0:29:26 lr 0.000056 time 2.3242 (2.2327) loss 3.1756 (3.0285) grad_norm 2.7774 (2.7893) [2022-01-25 23:29:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][470/1251] eta 0:29:05 lr 0.000056 time 2.0324 (2.2344) loss 3.1897 (3.0266) grad_norm 2.4048 (2.7865) [2022-01-25 23:29:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][480/1251] eta 0:28:42 lr 0.000056 time 1.8695 (2.2346) loss 3.6163 (3.0302) grad_norm 3.0444 (2.7835) [2022-01-25 23:30:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][490/1251] eta 0:28:16 lr 0.000056 time 1.9471 (2.2288) loss 3.2046 (3.0341) grad_norm 2.6960 (2.7805) [2022-01-25 23:30:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][500/1251] eta 0:27:52 lr 0.000056 time 2.2485 (2.2270) loss 2.5139 (3.0324) grad_norm 2.4044 (2.7821) [2022-01-25 23:30:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][510/1251] eta 0:27:28 lr 0.000056 time 2.4877 (2.2244) loss 3.6874 (3.0318) grad_norm 2.7808 (2.7834) [2022-01-25 23:31:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][520/1251] eta 0:27:06 lr 0.000056 time 2.2493 (2.2249) loss 3.5591 (3.0334) grad_norm 2.5267 (2.7841) [2022-01-25 23:31:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][530/1251] eta 0:26:45 lr 0.000056 time 2.2261 (2.2264) loss 3.6022 (3.0360) grad_norm 3.2403 (2.7858) [2022-01-25 23:32:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][540/1251] eta 0:26:23 lr 0.000056 time 2.5040 (2.2271) loss 3.4243 (3.0394) grad_norm 2.7238 (2.7872) [2022-01-25 23:32:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][550/1251] eta 0:26:02 lr 0.000056 time 2.3350 (2.2288) loss 2.2825 (3.0344) grad_norm 2.6214 (2.7892) [2022-01-25 23:32:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][560/1251] eta 0:25:39 lr 0.000056 time 1.9660 (2.2283) loss 3.4507 (3.0340) grad_norm 2.7755 (2.7898) [2022-01-25 23:33:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][570/1251] eta 0:25:14 lr 0.000056 time 2.2311 (2.2241) loss 2.7926 (3.0326) grad_norm 2.6114 (2.7895) [2022-01-25 23:33:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][580/1251] eta 0:24:50 lr 0.000056 time 1.9566 (2.2209) loss 2.1347 (3.0327) grad_norm 2.9300 (2.7892) [2022-01-25 23:33:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][590/1251] eta 0:24:26 lr 0.000056 time 2.3338 (2.2193) loss 3.4020 (3.0363) grad_norm 2.8191 (2.7912) [2022-01-25 23:34:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][600/1251] eta 0:24:04 lr 0.000056 time 2.3346 (2.2183) loss 3.8863 (3.0377) grad_norm 2.6302 (2.7951) [2022-01-25 23:34:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][610/1251] eta 0:23:41 lr 0.000056 time 1.8458 (2.2177) loss 2.2115 (3.0363) grad_norm 2.6235 (2.7935) [2022-01-25 23:35:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][620/1251] eta 0:23:20 lr 0.000056 time 2.2691 (2.2199) loss 3.2838 (3.0328) grad_norm 2.8242 (2.7931) [2022-01-25 23:35:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][630/1251] eta 0:22:59 lr 0.000056 time 2.5655 (2.2207) loss 3.1595 (3.0375) grad_norm 2.8366 (2.7959) [2022-01-25 23:35:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][640/1251] eta 0:22:36 lr 0.000056 time 2.8524 (2.2194) loss 2.8669 (3.0391) grad_norm 2.2793 (2.7939) [2022-01-25 23:36:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][650/1251] eta 0:22:12 lr 0.000056 time 2.0310 (2.2169) loss 2.9875 (3.0401) grad_norm 2.6117 (2.7921) [2022-01-25 23:36:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][660/1251] eta 0:21:48 lr 0.000056 time 1.5781 (2.2145) loss 3.1867 (3.0441) grad_norm 2.5730 (2.7915) [2022-01-25 23:36:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][670/1251] eta 0:21:26 lr 0.000056 time 2.1885 (2.2136) loss 3.2393 (3.0438) grad_norm 2.8230 (2.7918) [2022-01-25 23:37:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][680/1251] eta 0:21:04 lr 0.000056 time 2.7334 (2.2144) loss 2.5955 (3.0401) grad_norm 3.3236 (2.7916) [2022-01-25 23:37:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][690/1251] eta 0:20:41 lr 0.000056 time 2.0621 (2.2136) loss 2.0774 (3.0372) grad_norm 2.5287 (2.7895) [2022-01-25 23:37:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][700/1251] eta 0:20:18 lr 0.000056 time 2.5184 (2.2123) loss 3.6907 (3.0374) grad_norm 3.1221 (2.7866) [2022-01-25 23:38:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][710/1251] eta 0:19:56 lr 0.000056 time 2.4513 (2.2116) loss 3.1998 (3.0353) grad_norm 3.0409 (2.7863) [2022-01-25 23:38:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][720/1251] eta 0:19:34 lr 0.000056 time 1.4998 (2.2115) loss 3.8468 (3.0354) grad_norm 2.5308 (2.7858) [2022-01-25 23:39:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][730/1251] eta 0:19:13 lr 0.000056 time 2.2455 (2.2137) loss 3.5241 (3.0353) grad_norm 2.5967 (2.7868) [2022-01-25 23:39:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][740/1251] eta 0:18:51 lr 0.000056 time 2.2505 (2.2137) loss 2.7127 (3.0353) grad_norm 2.5776 (2.7876) [2022-01-25 23:39:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][750/1251] eta 0:18:29 lr 0.000056 time 2.7353 (2.2140) loss 3.1200 (3.0378) grad_norm 2.8247 (2.7864) [2022-01-25 23:40:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][760/1251] eta 0:18:06 lr 0.000056 time 1.5186 (2.2132) loss 3.1779 (3.0410) grad_norm 2.8656 (2.7861) [2022-01-25 23:40:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][770/1251] eta 0:17:44 lr 0.000056 time 2.2240 (2.2141) loss 3.2421 (3.0426) grad_norm 3.3765 (2.7890) [2022-01-25 23:40:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][780/1251] eta 0:17:21 lr 0.000056 time 1.9403 (2.2118) loss 2.4283 (3.0378) grad_norm 3.3383 (2.7919) [2022-01-25 23:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][790/1251] eta 0:16:58 lr 0.000056 time 2.1066 (2.2093) loss 3.0806 (3.0378) grad_norm 2.8947 (2.7968) [2022-01-25 23:41:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][800/1251] eta 0:16:36 lr 0.000056 time 1.9344 (2.2093) loss 2.8122 (3.0384) grad_norm 2.2168 (2.7970) [2022-01-25 23:41:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][810/1251] eta 0:16:13 lr 0.000056 time 2.1989 (2.2076) loss 3.3882 (3.0404) grad_norm 2.8937 (2.7972) [2022-01-25 23:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][820/1251] eta 0:15:50 lr 0.000056 time 1.9485 (2.2060) loss 2.9620 (3.0402) grad_norm 2.6567 (2.7993) [2022-01-25 23:42:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][830/1251] eta 0:15:28 lr 0.000056 time 1.7576 (2.2061) loss 3.1926 (3.0392) grad_norm 2.5603 (2.7972) [2022-01-25 23:42:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][840/1251] eta 0:15:06 lr 0.000056 time 1.9155 (2.2056) loss 2.9268 (3.0424) grad_norm 2.4716 (2.7962) [2022-01-25 23:43:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][850/1251] eta 0:14:45 lr 0.000056 time 5.0562 (2.2084) loss 2.8458 (3.0428) grad_norm 2.9622 (2.7962) [2022-01-25 23:43:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][860/1251] eta 0:14:23 lr 0.000056 time 1.6353 (2.2087) loss 2.6940 (3.0443) grad_norm 2.8103 (2.7973) [2022-01-25 23:44:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][870/1251] eta 0:14:01 lr 0.000056 time 1.9226 (2.2077) loss 2.5133 (3.0451) grad_norm 2.5308 (2.7992) [2022-01-25 23:44:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][880/1251] eta 0:13:38 lr 0.000056 time 2.0949 (2.2068) loss 2.1431 (3.0436) grad_norm 2.7870 (2.8005) [2022-01-25 23:44:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][890/1251] eta 0:13:17 lr 0.000056 time 3.6445 (2.2080) loss 2.3323 (3.0438) grad_norm 2.4863 (2.7993) [2022-01-25 23:45:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][900/1251] eta 0:12:54 lr 0.000056 time 2.0856 (2.2078) loss 3.1684 (3.0444) grad_norm 2.4910 (2.7994) [2022-01-25 23:45:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][910/1251] eta 0:12:32 lr 0.000056 time 2.2540 (2.2069) loss 2.4083 (3.0460) grad_norm 2.7005 (2.7998) [2022-01-25 23:45:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][920/1251] eta 0:12:09 lr 0.000056 time 1.7294 (2.2047) loss 3.3384 (3.0444) grad_norm 2.5514 (2.7987) [2022-01-25 23:46:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][930/1251] eta 0:11:47 lr 0.000055 time 2.6967 (2.2038) loss 3.2895 (3.0474) grad_norm 2.8415 (2.7990) [2022-01-25 23:46:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][940/1251] eta 0:11:24 lr 0.000055 time 1.5916 (2.2017) loss 3.4362 (3.0480) grad_norm 3.1565 (2.7989) [2022-01-25 23:46:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][950/1251] eta 0:11:02 lr 0.000055 time 2.1866 (2.2021) loss 2.0417 (3.0461) grad_norm 2.6258 (2.7985) [2022-01-25 23:47:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][960/1251] eta 0:10:41 lr 0.000055 time 2.0311 (2.2030) loss 2.9696 (3.0476) grad_norm 2.6514 (2.7964) [2022-01-25 23:47:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][970/1251] eta 0:10:18 lr 0.000055 time 2.1373 (2.2020) loss 3.1252 (3.0481) grad_norm 2.8015 (2.7959) [2022-01-25 23:48:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][980/1251] eta 0:09:56 lr 0.000055 time 2.3522 (2.2025) loss 2.9348 (3.0479) grad_norm 3.1196 (2.7959) [2022-01-25 23:48:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][990/1251] eta 0:09:35 lr 0.000055 time 3.0002 (2.2036) loss 3.3493 (3.0496) grad_norm 2.6198 (2.7955) [2022-01-25 23:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1000/1251] eta 0:09:13 lr 0.000055 time 1.8868 (2.2035) loss 3.2581 (3.0471) grad_norm 2.9320 (2.7953) [2022-01-25 23:49:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1010/1251] eta 0:08:51 lr 0.000055 time 2.1794 (2.2038) loss 2.1976 (3.0466) grad_norm 2.6527 (2.7953) [2022-01-25 23:49:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1020/1251] eta 0:08:28 lr 0.000055 time 2.1871 (2.2034) loss 3.4387 (3.0447) grad_norm 2.7172 (2.7964) [2022-01-25 23:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1030/1251] eta 0:08:07 lr 0.000055 time 2.5486 (2.2043) loss 3.0503 (3.0428) grad_norm 2.9201 (2.7958) [2022-01-25 23:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1040/1251] eta 0:07:44 lr 0.000055 time 1.8407 (2.2030) loss 3.3707 (3.0416) grad_norm 3.2318 (2.7963) [2022-01-25 23:50:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1050/1251] eta 0:07:22 lr 0.000055 time 2.2829 (2.2007) loss 3.2702 (3.0449) grad_norm 2.6478 (2.7960) [2022-01-25 23:50:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1060/1251] eta 0:07:00 lr 0.000055 time 2.0854 (2.1993) loss 2.1509 (3.0455) grad_norm 2.7916 (2.7948) [2022-01-25 23:51:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1070/1251] eta 0:06:37 lr 0.000055 time 1.8563 (2.1982) loss 2.9451 (3.0441) grad_norm 2.4825 (2.7945) [2022-01-25 23:51:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1080/1251] eta 0:06:15 lr 0.000055 time 2.3227 (2.1979) loss 3.2244 (3.0442) grad_norm 3.0686 (2.7956) [2022-01-25 23:52:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1090/1251] eta 0:05:53 lr 0.000055 time 2.5696 (2.1985) loss 3.2522 (3.0459) grad_norm 2.8048 (2.7964) [2022-01-25 23:52:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1100/1251] eta 0:05:31 lr 0.000055 time 1.6458 (2.1984) loss 3.4962 (3.0465) grad_norm 2.6706 (2.7973) [2022-01-25 23:52:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1110/1251] eta 0:05:09 lr 0.000055 time 1.7051 (2.1981) loss 2.4566 (3.0444) grad_norm 3.1982 (2.7998) [2022-01-25 23:53:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1120/1251] eta 0:04:48 lr 0.000055 time 2.9611 (2.2001) loss 3.4783 (3.0455) grad_norm 2.7386 (2.7996) [2022-01-25 23:53:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1130/1251] eta 0:04:26 lr 0.000055 time 2.1918 (2.2009) loss 2.7565 (3.0457) grad_norm 3.2838 (2.7990) [2022-01-25 23:53:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1140/1251] eta 0:04:04 lr 0.000055 time 1.5948 (2.2009) loss 3.3913 (3.0452) grad_norm 2.7745 (2.7988) [2022-01-25 23:54:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1150/1251] eta 0:03:42 lr 0.000055 time 2.4246 (2.2007) loss 3.2779 (3.0433) grad_norm 2.9490 (2.8026) [2022-01-25 23:54:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1160/1251] eta 0:03:20 lr 0.000055 time 2.5358 (2.2005) loss 3.0912 (3.0427) grad_norm 2.5413 (2.8044) [2022-01-25 23:54:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1170/1251] eta 0:02:58 lr 0.000055 time 1.5440 (2.1984) loss 2.2485 (3.0434) grad_norm 2.4505 (2.8030) [2022-01-25 23:55:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1180/1251] eta 0:02:35 lr 0.000055 time 1.6838 (2.1968) loss 3.7461 (3.0424) grad_norm 2.4811 (2.8030) [2022-01-25 23:55:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1190/1251] eta 0:02:14 lr 0.000055 time 2.2205 (2.1973) loss 2.4743 (3.0413) grad_norm 2.9067 (2.8031) [2022-01-25 23:56:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1200/1251] eta 0:01:52 lr 0.000055 time 2.2595 (2.1976) loss 3.7193 (3.0418) grad_norm 2.6869 (2.8028) [2022-01-25 23:56:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1210/1251] eta 0:01:30 lr 0.000055 time 2.1883 (2.1982) loss 2.2487 (3.0415) grad_norm 2.4509 (2.8016) [2022-01-25 23:56:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1220/1251] eta 0:01:08 lr 0.000055 time 2.8113 (2.1987) loss 3.2506 (3.0405) grad_norm 2.4118 (2.8011) [2022-01-25 23:57:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1230/1251] eta 0:00:46 lr 0.000055 time 2.2237 (2.2003) loss 3.6859 (3.0406) grad_norm 2.7219 (2.7997) [2022-01-25 23:57:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1240/1251] eta 0:00:24 lr 0.000055 time 1.2055 (2.1982) loss 3.6645 (3.0418) grad_norm 2.8941 (2.7990) [2022-01-25 23:57:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [258/300][1250/1251] eta 0:00:02 lr 0.000055 time 1.1926 (2.1927) loss 2.0123 (3.0420) grad_norm 3.2318 (2.7994) [2022-01-25 23:57:45 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 258 training takes 0:45:43 [2022-01-25 23:58:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 20.296 (20.296) Loss 0.7725 (0.7725) Acc@1 82.227 (82.227) Acc@5 96.582 (96.582) [2022-01-25 23:58:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.283 (3.334) Loss 0.8748 (0.8086) Acc@1 80.469 (80.859) Acc@5 95.020 (95.534) [2022-01-25 23:58:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.298 (2.510) Loss 0.8212 (0.8085) Acc@1 80.859 (80.850) Acc@5 95.801 (95.508) [2022-01-25 23:58:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.367 (2.259) Loss 0.8207 (0.8154) Acc@1 81.250 (80.696) Acc@5 95.020 (95.413) [2022-01-25 23:59:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.699 (2.183) Loss 0.8322 (0.8183) Acc@1 80.859 (80.535) Acc@5 95.410 (95.405) [2022-01-25 23:59:21 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.628 Acc@5 95.368 [2022-01-25 23:59:21 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-01-25 23:59:21 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.78% [2022-01-25 23:59:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][0/1251] eta 7:32:16 lr 0.000055 time 21.6917 (21.6917) loss 2.3914 (2.3914) grad_norm 2.7446 (2.7446) [2022-01-26 00:00:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][10/1251] eta 1:24:51 lr 0.000055 time 2.8412 (4.1026) loss 3.3862 (2.8507) grad_norm 2.5014 (2.6056) [2022-01-26 00:00:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][20/1251] eta 1:05:22 lr 0.000055 time 1.2524 (3.1864) loss 2.3643 (2.8944) grad_norm 2.4360 (2.7368) [2022-01-26 00:00:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][30/1251] eta 0:58:20 lr 0.000055 time 1.9406 (2.8668) loss 2.4576 (2.9958) grad_norm 3.6069 (2.8420) [2022-01-26 00:01:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][40/1251] eta 0:56:20 lr 0.000055 time 5.8938 (2.7911) loss 2.6510 (3.0252) grad_norm 2.7001 (2.8534) [2022-01-26 00:01:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][50/1251] eta 0:53:51 lr 0.000055 time 1.8634 (2.6907) loss 3.1281 (3.0031) grad_norm 2.5694 (2.9031) [2022-01-26 00:02:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][60/1251] eta 0:51:53 lr 0.000055 time 1.6916 (2.6143) loss 1.8815 (2.9520) grad_norm 2.5953 (2.8912) [2022-01-26 00:02:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][70/1251] eta 0:49:59 lr 0.000055 time 1.8722 (2.5394) loss 2.6405 (2.9736) grad_norm 2.9015 (2.9291) [2022-01-26 00:02:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][80/1251] eta 0:49:33 lr 0.000055 time 6.2673 (2.5395) loss 3.4321 (3.0027) grad_norm 3.4317 (2.9097) [2022-01-26 00:03:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][90/1251] eta 0:48:18 lr 0.000055 time 1.5875 (2.4969) loss 3.2561 (3.0377) grad_norm 2.5172 (2.9115) [2022-01-26 00:03:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][100/1251] eta 0:47:03 lr 0.000055 time 1.8617 (2.4528) loss 3.2657 (3.0332) grad_norm 2.8752 (2.9061) [2022-01-26 00:03:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][110/1251] eta 0:45:47 lr 0.000055 time 1.5892 (2.4076) loss 3.7310 (3.0431) grad_norm 2.8616 (2.8886) [2022-01-26 00:04:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][120/1251] eta 0:45:24 lr 0.000055 time 3.6626 (2.4092) loss 3.3209 (3.0557) grad_norm 2.6981 (2.8761) [2022-01-26 00:04:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][130/1251] eta 0:44:35 lr 0.000055 time 1.6126 (2.3865) loss 3.3059 (3.0638) grad_norm 2.6434 (2.8639) [2022-01-26 00:04:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][140/1251] eta 0:44:04 lr 0.000055 time 1.6816 (2.3800) loss 3.6157 (3.0721) grad_norm 3.1517 (2.8782) [2022-01-26 00:05:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][150/1251] eta 0:43:22 lr 0.000055 time 2.3247 (2.3639) loss 2.7018 (3.0718) grad_norm 2.8001 (2.8718) [2022-01-26 00:05:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][160/1251] eta 0:43:16 lr 0.000055 time 6.8807 (2.3797) loss 3.5216 (3.0729) grad_norm 2.9939 (2.8706) [2022-01-26 00:06:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][170/1251] eta 0:42:37 lr 0.000055 time 1.8967 (2.3655) loss 3.5553 (3.0754) grad_norm 3.2361 (2.8748) [2022-01-26 00:06:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][180/1251] eta 0:42:05 lr 0.000055 time 1.6750 (2.3580) loss 3.2528 (3.0691) grad_norm 2.7193 (2.8770) [2022-01-26 00:06:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][190/1251] eta 0:41:22 lr 0.000055 time 1.8962 (2.3396) loss 3.4439 (3.0671) grad_norm 3.0407 (2.8756) [2022-01-26 00:07:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][200/1251] eta 0:40:50 lr 0.000055 time 3.6399 (2.3317) loss 2.0083 (3.0554) grad_norm 2.9473 (2.8707) [2022-01-26 00:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][210/1251] eta 0:40:18 lr 0.000055 time 2.0727 (2.3236) loss 3.2739 (3.0427) grad_norm 3.0832 (2.8661) [2022-01-26 00:07:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][220/1251] eta 0:39:39 lr 0.000055 time 1.8602 (2.3082) loss 1.8714 (3.0355) grad_norm 2.6538 (2.8647) [2022-01-26 00:08:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][230/1251] eta 0:39:04 lr 0.000055 time 1.7102 (2.2964) loss 3.7437 (3.0379) grad_norm 3.2320 (2.8680) [2022-01-26 00:08:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][240/1251] eta 0:38:38 lr 0.000055 time 3.1514 (2.2929) loss 3.2937 (3.0392) grad_norm 7.7157 (2.8848) [2022-01-26 00:08:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][250/1251] eta 0:38:15 lr 0.000054 time 2.8403 (2.2930) loss 2.0669 (3.0388) grad_norm 3.0218 (2.8819) [2022-01-26 00:09:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][260/1251] eta 0:37:55 lr 0.000054 time 2.5744 (2.2959) loss 3.4408 (3.0501) grad_norm 2.5158 (2.8762) [2022-01-26 00:09:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][270/1251] eta 0:37:28 lr 0.000054 time 2.2771 (2.2916) loss 2.9081 (3.0416) grad_norm 2.4515 (2.8698) [2022-01-26 00:10:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][280/1251] eta 0:37:05 lr 0.000054 time 3.0958 (2.2924) loss 2.2169 (3.0321) grad_norm 2.6979 (2.8657) [2022-01-26 00:10:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][290/1251] eta 0:36:43 lr 0.000054 time 2.9413 (2.2929) loss 2.4761 (3.0319) grad_norm 3.2577 (2.8689) [2022-01-26 00:10:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][300/1251] eta 0:36:10 lr 0.000054 time 1.6097 (2.2829) loss 3.1745 (3.0242) grad_norm 2.6290 (2.8642) [2022-01-26 00:11:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][310/1251] eta 0:35:42 lr 0.000054 time 1.8500 (2.2766) loss 2.6167 (3.0169) grad_norm 2.6860 (2.8625) [2022-01-26 00:11:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][320/1251] eta 0:35:26 lr 0.000054 time 4.7783 (2.2840) loss 3.2317 (3.0199) grad_norm 2.6297 (2.8601) [2022-01-26 00:11:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][330/1251] eta 0:35:05 lr 0.000054 time 2.6444 (2.2861) loss 3.0039 (3.0218) grad_norm 3.1410 (2.8619) [2022-01-26 00:12:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][340/1251] eta 0:34:39 lr 0.000054 time 1.7051 (2.2824) loss 3.4847 (3.0290) grad_norm 2.8293 (2.8646) [2022-01-26 00:12:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][350/1251] eta 0:34:07 lr 0.000054 time 1.9143 (2.2724) loss 2.6405 (3.0351) grad_norm 2.4989 (2.8666) [2022-01-26 00:13:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][360/1251] eta 0:33:41 lr 0.000054 time 3.4315 (2.2685) loss 2.9037 (3.0357) grad_norm 2.6409 (2.8677) [2022-01-26 00:13:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][370/1251] eta 0:33:16 lr 0.000054 time 2.3799 (2.2657) loss 2.2449 (3.0339) grad_norm 3.3485 (2.8674) [2022-01-26 00:13:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][380/1251] eta 0:32:52 lr 0.000054 time 1.8459 (2.2645) loss 3.3617 (3.0304) grad_norm 2.7183 (2.8704) [2022-01-26 00:14:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][390/1251] eta 0:32:32 lr 0.000054 time 2.1288 (2.2677) loss 2.1038 (3.0304) grad_norm 2.9105 (2.8699) [2022-01-26 00:14:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][400/1251] eta 0:32:09 lr 0.000054 time 3.5836 (2.2673) loss 3.1870 (3.0320) grad_norm 2.8359 (2.8694) [2022-01-26 00:14:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][410/1251] eta 0:31:44 lr 0.000054 time 1.8594 (2.2646) loss 2.7994 (3.0290) grad_norm 2.6927 (2.8716) [2022-01-26 00:15:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][420/1251] eta 0:31:17 lr 0.000054 time 1.7729 (2.2596) loss 3.2468 (3.0291) grad_norm 2.8441 (2.8715) [2022-01-26 00:15:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][430/1251] eta 0:30:51 lr 0.000054 time 1.6212 (2.2546) loss 3.6056 (3.0395) grad_norm 2.5477 (2.8701) [2022-01-26 00:15:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][440/1251] eta 0:30:29 lr 0.000054 time 2.9583 (2.2553) loss 2.3537 (3.0395) grad_norm 2.5534 (2.8676) [2022-01-26 00:16:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][450/1251] eta 0:30:05 lr 0.000054 time 1.9699 (2.2539) loss 3.4271 (3.0358) grad_norm 2.9697 (2.8660) [2022-01-26 00:16:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][460/1251] eta 0:29:43 lr 0.000054 time 2.8499 (2.2543) loss 2.8117 (3.0376) grad_norm 2.8313 (2.8634) [2022-01-26 00:17:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][470/1251] eta 0:29:17 lr 0.000054 time 1.9545 (2.2501) loss 2.0587 (3.0431) grad_norm 4.2130 (2.8651) [2022-01-26 00:17:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][480/1251] eta 0:28:54 lr 0.000054 time 2.7559 (2.2496) loss 3.2595 (3.0435) grad_norm 3.0938 (2.8662) [2022-01-26 00:17:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][490/1251] eta 0:28:29 lr 0.000054 time 2.0199 (2.2458) loss 3.4407 (3.0383) grad_norm 2.7824 (2.8653) [2022-01-26 00:18:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][500/1251] eta 0:28:05 lr 0.000054 time 2.6282 (2.2448) loss 3.3061 (3.0387) grad_norm 2.3701 (2.8614) [2022-01-26 00:18:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][510/1251] eta 0:27:41 lr 0.000054 time 2.1516 (2.2417) loss 3.0418 (3.0369) grad_norm 2.5294 (2.8614) [2022-01-26 00:18:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][520/1251] eta 0:27:17 lr 0.000054 time 2.5020 (2.2404) loss 3.2301 (3.0361) grad_norm 3.1305 (2.8639) [2022-01-26 00:19:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][530/1251] eta 0:26:54 lr 0.000054 time 2.5084 (2.2392) loss 3.3497 (3.0383) grad_norm 2.7055 (2.8643) [2022-01-26 00:19:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][540/1251] eta 0:26:30 lr 0.000054 time 2.2264 (2.2368) loss 3.3155 (3.0374) grad_norm 2.7114 (2.8617) [2022-01-26 00:19:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][550/1251] eta 0:26:05 lr 0.000054 time 1.8006 (2.2339) loss 2.9735 (3.0345) grad_norm 3.3036 (2.8588) [2022-01-26 00:20:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][560/1251] eta 0:25:42 lr 0.000054 time 2.8003 (2.2325) loss 3.2813 (3.0371) grad_norm 2.5486 (2.8601) [2022-01-26 00:20:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][570/1251] eta 0:25:20 lr 0.000054 time 2.7299 (2.2333) loss 3.2553 (3.0354) grad_norm 2.8851 (2.8627) [2022-01-26 00:20:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][580/1251] eta 0:24:57 lr 0.000054 time 2.2369 (2.2321) loss 1.8695 (3.0336) grad_norm 2.5999 (2.8617) [2022-01-26 00:21:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][590/1251] eta 0:24:34 lr 0.000054 time 1.5894 (2.2301) loss 2.9723 (3.0313) grad_norm 2.6250 (2.8607) [2022-01-26 00:21:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][600/1251] eta 0:24:16 lr 0.000054 time 4.0773 (2.2374) loss 3.3775 (3.0306) grad_norm 2.4253 (2.8587) [2022-01-26 00:22:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][610/1251] eta 0:23:53 lr 0.000054 time 2.1780 (2.2366) loss 3.0396 (3.0262) grad_norm 2.9436 (2.8585) [2022-01-26 00:22:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][620/1251] eta 0:23:29 lr 0.000054 time 1.8402 (2.2340) loss 2.8536 (3.0221) grad_norm 2.3544 (2.8549) [2022-01-26 00:22:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][630/1251] eta 0:23:06 lr 0.000054 time 1.9534 (2.2327) loss 2.7254 (3.0196) grad_norm 3.6681 (2.8521) [2022-01-26 00:23:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][640/1251] eta 0:22:44 lr 0.000054 time 3.7221 (2.2336) loss 3.3961 (3.0157) grad_norm 2.6573 (2.8511) [2022-01-26 00:23:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][650/1251] eta 0:22:22 lr 0.000054 time 3.2270 (2.2346) loss 2.6929 (3.0138) grad_norm 2.5380 (2.8485) [2022-01-26 00:23:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][660/1251] eta 0:21:59 lr 0.000054 time 2.1907 (2.2322) loss 3.2544 (3.0152) grad_norm 2.3736 (2.8448) [2022-01-26 00:24:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][670/1251] eta 0:21:34 lr 0.000054 time 1.7771 (2.2286) loss 3.0741 (3.0156) grad_norm 2.5375 (2.8410) [2022-01-26 00:24:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][680/1251] eta 0:21:12 lr 0.000054 time 2.6964 (2.2294) loss 2.8973 (3.0187) grad_norm 2.5099 (2.8417) [2022-01-26 00:25:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][690/1251] eta 0:20:49 lr 0.000054 time 1.8408 (2.2277) loss 3.7864 (3.0214) grad_norm 3.0010 (2.8381) [2022-01-26 00:25:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][700/1251] eta 0:20:25 lr 0.000054 time 1.8673 (2.2250) loss 3.3259 (3.0186) grad_norm 2.6785 (2.8363) [2022-01-26 00:25:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][710/1251] eta 0:20:04 lr 0.000054 time 2.8748 (2.2256) loss 3.4950 (3.0186) grad_norm 2.8842 (2.8369) [2022-01-26 00:26:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][720/1251] eta 0:19:41 lr 0.000054 time 2.3290 (2.2257) loss 3.0070 (3.0207) grad_norm 3.8507 (2.8380) [2022-01-26 00:26:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][730/1251] eta 0:19:18 lr 0.000054 time 1.5932 (2.2237) loss 2.2311 (3.0175) grad_norm 2.5436 (2.8382) [2022-01-26 00:26:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][740/1251] eta 0:18:56 lr 0.000054 time 1.5457 (2.2234) loss 3.7285 (3.0177) grad_norm 2.7890 (2.8368) [2022-01-26 00:27:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][750/1251] eta 0:18:33 lr 0.000054 time 1.8783 (2.2226) loss 2.8733 (3.0208) grad_norm 2.7509 (2.8391) [2022-01-26 00:27:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][760/1251] eta 0:18:10 lr 0.000054 time 1.7958 (2.2209) loss 3.3473 (3.0225) grad_norm 2.8043 (2.8413) [2022-01-26 00:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][770/1251] eta 0:17:48 lr 0.000054 time 1.7267 (2.2219) loss 3.3831 (3.0211) grad_norm 2.6160 (2.8394) [2022-01-26 00:28:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][780/1251] eta 0:17:26 lr 0.000054 time 1.6985 (2.2218) loss 3.0674 (3.0244) grad_norm 2.6017 (2.8402) [2022-01-26 00:28:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][790/1251] eta 0:17:04 lr 0.000054 time 1.6380 (2.2219) loss 2.6450 (3.0236) grad_norm 2.6680 (2.8369) [2022-01-26 00:29:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][800/1251] eta 0:16:41 lr 0.000054 time 1.9720 (2.2217) loss 2.0400 (3.0196) grad_norm 2.5691 (2.8354) [2022-01-26 00:29:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][810/1251] eta 0:16:19 lr 0.000054 time 1.8437 (2.2210) loss 3.1132 (3.0213) grad_norm 3.5413 (2.8359) [2022-01-26 00:29:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][820/1251] eta 0:15:57 lr 0.000054 time 1.8447 (2.2218) loss 3.2481 (3.0212) grad_norm 2.9857 (2.8352) [2022-01-26 00:30:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][830/1251] eta 0:15:35 lr 0.000054 time 1.6321 (2.2218) loss 3.3538 (3.0204) grad_norm 2.8723 (2.8330) [2022-01-26 00:30:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][840/1251] eta 0:15:13 lr 0.000053 time 2.3771 (2.2228) loss 2.7469 (3.0199) grad_norm 2.5664 (2.8309) [2022-01-26 00:30:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][850/1251] eta 0:14:50 lr 0.000053 time 1.9381 (2.2216) loss 2.3220 (3.0205) grad_norm 2.7635 (2.8301) [2022-01-26 00:31:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][860/1251] eta 0:14:28 lr 0.000053 time 1.8604 (2.2202) loss 2.4260 (3.0182) grad_norm 2.8187 (2.8276) [2022-01-26 00:31:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][870/1251] eta 0:14:04 lr 0.000053 time 1.8856 (2.2175) loss 2.3359 (3.0165) grad_norm 2.8014 (2.8256) [2022-01-26 00:31:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][880/1251] eta 0:13:42 lr 0.000053 time 1.9647 (2.2165) loss 3.3224 (3.0182) grad_norm 3.4265 (2.8241) [2022-01-26 00:32:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][890/1251] eta 0:13:19 lr 0.000053 time 2.3844 (2.2157) loss 3.4300 (3.0201) grad_norm 2.4289 (2.8225) [2022-01-26 00:32:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][900/1251] eta 0:12:57 lr 0.000053 time 1.9079 (2.2165) loss 1.8294 (3.0195) grad_norm 2.9116 (2.8237) [2022-01-26 00:32:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][910/1251] eta 0:12:35 lr 0.000053 time 1.8525 (2.2143) loss 3.2417 (3.0202) grad_norm 2.4997 (2.8217) [2022-01-26 00:33:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][920/1251] eta 0:12:12 lr 0.000053 time 2.0056 (2.2127) loss 3.5421 (3.0204) grad_norm 2.9266 (2.8214) [2022-01-26 00:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][930/1251] eta 0:11:49 lr 0.000053 time 1.8334 (2.2117) loss 3.4098 (3.0183) grad_norm 2.7079 (2.8197) [2022-01-26 00:34:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][940/1251] eta 0:11:27 lr 0.000053 time 1.9011 (2.2107) loss 3.7944 (3.0208) grad_norm 2.9318 (2.8192) [2022-01-26 00:34:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][950/1251] eta 0:11:05 lr 0.000053 time 1.9863 (2.2104) loss 2.6812 (3.0225) grad_norm 2.3373 (2.8203) [2022-01-26 00:34:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][960/1251] eta 0:10:43 lr 0.000053 time 2.6620 (2.2106) loss 3.3022 (3.0227) grad_norm 2.5617 (2.8186) [2022-01-26 00:35:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][970/1251] eta 0:10:21 lr 0.000053 time 1.4977 (2.2106) loss 3.1495 (3.0246) grad_norm 2.5265 (2.8189) [2022-01-26 00:35:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][980/1251] eta 0:09:59 lr 0.000053 time 2.1678 (2.2121) loss 2.9331 (3.0259) grad_norm 2.8075 (2.8186) [2022-01-26 00:35:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][990/1251] eta 0:09:37 lr 0.000053 time 2.8023 (2.2131) loss 2.6949 (3.0261) grad_norm 3.4226 (2.8187) [2022-01-26 00:36:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1000/1251] eta 0:09:15 lr 0.000053 time 1.9131 (2.2137) loss 2.5302 (3.0268) grad_norm 2.5549 (2.8166) [2022-01-26 00:36:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1010/1251] eta 0:08:53 lr 0.000053 time 1.9136 (2.2145) loss 3.4533 (3.0257) grad_norm 2.9086 (2.8154) [2022-01-26 00:37:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1020/1251] eta 0:08:31 lr 0.000053 time 1.9002 (2.2145) loss 2.8689 (3.0263) grad_norm 3.3240 (2.8176) [2022-01-26 00:37:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1030/1251] eta 0:08:09 lr 0.000053 time 1.7925 (2.2141) loss 2.5133 (3.0262) grad_norm 3.0286 (2.8174) [2022-01-26 00:37:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1040/1251] eta 0:07:46 lr 0.000053 time 2.2120 (2.2115) loss 2.3026 (3.0262) grad_norm 2.7887 (2.8175) [2022-01-26 00:38:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1050/1251] eta 0:07:24 lr 0.000053 time 1.8945 (2.2092) loss 3.2505 (3.0284) grad_norm 3.1993 (2.8166) [2022-01-26 00:38:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1060/1251] eta 0:07:02 lr 0.000053 time 2.2131 (2.2103) loss 2.2755 (3.0252) grad_norm 2.7594 (2.8162) [2022-01-26 00:38:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1070/1251] eta 0:06:40 lr 0.000053 time 2.4667 (2.2120) loss 2.9974 (3.0254) grad_norm 2.8569 (2.8144) [2022-01-26 00:39:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1080/1251] eta 0:06:18 lr 0.000053 time 2.2064 (2.2111) loss 2.9524 (3.0258) grad_norm 2.4156 (2.8138) [2022-01-26 00:39:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1090/1251] eta 0:05:55 lr 0.000053 time 1.9448 (2.2091) loss 2.1904 (3.0263) grad_norm 2.6552 (2.8138) [2022-01-26 00:39:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1100/1251] eta 0:05:33 lr 0.000053 time 1.8584 (2.2085) loss 2.8464 (3.0276) grad_norm 2.8838 (2.8132) [2022-01-26 00:40:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1110/1251] eta 0:05:11 lr 0.000053 time 2.0200 (2.2075) loss 3.1404 (3.0280) grad_norm 2.7272 (2.8123) [2022-01-26 00:40:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1120/1251] eta 0:04:49 lr 0.000053 time 2.1763 (2.2071) loss 2.3528 (3.0284) grad_norm 2.9666 (2.8151) [2022-01-26 00:40:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1130/1251] eta 0:04:27 lr 0.000053 time 1.7189 (2.2078) loss 3.1817 (3.0290) grad_norm 2.5062 (2.8177) [2022-01-26 00:41:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1140/1251] eta 0:04:05 lr 0.000053 time 3.4947 (2.2106) loss 2.9668 (3.0288) grad_norm 3.0274 (2.8188) [2022-01-26 00:41:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1150/1251] eta 0:03:43 lr 0.000053 time 1.9095 (2.2115) loss 3.0227 (3.0288) grad_norm 4.1310 (2.8198) [2022-01-26 00:42:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1160/1251] eta 0:03:21 lr 0.000053 time 2.1540 (2.2115) loss 3.1016 (3.0296) grad_norm 2.4970 (2.8198) [2022-01-26 00:42:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1170/1251] eta 0:02:58 lr 0.000053 time 1.7631 (2.2095) loss 2.8491 (3.0296) grad_norm 2.7238 (2.8196) [2022-01-26 00:42:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1180/1251] eta 0:02:36 lr 0.000053 time 2.0077 (2.2071) loss 3.2188 (3.0315) grad_norm 2.4332 (2.8192) [2022-01-26 00:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1190/1251] eta 0:02:14 lr 0.000053 time 2.4814 (2.2058) loss 3.3070 (3.0325) grad_norm 2.5543 (2.8194) [2022-01-26 00:43:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1200/1251] eta 0:01:52 lr 0.000053 time 2.1978 (2.2063) loss 2.3212 (3.0336) grad_norm 2.7192 (2.8198) [2022-01-26 00:43:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1210/1251] eta 0:01:30 lr 0.000053 time 1.8977 (2.2061) loss 3.3119 (3.0323) grad_norm 2.5953 (2.8204) [2022-01-26 00:44:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1220/1251] eta 0:01:08 lr 0.000053 time 2.8137 (2.2082) loss 2.0977 (3.0315) grad_norm 2.8602 (2.8215) [2022-01-26 00:44:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1230/1251] eta 0:00:46 lr 0.000053 time 2.1643 (2.2075) loss 2.9880 (3.0333) grad_norm 2.9400 (2.8216) [2022-01-26 00:44:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1240/1251] eta 0:00:24 lr 0.000053 time 1.9788 (2.2061) loss 2.7724 (3.0305) grad_norm 2.8021 (2.8206) [2022-01-26 00:45:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [259/300][1250/1251] eta 0:00:02 lr 0.000053 time 1.1663 (2.2005) loss 2.9373 (3.0316) grad_norm 2.9665 (2.8217) [2022-01-26 00:45:15 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 259 training takes 0:45:53 [2022-01-26 00:45:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.808 (18.808) Loss 0.8146 (0.8146) Acc@1 81.641 (81.641) Acc@5 95.801 (95.801) [2022-01-26 00:45:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.923 (3.537) Loss 0.8613 (0.8423) Acc@1 79.004 (80.407) Acc@5 95.703 (95.162) [2022-01-26 00:46:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.595 (2.793) Loss 0.8281 (0.8334) Acc@1 79.883 (80.673) Acc@5 95.801 (95.252) [2022-01-26 00:46:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.622 (2.389) Loss 0.8546 (0.8305) Acc@1 81.641 (80.777) Acc@5 95.508 (95.319) [2022-01-26 00:46:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.236 (2.229) Loss 0.8596 (0.8302) Acc@1 81.250 (80.850) Acc@5 94.141 (95.322) [2022-01-26 00:46:54 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.658 Acc@5 95.320 [2022-01-26 00:46:54 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-01-26 00:46:54 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.78% [2022-01-26 00:47:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][0/1251] eta 7:15:08 lr 0.000053 time 20.8698 (20.8698) loss 3.3487 (3.3487) grad_norm 3.0546 (3.0546) [2022-01-26 00:47:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][10/1251] eta 1:23:53 lr 0.000053 time 2.9049 (4.0563) loss 3.4197 (3.1551) grad_norm 2.6545 (2.7308) [2022-01-26 00:48:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][20/1251] eta 1:05:10 lr 0.000053 time 2.1003 (3.1768) loss 3.2098 (3.1029) grad_norm 2.5827 (2.7097) [2022-01-26 00:48:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][30/1251] eta 0:57:06 lr 0.000053 time 1.4420 (2.8062) loss 3.5912 (3.2176) grad_norm 3.4367 (2.7451) [2022-01-26 00:48:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][40/1251] eta 0:54:26 lr 0.000053 time 4.1815 (2.6972) loss 2.0189 (3.1535) grad_norm 2.5802 (2.7762) [2022-01-26 00:49:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][50/1251] eta 0:52:36 lr 0.000053 time 2.1863 (2.6279) loss 3.5073 (3.1159) grad_norm 3.2313 (2.7613) [2022-01-26 00:49:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][60/1251] eta 0:50:29 lr 0.000053 time 2.6116 (2.5434) loss 3.4335 (3.1180) grad_norm 2.4499 (2.7652) [2022-01-26 00:49:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][70/1251] eta 0:48:56 lr 0.000053 time 1.8826 (2.4866) loss 3.3176 (3.1046) grad_norm 2.6438 (2.7666) [2022-01-26 00:50:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][80/1251] eta 0:48:10 lr 0.000053 time 2.9506 (2.4684) loss 2.8254 (3.0779) grad_norm 2.7223 (2.7880) [2022-01-26 00:50:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][90/1251] eta 0:47:24 lr 0.000053 time 1.7818 (2.4498) loss 3.4704 (3.0782) grad_norm 2.6977 (2.8026) [2022-01-26 00:50:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][100/1251] eta 0:46:12 lr 0.000053 time 1.8801 (2.4088) loss 3.3826 (3.0862) grad_norm 2.8210 (2.8074) [2022-01-26 00:51:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][110/1251] eta 0:45:19 lr 0.000053 time 1.8914 (2.3834) loss 2.7356 (3.0815) grad_norm 2.6764 (2.8010) [2022-01-26 00:51:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][120/1251] eta 0:44:45 lr 0.000053 time 3.3524 (2.3748) loss 3.4479 (3.0877) grad_norm 3.1969 (2.7932) [2022-01-26 00:52:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][130/1251] eta 0:44:21 lr 0.000053 time 2.0648 (2.3745) loss 3.3112 (3.0949) grad_norm 3.2543 (2.7970) [2022-01-26 00:52:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][140/1251] eta 0:43:37 lr 0.000053 time 2.0994 (2.3561) loss 3.1034 (3.0987) grad_norm 2.5592 (2.7924) [2022-01-26 00:52:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][150/1251] eta 0:42:54 lr 0.000053 time 1.9611 (2.3383) loss 3.0483 (3.0908) grad_norm 2.8729 (2.8084) [2022-01-26 00:53:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][160/1251] eta 0:42:08 lr 0.000053 time 2.5215 (2.3178) loss 3.4890 (3.0815) grad_norm 2.8179 (2.8191) [2022-01-26 00:53:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][170/1251] eta 0:41:29 lr 0.000053 time 1.9231 (2.3033) loss 2.7185 (3.0768) grad_norm 2.3935 (2.8221) [2022-01-26 00:53:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][180/1251] eta 0:40:59 lr 0.000052 time 2.2401 (2.2967) loss 3.2362 (3.0698) grad_norm 3.4059 (2.8193) [2022-01-26 00:54:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][190/1251] eta 0:40:38 lr 0.000052 time 2.4927 (2.2980) loss 3.1576 (3.0665) grad_norm 2.4945 (2.8216) [2022-01-26 00:54:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][200/1251] eta 0:40:15 lr 0.000052 time 2.7547 (2.2984) loss 3.4645 (3.0572) grad_norm 2.5970 (2.8230) [2022-01-26 00:54:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][210/1251] eta 0:39:45 lr 0.000052 time 1.6776 (2.2917) loss 2.9690 (3.0680) grad_norm 2.7879 (2.8272) [2022-01-26 00:55:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][220/1251] eta 0:39:17 lr 0.000052 time 2.1593 (2.2871) loss 3.1662 (3.0633) grad_norm 2.8480 (2.8344) [2022-01-26 00:55:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][230/1251] eta 0:38:48 lr 0.000052 time 1.9539 (2.2810) loss 3.3363 (3.0614) grad_norm 3.0226 (2.8376) [2022-01-26 00:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][240/1251] eta 0:38:22 lr 0.000052 time 3.0846 (2.2772) loss 3.3389 (3.0678) grad_norm 3.1590 (2.8483) [2022-01-26 00:56:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][250/1251] eta 0:37:49 lr 0.000052 time 1.9056 (2.2676) loss 3.4466 (3.0726) grad_norm 2.9188 (2.8479) [2022-01-26 00:56:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][260/1251] eta 0:37:23 lr 0.000052 time 2.1092 (2.2639) loss 2.7049 (3.0648) grad_norm 2.5956 (2.8468) [2022-01-26 00:57:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][270/1251] eta 0:36:58 lr 0.000052 time 1.8482 (2.2611) loss 3.4062 (3.0687) grad_norm 3.4468 (2.8449) [2022-01-26 00:57:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][280/1251] eta 0:36:36 lr 0.000052 time 2.7719 (2.2624) loss 3.2018 (3.0696) grad_norm 2.9661 (2.8517) [2022-01-26 00:57:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][290/1251] eta 0:36:07 lr 0.000052 time 1.9457 (2.2557) loss 2.9468 (3.0633) grad_norm 2.6510 (2.8506) [2022-01-26 00:58:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][300/1251] eta 0:35:40 lr 0.000052 time 1.6306 (2.2503) loss 2.3054 (3.0614) grad_norm 2.5011 (2.8511) [2022-01-26 00:58:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][310/1251] eta 0:35:12 lr 0.000052 time 1.9995 (2.2447) loss 3.5511 (3.0621) grad_norm 2.8089 (2.8520) [2022-01-26 00:58:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][320/1251] eta 0:34:47 lr 0.000052 time 2.2924 (2.2420) loss 3.5964 (3.0590) grad_norm 3.1109 (2.8543) [2022-01-26 00:59:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][330/1251] eta 0:34:28 lr 0.000052 time 1.9939 (2.2464) loss 2.2629 (3.0567) grad_norm 3.1935 (2.8605) [2022-01-26 00:59:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][340/1251] eta 0:34:04 lr 0.000052 time 1.9813 (2.2446) loss 3.7056 (3.0594) grad_norm 2.6619 (2.8566) [2022-01-26 01:00:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][350/1251] eta 0:33:42 lr 0.000052 time 2.9195 (2.2444) loss 2.6398 (3.0627) grad_norm 2.4132 (2.8568) [2022-01-26 01:00:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][360/1251] eta 0:33:18 lr 0.000052 time 2.7053 (2.2425) loss 3.5310 (3.0626) grad_norm 3.3074 (2.8583) [2022-01-26 01:00:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][370/1251] eta 0:32:53 lr 0.000052 time 1.8755 (2.2406) loss 2.9380 (3.0658) grad_norm 2.7183 (2.8590) [2022-01-26 01:01:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][380/1251] eta 0:32:28 lr 0.000052 time 2.1737 (2.2369) loss 3.4985 (3.0683) grad_norm 2.3529 (2.8536) [2022-01-26 01:01:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][390/1251] eta 0:32:02 lr 0.000052 time 1.7437 (2.2332) loss 3.2085 (3.0638) grad_norm 3.1218 (2.8555) [2022-01-26 01:01:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][400/1251] eta 0:31:36 lr 0.000052 time 1.8944 (2.2283) loss 3.4900 (3.0685) grad_norm 2.3804 (2.8547) [2022-01-26 01:02:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][410/1251] eta 0:31:11 lr 0.000052 time 1.7123 (2.2258) loss 3.0936 (3.0692) grad_norm 2.7037 (2.8548) [2022-01-26 01:02:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][420/1251] eta 0:30:53 lr 0.000052 time 2.5583 (2.2310) loss 2.9596 (3.0725) grad_norm 3.2224 (2.8581) [2022-01-26 01:02:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][430/1251] eta 0:30:34 lr 0.000052 time 2.7946 (2.2341) loss 2.7906 (3.0737) grad_norm 2.8527 (2.8531) [2022-01-26 01:03:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][440/1251] eta 0:30:09 lr 0.000052 time 1.7069 (2.2318) loss 3.4692 (3.0725) grad_norm 4.1502 (2.8563) [2022-01-26 01:03:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][450/1251] eta 0:29:45 lr 0.000052 time 1.9586 (2.2293) loss 3.3263 (3.0684) grad_norm 2.2658 (2.8524) [2022-01-26 01:04:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][460/1251] eta 0:29:22 lr 0.000052 time 1.8328 (2.2278) loss 3.0559 (3.0635) grad_norm 2.6976 (2.8517) [2022-01-26 01:04:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][470/1251] eta 0:29:01 lr 0.000052 time 3.2884 (2.2301) loss 2.5473 (3.0628) grad_norm 3.1278 (2.8503) [2022-01-26 01:04:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][480/1251] eta 0:28:36 lr 0.000052 time 1.6558 (2.2267) loss 2.3721 (3.0605) grad_norm 2.4788 (2.8496) [2022-01-26 01:05:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][490/1251] eta 0:28:14 lr 0.000052 time 1.5995 (2.2264) loss 2.9958 (3.0616) grad_norm 2.8803 (2.8522) [2022-01-26 01:05:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][500/1251] eta 0:27:52 lr 0.000052 time 1.8878 (2.2269) loss 2.4818 (3.0615) grad_norm 2.4052 (2.8492) [2022-01-26 01:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][510/1251] eta 0:27:29 lr 0.000052 time 3.3140 (2.2260) loss 2.8938 (3.0581) grad_norm 2.5503 (2.8488) [2022-01-26 01:06:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][520/1251] eta 0:27:05 lr 0.000052 time 1.8825 (2.2237) loss 3.3580 (3.0579) grad_norm 2.7075 (2.8486) [2022-01-26 01:06:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][530/1251] eta 0:26:42 lr 0.000052 time 1.6102 (2.2223) loss 3.1706 (3.0575) grad_norm 2.6695 (2.8501) [2022-01-26 01:06:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][540/1251] eta 0:26:20 lr 0.000052 time 2.3669 (2.2234) loss 3.2959 (3.0580) grad_norm 3.1594 (2.8594) [2022-01-26 01:07:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][550/1251] eta 0:25:56 lr 0.000052 time 2.2077 (2.2199) loss 3.2551 (3.0598) grad_norm 2.9754 (2.8582) [2022-01-26 01:07:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][560/1251] eta 0:25:31 lr 0.000052 time 1.8968 (2.2158) loss 3.4510 (3.0616) grad_norm 2.7211 (2.8563) [2022-01-26 01:07:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][570/1251] eta 0:25:07 lr 0.000052 time 2.6666 (2.2139) loss 2.7976 (3.0593) grad_norm 2.4531 (2.8581) [2022-01-26 01:08:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][580/1251] eta 0:24:43 lr 0.000052 time 2.6170 (2.2114) loss 3.2421 (3.0573) grad_norm 2.5659 (2.8598) [2022-01-26 01:08:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][590/1251] eta 0:24:22 lr 0.000052 time 3.1773 (2.2123) loss 3.6448 (3.0605) grad_norm 2.7303 (2.8600) [2022-01-26 01:09:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][600/1251] eta 0:24:00 lr 0.000052 time 2.3801 (2.2131) loss 3.2077 (3.0626) grad_norm 2.5009 (2.8586) [2022-01-26 01:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][610/1251] eta 0:23:38 lr 0.000052 time 2.5038 (2.2124) loss 3.0687 (3.0610) grad_norm 2.5443 (2.8553) [2022-01-26 01:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][620/1251] eta 0:23:17 lr 0.000052 time 2.2871 (2.2141) loss 3.3998 (3.0613) grad_norm 2.9937 (2.8534) [2022-01-26 01:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][630/1251] eta 0:22:55 lr 0.000052 time 2.5152 (2.2143) loss 2.8977 (3.0579) grad_norm 3.0065 (2.8533) [2022-01-26 01:10:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][640/1251] eta 0:22:33 lr 0.000052 time 2.4180 (2.2148) loss 2.3646 (3.0552) grad_norm 2.9975 (2.8529) [2022-01-26 01:10:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][650/1251] eta 0:22:10 lr 0.000052 time 1.8748 (2.2141) loss 2.1054 (3.0551) grad_norm 2.7190 (2.8528) [2022-01-26 01:11:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][660/1251] eta 0:21:48 lr 0.000052 time 1.9459 (2.2146) loss 2.4454 (3.0560) grad_norm 3.0049 (2.8542) [2022-01-26 01:11:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][670/1251] eta 0:21:26 lr 0.000052 time 2.0218 (2.2138) loss 3.4814 (3.0604) grad_norm 3.1064 (2.8558) [2022-01-26 01:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][680/1251] eta 0:21:02 lr 0.000052 time 1.6389 (2.2109) loss 3.6178 (3.0616) grad_norm 3.0436 (2.8557) [2022-01-26 01:12:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][690/1251] eta 0:20:39 lr 0.000052 time 1.5598 (2.2087) loss 3.0331 (3.0635) grad_norm 2.9497 (2.8552) [2022-01-26 01:12:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][700/1251] eta 0:20:17 lr 0.000052 time 2.9613 (2.2097) loss 3.4530 (3.0652) grad_norm 3.0975 (2.8552) [2022-01-26 01:13:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][710/1251] eta 0:19:55 lr 0.000052 time 1.8870 (2.2099) loss 3.0942 (3.0647) grad_norm 2.4888 (2.8556) [2022-01-26 01:13:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][720/1251] eta 0:19:34 lr 0.000052 time 1.9136 (2.2113) loss 3.2530 (3.0641) grad_norm 3.3396 (2.8560) [2022-01-26 01:13:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][730/1251] eta 0:19:14 lr 0.000052 time 1.6060 (2.2151) loss 2.5841 (3.0645) grad_norm 2.8406 (2.8568) [2022-01-26 01:14:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][740/1251] eta 0:18:52 lr 0.000052 time 2.8317 (2.2167) loss 2.7603 (3.0676) grad_norm 2.6856 (2.8557) [2022-01-26 01:14:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][750/1251] eta 0:18:29 lr 0.000052 time 2.1304 (2.2151) loss 2.6759 (3.0661) grad_norm 2.6978 (2.8569) [2022-01-26 01:14:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][760/1251] eta 0:18:06 lr 0.000052 time 1.8519 (2.2125) loss 3.4042 (3.0686) grad_norm 2.4593 (2.8574) [2022-01-26 01:15:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][770/1251] eta 0:17:44 lr 0.000052 time 1.9005 (2.2122) loss 3.2907 (3.0683) grad_norm 2.9576 (2.8600) [2022-01-26 01:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][780/1251] eta 0:17:21 lr 0.000051 time 2.7720 (2.2122) loss 3.2678 (3.0663) grad_norm 2.5167 (2.8606) [2022-01-26 01:16:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][790/1251] eta 0:17:00 lr 0.000051 time 2.1447 (2.2130) loss 3.2137 (3.0666) grad_norm 2.7567 (2.8602) [2022-01-26 01:16:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][800/1251] eta 0:16:37 lr 0.000051 time 1.9177 (2.2114) loss 2.9406 (3.0710) grad_norm 2.3176 (2.8650) [2022-01-26 01:16:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][810/1251] eta 0:16:14 lr 0.000051 time 1.9789 (2.2098) loss 3.5288 (3.0700) grad_norm 2.6780 (2.8655) [2022-01-26 01:17:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][820/1251] eta 0:15:51 lr 0.000051 time 2.1961 (2.2085) loss 2.2891 (3.0720) grad_norm 2.9628 (2.8657) [2022-01-26 01:17:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][830/1251] eta 0:15:29 lr 0.000051 time 3.0889 (2.2090) loss 2.2308 (3.0702) grad_norm 2.6144 (2.8634) [2022-01-26 01:17:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][840/1251] eta 0:15:08 lr 0.000051 time 2.5037 (2.2101) loss 2.5802 (3.0700) grad_norm 2.5809 (2.8634) [2022-01-26 01:18:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][850/1251] eta 0:14:45 lr 0.000051 time 1.9465 (2.2092) loss 3.6644 (3.0709) grad_norm 2.7686 (2.8632) [2022-01-26 01:18:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][860/1251] eta 0:14:23 lr 0.000051 time 2.3774 (2.2090) loss 3.3627 (3.0680) grad_norm 2.8604 (2.8615) [2022-01-26 01:18:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][870/1251] eta 0:14:01 lr 0.000051 time 2.2756 (2.2093) loss 2.0705 (3.0669) grad_norm 2.5647 (2.8603) [2022-01-26 01:19:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][880/1251] eta 0:13:39 lr 0.000051 time 1.7761 (2.2085) loss 3.2588 (3.0665) grad_norm 2.7274 (2.8595) [2022-01-26 01:19:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][890/1251] eta 0:13:16 lr 0.000051 time 2.2066 (2.2076) loss 3.4175 (3.0674) grad_norm 2.5797 (2.8582) [2022-01-26 01:20:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][900/1251] eta 0:12:54 lr 0.000051 time 2.5464 (2.2064) loss 2.4734 (3.0645) grad_norm 2.7066 (2.8569) [2022-01-26 01:20:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][910/1251] eta 0:12:32 lr 0.000051 time 1.9387 (2.2058) loss 2.4434 (3.0650) grad_norm 2.1318 (2.8582) [2022-01-26 01:20:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][920/1251] eta 0:12:09 lr 0.000051 time 2.1676 (2.2044) loss 2.8099 (3.0626) grad_norm 2.4789 (2.8563) [2022-01-26 01:21:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][930/1251] eta 0:11:47 lr 0.000051 time 1.9584 (2.2055) loss 2.3125 (3.0635) grad_norm 2.8822 (2.8571) [2022-01-26 01:21:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][940/1251] eta 0:11:26 lr 0.000051 time 2.7209 (2.2059) loss 3.3889 (3.0642) grad_norm 2.5078 (2.8575) [2022-01-26 01:21:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][950/1251] eta 0:11:04 lr 0.000051 time 2.8564 (2.2060) loss 3.1249 (3.0634) grad_norm 2.7135 (2.8572) [2022-01-26 01:22:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][960/1251] eta 0:10:41 lr 0.000051 time 1.6112 (2.2053) loss 2.6047 (3.0589) grad_norm 2.3522 (2.8562) [2022-01-26 01:22:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][970/1251] eta 0:10:19 lr 0.000051 time 1.5660 (2.2055) loss 3.5748 (3.0608) grad_norm 3.3105 (2.8558) [2022-01-26 01:22:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][980/1251] eta 0:09:57 lr 0.000051 time 3.1103 (2.2060) loss 2.8409 (3.0584) grad_norm 2.9205 (2.8563) [2022-01-26 01:23:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][990/1251] eta 0:09:35 lr 0.000051 time 2.7910 (2.2055) loss 3.2570 (3.0592) grad_norm 3.3809 (2.8574) [2022-01-26 01:23:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1000/1251] eta 0:09:13 lr 0.000051 time 1.8638 (2.2046) loss 2.7204 (3.0571) grad_norm 2.4480 (2.8561) [2022-01-26 01:24:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1010/1251] eta 0:08:50 lr 0.000051 time 1.9278 (2.2033) loss 3.2638 (3.0576) grad_norm 3.1551 (2.8552) [2022-01-26 01:24:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1020/1251] eta 0:08:28 lr 0.000051 time 2.2380 (2.2028) loss 3.7043 (3.0568) grad_norm 3.7100 (2.8571) [2022-01-26 01:24:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1030/1251] eta 0:08:06 lr 0.000051 time 3.3391 (2.2036) loss 3.2732 (3.0587) grad_norm 3.2106 (2.8565) [2022-01-26 01:25:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1040/1251] eta 0:07:45 lr 0.000051 time 2.1021 (2.2053) loss 3.7364 (3.0598) grad_norm 3.5807 (2.8571) [2022-01-26 01:25:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1050/1251] eta 0:07:23 lr 0.000051 time 2.5226 (2.2051) loss 3.5184 (3.0595) grad_norm 3.2138 (2.8579) [2022-01-26 01:25:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1060/1251] eta 0:07:00 lr 0.000051 time 1.9716 (2.2036) loss 3.5519 (3.0604) grad_norm 2.7322 (2.8579) [2022-01-26 01:26:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1070/1251] eta 0:06:38 lr 0.000051 time 3.3651 (2.2038) loss 3.5509 (3.0592) grad_norm 2.7254 (2.8581) [2022-01-26 01:26:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1080/1251] eta 0:06:16 lr 0.000051 time 2.1821 (2.2031) loss 2.8984 (3.0587) grad_norm 3.7107 (2.8572) [2022-01-26 01:26:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1090/1251] eta 0:05:54 lr 0.000051 time 2.3845 (2.2027) loss 2.8452 (3.0572) grad_norm 2.8919 (2.8579) [2022-01-26 01:27:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1100/1251] eta 0:05:32 lr 0.000051 time 2.0936 (2.2033) loss 2.1153 (3.0581) grad_norm 2.7685 (2.8565) [2022-01-26 01:27:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1110/1251] eta 0:05:10 lr 0.000051 time 3.1567 (2.2039) loss 3.1648 (3.0552) grad_norm 4.2947 (2.8580) [2022-01-26 01:28:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1120/1251] eta 0:04:48 lr 0.000051 time 2.2225 (2.2040) loss 3.7020 (3.0547) grad_norm 2.6892 (2.8579) [2022-01-26 01:28:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1130/1251] eta 0:04:26 lr 0.000051 time 1.9285 (2.2021) loss 3.4624 (3.0561) grad_norm 2.7849 (2.8583) [2022-01-26 01:28:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1140/1251] eta 0:04:04 lr 0.000051 time 2.3714 (2.2010) loss 3.0413 (3.0569) grad_norm 2.4853 (2.8564) [2022-01-26 01:29:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1150/1251] eta 0:03:42 lr 0.000051 time 2.5111 (2.2007) loss 3.6496 (3.0567) grad_norm 2.8281 (2.8548) [2022-01-26 01:29:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1160/1251] eta 0:03:20 lr 0.000051 time 1.9502 (2.2000) loss 3.4454 (3.0568) grad_norm 2.9021 (2.8539) [2022-01-26 01:29:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1170/1251] eta 0:02:58 lr 0.000051 time 2.2003 (2.2000) loss 3.1110 (3.0583) grad_norm 2.3865 (2.8542) [2022-01-26 01:30:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1180/1251] eta 0:02:36 lr 0.000051 time 2.9922 (2.2006) loss 2.2605 (3.0557) grad_norm 2.9920 (2.8538) [2022-01-26 01:30:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1190/1251] eta 0:02:14 lr 0.000051 time 2.3020 (2.2011) loss 3.3360 (3.0574) grad_norm 2.7899 (2.8566) [2022-01-26 01:30:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1200/1251] eta 0:01:52 lr 0.000051 time 2.4724 (2.2019) loss 3.0763 (3.0575) grad_norm 2.5197 (2.8561) [2022-01-26 01:31:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1210/1251] eta 0:01:30 lr 0.000051 time 2.2232 (2.2023) loss 2.3539 (3.0588) grad_norm 2.6590 (2.8561) [2022-01-26 01:31:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1220/1251] eta 0:01:08 lr 0.000051 time 3.1808 (2.2025) loss 2.8517 (3.0590) grad_norm 3.2597 (2.8561) [2022-01-26 01:32:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1230/1251] eta 0:00:46 lr 0.000051 time 1.9019 (2.2013) loss 3.1880 (3.0606) grad_norm 2.9226 (2.8560) [2022-01-26 01:32:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1240/1251] eta 0:00:24 lr 0.000051 time 1.2730 (2.1992) loss 2.6307 (3.0599) grad_norm 2.7704 (2.8558) [2022-01-26 01:32:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [260/300][1250/1251] eta 0:00:02 lr 0.000051 time 1.3305 (2.1936) loss 2.0655 (3.0573) grad_norm 2.8272 (2.8565) [2022-01-26 01:32:38 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 260 training takes 0:45:44 [2022-01-26 01:32:38 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_260 saving...... [2022-01-26 01:32:50 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_260 saved !!! [2022-01-26 01:33:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 16.497 (16.497) Loss 0.7885 (0.7885) Acc@1 81.836 (81.836) Acc@5 95.996 (95.996) [2022-01-26 01:33:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.263 (3.011) Loss 0.7805 (0.8286) Acc@1 81.543 (80.726) Acc@5 96.094 (95.543) [2022-01-26 01:33:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.935 (2.413) Loss 0.6847 (0.8131) Acc@1 83.789 (80.855) Acc@5 96.582 (95.522) [2022-01-26 01:33:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.662 (2.131) Loss 0.7754 (0.8167) Acc@1 81.445 (80.844) Acc@5 96.289 (95.505) [2022-01-26 01:34:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.138 (1.993) Loss 0.8651 (0.8229) Acc@1 79.590 (80.664) Acc@5 94.141 (95.401) [2022-01-26 01:34:19 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.640 Acc@5 95.364 [2022-01-26 01:34:19 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-01-26 01:34:19 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.78% [2022-01-26 01:34:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][0/1251] eta 7:14:08 lr 0.000051 time 20.8222 (20.8222) loss 3.4855 (3.4855) grad_norm 3.0142 (3.0142) [2022-01-26 01:35:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][10/1251] eta 1:23:29 lr 0.000051 time 2.7828 (4.0364) loss 2.8494 (2.9937) grad_norm 2.5835 (2.7920) [2022-01-26 01:35:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][20/1251] eta 1:03:58 lr 0.000051 time 1.4168 (3.1185) loss 3.6601 (3.1223) grad_norm 3.1124 (2.8771) [2022-01-26 01:35:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][30/1251] eta 0:57:24 lr 0.000051 time 1.9519 (2.8208) loss 3.5192 (3.0456) grad_norm 2.8293 (2.8883) [2022-01-26 01:36:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][40/1251] eta 0:54:55 lr 0.000051 time 3.1144 (2.7209) loss 2.5985 (3.0802) grad_norm 2.2309 (2.8873) [2022-01-26 01:36:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][50/1251] eta 0:53:09 lr 0.000051 time 2.2290 (2.6554) loss 3.0841 (3.0234) grad_norm 2.9522 (2.8536) [2022-01-26 01:36:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][60/1251] eta 0:51:01 lr 0.000051 time 2.4366 (2.5706) loss 3.3127 (2.9978) grad_norm 2.8841 (2.8450) [2022-01-26 01:37:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][70/1251] eta 0:49:11 lr 0.000051 time 1.9295 (2.4993) loss 2.6253 (2.9972) grad_norm 2.9804 (2.8675) [2022-01-26 01:37:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][80/1251] eta 0:48:16 lr 0.000051 time 2.4570 (2.4739) loss 1.9503 (2.9620) grad_norm 2.8142 (2.8611) [2022-01-26 01:38:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][90/1251] eta 0:47:15 lr 0.000051 time 2.1011 (2.4421) loss 3.2293 (2.9687) grad_norm 2.9042 (2.8538) [2022-01-26 01:38:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][100/1251] eta 0:46:38 lr 0.000051 time 3.3618 (2.4311) loss 3.5273 (2.9757) grad_norm 2.8136 (2.8436) [2022-01-26 01:38:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][110/1251] eta 0:45:50 lr 0.000051 time 1.9146 (2.4104) loss 3.3155 (2.9764) grad_norm 2.4592 (2.8457) [2022-01-26 01:39:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][120/1251] eta 0:44:55 lr 0.000051 time 2.1266 (2.3835) loss 3.2370 (2.9796) grad_norm 2.3817 (2.8377) [2022-01-26 01:39:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][130/1251] eta 0:44:25 lr 0.000050 time 2.3895 (2.3774) loss 2.9669 (2.9932) grad_norm 3.7700 (2.8442) [2022-01-26 01:39:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][140/1251] eta 0:43:59 lr 0.000050 time 3.3446 (2.3756) loss 1.9987 (2.9924) grad_norm 2.6936 (2.8398) [2022-01-26 01:40:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][150/1251] eta 0:43:06 lr 0.000050 time 1.5980 (2.3494) loss 2.9597 (3.0151) grad_norm 2.7154 (2.8446) [2022-01-26 01:40:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][160/1251] eta 0:42:22 lr 0.000050 time 1.8527 (2.3301) loss 3.2968 (3.0265) grad_norm 2.4334 (2.8444) [2022-01-26 01:40:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][170/1251] eta 0:41:58 lr 0.000050 time 2.8120 (2.3299) loss 2.2765 (3.0271) grad_norm 3.4480 (2.9059) [2022-01-26 01:41:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][180/1251] eta 0:41:27 lr 0.000050 time 2.2180 (2.3230) loss 2.9673 (3.0212) grad_norm 3.1598 (2.9044) [2022-01-26 01:41:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][190/1251] eta 0:40:54 lr 0.000050 time 2.1209 (2.3132) loss 2.9312 (3.0225) grad_norm 3.0723 (2.9053) [2022-01-26 01:42:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][200/1251] eta 0:40:32 lr 0.000050 time 1.6005 (2.3143) loss 3.4181 (3.0254) grad_norm 2.5592 (2.9101) [2022-01-26 01:42:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][210/1251] eta 0:40:08 lr 0.000050 time 2.7156 (2.3133) loss 3.1880 (3.0259) grad_norm 2.6953 (2.9184) [2022-01-26 01:42:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][220/1251] eta 0:39:30 lr 0.000050 time 1.8237 (2.2992) loss 3.2241 (3.0390) grad_norm 2.7058 (2.9179) [2022-01-26 01:43:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][230/1251] eta 0:38:52 lr 0.000050 time 2.1578 (2.2845) loss 2.4674 (3.0349) grad_norm 3.2089 (2.9282) [2022-01-26 01:43:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][240/1251] eta 0:38:23 lr 0.000050 time 2.0474 (2.2783) loss 3.5783 (3.0352) grad_norm 2.4323 (2.9264) [2022-01-26 01:43:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][250/1251] eta 0:38:04 lr 0.000050 time 3.0583 (2.2822) loss 2.2326 (3.0254) grad_norm 3.1482 (2.9189) [2022-01-26 01:44:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][260/1251] eta 0:37:42 lr 0.000050 time 1.9898 (2.2835) loss 3.1024 (3.0188) grad_norm 2.8721 (2.9268) [2022-01-26 01:44:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][270/1251] eta 0:37:18 lr 0.000050 time 2.0460 (2.2819) loss 3.2439 (3.0240) grad_norm 2.7619 (2.9247) [2022-01-26 01:44:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][280/1251] eta 0:36:50 lr 0.000050 time 1.7439 (2.2765) loss 2.8170 (3.0235) grad_norm 2.6511 (2.9189) [2022-01-26 01:45:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][290/1251] eta 0:36:27 lr 0.000050 time 2.9578 (2.2758) loss 3.5762 (3.0285) grad_norm 2.5977 (2.9131) [2022-01-26 01:45:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][300/1251] eta 0:35:54 lr 0.000050 time 1.9297 (2.2660) loss 3.0725 (3.0277) grad_norm 2.8158 (2.9084) [2022-01-26 01:46:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][310/1251] eta 0:35:27 lr 0.000050 time 2.2138 (2.2609) loss 3.4361 (3.0287) grad_norm 2.5450 (2.9021) [2022-01-26 01:46:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][320/1251] eta 0:35:02 lr 0.000050 time 2.1718 (2.2580) loss 2.6314 (3.0306) grad_norm 2.5340 (2.8992) [2022-01-26 01:46:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][330/1251] eta 0:34:39 lr 0.000050 time 2.8365 (2.2583) loss 1.9267 (3.0320) grad_norm 2.8277 (2.9010) [2022-01-26 01:47:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][340/1251] eta 0:34:19 lr 0.000050 time 2.0652 (2.2604) loss 2.7404 (3.0245) grad_norm 2.6418 (2.9019) [2022-01-26 01:47:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][350/1251] eta 0:33:57 lr 0.000050 time 1.6480 (2.2614) loss 3.2812 (3.0268) grad_norm 2.7175 (2.8995) [2022-01-26 01:47:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][360/1251] eta 0:33:35 lr 0.000050 time 2.5675 (2.2624) loss 3.1616 (3.0211) grad_norm 3.7608 (2.9011) [2022-01-26 01:48:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][370/1251] eta 0:33:12 lr 0.000050 time 3.4044 (2.2614) loss 2.8469 (3.0252) grad_norm 2.6088 (2.9023) [2022-01-26 01:48:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][380/1251] eta 0:32:43 lr 0.000050 time 1.6375 (2.2543) loss 3.5019 (3.0273) grad_norm 2.7862 (2.8964) [2022-01-26 01:48:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][390/1251] eta 0:32:17 lr 0.000050 time 2.5018 (2.2506) loss 2.2677 (3.0233) grad_norm 2.7786 (2.8927) [2022-01-26 01:49:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][400/1251] eta 0:31:49 lr 0.000050 time 1.8362 (2.2440) loss 2.8223 (3.0282) grad_norm 2.7597 (2.8872) [2022-01-26 01:49:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][410/1251] eta 0:31:27 lr 0.000050 time 3.4275 (2.2440) loss 3.1926 (3.0251) grad_norm 2.4696 (2.8901) [2022-01-26 01:50:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][420/1251] eta 0:31:04 lr 0.000050 time 2.3053 (2.2436) loss 1.8963 (3.0257) grad_norm 2.7623 (2.8880) [2022-01-26 01:50:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][430/1251] eta 0:30:44 lr 0.000050 time 2.7179 (2.2465) loss 3.2759 (3.0245) grad_norm 2.4387 (2.8871) [2022-01-26 01:50:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][440/1251] eta 0:30:19 lr 0.000050 time 2.1699 (2.2438) loss 2.4424 (3.0264) grad_norm 3.1672 (2.8829) [2022-01-26 01:51:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][450/1251] eta 0:29:58 lr 0.000050 time 2.7142 (2.2447) loss 3.3341 (3.0263) grad_norm 3.0078 (2.8826) [2022-01-26 01:51:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][460/1251] eta 0:29:33 lr 0.000050 time 2.3508 (2.2417) loss 2.5903 (3.0218) grad_norm 3.5432 (2.8845) [2022-01-26 01:51:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][470/1251] eta 0:29:10 lr 0.000050 time 1.7190 (2.2417) loss 3.3593 (3.0218) grad_norm 2.4858 (2.8839) [2022-01-26 01:52:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][480/1251] eta 0:28:47 lr 0.000050 time 1.5416 (2.2410) loss 2.1070 (3.0220) grad_norm 5.7927 (2.8912) [2022-01-26 01:52:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][490/1251] eta 0:28:24 lr 0.000050 time 2.4397 (2.2400) loss 2.5351 (3.0188) grad_norm 3.3374 (2.8938) [2022-01-26 01:53:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][500/1251] eta 0:27:59 lr 0.000050 time 2.1116 (2.2365) loss 3.1367 (3.0197) grad_norm 2.5423 (2.8971) [2022-01-26 01:53:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][510/1251] eta 0:27:36 lr 0.000050 time 1.4982 (2.2349) loss 2.1611 (3.0168) grad_norm 3.3035 (2.9023) [2022-01-26 01:53:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][520/1251] eta 0:27:16 lr 0.000050 time 2.3154 (2.2385) loss 3.7404 (3.0178) grad_norm 2.9127 (2.9025) [2022-01-26 01:54:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][530/1251] eta 0:26:53 lr 0.000050 time 2.3142 (2.2373) loss 2.9001 (3.0247) grad_norm 2.6121 (2.9017) [2022-01-26 01:54:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][540/1251] eta 0:26:27 lr 0.000050 time 1.7677 (2.2331) loss 3.1355 (3.0212) grad_norm 2.8078 (2.9003) [2022-01-26 01:54:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][550/1251] eta 0:26:03 lr 0.000050 time 1.5392 (2.2309) loss 2.0486 (3.0210) grad_norm 2.9648 (2.9003) [2022-01-26 01:55:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][560/1251] eta 0:25:39 lr 0.000050 time 1.6026 (2.2275) loss 3.6311 (3.0208) grad_norm 2.8316 (2.9009) [2022-01-26 01:55:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][570/1251] eta 0:25:16 lr 0.000050 time 1.8392 (2.2270) loss 2.6336 (3.0224) grad_norm 2.9306 (2.9023) [2022-01-26 01:55:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][580/1251] eta 0:24:53 lr 0.000050 time 1.9183 (2.2257) loss 2.1905 (3.0203) grad_norm 2.8285 (2.9007) [2022-01-26 01:56:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][590/1251] eta 0:24:31 lr 0.000050 time 1.8578 (2.2268) loss 2.6307 (3.0215) grad_norm 3.0102 (2.9074) [2022-01-26 01:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][600/1251] eta 0:24:12 lr 0.000050 time 2.7687 (2.2309) loss 2.7142 (3.0246) grad_norm 2.8217 (2.9056) [2022-01-26 01:57:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][610/1251] eta 0:23:50 lr 0.000050 time 1.8312 (2.2313) loss 2.8995 (3.0255) grad_norm 2.7186 (2.9046) [2022-01-26 01:57:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][620/1251] eta 0:23:27 lr 0.000050 time 1.9680 (2.2302) loss 3.1851 (3.0272) grad_norm 2.8168 (2.9035) [2022-01-26 01:57:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][630/1251] eta 0:23:03 lr 0.000050 time 1.9138 (2.2284) loss 2.8926 (3.0303) grad_norm 3.3384 (2.9053) [2022-01-26 01:58:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][640/1251] eta 0:22:40 lr 0.000050 time 3.1196 (2.2275) loss 2.9413 (3.0308) grad_norm 2.4689 (2.9033) [2022-01-26 01:58:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][650/1251] eta 0:22:17 lr 0.000050 time 2.1767 (2.2254) loss 2.0226 (3.0303) grad_norm 2.6358 (2.9025) [2022-01-26 01:58:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][660/1251] eta 0:21:53 lr 0.000050 time 1.8564 (2.2228) loss 2.9265 (3.0274) grad_norm 3.0187 (2.9037) [2022-01-26 01:59:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][670/1251] eta 0:21:30 lr 0.000050 time 2.1668 (2.2203) loss 3.0397 (3.0274) grad_norm 2.3611 (2.9048) [2022-01-26 01:59:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][680/1251] eta 0:21:06 lr 0.000050 time 2.1749 (2.2188) loss 2.8746 (3.0289) grad_norm 2.6868 (2.9025) [2022-01-26 01:59:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][690/1251] eta 0:20:43 lr 0.000050 time 2.2633 (2.2173) loss 3.0303 (3.0291) grad_norm 2.5327 (2.9015) [2022-01-26 02:00:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][700/1251] eta 0:20:21 lr 0.000050 time 2.4989 (2.2177) loss 3.3230 (3.0314) grad_norm 3.0712 (2.8998) [2022-01-26 02:00:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][710/1251] eta 0:19:59 lr 0.000050 time 2.1800 (2.2174) loss 2.5814 (3.0319) grad_norm 2.6844 (2.8999) [2022-01-26 02:00:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][720/1251] eta 0:19:37 lr 0.000050 time 2.5284 (2.2182) loss 2.7512 (3.0343) grad_norm 2.9140 (2.9006) [2022-01-26 02:01:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][730/1251] eta 0:19:17 lr 0.000050 time 3.0806 (2.2208) loss 2.8797 (3.0382) grad_norm 2.8183 (2.9015) [2022-01-26 02:01:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][740/1251] eta 0:18:55 lr 0.000050 time 2.7718 (2.2225) loss 2.3255 (3.0372) grad_norm 2.5004 (2.8985) [2022-01-26 02:02:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][750/1251] eta 0:18:33 lr 0.000049 time 2.1569 (2.2228) loss 3.0410 (3.0347) grad_norm 2.4997 (2.8972) [2022-01-26 02:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][760/1251] eta 0:18:10 lr 0.000049 time 1.8690 (2.2208) loss 2.6270 (3.0319) grad_norm 2.9819 (2.8971) [2022-01-26 02:02:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][770/1251] eta 0:17:46 lr 0.000049 time 2.2405 (2.2169) loss 3.2774 (3.0300) grad_norm 2.4226 (2.8976) [2022-01-26 02:03:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][780/1251] eta 0:17:23 lr 0.000049 time 1.9411 (2.2148) loss 2.4441 (3.0316) grad_norm 2.9961 (2.8985) [2022-01-26 02:03:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][790/1251] eta 0:17:00 lr 0.000049 time 2.2047 (2.2135) loss 2.1004 (3.0340) grad_norm 2.7014 (2.8986) [2022-01-26 02:03:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][800/1251] eta 0:16:38 lr 0.000049 time 1.9708 (2.2129) loss 3.6547 (3.0334) grad_norm 2.8610 (2.8977) [2022-01-26 02:04:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][810/1251] eta 0:16:16 lr 0.000049 time 1.9210 (2.2136) loss 3.2594 (3.0345) grad_norm 2.7921 (2.8989) [2022-01-26 02:04:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][820/1251] eta 0:15:53 lr 0.000049 time 1.6835 (2.2133) loss 2.8626 (3.0357) grad_norm 3.4832 (2.8995) [2022-01-26 02:04:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][830/1251] eta 0:15:31 lr 0.000049 time 2.1728 (2.2128) loss 3.2745 (3.0389) grad_norm 2.7967 (2.8997) [2022-01-26 02:05:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][840/1251] eta 0:15:09 lr 0.000049 time 1.6970 (2.2129) loss 3.0929 (3.0398) grad_norm 2.6335 (2.8976) [2022-01-26 02:05:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][850/1251] eta 0:14:47 lr 0.000049 time 2.6992 (2.2128) loss 3.7079 (3.0421) grad_norm 4.1636 (2.8983) [2022-01-26 02:06:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][860/1251] eta 0:14:25 lr 0.000049 time 1.9439 (2.2129) loss 1.9220 (3.0389) grad_norm 2.6178 (2.8994) [2022-01-26 02:06:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][870/1251] eta 0:14:03 lr 0.000049 time 2.3318 (2.2127) loss 3.1644 (3.0392) grad_norm 2.8795 (2.8980) [2022-01-26 02:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][880/1251] eta 0:13:41 lr 0.000049 time 1.5854 (2.2140) loss 2.4014 (3.0379) grad_norm 2.6604 (2.8991) [2022-01-26 02:07:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][890/1251] eta 0:13:19 lr 0.000049 time 1.9178 (2.2142) loss 2.5199 (3.0364) grad_norm 2.7452 (2.8987) [2022-01-26 02:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][900/1251] eta 0:12:56 lr 0.000049 time 1.6215 (2.2111) loss 3.1710 (3.0391) grad_norm 3.2765 (2.8973) [2022-01-26 02:07:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][910/1251] eta 0:12:33 lr 0.000049 time 2.4577 (2.2107) loss 3.1732 (3.0388) grad_norm 2.8059 (2.8970) [2022-01-26 02:08:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][920/1251] eta 0:12:11 lr 0.000049 time 1.8341 (2.2098) loss 3.3564 (3.0381) grad_norm 2.4254 (2.8951) [2022-01-26 02:08:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][930/1251] eta 0:11:48 lr 0.000049 time 1.5849 (2.2083) loss 3.3852 (3.0401) grad_norm 2.4905 (2.8938) [2022-01-26 02:08:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][940/1251] eta 0:11:26 lr 0.000049 time 1.9486 (2.2075) loss 2.8095 (3.0415) grad_norm 2.7415 (2.8933) [2022-01-26 02:09:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][950/1251] eta 0:11:04 lr 0.000049 time 2.4099 (2.2078) loss 2.4227 (3.0410) grad_norm 2.8684 (2.8931) [2022-01-26 02:09:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][960/1251] eta 0:10:42 lr 0.000049 time 2.5751 (2.2086) loss 3.8162 (3.0424) grad_norm 2.8242 (2.8925) [2022-01-26 02:10:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][970/1251] eta 0:10:20 lr 0.000049 time 1.9449 (2.2097) loss 3.0287 (3.0447) grad_norm 3.5293 (2.8937) [2022-01-26 02:10:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][980/1251] eta 0:09:59 lr 0.000049 time 2.5051 (2.2112) loss 2.5772 (3.0442) grad_norm 2.4104 (2.8916) [2022-01-26 02:10:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][990/1251] eta 0:09:37 lr 0.000049 time 3.9172 (2.2145) loss 3.2981 (3.0451) grad_norm 3.2025 (2.8919) [2022-01-26 02:11:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1000/1251] eta 0:09:15 lr 0.000049 time 2.6745 (2.2145) loss 3.1637 (3.0462) grad_norm 2.5895 (2.8936) [2022-01-26 02:11:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1010/1251] eta 0:08:53 lr 0.000049 time 1.8250 (2.2134) loss 2.3035 (3.0450) grad_norm 2.5052 (2.8930) [2022-01-26 02:11:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1020/1251] eta 0:08:30 lr 0.000049 time 1.6201 (2.2103) loss 3.1757 (3.0444) grad_norm 2.6106 (2.8917) [2022-01-26 02:12:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1030/1251] eta 0:08:08 lr 0.000049 time 2.6096 (2.2086) loss 3.7806 (3.0447) grad_norm 2.9652 (2.8915) [2022-01-26 02:12:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1040/1251] eta 0:07:45 lr 0.000049 time 2.2113 (2.2074) loss 2.9398 (3.0441) grad_norm 3.7673 (2.8930) [2022-01-26 02:12:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1050/1251] eta 0:07:23 lr 0.000049 time 1.8124 (2.2076) loss 3.4351 (3.0435) grad_norm 2.9285 (2.8946) [2022-01-26 02:13:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1060/1251] eta 0:07:01 lr 0.000049 time 3.0648 (2.2090) loss 3.2542 (3.0448) grad_norm 2.8279 (2.8940) [2022-01-26 02:13:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1070/1251] eta 0:06:40 lr 0.000049 time 1.9720 (2.2102) loss 2.3591 (3.0456) grad_norm 3.0384 (2.8939) [2022-01-26 02:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1080/1251] eta 0:06:18 lr 0.000049 time 1.8828 (2.2119) loss 3.0205 (3.0461) grad_norm 4.0653 (2.8953) [2022-01-26 02:14:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1090/1251] eta 0:05:56 lr 0.000049 time 1.5193 (2.2117) loss 2.1410 (3.0451) grad_norm 2.9244 (2.8944) [2022-01-26 02:14:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1100/1251] eta 0:05:33 lr 0.000049 time 1.9228 (2.2106) loss 2.8178 (3.0470) grad_norm 2.9517 (2.8931) [2022-01-26 02:15:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1110/1251] eta 0:05:11 lr 0.000049 time 1.8482 (2.2083) loss 2.4168 (3.0469) grad_norm 2.7542 (2.8921) [2022-01-26 02:15:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1120/1251] eta 0:04:49 lr 0.000049 time 2.2326 (2.2075) loss 3.3879 (3.0461) grad_norm 3.0298 (2.8918) [2022-01-26 02:15:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1130/1251] eta 0:04:27 lr 0.000049 time 1.6983 (2.2066) loss 3.4429 (3.0481) grad_norm 3.0261 (2.8917) [2022-01-26 02:16:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1140/1251] eta 0:04:04 lr 0.000049 time 2.4468 (2.2066) loss 2.5278 (3.0481) grad_norm 2.4206 (2.8881) [2022-01-26 02:16:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1150/1251] eta 0:03:42 lr 0.000049 time 2.5700 (2.2060) loss 2.2526 (3.0479) grad_norm 2.3109 (2.8867) [2022-01-26 02:17:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1160/1251] eta 0:03:20 lr 0.000049 time 1.8619 (2.2065) loss 3.1766 (3.0487) grad_norm 3.1821 (2.8877) [2022-01-26 02:17:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1170/1251] eta 0:02:58 lr 0.000049 time 1.5193 (2.2054) loss 3.2436 (3.0487) grad_norm 2.6420 (2.8878) [2022-01-26 02:17:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1180/1251] eta 0:02:36 lr 0.000049 time 2.9255 (2.2052) loss 3.5661 (3.0502) grad_norm 2.8372 (2.8871) [2022-01-26 02:18:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1190/1251] eta 0:02:14 lr 0.000049 time 2.5165 (2.2065) loss 3.3530 (3.0515) grad_norm 2.5036 (2.8869) [2022-01-26 02:18:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1200/1251] eta 0:01:52 lr 0.000049 time 2.7017 (2.2072) loss 3.3854 (3.0511) grad_norm 2.9512 (2.8864) [2022-01-26 02:18:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1210/1251] eta 0:01:30 lr 0.000049 time 2.2350 (2.2072) loss 3.2751 (3.0494) grad_norm 2.8506 (2.8847) [2022-01-26 02:19:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1220/1251] eta 0:01:08 lr 0.000049 time 2.2061 (2.2072) loss 3.5558 (3.0509) grad_norm 2.8573 (2.8834) [2022-01-26 02:19:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1230/1251] eta 0:00:46 lr 0.000049 time 2.2793 (2.2064) loss 2.8225 (3.0508) grad_norm 4.0268 (2.8835) [2022-01-26 02:19:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1240/1251] eta 0:00:24 lr 0.000049 time 1.6246 (2.2042) loss 2.3441 (3.0526) grad_norm 2.8456 (2.8818) [2022-01-26 02:20:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [261/300][1250/1251] eta 0:00:02 lr 0.000049 time 1.1797 (2.1982) loss 3.3828 (3.0527) grad_norm 3.1654 (2.8817) [2022-01-26 02:20:10 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 261 training takes 0:45:50 [2022-01-26 02:20:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.792 (18.792) Loss 0.8787 (0.8787) Acc@1 80.762 (80.762) Acc@5 94.043 (94.043) [2022-01-26 02:20:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.946 (3.480) Loss 0.7680 (0.8403) Acc@1 82.422 (80.691) Acc@5 96.094 (95.091) [2022-01-26 02:21:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.193 (2.679) Loss 0.8194 (0.8384) Acc@1 80.566 (80.408) Acc@5 94.922 (95.103) [2022-01-26 02:21:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.628 (2.341) Loss 0.8658 (0.8371) Acc@1 80.078 (80.462) Acc@5 94.629 (95.190) [2022-01-26 02:21:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.368 (2.221) Loss 0.7909 (0.8299) Acc@1 81.445 (80.597) Acc@5 95.215 (95.279) [2022-01-26 02:21:48 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.644 Acc@5 95.346 [2022-01-26 02:21:48 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.6% [2022-01-26 02:21:48 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.78% [2022-01-26 02:22:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][0/1251] eta 7:39:35 lr 0.000049 time 22.0430 (22.0430) loss 3.1568 (3.1568) grad_norm 2.6349 (2.6349) [2022-01-26 02:22:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][10/1251] eta 1:25:49 lr 0.000049 time 3.0236 (4.1493) loss 3.3845 (3.1079) grad_norm 3.1798 (2.8634) [2022-01-26 02:22:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][20/1251] eta 1:05:31 lr 0.000049 time 1.8067 (3.1941) loss 3.1294 (3.0445) grad_norm 2.5836 (2.8602) [2022-01-26 02:23:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][30/1251] eta 0:57:53 lr 0.000049 time 1.7571 (2.8450) loss 3.5190 (3.0689) grad_norm 3.5143 (2.8157) [2022-01-26 02:23:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][40/1251] eta 0:53:51 lr 0.000049 time 2.9830 (2.6683) loss 2.8153 (3.0462) grad_norm 2.5073 (2.7802) [2022-01-26 02:24:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][50/1251] eta 0:52:16 lr 0.000049 time 2.3548 (2.6116) loss 3.3482 (3.0336) grad_norm 2.6765 (2.7882) [2022-01-26 02:24:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][60/1251] eta 0:50:35 lr 0.000049 time 1.9177 (2.5487) loss 3.7546 (3.0032) grad_norm 2.7621 (2.8218) [2022-01-26 02:24:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][70/1251] eta 0:49:03 lr 0.000049 time 1.5601 (2.4924) loss 1.9767 (3.0153) grad_norm 2.5520 (2.8473) [2022-01-26 02:25:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][80/1251] eta 0:48:07 lr 0.000049 time 2.9281 (2.4656) loss 3.0608 (2.9869) grad_norm 2.8946 (2.8540) [2022-01-26 02:25:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][90/1251] eta 0:47:21 lr 0.000049 time 2.0146 (2.4475) loss 3.4120 (2.9975) grad_norm 3.0593 (2.8708) [2022-01-26 02:25:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][100/1251] eta 0:46:27 lr 0.000049 time 1.9129 (2.4221) loss 3.2207 (2.9959) grad_norm 2.6258 (2.8700) [2022-01-26 02:26:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][110/1251] eta 0:45:28 lr 0.000049 time 1.7055 (2.3915) loss 2.4599 (2.9815) grad_norm 2.4071 (2.8663) [2022-01-26 02:26:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][120/1251] eta 0:44:42 lr 0.000048 time 3.0018 (2.3722) loss 3.5874 (2.9770) grad_norm 3.8319 (2.8767) [2022-01-26 02:26:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][130/1251] eta 0:44:01 lr 0.000048 time 1.6491 (2.3566) loss 3.2024 (2.9879) grad_norm 2.8112 (2.8704) [2022-01-26 02:27:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][140/1251] eta 0:43:23 lr 0.000048 time 1.9243 (2.3435) loss 3.4057 (2.9837) grad_norm 2.8641 (2.8623) [2022-01-26 02:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][150/1251] eta 0:42:59 lr 0.000048 time 2.4587 (2.3429) loss 3.5668 (2.9914) grad_norm 4.5459 (2.8758) [2022-01-26 02:28:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][160/1251] eta 0:42:29 lr 0.000048 time 2.8900 (2.3366) loss 2.1691 (2.9946) grad_norm 2.4586 (2.8812) [2022-01-26 02:28:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][170/1251] eta 0:42:04 lr 0.000048 time 1.5169 (2.3355) loss 2.7324 (2.9986) grad_norm 2.9498 (2.8782) [2022-01-26 02:28:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][180/1251] eta 0:41:18 lr 0.000048 time 1.8722 (2.3143) loss 3.0501 (3.0100) grad_norm 2.8445 (2.8724) [2022-01-26 02:29:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][190/1251] eta 0:40:53 lr 0.000048 time 2.2227 (2.3120) loss 2.2717 (3.0167) grad_norm 2.4466 (2.8617) [2022-01-26 02:29:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][200/1251] eta 0:40:25 lr 0.000048 time 2.4366 (2.3075) loss 3.5302 (3.0189) grad_norm 2.8572 (2.8579) [2022-01-26 02:29:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][210/1251] eta 0:40:00 lr 0.000048 time 1.5944 (2.3056) loss 2.9179 (3.0081) grad_norm 2.8167 (2.8569) [2022-01-26 02:30:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][220/1251] eta 0:39:34 lr 0.000048 time 1.5040 (2.3029) loss 2.7273 (3.0089) grad_norm 3.0152 (2.8558) [2022-01-26 02:30:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][230/1251] eta 0:39:06 lr 0.000048 time 2.1313 (2.2979) loss 2.5670 (3.0001) grad_norm 3.3712 (2.8532) [2022-01-26 02:30:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][240/1251] eta 0:38:31 lr 0.000048 time 1.7477 (2.2866) loss 2.8667 (2.9970) grad_norm 3.1494 (2.8534) [2022-01-26 02:31:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][250/1251] eta 0:38:04 lr 0.000048 time 1.7323 (2.2826) loss 3.4720 (2.9911) grad_norm 2.9288 (2.8501) [2022-01-26 02:31:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][260/1251] eta 0:37:35 lr 0.000048 time 2.1161 (2.2759) loss 3.1580 (2.9947) grad_norm 2.8285 (2.8510) [2022-01-26 02:32:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][270/1251] eta 0:37:09 lr 0.000048 time 1.9482 (2.2725) loss 3.3574 (3.0058) grad_norm 2.8419 (2.8564) [2022-01-26 02:32:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][280/1251] eta 0:36:45 lr 0.000048 time 2.0012 (2.2712) loss 2.8954 (3.0011) grad_norm 2.5248 (2.8530) [2022-01-26 02:32:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][290/1251] eta 0:36:24 lr 0.000048 time 2.2109 (2.2733) loss 3.3797 (3.0034) grad_norm 2.6475 (2.8570) [2022-01-26 02:33:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][300/1251] eta 0:35:56 lr 0.000048 time 1.9389 (2.2677) loss 1.8784 (3.0011) grad_norm 2.6196 (2.8579) [2022-01-26 02:33:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][310/1251] eta 0:35:30 lr 0.000048 time 2.1831 (2.2644) loss 3.6448 (3.0038) grad_norm 2.9498 (2.8860) [2022-01-26 02:33:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][320/1251] eta 0:35:06 lr 0.000048 time 2.1163 (2.2631) loss 3.4107 (3.0017) grad_norm 2.8877 (2.8905) [2022-01-26 02:34:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][330/1251] eta 0:34:46 lr 0.000048 time 2.0717 (2.2657) loss 3.4873 (3.0012) grad_norm 2.6796 (2.8940) [2022-01-26 02:34:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][340/1251] eta 0:34:26 lr 0.000048 time 3.1867 (2.2688) loss 3.3269 (2.9964) grad_norm 2.5590 (2.8943) [2022-01-26 02:35:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][350/1251] eta 0:33:58 lr 0.000048 time 1.7864 (2.2623) loss 3.1088 (3.0027) grad_norm 2.6350 (2.8887) [2022-01-26 02:35:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][360/1251] eta 0:33:26 lr 0.000048 time 1.6898 (2.2525) loss 2.9718 (2.9927) grad_norm 2.8473 (2.8880) [2022-01-26 02:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][370/1251] eta 0:32:55 lr 0.000048 time 1.9098 (2.2428) loss 3.0025 (3.0034) grad_norm 3.1358 (2.8900) [2022-01-26 02:36:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][380/1251] eta 0:32:31 lr 0.000048 time 2.5379 (2.2409) loss 3.4633 (3.0051) grad_norm 2.6708 (2.8949) [2022-01-26 02:36:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][390/1251] eta 0:32:09 lr 0.000048 time 2.4213 (2.2407) loss 2.5763 (3.0110) grad_norm 2.7462 (2.9074) [2022-01-26 02:36:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][400/1251] eta 0:31:49 lr 0.000048 time 2.0657 (2.2435) loss 1.9999 (3.0076) grad_norm 2.9414 (2.9084) [2022-01-26 02:37:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][410/1251] eta 0:31:30 lr 0.000048 time 2.8326 (2.2480) loss 3.3936 (3.0169) grad_norm 2.7906 (2.9098) [2022-01-26 02:37:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][420/1251] eta 0:31:06 lr 0.000048 time 2.3938 (2.2463) loss 3.5071 (3.0199) grad_norm 2.6528 (2.9104) [2022-01-26 02:37:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][430/1251] eta 0:30:43 lr 0.000048 time 1.7457 (2.2460) loss 3.0658 (3.0235) grad_norm 2.9653 (2.9142) [2022-01-26 02:38:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][440/1251] eta 0:30:23 lr 0.000048 time 1.8381 (2.2480) loss 3.3809 (3.0198) grad_norm 3.5228 (2.9160) [2022-01-26 02:38:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][450/1251] eta 0:30:01 lr 0.000048 time 3.3565 (2.2495) loss 1.8897 (3.0225) grad_norm 2.7504 (2.9146) [2022-01-26 02:39:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][460/1251] eta 0:29:38 lr 0.000048 time 2.4634 (2.2481) loss 3.4968 (3.0250) grad_norm 3.3242 (2.9126) [2022-01-26 02:39:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][470/1251] eta 0:29:12 lr 0.000048 time 1.8990 (2.2443) loss 3.2434 (3.0215) grad_norm 2.8968 (2.9128) [2022-01-26 02:39:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][480/1251] eta 0:28:49 lr 0.000048 time 1.9160 (2.2437) loss 3.5620 (3.0227) grad_norm 3.1131 (2.9121) [2022-01-26 02:40:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][490/1251] eta 0:28:28 lr 0.000048 time 3.2088 (2.2451) loss 3.5290 (3.0234) grad_norm 2.8886 (2.9083) [2022-01-26 02:40:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][500/1251] eta 0:28:05 lr 0.000048 time 2.7379 (2.2448) loss 3.1779 (3.0239) grad_norm 2.8911 (2.9061) [2022-01-26 02:40:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][510/1251] eta 0:27:42 lr 0.000048 time 1.9224 (2.2441) loss 3.2484 (3.0261) grad_norm 2.8117 (2.9110) [2022-01-26 02:41:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][520/1251] eta 0:27:17 lr 0.000048 time 1.5802 (2.2395) loss 2.1097 (3.0248) grad_norm 2.9022 (2.9094) [2022-01-26 02:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][530/1251] eta 0:26:54 lr 0.000048 time 3.4371 (2.2392) loss 3.4283 (3.0269) grad_norm 2.5636 (2.9087) [2022-01-26 02:41:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][540/1251] eta 0:26:29 lr 0.000048 time 1.8994 (2.2357) loss 3.3138 (3.0295) grad_norm 2.8563 (2.9057) [2022-01-26 02:42:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][550/1251] eta 0:26:06 lr 0.000048 time 2.4335 (2.2347) loss 3.2586 (3.0269) grad_norm 2.7204 (2.9023) [2022-01-26 02:42:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][560/1251] eta 0:25:43 lr 0.000048 time 1.9912 (2.2342) loss 2.1396 (3.0242) grad_norm 2.6082 (2.9022) [2022-01-26 02:43:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][570/1251] eta 0:25:21 lr 0.000048 time 2.9889 (2.2340) loss 3.4755 (3.0291) grad_norm 2.9213 (2.9033) [2022-01-26 02:43:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][580/1251] eta 0:24:58 lr 0.000048 time 2.1637 (2.2329) loss 1.9558 (3.0338) grad_norm 2.7538 (2.9018) [2022-01-26 02:43:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][590/1251] eta 0:24:35 lr 0.000048 time 2.2001 (2.2315) loss 3.1461 (3.0340) grad_norm 3.0554 (2.9043) [2022-01-26 02:44:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][600/1251] eta 0:24:12 lr 0.000048 time 1.6330 (2.2314) loss 3.0888 (3.0369) grad_norm 3.2883 (2.9030) [2022-01-26 02:44:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][610/1251] eta 0:23:49 lr 0.000048 time 2.7568 (2.2303) loss 3.0785 (3.0409) grad_norm 2.4931 (2.9037) [2022-01-26 02:44:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][620/1251] eta 0:23:26 lr 0.000048 time 1.8616 (2.2296) loss 3.0935 (3.0426) grad_norm 3.2094 (2.9008) [2022-01-26 02:45:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][630/1251] eta 0:23:05 lr 0.000048 time 2.4938 (2.2306) loss 3.2767 (3.0445) grad_norm 2.7304 (2.8981) [2022-01-26 02:45:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][640/1251] eta 0:22:42 lr 0.000048 time 1.8612 (2.2302) loss 2.6259 (3.0432) grad_norm 2.6612 (2.8969) [2022-01-26 02:45:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][650/1251] eta 0:22:19 lr 0.000048 time 2.1518 (2.2289) loss 2.1902 (3.0370) grad_norm 3.3558 (2.8963) [2022-01-26 02:46:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][660/1251] eta 0:21:54 lr 0.000048 time 1.9132 (2.2238) loss 2.6614 (3.0368) grad_norm 2.6007 (2.8944) [2022-01-26 02:46:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][670/1251] eta 0:21:31 lr 0.000048 time 2.2245 (2.2227) loss 2.4684 (3.0380) grad_norm 3.0949 (2.8950) [2022-01-26 02:47:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][680/1251] eta 0:21:08 lr 0.000048 time 2.1711 (2.2222) loss 2.5243 (3.0396) grad_norm 2.7205 (2.8948) [2022-01-26 02:47:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][690/1251] eta 0:20:47 lr 0.000048 time 1.5545 (2.2229) loss 2.4066 (3.0399) grad_norm 3.0084 (2.8934) [2022-01-26 02:47:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][700/1251] eta 0:20:25 lr 0.000048 time 2.0834 (2.2245) loss 3.1135 (3.0353) grad_norm 3.2111 (2.8928) [2022-01-26 02:48:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][710/1251] eta 0:20:03 lr 0.000048 time 1.9575 (2.2241) loss 3.5967 (3.0336) grad_norm 3.1367 (2.8915) [2022-01-26 02:48:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][720/1251] eta 0:19:40 lr 0.000048 time 1.8697 (2.2233) loss 3.4764 (3.0320) grad_norm 2.5663 (2.8896) [2022-01-26 02:48:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][730/1251] eta 0:19:17 lr 0.000048 time 1.5700 (2.2225) loss 3.2321 (3.0331) grad_norm 2.7412 (2.8895) [2022-01-26 02:49:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][740/1251] eta 0:18:57 lr 0.000047 time 3.1147 (2.2266) loss 3.1575 (3.0303) grad_norm 2.8390 (2.8890) [2022-01-26 02:49:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][750/1251] eta 0:18:35 lr 0.000047 time 1.5471 (2.2258) loss 3.4026 (3.0298) grad_norm 2.8248 (2.8904) [2022-01-26 02:50:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][760/1251] eta 0:18:11 lr 0.000047 time 2.0563 (2.2235) loss 3.3076 (3.0320) grad_norm 2.8260 (2.8917) [2022-01-26 02:50:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][770/1251] eta 0:17:48 lr 0.000047 time 1.9891 (2.2216) loss 3.4911 (3.0331) grad_norm 2.9776 (2.8912) [2022-01-26 02:50:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][780/1251] eta 0:17:26 lr 0.000047 time 2.6979 (2.2209) loss 3.3890 (3.0336) grad_norm 2.7178 (2.8943) [2022-01-26 02:51:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][790/1251] eta 0:17:03 lr 0.000047 time 1.6199 (2.2202) loss 3.7381 (3.0344) grad_norm 2.6693 (2.8959) [2022-01-26 02:51:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][800/1251] eta 0:16:40 lr 0.000047 time 2.9455 (2.2194) loss 3.5668 (3.0368) grad_norm 2.9023 (2.8976) [2022-01-26 02:51:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][810/1251] eta 0:16:18 lr 0.000047 time 2.2668 (2.2183) loss 1.8940 (3.0383) grad_norm 2.6370 (2.8965) [2022-01-26 02:52:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][820/1251] eta 0:15:55 lr 0.000047 time 2.1324 (2.2171) loss 3.3257 (3.0399) grad_norm 2.8673 (2.8969) [2022-01-26 02:52:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][830/1251] eta 0:15:33 lr 0.000047 time 2.5018 (2.2176) loss 3.3772 (3.0423) grad_norm 3.2074 (2.8974) [2022-01-26 02:52:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][840/1251] eta 0:15:12 lr 0.000047 time 2.6384 (2.2203) loss 3.3170 (3.0406) grad_norm 2.9095 (2.8981) [2022-01-26 02:53:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][850/1251] eta 0:14:50 lr 0.000047 time 2.6210 (2.2198) loss 3.0726 (3.0415) grad_norm 2.8176 (2.9008) [2022-01-26 02:53:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][860/1251] eta 0:14:27 lr 0.000047 time 1.6033 (2.2177) loss 3.3263 (3.0425) grad_norm 2.7829 (2.9017) [2022-01-26 02:53:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][870/1251] eta 0:14:04 lr 0.000047 time 1.9631 (2.2157) loss 3.1945 (3.0433) grad_norm 2.5531 (2.8991) [2022-01-26 02:54:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][880/1251] eta 0:13:41 lr 0.000047 time 1.9765 (2.2145) loss 1.9194 (3.0432) grad_norm 2.6349 (2.9021) [2022-01-26 02:54:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][890/1251] eta 0:13:18 lr 0.000047 time 1.7069 (2.2120) loss 2.3244 (3.0426) grad_norm 3.0298 (2.9083) [2022-01-26 02:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][900/1251] eta 0:12:56 lr 0.000047 time 2.2248 (2.2112) loss 3.5988 (3.0433) grad_norm 2.4754 (2.9085) [2022-01-26 02:55:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][910/1251] eta 0:12:33 lr 0.000047 time 2.2391 (2.2094) loss 3.1191 (3.0440) grad_norm 2.9110 (2.9075) [2022-01-26 02:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][920/1251] eta 0:12:11 lr 0.000047 time 1.6514 (2.2085) loss 3.3129 (3.0448) grad_norm 2.8439 (2.9068) [2022-01-26 02:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][930/1251] eta 0:11:48 lr 0.000047 time 2.1390 (2.2071) loss 3.3766 (3.0472) grad_norm 2.6670 (2.9061) [2022-01-26 02:56:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][940/1251] eta 0:11:27 lr 0.000047 time 1.8209 (2.2092) loss 3.1480 (3.0460) grad_norm 3.7754 (2.9084) [2022-01-26 02:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][950/1251] eta 0:11:05 lr 0.000047 time 2.1108 (2.2111) loss 3.5065 (3.0474) grad_norm 3.1160 (2.9079) [2022-01-26 02:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][960/1251] eta 0:10:44 lr 0.000047 time 2.3616 (2.2135) loss 3.1978 (3.0484) grad_norm 2.4455 (2.9071) [2022-01-26 02:57:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][970/1251] eta 0:10:22 lr 0.000047 time 2.1037 (2.2143) loss 3.4559 (3.0492) grad_norm 3.5378 (2.9066) [2022-01-26 02:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][980/1251] eta 0:09:59 lr 0.000047 time 1.5939 (2.2132) loss 3.1988 (3.0511) grad_norm 3.0306 (2.9069) [2022-01-26 02:58:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][990/1251] eta 0:09:37 lr 0.000047 time 1.6763 (2.2115) loss 3.2962 (3.0490) grad_norm 2.7969 (2.9176) [2022-01-26 02:58:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1000/1251] eta 0:09:15 lr 0.000047 time 1.9818 (2.2144) loss 3.0783 (3.0499) grad_norm 3.3529 (2.9192) [2022-01-26 02:59:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1010/1251] eta 0:08:53 lr 0.000047 time 1.7588 (2.2135) loss 3.5976 (3.0525) grad_norm 2.6044 (2.9185) [2022-01-26 02:59:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1020/1251] eta 0:08:30 lr 0.000047 time 1.9174 (2.2108) loss 3.5685 (3.0556) grad_norm 3.1618 (2.9197) [2022-01-26 02:59:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1030/1251] eta 0:08:08 lr 0.000047 time 2.2543 (2.2111) loss 2.9515 (3.0552) grad_norm 2.8875 (2.9209) [2022-01-26 03:00:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1040/1251] eta 0:07:47 lr 0.000047 time 1.5887 (2.2133) loss 2.9945 (3.0566) grad_norm 2.9773 (2.9202) [2022-01-26 03:00:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1050/1251] eta 0:07:24 lr 0.000047 time 2.1441 (2.2127) loss 2.7712 (3.0553) grad_norm 2.7970 (2.9193) [2022-01-26 03:00:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1060/1251] eta 0:07:02 lr 0.000047 time 1.9448 (2.2112) loss 2.4362 (3.0548) grad_norm 2.9024 (2.9188) [2022-01-26 03:01:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1070/1251] eta 0:06:40 lr 0.000047 time 1.8389 (2.2102) loss 3.3513 (3.0531) grad_norm 3.0148 (2.9186) [2022-01-26 03:01:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1080/1251] eta 0:06:18 lr 0.000047 time 1.8685 (2.2109) loss 2.3850 (3.0493) grad_norm 2.4463 (2.9179) [2022-01-26 03:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1090/1251] eta 0:05:55 lr 0.000047 time 2.0704 (2.2106) loss 3.4750 (3.0486) grad_norm 3.0429 (2.9224) [2022-01-26 03:02:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1100/1251] eta 0:05:33 lr 0.000047 time 1.8954 (2.2101) loss 2.3270 (3.0486) grad_norm 2.6728 (2.9221) [2022-01-26 03:02:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1110/1251] eta 0:05:11 lr 0.000047 time 1.8923 (2.2088) loss 3.3429 (3.0491) grad_norm 2.7185 (2.9209) [2022-01-26 03:03:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1120/1251] eta 0:04:49 lr 0.000047 time 1.5879 (2.2081) loss 3.3848 (3.0505) grad_norm 3.2917 (2.9204) [2022-01-26 03:03:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1130/1251] eta 0:04:27 lr 0.000047 time 1.5394 (2.2072) loss 3.5529 (3.0518) grad_norm 3.3437 (2.9225) [2022-01-26 03:03:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1140/1251] eta 0:04:05 lr 0.000047 time 2.8095 (2.2077) loss 2.8948 (3.0526) grad_norm 2.9310 (2.9228) [2022-01-26 03:04:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1150/1251] eta 0:03:43 lr 0.000047 time 2.6353 (2.2081) loss 2.7679 (3.0516) grad_norm 3.6695 (2.9260) [2022-01-26 03:04:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1160/1251] eta 0:03:20 lr 0.000047 time 1.8641 (2.2071) loss 3.2553 (3.0520) grad_norm 3.4809 (2.9272) [2022-01-26 03:04:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1170/1251] eta 0:02:58 lr 0.000047 time 2.1453 (2.2063) loss 3.7321 (3.0533) grad_norm 2.4361 (2.9260) [2022-01-26 03:05:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1180/1251] eta 0:02:36 lr 0.000047 time 3.3297 (2.2062) loss 3.1266 (3.0526) grad_norm 2.8094 (2.9275) [2022-01-26 03:05:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1190/1251] eta 0:02:14 lr 0.000047 time 2.7500 (2.2073) loss 3.2055 (3.0510) grad_norm 3.3200 (2.9282) [2022-01-26 03:05:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1200/1251] eta 0:01:52 lr 0.000047 time 1.4893 (2.2078) loss 2.5581 (3.0515) grad_norm 3.1259 (2.9282) [2022-01-26 03:06:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1210/1251] eta 0:01:30 lr 0.000047 time 2.0656 (2.2075) loss 2.7562 (3.0502) grad_norm 2.9988 (2.9288) [2022-01-26 03:06:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1220/1251] eta 0:01:08 lr 0.000047 time 3.3788 (2.2088) loss 3.3047 (3.0495) grad_norm 2.6661 (2.9278) [2022-01-26 03:07:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1230/1251] eta 0:00:46 lr 0.000047 time 2.1759 (2.2091) loss 3.1005 (3.0497) grad_norm 2.7371 (2.9263) [2022-01-26 03:07:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1240/1251] eta 0:00:24 lr 0.000047 time 1.2958 (2.2068) loss 2.5931 (3.0496) grad_norm 2.9054 (2.9269) [2022-01-26 03:07:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [262/300][1250/1251] eta 0:00:02 lr 0.000047 time 1.2264 (2.2006) loss 2.9327 (3.0492) grad_norm 2.6760 (2.9268) [2022-01-26 03:07:41 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 262 training takes 0:45:53 [2022-01-26 03:08:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.543 (18.543) Loss 0.7929 (0.7929) Acc@1 81.641 (81.641) Acc@5 96.191 (96.191) [2022-01-26 03:08:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.973 (3.324) Loss 0.8548 (0.8193) Acc@1 79.395 (80.939) Acc@5 95.215 (95.534) [2022-01-26 03:08:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.614 (2.495) Loss 0.7727 (0.8204) Acc@1 81.055 (80.915) Acc@5 96.484 (95.359) [2022-01-26 03:08:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.199 (2.237) Loss 0.8711 (0.8266) Acc@1 80.371 (80.727) Acc@5 95.312 (95.360) [2022-01-26 03:09:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.644 (2.179) Loss 0.9082 (0.8293) Acc@1 78.613 (80.674) Acc@5 93.555 (95.267) [2022-01-26 03:09:17 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.744 Acc@5 95.316 [2022-01-26 03:09:17 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-01-26 03:09:17 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.78% [2022-01-26 03:09:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][0/1251] eta 7:29:20 lr 0.000047 time 21.5514 (21.5514) loss 3.2931 (3.2931) grad_norm 3.5627 (3.5627) [2022-01-26 03:10:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][10/1251] eta 1:24:07 lr 0.000047 time 3.3395 (4.0676) loss 2.1487 (2.8991) grad_norm 2.7583 (2.9047) [2022-01-26 03:10:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][20/1251] eta 1:05:01 lr 0.000047 time 2.6251 (3.1690) loss 3.3177 (3.0197) grad_norm 2.7271 (2.8532) [2022-01-26 03:10:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][30/1251] eta 0:57:36 lr 0.000047 time 1.5688 (2.8307) loss 2.1178 (2.9853) grad_norm 2.8653 (2.8150) [2022-01-26 03:11:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][40/1251] eta 0:55:00 lr 0.000047 time 3.9819 (2.7251) loss 3.5445 (3.0398) grad_norm 2.7175 (2.8262) [2022-01-26 03:11:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][50/1251] eta 0:53:04 lr 0.000047 time 3.1678 (2.6516) loss 2.9684 (3.0161) grad_norm 2.8269 (2.8063) [2022-01-26 03:11:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][60/1251] eta 0:51:02 lr 0.000047 time 2.2935 (2.5714) loss 3.2051 (3.0249) grad_norm 2.5015 (2.8000) [2022-01-26 03:12:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][70/1251] eta 0:49:19 lr 0.000047 time 1.9190 (2.5055) loss 2.2413 (3.0089) grad_norm 2.8195 (2.8199) [2022-01-26 03:12:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][80/1251] eta 0:48:27 lr 0.000047 time 3.9062 (2.4826) loss 2.7972 (3.0080) grad_norm 2.4389 (2.8007) [2022-01-26 03:12:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][90/1251] eta 0:47:10 lr 0.000047 time 1.6273 (2.4378) loss 2.0667 (2.9997) grad_norm 2.8092 (2.8024) [2022-01-26 03:13:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][100/1251] eta 0:46:17 lr 0.000047 time 3.1512 (2.4127) loss 3.0531 (3.0148) grad_norm 3.0834 (2.8104) [2022-01-26 03:13:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][110/1251] eta 0:45:41 lr 0.000047 time 1.8342 (2.4025) loss 3.7219 (3.0122) grad_norm 5.3260 (2.8321) [2022-01-26 03:14:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][120/1251] eta 0:45:11 lr 0.000047 time 4.0465 (2.3972) loss 2.0789 (2.9972) grad_norm 4.7017 (2.8358) [2022-01-26 03:14:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][130/1251] eta 0:44:20 lr 0.000046 time 1.8624 (2.3737) loss 3.5296 (2.9931) grad_norm 3.2945 (2.8421) [2022-01-26 03:14:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][140/1251] eta 0:43:33 lr 0.000046 time 2.1251 (2.3527) loss 2.7811 (2.9866) grad_norm 3.3224 (2.8399) [2022-01-26 03:15:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][150/1251] eta 0:42:52 lr 0.000046 time 1.9188 (2.3369) loss 3.4632 (2.9949) grad_norm 2.8055 (2.8324) [2022-01-26 03:15:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][160/1251] eta 0:42:36 lr 0.000046 time 3.6384 (2.3436) loss 2.9289 (3.0008) grad_norm 2.7073 (2.8296) [2022-01-26 03:15:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][170/1251] eta 0:42:05 lr 0.000046 time 2.8363 (2.3366) loss 3.3122 (3.0100) grad_norm 2.4353 (2.8166) [2022-01-26 03:16:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][180/1251] eta 0:41:31 lr 0.000046 time 1.8377 (2.3261) loss 3.6072 (3.0100) grad_norm 2.6783 (2.8115) [2022-01-26 03:16:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][190/1251] eta 0:40:49 lr 0.000046 time 1.9853 (2.3090) loss 2.3648 (3.0132) grad_norm 3.1525 (2.8145) [2022-01-26 03:17:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][200/1251] eta 0:40:28 lr 0.000046 time 3.9088 (2.3102) loss 3.3914 (3.0166) grad_norm 2.7127 (2.8118) [2022-01-26 03:17:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][210/1251] eta 0:39:54 lr 0.000046 time 2.3608 (2.3003) loss 3.4638 (3.0172) grad_norm 2.7354 (2.8128) [2022-01-26 03:17:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][220/1251] eta 0:39:25 lr 0.000046 time 2.0709 (2.2944) loss 3.3543 (3.0245) grad_norm 2.7835 (2.8176) [2022-01-26 03:18:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][230/1251] eta 0:38:52 lr 0.000046 time 1.6205 (2.2843) loss 3.4733 (3.0241) grad_norm 3.0156 (2.8178) [2022-01-26 03:18:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][240/1251] eta 0:38:39 lr 0.000046 time 3.5455 (2.2947) loss 3.3454 (3.0204) grad_norm 2.6281 (2.8181) [2022-01-26 03:18:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][250/1251] eta 0:38:14 lr 0.000046 time 1.5928 (2.2921) loss 3.0457 (3.0092) grad_norm 4.7138 (2.8295) [2022-01-26 03:19:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][260/1251] eta 0:37:40 lr 0.000046 time 1.5929 (2.2805) loss 3.0362 (3.0008) grad_norm 2.7331 (2.8296) [2022-01-26 03:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][270/1251] eta 0:37:08 lr 0.000046 time 2.0555 (2.2714) loss 2.4125 (2.9978) grad_norm 2.6820 (2.8328) [2022-01-26 03:19:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][280/1251] eta 0:36:43 lr 0.000046 time 3.0983 (2.2690) loss 3.3016 (3.0074) grad_norm 2.7519 (2.8302) [2022-01-26 03:20:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][290/1251] eta 0:36:22 lr 0.000046 time 3.1174 (2.2712) loss 3.2067 (3.0118) grad_norm 2.6507 (2.8331) [2022-01-26 03:20:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][300/1251] eta 0:35:53 lr 0.000046 time 2.4035 (2.2649) loss 3.0135 (3.0086) grad_norm 2.4647 (2.8315) [2022-01-26 03:21:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][310/1251] eta 0:35:25 lr 0.000046 time 2.2227 (2.2590) loss 3.6302 (3.0148) grad_norm 2.6283 (2.8368) [2022-01-26 03:21:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][320/1251] eta 0:35:01 lr 0.000046 time 3.2438 (2.2577) loss 2.2886 (3.0165) grad_norm 3.0680 (2.8342) [2022-01-26 03:21:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][330/1251] eta 0:34:35 lr 0.000046 time 1.9317 (2.2533) loss 3.9164 (3.0207) grad_norm 2.9445 (2.8366) [2022-01-26 03:22:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][340/1251] eta 0:34:14 lr 0.000046 time 2.9074 (2.2553) loss 3.2620 (3.0211) grad_norm 3.0591 (2.8397) [2022-01-26 03:22:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][350/1251] eta 0:33:52 lr 0.000046 time 2.2651 (2.2554) loss 3.5801 (3.0265) grad_norm 3.0139 (2.8391) [2022-01-26 03:22:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][360/1251] eta 0:33:27 lr 0.000046 time 2.7957 (2.2526) loss 3.4063 (3.0241) grad_norm 2.9305 (2.8434) [2022-01-26 03:23:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][370/1251] eta 0:33:01 lr 0.000046 time 1.7933 (2.2490) loss 3.1504 (3.0239) grad_norm 2.7909 (2.8522) [2022-01-26 03:23:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][380/1251] eta 0:32:39 lr 0.000046 time 2.5966 (2.2492) loss 3.5013 (3.0225) grad_norm 3.0430 (2.8606) [2022-01-26 03:23:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][390/1251] eta 0:32:13 lr 0.000046 time 1.8903 (2.2451) loss 2.3204 (3.0221) grad_norm 2.6545 (2.8653) [2022-01-26 03:24:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][400/1251] eta 0:31:47 lr 0.000046 time 2.5262 (2.2413) loss 3.3234 (3.0260) grad_norm 2.8005 (2.8632) [2022-01-26 03:24:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][410/1251] eta 0:31:24 lr 0.000046 time 2.5079 (2.2410) loss 2.4103 (3.0225) grad_norm 2.6134 (2.8716) [2022-01-26 03:25:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][420/1251] eta 0:30:59 lr 0.000046 time 2.2181 (2.2380) loss 2.4341 (3.0207) grad_norm 2.8044 (2.8748) [2022-01-26 03:25:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][430/1251] eta 0:30:36 lr 0.000046 time 1.9405 (2.2366) loss 2.4302 (3.0250) grad_norm 2.9448 (2.8753) [2022-01-26 03:25:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][440/1251] eta 0:30:14 lr 0.000046 time 2.0339 (2.2370) loss 3.3878 (3.0272) grad_norm 3.1545 (2.8756) [2022-01-26 03:26:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][450/1251] eta 0:29:49 lr 0.000046 time 2.3172 (2.2343) loss 3.3987 (3.0258) grad_norm 2.5805 (2.8854) [2022-01-26 03:26:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][460/1251] eta 0:29:28 lr 0.000046 time 2.6337 (2.2358) loss 3.4126 (3.0217) grad_norm 3.4393 (2.8863) [2022-01-26 03:26:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][470/1251] eta 0:29:04 lr 0.000046 time 2.1514 (2.2336) loss 3.3085 (3.0238) grad_norm 2.8033 (2.8829) [2022-01-26 03:27:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][480/1251] eta 0:28:41 lr 0.000046 time 1.8982 (2.2323) loss 3.0527 (3.0248) grad_norm 2.6700 (2.8825) [2022-01-26 03:27:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][490/1251] eta 0:28:21 lr 0.000046 time 2.8709 (2.2358) loss 3.5508 (3.0268) grad_norm 2.5975 (2.8813) [2022-01-26 03:27:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][500/1251] eta 0:27:58 lr 0.000046 time 1.7309 (2.2350) loss 3.0098 (3.0248) grad_norm 3.0275 (2.8815) [2022-01-26 03:28:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][510/1251] eta 0:27:32 lr 0.000046 time 1.8136 (2.2305) loss 3.4951 (3.0278) grad_norm 3.9212 (2.8848) [2022-01-26 03:28:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][520/1251] eta 0:27:08 lr 0.000046 time 1.9351 (2.2283) loss 3.2146 (3.0273) grad_norm 3.3492 (2.8884) [2022-01-26 03:29:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][530/1251] eta 0:26:45 lr 0.000046 time 2.3271 (2.2266) loss 3.3309 (3.0246) grad_norm 2.7451 (2.8943) [2022-01-26 03:29:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][540/1251] eta 0:26:21 lr 0.000046 time 2.5597 (2.2248) loss 3.5702 (3.0246) grad_norm 2.4506 (2.9004) [2022-01-26 03:29:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][550/1251] eta 0:25:59 lr 0.000046 time 1.9090 (2.2253) loss 3.1556 (3.0241) grad_norm 3.0196 (2.8992) [2022-01-26 03:30:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][560/1251] eta 0:25:37 lr 0.000046 time 2.5137 (2.2249) loss 2.8005 (3.0257) grad_norm 2.7223 (2.9007) [2022-01-26 03:30:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][570/1251] eta 0:25:14 lr 0.000046 time 2.2180 (2.2243) loss 3.5422 (3.0290) grad_norm 5.3694 (2.9069) [2022-01-26 03:30:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][580/1251] eta 0:24:52 lr 0.000046 time 3.0949 (2.2237) loss 1.8430 (3.0291) grad_norm 2.9437 (2.9074) [2022-01-26 03:31:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][590/1251] eta 0:24:28 lr 0.000046 time 1.6879 (2.2217) loss 3.3157 (3.0321) grad_norm 2.9334 (2.9078) [2022-01-26 03:31:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][600/1251] eta 0:24:05 lr 0.000046 time 1.6784 (2.2212) loss 2.3243 (3.0280) grad_norm 2.6583 (2.9084) [2022-01-26 03:31:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][610/1251] eta 0:23:42 lr 0.000046 time 2.2804 (2.2192) loss 3.3240 (3.0297) grad_norm 2.5177 (2.9069) [2022-01-26 03:32:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][620/1251] eta 0:23:21 lr 0.000046 time 3.1048 (2.2218) loss 2.9910 (3.0313) grad_norm 2.8409 (2.9055) [2022-01-26 03:32:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][630/1251] eta 0:22:56 lr 0.000046 time 1.6472 (2.2165) loss 2.5693 (3.0297) grad_norm 3.4848 (2.9037) [2022-01-26 03:32:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][640/1251] eta 0:22:33 lr 0.000046 time 1.4785 (2.2156) loss 1.8976 (3.0280) grad_norm 2.5190 (2.9035) [2022-01-26 03:33:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][650/1251] eta 0:22:10 lr 0.000046 time 2.3222 (2.2135) loss 3.2768 (3.0337) grad_norm 2.6535 (2.9020) [2022-01-26 03:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][660/1251] eta 0:21:47 lr 0.000046 time 2.6498 (2.2126) loss 3.0905 (3.0325) grad_norm 2.5327 (2.9009) [2022-01-26 03:34:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][670/1251] eta 0:21:25 lr 0.000046 time 1.6899 (2.2125) loss 3.5399 (3.0356) grad_norm 2.7973 (2.9025) [2022-01-26 03:34:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][680/1251] eta 0:21:02 lr 0.000046 time 1.5888 (2.2108) loss 3.6247 (3.0382) grad_norm 2.5556 (2.9031) [2022-01-26 03:34:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][690/1251] eta 0:20:39 lr 0.000046 time 1.9508 (2.2100) loss 2.7612 (3.0399) grad_norm 2.6196 (2.9000) [2022-01-26 03:35:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][700/1251] eta 0:20:19 lr 0.000046 time 3.3989 (2.2126) loss 2.2639 (3.0377) grad_norm 2.7344 (2.8984) [2022-01-26 03:35:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][710/1251] eta 0:19:57 lr 0.000046 time 2.6827 (2.2130) loss 2.4607 (3.0356) grad_norm 2.6360 (2.9007) [2022-01-26 03:35:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][720/1251] eta 0:19:35 lr 0.000046 time 1.4683 (2.2147) loss 2.7308 (3.0330) grad_norm 3.9175 (2.9005) [2022-01-26 03:36:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][730/1251] eta 0:19:14 lr 0.000046 time 2.1700 (2.2165) loss 2.7405 (3.0346) grad_norm 3.2534 (2.9033) [2022-01-26 03:36:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][740/1251] eta 0:18:51 lr 0.000046 time 1.5758 (2.2150) loss 3.0931 (3.0377) grad_norm 2.8960 (2.9050) [2022-01-26 03:36:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][750/1251] eta 0:18:28 lr 0.000046 time 1.8838 (2.2126) loss 2.9829 (3.0388) grad_norm 2.5414 (2.9062) [2022-01-26 03:37:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][760/1251] eta 0:18:05 lr 0.000046 time 1.7055 (2.2102) loss 3.7167 (3.0397) grad_norm 2.5175 (2.9070) [2022-01-26 03:37:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][770/1251] eta 0:17:42 lr 0.000045 time 2.1335 (2.2094) loss 2.2043 (3.0397) grad_norm 2.6041 (2.9073) [2022-01-26 03:38:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][780/1251] eta 0:17:20 lr 0.000045 time 2.2028 (2.2096) loss 3.2886 (3.0408) grad_norm 3.0411 (2.9067) [2022-01-26 03:38:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][790/1251] eta 0:16:58 lr 0.000045 time 2.2045 (2.2086) loss 2.5674 (3.0387) grad_norm 2.6142 (2.9070) [2022-01-26 03:38:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][800/1251] eta 0:16:35 lr 0.000045 time 1.9121 (2.2066) loss 3.1147 (3.0391) grad_norm 2.4385 (2.9063) [2022-01-26 03:39:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][810/1251] eta 0:16:13 lr 0.000045 time 3.0366 (2.2077) loss 2.4342 (3.0392) grad_norm 2.7382 (2.9074) [2022-01-26 03:39:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][820/1251] eta 0:15:51 lr 0.000045 time 1.6740 (2.2087) loss 2.7591 (3.0353) grad_norm 3.2149 (2.9088) [2022-01-26 03:39:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][830/1251] eta 0:15:30 lr 0.000045 time 1.8106 (2.2100) loss 3.4909 (3.0325) grad_norm 2.6923 (2.9080) [2022-01-26 03:40:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][840/1251] eta 0:15:08 lr 0.000045 time 1.7856 (2.2099) loss 3.4546 (3.0335) grad_norm 3.7183 (2.9093) [2022-01-26 03:40:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][850/1251] eta 0:14:45 lr 0.000045 time 1.9844 (2.2085) loss 3.1961 (3.0322) grad_norm 2.6268 (2.9108) [2022-01-26 03:40:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][860/1251] eta 0:14:22 lr 0.000045 time 1.7173 (2.2060) loss 2.7325 (3.0335) grad_norm 2.6064 (2.9095) [2022-01-26 03:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][870/1251] eta 0:13:59 lr 0.000045 time 2.0746 (2.2033) loss 2.3979 (3.0319) grad_norm 2.9144 (2.9097) [2022-01-26 03:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][880/1251] eta 0:13:36 lr 0.000045 time 1.8983 (2.2019) loss 3.1336 (3.0331) grad_norm 2.9537 (2.9103) [2022-01-26 03:41:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][890/1251] eta 0:13:14 lr 0.000045 time 2.2060 (2.2012) loss 2.8011 (3.0308) grad_norm 2.6471 (2.9204) [2022-01-26 03:42:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][900/1251] eta 0:12:52 lr 0.000045 time 2.3824 (2.2020) loss 2.7674 (3.0315) grad_norm 2.3514 (2.9215) [2022-01-26 03:42:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][910/1251] eta 0:12:30 lr 0.000045 time 1.4516 (2.2022) loss 1.8311 (3.0289) grad_norm 2.3001 (2.9197) [2022-01-26 03:43:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][920/1251] eta 0:12:09 lr 0.000045 time 2.0818 (2.2029) loss 3.1372 (3.0289) grad_norm 2.7641 (2.9190) [2022-01-26 03:43:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][930/1251] eta 0:11:47 lr 0.000045 time 2.5945 (2.2043) loss 2.5330 (3.0308) grad_norm 2.8193 (2.9222) [2022-01-26 03:43:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][940/1251] eta 0:11:26 lr 0.000045 time 2.7152 (2.2089) loss 1.9800 (3.0311) grad_norm 2.9324 (2.9233) [2022-01-26 03:44:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][950/1251] eta 0:11:05 lr 0.000045 time 1.7465 (2.2101) loss 3.2885 (3.0329) grad_norm 2.9335 (2.9231) [2022-01-26 03:44:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][960/1251] eta 0:10:42 lr 0.000045 time 1.8566 (2.2089) loss 3.7464 (3.0355) grad_norm 2.9157 (2.9225) [2022-01-26 03:45:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][970/1251] eta 0:10:20 lr 0.000045 time 1.9569 (2.2075) loss 3.1198 (3.0338) grad_norm 2.9833 (2.9213) [2022-01-26 03:45:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][980/1251] eta 0:09:57 lr 0.000045 time 2.1801 (2.2046) loss 3.3035 (3.0338) grad_norm 2.9500 (2.9204) [2022-01-26 03:45:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][990/1251] eta 0:09:34 lr 0.000045 time 1.8264 (2.2029) loss 3.2799 (3.0349) grad_norm 2.9511 (2.9210) [2022-01-26 03:46:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1000/1251] eta 0:09:12 lr 0.000045 time 2.2266 (2.2020) loss 3.3558 (3.0362) grad_norm 2.3549 (2.9198) [2022-01-26 03:46:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1010/1251] eta 0:08:50 lr 0.000045 time 2.0497 (2.2012) loss 3.4584 (3.0341) grad_norm 3.0578 (2.9184) [2022-01-26 03:46:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1020/1251] eta 0:08:28 lr 0.000045 time 2.9097 (2.2024) loss 2.8051 (3.0341) grad_norm 2.6366 (2.9172) [2022-01-26 03:47:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1030/1251] eta 0:08:06 lr 0.000045 time 1.7691 (2.2018) loss 3.2513 (3.0345) grad_norm 2.5825 (2.9154) [2022-01-26 03:47:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1040/1251] eta 0:07:44 lr 0.000045 time 2.4397 (2.2027) loss 3.2797 (3.0356) grad_norm 2.2802 (2.9154) [2022-01-26 03:47:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1050/1251] eta 0:07:22 lr 0.000045 time 1.8455 (2.2018) loss 2.0100 (3.0309) grad_norm 2.7652 (2.9135) [2022-01-26 03:48:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1060/1251] eta 0:07:00 lr 0.000045 time 3.0653 (2.2036) loss 2.4533 (3.0309) grad_norm 3.0072 (2.9129) [2022-01-26 03:48:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1070/1251] eta 0:06:38 lr 0.000045 time 1.7032 (2.2026) loss 2.8522 (3.0310) grad_norm 3.5010 (2.9114) [2022-01-26 03:48:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1080/1251] eta 0:06:16 lr 0.000045 time 2.4769 (2.2033) loss 2.7454 (3.0326) grad_norm 2.6392 (2.9131) [2022-01-26 03:49:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1090/1251] eta 0:05:54 lr 0.000045 time 1.8588 (2.2029) loss 3.4638 (3.0320) grad_norm 2.6207 (2.9139) [2022-01-26 03:49:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1100/1251] eta 0:05:32 lr 0.000045 time 3.4428 (2.2020) loss 2.8951 (3.0324) grad_norm 3.0910 (2.9138) [2022-01-26 03:50:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1110/1251] eta 0:05:10 lr 0.000045 time 1.5920 (2.2022) loss 3.5263 (3.0315) grad_norm 3.1735 (2.9151) [2022-01-26 03:50:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1120/1251] eta 0:04:48 lr 0.000045 time 1.9507 (2.2023) loss 3.1460 (3.0317) grad_norm 3.2245 (2.9170) [2022-01-26 03:50:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1130/1251] eta 0:04:26 lr 0.000045 time 3.4952 (2.2030) loss 3.2483 (3.0306) grad_norm 3.2587 (2.9168) [2022-01-26 03:51:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1140/1251] eta 0:04:04 lr 0.000045 time 3.5953 (2.2047) loss 3.4085 (3.0310) grad_norm 3.0118 (2.9168) [2022-01-26 03:51:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1150/1251] eta 0:03:42 lr 0.000045 time 2.2008 (2.2054) loss 3.3765 (3.0288) grad_norm 3.6947 (2.9165) [2022-01-26 03:51:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1160/1251] eta 0:03:20 lr 0.000045 time 1.6348 (2.2038) loss 3.0854 (3.0275) grad_norm 3.0352 (2.9179) [2022-01-26 03:52:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1170/1251] eta 0:02:58 lr 0.000045 time 1.9044 (2.2029) loss 2.4248 (3.0273) grad_norm 2.8061 (2.9163) [2022-01-26 03:52:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1180/1251] eta 0:02:36 lr 0.000045 time 2.8713 (2.2020) loss 3.5659 (3.0284) grad_norm 2.6901 (2.9170) [2022-01-26 03:52:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1190/1251] eta 0:02:14 lr 0.000045 time 2.0291 (2.2016) loss 1.9983 (3.0295) grad_norm 2.4677 (2.9157) [2022-01-26 03:53:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1200/1251] eta 0:01:52 lr 0.000045 time 2.3900 (2.2018) loss 3.4989 (3.0315) grad_norm 3.7109 (2.9159) [2022-01-26 03:53:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1210/1251] eta 0:01:30 lr 0.000045 time 2.1278 (2.2016) loss 2.9094 (3.0311) grad_norm 2.6746 (2.9146) [2022-01-26 03:54:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1220/1251] eta 0:01:08 lr 0.000045 time 3.8803 (2.2027) loss 3.5026 (3.0305) grad_norm 2.7120 (2.9141) [2022-01-26 03:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1230/1251] eta 0:00:46 lr 0.000045 time 1.7743 (2.2030) loss 3.0905 (3.0316) grad_norm 2.5552 (2.9140) [2022-01-26 03:54:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1240/1251] eta 0:00:24 lr 0.000045 time 1.6172 (2.2016) loss 3.3165 (3.0343) grad_norm 2.9830 (2.9132) [2022-01-26 03:55:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [263/300][1250/1251] eta 0:00:02 lr 0.000045 time 1.2011 (2.1956) loss 2.4920 (3.0357) grad_norm 4.1379 (2.9140) [2022-01-26 03:55:04 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 263 training takes 0:45:47 [2022-01-26 03:55:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.642 (18.642) Loss 0.8785 (0.8785) Acc@1 80.176 (80.176) Acc@5 94.824 (94.824) [2022-01-26 03:55:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.569 (3.474) Loss 0.9242 (0.8486) Acc@1 77.637 (80.247) Acc@5 94.727 (95.002) [2022-01-26 03:56:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.993 (2.693) Loss 0.8262 (0.8449) Acc@1 80.859 (80.418) Acc@5 95.898 (95.136) [2022-01-26 03:56:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.269 (2.258) Loss 0.9505 (0.8456) Acc@1 76.953 (80.346) Acc@5 94.043 (95.086) [2022-01-26 03:56:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.141 (2.185) Loss 0.9149 (0.8341) Acc@1 79.590 (80.671) Acc@5 94.336 (95.174) [2022-01-26 03:56:41 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.750 Acc@5 95.242 [2022-01-26 03:56:41 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-01-26 03:56:41 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.78% [2022-01-26 03:57:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][0/1251] eta 7:38:24 lr 0.000045 time 21.9857 (21.9857) loss 3.3300 (3.3300) grad_norm 3.0825 (3.0825) [2022-01-26 03:57:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][10/1251] eta 1:23:48 lr 0.000045 time 2.3250 (4.0523) loss 3.3910 (3.0742) grad_norm 2.5938 (2.9030) [2022-01-26 03:57:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][20/1251] eta 1:04:25 lr 0.000045 time 1.8245 (3.1403) loss 2.3590 (3.0216) grad_norm 3.0992 (2.9155) [2022-01-26 03:58:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][30/1251] eta 0:56:53 lr 0.000045 time 1.5430 (2.7953) loss 3.4429 (2.9893) grad_norm 2.9965 (2.9144) [2022-01-26 03:58:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][40/1251] eta 0:54:19 lr 0.000045 time 3.8363 (2.6912) loss 3.5361 (2.9845) grad_norm 3.0156 (2.8918) [2022-01-26 03:58:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][50/1251] eta 0:52:27 lr 0.000045 time 3.0143 (2.6204) loss 3.1441 (2.9789) grad_norm 2.7133 (2.8852) [2022-01-26 03:59:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][60/1251] eta 0:50:46 lr 0.000045 time 2.1857 (2.5576) loss 3.2699 (2.9458) grad_norm 2.9942 (2.8794) [2022-01-26 03:59:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][70/1251] eta 0:49:09 lr 0.000045 time 1.9851 (2.4974) loss 2.2287 (2.9432) grad_norm 3.0336 (2.8943) [2022-01-26 04:00:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][80/1251] eta 0:48:14 lr 0.000045 time 3.7329 (2.4716) loss 3.5246 (2.9865) grad_norm 3.1702 (2.8964) [2022-01-26 04:00:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][90/1251] eta 0:47:18 lr 0.000045 time 1.5668 (2.4451) loss 3.2520 (3.0108) grad_norm 3.0025 (2.9260) [2022-01-26 04:00:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][100/1251] eta 0:46:16 lr 0.000045 time 1.9034 (2.4125) loss 2.6237 (3.0254) grad_norm 3.0035 (2.9285) [2022-01-26 04:01:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][110/1251] eta 0:45:36 lr 0.000045 time 2.5327 (2.3982) loss 3.1048 (3.0246) grad_norm 3.0235 (2.9354) [2022-01-26 04:01:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][120/1251] eta 0:44:52 lr 0.000045 time 3.1468 (2.3806) loss 2.1883 (3.0076) grad_norm 5.0347 (2.9639) [2022-01-26 04:01:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][130/1251] eta 0:44:08 lr 0.000045 time 1.7262 (2.3623) loss 2.9421 (3.0145) grad_norm 3.3681 (2.9557) [2022-01-26 04:02:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][140/1251] eta 0:43:23 lr 0.000045 time 1.9455 (2.3430) loss 3.6370 (3.0214) grad_norm 2.9240 (2.9520) [2022-01-26 04:02:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][150/1251] eta 0:42:41 lr 0.000045 time 2.1208 (2.3266) loss 1.8623 (3.0287) grad_norm 3.0730 (2.9430) [2022-01-26 04:02:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][160/1251] eta 0:42:13 lr 0.000045 time 3.1982 (2.3226) loss 3.3622 (3.0187) grad_norm 2.5188 (2.9307) [2022-01-26 04:03:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][170/1251] eta 0:41:40 lr 0.000045 time 1.7995 (2.3129) loss 3.4548 (3.0295) grad_norm 2.9254 (2.9360) [2022-01-26 04:03:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][180/1251] eta 0:41:13 lr 0.000044 time 2.1730 (2.3096) loss 3.4232 (3.0405) grad_norm 2.7869 (2.9230) [2022-01-26 04:04:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][190/1251] eta 0:40:43 lr 0.000044 time 1.8982 (2.3028) loss 2.9269 (3.0342) grad_norm 2.8850 (2.9226) [2022-01-26 04:04:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][200/1251] eta 0:40:22 lr 0.000044 time 3.1356 (2.3045) loss 3.0910 (3.0305) grad_norm 3.1713 (2.9293) [2022-01-26 04:04:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][210/1251] eta 0:39:52 lr 0.000044 time 1.6274 (2.2982) loss 2.6726 (3.0241) grad_norm 2.4335 (2.9223) [2022-01-26 04:05:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][220/1251] eta 0:39:20 lr 0.000044 time 2.1352 (2.2896) loss 3.1422 (3.0305) grad_norm 2.8833 (2.9297) [2022-01-26 04:05:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][230/1251] eta 0:38:47 lr 0.000044 time 1.6454 (2.2796) loss 3.6681 (3.0344) grad_norm 3.1570 (2.9293) [2022-01-26 04:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][240/1251] eta 0:38:27 lr 0.000044 time 3.6163 (2.2829) loss 2.6497 (3.0285) grad_norm 2.8875 (2.9324) [2022-01-26 04:06:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][250/1251] eta 0:38:02 lr 0.000044 time 1.6394 (2.2806) loss 3.2589 (3.0295) grad_norm 2.6566 (2.9320) [2022-01-26 04:06:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][260/1251] eta 0:37:36 lr 0.000044 time 2.9115 (2.2768) loss 3.1746 (3.0299) grad_norm 2.7593 (2.9265) [2022-01-26 04:06:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][270/1251] eta 0:37:10 lr 0.000044 time 1.8311 (2.2733) loss 2.4644 (3.0321) grad_norm 2.6107 (2.9253) [2022-01-26 04:07:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][280/1251] eta 0:36:47 lr 0.000044 time 3.0133 (2.2733) loss 3.6502 (3.0287) grad_norm 4.1650 (2.9248) [2022-01-26 04:07:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][290/1251] eta 0:36:18 lr 0.000044 time 1.6111 (2.2670) loss 2.6058 (3.0275) grad_norm 2.9640 (2.9206) [2022-01-26 04:08:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][300/1251] eta 0:35:51 lr 0.000044 time 2.5749 (2.2624) loss 2.8882 (3.0246) grad_norm 2.6735 (2.9270) [2022-01-26 04:08:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][310/1251] eta 0:35:24 lr 0.000044 time 2.4416 (2.2581) loss 2.2124 (3.0224) grad_norm 2.9549 (2.9262) [2022-01-26 04:08:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][320/1251] eta 0:35:01 lr 0.000044 time 2.8682 (2.2570) loss 2.7107 (3.0248) grad_norm 2.9102 (2.9283) [2022-01-26 04:09:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][330/1251] eta 0:34:37 lr 0.000044 time 1.9604 (2.2560) loss 3.5464 (3.0246) grad_norm 2.5249 (2.9292) [2022-01-26 04:09:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][340/1251] eta 0:34:12 lr 0.000044 time 2.2167 (2.2534) loss 3.3536 (3.0234) grad_norm 2.7102 (2.9283) [2022-01-26 04:09:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][350/1251] eta 0:33:47 lr 0.000044 time 2.5665 (2.2499) loss 3.0356 (3.0216) grad_norm 2.5415 (2.9351) [2022-01-26 04:10:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][360/1251] eta 0:33:21 lr 0.000044 time 2.0606 (2.2460) loss 3.2926 (3.0227) grad_norm 3.1221 (2.9380) [2022-01-26 04:10:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][370/1251] eta 0:32:57 lr 0.000044 time 1.8441 (2.2445) loss 2.3620 (3.0254) grad_norm 3.0664 (2.9382) [2022-01-26 04:10:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][380/1251] eta 0:32:33 lr 0.000044 time 1.5237 (2.2429) loss 3.1888 (3.0312) grad_norm 2.8696 (2.9376) [2022-01-26 04:11:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][390/1251] eta 0:32:09 lr 0.000044 time 2.6211 (2.2405) loss 3.3398 (3.0316) grad_norm 2.9384 (2.9533) [2022-01-26 04:11:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][400/1251] eta 0:31:44 lr 0.000044 time 1.8151 (2.2377) loss 3.0464 (3.0332) grad_norm 3.0801 (2.9582) [2022-01-26 04:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][410/1251] eta 0:31:18 lr 0.000044 time 1.6651 (2.2342) loss 3.5907 (3.0330) grad_norm 3.0465 (2.9575) [2022-01-26 04:12:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][420/1251] eta 0:30:56 lr 0.000044 time 2.1552 (2.2339) loss 3.2288 (3.0335) grad_norm 2.5388 (2.9516) [2022-01-26 04:12:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][430/1251] eta 0:30:33 lr 0.000044 time 2.7628 (2.2329) loss 3.2959 (3.0412) grad_norm 3.0660 (2.9489) [2022-01-26 04:13:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][440/1251] eta 0:30:12 lr 0.000044 time 2.5006 (2.2347) loss 3.0273 (3.0370) grad_norm 3.1855 (2.9535) [2022-01-26 04:13:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][450/1251] eta 0:29:48 lr 0.000044 time 2.7741 (2.2334) loss 2.8426 (3.0402) grad_norm 2.5421 (2.9495) [2022-01-26 04:13:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][460/1251] eta 0:29:25 lr 0.000044 time 2.1105 (2.2324) loss 2.6051 (3.0414) grad_norm 2.8070 (2.9496) [2022-01-26 04:14:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][470/1251] eta 0:29:02 lr 0.000044 time 2.9433 (2.2312) loss 2.6900 (3.0438) grad_norm 2.9623 (2.9517) [2022-01-26 04:14:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][480/1251] eta 0:28:39 lr 0.000044 time 1.9485 (2.2304) loss 2.2537 (3.0337) grad_norm 2.9156 (2.9469) [2022-01-26 04:14:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][490/1251] eta 0:28:16 lr 0.000044 time 1.8585 (2.2290) loss 3.1498 (3.0240) grad_norm 2.5397 (2.9430) [2022-01-26 04:15:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][500/1251] eta 0:27:53 lr 0.000044 time 1.8818 (2.2286) loss 2.9571 (3.0225) grad_norm 2.6900 (2.9397) [2022-01-26 04:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][510/1251] eta 0:27:32 lr 0.000044 time 2.2177 (2.2300) loss 3.4446 (3.0264) grad_norm 2.8550 (2.9372) [2022-01-26 04:16:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][520/1251] eta 0:27:10 lr 0.000044 time 2.2168 (2.2299) loss 2.3635 (3.0251) grad_norm 3.5011 (2.9381) [2022-01-26 04:16:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][530/1251] eta 0:26:45 lr 0.000044 time 1.8003 (2.2275) loss 3.2363 (3.0235) grad_norm 2.8491 (2.9435) [2022-01-26 04:16:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][540/1251] eta 0:26:24 lr 0.000044 time 1.8089 (2.2284) loss 3.3207 (3.0235) grad_norm 2.7973 (2.9471) [2022-01-26 04:17:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][550/1251] eta 0:26:02 lr 0.000044 time 1.9340 (2.2288) loss 2.5244 (3.0239) grad_norm 2.2173 (2.9454) [2022-01-26 04:17:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][560/1251] eta 0:25:38 lr 0.000044 time 2.4990 (2.2270) loss 3.7089 (3.0292) grad_norm 2.7560 (2.9492) [2022-01-26 04:17:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][570/1251] eta 0:25:13 lr 0.000044 time 1.9882 (2.2229) loss 3.2444 (3.0276) grad_norm 3.0751 (2.9493) [2022-01-26 04:18:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][580/1251] eta 0:24:50 lr 0.000044 time 2.2429 (2.2211) loss 3.1358 (3.0320) grad_norm 2.4587 (2.9477) [2022-01-26 04:18:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][590/1251] eta 0:24:29 lr 0.000044 time 1.8764 (2.2233) loss 2.6098 (3.0313) grad_norm 3.8849 (2.9474) [2022-01-26 04:18:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][600/1251] eta 0:24:08 lr 0.000044 time 2.4030 (2.2246) loss 2.8500 (3.0294) grad_norm 2.5138 (2.9459) [2022-01-26 04:19:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][610/1251] eta 0:23:46 lr 0.000044 time 1.5393 (2.2258) loss 3.5030 (3.0307) grad_norm 2.9848 (2.9460) [2022-01-26 04:19:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][620/1251] eta 0:23:24 lr 0.000044 time 1.5997 (2.2252) loss 1.8711 (3.0288) grad_norm 2.7333 (2.9420) [2022-01-26 04:20:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][630/1251] eta 0:23:00 lr 0.000044 time 2.5079 (2.2229) loss 2.9704 (3.0285) grad_norm 2.9486 (2.9408) [2022-01-26 04:20:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][640/1251] eta 0:22:37 lr 0.000044 time 2.4547 (2.2211) loss 3.4946 (3.0288) grad_norm 3.3564 (2.9395) [2022-01-26 04:20:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][650/1251] eta 0:22:13 lr 0.000044 time 1.9218 (2.2187) loss 3.2950 (3.0307) grad_norm 2.7967 (2.9374) [2022-01-26 04:21:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][660/1251] eta 0:21:50 lr 0.000044 time 1.7861 (2.2174) loss 2.2322 (3.0234) grad_norm 2.9828 (2.9375) [2022-01-26 04:21:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][670/1251] eta 0:21:28 lr 0.000044 time 2.5738 (2.2183) loss 3.6425 (3.0273) grad_norm 3.4082 (2.9381) [2022-01-26 04:21:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][680/1251] eta 0:21:06 lr 0.000044 time 3.0934 (2.2180) loss 3.0612 (3.0266) grad_norm 3.0407 (2.9385) [2022-01-26 04:22:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][690/1251] eta 0:20:43 lr 0.000044 time 1.6371 (2.2161) loss 3.3405 (3.0261) grad_norm 2.7864 (2.9373) [2022-01-26 04:22:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][700/1251] eta 0:20:22 lr 0.000044 time 2.6912 (2.2180) loss 3.1901 (3.0274) grad_norm 3.3163 (2.9374) [2022-01-26 04:22:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][710/1251] eta 0:19:59 lr 0.000044 time 1.6639 (2.2165) loss 3.4439 (3.0265) grad_norm 2.8008 (2.9378) [2022-01-26 04:23:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][720/1251] eta 0:19:36 lr 0.000044 time 2.5122 (2.2157) loss 3.1271 (3.0278) grad_norm 2.7975 (2.9365) [2022-01-26 04:23:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][730/1251] eta 0:19:13 lr 0.000044 time 1.6273 (2.2135) loss 2.2043 (3.0274) grad_norm 2.6621 (2.9337) [2022-01-26 04:24:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][740/1251] eta 0:18:50 lr 0.000044 time 2.1776 (2.2118) loss 2.6808 (3.0260) grad_norm 2.6078 (2.9333) [2022-01-26 04:24:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][750/1251] eta 0:18:27 lr 0.000044 time 1.5129 (2.2105) loss 3.2026 (3.0247) grad_norm 3.0691 (2.9354) [2022-01-26 04:24:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][760/1251] eta 0:18:05 lr 0.000044 time 2.9291 (2.2114) loss 2.7729 (3.0232) grad_norm 2.5582 (2.9390) [2022-01-26 04:25:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][770/1251] eta 0:17:43 lr 0.000044 time 1.9654 (2.2108) loss 3.6199 (3.0240) grad_norm 3.0114 (2.9383) [2022-01-26 04:25:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][780/1251] eta 0:17:21 lr 0.000044 time 2.2749 (2.2119) loss 2.5182 (3.0214) grad_norm 2.9868 (2.9359) [2022-01-26 04:25:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][790/1251] eta 0:16:59 lr 0.000044 time 1.5996 (2.2115) loss 2.5458 (3.0197) grad_norm 2.9126 (2.9370) [2022-01-26 04:26:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][800/1251] eta 0:16:37 lr 0.000044 time 2.5866 (2.2108) loss 2.6888 (3.0192) grad_norm 3.0156 (2.9359) [2022-01-26 04:26:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][810/1251] eta 0:16:14 lr 0.000044 time 1.8525 (2.2093) loss 3.3844 (3.0224) grad_norm 2.7344 (2.9358) [2022-01-26 04:26:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][820/1251] eta 0:15:52 lr 0.000044 time 2.8350 (2.2106) loss 2.5358 (3.0213) grad_norm 3.7168 (2.9366) [2022-01-26 04:27:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][830/1251] eta 0:15:30 lr 0.000044 time 1.8093 (2.2113) loss 2.9566 (3.0191) grad_norm 3.0881 (2.9369) [2022-01-26 04:27:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][840/1251] eta 0:15:09 lr 0.000043 time 2.4917 (2.2117) loss 3.1441 (3.0185) grad_norm 3.1627 (2.9380) [2022-01-26 04:28:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][850/1251] eta 0:14:46 lr 0.000043 time 1.7876 (2.2104) loss 3.2963 (3.0204) grad_norm 2.9987 (2.9370) [2022-01-26 04:28:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][860/1251] eta 0:14:23 lr 0.000043 time 1.8334 (2.2079) loss 2.3658 (3.0179) grad_norm 2.5781 (2.9360) [2022-01-26 04:28:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][870/1251] eta 0:14:00 lr 0.000043 time 2.0629 (2.2073) loss 3.3158 (3.0167) grad_norm 2.8834 (2.9351) [2022-01-26 04:29:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][880/1251] eta 0:13:39 lr 0.000043 time 2.8375 (2.2084) loss 3.9276 (3.0208) grad_norm 2.8193 (2.9351) [2022-01-26 04:29:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][890/1251] eta 0:13:17 lr 0.000043 time 1.7580 (2.2093) loss 1.8029 (3.0206) grad_norm 3.0805 (2.9348) [2022-01-26 04:29:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][900/1251] eta 0:12:55 lr 0.000043 time 1.8576 (2.2083) loss 3.5950 (3.0231) grad_norm 2.9373 (2.9344) [2022-01-26 04:30:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][910/1251] eta 0:12:32 lr 0.000043 time 2.2060 (2.2074) loss 3.5661 (3.0242) grad_norm 2.4425 (2.9324) [2022-01-26 04:30:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][920/1251] eta 0:12:10 lr 0.000043 time 2.3908 (2.2066) loss 3.6675 (3.0268) grad_norm 2.8422 (2.9311) [2022-01-26 04:30:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][930/1251] eta 0:11:48 lr 0.000043 time 2.0897 (2.2074) loss 3.2848 (3.0283) grad_norm 3.0706 (2.9307) [2022-01-26 04:31:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][940/1251] eta 0:11:26 lr 0.000043 time 1.8261 (2.2076) loss 3.0487 (3.0265) grad_norm 3.2191 (2.9319) [2022-01-26 04:31:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][950/1251] eta 0:11:04 lr 0.000043 time 2.5975 (2.2081) loss 3.4361 (3.0272) grad_norm 2.7064 (2.9320) [2022-01-26 04:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][960/1251] eta 0:10:42 lr 0.000043 time 2.3737 (2.2080) loss 3.4717 (3.0247) grad_norm 2.8017 (2.9303) [2022-01-26 04:32:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][970/1251] eta 0:10:20 lr 0.000043 time 3.1129 (2.2085) loss 2.6209 (3.0268) grad_norm 2.7826 (2.9297) [2022-01-26 04:32:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][980/1251] eta 0:09:58 lr 0.000043 time 1.9156 (2.2077) loss 3.3714 (3.0279) grad_norm 3.0727 (2.9296) [2022-01-26 04:33:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][990/1251] eta 0:09:35 lr 0.000043 time 1.9356 (2.2057) loss 2.2377 (3.0255) grad_norm 2.9229 (2.9323) [2022-01-26 04:33:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1000/1251] eta 0:09:13 lr 0.000043 time 1.9164 (2.2040) loss 2.7888 (3.0250) grad_norm 3.1555 (2.9354) [2022-01-26 04:33:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1010/1251] eta 0:08:50 lr 0.000043 time 2.1582 (2.2029) loss 2.8133 (3.0233) grad_norm 2.3856 (2.9340) [2022-01-26 04:34:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1020/1251] eta 0:08:28 lr 0.000043 time 2.1711 (2.2027) loss 3.5036 (3.0228) grad_norm 2.8345 (2.9336) [2022-01-26 04:34:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1030/1251] eta 0:08:06 lr 0.000043 time 2.2218 (2.2035) loss 2.5386 (3.0221) grad_norm 3.2530 (2.9334) [2022-01-26 04:34:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1040/1251] eta 0:07:45 lr 0.000043 time 1.8481 (2.2057) loss 3.3352 (3.0246) grad_norm 2.9199 (2.9341) [2022-01-26 04:35:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1050/1251] eta 0:07:23 lr 0.000043 time 2.5072 (2.2058) loss 2.9906 (3.0261) grad_norm 3.1423 (2.9336) [2022-01-26 04:35:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1060/1251] eta 0:07:01 lr 0.000043 time 1.8949 (2.2055) loss 2.4704 (3.0267) grad_norm 2.8538 (2.9346) [2022-01-26 04:36:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1070/1251] eta 0:06:39 lr 0.000043 time 2.8936 (2.2057) loss 3.4318 (3.0269) grad_norm 2.8386 (2.9338) [2022-01-26 04:36:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1080/1251] eta 0:06:16 lr 0.000043 time 1.8785 (2.2041) loss 3.7664 (3.0269) grad_norm 2.8086 (2.9338) [2022-01-26 04:36:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1090/1251] eta 0:05:54 lr 0.000043 time 1.8965 (2.2027) loss 2.2420 (3.0278) grad_norm 3.7013 (2.9327) [2022-01-26 04:37:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1100/1251] eta 0:05:32 lr 0.000043 time 1.9736 (2.2022) loss 3.7710 (3.0291) grad_norm 3.0274 (2.9328) [2022-01-26 04:37:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1110/1251] eta 0:05:10 lr 0.000043 time 2.6637 (2.2028) loss 3.5113 (3.0297) grad_norm 2.9131 (2.9323) [2022-01-26 04:37:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1120/1251] eta 0:04:48 lr 0.000043 time 1.8581 (2.2026) loss 3.0076 (3.0299) grad_norm 2.5384 (2.9313) [2022-01-26 04:38:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1130/1251] eta 0:04:26 lr 0.000043 time 1.8773 (2.2020) loss 3.2523 (3.0294) grad_norm 2.7646 (2.9320) [2022-01-26 04:38:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1140/1251] eta 0:04:04 lr 0.000043 time 2.2170 (2.2017) loss 2.9755 (3.0285) grad_norm 3.4031 (2.9318) [2022-01-26 04:38:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1150/1251] eta 0:03:42 lr 0.000043 time 3.4711 (2.2031) loss 3.0201 (3.0291) grad_norm 2.5082 (2.9303) [2022-01-26 04:39:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1160/1251] eta 0:03:20 lr 0.000043 time 2.1621 (2.2035) loss 3.1678 (3.0296) grad_norm 3.9406 (2.9319) [2022-01-26 04:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1170/1251] eta 0:02:58 lr 0.000043 time 1.7169 (2.2029) loss 2.0244 (3.0301) grad_norm 3.0145 (2.9342) [2022-01-26 04:40:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1180/1251] eta 0:02:36 lr 0.000043 time 2.5908 (2.2038) loss 3.4771 (3.0305) grad_norm 2.6113 (2.9347) [2022-01-26 04:40:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1190/1251] eta 0:02:14 lr 0.000043 time 2.7832 (2.2041) loss 3.1381 (3.0313) grad_norm 2.8540 (2.9342) [2022-01-26 04:40:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1200/1251] eta 0:01:52 lr 0.000043 time 1.6494 (2.2027) loss 3.5125 (3.0309) grad_norm 2.7474 (2.9332) [2022-01-26 04:41:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1210/1251] eta 0:01:30 lr 0.000043 time 1.8468 (2.2014) loss 1.7502 (3.0287) grad_norm 3.8595 (2.9332) [2022-01-26 04:41:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1220/1251] eta 0:01:08 lr 0.000043 time 2.4819 (2.2016) loss 3.6836 (3.0307) grad_norm 2.6048 (2.9315) [2022-01-26 04:41:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1230/1251] eta 0:00:46 lr 0.000043 time 1.9446 (2.2009) loss 3.5495 (3.0302) grad_norm 2.5966 (2.9295) [2022-01-26 04:42:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1240/1251] eta 0:00:24 lr 0.000043 time 1.4666 (2.2001) loss 3.2520 (3.0322) grad_norm 2.9643 (2.9303) [2022-01-26 04:42:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [264/300][1250/1251] eta 0:00:02 lr 0.000043 time 1.2081 (2.1947) loss 3.2808 (3.0333) grad_norm 2.3927 (2.9297) [2022-01-26 04:42:27 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 264 training takes 0:45:45 [2022-01-26 04:42:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.904 (18.904) Loss 0.8278 (0.8278) Acc@1 80.664 (80.664) Acc@5 95.117 (95.117) [2022-01-26 04:43:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.185 (3.408) Loss 0.8184 (0.8183) Acc@1 81.152 (81.188) Acc@5 95.215 (95.446) [2022-01-26 04:43:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.607 (2.607) Loss 0.8131 (0.8146) Acc@1 79.395 (80.878) Acc@5 95.215 (95.443) [2022-01-26 04:43:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.958 (2.333) Loss 0.7996 (0.8139) Acc@1 82.422 (80.888) Acc@5 95.703 (95.432) [2022-01-26 04:43:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.102 (2.193) Loss 0.8678 (0.8222) Acc@1 79.785 (80.707) Acc@5 94.922 (95.320) [2022-01-26 04:44:04 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.690 Acc@5 95.312 [2022-01-26 04:44:04 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.7% [2022-01-26 04:44:04 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.78% [2022-01-26 04:44:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][0/1251] eta 7:38:20 lr 0.000043 time 21.9826 (21.9826) loss 3.4197 (3.4197) grad_norm 3.5022 (3.5022) [2022-01-26 04:44:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][10/1251] eta 1:21:49 lr 0.000043 time 1.9970 (3.9558) loss 3.4082 (3.1308) grad_norm 2.5047 (2.8877) [2022-01-26 04:45:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][20/1251] eta 1:04:06 lr 0.000043 time 1.2691 (3.1243) loss 2.3869 (3.1269) grad_norm 4.6879 (2.9488) [2022-01-26 04:45:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][30/1251] eta 0:56:15 lr 0.000043 time 1.3723 (2.7649) loss 3.2132 (3.1601) grad_norm 2.4907 (2.9194) [2022-01-26 04:45:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][40/1251] eta 0:54:51 lr 0.000043 time 4.0340 (2.7177) loss 2.3783 (3.1688) grad_norm 2.8234 (2.9697) [2022-01-26 04:46:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][50/1251] eta 0:52:19 lr 0.000043 time 2.3997 (2.6142) loss 3.8204 (3.1532) grad_norm 2.4887 (2.9418) [2022-01-26 04:46:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][60/1251] eta 0:50:48 lr 0.000043 time 2.0993 (2.5593) loss 2.6439 (3.1528) grad_norm 2.7566 (2.9288) [2022-01-26 04:47:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][70/1251] eta 0:49:23 lr 0.000043 time 1.5697 (2.5091) loss 2.2569 (3.1390) grad_norm 2.5935 (2.9392) [2022-01-26 04:47:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][80/1251] eta 0:48:53 lr 0.000043 time 3.9448 (2.5049) loss 3.5488 (3.0877) grad_norm 3.2136 (2.9333) [2022-01-26 04:47:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][90/1251] eta 0:48:02 lr 0.000043 time 1.9193 (2.4831) loss 3.4142 (3.0992) grad_norm 2.7508 (2.9298) [2022-01-26 04:48:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][100/1251] eta 0:46:45 lr 0.000043 time 1.8579 (2.4374) loss 3.0481 (3.0864) grad_norm 2.5871 (2.9419) [2022-01-26 04:48:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][110/1251] eta 0:45:37 lr 0.000043 time 1.9328 (2.3991) loss 3.0946 (3.0924) grad_norm 2.5342 (2.9490) [2022-01-26 04:48:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][120/1251] eta 0:45:01 lr 0.000043 time 3.1739 (2.3885) loss 3.1776 (3.0955) grad_norm 2.9148 (2.9401) [2022-01-26 04:49:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][130/1251] eta 0:44:28 lr 0.000043 time 2.6805 (2.3800) loss 2.3077 (3.0903) grad_norm 2.7807 (2.9492) [2022-01-26 04:49:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][140/1251] eta 0:43:55 lr 0.000043 time 2.0901 (2.3719) loss 3.5095 (3.0842) grad_norm 3.0241 (2.9425) [2022-01-26 04:50:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][150/1251] eta 0:43:12 lr 0.000043 time 1.5591 (2.3549) loss 2.3131 (3.0696) grad_norm 3.9217 (2.9603) [2022-01-26 04:50:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][160/1251] eta 0:42:32 lr 0.000043 time 2.9168 (2.3393) loss 2.1436 (3.0665) grad_norm 2.5035 (2.9556) [2022-01-26 04:50:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][170/1251] eta 0:41:59 lr 0.000043 time 1.9037 (2.3311) loss 2.0948 (3.0496) grad_norm 2.4394 (2.9537) [2022-01-26 04:51:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][180/1251] eta 0:41:23 lr 0.000043 time 1.8569 (2.3188) loss 3.5729 (3.0437) grad_norm 3.7116 (2.9491) [2022-01-26 04:51:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][190/1251] eta 0:40:51 lr 0.000043 time 1.8676 (2.3106) loss 3.4940 (3.0473) grad_norm 2.9469 (2.9545) [2022-01-26 04:51:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][200/1251] eta 0:40:21 lr 0.000043 time 2.8233 (2.3043) loss 3.3395 (3.0471) grad_norm 2.8850 (2.9582) [2022-01-26 04:52:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][210/1251] eta 0:39:54 lr 0.000043 time 2.7423 (2.2998) loss 2.5755 (3.0385) grad_norm 2.9405 (2.9568) [2022-01-26 04:52:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][220/1251] eta 0:39:24 lr 0.000043 time 2.2410 (2.2933) loss 3.1345 (3.0355) grad_norm 3.0922 (2.9526) [2022-01-26 04:52:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][230/1251] eta 0:39:03 lr 0.000043 time 3.1442 (2.2951) loss 3.0134 (3.0286) grad_norm 3.4097 (2.9631) [2022-01-26 04:53:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][240/1251] eta 0:38:37 lr 0.000043 time 2.5393 (2.2919) loss 1.7709 (3.0193) grad_norm 2.8813 (2.9582) [2022-01-26 04:53:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][250/1251] eta 0:38:13 lr 0.000043 time 2.4756 (2.2917) loss 3.1324 (3.0209) grad_norm 2.6099 (2.9600) [2022-01-26 04:54:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][260/1251] eta 0:37:45 lr 0.000042 time 1.8782 (2.2857) loss 3.5759 (3.0216) grad_norm 2.7537 (2.9637) [2022-01-26 04:54:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][270/1251] eta 0:37:09 lr 0.000042 time 2.1244 (2.2728) loss 3.3929 (3.0338) grad_norm 3.5420 (2.9662) [2022-01-26 04:54:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][280/1251] eta 0:36:36 lr 0.000042 time 1.7549 (2.2624) loss 2.0555 (3.0298) grad_norm 2.7341 (2.9634) [2022-01-26 04:55:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][290/1251] eta 0:36:07 lr 0.000042 time 1.6811 (2.2551) loss 3.1971 (3.0297) grad_norm 3.1894 (2.9615) [2022-01-26 04:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][300/1251] eta 0:35:42 lr 0.000042 time 2.0947 (2.2532) loss 2.8158 (3.0366) grad_norm 2.5158 (2.9553) [2022-01-26 04:55:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][310/1251] eta 0:35:16 lr 0.000042 time 2.0077 (2.2488) loss 3.6124 (3.0338) grad_norm 3.0070 (2.9526) [2022-01-26 04:56:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][320/1251] eta 0:34:49 lr 0.000042 time 1.8816 (2.2441) loss 3.3385 (3.0373) grad_norm 3.1017 (2.9565) [2022-01-26 04:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][330/1251] eta 0:34:26 lr 0.000042 time 1.9926 (2.2437) loss 3.3252 (3.0387) grad_norm 2.6917 (2.9516) [2022-01-26 04:56:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][340/1251] eta 0:34:08 lr 0.000042 time 2.1643 (2.2487) loss 3.5686 (3.0437) grad_norm 2.5721 (2.9476) [2022-01-26 04:57:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][350/1251] eta 0:33:48 lr 0.000042 time 1.7285 (2.2508) loss 2.8058 (3.0430) grad_norm 2.7973 (2.9499) [2022-01-26 04:57:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][360/1251] eta 0:33:30 lr 0.000042 time 2.5829 (2.2560) loss 2.8275 (3.0431) grad_norm 2.7823 (2.9487) [2022-01-26 04:58:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][370/1251] eta 0:33:06 lr 0.000042 time 2.1236 (2.2544) loss 2.1754 (3.0439) grad_norm 3.0967 (2.9521) [2022-01-26 04:58:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][380/1251] eta 0:32:37 lr 0.000042 time 1.8006 (2.2477) loss 2.8252 (3.0447) grad_norm 3.1822 (2.9465) [2022-01-26 04:58:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][390/1251] eta 0:32:09 lr 0.000042 time 1.9791 (2.2414) loss 3.2221 (3.0520) grad_norm 2.7686 (2.9487) [2022-01-26 04:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][400/1251] eta 0:31:43 lr 0.000042 time 2.1944 (2.2368) loss 3.1899 (3.0542) grad_norm 2.6595 (2.9511) [2022-01-26 04:59:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][410/1251] eta 0:31:19 lr 0.000042 time 1.7884 (2.2343) loss 2.7789 (3.0495) grad_norm 2.7330 (2.9498) [2022-01-26 04:59:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][420/1251] eta 0:30:58 lr 0.000042 time 2.0783 (2.2361) loss 2.9765 (3.0425) grad_norm 2.4466 (2.9471) [2022-01-26 05:00:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][430/1251] eta 0:30:37 lr 0.000042 time 1.8753 (2.2380) loss 2.8969 (3.0404) grad_norm 2.9115 (2.9495) [2022-01-26 05:00:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][440/1251] eta 0:30:16 lr 0.000042 time 1.9521 (2.2396) loss 3.3659 (3.0417) grad_norm 3.0806 (2.9483) [2022-01-26 05:00:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][450/1251] eta 0:29:53 lr 0.000042 time 2.1734 (2.2397) loss 3.4699 (3.0415) grad_norm 2.5773 (2.9478) [2022-01-26 05:01:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][460/1251] eta 0:29:31 lr 0.000042 time 1.9510 (2.2397) loss 3.2702 (3.0420) grad_norm 6.0591 (2.9526) [2022-01-26 05:01:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][470/1251] eta 0:29:07 lr 0.000042 time 2.1665 (2.2373) loss 2.1231 (3.0401) grad_norm 2.4417 (2.9502) [2022-01-26 05:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][480/1251] eta 0:28:42 lr 0.000042 time 1.7211 (2.2336) loss 3.3381 (3.0346) grad_norm 2.4978 (2.9466) [2022-01-26 05:02:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][490/1251] eta 0:28:17 lr 0.000042 time 2.1922 (2.2306) loss 3.0044 (3.0361) grad_norm 2.9221 (2.9434) [2022-01-26 05:02:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][500/1251] eta 0:27:56 lr 0.000042 time 1.9857 (2.2320) loss 2.5708 (3.0362) grad_norm 2.6733 (2.9438) [2022-01-26 05:03:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][510/1251] eta 0:27:33 lr 0.000042 time 2.2964 (2.2319) loss 2.1929 (3.0315) grad_norm 2.6846 (2.9411) [2022-01-26 05:03:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][520/1251] eta 0:27:09 lr 0.000042 time 1.4910 (2.2295) loss 3.2791 (3.0299) grad_norm 2.7377 (2.9420) [2022-01-26 05:03:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][530/1251] eta 0:26:47 lr 0.000042 time 1.9063 (2.2296) loss 3.2990 (3.0308) grad_norm 2.8696 (2.9390) [2022-01-26 05:04:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][540/1251] eta 0:26:26 lr 0.000042 time 2.2896 (2.2316) loss 3.4099 (3.0311) grad_norm 3.0451 (2.9414) [2022-01-26 05:04:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][550/1251] eta 0:26:02 lr 0.000042 time 2.2494 (2.2294) loss 3.3324 (3.0294) grad_norm 3.1791 (2.9390) [2022-01-26 05:04:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][560/1251] eta 0:25:38 lr 0.000042 time 1.9171 (2.2258) loss 3.6367 (3.0254) grad_norm 2.6532 (2.9369) [2022-01-26 05:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][570/1251] eta 0:25:14 lr 0.000042 time 2.8243 (2.2246) loss 3.6083 (3.0258) grad_norm 2.9671 (2.9367) [2022-01-26 05:05:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][580/1251] eta 0:24:51 lr 0.000042 time 1.6167 (2.2231) loss 2.3073 (3.0234) grad_norm 2.9362 (2.9379) [2022-01-26 05:05:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][590/1251] eta 0:24:28 lr 0.000042 time 2.0702 (2.2220) loss 1.9921 (3.0197) grad_norm 3.2494 (2.9396) [2022-01-26 05:06:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][600/1251] eta 0:24:06 lr 0.000042 time 2.2319 (2.2219) loss 3.5601 (3.0237) grad_norm 3.3961 (2.9393) [2022-01-26 05:06:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][610/1251] eta 0:23:45 lr 0.000042 time 2.9233 (2.2232) loss 2.6553 (3.0238) grad_norm 3.2213 (2.9427) [2022-01-26 05:07:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][620/1251] eta 0:23:24 lr 0.000042 time 1.6769 (2.2252) loss 3.4279 (3.0213) grad_norm 2.9903 (2.9419) [2022-01-26 05:07:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][630/1251] eta 0:23:02 lr 0.000042 time 1.8811 (2.2257) loss 3.1099 (3.0236) grad_norm 3.0874 (2.9448) [2022-01-26 05:07:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][640/1251] eta 0:22:39 lr 0.000042 time 1.9192 (2.2243) loss 3.4644 (3.0221) grad_norm 3.0323 (2.9450) [2022-01-26 05:08:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][650/1251] eta 0:22:16 lr 0.000042 time 2.4435 (2.2234) loss 3.3292 (3.0228) grad_norm 2.5474 (2.9423) [2022-01-26 05:08:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][660/1251] eta 0:21:52 lr 0.000042 time 2.1720 (2.2216) loss 1.9723 (3.0226) grad_norm 2.8260 (2.9427) [2022-01-26 05:08:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][670/1251] eta 0:21:29 lr 0.000042 time 1.6077 (2.2202) loss 2.3524 (3.0214) grad_norm 2.7375 (2.9454) [2022-01-26 05:09:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][680/1251] eta 0:21:08 lr 0.000042 time 2.2173 (2.2222) loss 3.6118 (3.0240) grad_norm 3.1487 (2.9462) [2022-01-26 05:09:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][690/1251] eta 0:20:46 lr 0.000042 time 2.2140 (2.2213) loss 1.8350 (3.0208) grad_norm 2.9959 (2.9443) [2022-01-26 05:10:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][700/1251] eta 0:20:23 lr 0.000042 time 2.5114 (2.2212) loss 3.1985 (3.0198) grad_norm 2.5922 (2.9405) [2022-01-26 05:10:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][710/1251] eta 0:20:00 lr 0.000042 time 1.9738 (2.2184) loss 2.4028 (3.0184) grad_norm 2.9541 (2.9390) [2022-01-26 05:10:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][720/1251] eta 0:19:38 lr 0.000042 time 2.7873 (2.2188) loss 3.3161 (3.0186) grad_norm 2.5350 (2.9383) [2022-01-26 05:11:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][730/1251] eta 0:19:16 lr 0.000042 time 2.2331 (2.2198) loss 3.6390 (3.0165) grad_norm 3.0681 (2.9401) [2022-01-26 05:11:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][740/1251] eta 0:18:55 lr 0.000042 time 1.8595 (2.2225) loss 3.1923 (3.0160) grad_norm 3.2052 (2.9416) [2022-01-26 05:11:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][750/1251] eta 0:18:33 lr 0.000042 time 1.9477 (2.2224) loss 2.6952 (3.0167) grad_norm 2.7282 (2.9410) [2022-01-26 05:12:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][760/1251] eta 0:18:10 lr 0.000042 time 2.1369 (2.2211) loss 2.9641 (3.0153) grad_norm 3.8058 (2.9456) [2022-01-26 05:12:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][770/1251] eta 0:17:46 lr 0.000042 time 1.8278 (2.2178) loss 3.5547 (3.0161) grad_norm 2.5637 (2.9444) [2022-01-26 05:12:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][780/1251] eta 0:17:24 lr 0.000042 time 1.9509 (2.2170) loss 2.8414 (3.0139) grad_norm 2.7544 (2.9444) [2022-01-26 05:13:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][790/1251] eta 0:17:02 lr 0.000042 time 2.4558 (2.2172) loss 3.0420 (3.0148) grad_norm 2.6273 (2.9463) [2022-01-26 05:13:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][800/1251] eta 0:16:40 lr 0.000042 time 1.9164 (2.2179) loss 3.0942 (3.0132) grad_norm 2.5831 (2.9474) [2022-01-26 05:14:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][810/1251] eta 0:16:18 lr 0.000042 time 2.1917 (2.2188) loss 3.4739 (3.0149) grad_norm 3.3308 (2.9475) [2022-01-26 05:14:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][820/1251] eta 0:15:56 lr 0.000042 time 2.0095 (2.2204) loss 3.9433 (3.0166) grad_norm 2.6549 (2.9491) [2022-01-26 05:14:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][830/1251] eta 0:15:34 lr 0.000042 time 2.4462 (2.2197) loss 3.0484 (3.0163) grad_norm 3.1437 (2.9477) [2022-01-26 05:15:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][840/1251] eta 0:15:11 lr 0.000042 time 2.0176 (2.2184) loss 3.4075 (3.0170) grad_norm 2.4369 (2.9470) [2022-01-26 05:15:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][850/1251] eta 0:14:49 lr 0.000042 time 1.8808 (2.2177) loss 2.2285 (3.0157) grad_norm 2.9741 (2.9465) [2022-01-26 05:15:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][860/1251] eta 0:14:27 lr 0.000042 time 2.1899 (2.2189) loss 3.4680 (3.0174) grad_norm 3.2701 (2.9464) [2022-01-26 05:16:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][870/1251] eta 0:14:05 lr 0.000042 time 1.8967 (2.2182) loss 3.4568 (3.0159) grad_norm 2.6058 (2.9463) [2022-01-26 05:16:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][880/1251] eta 0:13:42 lr 0.000042 time 1.6404 (2.2167) loss 3.5340 (3.0175) grad_norm 3.3893 (2.9481) [2022-01-26 05:16:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][890/1251] eta 0:13:20 lr 0.000042 time 1.8771 (2.2168) loss 3.5616 (3.0176) grad_norm 2.8164 (2.9475) [2022-01-26 05:17:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][900/1251] eta 0:12:58 lr 0.000042 time 1.8999 (2.2178) loss 2.6973 (3.0150) grad_norm 2.4755 (2.9450) [2022-01-26 05:17:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][910/1251] eta 0:12:36 lr 0.000042 time 2.5588 (2.2183) loss 2.6615 (3.0159) grad_norm 3.3405 (2.9444) [2022-01-26 05:18:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][920/1251] eta 0:12:14 lr 0.000042 time 1.7852 (2.2177) loss 2.9951 (3.0144) grad_norm 2.4534 (2.9443) [2022-01-26 05:18:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][930/1251] eta 0:11:51 lr 0.000042 time 1.8689 (2.2158) loss 2.7047 (3.0110) grad_norm 3.2595 (2.9426) [2022-01-26 05:18:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][940/1251] eta 0:11:28 lr 0.000041 time 1.9517 (2.2141) loss 3.6931 (3.0128) grad_norm 3.4681 (2.9428) [2022-01-26 05:19:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][950/1251] eta 0:11:06 lr 0.000041 time 2.3426 (2.2153) loss 3.0659 (3.0130) grad_norm 2.5868 (2.9421) [2022-01-26 05:19:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][960/1251] eta 0:10:45 lr 0.000041 time 2.0096 (2.2171) loss 1.8283 (3.0140) grad_norm 3.6023 (2.9441) [2022-01-26 05:19:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][970/1251] eta 0:10:23 lr 0.000041 time 2.3325 (2.2174) loss 3.1832 (3.0148) grad_norm 2.6043 (2.9454) [2022-01-26 05:20:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][980/1251] eta 0:10:00 lr 0.000041 time 1.9530 (2.2156) loss 3.0670 (3.0155) grad_norm 2.6602 (2.9445) [2022-01-26 05:20:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][990/1251] eta 0:09:38 lr 0.000041 time 2.6878 (2.2151) loss 2.4373 (3.0167) grad_norm 3.0193 (2.9440) [2022-01-26 05:21:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1000/1251] eta 0:09:15 lr 0.000041 time 2.0867 (2.2149) loss 2.9780 (3.0167) grad_norm 3.1976 (2.9465) [2022-01-26 05:21:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1010/1251] eta 0:08:53 lr 0.000041 time 1.8450 (2.2143) loss 3.1566 (3.0158) grad_norm 2.6492 (2.9463) [2022-01-26 05:21:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1020/1251] eta 0:08:31 lr 0.000041 time 2.2470 (2.2138) loss 2.9761 (3.0178) grad_norm 3.1753 (2.9465) [2022-01-26 05:22:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1030/1251] eta 0:08:09 lr 0.000041 time 2.2601 (2.2132) loss 2.2526 (3.0164) grad_norm 3.3938 (2.9461) [2022-01-26 05:22:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1040/1251] eta 0:07:46 lr 0.000041 time 1.7372 (2.2132) loss 2.9972 (3.0163) grad_norm 3.4005 (2.9465) [2022-01-26 05:22:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1050/1251] eta 0:07:24 lr 0.000041 time 1.8825 (2.2122) loss 3.6558 (3.0194) grad_norm 2.7674 (2.9459) [2022-01-26 05:23:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1060/1251] eta 0:07:02 lr 0.000041 time 2.5190 (2.2125) loss 2.9135 (3.0204) grad_norm 3.1123 (2.9454) [2022-01-26 05:23:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1070/1251] eta 0:06:40 lr 0.000041 time 2.8444 (2.2121) loss 2.5924 (3.0204) grad_norm 2.9405 (2.9441) [2022-01-26 05:23:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1080/1251] eta 0:06:18 lr 0.000041 time 1.8918 (2.2124) loss 2.5026 (3.0196) grad_norm 2.8443 (2.9438) [2022-01-26 05:24:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1090/1251] eta 0:05:56 lr 0.000041 time 1.6714 (2.2116) loss 2.5150 (3.0203) grad_norm 2.7215 (2.9449) [2022-01-26 05:24:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1100/1251] eta 0:05:33 lr 0.000041 time 1.9218 (2.2114) loss 3.3844 (3.0204) grad_norm 2.7292 (2.9449) [2022-01-26 05:25:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1110/1251] eta 0:05:11 lr 0.000041 time 2.8690 (2.2109) loss 3.1097 (3.0204) grad_norm 2.6091 (2.9441) [2022-01-26 05:25:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1120/1251] eta 0:04:49 lr 0.000041 time 2.1082 (2.2103) loss 2.8720 (3.0205) grad_norm 3.3052 (2.9446) [2022-01-26 05:25:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1130/1251] eta 0:04:27 lr 0.000041 time 2.4949 (2.2112) loss 3.3549 (3.0218) grad_norm 2.7956 (2.9456) [2022-01-26 05:26:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1140/1251] eta 0:04:05 lr 0.000041 time 1.8823 (2.2099) loss 3.3389 (3.0220) grad_norm 2.9658 (2.9448) [2022-01-26 05:26:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1150/1251] eta 0:03:43 lr 0.000041 time 2.8218 (2.2105) loss 3.0052 (3.0201) grad_norm 2.7769 (2.9440) [2022-01-26 05:26:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1160/1251] eta 0:03:21 lr 0.000041 time 1.6014 (2.2090) loss 3.5830 (3.0208) grad_norm 2.7978 (2.9433) [2022-01-26 05:27:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1170/1251] eta 0:02:58 lr 0.000041 time 2.4212 (2.2088) loss 3.2719 (3.0201) grad_norm 2.9475 (2.9449) [2022-01-26 05:27:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1180/1251] eta 0:02:36 lr 0.000041 time 1.6666 (2.2080) loss 3.3033 (3.0205) grad_norm 3.4656 (2.9452) [2022-01-26 05:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1190/1251] eta 0:02:14 lr 0.000041 time 2.7927 (2.2080) loss 2.2263 (3.0210) grad_norm 2.5007 (2.9445) [2022-01-26 05:28:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1200/1251] eta 0:01:52 lr 0.000041 time 1.9608 (2.2074) loss 3.6933 (3.0231) grad_norm 2.6010 (2.9430) [2022-01-26 05:28:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1210/1251] eta 0:01:30 lr 0.000041 time 1.9493 (2.2075) loss 2.9458 (3.0223) grad_norm 2.5943 (2.9414) [2022-01-26 05:28:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1220/1251] eta 0:01:08 lr 0.000041 time 2.1307 (2.2069) loss 3.5483 (3.0228) grad_norm 2.8496 (2.9414) [2022-01-26 05:29:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1230/1251] eta 0:00:46 lr 0.000041 time 2.8975 (2.2074) loss 3.1877 (3.0230) grad_norm 2.7535 (2.9413) [2022-01-26 05:29:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1240/1251] eta 0:00:24 lr 0.000041 time 1.4437 (2.2058) loss 3.1859 (3.0218) grad_norm 3.0121 (2.9407) [2022-01-26 05:29:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [265/300][1250/1251] eta 0:00:02 lr 0.000041 time 1.1648 (2.2008) loss 2.6836 (3.0209) grad_norm 3.1174 (2.9405) [2022-01-26 05:29:58 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 265 training takes 0:45:53 [2022-01-26 05:30:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.714 (18.714) Loss 0.7705 (0.7705) Acc@1 80.176 (80.176) Acc@5 96.289 (96.289) [2022-01-26 05:30:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.308 (3.366) Loss 0.8313 (0.8205) Acc@1 80.957 (80.744) Acc@5 94.922 (95.339) [2022-01-26 05:30:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.280 (2.530) Loss 0.7625 (0.8074) Acc@1 81.934 (81.031) Acc@5 96.289 (95.447) [2022-01-26 05:31:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.299 (2.246) Loss 0.8548 (0.8107) Acc@1 80.664 (80.995) Acc@5 95.020 (95.341) [2022-01-26 05:31:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.279 (2.191) Loss 0.8471 (0.8125) Acc@1 80.176 (80.952) Acc@5 94.336 (95.301) [2022-01-26 05:31:35 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.834 Acc@5 95.318 [2022-01-26 05:31:35 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-01-26 05:31:35 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.83% [2022-01-26 05:31:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][0/1251] eta 7:26:50 lr 0.000041 time 21.4315 (21.4315) loss 3.5499 (3.5499) grad_norm 3.3968 (3.3968) [2022-01-26 05:32:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][10/1251] eta 1:22:10 lr 0.000041 time 1.6081 (3.9732) loss 3.2425 (3.1291) grad_norm 3.1999 (2.8083) [2022-01-26 05:32:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][20/1251] eta 1:03:51 lr 0.000041 time 1.7008 (3.1127) loss 3.2880 (3.1753) grad_norm 3.0398 (2.8002) [2022-01-26 05:33:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][30/1251] eta 0:57:00 lr 0.000041 time 1.5069 (2.8013) loss 2.1852 (3.0559) grad_norm 2.9794 (2.8552) [2022-01-26 05:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][40/1251] eta 0:53:49 lr 0.000041 time 3.1983 (2.6665) loss 3.1587 (3.0186) grad_norm 2.6264 (2.8454) [2022-01-26 05:33:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][50/1251] eta 0:52:01 lr 0.000041 time 1.5271 (2.5992) loss 3.5181 (3.0316) grad_norm 2.8704 (2.8504) [2022-01-26 05:34:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][60/1251] eta 0:50:25 lr 0.000041 time 2.1205 (2.5404) loss 3.1744 (3.0623) grad_norm 2.9578 (2.8683) [2022-01-26 05:34:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][70/1251] eta 0:49:06 lr 0.000041 time 1.8651 (2.4949) loss 3.3558 (3.0174) grad_norm 3.0455 (2.8872) [2022-01-26 05:34:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][80/1251] eta 0:48:22 lr 0.000041 time 3.0888 (2.4789) loss 3.1941 (2.9932) grad_norm 3.1839 (2.9351) [2022-01-26 05:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][90/1251] eta 0:47:47 lr 0.000041 time 1.5376 (2.4696) loss 2.4117 (3.0000) grad_norm 3.0547 (2.9647) [2022-01-26 05:35:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][100/1251] eta 0:47:03 lr 0.000041 time 1.8497 (2.4532) loss 3.1186 (3.0056) grad_norm 2.7955 (2.9755) [2022-01-26 05:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][110/1251] eta 0:46:01 lr 0.000041 time 1.8433 (2.4200) loss 3.3455 (3.0000) grad_norm 2.9616 (2.9563) [2022-01-26 05:36:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][120/1251] eta 0:45:05 lr 0.000041 time 2.0430 (2.3926) loss 3.2378 (3.0107) grad_norm 3.0983 (2.9545) [2022-01-26 05:36:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][130/1251] eta 0:44:08 lr 0.000041 time 2.1966 (2.3629) loss 3.3608 (2.9922) grad_norm 3.0169 (2.9566) [2022-01-26 05:37:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][140/1251] eta 0:43:16 lr 0.000041 time 1.9506 (2.3372) loss 2.2189 (2.9980) grad_norm 2.9770 (2.9503) [2022-01-26 05:37:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][150/1251] eta 0:42:32 lr 0.000041 time 1.7530 (2.3179) loss 3.0766 (3.0131) grad_norm 3.8322 (2.9881) [2022-01-26 05:37:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][160/1251] eta 0:41:57 lr 0.000041 time 2.3055 (2.3071) loss 3.0319 (3.0141) grad_norm 2.7783 (2.9823) [2022-01-26 05:38:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][170/1251] eta 0:41:31 lr 0.000041 time 1.7782 (2.3053) loss 3.4508 (3.0204) grad_norm 3.1353 (2.9802) [2022-01-26 05:38:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][180/1251] eta 0:41:06 lr 0.000041 time 2.1715 (2.3031) loss 2.8501 (3.0114) grad_norm 3.0406 (2.9799) [2022-01-26 05:38:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][190/1251] eta 0:40:37 lr 0.000041 time 1.8351 (2.2975) loss 3.2971 (3.0211) grad_norm 2.5373 (2.9711) [2022-01-26 05:39:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][200/1251] eta 0:40:11 lr 0.000041 time 3.1331 (2.2947) loss 2.8495 (3.0095) grad_norm 2.6450 (2.9684) [2022-01-26 05:39:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][210/1251] eta 0:39:48 lr 0.000041 time 3.2337 (2.2948) loss 3.0693 (3.0020) grad_norm 3.3915 (2.9669) [2022-01-26 05:40:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][220/1251] eta 0:39:28 lr 0.000041 time 1.8131 (2.2974) loss 2.8478 (3.0069) grad_norm 3.3472 (2.9671) [2022-01-26 05:40:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][230/1251] eta 0:39:08 lr 0.000041 time 1.8912 (2.3005) loss 2.9851 (3.0085) grad_norm 3.0184 (2.9653) [2022-01-26 05:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][240/1251] eta 0:38:47 lr 0.000041 time 2.7925 (2.3019) loss 3.2016 (3.0164) grad_norm 2.8008 (2.9625) [2022-01-26 05:41:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][250/1251] eta 0:38:11 lr 0.000041 time 1.6040 (2.2896) loss 3.6548 (3.0303) grad_norm 2.9144 (2.9685) [2022-01-26 05:41:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][260/1251] eta 0:37:36 lr 0.000041 time 1.9501 (2.2773) loss 3.3970 (3.0318) grad_norm 2.9756 (2.9682) [2022-01-26 05:41:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][270/1251] eta 0:37:08 lr 0.000041 time 1.9491 (2.2717) loss 3.5075 (3.0481) grad_norm 2.7240 (2.9693) [2022-01-26 05:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][280/1251] eta 0:36:43 lr 0.000041 time 2.3527 (2.2695) loss 2.4870 (3.0395) grad_norm 3.0372 (2.9720) [2022-01-26 05:42:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][290/1251] eta 0:36:21 lr 0.000041 time 2.0056 (2.2702) loss 1.8073 (3.0382) grad_norm 2.7475 (2.9830) [2022-01-26 05:42:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][300/1251] eta 0:35:55 lr 0.000041 time 2.3791 (2.2664) loss 2.9593 (3.0197) grad_norm 2.6849 (2.9855) [2022-01-26 05:43:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][310/1251] eta 0:35:28 lr 0.000041 time 2.2038 (2.2622) loss 2.2715 (3.0245) grad_norm 2.8242 (2.9843) [2022-01-26 05:43:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][320/1251] eta 0:35:03 lr 0.000041 time 2.8777 (2.2590) loss 3.2958 (3.0289) grad_norm 2.9482 (2.9773) [2022-01-26 05:44:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][330/1251] eta 0:34:37 lr 0.000041 time 2.2799 (2.2553) loss 3.9489 (3.0277) grad_norm 3.1183 (2.9761) [2022-01-26 05:44:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][340/1251] eta 0:34:14 lr 0.000041 time 2.2533 (2.2554) loss 3.4301 (3.0290) grad_norm 2.6466 (2.9746) [2022-01-26 05:44:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][350/1251] eta 0:33:50 lr 0.000041 time 2.4875 (2.2537) loss 3.0621 (3.0253) grad_norm 3.3978 (2.9919) [2022-01-26 05:45:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][360/1251] eta 0:33:29 lr 0.000041 time 1.8354 (2.2554) loss 3.1826 (3.0246) grad_norm 3.1656 (2.9894) [2022-01-26 05:45:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][370/1251] eta 0:33:07 lr 0.000041 time 1.8993 (2.2555) loss 2.1343 (3.0138) grad_norm 4.0241 (2.9938) [2022-01-26 05:45:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][380/1251] eta 0:32:42 lr 0.000040 time 1.9834 (2.2528) loss 3.1572 (3.0189) grad_norm 2.9418 (2.9953) [2022-01-26 05:46:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][390/1251] eta 0:32:15 lr 0.000040 time 2.1835 (2.2478) loss 3.6271 (3.0219) grad_norm 4.7899 (3.0020) [2022-01-26 05:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][400/1251] eta 0:31:47 lr 0.000040 time 1.8718 (2.2418) loss 3.1520 (3.0239) grad_norm 3.2675 (2.9998) [2022-01-26 05:46:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][410/1251] eta 0:31:22 lr 0.000040 time 2.0659 (2.2384) loss 2.4012 (3.0250) grad_norm 2.9587 (2.9972) [2022-01-26 05:47:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][420/1251] eta 0:30:56 lr 0.000040 time 1.6861 (2.2342) loss 3.2084 (3.0266) grad_norm 2.4963 (2.9928) [2022-01-26 05:47:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][430/1251] eta 0:30:32 lr 0.000040 time 2.3107 (2.2318) loss 2.4410 (3.0268) grad_norm 3.2060 (2.9862) [2022-01-26 05:48:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][440/1251] eta 0:30:11 lr 0.000040 time 2.7878 (2.2332) loss 3.4783 (3.0328) grad_norm 2.7987 (2.9822) [2022-01-26 05:48:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][450/1251] eta 0:29:48 lr 0.000040 time 2.6051 (2.2323) loss 3.7318 (3.0350) grad_norm 2.5567 (2.9796) [2022-01-26 05:48:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][460/1251] eta 0:29:24 lr 0.000040 time 2.2119 (2.2312) loss 3.1031 (3.0320) grad_norm 2.5733 (2.9806) [2022-01-26 05:49:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][470/1251] eta 0:29:03 lr 0.000040 time 3.0664 (2.2319) loss 3.2323 (3.0285) grad_norm 2.8775 (2.9840) [2022-01-26 05:49:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][480/1251] eta 0:28:42 lr 0.000040 time 3.5025 (2.2342) loss 2.2807 (3.0262) grad_norm 2.7988 (2.9912) [2022-01-26 05:49:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][490/1251] eta 0:28:20 lr 0.000040 time 1.3838 (2.2342) loss 2.1777 (3.0151) grad_norm 2.7634 (2.9919) [2022-01-26 05:50:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][500/1251] eta 0:27:58 lr 0.000040 time 1.4391 (2.2351) loss 2.8857 (3.0138) grad_norm 2.4535 (2.9903) [2022-01-26 05:50:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][510/1251] eta 0:27:37 lr 0.000040 time 2.6227 (2.2362) loss 2.3221 (3.0082) grad_norm 2.9758 (2.9886) [2022-01-26 05:50:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][520/1251] eta 0:27:13 lr 0.000040 time 2.2600 (2.2346) loss 2.4699 (3.0112) grad_norm 3.4040 (2.9920) [2022-01-26 05:51:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][530/1251] eta 0:26:48 lr 0.000040 time 1.7157 (2.2315) loss 3.4005 (3.0087) grad_norm 2.7917 (2.9907) [2022-01-26 05:51:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][540/1251] eta 0:26:22 lr 0.000040 time 1.8329 (2.2260) loss 2.9590 (3.0074) grad_norm 2.6878 (2.9896) [2022-01-26 05:52:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][550/1251] eta 0:25:58 lr 0.000040 time 2.0457 (2.2236) loss 2.4366 (3.0073) grad_norm 3.5890 (2.9870) [2022-01-26 05:52:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][560/1251] eta 0:25:37 lr 0.000040 time 3.0505 (2.2249) loss 3.4383 (3.0058) grad_norm 3.0214 (2.9849) [2022-01-26 05:52:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][570/1251] eta 0:25:15 lr 0.000040 time 2.6980 (2.2261) loss 2.3972 (3.0045) grad_norm 3.6992 (2.9853) [2022-01-26 05:53:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][580/1251] eta 0:24:54 lr 0.000040 time 1.6563 (2.2269) loss 2.6530 (3.0064) grad_norm 2.8627 (2.9798) [2022-01-26 05:53:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][590/1251] eta 0:24:32 lr 0.000040 time 2.3829 (2.2282) loss 2.5370 (3.0052) grad_norm 3.7029 (2.9784) [2022-01-26 05:53:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][600/1251] eta 0:24:09 lr 0.000040 time 2.5485 (2.2269) loss 3.2525 (3.0014) grad_norm 2.5198 (2.9777) [2022-01-26 05:54:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][610/1251] eta 0:23:45 lr 0.000040 time 2.2470 (2.2235) loss 3.3566 (3.0030) grad_norm 2.5755 (2.9776) [2022-01-26 05:54:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][620/1251] eta 0:23:21 lr 0.000040 time 2.4318 (2.2204) loss 3.0345 (2.9991) grad_norm 2.4854 (2.9752) [2022-01-26 05:54:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][630/1251] eta 0:22:57 lr 0.000040 time 2.1801 (2.2183) loss 3.2097 (3.0012) grad_norm 2.7835 (2.9760) [2022-01-26 05:55:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][640/1251] eta 0:22:34 lr 0.000040 time 1.5885 (2.2165) loss 3.3233 (2.9999) grad_norm 2.7646 (2.9746) [2022-01-26 05:55:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][650/1251] eta 0:22:10 lr 0.000040 time 1.8578 (2.2133) loss 2.8064 (2.9993) grad_norm 2.5987 (2.9723) [2022-01-26 05:55:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][660/1251] eta 0:21:46 lr 0.000040 time 1.9911 (2.2114) loss 2.5533 (2.9950) grad_norm 2.6859 (2.9704) [2022-01-26 05:56:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][670/1251] eta 0:21:24 lr 0.000040 time 2.8690 (2.2109) loss 3.3381 (3.0009) grad_norm 2.5979 (2.9696) [2022-01-26 05:56:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][680/1251] eta 0:21:01 lr 0.000040 time 1.8879 (2.2089) loss 2.6617 (3.0020) grad_norm 2.8559 (2.9686) [2022-01-26 05:57:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][690/1251] eta 0:20:39 lr 0.000040 time 2.5761 (2.2092) loss 3.4916 (2.9994) grad_norm 2.7599 (2.9661) [2022-01-26 05:57:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][700/1251] eta 0:20:18 lr 0.000040 time 2.1252 (2.2107) loss 3.3954 (2.9968) grad_norm 3.1024 (2.9656) [2022-01-26 05:57:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][710/1251] eta 0:19:57 lr 0.000040 time 2.8964 (2.2132) loss 3.2494 (3.0000) grad_norm 2.8781 (2.9644) [2022-01-26 05:58:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][720/1251] eta 0:19:35 lr 0.000040 time 1.8913 (2.2141) loss 2.9942 (2.9947) grad_norm 3.1003 (2.9624) [2022-01-26 05:58:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][730/1251] eta 0:19:14 lr 0.000040 time 2.7755 (2.2168) loss 3.1125 (2.9945) grad_norm 2.8137 (2.9609) [2022-01-26 05:58:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][740/1251] eta 0:18:52 lr 0.000040 time 2.2829 (2.2157) loss 3.4777 (2.9948) grad_norm 2.9155 (2.9595) [2022-01-26 05:59:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][750/1251] eta 0:18:29 lr 0.000040 time 1.9597 (2.2145) loss 3.4311 (2.9977) grad_norm 2.4684 (2.9579) [2022-01-26 05:59:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][760/1251] eta 0:18:06 lr 0.000040 time 2.2326 (2.2120) loss 3.3818 (2.9979) grad_norm 2.7953 (2.9557) [2022-01-26 06:00:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][770/1251] eta 0:17:44 lr 0.000040 time 2.5982 (2.2135) loss 2.5997 (2.9965) grad_norm 2.8417 (2.9567) [2022-01-26 06:00:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][780/1251] eta 0:17:22 lr 0.000040 time 2.8602 (2.2139) loss 1.8696 (2.9928) grad_norm 2.9064 (2.9581) [2022-01-26 06:00:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][790/1251] eta 0:17:00 lr 0.000040 time 2.0153 (2.2133) loss 3.4681 (2.9920) grad_norm 2.4795 (2.9577) [2022-01-26 06:01:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][800/1251] eta 0:16:36 lr 0.000040 time 2.2394 (2.2100) loss 3.0844 (2.9905) grad_norm 3.0079 (2.9568) [2022-01-26 06:01:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][810/1251] eta 0:16:14 lr 0.000040 time 2.2755 (2.2089) loss 3.1141 (2.9915) grad_norm 3.1663 (2.9579) [2022-01-26 06:01:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][820/1251] eta 0:15:51 lr 0.000040 time 2.2644 (2.2088) loss 3.3323 (2.9908) grad_norm 2.9001 (2.9567) [2022-01-26 06:02:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][830/1251] eta 0:15:30 lr 0.000040 time 2.3280 (2.2091) loss 3.4096 (2.9925) grad_norm 3.0784 (2.9567) [2022-01-26 06:02:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][840/1251] eta 0:15:08 lr 0.000040 time 2.5203 (2.2096) loss 3.1282 (2.9954) grad_norm 2.4591 (2.9575) [2022-01-26 06:02:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][850/1251] eta 0:14:46 lr 0.000040 time 2.2187 (2.2096) loss 2.1813 (2.9968) grad_norm 2.7054 (2.9567) [2022-01-26 06:03:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][860/1251] eta 0:14:24 lr 0.000040 time 2.2767 (2.2111) loss 3.2405 (2.9978) grad_norm 2.6248 (2.9572) [2022-01-26 06:03:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][870/1251] eta 0:14:01 lr 0.000040 time 2.2330 (2.2099) loss 3.7164 (2.9986) grad_norm 2.7238 (2.9548) [2022-01-26 06:04:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][880/1251] eta 0:13:39 lr 0.000040 time 2.2528 (2.2078) loss 1.7780 (2.9974) grad_norm 3.0758 (2.9573) [2022-01-26 06:04:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][890/1251] eta 0:13:16 lr 0.000040 time 1.8297 (2.2068) loss 3.6275 (3.0006) grad_norm 2.6497 (2.9549) [2022-01-26 06:04:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][900/1251] eta 0:12:54 lr 0.000040 time 2.0037 (2.2056) loss 3.1948 (3.0034) grad_norm 2.9334 (2.9553) [2022-01-26 06:05:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][910/1251] eta 0:12:32 lr 0.000040 time 1.9341 (2.2073) loss 2.9044 (3.0055) grad_norm 2.5332 (2.9539) [2022-01-26 06:05:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][920/1251] eta 0:12:10 lr 0.000040 time 2.8733 (2.2079) loss 3.0648 (3.0048) grad_norm 2.6860 (2.9529) [2022-01-26 06:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][930/1251] eta 0:11:48 lr 0.000040 time 1.7105 (2.2082) loss 3.5919 (3.0059) grad_norm 2.8391 (2.9521) [2022-01-26 06:06:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][940/1251] eta 0:11:26 lr 0.000040 time 1.9382 (2.2090) loss 3.4110 (3.0070) grad_norm 2.6640 (2.9499) [2022-01-26 06:06:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][950/1251] eta 0:11:04 lr 0.000040 time 1.6486 (2.2069) loss 3.4909 (3.0048) grad_norm 2.8754 (2.9500) [2022-01-26 06:06:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][960/1251] eta 0:10:41 lr 0.000040 time 2.5161 (2.2060) loss 3.3767 (3.0055) grad_norm 3.1845 (2.9522) [2022-01-26 06:07:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][970/1251] eta 0:10:19 lr 0.000040 time 1.5954 (2.2040) loss 3.0155 (3.0061) grad_norm 2.4414 (2.9524) [2022-01-26 06:07:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][980/1251] eta 0:09:57 lr 0.000040 time 2.5363 (2.2049) loss 2.9950 (3.0045) grad_norm 2.6165 (2.9510) [2022-01-26 06:07:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][990/1251] eta 0:09:35 lr 0.000040 time 2.2114 (2.2037) loss 2.7018 (3.0044) grad_norm 3.0224 (2.9505) [2022-01-26 06:08:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1000/1251] eta 0:09:12 lr 0.000040 time 2.6046 (2.2031) loss 2.1938 (3.0022) grad_norm 2.5578 (2.9476) [2022-01-26 06:08:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1010/1251] eta 0:08:51 lr 0.000040 time 2.3850 (2.2039) loss 2.9641 (3.0013) grad_norm 2.3969 (2.9482) [2022-01-26 06:09:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1020/1251] eta 0:08:29 lr 0.000040 time 2.2356 (2.2043) loss 2.0336 (3.0008) grad_norm 2.3230 (2.9467) [2022-01-26 06:09:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1030/1251] eta 0:08:07 lr 0.000040 time 2.2474 (2.2046) loss 2.9897 (3.0008) grad_norm 2.7670 (2.9493) [2022-01-26 06:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1040/1251] eta 0:07:44 lr 0.000040 time 2.1887 (2.2037) loss 3.5351 (3.0011) grad_norm 3.1385 (2.9487) [2022-01-26 06:10:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1050/1251] eta 0:07:22 lr 0.000040 time 1.9634 (2.2030) loss 2.9666 (3.0026) grad_norm 3.0054 (2.9501) [2022-01-26 06:10:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1060/1251] eta 0:07:00 lr 0.000040 time 1.8115 (2.2030) loss 3.0487 (3.0045) grad_norm 2.9477 (2.9488) [2022-01-26 06:10:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1070/1251] eta 0:06:38 lr 0.000040 time 2.2192 (2.2042) loss 3.0024 (3.0056) grad_norm 2.7495 (2.9476) [2022-01-26 06:11:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1080/1251] eta 0:06:16 lr 0.000040 time 1.9170 (2.2043) loss 3.6934 (3.0079) grad_norm 3.3336 (2.9499) [2022-01-26 06:11:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1090/1251] eta 0:05:54 lr 0.000039 time 2.8503 (2.2049) loss 3.1943 (3.0086) grad_norm 3.2468 (2.9499) [2022-01-26 06:12:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1100/1251] eta 0:05:32 lr 0.000039 time 1.9820 (2.2039) loss 3.0407 (3.0097) grad_norm 3.0887 (2.9505) [2022-01-26 06:12:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1110/1251] eta 0:05:10 lr 0.000039 time 1.9001 (2.2028) loss 3.6601 (3.0113) grad_norm 3.5266 (2.9513) [2022-01-26 06:12:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1120/1251] eta 0:04:48 lr 0.000039 time 2.3641 (2.2016) loss 3.4749 (3.0129) grad_norm 3.2001 (2.9516) [2022-01-26 06:13:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1130/1251] eta 0:04:26 lr 0.000039 time 2.1520 (2.2002) loss 3.4808 (3.0127) grad_norm 3.3407 (2.9542) [2022-01-26 06:13:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1140/1251] eta 0:04:04 lr 0.000039 time 2.4362 (2.2003) loss 2.8613 (3.0121) grad_norm 3.5429 (2.9547) [2022-01-26 06:13:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1150/1251] eta 0:03:42 lr 0.000039 time 1.8178 (2.2009) loss 2.9139 (3.0126) grad_norm 2.9817 (2.9541) [2022-01-26 06:14:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1160/1251] eta 0:03:20 lr 0.000039 time 2.3419 (2.2018) loss 3.2087 (3.0129) grad_norm 2.6010 (2.9539) [2022-01-26 06:14:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1170/1251] eta 0:02:58 lr 0.000039 time 1.6115 (2.2013) loss 3.2294 (3.0132) grad_norm 2.5900 (2.9524) [2022-01-26 06:14:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1180/1251] eta 0:02:36 lr 0.000039 time 2.5906 (2.2003) loss 2.7620 (3.0121) grad_norm 3.0546 (2.9531) [2022-01-26 06:15:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1190/1251] eta 0:02:14 lr 0.000039 time 1.6897 (2.1989) loss 3.2268 (3.0147) grad_norm 2.6836 (2.9519) [2022-01-26 06:15:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1200/1251] eta 0:01:52 lr 0.000039 time 2.4647 (2.1978) loss 3.1404 (3.0164) grad_norm 2.9014 (2.9526) [2022-01-26 06:15:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1210/1251] eta 0:01:30 lr 0.000039 time 1.7626 (2.1968) loss 3.5960 (3.0190) grad_norm 2.8007 (2.9519) [2022-01-26 06:16:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1220/1251] eta 0:01:08 lr 0.000039 time 2.2198 (2.1958) loss 3.2009 (3.0174) grad_norm 3.2077 (2.9513) [2022-01-26 06:16:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1230/1251] eta 0:00:46 lr 0.000039 time 1.5453 (2.1952) loss 3.4614 (3.0180) grad_norm 3.0305 (2.9506) [2022-01-26 06:17:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1240/1251] eta 0:00:24 lr 0.000039 time 1.6905 (2.1982) loss 3.2566 (3.0178) grad_norm 2.9400 (2.9505) [2022-01-26 06:17:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [266/300][1250/1251] eta 0:00:02 lr 0.000039 time 1.1604 (2.1936) loss 3.1105 (3.0168) grad_norm 3.3963 (2.9517) [2022-01-26 06:17:19 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 266 training takes 0:45:44 [2022-01-26 06:17:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.419 (18.419) Loss 0.8860 (0.8860) Acc@1 78.906 (78.906) Acc@5 94.629 (94.629) [2022-01-26 06:17:55 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.002 (3.255) Loss 0.7922 (0.8301) Acc@1 81.445 (80.824) Acc@5 95.703 (95.446) [2022-01-26 06:18:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.350 (2.560) Loss 0.8693 (0.8390) Acc@1 79.199 (80.627) Acc@5 94.629 (95.089) [2022-01-26 06:18:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.498 (2.295) Loss 0.8020 (0.8256) Acc@1 79.492 (80.759) Acc@5 95.312 (95.272) [2022-01-26 06:18:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.494 (2.169) Loss 0.8216 (0.8222) Acc@1 81.543 (80.895) Acc@5 94.727 (95.293) [2022-01-26 06:18:55 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.822 Acc@5 95.296 [2022-01-26 06:18:55 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-01-26 06:18:55 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.83% [2022-01-26 06:19:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][0/1251] eta 7:31:18 lr 0.000039 time 21.6459 (21.6459) loss 3.0939 (3.0939) grad_norm 3.3763 (3.3763) [2022-01-26 06:19:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][10/1251] eta 1:25:07 lr 0.000039 time 2.1459 (4.1155) loss 3.4416 (3.0446) grad_norm 2.6248 (2.8866) [2022-01-26 06:20:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][20/1251] eta 1:05:19 lr 0.000039 time 1.8079 (3.1843) loss 2.8469 (3.0602) grad_norm 2.8430 (2.9124) [2022-01-26 06:20:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][30/1251] eta 0:57:45 lr 0.000039 time 1.8105 (2.8383) loss 3.6388 (3.0300) grad_norm 3.4466 (2.9505) [2022-01-26 06:20:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][40/1251] eta 0:54:58 lr 0.000039 time 3.0205 (2.7235) loss 1.9409 (2.9745) grad_norm 2.3548 (2.9685) [2022-01-26 06:21:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][50/1251] eta 0:53:52 lr 0.000039 time 2.9779 (2.6913) loss 3.2928 (2.9942) grad_norm 3.1943 (2.9568) [2022-01-26 06:21:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][60/1251] eta 0:51:47 lr 0.000039 time 1.9986 (2.6090) loss 3.8918 (3.0193) grad_norm 2.7056 (2.9497) [2022-01-26 06:21:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][70/1251] eta 0:49:38 lr 0.000039 time 2.3132 (2.5220) loss 2.7965 (3.0136) grad_norm 2.7927 (2.9570) [2022-01-26 06:22:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][80/1251] eta 0:48:02 lr 0.000039 time 2.2681 (2.4619) loss 2.5959 (3.0108) grad_norm 3.4326 (2.9598) [2022-01-26 06:22:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][90/1251] eta 0:46:55 lr 0.000039 time 1.7095 (2.4247) loss 3.5488 (3.0232) grad_norm 2.8313 (2.9592) [2022-01-26 06:22:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][100/1251] eta 0:45:59 lr 0.000039 time 2.1919 (2.3979) loss 3.3861 (3.0420) grad_norm 2.5473 (2.9487) [2022-01-26 06:23:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][110/1251] eta 0:45:08 lr 0.000039 time 1.5901 (2.3740) loss 2.5557 (3.0182) grad_norm 2.4864 (2.9445) [2022-01-26 06:23:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][120/1251] eta 0:44:40 lr 0.000039 time 2.2255 (2.3703) loss 3.1166 (3.0191) grad_norm 3.3626 (2.9416) [2022-01-26 06:24:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][130/1251] eta 0:44:09 lr 0.000039 time 2.2764 (2.3635) loss 3.5381 (3.0319) grad_norm 3.1007 (2.9391) [2022-01-26 06:24:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][140/1251] eta 0:43:27 lr 0.000039 time 2.2462 (2.3468) loss 2.8517 (3.0406) grad_norm 3.2651 (2.9470) [2022-01-26 06:24:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][150/1251] eta 0:42:50 lr 0.000039 time 1.8547 (2.3344) loss 2.7746 (3.0661) grad_norm 2.7941 (2.9531) [2022-01-26 06:25:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][160/1251] eta 0:42:11 lr 0.000039 time 1.8516 (2.3204) loss 3.1435 (3.0773) grad_norm 2.9449 (2.9591) [2022-01-26 06:25:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][170/1251] eta 0:41:43 lr 0.000039 time 2.2155 (2.3158) loss 3.3948 (3.0708) grad_norm 3.0738 (2.9675) [2022-01-26 06:25:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][180/1251] eta 0:41:16 lr 0.000039 time 1.8713 (2.3123) loss 3.0242 (3.0548) grad_norm 2.9762 (2.9794) [2022-01-26 06:26:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][190/1251] eta 0:40:45 lr 0.000039 time 1.8451 (2.3047) loss 2.9347 (3.0642) grad_norm 3.0265 (2.9968) [2022-01-26 06:26:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][200/1251] eta 0:40:14 lr 0.000039 time 1.9522 (2.2975) loss 3.5637 (3.0591) grad_norm 3.1507 (2.9938) [2022-01-26 06:27:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][210/1251] eta 0:39:50 lr 0.000039 time 2.0735 (2.2961) loss 3.4061 (3.0563) grad_norm 2.5633 (2.9894) [2022-01-26 06:27:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][220/1251] eta 0:39:18 lr 0.000039 time 1.6334 (2.2875) loss 3.2497 (3.0616) grad_norm 2.4704 (2.9847) [2022-01-26 06:27:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][230/1251] eta 0:38:41 lr 0.000039 time 2.2714 (2.2742) loss 3.9682 (3.0631) grad_norm 3.3155 (2.9840) [2022-01-26 06:28:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][240/1251] eta 0:38:13 lr 0.000039 time 1.6706 (2.2689) loss 2.8027 (3.0640) grad_norm 2.7157 (2.9871) [2022-01-26 06:28:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][250/1251] eta 0:37:49 lr 0.000039 time 2.3968 (2.2668) loss 3.3519 (3.0672) grad_norm 2.8447 (2.9865) [2022-01-26 06:28:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][260/1251] eta 0:37:22 lr 0.000039 time 1.9152 (2.2632) loss 2.5577 (3.0649) grad_norm 2.7870 (2.9844) [2022-01-26 06:29:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][270/1251] eta 0:36:53 lr 0.000039 time 1.8714 (2.2560) loss 3.6941 (3.0602) grad_norm 2.8375 (2.9828) [2022-01-26 06:29:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][280/1251] eta 0:36:24 lr 0.000039 time 1.6775 (2.2493) loss 2.6130 (3.0602) grad_norm 2.6988 (2.9748) [2022-01-26 06:29:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][290/1251] eta 0:36:02 lr 0.000039 time 2.9554 (2.2507) loss 2.1055 (3.0501) grad_norm 3.2511 (2.9712) [2022-01-26 06:30:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][300/1251] eta 0:35:41 lr 0.000039 time 2.1507 (2.2516) loss 2.8382 (3.0492) grad_norm 2.8617 (2.9733) [2022-01-26 06:30:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][310/1251] eta 0:35:17 lr 0.000039 time 1.7110 (2.2502) loss 3.4169 (3.0497) grad_norm 2.7596 (2.9699) [2022-01-26 06:30:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][320/1251] eta 0:34:54 lr 0.000039 time 1.4786 (2.2497) loss 2.0350 (3.0453) grad_norm 2.5696 (2.9654) [2022-01-26 06:31:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][330/1251] eta 0:34:33 lr 0.000039 time 2.9011 (2.2509) loss 2.9657 (3.0413) grad_norm 2.6661 (2.9606) [2022-01-26 06:31:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][340/1251] eta 0:34:05 lr 0.000039 time 1.8978 (2.2448) loss 3.4113 (3.0477) grad_norm 3.8762 (2.9632) [2022-01-26 06:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][350/1251] eta 0:33:42 lr 0.000039 time 1.8551 (2.2447) loss 3.5736 (3.0498) grad_norm 2.7468 (2.9671) [2022-01-26 06:32:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][360/1251] eta 0:33:17 lr 0.000039 time 1.6357 (2.2413) loss 2.7019 (3.0433) grad_norm 2.6006 (2.9740) [2022-01-26 06:32:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][370/1251] eta 0:32:54 lr 0.000039 time 3.0339 (2.2407) loss 1.9964 (3.0435) grad_norm 3.4106 (2.9813) [2022-01-26 06:33:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][380/1251] eta 0:32:28 lr 0.000039 time 2.2155 (2.2375) loss 3.4593 (3.0376) grad_norm 3.0191 (2.9803) [2022-01-26 06:33:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][390/1251] eta 0:32:06 lr 0.000039 time 1.9125 (2.2379) loss 3.3538 (3.0304) grad_norm 3.6792 (2.9835) [2022-01-26 06:33:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][400/1251] eta 0:31:44 lr 0.000039 time 1.6444 (2.2384) loss 2.0265 (3.0255) grad_norm 2.8812 (2.9892) [2022-01-26 06:34:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][410/1251] eta 0:31:25 lr 0.000039 time 3.5552 (2.2422) loss 1.8935 (3.0218) grad_norm 3.2489 (2.9896) [2022-01-26 06:34:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][420/1251] eta 0:31:05 lr 0.000039 time 3.5960 (2.2447) loss 3.0924 (3.0233) grad_norm 2.5239 (2.9893) [2022-01-26 06:35:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][430/1251] eta 0:30:40 lr 0.000039 time 2.1375 (2.2414) loss 2.9467 (3.0242) grad_norm 3.7054 (2.9860) [2022-01-26 06:35:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][440/1251] eta 0:30:14 lr 0.000039 time 1.7034 (2.2369) loss 2.5172 (3.0258) grad_norm 5.0332 (2.9893) [2022-01-26 06:35:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][450/1251] eta 0:29:48 lr 0.000039 time 1.9701 (2.2332) loss 3.5594 (3.0261) grad_norm 2.5905 (2.9880) [2022-01-26 06:36:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][460/1251] eta 0:29:29 lr 0.000039 time 4.8470 (2.2376) loss 2.6863 (3.0277) grad_norm 2.9352 (2.9855) [2022-01-26 06:36:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][470/1251] eta 0:29:06 lr 0.000039 time 2.4642 (2.2362) loss 3.3472 (3.0229) grad_norm 2.5040 (2.9836) [2022-01-26 06:36:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][480/1251] eta 0:28:42 lr 0.000039 time 1.9197 (2.2340) loss 3.0435 (3.0228) grad_norm 2.9712 (2.9818) [2022-01-26 06:37:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][490/1251] eta 0:28:17 lr 0.000039 time 2.0980 (2.2308) loss 2.9214 (3.0209) grad_norm 2.9785 (2.9835) [2022-01-26 06:37:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][500/1251] eta 0:27:53 lr 0.000039 time 3.4248 (2.2288) loss 2.6741 (3.0207) grad_norm 2.6111 (2.9856) [2022-01-26 06:37:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][510/1251] eta 0:27:29 lr 0.000039 time 1.8251 (2.2265) loss 3.4768 (3.0218) grad_norm 3.2691 (2.9849) [2022-01-26 06:38:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][520/1251] eta 0:27:09 lr 0.000039 time 2.7306 (2.2294) loss 3.3250 (3.0296) grad_norm 2.9538 (2.9860) [2022-01-26 06:38:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][530/1251] eta 0:26:47 lr 0.000039 time 2.2245 (2.2290) loss 3.6995 (3.0332) grad_norm 3.2304 (2.9893) [2022-01-26 06:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][540/1251] eta 0:26:24 lr 0.000039 time 2.5831 (2.2281) loss 2.5366 (3.0308) grad_norm 2.8822 (2.9889) [2022-01-26 06:39:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][550/1251] eta 0:25:58 lr 0.000038 time 2.2497 (2.2231) loss 3.5985 (3.0348) grad_norm 2.8027 (2.9873) [2022-01-26 06:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][560/1251] eta 0:25:34 lr 0.000038 time 2.7134 (2.2211) loss 3.4107 (3.0359) grad_norm 2.6150 (2.9862) [2022-01-26 06:40:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][570/1251] eta 0:25:11 lr 0.000038 time 2.1030 (2.2192) loss 3.4743 (3.0386) grad_norm 3.5101 (2.9921) [2022-01-26 06:40:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][580/1251] eta 0:24:49 lr 0.000038 time 2.1670 (2.2200) loss 2.3001 (3.0413) grad_norm 2.3741 (2.9893) [2022-01-26 06:40:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][590/1251] eta 0:24:28 lr 0.000038 time 2.8211 (2.2216) loss 3.0619 (3.0462) grad_norm 3.0449 (2.9912) [2022-01-26 06:41:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][600/1251] eta 0:24:07 lr 0.000038 time 2.5048 (2.2230) loss 3.4952 (3.0508) grad_norm 3.4136 (2.9916) [2022-01-26 06:41:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][610/1251] eta 0:23:44 lr 0.000038 time 2.2433 (2.2230) loss 3.5905 (3.0556) grad_norm 3.1809 (2.9940) [2022-01-26 06:41:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][620/1251] eta 0:23:22 lr 0.000038 time 2.1552 (2.2230) loss 3.2380 (3.0558) grad_norm 2.8337 (2.9916) [2022-01-26 06:42:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][630/1251] eta 0:22:59 lr 0.000038 time 2.2790 (2.2207) loss 1.8152 (3.0547) grad_norm 3.3397 (2.9914) [2022-01-26 06:42:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][640/1251] eta 0:22:34 lr 0.000038 time 2.5229 (2.2164) loss 3.2653 (3.0539) grad_norm 2.4402 (2.9880) [2022-01-26 06:42:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][650/1251] eta 0:22:10 lr 0.000038 time 2.0027 (2.2131) loss 3.1687 (3.0521) grad_norm 2.8472 (2.9882) [2022-01-26 06:43:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][660/1251] eta 0:21:46 lr 0.000038 time 1.6026 (2.2108) loss 3.6414 (3.0524) grad_norm 3.0040 (2.9904) [2022-01-26 06:43:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][670/1251] eta 0:21:23 lr 0.000038 time 2.1562 (2.2089) loss 3.6815 (3.0488) grad_norm 3.2340 (2.9924) [2022-01-26 06:43:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][680/1251] eta 0:21:00 lr 0.000038 time 1.8628 (2.2077) loss 3.3343 (3.0502) grad_norm 3.0026 (2.9913) [2022-01-26 06:44:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][690/1251] eta 0:20:37 lr 0.000038 time 2.1978 (2.2057) loss 3.2141 (3.0547) grad_norm 2.5109 (2.9894) [2022-01-26 06:44:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][700/1251] eta 0:20:15 lr 0.000038 time 2.1160 (2.2060) loss 3.5418 (3.0582) grad_norm 2.9627 (2.9878) [2022-01-26 06:45:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][710/1251] eta 0:19:54 lr 0.000038 time 2.4014 (2.2083) loss 3.5897 (3.0562) grad_norm 2.7111 (2.9879) [2022-01-26 06:45:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][720/1251] eta 0:19:34 lr 0.000038 time 2.1014 (2.2110) loss 1.8298 (3.0525) grad_norm 2.9270 (2.9862) [2022-01-26 06:45:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][730/1251] eta 0:19:13 lr 0.000038 time 2.8332 (2.2131) loss 3.1434 (3.0508) grad_norm 2.8419 (2.9861) [2022-01-26 06:46:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][740/1251] eta 0:18:52 lr 0.000038 time 1.7560 (2.2153) loss 3.3844 (3.0507) grad_norm 2.8700 (2.9848) [2022-01-26 06:46:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][750/1251] eta 0:18:30 lr 0.000038 time 2.7292 (2.2164) loss 3.3949 (3.0517) grad_norm 2.9397 (2.9847) [2022-01-26 06:47:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][760/1251] eta 0:18:07 lr 0.000038 time 1.6080 (2.2139) loss 3.0637 (3.0531) grad_norm 3.0653 (2.9887) [2022-01-26 06:47:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][770/1251] eta 0:17:44 lr 0.000038 time 2.4881 (2.2121) loss 3.2141 (3.0509) grad_norm 2.7964 (2.9882) [2022-01-26 06:47:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][780/1251] eta 0:17:21 lr 0.000038 time 2.1887 (2.2110) loss 2.8620 (3.0524) grad_norm 3.3896 (2.9887) [2022-01-26 06:48:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][790/1251] eta 0:16:58 lr 0.000038 time 2.2250 (2.2097) loss 2.9129 (3.0528) grad_norm 3.4648 (2.9898) [2022-01-26 06:48:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][800/1251] eta 0:16:37 lr 0.000038 time 2.4734 (2.2107) loss 3.2873 (3.0505) grad_norm 2.5508 (2.9869) [2022-01-26 06:48:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][810/1251] eta 0:16:14 lr 0.000038 time 2.2344 (2.2104) loss 2.6635 (3.0507) grad_norm 2.8020 (2.9892) [2022-01-26 06:49:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][820/1251] eta 0:15:53 lr 0.000038 time 2.3327 (2.2116) loss 3.1279 (3.0504) grad_norm 2.7070 (2.9888) [2022-01-26 06:49:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][830/1251] eta 0:15:30 lr 0.000038 time 1.9175 (2.2106) loss 3.0586 (3.0492) grad_norm 2.6235 (2.9901) [2022-01-26 06:49:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][840/1251] eta 0:15:08 lr 0.000038 time 2.1410 (2.2100) loss 3.2242 (3.0483) grad_norm 2.4658 (2.9879) [2022-01-26 06:50:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][850/1251] eta 0:14:45 lr 0.000038 time 2.2314 (2.2081) loss 3.5335 (3.0487) grad_norm 2.6536 (2.9883) [2022-01-26 06:50:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][860/1251] eta 0:14:22 lr 0.000038 time 1.8925 (2.2066) loss 3.2734 (3.0472) grad_norm 3.6020 (2.9909) [2022-01-26 06:50:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][870/1251] eta 0:14:00 lr 0.000038 time 1.9168 (2.2073) loss 2.8766 (3.0456) grad_norm 2.9136 (2.9898) [2022-01-26 06:51:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][880/1251] eta 0:13:38 lr 0.000038 time 2.3351 (2.2063) loss 2.7309 (3.0443) grad_norm 3.7434 (2.9902) [2022-01-26 06:51:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][890/1251] eta 0:13:16 lr 0.000038 time 2.2256 (2.2069) loss 2.0438 (3.0436) grad_norm 3.1769 (2.9896) [2022-01-26 06:52:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][900/1251] eta 0:12:54 lr 0.000038 time 1.8911 (2.2058) loss 3.1590 (3.0428) grad_norm 3.3696 (2.9898) [2022-01-26 06:52:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][910/1251] eta 0:12:31 lr 0.000038 time 2.1558 (2.2051) loss 2.9867 (3.0435) grad_norm 2.9120 (2.9908) [2022-01-26 06:52:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][920/1251] eta 0:12:10 lr 0.000038 time 2.1944 (2.2063) loss 2.1673 (3.0424) grad_norm 2.9313 (2.9905) [2022-01-26 06:53:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][930/1251] eta 0:11:48 lr 0.000038 time 1.9448 (2.2068) loss 3.1950 (3.0431) grad_norm 2.7368 (2.9888) [2022-01-26 06:53:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][940/1251] eta 0:11:26 lr 0.000038 time 1.9613 (2.2085) loss 2.9748 (3.0406) grad_norm 3.2754 (2.9901) [2022-01-26 06:53:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][950/1251] eta 0:11:04 lr 0.000038 time 1.9404 (2.2085) loss 3.2212 (3.0413) grad_norm 2.6345 (2.9895) [2022-01-26 06:54:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][960/1251] eta 0:10:42 lr 0.000038 time 1.8957 (2.2064) loss 2.2678 (3.0409) grad_norm 2.5114 (2.9861) [2022-01-26 06:54:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][970/1251] eta 0:10:19 lr 0.000038 time 1.9496 (2.2044) loss 2.5150 (3.0426) grad_norm 2.5796 (2.9845) [2022-01-26 06:54:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][980/1251] eta 0:09:57 lr 0.000038 time 2.7946 (2.2036) loss 3.7277 (3.0408) grad_norm 2.5242 (2.9827) [2022-01-26 06:55:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][990/1251] eta 0:09:34 lr 0.000038 time 1.8575 (2.2025) loss 3.2462 (3.0401) grad_norm 3.0414 (2.9817) [2022-01-26 06:55:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1000/1251] eta 0:09:12 lr 0.000038 time 2.5402 (2.2029) loss 3.4012 (3.0410) grad_norm 2.7524 (2.9871) [2022-01-26 06:56:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1010/1251] eta 0:08:50 lr 0.000038 time 2.4849 (2.2021) loss 3.1750 (3.0423) grad_norm 2.8275 (2.9852) [2022-01-26 06:56:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1020/1251] eta 0:08:28 lr 0.000038 time 2.2654 (2.2025) loss 3.1901 (3.0447) grad_norm 2.6410 (2.9850) [2022-01-26 06:56:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1030/1251] eta 0:08:06 lr 0.000038 time 2.2504 (2.2029) loss 2.5534 (3.0463) grad_norm 2.8529 (2.9868) [2022-01-26 06:57:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1040/1251] eta 0:07:45 lr 0.000038 time 2.9484 (2.2046) loss 3.2283 (3.0470) grad_norm 2.8320 (2.9861) [2022-01-26 06:57:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1050/1251] eta 0:07:23 lr 0.000038 time 2.4928 (2.2058) loss 3.3546 (3.0481) grad_norm 2.9337 (2.9859) [2022-01-26 06:57:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1060/1251] eta 0:07:01 lr 0.000038 time 1.8200 (2.2054) loss 2.6464 (3.0484) grad_norm 2.8850 (2.9856) [2022-01-26 06:58:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1070/1251] eta 0:06:38 lr 0.000038 time 2.1302 (2.2042) loss 3.0477 (3.0488) grad_norm 3.0344 (2.9841) [2022-01-26 06:58:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1080/1251] eta 0:06:16 lr 0.000038 time 1.8847 (2.2030) loss 3.0466 (3.0480) grad_norm 3.5049 (2.9840) [2022-01-26 06:58:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1090/1251] eta 0:05:54 lr 0.000038 time 2.1460 (2.2035) loss 3.3117 (3.0481) grad_norm 3.1593 (2.9836) [2022-01-26 06:59:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1100/1251] eta 0:05:32 lr 0.000038 time 1.6082 (2.2027) loss 2.7759 (3.0476) grad_norm 2.9476 (2.9818) [2022-01-26 06:59:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1110/1251] eta 0:05:10 lr 0.000038 time 1.8827 (2.2038) loss 2.3958 (3.0471) grad_norm 3.8326 (2.9811) [2022-01-26 07:00:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1120/1251] eta 0:04:48 lr 0.000038 time 1.6106 (2.2033) loss 2.5151 (3.0467) grad_norm 3.2871 (2.9802) [2022-01-26 07:00:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1130/1251] eta 0:04:26 lr 0.000038 time 2.4978 (2.2034) loss 3.4160 (3.0454) grad_norm 3.0657 (2.9789) [2022-01-26 07:00:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1140/1251] eta 0:04:04 lr 0.000038 time 1.5137 (2.2034) loss 2.2584 (3.0446) grad_norm 2.8698 (2.9774) [2022-01-26 07:01:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1150/1251] eta 0:03:42 lr 0.000038 time 1.8807 (2.2043) loss 2.2720 (3.0447) grad_norm 2.9642 (2.9764) [2022-01-26 07:01:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1160/1251] eta 0:03:20 lr 0.000038 time 1.8147 (2.2026) loss 3.1245 (3.0443) grad_norm 2.2503 (2.9774) [2022-01-26 07:01:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1170/1251] eta 0:02:58 lr 0.000038 time 2.1505 (2.2023) loss 3.4434 (3.0452) grad_norm 2.6395 (2.9761) [2022-01-26 07:02:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1180/1251] eta 0:02:36 lr 0.000038 time 1.6276 (2.2012) loss 3.1611 (3.0456) grad_norm 2.8294 (2.9755) [2022-01-26 07:02:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1190/1251] eta 0:02:14 lr 0.000038 time 2.1292 (2.2006) loss 3.6030 (3.0462) grad_norm 2.8740 (2.9760) [2022-01-26 07:02:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1200/1251] eta 0:01:52 lr 0.000038 time 2.1740 (2.2008) loss 3.0200 (3.0471) grad_norm 3.1305 (2.9760) [2022-01-26 07:03:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1210/1251] eta 0:01:30 lr 0.000038 time 1.8806 (2.1998) loss 3.1199 (3.0451) grad_norm 2.7742 (2.9750) [2022-01-26 07:03:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1220/1251] eta 0:01:08 lr 0.000038 time 1.9722 (2.1987) loss 2.7443 (3.0453) grad_norm 3.3230 (2.9748) [2022-01-26 07:04:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1230/1251] eta 0:00:46 lr 0.000038 time 2.1624 (2.1989) loss 3.2040 (3.0460) grad_norm 2.8073 (2.9741) [2022-01-26 07:04:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1240/1251] eta 0:00:24 lr 0.000038 time 2.0395 (2.1987) loss 2.8893 (3.0435) grad_norm 3.0264 (2.9722) [2022-01-26 07:04:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [267/300][1250/1251] eta 0:00:02 lr 0.000038 time 1.1581 (2.1937) loss 2.8696 (3.0428) grad_norm 2.5335 (2.9706) [2022-01-26 07:04:40 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 267 training takes 0:45:44 [2022-01-26 07:04:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.321 (18.321) Loss 0.7838 (0.7838) Acc@1 81.543 (81.543) Acc@5 95.410 (95.410) [2022-01-26 07:05:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.215 (3.415) Loss 0.8316 (0.7948) Acc@1 80.566 (81.081) Acc@5 95.020 (95.703) [2022-01-26 07:05:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.311 (2.740) Loss 0.8136 (0.8087) Acc@1 80.762 (80.897) Acc@5 95.410 (95.471) [2022-01-26 07:05:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.928 (2.337) Loss 0.8415 (0.8183) Acc@1 80.273 (80.825) Acc@5 94.824 (95.325) [2022-01-26 07:06:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.214 (2.215) Loss 0.8498 (0.8206) Acc@1 78.711 (80.886) Acc@5 95.508 (95.284) [2022-01-26 07:06:18 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.906 Acc@5 95.302 [2022-01-26 07:06:18 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.9% [2022-01-26 07:06:18 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.91% [2022-01-26 07:06:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][0/1251] eta 7:35:49 lr 0.000038 time 21.8619 (21.8619) loss 3.3108 (3.3108) grad_norm 2.6245 (2.6245) [2022-01-26 07:07:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][10/1251] eta 1:24:22 lr 0.000038 time 1.5336 (4.0790) loss 3.4109 (3.1472) grad_norm 2.6521 (2.8644) [2022-01-26 07:07:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][20/1251] eta 1:02:11 lr 0.000038 time 1.5473 (3.0313) loss 3.0022 (3.1901) grad_norm 3.0458 (2.9358) [2022-01-26 07:07:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][30/1251] eta 0:56:27 lr 0.000037 time 1.7192 (2.7741) loss 3.7898 (3.1271) grad_norm 2.9764 (2.9920) [2022-01-26 07:08:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][40/1251] eta 0:54:33 lr 0.000037 time 4.0779 (2.7035) loss 2.0308 (3.1099) grad_norm 2.7084 (3.0047) [2022-01-26 07:08:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][50/1251] eta 0:51:55 lr 0.000037 time 2.2389 (2.5939) loss 2.2443 (3.0729) grad_norm 2.7124 (2.9979) [2022-01-26 07:08:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][60/1251] eta 0:50:14 lr 0.000037 time 1.5713 (2.5311) loss 3.2769 (3.0854) grad_norm 2.7418 (2.9811) [2022-01-26 07:09:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][70/1251] eta 0:49:14 lr 0.000037 time 2.4587 (2.5016) loss 2.8882 (3.0698) grad_norm 2.6481 (2.9760) [2022-01-26 07:09:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][80/1251] eta 0:48:34 lr 0.000037 time 3.7567 (2.4886) loss 2.1299 (3.0684) grad_norm 3.1199 (2.9685) [2022-01-26 07:10:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][90/1251] eta 0:47:26 lr 0.000037 time 1.6225 (2.4519) loss 2.9584 (3.0701) grad_norm 2.9615 (2.9616) [2022-01-26 07:10:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][100/1251] eta 0:46:22 lr 0.000037 time 1.6798 (2.4176) loss 2.5329 (3.0180) grad_norm 2.9043 (2.9761) [2022-01-26 07:10:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][110/1251] eta 0:45:37 lr 0.000037 time 2.0034 (2.3994) loss 3.2900 (3.0378) grad_norm 2.6275 (2.9838) [2022-01-26 07:11:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][120/1251] eta 0:44:56 lr 0.000037 time 3.6673 (2.3844) loss 3.0167 (3.0323) grad_norm 4.1268 (2.9842) [2022-01-26 07:11:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][130/1251] eta 0:44:09 lr 0.000037 time 1.8188 (2.3636) loss 3.3941 (3.0406) grad_norm 4.3678 (2.9929) [2022-01-26 07:11:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][140/1251] eta 0:43:21 lr 0.000037 time 1.8700 (2.3412) loss 3.4202 (3.0344) grad_norm 2.9016 (2.9804) [2022-01-26 07:12:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][150/1251] eta 0:42:42 lr 0.000037 time 2.0732 (2.3272) loss 2.6856 (3.0282) grad_norm 2.9720 (2.9821) [2022-01-26 07:12:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][160/1251] eta 0:42:18 lr 0.000037 time 3.5942 (2.3270) loss 3.0067 (3.0400) grad_norm 2.7571 (2.9721) [2022-01-26 07:12:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][170/1251] eta 0:41:44 lr 0.000037 time 1.8953 (2.3171) loss 2.7513 (3.0323) grad_norm 3.2961 (2.9783) [2022-01-26 07:13:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][180/1251] eta 0:41:13 lr 0.000037 time 2.1833 (2.3099) loss 3.1074 (3.0226) grad_norm 2.8906 (2.9746) [2022-01-26 07:13:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][190/1251] eta 0:40:46 lr 0.000037 time 2.1433 (2.3058) loss 3.1972 (3.0271) grad_norm 2.8129 (2.9599) [2022-01-26 07:14:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][200/1251] eta 0:40:19 lr 0.000037 time 3.5865 (2.3018) loss 2.8118 (3.0158) grad_norm 3.3547 (2.9612) [2022-01-26 07:14:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][210/1251] eta 0:39:48 lr 0.000037 time 2.2164 (2.2942) loss 3.3797 (3.0175) grad_norm 2.8335 (2.9636) [2022-01-26 07:14:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][220/1251] eta 0:39:17 lr 0.000037 time 1.8600 (2.2861) loss 3.4043 (3.0217) grad_norm 2.4594 (2.9622) [2022-01-26 07:15:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][230/1251] eta 0:38:46 lr 0.000037 time 1.8077 (2.2783) loss 1.9506 (3.0211) grad_norm 3.9823 (2.9650) [2022-01-26 07:15:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][240/1251] eta 0:38:16 lr 0.000037 time 3.0452 (2.2720) loss 2.9027 (3.0267) grad_norm 3.0113 (2.9578) [2022-01-26 07:15:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][250/1251] eta 0:37:41 lr 0.000037 time 1.9047 (2.2592) loss 3.0505 (3.0210) grad_norm 3.0231 (2.9548) [2022-01-26 07:16:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][260/1251] eta 0:37:19 lr 0.000037 time 2.7932 (2.2594) loss 3.3463 (3.0243) grad_norm 3.0802 (2.9493) [2022-01-26 07:16:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][270/1251] eta 0:36:54 lr 0.000037 time 2.3668 (2.2575) loss 3.2484 (3.0244) grad_norm 2.9966 (2.9468) [2022-01-26 07:16:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][280/1251] eta 0:36:35 lr 0.000037 time 2.7673 (2.2610) loss 3.3006 (3.0260) grad_norm 2.3842 (2.9434) [2022-01-26 07:17:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][290/1251] eta 0:36:14 lr 0.000037 time 1.5844 (2.2630) loss 3.1274 (3.0197) grad_norm 2.9551 (2.9438) [2022-01-26 07:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][300/1251] eta 0:35:53 lr 0.000037 time 2.7163 (2.2641) loss 2.3133 (3.0241) grad_norm 2.7709 (2.9420) [2022-01-26 07:18:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][310/1251] eta 0:35:28 lr 0.000037 time 3.0889 (2.2617) loss 2.8554 (3.0143) grad_norm 2.9874 (2.9404) [2022-01-26 07:18:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][320/1251] eta 0:34:57 lr 0.000037 time 1.8430 (2.2524) loss 2.2267 (3.0171) grad_norm 3.0432 (2.9407) [2022-01-26 07:18:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][330/1251] eta 0:34:29 lr 0.000037 time 2.3348 (2.2472) loss 3.4329 (3.0236) grad_norm 2.8246 (2.9419) [2022-01-26 07:19:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][340/1251] eta 0:34:03 lr 0.000037 time 2.0731 (2.2430) loss 2.7180 (3.0299) grad_norm 2.8412 (2.9425) [2022-01-26 07:19:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][350/1251] eta 0:33:40 lr 0.000037 time 2.2921 (2.2423) loss 2.7764 (3.0309) grad_norm 3.1948 (2.9430) [2022-01-26 07:19:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][360/1251] eta 0:33:17 lr 0.000037 time 2.5224 (2.2423) loss 2.8038 (3.0231) grad_norm 2.9799 (2.9368) [2022-01-26 07:20:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][370/1251] eta 0:32:54 lr 0.000037 time 1.5639 (2.2413) loss 3.0853 (3.0287) grad_norm 3.6904 (2.9469) [2022-01-26 07:20:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][380/1251] eta 0:32:30 lr 0.000037 time 1.9508 (2.2396) loss 3.1363 (3.0302) grad_norm 2.5850 (2.9491) [2022-01-26 07:20:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][390/1251] eta 0:32:08 lr 0.000037 time 1.8036 (2.2396) loss 2.8393 (3.0347) grad_norm 3.8138 (2.9533) [2022-01-26 07:21:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][400/1251] eta 0:31:42 lr 0.000037 time 2.1605 (2.2357) loss 3.8525 (3.0402) grad_norm 2.9006 (2.9527) [2022-01-26 07:21:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][410/1251] eta 0:31:19 lr 0.000037 time 2.3344 (2.2347) loss 3.5827 (3.0378) grad_norm 3.1125 (2.9531) [2022-01-26 07:21:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][420/1251] eta 0:30:54 lr 0.000037 time 1.7927 (2.2315) loss 3.4020 (3.0382) grad_norm 2.3997 (2.9564) [2022-01-26 07:22:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][430/1251] eta 0:30:31 lr 0.000037 time 2.0030 (2.2307) loss 3.2187 (3.0427) grad_norm 3.2895 (2.9566) [2022-01-26 07:22:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][440/1251] eta 0:30:08 lr 0.000037 time 1.8371 (2.2301) loss 3.2945 (3.0402) grad_norm 3.1811 (2.9564) [2022-01-26 07:23:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][450/1251] eta 0:29:46 lr 0.000037 time 1.8473 (2.2300) loss 3.4240 (3.0320) grad_norm 2.5053 (2.9549) [2022-01-26 07:23:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][460/1251] eta 0:29:21 lr 0.000037 time 1.6230 (2.2273) loss 3.2171 (3.0347) grad_norm 2.9824 (2.9549) [2022-01-26 07:23:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][470/1251] eta 0:29:01 lr 0.000037 time 2.0997 (2.2295) loss 2.0625 (3.0303) grad_norm 3.1762 (2.9511) [2022-01-26 07:24:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][480/1251] eta 0:28:37 lr 0.000037 time 2.0597 (2.2282) loss 3.2126 (3.0254) grad_norm 3.4759 (2.9517) [2022-01-26 07:24:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][490/1251] eta 0:28:17 lr 0.000037 time 2.4879 (2.2309) loss 2.8115 (3.0257) grad_norm 2.9082 (2.9495) [2022-01-26 07:24:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][500/1251] eta 0:27:52 lr 0.000037 time 1.9154 (2.2270) loss 2.9462 (3.0271) grad_norm 2.9915 (2.9500) [2022-01-26 07:25:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][510/1251] eta 0:27:29 lr 0.000037 time 2.3607 (2.2261) loss 3.3632 (3.0278) grad_norm 2.8202 (2.9493) [2022-01-26 07:25:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][520/1251] eta 0:27:04 lr 0.000037 time 2.0953 (2.2223) loss 2.8762 (3.0286) grad_norm 3.1253 (2.9531) [2022-01-26 07:25:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][530/1251] eta 0:26:40 lr 0.000037 time 2.2134 (2.2195) loss 3.3530 (3.0246) grad_norm 2.7694 (2.9542) [2022-01-26 07:26:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][540/1251] eta 0:26:18 lr 0.000037 time 2.1839 (2.2202) loss 3.0728 (3.0278) grad_norm 2.5889 (2.9536) [2022-01-26 07:26:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][550/1251] eta 0:25:59 lr 0.000037 time 2.7231 (2.2252) loss 2.6505 (3.0276) grad_norm 2.6611 (2.9530) [2022-01-26 07:27:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][560/1251] eta 0:25:38 lr 0.000037 time 2.1495 (2.2269) loss 3.3467 (3.0267) grad_norm 2.5670 (2.9559) [2022-01-26 07:27:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][570/1251] eta 0:25:15 lr 0.000037 time 1.8041 (2.2259) loss 3.8125 (3.0269) grad_norm 2.9956 (2.9565) [2022-01-26 07:27:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][580/1251] eta 0:24:51 lr 0.000037 time 1.7190 (2.2232) loss 1.9983 (3.0260) grad_norm 2.8693 (2.9561) [2022-01-26 07:28:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][590/1251] eta 0:24:26 lr 0.000037 time 1.6321 (2.2185) loss 3.2530 (3.0249) grad_norm 3.2706 (2.9575) [2022-01-26 07:28:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][600/1251] eta 0:24:01 lr 0.000037 time 1.8731 (2.2144) loss 3.5992 (3.0224) grad_norm 2.6246 (2.9553) [2022-01-26 07:28:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][610/1251] eta 0:23:39 lr 0.000037 time 2.5443 (2.2138) loss 3.7393 (3.0250) grad_norm 2.6223 (2.9504) [2022-01-26 07:29:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][620/1251] eta 0:23:18 lr 0.000037 time 2.1861 (2.2158) loss 2.5134 (3.0288) grad_norm 2.8742 (2.9484) [2022-01-26 07:29:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][630/1251] eta 0:22:57 lr 0.000037 time 2.1233 (2.2178) loss 3.2092 (3.0318) grad_norm 3.3832 (2.9489) [2022-01-26 07:30:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][640/1251] eta 0:22:35 lr 0.000037 time 2.1389 (2.2181) loss 3.4194 (3.0312) grad_norm 2.5854 (2.9452) [2022-01-26 07:30:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][650/1251] eta 0:22:13 lr 0.000037 time 2.4464 (2.2190) loss 3.4929 (3.0349) grad_norm 2.8230 (2.9434) [2022-01-26 07:30:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][660/1251] eta 0:21:50 lr 0.000037 time 1.8037 (2.2178) loss 3.5141 (3.0347) grad_norm 2.8473 (2.9461) [2022-01-26 07:31:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][670/1251] eta 0:21:28 lr 0.000037 time 2.5255 (2.2183) loss 2.7835 (3.0325) grad_norm 3.0079 (2.9454) [2022-01-26 07:31:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][680/1251] eta 0:21:05 lr 0.000037 time 1.5923 (2.2166) loss 3.3150 (3.0291) grad_norm 2.6684 (2.9476) [2022-01-26 07:31:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][690/1251] eta 0:20:42 lr 0.000037 time 1.8667 (2.2146) loss 2.8764 (3.0297) grad_norm 3.3182 (2.9537) [2022-01-26 07:32:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][700/1251] eta 0:20:19 lr 0.000037 time 2.2066 (2.2126) loss 3.3015 (3.0307) grad_norm 3.0495 (2.9573) [2022-01-26 07:32:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][710/1251] eta 0:19:58 lr 0.000037 time 1.9418 (2.2146) loss 3.3978 (3.0314) grad_norm 2.7467 (2.9604) [2022-01-26 07:32:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][720/1251] eta 0:19:34 lr 0.000037 time 1.8816 (2.2128) loss 3.6558 (3.0335) grad_norm 2.7606 (2.9598) [2022-01-26 07:33:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][730/1251] eta 0:19:12 lr 0.000037 time 1.9745 (2.2125) loss 3.2626 (3.0370) grad_norm 3.0289 (2.9611) [2022-01-26 07:33:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][740/1251] eta 0:18:50 lr 0.000037 time 1.8677 (2.2121) loss 2.9831 (3.0370) grad_norm 2.7659 (2.9591) [2022-01-26 07:33:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][750/1251] eta 0:18:28 lr 0.000037 time 2.2231 (2.2118) loss 3.5642 (3.0372) grad_norm 3.1790 (2.9597) [2022-01-26 07:34:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][760/1251] eta 0:18:06 lr 0.000037 time 1.9555 (2.2120) loss 3.7645 (3.0360) grad_norm 2.5608 (2.9598) [2022-01-26 07:34:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][770/1251] eta 0:17:43 lr 0.000036 time 2.4403 (2.2115) loss 2.1048 (3.0341) grad_norm 2.9791 (2.9594) [2022-01-26 07:35:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][780/1251] eta 0:17:21 lr 0.000036 time 2.4721 (2.2107) loss 2.8937 (3.0351) grad_norm 2.8107 (2.9581) [2022-01-26 07:35:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][790/1251] eta 0:16:59 lr 0.000036 time 1.8199 (2.2114) loss 3.0534 (3.0320) grad_norm 2.9318 (2.9561) [2022-01-26 07:35:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][800/1251] eta 0:16:36 lr 0.000036 time 2.1224 (2.2100) loss 3.4243 (3.0346) grad_norm 2.7867 (2.9556) [2022-01-26 07:36:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][810/1251] eta 0:16:14 lr 0.000036 time 2.4701 (2.2105) loss 3.4615 (3.0353) grad_norm 2.6924 (2.9548) [2022-01-26 07:36:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][820/1251] eta 0:15:52 lr 0.000036 time 2.2611 (2.2092) loss 3.0521 (3.0348) grad_norm 2.4472 (2.9536) [2022-01-26 07:36:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][830/1251] eta 0:15:29 lr 0.000036 time 1.7541 (2.2074) loss 3.3403 (3.0341) grad_norm 2.8679 (2.9555) [2022-01-26 07:37:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][840/1251] eta 0:15:08 lr 0.000036 time 2.0339 (2.2095) loss 2.2770 (3.0301) grad_norm 2.6558 (2.9542) [2022-01-26 07:37:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][850/1251] eta 0:14:47 lr 0.000036 time 2.2658 (2.2137) loss 3.3428 (3.0298) grad_norm 3.0007 (2.9563) [2022-01-26 07:38:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][860/1251] eta 0:14:25 lr 0.000036 time 2.2184 (2.2125) loss 3.2855 (3.0281) grad_norm 2.4269 (2.9570) [2022-01-26 07:38:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][870/1251] eta 0:14:02 lr 0.000036 time 2.4619 (2.2110) loss 2.2728 (3.0264) grad_norm 2.5252 (2.9540) [2022-01-26 07:38:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][880/1251] eta 0:13:39 lr 0.000036 time 1.5969 (2.2085) loss 3.3615 (3.0280) grad_norm 2.9012 (2.9543) [2022-01-26 07:39:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][890/1251] eta 0:13:16 lr 0.000036 time 1.5545 (2.2056) loss 2.8622 (3.0286) grad_norm 2.9967 (2.9567) [2022-01-26 07:39:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][900/1251] eta 0:12:54 lr 0.000036 time 2.8596 (2.2052) loss 3.1626 (3.0301) grad_norm 2.4629 (2.9568) [2022-01-26 07:39:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][910/1251] eta 0:12:32 lr 0.000036 time 2.6532 (2.2057) loss 2.6826 (3.0294) grad_norm 2.6455 (2.9566) [2022-01-26 07:40:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][920/1251] eta 0:12:10 lr 0.000036 time 2.1952 (2.2061) loss 3.5143 (3.0269) grad_norm 2.8858 (2.9585) [2022-01-26 07:40:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][930/1251] eta 0:11:48 lr 0.000036 time 2.6417 (2.2061) loss 3.2720 (3.0255) grad_norm 3.0904 (2.9578) [2022-01-26 07:40:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][940/1251] eta 0:11:25 lr 0.000036 time 2.8300 (2.2052) loss 3.4491 (3.0248) grad_norm 3.6838 (2.9624) [2022-01-26 07:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][950/1251] eta 0:11:04 lr 0.000036 time 2.5064 (2.2062) loss 2.5183 (3.0261) grad_norm 2.5057 (2.9622) [2022-01-26 07:41:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][960/1251] eta 0:10:42 lr 0.000036 time 2.6120 (2.2068) loss 3.2806 (3.0251) grad_norm 2.7467 (2.9620) [2022-01-26 07:42:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][970/1251] eta 0:10:20 lr 0.000036 time 2.2364 (2.2072) loss 3.0918 (3.0236) grad_norm 3.3516 (2.9615) [2022-01-26 07:42:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][980/1251] eta 0:09:58 lr 0.000036 time 2.1908 (2.2082) loss 3.0844 (3.0224) grad_norm 3.1912 (2.9619) [2022-01-26 07:42:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][990/1251] eta 0:09:36 lr 0.000036 time 2.8191 (2.2078) loss 2.5253 (3.0219) grad_norm 2.6889 (2.9617) [2022-01-26 07:43:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1000/1251] eta 0:09:13 lr 0.000036 time 1.7316 (2.2064) loss 2.3761 (3.0213) grad_norm 3.1689 (2.9618) [2022-01-26 07:43:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1010/1251] eta 0:08:51 lr 0.000036 time 2.7280 (2.2056) loss 3.1762 (3.0189) grad_norm 2.6704 (2.9604) [2022-01-26 07:43:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1020/1251] eta 0:08:29 lr 0.000036 time 1.7766 (2.2047) loss 2.8062 (3.0208) grad_norm 2.6261 (2.9595) [2022-01-26 07:44:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1030/1251] eta 0:08:07 lr 0.000036 time 2.8428 (2.2057) loss 3.2600 (3.0215) grad_norm 2.7400 (2.9585) [2022-01-26 07:44:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1040/1251] eta 0:07:45 lr 0.000036 time 1.4656 (2.2044) loss 2.6471 (3.0200) grad_norm 3.1392 (2.9583) [2022-01-26 07:44:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1050/1251] eta 0:07:22 lr 0.000036 time 2.2149 (2.2039) loss 3.5239 (3.0215) grad_norm 2.7429 (2.9593) [2022-01-26 07:45:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1060/1251] eta 0:07:01 lr 0.000036 time 2.4523 (2.2046) loss 3.7507 (3.0236) grad_norm 3.0322 (2.9591) [2022-01-26 07:45:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1070/1251] eta 0:06:39 lr 0.000036 time 2.0146 (2.2049) loss 2.6828 (3.0238) grad_norm 2.8461 (2.9605) [2022-01-26 07:46:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1080/1251] eta 0:06:17 lr 0.000036 time 1.7786 (2.2048) loss 3.3530 (3.0256) grad_norm 4.3516 (2.9635) [2022-01-26 07:46:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1090/1251] eta 0:05:54 lr 0.000036 time 2.0428 (2.2042) loss 3.6980 (3.0255) grad_norm 2.8824 (2.9644) [2022-01-26 07:46:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1100/1251] eta 0:05:32 lr 0.000036 time 3.1334 (2.2044) loss 3.6412 (3.0235) grad_norm 3.2462 (2.9640) [2022-01-26 07:47:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1110/1251] eta 0:05:10 lr 0.000036 time 3.1291 (2.2048) loss 3.1962 (3.0238) grad_norm 3.1500 (2.9636) [2022-01-26 07:47:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1120/1251] eta 0:04:48 lr 0.000036 time 1.7342 (2.2038) loss 3.1468 (3.0248) grad_norm 3.0195 (2.9638) [2022-01-26 07:47:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1130/1251] eta 0:04:26 lr 0.000036 time 1.9153 (2.2021) loss 2.9874 (3.0259) grad_norm 2.6747 (2.9634) [2022-01-26 07:48:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1140/1251] eta 0:04:04 lr 0.000036 time 1.9716 (2.2005) loss 3.4333 (3.0263) grad_norm 3.0894 (2.9642) [2022-01-26 07:48:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1150/1251] eta 0:03:42 lr 0.000036 time 2.8185 (2.2005) loss 3.2823 (3.0255) grad_norm 3.6496 (2.9649) [2022-01-26 07:48:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1160/1251] eta 0:03:20 lr 0.000036 time 1.8255 (2.1993) loss 2.7945 (3.0233) grad_norm 2.5928 (2.9659) [2022-01-26 07:49:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1170/1251] eta 0:02:58 lr 0.000036 time 2.0800 (2.1985) loss 3.3343 (3.0229) grad_norm 2.9244 (2.9646) [2022-01-26 07:49:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1180/1251] eta 0:02:36 lr 0.000036 time 2.7766 (2.1983) loss 2.7869 (3.0226) grad_norm 3.1735 (2.9658) [2022-01-26 07:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1190/1251] eta 0:02:14 lr 0.000036 time 2.6704 (2.1977) loss 3.4904 (3.0229) grad_norm 2.7660 (2.9647) [2022-01-26 07:50:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1200/1251] eta 0:01:52 lr 0.000036 time 2.4550 (2.1999) loss 3.2406 (3.0211) grad_norm 2.6369 (2.9644) [2022-01-26 07:50:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1210/1251] eta 0:01:30 lr 0.000036 time 2.1891 (2.1995) loss 2.6560 (3.0214) grad_norm 2.8438 (2.9650) [2022-01-26 07:51:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1220/1251] eta 0:01:08 lr 0.000036 time 2.6697 (2.2000) loss 3.3969 (3.0197) grad_norm 2.8066 (2.9643) [2022-01-26 07:51:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1230/1251] eta 0:00:46 lr 0.000036 time 2.5229 (2.1999) loss 3.3659 (3.0198) grad_norm 3.2405 (2.9670) [2022-01-26 07:51:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1240/1251] eta 0:00:24 lr 0.000036 time 2.2493 (2.1988) loss 3.3776 (3.0201) grad_norm 2.7000 (2.9670) [2022-01-26 07:52:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [268/300][1250/1251] eta 0:00:02 lr 0.000036 time 1.1824 (2.1930) loss 2.9785 (3.0215) grad_norm 3.1540 (2.9663) [2022-01-26 07:52:02 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 268 training takes 0:45:43 [2022-01-26 07:52:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.731 (18.731) Loss 0.7943 (0.7943) Acc@1 81.836 (81.836) Acc@5 95.508 (95.508) [2022-01-26 07:52:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.926 (3.217) Loss 0.8258 (0.8166) Acc@1 80.566 (81.143) Acc@5 95.117 (95.428) [2022-01-26 07:52:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.019 (2.478) Loss 0.8422 (0.8234) Acc@1 79.980 (80.850) Acc@5 94.727 (95.350) [2022-01-26 07:53:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.604 (2.121) Loss 0.8148 (0.8298) Acc@1 80.957 (80.661) Acc@5 95.117 (95.262) [2022-01-26 07:53:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.762 (2.089) Loss 0.8162 (0.8218) Acc@1 81.348 (80.788) Acc@5 95.508 (95.353) [2022-01-26 07:53:34 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.826 Acc@5 95.378 [2022-01-26 07:53:34 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.8% [2022-01-26 07:53:34 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.91% [2022-01-26 07:53:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][0/1251] eta 7:21:55 lr 0.000036 time 21.1953 (21.1953) loss 3.4336 (3.4336) grad_norm 3.0550 (3.0550) [2022-01-26 07:54:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][10/1251] eta 1:21:47 lr 0.000036 time 2.2669 (3.9548) loss 3.2940 (2.8159) grad_norm 2.6865 (2.8766) [2022-01-26 07:54:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][20/1251] eta 1:02:47 lr 0.000036 time 1.4248 (3.0609) loss 3.3259 (3.0142) grad_norm 2.5651 (2.8255) [2022-01-26 07:55:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][30/1251] eta 0:56:53 lr 0.000036 time 2.1282 (2.7958) loss 3.3533 (3.0604) grad_norm 2.3638 (2.8854) [2022-01-26 07:55:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][40/1251] eta 0:53:46 lr 0.000036 time 4.0798 (2.6643) loss 3.0458 (3.0830) grad_norm 2.6446 (2.9331) [2022-01-26 07:55:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][50/1251] eta 0:51:43 lr 0.000036 time 1.8289 (2.5842) loss 3.1059 (3.0746) grad_norm 2.8596 (2.9238) [2022-01-26 07:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][60/1251] eta 0:49:33 lr 0.000036 time 1.3657 (2.4963) loss 2.3100 (3.0676) grad_norm 2.9409 (2.8812) [2022-01-26 07:56:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][70/1251] eta 0:48:00 lr 0.000036 time 1.7336 (2.4389) loss 3.2223 (3.0322) grad_norm 3.4480 (2.9055) [2022-01-26 07:56:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][80/1251] eta 0:46:56 lr 0.000036 time 3.2050 (2.4054) loss 3.5904 (3.0385) grad_norm 3.4771 (2.9161) [2022-01-26 07:57:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][90/1251] eta 0:46:17 lr 0.000036 time 1.5913 (2.3923) loss 3.3777 (3.0129) grad_norm 2.8169 (2.9428) [2022-01-26 07:57:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][100/1251] eta 0:45:21 lr 0.000036 time 2.1496 (2.3647) loss 3.3039 (3.0232) grad_norm 3.3632 (2.9892) [2022-01-26 07:57:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][110/1251] eta 0:44:40 lr 0.000036 time 1.4575 (2.3496) loss 2.3548 (3.0190) grad_norm 2.9197 (3.0029) [2022-01-26 07:58:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][120/1251] eta 0:44:17 lr 0.000036 time 3.1437 (2.3496) loss 3.3066 (3.0120) grad_norm 3.6240 (3.0262) [2022-01-26 07:58:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][130/1251] eta 0:43:52 lr 0.000036 time 1.6786 (2.3485) loss 2.9585 (3.0192) grad_norm 3.1803 (3.0228) [2022-01-26 07:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][140/1251] eta 0:43:15 lr 0.000036 time 2.2726 (2.3366) loss 3.0145 (3.0225) grad_norm 2.6980 (3.0238) [2022-01-26 07:59:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][150/1251] eta 0:42:49 lr 0.000036 time 1.5508 (2.3336) loss 2.4323 (3.0305) grad_norm 2.7995 (3.0260) [2022-01-26 07:59:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][160/1251] eta 0:42:14 lr 0.000036 time 2.2717 (2.3227) loss 2.3077 (3.0326) grad_norm 2.6828 (3.0172) [2022-01-26 08:00:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][170/1251] eta 0:41:30 lr 0.000036 time 1.9689 (2.3042) loss 3.4886 (3.0247) grad_norm 2.9859 (3.0150) [2022-01-26 08:00:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][180/1251] eta 0:40:48 lr 0.000036 time 2.4675 (2.2866) loss 3.2766 (3.0120) grad_norm 2.8790 (3.0141) [2022-01-26 08:00:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][190/1251] eta 0:40:21 lr 0.000036 time 1.9258 (2.2826) loss 3.5897 (3.0150) grad_norm 3.0828 (3.0109) [2022-01-26 08:01:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][200/1251] eta 0:40:02 lr 0.000036 time 2.5574 (2.2859) loss 2.8848 (3.0175) grad_norm 2.7770 (3.0033) [2022-01-26 08:01:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][210/1251] eta 0:39:42 lr 0.000036 time 2.7800 (2.2883) loss 1.9888 (3.0156) grad_norm 2.3302 (3.0004) [2022-01-26 08:02:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][220/1251] eta 0:39:19 lr 0.000036 time 2.6655 (2.2886) loss 2.9590 (3.0224) grad_norm 3.0986 (3.0000) [2022-01-26 08:02:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][230/1251] eta 0:38:47 lr 0.000036 time 1.6125 (2.2799) loss 3.4201 (3.0187) grad_norm 3.0241 (3.0015) [2022-01-26 08:02:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][240/1251] eta 0:38:12 lr 0.000036 time 2.1264 (2.2672) loss 3.4493 (3.0206) grad_norm 2.5369 (2.9982) [2022-01-26 08:03:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][250/1251] eta 0:37:40 lr 0.000036 time 1.5862 (2.2580) loss 2.5450 (3.0199) grad_norm 2.7583 (2.9966) [2022-01-26 08:03:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][260/1251] eta 0:37:14 lr 0.000036 time 1.9497 (2.2545) loss 3.2700 (3.0266) grad_norm 3.0816 (3.0218) [2022-01-26 08:03:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][270/1251] eta 0:36:46 lr 0.000035 time 1.8690 (2.2491) loss 3.0403 (3.0282) grad_norm 2.7913 (3.0170) [2022-01-26 08:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][280/1251] eta 0:36:19 lr 0.000035 time 2.2491 (2.2445) loss 2.2722 (3.0101) grad_norm 2.8277 (3.0176) [2022-01-26 08:04:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][290/1251] eta 0:35:58 lr 0.000035 time 1.7755 (2.2456) loss 2.3558 (2.9964) grad_norm 2.7425 (3.0129) [2022-01-26 08:04:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][300/1251] eta 0:35:35 lr 0.000035 time 2.2212 (2.2460) loss 3.2840 (2.9937) grad_norm 2.5031 (3.0159) [2022-01-26 08:05:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][310/1251] eta 0:35:08 lr 0.000035 time 2.0972 (2.2409) loss 3.4595 (2.9929) grad_norm 3.8161 (3.0176) [2022-01-26 08:05:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][320/1251] eta 0:34:46 lr 0.000035 time 2.1953 (2.2409) loss 3.5992 (3.0000) grad_norm 2.4895 (3.0091) [2022-01-26 08:05:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][330/1251] eta 0:34:25 lr 0.000035 time 2.1878 (2.2429) loss 3.2102 (2.9945) grad_norm 2.5236 (3.0058) [2022-01-26 08:06:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][340/1251] eta 0:34:01 lr 0.000035 time 1.9375 (2.2413) loss 3.5394 (2.9957) grad_norm 2.8639 (3.0004) [2022-01-26 08:06:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][350/1251] eta 0:33:35 lr 0.000035 time 2.2254 (2.2371) loss 3.2720 (2.9853) grad_norm 3.0804 (2.9978) [2022-01-26 08:06:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][360/1251] eta 0:33:05 lr 0.000035 time 2.0017 (2.2284) loss 2.8688 (2.9860) grad_norm 2.4423 (2.9909) [2022-01-26 08:07:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][370/1251] eta 0:32:40 lr 0.000035 time 2.1836 (2.2254) loss 3.4994 (2.9919) grad_norm 2.7349 (2.9900) [2022-01-26 08:07:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][380/1251] eta 0:32:15 lr 0.000035 time 1.6164 (2.2219) loss 2.9403 (2.9900) grad_norm 2.8792 (2.9902) [2022-01-26 08:08:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][390/1251] eta 0:31:52 lr 0.000035 time 2.6284 (2.2211) loss 3.3470 (2.9877) grad_norm 3.7102 (2.9995) [2022-01-26 08:08:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][400/1251] eta 0:31:27 lr 0.000035 time 2.5891 (2.2184) loss 2.0631 (2.9810) grad_norm 2.3738 (2.9950) [2022-01-26 08:08:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][410/1251] eta 0:31:06 lr 0.000035 time 2.7034 (2.2189) loss 3.4699 (2.9812) grad_norm 3.3731 (2.9931) [2022-01-26 08:09:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][420/1251] eta 0:30:44 lr 0.000035 time 1.8990 (2.2202) loss 2.6005 (2.9816) grad_norm 2.9373 (2.9956) [2022-01-26 08:09:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][430/1251] eta 0:30:25 lr 0.000035 time 2.2687 (2.2234) loss 3.4662 (2.9874) grad_norm 2.9120 (2.9933) [2022-01-26 08:09:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][440/1251] eta 0:30:03 lr 0.000035 time 2.5879 (2.2239) loss 2.7981 (2.9867) grad_norm 2.6115 (2.9924) [2022-01-26 08:10:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][450/1251] eta 0:29:42 lr 0.000035 time 2.3411 (2.2257) loss 2.2859 (2.9898) grad_norm 2.7371 (2.9940) [2022-01-26 08:10:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][460/1251] eta 0:29:18 lr 0.000035 time 1.5926 (2.2231) loss 3.4733 (2.9934) grad_norm 4.0872 (3.0021) [2022-01-26 08:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][470/1251] eta 0:28:55 lr 0.000035 time 1.8174 (2.2224) loss 2.3803 (2.9927) grad_norm 2.5691 (2.9996) [2022-01-26 08:11:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][480/1251] eta 0:28:32 lr 0.000035 time 2.5201 (2.2209) loss 3.6471 (2.9980) grad_norm 3.1904 (2.9956) [2022-01-26 08:11:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][490/1251] eta 0:28:10 lr 0.000035 time 2.5198 (2.2209) loss 3.4625 (2.9978) grad_norm 3.0658 (2.9975) [2022-01-26 08:12:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][500/1251] eta 0:27:45 lr 0.000035 time 2.4981 (2.2184) loss 2.7052 (2.9948) grad_norm 2.4293 (2.9960) [2022-01-26 08:12:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][510/1251] eta 0:27:23 lr 0.000035 time 2.1817 (2.2181) loss 3.4939 (2.9983) grad_norm 2.7969 (2.9910) [2022-01-26 08:12:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][520/1251] eta 0:27:01 lr 0.000035 time 2.5124 (2.2181) loss 3.0108 (2.9972) grad_norm 2.4077 (2.9900) [2022-01-26 08:13:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][530/1251] eta 0:26:38 lr 0.000035 time 2.4246 (2.2175) loss 2.4159 (2.9968) grad_norm 2.9468 (2.9876) [2022-01-26 08:13:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][540/1251] eta 0:26:14 lr 0.000035 time 2.2902 (2.2152) loss 2.7418 (2.9943) grad_norm 2.7232 (2.9884) [2022-01-26 08:13:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][550/1251] eta 0:25:51 lr 0.000035 time 1.6216 (2.2129) loss 3.8399 (2.9936) grad_norm 3.1634 (2.9873) [2022-01-26 08:14:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][560/1251] eta 0:25:28 lr 0.000035 time 1.9231 (2.2114) loss 3.4879 (2.9981) grad_norm 3.4819 (2.9884) [2022-01-26 08:14:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][570/1251] eta 0:25:05 lr 0.000035 time 1.9344 (2.2113) loss 3.4086 (3.0056) grad_norm 3.3399 (2.9875) [2022-01-26 08:15:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][580/1251] eta 0:24:44 lr 0.000035 time 2.6860 (2.2130) loss 2.3173 (3.0017) grad_norm 2.5929 (2.9846) [2022-01-26 08:15:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][590/1251] eta 0:24:21 lr 0.000035 time 1.8965 (2.2110) loss 2.8935 (3.0022) grad_norm 2.8153 (2.9855) [2022-01-26 08:15:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][600/1251] eta 0:23:59 lr 0.000035 time 2.3616 (2.2107) loss 3.2661 (3.0040) grad_norm 2.5412 (2.9873) [2022-01-26 08:16:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][610/1251] eta 0:23:36 lr 0.000035 time 1.9294 (2.2094) loss 2.5673 (3.0008) grad_norm 3.1657 (2.9889) [2022-01-26 08:16:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][620/1251] eta 0:23:15 lr 0.000035 time 3.4492 (2.2120) loss 3.3001 (3.0052) grad_norm 2.7835 (2.9866) [2022-01-26 08:16:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][630/1251] eta 0:22:52 lr 0.000035 time 1.5674 (2.2095) loss 2.2181 (3.0060) grad_norm 2.8887 (2.9873) [2022-01-26 08:17:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][640/1251] eta 0:22:29 lr 0.000035 time 1.5668 (2.2079) loss 3.6686 (3.0077) grad_norm 2.8591 (2.9865) [2022-01-26 08:17:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][650/1251] eta 0:22:07 lr 0.000035 time 2.2309 (2.2093) loss 2.7223 (3.0111) grad_norm 2.9727 (2.9871) [2022-01-26 08:17:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][660/1251] eta 0:21:46 lr 0.000035 time 2.0855 (2.2108) loss 2.8824 (3.0107) grad_norm 2.7123 (2.9893) [2022-01-26 08:18:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][670/1251] eta 0:21:23 lr 0.000035 time 1.6461 (2.2096) loss 2.6011 (3.0122) grad_norm 2.8694 (2.9885) [2022-01-26 08:18:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][680/1251] eta 0:21:01 lr 0.000035 time 1.6312 (2.2088) loss 2.7628 (3.0137) grad_norm 3.0257 (2.9886) [2022-01-26 08:19:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][690/1251] eta 0:20:39 lr 0.000035 time 1.9328 (2.2103) loss 3.6635 (3.0156) grad_norm 3.5552 (2.9927) [2022-01-26 08:19:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][700/1251] eta 0:20:17 lr 0.000035 time 1.8870 (2.2096) loss 2.9822 (3.0172) grad_norm 2.7538 (2.9937) [2022-01-26 08:19:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][710/1251] eta 0:19:54 lr 0.000035 time 1.6361 (2.2081) loss 3.7125 (3.0199) grad_norm 2.4577 (2.9898) [2022-01-26 08:20:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][720/1251] eta 0:19:32 lr 0.000035 time 1.9020 (2.2085) loss 3.2300 (3.0229) grad_norm 3.3633 (2.9897) [2022-01-26 08:20:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][730/1251] eta 0:19:10 lr 0.000035 time 1.8754 (2.2088) loss 3.0635 (3.0231) grad_norm 2.7330 (2.9891) [2022-01-26 08:20:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][740/1251] eta 0:18:48 lr 0.000035 time 2.3385 (2.2093) loss 3.1882 (3.0243) grad_norm 2.7615 (2.9887) [2022-01-26 08:21:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][750/1251] eta 0:18:25 lr 0.000035 time 1.6830 (2.2073) loss 2.6067 (3.0254) grad_norm 2.7190 (2.9901) [2022-01-26 08:21:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][760/1251] eta 0:18:03 lr 0.000035 time 1.5955 (2.2067) loss 2.7587 (3.0241) grad_norm 3.1429 (2.9895) [2022-01-26 08:21:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][770/1251] eta 0:17:41 lr 0.000035 time 1.8869 (2.2074) loss 2.5680 (3.0237) grad_norm 3.0588 (2.9883) [2022-01-26 08:22:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][780/1251] eta 0:17:18 lr 0.000035 time 2.1904 (2.2055) loss 2.6591 (3.0225) grad_norm 2.7727 (2.9904) [2022-01-26 08:22:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][790/1251] eta 0:16:56 lr 0.000035 time 2.0769 (2.2053) loss 3.0963 (3.0233) grad_norm 2.7482 (2.9900) [2022-01-26 08:23:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][800/1251] eta 0:16:34 lr 0.000035 time 2.4480 (2.2049) loss 2.3971 (3.0199) grad_norm 2.9692 (2.9881) [2022-01-26 08:23:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][810/1251] eta 0:16:12 lr 0.000035 time 1.6760 (2.2050) loss 3.4502 (3.0238) grad_norm 2.7029 (2.9930) [2022-01-26 08:23:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][820/1251] eta 0:15:49 lr 0.000035 time 1.9970 (2.2040) loss 3.7747 (3.0282) grad_norm 4.4796 (2.9953) [2022-01-26 08:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][830/1251] eta 0:15:27 lr 0.000035 time 1.9084 (2.2036) loss 2.7424 (3.0291) grad_norm 2.7682 (2.9936) [2022-01-26 08:24:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][840/1251] eta 0:15:05 lr 0.000035 time 3.1050 (2.2040) loss 3.3074 (3.0270) grad_norm 2.4010 (2.9930) [2022-01-26 08:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][850/1251] eta 0:14:45 lr 0.000035 time 2.3258 (2.2074) loss 2.9606 (3.0237) grad_norm 2.5715 (2.9936) [2022-01-26 08:25:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][860/1251] eta 0:14:23 lr 0.000035 time 2.0262 (2.2076) loss 3.2125 (3.0246) grad_norm 2.6713 (2.9925) [2022-01-26 08:25:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][870/1251] eta 0:14:00 lr 0.000035 time 1.8229 (2.2057) loss 3.8189 (3.0261) grad_norm 3.1197 (2.9911) [2022-01-26 08:25:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][880/1251] eta 0:13:37 lr 0.000035 time 1.8042 (2.2031) loss 2.7418 (3.0258) grad_norm 3.1585 (2.9927) [2022-01-26 08:26:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][890/1251] eta 0:13:14 lr 0.000035 time 1.8210 (2.2012) loss 2.8910 (3.0243) grad_norm 3.3171 (2.9926) [2022-01-26 08:26:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][900/1251] eta 0:12:52 lr 0.000035 time 2.5812 (2.2010) loss 2.1483 (3.0244) grad_norm 2.9750 (2.9919) [2022-01-26 08:27:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][910/1251] eta 0:12:30 lr 0.000035 time 1.7395 (2.2012) loss 3.3313 (3.0248) grad_norm 3.4228 (2.9906) [2022-01-26 08:27:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][920/1251] eta 0:12:08 lr 0.000035 time 1.7730 (2.2016) loss 3.1565 (3.0246) grad_norm 2.7926 (2.9900) [2022-01-26 08:27:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][930/1251] eta 0:11:46 lr 0.000035 time 2.4910 (2.2013) loss 3.3217 (3.0266) grad_norm 2.7015 (2.9884) [2022-01-26 08:28:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][940/1251] eta 0:11:24 lr 0.000035 time 3.0959 (2.2015) loss 3.3379 (3.0275) grad_norm 2.8547 (2.9881) [2022-01-26 08:28:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][950/1251] eta 0:11:03 lr 0.000035 time 2.2498 (2.2032) loss 3.0999 (3.0263) grad_norm 2.7703 (2.9884) [2022-01-26 08:28:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][960/1251] eta 0:10:41 lr 0.000035 time 1.6199 (2.2031) loss 3.1999 (3.0277) grad_norm 3.0287 (2.9907) [2022-01-26 08:29:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][970/1251] eta 0:10:19 lr 0.000035 time 1.6057 (2.2031) loss 3.3371 (3.0250) grad_norm 2.7482 (2.9908) [2022-01-26 08:29:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][980/1251] eta 0:09:57 lr 0.000035 time 3.4236 (2.2031) loss 2.2806 (3.0244) grad_norm 2.6663 (2.9915) [2022-01-26 08:29:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][990/1251] eta 0:09:34 lr 0.000035 time 1.7230 (2.2015) loss 3.3366 (3.0229) grad_norm 6.3214 (2.9947) [2022-01-26 08:30:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1000/1251] eta 0:09:12 lr 0.000035 time 1.9223 (2.2005) loss 2.5707 (3.0206) grad_norm 2.7330 (2.9950) [2022-01-26 08:30:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1010/1251] eta 0:08:50 lr 0.000035 time 2.1896 (2.2002) loss 2.8868 (3.0202) grad_norm 2.6433 (2.9948) [2022-01-26 08:31:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1020/1251] eta 0:08:28 lr 0.000035 time 2.3435 (2.1999) loss 2.9060 (3.0200) grad_norm 2.7671 (2.9944) [2022-01-26 08:31:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1030/1251] eta 0:08:05 lr 0.000035 time 1.8476 (2.1981) loss 2.7210 (3.0211) grad_norm 2.8177 (2.9947) [2022-01-26 08:31:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1040/1251] eta 0:07:43 lr 0.000034 time 2.5581 (2.1987) loss 3.4821 (3.0220) grad_norm 3.6928 (2.9994) [2022-01-26 08:32:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1050/1251] eta 0:07:22 lr 0.000034 time 2.6872 (2.2006) loss 2.3924 (3.0203) grad_norm 3.1685 (2.9984) [2022-01-26 08:32:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1060/1251] eta 0:07:00 lr 0.000034 time 2.3712 (2.2005) loss 3.2306 (3.0206) grad_norm 2.5444 (2.9984) [2022-01-26 08:32:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1070/1251] eta 0:06:38 lr 0.000034 time 1.5626 (2.1999) loss 2.1693 (3.0205) grad_norm 2.5924 (2.9973) [2022-01-26 08:33:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1080/1251] eta 0:06:16 lr 0.000034 time 2.3977 (2.1999) loss 3.5021 (3.0214) grad_norm 3.3237 (2.9986) [2022-01-26 08:33:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1090/1251] eta 0:05:54 lr 0.000034 time 1.9193 (2.2003) loss 2.0560 (3.0233) grad_norm 3.3253 (2.9980) [2022-01-26 08:33:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1100/1251] eta 0:05:32 lr 0.000034 time 2.1350 (2.2003) loss 2.5882 (3.0242) grad_norm 2.5771 (2.9977) [2022-01-26 08:34:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1110/1251] eta 0:05:10 lr 0.000034 time 2.1471 (2.2002) loss 3.6214 (3.0231) grad_norm 3.1589 (2.9968) [2022-01-26 08:34:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1120/1251] eta 0:04:48 lr 0.000034 time 1.6280 (2.1989) loss 2.9517 (3.0207) grad_norm 3.3781 (3.0002) [2022-01-26 08:35:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1130/1251] eta 0:04:26 lr 0.000034 time 2.7939 (2.2004) loss 3.1051 (3.0204) grad_norm 4.0285 (3.0032) [2022-01-26 08:35:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1140/1251] eta 0:04:04 lr 0.000034 time 1.9781 (2.2009) loss 3.3781 (3.0214) grad_norm 3.3545 (3.0029) [2022-01-26 08:35:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1150/1251] eta 0:03:42 lr 0.000034 time 2.2244 (2.2006) loss 3.2302 (3.0218) grad_norm 3.1332 (3.0024) [2022-01-26 08:36:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1160/1251] eta 0:03:20 lr 0.000034 time 1.9702 (2.1991) loss 3.1769 (3.0206) grad_norm 3.0287 (3.0031) [2022-01-26 08:36:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1170/1251] eta 0:02:58 lr 0.000034 time 2.1647 (2.1995) loss 3.1888 (3.0224) grad_norm 3.0045 (3.0066) [2022-01-26 08:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1180/1251] eta 0:02:36 lr 0.000034 time 1.8375 (2.1984) loss 3.1954 (3.0212) grad_norm 3.1110 (3.0081) [2022-01-26 08:37:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1190/1251] eta 0:02:14 lr 0.000034 time 2.6232 (2.1975) loss 2.5472 (3.0209) grad_norm 3.0008 (3.0103) [2022-01-26 08:37:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1200/1251] eta 0:01:52 lr 0.000034 time 2.1157 (2.1972) loss 3.3764 (3.0208) grad_norm 3.3725 (3.0097) [2022-01-26 08:37:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1210/1251] eta 0:01:30 lr 0.000034 time 1.9484 (2.1967) loss 3.5399 (3.0224) grad_norm 2.6810 (3.0094) [2022-01-26 08:38:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1220/1251] eta 0:01:08 lr 0.000034 time 1.8435 (2.1957) loss 2.3936 (3.0220) grad_norm 2.8871 (3.0083) [2022-01-26 08:38:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1230/1251] eta 0:00:46 lr 0.000034 time 1.8879 (2.1952) loss 3.2477 (3.0223) grad_norm 2.7547 (3.0085) [2022-01-26 08:38:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1240/1251] eta 0:00:24 lr 0.000034 time 1.2450 (2.1949) loss 3.2957 (3.0216) grad_norm 3.1890 (3.0085) [2022-01-26 08:39:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [269/300][1250/1251] eta 0:00:02 lr 0.000034 time 1.2158 (2.1898) loss 3.7261 (3.0224) grad_norm 3.0940 (3.0081) [2022-01-26 08:39:14 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 269 training takes 0:45:39 [2022-01-26 08:39:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.533 (18.533) Loss 0.8116 (0.8116) Acc@1 80.957 (80.957) Acc@5 95.801 (95.801) [2022-01-26 08:39:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.817 (3.351) Loss 0.7853 (0.8256) Acc@1 81.836 (80.806) Acc@5 95.508 (95.339) [2022-01-26 08:40:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.355 (2.516) Loss 0.7720 (0.8157) Acc@1 82.324 (81.017) Acc@5 96.094 (95.429) [2022-01-26 08:40:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.929 (2.296) Loss 0.9064 (0.8232) Acc@1 79.883 (80.752) Acc@5 94.141 (95.363) [2022-01-26 08:40:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.003 (2.203) Loss 0.7805 (0.8236) Acc@1 82.422 (80.840) Acc@5 94.629 (95.308) [2022-01-26 08:40:52 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.920 Acc@5 95.366 [2022-01-26 08:40:52 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.9% [2022-01-26 08:40:52 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 80.92% [2022-01-26 08:41:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][0/1251] eta 7:19:36 lr 0.000034 time 21.0842 (21.0842) loss 3.5123 (3.5123) grad_norm 3.1271 (3.1271) [2022-01-26 08:41:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][10/1251] eta 1:26:02 lr 0.000034 time 2.2131 (4.1601) loss 2.5290 (3.0208) grad_norm 2.6744 (2.9324) [2022-01-26 08:42:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][20/1251] eta 1:07:08 lr 0.000034 time 2.2471 (3.2722) loss 3.1268 (3.1039) grad_norm 2.8554 (2.9389) [2022-01-26 08:42:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][30/1251] eta 0:59:23 lr 0.000034 time 1.9662 (2.9185) loss 2.2957 (3.1138) grad_norm 2.8846 (2.9547) [2022-01-26 08:42:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][40/1251] eta 0:55:19 lr 0.000034 time 2.3870 (2.7411) loss 3.2540 (3.0609) grad_norm 2.8474 (2.9464) [2022-01-26 08:43:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][50/1251] eta 0:53:16 lr 0.000034 time 2.2227 (2.6613) loss 3.2975 (3.0749) grad_norm 2.9140 (2.9190) [2022-01-26 08:43:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][60/1251] eta 0:51:38 lr 0.000034 time 2.6999 (2.6014) loss 2.8060 (3.1044) grad_norm 3.3065 (2.9503) [2022-01-26 08:43:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][70/1251] eta 0:49:48 lr 0.000034 time 1.9218 (2.5307) loss 2.9914 (3.0861) grad_norm 2.6656 (2.9754) [2022-01-26 08:44:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][80/1251] eta 0:48:10 lr 0.000034 time 1.9814 (2.4682) loss 2.3505 (3.0597) grad_norm 2.9597 (2.9591) [2022-01-26 08:44:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][90/1251] eta 0:46:46 lr 0.000034 time 1.9473 (2.4177) loss 2.4174 (3.0613) grad_norm 3.2323 (2.9806) [2022-01-26 08:44:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][100/1251] eta 0:45:55 lr 0.000034 time 2.4845 (2.3943) loss 3.1831 (3.0596) grad_norm 2.7647 (2.9856) [2022-01-26 08:45:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][110/1251] eta 0:44:51 lr 0.000034 time 1.3177 (2.3592) loss 2.9617 (3.0505) grad_norm 3.2354 (2.9664) [2022-01-26 08:45:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][120/1251] eta 0:44:15 lr 0.000034 time 2.2233 (2.3479) loss 3.5446 (3.0444) grad_norm 2.4804 (2.9679) [2022-01-26 08:46:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][130/1251] eta 0:43:47 lr 0.000034 time 2.1830 (2.3437) loss 2.4691 (3.0196) grad_norm 3.3166 (2.9804) [2022-01-26 08:46:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][140/1251] eta 0:43:25 lr 0.000034 time 2.6719 (2.3456) loss 3.6103 (3.0226) grad_norm 3.2816 (3.0379) [2022-01-26 08:46:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][150/1251] eta 0:42:59 lr 0.000034 time 1.7821 (2.3425) loss 2.8997 (3.0242) grad_norm 2.8700 (3.0520) [2022-01-26 08:47:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][160/1251] eta 0:42:36 lr 0.000034 time 2.4240 (2.3432) loss 2.3505 (3.0268) grad_norm 3.0745 (3.0492) [2022-01-26 08:47:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][170/1251] eta 0:41:57 lr 0.000034 time 1.6838 (2.3284) loss 2.7705 (3.0320) grad_norm 3.2059 (3.0462) [2022-01-26 08:47:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][180/1251] eta 0:41:14 lr 0.000034 time 2.2836 (2.3102) loss 3.7800 (3.0407) grad_norm 3.0991 (3.0397) [2022-01-26 08:48:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][190/1251] eta 0:40:49 lr 0.000034 time 2.2174 (2.3085) loss 3.1815 (3.0286) grad_norm 2.8410 (3.0354) [2022-01-26 08:48:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][200/1251] eta 0:40:27 lr 0.000034 time 3.0793 (2.3096) loss 3.0175 (3.0334) grad_norm 2.8127 (3.0368) [2022-01-26 08:48:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][210/1251] eta 0:39:48 lr 0.000034 time 1.5131 (2.2948) loss 3.2378 (3.0283) grad_norm 2.7639 (3.0360) [2022-01-26 08:49:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][220/1251] eta 0:39:17 lr 0.000034 time 1.9496 (2.2867) loss 1.9027 (3.0171) grad_norm 3.4449 (3.0425) [2022-01-26 08:49:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][230/1251] eta 0:38:47 lr 0.000034 time 2.2526 (2.2792) loss 2.4611 (3.0132) grad_norm 3.1802 (3.0485) [2022-01-26 08:50:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][240/1251] eta 0:38:23 lr 0.000034 time 2.6399 (2.2780) loss 3.6068 (3.0147) grad_norm 3.6946 (3.0715) [2022-01-26 08:50:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][250/1251] eta 0:37:59 lr 0.000034 time 1.9910 (2.2773) loss 3.1657 (3.0124) grad_norm 3.1798 (3.0671) [2022-01-26 08:50:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][260/1251] eta 0:37:33 lr 0.000034 time 1.9243 (2.2741) loss 3.3764 (3.0149) grad_norm 3.8068 (3.0700) [2022-01-26 08:51:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][270/1251] eta 0:37:06 lr 0.000034 time 1.8696 (2.2692) loss 3.1772 (3.0154) grad_norm 2.7606 (3.0622) [2022-01-26 08:51:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][280/1251] eta 0:36:35 lr 0.000034 time 1.8210 (2.2607) loss 3.5066 (3.0230) grad_norm 2.6231 (3.0545) [2022-01-26 08:51:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][290/1251] eta 0:35:59 lr 0.000034 time 2.2152 (2.2474) loss 3.0863 (3.0136) grad_norm 2.9248 (3.0528) [2022-01-26 08:52:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][300/1251] eta 0:35:33 lr 0.000034 time 2.1654 (2.2430) loss 2.2275 (3.0134) grad_norm 2.7187 (3.0508) [2022-01-26 08:52:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][310/1251] eta 0:35:11 lr 0.000034 time 2.2446 (2.2441) loss 3.4718 (3.0242) grad_norm 3.4511 (3.0520) [2022-01-26 08:52:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][320/1251] eta 0:34:48 lr 0.000034 time 1.9501 (2.2429) loss 3.3007 (3.0240) grad_norm 3.1497 (3.0677) [2022-01-26 08:53:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][330/1251] eta 0:34:32 lr 0.000034 time 2.1395 (2.2504) loss 3.4956 (3.0166) grad_norm 3.0179 (3.0742) [2022-01-26 08:53:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][340/1251] eta 0:34:12 lr 0.000034 time 2.3884 (2.2526) loss 3.1030 (3.0121) grad_norm 2.8406 (3.0784) [2022-01-26 08:54:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][350/1251] eta 0:33:51 lr 0.000034 time 2.1273 (2.2543) loss 2.9559 (3.0116) grad_norm 4.0936 (3.0769) [2022-01-26 08:54:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][360/1251] eta 0:33:27 lr 0.000034 time 1.8872 (2.2533) loss 3.1663 (3.0118) grad_norm 3.5002 (3.0788) [2022-01-26 08:54:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][370/1251] eta 0:33:01 lr 0.000034 time 1.9525 (2.2497) loss 3.3797 (3.0138) grad_norm 2.6968 (3.0749) [2022-01-26 08:55:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][380/1251] eta 0:32:30 lr 0.000034 time 1.7106 (2.2389) loss 2.9677 (3.0124) grad_norm 2.8904 (3.0782) [2022-01-26 08:55:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][390/1251] eta 0:32:01 lr 0.000034 time 1.8594 (2.2323) loss 3.1429 (3.0138) grad_norm 3.3484 (3.0847) [2022-01-26 08:55:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][400/1251] eta 0:31:35 lr 0.000034 time 2.2137 (2.2274) loss 3.0583 (3.0179) grad_norm 3.4052 (3.0868) [2022-01-26 08:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][410/1251] eta 0:31:11 lr 0.000034 time 1.8555 (2.2250) loss 2.9853 (3.0195) grad_norm 3.1062 (3.0846) [2022-01-26 08:56:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][420/1251] eta 0:30:49 lr 0.000034 time 1.5567 (2.2257) loss 2.0506 (3.0190) grad_norm 2.8001 (3.0844) [2022-01-26 08:56:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][430/1251] eta 0:30:27 lr 0.000034 time 2.2314 (2.2263) loss 3.2833 (3.0224) grad_norm 4.3086 (3.0862) [2022-01-26 08:57:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][440/1251] eta 0:30:05 lr 0.000034 time 2.1751 (2.2264) loss 2.7908 (3.0227) grad_norm 3.2375 (3.0847) [2022-01-26 08:57:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][450/1251] eta 0:29:43 lr 0.000034 time 1.8906 (2.2263) loss 3.0882 (3.0215) grad_norm 3.0337 (3.0802) [2022-01-26 08:58:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][460/1251] eta 0:29:22 lr 0.000034 time 1.6373 (2.2287) loss 2.6375 (3.0184) grad_norm 3.4666 (3.0778) [2022-01-26 08:58:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][470/1251] eta 0:29:01 lr 0.000034 time 2.5718 (2.2295) loss 3.2227 (3.0155) grad_norm 2.7196 (3.0725) [2022-01-26 08:58:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][480/1251] eta 0:28:38 lr 0.000034 time 1.6011 (2.2287) loss 2.6112 (3.0098) grad_norm 3.2307 (3.0699) [2022-01-26 08:59:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][490/1251] eta 0:28:17 lr 0.000034 time 2.4711 (2.2303) loss 2.4942 (3.0138) grad_norm 3.3854 (3.0695) [2022-01-26 08:59:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][500/1251] eta 0:27:53 lr 0.000034 time 1.6556 (2.2280) loss 2.9045 (3.0145) grad_norm 2.5599 (3.0666) [2022-01-26 08:59:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][510/1251] eta 0:27:29 lr 0.000034 time 1.8127 (2.2258) loss 3.3935 (3.0118) grad_norm 3.3356 (3.0647) [2022-01-26 09:00:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][520/1251] eta 0:27:05 lr 0.000034 time 2.2555 (2.2241) loss 2.0019 (3.0095) grad_norm 3.7768 (3.0660) [2022-01-26 09:00:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][530/1251] eta 0:26:42 lr 0.000034 time 1.8508 (2.2229) loss 2.9179 (3.0094) grad_norm 3.4544 (3.0633) [2022-01-26 09:00:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][540/1251] eta 0:26:21 lr 0.000034 time 2.1532 (2.2249) loss 2.7566 (3.0042) grad_norm 3.1388 (3.0621) [2022-01-26 09:01:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][550/1251] eta 0:25:57 lr 0.000034 time 1.8679 (2.2223) loss 3.0104 (3.0034) grad_norm 2.7609 (3.0586) [2022-01-26 09:01:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][560/1251] eta 0:25:33 lr 0.000034 time 1.8830 (2.2193) loss 3.4221 (3.0071) grad_norm 2.4190 (3.0551) [2022-01-26 09:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][570/1251] eta 0:25:10 lr 0.000034 time 2.0617 (2.2175) loss 3.1894 (3.0089) grad_norm 2.7453 (3.0552) [2022-01-26 09:02:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][580/1251] eta 0:24:49 lr 0.000033 time 2.4337 (2.2195) loss 2.4242 (3.0114) grad_norm 2.9714 (3.0595) [2022-01-26 09:02:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][590/1251] eta 0:24:28 lr 0.000033 time 2.8056 (2.2210) loss 3.3481 (3.0121) grad_norm 2.9374 (3.0598) [2022-01-26 09:03:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][600/1251] eta 0:24:05 lr 0.000033 time 1.9423 (2.2206) loss 3.2737 (3.0130) grad_norm 3.1812 (3.0604) [2022-01-26 09:03:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][610/1251] eta 0:23:43 lr 0.000033 time 1.8928 (2.2212) loss 2.7019 (3.0130) grad_norm 3.6347 (3.0605) [2022-01-26 09:03:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][620/1251] eta 0:23:21 lr 0.000033 time 2.1942 (2.2203) loss 3.4015 (3.0132) grad_norm 3.4842 (3.0639) [2022-01-26 09:04:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][630/1251] eta 0:22:57 lr 0.000033 time 3.0669 (2.2177) loss 3.1084 (3.0164) grad_norm 3.0656 (3.0659) [2022-01-26 09:04:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][640/1251] eta 0:22:32 lr 0.000033 time 1.9353 (2.2131) loss 3.0323 (3.0216) grad_norm 2.7282 (3.0663) [2022-01-26 09:04:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][650/1251] eta 0:22:09 lr 0.000033 time 2.2067 (2.2125) loss 3.3621 (3.0215) grad_norm 2.7770 (3.0647) [2022-01-26 09:05:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][660/1251] eta 0:21:48 lr 0.000033 time 1.6771 (2.2136) loss 3.2580 (3.0208) grad_norm 2.5521 (3.0676) [2022-01-26 09:05:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][670/1251] eta 0:21:27 lr 0.000033 time 4.4225 (2.2162) loss 3.3785 (3.0165) grad_norm 2.9458 (3.0657) [2022-01-26 09:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][680/1251] eta 0:21:04 lr 0.000033 time 2.0100 (2.2145) loss 2.2257 (3.0179) grad_norm 2.9140 (3.0677) [2022-01-26 09:06:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][690/1251] eta 0:20:41 lr 0.000033 time 2.1395 (2.2136) loss 3.4659 (3.0213) grad_norm 3.4887 (3.0680) [2022-01-26 09:06:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][700/1251] eta 0:20:19 lr 0.000033 time 1.6120 (2.2135) loss 2.6012 (3.0179) grad_norm 2.9020 (3.0692) [2022-01-26 09:07:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][710/1251] eta 0:19:58 lr 0.000033 time 3.3965 (2.2148) loss 3.2201 (3.0185) grad_norm 2.8847 (3.0694) [2022-01-26 09:07:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][720/1251] eta 0:19:34 lr 0.000033 time 2.2112 (2.2126) loss 3.3111 (3.0186) grad_norm 2.8450 (3.0687) [2022-01-26 09:07:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][730/1251] eta 0:19:12 lr 0.000033 time 1.9956 (2.2115) loss 2.2444 (3.0179) grad_norm 3.2441 (3.0672) [2022-01-26 09:08:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][740/1251] eta 0:18:48 lr 0.000033 time 1.9281 (2.2092) loss 3.6064 (3.0195) grad_norm 2.8922 (3.0685) [2022-01-26 09:08:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][750/1251] eta 0:18:26 lr 0.000033 time 2.7762 (2.2092) loss 3.3916 (3.0219) grad_norm 3.3939 (3.0678) [2022-01-26 09:08:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][760/1251] eta 0:18:04 lr 0.000033 time 1.8510 (2.2093) loss 2.8401 (3.0217) grad_norm 2.6983 (3.0703) [2022-01-26 09:09:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][770/1251] eta 0:17:42 lr 0.000033 time 1.9816 (2.2082) loss 2.2402 (3.0184) grad_norm 2.7391 (3.0710) [2022-01-26 09:09:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][780/1251] eta 0:17:19 lr 0.000033 time 1.8552 (2.2076) loss 3.2016 (3.0194) grad_norm 3.0221 (3.0696) [2022-01-26 09:10:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][790/1251] eta 0:16:58 lr 0.000033 time 3.4721 (2.2099) loss 2.9597 (3.0217) grad_norm 3.2628 (3.0696) [2022-01-26 09:10:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][800/1251] eta 0:16:36 lr 0.000033 time 2.2314 (2.2097) loss 3.0856 (3.0239) grad_norm 2.8324 (3.0686) [2022-01-26 09:10:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][810/1251] eta 0:16:13 lr 0.000033 time 1.5573 (2.2086) loss 3.6301 (3.0266) grad_norm 3.1913 (3.0668) [2022-01-26 09:11:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][820/1251] eta 0:15:51 lr 0.000033 time 1.8812 (2.2073) loss 3.3866 (3.0274) grad_norm 3.5599 (3.0679) [2022-01-26 09:11:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][830/1251] eta 0:15:28 lr 0.000033 time 2.5465 (2.2065) loss 3.5856 (3.0278) grad_norm 3.2375 (3.0682) [2022-01-26 09:11:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][840/1251] eta 0:15:06 lr 0.000033 time 1.6452 (2.2045) loss 3.5152 (3.0277) grad_norm 3.0121 (3.0658) [2022-01-26 09:12:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][850/1251] eta 0:14:44 lr 0.000033 time 1.9133 (2.2046) loss 2.5353 (3.0239) grad_norm 3.0937 (3.0667) [2022-01-26 09:12:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][860/1251] eta 0:14:21 lr 0.000033 time 1.5787 (2.2037) loss 3.3111 (3.0230) grad_norm 2.5006 (3.0657) [2022-01-26 09:12:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][870/1251] eta 0:13:59 lr 0.000033 time 1.8953 (2.2030) loss 3.1600 (3.0228) grad_norm 2.8369 (3.0627) [2022-01-26 09:13:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][880/1251] eta 0:13:37 lr 0.000033 time 3.1529 (2.2047) loss 3.3571 (3.0242) grad_norm 3.4902 (3.0672) [2022-01-26 09:13:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][890/1251] eta 0:13:17 lr 0.000033 time 3.3901 (2.2085) loss 3.5070 (3.0251) grad_norm 3.1867 (3.0648) [2022-01-26 09:14:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][900/1251] eta 0:12:55 lr 0.000033 time 1.5973 (2.2101) loss 2.9084 (3.0236) grad_norm 2.4429 (3.0634) [2022-01-26 09:14:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][910/1251] eta 0:12:33 lr 0.000033 time 2.4823 (2.2105) loss 2.1437 (3.0234) grad_norm 3.2221 (3.0631) [2022-01-26 09:14:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][920/1251] eta 0:12:11 lr 0.000033 time 2.4936 (2.2090) loss 2.5078 (3.0235) grad_norm 2.6626 (3.0621) [2022-01-26 09:15:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][930/1251] eta 0:11:48 lr 0.000033 time 2.6887 (2.2068) loss 3.2705 (3.0238) grad_norm 3.2948 (3.0633) [2022-01-26 09:15:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][940/1251] eta 0:11:25 lr 0.000033 time 1.9528 (2.2051) loss 2.8985 (3.0251) grad_norm 2.8303 (3.0623) [2022-01-26 09:15:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][950/1251] eta 0:11:03 lr 0.000033 time 2.2463 (2.2037) loss 2.8181 (3.0261) grad_norm 3.5162 (3.0628) [2022-01-26 09:16:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][960/1251] eta 0:10:41 lr 0.000033 time 2.3112 (2.2028) loss 2.4547 (3.0254) grad_norm 2.6548 (3.0611) [2022-01-26 09:16:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][970/1251] eta 0:10:18 lr 0.000033 time 2.4045 (2.2022) loss 2.5738 (3.0230) grad_norm 2.7216 (3.0597) [2022-01-26 09:16:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][980/1251] eta 0:09:56 lr 0.000033 time 2.4341 (2.2009) loss 2.1293 (3.0224) grad_norm 2.8017 (3.0587) [2022-01-26 09:17:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][990/1251] eta 0:09:34 lr 0.000033 time 2.4834 (2.2009) loss 3.1592 (3.0232) grad_norm 3.1592 (3.0592) [2022-01-26 09:17:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1000/1251] eta 0:09:12 lr 0.000033 time 1.8643 (2.2013) loss 2.1354 (3.0223) grad_norm 2.7718 (3.0571) [2022-01-26 09:18:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1010/1251] eta 0:08:51 lr 0.000033 time 3.0028 (2.2037) loss 2.8249 (3.0235) grad_norm 2.5889 (3.0565) [2022-01-26 09:18:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1020/1251] eta 0:08:29 lr 0.000033 time 1.9901 (2.2073) loss 2.3992 (3.0209) grad_norm 2.7410 (3.0564) [2022-01-26 09:18:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1030/1251] eta 0:08:08 lr 0.000033 time 2.1062 (2.2089) loss 3.1057 (3.0233) grad_norm 2.8707 (3.0553) [2022-01-26 09:19:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1040/1251] eta 0:07:45 lr 0.000033 time 2.1277 (2.2083) loss 1.9686 (3.0217) grad_norm 6.1662 (3.0573) [2022-01-26 09:19:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1050/1251] eta 0:07:23 lr 0.000033 time 1.9758 (2.2055) loss 3.2083 (3.0230) grad_norm 2.9482 (3.0569) [2022-01-26 09:19:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1060/1251] eta 0:07:00 lr 0.000033 time 1.8837 (2.2029) loss 3.4675 (3.0241) grad_norm 3.0792 (3.0578) [2022-01-26 09:20:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1070/1251] eta 0:06:38 lr 0.000033 time 2.4243 (2.2022) loss 3.3849 (3.0239) grad_norm 2.9148 (3.0596) [2022-01-26 09:20:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1080/1251] eta 0:06:16 lr 0.000033 time 2.3096 (2.2016) loss 3.4651 (3.0237) grad_norm 3.3982 (3.0577) [2022-01-26 09:20:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1090/1251] eta 0:05:54 lr 0.000033 time 1.4816 (2.2004) loss 3.4197 (3.0234) grad_norm 3.0296 (3.0574) [2022-01-26 09:21:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1100/1251] eta 0:05:32 lr 0.000033 time 1.5768 (2.2004) loss 2.9651 (3.0247) grad_norm 3.5666 (3.0578) [2022-01-26 09:21:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1110/1251] eta 0:05:10 lr 0.000033 time 2.7318 (2.2019) loss 2.9358 (3.0274) grad_norm 2.5496 (3.0565) [2022-01-26 09:22:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1120/1251] eta 0:04:48 lr 0.000033 time 2.4260 (2.2032) loss 2.7427 (3.0283) grad_norm 2.9347 (3.0556) [2022-01-26 09:22:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1130/1251] eta 0:04:26 lr 0.000033 time 1.5306 (2.2039) loss 3.6280 (3.0275) grad_norm 3.0681 (3.0551) [2022-01-26 09:22:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1140/1251] eta 0:04:04 lr 0.000033 time 1.4971 (2.2029) loss 2.7673 (3.0273) grad_norm 2.7226 (3.0552) [2022-01-26 09:23:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1150/1251] eta 0:03:42 lr 0.000033 time 1.8602 (2.2022) loss 2.2157 (3.0260) grad_norm 3.2576 (3.0556) [2022-01-26 09:23:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1160/1251] eta 0:03:20 lr 0.000033 time 2.1137 (2.2021) loss 3.2916 (3.0276) grad_norm 2.9869 (3.0551) [2022-01-26 09:23:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1170/1251] eta 0:02:58 lr 0.000033 time 1.7946 (2.2027) loss 3.3075 (3.0268) grad_norm 3.1259 (3.0549) [2022-01-26 09:24:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1180/1251] eta 0:02:36 lr 0.000033 time 1.5832 (2.2027) loss 2.0347 (3.0248) grad_norm 2.9185 (3.0534) [2022-01-26 09:24:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1190/1251] eta 0:02:14 lr 0.000033 time 2.2840 (2.2039) loss 2.0151 (3.0250) grad_norm 2.6779 (3.0533) [2022-01-26 09:25:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1200/1251] eta 0:01:52 lr 0.000033 time 2.5956 (2.2042) loss 2.9663 (3.0263) grad_norm 2.9684 (3.0532) [2022-01-26 09:25:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1210/1251] eta 0:01:30 lr 0.000033 time 1.5621 (2.2020) loss 3.0868 (3.0244) grad_norm 3.2811 (3.0524) [2022-01-26 09:25:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1220/1251] eta 0:01:08 lr 0.000033 time 2.0775 (2.1999) loss 3.0021 (3.0239) grad_norm 2.7305 (3.0507) [2022-01-26 09:26:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1230/1251] eta 0:00:46 lr 0.000033 time 2.2283 (2.1999) loss 2.8088 (3.0250) grad_norm 2.6795 (3.0505) [2022-01-26 09:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1240/1251] eta 0:00:24 lr 0.000033 time 2.2417 (2.2000) loss 3.1325 (3.0256) grad_norm 2.8258 (3.0492) [2022-01-26 09:26:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [270/300][1250/1251] eta 0:00:02 lr 0.000033 time 1.1533 (2.1946) loss 2.6731 (3.0264) grad_norm 2.7019 (3.0491) [2022-01-26 09:26:38 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 270 training takes 0:45:45 [2022-01-26 09:26:38 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_270 saving...... [2022-01-26 09:26:50 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_270 saved !!! [2022-01-26 09:27:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 11.660 (11.660) Loss 0.7942 (0.7942) Acc@1 82.031 (82.031) Acc@5 95.801 (95.801) [2022-01-26 09:27:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 4.334 (3.072) Loss 0.8123 (0.7977) Acc@1 81.152 (81.188) Acc@5 95.410 (95.748) [2022-01-26 09:27:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.633 (2.413) Loss 0.8552 (0.8024) Acc@1 79.004 (81.283) Acc@5 94.922 (95.578) [2022-01-26 09:27:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.661 (2.057) Loss 0.7452 (0.7962) Acc@1 82.422 (81.231) Acc@5 95.996 (95.618) [2022-01-26 09:28:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.478 (2.037) Loss 0.7790 (0.8052) Acc@1 81.934 (81.159) Acc@5 95.410 (95.470) [2022-01-26 09:28:21 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.016 Acc@5 95.420 [2022-01-26 09:28:21 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-01-26 09:28:21 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.02% [2022-01-26 09:28:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][0/1251] eta 7:26:57 lr 0.000033 time 21.4365 (21.4365) loss 3.4832 (3.4832) grad_norm 3.7759 (3.7759) [2022-01-26 09:29:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][10/1251] eta 1:23:53 lr 0.000033 time 1.8495 (4.0557) loss 3.3633 (3.0461) grad_norm 2.9059 (2.9550) [2022-01-26 09:29:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][20/1251] eta 1:03:55 lr 0.000033 time 1.2418 (3.1158) loss 2.7753 (3.0351) grad_norm 2.6026 (2.9204) [2022-01-26 09:29:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][30/1251] eta 0:56:21 lr 0.000033 time 1.6498 (2.7692) loss 2.4747 (2.9653) grad_norm 2.9180 (2.9527) [2022-01-26 09:30:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][40/1251] eta 0:54:17 lr 0.000033 time 5.3797 (2.6897) loss 2.9694 (2.9539) grad_norm 2.7409 (2.9701) [2022-01-26 09:30:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][50/1251] eta 0:53:08 lr 0.000033 time 2.1478 (2.6551) loss 3.6135 (2.9399) grad_norm 3.3003 (2.9704) [2022-01-26 09:30:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][60/1251] eta 0:51:34 lr 0.000033 time 1.8160 (2.5979) loss 3.4334 (2.9021) grad_norm 2.8702 (2.9807) [2022-01-26 09:31:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][70/1251] eta 0:49:38 lr 0.000033 time 1.5408 (2.5223) loss 2.9916 (2.9224) grad_norm 2.8303 (3.0147) [2022-01-26 09:31:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][80/1251] eta 0:48:35 lr 0.000033 time 3.4651 (2.4901) loss 3.3937 (2.9448) grad_norm 2.7312 (3.0142) [2022-01-26 09:32:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][90/1251] eta 0:46:55 lr 0.000033 time 1.5378 (2.4252) loss 3.4409 (2.9570) grad_norm 3.1569 (3.0138) [2022-01-26 09:32:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][100/1251] eta 0:45:54 lr 0.000033 time 2.2766 (2.3935) loss 3.4788 (2.9755) grad_norm 2.6296 (3.0267) [2022-01-26 09:32:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][110/1251] eta 0:45:03 lr 0.000033 time 1.9084 (2.3691) loss 3.6479 (2.9637) grad_norm 3.1513 (3.0198) [2022-01-26 09:33:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][120/1251] eta 0:44:30 lr 0.000033 time 3.2306 (2.3614) loss 2.6318 (2.9751) grad_norm 3.0345 (3.0140) [2022-01-26 09:33:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][130/1251] eta 0:43:57 lr 0.000032 time 2.1186 (2.3528) loss 3.1881 (2.9899) grad_norm 3.1339 (3.0085) [2022-01-26 09:33:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][140/1251] eta 0:43:20 lr 0.000032 time 1.5331 (2.3409) loss 3.3675 (3.0026) grad_norm 3.0527 (3.0065) [2022-01-26 09:34:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][150/1251] eta 0:42:44 lr 0.000032 time 1.8490 (2.3290) loss 2.9808 (3.0076) grad_norm 3.0564 (3.0253) [2022-01-26 09:34:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][160/1251] eta 0:42:20 lr 0.000032 time 2.5748 (2.3287) loss 3.3977 (3.0128) grad_norm 2.9087 (3.0394) [2022-01-26 09:34:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][170/1251] eta 0:41:53 lr 0.000032 time 1.6358 (2.3253) loss 2.9594 (3.0038) grad_norm 3.3661 (3.0531) [2022-01-26 09:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][180/1251] eta 0:41:22 lr 0.000032 time 1.5622 (2.3179) loss 2.4502 (2.9990) grad_norm 2.8816 (3.0444) [2022-01-26 09:35:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][190/1251] eta 0:40:52 lr 0.000032 time 2.6358 (2.3116) loss 3.3604 (2.9948) grad_norm 3.0735 (3.0461) [2022-01-26 09:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][200/1251] eta 0:40:20 lr 0.000032 time 2.2983 (2.3031) loss 3.3955 (2.9982) grad_norm 2.6612 (3.0401) [2022-01-26 09:36:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][210/1251] eta 0:39:52 lr 0.000032 time 1.8161 (2.2982) loss 3.0297 (3.0015) grad_norm 3.0710 (3.0322) [2022-01-26 09:36:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][220/1251] eta 0:39:17 lr 0.000032 time 1.7885 (2.2865) loss 2.8282 (2.9960) grad_norm 3.0940 (3.0330) [2022-01-26 09:37:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][230/1251] eta 0:38:46 lr 0.000032 time 1.8698 (2.2783) loss 3.2803 (2.9993) grad_norm 2.8300 (3.0366) [2022-01-26 09:37:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][240/1251] eta 0:38:20 lr 0.000032 time 2.7979 (2.2751) loss 3.2028 (3.0027) grad_norm 3.0787 (3.0570) [2022-01-26 09:37:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][250/1251] eta 0:38:01 lr 0.000032 time 2.5894 (2.2796) loss 3.1330 (2.9986) grad_norm 3.3477 (3.0572) [2022-01-26 09:38:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][260/1251] eta 0:37:28 lr 0.000032 time 1.5708 (2.2690) loss 3.2285 (2.9914) grad_norm 2.7225 (3.0585) [2022-01-26 09:38:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][270/1251] eta 0:36:58 lr 0.000032 time 2.1391 (2.2619) loss 3.2640 (2.9867) grad_norm 2.8054 (3.0535) [2022-01-26 09:38:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][280/1251] eta 0:36:34 lr 0.000032 time 3.2350 (2.2600) loss 3.4425 (2.9818) grad_norm 2.7634 (3.0489) [2022-01-26 09:39:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][290/1251] eta 0:36:14 lr 0.000032 time 2.9884 (2.2632) loss 3.5141 (2.9819) grad_norm 2.6864 (3.0521) [2022-01-26 09:39:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][300/1251] eta 0:35:55 lr 0.000032 time 2.3418 (2.2662) loss 3.1789 (2.9818) grad_norm 3.8308 (3.0506) [2022-01-26 09:40:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][310/1251] eta 0:35:32 lr 0.000032 time 1.8395 (2.2662) loss 2.0140 (2.9778) grad_norm 3.0357 (3.0492) [2022-01-26 09:40:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][320/1251] eta 0:35:06 lr 0.000032 time 2.6464 (2.2629) loss 2.3763 (2.9820) grad_norm 3.6908 (3.0512) [2022-01-26 09:40:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][330/1251] eta 0:34:35 lr 0.000032 time 1.9443 (2.2533) loss 2.7107 (2.9859) grad_norm 3.3211 (3.0647) [2022-01-26 09:41:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][340/1251] eta 0:34:07 lr 0.000032 time 2.3788 (2.2470) loss 3.2333 (2.9895) grad_norm 2.8557 (3.0604) [2022-01-26 09:41:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][350/1251] eta 0:33:39 lr 0.000032 time 1.8521 (2.2418) loss 2.1664 (2.9893) grad_norm 3.2125 (3.0634) [2022-01-26 09:41:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][360/1251] eta 0:33:15 lr 0.000032 time 1.8323 (2.2393) loss 3.0856 (2.9949) grad_norm 2.5454 (3.0608) [2022-01-26 09:42:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][370/1251] eta 0:32:54 lr 0.000032 time 2.1616 (2.2408) loss 1.9640 (2.9931) grad_norm 3.2991 (3.0592) [2022-01-26 09:42:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][380/1251] eta 0:32:36 lr 0.000032 time 2.5398 (2.2466) loss 1.9213 (2.9910) grad_norm 3.0377 (3.0559) [2022-01-26 09:42:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][390/1251] eta 0:32:13 lr 0.000032 time 1.4923 (2.2454) loss 3.4939 (2.9890) grad_norm 3.1412 (3.0592) [2022-01-26 09:43:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][400/1251] eta 0:31:48 lr 0.000032 time 2.1975 (2.2423) loss 2.2189 (2.9902) grad_norm 2.8335 (3.0581) [2022-01-26 09:43:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][410/1251] eta 0:31:23 lr 0.000032 time 1.9481 (2.2392) loss 3.0464 (2.9919) grad_norm 2.8683 (3.0537) [2022-01-26 09:44:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][420/1251] eta 0:30:56 lr 0.000032 time 2.1845 (2.2339) loss 3.3196 (2.9913) grad_norm 3.1722 (3.0535) [2022-01-26 09:44:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][430/1251] eta 0:30:29 lr 0.000032 time 1.5869 (2.2283) loss 2.9149 (2.9926) grad_norm 4.0305 (3.0569) [2022-01-26 09:44:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][440/1251] eta 0:30:04 lr 0.000032 time 2.0328 (2.2256) loss 3.1762 (2.9954) grad_norm 3.2106 (3.0589) [2022-01-26 09:45:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][450/1251] eta 0:29:45 lr 0.000032 time 2.9237 (2.2286) loss 3.0096 (2.9972) grad_norm 2.5627 (3.0591) [2022-01-26 09:45:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][460/1251] eta 0:29:23 lr 0.000032 time 2.7598 (2.2297) loss 3.0281 (2.9972) grad_norm 3.0189 (3.0612) [2022-01-26 09:45:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][470/1251] eta 0:29:00 lr 0.000032 time 1.8966 (2.2280) loss 2.5764 (2.9951) grad_norm 3.0611 (3.0596) [2022-01-26 09:46:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][480/1251] eta 0:28:36 lr 0.000032 time 2.1838 (2.2269) loss 3.0217 (2.9937) grad_norm 2.5905 (3.0564) [2022-01-26 09:46:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][490/1251] eta 0:28:14 lr 0.000032 time 2.1823 (2.2262) loss 2.9192 (2.9918) grad_norm 2.5905 (3.0559) [2022-01-26 09:46:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][500/1251] eta 0:27:49 lr 0.000032 time 1.6724 (2.2230) loss 3.3564 (2.9884) grad_norm 3.0394 (3.0563) [2022-01-26 09:47:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][510/1251] eta 0:27:25 lr 0.000032 time 1.6252 (2.2204) loss 3.5337 (2.9907) grad_norm 3.5461 (3.0553) [2022-01-26 09:47:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][520/1251] eta 0:27:02 lr 0.000032 time 1.8649 (2.2196) loss 3.6580 (2.9941) grad_norm 2.9255 (3.0540) [2022-01-26 09:47:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][530/1251] eta 0:26:38 lr 0.000032 time 2.2874 (2.2171) loss 3.6911 (2.9987) grad_norm 3.0935 (3.0514) [2022-01-26 09:48:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][540/1251] eta 0:26:16 lr 0.000032 time 2.2648 (2.2174) loss 3.5029 (2.9993) grad_norm 2.8604 (3.0504) [2022-01-26 09:48:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][550/1251] eta 0:25:54 lr 0.000032 time 2.2516 (2.2180) loss 2.7943 (2.9983) grad_norm 2.8255 (3.0481) [2022-01-26 09:49:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][560/1251] eta 0:25:33 lr 0.000032 time 2.7396 (2.2191) loss 3.3227 (3.0019) grad_norm 3.0353 (3.0474) [2022-01-26 09:49:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][570/1251] eta 0:25:11 lr 0.000032 time 1.9310 (2.2188) loss 3.4148 (3.0046) grad_norm 2.9118 (3.0451) [2022-01-26 09:49:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][580/1251] eta 0:24:47 lr 0.000032 time 1.8920 (2.2169) loss 3.3965 (3.0050) grad_norm 2.7619 (3.0415) [2022-01-26 09:50:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][590/1251] eta 0:24:22 lr 0.000032 time 1.7769 (2.2128) loss 2.2571 (3.0087) grad_norm 2.9885 (3.0402) [2022-01-26 09:50:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][600/1251] eta 0:24:00 lr 0.000032 time 2.1665 (2.2124) loss 3.3564 (3.0085) grad_norm 3.1908 (3.0413) [2022-01-26 09:50:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][610/1251] eta 0:23:38 lr 0.000032 time 2.1280 (2.2126) loss 1.9677 (3.0061) grad_norm 2.8076 (3.0395) [2022-01-26 09:51:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][620/1251] eta 0:23:17 lr 0.000032 time 2.5196 (2.2147) loss 3.1480 (3.0047) grad_norm 2.9061 (3.0381) [2022-01-26 09:51:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][630/1251] eta 0:22:54 lr 0.000032 time 1.9754 (2.2136) loss 3.4871 (3.0039) grad_norm 3.0002 (3.0376) [2022-01-26 09:52:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][640/1251] eta 0:22:33 lr 0.000032 time 2.5384 (2.2146) loss 2.9236 (3.0064) grad_norm 2.6039 (3.0368) [2022-01-26 09:52:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][650/1251] eta 0:22:10 lr 0.000032 time 1.6140 (2.2133) loss 3.2807 (3.0087) grad_norm 2.7147 (3.0347) [2022-01-26 09:52:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][660/1251] eta 0:21:48 lr 0.000032 time 2.4233 (2.2138) loss 3.3230 (3.0090) grad_norm 2.6115 (3.0317) [2022-01-26 09:53:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][670/1251] eta 0:21:24 lr 0.000032 time 1.8553 (2.2113) loss 2.7526 (3.0119) grad_norm 3.0547 (3.0294) [2022-01-26 09:53:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][680/1251] eta 0:21:02 lr 0.000032 time 2.5599 (2.2116) loss 3.3360 (3.0141) grad_norm 2.7850 (3.0281) [2022-01-26 09:53:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][690/1251] eta 0:20:40 lr 0.000032 time 1.9026 (2.2109) loss 2.7859 (3.0163) grad_norm 2.8841 (3.0262) [2022-01-26 09:54:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][700/1251] eta 0:20:18 lr 0.000032 time 1.9376 (2.2122) loss 2.2116 (3.0177) grad_norm 2.7331 (3.0252) [2022-01-26 09:54:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][710/1251] eta 0:19:56 lr 0.000032 time 2.1287 (2.2119) loss 3.3238 (3.0208) grad_norm 2.9870 (3.0248) [2022-01-26 09:54:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][720/1251] eta 0:19:34 lr 0.000032 time 2.4955 (2.2114) loss 3.5313 (3.0186) grad_norm 3.0153 (3.0248) [2022-01-26 09:55:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][730/1251] eta 0:19:11 lr 0.000032 time 1.6841 (2.2104) loss 2.8181 (3.0198) grad_norm 2.8572 (3.0272) [2022-01-26 09:55:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][740/1251] eta 0:18:49 lr 0.000032 time 1.8022 (2.2104) loss 2.4847 (3.0171) grad_norm 2.9869 (3.0293) [2022-01-26 09:56:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][750/1251] eta 0:18:27 lr 0.000032 time 2.6829 (2.2104) loss 2.9010 (3.0143) grad_norm 2.7384 (3.0295) [2022-01-26 09:56:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][760/1251] eta 0:18:05 lr 0.000032 time 2.4788 (2.2102) loss 2.5985 (3.0133) grad_norm 2.6740 (3.0260) [2022-01-26 09:56:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][770/1251] eta 0:17:42 lr 0.000032 time 1.6881 (2.2095) loss 3.4923 (3.0141) grad_norm 3.1296 (3.0244) [2022-01-26 09:57:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][780/1251] eta 0:17:20 lr 0.000032 time 1.9539 (2.2097) loss 3.0787 (3.0140) grad_norm 2.7438 (3.0286) [2022-01-26 09:57:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][790/1251] eta 0:16:59 lr 0.000032 time 2.6385 (2.2104) loss 2.9110 (3.0135) grad_norm 3.6587 (3.0286) [2022-01-26 09:57:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][800/1251] eta 0:16:36 lr 0.000032 time 2.8100 (2.2094) loss 2.6435 (3.0135) grad_norm 3.4802 (3.0276) [2022-01-26 09:58:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][810/1251] eta 0:16:13 lr 0.000032 time 2.0589 (2.2073) loss 2.6058 (3.0114) grad_norm 2.7032 (3.0273) [2022-01-26 09:58:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][820/1251] eta 0:15:51 lr 0.000032 time 2.6035 (2.2076) loss 3.2061 (3.0158) grad_norm 3.0366 (3.0280) [2022-01-26 09:58:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][830/1251] eta 0:15:29 lr 0.000032 time 2.0004 (2.2080) loss 2.1065 (3.0154) grad_norm 3.6373 (3.0274) [2022-01-26 09:59:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][840/1251] eta 0:15:08 lr 0.000032 time 3.2231 (2.2094) loss 3.1462 (3.0155) grad_norm 2.7876 (3.0265) [2022-01-26 09:59:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][850/1251] eta 0:14:45 lr 0.000032 time 2.5317 (2.2085) loss 2.5631 (3.0150) grad_norm 2.8740 (3.0253) [2022-01-26 10:00:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][860/1251] eta 0:14:23 lr 0.000032 time 1.9492 (2.2075) loss 2.2774 (3.0130) grad_norm 2.5659 (3.0250) [2022-01-26 10:00:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][870/1251] eta 0:14:00 lr 0.000032 time 1.8856 (2.2062) loss 2.8751 (3.0096) grad_norm 3.0133 (3.0259) [2022-01-26 10:00:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][880/1251] eta 0:13:38 lr 0.000032 time 2.2070 (2.2050) loss 2.7164 (3.0085) grad_norm 3.4124 (3.0255) [2022-01-26 10:01:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][890/1251] eta 0:13:16 lr 0.000032 time 2.2102 (2.2055) loss 3.0005 (3.0095) grad_norm 2.8643 (3.0266) [2022-01-26 10:01:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][900/1251] eta 0:12:54 lr 0.000032 time 2.9772 (2.2075) loss 3.3702 (3.0109) grad_norm 2.7557 (3.0277) [2022-01-26 10:01:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][910/1251] eta 0:12:32 lr 0.000032 time 2.0906 (2.2071) loss 1.8532 (3.0078) grad_norm 2.5663 (3.0287) [2022-01-26 10:02:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][920/1251] eta 0:12:10 lr 0.000032 time 2.7776 (2.2057) loss 3.3152 (3.0081) grad_norm 2.6584 (3.0292) [2022-01-26 10:02:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][930/1251] eta 0:11:47 lr 0.000032 time 1.6868 (2.2032) loss 2.9247 (3.0062) grad_norm 3.2430 (3.0309) [2022-01-26 10:02:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][940/1251] eta 0:11:24 lr 0.000032 time 1.9845 (2.2015) loss 3.0633 (3.0057) grad_norm 3.5563 (3.0327) [2022-01-26 10:03:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][950/1251] eta 0:11:02 lr 0.000031 time 1.9130 (2.2002) loss 3.2836 (3.0074) grad_norm 2.9253 (3.0322) [2022-01-26 10:03:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][960/1251] eta 0:10:40 lr 0.000031 time 2.9500 (2.2017) loss 2.3398 (3.0051) grad_norm 2.6382 (3.0303) [2022-01-26 10:03:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][970/1251] eta 0:10:18 lr 0.000031 time 1.7381 (2.2008) loss 3.4252 (3.0064) grad_norm 2.7489 (3.0318) [2022-01-26 10:04:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][980/1251] eta 0:09:56 lr 0.000031 time 2.5508 (2.2000) loss 2.8041 (3.0060) grad_norm 3.1140 (3.0320) [2022-01-26 10:04:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][990/1251] eta 0:09:34 lr 0.000031 time 2.2645 (2.1997) loss 2.5465 (3.0043) grad_norm 3.0095 (3.0308) [2022-01-26 10:05:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1000/1251] eta 0:09:12 lr 0.000031 time 3.3412 (2.2027) loss 2.4221 (3.0016) grad_norm 2.4987 (3.0293) [2022-01-26 10:05:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1010/1251] eta 0:08:51 lr 0.000031 time 1.6242 (2.2044) loss 3.2396 (3.0016) grad_norm 2.5273 (3.0319) [2022-01-26 10:05:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1020/1251] eta 0:08:29 lr 0.000031 time 1.9048 (2.2041) loss 3.0812 (3.0015) grad_norm 2.7352 (3.0327) [2022-01-26 10:06:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1030/1251] eta 0:08:06 lr 0.000031 time 2.2042 (2.2028) loss 2.4546 (2.9992) grad_norm 3.2759 (3.0328) [2022-01-26 10:06:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1040/1251] eta 0:07:44 lr 0.000031 time 1.9036 (2.2012) loss 3.3571 (2.9990) grad_norm 4.2413 (3.0340) [2022-01-26 10:06:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1050/1251] eta 0:07:22 lr 0.000031 time 1.9173 (2.1995) loss 3.3208 (3.0012) grad_norm 3.0274 (3.0336) [2022-01-26 10:07:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1060/1251] eta 0:07:00 lr 0.000031 time 2.6243 (2.1992) loss 2.0648 (3.0005) grad_norm 2.8807 (3.0326) [2022-01-26 10:07:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1070/1251] eta 0:06:38 lr 0.000031 time 2.5143 (2.1995) loss 3.0209 (2.9995) grad_norm 3.5139 (3.0323) [2022-01-26 10:07:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1080/1251] eta 0:06:16 lr 0.000031 time 2.1999 (2.1996) loss 3.1986 (3.0010) grad_norm 3.7273 (3.0329) [2022-01-26 10:08:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1090/1251] eta 0:05:54 lr 0.000031 time 2.5077 (2.2011) loss 3.9130 (3.0021) grad_norm 2.9324 (3.0329) [2022-01-26 10:08:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1100/1251] eta 0:05:32 lr 0.000031 time 2.8864 (2.2022) loss 3.9543 (3.0047) grad_norm 3.1208 (3.0324) [2022-01-26 10:09:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1110/1251] eta 0:05:10 lr 0.000031 time 2.0718 (2.2019) loss 3.0862 (3.0051) grad_norm 3.0944 (3.0325) [2022-01-26 10:09:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1120/1251] eta 0:04:48 lr 0.000031 time 1.8893 (2.2001) loss 3.1302 (3.0047) grad_norm 3.1402 (3.0329) [2022-01-26 10:09:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1130/1251] eta 0:04:26 lr 0.000031 time 2.1982 (2.2005) loss 3.6515 (3.0065) grad_norm 2.8901 (3.0327) [2022-01-26 10:10:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1140/1251] eta 0:04:04 lr 0.000031 time 2.5144 (2.2004) loss 3.4255 (3.0067) grad_norm 2.6573 (3.0320) [2022-01-26 10:10:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1150/1251] eta 0:03:42 lr 0.000031 time 1.9005 (2.1997) loss 2.8575 (3.0040) grad_norm 2.7298 (3.0316) [2022-01-26 10:10:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1160/1251] eta 0:03:20 lr 0.000031 time 2.6899 (2.1998) loss 2.4093 (3.0030) grad_norm 2.5639 (3.0308) [2022-01-26 10:11:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1170/1251] eta 0:02:58 lr 0.000031 time 1.8257 (2.1993) loss 3.2976 (3.0035) grad_norm 2.7322 (3.0298) [2022-01-26 10:11:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1180/1251] eta 0:02:36 lr 0.000031 time 1.9244 (2.1984) loss 3.3020 (3.0018) grad_norm 2.7419 (3.0303) [2022-01-26 10:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1190/1251] eta 0:02:14 lr 0.000031 time 1.9187 (2.1979) loss 3.4190 (3.0023) grad_norm 2.6979 (3.0303) [2022-01-26 10:12:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1200/1251] eta 0:01:52 lr 0.000031 time 2.8858 (2.1981) loss 3.5080 (3.0030) grad_norm 2.9014 (3.0297) [2022-01-26 10:12:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1210/1251] eta 0:01:30 lr 0.000031 time 1.8265 (2.1978) loss 2.4305 (3.0015) grad_norm 2.5846 (3.0284) [2022-01-26 10:13:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1220/1251] eta 0:01:08 lr 0.000031 time 2.0777 (2.1975) loss 3.2106 (3.0030) grad_norm 3.0474 (3.0273) [2022-01-26 10:13:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1230/1251] eta 0:00:46 lr 0.000031 time 1.5845 (2.1975) loss 3.0047 (3.0036) grad_norm 3.0506 (3.0278) [2022-01-26 10:13:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1240/1251] eta 0:00:24 lr 0.000031 time 1.8027 (2.1965) loss 3.3464 (3.0008) grad_norm 4.4396 (3.0303) [2022-01-26 10:14:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [271/300][1250/1251] eta 0:00:02 lr 0.000031 time 1.1713 (2.1902) loss 3.3855 (3.0013) grad_norm 3.1891 (3.0312) [2022-01-26 10:14:01 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 271 training takes 0:45:40 [2022-01-26 10:14:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.038 (19.038) Loss 0.7767 (0.7767) Acc@1 81.348 (81.348) Acc@5 95.508 (95.508) [2022-01-26 10:14:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.289 (3.457) Loss 0.8405 (0.8004) Acc@1 80.664 (81.312) Acc@5 95.020 (95.579) [2022-01-26 10:14:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.616 (2.533) Loss 0.8035 (0.8059) Acc@1 81.445 (81.078) Acc@5 95.410 (95.536) [2022-01-26 10:15:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.621 (2.337) Loss 0.8222 (0.8044) Acc@1 80.273 (81.159) Acc@5 95.703 (95.442) [2022-01-26 10:15:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.532 (2.196) Loss 0.7320 (0.8064) Acc@1 82.324 (81.098) Acc@5 95.996 (95.424) [2022-01-26 10:15:38 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.074 Acc@5 95.382 [2022-01-26 10:15:38 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-01-26 10:15:38 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.07% [2022-01-26 10:16:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][0/1251] eta 7:26:27 lr 0.000031 time 21.4129 (21.4129) loss 3.1754 (3.1754) grad_norm 3.5541 (3.5541) [2022-01-26 10:16:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][10/1251] eta 1:23:52 lr 0.000031 time 2.5489 (4.0548) loss 3.3228 (3.1840) grad_norm 2.3596 (3.1280) [2022-01-26 10:16:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][20/1251] eta 1:03:32 lr 0.000031 time 1.3011 (3.0968) loss 2.9206 (3.1239) grad_norm 2.8136 (3.1250) [2022-01-26 10:17:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][30/1251] eta 0:56:36 lr 0.000031 time 1.7896 (2.7821) loss 3.6335 (3.0365) grad_norm 3.5069 (3.1180) [2022-01-26 10:17:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][40/1251] eta 0:54:59 lr 0.000031 time 4.0052 (2.7247) loss 3.3973 (3.0675) grad_norm 2.6571 (3.1483) [2022-01-26 10:17:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][50/1251] eta 0:52:03 lr 0.000031 time 1.7632 (2.6004) loss 2.4626 (3.0136) grad_norm 2.7489 (3.1049) [2022-01-26 10:18:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][60/1251] eta 0:50:04 lr 0.000031 time 2.0884 (2.5229) loss 2.0180 (3.0120) grad_norm 2.5430 (3.0834) [2022-01-26 10:18:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][70/1251] eta 0:48:26 lr 0.000031 time 1.5081 (2.4614) loss 3.3495 (3.0257) grad_norm 3.5858 (3.1098) [2022-01-26 10:18:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][80/1251] eta 0:47:52 lr 0.000031 time 3.3307 (2.4533) loss 3.3095 (3.0124) grad_norm 3.2500 (3.1151) [2022-01-26 10:19:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][90/1251] eta 0:47:10 lr 0.000031 time 1.9326 (2.4383) loss 3.1443 (2.9857) grad_norm 3.2550 (3.1510) [2022-01-26 10:19:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][100/1251] eta 0:46:26 lr 0.000031 time 2.4374 (2.4209) loss 3.6388 (3.0095) grad_norm 3.0660 (3.1511) [2022-01-26 10:20:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][110/1251] eta 0:45:36 lr 0.000031 time 1.6918 (2.3981) loss 2.5987 (3.0143) grad_norm 2.6252 (3.1272) [2022-01-26 10:20:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][120/1251] eta 0:44:39 lr 0.000031 time 2.3399 (2.3691) loss 3.1519 (3.0301) grad_norm 2.9600 (3.1132) [2022-01-26 10:20:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][130/1251] eta 0:43:44 lr 0.000031 time 1.8778 (2.3411) loss 3.3987 (3.0532) grad_norm 3.3589 (3.1204) [2022-01-26 10:21:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][140/1251] eta 0:42:59 lr 0.000031 time 2.7429 (2.3217) loss 3.4430 (3.0358) grad_norm 2.8431 (3.1147) [2022-01-26 10:21:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][150/1251] eta 0:42:19 lr 0.000031 time 2.1411 (2.3069) loss 3.4608 (3.0289) grad_norm 2.9713 (3.1145) [2022-01-26 10:21:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][160/1251] eta 0:41:47 lr 0.000031 time 2.8759 (2.2980) loss 3.1350 (3.0297) grad_norm 2.7860 (3.1057) [2022-01-26 10:22:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][170/1251] eta 0:41:11 lr 0.000031 time 1.7460 (2.2866) loss 2.6155 (3.0340) grad_norm 3.1938 (3.1076) [2022-01-26 10:22:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][180/1251] eta 0:40:43 lr 0.000031 time 2.1968 (2.2817) loss 2.4527 (3.0317) grad_norm 3.4015 (3.0996) [2022-01-26 10:22:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][190/1251] eta 0:40:26 lr 0.000031 time 2.2959 (2.2867) loss 2.8820 (3.0222) grad_norm 2.5057 (3.0903) [2022-01-26 10:23:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][200/1251] eta 0:40:01 lr 0.000031 time 2.8011 (2.2852) loss 2.9231 (3.0251) grad_norm 3.1078 (3.0879) [2022-01-26 10:23:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][210/1251] eta 0:39:39 lr 0.000031 time 1.5578 (2.2862) loss 2.8039 (3.0366) grad_norm 2.7210 (3.0849) [2022-01-26 10:24:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][220/1251] eta 0:39:15 lr 0.000031 time 2.5542 (2.2845) loss 3.1898 (3.0394) grad_norm 3.1225 (3.0827) [2022-01-26 10:24:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][230/1251] eta 0:38:50 lr 0.000031 time 1.8630 (2.2824) loss 3.5683 (3.0459) grad_norm 4.0807 (3.0859) [2022-01-26 10:24:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][240/1251] eta 0:38:19 lr 0.000031 time 2.6879 (2.2747) loss 3.2058 (3.0506) grad_norm 3.1747 (3.0843) [2022-01-26 10:25:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][250/1251] eta 0:37:45 lr 0.000031 time 1.8502 (2.2634) loss 2.3226 (3.0436) grad_norm 2.5133 (3.0788) [2022-01-26 10:25:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][260/1251] eta 0:37:21 lr 0.000031 time 2.1907 (2.2616) loss 3.4729 (3.0471) grad_norm 2.6094 (3.0738) [2022-01-26 10:25:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][270/1251] eta 0:36:56 lr 0.000031 time 2.3229 (2.2599) loss 3.2340 (3.0415) grad_norm 3.6827 (3.0782) [2022-01-26 10:26:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][280/1251] eta 0:36:28 lr 0.000031 time 2.4662 (2.2534) loss 3.5510 (3.0421) grad_norm 2.6971 (3.0705) [2022-01-26 10:26:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][290/1251] eta 0:35:59 lr 0.000031 time 2.1380 (2.2467) loss 2.7696 (3.0354) grad_norm 2.9929 (3.0685) [2022-01-26 10:26:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][300/1251] eta 0:35:32 lr 0.000031 time 2.4860 (2.2420) loss 3.3150 (3.0301) grad_norm 3.5631 (3.0649) [2022-01-26 10:27:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][310/1251] eta 0:35:06 lr 0.000031 time 2.0682 (2.2381) loss 3.2252 (3.0273) grad_norm 3.0169 (3.0607) [2022-01-26 10:27:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][320/1251] eta 0:34:44 lr 0.000031 time 2.7999 (2.2389) loss 3.3369 (3.0322) grad_norm 3.1175 (3.0517) [2022-01-26 10:28:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][330/1251] eta 0:34:22 lr 0.000031 time 2.1181 (2.2396) loss 3.1134 (3.0328) grad_norm 3.5007 (3.0578) [2022-01-26 10:28:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][340/1251] eta 0:33:57 lr 0.000031 time 2.3601 (2.2369) loss 3.3433 (3.0338) grad_norm 2.9043 (3.0532) [2022-01-26 10:28:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][350/1251] eta 0:33:34 lr 0.000031 time 1.5980 (2.2358) loss 3.6408 (3.0413) grad_norm 3.3005 (3.0500) [2022-01-26 10:29:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][360/1251] eta 0:33:22 lr 0.000031 time 3.6016 (2.2475) loss 3.3251 (3.0456) grad_norm 2.8786 (3.0519) [2022-01-26 10:29:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][370/1251] eta 0:33:01 lr 0.000031 time 1.9145 (2.2487) loss 3.5577 (3.0494) grad_norm 3.2860 (3.0502) [2022-01-26 10:29:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][380/1251] eta 0:32:38 lr 0.000031 time 1.8038 (2.2485) loss 2.9893 (3.0500) grad_norm 3.4979 (3.0575) [2022-01-26 10:30:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][390/1251] eta 0:32:10 lr 0.000031 time 1.7144 (2.2427) loss 3.4343 (3.0409) grad_norm 3.3550 (3.0641) [2022-01-26 10:30:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][400/1251] eta 0:31:43 lr 0.000031 time 2.6695 (2.2371) loss 2.9719 (3.0367) grad_norm 3.1268 (3.0622) [2022-01-26 10:30:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][410/1251] eta 0:31:20 lr 0.000031 time 2.7982 (2.2357) loss 2.2647 (3.0349) grad_norm 3.3391 (3.0637) [2022-01-26 10:31:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][420/1251] eta 0:30:55 lr 0.000031 time 2.2658 (2.2326) loss 2.6163 (3.0362) grad_norm 2.7374 (3.0669) [2022-01-26 10:31:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][430/1251] eta 0:30:31 lr 0.000031 time 1.8347 (2.2309) loss 2.7052 (3.0305) grad_norm 2.8172 (3.0642) [2022-01-26 10:32:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][440/1251] eta 0:30:07 lr 0.000031 time 1.8739 (2.2283) loss 2.0398 (3.0295) grad_norm 3.8646 (3.0618) [2022-01-26 10:32:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][450/1251] eta 0:29:43 lr 0.000031 time 2.2342 (2.2266) loss 3.2156 (3.0339) grad_norm 3.1152 (3.0576) [2022-01-26 10:32:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][460/1251] eta 0:29:21 lr 0.000031 time 2.1922 (2.2274) loss 3.5041 (3.0316) grad_norm 3.3280 (3.0596) [2022-01-26 10:33:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][470/1251] eta 0:28:59 lr 0.000031 time 1.6422 (2.2274) loss 2.3644 (3.0333) grad_norm 2.7502 (3.0551) [2022-01-26 10:33:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][480/1251] eta 0:28:36 lr 0.000031 time 1.8765 (2.2265) loss 2.9319 (3.0314) grad_norm 3.7078 (3.0558) [2022-01-26 10:33:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][490/1251] eta 0:28:14 lr 0.000031 time 2.3239 (2.2263) loss 3.1553 (3.0368) grad_norm 2.8513 (3.0527) [2022-01-26 10:34:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][500/1251] eta 0:27:52 lr 0.000031 time 2.0071 (2.2276) loss 2.7811 (3.0396) grad_norm 2.4292 (3.0520) [2022-01-26 10:34:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][510/1251] eta 0:27:30 lr 0.000031 time 1.7787 (2.2277) loss 2.8067 (3.0378) grad_norm 3.1323 (3.0540) [2022-01-26 10:34:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][520/1251] eta 0:27:05 lr 0.000031 time 2.1556 (2.2239) loss 3.1825 (3.0391) grad_norm 3.2809 (3.0538) [2022-01-26 10:35:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][530/1251] eta 0:26:41 lr 0.000030 time 2.0109 (2.2219) loss 2.4630 (3.0375) grad_norm 2.8438 (3.0525) [2022-01-26 10:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][540/1251] eta 0:26:19 lr 0.000030 time 2.1760 (2.2218) loss 3.4483 (3.0365) grad_norm 2.7773 (3.0503) [2022-01-26 10:36:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][550/1251] eta 0:25:55 lr 0.000030 time 1.9166 (2.2196) loss 2.8330 (3.0353) grad_norm 3.0264 (3.0490) [2022-01-26 10:36:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][560/1251] eta 0:25:31 lr 0.000030 time 1.9524 (2.2167) loss 3.0693 (3.0355) grad_norm 3.1217 (3.0503) [2022-01-26 10:36:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][570/1251] eta 0:25:08 lr 0.000030 time 2.0142 (2.2152) loss 3.2680 (3.0348) grad_norm 3.4154 (3.0511) [2022-01-26 10:37:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][580/1251] eta 0:24:46 lr 0.000030 time 2.5792 (2.2151) loss 3.1935 (3.0364) grad_norm 2.9969 (3.0503) [2022-01-26 10:37:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][590/1251] eta 0:24:24 lr 0.000030 time 1.9009 (2.2163) loss 3.0557 (3.0374) grad_norm 3.3634 (3.0506) [2022-01-26 10:37:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][600/1251] eta 0:24:04 lr 0.000030 time 2.7754 (2.2183) loss 2.2013 (3.0344) grad_norm 3.4224 (3.0516) [2022-01-26 10:38:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][610/1251] eta 0:23:43 lr 0.000030 time 2.2274 (2.2201) loss 3.6091 (3.0344) grad_norm 2.7095 (3.0513) [2022-01-26 10:38:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][620/1251] eta 0:23:20 lr 0.000030 time 2.2082 (2.2201) loss 3.1649 (3.0333) grad_norm 2.9404 (3.0526) [2022-01-26 10:38:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][630/1251] eta 0:22:57 lr 0.000030 time 1.8799 (2.2188) loss 3.6089 (3.0329) grad_norm 3.7707 (3.0565) [2022-01-26 10:39:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][640/1251] eta 0:22:35 lr 0.000030 time 2.8701 (2.2177) loss 3.0419 (3.0348) grad_norm 2.7980 (3.0570) [2022-01-26 10:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][650/1251] eta 0:22:11 lr 0.000030 time 2.0577 (2.2156) loss 2.2012 (3.0323) grad_norm 2.8070 (3.0566) [2022-01-26 10:40:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][660/1251] eta 0:21:49 lr 0.000030 time 2.9137 (2.2151) loss 2.8441 (3.0299) grad_norm 2.7664 (3.0609) [2022-01-26 10:40:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][670/1251] eta 0:21:29 lr 0.000030 time 3.0208 (2.2194) loss 2.8017 (3.0298) grad_norm 2.9177 (3.0619) [2022-01-26 10:40:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][680/1251] eta 0:21:08 lr 0.000030 time 2.5357 (2.2213) loss 3.0690 (3.0280) grad_norm 4.4120 (3.0657) [2022-01-26 10:41:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][690/1251] eta 0:20:45 lr 0.000030 time 2.0063 (2.2205) loss 2.7784 (3.0281) grad_norm 3.1419 (3.0638) [2022-01-26 10:41:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][700/1251] eta 0:20:23 lr 0.000030 time 2.0043 (2.2197) loss 2.6712 (3.0261) grad_norm 3.6595 (3.0640) [2022-01-26 10:41:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][710/1251] eta 0:19:57 lr 0.000030 time 1.6196 (2.2143) loss 3.1648 (3.0231) grad_norm 3.4856 (3.0631) [2022-01-26 10:42:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][720/1251] eta 0:19:33 lr 0.000030 time 1.7929 (2.2106) loss 3.2034 (3.0223) grad_norm 2.5894 (3.0601) [2022-01-26 10:42:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][730/1251] eta 0:19:10 lr 0.000030 time 1.9007 (2.2084) loss 3.4104 (3.0202) grad_norm 2.9265 (3.0605) [2022-01-26 10:42:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][740/1251] eta 0:18:48 lr 0.000030 time 2.2586 (2.2075) loss 2.7429 (3.0224) grad_norm 3.2881 (3.0607) [2022-01-26 10:43:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][750/1251] eta 0:18:26 lr 0.000030 time 2.4965 (2.2081) loss 2.7612 (3.0205) grad_norm 3.5018 (3.0606) [2022-01-26 10:43:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][760/1251] eta 0:18:05 lr 0.000030 time 2.5231 (2.2101) loss 2.4764 (3.0223) grad_norm 2.6573 (3.0592) [2022-01-26 10:44:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][770/1251] eta 0:17:43 lr 0.000030 time 1.8977 (2.2101) loss 3.4059 (3.0201) grad_norm 2.8193 (3.0560) [2022-01-26 10:44:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][780/1251] eta 0:17:20 lr 0.000030 time 2.5616 (2.2101) loss 2.9582 (3.0190) grad_norm 2.8936 (3.0591) [2022-01-26 10:44:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][790/1251] eta 0:16:59 lr 0.000030 time 2.8632 (2.2117) loss 2.5842 (3.0174) grad_norm 3.8341 (3.0609) [2022-01-26 10:45:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][800/1251] eta 0:16:37 lr 0.000030 time 1.5959 (2.2117) loss 3.3989 (3.0166) grad_norm 2.5487 (3.0606) [2022-01-26 10:45:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][810/1251] eta 0:16:15 lr 0.000030 time 2.2453 (2.2123) loss 2.2887 (3.0152) grad_norm 3.4019 (3.0600) [2022-01-26 10:45:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][820/1251] eta 0:15:53 lr 0.000030 time 2.4963 (2.2117) loss 2.9175 (3.0131) grad_norm 3.5734 (3.0634) [2022-01-26 10:46:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][830/1251] eta 0:15:30 lr 0.000030 time 1.9459 (2.2100) loss 2.5789 (3.0136) grad_norm 3.3287 (3.0635) [2022-01-26 10:46:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][840/1251] eta 0:15:08 lr 0.000030 time 1.8842 (2.2107) loss 3.3700 (3.0133) grad_norm 2.8022 (3.0625) [2022-01-26 10:46:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][850/1251] eta 0:14:46 lr 0.000030 time 1.8426 (2.2103) loss 3.4589 (3.0153) grad_norm 2.8256 (3.0611) [2022-01-26 10:47:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][860/1251] eta 0:14:24 lr 0.000030 time 3.0042 (2.2108) loss 3.3185 (3.0175) grad_norm 2.9731 (3.0591) [2022-01-26 10:47:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][870/1251] eta 0:14:01 lr 0.000030 time 2.0193 (2.2099) loss 3.5295 (3.0186) grad_norm 3.4152 (3.0624) [2022-01-26 10:48:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][880/1251] eta 0:13:39 lr 0.000030 time 1.9045 (2.2100) loss 3.2109 (3.0200) grad_norm 2.8802 (3.0669) [2022-01-26 10:48:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][890/1251] eta 0:13:17 lr 0.000030 time 1.9540 (2.2089) loss 2.9569 (3.0147) grad_norm 2.9796 (3.0661) [2022-01-26 10:48:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][900/1251] eta 0:12:55 lr 0.000030 time 3.2570 (2.2088) loss 3.2835 (3.0139) grad_norm 2.9988 (3.0660) [2022-01-26 10:49:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][910/1251] eta 0:12:32 lr 0.000030 time 1.8361 (2.2078) loss 3.1370 (3.0151) grad_norm 2.7372 (3.0642) [2022-01-26 10:49:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][920/1251] eta 0:12:10 lr 0.000030 time 1.8537 (2.2071) loss 3.0673 (3.0127) grad_norm 2.9768 (3.0639) [2022-01-26 10:49:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][930/1251] eta 0:11:48 lr 0.000030 time 1.5973 (2.2061) loss 3.2263 (3.0123) grad_norm 2.7266 (3.0632) [2022-01-26 10:50:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][940/1251] eta 0:11:26 lr 0.000030 time 2.2027 (2.2059) loss 3.4170 (3.0124) grad_norm 3.2065 (3.0649) [2022-01-26 10:50:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][950/1251] eta 0:11:03 lr 0.000030 time 1.9596 (2.2058) loss 2.5941 (3.0139) grad_norm 3.2121 (3.0645) [2022-01-26 10:50:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][960/1251] eta 0:10:41 lr 0.000030 time 2.1638 (2.2059) loss 3.6657 (3.0123) grad_norm 2.7404 (3.0618) [2022-01-26 10:51:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][970/1251] eta 0:10:19 lr 0.000030 time 1.8115 (2.2058) loss 3.4159 (3.0105) grad_norm 3.2395 (3.0607) [2022-01-26 10:51:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][980/1251] eta 0:09:58 lr 0.000030 time 2.4194 (2.2068) loss 3.5744 (3.0122) grad_norm 2.9284 (3.0593) [2022-01-26 10:52:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][990/1251] eta 0:09:35 lr 0.000030 time 2.1599 (2.2066) loss 3.0763 (3.0121) grad_norm 2.8744 (3.0622) [2022-01-26 10:52:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1000/1251] eta 0:09:14 lr 0.000030 time 1.9315 (2.2074) loss 1.7087 (3.0115) grad_norm 3.0043 (3.0611) [2022-01-26 10:52:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1010/1251] eta 0:08:51 lr 0.000030 time 2.2684 (2.2054) loss 2.1484 (3.0094) grad_norm 2.9494 (3.0607) [2022-01-26 10:53:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1020/1251] eta 0:08:29 lr 0.000030 time 1.9878 (2.2053) loss 3.5070 (3.0105) grad_norm 5.4885 (3.0628) [2022-01-26 10:53:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1030/1251] eta 0:08:07 lr 0.000030 time 2.1916 (2.2039) loss 3.1337 (3.0098) grad_norm 3.1799 (3.0621) [2022-01-26 10:53:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1040/1251] eta 0:07:45 lr 0.000030 time 1.9005 (2.2044) loss 2.7344 (3.0114) grad_norm 3.1891 (3.0654) [2022-01-26 10:54:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1050/1251] eta 0:07:22 lr 0.000030 time 1.8671 (2.2034) loss 2.7803 (3.0122) grad_norm 3.0461 (3.0654) [2022-01-26 10:54:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1060/1251] eta 0:07:00 lr 0.000030 time 2.1723 (2.2032) loss 3.4621 (3.0111) grad_norm 3.0650 (3.0661) [2022-01-26 10:54:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1070/1251] eta 0:06:38 lr 0.000030 time 1.7750 (2.2030) loss 3.5973 (3.0099) grad_norm 2.6970 (3.0636) [2022-01-26 10:55:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1080/1251] eta 0:06:16 lr 0.000030 time 1.9488 (2.2040) loss 3.0205 (3.0131) grad_norm 3.3574 (3.0632) [2022-01-26 10:55:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1090/1251] eta 0:05:54 lr 0.000030 time 2.1055 (2.2036) loss 1.9274 (3.0126) grad_norm 3.0031 (3.0630) [2022-01-26 10:56:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1100/1251] eta 0:05:32 lr 0.000030 time 2.0199 (2.2031) loss 1.8363 (3.0116) grad_norm 2.9792 (3.0627) [2022-01-26 10:56:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1110/1251] eta 0:05:10 lr 0.000030 time 2.1546 (2.2037) loss 3.2933 (3.0130) grad_norm 3.0462 (3.0618) [2022-01-26 10:56:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1120/1251] eta 0:04:48 lr 0.000030 time 1.6089 (2.2034) loss 3.6511 (3.0153) grad_norm 3.4296 (3.0619) [2022-01-26 10:57:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1130/1251] eta 0:04:26 lr 0.000030 time 3.0741 (2.2032) loss 2.8114 (3.0138) grad_norm 2.8067 (3.0613) [2022-01-26 10:57:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1140/1251] eta 0:04:04 lr 0.000030 time 2.1706 (2.2032) loss 2.6495 (3.0129) grad_norm 2.9372 (3.0607) [2022-01-26 10:57:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1150/1251] eta 0:03:42 lr 0.000030 time 1.8827 (2.2021) loss 2.3413 (3.0102) grad_norm 2.8722 (3.0609) [2022-01-26 10:58:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1160/1251] eta 0:03:20 lr 0.000030 time 2.7826 (2.2019) loss 2.9990 (3.0104) grad_norm 2.8972 (3.0601) [2022-01-26 10:58:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1170/1251] eta 0:02:58 lr 0.000030 time 1.8467 (2.2010) loss 2.5895 (3.0091) grad_norm 3.5119 (3.0609) [2022-01-26 10:58:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1180/1251] eta 0:02:36 lr 0.000030 time 2.4313 (2.2011) loss 2.3748 (3.0080) grad_norm 2.7782 (3.0611) [2022-01-26 10:59:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1190/1251] eta 0:02:14 lr 0.000030 time 2.2542 (2.2004) loss 2.2416 (3.0081) grad_norm 3.2273 (3.0601) [2022-01-26 10:59:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1200/1251] eta 0:01:52 lr 0.000030 time 2.4575 (2.2005) loss 3.0029 (3.0067) grad_norm 2.8900 (3.0590) [2022-01-26 11:00:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1210/1251] eta 0:01:30 lr 0.000030 time 2.0941 (2.2002) loss 2.3088 (3.0043) grad_norm 2.8115 (3.0588) [2022-01-26 11:00:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1220/1251] eta 0:01:08 lr 0.000030 time 1.8164 (2.1987) loss 2.4871 (3.0043) grad_norm 3.1465 (3.0591) [2022-01-26 11:00:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1230/1251] eta 0:00:46 lr 0.000030 time 2.3761 (2.1991) loss 2.1308 (3.0024) grad_norm 2.7056 (3.0576) [2022-01-26 11:01:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1240/1251] eta 0:00:24 lr 0.000030 time 1.7537 (2.1981) loss 3.0183 (3.0018) grad_norm 2.7926 (3.0574) [2022-01-26 11:01:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [272/300][1250/1251] eta 0:00:02 lr 0.000030 time 1.1987 (2.1929) loss 2.4911 (3.0012) grad_norm 3.7361 (3.0575) [2022-01-26 11:01:22 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 272 training takes 0:45:43 [2022-01-26 11:01:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.595 (18.595) Loss 0.8299 (0.8299) Acc@1 81.055 (81.055) Acc@5 94.922 (94.922) [2022-01-26 11:01:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.259 (3.138) Loss 0.7303 (0.7986) Acc@1 83.496 (81.587) Acc@5 95.801 (95.543) [2022-01-26 11:02:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.618 (2.441) Loss 0.8285 (0.8050) Acc@1 79.883 (81.231) Acc@5 95.996 (95.573) [2022-01-26 11:02:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 3.739 (2.296) Loss 0.8399 (0.8099) Acc@1 79.297 (81.061) Acc@5 95.312 (95.520) [2022-01-26 11:02:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.372 (2.148) Loss 0.8010 (0.8146) Acc@1 81.934 (80.967) Acc@5 95.996 (95.405) [2022-01-26 11:02:57 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.960 Acc@5 95.418 [2022-01-26 11:02:57 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-01-26 11:02:57 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.07% [2022-01-26 11:03:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][0/1251] eta 7:29:05 lr 0.000030 time 21.5388 (21.5388) loss 3.4755 (3.4755) grad_norm 2.5972 (2.5972) [2022-01-26 11:03:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][10/1251] eta 1:26:04 lr 0.000030 time 3.0901 (4.1620) loss 2.0348 (2.8759) grad_norm 3.1292 (2.9732) [2022-01-26 11:04:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][20/1251] eta 1:05:50 lr 0.000030 time 1.3031 (3.2089) loss 3.3221 (2.9650) grad_norm 2.8969 (2.9771) [2022-01-26 11:04:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][30/1251] eta 1:00:19 lr 0.000030 time 1.5287 (2.9643) loss 3.5092 (3.0379) grad_norm 3.2923 (2.9873) [2022-01-26 11:04:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][40/1251] eta 0:56:56 lr 0.000030 time 2.6940 (2.8210) loss 3.5259 (3.0190) grad_norm 2.8390 (2.9844) [2022-01-26 11:05:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][50/1251] eta 0:53:42 lr 0.000030 time 2.5032 (2.6832) loss 2.8753 (2.9823) grad_norm 2.7389 (2.9418) [2022-01-26 11:05:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][60/1251] eta 0:50:41 lr 0.000030 time 1.9641 (2.5539) loss 3.4198 (3.0075) grad_norm 2.5226 (2.9540) [2022-01-26 11:05:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][70/1251] eta 0:48:32 lr 0.000030 time 1.9529 (2.4662) loss 3.3260 (3.0080) grad_norm 2.9134 (2.9752) [2022-01-26 11:06:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][80/1251] eta 0:47:20 lr 0.000030 time 2.2550 (2.4253) loss 3.3950 (3.0122) grad_norm 3.3528 (2.9950) [2022-01-26 11:06:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][90/1251] eta 0:46:45 lr 0.000030 time 3.1576 (2.4166) loss 2.8077 (3.0169) grad_norm 2.8418 (2.9999) [2022-01-26 11:06:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][100/1251] eta 0:45:59 lr 0.000030 time 2.0017 (2.3977) loss 2.6185 (2.9936) grad_norm 3.4688 (2.9876) [2022-01-26 11:07:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][110/1251] eta 0:45:13 lr 0.000030 time 2.4465 (2.3780) loss 3.0034 (2.9938) grad_norm 3.2385 (2.9954) [2022-01-26 11:07:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][120/1251] eta 0:44:40 lr 0.000030 time 2.2009 (2.3700) loss 3.3160 (3.0027) grad_norm 3.3191 (2.9816) [2022-01-26 11:08:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][130/1251] eta 0:43:55 lr 0.000030 time 2.3363 (2.3515) loss 2.6389 (2.9776) grad_norm 2.8018 (2.9708) [2022-01-26 11:08:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][140/1251] eta 0:43:23 lr 0.000029 time 2.2346 (2.3434) loss 2.1084 (2.9857) grad_norm 3.1546 (2.9675) [2022-01-26 11:08:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][150/1251] eta 0:42:52 lr 0.000029 time 2.4966 (2.3368) loss 2.7908 (3.0003) grad_norm 2.5820 (2.9827) [2022-01-26 11:09:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][160/1251] eta 0:42:12 lr 0.000029 time 1.5939 (2.3212) loss 2.3911 (2.9843) grad_norm 3.3396 (2.9912) [2022-01-26 11:09:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][170/1251] eta 0:41:34 lr 0.000029 time 2.1668 (2.3074) loss 3.5189 (2.9861) grad_norm 3.2281 (3.0050) [2022-01-26 11:09:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][180/1251] eta 0:41:03 lr 0.000029 time 2.2295 (2.3001) loss 2.4021 (2.9692) grad_norm 3.6973 (3.0038) [2022-01-26 11:10:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][190/1251] eta 0:40:40 lr 0.000029 time 2.9920 (2.3000) loss 3.3719 (2.9676) grad_norm 3.2110 (3.0142) [2022-01-26 11:10:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][200/1251] eta 0:40:11 lr 0.000029 time 1.9891 (2.2946) loss 3.6432 (2.9709) grad_norm 3.0022 (3.0179) [2022-01-26 11:10:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][210/1251] eta 0:39:39 lr 0.000029 time 2.2053 (2.2861) loss 3.3309 (2.9802) grad_norm 2.9400 (3.0218) [2022-01-26 11:11:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][220/1251] eta 0:39:16 lr 0.000029 time 2.2062 (2.2859) loss 3.4750 (2.9787) grad_norm 3.2121 (3.0289) [2022-01-26 11:11:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][230/1251] eta 0:38:51 lr 0.000029 time 2.8639 (2.2835) loss 3.2066 (2.9778) grad_norm 3.6174 (3.0484) [2022-01-26 11:12:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][240/1251] eta 0:38:30 lr 0.000029 time 3.3725 (2.2858) loss 2.4140 (2.9806) grad_norm 3.2157 (3.0537) [2022-01-26 11:12:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][250/1251] eta 0:38:01 lr 0.000029 time 1.8433 (2.2790) loss 1.8951 (2.9715) grad_norm 2.7745 (3.0529) [2022-01-26 11:12:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][260/1251] eta 0:37:26 lr 0.000029 time 1.7204 (2.2674) loss 3.6083 (2.9782) grad_norm 2.9312 (3.0482) [2022-01-26 11:13:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][270/1251] eta 0:36:57 lr 0.000029 time 1.6231 (2.2607) loss 3.1119 (2.9912) grad_norm 3.0264 (3.0463) [2022-01-26 11:13:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][280/1251] eta 0:36:34 lr 0.000029 time 2.7080 (2.2596) loss 2.8859 (2.9896) grad_norm 2.5746 (3.0437) [2022-01-26 11:13:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][290/1251] eta 0:36:06 lr 0.000029 time 2.1373 (2.2549) loss 2.1665 (2.9838) grad_norm 3.1871 (3.0444) [2022-01-26 11:14:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][300/1251] eta 0:35:40 lr 0.000029 time 2.0964 (2.2513) loss 3.4857 (2.9796) grad_norm 2.8763 (3.0414) [2022-01-26 11:14:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][310/1251] eta 0:35:14 lr 0.000029 time 2.4659 (2.2469) loss 3.0903 (2.9869) grad_norm 3.0365 (3.0390) [2022-01-26 11:14:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][320/1251] eta 0:34:50 lr 0.000029 time 2.5578 (2.2452) loss 2.7509 (2.9850) grad_norm 2.9886 (3.0412) [2022-01-26 11:15:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][330/1251] eta 0:34:29 lr 0.000029 time 2.4572 (2.2466) loss 2.2501 (2.9793) grad_norm 3.1890 (3.0375) [2022-01-26 11:15:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][340/1251] eta 0:34:06 lr 0.000029 time 1.5086 (2.2469) loss 3.0943 (2.9832) grad_norm 3.7019 (3.0402) [2022-01-26 11:16:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][350/1251] eta 0:33:43 lr 0.000029 time 2.1083 (2.2459) loss 3.6447 (2.9850) grad_norm 2.7539 (3.0390) [2022-01-26 11:16:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][360/1251] eta 0:33:21 lr 0.000029 time 1.8450 (2.2463) loss 1.9253 (2.9789) grad_norm 2.8080 (3.0341) [2022-01-26 11:16:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][370/1251] eta 0:32:57 lr 0.000029 time 2.4845 (2.2451) loss 3.7039 (2.9780) grad_norm 3.0041 (3.0376) [2022-01-26 11:17:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][380/1251] eta 0:32:30 lr 0.000029 time 1.8401 (2.2390) loss 3.5126 (2.9783) grad_norm 3.4178 (3.0558) [2022-01-26 11:17:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][390/1251] eta 0:32:05 lr 0.000029 time 2.8303 (2.2361) loss 3.4204 (2.9836) grad_norm 3.5229 (3.0627) [2022-01-26 11:17:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][400/1251] eta 0:31:38 lr 0.000029 time 1.8738 (2.2311) loss 3.1249 (2.9857) grad_norm 3.1169 (3.0659) [2022-01-26 11:18:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][410/1251] eta 0:31:18 lr 0.000029 time 3.0313 (2.2336) loss 3.3349 (2.9872) grad_norm 2.9206 (3.0666) [2022-01-26 11:18:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][420/1251] eta 0:30:55 lr 0.000029 time 2.1165 (2.2330) loss 2.4003 (2.9817) grad_norm 3.2944 (3.0752) [2022-01-26 11:19:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][430/1251] eta 0:30:37 lr 0.000029 time 3.4490 (2.2384) loss 2.3156 (2.9777) grad_norm 3.1757 (3.0733) [2022-01-26 11:19:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][440/1251] eta 0:30:15 lr 0.000029 time 2.2284 (2.2382) loss 3.1169 (2.9820) grad_norm 3.0275 (3.0846) [2022-01-26 11:19:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][450/1251] eta 0:29:50 lr 0.000029 time 1.8788 (2.2351) loss 3.6543 (2.9804) grad_norm 3.2952 (3.0839) [2022-01-26 11:20:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][460/1251] eta 0:29:23 lr 0.000029 time 1.7114 (2.2295) loss 3.5515 (2.9800) grad_norm 3.0113 (3.0855) [2022-01-26 11:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][470/1251] eta 0:29:00 lr 0.000029 time 2.7670 (2.2286) loss 3.0201 (2.9802) grad_norm 3.4944 (3.0837) [2022-01-26 11:20:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][480/1251] eta 0:28:37 lr 0.000029 time 2.6308 (2.2276) loss 2.1431 (2.9830) grad_norm 3.3959 (3.0839) [2022-01-26 11:21:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][490/1251] eta 0:28:14 lr 0.000029 time 1.8590 (2.2266) loss 3.1593 (2.9824) grad_norm 3.1610 (3.0803) [2022-01-26 11:21:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][500/1251] eta 0:27:50 lr 0.000029 time 2.6464 (2.2246) loss 2.6541 (2.9841) grad_norm 2.7393 (3.0806) [2022-01-26 11:21:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][510/1251] eta 0:27:28 lr 0.000029 time 2.2214 (2.2244) loss 3.3709 (2.9887) grad_norm 3.2648 (3.0816) [2022-01-26 11:22:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][520/1251] eta 0:27:05 lr 0.000029 time 2.3146 (2.2231) loss 3.4555 (2.9911) grad_norm 3.1379 (3.0799) [2022-01-26 11:22:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][530/1251] eta 0:26:39 lr 0.000029 time 1.6810 (2.2190) loss 3.1817 (2.9874) grad_norm 3.9445 (3.0837) [2022-01-26 11:22:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][540/1251] eta 0:26:16 lr 0.000029 time 1.5932 (2.2180) loss 3.3879 (2.9868) grad_norm 2.9086 (3.0839) [2022-01-26 11:23:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][550/1251] eta 0:25:52 lr 0.000029 time 2.0558 (2.2148) loss 2.3325 (2.9880) grad_norm 3.0681 (3.0897) [2022-01-26 11:23:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][560/1251] eta 0:25:32 lr 0.000029 time 2.1660 (2.2173) loss 3.6038 (2.9894) grad_norm 2.9098 (3.0862) [2022-01-26 11:24:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][570/1251] eta 0:25:12 lr 0.000029 time 2.4061 (2.2217) loss 2.2540 (2.9837) grad_norm 3.4458 (3.0856) [2022-01-26 11:24:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][580/1251] eta 0:24:49 lr 0.000029 time 1.9049 (2.2198) loss 2.9231 (2.9840) grad_norm 2.7113 (3.0851) [2022-01-26 11:24:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][590/1251] eta 0:24:26 lr 0.000029 time 1.9271 (2.2185) loss 2.9139 (2.9862) grad_norm 2.7794 (3.0837) [2022-01-26 11:25:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][600/1251] eta 0:24:04 lr 0.000029 time 2.1672 (2.2186) loss 3.4247 (2.9851) grad_norm 3.4755 (3.0830) [2022-01-26 11:25:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][610/1251] eta 0:23:40 lr 0.000029 time 1.5931 (2.2160) loss 3.5186 (2.9890) grad_norm 2.9819 (3.0824) [2022-01-26 11:25:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][620/1251] eta 0:23:18 lr 0.000029 time 2.2519 (2.2159) loss 2.5342 (2.9879) grad_norm 3.8823 (3.0813) [2022-01-26 11:26:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][630/1251] eta 0:22:54 lr 0.000029 time 1.9368 (2.2137) loss 2.2314 (2.9880) grad_norm 2.8391 (3.0821) [2022-01-26 11:26:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][640/1251] eta 0:22:31 lr 0.000029 time 1.8477 (2.2124) loss 3.2591 (2.9893) grad_norm 2.9107 (3.0791) [2022-01-26 11:26:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][650/1251] eta 0:22:09 lr 0.000029 time 1.9037 (2.2115) loss 2.9569 (2.9929) grad_norm 3.0243 (3.0777) [2022-01-26 11:27:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][660/1251] eta 0:21:46 lr 0.000029 time 1.8947 (2.2103) loss 2.8061 (2.9923) grad_norm 2.9566 (3.0774) [2022-01-26 11:27:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][670/1251] eta 0:21:22 lr 0.000029 time 1.5893 (2.2080) loss 3.4169 (2.9939) grad_norm 2.6433 (3.0767) [2022-01-26 11:27:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][680/1251] eta 0:20:59 lr 0.000029 time 2.0236 (2.2058) loss 3.2146 (2.9935) grad_norm 3.1509 (3.0767) [2022-01-26 11:28:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][690/1251] eta 0:20:38 lr 0.000029 time 2.1062 (2.2071) loss 3.0956 (2.9909) grad_norm 3.1676 (3.0735) [2022-01-26 11:28:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][700/1251] eta 0:20:16 lr 0.000029 time 2.5414 (2.2082) loss 2.7109 (2.9883) grad_norm 2.7957 (3.0731) [2022-01-26 11:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][710/1251] eta 0:19:55 lr 0.000029 time 2.9251 (2.2096) loss 2.8010 (2.9906) grad_norm 2.5946 (3.0741) [2022-01-26 11:29:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][720/1251] eta 0:19:33 lr 0.000029 time 2.1352 (2.2099) loss 3.1403 (2.9907) grad_norm 3.1518 (3.0732) [2022-01-26 11:29:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][730/1251] eta 0:19:11 lr 0.000029 time 1.5535 (2.2105) loss 2.0329 (2.9914) grad_norm 2.4184 (3.0724) [2022-01-26 11:30:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][740/1251] eta 0:18:49 lr 0.000029 time 3.0563 (2.2103) loss 2.6723 (2.9868) grad_norm 2.7271 (3.0707) [2022-01-26 11:30:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][750/1251] eta 0:18:27 lr 0.000029 time 2.8974 (2.2099) loss 3.3564 (2.9877) grad_norm 2.7506 (3.0706) [2022-01-26 11:30:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][760/1251] eta 0:18:03 lr 0.000029 time 1.9030 (2.2066) loss 1.9862 (2.9861) grad_norm 3.2922 (3.0697) [2022-01-26 11:31:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][770/1251] eta 0:17:40 lr 0.000029 time 1.9129 (2.2057) loss 2.8226 (2.9833) grad_norm 3.2311 (3.0692) [2022-01-26 11:31:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][780/1251] eta 0:17:18 lr 0.000029 time 2.2511 (2.2044) loss 2.1190 (2.9831) grad_norm 2.7971 (3.0706) [2022-01-26 11:32:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][790/1251] eta 0:16:56 lr 0.000029 time 2.2594 (2.2048) loss 3.3907 (2.9864) grad_norm 2.6676 (3.0682) [2022-01-26 11:32:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][800/1251] eta 0:16:34 lr 0.000029 time 1.8896 (2.2042) loss 3.2048 (2.9847) grad_norm 2.8960 (3.0690) [2022-01-26 11:32:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][810/1251] eta 0:16:12 lr 0.000029 time 2.3935 (2.2048) loss 3.3302 (2.9831) grad_norm 2.4536 (3.0721) [2022-01-26 11:33:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][820/1251] eta 0:15:49 lr 0.000029 time 2.4590 (2.2036) loss 3.5981 (2.9857) grad_norm 3.3121 (3.0750) [2022-01-26 11:33:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][830/1251] eta 0:15:28 lr 0.000029 time 2.8271 (2.2053) loss 2.4351 (2.9844) grad_norm 3.2564 (3.0753) [2022-01-26 11:33:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][840/1251] eta 0:15:07 lr 0.000029 time 2.5332 (2.2074) loss 3.5819 (2.9853) grad_norm 2.5359 (3.0724) [2022-01-26 11:34:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][850/1251] eta 0:14:45 lr 0.000029 time 2.0857 (2.2089) loss 3.4482 (2.9865) grad_norm 2.8155 (3.0714) [2022-01-26 11:34:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][860/1251] eta 0:14:23 lr 0.000029 time 2.2074 (2.2080) loss 3.5870 (2.9877) grad_norm 3.0543 (3.0718) [2022-01-26 11:34:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][870/1251] eta 0:13:59 lr 0.000029 time 1.5994 (2.2044) loss 2.7685 (2.9893) grad_norm 2.9884 (3.0716) [2022-01-26 11:35:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][880/1251] eta 0:13:36 lr 0.000029 time 1.8928 (2.2014) loss 3.4625 (2.9917) grad_norm 2.9721 (3.0731) [2022-01-26 11:35:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][890/1251] eta 0:13:14 lr 0.000029 time 2.0528 (2.1999) loss 2.2071 (2.9932) grad_norm 3.1916 (3.0744) [2022-01-26 11:36:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][900/1251] eta 0:12:52 lr 0.000029 time 2.4257 (2.2008) loss 2.8752 (2.9942) grad_norm 2.5898 (3.0732) [2022-01-26 11:36:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][910/1251] eta 0:12:30 lr 0.000029 time 2.5174 (2.2002) loss 2.9015 (2.9943) grad_norm 2.5517 (3.0739) [2022-01-26 11:36:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][920/1251] eta 0:12:08 lr 0.000029 time 2.0280 (2.2007) loss 3.2950 (2.9944) grad_norm 3.5941 (3.0746) [2022-01-26 11:37:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][930/1251] eta 0:11:46 lr 0.000029 time 1.7584 (2.2013) loss 2.9590 (2.9939) grad_norm 3.6187 (3.0744) [2022-01-26 11:37:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][940/1251] eta 0:11:25 lr 0.000029 time 1.6369 (2.2043) loss 3.2589 (2.9944) grad_norm 3.1356 (3.0738) [2022-01-26 11:37:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][950/1251] eta 0:11:04 lr 0.000029 time 2.4800 (2.2071) loss 3.2142 (2.9966) grad_norm 3.1724 (3.0731) [2022-01-26 11:38:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][960/1251] eta 0:10:41 lr 0.000029 time 1.9880 (2.2061) loss 3.0742 (2.9941) grad_norm 2.9872 (3.0735) [2022-01-26 11:38:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][970/1251] eta 0:10:19 lr 0.000029 time 1.9836 (2.2042) loss 2.8885 (2.9919) grad_norm 3.3416 (3.0729) [2022-01-26 11:38:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][980/1251] eta 0:09:56 lr 0.000029 time 1.8866 (2.2013) loss 3.6748 (2.9949) grad_norm 2.8089 (3.0715) [2022-01-26 11:39:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][990/1251] eta 0:09:33 lr 0.000029 time 1.9371 (2.1982) loss 3.2433 (2.9954) grad_norm 4.2060 (3.0743) [2022-01-26 11:39:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1000/1251] eta 0:09:11 lr 0.000029 time 2.3844 (2.1967) loss 2.9276 (2.9961) grad_norm 3.6993 (3.0742) [2022-01-26 11:39:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1010/1251] eta 0:08:49 lr 0.000029 time 5.0687 (2.1982) loss 2.9715 (2.9970) grad_norm 2.9949 (3.0745) [2022-01-26 11:40:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1020/1251] eta 0:08:28 lr 0.000028 time 2.6726 (2.2025) loss 3.2731 (2.9960) grad_norm 2.7242 (3.0790) [2022-01-26 11:40:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1030/1251] eta 0:08:06 lr 0.000028 time 2.2005 (2.2036) loss 3.0928 (2.9941) grad_norm 3.1315 (3.0795) [2022-01-26 11:41:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1040/1251] eta 0:07:44 lr 0.000028 time 2.8542 (2.2036) loss 2.5968 (2.9934) grad_norm 3.1572 (3.0783) [2022-01-26 11:41:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1050/1251] eta 0:07:22 lr 0.000028 time 2.5212 (2.2028) loss 3.0796 (2.9939) grad_norm 2.9225 (3.0777) [2022-01-26 11:41:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1060/1251] eta 0:07:00 lr 0.000028 time 1.9165 (2.2009) loss 2.5422 (2.9950) grad_norm 3.6221 (3.0772) [2022-01-26 11:42:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1070/1251] eta 0:06:38 lr 0.000028 time 2.8142 (2.1999) loss 2.9011 (2.9933) grad_norm 3.5341 (3.0802) [2022-01-26 11:42:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1080/1251] eta 0:06:15 lr 0.000028 time 2.6050 (2.1986) loss 3.4514 (2.9954) grad_norm 4.1698 (3.0857) [2022-01-26 11:42:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1090/1251] eta 0:05:53 lr 0.000028 time 2.6191 (2.1975) loss 3.6783 (2.9957) grad_norm 3.2197 (3.0884) [2022-01-26 11:43:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1100/1251] eta 0:05:31 lr 0.000028 time 2.2410 (2.1959) loss 3.3725 (2.9967) grad_norm 3.1263 (3.0884) [2022-01-26 11:43:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1110/1251] eta 0:05:09 lr 0.000028 time 3.1640 (2.1967) loss 3.2634 (2.9952) grad_norm 2.7870 (3.0867) [2022-01-26 11:43:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1120/1251] eta 0:04:47 lr 0.000028 time 2.2918 (2.1965) loss 2.2846 (2.9927) grad_norm 3.0009 (3.0852) [2022-01-26 11:44:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1130/1251] eta 0:04:25 lr 0.000028 time 2.2927 (2.1972) loss 3.1944 (2.9932) grad_norm 2.8662 (3.0840) [2022-01-26 11:44:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1140/1251] eta 0:04:03 lr 0.000028 time 2.7317 (2.1982) loss 3.0767 (2.9956) grad_norm 3.2739 (3.0858) [2022-01-26 11:45:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1150/1251] eta 0:03:42 lr 0.000028 time 2.2719 (2.1986) loss 3.0539 (2.9938) grad_norm 3.4508 (3.0861) [2022-01-26 11:45:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1160/1251] eta 0:03:19 lr 0.000028 time 2.1006 (2.1970) loss 3.2920 (2.9952) grad_norm 2.9842 (3.0852) [2022-01-26 11:45:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1170/1251] eta 0:02:57 lr 0.000028 time 1.8484 (2.1956) loss 2.8351 (2.9950) grad_norm 2.6362 (3.0841) [2022-01-26 11:46:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1180/1251] eta 0:02:35 lr 0.000028 time 2.5232 (2.1961) loss 2.4542 (2.9960) grad_norm 3.2268 (3.0835) [2022-01-26 11:46:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1190/1251] eta 0:02:13 lr 0.000028 time 1.8186 (2.1962) loss 2.4811 (2.9948) grad_norm 2.8650 (3.0848) [2022-01-26 11:46:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1200/1251] eta 0:01:52 lr 0.000028 time 3.2807 (2.1981) loss 3.5080 (2.9962) grad_norm 3.4644 (3.0855) [2022-01-26 11:47:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1210/1251] eta 0:01:30 lr 0.000028 time 1.8784 (2.1982) loss 2.9351 (2.9968) grad_norm 3.9355 (3.0863) [2022-01-26 11:47:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1220/1251] eta 0:01:08 lr 0.000028 time 1.8854 (2.1986) loss 2.3555 (2.9968) grad_norm 2.7058 (3.0862) [2022-01-26 11:48:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1230/1251] eta 0:00:46 lr 0.000028 time 1.8610 (2.1976) loss 2.9887 (2.9971) grad_norm 3.1064 (3.0872) [2022-01-26 11:48:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1240/1251] eta 0:00:24 lr 0.000028 time 2.5881 (2.1974) loss 2.8959 (2.9978) grad_norm 2.8867 (3.0868) [2022-01-26 11:48:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [273/300][1250/1251] eta 0:00:02 lr 0.000028 time 1.2165 (2.1915) loss 3.4568 (3.0001) grad_norm 2.6816 (3.0857) [2022-01-26 11:48:39 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 273 training takes 0:45:42 [2022-01-26 11:48:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.533 (18.533) Loss 0.9255 (0.9255) Acc@1 78.223 (78.223) Acc@5 94.434 (94.434) [2022-01-26 11:49:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.911 (3.622) Loss 0.8438 (0.8289) Acc@1 80.273 (80.895) Acc@5 94.238 (95.046) [2022-01-26 11:49:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.266 (2.569) Loss 0.8231 (0.8204) Acc@1 81.055 (80.971) Acc@5 94.434 (95.168) [2022-01-26 11:49:51 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.311 (2.326) Loss 0.8427 (0.8171) Acc@1 79.883 (81.007) Acc@5 95.117 (95.319) [2022-01-26 11:50:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.038 (2.156) Loss 0.8627 (0.8168) Acc@1 79.883 (80.909) Acc@5 94.629 (95.312) [2022-01-26 11:50:15 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.986 Acc@5 95.366 [2022-01-26 11:50:15 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-01-26 11:50:15 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.07% [2022-01-26 11:50:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][0/1251] eta 7:27:13 lr 0.000028 time 21.4495 (21.4495) loss 3.5361 (3.5361) grad_norm 2.8465 (2.8465) [2022-01-26 11:51:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][10/1251] eta 1:25:18 lr 0.000028 time 2.1305 (4.1243) loss 3.0983 (2.8831) grad_norm 2.7064 (2.8594) [2022-01-26 11:51:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][20/1251] eta 1:05:15 lr 0.000028 time 1.5807 (3.1811) loss 3.7746 (2.8970) grad_norm 3.0950 (2.9886) [2022-01-26 11:51:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][30/1251] eta 0:57:40 lr 0.000028 time 1.7488 (2.8345) loss 2.5330 (2.9120) grad_norm 3.3668 (3.0631) [2022-01-26 11:52:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][40/1251] eta 0:54:38 lr 0.000028 time 3.9830 (2.7073) loss 2.6641 (2.8492) grad_norm 2.7224 (3.0465) [2022-01-26 11:52:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][50/1251] eta 0:52:10 lr 0.000028 time 2.1914 (2.6067) loss 3.1331 (2.9121) grad_norm 2.8636 (3.0320) [2022-01-26 11:52:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][60/1251] eta 0:50:07 lr 0.000028 time 2.1925 (2.5253) loss 3.1451 (2.9189) grad_norm 3.1013 (3.0526) [2022-01-26 11:53:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][70/1251] eta 0:48:46 lr 0.000028 time 2.1196 (2.4783) loss 2.9460 (2.9121) grad_norm 3.3679 (3.0914) [2022-01-26 11:53:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][80/1251] eta 0:47:45 lr 0.000028 time 3.4888 (2.4471) loss 3.0228 (2.8887) grad_norm 2.8088 (3.0690) [2022-01-26 11:53:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][90/1251] eta 0:46:55 lr 0.000028 time 1.5630 (2.4250) loss 3.0894 (2.9015) grad_norm 3.3134 (3.0656) [2022-01-26 11:54:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][100/1251] eta 0:45:53 lr 0.000028 time 2.3205 (2.3921) loss 3.2552 (2.9250) grad_norm 2.9114 (3.0659) [2022-01-26 11:54:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][110/1251] eta 0:45:16 lr 0.000028 time 2.5049 (2.3809) loss 2.6831 (2.9269) grad_norm 3.1814 (3.0419) [2022-01-26 11:55:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][120/1251] eta 0:44:39 lr 0.000028 time 2.7852 (2.3695) loss 2.2827 (2.9255) grad_norm 4.1371 (3.0494) [2022-01-26 11:55:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][130/1251] eta 0:44:05 lr 0.000028 time 2.2541 (2.3599) loss 3.3506 (2.8975) grad_norm 3.6006 (3.0552) [2022-01-26 11:55:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][140/1251] eta 0:43:21 lr 0.000028 time 2.5686 (2.3412) loss 2.9494 (2.9028) grad_norm 2.5520 (3.0403) [2022-01-26 11:56:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][150/1251] eta 0:42:41 lr 0.000028 time 2.1445 (2.3263) loss 3.3915 (2.9080) grad_norm 2.7403 (3.0483) [2022-01-26 11:56:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][160/1251] eta 0:42:20 lr 0.000028 time 3.2876 (2.3289) loss 2.8000 (2.9145) grad_norm 2.6680 (3.0550) [2022-01-26 11:56:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][170/1251] eta 0:41:59 lr 0.000028 time 1.7588 (2.3306) loss 2.9452 (2.9048) grad_norm 3.5135 (3.0543) [2022-01-26 11:57:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][180/1251] eta 0:41:32 lr 0.000028 time 2.2881 (2.3277) loss 2.3837 (2.8920) grad_norm 3.2844 (3.0615) [2022-01-26 11:57:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][190/1251] eta 0:40:55 lr 0.000028 time 1.9582 (2.3144) loss 3.3631 (2.9046) grad_norm 3.1526 (3.0775) [2022-01-26 11:57:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][200/1251] eta 0:40:20 lr 0.000028 time 2.5645 (2.3027) loss 3.2228 (2.9110) grad_norm 2.6959 (3.0803) [2022-01-26 11:58:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][210/1251] eta 0:39:41 lr 0.000028 time 1.9244 (2.2874) loss 3.7309 (2.9155) grad_norm 3.0754 (3.0763) [2022-01-26 11:58:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][220/1251] eta 0:39:11 lr 0.000028 time 2.7236 (2.2812) loss 2.1826 (2.9232) grad_norm 2.4975 (3.0685) [2022-01-26 11:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][230/1251] eta 0:38:44 lr 0.000028 time 2.4895 (2.2764) loss 2.6120 (2.9229) grad_norm 3.5407 (3.0654) [2022-01-26 11:59:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][240/1251] eta 0:38:21 lr 0.000028 time 2.6666 (2.2766) loss 2.8536 (2.9280) grad_norm 3.2004 (3.1149) [2022-01-26 11:59:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][250/1251] eta 0:38:02 lr 0.000028 time 2.8845 (2.2805) loss 3.0842 (2.9346) grad_norm 2.7872 (3.1122) [2022-01-26 12:00:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][260/1251] eta 0:37:38 lr 0.000028 time 2.7866 (2.2795) loss 2.1677 (2.9257) grad_norm 2.9289 (3.1136) [2022-01-26 12:00:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][270/1251] eta 0:37:09 lr 0.000028 time 2.1922 (2.2723) loss 2.9853 (2.9278) grad_norm 3.1241 (3.1089) [2022-01-26 12:00:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][280/1251] eta 0:36:38 lr 0.000028 time 2.5186 (2.2644) loss 2.4682 (2.9251) grad_norm 2.6271 (3.1014) [2022-01-26 12:01:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][290/1251] eta 0:36:09 lr 0.000028 time 2.1322 (2.2578) loss 3.0651 (2.9278) grad_norm 3.0608 (3.0956) [2022-01-26 12:01:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][300/1251] eta 0:35:50 lr 0.000028 time 2.8823 (2.2612) loss 3.5624 (2.9331) grad_norm 2.7192 (3.0944) [2022-01-26 12:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][310/1251] eta 0:35:30 lr 0.000028 time 1.6138 (2.2644) loss 2.0470 (2.9292) grad_norm 2.9323 (3.0909) [2022-01-26 12:02:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][320/1251] eta 0:35:09 lr 0.000028 time 1.8781 (2.2657) loss 2.2428 (2.9415) grad_norm 2.7374 (3.0951) [2022-01-26 12:02:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][330/1251] eta 0:34:42 lr 0.000028 time 2.1932 (2.2609) loss 2.8453 (2.9450) grad_norm 3.2119 (3.0937) [2022-01-26 12:03:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][340/1251] eta 0:34:13 lr 0.000028 time 1.8255 (2.2538) loss 3.3920 (2.9469) grad_norm 2.7291 (3.0857) [2022-01-26 12:03:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][350/1251] eta 0:33:46 lr 0.000028 time 1.5815 (2.2491) loss 3.0594 (2.9546) grad_norm 2.8512 (3.0846) [2022-01-26 12:03:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][360/1251] eta 0:33:22 lr 0.000028 time 2.2552 (2.2475) loss 3.4631 (2.9606) grad_norm 2.6356 (3.0846) [2022-01-26 12:04:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][370/1251] eta 0:33:00 lr 0.000028 time 1.8900 (2.2479) loss 1.8979 (2.9602) grad_norm 3.2188 (3.0824) [2022-01-26 12:04:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][380/1251] eta 0:32:37 lr 0.000028 time 2.3635 (2.2469) loss 3.3968 (2.9638) grad_norm 2.9465 (3.0911) [2022-01-26 12:04:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][390/1251] eta 0:32:12 lr 0.000028 time 1.8078 (2.2450) loss 2.7450 (2.9687) grad_norm 2.6791 (3.0937) [2022-01-26 12:05:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][400/1251] eta 0:31:47 lr 0.000028 time 1.8846 (2.2411) loss 2.9415 (2.9718) grad_norm 2.8637 (3.0940) [2022-01-26 12:05:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][410/1251] eta 0:31:23 lr 0.000028 time 1.9183 (2.2390) loss 2.4147 (2.9760) grad_norm 4.0715 (3.0991) [2022-01-26 12:05:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][420/1251] eta 0:30:57 lr 0.000028 time 2.6375 (2.2354) loss 3.2398 (2.9785) grad_norm 3.0775 (3.1025) [2022-01-26 12:06:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][430/1251] eta 0:30:33 lr 0.000028 time 2.4807 (2.2336) loss 3.3908 (2.9742) grad_norm 2.8383 (3.1047) [2022-01-26 12:06:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][440/1251] eta 0:30:10 lr 0.000028 time 2.1320 (2.2328) loss 3.6569 (2.9786) grad_norm 2.8282 (3.1052) [2022-01-26 12:07:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][450/1251] eta 0:29:51 lr 0.000028 time 2.6279 (2.2366) loss 1.9913 (2.9780) grad_norm 2.6215 (3.1029) [2022-01-26 12:07:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][460/1251] eta 0:29:28 lr 0.000028 time 2.3089 (2.2356) loss 3.3484 (2.9785) grad_norm 2.8277 (3.0991) [2022-01-26 12:07:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][470/1251] eta 0:29:10 lr 0.000028 time 1.4581 (2.2418) loss 3.2463 (2.9808) grad_norm 2.9738 (3.0974) [2022-01-26 12:08:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][480/1251] eta 0:28:44 lr 0.000028 time 1.9201 (2.2368) loss 3.2734 (2.9801) grad_norm 2.9633 (3.0963) [2022-01-26 12:08:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][490/1251] eta 0:28:21 lr 0.000028 time 1.9220 (2.2362) loss 3.5661 (2.9832) grad_norm 2.7344 (3.0934) [2022-01-26 12:08:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][500/1251] eta 0:27:57 lr 0.000028 time 1.7640 (2.2337) loss 2.9267 (2.9857) grad_norm 2.6003 (3.0935) [2022-01-26 12:09:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][510/1251] eta 0:27:35 lr 0.000028 time 1.8488 (2.2341) loss 2.2158 (2.9859) grad_norm 3.4326 (3.1004) [2022-01-26 12:09:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][520/1251] eta 0:27:14 lr 0.000028 time 2.1378 (2.2364) loss 2.5563 (2.9841) grad_norm 3.0503 (3.1011) [2022-01-26 12:10:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][530/1251] eta 0:26:51 lr 0.000028 time 1.8895 (2.2357) loss 3.1581 (2.9826) grad_norm 2.8404 (3.1016) [2022-01-26 12:10:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][540/1251] eta 0:26:26 lr 0.000028 time 2.1754 (2.2307) loss 2.4107 (2.9833) grad_norm 2.7446 (3.0990) [2022-01-26 12:10:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][550/1251] eta 0:26:02 lr 0.000028 time 2.2615 (2.2290) loss 2.3907 (2.9853) grad_norm 3.7193 (3.0981) [2022-01-26 12:11:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][560/1251] eta 0:25:39 lr 0.000028 time 2.1019 (2.2284) loss 3.0009 (2.9824) grad_norm 3.5016 (3.0994) [2022-01-26 12:11:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][570/1251] eta 0:25:19 lr 0.000028 time 2.0318 (2.2312) loss 3.7697 (2.9833) grad_norm 3.4876 (3.1000) [2022-01-26 12:11:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][580/1251] eta 0:24:56 lr 0.000028 time 1.5583 (2.2306) loss 3.3429 (2.9851) grad_norm 3.4162 (3.0964) [2022-01-26 12:12:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][590/1251] eta 0:24:33 lr 0.000028 time 1.7339 (2.2286) loss 3.6060 (2.9874) grad_norm 2.9039 (3.0983) [2022-01-26 12:12:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][600/1251] eta 0:24:08 lr 0.000028 time 1.9521 (2.2256) loss 3.0138 (2.9842) grad_norm 3.3383 (3.1025) [2022-01-26 12:12:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][610/1251] eta 0:23:46 lr 0.000028 time 2.4565 (2.2258) loss 3.7161 (2.9852) grad_norm 3.0824 (3.1036) [2022-01-26 12:13:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][620/1251] eta 0:23:23 lr 0.000028 time 1.9252 (2.2239) loss 3.3369 (2.9861) grad_norm 3.1027 (3.1048) [2022-01-26 12:13:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][630/1251] eta 0:23:01 lr 0.000028 time 1.8707 (2.2247) loss 3.5204 (2.9901) grad_norm 3.0960 (3.1102) [2022-01-26 12:14:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][640/1251] eta 0:22:39 lr 0.000028 time 1.8145 (2.2246) loss 3.3803 (2.9883) grad_norm 2.8570 (3.1106) [2022-01-26 12:14:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][650/1251] eta 0:22:17 lr 0.000028 time 1.8746 (2.2261) loss 3.4289 (2.9867) grad_norm 3.1950 (3.1097) [2022-01-26 12:14:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][660/1251] eta 0:21:55 lr 0.000028 time 1.8644 (2.2253) loss 2.5070 (2.9886) grad_norm 3.0959 (3.1100) [2022-01-26 12:15:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][670/1251] eta 0:21:31 lr 0.000027 time 1.9256 (2.2230) loss 3.4706 (2.9885) grad_norm 2.9260 (3.1115) [2022-01-26 12:15:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][680/1251] eta 0:21:07 lr 0.000027 time 1.9058 (2.2203) loss 2.6389 (2.9903) grad_norm 3.2019 (3.1114) [2022-01-26 12:15:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][690/1251] eta 0:20:45 lr 0.000027 time 1.6626 (2.2201) loss 1.7729 (2.9892) grad_norm 2.5019 (3.1081) [2022-01-26 12:16:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][700/1251] eta 0:20:23 lr 0.000027 time 1.8977 (2.2199) loss 3.3481 (2.9887) grad_norm 2.9821 (3.1111) [2022-01-26 12:16:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][710/1251] eta 0:20:00 lr 0.000027 time 2.6856 (2.2192) loss 3.0246 (2.9875) grad_norm 2.5718 (3.1095) [2022-01-26 12:16:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][720/1251] eta 0:19:37 lr 0.000027 time 1.9511 (2.2183) loss 3.0058 (2.9911) grad_norm 3.3202 (3.1098) [2022-01-26 12:17:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][730/1251] eta 0:19:15 lr 0.000027 time 1.8920 (2.2183) loss 3.4077 (2.9949) grad_norm 2.9688 (3.1084) [2022-01-26 12:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][740/1251] eta 0:18:53 lr 0.000027 time 2.2366 (2.2187) loss 3.5593 (2.9944) grad_norm 3.1165 (3.1091) [2022-01-26 12:18:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][750/1251] eta 0:18:30 lr 0.000027 time 1.6158 (2.2169) loss 3.4493 (2.9970) grad_norm 2.7730 (3.1084) [2022-01-26 12:18:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][760/1251] eta 0:18:08 lr 0.000027 time 2.0703 (2.2162) loss 3.2671 (2.9987) grad_norm 2.8698 (3.1075) [2022-01-26 12:18:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][770/1251] eta 0:17:46 lr 0.000027 time 1.8535 (2.2178) loss 3.3064 (3.0009) grad_norm 2.7288 (3.1068) [2022-01-26 12:19:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][780/1251] eta 0:17:24 lr 0.000027 time 2.1294 (2.2167) loss 3.0712 (2.9991) grad_norm 3.4214 (3.1075) [2022-01-26 12:19:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][790/1251] eta 0:17:01 lr 0.000027 time 1.5793 (2.2162) loss 3.5617 (3.0016) grad_norm 3.5018 (3.1072) [2022-01-26 12:19:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][800/1251] eta 0:16:39 lr 0.000027 time 3.1069 (2.2172) loss 3.2331 (3.0021) grad_norm 2.7249 (3.1041) [2022-01-26 12:20:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][810/1251] eta 0:16:17 lr 0.000027 time 2.0732 (2.2160) loss 3.4608 (3.0045) grad_norm 3.1015 (3.1040) [2022-01-26 12:20:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][820/1251] eta 0:15:54 lr 0.000027 time 2.4581 (2.2147) loss 3.5044 (3.0052) grad_norm 2.9436 (3.1040) [2022-01-26 12:20:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][830/1251] eta 0:15:32 lr 0.000027 time 1.9320 (2.2147) loss 3.4330 (3.0065) grad_norm 2.6461 (3.1021) [2022-01-26 12:21:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][840/1251] eta 0:15:09 lr 0.000027 time 2.6277 (2.2128) loss 3.5391 (3.0060) grad_norm 3.6391 (3.1014) [2022-01-26 12:21:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][850/1251] eta 0:14:47 lr 0.000027 time 1.9035 (2.2124) loss 2.0437 (3.0051) grad_norm 2.5531 (3.1006) [2022-01-26 12:21:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][860/1251] eta 0:14:24 lr 0.000027 time 1.9071 (2.2116) loss 3.4197 (3.0063) grad_norm 2.7134 (3.1003) [2022-01-26 12:22:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][870/1251] eta 0:14:02 lr 0.000027 time 1.8912 (2.2121) loss 2.1449 (3.0074) grad_norm 3.7773 (3.0990) [2022-01-26 12:22:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][880/1251] eta 0:13:40 lr 0.000027 time 2.3235 (2.2120) loss 3.5513 (3.0066) grad_norm 3.2657 (3.1001) [2022-01-26 12:23:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][890/1251] eta 0:13:18 lr 0.000027 time 1.8307 (2.2111) loss 3.2577 (3.0061) grad_norm 3.1378 (3.0993) [2022-01-26 12:23:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][900/1251] eta 0:12:56 lr 0.000027 time 1.9691 (2.2115) loss 2.6974 (3.0059) grad_norm 3.3276 (3.1000) [2022-01-26 12:23:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][910/1251] eta 0:12:33 lr 0.000027 time 1.8875 (2.2106) loss 3.2801 (3.0073) grad_norm 2.8280 (3.0980) [2022-01-26 12:24:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][920/1251] eta 0:12:11 lr 0.000027 time 1.5946 (2.2097) loss 3.5122 (3.0085) grad_norm 3.3266 (3.0975) [2022-01-26 12:24:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][930/1251] eta 0:11:49 lr 0.000027 time 2.8032 (2.2090) loss 3.2352 (3.0097) grad_norm 3.1940 (3.0993) [2022-01-26 12:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][940/1251] eta 0:11:26 lr 0.000027 time 1.6445 (2.2089) loss 2.9225 (3.0091) grad_norm 4.0308 (3.0991) [2022-01-26 12:25:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][950/1251] eta 0:11:04 lr 0.000027 time 1.5135 (2.2085) loss 3.1494 (3.0097) grad_norm 3.3465 (3.0990) [2022-01-26 12:25:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][960/1251] eta 0:10:42 lr 0.000027 time 1.5338 (2.2081) loss 3.3665 (3.0087) grad_norm 3.2981 (3.1009) [2022-01-26 12:25:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][970/1251] eta 0:10:20 lr 0.000027 time 2.5861 (2.2084) loss 2.5906 (3.0089) grad_norm 3.6332 (3.1024) [2022-01-26 12:26:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][980/1251] eta 0:09:58 lr 0.000027 time 1.8263 (2.2100) loss 3.5647 (3.0093) grad_norm 2.8244 (3.1015) [2022-01-26 12:26:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][990/1251] eta 0:09:37 lr 0.000027 time 2.8390 (2.2117) loss 1.9333 (3.0078) grad_norm 2.6786 (3.1011) [2022-01-26 12:27:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1000/1251] eta 0:09:14 lr 0.000027 time 1.8603 (2.2108) loss 3.0837 (3.0077) grad_norm 2.5725 (3.0990) [2022-01-26 12:27:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1010/1251] eta 0:08:52 lr 0.000027 time 2.4075 (2.2111) loss 3.6344 (3.0071) grad_norm 4.0815 (3.0993) [2022-01-26 12:27:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1020/1251] eta 0:08:30 lr 0.000027 time 1.4887 (2.2097) loss 3.0626 (3.0061) grad_norm 3.0402 (3.0971) [2022-01-26 12:28:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1030/1251] eta 0:08:07 lr 0.000027 time 1.6416 (2.2077) loss 3.4737 (3.0067) grad_norm 2.8760 (3.0960) [2022-01-26 12:28:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1040/1251] eta 0:07:45 lr 0.000027 time 2.0703 (2.2065) loss 3.5547 (3.0090) grad_norm 2.9165 (3.0962) [2022-01-26 12:28:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1050/1251] eta 0:07:23 lr 0.000027 time 2.2813 (2.2058) loss 3.4449 (3.0093) grad_norm 2.7994 (3.0946) [2022-01-26 12:29:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1060/1251] eta 0:07:01 lr 0.000027 time 2.1879 (2.2050) loss 3.1041 (3.0087) grad_norm 2.9550 (3.0951) [2022-01-26 12:29:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1070/1251] eta 0:06:39 lr 0.000027 time 2.4777 (2.2051) loss 2.8355 (3.0100) grad_norm 2.5840 (3.0991) [2022-01-26 12:29:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1080/1251] eta 0:06:17 lr 0.000027 time 2.1178 (2.2058) loss 3.3340 (3.0073) grad_norm 2.8450 (3.0984) [2022-01-26 12:30:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1090/1251] eta 0:05:55 lr 0.000027 time 2.3425 (2.2055) loss 3.4654 (3.0077) grad_norm 3.0231 (3.0962) [2022-01-26 12:30:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1100/1251] eta 0:05:33 lr 0.000027 time 1.8794 (2.2054) loss 2.1339 (3.0039) grad_norm 3.8511 (3.0958) [2022-01-26 12:31:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1110/1251] eta 0:05:11 lr 0.000027 time 2.3157 (2.2061) loss 2.1283 (3.0022) grad_norm 2.7708 (3.0952) [2022-01-26 12:31:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1120/1251] eta 0:04:49 lr 0.000027 time 2.4555 (2.2076) loss 2.2885 (3.0018) grad_norm 3.1163 (3.0949) [2022-01-26 12:31:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1130/1251] eta 0:04:27 lr 0.000027 time 2.2495 (2.2068) loss 3.3720 (3.0026) grad_norm 2.9564 (3.0943) [2022-01-26 12:32:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1140/1251] eta 0:04:04 lr 0.000027 time 1.8817 (2.2052) loss 2.3691 (3.0019) grad_norm 2.6450 (3.0953) [2022-01-26 12:32:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1150/1251] eta 0:03:42 lr 0.000027 time 2.2616 (2.2044) loss 1.9013 (3.0004) grad_norm 2.9070 (3.0972) [2022-01-26 12:32:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1160/1251] eta 0:03:20 lr 0.000027 time 3.4824 (2.2044) loss 3.2721 (3.0013) grad_norm 2.8587 (3.0962) [2022-01-26 12:33:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1170/1251] eta 0:02:58 lr 0.000027 time 2.9115 (2.2053) loss 3.4490 (3.0013) grad_norm 2.8537 (3.0953) [2022-01-26 12:33:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1180/1251] eta 0:02:36 lr 0.000027 time 2.5266 (2.2053) loss 2.0595 (3.0001) grad_norm 3.2088 (3.0958) [2022-01-26 12:34:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1190/1251] eta 0:02:14 lr 0.000027 time 2.2309 (2.2062) loss 3.7402 (3.0012) grad_norm 3.2045 (3.1011) [2022-01-26 12:34:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1200/1251] eta 0:01:52 lr 0.000027 time 2.0610 (2.2053) loss 3.0790 (3.0004) grad_norm 3.0407 (3.1011) [2022-01-26 12:34:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1210/1251] eta 0:01:30 lr 0.000027 time 3.1521 (2.2058) loss 3.3128 (3.0009) grad_norm 2.9059 (3.1016) [2022-01-26 12:35:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1220/1251] eta 0:01:08 lr 0.000027 time 1.5827 (2.2063) loss 3.7075 (3.0023) grad_norm 3.0070 (3.1016) [2022-01-26 12:35:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1230/1251] eta 0:00:46 lr 0.000027 time 1.9268 (2.2067) loss 2.8862 (3.0028) grad_norm 2.6082 (3.1007) [2022-01-26 12:35:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1240/1251] eta 0:00:24 lr 0.000027 time 1.2141 (2.2051) loss 3.4332 (3.0047) grad_norm 2.8874 (3.1008) [2022-01-26 12:36:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [274/300][1250/1251] eta 0:00:02 lr 0.000027 time 1.2101 (2.1990) loss 2.8984 (3.0040) grad_norm 2.8171 (3.1007) [2022-01-26 12:36:06 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 274 training takes 0:45:51 [2022-01-26 12:36:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.184 (18.184) Loss 0.8122 (0.8122) Acc@1 82.031 (82.031) Acc@5 94.727 (94.727) [2022-01-26 12:36:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.227 (3.379) Loss 0.8963 (0.8208) Acc@1 79.199 (81.081) Acc@5 94.629 (95.348) [2022-01-26 12:36:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.896 (2.534) Loss 0.8327 (0.8124) Acc@1 80.762 (80.948) Acc@5 94.238 (95.461) [2022-01-26 12:37:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.271 (2.270) Loss 0.8699 (0.8155) Acc@1 78.711 (80.869) Acc@5 95.117 (95.423) [2022-01-26 12:37:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.948 (2.147) Loss 0.8493 (0.8135) Acc@1 80.664 (80.919) Acc@5 94.824 (95.474) [2022-01-26 12:37:41 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 80.938 Acc@5 95.440 [2022-01-26 12:37:41 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 80.9% [2022-01-26 12:37:41 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.07% [2022-01-26 12:38:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][0/1251] eta 7:23:00 lr 0.000027 time 21.2472 (21.2472) loss 3.2269 (3.2269) grad_norm 3.1365 (3.1365) [2022-01-26 12:38:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][10/1251] eta 1:25:09 lr 0.000027 time 2.2800 (4.1171) loss 3.1076 (3.0414) grad_norm 3.3708 (3.2595) [2022-01-26 12:38:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][20/1251] eta 1:03:46 lr 0.000027 time 1.2549 (3.1086) loss 3.4293 (3.0385) grad_norm 2.9073 (3.0792) [2022-01-26 12:39:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][30/1251] eta 0:58:12 lr 0.000027 time 1.5136 (2.8602) loss 3.0988 (3.0557) grad_norm 3.1067 (3.0745) [2022-01-26 12:39:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][40/1251] eta 0:55:19 lr 0.000027 time 3.6442 (2.7412) loss 2.6915 (3.0565) grad_norm 3.4818 (3.1295) [2022-01-26 12:39:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][50/1251] eta 0:53:46 lr 0.000027 time 2.5933 (2.6865) loss 3.0342 (3.0744) grad_norm 3.2174 (3.0939) [2022-01-26 12:40:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][60/1251] eta 0:52:12 lr 0.000027 time 1.7794 (2.6300) loss 3.1006 (3.0694) grad_norm 4.0371 (3.1386) [2022-01-26 12:40:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][70/1251] eta 0:50:30 lr 0.000027 time 1.6784 (2.5659) loss 1.9572 (3.0611) grad_norm 3.0461 (3.1497) [2022-01-26 12:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][80/1251] eta 0:48:54 lr 0.000027 time 2.3010 (2.5062) loss 3.0305 (3.0314) grad_norm 3.2158 (3.1265) [2022-01-26 12:41:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][90/1251] eta 0:47:31 lr 0.000027 time 1.5718 (2.4562) loss 3.1091 (3.0236) grad_norm 3.3217 (3.1510) [2022-01-26 12:41:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][100/1251] eta 0:46:12 lr 0.000027 time 1.9002 (2.4091) loss 3.2735 (3.0402) grad_norm 3.1243 (3.1685) [2022-01-26 12:42:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][110/1251] eta 0:45:10 lr 0.000027 time 1.9341 (2.3751) loss 2.7890 (3.0396) grad_norm 2.6884 (3.1526) [2022-01-26 12:42:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][120/1251] eta 0:44:17 lr 0.000027 time 1.8749 (2.3501) loss 3.5563 (3.0378) grad_norm 3.7852 (3.1532) [2022-01-26 12:42:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][130/1251] eta 0:43:47 lr 0.000027 time 2.5421 (2.3435) loss 2.2563 (3.0294) grad_norm 2.8733 (3.1349) [2022-01-26 12:43:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][140/1251] eta 0:43:15 lr 0.000027 time 2.0813 (2.3361) loss 2.9539 (3.0071) grad_norm 3.0909 (3.1250) [2022-01-26 12:43:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][150/1251] eta 0:42:42 lr 0.000027 time 1.8034 (2.3272) loss 3.3447 (3.0048) grad_norm 2.8142 (3.1092) [2022-01-26 12:43:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][160/1251] eta 0:42:14 lr 0.000027 time 2.0786 (2.3229) loss 3.4643 (2.9977) grad_norm 3.3765 (3.1075) [2022-01-26 12:44:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][170/1251] eta 0:42:02 lr 0.000027 time 2.4160 (2.3337) loss 3.1860 (3.0013) grad_norm 2.9556 (3.1070) [2022-01-26 12:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][180/1251] eta 0:41:36 lr 0.000027 time 2.6341 (2.3309) loss 2.5302 (2.9995) grad_norm 2.7128 (3.1041) [2022-01-26 12:45:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][190/1251] eta 0:41:02 lr 0.000027 time 1.5721 (2.3206) loss 2.2646 (2.9927) grad_norm 3.1885 (3.0983) [2022-01-26 12:45:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][200/1251] eta 0:40:31 lr 0.000027 time 1.8214 (2.3133) loss 3.2721 (2.9978) grad_norm 3.3145 (3.1017) [2022-01-26 12:45:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][210/1251] eta 0:39:51 lr 0.000027 time 1.9165 (2.2977) loss 2.7243 (2.9972) grad_norm 4.2755 (3.1237) [2022-01-26 12:46:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][220/1251] eta 0:39:16 lr 0.000027 time 2.5183 (2.2858) loss 3.0917 (3.0032) grad_norm 3.0383 (3.1307) [2022-01-26 12:46:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][230/1251] eta 0:38:42 lr 0.000027 time 2.2601 (2.2748) loss 2.5483 (2.9946) grad_norm 3.0977 (3.1577) [2022-01-26 12:46:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][240/1251] eta 0:38:07 lr 0.000027 time 1.6713 (2.2622) loss 3.3271 (2.9935) grad_norm 3.3222 (3.1603) [2022-01-26 12:47:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][250/1251] eta 0:37:41 lr 0.000027 time 2.1434 (2.2595) loss 3.2847 (2.9973) grad_norm 2.9398 (3.1493) [2022-01-26 12:47:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][260/1251] eta 0:37:09 lr 0.000027 time 2.2365 (2.2502) loss 2.1600 (2.9912) grad_norm 2.5659 (3.1466) [2022-01-26 12:47:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][270/1251] eta 0:36:44 lr 0.000027 time 2.2878 (2.2471) loss 3.1778 (2.9956) grad_norm 3.2727 (3.1482) [2022-01-26 12:48:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][280/1251] eta 0:36:27 lr 0.000027 time 3.5324 (2.2531) loss 2.3801 (2.9948) grad_norm 3.1751 (3.1439) [2022-01-26 12:48:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][290/1251] eta 0:36:13 lr 0.000027 time 1.8005 (2.2613) loss 2.9569 (2.9937) grad_norm 2.9369 (3.1470) [2022-01-26 12:49:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][300/1251] eta 0:35:55 lr 0.000027 time 2.6705 (2.2660) loss 2.9993 (2.9909) grad_norm 3.1332 (3.1484) [2022-01-26 12:49:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][310/1251] eta 0:35:31 lr 0.000027 time 3.1170 (2.2652) loss 3.1149 (2.9865) grad_norm 2.7518 (3.1473) [2022-01-26 12:49:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][320/1251] eta 0:35:08 lr 0.000027 time 3.1365 (2.2646) loss 2.1092 (2.9764) grad_norm 2.9140 (3.1458) [2022-01-26 12:50:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][330/1251] eta 0:34:41 lr 0.000027 time 1.5914 (2.2602) loss 3.0341 (2.9773) grad_norm 2.8926 (3.1550) [2022-01-26 12:50:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][340/1251] eta 0:34:14 lr 0.000027 time 1.9025 (2.2555) loss 2.7777 (2.9770) grad_norm 2.8238 (3.1576) [2022-01-26 12:50:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][350/1251] eta 0:33:48 lr 0.000026 time 2.7579 (2.2510) loss 2.8813 (2.9773) grad_norm 2.9122 (3.1494) [2022-01-26 12:51:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][360/1251] eta 0:33:28 lr 0.000026 time 3.2314 (2.2543) loss 2.2011 (2.9752) grad_norm 2.8470 (3.1653) [2022-01-26 12:51:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][370/1251] eta 0:33:06 lr 0.000026 time 1.7524 (2.2550) loss 3.3457 (2.9809) grad_norm 3.2658 (3.1666) [2022-01-26 12:51:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][380/1251] eta 0:32:41 lr 0.000026 time 1.9570 (2.2525) loss 2.2308 (2.9797) grad_norm 3.0233 (3.1670) [2022-01-26 12:52:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][390/1251] eta 0:32:15 lr 0.000026 time 2.3890 (2.2485) loss 2.9663 (2.9833) grad_norm 3.4812 (3.1629) [2022-01-26 12:52:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][400/1251] eta 0:31:49 lr 0.000026 time 2.7314 (2.2440) loss 3.1227 (2.9850) grad_norm 2.8059 (3.1582) [2022-01-26 12:53:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][410/1251] eta 0:31:23 lr 0.000026 time 1.5615 (2.2392) loss 3.2755 (2.9863) grad_norm 2.7864 (3.1608) [2022-01-26 12:53:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][420/1251] eta 0:30:58 lr 0.000026 time 2.1087 (2.2365) loss 3.3191 (2.9914) grad_norm 2.9374 (3.1557) [2022-01-26 12:53:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][430/1251] eta 0:30:35 lr 0.000026 time 2.5322 (2.2361) loss 3.1898 (2.9878) grad_norm 3.0035 (3.1528) [2022-01-26 12:54:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][440/1251] eta 0:30:15 lr 0.000026 time 3.3500 (2.2389) loss 1.9619 (2.9878) grad_norm 3.1465 (3.1499) [2022-01-26 12:54:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][450/1251] eta 0:29:54 lr 0.000026 time 2.1957 (2.2398) loss 3.7611 (2.9909) grad_norm 2.7756 (3.1469) [2022-01-26 12:54:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][460/1251] eta 0:29:32 lr 0.000026 time 2.5832 (2.2404) loss 2.5031 (2.9919) grad_norm 2.9365 (3.1443) [2022-01-26 12:55:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][470/1251] eta 0:29:09 lr 0.000026 time 1.9946 (2.2396) loss 2.7509 (2.9908) grad_norm 2.7265 (3.1421) [2022-01-26 12:55:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][480/1251] eta 0:28:47 lr 0.000026 time 3.7176 (2.2406) loss 3.2470 (2.9932) grad_norm 3.4105 (3.1415) [2022-01-26 12:55:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][490/1251] eta 0:28:21 lr 0.000026 time 1.9887 (2.2363) loss 3.4111 (2.9901) grad_norm 2.6339 (3.1490) [2022-01-26 12:56:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][500/1251] eta 0:27:55 lr 0.000026 time 1.9043 (2.2311) loss 2.2120 (2.9813) grad_norm 3.0992 (3.1480) [2022-01-26 12:56:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][510/1251] eta 0:27:33 lr 0.000026 time 1.8371 (2.2309) loss 3.0075 (2.9844) grad_norm 3.1279 (3.1469) [2022-01-26 12:57:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][520/1251] eta 0:27:11 lr 0.000026 time 3.1709 (2.2316) loss 3.2866 (2.9856) grad_norm 2.8514 (3.1477) [2022-01-26 12:57:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][530/1251] eta 0:26:48 lr 0.000026 time 2.0164 (2.2315) loss 3.4156 (2.9844) grad_norm 2.9760 (3.1441) [2022-01-26 12:57:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][540/1251] eta 0:26:24 lr 0.000026 time 1.6877 (2.2288) loss 3.2884 (2.9833) grad_norm 2.8551 (3.1429) [2022-01-26 12:58:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][550/1251] eta 0:26:01 lr 0.000026 time 1.9284 (2.2273) loss 3.2563 (2.9844) grad_norm 2.7905 (3.1431) [2022-01-26 12:58:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][560/1251] eta 0:25:37 lr 0.000026 time 2.6818 (2.2243) loss 3.3534 (2.9863) grad_norm 2.9348 (3.1436) [2022-01-26 12:58:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][570/1251] eta 0:25:12 lr 0.000026 time 1.8429 (2.2206) loss 3.3958 (2.9901) grad_norm 2.8929 (3.1423) [2022-01-26 12:59:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][580/1251] eta 0:24:49 lr 0.000026 time 2.1640 (2.2194) loss 2.0528 (2.9892) grad_norm 2.7937 (3.1396) [2022-01-26 12:59:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][590/1251] eta 0:24:27 lr 0.000026 time 1.4963 (2.2195) loss 1.9306 (2.9898) grad_norm 3.1204 (3.1391) [2022-01-26 12:59:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][600/1251] eta 0:24:05 lr 0.000026 time 1.8457 (2.2206) loss 2.8279 (2.9863) grad_norm 3.1838 (3.1390) [2022-01-26 13:00:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][610/1251] eta 0:23:43 lr 0.000026 time 1.4980 (2.2201) loss 3.5497 (2.9845) grad_norm 2.8987 (3.1381) [2022-01-26 13:00:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][620/1251] eta 0:23:22 lr 0.000026 time 2.2226 (2.2220) loss 2.1429 (2.9831) grad_norm 2.5986 (3.1347) [2022-01-26 13:01:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][630/1251] eta 0:23:01 lr 0.000026 time 2.4455 (2.2245) loss 3.4227 (2.9826) grad_norm 3.0087 (3.1301) [2022-01-26 13:01:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][640/1251] eta 0:22:38 lr 0.000026 time 2.5569 (2.2236) loss 2.2778 (2.9799) grad_norm 2.8725 (3.1353) [2022-01-26 13:01:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][650/1251] eta 0:22:13 lr 0.000026 time 1.9246 (2.2189) loss 3.2817 (2.9817) grad_norm 3.4065 (3.1334) [2022-01-26 13:02:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][660/1251] eta 0:21:50 lr 0.000026 time 2.4370 (2.2175) loss 3.1366 (2.9778) grad_norm 2.6593 (3.1317) [2022-01-26 13:02:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][670/1251] eta 0:21:27 lr 0.000026 time 2.1002 (2.2158) loss 2.2455 (2.9755) grad_norm 3.2043 (3.1297) [2022-01-26 13:02:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][680/1251] eta 0:21:05 lr 0.000026 time 2.1602 (2.2158) loss 2.1216 (2.9757) grad_norm 3.0642 (3.1349) [2022-01-26 13:03:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][690/1251] eta 0:20:42 lr 0.000026 time 2.3057 (2.2142) loss 2.1263 (2.9746) grad_norm 3.4471 (3.1347) [2022-01-26 13:03:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][700/1251] eta 0:20:20 lr 0.000026 time 1.7045 (2.2152) loss 1.8716 (2.9760) grad_norm 2.8885 (3.1319) [2022-01-26 13:03:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][710/1251] eta 0:19:58 lr 0.000026 time 2.4221 (2.2162) loss 2.9756 (2.9782) grad_norm 2.7168 (3.1290) [2022-01-26 13:04:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][720/1251] eta 0:19:36 lr 0.000026 time 1.5001 (2.2151) loss 3.7567 (2.9823) grad_norm 3.0729 (3.1299) [2022-01-26 13:04:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][730/1251] eta 0:19:13 lr 0.000026 time 2.3194 (2.2139) loss 3.3146 (2.9830) grad_norm 3.1756 (3.1277) [2022-01-26 13:05:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][740/1251] eta 0:18:51 lr 0.000026 time 1.8243 (2.2138) loss 2.8277 (2.9860) grad_norm 3.0222 (3.1289) [2022-01-26 13:05:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][750/1251] eta 0:18:31 lr 0.000026 time 3.0383 (2.2176) loss 2.1621 (2.9851) grad_norm 3.0218 (3.1281) [2022-01-26 13:05:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][760/1251] eta 0:18:08 lr 0.000026 time 1.7659 (2.2159) loss 3.0468 (2.9828) grad_norm 2.9147 (3.1300) [2022-01-26 13:06:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][770/1251] eta 0:17:44 lr 0.000026 time 1.9612 (2.2132) loss 3.0471 (2.9865) grad_norm 3.2398 (3.1300) [2022-01-26 13:06:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][780/1251] eta 0:17:21 lr 0.000026 time 1.6361 (2.2104) loss 3.0849 (2.9869) grad_norm 2.7826 (3.1361) [2022-01-26 13:06:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][790/1251] eta 0:16:58 lr 0.000026 time 1.8692 (2.2083) loss 2.9920 (2.9860) grad_norm 2.8013 (3.1365) [2022-01-26 13:07:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][800/1251] eta 0:16:37 lr 0.000026 time 5.1481 (2.2108) loss 2.4016 (2.9877) grad_norm 2.7519 (3.1353) [2022-01-26 13:07:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][810/1251] eta 0:16:14 lr 0.000026 time 2.1628 (2.2106) loss 2.4069 (2.9905) grad_norm 3.0387 (3.1344) [2022-01-26 13:07:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][820/1251] eta 0:15:53 lr 0.000026 time 1.4902 (2.2119) loss 3.5298 (2.9897) grad_norm 3.5940 (3.1368) [2022-01-26 13:08:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][830/1251] eta 0:15:31 lr 0.000026 time 2.2605 (2.2119) loss 3.3409 (2.9885) grad_norm 3.0246 (3.1369) [2022-01-26 13:08:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][840/1251] eta 0:15:08 lr 0.000026 time 2.4875 (2.2109) loss 3.1195 (2.9893) grad_norm 2.8406 (3.1369) [2022-01-26 13:09:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][850/1251] eta 0:14:46 lr 0.000026 time 2.7756 (2.2104) loss 3.2091 (2.9879) grad_norm 3.5219 (3.1343) [2022-01-26 13:09:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][860/1251] eta 0:14:24 lr 0.000026 time 2.0466 (2.2104) loss 3.3661 (2.9846) grad_norm 2.9410 (3.1328) [2022-01-26 13:09:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][870/1251] eta 0:14:02 lr 0.000026 time 2.5821 (2.2125) loss 2.8047 (2.9840) grad_norm 2.9067 (3.1322) [2022-01-26 13:10:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][880/1251] eta 0:13:40 lr 0.000026 time 2.6923 (2.2112) loss 3.7034 (2.9830) grad_norm 3.3127 (3.1325) [2022-01-26 13:10:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][890/1251] eta 0:13:17 lr 0.000026 time 2.1881 (2.2089) loss 2.9770 (2.9842) grad_norm 2.6801 (3.1338) [2022-01-26 13:10:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][900/1251] eta 0:12:54 lr 0.000026 time 1.9332 (2.2062) loss 3.4274 (2.9856) grad_norm 2.8269 (3.1342) [2022-01-26 13:11:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][910/1251] eta 0:12:32 lr 0.000026 time 1.3536 (2.2054) loss 2.9995 (2.9852) grad_norm 3.1006 (3.1328) [2022-01-26 13:11:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][920/1251] eta 0:12:10 lr 0.000026 time 2.3032 (2.2062) loss 3.5481 (2.9857) grad_norm 2.8235 (3.1325) [2022-01-26 13:11:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][930/1251] eta 0:11:48 lr 0.000026 time 2.2257 (2.2080) loss 2.3229 (2.9861) grad_norm 3.1236 (3.1316) [2022-01-26 13:12:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][940/1251] eta 0:11:27 lr 0.000026 time 2.7350 (2.2100) loss 3.3611 (2.9886) grad_norm 3.0874 (3.1317) [2022-01-26 13:12:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][950/1251] eta 0:11:04 lr 0.000026 time 1.7417 (2.2090) loss 2.2747 (2.9889) grad_norm 2.6956 (3.1310) [2022-01-26 13:13:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][960/1251] eta 0:10:42 lr 0.000026 time 2.2973 (2.2087) loss 2.6737 (2.9887) grad_norm 3.1115 (3.1330) [2022-01-26 13:13:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][970/1251] eta 0:10:19 lr 0.000026 time 1.8308 (2.2057) loss 3.0230 (2.9877) grad_norm 8.9726 (3.1394) [2022-01-26 13:13:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][980/1251] eta 0:09:57 lr 0.000026 time 1.6181 (2.2037) loss 3.3114 (2.9874) grad_norm 2.5682 (3.1364) [2022-01-26 13:14:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][990/1251] eta 0:09:34 lr 0.000026 time 1.7323 (2.2020) loss 3.0323 (2.9874) grad_norm 2.8259 (3.1342) [2022-01-26 13:14:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1000/1251] eta 0:09:12 lr 0.000026 time 2.4273 (2.2014) loss 2.8871 (2.9872) grad_norm 2.9947 (3.1320) [2022-01-26 13:14:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1010/1251] eta 0:08:50 lr 0.000026 time 1.8991 (2.2018) loss 3.2237 (2.9868) grad_norm 2.8764 (3.1297) [2022-01-26 13:15:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1020/1251] eta 0:08:28 lr 0.000026 time 1.8267 (2.2016) loss 3.7616 (2.9884) grad_norm 3.0889 (3.1288) [2022-01-26 13:15:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1030/1251] eta 0:08:06 lr 0.000026 time 2.4373 (2.2026) loss 3.3666 (2.9879) grad_norm 2.8712 (3.1271) [2022-01-26 13:15:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1040/1251] eta 0:07:45 lr 0.000026 time 1.9523 (2.2039) loss 2.7175 (2.9889) grad_norm 3.2536 (3.1262) [2022-01-26 13:16:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1050/1251] eta 0:07:23 lr 0.000026 time 2.2612 (2.2054) loss 2.8487 (2.9860) grad_norm 3.0731 (3.1252) [2022-01-26 13:16:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1060/1251] eta 0:07:01 lr 0.000026 time 1.5663 (2.2049) loss 2.6913 (2.9869) grad_norm 3.1683 (3.1241) [2022-01-26 13:17:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1070/1251] eta 0:06:38 lr 0.000026 time 2.3393 (2.2038) loss 3.1438 (2.9877) grad_norm 2.7581 (3.1225) [2022-01-26 13:17:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1080/1251] eta 0:06:16 lr 0.000026 time 1.6019 (2.2031) loss 2.3329 (2.9870) grad_norm 3.1746 (3.1226) [2022-01-26 13:17:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1090/1251] eta 0:05:54 lr 0.000026 time 2.4127 (2.2033) loss 3.2502 (2.9897) grad_norm 2.7012 (3.1207) [2022-01-26 13:18:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1100/1251] eta 0:05:32 lr 0.000026 time 1.6907 (2.2015) loss 2.9069 (2.9919) grad_norm 2.7841 (3.1201) [2022-01-26 13:18:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1110/1251] eta 0:05:10 lr 0.000026 time 2.6595 (2.2013) loss 3.0517 (2.9922) grad_norm 3.6928 (3.1220) [2022-01-26 13:18:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1120/1251] eta 0:04:48 lr 0.000026 time 1.5626 (2.2008) loss 2.8399 (2.9934) grad_norm 3.2723 (3.1246) [2022-01-26 13:19:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1130/1251] eta 0:04:26 lr 0.000026 time 2.1343 (2.2022) loss 2.7902 (2.9930) grad_norm 2.9288 (3.1254) [2022-01-26 13:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1140/1251] eta 0:04:04 lr 0.000026 time 1.9107 (2.2015) loss 3.2740 (2.9917) grad_norm 2.6117 (3.1230) [2022-01-26 13:19:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1150/1251] eta 0:03:42 lr 0.000026 time 1.6380 (2.2003) loss 2.5567 (2.9942) grad_norm 3.1187 (3.1233) [2022-01-26 13:20:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1160/1251] eta 0:03:20 lr 0.000026 time 2.3172 (2.2003) loss 3.0854 (2.9953) grad_norm 2.7680 (3.1233) [2022-01-26 13:20:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1170/1251] eta 0:02:58 lr 0.000026 time 2.7331 (2.2016) loss 2.9727 (2.9957) grad_norm 3.1531 (3.1229) [2022-01-26 13:21:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1180/1251] eta 0:02:36 lr 0.000026 time 2.3948 (2.2012) loss 3.5883 (2.9981) grad_norm 2.6911 (3.1216) [2022-01-26 13:21:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1190/1251] eta 0:02:14 lr 0.000026 time 2.2344 (2.2002) loss 2.8039 (2.9977) grad_norm 2.6922 (3.1223) [2022-01-26 13:21:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1200/1251] eta 0:01:52 lr 0.000026 time 2.1812 (2.2002) loss 2.5957 (2.9975) grad_norm 3.0760 (3.1198) [2022-01-26 13:22:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1210/1251] eta 0:01:30 lr 0.000026 time 2.0190 (2.2015) loss 3.1459 (2.9959) grad_norm 3.3206 (3.1195) [2022-01-26 13:22:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1220/1251] eta 0:01:08 lr 0.000026 time 1.9171 (2.2000) loss 2.4612 (2.9931) grad_norm 3.5738 (3.1196) [2022-01-26 13:22:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1230/1251] eta 0:00:46 lr 0.000026 time 2.7379 (2.2000) loss 3.1623 (2.9928) grad_norm 3.3741 (3.1199) [2022-01-26 13:23:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1240/1251] eta 0:00:24 lr 0.000026 time 2.1524 (2.1992) loss 2.9277 (2.9940) grad_norm 3.4495 (3.1195) [2022-01-26 13:23:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [275/300][1250/1251] eta 0:00:02 lr 0.000026 time 1.2887 (2.1940) loss 3.0344 (2.9930) grad_norm 2.8280 (3.1214) [2022-01-26 13:23:26 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 275 training takes 0:45:45 [2022-01-26 13:23:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 17.977 (17.977) Loss 0.8128 (0.8128) Acc@1 81.641 (81.641) Acc@5 94.727 (94.727) [2022-01-26 13:24:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.973 (3.291) Loss 0.8705 (0.8063) Acc@1 80.078 (81.090) Acc@5 94.531 (95.419) [2022-01-26 13:24:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.318 (2.538) Loss 0.8451 (0.8093) Acc@1 80.762 (81.129) Acc@5 95.508 (95.382) [2022-01-26 13:24:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.589 (2.326) Loss 0.8646 (0.8119) Acc@1 82.227 (81.105) Acc@5 94.531 (95.357) [2022-01-26 13:24:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.125 (2.178) Loss 0.7637 (0.8147) Acc@1 82.129 (80.926) Acc@5 95.801 (95.384) [2022-01-26 13:25:03 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.010 Acc@5 95.418 [2022-01-26 13:25:03 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-01-26 13:25:03 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.07% [2022-01-26 13:25:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][0/1251] eta 7:36:40 lr 0.000026 time 21.9032 (21.9032) loss 3.5102 (3.5102) grad_norm 3.3729 (3.3729) [2022-01-26 13:25:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][10/1251] eta 1:24:32 lr 0.000026 time 2.8538 (4.0874) loss 3.1797 (2.9037) grad_norm 2.6437 (3.1744) [2022-01-26 13:26:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][20/1251] eta 1:04:42 lr 0.000026 time 1.5220 (3.1543) loss 3.2677 (2.8688) grad_norm 2.8352 (3.1178) [2022-01-26 13:26:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][30/1251] eta 0:58:22 lr 0.000026 time 1.8688 (2.8686) loss 3.4888 (2.9882) grad_norm 3.7372 (3.1673) [2022-01-26 13:26:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][40/1251] eta 0:55:10 lr 0.000026 time 3.3526 (2.7337) loss 3.4322 (3.0281) grad_norm 2.5048 (3.1100) [2022-01-26 13:27:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][50/1251] eta 0:53:39 lr 0.000025 time 2.4841 (2.6805) loss 3.3373 (3.0248) grad_norm 2.7033 (3.0958) [2022-01-26 13:27:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][60/1251] eta 0:51:49 lr 0.000025 time 1.9184 (2.6105) loss 2.4638 (2.9642) grad_norm 4.1675 (3.1133) [2022-01-26 13:28:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][70/1251] eta 0:50:49 lr 0.000025 time 1.5034 (2.5820) loss 2.7592 (2.9826) grad_norm 2.7741 (3.1367) [2022-01-26 13:28:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][80/1251] eta 0:49:36 lr 0.000025 time 3.0131 (2.5417) loss 2.5838 (2.9920) grad_norm 2.6997 (3.1540) [2022-01-26 13:28:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][90/1251] eta 0:48:18 lr 0.000025 time 1.9671 (2.4961) loss 3.7111 (3.0032) grad_norm 3.5033 (3.1880) [2022-01-26 13:29:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][100/1251] eta 0:46:58 lr 0.000025 time 1.8280 (2.4486) loss 3.4437 (3.0130) grad_norm 2.4688 (3.1553) [2022-01-26 13:29:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][110/1251] eta 0:46:08 lr 0.000025 time 1.8962 (2.4265) loss 2.5639 (3.0070) grad_norm 2.8004 (3.1307) [2022-01-26 13:29:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][120/1251] eta 0:45:21 lr 0.000025 time 2.7663 (2.4066) loss 3.2850 (2.9851) grad_norm 3.6949 (3.1189) [2022-01-26 13:30:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][130/1251] eta 0:44:39 lr 0.000025 time 1.8193 (2.3899) loss 3.4224 (2.9993) grad_norm 7.1061 (3.1370) [2022-01-26 13:30:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][140/1251] eta 0:43:48 lr 0.000025 time 2.2211 (2.3658) loss 3.2667 (3.0131) grad_norm 2.9787 (3.1425) [2022-01-26 13:30:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][150/1251] eta 0:43:12 lr 0.000025 time 2.2579 (2.3551) loss 2.2768 (2.9980) grad_norm 2.7171 (3.1533) [2022-01-26 13:31:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][160/1251] eta 0:42:40 lr 0.000025 time 2.5703 (2.3471) loss 3.2405 (3.0085) grad_norm 3.1749 (3.1448) [2022-01-26 13:31:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][170/1251] eta 0:42:09 lr 0.000025 time 2.2943 (2.3397) loss 3.3103 (3.0070) grad_norm 3.0331 (3.1499) [2022-01-26 13:32:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][180/1251] eta 0:41:30 lr 0.000025 time 2.2499 (2.3251) loss 2.8936 (2.9980) grad_norm 4.1222 (3.1444) [2022-01-26 13:32:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][190/1251] eta 0:40:57 lr 0.000025 time 1.8242 (2.3167) loss 3.2398 (2.9988) grad_norm 3.4105 (3.1449) [2022-01-26 13:32:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][200/1251] eta 0:40:29 lr 0.000025 time 3.3211 (2.3118) loss 3.4161 (3.0151) grad_norm 3.1773 (3.1581) [2022-01-26 13:33:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][210/1251] eta 0:39:53 lr 0.000025 time 1.7786 (2.2997) loss 2.7036 (3.0115) grad_norm 2.7377 (3.1575) [2022-01-26 13:33:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][220/1251] eta 0:39:21 lr 0.000025 time 1.7195 (2.2907) loss 3.2452 (3.0234) grad_norm 2.9548 (3.1528) [2022-01-26 13:33:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][230/1251] eta 0:38:56 lr 0.000025 time 2.0990 (2.2887) loss 3.5696 (3.0343) grad_norm 3.3132 (3.1578) [2022-01-26 13:34:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][240/1251] eta 0:38:41 lr 0.000025 time 3.4426 (2.2967) loss 2.6148 (3.0330) grad_norm 4.0261 (3.1646) [2022-01-26 13:34:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][250/1251] eta 0:38:16 lr 0.000025 time 1.4408 (2.2939) loss 2.3538 (3.0212) grad_norm 3.0561 (3.1574) [2022-01-26 13:35:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][260/1251] eta 0:37:49 lr 0.000025 time 2.2029 (2.2897) loss 2.8972 (3.0257) grad_norm 2.9378 (3.1553) [2022-01-26 13:35:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][270/1251] eta 0:37:18 lr 0.000025 time 1.8352 (2.2818) loss 2.8166 (3.0137) grad_norm 2.7538 (3.1562) [2022-01-26 13:35:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][280/1251] eta 0:36:54 lr 0.000025 time 2.8286 (2.2806) loss 2.1462 (3.0055) grad_norm 2.8226 (3.1536) [2022-01-26 13:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][290/1251] eta 0:36:23 lr 0.000025 time 1.7312 (2.2724) loss 2.3060 (2.9960) grad_norm 3.1136 (3.1495) [2022-01-26 13:36:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][300/1251] eta 0:35:56 lr 0.000025 time 1.7218 (2.2679) loss 3.4636 (2.9985) grad_norm 2.9374 (3.1518) [2022-01-26 13:36:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][310/1251] eta 0:35:32 lr 0.000025 time 2.1897 (2.2658) loss 2.0296 (3.0034) grad_norm 3.0797 (3.1497) [2022-01-26 13:37:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][320/1251] eta 0:35:12 lr 0.000025 time 3.1766 (2.2691) loss 3.1080 (3.0046) grad_norm 2.9519 (3.1562) [2022-01-26 13:37:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][330/1251] eta 0:34:47 lr 0.000025 time 1.5558 (2.2671) loss 3.3336 (3.0094) grad_norm 3.8186 (3.1549) [2022-01-26 13:37:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][340/1251] eta 0:34:22 lr 0.000025 time 2.2296 (2.2638) loss 1.8976 (3.0083) grad_norm 3.1896 (3.1540) [2022-01-26 13:38:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][350/1251] eta 0:33:54 lr 0.000025 time 1.8160 (2.2584) loss 3.3436 (3.0203) grad_norm 3.4831 (3.1581) [2022-01-26 13:38:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][360/1251] eta 0:33:33 lr 0.000025 time 2.7034 (2.2596) loss 2.5438 (3.0186) grad_norm 3.1087 (3.1548) [2022-01-26 13:38:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][370/1251] eta 0:33:06 lr 0.000025 time 1.6612 (2.2550) loss 2.0714 (3.0194) grad_norm 2.8685 (3.1570) [2022-01-26 13:39:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][380/1251] eta 0:32:45 lr 0.000025 time 2.4592 (2.2568) loss 3.5801 (3.0260) grad_norm 2.7418 (3.1518) [2022-01-26 13:39:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][390/1251] eta 0:32:21 lr 0.000025 time 2.3376 (2.2555) loss 3.2340 (3.0222) grad_norm 2.8488 (3.1576) [2022-01-26 13:40:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][400/1251] eta 0:31:59 lr 0.000025 time 2.6069 (2.2552) loss 3.2813 (3.0219) grad_norm 3.1105 (3.1545) [2022-01-26 13:40:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][410/1251] eta 0:31:32 lr 0.000025 time 1.9616 (2.2508) loss 3.3701 (3.0257) grad_norm 3.3374 (3.1516) [2022-01-26 13:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][420/1251] eta 0:31:09 lr 0.000025 time 1.8584 (2.2497) loss 2.5164 (3.0218) grad_norm 3.0387 (3.1577) [2022-01-26 13:41:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][430/1251] eta 0:30:44 lr 0.000025 time 2.0384 (2.2469) loss 3.2624 (3.0222) grad_norm 3.1930 (3.1564) [2022-01-26 13:41:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][440/1251] eta 0:30:22 lr 0.000025 time 2.4964 (2.2478) loss 3.4368 (3.0255) grad_norm 2.9328 (3.1546) [2022-01-26 13:41:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][450/1251] eta 0:29:59 lr 0.000025 time 1.5403 (2.2470) loss 3.1400 (3.0258) grad_norm 2.7268 (3.1490) [2022-01-26 13:42:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][460/1251] eta 0:29:35 lr 0.000025 time 2.2311 (2.2452) loss 2.7233 (3.0233) grad_norm 3.4359 (3.1496) [2022-01-26 13:42:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][470/1251] eta 0:29:12 lr 0.000025 time 1.7596 (2.2433) loss 2.0148 (3.0191) grad_norm 3.6160 (3.1473) [2022-01-26 13:43:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][480/1251] eta 0:28:51 lr 0.000025 time 3.1284 (2.2453) loss 3.4843 (3.0202) grad_norm 3.4043 (3.1465) [2022-01-26 13:43:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][490/1251] eta 0:28:28 lr 0.000025 time 2.1638 (2.2453) loss 3.6496 (3.0216) grad_norm 3.2431 (3.1480) [2022-01-26 13:43:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][500/1251] eta 0:28:03 lr 0.000025 time 2.0195 (2.2423) loss 3.3613 (3.0198) grad_norm 2.5765 (3.1430) [2022-01-26 13:44:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][510/1251] eta 0:27:39 lr 0.000025 time 1.8952 (2.2397) loss 3.7358 (3.0177) grad_norm 3.0319 (3.1416) [2022-01-26 13:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][520/1251] eta 0:27:15 lr 0.000025 time 1.9076 (2.2372) loss 3.4036 (3.0195) grad_norm 3.0493 (3.1400) [2022-01-26 13:44:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][530/1251] eta 0:26:50 lr 0.000025 time 1.5201 (2.2341) loss 3.4441 (3.0203) grad_norm 3.0608 (3.1407) [2022-01-26 13:45:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][540/1251] eta 0:26:27 lr 0.000025 time 1.8594 (2.2332) loss 3.4905 (3.0254) grad_norm 3.4104 (3.1412) [2022-01-26 13:45:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][550/1251] eta 0:26:04 lr 0.000025 time 2.2699 (2.2325) loss 3.1626 (3.0242) grad_norm 2.8418 (3.1391) [2022-01-26 13:45:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][560/1251] eta 0:25:42 lr 0.000025 time 2.1636 (2.2326) loss 2.5533 (3.0232) grad_norm 3.2384 (3.1417) [2022-01-26 13:46:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][570/1251] eta 0:25:20 lr 0.000025 time 1.8928 (2.2333) loss 3.3022 (3.0265) grad_norm 3.0399 (3.1462) [2022-01-26 13:46:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][580/1251] eta 0:24:57 lr 0.000025 time 1.9243 (2.2312) loss 3.1125 (3.0300) grad_norm 4.1273 (3.1494) [2022-01-26 13:47:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][590/1251] eta 0:24:34 lr 0.000025 time 2.2144 (2.2301) loss 2.9077 (3.0266) grad_norm 3.0838 (3.1481) [2022-01-26 13:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][600/1251] eta 0:24:11 lr 0.000025 time 1.6962 (2.2302) loss 3.0645 (3.0273) grad_norm 2.8336 (3.1476) [2022-01-26 13:47:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][610/1251] eta 0:23:49 lr 0.000025 time 1.9319 (2.2307) loss 2.0508 (3.0237) grad_norm 2.9468 (3.1464) [2022-01-26 13:48:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][620/1251] eta 0:23:26 lr 0.000025 time 1.5986 (2.2283) loss 2.8582 (3.0201) grad_norm 3.0828 (3.1448) [2022-01-26 13:48:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][630/1251] eta 0:23:03 lr 0.000025 time 2.1631 (2.2284) loss 1.8018 (3.0218) grad_norm 2.9724 (3.1411) [2022-01-26 13:48:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][640/1251] eta 0:22:42 lr 0.000025 time 1.8546 (2.2299) loss 2.7347 (3.0220) grad_norm 2.7852 (3.1389) [2022-01-26 13:49:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][650/1251] eta 0:22:20 lr 0.000025 time 3.0292 (2.2309) loss 3.3806 (3.0181) grad_norm 3.2828 (3.1367) [2022-01-26 13:49:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][660/1251] eta 0:21:56 lr 0.000025 time 1.9034 (2.2274) loss 3.3956 (3.0200) grad_norm 3.3108 (3.1356) [2022-01-26 13:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][670/1251] eta 0:21:32 lr 0.000025 time 2.2429 (2.2241) loss 2.8489 (3.0199) grad_norm 2.8406 (3.1363) [2022-01-26 13:50:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][680/1251] eta 0:21:08 lr 0.000025 time 2.1703 (2.2221) loss 2.2465 (3.0159) grad_norm 3.0055 (3.1348) [2022-01-26 13:50:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][690/1251] eta 0:20:47 lr 0.000025 time 2.9930 (2.2239) loss 2.1701 (3.0154) grad_norm 3.0695 (3.1342) [2022-01-26 13:51:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][700/1251] eta 0:20:26 lr 0.000025 time 1.8810 (2.2264) loss 2.5607 (3.0142) grad_norm 4.1841 (3.1383) [2022-01-26 13:51:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][710/1251] eta 0:20:07 lr 0.000025 time 1.8501 (2.2317) loss 3.5143 (3.0172) grad_norm 3.0944 (3.1400) [2022-01-26 13:51:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][720/1251] eta 0:19:46 lr 0.000025 time 3.1156 (2.2351) loss 3.4381 (3.0189) grad_norm 2.9564 (3.1372) [2022-01-26 13:52:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][730/1251] eta 0:19:23 lr 0.000025 time 2.1098 (2.2341) loss 2.2285 (3.0208) grad_norm 3.0426 (3.1383) [2022-01-26 13:52:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][740/1251] eta 0:19:00 lr 0.000025 time 2.0081 (2.2327) loss 3.1567 (3.0206) grad_norm 2.9517 (3.1367) [2022-01-26 13:52:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][750/1251] eta 0:18:36 lr 0.000025 time 1.9845 (2.2283) loss 2.9664 (3.0197) grad_norm 3.4781 (3.1383) [2022-01-26 13:53:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][760/1251] eta 0:18:13 lr 0.000025 time 3.1175 (2.2267) loss 3.3077 (3.0208) grad_norm 4.0783 (3.1406) [2022-01-26 13:53:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][770/1251] eta 0:17:50 lr 0.000025 time 1.6516 (2.2254) loss 2.9747 (3.0211) grad_norm 3.0281 (3.1411) [2022-01-26 13:53:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][780/1251] eta 0:17:27 lr 0.000025 time 2.3220 (2.2236) loss 3.1784 (3.0194) grad_norm 3.0627 (3.1436) [2022-01-26 13:54:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][790/1251] eta 0:17:04 lr 0.000025 time 2.2310 (2.2216) loss 3.6746 (3.0224) grad_norm 3.0297 (3.1429) [2022-01-26 13:54:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][800/1251] eta 0:16:42 lr 0.000025 time 3.0038 (2.2220) loss 1.9124 (3.0211) grad_norm 2.7564 (3.1434) [2022-01-26 13:55:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][810/1251] eta 0:16:19 lr 0.000025 time 1.8115 (2.2200) loss 2.6291 (3.0211) grad_norm 3.0937 (3.1446) [2022-01-26 13:55:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][820/1251] eta 0:15:57 lr 0.000025 time 2.3078 (2.2208) loss 2.1411 (3.0185) grad_norm 3.6025 (3.1462) [2022-01-26 13:55:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][830/1251] eta 0:15:34 lr 0.000025 time 2.2361 (2.2202) loss 3.0902 (3.0166) grad_norm 3.1206 (3.1468) [2022-01-26 13:56:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][840/1251] eta 0:15:13 lr 0.000025 time 3.4587 (2.2219) loss 2.8318 (3.0155) grad_norm 2.6373 (3.1470) [2022-01-26 13:56:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][850/1251] eta 0:14:51 lr 0.000025 time 2.5406 (2.2230) loss 3.4631 (3.0171) grad_norm 3.3555 (3.1484) [2022-01-26 13:56:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][860/1251] eta 0:14:30 lr 0.000025 time 2.4529 (2.2252) loss 3.3053 (3.0184) grad_norm 3.1019 (3.1479) [2022-01-26 13:57:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][870/1251] eta 0:14:08 lr 0.000025 time 2.1785 (2.2269) loss 2.7352 (3.0173) grad_norm 3.1245 (3.1460) [2022-01-26 13:57:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][880/1251] eta 0:13:46 lr 0.000025 time 1.8639 (2.2271) loss 2.9451 (3.0193) grad_norm 3.3315 (3.1476) [2022-01-26 13:58:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][890/1251] eta 0:13:22 lr 0.000025 time 1.6976 (2.2231) loss 2.1195 (3.0202) grad_norm 2.8412 (3.1468) [2022-01-26 13:58:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][900/1251] eta 0:12:58 lr 0.000025 time 2.1986 (2.2191) loss 3.3673 (3.0210) grad_norm 2.7419 (3.1475) [2022-01-26 13:58:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][910/1251] eta 0:12:35 lr 0.000025 time 1.9634 (2.2161) loss 3.1781 (3.0193) grad_norm 2.7480 (3.1447) [2022-01-26 13:59:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][920/1251] eta 0:12:13 lr 0.000025 time 2.2763 (2.2150) loss 3.5612 (3.0181) grad_norm 3.2080 (3.1446) [2022-01-26 13:59:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][930/1251] eta 0:11:50 lr 0.000025 time 1.8547 (2.2142) loss 3.1793 (3.0171) grad_norm 3.6825 (3.1427) [2022-01-26 13:59:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][940/1251] eta 0:11:28 lr 0.000025 time 2.5869 (2.2136) loss 3.2158 (3.0155) grad_norm 3.2283 (3.1413) [2022-01-26 14:00:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][950/1251] eta 0:11:05 lr 0.000025 time 1.6646 (2.2123) loss 3.1266 (3.0174) grad_norm 2.7671 (3.1399) [2022-01-26 14:00:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][960/1251] eta 0:10:43 lr 0.000025 time 2.4293 (2.2122) loss 3.3386 (3.0166) grad_norm 2.9570 (3.1396) [2022-01-26 14:00:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][970/1251] eta 0:10:21 lr 0.000025 time 1.8532 (2.2116) loss 3.1668 (3.0153) grad_norm 3.0372 (3.1418) [2022-01-26 14:01:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][980/1251] eta 0:09:59 lr 0.000025 time 2.6706 (2.2128) loss 2.9511 (3.0170) grad_norm 3.3770 (3.1418) [2022-01-26 14:01:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][990/1251] eta 0:09:37 lr 0.000025 time 1.6873 (2.2121) loss 3.5864 (3.0163) grad_norm 3.4954 (3.1405) [2022-01-26 14:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1000/1251] eta 0:09:15 lr 0.000025 time 2.1932 (2.2137) loss 1.9156 (3.0156) grad_norm 2.8218 (3.1386) [2022-01-26 14:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1010/1251] eta 0:08:53 lr 0.000025 time 1.8576 (2.2133) loss 2.9376 (3.0163) grad_norm 2.8237 (3.1394) [2022-01-26 14:02:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1020/1251] eta 0:08:31 lr 0.000025 time 3.2821 (2.2146) loss 2.3730 (3.0152) grad_norm 2.5828 (3.1389) [2022-01-26 14:03:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1030/1251] eta 0:08:09 lr 0.000025 time 1.8783 (2.2143) loss 2.0675 (3.0152) grad_norm 3.1926 (3.1376) [2022-01-26 14:03:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1040/1251] eta 0:07:47 lr 0.000024 time 1.2489 (2.2170) loss 3.4010 (3.0145) grad_norm 2.8705 (3.1370) [2022-01-26 14:03:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1050/1251] eta 0:07:25 lr 0.000024 time 1.9802 (2.2179) loss 3.3290 (3.0137) grad_norm 3.3786 (3.1382) [2022-01-26 14:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1060/1251] eta 0:07:03 lr 0.000024 time 3.2267 (2.2188) loss 3.6246 (3.0145) grad_norm 3.6481 (3.1386) [2022-01-26 14:04:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1070/1251] eta 0:06:41 lr 0.000024 time 1.5754 (2.2174) loss 2.2843 (3.0128) grad_norm 2.7111 (3.1382) [2022-01-26 14:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1080/1251] eta 0:06:19 lr 0.000024 time 1.9142 (2.2168) loss 2.8097 (3.0132) grad_norm 3.6732 (3.1374) [2022-01-26 14:05:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1090/1251] eta 0:05:56 lr 0.000024 time 1.6108 (2.2158) loss 3.1155 (3.0136) grad_norm 3.2048 (3.1370) [2022-01-26 14:05:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1100/1251] eta 0:05:34 lr 0.000024 time 2.2648 (2.2158) loss 3.2113 (3.0134) grad_norm 4.1996 (3.1398) [2022-01-26 14:06:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1110/1251] eta 0:05:12 lr 0.000024 time 1.9529 (2.2151) loss 3.1424 (3.0142) grad_norm 2.7544 (3.1428) [2022-01-26 14:06:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1120/1251] eta 0:04:50 lr 0.000024 time 1.9849 (2.2138) loss 3.3964 (3.0161) grad_norm 3.3853 (3.1447) [2022-01-26 14:06:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1130/1251] eta 0:04:27 lr 0.000024 time 1.6135 (2.2119) loss 2.2323 (3.0150) grad_norm 3.0761 (3.1447) [2022-01-26 14:07:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1140/1251] eta 0:04:05 lr 0.000024 time 2.1711 (2.2112) loss 3.5799 (3.0164) grad_norm 2.9724 (3.1451) [2022-01-26 14:07:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1150/1251] eta 0:03:43 lr 0.000024 time 2.4963 (2.2115) loss 3.0547 (3.0148) grad_norm 3.1572 (3.1441) [2022-01-26 14:07:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1160/1251] eta 0:03:21 lr 0.000024 time 2.2015 (2.2114) loss 2.5087 (3.0138) grad_norm 2.6878 (3.1420) [2022-01-26 14:08:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1170/1251] eta 0:02:59 lr 0.000024 time 1.8257 (2.2108) loss 2.7389 (3.0137) grad_norm 2.8942 (3.1426) [2022-01-26 14:08:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1180/1251] eta 0:02:36 lr 0.000024 time 1.9815 (2.2108) loss 1.9163 (3.0130) grad_norm 3.7316 (3.1443) [2022-01-26 14:08:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1190/1251] eta 0:02:14 lr 0.000024 time 1.9569 (2.2118) loss 2.8298 (3.0136) grad_norm 2.9586 (3.1444) [2022-01-26 14:09:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1200/1251] eta 0:01:52 lr 0.000024 time 2.7025 (2.2132) loss 3.2175 (3.0122) grad_norm 3.3891 (3.1432) [2022-01-26 14:09:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1210/1251] eta 0:01:30 lr 0.000024 time 1.7937 (2.2136) loss 3.2669 (3.0116) grad_norm 5.3746 (3.1447) [2022-01-26 14:10:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1220/1251] eta 0:01:08 lr 0.000024 time 1.5899 (2.2126) loss 2.8369 (3.0130) grad_norm 2.6353 (3.1437) [2022-01-26 14:10:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1230/1251] eta 0:00:46 lr 0.000024 time 1.6328 (2.2110) loss 2.4070 (3.0127) grad_norm 2.9759 (3.1445) [2022-01-26 14:10:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1240/1251] eta 0:00:24 lr 0.000024 time 1.9095 (2.2112) loss 3.1504 (3.0110) grad_norm 3.0428 (3.1440) [2022-01-26 14:11:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [276/300][1250/1251] eta 0:00:02 lr 0.000024 time 1.2099 (2.2057) loss 2.1457 (3.0092) grad_norm 2.9362 (3.1425) [2022-01-26 14:11:03 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 276 training takes 0:45:59 [2022-01-26 14:11:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.001 (18.001) Loss 0.7028 (0.7028) Acc@1 84.375 (84.375) Acc@5 96.387 (96.387) [2022-01-26 14:11:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.812 (3.376) Loss 0.8725 (0.7891) Acc@1 80.273 (81.809) Acc@5 94.434 (95.739) [2022-01-26 14:11:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.595 (2.619) Loss 0.8424 (0.8062) Acc@1 80.762 (81.445) Acc@5 94.727 (95.503) [2022-01-26 14:12:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.928 (2.316) Loss 0.8009 (0.8091) Acc@1 81.738 (81.370) Acc@5 96.484 (95.508) [2022-01-26 14:12:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.925 (2.231) Loss 0.8641 (0.8189) Acc@1 79.980 (81.014) Acc@5 94.824 (95.410) [2022-01-26 14:12:41 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.048 Acc@5 95.400 [2022-01-26 14:12:41 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-01-26 14:12:41 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.07% [2022-01-26 14:13:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][0/1251] eta 7:47:34 lr 0.000024 time 22.4257 (22.4257) loss 2.3425 (2.3425) grad_norm 2.5916 (2.5916) [2022-01-26 14:13:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][10/1251] eta 1:25:55 lr 0.000024 time 2.0921 (4.1544) loss 3.2997 (2.7977) grad_norm 2.8175 (3.6182) [2022-01-26 14:13:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][20/1251] eta 1:06:52 lr 0.000024 time 1.3449 (3.2596) loss 3.4921 (2.9226) grad_norm 2.5747 (3.3823) [2022-01-26 14:14:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][30/1251] eta 0:58:40 lr 0.000024 time 1.9790 (2.8833) loss 3.4842 (2.9483) grad_norm 3.4301 (3.2882) [2022-01-26 14:14:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][40/1251] eta 0:54:57 lr 0.000024 time 3.6918 (2.7228) loss 2.4267 (2.9356) grad_norm 2.6797 (3.3073) [2022-01-26 14:14:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][50/1251] eta 0:52:53 lr 0.000024 time 3.1875 (2.6425) loss 3.0000 (2.9253) grad_norm 2.9403 (3.2514) [2022-01-26 14:15:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][60/1251] eta 0:51:06 lr 0.000024 time 1.9353 (2.5745) loss 3.1122 (2.9399) grad_norm 2.9837 (3.2324) [2022-01-26 14:15:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][70/1251] eta 0:49:35 lr 0.000024 time 1.7740 (2.5197) loss 2.8620 (2.9759) grad_norm 3.1008 (3.2445) [2022-01-26 14:16:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][80/1251] eta 0:48:40 lr 0.000024 time 3.7200 (2.4939) loss 3.1670 (2.9680) grad_norm 3.6180 (3.2165) [2022-01-26 14:16:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][90/1251] eta 0:47:39 lr 0.000024 time 2.7794 (2.4632) loss 3.7549 (2.9853) grad_norm 2.9806 (3.2119) [2022-01-26 14:16:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][100/1251] eta 0:46:39 lr 0.000024 time 1.9365 (2.4320) loss 3.1617 (2.9686) grad_norm 3.0352 (3.1894) [2022-01-26 14:17:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][110/1251] eta 0:45:38 lr 0.000024 time 1.8161 (2.3999) loss 2.7411 (2.9744) grad_norm 3.3020 (3.1982) [2022-01-26 14:17:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][120/1251] eta 0:45:01 lr 0.000024 time 3.1605 (2.3884) loss 2.3530 (2.9786) grad_norm 3.2879 (3.1939) [2022-01-26 14:17:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][130/1251] eta 0:44:23 lr 0.000024 time 2.1784 (2.3761) loss 3.5526 (2.9892) grad_norm 3.3594 (3.1875) [2022-01-26 14:18:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][140/1251] eta 0:43:26 lr 0.000024 time 1.8096 (2.3461) loss 3.4152 (2.9921) grad_norm 2.7056 (3.1739) [2022-01-26 14:18:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][150/1251] eta 0:42:40 lr 0.000024 time 2.4788 (2.3260) loss 1.9761 (2.9870) grad_norm 3.0646 (3.1887) [2022-01-26 14:18:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][160/1251] eta 0:42:15 lr 0.000024 time 3.0434 (2.3241) loss 3.5042 (2.9877) grad_norm 2.7198 (3.1772) [2022-01-26 14:19:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][170/1251] eta 0:41:44 lr 0.000024 time 1.8455 (2.3172) loss 3.0850 (2.9825) grad_norm 2.9580 (3.1796) [2022-01-26 14:19:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][180/1251] eta 0:41:17 lr 0.000024 time 1.8919 (2.3129) loss 2.7157 (2.9757) grad_norm 2.7737 (3.1701) [2022-01-26 14:20:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][190/1251] eta 0:40:50 lr 0.000024 time 2.0495 (2.3094) loss 2.0130 (2.9644) grad_norm 3.5766 (3.1688) [2022-01-26 14:20:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][200/1251] eta 0:40:28 lr 0.000024 time 3.1301 (2.3102) loss 2.3069 (2.9561) grad_norm 2.9585 (3.1640) [2022-01-26 14:20:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][210/1251] eta 0:39:56 lr 0.000024 time 1.8238 (2.3019) loss 3.5725 (2.9496) grad_norm 2.7430 (3.1641) [2022-01-26 14:21:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][220/1251] eta 0:39:25 lr 0.000024 time 1.6188 (2.2941) loss 2.7182 (2.9566) grad_norm 3.3870 (3.1720) [2022-01-26 14:21:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][230/1251] eta 0:38:55 lr 0.000024 time 2.1936 (2.2874) loss 3.0944 (2.9676) grad_norm 3.1307 (3.1838) [2022-01-26 14:21:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][240/1251] eta 0:38:33 lr 0.000024 time 3.0236 (2.2883) loss 3.0966 (2.9789) grad_norm 2.7179 (3.1764) [2022-01-26 14:22:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][250/1251] eta 0:38:01 lr 0.000024 time 1.9947 (2.2790) loss 3.2205 (2.9801) grad_norm 3.0536 (3.1751) [2022-01-26 14:22:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][260/1251] eta 0:37:30 lr 0.000024 time 1.8619 (2.2714) loss 3.0241 (2.9796) grad_norm 2.9887 (3.1719) [2022-01-26 14:22:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][270/1251] eta 0:37:00 lr 0.000024 time 1.8774 (2.2635) loss 2.1036 (2.9715) grad_norm 2.9044 (3.1663) [2022-01-26 14:23:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][280/1251] eta 0:36:26 lr 0.000024 time 1.7883 (2.2520) loss 3.4477 (2.9804) grad_norm 3.0083 (3.1642) [2022-01-26 14:23:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][290/1251] eta 0:35:55 lr 0.000024 time 1.5769 (2.2428) loss 2.9500 (2.9818) grad_norm 3.3073 (3.1662) [2022-01-26 14:23:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][300/1251] eta 0:35:28 lr 0.000024 time 1.8714 (2.2377) loss 2.6884 (2.9771) grad_norm 3.4370 (3.1659) [2022-01-26 14:24:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][310/1251] eta 0:35:09 lr 0.000024 time 2.5613 (2.2413) loss 3.3869 (2.9778) grad_norm 3.4011 (3.1656) [2022-01-26 14:24:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][320/1251] eta 0:34:46 lr 0.000024 time 2.1385 (2.2406) loss 3.5694 (2.9733) grad_norm 3.3631 (3.1605) [2022-01-26 14:25:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][330/1251] eta 0:34:32 lr 0.000024 time 2.2262 (2.2504) loss 3.1761 (2.9695) grad_norm 3.5035 (3.1648) [2022-01-26 14:25:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][340/1251] eta 0:34:22 lr 0.000024 time 2.3161 (2.2636) loss 3.4555 (2.9765) grad_norm 3.2051 (3.1640) [2022-01-26 14:25:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][350/1251] eta 0:34:03 lr 0.000024 time 2.1398 (2.2681) loss 3.2533 (2.9811) grad_norm 3.1943 (3.1620) [2022-01-26 14:26:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][360/1251] eta 0:33:37 lr 0.000024 time 1.5395 (2.2647) loss 3.1981 (2.9877) grad_norm 3.5898 (3.1623) [2022-01-26 14:26:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][370/1251] eta 0:33:08 lr 0.000024 time 1.6605 (2.2574) loss 2.6724 (2.9893) grad_norm 3.4121 (3.1663) [2022-01-26 14:26:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][380/1251] eta 0:32:41 lr 0.000024 time 1.8438 (2.2526) loss 3.3620 (2.9900) grad_norm 3.1585 (3.1653) [2022-01-26 14:27:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][390/1251] eta 0:32:21 lr 0.000024 time 1.9326 (2.2550) loss 3.3263 (2.9942) grad_norm 3.9427 (3.1666) [2022-01-26 14:27:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][400/1251] eta 0:31:53 lr 0.000024 time 1.7742 (2.2491) loss 3.4408 (2.9932) grad_norm 5.4052 (3.1726) [2022-01-26 14:28:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][410/1251] eta 0:31:27 lr 0.000024 time 1.9041 (2.2441) loss 3.0999 (2.9883) grad_norm 3.6807 (3.1813) [2022-01-26 14:28:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][420/1251] eta 0:31:01 lr 0.000024 time 1.5563 (2.2400) loss 2.8513 (2.9933) grad_norm 2.6660 (3.1774) [2022-01-26 14:28:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][430/1251] eta 0:30:34 lr 0.000024 time 2.4053 (2.2349) loss 2.3732 (2.9879) grad_norm 3.0605 (3.1710) [2022-01-26 14:29:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][440/1251] eta 0:30:13 lr 0.000024 time 2.7629 (2.2362) loss 3.4735 (2.9887) grad_norm 3.8324 (3.1692) [2022-01-26 14:29:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][450/1251] eta 0:29:51 lr 0.000024 time 2.3748 (2.2369) loss 3.6594 (2.9923) grad_norm 3.4664 (3.1682) [2022-01-26 14:29:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][460/1251] eta 0:29:27 lr 0.000024 time 2.8888 (2.2346) loss 3.8966 (2.9988) grad_norm 3.1515 (3.1651) [2022-01-26 14:30:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][470/1251] eta 0:29:05 lr 0.000024 time 2.7633 (2.2355) loss 3.6056 (3.0012) grad_norm 3.1438 (3.1626) [2022-01-26 14:30:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][480/1251] eta 0:28:43 lr 0.000024 time 1.6836 (2.2358) loss 3.4243 (3.0043) grad_norm 3.5443 (3.1642) [2022-01-26 14:31:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][490/1251] eta 0:28:22 lr 0.000024 time 2.5446 (2.2370) loss 3.1405 (3.0075) grad_norm 3.1965 (3.1782) [2022-01-26 14:31:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][500/1251] eta 0:27:57 lr 0.000024 time 2.6583 (2.2342) loss 3.0040 (3.0039) grad_norm 2.9515 (3.1783) [2022-01-26 14:31:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][510/1251] eta 0:27:32 lr 0.000024 time 2.1524 (2.2298) loss 2.8187 (3.0048) grad_norm 2.8028 (3.1799) [2022-01-26 14:32:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][520/1251] eta 0:27:07 lr 0.000024 time 2.4880 (2.2267) loss 3.6548 (3.0035) grad_norm 3.4452 (3.1803) [2022-01-26 14:32:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][530/1251] eta 0:26:45 lr 0.000024 time 2.3295 (2.2261) loss 2.1542 (3.0024) grad_norm 3.4709 (3.1775) [2022-01-26 14:32:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][540/1251] eta 0:26:23 lr 0.000024 time 2.4619 (2.2277) loss 3.3860 (3.0059) grad_norm 3.1973 (3.1781) [2022-01-26 14:33:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][550/1251] eta 0:26:00 lr 0.000024 time 1.6382 (2.2256) loss 3.2923 (3.0061) grad_norm 3.6253 (3.1772) [2022-01-26 14:33:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][560/1251] eta 0:25:38 lr 0.000024 time 2.6647 (2.2270) loss 2.3984 (3.0057) grad_norm 3.3226 (3.1771) [2022-01-26 14:33:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][570/1251] eta 0:25:15 lr 0.000024 time 2.5419 (2.2256) loss 3.5465 (3.0094) grad_norm 3.5095 (3.1813) [2022-01-26 14:34:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][580/1251] eta 0:24:54 lr 0.000024 time 1.9802 (2.2268) loss 3.1807 (3.0089) grad_norm 2.9920 (3.1826) [2022-01-26 14:34:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][590/1251] eta 0:24:32 lr 0.000024 time 1.8522 (2.2282) loss 3.1187 (3.0091) grad_norm 3.4129 (3.1801) [2022-01-26 14:34:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][600/1251] eta 0:24:09 lr 0.000024 time 2.2874 (2.2269) loss 2.7731 (3.0063) grad_norm 3.6674 (3.1823) [2022-01-26 14:35:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][610/1251] eta 0:23:45 lr 0.000024 time 2.8458 (2.2240) loss 1.9170 (3.0040) grad_norm 3.0488 (3.1807) [2022-01-26 14:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][620/1251] eta 0:23:20 lr 0.000024 time 1.5988 (2.2197) loss 2.9778 (3.0057) grad_norm 2.6804 (3.1772) [2022-01-26 14:36:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][630/1251] eta 0:22:58 lr 0.000024 time 1.9197 (2.2201) loss 2.6223 (3.0041) grad_norm 3.6702 (3.1768) [2022-01-26 14:36:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][640/1251] eta 0:22:35 lr 0.000024 time 2.2398 (2.2182) loss 2.6033 (3.0028) grad_norm 2.9535 (3.1769) [2022-01-26 14:36:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][650/1251] eta 0:22:13 lr 0.000024 time 2.8690 (2.2183) loss 2.7214 (3.0031) grad_norm 2.5405 (3.1720) [2022-01-26 14:37:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][660/1251] eta 0:21:50 lr 0.000024 time 2.3788 (2.2177) loss 3.5358 (3.0019) grad_norm 3.5702 (3.1697) [2022-01-26 14:37:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][670/1251] eta 0:21:28 lr 0.000024 time 2.6304 (2.2185) loss 3.4879 (3.0019) grad_norm 3.2886 (3.1684) [2022-01-26 14:37:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][680/1251] eta 0:21:07 lr 0.000024 time 2.2395 (2.2203) loss 3.5407 (3.0003) grad_norm 3.6392 (3.1716) [2022-01-26 14:38:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][690/1251] eta 0:20:46 lr 0.000024 time 2.0647 (2.2224) loss 2.7681 (3.0035) grad_norm 2.8983 (3.1678) [2022-01-26 14:38:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][700/1251] eta 0:20:23 lr 0.000024 time 1.7814 (2.2205) loss 2.9048 (3.0040) grad_norm 3.7125 (3.1657) [2022-01-26 14:38:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][710/1251] eta 0:19:59 lr 0.000024 time 1.6037 (2.2180) loss 2.2150 (3.0017) grad_norm 3.9869 (3.1667) [2022-01-26 14:39:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][720/1251] eta 0:19:36 lr 0.000024 time 1.9533 (2.2162) loss 2.4622 (3.0022) grad_norm 3.3163 (3.1646) [2022-01-26 14:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][730/1251] eta 0:19:14 lr 0.000024 time 2.4976 (2.2155) loss 2.8148 (3.0020) grad_norm 3.3973 (3.1649) [2022-01-26 14:40:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][740/1251] eta 0:18:51 lr 0.000024 time 1.5812 (2.2140) loss 3.2823 (3.0037) grad_norm 3.1787 (3.1662) [2022-01-26 14:40:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][750/1251] eta 0:18:30 lr 0.000024 time 1.9776 (2.2160) loss 3.4059 (3.0061) grad_norm 3.0299 (3.1656) [2022-01-26 14:40:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][760/1251] eta 0:18:07 lr 0.000024 time 2.2391 (2.2152) loss 2.9943 (3.0029) grad_norm 3.1326 (3.1661) [2022-01-26 14:41:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][770/1251] eta 0:17:45 lr 0.000024 time 2.5126 (2.2145) loss 3.5545 (3.0061) grad_norm 3.0600 (3.1667) [2022-01-26 14:41:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][780/1251] eta 0:17:22 lr 0.000024 time 1.5360 (2.2137) loss 3.4317 (3.0066) grad_norm 3.1558 (3.1713) [2022-01-26 14:41:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][790/1251] eta 0:17:00 lr 0.000024 time 2.1507 (2.2142) loss 3.6177 (3.0053) grad_norm 3.0919 (3.1721) [2022-01-26 14:42:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][800/1251] eta 0:16:38 lr 0.000024 time 1.6817 (2.2134) loss 3.3774 (3.0057) grad_norm 2.7958 (3.1710) [2022-01-26 14:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][810/1251] eta 0:16:16 lr 0.000023 time 3.2409 (2.2150) loss 2.2828 (3.0042) grad_norm 2.7678 (3.1712) [2022-01-26 14:43:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][820/1251] eta 0:15:54 lr 0.000023 time 1.9438 (2.2149) loss 3.1921 (3.0048) grad_norm 2.9650 (3.1720) [2022-01-26 14:43:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][830/1251] eta 0:15:32 lr 0.000023 time 1.8348 (2.2140) loss 3.4492 (3.0082) grad_norm 2.9923 (3.1732) [2022-01-26 14:43:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][840/1251] eta 0:15:09 lr 0.000023 time 1.8291 (2.2129) loss 2.7322 (3.0094) grad_norm 2.7567 (3.1726) [2022-01-26 14:44:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][850/1251] eta 0:14:48 lr 0.000023 time 2.7129 (2.2146) loss 3.2681 (3.0092) grad_norm 3.2708 (3.1731) [2022-01-26 14:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][860/1251] eta 0:14:25 lr 0.000023 time 1.5844 (2.2143) loss 2.6090 (3.0083) grad_norm 3.1027 (3.1736) [2022-01-26 14:44:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][870/1251] eta 0:14:03 lr 0.000023 time 1.8866 (2.2138) loss 2.8908 (3.0081) grad_norm 3.1673 (3.1712) [2022-01-26 14:45:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][880/1251] eta 0:13:40 lr 0.000023 time 1.5728 (2.2119) loss 3.6606 (3.0091) grad_norm 3.0833 (3.1699) [2022-01-26 14:45:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][890/1251] eta 0:13:18 lr 0.000023 time 2.4937 (2.2132) loss 2.8477 (3.0100) grad_norm 2.7779 (3.1679) [2022-01-26 14:45:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][900/1251] eta 0:12:57 lr 0.000023 time 1.9229 (2.2151) loss 2.5902 (3.0093) grad_norm 2.8644 (3.1660) [2022-01-26 14:46:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][910/1251] eta 0:12:35 lr 0.000023 time 2.1349 (2.2161) loss 3.3061 (3.0081) grad_norm 3.3187 (3.1643) [2022-01-26 14:46:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][920/1251] eta 0:12:12 lr 0.000023 time 1.7807 (2.2131) loss 2.4904 (3.0062) grad_norm 3.2475 (3.1635) [2022-01-26 14:46:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][930/1251] eta 0:11:49 lr 0.000023 time 1.8007 (2.2098) loss 3.4036 (3.0069) grad_norm 4.0474 (3.1651) [2022-01-26 14:47:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][940/1251] eta 0:11:26 lr 0.000023 time 1.9852 (2.2078) loss 3.4879 (3.0091) grad_norm 3.2698 (3.1646) [2022-01-26 14:47:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][950/1251] eta 0:11:04 lr 0.000023 time 2.2385 (2.2081) loss 3.5297 (3.0079) grad_norm 3.1428 (3.1636) [2022-01-26 14:48:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][960/1251] eta 0:10:42 lr 0.000023 time 1.7644 (2.2076) loss 2.4891 (3.0042) grad_norm 3.0684 (3.1630) [2022-01-26 14:48:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][970/1251] eta 0:10:20 lr 0.000023 time 2.5157 (2.2079) loss 3.4609 (3.0062) grad_norm 3.0530 (3.1647) [2022-01-26 14:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][980/1251] eta 0:09:58 lr 0.000023 time 2.1217 (2.2075) loss 3.5521 (3.0049) grad_norm 4.2449 (3.1664) [2022-01-26 14:49:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][990/1251] eta 0:09:36 lr 0.000023 time 1.5180 (2.2074) loss 3.3641 (3.0034) grad_norm 3.0821 (3.1649) [2022-01-26 14:49:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1000/1251] eta 0:09:14 lr 0.000023 time 2.2318 (2.2099) loss 2.7696 (3.0027) grad_norm 2.7208 (3.1638) [2022-01-26 14:49:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1010/1251] eta 0:08:52 lr 0.000023 time 1.8074 (2.2112) loss 3.1805 (3.0008) grad_norm 2.8132 (3.1632) [2022-01-26 14:50:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1020/1251] eta 0:08:30 lr 0.000023 time 1.8651 (2.2105) loss 2.0895 (2.9993) grad_norm 3.1312 (3.1623) [2022-01-26 14:50:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1030/1251] eta 0:08:08 lr 0.000023 time 1.6516 (2.2082) loss 3.3491 (2.9986) grad_norm 2.7538 (3.1610) [2022-01-26 14:50:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1040/1251] eta 0:07:45 lr 0.000023 time 1.6917 (2.2061) loss 1.8731 (2.9967) grad_norm 3.6812 (3.1624) [2022-01-26 14:51:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1050/1251] eta 0:07:23 lr 0.000023 time 1.8970 (2.2052) loss 3.2845 (2.9953) grad_norm 3.1841 (3.1645) [2022-01-26 14:51:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1060/1251] eta 0:07:01 lr 0.000023 time 2.7774 (2.2055) loss 3.6237 (2.9984) grad_norm 3.4183 (3.1640) [2022-01-26 14:52:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1070/1251] eta 0:06:39 lr 0.000023 time 1.6793 (2.2059) loss 3.3054 (2.9980) grad_norm 2.9966 (3.1626) [2022-01-26 14:52:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1080/1251] eta 0:06:17 lr 0.000023 time 2.5271 (2.2076) loss 3.1842 (3.0002) grad_norm 2.8498 (3.1605) [2022-01-26 14:52:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1090/1251] eta 0:05:55 lr 0.000023 time 1.8496 (2.2069) loss 3.6880 (3.0007) grad_norm 3.0253 (3.1600) [2022-01-26 14:53:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1100/1251] eta 0:05:33 lr 0.000023 time 1.5432 (2.2068) loss 2.5698 (3.0000) grad_norm 2.7983 (3.1597) [2022-01-26 14:53:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1110/1251] eta 0:05:11 lr 0.000023 time 1.5786 (2.2066) loss 3.1606 (3.0005) grad_norm 5.8369 (3.1610) [2022-01-26 14:53:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1120/1251] eta 0:04:49 lr 0.000023 time 1.7765 (2.2067) loss 2.9090 (3.0004) grad_norm 2.8010 (3.1771) [2022-01-26 14:54:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1130/1251] eta 0:04:26 lr 0.000023 time 1.8535 (2.2052) loss 3.1582 (3.0004) grad_norm 2.7525 (3.1750) [2022-01-26 14:54:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1140/1251] eta 0:04:04 lr 0.000023 time 1.8193 (2.2055) loss 2.9845 (2.9985) grad_norm 4.6418 (3.1751) [2022-01-26 14:55:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1150/1251] eta 0:03:42 lr 0.000023 time 1.8847 (2.2066) loss 3.4159 (2.9984) grad_norm 2.9830 (3.1746) [2022-01-26 14:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1160/1251] eta 0:03:20 lr 0.000023 time 2.2021 (2.2065) loss 3.4071 (2.9971) grad_norm 2.6163 (3.1729) [2022-01-26 14:55:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1170/1251] eta 0:02:58 lr 0.000023 time 2.0069 (2.2052) loss 2.9878 (2.9964) grad_norm 2.9621 (3.1710) [2022-01-26 14:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1180/1251] eta 0:02:36 lr 0.000023 time 2.1331 (2.2062) loss 3.5758 (2.9981) grad_norm 3.0336 (3.1720) [2022-01-26 14:56:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1190/1251] eta 0:02:14 lr 0.000023 time 1.5594 (2.2068) loss 2.6985 (2.9984) grad_norm 2.9797 (3.1717) [2022-01-26 14:56:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1200/1251] eta 0:01:52 lr 0.000023 time 1.8543 (2.2066) loss 2.3458 (2.9977) grad_norm 3.1459 (3.1707) [2022-01-26 14:57:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1210/1251] eta 0:01:30 lr 0.000023 time 1.8991 (2.2045) loss 3.3744 (2.9990) grad_norm 2.7382 (3.1701) [2022-01-26 14:57:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1220/1251] eta 0:01:08 lr 0.000023 time 2.1700 (2.2041) loss 2.8579 (2.9991) grad_norm 3.0496 (3.1719) [2022-01-26 14:57:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1230/1251] eta 0:00:46 lr 0.000023 time 2.4290 (2.2038) loss 2.1286 (3.0005) grad_norm 2.7509 (3.1726) [2022-01-26 14:58:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1240/1251] eta 0:00:24 lr 0.000023 time 2.0495 (2.2029) loss 3.2615 (3.0026) grad_norm 2.9948 (3.1744) [2022-01-26 14:58:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [277/300][1250/1251] eta 0:00:02 lr 0.000023 time 1.1284 (2.1982) loss 3.0327 (3.0006) grad_norm 2.7407 (3.1736) [2022-01-26 14:58:31 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 277 training takes 0:45:50 [2022-01-26 14:58:50 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.315 (18.315) Loss 0.9665 (0.9665) Acc@1 78.516 (78.516) Acc@5 94.531 (94.531) [2022-01-26 14:59:10 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.321 (3.500) Loss 0.7511 (0.8187) Acc@1 82.324 (80.753) Acc@5 95.410 (95.490) [2022-01-26 14:59:27 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.571 (2.650) Loss 0.8202 (0.8097) Acc@1 80.762 (80.897) Acc@5 95.410 (95.489) [2022-01-26 14:59:43 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.651 (2.297) Loss 0.7858 (0.8115) Acc@1 81.738 (80.834) Acc@5 95.801 (95.407) [2022-01-26 15:00:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.155 (2.211) Loss 0.8420 (0.8099) Acc@1 80.762 (80.928) Acc@5 95.410 (95.439) [2022-01-26 15:00:10 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.008 Acc@5 95.376 [2022-01-26 15:00:10 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-01-26 15:00:10 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.07% [2022-01-26 15:00:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][0/1251] eta 7:28:52 lr 0.000023 time 21.5289 (21.5289) loss 3.3051 (3.3051) grad_norm 3.2581 (3.2581) [2022-01-26 15:00:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][10/1251] eta 1:23:57 lr 0.000023 time 2.1002 (4.0596) loss 3.4471 (3.1360) grad_norm 2.9568 (3.0001) [2022-01-26 15:01:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][20/1251] eta 1:04:42 lr 0.000023 time 1.6778 (3.1541) loss 3.4094 (3.0231) grad_norm 3.0483 (3.1167) [2022-01-26 15:01:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][30/1251] eta 0:58:39 lr 0.000023 time 1.8736 (2.8825) loss 2.4791 (3.0019) grad_norm 3.7406 (3.1129) [2022-01-26 15:02:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][40/1251] eta 0:55:33 lr 0.000023 time 4.2041 (2.7527) loss 3.4982 (3.0263) grad_norm 3.0881 (3.1246) [2022-01-26 15:02:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][50/1251] eta 0:52:54 lr 0.000023 time 2.1285 (2.6433) loss 2.8424 (2.9806) grad_norm 2.7931 (3.1102) [2022-01-26 15:02:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][60/1251] eta 0:50:49 lr 0.000023 time 1.5470 (2.5602) loss 2.6107 (2.9575) grad_norm 3.5473 (3.0975) [2022-01-26 15:03:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][70/1251] eta 0:49:44 lr 0.000023 time 2.1329 (2.5269) loss 3.4746 (3.0011) grad_norm 2.9019 (3.1186) [2022-01-26 15:03:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][80/1251] eta 0:48:52 lr 0.000023 time 3.8673 (2.5046) loss 4.0039 (2.9902) grad_norm 3.3129 (3.1143) [2022-01-26 15:03:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][90/1251] eta 0:47:34 lr 0.000023 time 2.2638 (2.4588) loss 2.1828 (2.9917) grad_norm 3.2341 (3.1537) [2022-01-26 15:04:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][100/1251] eta 0:46:41 lr 0.000023 time 1.7264 (2.4341) loss 3.1633 (3.0114) grad_norm 2.8371 (3.1795) [2022-01-26 15:04:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][110/1251] eta 0:46:08 lr 0.000023 time 2.9904 (2.4267) loss 3.1129 (3.0153) grad_norm 2.8880 (3.1583) [2022-01-26 15:05:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][120/1251] eta 0:45:23 lr 0.000023 time 2.6275 (2.4078) loss 3.1059 (3.0110) grad_norm 3.7171 (3.1624) [2022-01-26 15:05:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][130/1251] eta 0:44:30 lr 0.000023 time 1.5669 (2.3822) loss 3.5280 (3.0157) grad_norm 3.0359 (3.1450) [2022-01-26 15:05:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][140/1251] eta 0:43:55 lr 0.000023 time 1.9229 (2.3721) loss 2.9026 (3.0115) grad_norm 3.3129 (3.1433) [2022-01-26 15:06:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][150/1251] eta 0:43:30 lr 0.000023 time 3.0716 (2.3712) loss 2.8288 (3.0047) grad_norm 3.4225 (3.1407) [2022-01-26 15:06:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][160/1251] eta 0:42:53 lr 0.000023 time 2.4939 (2.3584) loss 3.3374 (3.0106) grad_norm 2.8227 (3.1288) [2022-01-26 15:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][170/1251] eta 0:42:09 lr 0.000023 time 1.9747 (2.3400) loss 2.7429 (3.0034) grad_norm 3.5043 (3.1236) [2022-01-26 15:07:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][180/1251] eta 0:41:32 lr 0.000023 time 2.2476 (2.3271) loss 2.9663 (2.9935) grad_norm 2.6342 (3.1113) [2022-01-26 15:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][190/1251] eta 0:40:55 lr 0.000023 time 2.2348 (2.3146) loss 2.7303 (2.9906) grad_norm 3.7243 (3.1190) [2022-01-26 15:07:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][200/1251] eta 0:40:22 lr 0.000023 time 2.2303 (2.3053) loss 3.6426 (2.9895) grad_norm 3.4666 (3.1120) [2022-01-26 15:08:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][210/1251] eta 0:39:51 lr 0.000023 time 2.2693 (2.2971) loss 3.2332 (2.9941) grad_norm 3.1669 (3.1130) [2022-01-26 15:08:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][220/1251] eta 0:39:23 lr 0.000023 time 2.3616 (2.2920) loss 3.2143 (2.9852) grad_norm 3.3409 (3.1083) [2022-01-26 15:08:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][230/1251] eta 0:38:49 lr 0.000023 time 1.7379 (2.2819) loss 3.1230 (2.9811) grad_norm 3.7611 (3.1120) [2022-01-26 15:09:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][240/1251] eta 0:38:26 lr 0.000023 time 2.4923 (2.2813) loss 3.3339 (2.9734) grad_norm 3.0228 (3.1264) [2022-01-26 15:09:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][250/1251] eta 0:38:02 lr 0.000023 time 2.6131 (2.2798) loss 2.9353 (2.9817) grad_norm 2.7266 (3.1287) [2022-01-26 15:10:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][260/1251] eta 0:37:35 lr 0.000023 time 2.8620 (2.2764) loss 1.9599 (2.9833) grad_norm 3.7247 (3.1269) [2022-01-26 15:10:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][270/1251] eta 0:37:04 lr 0.000023 time 1.8338 (2.2681) loss 3.2092 (2.9861) grad_norm 2.8548 (3.1215) [2022-01-26 15:10:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][280/1251] eta 0:36:40 lr 0.000023 time 2.7678 (2.2663) loss 2.0069 (2.9803) grad_norm 2.9814 (3.1234) [2022-01-26 15:11:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][290/1251] eta 0:36:12 lr 0.000023 time 2.1894 (2.2602) loss 3.4096 (2.9817) grad_norm 3.1855 (3.1194) [2022-01-26 15:11:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][300/1251] eta 0:35:46 lr 0.000023 time 3.0997 (2.2574) loss 3.6442 (2.9809) grad_norm 5.0991 (3.1239) [2022-01-26 15:11:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][310/1251] eta 0:35:20 lr 0.000023 time 2.0117 (2.2537) loss 3.3089 (2.9878) grad_norm 3.0511 (3.1225) [2022-01-26 15:12:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][320/1251] eta 0:34:57 lr 0.000023 time 2.1594 (2.2528) loss 3.4959 (2.9914) grad_norm 3.0605 (3.1227) [2022-01-26 15:12:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][330/1251] eta 0:34:33 lr 0.000023 time 2.4278 (2.2510) loss 3.2690 (3.0006) grad_norm 3.0559 (3.1188) [2022-01-26 15:12:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][340/1251] eta 0:34:11 lr 0.000023 time 2.5778 (2.2514) loss 2.4696 (3.0011) grad_norm 3.4778 (3.1244) [2022-01-26 15:13:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][350/1251] eta 0:33:49 lr 0.000023 time 1.5505 (2.2529) loss 3.5020 (3.0031) grad_norm 2.6118 (3.1228) [2022-01-26 15:13:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][360/1251] eta 0:33:24 lr 0.000023 time 1.8979 (2.2495) loss 3.2461 (3.0038) grad_norm 3.2096 (3.1229) [2022-01-26 15:14:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][370/1251] eta 0:33:00 lr 0.000023 time 2.2130 (2.2485) loss 3.0214 (3.0095) grad_norm 3.2915 (3.1232) [2022-01-26 15:14:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][380/1251] eta 0:32:35 lr 0.000023 time 2.5429 (2.2452) loss 2.4285 (3.0000) grad_norm 3.3794 (3.1198) [2022-01-26 15:14:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][390/1251] eta 0:32:10 lr 0.000023 time 1.5662 (2.2426) loss 3.5800 (2.9989) grad_norm 4.1698 (3.1280) [2022-01-26 15:15:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][400/1251] eta 0:31:47 lr 0.000023 time 1.8888 (2.2417) loss 3.0917 (3.0000) grad_norm 3.7372 (3.1318) [2022-01-26 15:15:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][410/1251] eta 0:31:22 lr 0.000023 time 1.8736 (2.2389) loss 1.8664 (2.9909) grad_norm 3.6941 (3.1304) [2022-01-26 15:15:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][420/1251] eta 0:31:00 lr 0.000023 time 1.9853 (2.2386) loss 3.1003 (2.9893) grad_norm 2.8408 (3.1316) [2022-01-26 15:16:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][430/1251] eta 0:30:39 lr 0.000023 time 1.7969 (2.2408) loss 3.5207 (2.9928) grad_norm 3.0132 (3.1326) [2022-01-26 15:16:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][440/1251] eta 0:30:15 lr 0.000023 time 1.8422 (2.2389) loss 3.1192 (2.9922) grad_norm 4.5380 (3.1346) [2022-01-26 15:17:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][450/1251] eta 0:29:54 lr 0.000023 time 2.1725 (2.2404) loss 3.1822 (2.9942) grad_norm 2.7979 (3.1326) [2022-01-26 15:17:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][460/1251] eta 0:29:29 lr 0.000023 time 1.9333 (2.2366) loss 3.6204 (3.0010) grad_norm 3.7555 (3.1342) [2022-01-26 15:17:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][470/1251] eta 0:29:05 lr 0.000023 time 1.8723 (2.2351) loss 3.1056 (3.0054) grad_norm 3.2376 (3.1335) [2022-01-26 15:18:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][480/1251] eta 0:28:41 lr 0.000023 time 1.8917 (2.2331) loss 3.3502 (3.0059) grad_norm 3.2421 (3.1312) [2022-01-26 15:18:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][490/1251] eta 0:28:15 lr 0.000023 time 1.8914 (2.2285) loss 3.7480 (3.0056) grad_norm 3.0595 (3.1350) [2022-01-26 15:18:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][500/1251] eta 0:27:52 lr 0.000023 time 2.0226 (2.2267) loss 3.2708 (3.0050) grad_norm 2.5360 (3.1356) [2022-01-26 15:19:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][510/1251] eta 0:27:30 lr 0.000023 time 2.2379 (2.2276) loss 3.5638 (3.0033) grad_norm 3.1362 (3.1374) [2022-01-26 15:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][520/1251] eta 0:27:11 lr 0.000023 time 3.0133 (2.2323) loss 3.6373 (2.9995) grad_norm 3.4554 (3.1538) [2022-01-26 15:19:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][530/1251] eta 0:26:49 lr 0.000023 time 2.0523 (2.2319) loss 2.7469 (2.9983) grad_norm 3.1259 (3.1589) [2022-01-26 15:20:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][540/1251] eta 0:26:25 lr 0.000023 time 1.9306 (2.2301) loss 3.4512 (2.9973) grad_norm 3.2176 (3.1644) [2022-01-26 15:20:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][550/1251] eta 0:26:00 lr 0.000023 time 1.7228 (2.2258) loss 2.5105 (2.9973) grad_norm 2.7741 (3.1695) [2022-01-26 15:20:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][560/1251] eta 0:25:37 lr 0.000023 time 3.0151 (2.2245) loss 3.3312 (3.0009) grad_norm 3.0954 (3.1705) [2022-01-26 15:21:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][570/1251] eta 0:25:13 lr 0.000023 time 2.0338 (2.2221) loss 3.3577 (3.0055) grad_norm 3.0821 (3.1690) [2022-01-26 15:21:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][580/1251] eta 0:24:51 lr 0.000023 time 2.3310 (2.2231) loss 2.5189 (3.0056) grad_norm 3.0072 (3.1673) [2022-01-26 15:22:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][590/1251] eta 0:24:29 lr 0.000023 time 1.5245 (2.2233) loss 2.4395 (3.0043) grad_norm 3.3649 (3.1682) [2022-01-26 15:22:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][600/1251] eta 0:24:05 lr 0.000023 time 1.5146 (2.2209) loss 3.4049 (3.0066) grad_norm 3.8913 (3.1723) [2022-01-26 15:22:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][610/1251] eta 0:23:41 lr 0.000023 time 2.0299 (2.2177) loss 3.3052 (3.0082) grad_norm 3.0218 (3.1717) [2022-01-26 15:23:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][620/1251] eta 0:23:17 lr 0.000022 time 2.2396 (2.2153) loss 3.1832 (3.0054) grad_norm 3.4162 (3.1693) [2022-01-26 15:23:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][630/1251] eta 0:22:56 lr 0.000022 time 2.4313 (2.2166) loss 2.9816 (3.0074) grad_norm 3.1635 (3.1652) [2022-01-26 15:23:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][640/1251] eta 0:22:34 lr 0.000022 time 2.3304 (2.2169) loss 2.0672 (3.0045) grad_norm 2.5997 (3.1615) [2022-01-26 15:24:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][650/1251] eta 0:22:13 lr 0.000022 time 1.9457 (2.2185) loss 2.8318 (3.0058) grad_norm 2.8280 (3.1596) [2022-01-26 15:24:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][660/1251] eta 0:21:51 lr 0.000022 time 2.1176 (2.2192) loss 2.6437 (3.0073) grad_norm 3.1128 (3.1605) [2022-01-26 15:25:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][670/1251] eta 0:21:31 lr 0.000022 time 3.3945 (2.2234) loss 2.3659 (3.0039) grad_norm 2.7370 (3.1577) [2022-01-26 15:25:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][680/1251] eta 0:21:09 lr 0.000022 time 1.9246 (2.2226) loss 3.3593 (3.0026) grad_norm 3.5026 (3.1595) [2022-01-26 15:25:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][690/1251] eta 0:20:46 lr 0.000022 time 2.0021 (2.2215) loss 2.3606 (3.0010) grad_norm 3.3340 (3.1578) [2022-01-26 15:26:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][700/1251] eta 0:20:22 lr 0.000022 time 1.7737 (2.2184) loss 3.0453 (3.0026) grad_norm 3.6946 (3.1582) [2022-01-26 15:26:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][710/1251] eta 0:19:59 lr 0.000022 time 2.8621 (2.2175) loss 3.7225 (3.0016) grad_norm 3.1352 (3.1596) [2022-01-26 15:26:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][720/1251] eta 0:19:36 lr 0.000022 time 2.0613 (2.2164) loss 3.5722 (3.0009) grad_norm 3.0345 (3.1598) [2022-01-26 15:27:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][730/1251] eta 0:19:15 lr 0.000022 time 2.4272 (2.2171) loss 2.5556 (2.9962) grad_norm 2.9557 (3.1600) [2022-01-26 15:27:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][740/1251] eta 0:18:53 lr 0.000022 time 1.7112 (2.2174) loss 1.9962 (2.9971) grad_norm 3.4946 (3.1639) [2022-01-26 15:27:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][750/1251] eta 0:18:30 lr 0.000022 time 2.5947 (2.2172) loss 3.2354 (2.9977) grad_norm 3.2087 (3.1642) [2022-01-26 15:28:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][760/1251] eta 0:18:07 lr 0.000022 time 2.1871 (2.2152) loss 1.9167 (2.9969) grad_norm 2.8732 (3.1646) [2022-01-26 15:28:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][770/1251] eta 0:17:44 lr 0.000022 time 1.9171 (2.2140) loss 3.1302 (2.9997) grad_norm 3.1583 (3.1630) [2022-01-26 15:28:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][780/1251] eta 0:17:22 lr 0.000022 time 2.4800 (2.2133) loss 2.4412 (2.9986) grad_norm 4.0321 (3.1636) [2022-01-26 15:29:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][790/1251] eta 0:17:00 lr 0.000022 time 2.5985 (2.2141) loss 2.8785 (3.0005) grad_norm 2.9883 (3.1619) [2022-01-26 15:29:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][800/1251] eta 0:16:38 lr 0.000022 time 2.1333 (2.2139) loss 3.4693 (3.0030) grad_norm 3.1166 (3.1616) [2022-01-26 15:30:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][810/1251] eta 0:16:15 lr 0.000022 time 1.8293 (2.2128) loss 3.3563 (3.0034) grad_norm 2.9686 (3.1630) [2022-01-26 15:30:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][820/1251] eta 0:15:53 lr 0.000022 time 1.9015 (2.2115) loss 2.8704 (3.0070) grad_norm 2.6483 (3.1666) [2022-01-26 15:30:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][830/1251] eta 0:15:31 lr 0.000022 time 2.2172 (2.2127) loss 3.2809 (3.0090) grad_norm 3.2772 (3.1638) [2022-01-26 15:31:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][840/1251] eta 0:15:09 lr 0.000022 time 2.1678 (2.2125) loss 3.3051 (3.0106) grad_norm 2.5627 (3.1646) [2022-01-26 15:31:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][850/1251] eta 0:14:47 lr 0.000022 time 1.7760 (2.2121) loss 2.0367 (3.0081) grad_norm 2.8931 (3.1642) [2022-01-26 15:31:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][860/1251] eta 0:14:24 lr 0.000022 time 2.0558 (2.2121) loss 3.1030 (3.0081) grad_norm 3.2792 (3.1640) [2022-01-26 15:32:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][870/1251] eta 0:14:03 lr 0.000022 time 2.6639 (2.2132) loss 3.1541 (3.0080) grad_norm 3.6134 (3.1628) [2022-01-26 15:32:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][880/1251] eta 0:13:40 lr 0.000022 time 1.6254 (2.2120) loss 2.9195 (3.0097) grad_norm 2.8659 (3.1631) [2022-01-26 15:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][890/1251] eta 0:13:18 lr 0.000022 time 2.3335 (2.2130) loss 3.3031 (3.0113) grad_norm 3.5990 (3.1628) [2022-01-26 15:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][900/1251] eta 0:12:56 lr 0.000022 time 1.8568 (2.2134) loss 2.7403 (3.0132) grad_norm 2.6368 (3.1630) [2022-01-26 15:33:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][910/1251] eta 0:12:35 lr 0.000022 time 2.4042 (2.2152) loss 3.3224 (3.0136) grad_norm 2.9206 (3.1602) [2022-01-26 15:34:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][920/1251] eta 0:12:12 lr 0.000022 time 1.7813 (2.2143) loss 3.4447 (3.0152) grad_norm 3.2306 (3.1622) [2022-01-26 15:34:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][930/1251] eta 0:11:51 lr 0.000022 time 2.4912 (2.2153) loss 2.4283 (3.0145) grad_norm 2.9162 (3.1619) [2022-01-26 15:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][940/1251] eta 0:11:28 lr 0.000022 time 1.9283 (2.2149) loss 2.7931 (3.0138) grad_norm 3.0326 (3.1652) [2022-01-26 15:35:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][950/1251] eta 0:11:06 lr 0.000022 time 1.9178 (2.2134) loss 2.7688 (3.0136) grad_norm 2.7650 (3.1664) [2022-01-26 15:35:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][960/1251] eta 0:10:43 lr 0.000022 time 1.7809 (2.2105) loss 3.2362 (3.0127) grad_norm 2.6759 (3.1653) [2022-01-26 15:35:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][970/1251] eta 0:10:21 lr 0.000022 time 2.4946 (2.2106) loss 1.9367 (3.0115) grad_norm 3.2199 (3.1646) [2022-01-26 15:36:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][980/1251] eta 0:09:59 lr 0.000022 time 2.2124 (2.2109) loss 1.8505 (3.0118) grad_norm 2.8283 (3.1626) [2022-01-26 15:36:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][990/1251] eta 0:09:37 lr 0.000022 time 3.3762 (2.2121) loss 3.2753 (3.0111) grad_norm 3.2062 (3.1615) [2022-01-26 15:37:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1000/1251] eta 0:09:15 lr 0.000022 time 2.0951 (2.2138) loss 3.0382 (3.0110) grad_norm 3.6274 (3.1612) [2022-01-26 15:37:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1010/1251] eta 0:08:53 lr 0.000022 time 2.5486 (2.2147) loss 3.5681 (3.0113) grad_norm 3.2145 (3.1615) [2022-01-26 15:37:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1020/1251] eta 0:08:31 lr 0.000022 time 1.9531 (2.2125) loss 2.2629 (3.0129) grad_norm 2.9032 (3.1615) [2022-01-26 15:38:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1030/1251] eta 0:08:08 lr 0.000022 time 2.6592 (2.2103) loss 3.4049 (3.0099) grad_norm 3.1819 (3.1600) [2022-01-26 15:38:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1040/1251] eta 0:07:45 lr 0.000022 time 1.6107 (2.2083) loss 3.6004 (3.0083) grad_norm 2.9983 (3.1599) [2022-01-26 15:38:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1050/1251] eta 0:07:23 lr 0.000022 time 2.9552 (2.2083) loss 2.5746 (3.0073) grad_norm 2.8890 (3.1603) [2022-01-26 15:39:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1060/1251] eta 0:07:01 lr 0.000022 time 1.9244 (2.2078) loss 3.6104 (3.0074) grad_norm 3.0455 (3.1608) [2022-01-26 15:39:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1070/1251] eta 0:06:40 lr 0.000022 time 2.4966 (2.2102) loss 2.9137 (3.0085) grad_norm 3.3766 (3.1600) [2022-01-26 15:40:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1080/1251] eta 0:06:18 lr 0.000022 time 1.4376 (2.2126) loss 3.2604 (3.0078) grad_norm 2.9972 (3.1588) [2022-01-26 15:40:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1090/1251] eta 0:05:56 lr 0.000022 time 2.8531 (2.2137) loss 2.6092 (3.0098) grad_norm 2.4427 (3.1578) [2022-01-26 15:40:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1100/1251] eta 0:05:34 lr 0.000022 time 1.8341 (2.2122) loss 2.8551 (3.0091) grad_norm 3.8539 (3.1574) [2022-01-26 15:41:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1110/1251] eta 0:05:11 lr 0.000022 time 2.2375 (2.2103) loss 3.7206 (3.0093) grad_norm 3.8007 (3.1582) [2022-01-26 15:41:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1120/1251] eta 0:04:49 lr 0.000022 time 1.8263 (2.2080) loss 3.0368 (3.0092) grad_norm 4.1958 (3.1621) [2022-01-26 15:41:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1130/1251] eta 0:04:27 lr 0.000022 time 2.6018 (2.2075) loss 3.2843 (3.0095) grad_norm 2.8293 (3.1624) [2022-01-26 15:42:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1140/1251] eta 0:04:05 lr 0.000022 time 2.1613 (2.2079) loss 2.0181 (3.0090) grad_norm 2.6331 (3.1601) [2022-01-26 15:42:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1150/1251] eta 0:03:43 lr 0.000022 time 1.9518 (2.2083) loss 3.2517 (3.0099) grad_norm 3.2092 (3.1601) [2022-01-26 15:42:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1160/1251] eta 0:03:20 lr 0.000022 time 2.5325 (2.2085) loss 3.2522 (3.0094) grad_norm 2.8888 (3.1590) [2022-01-26 15:43:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1170/1251] eta 0:02:59 lr 0.000022 time 2.7024 (2.2109) loss 3.1146 (3.0088) grad_norm 3.2287 (3.1581) [2022-01-26 15:43:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1180/1251] eta 0:02:36 lr 0.000022 time 2.0413 (2.2102) loss 3.3670 (3.0070) grad_norm 3.4352 (3.1578) [2022-01-26 15:44:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1190/1251] eta 0:02:14 lr 0.000022 time 1.7290 (2.2097) loss 2.9543 (3.0058) grad_norm 2.8172 (3.1582) [2022-01-26 15:44:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1200/1251] eta 0:01:52 lr 0.000022 time 2.1694 (2.2087) loss 2.5541 (3.0060) grad_norm 2.7478 (3.1591) [2022-01-26 15:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1210/1251] eta 0:01:30 lr 0.000022 time 2.2684 (2.2075) loss 3.1997 (3.0076) grad_norm 2.8540 (3.1589) [2022-01-26 15:45:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1220/1251] eta 0:01:08 lr 0.000022 time 3.0513 (2.2077) loss 3.3051 (3.0086) grad_norm 3.1887 (3.1573) [2022-01-26 15:45:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1230/1251] eta 0:00:46 lr 0.000022 time 2.2372 (2.2082) loss 2.6670 (3.0078) grad_norm 2.8520 (3.1553) [2022-01-26 15:45:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1240/1251] eta 0:00:24 lr 0.000022 time 1.1580 (2.2069) loss 3.2663 (3.0090) grad_norm 3.0074 (3.1548) [2022-01-26 15:46:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [278/300][1250/1251] eta 0:00:02 lr 0.000022 time 1.3580 (2.2020) loss 3.1462 (3.0074) grad_norm 3.4590 (3.1547) [2022-01-26 15:46:05 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 278 training takes 0:45:55 [2022-01-26 15:46:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.278 (18.278) Loss 0.8212 (0.8212) Acc@1 81.543 (81.543) Acc@5 95.508 (95.508) [2022-01-26 15:46:42 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.901 (3.406) Loss 0.8245 (0.8003) Acc@1 81.934 (81.401) Acc@5 94.824 (95.188) [2022-01-26 15:47:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.936 (2.666) Loss 0.8214 (0.8053) Acc@1 80.859 (81.348) Acc@5 95.020 (95.406) [2022-01-26 15:47:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.627 (2.346) Loss 0.7907 (0.8109) Acc@1 82.324 (81.269) Acc@5 95.117 (95.347) [2022-01-26 15:47:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.348 (2.195) Loss 0.7865 (0.8118) Acc@1 80.469 (81.071) Acc@5 96.191 (95.353) [2022-01-26 15:47:42 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.074 Acc@5 95.406 [2022-01-26 15:47:42 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-01-26 15:47:42 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.07% [2022-01-26 15:48:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][0/1251] eta 7:30:29 lr 0.000022 time 21.6064 (21.6064) loss 3.0292 (3.0292) grad_norm 2.9771 (2.9771) [2022-01-26 15:48:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][10/1251] eta 1:22:50 lr 0.000022 time 2.2121 (4.0056) loss 3.4291 (3.0290) grad_norm 2.8333 (3.1812) [2022-01-26 15:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][20/1251] eta 1:03:47 lr 0.000022 time 1.5801 (3.1093) loss 2.1437 (3.0603) grad_norm 2.7640 (3.1186) [2022-01-26 15:49:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][30/1251] eta 0:57:34 lr 0.000022 time 1.6226 (2.8290) loss 3.7712 (2.9449) grad_norm 3.1828 (3.1184) [2022-01-26 15:49:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][40/1251] eta 0:55:12 lr 0.000022 time 3.7504 (2.7357) loss 2.7868 (2.9573) grad_norm 2.7734 (3.1200) [2022-01-26 15:49:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][50/1251] eta 0:53:04 lr 0.000022 time 2.0721 (2.6517) loss 1.9755 (2.9687) grad_norm 2.8774 (3.0957) [2022-01-26 15:50:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][60/1251] eta 0:51:01 lr 0.000022 time 1.4930 (2.5709) loss 3.4847 (2.9849) grad_norm 3.1563 (3.1083) [2022-01-26 15:50:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][70/1251] eta 0:49:44 lr 0.000022 time 1.7648 (2.5275) loss 3.0920 (2.9986) grad_norm 3.0365 (3.1082) [2022-01-26 15:51:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][80/1251] eta 0:48:55 lr 0.000022 time 3.0074 (2.5067) loss 3.3251 (2.9990) grad_norm 3.4247 (3.0957) [2022-01-26 15:51:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][90/1251] eta 0:47:48 lr 0.000022 time 1.5172 (2.4710) loss 2.9972 (3.0111) grad_norm 2.8652 (3.1163) [2022-01-26 15:51:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][100/1251] eta 0:46:36 lr 0.000022 time 1.8294 (2.4299) loss 1.8963 (3.0089) grad_norm 3.1271 (3.1311) [2022-01-26 15:52:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][110/1251] eta 0:45:19 lr 0.000022 time 1.9692 (2.3832) loss 3.0305 (3.0085) grad_norm 2.7513 (3.1310) [2022-01-26 15:52:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][120/1251] eta 0:44:21 lr 0.000022 time 2.4732 (2.3532) loss 3.6031 (3.0016) grad_norm 3.8176 (3.1330) [2022-01-26 15:52:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][130/1251] eta 0:43:46 lr 0.000022 time 2.2816 (2.3431) loss 3.2561 (2.9992) grad_norm 3.9277 (3.1337) [2022-01-26 15:53:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][140/1251] eta 0:43:23 lr 0.000022 time 2.3811 (2.3435) loss 2.8302 (2.9719) grad_norm 2.7710 (3.1246) [2022-01-26 15:53:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][150/1251] eta 0:43:00 lr 0.000022 time 2.2238 (2.3441) loss 3.3881 (2.9835) grad_norm 2.6336 (3.1392) [2022-01-26 15:54:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][160/1251] eta 0:42:48 lr 0.000022 time 2.7782 (2.3545) loss 3.1149 (2.9800) grad_norm 3.3324 (3.1449) [2022-01-26 15:54:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][170/1251] eta 0:42:10 lr 0.000022 time 1.8887 (2.3408) loss 3.0358 (2.9715) grad_norm 3.2439 (3.1473) [2022-01-26 15:54:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][180/1251] eta 0:41:32 lr 0.000022 time 1.9078 (2.3272) loss 2.9963 (2.9671) grad_norm 3.3773 (3.1415) [2022-01-26 15:55:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][190/1251] eta 0:40:49 lr 0.000022 time 1.9623 (2.3090) loss 3.5728 (2.9690) grad_norm 3.7738 (3.1444) [2022-01-26 15:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][200/1251] eta 0:40:10 lr 0.000022 time 1.9430 (2.2936) loss 3.2626 (2.9753) grad_norm 3.1019 (3.1469) [2022-01-26 15:55:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][210/1251] eta 0:39:31 lr 0.000022 time 1.8075 (2.2782) loss 2.6967 (2.9615) grad_norm 2.6527 (3.1419) [2022-01-26 15:56:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][220/1251] eta 0:38:58 lr 0.000022 time 2.0313 (2.2685) loss 2.6964 (2.9701) grad_norm 2.8472 (3.1430) [2022-01-26 15:56:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][230/1251] eta 0:38:27 lr 0.000022 time 2.1558 (2.2598) loss 3.3752 (2.9647) grad_norm 4.2075 (3.1725) [2022-01-26 15:56:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][240/1251] eta 0:38:09 lr 0.000022 time 3.0292 (2.2641) loss 2.8708 (2.9737) grad_norm 4.0674 (3.1762) [2022-01-26 15:57:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][250/1251] eta 0:37:46 lr 0.000022 time 1.8228 (2.2643) loss 3.1695 (2.9671) grad_norm 3.2202 (3.1762) [2022-01-26 15:57:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][260/1251] eta 0:37:25 lr 0.000022 time 2.1258 (2.2664) loss 3.4348 (2.9660) grad_norm 2.9295 (3.1829) [2022-01-26 15:57:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][270/1251] eta 0:37:03 lr 0.000022 time 2.6051 (2.2668) loss 2.4035 (2.9666) grad_norm 2.9127 (3.1840) [2022-01-26 15:58:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][280/1251] eta 0:36:48 lr 0.000022 time 2.5427 (2.2746) loss 3.1775 (2.9703) grad_norm 2.9748 (3.1789) [2022-01-26 15:58:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][290/1251] eta 0:36:21 lr 0.000022 time 1.5546 (2.2704) loss 2.8559 (2.9681) grad_norm 2.9356 (3.1769) [2022-01-26 15:59:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][300/1251] eta 0:35:56 lr 0.000022 time 1.9077 (2.2673) loss 3.5934 (2.9754) grad_norm 2.8427 (3.1804) [2022-01-26 15:59:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][310/1251] eta 0:35:28 lr 0.000022 time 1.9500 (2.2616) loss 2.9967 (2.9706) grad_norm 3.0645 (3.1786) [2022-01-26 15:59:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][320/1251] eta 0:35:06 lr 0.000022 time 2.7055 (2.2625) loss 2.6376 (2.9730) grad_norm 3.4215 (3.1767) [2022-01-26 16:00:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][330/1251] eta 0:34:42 lr 0.000022 time 2.3634 (2.2610) loss 3.3075 (2.9750) grad_norm 2.9645 (3.1810) [2022-01-26 16:00:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][340/1251] eta 0:34:14 lr 0.000022 time 1.8840 (2.2557) loss 2.9151 (2.9798) grad_norm 3.1571 (3.1847) [2022-01-26 16:00:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][350/1251] eta 0:33:48 lr 0.000022 time 1.8966 (2.2510) loss 3.4090 (2.9774) grad_norm 3.3691 (3.1986) [2022-01-26 16:01:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][360/1251] eta 0:33:29 lr 0.000022 time 2.8922 (2.2548) loss 3.0886 (2.9734) grad_norm 3.2341 (3.2020) [2022-01-26 16:01:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][370/1251] eta 0:33:02 lr 0.000022 time 1.6459 (2.2505) loss 2.6794 (2.9733) grad_norm 2.7989 (3.2083) [2022-01-26 16:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][380/1251] eta 0:32:38 lr 0.000022 time 2.2070 (2.2486) loss 2.8371 (2.9778) grad_norm 3.0854 (3.2164) [2022-01-26 16:02:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][390/1251] eta 0:32:11 lr 0.000022 time 1.9148 (2.2433) loss 3.1648 (2.9749) grad_norm 3.2026 (3.2143) [2022-01-26 16:02:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][400/1251] eta 0:31:45 lr 0.000022 time 2.2261 (2.2388) loss 3.3740 (2.9716) grad_norm 2.9523 (3.2123) [2022-01-26 16:03:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][410/1251] eta 0:31:18 lr 0.000022 time 1.9236 (2.2339) loss 3.3125 (2.9674) grad_norm 2.8309 (3.2110) [2022-01-26 16:03:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][420/1251] eta 0:30:54 lr 0.000022 time 3.0138 (2.2315) loss 3.1666 (2.9727) grad_norm 2.8378 (3.2090) [2022-01-26 16:03:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][430/1251] eta 0:30:28 lr 0.000022 time 1.7968 (2.2272) loss 3.2240 (2.9745) grad_norm 2.9148 (3.2050) [2022-01-26 16:04:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][440/1251] eta 0:30:06 lr 0.000022 time 2.6684 (2.2272) loss 3.2872 (2.9776) grad_norm 3.2874 (3.2010) [2022-01-26 16:04:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][450/1251] eta 0:29:42 lr 0.000022 time 2.3868 (2.2259) loss 3.2487 (2.9769) grad_norm 3.5844 (3.2031) [2022-01-26 16:04:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][460/1251] eta 0:29:22 lr 0.000022 time 3.0355 (2.2287) loss 3.1342 (2.9787) grad_norm 3.3484 (3.2020) [2022-01-26 16:05:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][470/1251] eta 0:28:58 lr 0.000022 time 1.9278 (2.2259) loss 3.2012 (2.9743) grad_norm 3.1535 (3.2023) [2022-01-26 16:05:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][480/1251] eta 0:28:38 lr 0.000021 time 3.6084 (2.2291) loss 2.8437 (2.9705) grad_norm 2.7910 (3.1992) [2022-01-26 16:05:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][490/1251] eta 0:28:17 lr 0.000021 time 2.7885 (2.2300) loss 3.2779 (2.9715) grad_norm 2.7814 (3.1964) [2022-01-26 16:06:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][500/1251] eta 0:27:53 lr 0.000021 time 1.8514 (2.2278) loss 3.1244 (2.9675) grad_norm 2.4381 (3.1940) [2022-01-26 16:06:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][510/1251] eta 0:27:28 lr 0.000021 time 1.6776 (2.2246) loss 3.6690 (2.9718) grad_norm 3.4959 (3.1922) [2022-01-26 16:07:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][520/1251] eta 0:27:05 lr 0.000021 time 2.3166 (2.2234) loss 2.9784 (2.9743) grad_norm 3.0713 (3.1901) [2022-01-26 16:07:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][530/1251] eta 0:26:42 lr 0.000021 time 2.4701 (2.2233) loss 3.4344 (2.9772) grad_norm 3.0320 (3.1875) [2022-01-26 16:07:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][540/1251] eta 0:26:19 lr 0.000021 time 1.7708 (2.2218) loss 2.9130 (2.9767) grad_norm 3.1070 (3.1880) [2022-01-26 16:08:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][550/1251] eta 0:25:56 lr 0.000021 time 1.4867 (2.2202) loss 3.0530 (2.9753) grad_norm 2.8011 (3.1888) [2022-01-26 16:08:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][560/1251] eta 0:25:34 lr 0.000021 time 2.0082 (2.2205) loss 3.2324 (2.9739) grad_norm 3.3418 (3.1896) [2022-01-26 16:08:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][570/1251] eta 0:25:14 lr 0.000021 time 2.8016 (2.2238) loss 3.4335 (2.9801) grad_norm 2.9679 (3.1905) [2022-01-26 16:09:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][580/1251] eta 0:24:54 lr 0.000021 time 2.6500 (2.2272) loss 3.2720 (2.9825) grad_norm 2.6594 (3.1919) [2022-01-26 16:09:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][590/1251] eta 0:24:30 lr 0.000021 time 2.4614 (2.2253) loss 2.6134 (2.9843) grad_norm 3.4987 (3.1935) [2022-01-26 16:09:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][600/1251] eta 0:24:07 lr 0.000021 time 2.3181 (2.2228) loss 2.6375 (2.9865) grad_norm 2.8601 (3.1986) [2022-01-26 16:10:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][610/1251] eta 0:23:43 lr 0.000021 time 1.8032 (2.2206) loss 2.5511 (2.9883) grad_norm 3.2878 (3.1998) [2022-01-26 16:10:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][620/1251] eta 0:23:20 lr 0.000021 time 1.6652 (2.2194) loss 2.9894 (2.9907) grad_norm 3.7927 (3.1997) [2022-01-26 16:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][630/1251] eta 0:22:57 lr 0.000021 time 1.5785 (2.2175) loss 3.5204 (2.9936) grad_norm 3.8432 (3.1999) [2022-01-26 16:11:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][640/1251] eta 0:22:34 lr 0.000021 time 2.5854 (2.2162) loss 3.3340 (2.9971) grad_norm 3.1779 (3.2024) [2022-01-26 16:11:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][650/1251] eta 0:22:12 lr 0.000021 time 1.5493 (2.2178) loss 3.5479 (2.9950) grad_norm 3.6346 (3.2035) [2022-01-26 16:12:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][660/1251] eta 0:21:51 lr 0.000021 time 1.5184 (2.2187) loss 2.6654 (2.9971) grad_norm 2.5262 (3.1991) [2022-01-26 16:12:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][670/1251] eta 0:21:28 lr 0.000021 time 2.5493 (2.2183) loss 3.0104 (2.9971) grad_norm 3.0087 (3.1971) [2022-01-26 16:12:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][680/1251] eta 0:21:07 lr 0.000021 time 2.8188 (2.2190) loss 3.2050 (3.0009) grad_norm 2.9827 (3.1943) [2022-01-26 16:13:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][690/1251] eta 0:20:44 lr 0.000021 time 1.9455 (2.2180) loss 3.8781 (2.9994) grad_norm 3.1831 (3.1917) [2022-01-26 16:13:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][700/1251] eta 0:20:21 lr 0.000021 time 1.6091 (2.2164) loss 2.9818 (3.0003) grad_norm 3.0828 (3.1940) [2022-01-26 16:13:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][710/1251] eta 0:19:58 lr 0.000021 time 2.2164 (2.2154) loss 3.1795 (3.0006) grad_norm 4.0885 (3.1940) [2022-01-26 16:14:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][720/1251] eta 0:19:37 lr 0.000021 time 3.8039 (2.2173) loss 2.7377 (2.9978) grad_norm 3.4142 (3.1948) [2022-01-26 16:14:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][730/1251] eta 0:19:14 lr 0.000021 time 1.8964 (2.2159) loss 2.8163 (2.9941) grad_norm 2.8327 (3.1937) [2022-01-26 16:15:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][740/1251] eta 0:18:53 lr 0.000021 time 1.8503 (2.2180) loss 3.2903 (2.9938) grad_norm 3.0037 (3.1925) [2022-01-26 16:15:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][750/1251] eta 0:18:30 lr 0.000021 time 1.8173 (2.2169) loss 3.2684 (2.9948) grad_norm 2.8792 (3.1892) [2022-01-26 16:15:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][760/1251] eta 0:18:08 lr 0.000021 time 3.7372 (2.2162) loss 3.2213 (2.9938) grad_norm 3.7278 (3.1879) [2022-01-26 16:16:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][770/1251] eta 0:17:45 lr 0.000021 time 1.8149 (2.2156) loss 2.9324 (2.9932) grad_norm 3.5173 (3.1862) [2022-01-26 16:16:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][780/1251] eta 0:17:24 lr 0.000021 time 2.0065 (2.2167) loss 2.0630 (2.9922) grad_norm 2.7619 (3.1869) [2022-01-26 16:16:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][790/1251] eta 0:17:01 lr 0.000021 time 1.9299 (2.2168) loss 3.2966 (2.9922) grad_norm 3.2295 (3.1857) [2022-01-26 16:17:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][800/1251] eta 0:16:39 lr 0.000021 time 3.5361 (2.2158) loss 3.2904 (2.9955) grad_norm 2.5229 (3.1842) [2022-01-26 16:17:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][810/1251] eta 0:16:16 lr 0.000021 time 1.8349 (2.2138) loss 3.3698 (2.9967) grad_norm 3.3277 (3.1866) [2022-01-26 16:18:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][820/1251] eta 0:15:54 lr 0.000021 time 1.5743 (2.2141) loss 3.6723 (2.9999) grad_norm 3.5460 (3.1930) [2022-01-26 16:18:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][830/1251] eta 0:15:31 lr 0.000021 time 1.8309 (2.2136) loss 3.0617 (3.0013) grad_norm 3.7327 (3.1914) [2022-01-26 16:18:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][840/1251] eta 0:15:09 lr 0.000021 time 3.6043 (2.2137) loss 3.3102 (2.9996) grad_norm 2.8453 (3.1880) [2022-01-26 16:19:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][850/1251] eta 0:14:46 lr 0.000021 time 1.6366 (2.2115) loss 2.6939 (2.9966) grad_norm 2.6479 (3.1844) [2022-01-26 16:19:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][860/1251] eta 0:14:24 lr 0.000021 time 2.2621 (2.2116) loss 3.2035 (2.9965) grad_norm 2.7210 (3.1842) [2022-01-26 16:19:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][870/1251] eta 0:14:02 lr 0.000021 time 2.5605 (2.2112) loss 2.8827 (2.9953) grad_norm 4.1031 (3.1839) [2022-01-26 16:20:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][880/1251] eta 0:13:40 lr 0.000021 time 2.8795 (2.2111) loss 3.7862 (2.9971) grad_norm 3.2646 (3.1834) [2022-01-26 16:20:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][890/1251] eta 0:13:17 lr 0.000021 time 1.7311 (2.2102) loss 3.7029 (2.9991) grad_norm 3.8265 (3.1820) [2022-01-26 16:20:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][900/1251] eta 0:12:55 lr 0.000021 time 1.5796 (2.2099) loss 2.8176 (3.0011) grad_norm 3.1707 (3.1824) [2022-01-26 16:21:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][910/1251] eta 0:12:34 lr 0.000021 time 2.8292 (2.2131) loss 3.1586 (3.0006) grad_norm 3.5171 (3.1822) [2022-01-26 16:21:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][920/1251] eta 0:12:12 lr 0.000021 time 1.6044 (2.2126) loss 3.2734 (3.0010) grad_norm 2.7297 (3.1814) [2022-01-26 16:22:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][930/1251] eta 0:11:49 lr 0.000021 time 1.7152 (2.2111) loss 2.3027 (3.0000) grad_norm 2.6719 (3.1797) [2022-01-26 16:22:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][940/1251] eta 0:11:26 lr 0.000021 time 1.7208 (2.2083) loss 2.6512 (3.0000) grad_norm 3.0380 (3.1807) [2022-01-26 16:22:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][950/1251] eta 0:11:04 lr 0.000021 time 2.7246 (2.2080) loss 2.7916 (3.0005) grad_norm 2.8195 (3.1815) [2022-01-26 16:23:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][960/1251] eta 0:10:42 lr 0.000021 time 1.5784 (2.2067) loss 3.0885 (2.9988) grad_norm 3.2493 (3.1818) [2022-01-26 16:23:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][970/1251] eta 0:10:20 lr 0.000021 time 2.3143 (2.2068) loss 3.2246 (3.0004) grad_norm 3.6031 (3.1815) [2022-01-26 16:23:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][980/1251] eta 0:09:58 lr 0.000021 time 1.8624 (2.2072) loss 2.9874 (2.9991) grad_norm 3.0288 (3.1817) [2022-01-26 16:24:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][990/1251] eta 0:09:36 lr 0.000021 time 2.7659 (2.2078) loss 3.2967 (2.9970) grad_norm 3.2285 (3.1828) [2022-01-26 16:24:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1000/1251] eta 0:09:14 lr 0.000021 time 1.8718 (2.2073) loss 2.8817 (2.9948) grad_norm 3.3336 (3.1829) [2022-01-26 16:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1010/1251] eta 0:08:51 lr 0.000021 time 2.0615 (2.2064) loss 2.5450 (2.9928) grad_norm 3.1573 (3.1820) [2022-01-26 16:25:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1020/1251] eta 0:08:29 lr 0.000021 time 1.8190 (2.2057) loss 3.4229 (2.9951) grad_norm 3.2200 (3.1818) [2022-01-26 16:25:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1030/1251] eta 0:08:07 lr 0.000021 time 1.8738 (2.2066) loss 2.9553 (2.9953) grad_norm 2.9107 (3.1807) [2022-01-26 16:25:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1040/1251] eta 0:07:45 lr 0.000021 time 1.5309 (2.2060) loss 3.0287 (2.9955) grad_norm 3.6445 (3.1829) [2022-01-26 16:26:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1050/1251] eta 0:07:23 lr 0.000021 time 2.4661 (2.2054) loss 3.1489 (2.9961) grad_norm 3.4410 (3.1832) [2022-01-26 16:26:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1060/1251] eta 0:07:00 lr 0.000021 time 1.5781 (2.2042) loss 2.7006 (2.9964) grad_norm 3.1837 (3.1843) [2022-01-26 16:27:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1070/1251] eta 0:06:38 lr 0.000021 time 2.1102 (2.2030) loss 2.0518 (2.9962) grad_norm 3.1594 (3.1848) [2022-01-26 16:27:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1080/1251] eta 0:06:16 lr 0.000021 time 2.2751 (2.2028) loss 2.8435 (2.9955) grad_norm 3.3374 (3.1860) [2022-01-26 16:27:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1090/1251] eta 0:05:54 lr 0.000021 time 1.9285 (2.2023) loss 2.9871 (2.9951) grad_norm 3.0432 (3.1845) [2022-01-26 16:28:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1100/1251] eta 0:05:32 lr 0.000021 time 1.6494 (2.2022) loss 2.6534 (2.9956) grad_norm 3.6295 (3.1846) [2022-01-26 16:28:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1110/1251] eta 0:05:10 lr 0.000021 time 2.2915 (2.2023) loss 3.5903 (2.9950) grad_norm 2.9006 (3.1830) [2022-01-26 16:28:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1120/1251] eta 0:04:48 lr 0.000021 time 2.2274 (2.2024) loss 3.2613 (2.9926) grad_norm 2.7902 (3.1837) [2022-01-26 16:29:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1130/1251] eta 0:04:26 lr 0.000021 time 2.3853 (2.2017) loss 2.4295 (2.9929) grad_norm 3.0962 (3.1829) [2022-01-26 16:29:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1140/1251] eta 0:04:04 lr 0.000021 time 1.8463 (2.2015) loss 3.0501 (2.9915) grad_norm 2.7503 (3.1823) [2022-01-26 16:29:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1150/1251] eta 0:03:42 lr 0.000021 time 2.3933 (2.2023) loss 3.3204 (2.9916) grad_norm 3.6874 (3.1830) [2022-01-26 16:30:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1160/1251] eta 0:03:20 lr 0.000021 time 2.3521 (2.2031) loss 3.1213 (2.9914) grad_norm 2.8086 (3.1834) [2022-01-26 16:30:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1170/1251] eta 0:02:58 lr 0.000021 time 2.5621 (2.2033) loss 2.3024 (2.9917) grad_norm 2.9323 (3.1849) [2022-01-26 16:31:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1180/1251] eta 0:02:36 lr 0.000021 time 1.8514 (2.2033) loss 2.8194 (2.9925) grad_norm 2.4870 (3.1834) [2022-01-26 16:31:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1190/1251] eta 0:02:14 lr 0.000021 time 1.8278 (2.2040) loss 3.4736 (2.9937) grad_norm 3.2804 (3.1830) [2022-01-26 16:31:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1200/1251] eta 0:01:52 lr 0.000021 time 1.9900 (2.2051) loss 3.2293 (2.9952) grad_norm 3.0842 (3.1838) [2022-01-26 16:32:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1210/1251] eta 0:01:30 lr 0.000021 time 2.8250 (2.2045) loss 2.4879 (2.9946) grad_norm 3.2035 (3.1848) [2022-01-26 16:32:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1220/1251] eta 0:01:08 lr 0.000021 time 1.6308 (2.2025) loss 2.6577 (2.9953) grad_norm 2.9798 (3.1839) [2022-01-26 16:32:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1230/1251] eta 0:00:46 lr 0.000021 time 1.8905 (2.2008) loss 2.5247 (2.9956) grad_norm 3.3737 (3.1832) [2022-01-26 16:33:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1240/1251] eta 0:00:24 lr 0.000021 time 1.8664 (2.2000) loss 2.4105 (2.9959) grad_norm 2.9932 (3.1839) [2022-01-26 16:33:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [279/300][1250/1251] eta 0:00:02 lr 0.000021 time 1.1831 (2.1946) loss 3.3730 (2.9948) grad_norm 3.7785 (3.1829) [2022-01-26 16:33:28 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 279 training takes 0:45:45 [2022-01-26 16:33:46 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.218 (18.218) Loss 0.8211 (0.8211) Acc@1 81.055 (81.055) Acc@5 95.312 (95.312) [2022-01-26 16:34:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.949 (3.476) Loss 0.7841 (0.8094) Acc@1 81.543 (81.090) Acc@5 95.703 (95.543) [2022-01-26 16:34:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.577 (2.554) Loss 0.7853 (0.8063) Acc@1 80.078 (81.203) Acc@5 95.605 (95.517) [2022-01-26 16:34:37 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.582 (2.223) Loss 0.8770 (0.8105) Acc@1 79.395 (81.149) Acc@5 93.750 (95.461) [2022-01-26 16:34:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.229 (2.166) Loss 0.7039 (0.8097) Acc@1 83.398 (81.202) Acc@5 95.996 (95.441) [2022-01-26 16:35:04 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.058 Acc@5 95.438 [2022-01-26 16:35:04 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-01-26 16:35:04 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.07% [2022-01-26 16:35:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][0/1251] eta 7:41:15 lr 0.000021 time 22.1230 (22.1230) loss 2.4879 (2.4879) grad_norm 3.3937 (3.3937) [2022-01-26 16:35:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][10/1251] eta 1:25:19 lr 0.000021 time 2.2370 (4.1252) loss 3.1875 (2.8407) grad_norm 2.9299 (3.1232) [2022-01-26 16:36:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][20/1251] eta 1:04:47 lr 0.000021 time 1.4883 (3.1577) loss 3.3846 (2.9796) grad_norm 3.1192 (3.0782) [2022-01-26 16:36:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][30/1251] eta 0:58:22 lr 0.000021 time 1.7308 (2.8685) loss 2.7285 (2.9974) grad_norm 3.3409 (3.1364) [2022-01-26 16:36:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][40/1251] eta 0:55:21 lr 0.000021 time 3.2504 (2.7429) loss 3.3739 (2.9886) grad_norm 3.0980 (3.1440) [2022-01-26 16:37:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][50/1251] eta 0:53:25 lr 0.000021 time 2.4890 (2.6690) loss 3.4380 (2.9926) grad_norm 3.1331 (3.1298) [2022-01-26 16:37:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][60/1251] eta 0:51:10 lr 0.000021 time 2.0366 (2.5783) loss 3.2572 (3.0135) grad_norm 2.7867 (3.1307) [2022-01-26 16:38:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][70/1251] eta 0:49:18 lr 0.000021 time 1.6674 (2.5054) loss 3.2387 (3.0068) grad_norm 3.1655 (3.1717) [2022-01-26 16:38:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][80/1251] eta 0:48:22 lr 0.000021 time 3.4308 (2.4789) loss 2.1475 (3.0195) grad_norm 3.6777 (3.1558) [2022-01-26 16:38:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][90/1251] eta 0:47:23 lr 0.000021 time 2.2289 (2.4491) loss 3.8311 (3.0132) grad_norm 3.1723 (3.1682) [2022-01-26 16:39:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][100/1251] eta 0:46:18 lr 0.000021 time 1.5259 (2.4139) loss 3.3006 (2.9973) grad_norm 3.0907 (3.1901) [2022-01-26 16:39:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][110/1251] eta 0:45:43 lr 0.000021 time 2.0618 (2.4041) loss 3.2825 (2.9873) grad_norm 3.3035 (3.1776) [2022-01-26 16:39:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][120/1251] eta 0:45:04 lr 0.000021 time 3.0784 (2.3914) loss 2.3268 (2.9755) grad_norm 3.7647 (3.1652) [2022-01-26 16:40:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][130/1251] eta 0:44:08 lr 0.000021 time 2.3461 (2.3623) loss 3.6653 (2.9761) grad_norm 3.5721 (3.1607) [2022-01-26 16:40:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][140/1251] eta 0:43:13 lr 0.000021 time 1.6607 (2.3346) loss 2.9380 (2.9806) grad_norm 3.2641 (3.1479) [2022-01-26 16:40:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][150/1251] eta 0:42:31 lr 0.000021 time 2.2802 (2.3177) loss 3.5011 (2.9848) grad_norm 3.5043 (3.1673) [2022-01-26 16:41:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][160/1251] eta 0:41:56 lr 0.000021 time 1.9441 (2.3063) loss 3.6140 (2.9968) grad_norm 3.2807 (3.1721) [2022-01-26 16:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][170/1251] eta 0:41:22 lr 0.000021 time 2.8839 (2.2969) loss 3.5537 (3.0092) grad_norm 4.4418 (3.1774) [2022-01-26 16:41:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][180/1251] eta 0:40:48 lr 0.000021 time 1.9271 (2.2858) loss 2.2641 (2.9863) grad_norm 3.0137 (3.1712) [2022-01-26 16:42:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][190/1251] eta 0:40:27 lr 0.000021 time 2.2502 (2.2875) loss 2.7584 (2.9900) grad_norm 2.7976 (3.1717) [2022-01-26 16:42:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][200/1251] eta 0:40:05 lr 0.000021 time 2.8020 (2.2887) loss 3.2943 (2.9872) grad_norm 2.9002 (3.1672) [2022-01-26 16:43:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][210/1251] eta 0:39:41 lr 0.000021 time 2.5040 (2.2879) loss 2.8621 (2.9924) grad_norm 2.9218 (3.1648) [2022-01-26 16:43:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][220/1251] eta 0:39:18 lr 0.000021 time 1.9996 (2.2872) loss 3.7539 (2.9984) grad_norm 4.0425 (3.1733) [2022-01-26 16:43:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][230/1251] eta 0:38:50 lr 0.000021 time 2.2737 (2.2829) loss 3.2886 (3.0066) grad_norm 2.8877 (3.1820) [2022-01-26 16:44:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][240/1251] eta 0:38:21 lr 0.000021 time 1.9084 (2.2767) loss 3.6827 (3.0043) grad_norm 3.3527 (3.1803) [2022-01-26 16:44:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][250/1251] eta 0:37:52 lr 0.000021 time 2.4462 (2.2701) loss 3.2687 (3.0019) grad_norm 2.9545 (3.1840) [2022-01-26 16:44:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][260/1251] eta 0:37:27 lr 0.000021 time 1.6030 (2.2678) loss 3.0050 (2.9985) grad_norm 2.6404 (3.1857) [2022-01-26 16:45:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][270/1251] eta 0:37:07 lr 0.000021 time 2.3899 (2.2707) loss 3.1461 (2.9951) grad_norm 2.9995 (3.1858) [2022-01-26 16:45:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][280/1251] eta 0:36:40 lr 0.000021 time 1.6583 (2.2660) loss 3.0052 (2.9997) grad_norm 3.3780 (3.1856) [2022-01-26 16:46:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][290/1251] eta 0:36:16 lr 0.000021 time 3.1112 (2.2653) loss 3.1920 (3.0010) grad_norm 2.9771 (3.1878) [2022-01-26 16:46:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][300/1251] eta 0:35:52 lr 0.000021 time 1.8940 (2.2630) loss 2.5441 (2.9998) grad_norm 2.8232 (3.1811) [2022-01-26 16:46:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][310/1251] eta 0:35:29 lr 0.000021 time 2.6030 (2.2628) loss 3.4714 (2.9964) grad_norm 3.0090 (3.1739) [2022-01-26 16:47:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][320/1251] eta 0:35:04 lr 0.000021 time 2.2624 (2.2606) loss 3.4205 (3.0005) grad_norm 2.9176 (3.1725) [2022-01-26 16:47:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][330/1251] eta 0:34:36 lr 0.000021 time 2.5225 (2.2546) loss 3.1850 (3.0051) grad_norm 2.7687 (3.1703) [2022-01-26 16:47:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][340/1251] eta 0:34:06 lr 0.000021 time 2.2530 (2.2464) loss 3.2870 (3.0036) grad_norm 3.0363 (3.1643) [2022-01-26 16:48:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][350/1251] eta 0:33:39 lr 0.000021 time 2.3505 (2.2412) loss 3.6000 (3.0091) grad_norm 2.8356 (3.1596) [2022-01-26 16:48:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][360/1251] eta 0:33:13 lr 0.000021 time 2.1215 (2.2370) loss 3.1595 (3.0144) grad_norm 3.0477 (3.1656) [2022-01-26 16:48:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][370/1251] eta 0:32:49 lr 0.000021 time 2.6711 (2.2356) loss 2.5038 (3.0105) grad_norm 4.0527 (3.1661) [2022-01-26 16:49:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][380/1251] eta 0:32:26 lr 0.000020 time 3.0555 (2.2348) loss 3.2832 (3.0107) grad_norm 2.7730 (3.1714) [2022-01-26 16:49:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][390/1251] eta 0:32:02 lr 0.000020 time 2.3900 (2.2325) loss 2.2086 (3.0055) grad_norm 3.3033 (3.1736) [2022-01-26 16:50:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][400/1251] eta 0:31:40 lr 0.000020 time 2.5330 (2.2338) loss 3.4754 (3.0064) grad_norm 2.9641 (3.1723) [2022-01-26 16:50:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][410/1251] eta 0:31:18 lr 0.000020 time 2.3347 (2.2336) loss 2.5210 (3.0056) grad_norm 3.2645 (3.1686) [2022-01-26 16:50:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][420/1251] eta 0:30:59 lr 0.000020 time 2.8101 (2.2375) loss 3.0669 (2.9995) grad_norm 3.2439 (3.1673) [2022-01-26 16:51:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][430/1251] eta 0:30:38 lr 0.000020 time 3.3058 (2.2398) loss 3.1372 (3.0013) grad_norm 3.4318 (3.1682) [2022-01-26 16:51:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][440/1251] eta 0:30:16 lr 0.000020 time 2.1661 (2.2400) loss 2.8380 (3.0001) grad_norm 3.8379 (3.1724) [2022-01-26 16:51:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][450/1251] eta 0:29:51 lr 0.000020 time 2.5540 (2.2371) loss 3.5675 (3.0015) grad_norm 3.2804 (3.1703) [2022-01-26 16:52:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][460/1251] eta 0:29:26 lr 0.000020 time 2.1853 (2.2328) loss 2.8692 (2.9980) grad_norm 2.9904 (3.1762) [2022-01-26 16:52:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][470/1251] eta 0:29:00 lr 0.000020 time 1.9250 (2.2282) loss 2.1460 (2.9939) grad_norm 2.9846 (3.1745) [2022-01-26 16:52:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][480/1251] eta 0:28:37 lr 0.000020 time 2.0473 (2.2273) loss 3.0606 (2.9899) grad_norm 3.5532 (3.1747) [2022-01-26 16:53:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][490/1251] eta 0:28:13 lr 0.000020 time 2.0255 (2.2252) loss 3.1953 (2.9911) grad_norm 3.6564 (3.1779) [2022-01-26 16:53:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][500/1251] eta 0:27:52 lr 0.000020 time 2.1060 (2.2266) loss 2.8325 (2.9894) grad_norm 2.7847 (3.1773) [2022-01-26 16:54:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][510/1251] eta 0:27:29 lr 0.000020 time 1.4544 (2.2257) loss 3.4759 (2.9927) grad_norm 3.1106 (3.1774) [2022-01-26 16:54:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][520/1251] eta 0:27:05 lr 0.000020 time 2.1427 (2.2241) loss 2.2272 (2.9905) grad_norm 2.9446 (3.1763) [2022-01-26 16:54:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][530/1251] eta 0:26:43 lr 0.000020 time 2.4804 (2.2246) loss 2.2229 (2.9858) grad_norm 3.2312 (3.1746) [2022-01-26 16:55:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][540/1251] eta 0:26:22 lr 0.000020 time 1.8822 (2.2252) loss 3.1994 (2.9881) grad_norm 3.1481 (3.1764) [2022-01-26 16:55:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][550/1251] eta 0:25:59 lr 0.000020 time 2.0989 (2.2252) loss 2.9522 (2.9881) grad_norm 2.9067 (3.1743) [2022-01-26 16:55:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][560/1251] eta 0:25:37 lr 0.000020 time 2.0247 (2.2246) loss 3.0630 (2.9884) grad_norm 3.2143 (3.1717) [2022-01-26 16:56:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][570/1251] eta 0:25:14 lr 0.000020 time 2.4374 (2.2240) loss 3.2978 (2.9925) grad_norm 3.2538 (3.1715) [2022-01-26 16:56:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][580/1251] eta 0:24:49 lr 0.000020 time 1.8623 (2.2204) loss 2.2821 (2.9931) grad_norm 3.2258 (3.1689) [2022-01-26 16:56:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][590/1251] eta 0:24:26 lr 0.000020 time 2.2643 (2.2190) loss 3.0500 (2.9936) grad_norm 2.8436 (3.1666) [2022-01-26 16:57:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][600/1251] eta 0:24:02 lr 0.000020 time 1.6093 (2.2163) loss 3.2959 (2.9955) grad_norm 3.1586 (3.1674) [2022-01-26 16:57:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][610/1251] eta 0:23:40 lr 0.000020 time 2.3069 (2.2153) loss 2.0878 (2.9947) grad_norm 2.7597 (3.1690) [2022-01-26 16:57:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][620/1251] eta 0:23:16 lr 0.000020 time 1.9388 (2.2130) loss 2.8210 (2.9930) grad_norm 3.1049 (3.1669) [2022-01-26 16:58:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][630/1251] eta 0:22:53 lr 0.000020 time 2.4808 (2.2120) loss 3.1049 (2.9903) grad_norm 3.5614 (3.1733) [2022-01-26 16:58:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][640/1251] eta 0:22:30 lr 0.000020 time 2.4575 (2.2111) loss 3.6475 (2.9941) grad_norm 3.4188 (3.1702) [2022-01-26 16:59:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][650/1251] eta 0:22:07 lr 0.000020 time 2.6337 (2.2095) loss 1.8259 (2.9900) grad_norm 2.9422 (3.1770) [2022-01-26 16:59:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][660/1251] eta 0:21:45 lr 0.000020 time 1.8261 (2.2091) loss 3.1335 (2.9909) grad_norm 2.8933 (3.1740) [2022-01-26 16:59:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][670/1251] eta 0:21:24 lr 0.000020 time 4.4504 (2.2110) loss 2.6742 (2.9912) grad_norm 2.8142 (3.1745) [2022-01-26 17:00:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][680/1251] eta 0:21:02 lr 0.000020 time 2.1726 (2.2115) loss 3.0857 (2.9885) grad_norm 3.1696 (3.1768) [2022-01-26 17:00:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][690/1251] eta 0:20:41 lr 0.000020 time 2.4759 (2.2135) loss 3.0797 (2.9865) grad_norm 2.8498 (3.1730) [2022-01-26 17:00:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][700/1251] eta 0:20:20 lr 0.000020 time 2.3881 (2.2154) loss 3.1425 (2.9895) grad_norm 3.1720 (3.1706) [2022-01-26 17:01:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][710/1251] eta 0:19:57 lr 0.000020 time 2.6736 (2.2137) loss 3.2800 (2.9904) grad_norm 3.1683 (3.1719) [2022-01-26 17:01:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][720/1251] eta 0:19:33 lr 0.000020 time 1.8966 (2.2109) loss 3.0160 (2.9909) grad_norm 2.9466 (3.1750) [2022-01-26 17:02:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][730/1251] eta 0:19:11 lr 0.000020 time 2.5909 (2.2103) loss 2.3318 (2.9896) grad_norm 3.1337 (3.1734) [2022-01-26 17:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][740/1251] eta 0:18:48 lr 0.000020 time 1.8017 (2.2079) loss 1.7862 (2.9919) grad_norm 3.2868 (3.1747) [2022-01-26 17:02:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][750/1251] eta 0:18:26 lr 0.000020 time 2.8669 (2.2085) loss 2.2607 (2.9902) grad_norm 3.1237 (3.1802) [2022-01-26 17:03:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][760/1251] eta 0:18:04 lr 0.000020 time 2.3008 (2.2078) loss 3.3818 (2.9934) grad_norm 3.1678 (3.1805) [2022-01-26 17:03:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][770/1251] eta 0:17:41 lr 0.000020 time 2.0539 (2.2069) loss 3.5986 (2.9915) grad_norm 2.9314 (3.1808) [2022-01-26 17:03:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][780/1251] eta 0:17:18 lr 0.000020 time 1.8182 (2.2054) loss 2.9800 (2.9907) grad_norm 2.5895 (3.1783) [2022-01-26 17:04:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][790/1251] eta 0:16:57 lr 0.000020 time 2.8186 (2.2071) loss 2.8995 (2.9888) grad_norm 2.9614 (3.1789) [2022-01-26 17:04:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][800/1251] eta 0:16:36 lr 0.000020 time 2.8852 (2.2086) loss 2.9133 (2.9874) grad_norm 2.8796 (3.1826) [2022-01-26 17:04:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][810/1251] eta 0:16:14 lr 0.000020 time 2.1858 (2.2090) loss 3.5749 (2.9906) grad_norm 3.6179 (3.1849) [2022-01-26 17:05:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][820/1251] eta 0:15:51 lr 0.000020 time 1.8800 (2.2068) loss 3.0344 (2.9916) grad_norm 3.4618 (3.1857) [2022-01-26 17:05:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][830/1251] eta 0:15:28 lr 0.000020 time 2.2784 (2.2051) loss 2.8477 (2.9911) grad_norm 2.8193 (3.1839) [2022-01-26 17:05:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][840/1251] eta 0:15:05 lr 0.000020 time 1.7012 (2.2039) loss 2.7455 (2.9921) grad_norm 2.9304 (3.1828) [2022-01-26 17:06:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][850/1251] eta 0:14:43 lr 0.000020 time 2.5132 (2.2033) loss 3.3812 (2.9936) grad_norm 2.7206 (3.1841) [2022-01-26 17:06:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][860/1251] eta 0:14:21 lr 0.000020 time 2.2479 (2.2024) loss 2.9399 (2.9941) grad_norm 3.0638 (3.1866) [2022-01-26 17:07:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][870/1251] eta 0:13:59 lr 0.000020 time 2.4939 (2.2023) loss 3.5378 (2.9927) grad_norm 3.4364 (3.1873) [2022-01-26 17:07:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][880/1251] eta 0:13:36 lr 0.000020 time 1.8377 (2.2000) loss 3.5859 (2.9945) grad_norm 3.3849 (3.1873) [2022-01-26 17:07:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][890/1251] eta 0:13:14 lr 0.000020 time 2.5572 (2.1996) loss 3.2771 (2.9944) grad_norm 4.5039 (3.1883) [2022-01-26 17:08:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][900/1251] eta 0:12:52 lr 0.000020 time 1.9740 (2.1995) loss 2.1620 (2.9941) grad_norm 3.7631 (3.1886) [2022-01-26 17:08:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][910/1251] eta 0:12:30 lr 0.000020 time 2.3706 (2.2005) loss 3.4297 (2.9950) grad_norm 2.7331 (3.1870) [2022-01-26 17:08:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][920/1251] eta 0:12:08 lr 0.000020 time 1.8310 (2.2005) loss 3.3214 (2.9942) grad_norm 3.0923 (3.1869) [2022-01-26 17:09:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][930/1251] eta 0:11:46 lr 0.000020 time 2.0787 (2.2014) loss 3.2469 (2.9940) grad_norm 3.5511 (3.1876) [2022-01-26 17:09:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][940/1251] eta 0:11:24 lr 0.000020 time 2.7378 (2.2024) loss 2.5573 (2.9945) grad_norm 3.3514 (3.1887) [2022-01-26 17:09:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][950/1251] eta 0:11:02 lr 0.000020 time 2.2378 (2.2024) loss 2.5311 (2.9908) grad_norm 8.0712 (3.1923) [2022-01-26 17:10:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][960/1251] eta 0:10:39 lr 0.000020 time 1.6324 (2.1993) loss 3.0970 (2.9923) grad_norm 3.4665 (3.1929) [2022-01-26 17:10:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][970/1251] eta 0:10:17 lr 0.000020 time 1.8939 (2.1985) loss 2.3735 (2.9914) grad_norm 3.0274 (3.1923) [2022-01-26 17:10:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][980/1251] eta 0:09:55 lr 0.000020 time 1.9309 (2.1959) loss 3.3836 (2.9926) grad_norm 3.6955 (3.1897) [2022-01-26 17:11:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][990/1251] eta 0:09:33 lr 0.000020 time 2.5636 (2.1957) loss 3.2747 (2.9924) grad_norm 3.7155 (3.1884) [2022-01-26 17:11:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1000/1251] eta 0:09:11 lr 0.000020 time 2.2255 (2.1963) loss 2.9922 (2.9937) grad_norm 2.6448 (3.1906) [2022-01-26 17:12:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1010/1251] eta 0:08:49 lr 0.000020 time 2.1591 (2.1974) loss 3.2988 (2.9942) grad_norm 3.2816 (3.1938) [2022-01-26 17:12:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1020/1251] eta 0:08:27 lr 0.000020 time 2.0780 (2.1960) loss 3.4962 (2.9953) grad_norm 3.2717 (3.1951) [2022-01-26 17:12:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1030/1251] eta 0:08:05 lr 0.000020 time 2.3621 (2.1975) loss 2.6470 (2.9945) grad_norm 3.0423 (3.1960) [2022-01-26 17:13:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1040/1251] eta 0:07:43 lr 0.000020 time 2.4757 (2.1986) loss 3.3866 (2.9958) grad_norm 3.9855 (3.1956) [2022-01-26 17:13:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1050/1251] eta 0:07:22 lr 0.000020 time 2.6929 (2.1997) loss 3.5710 (2.9948) grad_norm 3.5942 (3.1966) [2022-01-26 17:13:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1060/1251] eta 0:06:59 lr 0.000020 time 1.8543 (2.1988) loss 2.8188 (2.9932) grad_norm 2.8250 (3.1962) [2022-01-26 17:14:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1070/1251] eta 0:06:38 lr 0.000020 time 2.8218 (2.2009) loss 2.9501 (2.9935) grad_norm 4.8571 (3.1959) [2022-01-26 17:14:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1080/1251] eta 0:06:16 lr 0.000020 time 1.8846 (2.2009) loss 3.0922 (2.9942) grad_norm 3.7805 (3.1974) [2022-01-26 17:15:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1090/1251] eta 0:05:54 lr 0.000020 time 1.9604 (2.1992) loss 3.3578 (2.9948) grad_norm 2.9516 (3.1982) [2022-01-26 17:15:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1100/1251] eta 0:05:31 lr 0.000020 time 1.8784 (2.1971) loss 3.0764 (2.9956) grad_norm 3.0517 (3.1977) [2022-01-26 17:15:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1110/1251] eta 0:05:09 lr 0.000020 time 2.3575 (2.1955) loss 3.0905 (2.9966) grad_norm 2.8339 (3.1978) [2022-01-26 17:16:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1120/1251] eta 0:04:47 lr 0.000020 time 1.9166 (2.1956) loss 3.4078 (2.9961) grad_norm 4.0690 (3.1988) [2022-01-26 17:16:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1130/1251] eta 0:04:25 lr 0.000020 time 1.6442 (2.1963) loss 3.3157 (2.9959) grad_norm 3.3432 (3.1987) [2022-01-26 17:16:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1140/1251] eta 0:04:03 lr 0.000020 time 1.9424 (2.1970) loss 2.6736 (2.9945) grad_norm 2.5988 (3.1955) [2022-01-26 17:17:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1150/1251] eta 0:03:42 lr 0.000020 time 3.3453 (2.1985) loss 3.3666 (2.9966) grad_norm 3.5951 (3.1948) [2022-01-26 17:17:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1160/1251] eta 0:03:20 lr 0.000020 time 2.3065 (2.1987) loss 2.0180 (2.9957) grad_norm 2.5296 (3.1942) [2022-01-26 17:17:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1170/1251] eta 0:02:58 lr 0.000020 time 1.8930 (2.1983) loss 2.6535 (2.9950) grad_norm 3.0303 (3.1929) [2022-01-26 17:18:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1180/1251] eta 0:02:36 lr 0.000020 time 2.2490 (2.1983) loss 2.4269 (2.9942) grad_norm 3.4468 (3.1964) [2022-01-26 17:18:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1190/1251] eta 0:02:14 lr 0.000020 time 2.1636 (2.1977) loss 3.0061 (2.9950) grad_norm 2.7184 (3.1969) [2022-01-26 17:19:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1200/1251] eta 0:01:52 lr 0.000020 time 2.2843 (2.1967) loss 3.0724 (2.9947) grad_norm 3.3957 (3.1962) [2022-01-26 17:19:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1210/1251] eta 0:01:30 lr 0.000020 time 2.3316 (2.1967) loss 2.0207 (2.9932) grad_norm 3.1014 (3.1954) [2022-01-26 17:19:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1220/1251] eta 0:01:08 lr 0.000020 time 2.2327 (2.1965) loss 3.3042 (2.9931) grad_norm 2.6544 (3.1944) [2022-01-26 17:20:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1230/1251] eta 0:00:46 lr 0.000020 time 2.4418 (2.1959) loss 2.5850 (2.9932) grad_norm 3.2519 (3.1948) [2022-01-26 17:20:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1240/1251] eta 0:00:24 lr 0.000020 time 1.5406 (2.1944) loss 3.3581 (2.9938) grad_norm 3.4996 (3.1944) [2022-01-26 17:20:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [280/300][1250/1251] eta 0:00:02 lr 0.000020 time 1.1995 (2.1889) loss 2.0809 (2.9942) grad_norm 3.2373 (3.1949) [2022-01-26 17:20:43 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 280 training takes 0:45:38 [2022-01-26 17:20:43 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_280 saving...... [2022-01-26 17:20:54 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_280 saved !!! [2022-01-26 17:21:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 15.662 (15.662) Loss 0.8764 (0.8764) Acc@1 79.004 (79.004) Acc@5 94.727 (94.727) [2022-01-26 17:21:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.959 (2.666) Loss 0.7884 (0.8428) Acc@1 80.566 (80.478) Acc@5 96.582 (95.206) [2022-01-26 17:21:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.195 (2.127) Loss 0.8241 (0.8180) Acc@1 80.957 (80.836) Acc@5 95.117 (95.424) [2022-01-26 17:21:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.032 (1.998) Loss 0.8150 (0.8110) Acc@1 80.566 (81.014) Acc@5 95.801 (95.473) [2022-01-26 17:22:14 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 0.860 (1.956) Loss 0.7938 (0.8123) Acc@1 81.836 (81.028) Acc@5 95.508 (95.429) [2022-01-26 17:22:22 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.086 Acc@5 95.428 [2022-01-26 17:22:22 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-01-26 17:22:22 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.09% [2022-01-26 17:22:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][0/1251] eta 7:38:01 lr 0.000020 time 21.9680 (21.9680) loss 3.1432 (3.1432) grad_norm 3.0744 (3.0744) [2022-01-26 17:23:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][10/1251] eta 1:25:42 lr 0.000020 time 2.2329 (4.1435) loss 2.0514 (2.8045) grad_norm 3.0624 (3.1704) [2022-01-26 17:23:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][20/1251] eta 1:07:37 lr 0.000020 time 2.1478 (3.2958) loss 3.3966 (2.8781) grad_norm 3.4343 (3.2006) [2022-01-26 17:23:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][30/1251] eta 1:00:29 lr 0.000020 time 1.8485 (2.9729) loss 1.8350 (2.8330) grad_norm 3.7961 (3.2197) [2022-01-26 17:24:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][40/1251] eta 0:56:52 lr 0.000020 time 3.5940 (2.8182) loss 3.1309 (2.8398) grad_norm 2.5266 (3.2115) [2022-01-26 17:24:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][50/1251] eta 0:54:30 lr 0.000020 time 1.5453 (2.7235) loss 3.1272 (2.9149) grad_norm 3.4569 (3.2130) [2022-01-26 17:25:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][60/1251] eta 0:52:11 lr 0.000020 time 1.8545 (2.6292) loss 3.6645 (2.9605) grad_norm 3.0060 (3.2398) [2022-01-26 17:25:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][70/1251] eta 0:50:11 lr 0.000020 time 1.9492 (2.5497) loss 3.4061 (2.9867) grad_norm 3.5558 (3.2451) [2022-01-26 17:25:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][80/1251] eta 0:48:43 lr 0.000020 time 2.5815 (2.4964) loss 3.6348 (2.9694) grad_norm 3.1023 (3.2129) [2022-01-26 17:26:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][90/1251] eta 0:47:24 lr 0.000020 time 2.0471 (2.4504) loss 3.2168 (2.9848) grad_norm 2.7221 (3.2120) [2022-01-26 17:26:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][100/1251] eta 0:46:16 lr 0.000020 time 1.9103 (2.4125) loss 2.8896 (3.0002) grad_norm 4.0088 (3.2345) [2022-01-26 17:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][110/1251] eta 0:45:26 lr 0.000020 time 2.3107 (2.3893) loss 3.3249 (3.0039) grad_norm 2.9279 (3.2172) [2022-01-26 17:27:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][120/1251] eta 0:44:34 lr 0.000020 time 2.7424 (2.3650) loss 3.0279 (3.0052) grad_norm 3.5718 (3.2156) [2022-01-26 17:27:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][130/1251] eta 0:44:00 lr 0.000020 time 1.8168 (2.3555) loss 3.1852 (3.0034) grad_norm 3.2289 (3.2150) [2022-01-26 17:27:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][140/1251] eta 0:43:26 lr 0.000020 time 2.1751 (2.3463) loss 1.9546 (2.9847) grad_norm 3.2027 (3.1937) [2022-01-26 17:28:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][150/1251] eta 0:42:59 lr 0.000020 time 1.8764 (2.3429) loss 3.1220 (2.9959) grad_norm 2.7213 (3.2128) [2022-01-26 17:28:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][160/1251] eta 0:42:45 lr 0.000020 time 4.6181 (2.3517) loss 2.4714 (2.9882) grad_norm 3.8209 (3.2103) [2022-01-26 17:29:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][170/1251] eta 0:42:02 lr 0.000020 time 1.5561 (2.3334) loss 2.9926 (2.9962) grad_norm 2.8230 (3.2124) [2022-01-26 17:29:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][180/1251] eta 0:41:29 lr 0.000020 time 2.4936 (2.3247) loss 3.1124 (2.9959) grad_norm 3.3575 (3.2024) [2022-01-26 17:29:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][190/1251] eta 0:41:02 lr 0.000020 time 1.5248 (2.3209) loss 3.3692 (2.9826) grad_norm 3.3385 (3.1927) [2022-01-26 17:30:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][200/1251] eta 0:40:38 lr 0.000020 time 3.4987 (2.3198) loss 3.1374 (2.9899) grad_norm 2.6169 (3.1940) [2022-01-26 17:30:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][210/1251] eta 0:40:20 lr 0.000020 time 2.2692 (2.3247) loss 1.9861 (2.9840) grad_norm 2.8598 (3.1930) [2022-01-26 17:30:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][220/1251] eta 0:39:40 lr 0.000020 time 1.9165 (2.3085) loss 3.0750 (2.9937) grad_norm 2.9835 (3.1918) [2022-01-26 17:31:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][230/1251] eta 0:39:02 lr 0.000020 time 1.6228 (2.2939) loss 2.9237 (2.9949) grad_norm 3.5609 (3.1914) [2022-01-26 17:31:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][240/1251] eta 0:38:26 lr 0.000020 time 2.2080 (2.2810) loss 2.7944 (2.9904) grad_norm 3.5238 (3.1886) [2022-01-26 17:31:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][250/1251] eta 0:37:54 lr 0.000020 time 2.1456 (2.2718) loss 3.3024 (2.9839) grad_norm 3.2173 (3.1889) [2022-01-26 17:32:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][260/1251] eta 0:37:26 lr 0.000020 time 2.1945 (2.2671) loss 3.1214 (2.9800) grad_norm 2.8061 (3.1926) [2022-01-26 17:32:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][270/1251] eta 0:37:04 lr 0.000020 time 2.4931 (2.2680) loss 3.1228 (2.9747) grad_norm 2.8053 (3.1936) [2022-01-26 17:33:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][280/1251] eta 0:36:45 lr 0.000020 time 2.3706 (2.2711) loss 3.2757 (2.9791) grad_norm 3.3133 (3.1904) [2022-01-26 17:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][290/1251] eta 0:36:27 lr 0.000020 time 2.5179 (2.2759) loss 3.2584 (2.9820) grad_norm 3.3534 (3.1871) [2022-01-26 17:33:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][300/1251] eta 0:35:57 lr 0.000020 time 1.5307 (2.2692) loss 2.9284 (2.9806) grad_norm 2.8644 (3.1775) [2022-01-26 17:34:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][310/1251] eta 0:35:33 lr 0.000020 time 2.7147 (2.2671) loss 2.8996 (2.9814) grad_norm 3.2399 (3.1698) [2022-01-26 17:34:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][320/1251] eta 0:35:04 lr 0.000020 time 1.8831 (2.2604) loss 2.5059 (2.9786) grad_norm 3.0447 (3.1681) [2022-01-26 17:34:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][330/1251] eta 0:34:35 lr 0.000019 time 2.1303 (2.2539) loss 2.9186 (2.9808) grad_norm 2.9328 (3.1611) [2022-01-26 17:35:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][340/1251] eta 0:34:07 lr 0.000019 time 1.5896 (2.2471) loss 3.5324 (2.9831) grad_norm 2.9250 (3.1647) [2022-01-26 17:35:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][350/1251] eta 0:33:42 lr 0.000019 time 2.7731 (2.2444) loss 3.1483 (2.9859) grad_norm 3.4205 (3.1613) [2022-01-26 17:35:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][360/1251] eta 0:33:19 lr 0.000019 time 1.8564 (2.2440) loss 3.3245 (2.9848) grad_norm 3.3938 (3.1571) [2022-01-26 17:36:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][370/1251] eta 0:32:59 lr 0.000019 time 2.7019 (2.2465) loss 2.9140 (2.9850) grad_norm 2.8774 (3.1683) [2022-01-26 17:36:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][380/1251] eta 0:32:37 lr 0.000019 time 2.2632 (2.2470) loss 3.3004 (2.9811) grad_norm 2.8612 (3.1828) [2022-01-26 17:37:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][390/1251] eta 0:32:13 lr 0.000019 time 3.1890 (2.2457) loss 2.6109 (2.9810) grad_norm 3.0934 (3.1835) [2022-01-26 17:37:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][400/1251] eta 0:31:49 lr 0.000019 time 2.1103 (2.2436) loss 3.2579 (2.9764) grad_norm 3.3252 (3.1816) [2022-01-26 17:37:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][410/1251] eta 0:31:24 lr 0.000019 time 2.2345 (2.2413) loss 2.5907 (2.9714) grad_norm 3.1929 (3.1795) [2022-01-26 17:38:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][420/1251] eta 0:30:59 lr 0.000019 time 1.8560 (2.2374) loss 2.5813 (2.9736) grad_norm 2.7695 (3.1803) [2022-01-26 17:38:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][430/1251] eta 0:30:34 lr 0.000019 time 2.6442 (2.2339) loss 3.0922 (2.9734) grad_norm 3.0659 (3.1774) [2022-01-26 17:38:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][440/1251] eta 0:30:09 lr 0.000019 time 2.1657 (2.2314) loss 3.6149 (2.9754) grad_norm 3.1487 (3.1805) [2022-01-26 17:39:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][450/1251] eta 0:29:48 lr 0.000019 time 2.7774 (2.2325) loss 2.1737 (2.9722) grad_norm 3.4566 (3.1808) [2022-01-26 17:39:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][460/1251] eta 0:29:28 lr 0.000019 time 2.5570 (2.2354) loss 3.2844 (2.9671) grad_norm 3.4402 (3.1800) [2022-01-26 17:39:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][470/1251] eta 0:29:05 lr 0.000019 time 1.9197 (2.2352) loss 3.4801 (2.9700) grad_norm 2.7858 (3.1816) [2022-01-26 17:40:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][480/1251] eta 0:28:43 lr 0.000019 time 1.9113 (2.2351) loss 2.9915 (2.9717) grad_norm 3.5371 (3.1834) [2022-01-26 17:40:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][490/1251] eta 0:28:17 lr 0.000019 time 1.8850 (2.2312) loss 3.4855 (2.9706) grad_norm 2.9973 (3.1831) [2022-01-26 17:41:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][500/1251] eta 0:27:55 lr 0.000019 time 2.5788 (2.2309) loss 3.0623 (2.9689) grad_norm 3.4676 (3.1856) [2022-01-26 17:41:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][510/1251] eta 0:27:33 lr 0.000019 time 2.7241 (2.2315) loss 2.9956 (2.9709) grad_norm 2.8416 (3.1840) [2022-01-26 17:41:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][520/1251] eta 0:27:11 lr 0.000019 time 2.5752 (2.2318) loss 3.4726 (2.9740) grad_norm 3.3378 (3.1831) [2022-01-26 17:42:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][530/1251] eta 0:26:47 lr 0.000019 time 1.6196 (2.2295) loss 2.8320 (2.9722) grad_norm 2.8217 (3.1813) [2022-01-26 17:42:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][540/1251] eta 0:26:22 lr 0.000019 time 1.6494 (2.2253) loss 2.2075 (2.9653) grad_norm 2.7687 (3.1828) [2022-01-26 17:42:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][550/1251] eta 0:25:58 lr 0.000019 time 1.8369 (2.2226) loss 3.6699 (2.9661) grad_norm 3.3288 (3.1861) [2022-01-26 17:43:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][560/1251] eta 0:25:35 lr 0.000019 time 1.8668 (2.2224) loss 3.1468 (2.9643) grad_norm 3.2811 (3.1878) [2022-01-26 17:43:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][570/1251] eta 0:25:13 lr 0.000019 time 2.3481 (2.2218) loss 3.0542 (2.9637) grad_norm 3.3362 (3.1892) [2022-01-26 17:43:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][580/1251] eta 0:24:51 lr 0.000019 time 2.2045 (2.2228) loss 3.5762 (2.9615) grad_norm 3.1145 (3.1876) [2022-01-26 17:44:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][590/1251] eta 0:24:30 lr 0.000019 time 2.0747 (2.2245) loss 3.2313 (2.9612) grad_norm 3.2093 (3.1882) [2022-01-26 17:44:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][600/1251] eta 0:24:07 lr 0.000019 time 2.2189 (2.2237) loss 3.5212 (2.9647) grad_norm 2.4957 (3.1849) [2022-01-26 17:44:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][610/1251] eta 0:23:42 lr 0.000019 time 2.1941 (2.2196) loss 2.6004 (2.9653) grad_norm 3.6435 (3.1858) [2022-01-26 17:45:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][620/1251] eta 0:23:18 lr 0.000019 time 2.1715 (2.2165) loss 2.2030 (2.9612) grad_norm 3.4362 (3.1847) [2022-01-26 17:45:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][630/1251] eta 0:22:55 lr 0.000019 time 1.7955 (2.2152) loss 3.0814 (2.9624) grad_norm 2.9483 (3.1865) [2022-01-26 17:46:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][640/1251] eta 0:22:32 lr 0.000019 time 2.4091 (2.2144) loss 2.4608 (2.9649) grad_norm 2.6806 (3.1849) [2022-01-26 17:46:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][650/1251] eta 0:22:10 lr 0.000019 time 2.1255 (2.2145) loss 2.1328 (2.9658) grad_norm 3.1129 (3.1872) [2022-01-26 17:46:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][660/1251] eta 0:21:49 lr 0.000019 time 2.5572 (2.2156) loss 2.7517 (2.9670) grad_norm 2.8732 (3.1863) [2022-01-26 17:47:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][670/1251] eta 0:21:27 lr 0.000019 time 2.0410 (2.2165) loss 3.6110 (2.9670) grad_norm 3.2491 (3.1863) [2022-01-26 17:47:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][680/1251] eta 0:21:06 lr 0.000019 time 2.7911 (2.2173) loss 2.8463 (2.9688) grad_norm 3.5415 (3.1863) [2022-01-26 17:47:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][690/1251] eta 0:20:42 lr 0.000019 time 1.5523 (2.2145) loss 2.6894 (2.9653) grad_norm 6.9578 (3.1894) [2022-01-26 17:48:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][700/1251] eta 0:20:19 lr 0.000019 time 1.6869 (2.2132) loss 3.5324 (2.9666) grad_norm 2.8450 (3.1854) [2022-01-26 17:48:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][710/1251] eta 0:19:56 lr 0.000019 time 1.8016 (2.2123) loss 2.4859 (2.9668) grad_norm 3.2344 (3.1884) [2022-01-26 17:48:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][720/1251] eta 0:19:34 lr 0.000019 time 2.4141 (2.2128) loss 3.2908 (2.9675) grad_norm 3.1581 (3.1885) [2022-01-26 17:49:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][730/1251] eta 0:19:15 lr 0.000019 time 3.2041 (2.2173) loss 3.2051 (2.9672) grad_norm 3.5659 (3.1888) [2022-01-26 17:49:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][740/1251] eta 0:18:51 lr 0.000019 time 2.2082 (2.2148) loss 2.3046 (2.9662) grad_norm 3.0494 (3.1906) [2022-01-26 17:50:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][750/1251] eta 0:18:28 lr 0.000019 time 2.0243 (2.2127) loss 2.8517 (2.9686) grad_norm 3.3253 (3.1947) [2022-01-26 17:50:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][760/1251] eta 0:18:06 lr 0.000019 time 2.2777 (2.2122) loss 3.4728 (2.9688) grad_norm 3.1550 (3.1958) [2022-01-26 17:50:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][770/1251] eta 0:17:44 lr 0.000019 time 2.7041 (2.2123) loss 3.1592 (2.9709) grad_norm 2.6786 (3.1942) [2022-01-26 17:51:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][780/1251] eta 0:17:22 lr 0.000019 time 2.2854 (2.2134) loss 3.4313 (2.9728) grad_norm 2.8520 (3.1930) [2022-01-26 17:51:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][790/1251] eta 0:16:59 lr 0.000019 time 2.1493 (2.2125) loss 3.2969 (2.9768) grad_norm 2.6766 (3.1920) [2022-01-26 17:51:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][800/1251] eta 0:16:36 lr 0.000019 time 1.8795 (2.2103) loss 3.0486 (2.9780) grad_norm 2.7066 (3.1907) [2022-01-26 17:52:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][810/1251] eta 0:16:15 lr 0.000019 time 2.7212 (2.2121) loss 2.2985 (2.9795) grad_norm 2.5439 (3.1885) [2022-01-26 17:52:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][820/1251] eta 0:15:52 lr 0.000019 time 1.9862 (2.2106) loss 3.0460 (2.9798) grad_norm 3.1625 (3.1878) [2022-01-26 17:52:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][830/1251] eta 0:15:30 lr 0.000019 time 2.2628 (2.2096) loss 3.7596 (2.9807) grad_norm 2.9355 (3.1888) [2022-01-26 17:53:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][840/1251] eta 0:15:07 lr 0.000019 time 2.0794 (2.2080) loss 2.1905 (2.9790) grad_norm 3.5581 (3.1900) [2022-01-26 17:53:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][850/1251] eta 0:14:44 lr 0.000019 time 1.8422 (2.2060) loss 3.0629 (2.9789) grad_norm 2.6927 (3.1902) [2022-01-26 17:54:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][860/1251] eta 0:14:21 lr 0.000019 time 1.9583 (2.2040) loss 3.4749 (2.9804) grad_norm 3.0704 (3.1922) [2022-01-26 17:54:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][870/1251] eta 0:14:00 lr 0.000019 time 1.8579 (2.2048) loss 3.2610 (2.9820) grad_norm 2.8485 (3.1924) [2022-01-26 17:54:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][880/1251] eta 0:13:38 lr 0.000019 time 2.8184 (2.2075) loss 1.9738 (2.9779) grad_norm 3.3099 (3.1928) [2022-01-26 17:55:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][890/1251] eta 0:13:17 lr 0.000019 time 2.1578 (2.2091) loss 1.9247 (2.9782) grad_norm 2.8635 (3.1915) [2022-01-26 17:55:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][900/1251] eta 0:12:55 lr 0.000019 time 2.3811 (2.2099) loss 2.1637 (2.9770) grad_norm 2.7941 (3.1896) [2022-01-26 17:55:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][910/1251] eta 0:12:33 lr 0.000019 time 1.8418 (2.2082) loss 3.4045 (2.9772) grad_norm 2.7027 (3.1870) [2022-01-26 17:56:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][920/1251] eta 0:12:10 lr 0.000019 time 2.0475 (2.2064) loss 2.3527 (2.9771) grad_norm 3.0724 (3.1845) [2022-01-26 17:56:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][930/1251] eta 0:11:47 lr 0.000019 time 1.9365 (2.2047) loss 3.1155 (2.9786) grad_norm 3.3391 (3.1832) [2022-01-26 17:56:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][940/1251] eta 0:11:25 lr 0.000019 time 2.2322 (2.2052) loss 2.4259 (2.9778) grad_norm 3.4521 (3.1826) [2022-01-26 17:57:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][950/1251] eta 0:11:03 lr 0.000019 time 1.6419 (2.2043) loss 3.0733 (2.9790) grad_norm 2.8244 (3.1838) [2022-01-26 17:57:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][960/1251] eta 0:10:41 lr 0.000019 time 2.5930 (2.2041) loss 2.6087 (2.9755) grad_norm 2.9724 (3.1822) [2022-01-26 17:58:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][970/1251] eta 0:10:19 lr 0.000019 time 3.2974 (2.2049) loss 3.2487 (2.9778) grad_norm 2.9766 (3.1809) [2022-01-26 17:58:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][980/1251] eta 0:09:57 lr 0.000019 time 2.2606 (2.2060) loss 3.1424 (2.9803) grad_norm 3.3418 (3.1796) [2022-01-26 17:58:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][990/1251] eta 0:09:35 lr 0.000019 time 1.5786 (2.2056) loss 3.2104 (2.9803) grad_norm 3.0585 (3.1791) [2022-01-26 17:59:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1000/1251] eta 0:09:13 lr 0.000019 time 2.1130 (2.2058) loss 3.2400 (2.9804) grad_norm 2.6890 (3.1773) [2022-01-26 17:59:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1010/1251] eta 0:08:51 lr 0.000019 time 2.2828 (2.2049) loss 1.9259 (2.9790) grad_norm 3.2105 (3.1782) [2022-01-26 17:59:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1020/1251] eta 0:08:29 lr 0.000019 time 2.5573 (2.2038) loss 3.0323 (2.9809) grad_norm 3.0941 (3.1792) [2022-01-26 18:00:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1030/1251] eta 0:08:06 lr 0.000019 time 2.7616 (2.2030) loss 3.4632 (2.9806) grad_norm 3.0761 (3.1796) [2022-01-26 18:00:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1040/1251] eta 0:07:44 lr 0.000019 time 2.2838 (2.2035) loss 3.3867 (2.9801) grad_norm 3.1242 (3.1805) [2022-01-26 18:00:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1050/1251] eta 0:07:22 lr 0.000019 time 2.5162 (2.2037) loss 3.2420 (2.9802) grad_norm 3.2425 (3.1798) [2022-01-26 18:01:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1060/1251] eta 0:07:00 lr 0.000019 time 1.7857 (2.2032) loss 2.9797 (2.9794) grad_norm 3.9018 (3.1817) [2022-01-26 18:01:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1070/1251] eta 0:06:38 lr 0.000019 time 2.5145 (2.2035) loss 3.0811 (2.9803) grad_norm 2.7068 (3.1792) [2022-01-26 18:02:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1080/1251] eta 0:06:16 lr 0.000019 time 1.9055 (2.2013) loss 3.2705 (2.9796) grad_norm 3.9924 (3.1798) [2022-01-26 18:02:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1090/1251] eta 0:05:54 lr 0.000019 time 2.8585 (2.2002) loss 3.0928 (2.9795) grad_norm 3.2422 (3.1812) [2022-01-26 18:02:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1100/1251] eta 0:05:31 lr 0.000019 time 1.7977 (2.1983) loss 2.2468 (2.9784) grad_norm 2.8766 (3.1804) [2022-01-26 18:03:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1110/1251] eta 0:05:09 lr 0.000019 time 2.3177 (2.1975) loss 3.0384 (2.9801) grad_norm 3.0666 (3.1784) [2022-01-26 18:03:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1120/1251] eta 0:04:47 lr 0.000019 time 2.2630 (2.1969) loss 2.7695 (2.9805) grad_norm 3.2621 (3.1781) [2022-01-26 18:03:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1130/1251] eta 0:04:25 lr 0.000019 time 2.5660 (2.1978) loss 3.4791 (2.9821) grad_norm 3.1643 (3.1771) [2022-01-26 18:04:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1140/1251] eta 0:04:03 lr 0.000019 time 1.7736 (2.1979) loss 2.3791 (2.9798) grad_norm 3.3905 (3.1781) [2022-01-26 18:04:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1150/1251] eta 0:03:42 lr 0.000019 time 4.5768 (2.2019) loss 3.2293 (2.9803) grad_norm 2.8538 (3.1774) [2022-01-26 18:05:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1160/1251] eta 0:03:20 lr 0.000019 time 2.0943 (2.2035) loss 2.2511 (2.9780) grad_norm 2.6808 (3.1771) [2022-01-26 18:05:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1170/1251] eta 0:02:58 lr 0.000019 time 3.2706 (2.2043) loss 3.1109 (2.9804) grad_norm 3.0648 (3.1761) [2022-01-26 18:05:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1180/1251] eta 0:02:36 lr 0.000019 time 1.5505 (2.2033) loss 2.9439 (2.9808) grad_norm 2.8517 (3.1758) [2022-01-26 18:06:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1190/1251] eta 0:02:14 lr 0.000019 time 2.5065 (2.2017) loss 2.8242 (2.9813) grad_norm 2.5417 (3.1755) [2022-01-26 18:06:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1200/1251] eta 0:01:52 lr 0.000019 time 2.7454 (2.2009) loss 2.0401 (2.9795) grad_norm 2.6354 (3.1738) [2022-01-26 18:06:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1210/1251] eta 0:01:30 lr 0.000019 time 2.7693 (2.2014) loss 2.6481 (2.9789) grad_norm 3.1792 (3.1736) [2022-01-26 18:07:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1220/1251] eta 0:01:08 lr 0.000019 time 1.9325 (2.2004) loss 2.4942 (2.9766) grad_norm 3.0530 (3.1739) [2022-01-26 18:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1230/1251] eta 0:00:46 lr 0.000019 time 3.4931 (2.2016) loss 2.5388 (2.9764) grad_norm 2.7696 (3.1743) [2022-01-26 18:07:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1240/1251] eta 0:00:24 lr 0.000019 time 2.1911 (2.2009) loss 2.9427 (2.9778) grad_norm 3.7134 (3.1734) [2022-01-26 18:08:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [281/300][1250/1251] eta 0:00:02 lr 0.000019 time 1.3152 (2.1951) loss 3.7016 (2.9804) grad_norm 3.7285 (3.1737) [2022-01-26 18:08:08 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 281 training takes 0:45:46 [2022-01-26 18:08:26 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.109 (18.109) Loss 0.8724 (0.8724) Acc@1 78.613 (78.613) Acc@5 94.824 (94.824) [2022-01-26 18:08:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.656 (3.219) Loss 0.8498 (0.8207) Acc@1 81.836 (80.859) Acc@5 94.629 (95.286) [2022-01-26 18:09:02 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.588 (2.578) Loss 0.7636 (0.8133) Acc@1 82.031 (81.166) Acc@5 96.484 (95.499) [2022-01-26 18:09:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.584 (2.387) Loss 0.8121 (0.8146) Acc@1 80.273 (81.086) Acc@5 95.898 (95.486) [2022-01-26 18:09:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.198 (2.216) Loss 0.8045 (0.8140) Acc@1 81.738 (81.017) Acc@5 94.922 (95.482) [2022-01-26 18:09:46 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.006 Acc@5 95.478 [2022-01-26 18:09:46 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.0% [2022-01-26 18:09:46 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.09% [2022-01-26 18:10:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][0/1251] eta 8:24:09 lr 0.000019 time 24.1806 (24.1806) loss 3.1594 (3.1594) grad_norm 3.4827 (3.4827) [2022-01-26 18:10:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][10/1251] eta 1:35:05 lr 0.000019 time 1.9074 (4.5978) loss 3.4396 (3.0905) grad_norm 2.7473 (2.9886) [2022-01-26 18:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][20/1251] eta 1:13:06 lr 0.000019 time 1.1792 (3.5636) loss 3.1100 (2.9658) grad_norm 2.4931 (3.0637) [2022-01-26 18:11:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][30/1251] eta 1:02:15 lr 0.000019 time 1.5926 (3.0591) loss 2.6892 (3.0529) grad_norm 3.1135 (3.1206) [2022-01-26 18:11:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][40/1251] eta 0:57:08 lr 0.000019 time 3.4029 (2.8308) loss 1.6963 (2.9619) grad_norm 3.2026 (3.2144) [2022-01-26 18:12:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][50/1251] eta 0:53:56 lr 0.000019 time 1.5787 (2.6951) loss 2.7937 (2.9166) grad_norm 3.2246 (3.2204) [2022-01-26 18:12:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][60/1251] eta 0:51:47 lr 0.000019 time 2.2375 (2.6091) loss 2.0840 (2.9281) grad_norm 3.4396 (3.1990) [2022-01-26 18:12:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][70/1251] eta 0:50:23 lr 0.000019 time 1.5381 (2.5605) loss 2.5282 (2.9422) grad_norm 3.1466 (3.1816) [2022-01-26 18:13:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][80/1251] eta 0:49:20 lr 0.000019 time 3.1107 (2.5285) loss 3.3333 (2.9334) grad_norm 3.4609 (3.1885) [2022-01-26 18:13:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][90/1251] eta 0:48:15 lr 0.000019 time 1.8999 (2.4943) loss 2.8510 (2.9537) grad_norm 3.3345 (3.2111) [2022-01-26 18:13:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][100/1251] eta 0:47:31 lr 0.000019 time 2.5777 (2.4770) loss 3.1134 (2.9547) grad_norm 3.2247 (3.2156) [2022-01-26 18:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][110/1251] eta 0:46:37 lr 0.000019 time 1.5987 (2.4514) loss 3.2376 (2.9780) grad_norm 2.9933 (3.2118) [2022-01-26 18:14:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][120/1251] eta 0:45:32 lr 0.000019 time 2.0477 (2.4163) loss 3.3916 (2.9895) grad_norm 3.3434 (3.2299) [2022-01-26 18:15:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][130/1251] eta 0:44:46 lr 0.000019 time 1.7724 (2.3963) loss 3.4630 (2.9822) grad_norm 3.4578 (3.2272) [2022-01-26 18:15:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][140/1251] eta 0:43:55 lr 0.000019 time 2.8846 (2.3722) loss 2.2190 (2.9833) grad_norm 3.4930 (3.2179) [2022-01-26 18:15:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][150/1251] eta 0:43:09 lr 0.000019 time 1.9580 (2.3521) loss 2.8190 (2.9774) grad_norm 3.1145 (3.2152) [2022-01-26 18:16:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][160/1251] eta 0:43:21 lr 0.000019 time 1.9227 (2.3843) loss 3.6689 (2.9745) grad_norm 3.2642 (3.2040) [2022-01-26 18:16:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][170/1251] eta 0:42:50 lr 0.000019 time 2.9315 (2.3779) loss 2.7009 (2.9658) grad_norm 3.2336 (3.2062) [2022-01-26 18:16:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][180/1251] eta 0:42:12 lr 0.000019 time 1.8542 (2.3650) loss 3.4220 (2.9614) grad_norm 2.6153 (3.1982) [2022-01-26 18:17:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][190/1251] eta 0:41:35 lr 0.000019 time 2.0341 (2.3517) loss 2.4108 (2.9573) grad_norm 3.0383 (3.2060) [2022-01-26 18:17:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][200/1251] eta 0:41:02 lr 0.000019 time 1.7945 (2.3430) loss 2.7701 (2.9530) grad_norm 2.8776 (3.2015) [2022-01-26 18:17:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][210/1251] eta 0:40:25 lr 0.000019 time 1.9520 (2.3301) loss 2.0945 (2.9548) grad_norm 3.2305 (3.1972) [2022-01-26 18:18:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][220/1251] eta 0:39:49 lr 0.000019 time 2.4676 (2.3172) loss 3.1775 (2.9696) grad_norm 4.7554 (3.1994) [2022-01-26 18:18:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][230/1251] eta 0:39:16 lr 0.000019 time 1.8772 (2.3077) loss 2.9424 (2.9631) grad_norm 2.9923 (3.1941) [2022-01-26 18:19:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][240/1251] eta 0:38:45 lr 0.000019 time 1.9221 (2.2999) loss 2.0027 (2.9519) grad_norm 2.9916 (3.1884) [2022-01-26 18:19:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][250/1251] eta 0:38:17 lr 0.000019 time 1.6246 (2.2954) loss 3.4081 (2.9553) grad_norm 2.9112 (3.1846) [2022-01-26 18:19:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][260/1251] eta 0:37:49 lr 0.000019 time 1.8297 (2.2904) loss 2.6371 (2.9567) grad_norm 3.1467 (3.1824) [2022-01-26 18:20:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][270/1251] eta 0:37:33 lr 0.000019 time 2.1850 (2.2967) loss 2.2752 (2.9576) grad_norm 3.4228 (3.1853) [2022-01-26 18:20:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][280/1251] eta 0:37:09 lr 0.000019 time 1.6028 (2.2956) loss 2.2767 (2.9504) grad_norm 3.2389 (3.1926) [2022-01-26 18:20:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][290/1251] eta 0:36:41 lr 0.000019 time 1.9587 (2.2909) loss 2.4654 (2.9523) grad_norm 4.1826 (3.1942) [2022-01-26 18:21:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][300/1251] eta 0:36:11 lr 0.000019 time 1.5922 (2.2833) loss 2.9730 (2.9599) grad_norm 2.9430 (3.2053) [2022-01-26 18:21:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][310/1251] eta 0:35:42 lr 0.000019 time 1.9177 (2.2766) loss 3.3210 (2.9614) grad_norm 3.6809 (3.2059) [2022-01-26 18:21:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][320/1251] eta 0:35:13 lr 0.000019 time 2.1280 (2.2702) loss 3.3012 (2.9689) grad_norm 2.9805 (3.2041) [2022-01-26 18:22:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][330/1251] eta 0:34:50 lr 0.000019 time 1.9217 (2.2697) loss 3.6553 (2.9799) grad_norm 3.3584 (3.2035) [2022-01-26 18:22:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][340/1251] eta 0:34:29 lr 0.000019 time 2.4572 (2.2713) loss 3.2436 (2.9786) grad_norm 3.1707 (3.1984) [2022-01-26 18:23:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][350/1251] eta 0:34:02 lr 0.000018 time 1.7496 (2.2668) loss 2.9037 (2.9841) grad_norm 3.1841 (3.1992) [2022-01-26 18:23:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][360/1251] eta 0:33:41 lr 0.000018 time 2.1017 (2.2690) loss 3.0576 (2.9856) grad_norm 2.7129 (3.1961) [2022-01-26 18:23:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][370/1251] eta 0:33:17 lr 0.000018 time 1.8717 (2.2673) loss 2.7954 (2.9802) grad_norm 2.7341 (3.2055) [2022-01-26 18:24:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][380/1251] eta 0:32:52 lr 0.000018 time 2.2610 (2.2649) loss 3.1113 (2.9808) grad_norm 3.5095 (3.2070) [2022-01-26 18:24:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][390/1251] eta 0:32:28 lr 0.000018 time 2.0469 (2.2628) loss 3.1789 (2.9776) grad_norm 2.9904 (3.2222) [2022-01-26 18:24:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][400/1251] eta 0:32:03 lr 0.000018 time 1.9205 (2.2605) loss 3.6949 (2.9793) grad_norm 2.9601 (3.2183) [2022-01-26 18:25:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][410/1251] eta 0:31:39 lr 0.000018 time 1.8265 (2.2584) loss 2.6932 (2.9831) grad_norm 2.8423 (3.2144) [2022-01-26 18:25:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][420/1251] eta 0:31:14 lr 0.000018 time 2.2666 (2.2556) loss 3.5340 (2.9862) grad_norm 3.2331 (3.2115) [2022-01-26 18:25:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][430/1251] eta 0:30:50 lr 0.000018 time 2.1920 (2.2535) loss 2.4569 (2.9846) grad_norm 2.6738 (3.2087) [2022-01-26 18:26:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][440/1251] eta 0:30:26 lr 0.000018 time 1.9644 (2.2519) loss 2.0347 (2.9745) grad_norm 3.6032 (3.2067) [2022-01-26 18:26:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][450/1251] eta 0:30:04 lr 0.000018 time 2.4015 (2.2526) loss 3.5936 (2.9747) grad_norm 3.1592 (3.2071) [2022-01-26 18:27:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][460/1251] eta 0:29:41 lr 0.000018 time 1.9246 (2.2521) loss 3.6105 (2.9787) grad_norm 3.2666 (3.2039) [2022-01-26 18:27:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][470/1251] eta 0:29:17 lr 0.000018 time 2.1777 (2.2509) loss 2.7855 (2.9727) grad_norm 2.8038 (3.2038) [2022-01-26 18:27:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][480/1251] eta 0:28:52 lr 0.000018 time 2.1949 (2.2476) loss 3.2655 (2.9706) grad_norm 2.8070 (3.1979) [2022-01-26 18:28:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][490/1251] eta 0:28:28 lr 0.000018 time 1.9194 (2.2452) loss 3.2303 (2.9732) grad_norm 3.1463 (3.1927) [2022-01-26 18:28:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][500/1251] eta 0:28:05 lr 0.000018 time 2.2366 (2.2438) loss 2.8442 (2.9756) grad_norm 4.0192 (3.1938) [2022-01-26 18:28:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][510/1251] eta 0:27:39 lr 0.000018 time 2.2867 (2.2394) loss 2.3608 (2.9757) grad_norm 3.1466 (3.1906) [2022-01-26 18:29:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][520/1251] eta 0:27:14 lr 0.000018 time 2.2185 (2.2364) loss 2.8087 (2.9728) grad_norm 3.1428 (3.1985) [2022-01-26 18:29:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][530/1251] eta 0:26:50 lr 0.000018 time 1.8120 (2.2335) loss 2.8212 (2.9748) grad_norm 3.6639 (3.2005) [2022-01-26 18:29:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][540/1251] eta 0:26:30 lr 0.000018 time 3.0370 (2.2369) loss 2.9539 (2.9756) grad_norm 2.9703 (3.2045) [2022-01-26 18:30:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][550/1251] eta 0:26:08 lr 0.000018 time 2.2378 (2.2377) loss 2.1752 (2.9699) grad_norm 3.1899 (3.2027) [2022-01-26 18:30:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][560/1251] eta 0:25:46 lr 0.000018 time 2.5049 (2.2379) loss 3.5302 (2.9730) grad_norm 2.8343 (3.2052) [2022-01-26 18:31:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][570/1251] eta 0:25:24 lr 0.000018 time 2.1849 (2.2385) loss 3.3765 (2.9780) grad_norm 3.4146 (3.2053) [2022-01-26 18:31:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][580/1251] eta 0:25:01 lr 0.000018 time 2.5269 (2.2377) loss 2.6146 (2.9743) grad_norm 3.2595 (3.2033) [2022-01-26 18:31:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][590/1251] eta 0:24:37 lr 0.000018 time 1.8402 (2.2346) loss 3.0512 (2.9761) grad_norm 3.6128 (3.2022) [2022-01-26 18:32:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][600/1251] eta 0:24:13 lr 0.000018 time 2.4443 (2.2321) loss 3.5202 (2.9761) grad_norm 2.9418 (3.2028) [2022-01-26 18:32:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][610/1251] eta 0:23:49 lr 0.000018 time 1.8419 (2.2304) loss 3.2739 (2.9762) grad_norm 3.0576 (3.2017) [2022-01-26 18:32:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][620/1251] eta 0:23:25 lr 0.000018 time 1.9543 (2.2279) loss 2.3350 (2.9697) grad_norm 3.5307 (3.1995) [2022-01-26 18:33:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][630/1251] eta 0:23:03 lr 0.000018 time 2.4573 (2.2281) loss 3.1144 (2.9674) grad_norm 3.1621 (3.1989) [2022-01-26 18:33:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][640/1251] eta 0:22:41 lr 0.000018 time 2.1671 (2.2280) loss 2.2767 (2.9700) grad_norm 3.1803 (3.1989) [2022-01-26 18:33:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][650/1251] eta 0:22:17 lr 0.000018 time 1.8430 (2.2251) loss 3.0053 (2.9709) grad_norm 2.4841 (3.1947) [2022-01-26 18:34:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][660/1251] eta 0:21:53 lr 0.000018 time 1.7231 (2.2228) loss 3.2800 (2.9700) grad_norm 2.9190 (3.1905) [2022-01-26 18:34:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][670/1251] eta 0:21:33 lr 0.000018 time 3.1288 (2.2255) loss 3.0203 (2.9709) grad_norm 2.8982 (3.1880) [2022-01-26 18:35:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][680/1251] eta 0:21:12 lr 0.000018 time 1.7161 (2.2288) loss 2.2502 (2.9692) grad_norm 3.3071 (3.1878) [2022-01-26 18:35:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][690/1251] eta 0:20:50 lr 0.000018 time 1.5171 (2.2286) loss 3.3991 (2.9714) grad_norm 2.8209 (3.1863) [2022-01-26 18:35:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][700/1251] eta 0:20:27 lr 0.000018 time 2.1751 (2.2273) loss 3.3490 (2.9724) grad_norm 3.1644 (3.1858) [2022-01-26 18:36:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][710/1251] eta 0:20:04 lr 0.000018 time 2.3522 (2.2258) loss 3.2505 (2.9729) grad_norm 3.3875 (3.1879) [2022-01-26 18:36:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][720/1251] eta 0:19:41 lr 0.000018 time 1.9703 (2.2259) loss 3.3189 (2.9753) grad_norm 2.8912 (3.1850) [2022-01-26 18:36:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][730/1251] eta 0:19:18 lr 0.000018 time 1.7794 (2.2239) loss 3.5431 (2.9759) grad_norm 2.7038 (3.1824) [2022-01-26 18:37:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][740/1251] eta 0:18:56 lr 0.000018 time 1.9810 (2.2236) loss 3.1093 (2.9748) grad_norm 2.9254 (3.1802) [2022-01-26 18:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][750/1251] eta 0:18:34 lr 0.000018 time 1.8614 (2.2239) loss 3.3548 (2.9774) grad_norm 3.0979 (3.1801) [2022-01-26 18:37:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][760/1251] eta 0:18:12 lr 0.000018 time 1.8390 (2.2248) loss 3.0823 (2.9766) grad_norm 2.9169 (3.1780) [2022-01-26 18:38:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][770/1251] eta 0:17:49 lr 0.000018 time 2.1001 (2.2243) loss 3.2515 (2.9777) grad_norm 2.7662 (3.1831) [2022-01-26 18:38:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][780/1251] eta 0:17:25 lr 0.000018 time 1.9444 (2.2207) loss 3.7515 (2.9791) grad_norm 3.1583 (3.1815) [2022-01-26 18:39:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][790/1251] eta 0:17:02 lr 0.000018 time 1.9362 (2.2190) loss 3.3498 (2.9812) grad_norm 2.9228 (3.1831) [2022-01-26 18:39:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][800/1251] eta 0:16:40 lr 0.000018 time 2.2743 (2.2178) loss 3.4197 (2.9810) grad_norm 2.7499 (3.1819) [2022-01-26 18:39:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][810/1251] eta 0:16:17 lr 0.000018 time 1.8720 (2.2161) loss 3.4235 (2.9799) grad_norm 3.1318 (3.1825) [2022-01-26 18:40:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][820/1251] eta 0:15:54 lr 0.000018 time 2.4595 (2.2156) loss 3.3649 (2.9823) grad_norm 3.5443 (3.1850) [2022-01-26 18:40:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][830/1251] eta 0:15:33 lr 0.000018 time 2.7476 (2.2174) loss 3.7906 (2.9823) grad_norm 2.7891 (3.1884) [2022-01-26 18:40:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][840/1251] eta 0:15:12 lr 0.000018 time 3.0943 (2.2190) loss 3.1199 (2.9812) grad_norm 2.9655 (3.1877) [2022-01-26 18:41:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][850/1251] eta 0:14:50 lr 0.000018 time 1.6326 (2.2201) loss 2.8832 (2.9831) grad_norm 3.1116 (3.1861) [2022-01-26 18:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][860/1251] eta 0:14:27 lr 0.000018 time 1.9332 (2.2192) loss 2.1907 (2.9809) grad_norm 5.9905 (3.1911) [2022-01-26 18:41:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][870/1251] eta 0:14:04 lr 0.000018 time 1.8340 (2.2174) loss 2.1778 (2.9794) grad_norm 3.2027 (3.1922) [2022-01-26 18:42:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][880/1251] eta 0:13:41 lr 0.000018 time 2.3699 (2.2149) loss 3.2739 (2.9804) grad_norm 3.3536 (3.1934) [2022-01-26 18:42:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][890/1251] eta 0:13:19 lr 0.000018 time 1.7215 (2.2153) loss 3.7659 (2.9824) grad_norm 3.2915 (3.1941) [2022-01-26 18:43:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][900/1251] eta 0:12:57 lr 0.000018 time 2.3921 (2.2155) loss 3.4926 (2.9851) grad_norm 2.9023 (3.1970) [2022-01-26 18:43:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][910/1251] eta 0:12:35 lr 0.000018 time 1.8631 (2.2150) loss 3.4655 (2.9865) grad_norm 2.8338 (3.1966) [2022-01-26 18:43:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][920/1251] eta 0:12:13 lr 0.000018 time 1.6950 (2.2158) loss 3.3990 (2.9861) grad_norm 3.1688 (3.1965) [2022-01-26 18:44:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][930/1251] eta 0:11:51 lr 0.000018 time 2.1367 (2.2156) loss 3.2795 (2.9850) grad_norm 2.9425 (3.1959) [2022-01-26 18:44:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][940/1251] eta 0:11:29 lr 0.000018 time 2.5274 (2.2155) loss 2.8464 (2.9825) grad_norm 2.8154 (3.1955) [2022-01-26 18:44:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][950/1251] eta 0:11:06 lr 0.000018 time 2.6865 (2.2151) loss 3.0786 (2.9836) grad_norm 2.8999 (3.1939) [2022-01-26 18:45:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][960/1251] eta 0:10:44 lr 0.000018 time 2.1349 (2.2146) loss 2.9954 (2.9831) grad_norm 3.2853 (3.1934) [2022-01-26 18:45:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][970/1251] eta 0:10:22 lr 0.000018 time 1.8904 (2.2150) loss 3.3118 (2.9814) grad_norm 2.8884 (3.1913) [2022-01-26 18:45:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][980/1251] eta 0:10:00 lr 0.000018 time 2.8140 (2.2143) loss 3.2390 (2.9812) grad_norm 3.9963 (3.1913) [2022-01-26 18:46:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][990/1251] eta 0:09:37 lr 0.000018 time 2.4365 (2.2134) loss 2.8151 (2.9815) grad_norm 2.9165 (3.1905) [2022-01-26 18:46:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1000/1251] eta 0:09:15 lr 0.000018 time 1.8814 (2.2117) loss 2.5276 (2.9837) grad_norm 3.3768 (3.1903) [2022-01-26 18:47:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1010/1251] eta 0:08:52 lr 0.000018 time 3.1188 (2.2113) loss 3.2207 (2.9829) grad_norm 2.8104 (3.1888) [2022-01-26 18:47:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1020/1251] eta 0:08:30 lr 0.000018 time 1.9020 (2.2096) loss 2.6592 (2.9828) grad_norm 2.6479 (3.1878) [2022-01-26 18:47:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1030/1251] eta 0:08:08 lr 0.000018 time 2.7552 (2.2102) loss 3.3517 (2.9834) grad_norm 3.3241 (3.1904) [2022-01-26 18:48:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1040/1251] eta 0:07:46 lr 0.000018 time 2.5835 (2.2101) loss 3.2661 (2.9842) grad_norm 3.4554 (3.1891) [2022-01-26 18:48:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1050/1251] eta 0:07:24 lr 0.000018 time 2.7903 (2.2117) loss 3.3633 (2.9845) grad_norm 3.2745 (3.1896) [2022-01-26 18:48:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1060/1251] eta 0:07:02 lr 0.000018 time 1.9287 (2.2124) loss 2.9077 (2.9860) grad_norm 2.8770 (3.1896) [2022-01-26 18:49:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1070/1251] eta 0:06:40 lr 0.000018 time 1.9240 (2.2106) loss 3.5482 (2.9855) grad_norm 2.9769 (3.1877) [2022-01-26 18:49:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1080/1251] eta 0:06:17 lr 0.000018 time 1.9296 (2.2089) loss 3.4278 (2.9870) grad_norm 3.1043 (3.1866) [2022-01-26 18:49:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1090/1251] eta 0:05:55 lr 0.000018 time 2.1250 (2.2076) loss 2.9580 (2.9863) grad_norm 3.2860 (3.1855) [2022-01-26 18:50:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1100/1251] eta 0:05:33 lr 0.000018 time 2.0406 (2.2111) loss 2.9930 (2.9859) grad_norm 3.4453 (3.1835) [2022-01-26 18:50:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1110/1251] eta 0:05:12 lr 0.000018 time 1.4992 (2.2137) loss 2.7424 (2.9853) grad_norm 3.0400 (3.1844) [2022-01-26 18:51:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1120/1251] eta 0:04:49 lr 0.000018 time 1.6864 (2.2129) loss 3.0884 (2.9853) grad_norm 2.8381 (3.1850) [2022-01-26 18:51:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1130/1251] eta 0:04:27 lr 0.000018 time 1.7952 (2.2118) loss 2.9552 (2.9864) grad_norm 3.0876 (3.1846) [2022-01-26 18:51:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1140/1251] eta 0:04:05 lr 0.000018 time 1.9919 (2.2115) loss 2.0484 (2.9864) grad_norm 3.2902 (3.1832) [2022-01-26 18:52:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1150/1251] eta 0:03:43 lr 0.000018 time 1.8433 (2.2101) loss 3.6240 (2.9864) grad_norm 2.9096 (3.1818) [2022-01-26 18:52:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1160/1251] eta 0:03:21 lr 0.000018 time 1.8524 (2.2089) loss 2.3463 (2.9850) grad_norm 2.6608 (3.1795) [2022-01-26 18:52:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1170/1251] eta 0:02:58 lr 0.000018 time 2.0205 (2.2094) loss 3.2985 (2.9848) grad_norm 2.9167 (3.1792) [2022-01-26 18:53:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1180/1251] eta 0:02:36 lr 0.000018 time 2.1492 (2.2102) loss 3.0012 (2.9868) grad_norm 3.2926 (3.1790) [2022-01-26 18:53:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1190/1251] eta 0:02:14 lr 0.000018 time 2.5677 (2.2105) loss 3.0130 (2.9873) grad_norm 3.2830 (3.1814) [2022-01-26 18:53:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1200/1251] eta 0:01:52 lr 0.000018 time 1.6212 (2.2092) loss 2.7704 (2.9881) grad_norm 2.8411 (3.1803) [2022-01-26 18:54:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1210/1251] eta 0:01:30 lr 0.000018 time 2.8741 (2.2090) loss 1.7951 (2.9855) grad_norm 3.1645 (3.1804) [2022-01-26 18:54:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1220/1251] eta 0:01:08 lr 0.000018 time 1.8914 (2.2081) loss 2.3040 (2.9854) grad_norm 3.0495 (3.1801) [2022-01-26 18:55:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1230/1251] eta 0:00:46 lr 0.000018 time 2.2455 (2.2083) loss 3.2611 (2.9836) grad_norm 3.0269 (3.1785) [2022-01-26 18:55:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1240/1251] eta 0:00:24 lr 0.000018 time 1.4908 (2.2068) loss 2.4431 (2.9839) grad_norm 3.1828 (3.1792) [2022-01-26 18:55:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [282/300][1250/1251] eta 0:00:02 lr 0.000018 time 1.2194 (2.2014) loss 2.9074 (2.9848) grad_norm 2.8784 (3.1785) [2022-01-26 18:55:41 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 282 training takes 0:45:54 [2022-01-26 18:55:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.008 (18.008) Loss 0.8499 (0.8499) Acc@1 80.273 (80.273) Acc@5 95.410 (95.410) [2022-01-26 18:56:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.870 (3.191) Loss 0.7982 (0.8040) Acc@1 80.859 (81.463) Acc@5 95.898 (95.588) [2022-01-26 18:56:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.313 (2.515) Loss 0.7837 (0.8060) Acc@1 81.738 (81.180) Acc@5 95.898 (95.457) [2022-01-26 18:56:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.301 (2.294) Loss 0.8533 (0.8054) Acc@1 80.078 (81.168) Acc@5 94.922 (95.558) [2022-01-26 18:57:08 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.323 (2.138) Loss 0.7869 (0.8043) Acc@1 80.273 (81.212) Acc@5 95.996 (95.534) [2022-01-26 18:57:16 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.122 Acc@5 95.476 [2022-01-26 18:57:16 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-01-26 18:57:16 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.12% [2022-01-26 18:57:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][0/1251] eta 7:25:10 lr 0.000018 time 21.3513 (21.3513) loss 2.0182 (2.0182) grad_norm 3.7964 (3.7964) [2022-01-26 18:58:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][10/1251] eta 1:25:33 lr 0.000018 time 2.5411 (4.1370) loss 3.2635 (2.9736) grad_norm 2.6946 (3.2154) [2022-01-26 18:58:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][20/1251] eta 1:07:06 lr 0.000018 time 1.8523 (3.2707) loss 2.9956 (2.9179) grad_norm 3.3501 (3.2407) [2022-01-26 18:58:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][30/1251] eta 0:58:31 lr 0.000018 time 1.8639 (2.8759) loss 3.6474 (2.9539) grad_norm 2.8772 (3.3468) [2022-01-26 18:59:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][40/1251] eta 0:55:10 lr 0.000018 time 3.1101 (2.7337) loss 3.2599 (2.9195) grad_norm 3.0236 (3.3178) [2022-01-26 18:59:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][50/1251] eta 0:53:13 lr 0.000018 time 2.7843 (2.6589) loss 2.0741 (2.9333) grad_norm 3.2697 (3.2823) [2022-01-26 18:59:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][60/1251] eta 0:51:20 lr 0.000018 time 2.0778 (2.5864) loss 3.1788 (2.9577) grad_norm 2.8797 (3.2762) [2022-01-26 19:00:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][70/1251] eta 0:49:21 lr 0.000018 time 1.8493 (2.5079) loss 3.2909 (2.9617) grad_norm 3.2042 (3.2832) [2022-01-26 19:00:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][80/1251] eta 0:48:10 lr 0.000018 time 3.2473 (2.4688) loss 3.0864 (2.9611) grad_norm 3.0399 (3.2641) [2022-01-26 19:00:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][90/1251] eta 0:47:14 lr 0.000018 time 2.4774 (2.4411) loss 2.2281 (2.9630) grad_norm 3.0507 (3.2533) [2022-01-26 19:01:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][100/1251] eta 0:46:21 lr 0.000018 time 1.5271 (2.4165) loss 1.8306 (2.9419) grad_norm 2.7993 (3.2409) [2022-01-26 19:01:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][110/1251] eta 0:45:30 lr 0.000018 time 1.5280 (2.3931) loss 3.4587 (2.9475) grad_norm 3.2219 (3.2307) [2022-01-26 19:02:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][120/1251] eta 0:44:41 lr 0.000018 time 2.2796 (2.3710) loss 3.0167 (2.9400) grad_norm 3.7508 (3.2401) [2022-01-26 19:02:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][130/1251] eta 0:44:14 lr 0.000018 time 3.2738 (2.3683) loss 2.6622 (2.9220) grad_norm 3.1190 (3.2343) [2022-01-26 19:02:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][140/1251] eta 0:43:29 lr 0.000018 time 1.9073 (2.3488) loss 1.9119 (2.9163) grad_norm 3.3838 (3.2215) [2022-01-26 19:03:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][150/1251] eta 0:42:43 lr 0.000018 time 1.9190 (2.3285) loss 2.1444 (2.9270) grad_norm 3.1366 (3.2167) [2022-01-26 19:03:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][160/1251] eta 0:41:55 lr 0.000018 time 1.9151 (2.3059) loss 2.1169 (2.9267) grad_norm 3.2733 (3.2183) [2022-01-26 19:03:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][170/1251] eta 0:41:26 lr 0.000018 time 2.8670 (2.2999) loss 2.6780 (2.9329) grad_norm 3.4918 (3.2206) [2022-01-26 19:04:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][180/1251] eta 0:40:56 lr 0.000018 time 1.7413 (2.2938) loss 3.1420 (2.9429) grad_norm 3.0613 (3.2237) [2022-01-26 19:04:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][190/1251] eta 0:40:34 lr 0.000018 time 2.3461 (2.2945) loss 3.3923 (2.9554) grad_norm 3.2026 (3.2204) [2022-01-26 19:04:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][200/1251] eta 0:40:17 lr 0.000018 time 3.4114 (2.3006) loss 3.6283 (2.9750) grad_norm 3.2681 (3.2168) [2022-01-26 19:05:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][210/1251] eta 0:40:02 lr 0.000018 time 3.0917 (2.3077) loss 2.4371 (2.9752) grad_norm 3.6281 (3.2128) [2022-01-26 19:05:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][220/1251] eta 0:39:24 lr 0.000018 time 2.0359 (2.2934) loss 2.5878 (2.9773) grad_norm 3.4168 (3.2232) [2022-01-26 19:06:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][230/1251] eta 0:38:47 lr 0.000018 time 1.8767 (2.2797) loss 3.0230 (2.9814) grad_norm 3.7232 (3.2286) [2022-01-26 19:06:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][240/1251] eta 0:38:16 lr 0.000018 time 2.5515 (2.2713) loss 3.5627 (2.9924) grad_norm 2.8325 (3.2273) [2022-01-26 19:06:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][250/1251] eta 0:37:47 lr 0.000018 time 2.5213 (2.2652) loss 2.1600 (2.9878) grad_norm 3.3984 (3.2295) [2022-01-26 19:07:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][260/1251] eta 0:37:26 lr 0.000018 time 2.6750 (2.2665) loss 3.4357 (2.9839) grad_norm 3.7051 (3.2278) [2022-01-26 19:07:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][270/1251] eta 0:36:59 lr 0.000018 time 1.8365 (2.2622) loss 3.4568 (2.9931) grad_norm 3.0776 (3.2317) [2022-01-26 19:07:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][280/1251] eta 0:36:46 lr 0.000018 time 3.0390 (2.2720) loss 2.8177 (2.9943) grad_norm 2.9491 (3.2306) [2022-01-26 19:08:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][290/1251] eta 0:36:22 lr 0.000018 time 2.2091 (2.2714) loss 3.1775 (2.9939) grad_norm 2.8787 (3.2290) [2022-01-26 19:08:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][300/1251] eta 0:35:55 lr 0.000018 time 2.4841 (2.2670) loss 2.9302 (2.9963) grad_norm 3.5052 (3.2345) [2022-01-26 19:08:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][310/1251] eta 0:35:25 lr 0.000018 time 1.8873 (2.2586) loss 2.2774 (2.9861) grad_norm 3.1648 (3.2330) [2022-01-26 19:09:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][320/1251] eta 0:34:57 lr 0.000018 time 2.8779 (2.2533) loss 3.4089 (2.9944) grad_norm 2.9704 (3.2250) [2022-01-26 19:09:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][330/1251] eta 0:34:30 lr 0.000018 time 1.9550 (2.2485) loss 4.0974 (2.9977) grad_norm 3.6962 (3.2218) [2022-01-26 19:10:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][340/1251] eta 0:34:09 lr 0.000018 time 2.4870 (2.2501) loss 3.2468 (3.0038) grad_norm 3.7910 (3.2246) [2022-01-26 19:10:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][350/1251] eta 0:33:45 lr 0.000018 time 1.9437 (2.2479) loss 2.4839 (2.9987) grad_norm 3.1072 (3.2270) [2022-01-26 19:10:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][360/1251] eta 0:33:24 lr 0.000018 time 2.9691 (2.2500) loss 3.4650 (3.0002) grad_norm 3.1781 (3.2224) [2022-01-26 19:11:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][370/1251] eta 0:33:01 lr 0.000018 time 2.5637 (2.2487) loss 2.6274 (2.9995) grad_norm 3.1346 (3.2228) [2022-01-26 19:11:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][380/1251] eta 0:32:33 lr 0.000018 time 2.2714 (2.2423) loss 3.0306 (2.9905) grad_norm 3.9744 (3.2279) [2022-01-26 19:11:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][390/1251] eta 0:32:04 lr 0.000018 time 1.9230 (2.2351) loss 2.9091 (2.9892) grad_norm 3.1067 (3.2281) [2022-01-26 19:12:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][400/1251] eta 0:31:41 lr 0.000018 time 2.4392 (2.2348) loss 3.3032 (2.9946) grad_norm 2.9231 (3.2269) [2022-01-26 19:12:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][410/1251] eta 0:31:20 lr 0.000018 time 1.7275 (2.2361) loss 2.5168 (2.9917) grad_norm 3.4869 (3.2259) [2022-01-26 19:12:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][420/1251] eta 0:30:57 lr 0.000018 time 2.1532 (2.2358) loss 3.6418 (2.9927) grad_norm 3.5104 (3.2242) [2022-01-26 19:13:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][430/1251] eta 0:30:33 lr 0.000018 time 2.3100 (2.2329) loss 2.5804 (2.9892) grad_norm 2.9719 (3.2186) [2022-01-26 19:13:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][440/1251] eta 0:30:11 lr 0.000018 time 2.1196 (2.2334) loss 3.7132 (2.9935) grad_norm 4.0819 (3.2201) [2022-01-26 19:14:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][450/1251] eta 0:29:49 lr 0.000017 time 1.9983 (2.2336) loss 2.9575 (2.9940) grad_norm 3.1472 (3.2177) [2022-01-26 19:14:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][460/1251] eta 0:29:23 lr 0.000017 time 1.6624 (2.2292) loss 2.6073 (2.9907) grad_norm 3.5712 (3.2189) [2022-01-26 19:14:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][470/1251] eta 0:28:58 lr 0.000017 time 2.0499 (2.2261) loss 3.4697 (2.9883) grad_norm 3.4000 (3.2217) [2022-01-26 19:15:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][480/1251] eta 0:28:34 lr 0.000017 time 2.4280 (2.2234) loss 3.5403 (2.9822) grad_norm 3.7637 (3.2277) [2022-01-26 19:15:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][490/1251] eta 0:28:14 lr 0.000017 time 2.1486 (2.2272) loss 3.3973 (2.9850) grad_norm 3.1745 (3.2248) [2022-01-26 19:15:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][500/1251] eta 0:27:52 lr 0.000017 time 1.8761 (2.2277) loss 3.2407 (2.9844) grad_norm 3.3173 (3.2246) [2022-01-26 19:16:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][510/1251] eta 0:27:29 lr 0.000017 time 2.1197 (2.2258) loss 3.3332 (2.9856) grad_norm 2.8054 (3.2218) [2022-01-26 19:16:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][520/1251] eta 0:27:06 lr 0.000017 time 2.5581 (2.2253) loss 2.7072 (2.9882) grad_norm 3.5020 (3.2198) [2022-01-26 19:16:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][530/1251] eta 0:26:42 lr 0.000017 time 2.1853 (2.2226) loss 3.2969 (2.9896) grad_norm 3.3701 (3.2406) [2022-01-26 19:17:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][540/1251] eta 0:26:20 lr 0.000017 time 2.6901 (2.2229) loss 3.1320 (2.9875) grad_norm 2.9230 (3.2417) [2022-01-26 19:17:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][550/1251] eta 0:25:58 lr 0.000017 time 2.5110 (2.2231) loss 3.5728 (2.9899) grad_norm 2.7028 (3.2386) [2022-01-26 19:18:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][560/1251] eta 0:25:34 lr 0.000017 time 2.0669 (2.2209) loss 3.4464 (2.9928) grad_norm 3.6755 (3.2536) [2022-01-26 19:18:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][570/1251] eta 0:25:13 lr 0.000017 time 1.8372 (2.2225) loss 2.1792 (2.9886) grad_norm 3.3964 (3.2581) [2022-01-26 19:18:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][580/1251] eta 0:24:51 lr 0.000017 time 2.2654 (2.2231) loss 2.7497 (2.9885) grad_norm 3.1785 (3.2572) [2022-01-26 19:19:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][590/1251] eta 0:24:28 lr 0.000017 time 1.8818 (2.2212) loss 2.4508 (2.9906) grad_norm 3.2958 (3.2567) [2022-01-26 19:19:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][600/1251] eta 0:24:03 lr 0.000017 time 1.8755 (2.2168) loss 3.2899 (2.9912) grad_norm 3.1231 (3.2585) [2022-01-26 19:19:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][610/1251] eta 0:23:38 lr 0.000017 time 2.1066 (2.2135) loss 3.5040 (2.9930) grad_norm 3.2102 (3.2591) [2022-01-26 19:20:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][620/1251] eta 0:23:15 lr 0.000017 time 2.1633 (2.2121) loss 3.2142 (2.9954) grad_norm 3.3594 (3.2609) [2022-01-26 19:20:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][630/1251] eta 0:22:54 lr 0.000017 time 1.9629 (2.2131) loss 3.2913 (2.9980) grad_norm 3.3635 (3.2606) [2022-01-26 19:20:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][640/1251] eta 0:22:34 lr 0.000017 time 1.6497 (2.2172) loss 3.5275 (2.9947) grad_norm 3.2090 (3.2573) [2022-01-26 19:21:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][650/1251] eta 0:22:13 lr 0.000017 time 2.4817 (2.2180) loss 3.1788 (2.9959) grad_norm 2.5666 (3.2536) [2022-01-26 19:21:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][660/1251] eta 0:21:49 lr 0.000017 time 1.7512 (2.2155) loss 2.7446 (2.9936) grad_norm 3.0772 (3.2517) [2022-01-26 19:22:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][670/1251] eta 0:21:26 lr 0.000017 time 1.8578 (2.2143) loss 3.4083 (2.9926) grad_norm 2.9200 (3.2538) [2022-01-26 19:22:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][680/1251] eta 0:21:03 lr 0.000017 time 1.9997 (2.2122) loss 3.2423 (2.9926) grad_norm 3.3216 (3.2605) [2022-01-26 19:22:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][690/1251] eta 0:20:42 lr 0.000017 time 3.0833 (2.2142) loss 3.4713 (2.9903) grad_norm 3.0714 (3.2575) [2022-01-26 19:23:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][700/1251] eta 0:20:19 lr 0.000017 time 2.1191 (2.2141) loss 2.5591 (2.9904) grad_norm 2.9657 (3.2569) [2022-01-26 19:23:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][710/1251] eta 0:19:57 lr 0.000017 time 2.1863 (2.2131) loss 3.4882 (2.9911) grad_norm 3.2004 (3.2548) [2022-01-26 19:23:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][720/1251] eta 0:19:33 lr 0.000017 time 2.2365 (2.2107) loss 3.7578 (2.9925) grad_norm 3.3625 (3.2527) [2022-01-26 19:24:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][730/1251] eta 0:19:10 lr 0.000017 time 2.1685 (2.2076) loss 3.3331 (2.9945) grad_norm 3.4159 (3.2544) [2022-01-26 19:24:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][740/1251] eta 0:18:46 lr 0.000017 time 2.0291 (2.2044) loss 3.0468 (2.9963) grad_norm 3.4683 (3.2522) [2022-01-26 19:24:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][750/1251] eta 0:18:23 lr 0.000017 time 2.2411 (2.2034) loss 2.1696 (2.9946) grad_norm 4.6996 (3.2543) [2022-01-26 19:25:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][760/1251] eta 0:18:02 lr 0.000017 time 2.7771 (2.2046) loss 3.3672 (2.9968) grad_norm 3.0259 (3.2544) [2022-01-26 19:25:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][770/1251] eta 0:17:41 lr 0.000017 time 2.1575 (2.2075) loss 2.6668 (2.9959) grad_norm 3.2560 (3.2524) [2022-01-26 19:26:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][780/1251] eta 0:17:20 lr 0.000017 time 1.9466 (2.2088) loss 3.3001 (2.9952) grad_norm 2.9256 (3.2554) [2022-01-26 19:26:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][790/1251] eta 0:16:58 lr 0.000017 time 1.9014 (2.2097) loss 2.8447 (2.9961) grad_norm 2.8779 (3.2550) [2022-01-26 19:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][800/1251] eta 0:16:36 lr 0.000017 time 1.5973 (2.2102) loss 3.1614 (2.9957) grad_norm 2.9658 (3.2534) [2022-01-26 19:27:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][810/1251] eta 0:16:13 lr 0.000017 time 1.6041 (2.2079) loss 2.0527 (2.9924) grad_norm 2.9304 (3.2512) [2022-01-26 19:27:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][820/1251] eta 0:15:51 lr 0.000017 time 1.9512 (2.2076) loss 3.6677 (2.9933) grad_norm 4.4456 (3.2531) [2022-01-26 19:27:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][830/1251] eta 0:15:29 lr 0.000017 time 1.9551 (2.2075) loss 3.7435 (2.9928) grad_norm 3.2632 (3.2526) [2022-01-26 19:28:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][840/1251] eta 0:15:06 lr 0.000017 time 1.8022 (2.2061) loss 1.9469 (2.9912) grad_norm 3.1246 (3.2519) [2022-01-26 19:28:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][850/1251] eta 0:14:44 lr 0.000017 time 2.5078 (2.2060) loss 2.3475 (2.9908) grad_norm 3.3619 (3.2506) [2022-01-26 19:28:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][860/1251] eta 0:14:22 lr 0.000017 time 1.7019 (2.2051) loss 3.5198 (2.9921) grad_norm 3.9180 (3.2523) [2022-01-26 19:29:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][870/1251] eta 0:14:00 lr 0.000017 time 2.5979 (2.2073) loss 2.8809 (2.9935) grad_norm 3.2018 (3.2504) [2022-01-26 19:29:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][880/1251] eta 0:13:39 lr 0.000017 time 2.4352 (2.2089) loss 2.4408 (2.9929) grad_norm 3.0572 (3.2509) [2022-01-26 19:30:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][890/1251] eta 0:13:16 lr 0.000017 time 2.5175 (2.2076) loss 2.7205 (2.9928) grad_norm 3.2279 (3.2478) [2022-01-26 19:30:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][900/1251] eta 0:12:53 lr 0.000017 time 1.6255 (2.2050) loss 3.3681 (2.9908) grad_norm 2.7226 (3.2463) [2022-01-26 19:30:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][910/1251] eta 0:12:31 lr 0.000017 time 2.3817 (2.2042) loss 3.0970 (2.9898) grad_norm 3.5062 (3.2449) [2022-01-26 19:31:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][920/1251] eta 0:12:09 lr 0.000017 time 1.9593 (2.2037) loss 3.2253 (2.9898) grad_norm 3.7380 (3.2446) [2022-01-26 19:31:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][930/1251] eta 0:11:47 lr 0.000017 time 2.8178 (2.2048) loss 3.5639 (2.9913) grad_norm 3.5642 (3.2437) [2022-01-26 19:31:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][940/1251] eta 0:11:25 lr 0.000017 time 1.8885 (2.2028) loss 3.0529 (2.9888) grad_norm 3.1950 (3.2463) [2022-01-26 19:32:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][950/1251] eta 0:11:02 lr 0.000017 time 1.9219 (2.2022) loss 3.1435 (2.9869) grad_norm 3.9516 (3.2481) [2022-01-26 19:32:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][960/1251] eta 0:10:41 lr 0.000017 time 2.5849 (2.2035) loss 3.2102 (2.9879) grad_norm 3.2681 (3.2468) [2022-01-26 19:32:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][970/1251] eta 0:10:19 lr 0.000017 time 1.7199 (2.2034) loss 1.8814 (2.9871) grad_norm 3.4726 (3.2459) [2022-01-26 19:33:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][980/1251] eta 0:09:57 lr 0.000017 time 2.7454 (2.2041) loss 3.4023 (2.9866) grad_norm 3.2653 (3.2448) [2022-01-26 19:33:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][990/1251] eta 0:09:34 lr 0.000017 time 1.6332 (2.2029) loss 2.8070 (2.9881) grad_norm 2.8916 (3.2450) [2022-01-26 19:34:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1000/1251] eta 0:09:13 lr 0.000017 time 2.7948 (2.2033) loss 2.8451 (2.9866) grad_norm 3.1252 (3.2477) [2022-01-26 19:34:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1010/1251] eta 0:08:50 lr 0.000017 time 1.9836 (2.2021) loss 3.3575 (2.9893) grad_norm 3.0155 (3.2480) [2022-01-26 19:34:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1020/1251] eta 0:08:28 lr 0.000017 time 1.9243 (2.2005) loss 2.9484 (2.9883) grad_norm 2.7920 (3.2453) [2022-01-26 19:35:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1030/1251] eta 0:08:05 lr 0.000017 time 2.5506 (2.1985) loss 3.3954 (2.9895) grad_norm 2.7574 (3.2449) [2022-01-26 19:35:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1040/1251] eta 0:07:43 lr 0.000017 time 1.8889 (2.1971) loss 3.2001 (2.9871) grad_norm 3.0973 (3.2450) [2022-01-26 19:35:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1050/1251] eta 0:07:21 lr 0.000017 time 1.9227 (2.1962) loss 3.5536 (2.9892) grad_norm 3.3925 (3.2451) [2022-01-26 19:36:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1060/1251] eta 0:06:59 lr 0.000017 time 2.5962 (2.1966) loss 3.3138 (2.9910) grad_norm 2.8427 (3.2476) [2022-01-26 19:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1070/1251] eta 0:06:37 lr 0.000017 time 2.5483 (2.1972) loss 3.0315 (2.9893) grad_norm 3.4261 (3.2468) [2022-01-26 19:36:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1080/1251] eta 0:06:15 lr 0.000017 time 2.2042 (2.1978) loss 3.4017 (2.9892) grad_norm 3.3330 (3.2456) [2022-01-26 19:37:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1090/1251] eta 0:05:54 lr 0.000017 time 2.1605 (2.2001) loss 2.8308 (2.9891) grad_norm 3.8562 (3.2458) [2022-01-26 19:37:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1100/1251] eta 0:05:32 lr 0.000017 time 3.0316 (2.2034) loss 1.9911 (2.9870) grad_norm 2.9210 (3.2439) [2022-01-26 19:38:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1110/1251] eta 0:05:10 lr 0.000017 time 2.4706 (2.2047) loss 2.1763 (2.9884) grad_norm 3.5234 (3.2431) [2022-01-26 19:38:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1120/1251] eta 0:04:48 lr 0.000017 time 1.5646 (2.2024) loss 2.4431 (2.9892) grad_norm 3.3421 (3.2426) [2022-01-26 19:38:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1130/1251] eta 0:04:26 lr 0.000017 time 2.2540 (2.1995) loss 3.0741 (2.9904) grad_norm 3.1319 (3.2422) [2022-01-26 19:39:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1140/1251] eta 0:04:03 lr 0.000017 time 1.6761 (2.1975) loss 2.6413 (2.9893) grad_norm 2.9745 (3.2408) [2022-01-26 19:39:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1150/1251] eta 0:03:41 lr 0.000017 time 1.6121 (2.1973) loss 3.1526 (2.9892) grad_norm 3.4850 (3.2394) [2022-01-26 19:39:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1160/1251] eta 0:03:19 lr 0.000017 time 2.1386 (2.1975) loss 2.5033 (2.9888) grad_norm 3.1438 (3.2384) [2022-01-26 19:40:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1170/1251] eta 0:02:58 lr 0.000017 time 4.5503 (2.1989) loss 2.7690 (2.9882) grad_norm 2.7970 (3.2382) [2022-01-26 19:40:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1180/1251] eta 0:02:36 lr 0.000017 time 1.5279 (2.2001) loss 1.9119 (2.9875) grad_norm 2.6376 (3.2390) [2022-01-26 19:40:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1190/1251] eta 0:02:14 lr 0.000017 time 2.1322 (2.2011) loss 2.4570 (2.9872) grad_norm 3.6722 (3.2401) [2022-01-26 19:41:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1200/1251] eta 0:01:52 lr 0.000017 time 2.1852 (2.2010) loss 3.4861 (2.9875) grad_norm 3.4393 (3.2398) [2022-01-26 19:41:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1210/1251] eta 0:01:30 lr 0.000017 time 6.7036 (2.2039) loss 2.9437 (2.9886) grad_norm 3.4767 (3.2398) [2022-01-26 19:42:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1220/1251] eta 0:01:08 lr 0.000017 time 1.9404 (2.2020) loss 2.0660 (2.9874) grad_norm 2.9701 (3.2399) [2022-01-26 19:42:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1230/1251] eta 0:00:46 lr 0.000017 time 1.8864 (2.2006) loss 3.2942 (2.9875) grad_norm 2.9772 (3.2410) [2022-01-26 19:42:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1240/1251] eta 0:00:24 lr 0.000017 time 1.5085 (2.1985) loss 3.3230 (2.9879) grad_norm 2.9686 (3.2407) [2022-01-26 19:43:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [283/300][1250/1251] eta 0:00:02 lr 0.000017 time 1.2073 (2.1934) loss 2.9152 (2.9876) grad_norm 2.7937 (3.2395) [2022-01-26 19:43:01 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 283 training takes 0:45:44 [2022-01-26 19:43:20 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.177 (19.177) Loss 0.8055 (0.8055) Acc@1 81.738 (81.738) Acc@5 95.020 (95.020) [2022-01-26 19:43:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.165 (3.440) Loss 0.7974 (0.8268) Acc@1 81.934 (80.700) Acc@5 95.996 (95.153) [2022-01-26 19:43:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.592 (2.628) Loss 0.8746 (0.8132) Acc@1 79.590 (81.017) Acc@5 94.824 (95.382) [2022-01-26 19:44:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.672 (2.283) Loss 0.8128 (0.8041) Acc@1 81.836 (81.206) Acc@5 94.824 (95.489) [2022-01-26 19:44:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.114 (2.163) Loss 0.7759 (0.8054) Acc@1 82.031 (81.129) Acc@5 95.312 (95.467) [2022-01-26 19:44:36 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.094 Acc@5 95.420 [2022-01-26 19:44:36 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-01-26 19:44:36 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.12% [2022-01-26 19:44:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][0/1251] eta 7:38:21 lr 0.000017 time 21.9835 (21.9835) loss 3.8739 (3.8739) grad_norm 3.2706 (3.2706) [2022-01-26 19:45:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][10/1251] eta 1:26:21 lr 0.000017 time 1.5419 (4.1749) loss 2.6135 (3.0912) grad_norm 2.9470 (3.2750) [2022-01-26 19:45:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][20/1251] eta 1:04:28 lr 0.000017 time 1.6419 (3.1423) loss 2.2333 (2.9027) grad_norm 3.1347 (3.3082) [2022-01-26 19:46:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][30/1251] eta 0:57:32 lr 0.000017 time 1.8898 (2.8272) loss 2.7754 (2.9048) grad_norm 3.3233 (3.3357) [2022-01-26 19:46:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][40/1251] eta 0:54:32 lr 0.000017 time 3.5912 (2.7021) loss 3.4845 (2.8963) grad_norm 3.4373 (3.2886) [2022-01-26 19:46:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][50/1251] eta 0:53:34 lr 0.000017 time 2.7811 (2.6761) loss 2.5111 (2.8972) grad_norm 3.0443 (3.3328) [2022-01-26 19:47:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][60/1251] eta 0:51:49 lr 0.000017 time 2.4939 (2.6108) loss 3.2338 (2.8844) grad_norm 2.9726 (3.3105) [2022-01-26 19:47:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][70/1251] eta 0:50:18 lr 0.000017 time 1.9103 (2.5559) loss 3.4202 (2.9144) grad_norm 2.7206 (3.3179) [2022-01-26 19:47:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][80/1251] eta 0:48:49 lr 0.000017 time 1.9714 (2.5018) loss 3.3835 (2.9245) grad_norm 3.1381 (3.2804) [2022-01-26 19:48:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][90/1251] eta 0:47:49 lr 0.000017 time 2.8569 (2.4717) loss 2.9794 (2.9606) grad_norm 2.8697 (3.2753) [2022-01-26 19:48:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][100/1251] eta 0:46:34 lr 0.000017 time 1.8433 (2.4281) loss 2.8044 (2.9448) grad_norm 3.6095 (3.2684) [2022-01-26 19:49:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][110/1251] eta 0:45:47 lr 0.000017 time 2.4170 (2.4080) loss 3.5727 (2.9605) grad_norm 3.1680 (3.2508) [2022-01-26 19:49:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][120/1251] eta 0:45:05 lr 0.000017 time 1.7950 (2.3924) loss 2.2372 (2.9553) grad_norm 3.4356 (3.2555) [2022-01-26 19:49:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][130/1251] eta 0:44:43 lr 0.000017 time 3.2448 (2.3942) loss 3.3028 (2.9536) grad_norm 3.7937 (3.2618) [2022-01-26 19:50:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][140/1251] eta 0:43:54 lr 0.000017 time 1.6583 (2.3711) loss 3.5177 (2.9721) grad_norm 4.1672 (3.2427) [2022-01-26 19:50:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][150/1251] eta 0:43:22 lr 0.000017 time 2.4909 (2.3637) loss 3.3508 (2.9685) grad_norm 2.9171 (3.2350) [2022-01-26 19:50:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][160/1251] eta 0:42:41 lr 0.000017 time 1.8869 (2.3482) loss 2.9798 (2.9625) grad_norm 3.0781 (3.2478) [2022-01-26 19:51:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][170/1251] eta 0:42:06 lr 0.000017 time 2.7638 (2.3367) loss 3.1530 (2.9653) grad_norm 3.7188 (3.2489) [2022-01-26 19:51:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][180/1251] eta 0:41:19 lr 0.000017 time 1.5301 (2.3153) loss 3.5224 (2.9834) grad_norm 3.2059 (3.2423) [2022-01-26 19:51:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][190/1251] eta 0:40:36 lr 0.000017 time 1.7772 (2.2968) loss 3.1870 (2.9880) grad_norm 3.3712 (3.2484) [2022-01-26 19:52:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][200/1251] eta 0:40:08 lr 0.000017 time 2.3222 (2.2917) loss 2.2799 (2.9836) grad_norm 3.6495 (3.2507) [2022-01-26 19:52:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][210/1251] eta 0:39:49 lr 0.000017 time 2.8941 (2.2953) loss 3.0556 (2.9850) grad_norm 3.1262 (3.2450) [2022-01-26 19:53:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][220/1251] eta 0:39:28 lr 0.000017 time 2.1277 (2.2972) loss 2.5546 (2.9820) grad_norm 2.9982 (3.2477) [2022-01-26 19:53:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][230/1251] eta 0:39:03 lr 0.000017 time 3.1588 (2.2952) loss 3.4866 (2.9904) grad_norm 3.8269 (3.2508) [2022-01-26 19:53:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][240/1251] eta 0:38:40 lr 0.000017 time 2.6721 (2.2949) loss 3.2193 (2.9838) grad_norm 3.3655 (3.2573) [2022-01-26 19:54:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][250/1251] eta 0:38:13 lr 0.000017 time 2.4839 (2.2911) loss 2.9629 (2.9660) grad_norm 3.0288 (3.2532) [2022-01-26 19:54:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][260/1251] eta 0:37:43 lr 0.000017 time 1.7824 (2.2844) loss 3.1769 (2.9680) grad_norm 3.2442 (3.2476) [2022-01-26 19:54:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][270/1251] eta 0:37:18 lr 0.000017 time 1.8942 (2.2815) loss 3.4085 (2.9677) grad_norm 3.5025 (3.2495) [2022-01-26 19:55:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][280/1251] eta 0:36:53 lr 0.000017 time 1.9186 (2.2792) loss 2.0093 (2.9655) grad_norm 2.7057 (3.2472) [2022-01-26 19:55:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][290/1251] eta 0:36:21 lr 0.000017 time 1.6183 (2.2701) loss 2.2168 (2.9638) grad_norm 3.4060 (3.2444) [2022-01-26 19:55:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][300/1251] eta 0:35:54 lr 0.000017 time 2.5236 (2.2660) loss 2.7650 (2.9704) grad_norm 3.1000 (3.2702) [2022-01-26 19:56:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][310/1251] eta 0:35:28 lr 0.000017 time 1.8282 (2.2618) loss 3.3278 (2.9638) grad_norm 3.4870 (3.2680) [2022-01-26 19:56:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][320/1251] eta 0:35:03 lr 0.000017 time 2.2876 (2.2591) loss 2.9475 (2.9616) grad_norm 3.5700 (3.2675) [2022-01-26 19:57:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][330/1251] eta 0:34:40 lr 0.000017 time 2.1781 (2.2595) loss 3.3616 (2.9660) grad_norm 3.1434 (3.2684) [2022-01-26 19:57:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][340/1251] eta 0:34:18 lr 0.000017 time 2.1523 (2.2597) loss 3.5394 (2.9662) grad_norm 3.2313 (3.2640) [2022-01-26 19:57:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][350/1251] eta 0:33:50 lr 0.000017 time 2.1865 (2.2534) loss 3.1279 (2.9672) grad_norm 3.0226 (3.2623) [2022-01-26 19:58:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][360/1251] eta 0:33:23 lr 0.000017 time 2.0496 (2.2486) loss 3.2832 (2.9716) grad_norm 3.2215 (3.2599) [2022-01-26 19:58:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][370/1251] eta 0:32:58 lr 0.000017 time 2.1759 (2.2454) loss 3.2129 (2.9720) grad_norm 3.2884 (3.2639) [2022-01-26 19:58:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][380/1251] eta 0:32:34 lr 0.000017 time 1.8969 (2.2445) loss 2.0220 (2.9709) grad_norm 3.7892 (3.2639) [2022-01-26 19:59:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][390/1251] eta 0:32:10 lr 0.000017 time 2.7394 (2.2425) loss 2.8230 (2.9747) grad_norm 3.2510 (3.2679) [2022-01-26 19:59:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][400/1251] eta 0:31:45 lr 0.000017 time 1.9580 (2.2390) loss 2.1183 (2.9684) grad_norm 3.3083 (3.2683) [2022-01-26 19:59:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][410/1251] eta 0:31:25 lr 0.000017 time 2.7868 (2.2420) loss 2.4654 (2.9685) grad_norm 3.4070 (3.2706) [2022-01-26 20:00:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][420/1251] eta 0:31:05 lr 0.000017 time 2.7684 (2.2448) loss 3.0650 (2.9681) grad_norm 3.0321 (3.2665) [2022-01-26 20:00:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][430/1251] eta 0:30:41 lr 0.000017 time 1.7586 (2.2430) loss 3.6100 (2.9674) grad_norm 3.3346 (3.2643) [2022-01-26 20:01:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][440/1251] eta 0:30:17 lr 0.000017 time 1.6143 (2.2405) loss 3.2522 (2.9726) grad_norm 4.0566 (3.2634) [2022-01-26 20:01:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][450/1251] eta 0:29:51 lr 0.000017 time 2.5013 (2.2372) loss 3.2302 (2.9716) grad_norm 3.2362 (3.2581) [2022-01-26 20:01:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][460/1251] eta 0:29:27 lr 0.000017 time 2.5407 (2.2349) loss 1.8949 (2.9706) grad_norm 3.6513 (3.2567) [2022-01-26 20:02:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][470/1251] eta 0:29:02 lr 0.000017 time 2.0055 (2.2308) loss 2.4448 (2.9690) grad_norm 3.4575 (3.2589) [2022-01-26 20:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][480/1251] eta 0:28:39 lr 0.000017 time 2.2301 (2.2303) loss 3.3054 (2.9729) grad_norm 2.8973 (3.2564) [2022-01-26 20:02:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][490/1251] eta 0:28:17 lr 0.000017 time 1.9388 (2.2305) loss 3.3452 (2.9774) grad_norm 2.3502 (3.2529) [2022-01-26 20:03:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][500/1251] eta 0:27:56 lr 0.000017 time 3.1332 (2.2321) loss 2.1482 (2.9756) grad_norm 3.1843 (3.2532) [2022-01-26 20:03:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][510/1251] eta 0:27:31 lr 0.000017 time 1.9822 (2.2291) loss 2.1822 (2.9784) grad_norm 3.1101 (3.2518) [2022-01-26 20:03:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][520/1251] eta 0:27:05 lr 0.000017 time 2.6116 (2.2242) loss 3.4836 (2.9767) grad_norm 3.2168 (3.2527) [2022-01-26 20:04:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][530/1251] eta 0:26:43 lr 0.000017 time 2.5343 (2.2238) loss 3.3862 (2.9796) grad_norm 3.2546 (3.2521) [2022-01-26 20:04:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][540/1251] eta 0:26:19 lr 0.000017 time 1.8369 (2.2217) loss 3.2964 (2.9817) grad_norm 2.9837 (3.2529) [2022-01-26 20:05:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][550/1251] eta 0:25:57 lr 0.000017 time 1.7004 (2.2214) loss 2.4299 (2.9854) grad_norm 3.0404 (3.2536) [2022-01-26 20:05:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][560/1251] eta 0:25:35 lr 0.000017 time 2.3392 (2.2215) loss 1.9374 (2.9838) grad_norm 3.3237 (3.2540) [2022-01-26 20:05:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][570/1251] eta 0:25:11 lr 0.000017 time 2.1557 (2.2199) loss 3.3101 (2.9824) grad_norm 3.7338 (3.2622) [2022-01-26 20:06:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][580/1251] eta 0:24:49 lr 0.000017 time 1.5770 (2.2200) loss 2.8311 (2.9820) grad_norm 2.9228 (3.2625) [2022-01-26 20:06:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][590/1251] eta 0:24:27 lr 0.000017 time 1.9079 (2.2207) loss 3.1704 (2.9841) grad_norm 3.3934 (3.2614) [2022-01-26 20:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][600/1251] eta 0:24:05 lr 0.000017 time 2.1916 (2.2210) loss 3.4737 (2.9861) grad_norm 2.5937 (3.2597) [2022-01-26 20:07:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][610/1251] eta 0:23:43 lr 0.000017 time 1.8033 (2.2212) loss 3.1340 (2.9843) grad_norm 3.1982 (3.2716) [2022-01-26 20:07:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][620/1251] eta 0:23:20 lr 0.000017 time 1.8619 (2.2187) loss 2.7907 (2.9819) grad_norm 3.1105 (3.2689) [2022-01-26 20:07:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][630/1251] eta 0:22:57 lr 0.000017 time 1.7357 (2.2186) loss 2.1374 (2.9819) grad_norm 2.9871 (3.2700) [2022-01-26 20:08:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][640/1251] eta 0:22:36 lr 0.000016 time 2.1036 (2.2194) loss 3.6371 (2.9811) grad_norm 3.0415 (3.2701) [2022-01-26 20:08:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][650/1251] eta 0:22:12 lr 0.000016 time 1.9416 (2.2178) loss 2.3096 (2.9807) grad_norm 4.7448 (3.2695) [2022-01-26 20:09:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][660/1251] eta 0:21:50 lr 0.000016 time 2.4682 (2.2167) loss 2.8746 (2.9803) grad_norm 2.8990 (3.2653) [2022-01-26 20:09:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][670/1251] eta 0:21:27 lr 0.000016 time 1.8681 (2.2166) loss 3.3398 (2.9822) grad_norm 3.8323 (3.2639) [2022-01-26 20:09:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][680/1251] eta 0:21:04 lr 0.000016 time 1.9169 (2.2148) loss 3.0741 (2.9816) grad_norm 3.2894 (3.2615) [2022-01-26 20:10:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][690/1251] eta 0:20:41 lr 0.000016 time 1.9413 (2.2139) loss 2.7585 (2.9840) grad_norm 2.9017 (3.2591) [2022-01-26 20:10:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][700/1251] eta 0:20:20 lr 0.000016 time 1.8913 (2.2145) loss 2.9717 (2.9857) grad_norm 3.0009 (3.2580) [2022-01-26 20:10:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][710/1251] eta 0:19:57 lr 0.000016 time 1.8471 (2.2131) loss 3.4805 (2.9865) grad_norm 2.6184 (3.2584) [2022-01-26 20:11:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][720/1251] eta 0:19:34 lr 0.000016 time 2.2842 (2.2127) loss 3.9120 (2.9916) grad_norm 3.6765 (3.2579) [2022-01-26 20:11:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][730/1251] eta 0:19:11 lr 0.000016 time 1.5575 (2.2106) loss 2.3245 (2.9896) grad_norm 3.1561 (3.2575) [2022-01-26 20:11:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][740/1251] eta 0:18:49 lr 0.000016 time 1.8319 (2.2096) loss 3.4375 (2.9904) grad_norm 3.2135 (3.2579) [2022-01-26 20:12:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][750/1251] eta 0:18:27 lr 0.000016 time 2.1213 (2.2098) loss 3.1473 (2.9918) grad_norm 2.9924 (3.2572) [2022-01-26 20:12:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][760/1251] eta 0:18:05 lr 0.000016 time 2.7125 (2.2105) loss 3.2027 (2.9891) grad_norm 2.8662 (3.2547) [2022-01-26 20:13:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][770/1251] eta 0:17:43 lr 0.000016 time 1.8537 (2.2114) loss 2.8931 (2.9868) grad_norm 3.2142 (3.2560) [2022-01-26 20:13:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][780/1251] eta 0:17:21 lr 0.000016 time 1.5687 (2.2119) loss 3.1846 (2.9856) grad_norm 3.0242 (3.2533) [2022-01-26 20:13:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][790/1251] eta 0:16:58 lr 0.000016 time 1.5560 (2.2104) loss 2.3104 (2.9832) grad_norm 2.9805 (3.2516) [2022-01-26 20:14:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][800/1251] eta 0:16:36 lr 0.000016 time 3.0569 (2.2106) loss 2.2275 (2.9823) grad_norm 3.2306 (3.2541) [2022-01-26 20:14:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][810/1251] eta 0:16:14 lr 0.000016 time 1.5161 (2.2090) loss 3.3807 (2.9842) grad_norm 2.8382 (3.2549) [2022-01-26 20:14:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][820/1251] eta 0:15:52 lr 0.000016 time 2.8304 (2.2098) loss 2.0349 (2.9844) grad_norm 3.1503 (3.2545) [2022-01-26 20:15:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][830/1251] eta 0:15:30 lr 0.000016 time 2.0193 (2.2092) loss 3.6453 (2.9862) grad_norm 2.9686 (3.2523) [2022-01-26 20:15:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][840/1251] eta 0:15:08 lr 0.000016 time 3.0613 (2.2108) loss 2.6269 (2.9868) grad_norm 3.2068 (3.2499) [2022-01-26 20:15:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][850/1251] eta 0:14:46 lr 0.000016 time 1.8996 (2.2096) loss 3.1025 (2.9871) grad_norm 3.1279 (3.2492) [2022-01-26 20:16:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][860/1251] eta 0:14:23 lr 0.000016 time 2.2563 (2.2085) loss 3.3986 (2.9907) grad_norm 3.1944 (3.2483) [2022-01-26 20:16:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][870/1251] eta 0:14:01 lr 0.000016 time 2.1843 (2.2074) loss 2.9478 (2.9904) grad_norm 3.4377 (3.2476) [2022-01-26 20:16:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][880/1251] eta 0:13:38 lr 0.000016 time 2.2559 (2.2057) loss 2.4210 (2.9902) grad_norm 2.8424 (3.2460) [2022-01-26 20:17:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][890/1251] eta 0:13:15 lr 0.000016 time 1.6024 (2.2033) loss 3.1826 (2.9902) grad_norm 3.3105 (3.2463) [2022-01-26 20:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][900/1251] eta 0:12:52 lr 0.000016 time 1.8978 (2.2012) loss 3.1268 (2.9892) grad_norm 2.4934 (3.2443) [2022-01-26 20:18:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][910/1251] eta 0:12:30 lr 0.000016 time 2.2060 (2.2003) loss 3.2622 (2.9896) grad_norm 3.5424 (3.2443) [2022-01-26 20:18:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][920/1251] eta 0:12:08 lr 0.000016 time 1.7462 (2.2002) loss 3.0261 (2.9911) grad_norm 3.1753 (3.2423) [2022-01-26 20:18:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][930/1251] eta 0:11:46 lr 0.000016 time 2.3293 (2.2017) loss 2.0449 (2.9922) grad_norm 3.1779 (3.2526) [2022-01-26 20:19:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][940/1251] eta 0:11:24 lr 0.000016 time 1.8496 (2.2018) loss 3.3296 (2.9927) grad_norm 4.0548 (3.2549) [2022-01-26 20:19:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][950/1251] eta 0:11:02 lr 0.000016 time 1.8413 (2.2016) loss 3.5798 (2.9936) grad_norm 3.1066 (3.2535) [2022-01-26 20:19:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][960/1251] eta 0:10:40 lr 0.000016 time 1.9487 (2.2016) loss 2.4424 (2.9928) grad_norm 2.7552 (3.2550) [2022-01-26 20:20:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][970/1251] eta 0:10:18 lr 0.000016 time 2.1798 (2.2028) loss 3.1427 (2.9931) grad_norm 3.0308 (3.2557) [2022-01-26 20:20:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][980/1251] eta 0:09:57 lr 0.000016 time 2.7209 (2.2040) loss 2.8761 (2.9915) grad_norm 2.8391 (3.2557) [2022-01-26 20:21:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][990/1251] eta 0:09:35 lr 0.000016 time 2.5937 (2.2051) loss 2.9759 (2.9918) grad_norm 3.4375 (3.2562) [2022-01-26 20:21:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1000/1251] eta 0:09:13 lr 0.000016 time 1.8924 (2.2041) loss 3.4195 (2.9913) grad_norm 3.0915 (3.2555) [2022-01-26 20:21:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1010/1251] eta 0:08:50 lr 0.000016 time 1.5962 (2.2029) loss 3.2574 (2.9908) grad_norm 2.6617 (3.2533) [2022-01-26 20:22:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1020/1251] eta 0:08:28 lr 0.000016 time 2.2818 (2.2025) loss 3.4259 (2.9920) grad_norm 3.0843 (3.2516) [2022-01-26 20:22:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1030/1251] eta 0:08:07 lr 0.000016 time 3.7205 (2.2036) loss 3.5585 (2.9895) grad_norm 3.2198 (3.2532) [2022-01-26 20:22:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1040/1251] eta 0:07:44 lr 0.000016 time 2.1416 (2.2035) loss 2.5454 (2.9879) grad_norm 2.9127 (3.2517) [2022-01-26 20:23:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1050/1251] eta 0:07:22 lr 0.000016 time 1.6647 (2.2036) loss 2.5747 (2.9884) grad_norm 3.7874 (3.2517) [2022-01-26 20:23:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1060/1251] eta 0:07:00 lr 0.000016 time 2.2809 (2.2036) loss 2.3033 (2.9863) grad_norm 2.9461 (3.2513) [2022-01-26 20:23:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1070/1251] eta 0:06:38 lr 0.000016 time 2.4455 (2.2040) loss 2.4624 (2.9861) grad_norm 2.9997 (3.2487) [2022-01-26 20:24:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1080/1251] eta 0:06:16 lr 0.000016 time 1.8818 (2.2023) loss 3.4530 (2.9866) grad_norm 3.7339 (3.2496) [2022-01-26 20:24:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1090/1251] eta 0:05:54 lr 0.000016 time 1.8369 (2.2026) loss 3.0327 (2.9871) grad_norm 3.4861 (3.2492) [2022-01-26 20:25:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1100/1251] eta 0:05:32 lr 0.000016 time 1.9336 (2.2025) loss 3.3310 (2.9894) grad_norm 2.7146 (3.2493) [2022-01-26 20:25:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1110/1251] eta 0:05:10 lr 0.000016 time 2.6953 (2.2024) loss 3.0730 (2.9884) grad_norm 2.8927 (3.2478) [2022-01-26 20:25:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1120/1251] eta 0:04:48 lr 0.000016 time 2.9787 (2.2023) loss 2.3968 (2.9890) grad_norm 3.1345 (3.2480) [2022-01-26 20:26:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1130/1251] eta 0:04:26 lr 0.000016 time 1.5846 (2.2022) loss 2.6674 (2.9882) grad_norm 3.2288 (3.2471) [2022-01-26 20:26:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1140/1251] eta 0:04:04 lr 0.000016 time 1.9466 (2.2003) loss 1.9321 (2.9856) grad_norm 2.8216 (3.2465) [2022-01-26 20:26:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1150/1251] eta 0:03:42 lr 0.000016 time 2.2472 (2.2005) loss 3.2515 (2.9844) grad_norm 3.1433 (3.2461) [2022-01-26 20:27:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1160/1251] eta 0:03:20 lr 0.000016 time 1.6944 (2.1995) loss 2.8034 (2.9835) grad_norm 3.0157 (3.2460) [2022-01-26 20:27:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1170/1251] eta 0:02:58 lr 0.000016 time 2.5511 (2.2000) loss 2.9031 (2.9831) grad_norm 3.1694 (3.2448) [2022-01-26 20:27:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1180/1251] eta 0:02:36 lr 0.000016 time 2.2495 (2.2001) loss 2.8393 (2.9828) grad_norm 3.1046 (3.2448) [2022-01-26 20:28:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1190/1251] eta 0:02:14 lr 0.000016 time 1.6567 (2.2004) loss 2.7731 (2.9844) grad_norm 3.7067 (3.2449) [2022-01-26 20:28:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1200/1251] eta 0:01:52 lr 0.000016 time 1.9119 (2.2007) loss 3.2139 (2.9838) grad_norm 2.8328 (3.2429) [2022-01-26 20:29:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1210/1251] eta 0:01:30 lr 0.000016 time 2.4462 (2.2003) loss 3.2629 (2.9828) grad_norm 2.7386 (3.2409) [2022-01-26 20:29:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1220/1251] eta 0:01:08 lr 0.000016 time 2.4670 (2.1991) loss 2.8490 (2.9835) grad_norm 3.9804 (3.2418) [2022-01-26 20:29:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1230/1251] eta 0:00:46 lr 0.000016 time 2.0143 (2.1995) loss 2.6757 (2.9823) grad_norm 3.3518 (3.2398) [2022-01-26 20:30:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1240/1251] eta 0:00:24 lr 0.000016 time 1.7512 (2.1988) loss 3.1944 (2.9809) grad_norm 2.9472 (3.2412) [2022-01-26 20:30:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [284/300][1250/1251] eta 0:00:02 lr 0.000016 time 1.1766 (2.1928) loss 3.3686 (2.9799) grad_norm 3.4767 (3.2410) [2022-01-26 20:30:20 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 284 training takes 0:45:43 [2022-01-26 20:30:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.392 (18.392) Loss 0.8142 (0.8142) Acc@1 80.273 (80.273) Acc@5 95.801 (95.801) [2022-01-26 20:30:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.265 (3.280) Loss 0.8877 (0.8193) Acc@1 79.980 (80.975) Acc@5 93.848 (95.348) [2022-01-26 20:31:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.614 (2.506) Loss 0.8383 (0.8168) Acc@1 79.004 (80.957) Acc@5 95.801 (95.415) [2022-01-26 20:31:28 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.889 (2.186) Loss 0.7528 (0.8112) Acc@1 82.715 (81.039) Acc@5 95.996 (95.407) [2022-01-26 20:31:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.763 (2.120) Loss 0.8228 (0.8114) Acc@1 80.176 (81.033) Acc@5 95.312 (95.398) [2022-01-26 20:31:54 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.106 Acc@5 95.444 [2022-01-26 20:31:54 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-01-26 20:31:54 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.12% [2022-01-26 20:32:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][0/1251] eta 7:28:08 lr 0.000016 time 21.4934 (21.4934) loss 3.1062 (3.1062) grad_norm 3.3720 (3.3720) [2022-01-26 20:32:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][10/1251] eta 1:22:44 lr 0.000016 time 2.5470 (4.0000) loss 3.3531 (2.9951) grad_norm 2.9546 (3.0227) [2022-01-26 20:33:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][20/1251] eta 1:04:07 lr 0.000016 time 2.1773 (3.1253) loss 3.0714 (2.9377) grad_norm 3.3835 (3.0853) [2022-01-26 20:33:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][30/1251] eta 0:57:14 lr 0.000016 time 1.6105 (2.8131) loss 3.4863 (3.0110) grad_norm 3.2951 (3.1262) [2022-01-26 20:33:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][40/1251] eta 0:55:06 lr 0.000016 time 3.6420 (2.7308) loss 2.9621 (3.0444) grad_norm 2.8505 (3.2098) [2022-01-26 20:34:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][50/1251] eta 0:53:05 lr 0.000016 time 3.3731 (2.6523) loss 2.8600 (3.0282) grad_norm 2.5225 (3.1792) [2022-01-26 20:34:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][60/1251] eta 0:50:42 lr 0.000016 time 2.4637 (2.5547) loss 3.0395 (3.0156) grad_norm 2.6656 (3.1726) [2022-01-26 20:34:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][70/1251] eta 0:48:54 lr 0.000016 time 1.6759 (2.4845) loss 2.2561 (3.0027) grad_norm 2.8219 (3.1527) [2022-01-26 20:35:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][80/1251] eta 0:47:32 lr 0.000016 time 2.6799 (2.4363) loss 2.7231 (2.9913) grad_norm 3.0848 (3.1618) [2022-01-26 20:35:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][90/1251] eta 0:46:44 lr 0.000016 time 2.3739 (2.4160) loss 3.4875 (2.9767) grad_norm 2.8698 (3.1797) [2022-01-26 20:35:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][100/1251] eta 0:46:05 lr 0.000016 time 2.7551 (2.4024) loss 3.2759 (2.9863) grad_norm 3.0075 (3.1671) [2022-01-26 20:36:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][110/1251] eta 0:45:10 lr 0.000016 time 2.2075 (2.3752) loss 3.3665 (3.0015) grad_norm 2.7274 (3.1870) [2022-01-26 20:36:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][120/1251] eta 0:44:29 lr 0.000016 time 2.5757 (2.3606) loss 3.5563 (3.0015) grad_norm 4.5432 (3.1893) [2022-01-26 20:37:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][130/1251] eta 0:43:49 lr 0.000016 time 2.8416 (2.3453) loss 3.2284 (3.0096) grad_norm 3.4118 (3.1976) [2022-01-26 20:37:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][140/1251] eta 0:43:15 lr 0.000016 time 2.6081 (2.3358) loss 3.6312 (3.0040) grad_norm 3.6049 (3.1999) [2022-01-26 20:37:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][150/1251] eta 0:42:44 lr 0.000016 time 1.8948 (2.3293) loss 3.3246 (2.9869) grad_norm 3.3079 (3.2174) [2022-01-26 20:38:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][160/1251] eta 0:42:24 lr 0.000016 time 2.5059 (2.3319) loss 2.1944 (2.9910) grad_norm 3.0746 (3.2138) [2022-01-26 20:38:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][170/1251] eta 0:41:49 lr 0.000016 time 1.8655 (2.3217) loss 3.1101 (3.0004) grad_norm 2.9318 (3.2168) [2022-01-26 20:38:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][180/1251] eta 0:41:15 lr 0.000016 time 2.5430 (2.3118) loss 2.6495 (2.9748) grad_norm 3.1048 (3.2141) [2022-01-26 20:39:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][190/1251] eta 0:40:38 lr 0.000016 time 1.9516 (2.2980) loss 2.2380 (2.9712) grad_norm 3.2192 (3.2097) [2022-01-26 20:39:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][200/1251] eta 0:40:06 lr 0.000016 time 2.1818 (2.2900) loss 2.6725 (2.9578) grad_norm 3.3221 (3.2059) [2022-01-26 20:39:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][210/1251] eta 0:39:35 lr 0.000016 time 2.1702 (2.2824) loss 2.8561 (2.9600) grad_norm 5.6259 (3.2154) [2022-01-26 20:40:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][220/1251] eta 0:39:05 lr 0.000016 time 2.4882 (2.2749) loss 2.5025 (2.9523) grad_norm 2.8381 (3.2050) [2022-01-26 20:40:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][230/1251] eta 0:38:47 lr 0.000016 time 2.4701 (2.2800) loss 2.7971 (2.9635) grad_norm 3.7990 (3.1969) [2022-01-26 20:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][240/1251] eta 0:38:28 lr 0.000016 time 2.8044 (2.2838) loss 2.9959 (2.9691) grad_norm 2.9157 (3.2085) [2022-01-26 20:41:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][250/1251] eta 0:38:00 lr 0.000016 time 1.8823 (2.2784) loss 2.4254 (2.9567) grad_norm 3.4685 (3.2198) [2022-01-26 20:41:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][260/1251] eta 0:37:34 lr 0.000016 time 2.7470 (2.2747) loss 2.8268 (2.9594) grad_norm 3.5993 (3.2291) [2022-01-26 20:42:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][270/1251] eta 0:37:03 lr 0.000016 time 1.8988 (2.2669) loss 3.3007 (2.9591) grad_norm 3.0460 (3.2328) [2022-01-26 20:42:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][280/1251] eta 0:36:35 lr 0.000016 time 1.8532 (2.2609) loss 2.3451 (2.9635) grad_norm 3.1192 (3.2354) [2022-01-26 20:42:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][290/1251] eta 0:36:10 lr 0.000016 time 2.2812 (2.2588) loss 2.4798 (2.9526) grad_norm 4.1200 (3.2439) [2022-01-26 20:43:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][300/1251] eta 0:35:43 lr 0.000016 time 1.7746 (2.2541) loss 2.2458 (2.9528) grad_norm 2.9534 (3.2425) [2022-01-26 20:43:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][310/1251] eta 0:35:21 lr 0.000016 time 1.9412 (2.2541) loss 3.2144 (2.9511) grad_norm 3.6111 (3.2436) [2022-01-26 20:43:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][320/1251] eta 0:34:56 lr 0.000016 time 1.8499 (2.2523) loss 1.9503 (2.9565) grad_norm 2.8596 (3.2416) [2022-01-26 20:44:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][330/1251] eta 0:34:30 lr 0.000016 time 2.5569 (2.2480) loss 3.7722 (2.9624) grad_norm 3.4503 (3.2447) [2022-01-26 20:44:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][340/1251] eta 0:34:05 lr 0.000016 time 2.1506 (2.2458) loss 3.5563 (2.9693) grad_norm 3.1966 (3.2395) [2022-01-26 20:45:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][350/1251] eta 0:33:43 lr 0.000016 time 2.8211 (2.2460) loss 3.0681 (2.9779) grad_norm 3.2237 (3.2363) [2022-01-26 20:45:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][360/1251] eta 0:33:20 lr 0.000016 time 2.1775 (2.2447) loss 3.4714 (2.9798) grad_norm 3.3250 (3.2382) [2022-01-26 20:45:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][370/1251] eta 0:32:57 lr 0.000016 time 1.8930 (2.2449) loss 3.5135 (2.9819) grad_norm 3.5792 (3.2408) [2022-01-26 20:46:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][380/1251] eta 0:32:32 lr 0.000016 time 2.0928 (2.2412) loss 3.1051 (2.9882) grad_norm 3.3760 (3.2354) [2022-01-26 20:46:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][390/1251] eta 0:32:05 lr 0.000016 time 2.3008 (2.2363) loss 2.2907 (2.9888) grad_norm 3.3567 (3.2431) [2022-01-26 20:46:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][400/1251] eta 0:31:40 lr 0.000016 time 1.8900 (2.2327) loss 2.8890 (2.9836) grad_norm 3.2346 (3.2367) [2022-01-26 20:47:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][410/1251] eta 0:31:16 lr 0.000016 time 2.4743 (2.2316) loss 2.6128 (2.9752) grad_norm 2.8485 (3.2343) [2022-01-26 20:47:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][420/1251] eta 0:30:49 lr 0.000016 time 1.8981 (2.2259) loss 2.9945 (2.9786) grad_norm 3.8419 (3.2367) [2022-01-26 20:47:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][430/1251] eta 0:30:25 lr 0.000016 time 1.8907 (2.2239) loss 2.0629 (2.9751) grad_norm 3.4898 (3.2409) [2022-01-26 20:48:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][440/1251] eta 0:30:02 lr 0.000016 time 2.8298 (2.2226) loss 1.9646 (2.9705) grad_norm 3.6146 (3.2418) [2022-01-26 20:48:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][450/1251] eta 0:29:42 lr 0.000016 time 2.8099 (2.2255) loss 3.2996 (2.9754) grad_norm 3.1467 (3.2426) [2022-01-26 20:49:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][460/1251] eta 0:29:20 lr 0.000016 time 2.3287 (2.2251) loss 2.9369 (2.9719) grad_norm 2.9622 (3.2429) [2022-01-26 20:49:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][470/1251] eta 0:28:58 lr 0.000016 time 1.5057 (2.2259) loss 3.6645 (2.9709) grad_norm 3.2493 (3.2419) [2022-01-26 20:49:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][480/1251] eta 0:28:36 lr 0.000016 time 2.2937 (2.2258) loss 2.7873 (2.9699) grad_norm 3.5574 (3.2402) [2022-01-26 20:50:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][490/1251] eta 0:28:17 lr 0.000016 time 3.3519 (2.2306) loss 3.2041 (2.9703) grad_norm 3.5645 (3.2380) [2022-01-26 20:50:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][500/1251] eta 0:27:55 lr 0.000016 time 2.8085 (2.2305) loss 2.4928 (2.9719) grad_norm 2.5339 (3.2360) [2022-01-26 20:50:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][510/1251] eta 0:27:31 lr 0.000016 time 1.8894 (2.2282) loss 2.9495 (2.9743) grad_norm 3.4506 (3.2361) [2022-01-26 20:51:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][520/1251] eta 0:27:08 lr 0.000016 time 2.3812 (2.2275) loss 2.9238 (2.9717) grad_norm 3.4782 (3.2394) [2022-01-26 20:51:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][530/1251] eta 0:26:44 lr 0.000016 time 2.5034 (2.2257) loss 3.5096 (2.9765) grad_norm 3.3059 (3.2389) [2022-01-26 20:51:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][540/1251] eta 0:26:21 lr 0.000016 time 2.4872 (2.2240) loss 3.2280 (2.9792) grad_norm 3.5536 (3.2407) [2022-01-26 20:52:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][550/1251] eta 0:25:57 lr 0.000016 time 1.9316 (2.2224) loss 3.3337 (2.9824) grad_norm 3.2037 (3.2415) [2022-01-26 20:52:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][560/1251] eta 0:25:35 lr 0.000016 time 1.9395 (2.2222) loss 3.2581 (2.9860) grad_norm 2.8525 (3.2383) [2022-01-26 20:53:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][570/1251] eta 0:25:12 lr 0.000016 time 2.8278 (2.2208) loss 2.3458 (2.9866) grad_norm 3.6197 (3.2362) [2022-01-26 20:53:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][580/1251] eta 0:24:50 lr 0.000016 time 2.0520 (2.2209) loss 2.2368 (2.9838) grad_norm 2.9227 (3.2343) [2022-01-26 20:53:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][590/1251] eta 0:24:28 lr 0.000016 time 2.3806 (2.2219) loss 2.6349 (2.9784) grad_norm 4.8665 (3.2399) [2022-01-26 20:54:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][600/1251] eta 0:24:06 lr 0.000016 time 1.7790 (2.2222) loss 3.3068 (2.9794) grad_norm 3.6840 (3.2409) [2022-01-26 20:54:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][610/1251] eta 0:23:45 lr 0.000016 time 3.1118 (2.2244) loss 2.6769 (2.9771) grad_norm 3.0192 (3.2376) [2022-01-26 20:54:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][620/1251] eta 0:23:21 lr 0.000016 time 1.9987 (2.2208) loss 3.5150 (2.9782) grad_norm 2.7792 (3.2325) [2022-01-26 20:55:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][630/1251] eta 0:22:56 lr 0.000016 time 2.1668 (2.2173) loss 3.5339 (2.9777) grad_norm 2.9377 (3.2315) [2022-01-26 20:55:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][640/1251] eta 0:22:33 lr 0.000016 time 1.8736 (2.2158) loss 3.4121 (2.9771) grad_norm 3.0840 (3.2295) [2022-01-26 20:55:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][650/1251] eta 0:22:11 lr 0.000016 time 2.2604 (2.2155) loss 3.2293 (2.9747) grad_norm 3.4249 (3.2284) [2022-01-26 20:56:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][660/1251] eta 0:21:49 lr 0.000016 time 2.2634 (2.2162) loss 3.0510 (2.9725) grad_norm 3.3441 (3.2258) [2022-01-26 20:56:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][670/1251] eta 0:21:27 lr 0.000016 time 2.5005 (2.2164) loss 2.5212 (2.9739) grad_norm 3.3254 (3.2240) [2022-01-26 20:57:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][680/1251] eta 0:21:04 lr 0.000016 time 1.5768 (2.2138) loss 2.4860 (2.9742) grad_norm 3.7231 (3.2262) [2022-01-26 20:57:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][690/1251] eta 0:20:41 lr 0.000016 time 2.2366 (2.2132) loss 2.9879 (2.9726) grad_norm 3.0050 (3.2226) [2022-01-26 20:57:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][700/1251] eta 0:20:19 lr 0.000016 time 2.4708 (2.2131) loss 3.5031 (2.9742) grad_norm 3.1451 (3.2228) [2022-01-26 20:58:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][710/1251] eta 0:19:59 lr 0.000016 time 3.8230 (2.2173) loss 3.6110 (2.9772) grad_norm 4.5428 (3.2246) [2022-01-26 20:58:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][720/1251] eta 0:19:37 lr 0.000016 time 1.6267 (2.2173) loss 2.3325 (2.9765) grad_norm 3.1239 (3.2247) [2022-01-26 20:58:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][730/1251] eta 0:19:14 lr 0.000016 time 2.2564 (2.2168) loss 3.3205 (2.9775) grad_norm 3.2246 (3.2257) [2022-01-26 20:59:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][740/1251] eta 0:18:51 lr 0.000016 time 1.7694 (2.2144) loss 3.4074 (2.9779) grad_norm 3.2641 (3.2307) [2022-01-26 20:59:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][750/1251] eta 0:18:27 lr 0.000016 time 1.8628 (2.2112) loss 3.1768 (2.9809) grad_norm 3.2429 (3.2294) [2022-01-26 20:59:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][760/1251] eta 0:18:05 lr 0.000016 time 2.1448 (2.2104) loss 2.8950 (2.9804) grad_norm 2.8724 (3.2334) [2022-01-26 21:00:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][770/1251] eta 0:17:43 lr 0.000016 time 2.5984 (2.2100) loss 3.7645 (2.9831) grad_norm 3.0313 (3.2321) [2022-01-26 21:00:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][780/1251] eta 0:17:20 lr 0.000016 time 1.7692 (2.2095) loss 3.2061 (2.9859) grad_norm 3.1590 (3.2371) [2022-01-26 21:01:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][790/1251] eta 0:16:58 lr 0.000016 time 3.0974 (2.2097) loss 2.3607 (2.9855) grad_norm 3.1389 (3.2352) [2022-01-26 21:01:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][800/1251] eta 0:16:38 lr 0.000016 time 2.4402 (2.2129) loss 3.4405 (2.9877) grad_norm 2.9734 (3.2362) [2022-01-26 21:01:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][810/1251] eta 0:16:16 lr 0.000016 time 2.5500 (2.2145) loss 2.6738 (2.9862) grad_norm 2.7226 (3.2335) [2022-01-26 21:02:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][820/1251] eta 0:15:53 lr 0.000016 time 1.8311 (2.2133) loss 2.7504 (2.9891) grad_norm 3.5546 (3.2346) [2022-01-26 21:02:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][830/1251] eta 0:15:30 lr 0.000016 time 1.9274 (2.2102) loss 2.1678 (2.9889) grad_norm 3.0999 (3.2342) [2022-01-26 21:02:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][840/1251] eta 0:15:08 lr 0.000016 time 2.2098 (2.2106) loss 2.4795 (2.9849) grad_norm 2.9444 (3.2330) [2022-01-26 21:03:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][850/1251] eta 0:14:46 lr 0.000016 time 2.2012 (2.2102) loss 3.4694 (2.9853) grad_norm 3.1501 (3.2363) [2022-01-26 21:03:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][860/1251] eta 0:14:23 lr 0.000016 time 1.7958 (2.2088) loss 3.4945 (2.9853) grad_norm 3.1117 (3.2355) [2022-01-26 21:03:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][870/1251] eta 0:14:01 lr 0.000016 time 1.9684 (2.2082) loss 2.2871 (2.9848) grad_norm 3.6141 (3.2346) [2022-01-26 21:04:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][880/1251] eta 0:13:39 lr 0.000016 time 1.7581 (2.2094) loss 2.6901 (2.9851) grad_norm 4.0443 (3.2341) [2022-01-26 21:04:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][890/1251] eta 0:13:17 lr 0.000016 time 2.1090 (2.2089) loss 2.3659 (2.9830) grad_norm 2.9597 (3.2314) [2022-01-26 21:05:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][900/1251] eta 0:12:55 lr 0.000016 time 2.4355 (2.2082) loss 3.2581 (2.9828) grad_norm 2.9232 (3.2293) [2022-01-26 21:05:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][910/1251] eta 0:12:32 lr 0.000016 time 1.9452 (2.2075) loss 3.3780 (2.9844) grad_norm 3.0832 (3.2280) [2022-01-26 21:05:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][920/1251] eta 0:12:09 lr 0.000016 time 1.9819 (2.2051) loss 2.9871 (2.9860) grad_norm 3.0138 (3.2277) [2022-01-26 21:06:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][930/1251] eta 0:11:47 lr 0.000016 time 1.9117 (2.2031) loss 1.9976 (2.9865) grad_norm 3.7275 (3.2273) [2022-01-26 21:06:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][940/1251] eta 0:11:24 lr 0.000016 time 1.8923 (2.2025) loss 2.2445 (2.9877) grad_norm 3.4202 (3.2263) [2022-01-26 21:06:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][950/1251] eta 0:11:03 lr 0.000015 time 3.1953 (2.2044) loss 2.0613 (2.9857) grad_norm 2.9874 (3.2259) [2022-01-26 21:07:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][960/1251] eta 0:10:41 lr 0.000015 time 1.7320 (2.2059) loss 2.4177 (2.9845) grad_norm 3.4519 (3.2273) [2022-01-26 21:07:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][970/1251] eta 0:10:20 lr 0.000015 time 2.7799 (2.2087) loss 3.0215 (2.9852) grad_norm 3.1485 (3.2256) [2022-01-26 21:08:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][980/1251] eta 0:09:58 lr 0.000015 time 2.5305 (2.2099) loss 2.9271 (2.9861) grad_norm 3.0704 (3.2244) [2022-01-26 21:08:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][990/1251] eta 0:09:37 lr 0.000015 time 2.1221 (2.2111) loss 3.6492 (2.9898) grad_norm 4.0152 (3.2261) [2022-01-26 21:08:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1000/1251] eta 0:09:14 lr 0.000015 time 1.9974 (2.2107) loss 2.5575 (2.9893) grad_norm 3.3955 (3.2267) [2022-01-26 21:09:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1010/1251] eta 0:08:52 lr 0.000015 time 1.6671 (2.2078) loss 2.2579 (2.9878) grad_norm 3.1195 (3.2282) [2022-01-26 21:09:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1020/1251] eta 0:08:29 lr 0.000015 time 1.6278 (2.2048) loss 2.8828 (2.9877) grad_norm 2.8322 (3.2278) [2022-01-26 21:09:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1030/1251] eta 0:08:07 lr 0.000015 time 2.2877 (2.2045) loss 3.6042 (2.9880) grad_norm 3.2190 (3.2256) [2022-01-26 21:10:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1040/1251] eta 0:07:44 lr 0.000015 time 1.5618 (2.2037) loss 3.0809 (2.9880) grad_norm 4.4176 (3.2253) [2022-01-26 21:10:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1050/1251] eta 0:07:23 lr 0.000015 time 2.4187 (2.2056) loss 2.7002 (2.9887) grad_norm 3.3629 (3.2234) [2022-01-26 21:10:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1060/1251] eta 0:07:01 lr 0.000015 time 2.0662 (2.2069) loss 3.6547 (2.9887) grad_norm 3.4369 (3.2236) [2022-01-26 21:11:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1070/1251] eta 0:06:39 lr 0.000015 time 2.0484 (2.2056) loss 2.9012 (2.9906) grad_norm 3.4863 (3.2232) [2022-01-26 21:11:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1080/1251] eta 0:06:17 lr 0.000015 time 2.8481 (2.2052) loss 2.0607 (2.9907) grad_norm 5.6787 (3.2257) [2022-01-26 21:11:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1090/1251] eta 0:05:54 lr 0.000015 time 1.9457 (2.2044) loss 3.1328 (2.9908) grad_norm 3.0540 (3.2244) [2022-01-26 21:12:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1100/1251] eta 0:05:32 lr 0.000015 time 2.0902 (2.2044) loss 2.1667 (2.9906) grad_norm 3.0857 (3.2230) [2022-01-26 21:12:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1110/1251] eta 0:05:11 lr 0.000015 time 2.0519 (2.2058) loss 3.0598 (2.9902) grad_norm 3.0144 (3.2221) [2022-01-26 21:13:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1120/1251] eta 0:04:49 lr 0.000015 time 2.8113 (2.2076) loss 2.4855 (2.9873) grad_norm 2.9141 (3.2203) [2022-01-26 21:13:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1130/1251] eta 0:04:27 lr 0.000015 time 1.8283 (2.2070) loss 3.3207 (2.9866) grad_norm 3.5005 (3.2207) [2022-01-26 21:13:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1140/1251] eta 0:04:04 lr 0.000015 time 1.6771 (2.2062) loss 2.9315 (2.9858) grad_norm 3.2191 (3.2190) [2022-01-26 21:14:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1150/1251] eta 0:03:42 lr 0.000015 time 1.9476 (2.2048) loss 2.9516 (2.9844) grad_norm 3.1377 (3.2190) [2022-01-26 21:14:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1160/1251] eta 0:03:20 lr 0.000015 time 2.7444 (2.2042) loss 3.2611 (2.9856) grad_norm 4.0544 (3.2224) [2022-01-26 21:14:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1170/1251] eta 0:02:58 lr 0.000015 time 2.4560 (2.2048) loss 3.5232 (2.9868) grad_norm 3.0336 (3.2237) [2022-01-26 21:15:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1180/1251] eta 0:02:36 lr 0.000015 time 1.9246 (2.2052) loss 3.5511 (2.9883) grad_norm 3.3260 (3.2249) [2022-01-26 21:15:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1190/1251] eta 0:02:14 lr 0.000015 time 1.9288 (2.2039) loss 2.7530 (2.9881) grad_norm 3.8364 (3.2269) [2022-01-26 21:16:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1200/1251] eta 0:01:52 lr 0.000015 time 2.4596 (2.2033) loss 3.1589 (2.9864) grad_norm 2.9803 (3.2259) [2022-01-26 21:16:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1210/1251] eta 0:01:30 lr 0.000015 time 2.5681 (2.2026) loss 2.8909 (2.9895) grad_norm 3.7777 (3.2263) [2022-01-26 21:16:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1220/1251] eta 0:01:08 lr 0.000015 time 1.8402 (2.2017) loss 1.9670 (2.9887) grad_norm 2.9491 (3.2256) [2022-01-26 21:17:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1230/1251] eta 0:00:46 lr 0.000015 time 2.5287 (2.2012) loss 2.7284 (2.9877) grad_norm 2.9421 (3.2265) [2022-01-26 21:17:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1240/1251] eta 0:00:24 lr 0.000015 time 1.2613 (2.2005) loss 2.3668 (2.9859) grad_norm 2.8227 (3.2253) [2022-01-26 21:17:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [285/300][1250/1251] eta 0:00:02 lr 0.000015 time 1.2141 (2.1948) loss 2.7006 (2.9877) grad_norm 2.9449 (3.2244) [2022-01-26 21:17:40 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 285 training takes 0:45:46 [2022-01-26 21:17:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.729 (18.729) Loss 0.8272 (0.8272) Acc@1 79.980 (79.980) Acc@5 95.508 (95.508) [2022-01-26 21:18:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.826 (3.371) Loss 0.8258 (0.8092) Acc@1 80.664 (81.072) Acc@5 96.680 (95.552) [2022-01-26 21:18:35 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.599 (2.622) Loss 0.6996 (0.8004) Acc@1 83.496 (81.097) Acc@5 96.191 (95.433) [2022-01-26 21:18:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.240 (2.394) Loss 0.8144 (0.8029) Acc@1 81.055 (81.165) Acc@5 95.801 (95.426) [2022-01-26 21:19:12 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.313 (2.234) Loss 0.8610 (0.8062) Acc@1 79.980 (81.083) Acc@5 95.605 (95.413) [2022-01-26 21:19:19 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.060 Acc@5 95.458 [2022-01-26 21:19:19 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-01-26 21:19:19 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.12% [2022-01-26 21:19:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][0/1251] eta 7:34:32 lr 0.000015 time 21.8004 (21.8004) loss 3.4645 (3.4645) grad_norm 3.4415 (3.4415) [2022-01-26 21:20:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][10/1251] eta 1:24:05 lr 0.000015 time 3.0235 (4.0660) loss 1.9425 (2.9017) grad_norm 3.0221 (3.0876) [2022-01-26 21:20:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][20/1251] eta 1:03:11 lr 0.000015 time 1.7819 (3.0803) loss 3.3008 (3.0435) grad_norm 3.1654 (3.1630) [2022-01-26 21:20:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][30/1251] eta 0:56:48 lr 0.000015 time 1.3951 (2.7916) loss 2.5429 (3.0222) grad_norm 3.5198 (3.1752) [2022-01-26 21:21:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][40/1251] eta 0:54:28 lr 0.000015 time 4.0905 (2.6993) loss 2.1162 (2.9781) grad_norm 2.7946 (3.1936) [2022-01-26 21:21:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][50/1251] eta 0:52:24 lr 0.000015 time 2.1075 (2.6178) loss 2.8661 (3.0022) grad_norm 2.8661 (3.1853) [2022-01-26 21:21:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][60/1251] eta 0:50:12 lr 0.000015 time 1.3847 (2.5293) loss 2.6120 (2.9947) grad_norm 3.3797 (3.1981) [2022-01-26 21:22:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][70/1251] eta 0:48:41 lr 0.000015 time 1.4032 (2.4737) loss 1.9841 (2.9956) grad_norm 2.9631 (3.2153) [2022-01-26 21:22:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][80/1251] eta 0:48:03 lr 0.000015 time 4.2113 (2.4627) loss 3.3175 (2.9931) grad_norm 3.0635 (3.2173) [2022-01-26 21:23:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][90/1251] eta 0:47:21 lr 0.000015 time 3.2798 (2.4477) loss 3.5284 (3.0255) grad_norm 3.0103 (3.2244) [2022-01-26 21:23:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][100/1251] eta 0:45:58 lr 0.000015 time 1.8835 (2.3965) loss 2.1374 (3.0136) grad_norm 2.6410 (3.1972) [2022-01-26 21:23:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][110/1251] eta 0:45:08 lr 0.000015 time 1.7436 (2.3737) loss 3.1020 (2.9947) grad_norm 2.6314 (3.1972) [2022-01-26 21:24:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][120/1251] eta 0:44:53 lr 0.000015 time 3.9544 (2.3816) loss 1.9569 (2.9936) grad_norm 6.0314 (3.2183) [2022-01-26 21:24:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][130/1251] eta 0:44:16 lr 0.000015 time 1.4916 (2.3697) loss 3.6165 (3.0038) grad_norm 3.1899 (3.2091) [2022-01-26 21:24:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][140/1251] eta 0:43:28 lr 0.000015 time 1.9396 (2.3477) loss 3.0567 (3.0111) grad_norm 3.2154 (3.2110) [2022-01-26 21:25:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][150/1251] eta 0:42:42 lr 0.000015 time 2.0797 (2.3276) loss 3.1823 (3.0128) grad_norm 2.9401 (3.2092) [2022-01-26 21:25:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][160/1251] eta 0:42:08 lr 0.000015 time 3.1998 (2.3174) loss 2.6087 (2.9987) grad_norm 2.9712 (3.2050) [2022-01-26 21:25:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][170/1251] eta 0:41:29 lr 0.000015 time 1.5174 (2.3029) loss 3.4021 (2.9972) grad_norm 3.0554 (3.1996) [2022-01-26 21:26:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][180/1251] eta 0:41:06 lr 0.000015 time 1.7579 (2.3034) loss 2.2358 (2.9962) grad_norm 2.8854 (3.2067) [2022-01-26 21:26:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][190/1251] eta 0:40:39 lr 0.000015 time 2.2264 (2.2993) loss 3.3732 (3.0084) grad_norm 3.6114 (3.2073) [2022-01-26 21:27:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][200/1251] eta 0:40:16 lr 0.000015 time 2.5568 (2.2994) loss 3.6031 (3.0110) grad_norm 3.5270 (3.2072) [2022-01-26 21:27:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][210/1251] eta 0:39:47 lr 0.000015 time 1.8853 (2.2939) loss 3.2710 (3.0064) grad_norm 2.5727 (3.2025) [2022-01-26 21:27:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][220/1251] eta 0:39:25 lr 0.000015 time 1.5805 (2.2940) loss 3.2782 (3.0114) grad_norm 3.4922 (3.2082) [2022-01-26 21:28:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][230/1251] eta 0:38:57 lr 0.000015 time 2.6145 (2.2894) loss 3.1702 (3.0127) grad_norm 3.6137 (3.2128) [2022-01-26 21:28:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][240/1251] eta 0:38:27 lr 0.000015 time 1.8559 (2.2826) loss 2.9951 (3.0122) grad_norm 3.0060 (3.2081) [2022-01-26 21:28:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][250/1251] eta 0:37:59 lr 0.000015 time 1.9555 (2.2775) loss 3.4961 (3.0144) grad_norm 2.9342 (3.2107) [2022-01-26 21:29:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][260/1251] eta 0:37:35 lr 0.000015 time 1.8715 (2.2759) loss 2.4933 (3.0128) grad_norm 3.0102 (3.2105) [2022-01-26 21:29:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][270/1251] eta 0:37:08 lr 0.000015 time 2.1308 (2.2713) loss 2.8990 (3.0115) grad_norm 2.6160 (3.2096) [2022-01-26 21:29:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][280/1251] eta 0:36:42 lr 0.000015 time 1.8663 (2.2680) loss 2.6359 (3.0086) grad_norm 3.2134 (3.2019) [2022-01-26 21:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][290/1251] eta 0:36:18 lr 0.000015 time 2.1409 (2.2667) loss 2.3906 (3.0064) grad_norm 2.6728 (3.1982) [2022-01-26 21:30:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][300/1251] eta 0:35:47 lr 0.000015 time 1.6680 (2.2583) loss 3.4816 (3.0113) grad_norm 3.1137 (3.1966) [2022-01-26 21:30:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][310/1251] eta 0:35:20 lr 0.000015 time 1.8289 (2.2531) loss 3.1044 (3.0118) grad_norm 2.9771 (3.1932) [2022-01-26 21:31:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][320/1251] eta 0:35:01 lr 0.000015 time 2.6748 (2.2572) loss 2.0596 (3.0067) grad_norm 2.7164 (3.1869) [2022-01-26 21:31:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][330/1251] eta 0:34:42 lr 0.000015 time 3.3682 (2.2617) loss 2.9873 (3.0076) grad_norm 2.8800 (3.1859) [2022-01-26 21:32:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][340/1251] eta 0:34:19 lr 0.000015 time 1.9616 (2.2608) loss 2.3211 (3.0024) grad_norm 2.7417 (3.1912) [2022-01-26 21:32:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][350/1251] eta 0:33:55 lr 0.000015 time 1.8740 (2.2587) loss 2.3567 (2.9969) grad_norm 2.9134 (3.1925) [2022-01-26 21:32:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][360/1251] eta 0:33:30 lr 0.000015 time 2.8039 (2.2570) loss 3.1123 (2.9968) grad_norm 2.9846 (3.1928) [2022-01-26 21:33:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][370/1251] eta 0:33:04 lr 0.000015 time 3.1624 (2.2526) loss 2.1999 (2.9976) grad_norm 3.0428 (3.1944) [2022-01-26 21:33:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][380/1251] eta 0:32:37 lr 0.000015 time 1.7394 (2.2470) loss 2.3908 (2.9940) grad_norm 2.7832 (3.1946) [2022-01-26 21:33:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][390/1251] eta 0:32:10 lr 0.000015 time 2.1826 (2.2426) loss 3.5221 (2.9972) grad_norm 3.4894 (3.1974) [2022-01-26 21:34:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][400/1251] eta 0:31:45 lr 0.000015 time 1.8050 (2.2396) loss 3.2180 (2.9908) grad_norm 3.1905 (3.1979) [2022-01-26 21:34:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][410/1251] eta 0:31:24 lr 0.000015 time 3.1294 (2.2413) loss 2.4597 (2.9959) grad_norm 2.9561 (3.1951) [2022-01-26 21:35:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][420/1251] eta 0:31:05 lr 0.000015 time 2.5192 (2.2454) loss 3.2120 (2.9995) grad_norm 3.5158 (3.1967) [2022-01-26 21:35:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][430/1251] eta 0:30:45 lr 0.000015 time 2.1140 (2.2480) loss 2.9382 (2.9945) grad_norm 3.1055 (3.1957) [2022-01-26 21:35:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][440/1251] eta 0:30:21 lr 0.000015 time 1.9685 (2.2456) loss 3.2641 (2.9991) grad_norm 3.3553 (3.1982) [2022-01-26 21:36:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][450/1251] eta 0:29:54 lr 0.000015 time 2.2199 (2.2408) loss 3.3108 (2.9922) grad_norm 3.2928 (3.2004) [2022-01-26 21:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][460/1251] eta 0:29:27 lr 0.000015 time 1.8628 (2.2342) loss 3.7892 (2.9959) grad_norm 4.0813 (3.2016) [2022-01-26 21:36:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][470/1251] eta 0:29:00 lr 0.000015 time 2.0193 (2.2287) loss 3.1324 (2.9959) grad_norm 3.0412 (3.2020) [2022-01-26 21:37:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][480/1251] eta 0:28:35 lr 0.000015 time 1.9349 (2.2249) loss 2.7115 (2.9958) grad_norm 3.4321 (3.1985) [2022-01-26 21:37:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][490/1251] eta 0:28:14 lr 0.000015 time 3.0985 (2.2261) loss 3.2574 (2.9947) grad_norm 3.0899 (3.1974) [2022-01-26 21:37:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][500/1251] eta 0:27:52 lr 0.000015 time 2.2210 (2.2266) loss 3.3441 (2.9965) grad_norm 2.8471 (3.2089) [2022-01-26 21:38:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][510/1251] eta 0:27:30 lr 0.000015 time 1.8751 (2.2270) loss 3.3495 (2.9982) grad_norm 3.7820 (3.2077) [2022-01-26 21:38:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][520/1251] eta 0:27:08 lr 0.000015 time 2.0014 (2.2276) loss 3.0087 (2.9994) grad_norm 3.4867 (3.2172) [2022-01-26 21:39:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][530/1251] eta 0:26:46 lr 0.000015 time 2.2283 (2.2280) loss 3.4509 (3.0019) grad_norm 3.6256 (3.2256) [2022-01-26 21:39:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][540/1251] eta 0:26:24 lr 0.000015 time 2.4758 (2.2284) loss 2.8453 (2.9981) grad_norm 2.7307 (3.2275) [2022-01-26 21:39:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][550/1251] eta 0:26:01 lr 0.000015 time 1.9293 (2.2281) loss 3.3994 (2.9956) grad_norm 2.9071 (3.2307) [2022-01-26 21:40:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][560/1251] eta 0:25:39 lr 0.000015 time 2.0595 (2.2285) loss 3.0369 (2.9987) grad_norm 3.3583 (3.2313) [2022-01-26 21:40:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][570/1251] eta 0:25:15 lr 0.000015 time 2.2043 (2.2256) loss 2.0981 (2.9987) grad_norm 5.3522 (3.2384) [2022-01-26 21:40:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][580/1251] eta 0:24:51 lr 0.000015 time 1.9645 (2.2224) loss 3.4761 (2.9978) grad_norm 3.3546 (3.2391) [2022-01-26 21:41:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][590/1251] eta 0:24:30 lr 0.000015 time 1.8972 (2.2249) loss 2.8895 (2.9936) grad_norm 3.6051 (3.2432) [2022-01-26 21:41:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][600/1251] eta 0:24:09 lr 0.000015 time 2.7683 (2.2260) loss 3.1607 (2.9927) grad_norm 3.5202 (3.2444) [2022-01-26 21:41:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][610/1251] eta 0:23:47 lr 0.000015 time 2.0590 (2.2267) loss 3.3346 (2.9923) grad_norm 5.2536 (3.2481) [2022-01-26 21:42:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][620/1251] eta 0:23:23 lr 0.000015 time 1.8071 (2.2248) loss 2.8935 (2.9945) grad_norm 3.0396 (3.2459) [2022-01-26 21:42:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][630/1251] eta 0:22:59 lr 0.000015 time 1.9734 (2.2208) loss 2.9723 (2.9958) grad_norm 3.1547 (3.2452) [2022-01-26 21:43:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][640/1251] eta 0:22:35 lr 0.000015 time 2.5140 (2.2182) loss 2.8425 (2.9942) grad_norm 2.9247 (3.2415) [2022-01-26 21:43:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][650/1251] eta 0:22:12 lr 0.000015 time 1.9651 (2.2167) loss 3.3832 (2.9946) grad_norm 3.3771 (3.2416) [2022-01-26 21:43:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][660/1251] eta 0:21:49 lr 0.000015 time 2.2784 (2.2156) loss 2.1521 (2.9960) grad_norm 2.6915 (3.2418) [2022-01-26 21:44:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][670/1251] eta 0:21:28 lr 0.000015 time 3.1236 (2.2177) loss 2.8111 (2.9970) grad_norm 2.7295 (3.2420) [2022-01-26 21:44:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][680/1251] eta 0:21:06 lr 0.000015 time 2.8648 (2.2187) loss 3.0661 (2.9949) grad_norm 4.5214 (3.2460) [2022-01-26 21:44:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][690/1251] eta 0:20:44 lr 0.000015 time 1.7192 (2.2179) loss 1.9855 (2.9944) grad_norm 2.9260 (3.2461) [2022-01-26 21:45:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][700/1251] eta 0:20:21 lr 0.000015 time 2.3153 (2.2170) loss 3.2458 (2.9905) grad_norm 3.2011 (3.2452) [2022-01-26 21:45:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][710/1251] eta 0:19:59 lr 0.000015 time 2.1573 (2.2166) loss 2.9217 (2.9926) grad_norm 3.0933 (3.2469) [2022-01-26 21:45:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][720/1251] eta 0:19:37 lr 0.000015 time 3.0715 (2.2174) loss 3.0328 (2.9920) grad_norm 3.1457 (3.2467) [2022-01-26 21:46:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][730/1251] eta 0:19:14 lr 0.000015 time 1.5712 (2.2157) loss 2.9034 (2.9928) grad_norm 3.2107 (3.2454) [2022-01-26 21:46:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][740/1251] eta 0:18:52 lr 0.000015 time 2.4786 (2.2156) loss 3.0858 (2.9949) grad_norm 3.2248 (3.2454) [2022-01-26 21:47:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][750/1251] eta 0:18:29 lr 0.000015 time 1.7968 (2.2146) loss 3.0712 (2.9944) grad_norm 3.1959 (3.2463) [2022-01-26 21:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][760/1251] eta 0:18:06 lr 0.000015 time 1.6898 (2.2129) loss 3.5231 (2.9938) grad_norm 3.0867 (3.2482) [2022-01-26 21:47:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][770/1251] eta 0:17:44 lr 0.000015 time 2.1600 (2.2137) loss 3.3913 (2.9955) grad_norm 2.9072 (3.2458) [2022-01-26 21:48:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][780/1251] eta 0:17:23 lr 0.000015 time 3.5136 (2.2159) loss 2.9408 (2.9962) grad_norm 3.1436 (3.2453) [2022-01-26 21:48:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][790/1251] eta 0:17:02 lr 0.000015 time 1.6956 (2.2186) loss 3.3227 (2.9966) grad_norm 3.1233 (3.2484) [2022-01-26 21:48:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][800/1251] eta 0:16:40 lr 0.000015 time 1.6720 (2.2175) loss 2.4251 (2.9950) grad_norm 3.1066 (3.2469) [2022-01-26 21:49:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][810/1251] eta 0:16:17 lr 0.000015 time 1.8544 (2.2166) loss 3.0695 (2.9957) grad_norm 3.0225 (3.2499) [2022-01-26 21:49:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][820/1251] eta 0:15:54 lr 0.000015 time 1.9525 (2.2151) loss 3.3028 (2.9973) grad_norm 3.5936 (3.2521) [2022-01-26 21:49:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][830/1251] eta 0:15:30 lr 0.000015 time 1.9044 (2.2114) loss 2.5851 (2.9976) grad_norm 3.2169 (3.2554) [2022-01-26 21:50:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][840/1251] eta 0:15:08 lr 0.000015 time 1.8601 (2.2093) loss 2.7083 (2.9963) grad_norm 3.2606 (3.2551) [2022-01-26 21:50:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][850/1251] eta 0:14:45 lr 0.000015 time 2.5533 (2.2094) loss 2.8898 (2.9970) grad_norm 3.0062 (3.2549) [2022-01-26 21:51:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][860/1251] eta 0:14:23 lr 0.000015 time 2.0025 (2.2087) loss 3.5547 (2.9981) grad_norm 3.3434 (3.2562) [2022-01-26 21:51:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][870/1251] eta 0:14:01 lr 0.000015 time 2.1715 (2.2087) loss 3.3638 (2.9995) grad_norm 2.9097 (3.2549) [2022-01-26 21:51:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][880/1251] eta 0:13:39 lr 0.000015 time 2.1882 (2.2090) loss 2.4662 (2.9997) grad_norm 3.4790 (3.2563) [2022-01-26 21:52:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][890/1251] eta 0:13:18 lr 0.000015 time 2.8682 (2.2112) loss 2.7837 (3.0008) grad_norm 3.2232 (3.2561) [2022-01-26 21:52:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][900/1251] eta 0:12:55 lr 0.000015 time 2.1924 (2.2105) loss 3.1258 (3.0035) grad_norm 3.1100 (3.2547) [2022-01-26 21:52:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][910/1251] eta 0:12:33 lr 0.000015 time 2.4865 (2.2111) loss 1.8851 (3.0026) grad_norm 2.8646 (3.2534) [2022-01-26 21:53:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][920/1251] eta 0:12:11 lr 0.000015 time 1.8363 (2.2111) loss 3.4864 (3.0044) grad_norm 2.9423 (3.2518) [2022-01-26 21:53:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][930/1251] eta 0:11:49 lr 0.000015 time 2.8574 (2.2107) loss 3.0380 (3.0043) grad_norm 3.1458 (3.2526) [2022-01-26 21:53:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][940/1251] eta 0:11:26 lr 0.000015 time 1.9400 (2.2089) loss 3.2546 (3.0041) grad_norm 3.1184 (3.2514) [2022-01-26 21:54:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][950/1251] eta 0:11:04 lr 0.000015 time 2.2194 (2.2078) loss 3.5670 (3.0035) grad_norm 3.2067 (3.2512) [2022-01-26 21:54:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][960/1251] eta 0:10:42 lr 0.000015 time 1.8849 (2.2072) loss 3.1315 (3.0039) grad_norm 2.8109 (3.2505) [2022-01-26 21:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][970/1251] eta 0:10:19 lr 0.000015 time 1.6917 (2.2056) loss 2.9996 (3.0037) grad_norm 2.6637 (3.2499) [2022-01-26 21:55:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][980/1251] eta 0:09:57 lr 0.000015 time 2.0383 (2.2053) loss 2.6813 (3.0040) grad_norm 3.0952 (3.2501) [2022-01-26 21:55:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][990/1251] eta 0:09:36 lr 0.000015 time 3.4122 (2.2079) loss 2.7519 (3.0043) grad_norm 3.3512 (3.2499) [2022-01-26 21:56:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1000/1251] eta 0:09:14 lr 0.000015 time 2.0275 (2.2084) loss 2.4872 (3.0024) grad_norm 3.2879 (3.2497) [2022-01-26 21:56:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1010/1251] eta 0:08:52 lr 0.000015 time 2.2270 (2.2079) loss 3.1630 (3.0019) grad_norm 2.9798 (3.2494) [2022-01-26 21:56:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1020/1251] eta 0:08:30 lr 0.000015 time 2.0840 (2.2078) loss 2.6141 (3.0014) grad_norm 3.1886 (3.2489) [2022-01-26 21:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1030/1251] eta 0:08:07 lr 0.000015 time 1.9073 (2.2075) loss 3.3407 (3.0019) grad_norm 3.3440 (3.2498) [2022-01-26 21:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1040/1251] eta 0:07:45 lr 0.000015 time 1.9186 (2.2066) loss 3.3344 (3.0022) grad_norm 3.3620 (3.2480) [2022-01-26 21:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1050/1251] eta 0:07:23 lr 0.000015 time 1.7758 (2.2077) loss 2.1857 (3.0031) grad_norm 3.4137 (3.2459) [2022-01-26 21:58:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1060/1251] eta 0:07:01 lr 0.000015 time 1.8118 (2.2079) loss 3.3179 (3.0012) grad_norm 3.0243 (3.2467) [2022-01-26 21:58:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1070/1251] eta 0:06:39 lr 0.000015 time 1.7971 (2.2079) loss 2.2380 (3.0004) grad_norm 2.8541 (3.2469) [2022-01-26 21:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1080/1251] eta 0:06:17 lr 0.000015 time 2.0392 (2.2067) loss 1.9574 (2.9999) grad_norm 3.4672 (3.2468) [2022-01-26 21:59:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1090/1251] eta 0:05:55 lr 0.000015 time 1.6583 (2.2054) loss 3.0219 (3.0010) grad_norm 3.0848 (3.2457) [2022-01-26 21:59:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1100/1251] eta 0:05:32 lr 0.000015 time 2.1010 (2.2048) loss 2.2468 (3.0008) grad_norm 3.7157 (3.2476) [2022-01-26 22:00:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1110/1251] eta 0:05:10 lr 0.000015 time 2.2301 (2.2042) loss 3.4716 (3.0018) grad_norm 3.0454 (3.2480) [2022-01-26 22:00:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1120/1251] eta 0:04:48 lr 0.000015 time 1.8819 (2.2029) loss 2.7849 (2.9997) grad_norm 3.2030 (3.2463) [2022-01-26 22:00:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1130/1251] eta 0:04:26 lr 0.000015 time 1.8254 (2.2046) loss 3.0605 (2.9979) grad_norm 2.9221 (3.2455) [2022-01-26 22:01:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1140/1251] eta 0:04:04 lr 0.000015 time 2.1438 (2.2067) loss 3.5868 (3.0000) grad_norm 3.2697 (3.2445) [2022-01-26 22:01:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1150/1251] eta 0:03:42 lr 0.000015 time 1.8177 (2.2067) loss 2.4907 (2.9990) grad_norm 3.1141 (3.2422) [2022-01-26 22:01:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1160/1251] eta 0:03:20 lr 0.000015 time 1.5808 (2.2052) loss 2.1150 (2.9963) grad_norm 3.9956 (3.2433) [2022-01-26 22:02:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1170/1251] eta 0:02:58 lr 0.000015 time 1.8733 (2.2048) loss 2.7424 (2.9975) grad_norm 2.6195 (3.2429) [2022-01-26 22:02:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1180/1251] eta 0:02:36 lr 0.000015 time 3.1238 (2.2051) loss 3.1263 (2.9976) grad_norm 2.7128 (3.2422) [2022-01-26 22:03:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1190/1251] eta 0:02:14 lr 0.000015 time 2.3462 (2.2048) loss 1.8917 (2.9980) grad_norm 3.2465 (3.2410) [2022-01-26 22:03:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1200/1251] eta 0:01:52 lr 0.000015 time 1.8427 (2.2035) loss 3.1287 (2.9971) grad_norm 2.8488 (3.2418) [2022-01-26 22:03:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1210/1251] eta 0:01:30 lr 0.000015 time 1.8547 (2.2031) loss 3.1987 (2.9979) grad_norm 3.0334 (3.2412) [2022-01-26 22:04:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1220/1251] eta 0:01:08 lr 0.000015 time 2.3964 (2.2025) loss 2.2582 (2.9965) grad_norm 3.3777 (3.2402) [2022-01-26 22:04:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1230/1251] eta 0:00:46 lr 0.000015 time 3.1078 (2.2021) loss 3.4174 (2.9979) grad_norm 2.9896 (3.2396) [2022-01-26 22:04:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1240/1251] eta 0:00:24 lr 0.000015 time 1.5225 (2.2010) loss 2.2833 (2.9962) grad_norm 3.4350 (3.2448) [2022-01-26 22:05:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [286/300][1250/1251] eta 0:00:02 lr 0.000015 time 1.1815 (2.1951) loss 3.1702 (2.9979) grad_norm 3.3473 (3.2442) [2022-01-26 22:05:05 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 286 training takes 0:45:46 [2022-01-26 22:05:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.658 (18.658) Loss 0.8301 (0.8301) Acc@1 81.055 (81.055) Acc@5 95.410 (95.410) [2022-01-26 22:05:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.681 (3.564) Loss 0.7716 (0.8185) Acc@1 81.543 (80.788) Acc@5 95.996 (95.188) [2022-01-26 22:06:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.274 (2.595) Loss 0.8488 (0.8086) Acc@1 80.371 (81.101) Acc@5 95.117 (95.382) [2022-01-26 22:06:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.631 (2.316) Loss 0.7627 (0.8071) Acc@1 83.594 (81.178) Acc@5 96.094 (95.410) [2022-01-26 22:06:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.282 (2.213) Loss 0.8230 (0.8072) Acc@1 80.762 (81.055) Acc@5 94.434 (95.386) [2022-01-26 22:06:43 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.092 Acc@5 95.464 [2022-01-26 22:06:43 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-01-26 22:06:43 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.12% [2022-01-26 22:07:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][0/1251] eta 7:30:23 lr 0.000015 time 21.6019 (21.6019) loss 2.5287 (2.5287) grad_norm 3.8747 (3.8747) [2022-01-26 22:07:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][10/1251] eta 1:25:33 lr 0.000015 time 2.0018 (4.1370) loss 3.2843 (2.9951) grad_norm 3.2551 (3.3061) [2022-01-26 22:07:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][20/1251] eta 1:06:10 lr 0.000015 time 2.1532 (3.2255) loss 3.3137 (3.0501) grad_norm 3.1576 (3.2526) [2022-01-26 22:08:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][30/1251] eta 0:58:46 lr 0.000015 time 1.4767 (2.8880) loss 2.9728 (3.0594) grad_norm 2.9237 (3.2415) [2022-01-26 22:08:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][40/1251] eta 0:58:41 lr 0.000015 time 8.5103 (2.9077) loss 2.9129 (3.1057) grad_norm 3.1955 (3.2574) [2022-01-26 22:09:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][50/1251] eta 0:56:07 lr 0.000015 time 1.7088 (2.8039) loss 3.4239 (3.0929) grad_norm 3.1342 (3.2463) [2022-01-26 22:09:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][60/1251] eta 0:53:41 lr 0.000015 time 2.0883 (2.7046) loss 3.2529 (3.0316) grad_norm 3.0364 (3.2274) [2022-01-26 22:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][70/1251] eta 0:51:21 lr 0.000015 time 1.8947 (2.6089) loss 3.1625 (3.0461) grad_norm 3.1161 (3.2742) [2022-01-26 22:10:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][80/1251] eta 0:49:52 lr 0.000015 time 3.3975 (2.5554) loss 2.9445 (3.0061) grad_norm 2.8452 (3.2670) [2022-01-26 22:10:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][90/1251] eta 0:48:27 lr 0.000015 time 1.8368 (2.5045) loss 3.5230 (2.9837) grad_norm 3.5017 (3.2691) [2022-01-26 22:10:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][100/1251] eta 0:47:13 lr 0.000015 time 1.5509 (2.4616) loss 3.2263 (2.9948) grad_norm 3.0323 (3.2592) [2022-01-26 22:11:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][110/1251] eta 0:46:05 lr 0.000015 time 2.2864 (2.4238) loss 3.3447 (2.9800) grad_norm 3.4412 (3.2528) [2022-01-26 22:11:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][120/1251] eta 0:45:18 lr 0.000015 time 1.9408 (2.4034) loss 2.7798 (2.9952) grad_norm 3.2151 (3.3123) [2022-01-26 22:11:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][130/1251] eta 0:44:51 lr 0.000015 time 2.7017 (2.4005) loss 2.6641 (2.9748) grad_norm 3.5296 (3.3088) [2022-01-26 22:12:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][140/1251] eta 0:44:15 lr 0.000015 time 2.2509 (2.3906) loss 3.3607 (2.9870) grad_norm 2.9889 (3.2830) [2022-01-26 22:12:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][150/1251] eta 0:43:28 lr 0.000014 time 1.9348 (2.3696) loss 2.8533 (2.9958) grad_norm 2.7440 (3.2808) [2022-01-26 22:13:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][160/1251] eta 0:42:50 lr 0.000014 time 1.9548 (2.3559) loss 3.2242 (2.9974) grad_norm 2.8171 (3.2680) [2022-01-26 22:13:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][170/1251] eta 0:42:07 lr 0.000014 time 1.6240 (2.3377) loss 2.7400 (2.9987) grad_norm 3.1476 (3.2473) [2022-01-26 22:13:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][180/1251] eta 0:41:44 lr 0.000014 time 2.8590 (2.3383) loss 2.7094 (3.0053) grad_norm 3.1179 (3.2406) [2022-01-26 22:14:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][190/1251] eta 0:41:22 lr 0.000014 time 2.1549 (2.3397) loss 2.2946 (2.9992) grad_norm 3.2421 (3.2344) [2022-01-26 22:14:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][200/1251] eta 0:41:05 lr 0.000014 time 3.2861 (2.3459) loss 3.4929 (3.0035) grad_norm 3.3614 (3.2353) [2022-01-26 22:14:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][210/1251] eta 0:40:46 lr 0.000014 time 1.9224 (2.3498) loss 2.6295 (2.9909) grad_norm 2.8760 (3.2230) [2022-01-26 22:15:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][220/1251] eta 0:40:15 lr 0.000014 time 1.9114 (2.3432) loss 2.5233 (2.9923) grad_norm 3.5048 (3.2262) [2022-01-26 22:15:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][230/1251] eta 0:39:40 lr 0.000014 time 1.9389 (2.3313) loss 3.3496 (2.9836) grad_norm 3.3256 (3.2384) [2022-01-26 22:16:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][240/1251] eta 0:39:02 lr 0.000014 time 2.3855 (2.3169) loss 3.0324 (2.9956) grad_norm 3.2593 (3.2472) [2022-01-26 22:16:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][250/1251] eta 0:38:22 lr 0.000014 time 1.7853 (2.3004) loss 3.3762 (3.0033) grad_norm 3.1450 (3.2474) [2022-01-26 22:16:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][260/1251] eta 0:37:53 lr 0.000014 time 2.2219 (2.2938) loss 3.2984 (2.9999) grad_norm 3.5293 (3.2564) [2022-01-26 22:17:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][270/1251] eta 0:37:24 lr 0.000014 time 2.0850 (2.2876) loss 3.2540 (3.0033) grad_norm 2.7029 (3.2532) [2022-01-26 22:17:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][280/1251] eta 0:37:00 lr 0.000014 time 1.7386 (2.2870) loss 2.1463 (3.0063) grad_norm 2.9771 (3.2528) [2022-01-26 22:17:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][290/1251] eta 0:36:34 lr 0.000014 time 1.8676 (2.2834) loss 3.2861 (3.0017) grad_norm 3.3819 (3.2565) [2022-01-26 22:18:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][300/1251] eta 0:36:16 lr 0.000014 time 3.1840 (2.2887) loss 3.1176 (2.9980) grad_norm 2.4785 (3.2529) [2022-01-26 22:18:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][310/1251] eta 0:35:53 lr 0.000014 time 2.5164 (2.2888) loss 3.0825 (2.9925) grad_norm 2.9367 (3.2463) [2022-01-26 22:18:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][320/1251] eta 0:35:27 lr 0.000014 time 1.7818 (2.2857) loss 3.8338 (2.9934) grad_norm 3.7318 (3.2523) [2022-01-26 22:19:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][330/1251] eta 0:35:03 lr 0.000014 time 1.8870 (2.2843) loss 3.1026 (2.9974) grad_norm 2.8998 (3.2498) [2022-01-26 22:19:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][340/1251] eta 0:34:36 lr 0.000014 time 2.1761 (2.2792) loss 2.6464 (2.9989) grad_norm 2.7250 (3.2416) [2022-01-26 22:20:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][350/1251] eta 0:34:11 lr 0.000014 time 1.8672 (2.2771) loss 3.4126 (2.9994) grad_norm 3.2280 (3.2379) [2022-01-26 22:20:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][360/1251] eta 0:33:48 lr 0.000014 time 1.7890 (2.2761) loss 3.3201 (2.9973) grad_norm 3.1857 (3.2384) [2022-01-26 22:20:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][370/1251] eta 0:33:26 lr 0.000014 time 1.9176 (2.2772) loss 2.5022 (3.0005) grad_norm 3.8772 (3.2369) [2022-01-26 22:21:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][380/1251] eta 0:33:00 lr 0.000014 time 2.4660 (2.2741) loss 3.2301 (3.0067) grad_norm 2.6508 (3.2383) [2022-01-26 22:21:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][390/1251] eta 0:32:34 lr 0.000014 time 1.9018 (2.2702) loss 2.5522 (3.0100) grad_norm 2.8743 (3.2375) [2022-01-26 22:21:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][400/1251] eta 0:32:08 lr 0.000014 time 2.0285 (2.2665) loss 3.6582 (3.0085) grad_norm 3.2168 (3.2336) [2022-01-26 22:22:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][410/1251] eta 0:31:42 lr 0.000014 time 1.5335 (2.2622) loss 3.1905 (3.0115) grad_norm 3.4317 (3.2333) [2022-01-26 22:22:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][420/1251] eta 0:31:18 lr 0.000014 time 2.4147 (2.2606) loss 3.7917 (3.0135) grad_norm 2.7519 (3.2315) [2022-01-26 22:22:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][430/1251] eta 0:30:54 lr 0.000014 time 2.2311 (2.2584) loss 3.3266 (3.0105) grad_norm 3.0214 (3.2279) [2022-01-26 22:23:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][440/1251] eta 0:30:30 lr 0.000014 time 2.2158 (2.2571) loss 3.3504 (3.0084) grad_norm 4.6994 (3.2372) [2022-01-26 22:23:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][450/1251] eta 0:30:04 lr 0.000014 time 2.3109 (2.2532) loss 3.1468 (3.0118) grad_norm 2.9955 (3.2369) [2022-01-26 22:24:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][460/1251] eta 0:29:40 lr 0.000014 time 1.9746 (2.2512) loss 3.1882 (3.0139) grad_norm 3.0932 (3.2425) [2022-01-26 22:24:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][470/1251] eta 0:29:16 lr 0.000014 time 2.0512 (2.2487) loss 3.0943 (3.0148) grad_norm 2.7480 (3.2402) [2022-01-26 22:24:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][480/1251] eta 0:28:51 lr 0.000014 time 2.2662 (2.2451) loss 2.6521 (3.0154) grad_norm 3.2180 (3.2418) [2022-01-26 22:25:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][490/1251] eta 0:28:30 lr 0.000014 time 2.4794 (2.2472) loss 2.7587 (3.0147) grad_norm 3.0091 (3.2434) [2022-01-26 22:25:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][500/1251] eta 0:28:06 lr 0.000014 time 1.9570 (2.2451) loss 3.3977 (3.0161) grad_norm 2.8827 (3.2511) [2022-01-26 22:25:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][510/1251] eta 0:27:43 lr 0.000014 time 2.2765 (2.2444) loss 2.6556 (3.0141) grad_norm 3.7458 (3.2559) [2022-01-26 22:26:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][520/1251] eta 0:27:19 lr 0.000014 time 2.5790 (2.2429) loss 2.6488 (3.0079) grad_norm 3.2293 (3.2578) [2022-01-26 22:26:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][530/1251] eta 0:26:54 lr 0.000014 time 2.3236 (2.2391) loss 3.4791 (3.0090) grad_norm 2.7770 (3.2573) [2022-01-26 22:26:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][540/1251] eta 0:26:28 lr 0.000014 time 2.3275 (2.2348) loss 3.2708 (3.0131) grad_norm 3.1570 (3.2633) [2022-01-26 22:27:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][550/1251] eta 0:26:05 lr 0.000014 time 2.1829 (2.2327) loss 3.0416 (3.0143) grad_norm 3.4462 (3.2620) [2022-01-26 22:27:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][560/1251] eta 0:25:44 lr 0.000014 time 3.2841 (2.2345) loss 3.2202 (3.0147) grad_norm 3.3266 (3.2623) [2022-01-26 22:28:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][570/1251] eta 0:25:22 lr 0.000014 time 2.3617 (2.2353) loss 3.0567 (3.0130) grad_norm 3.2723 (3.2616) [2022-01-26 22:28:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][580/1251] eta 0:24:59 lr 0.000014 time 2.5307 (2.2352) loss 2.3133 (3.0117) grad_norm 3.1615 (3.2611) [2022-01-26 22:28:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][590/1251] eta 0:24:36 lr 0.000014 time 1.9286 (2.2338) loss 3.2877 (3.0122) grad_norm 3.6843 (3.2638) [2022-01-26 22:29:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][600/1251] eta 0:24:14 lr 0.000014 time 2.2733 (2.2343) loss 3.2856 (3.0125) grad_norm 3.1807 (3.2637) [2022-01-26 22:29:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][610/1251] eta 0:23:51 lr 0.000014 time 2.4143 (2.2329) loss 2.2861 (3.0129) grad_norm 3.2820 (3.2611) [2022-01-26 22:29:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][620/1251] eta 0:23:27 lr 0.000014 time 2.3058 (2.2301) loss 3.4413 (3.0103) grad_norm 4.6274 (3.2596) [2022-01-26 22:30:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][630/1251] eta 0:23:04 lr 0.000014 time 3.3902 (2.2303) loss 3.0862 (3.0109) grad_norm 3.8346 (3.2592) [2022-01-26 22:30:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][640/1251] eta 0:22:42 lr 0.000014 time 2.1593 (2.2300) loss 2.4593 (3.0101) grad_norm 2.7238 (3.2593) [2022-01-26 22:30:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][650/1251] eta 0:22:20 lr 0.000014 time 2.9766 (2.2308) loss 2.9732 (3.0096) grad_norm 3.6016 (3.2571) [2022-01-26 22:31:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][660/1251] eta 0:21:57 lr 0.000014 time 1.9719 (2.2286) loss 2.8607 (3.0121) grad_norm 3.1939 (3.2537) [2022-01-26 22:31:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][670/1251] eta 0:21:35 lr 0.000014 time 3.0094 (2.2291) loss 3.3397 (3.0132) grad_norm 3.0211 (3.2524) [2022-01-26 22:32:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][680/1251] eta 0:21:13 lr 0.000014 time 1.8986 (2.2309) loss 2.5677 (3.0119) grad_norm 4.0037 (3.2522) [2022-01-26 22:32:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][690/1251] eta 0:20:52 lr 0.000014 time 2.2345 (2.2327) loss 2.6633 (3.0119) grad_norm 2.7610 (3.2499) [2022-01-26 22:32:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][700/1251] eta 0:20:29 lr 0.000014 time 2.2690 (2.2321) loss 3.2253 (3.0156) grad_norm 3.4264 (3.2517) [2022-01-26 22:33:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][710/1251] eta 0:20:06 lr 0.000014 time 2.1657 (2.2301) loss 3.1117 (3.0141) grad_norm 2.8163 (3.2504) [2022-01-26 22:33:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][720/1251] eta 0:19:42 lr 0.000014 time 1.7981 (2.2261) loss 3.1699 (3.0165) grad_norm 3.3491 (3.2502) [2022-01-26 22:33:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][730/1251] eta 0:19:17 lr 0.000014 time 2.2783 (2.2225) loss 3.2812 (3.0162) grad_norm 2.9961 (3.2507) [2022-01-26 22:34:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][740/1251] eta 0:18:54 lr 0.000014 time 1.9629 (2.2201) loss 3.2390 (3.0170) grad_norm 2.9002 (3.2524) [2022-01-26 22:34:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][750/1251] eta 0:18:31 lr 0.000014 time 1.8832 (2.2191) loss 3.5483 (3.0178) grad_norm 2.9986 (3.2549) [2022-01-26 22:34:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][760/1251] eta 0:18:10 lr 0.000014 time 1.6273 (2.2201) loss 3.2908 (3.0195) grad_norm 3.1622 (3.2558) [2022-01-26 22:35:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][770/1251] eta 0:17:49 lr 0.000014 time 2.8055 (2.2236) loss 2.4394 (3.0213) grad_norm 3.7662 (3.2594) [2022-01-26 22:35:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][780/1251] eta 0:17:28 lr 0.000014 time 2.1825 (2.2260) loss 3.0795 (3.0224) grad_norm 4.5675 (3.2636) [2022-01-26 22:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][790/1251] eta 0:17:05 lr 0.000014 time 2.4596 (2.2255) loss 2.8539 (3.0221) grad_norm 3.1581 (3.2619) [2022-01-26 22:36:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][800/1251] eta 0:16:42 lr 0.000014 time 1.9462 (2.2237) loss 3.0845 (3.0229) grad_norm 2.9047 (3.2606) [2022-01-26 22:36:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][810/1251] eta 0:16:19 lr 0.000014 time 2.2377 (2.2221) loss 3.1932 (3.0240) grad_norm 3.3744 (3.2633) [2022-01-26 22:37:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][820/1251] eta 0:15:57 lr 0.000014 time 1.4804 (2.2209) loss 1.8259 (3.0205) grad_norm 3.2865 (3.2644) [2022-01-26 22:37:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][830/1251] eta 0:15:34 lr 0.000014 time 2.2167 (2.2192) loss 2.7037 (3.0174) grad_norm 3.4176 (3.2639) [2022-01-26 22:37:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][840/1251] eta 0:15:11 lr 0.000014 time 1.7955 (2.2189) loss 3.3197 (3.0147) grad_norm 3.4739 (3.2625) [2022-01-26 22:38:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][850/1251] eta 0:14:49 lr 0.000014 time 2.1593 (2.2191) loss 3.1774 (3.0122) grad_norm 2.9144 (3.2611) [2022-01-26 22:38:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][860/1251] eta 0:14:28 lr 0.000014 time 1.8812 (2.2209) loss 3.5352 (3.0111) grad_norm 2.9086 (3.2578) [2022-01-26 22:38:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][870/1251] eta 0:14:05 lr 0.000014 time 1.7964 (2.2185) loss 2.8447 (3.0127) grad_norm 2.7715 (3.2545) [2022-01-26 22:39:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][880/1251] eta 0:13:42 lr 0.000014 time 2.1214 (2.2180) loss 2.6328 (3.0120) grad_norm 3.0005 (3.2595) [2022-01-26 22:39:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][890/1251] eta 0:13:21 lr 0.000014 time 1.9448 (2.2200) loss 2.5738 (3.0129) grad_norm 3.4368 (3.2601) [2022-01-26 22:40:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][900/1251] eta 0:12:59 lr 0.000014 time 1.9692 (2.2200) loss 2.8620 (3.0135) grad_norm 2.4334 (3.2589) [2022-01-26 22:40:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][910/1251] eta 0:12:36 lr 0.000014 time 1.6794 (2.2179) loss 3.3509 (3.0138) grad_norm 2.9505 (3.2576) [2022-01-26 22:40:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][920/1251] eta 0:12:13 lr 0.000014 time 1.8795 (2.2159) loss 3.0719 (3.0131) grad_norm 3.0827 (3.2564) [2022-01-26 22:41:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][930/1251] eta 0:11:51 lr 0.000014 time 2.1190 (2.2173) loss 2.8466 (3.0139) grad_norm 2.8387 (3.2568) [2022-01-26 22:41:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][940/1251] eta 0:11:30 lr 0.000014 time 4.2670 (2.2214) loss 3.7089 (3.0152) grad_norm 3.5678 (3.2588) [2022-01-26 22:41:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][950/1251] eta 0:11:08 lr 0.000014 time 1.5896 (2.2194) loss 2.4900 (3.0117) grad_norm 2.7847 (3.2582) [2022-01-26 22:42:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][960/1251] eta 0:10:45 lr 0.000014 time 1.9634 (2.2171) loss 2.7803 (3.0091) grad_norm 4.3715 (3.2602) [2022-01-26 22:42:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][970/1251] eta 0:10:22 lr 0.000014 time 2.1228 (2.2151) loss 3.5552 (3.0073) grad_norm 3.6212 (3.2614) [2022-01-26 22:42:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][980/1251] eta 0:10:00 lr 0.000014 time 2.8559 (2.2147) loss 2.9051 (3.0083) grad_norm 4.2474 (3.2651) [2022-01-26 22:43:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][990/1251] eta 0:09:37 lr 0.000014 time 2.1199 (2.2145) loss 1.9479 (3.0074) grad_norm 3.6792 (3.2644) [2022-01-26 22:43:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1000/1251] eta 0:09:15 lr 0.000014 time 2.3287 (2.2145) loss 3.3128 (3.0055) grad_norm 2.6988 (3.2654) [2022-01-26 22:44:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1010/1251] eta 0:08:53 lr 0.000014 time 1.9347 (2.2132) loss 3.4727 (3.0024) grad_norm 3.0692 (3.2664) [2022-01-26 22:44:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1020/1251] eta 0:08:31 lr 0.000014 time 1.7228 (2.2125) loss 3.0102 (3.0021) grad_norm 3.3400 (3.2659) [2022-01-26 22:44:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1030/1251] eta 0:08:09 lr 0.000014 time 1.5379 (2.2128) loss 3.2774 (3.0024) grad_norm 3.0947 (3.2656) [2022-01-26 22:45:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1040/1251] eta 0:07:46 lr 0.000014 time 1.8457 (2.2122) loss 3.3276 (3.0008) grad_norm 3.0865 (3.2674) [2022-01-26 22:45:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1050/1251] eta 0:07:24 lr 0.000014 time 2.0513 (2.2138) loss 2.9495 (3.0011) grad_norm 2.9009 (3.2654) [2022-01-26 22:45:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1060/1251] eta 0:07:02 lr 0.000014 time 1.6678 (2.2124) loss 2.1735 (3.0001) grad_norm 3.2097 (3.2639) [2022-01-26 22:46:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1070/1251] eta 0:06:40 lr 0.000014 time 2.0393 (2.2122) loss 1.9612 (2.9983) grad_norm 3.0496 (3.2642) [2022-01-26 22:46:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1080/1251] eta 0:06:18 lr 0.000014 time 2.0176 (2.2106) loss 1.9943 (2.9987) grad_norm 3.1051 (3.2636) [2022-01-26 22:46:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1090/1251] eta 0:05:56 lr 0.000014 time 2.4673 (2.2112) loss 3.1783 (2.9981) grad_norm 3.7536 (3.2635) [2022-01-26 22:47:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1100/1251] eta 0:05:34 lr 0.000014 time 2.0514 (2.2123) loss 3.1929 (2.9979) grad_norm 3.7709 (3.2638) [2022-01-26 22:47:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1110/1251] eta 0:05:11 lr 0.000014 time 1.7013 (2.2121) loss 3.3491 (2.9976) grad_norm 3.2375 (3.2633) [2022-01-26 22:48:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1120/1251] eta 0:04:49 lr 0.000014 time 2.1110 (2.2126) loss 2.8472 (3.0001) grad_norm 3.2197 (3.2626) [2022-01-26 22:48:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1130/1251] eta 0:04:27 lr 0.000014 time 1.8928 (2.2129) loss 2.2276 (2.9999) grad_norm 3.2999 (3.2637) [2022-01-26 22:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1140/1251] eta 0:04:05 lr 0.000014 time 1.8306 (2.2114) loss 2.0608 (2.9993) grad_norm 3.1988 (3.2619) [2022-01-26 22:49:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1150/1251] eta 0:03:43 lr 0.000014 time 1.8004 (2.2138) loss 3.2229 (3.0003) grad_norm 2.9931 (3.2607) [2022-01-26 22:49:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1160/1251] eta 0:03:21 lr 0.000014 time 1.8879 (2.2124) loss 3.2236 (3.0010) grad_norm 3.4408 (3.2597) [2022-01-26 22:49:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1170/1251] eta 0:02:59 lr 0.000014 time 1.7047 (2.2117) loss 3.3353 (3.0033) grad_norm 3.4023 (3.2607) [2022-01-26 22:50:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1180/1251] eta 0:02:36 lr 0.000014 time 1.9810 (2.2105) loss 3.2649 (3.0029) grad_norm 3.1291 (3.2613) [2022-01-26 22:50:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1190/1251] eta 0:02:14 lr 0.000014 time 1.8665 (2.2113) loss 3.3068 (3.0040) grad_norm 3.2373 (3.2618) [2022-01-26 22:50:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1200/1251] eta 0:01:52 lr 0.000014 time 1.9440 (2.2116) loss 2.7608 (3.0040) grad_norm 4.2766 (3.2614) [2022-01-26 22:51:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1210/1251] eta 0:01:30 lr 0.000014 time 1.8860 (2.2118) loss 3.3050 (3.0039) grad_norm 2.7411 (3.2606) [2022-01-26 22:51:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1220/1251] eta 0:01:08 lr 0.000014 time 1.6855 (2.2113) loss 3.0539 (3.0031) grad_norm 2.9068 (3.2597) [2022-01-26 22:52:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1230/1251] eta 0:00:46 lr 0.000014 time 2.0491 (2.2133) loss 2.9568 (3.0034) grad_norm 2.9403 (3.2629) [2022-01-26 22:52:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1240/1251] eta 0:00:24 lr 0.000014 time 1.3482 (2.2114) loss 2.8228 (3.0047) grad_norm 3.2333 (3.2641) [2022-01-26 22:52:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [287/300][1250/1251] eta 0:00:02 lr 0.000014 time 1.1839 (2.2058) loss 2.8199 (3.0048) grad_norm 4.0195 (3.2640) [2022-01-26 22:52:43 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 287 training takes 0:45:59 [2022-01-26 22:53:01 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.106 (18.106) Loss 0.7713 (0.7713) Acc@1 81.445 (81.445) Acc@5 95.996 (95.996) [2022-01-26 22:53:17 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.306 (3.089) Loss 0.7379 (0.8014) Acc@1 83.789 (81.410) Acc@5 95.898 (95.685) [2022-01-26 22:53:36 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.923 (2.491) Loss 0.7880 (0.8149) Acc@1 80.762 (81.138) Acc@5 95.117 (95.406) [2022-01-26 22:53:52 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.964 (2.230) Loss 0.8210 (0.8152) Acc@1 80.566 (81.168) Acc@5 95.508 (95.410) [2022-01-26 22:54:13 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.339 (2.184) Loss 0.8074 (0.8174) Acc@1 80.957 (81.107) Acc@5 95.410 (95.417) [2022-01-26 22:54:20 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.134 Acc@5 95.422 [2022-01-26 22:54:20 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-01-26 22:54:20 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.13% [2022-01-26 22:54:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][0/1251] eta 7:22:09 lr 0.000014 time 21.2070 (21.2070) loss 2.9392 (2.9392) grad_norm 3.0540 (3.0540) [2022-01-26 22:55:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][10/1251] eta 1:17:54 lr 0.000014 time 1.6800 (3.7669) loss 3.1183 (3.0019) grad_norm 2.8918 (3.1362) [2022-01-26 22:55:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][20/1251] eta 1:04:04 lr 0.000014 time 2.1905 (3.1229) loss 1.9133 (2.9317) grad_norm 3.0741 (3.2209) [2022-01-26 22:55:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][30/1251] eta 0:57:37 lr 0.000014 time 2.6764 (2.8317) loss 2.6143 (2.9594) grad_norm 2.8418 (3.1482) [2022-01-26 22:56:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][40/1251] eta 0:55:00 lr 0.000014 time 3.7084 (2.7256) loss 3.0105 (2.9552) grad_norm 3.2430 (3.1817) [2022-01-26 22:56:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][50/1251] eta 0:52:52 lr 0.000014 time 2.4084 (2.6419) loss 2.7274 (2.9271) grad_norm 3.3894 (3.1668) [2022-01-26 22:56:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][60/1251] eta 0:51:40 lr 0.000014 time 2.2866 (2.6033) loss 3.2218 (2.9004) grad_norm 3.0624 (3.1500) [2022-01-26 22:57:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][70/1251] eta 0:50:07 lr 0.000014 time 1.5710 (2.5465) loss 3.3050 (2.9348) grad_norm 2.5106 (3.1457) [2022-01-26 22:57:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][80/1251] eta 0:48:41 lr 0.000014 time 3.3214 (2.4945) loss 3.2512 (2.9217) grad_norm 3.3915 (3.1469) [2022-01-26 22:58:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][90/1251] eta 0:47:02 lr 0.000014 time 1.7580 (2.4314) loss 3.5964 (2.9507) grad_norm 3.5264 (3.1947) [2022-01-26 22:58:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][100/1251] eta 0:46:06 lr 0.000014 time 2.2213 (2.4036) loss 2.5313 (2.9680) grad_norm 3.6844 (3.2108) [2022-01-26 22:58:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][110/1251] eta 0:45:33 lr 0.000014 time 3.1973 (2.3957) loss 3.0463 (2.9789) grad_norm 3.1890 (3.2109) [2022-01-26 22:59:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][120/1251] eta 0:44:51 lr 0.000014 time 2.2979 (2.3801) loss 3.4658 (2.9946) grad_norm 3.6555 (3.2182) [2022-01-26 22:59:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][130/1251] eta 0:44:08 lr 0.000014 time 1.7442 (2.3629) loss 2.7048 (2.9875) grad_norm 3.9825 (3.2183) [2022-01-26 22:59:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][140/1251] eta 0:43:39 lr 0.000014 time 2.3499 (2.3574) loss 3.0588 (2.9910) grad_norm 3.4899 (3.2178) [2022-01-26 23:00:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][150/1251] eta 0:42:57 lr 0.000014 time 2.2243 (2.3411) loss 3.1592 (3.0020) grad_norm 2.9540 (3.2117) [2022-01-26 23:00:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][160/1251] eta 0:42:19 lr 0.000014 time 2.1415 (2.3274) loss 3.0340 (2.9992) grad_norm 3.0704 (3.2137) [2022-01-26 23:00:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][170/1251] eta 0:41:43 lr 0.000014 time 2.5993 (2.3157) loss 2.3121 (2.9948) grad_norm 3.6204 (3.2080) [2022-01-26 23:01:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][180/1251] eta 0:41:01 lr 0.000014 time 1.8400 (2.2982) loss 3.3998 (3.0030) grad_norm 3.1810 (3.2080) [2022-01-26 23:01:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][190/1251] eta 0:40:30 lr 0.000014 time 1.9812 (2.2909) loss 3.0644 (2.9899) grad_norm 3.1586 (3.2196) [2022-01-26 23:02:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][200/1251] eta 0:40:06 lr 0.000014 time 2.8325 (2.2896) loss 3.1970 (2.9920) grad_norm 3.3959 (3.2228) [2022-01-26 23:02:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][210/1251] eta 0:39:35 lr 0.000014 time 2.2412 (2.2824) loss 2.3357 (2.9979) grad_norm 3.0070 (3.2220) [2022-01-26 23:02:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][220/1251] eta 0:39:01 lr 0.000014 time 1.6962 (2.2711) loss 2.8106 (2.9996) grad_norm 3.4321 (3.2244) [2022-01-26 23:03:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][230/1251] eta 0:38:32 lr 0.000014 time 2.1678 (2.2654) loss 2.9607 (2.9931) grad_norm 2.8143 (3.2220) [2022-01-26 23:03:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][240/1251] eta 0:38:18 lr 0.000014 time 2.9237 (2.2734) loss 3.1571 (2.9994) grad_norm 3.0961 (3.2222) [2022-01-26 23:03:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][250/1251] eta 0:38:01 lr 0.000014 time 3.0111 (2.2790) loss 3.3798 (3.0117) grad_norm 3.2415 (3.2208) [2022-01-26 23:04:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][260/1251] eta 0:37:36 lr 0.000014 time 1.6931 (2.2767) loss 3.2101 (3.0061) grad_norm 3.1720 (3.2309) [2022-01-26 23:04:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][270/1251] eta 0:37:05 lr 0.000014 time 1.9435 (2.2688) loss 2.3765 (3.0012) grad_norm 3.0591 (3.2330) [2022-01-26 23:04:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][280/1251] eta 0:36:32 lr 0.000014 time 1.8436 (2.2585) loss 1.8417 (3.0005) grad_norm 2.9372 (3.2355) [2022-01-26 23:05:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][290/1251] eta 0:36:11 lr 0.000014 time 2.5265 (2.2592) loss 3.2059 (2.9977) grad_norm 3.4084 (3.2410) [2022-01-26 23:05:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][300/1251] eta 0:35:47 lr 0.000014 time 2.0426 (2.2584) loss 2.7580 (2.9930) grad_norm 3.0744 (3.2394) [2022-01-26 23:06:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][310/1251] eta 0:35:21 lr 0.000014 time 2.2524 (2.2550) loss 2.3574 (2.9870) grad_norm 3.4442 (3.2375) [2022-01-26 23:06:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][320/1251] eta 0:34:55 lr 0.000014 time 2.6089 (2.2507) loss 3.1401 (2.9919) grad_norm 2.8327 (3.2351) [2022-01-26 23:06:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][330/1251] eta 0:34:28 lr 0.000014 time 1.6178 (2.2455) loss 1.8880 (2.9879) grad_norm 2.9240 (3.2340) [2022-01-26 23:07:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][340/1251] eta 0:34:02 lr 0.000014 time 2.1344 (2.2424) loss 3.6720 (2.9842) grad_norm 3.2848 (3.2329) [2022-01-26 23:07:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][350/1251] eta 0:33:42 lr 0.000014 time 2.3965 (2.2451) loss 3.2950 (2.9888) grad_norm 3.8223 (3.2339) [2022-01-26 23:07:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][360/1251] eta 0:33:15 lr 0.000014 time 1.7501 (2.2391) loss 2.9693 (2.9895) grad_norm 3.0682 (3.2344) [2022-01-26 23:08:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][370/1251] eta 0:32:50 lr 0.000014 time 2.2016 (2.2364) loss 2.7732 (2.9887) grad_norm 3.1806 (3.2362) [2022-01-26 23:08:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][380/1251] eta 0:32:27 lr 0.000014 time 1.9817 (2.2354) loss 3.0383 (2.9907) grad_norm 3.4343 (3.2391) [2022-01-26 23:08:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][390/1251] eta 0:32:07 lr 0.000014 time 2.5864 (2.2385) loss 2.5863 (2.9885) grad_norm 4.7172 (3.2462) [2022-01-26 23:09:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][400/1251] eta 0:31:43 lr 0.000014 time 1.6615 (2.2369) loss 3.6413 (2.9887) grad_norm 3.3016 (3.2519) [2022-01-26 23:09:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][410/1251] eta 0:31:22 lr 0.000014 time 2.6180 (2.2388) loss 3.3337 (2.9903) grad_norm 3.7766 (3.2552) [2022-01-26 23:10:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][420/1251] eta 0:30:57 lr 0.000014 time 1.7035 (2.2352) loss 3.2244 (2.9922) grad_norm 3.1206 (3.2594) [2022-01-26 23:10:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][430/1251] eta 0:30:34 lr 0.000014 time 2.4847 (2.2340) loss 3.1685 (2.9956) grad_norm 2.9228 (3.2588) [2022-01-26 23:10:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][440/1251] eta 0:30:09 lr 0.000014 time 1.8425 (2.2309) loss 3.0977 (2.9980) grad_norm 9.0441 (3.2700) [2022-01-26 23:11:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][450/1251] eta 0:29:46 lr 0.000014 time 2.3804 (2.2309) loss 3.4291 (2.9958) grad_norm 3.3089 (3.2723) [2022-01-26 23:11:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][460/1251] eta 0:29:23 lr 0.000014 time 1.6850 (2.2295) loss 2.4741 (2.9931) grad_norm 3.4681 (3.2765) [2022-01-26 23:11:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][470/1251] eta 0:29:03 lr 0.000014 time 2.6647 (2.2324) loss 3.4892 (2.9953) grad_norm 3.4630 (3.2750) [2022-01-26 23:12:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][480/1251] eta 0:28:39 lr 0.000014 time 1.5818 (2.2305) loss 1.8862 (2.9936) grad_norm 3.4257 (3.2719) [2022-01-26 23:12:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][490/1251] eta 0:28:15 lr 0.000014 time 2.4957 (2.2280) loss 3.4200 (2.9930) grad_norm 3.2347 (3.2687) [2022-01-26 23:12:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][500/1251] eta 0:27:52 lr 0.000014 time 1.8990 (2.2274) loss 2.0120 (2.9927) grad_norm 2.9383 (3.2755) [2022-01-26 23:13:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][510/1251] eta 0:27:30 lr 0.000014 time 1.7993 (2.2271) loss 3.0695 (2.9907) grad_norm 3.0899 (3.2740) [2022-01-26 23:13:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][520/1251] eta 0:27:07 lr 0.000014 time 1.9953 (2.2261) loss 2.5341 (2.9879) grad_norm 2.9080 (3.2752) [2022-01-26 23:14:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][530/1251] eta 0:26:44 lr 0.000014 time 2.6274 (2.2260) loss 3.5641 (2.9872) grad_norm 3.4124 (3.2744) [2022-01-26 23:14:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][540/1251] eta 0:26:21 lr 0.000014 time 1.9545 (2.2247) loss 2.9428 (2.9877) grad_norm 2.7947 (3.2771) [2022-01-26 23:14:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][550/1251] eta 0:25:59 lr 0.000014 time 1.8544 (2.2250) loss 3.3276 (2.9870) grad_norm 2.8511 (3.2756) [2022-01-26 23:15:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][560/1251] eta 0:25:36 lr 0.000014 time 1.8938 (2.2239) loss 2.9903 (2.9860) grad_norm 3.0910 (3.2788) [2022-01-26 23:15:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][570/1251] eta 0:25:13 lr 0.000014 time 2.0794 (2.2229) loss 2.9271 (2.9891) grad_norm 4.1515 (3.2799) [2022-01-26 23:15:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][580/1251] eta 0:24:51 lr 0.000014 time 2.1197 (2.2222) loss 3.1408 (2.9939) grad_norm 3.3944 (3.2777) [2022-01-26 23:16:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][590/1251] eta 0:24:28 lr 0.000014 time 1.8997 (2.2210) loss 2.5063 (2.9957) grad_norm 2.8351 (3.2821) [2022-01-26 23:16:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][600/1251] eta 0:24:04 lr 0.000014 time 2.2205 (2.2194) loss 3.1232 (2.9973) grad_norm 2.9730 (3.2859) [2022-01-26 23:16:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][610/1251] eta 0:23:42 lr 0.000014 time 2.4959 (2.2190) loss 3.3737 (2.9979) grad_norm 3.6052 (3.2873) [2022-01-26 23:17:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][620/1251] eta 0:23:20 lr 0.000014 time 2.1095 (2.2190) loss 3.3775 (2.9985) grad_norm 4.4275 (3.2843) [2022-01-26 23:17:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][630/1251] eta 0:22:55 lr 0.000014 time 1.9876 (2.2156) loss 3.4712 (3.0004) grad_norm 4.1694 (3.2863) [2022-01-26 23:17:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][640/1251] eta 0:22:31 lr 0.000014 time 1.9306 (2.2119) loss 3.3415 (2.9993) grad_norm 2.6142 (3.2823) [2022-01-26 23:18:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][650/1251] eta 0:22:09 lr 0.000014 time 2.2140 (2.2123) loss 2.6433 (3.0016) grad_norm 2.9035 (3.2816) [2022-01-26 23:18:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][660/1251] eta 0:21:46 lr 0.000014 time 1.9488 (2.2106) loss 2.4207 (2.9995) grad_norm 3.8856 (3.2828) [2022-01-26 23:19:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][670/1251] eta 0:21:24 lr 0.000014 time 2.1893 (2.2101) loss 3.3905 (2.9980) grad_norm 3.1551 (3.2844) [2022-01-26 23:19:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][680/1251] eta 0:21:02 lr 0.000014 time 2.1508 (2.2114) loss 3.3869 (2.9954) grad_norm 3.3852 (3.2843) [2022-01-26 23:19:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][690/1251] eta 0:20:39 lr 0.000014 time 1.7720 (2.2093) loss 3.6296 (2.9959) grad_norm 3.1224 (3.2808) [2022-01-26 23:20:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][700/1251] eta 0:20:17 lr 0.000014 time 1.8144 (2.2104) loss 3.6014 (2.9917) grad_norm 2.9007 (3.2784) [2022-01-26 23:20:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][710/1251] eta 0:19:56 lr 0.000014 time 2.2913 (2.2115) loss 3.2568 (2.9935) grad_norm 3.2793 (3.2774) [2022-01-26 23:20:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][720/1251] eta 0:19:34 lr 0.000014 time 2.2747 (2.2118) loss 3.2732 (2.9898) grad_norm 3.4181 (3.2783) [2022-01-26 23:21:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][730/1251] eta 0:19:12 lr 0.000014 time 1.9061 (2.2126) loss 3.5605 (2.9926) grad_norm 2.6021 (3.2756) [2022-01-26 23:21:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][740/1251] eta 0:18:50 lr 0.000014 time 1.6518 (2.2123) loss 3.0436 (2.9911) grad_norm 3.2255 (3.2761) [2022-01-26 23:22:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][750/1251] eta 0:18:28 lr 0.000014 time 2.2353 (2.2130) loss 3.4092 (2.9933) grad_norm 3.3775 (3.2760) [2022-01-26 23:22:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][760/1251] eta 0:18:04 lr 0.000014 time 1.8311 (2.2093) loss 3.0521 (2.9947) grad_norm 3.2882 (3.2776) [2022-01-26 23:22:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][770/1251] eta 0:17:41 lr 0.000014 time 1.7761 (2.2070) loss 2.6882 (2.9924) grad_norm 3.5578 (3.2799) [2022-01-26 23:23:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][780/1251] eta 0:17:18 lr 0.000014 time 1.8600 (2.2058) loss 3.3335 (2.9939) grad_norm 3.3396 (3.2819) [2022-01-26 23:23:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][790/1251] eta 0:16:56 lr 0.000014 time 1.7919 (2.2045) loss 2.7011 (2.9954) grad_norm 3.2898 (3.2818) [2022-01-26 23:23:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][800/1251] eta 0:16:34 lr 0.000013 time 1.5204 (2.2052) loss 2.2750 (2.9925) grad_norm 2.9556 (3.2804) [2022-01-26 23:24:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][810/1251] eta 0:16:13 lr 0.000013 time 3.1732 (2.2068) loss 2.3800 (2.9919) grad_norm 3.4111 (3.2783) [2022-01-26 23:24:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][820/1251] eta 0:15:51 lr 0.000013 time 2.7574 (2.2084) loss 2.0171 (2.9907) grad_norm 3.3664 (3.2801) [2022-01-26 23:24:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][830/1251] eta 0:15:30 lr 0.000013 time 1.9315 (2.2099) loss 3.4307 (2.9917) grad_norm 2.8915 (3.2794) [2022-01-26 23:25:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][840/1251] eta 0:15:07 lr 0.000013 time 1.6325 (2.2083) loss 2.6973 (2.9926) grad_norm 3.2343 (3.2775) [2022-01-26 23:25:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][850/1251] eta 0:14:44 lr 0.000013 time 2.2278 (2.2066) loss 2.4846 (2.9896) grad_norm 2.8346 (3.2771) [2022-01-26 23:25:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][860/1251] eta 0:14:22 lr 0.000013 time 2.1770 (2.2053) loss 3.6255 (2.9920) grad_norm 3.0525 (3.2772) [2022-01-26 23:26:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][870/1251] eta 0:13:59 lr 0.000013 time 1.9366 (2.2043) loss 3.2745 (2.9936) grad_norm 3.1760 (3.2748) [2022-01-26 23:26:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][880/1251] eta 0:13:37 lr 0.000013 time 2.4058 (2.2039) loss 2.8580 (2.9928) grad_norm 3.5289 (3.2764) [2022-01-26 23:27:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][890/1251] eta 0:13:15 lr 0.000013 time 1.9785 (2.2028) loss 3.3133 (2.9964) grad_norm 3.2055 (3.2788) [2022-01-26 23:27:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][900/1251] eta 0:12:53 lr 0.000013 time 3.2662 (2.2045) loss 3.3622 (2.9953) grad_norm 3.0913 (3.2768) [2022-01-26 23:27:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][910/1251] eta 0:12:31 lr 0.000013 time 1.7690 (2.2045) loss 2.2471 (2.9969) grad_norm 3.2047 (3.2760) [2022-01-26 23:28:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][920/1251] eta 0:12:09 lr 0.000013 time 2.1142 (2.2042) loss 3.2452 (2.9978) grad_norm 3.2106 (3.2793) [2022-01-26 23:28:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][930/1251] eta 0:11:47 lr 0.000013 time 1.9070 (2.2039) loss 3.0233 (2.9987) grad_norm 2.9499 (3.2782) [2022-01-26 23:28:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][940/1251] eta 0:11:25 lr 0.000013 time 2.0162 (2.2038) loss 2.7904 (2.9977) grad_norm 3.3206 (3.2773) [2022-01-26 23:29:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][950/1251] eta 0:11:03 lr 0.000013 time 2.1911 (2.2035) loss 3.3156 (2.9978) grad_norm 3.0789 (3.2767) [2022-01-26 23:29:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][960/1251] eta 0:10:41 lr 0.000013 time 2.8524 (2.2044) loss 3.3387 (2.9962) grad_norm 3.0232 (3.2772) [2022-01-26 23:29:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][970/1251] eta 0:10:18 lr 0.000013 time 1.6694 (2.2025) loss 1.8607 (2.9937) grad_norm 3.3142 (3.2891) [2022-01-26 23:30:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][980/1251] eta 0:09:56 lr 0.000013 time 1.9021 (2.2009) loss 3.2663 (2.9929) grad_norm 2.7577 (3.2912) [2022-01-26 23:30:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][990/1251] eta 0:09:34 lr 0.000013 time 2.1285 (2.1998) loss 2.2477 (2.9914) grad_norm 3.1635 (3.2884) [2022-01-26 23:31:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1000/1251] eta 0:09:11 lr 0.000013 time 1.8663 (2.1981) loss 2.5266 (2.9875) grad_norm 3.0656 (3.2884) [2022-01-26 23:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1010/1251] eta 0:08:49 lr 0.000013 time 2.1627 (2.1979) loss 2.9583 (2.9869) grad_norm 3.0110 (3.2876) [2022-01-26 23:31:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1020/1251] eta 0:08:27 lr 0.000013 time 2.0226 (2.1983) loss 3.5279 (2.9844) grad_norm 3.3208 (3.2881) [2022-01-26 23:32:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1030/1251] eta 0:08:05 lr 0.000013 time 2.1087 (2.1979) loss 3.1017 (2.9868) grad_norm 3.8446 (3.2900) [2022-01-26 23:32:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1040/1251] eta 0:07:43 lr 0.000013 time 2.1602 (2.1990) loss 3.3635 (2.9868) grad_norm 3.1041 (3.2887) [2022-01-26 23:32:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1050/1251] eta 0:07:22 lr 0.000013 time 2.2137 (2.1999) loss 2.2148 (2.9842) grad_norm 3.0369 (3.2860) [2022-01-26 23:33:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1060/1251] eta 0:07:00 lr 0.000013 time 2.0522 (2.2004) loss 3.2419 (2.9831) grad_norm 2.9634 (3.2856) [2022-01-26 23:33:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1070/1251] eta 0:06:38 lr 0.000013 time 2.2328 (2.2013) loss 3.0304 (2.9847) grad_norm 3.4167 (3.2867) [2022-01-26 23:34:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1080/1251] eta 0:06:16 lr 0.000013 time 2.8696 (2.2033) loss 2.9094 (2.9853) grad_norm 3.4032 (3.2887) [2022-01-26 23:34:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1090/1251] eta 0:05:54 lr 0.000013 time 3.1572 (2.2045) loss 3.3070 (2.9863) grad_norm 3.8378 (3.2892) [2022-01-26 23:34:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1100/1251] eta 0:05:32 lr 0.000013 time 1.9929 (2.2020) loss 3.6609 (2.9875) grad_norm 3.3397 (3.2885) [2022-01-26 23:35:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1110/1251] eta 0:05:10 lr 0.000013 time 1.9131 (2.1988) loss 2.4785 (2.9858) grad_norm 3.4825 (3.2880) [2022-01-26 23:35:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1120/1251] eta 0:04:47 lr 0.000013 time 2.5940 (2.1973) loss 2.1038 (2.9849) grad_norm 3.7163 (3.2869) [2022-01-26 23:35:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1130/1251] eta 0:04:25 lr 0.000013 time 2.7487 (2.1977) loss 3.2974 (2.9856) grad_norm 3.2171 (3.2855) [2022-01-26 23:36:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1140/1251] eta 0:04:03 lr 0.000013 time 1.5674 (2.1967) loss 2.6994 (2.9827) grad_norm 3.8031 (3.2856) [2022-01-26 23:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1150/1251] eta 0:03:41 lr 0.000013 time 2.5517 (2.1970) loss 2.2736 (2.9805) grad_norm 2.9943 (3.2850) [2022-01-26 23:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1160/1251] eta 0:03:19 lr 0.000013 time 2.3757 (2.1971) loss 3.2061 (2.9793) grad_norm 2.9125 (3.2840) [2022-01-26 23:37:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1170/1251] eta 0:02:58 lr 0.000013 time 2.5628 (2.1976) loss 2.5332 (2.9799) grad_norm 3.1367 (3.2838) [2022-01-26 23:37:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1180/1251] eta 0:02:36 lr 0.000013 time 1.9823 (2.1981) loss 2.7735 (2.9805) grad_norm 2.9327 (3.2812) [2022-01-26 23:37:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1190/1251] eta 0:02:14 lr 0.000013 time 2.4960 (2.1987) loss 3.3160 (2.9812) grad_norm 3.0454 (3.2806) [2022-01-26 23:38:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1200/1251] eta 0:01:52 lr 0.000013 time 3.3814 (2.2011) loss 2.7519 (2.9833) grad_norm 3.0083 (3.2787) [2022-01-26 23:38:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1210/1251] eta 0:01:30 lr 0.000013 time 2.1928 (2.2017) loss 3.1305 (2.9825) grad_norm 2.8633 (3.2768) [2022-01-26 23:39:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1220/1251] eta 0:01:08 lr 0.000013 time 1.8472 (2.2023) loss 2.8093 (2.9821) grad_norm 3.1547 (3.2753) [2022-01-26 23:39:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1230/1251] eta 0:00:46 lr 0.000013 time 1.8982 (2.2006) loss 2.1165 (2.9820) grad_norm 2.5967 (3.2742) [2022-01-26 23:39:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1240/1251] eta 0:00:24 lr 0.000013 time 2.0326 (2.1981) loss 3.5239 (2.9832) grad_norm 3.3634 (3.2741) [2022-01-26 23:40:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [288/300][1250/1251] eta 0:00:02 lr 0.000013 time 1.2095 (2.1918) loss 2.9077 (2.9826) grad_norm 3.1282 (3.2736) [2022-01-26 23:40:02 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 288 training takes 0:45:42 [2022-01-26 23:40:21 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.368 (18.368) Loss 0.8143 (0.8143) Acc@1 81.152 (81.152) Acc@5 95.508 (95.508) [2022-01-26 23:40:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.388 (3.307) Loss 0.8247 (0.8287) Acc@1 80.957 (80.513) Acc@5 95.312 (95.321) [2022-01-26 23:40:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.921 (2.633) Loss 0.7405 (0.8149) Acc@1 83.203 (80.771) Acc@5 96.289 (95.526) [2022-01-26 23:41:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.981 (2.384) Loss 0.8253 (0.8142) Acc@1 80.566 (80.922) Acc@5 95.410 (95.423) [2022-01-26 23:41:32 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.554 (2.195) Loss 0.7951 (0.8149) Acc@1 81.055 (80.955) Acc@5 95.898 (95.408) [2022-01-26 23:41:41 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.144 Acc@5 95.460 [2022-01-26 23:41:41 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-01-26 23:41:41 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.14% [2022-01-26 23:42:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][0/1251] eta 7:33:40 lr 0.000013 time 21.7592 (21.7592) loss 2.8313 (2.8313) grad_norm 3.9046 (3.9046) [2022-01-26 23:42:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][10/1251] eta 1:22:32 lr 0.000013 time 2.2460 (3.9909) loss 2.8443 (2.8880) grad_norm 3.2474 (4.0118) [2022-01-26 23:42:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][20/1251] eta 1:04:27 lr 0.000013 time 1.5192 (3.1414) loss 3.3149 (2.9019) grad_norm 3.1799 (3.5471) [2022-01-26 23:43:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][30/1251] eta 0:57:32 lr 0.000013 time 1.5841 (2.8280) loss 2.7297 (2.8860) grad_norm 3.3661 (3.4762) [2022-01-26 23:43:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][40/1251] eta 0:55:24 lr 0.000013 time 3.7773 (2.7452) loss 3.2264 (2.8762) grad_norm 2.6097 (3.4599) [2022-01-26 23:43:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][50/1251] eta 0:53:48 lr 0.000013 time 2.7439 (2.6878) loss 2.3114 (2.8551) grad_norm 2.9918 (3.3799) [2022-01-26 23:44:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][60/1251] eta 0:51:54 lr 0.000013 time 1.9285 (2.6146) loss 3.1804 (2.8661) grad_norm 2.8682 (3.3555) [2022-01-26 23:44:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][70/1251] eta 0:50:13 lr 0.000013 time 1.8501 (2.5512) loss 2.0029 (2.8827) grad_norm 3.1434 (3.3113) [2022-01-26 23:45:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][80/1251] eta 0:48:44 lr 0.000013 time 2.1918 (2.4975) loss 3.4541 (2.8902) grad_norm 3.4055 (3.3116) [2022-01-26 23:45:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][90/1251] eta 0:47:39 lr 0.000013 time 2.6787 (2.4630) loss 3.1760 (2.8869) grad_norm 3.0774 (3.3135) [2022-01-26 23:45:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][100/1251] eta 0:46:44 lr 0.000013 time 2.2213 (2.4365) loss 2.5383 (2.8920) grad_norm 2.8249 (3.3194) [2022-01-26 23:46:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][110/1251] eta 0:45:35 lr 0.000013 time 1.7834 (2.3975) loss 3.0715 (2.9001) grad_norm 3.0446 (3.3162) [2022-01-26 23:46:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][120/1251] eta 0:44:46 lr 0.000013 time 2.5221 (2.3751) loss 3.0857 (2.8945) grad_norm 3.1816 (3.3213) [2022-01-26 23:46:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][130/1251] eta 0:43:54 lr 0.000013 time 2.2374 (2.3498) loss 2.0834 (2.8926) grad_norm 3.3728 (3.3204) [2022-01-26 23:47:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][140/1251] eta 0:43:18 lr 0.000013 time 2.7462 (2.3385) loss 3.2446 (2.9097) grad_norm 2.9157 (3.3180) [2022-01-26 23:47:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][150/1251] eta 0:42:51 lr 0.000013 time 2.1446 (2.3357) loss 3.3561 (2.9116) grad_norm 2.8093 (3.3183) [2022-01-26 23:47:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][160/1251] eta 0:42:29 lr 0.000013 time 3.2678 (2.3370) loss 2.9563 (2.9209) grad_norm 3.4766 (3.3153) [2022-01-26 23:48:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][170/1251] eta 0:41:57 lr 0.000013 time 1.8813 (2.3291) loss 2.5495 (2.9250) grad_norm 3.6801 (3.3132) [2022-01-26 23:48:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][180/1251] eta 0:41:37 lr 0.000013 time 3.3992 (2.3321) loss 2.4780 (2.9207) grad_norm 3.1909 (3.3100) [2022-01-26 23:49:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][190/1251] eta 0:41:01 lr 0.000013 time 2.4505 (2.3196) loss 3.0884 (2.9215) grad_norm 3.3811 (3.3135) [2022-01-26 23:49:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][200/1251] eta 0:40:22 lr 0.000013 time 2.1606 (2.3049) loss 2.1656 (2.9115) grad_norm 3.2131 (3.3010) [2022-01-26 23:49:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][210/1251] eta 0:39:47 lr 0.000013 time 1.9479 (2.2937) loss 2.2547 (2.9178) grad_norm 3.0305 (3.2863) [2022-01-26 23:50:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][220/1251] eta 0:39:21 lr 0.000013 time 1.7635 (2.2907) loss 2.6245 (2.9318) grad_norm 2.9516 (3.2835) [2022-01-26 23:50:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][230/1251] eta 0:39:00 lr 0.000013 time 2.1343 (2.2922) loss 3.1193 (2.9414) grad_norm 3.1425 (3.2793) [2022-01-26 23:50:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][240/1251] eta 0:38:33 lr 0.000013 time 2.4059 (2.2887) loss 2.8371 (2.9438) grad_norm 3.1385 (3.2821) [2022-01-26 23:51:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][250/1251] eta 0:38:03 lr 0.000013 time 1.8989 (2.2814) loss 2.4324 (2.9396) grad_norm 3.2473 (3.2759) [2022-01-26 23:51:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][260/1251] eta 0:37:32 lr 0.000013 time 1.7965 (2.2733) loss 2.3463 (2.9395) grad_norm 3.2189 (3.2701) [2022-01-26 23:51:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][270/1251] eta 0:37:03 lr 0.000013 time 1.8541 (2.2663) loss 2.9572 (2.9337) grad_norm 3.0672 (3.2769) [2022-01-26 23:52:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][280/1251] eta 0:36:35 lr 0.000013 time 1.7882 (2.2612) loss 3.3526 (2.9350) grad_norm 3.1017 (3.2784) [2022-01-26 23:52:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][290/1251] eta 0:36:11 lr 0.000013 time 2.5978 (2.2601) loss 3.4101 (2.9419) grad_norm 3.1553 (3.2756) [2022-01-26 23:53:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][300/1251] eta 0:35:46 lr 0.000013 time 1.5904 (2.2573) loss 3.4157 (2.9440) grad_norm 2.6826 (3.2732) [2022-01-26 23:53:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][310/1251] eta 0:35:22 lr 0.000013 time 2.6089 (2.2558) loss 3.2654 (2.9400) grad_norm 3.7587 (3.2658) [2022-01-26 23:53:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][320/1251] eta 0:34:57 lr 0.000013 time 1.8354 (2.2528) loss 3.4477 (2.9430) grad_norm 3.8919 (3.2706) [2022-01-26 23:54:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][330/1251] eta 0:34:34 lr 0.000013 time 1.9065 (2.2523) loss 3.1339 (2.9435) grad_norm 3.0635 (3.2691) [2022-01-26 23:54:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][340/1251] eta 0:34:07 lr 0.000013 time 1.9417 (2.2477) loss 3.0221 (2.9441) grad_norm 3.4044 (3.2715) [2022-01-26 23:54:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][350/1251] eta 0:33:42 lr 0.000013 time 2.7838 (2.2444) loss 3.0239 (2.9457) grad_norm 2.8330 (3.2655) [2022-01-26 23:55:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][360/1251] eta 0:33:15 lr 0.000013 time 1.9184 (2.2400) loss 2.2543 (2.9486) grad_norm 3.4190 (3.2676) [2022-01-26 23:55:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][370/1251] eta 0:32:52 lr 0.000013 time 1.6026 (2.2391) loss 3.5225 (2.9511) grad_norm 4.7315 (3.2768) [2022-01-26 23:55:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][380/1251] eta 0:32:28 lr 0.000013 time 2.0578 (2.2368) loss 2.9984 (2.9442) grad_norm 3.7329 (3.2793) [2022-01-26 23:56:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][390/1251] eta 0:32:06 lr 0.000013 time 2.2730 (2.2379) loss 3.9036 (2.9504) grad_norm 3.5333 (3.2812) [2022-01-26 23:56:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][400/1251] eta 0:31:47 lr 0.000013 time 1.7295 (2.2416) loss 3.2330 (2.9492) grad_norm 3.0227 (3.2815) [2022-01-26 23:57:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][410/1251] eta 0:31:28 lr 0.000013 time 1.7577 (2.2450) loss 3.2826 (2.9473) grad_norm 3.3081 (3.2832) [2022-01-26 23:57:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][420/1251] eta 0:31:02 lr 0.000013 time 1.6162 (2.2408) loss 2.0560 (2.9416) grad_norm 3.0812 (3.2801) [2022-01-26 23:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][430/1251] eta 0:30:38 lr 0.000013 time 2.1934 (2.2393) loss 2.8269 (2.9452) grad_norm 3.2115 (3.2817) [2022-01-26 23:58:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][440/1251] eta 0:30:11 lr 0.000013 time 1.6976 (2.2338) loss 2.8896 (2.9445) grad_norm 3.2532 (3.2823) [2022-01-26 23:58:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][450/1251] eta 0:29:46 lr 0.000013 time 1.8097 (2.2304) loss 2.2997 (2.9444) grad_norm 3.4452 (3.2818) [2022-01-26 23:58:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][460/1251] eta 0:29:25 lr 0.000013 time 2.1947 (2.2315) loss 3.6025 (2.9442) grad_norm 3.8731 (3.2789) [2022-01-26 23:59:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][470/1251] eta 0:29:01 lr 0.000013 time 2.4205 (2.2301) loss 2.8552 (2.9475) grad_norm 3.1179 (3.2798) [2022-01-26 23:59:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][480/1251] eta 0:28:37 lr 0.000013 time 1.8921 (2.2277) loss 3.5833 (2.9495) grad_norm 3.4786 (3.2781) [2022-01-26 23:59:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][490/1251] eta 0:28:14 lr 0.000013 time 2.0803 (2.2264) loss 2.1004 (2.9434) grad_norm 3.4352 (3.2770) [2022-01-27 00:00:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][500/1251] eta 0:27:53 lr 0.000013 time 2.4564 (2.2286) loss 3.1737 (2.9482) grad_norm 3.6717 (3.2768) [2022-01-27 00:00:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][510/1251] eta 0:27:33 lr 0.000013 time 3.4023 (2.2316) loss 2.0964 (2.9474) grad_norm 2.9545 (3.2738) [2022-01-27 00:01:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][520/1251] eta 0:27:12 lr 0.000013 time 1.5060 (2.2334) loss 3.0595 (2.9514) grad_norm 2.9523 (3.2745) [2022-01-27 00:01:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][530/1251] eta 0:26:49 lr 0.000013 time 1.9220 (2.2324) loss 3.1394 (2.9526) grad_norm 3.2882 (3.2781) [2022-01-27 00:01:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][540/1251] eta 0:26:24 lr 0.000013 time 1.7392 (2.2289) loss 3.3540 (2.9557) grad_norm 3.5920 (3.2785) [2022-01-27 00:02:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][550/1251] eta 0:26:00 lr 0.000013 time 1.8894 (2.2264) loss 3.2622 (2.9581) grad_norm 3.4919 (3.2788) [2022-01-27 00:02:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][560/1251] eta 0:25:37 lr 0.000013 time 1.9872 (2.2250) loss 3.2510 (2.9581) grad_norm 4.0691 (3.2789) [2022-01-27 00:02:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][570/1251] eta 0:25:13 lr 0.000013 time 1.9224 (2.2218) loss 2.7926 (2.9557) grad_norm 4.2299 (3.2812) [2022-01-27 00:03:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][580/1251] eta 0:24:50 lr 0.000013 time 1.8170 (2.2206) loss 2.4947 (2.9591) grad_norm 3.2033 (3.2820) [2022-01-27 00:03:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][590/1251] eta 0:24:26 lr 0.000013 time 1.8960 (2.2186) loss 3.5375 (2.9604) grad_norm 3.6499 (3.2820) [2022-01-27 00:03:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][600/1251] eta 0:24:05 lr 0.000013 time 2.0406 (2.2197) loss 3.2126 (2.9594) grad_norm 2.8086 (3.2798) [2022-01-27 00:04:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][610/1251] eta 0:23:43 lr 0.000013 time 2.7263 (2.2209) loss 3.5559 (2.9622) grad_norm 3.6673 (3.2806) [2022-01-27 00:04:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][620/1251] eta 0:23:20 lr 0.000013 time 2.2106 (2.2192) loss 3.2546 (2.9623) grad_norm 3.6535 (3.2812) [2022-01-27 00:05:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][630/1251] eta 0:22:58 lr 0.000013 time 2.2327 (2.2192) loss 3.5591 (2.9588) grad_norm 3.3688 (3.2800) [2022-01-27 00:05:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][640/1251] eta 0:22:34 lr 0.000013 time 1.5681 (2.2165) loss 2.8184 (2.9576) grad_norm 3.6991 (3.2825) [2022-01-27 00:05:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][650/1251] eta 0:22:13 lr 0.000013 time 2.7486 (2.2182) loss 3.1517 (2.9563) grad_norm 3.4293 (3.2815) [2022-01-27 00:06:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][660/1251] eta 0:21:50 lr 0.000013 time 2.1636 (2.2174) loss 3.2140 (2.9578) grad_norm 3.1645 (3.2837) [2022-01-27 00:06:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][670/1251] eta 0:21:29 lr 0.000013 time 2.0714 (2.2188) loss 3.0169 (2.9579) grad_norm 2.8888 (3.2835) [2022-01-27 00:06:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][680/1251] eta 0:21:07 lr 0.000013 time 1.8975 (2.2198) loss 3.2255 (2.9562) grad_norm 3.7114 (3.2885) [2022-01-27 00:07:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][690/1251] eta 0:20:45 lr 0.000013 time 2.5084 (2.2202) loss 2.5652 (2.9551) grad_norm 3.1332 (3.2886) [2022-01-27 00:07:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][700/1251] eta 0:20:22 lr 0.000013 time 1.5826 (2.2191) loss 3.7356 (2.9567) grad_norm 3.1988 (3.2866) [2022-01-27 00:07:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][710/1251] eta 0:19:59 lr 0.000013 time 1.8154 (2.2163) loss 2.6546 (2.9547) grad_norm 2.8283 (3.2852) [2022-01-27 00:08:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][720/1251] eta 0:19:34 lr 0.000013 time 1.9264 (2.2120) loss 3.1352 (2.9545) grad_norm 3.1333 (3.2833) [2022-01-27 00:08:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][730/1251] eta 0:19:10 lr 0.000013 time 1.7221 (2.2083) loss 3.0000 (2.9575) grad_norm 2.9829 (3.2829) [2022-01-27 00:08:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][740/1251] eta 0:18:47 lr 0.000013 time 2.0093 (2.2066) loss 3.2018 (2.9555) grad_norm 2.8173 (3.2813) [2022-01-27 00:09:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][750/1251] eta 0:18:25 lr 0.000013 time 2.1702 (2.2064) loss 3.4390 (2.9578) grad_norm 3.2325 (3.2814) [2022-01-27 00:09:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][760/1251] eta 0:18:02 lr 0.000013 time 1.4595 (2.2055) loss 2.2417 (2.9602) grad_norm 3.4255 (3.2820) [2022-01-27 00:10:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][770/1251] eta 0:17:40 lr 0.000013 time 1.9955 (2.2047) loss 3.0836 (2.9573) grad_norm 3.4943 (3.2823) [2022-01-27 00:10:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][780/1251] eta 0:17:18 lr 0.000013 time 2.7762 (2.2054) loss 3.1839 (2.9602) grad_norm 2.8293 (3.2827) [2022-01-27 00:10:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][790/1251] eta 0:16:57 lr 0.000013 time 2.0731 (2.2067) loss 1.8622 (2.9586) grad_norm 2.8703 (3.2825) [2022-01-27 00:11:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][800/1251] eta 0:16:35 lr 0.000013 time 1.9089 (2.2074) loss 3.3829 (2.9601) grad_norm 2.9359 (3.2820) [2022-01-27 00:11:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][810/1251] eta 0:16:13 lr 0.000013 time 2.5386 (2.2086) loss 3.5228 (2.9627) grad_norm 3.4563 (3.2802) [2022-01-27 00:11:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][820/1251] eta 0:15:52 lr 0.000013 time 2.5561 (2.2096) loss 3.1703 (2.9651) grad_norm 3.8098 (3.2801) [2022-01-27 00:12:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][830/1251] eta 0:15:30 lr 0.000013 time 2.1588 (2.2095) loss 2.7400 (2.9638) grad_norm 3.0133 (3.2794) [2022-01-27 00:12:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][840/1251] eta 0:15:08 lr 0.000013 time 2.3878 (2.2099) loss 1.9477 (2.9618) grad_norm 2.5566 (3.2782) [2022-01-27 00:13:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][850/1251] eta 0:14:46 lr 0.000013 time 1.9335 (2.2118) loss 2.9078 (2.9636) grad_norm 4.7209 (3.2810) [2022-01-27 00:13:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][860/1251] eta 0:14:25 lr 0.000013 time 3.0392 (2.2136) loss 3.1279 (2.9668) grad_norm 3.2657 (3.2802) [2022-01-27 00:13:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][870/1251] eta 0:14:02 lr 0.000013 time 2.0977 (2.2124) loss 2.7659 (2.9657) grad_norm 3.4213 (3.2784) [2022-01-27 00:14:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][880/1251] eta 0:13:39 lr 0.000013 time 1.8131 (2.2086) loss 3.2248 (2.9677) grad_norm 2.8497 (3.2784) [2022-01-27 00:14:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][890/1251] eta 0:13:16 lr 0.000013 time 1.9061 (2.2066) loss 3.6139 (2.9694) grad_norm 3.1774 (3.2820) [2022-01-27 00:14:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][900/1251] eta 0:12:54 lr 0.000013 time 2.3703 (2.2051) loss 2.3319 (2.9669) grad_norm 2.6517 (3.2793) [2022-01-27 00:15:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][910/1251] eta 0:12:32 lr 0.000013 time 3.0604 (2.2053) loss 2.5609 (2.9640) grad_norm 3.0570 (3.2775) [2022-01-27 00:15:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][920/1251] eta 0:12:10 lr 0.000013 time 2.1932 (2.2055) loss 3.3070 (2.9648) grad_norm 3.5555 (3.2771) [2022-01-27 00:15:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][930/1251] eta 0:11:48 lr 0.000013 time 2.4702 (2.2076) loss 3.5984 (2.9685) grad_norm 3.2975 (3.2766) [2022-01-27 00:16:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][940/1251] eta 0:11:26 lr 0.000013 time 1.8413 (2.2075) loss 3.2539 (2.9690) grad_norm 3.4933 (3.2782) [2022-01-27 00:16:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][950/1251] eta 0:11:05 lr 0.000013 time 3.7248 (2.2096) loss 2.9389 (2.9699) grad_norm 2.6451 (3.2773) [2022-01-27 00:17:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][960/1251] eta 0:10:42 lr 0.000013 time 1.7348 (2.2083) loss 2.7984 (2.9686) grad_norm 3.7093 (3.2752) [2022-01-27 00:17:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][970/1251] eta 0:10:20 lr 0.000013 time 1.8901 (2.2074) loss 3.1576 (2.9682) grad_norm 3.5230 (3.2757) [2022-01-27 00:17:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][980/1251] eta 0:09:58 lr 0.000013 time 1.9942 (2.2067) loss 3.0274 (2.9684) grad_norm 3.0463 (3.2739) [2022-01-27 00:18:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][990/1251] eta 0:09:36 lr 0.000013 time 2.4346 (2.2069) loss 2.3236 (2.9691) grad_norm 3.6693 (3.2727) [2022-01-27 00:18:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1000/1251] eta 0:09:13 lr 0.000013 time 2.0113 (2.2047) loss 2.9220 (2.9696) grad_norm 3.0831 (3.2760) [2022-01-27 00:18:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1010/1251] eta 0:08:50 lr 0.000013 time 2.1663 (2.2031) loss 2.9853 (2.9706) grad_norm 2.6913 (3.2748) [2022-01-27 00:19:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1020/1251] eta 0:08:28 lr 0.000013 time 1.9676 (2.2032) loss 3.0641 (2.9708) grad_norm 3.4319 (3.2769) [2022-01-27 00:19:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1030/1251] eta 0:08:07 lr 0.000013 time 2.3701 (2.2037) loss 3.1476 (2.9703) grad_norm 3.4008 (3.2753) [2022-01-27 00:19:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1040/1251] eta 0:07:45 lr 0.000013 time 2.3860 (2.2046) loss 3.5545 (2.9704) grad_norm 3.4613 (3.2749) [2022-01-27 00:20:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1050/1251] eta 0:07:23 lr 0.000013 time 2.8015 (2.2055) loss 2.8206 (2.9700) grad_norm 3.1635 (3.2746) [2022-01-27 00:20:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1060/1251] eta 0:07:01 lr 0.000013 time 1.5467 (2.2065) loss 3.1726 (2.9698) grad_norm 3.1088 (3.2760) [2022-01-27 00:21:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1070/1251] eta 0:06:39 lr 0.000013 time 2.4362 (2.2081) loss 3.3465 (2.9670) grad_norm 3.0004 (3.2762) [2022-01-27 00:21:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1080/1251] eta 0:06:17 lr 0.000013 time 1.6782 (2.2065) loss 3.6873 (2.9685) grad_norm 4.8611 (3.2787) [2022-01-27 00:21:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1090/1251] eta 0:05:54 lr 0.000013 time 1.9247 (2.2039) loss 3.2325 (2.9696) grad_norm 3.2523 (3.2790) [2022-01-27 00:22:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1100/1251] eta 0:05:32 lr 0.000013 time 1.8613 (2.2013) loss 2.8059 (2.9697) grad_norm 3.8744 (3.2782) [2022-01-27 00:22:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1110/1251] eta 0:05:10 lr 0.000013 time 2.5885 (2.2017) loss 3.3069 (2.9697) grad_norm 4.2096 (3.2819) [2022-01-27 00:22:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1120/1251] eta 0:04:48 lr 0.000013 time 1.9341 (2.2006) loss 3.4095 (2.9694) grad_norm 3.7759 (3.2840) [2022-01-27 00:23:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1130/1251] eta 0:04:26 lr 0.000013 time 1.9344 (2.2004) loss 3.5068 (2.9703) grad_norm 3.6388 (3.2836) [2022-01-27 00:23:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1140/1251] eta 0:04:04 lr 0.000013 time 1.7642 (2.2006) loss 3.7875 (2.9700) grad_norm 3.2351 (3.2825) [2022-01-27 00:23:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1150/1251] eta 0:03:42 lr 0.000013 time 3.0524 (2.2027) loss 2.0141 (2.9686) grad_norm 3.1639 (3.2829) [2022-01-27 00:24:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1160/1251] eta 0:03:20 lr 0.000013 time 3.2729 (2.2054) loss 3.4635 (2.9699) grad_norm 3.4382 (3.2830) [2022-01-27 00:24:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1170/1251] eta 0:02:58 lr 0.000013 time 2.0760 (2.2059) loss 3.0069 (2.9694) grad_norm 3.0787 (3.2830) [2022-01-27 00:25:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1180/1251] eta 0:02:36 lr 0.000013 time 1.5699 (2.2047) loss 2.8213 (2.9693) grad_norm 2.6017 (3.2813) [2022-01-27 00:25:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1190/1251] eta 0:02:14 lr 0.000013 time 3.7565 (2.2044) loss 3.3561 (2.9706) grad_norm 2.8593 (3.2818) [2022-01-27 00:25:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1200/1251] eta 0:01:52 lr 0.000013 time 2.7643 (2.2050) loss 3.5890 (2.9716) grad_norm 3.3966 (3.2813) [2022-01-27 00:26:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1210/1251] eta 0:01:30 lr 0.000013 time 1.7925 (2.2035) loss 3.2927 (2.9706) grad_norm 3.4657 (3.2822) [2022-01-27 00:26:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1220/1251] eta 0:01:08 lr 0.000013 time 1.9873 (2.2018) loss 2.8017 (2.9705) grad_norm 3.1469 (3.2822) [2022-01-27 00:26:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1230/1251] eta 0:00:46 lr 0.000013 time 2.9786 (2.2025) loss 2.9716 (2.9707) grad_norm 2.6594 (3.2802) [2022-01-27 00:27:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1240/1251] eta 0:00:24 lr 0.000013 time 1.9418 (2.2019) loss 3.3323 (2.9700) grad_norm 3.0583 (3.2803) [2022-01-27 00:27:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [289/300][1250/1251] eta 0:00:02 lr 0.000013 time 1.1896 (2.1962) loss 3.5848 (2.9704) grad_norm 3.4274 (3.2801) [2022-01-27 00:27:29 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 289 training takes 0:45:48 [2022-01-27 00:27:47 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.446 (18.446) Loss 0.8076 (0.8076) Acc@1 79.980 (79.980) Acc@5 95.703 (95.703) [2022-01-27 00:28:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.617 (3.481) Loss 0.7912 (0.7929) Acc@1 82.422 (81.490) Acc@5 94.922 (95.801) [2022-01-27 00:28:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.609 (2.636) Loss 0.8267 (0.8027) Acc@1 81.250 (81.310) Acc@5 95.410 (95.573) [2022-01-27 00:28:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.556 (2.244) Loss 0.7450 (0.8062) Acc@1 82.520 (81.272) Acc@5 95.508 (95.483) [2022-01-27 00:29:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 4.151 (2.210) Loss 0.7639 (0.8144) Acc@1 82.715 (81.031) Acc@5 95.996 (95.436) [2022-01-27 00:29:07 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.120 Acc@5 95.438 [2022-01-27 00:29:07 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-01-27 00:29:07 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.14% [2022-01-27 00:29:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][0/1251] eta 7:29:58 lr 0.000013 time 21.5819 (21.5819) loss 2.8772 (2.8772) grad_norm 3.5191 (3.5191) [2022-01-27 00:29:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][10/1251] eta 1:24:24 lr 0.000013 time 1.7598 (4.0808) loss 3.1223 (3.0184) grad_norm 3.1954 (3.5617) [2022-01-27 00:30:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][20/1251] eta 1:05:56 lr 0.000013 time 1.7850 (3.2138) loss 2.4755 (2.9764) grad_norm 3.0638 (3.4252) [2022-01-27 00:30:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][30/1251] eta 0:59:44 lr 0.000013 time 1.8864 (2.9360) loss 2.3443 (2.9749) grad_norm 3.1273 (3.3706) [2022-01-27 00:31:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][40/1251] eta 0:55:53 lr 0.000013 time 3.4143 (2.7690) loss 3.2203 (2.9718) grad_norm 2.8046 (3.3606) [2022-01-27 00:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][50/1251] eta 0:53:10 lr 0.000013 time 1.5638 (2.6564) loss 2.4615 (2.9480) grad_norm 2.9846 (3.3093) [2022-01-27 00:31:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][60/1251] eta 0:51:14 lr 0.000013 time 1.5053 (2.5813) loss 2.6444 (2.9140) grad_norm 4.3990 (3.3033) [2022-01-27 00:32:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][70/1251] eta 0:49:30 lr 0.000013 time 1.7788 (2.5151) loss 3.0180 (2.8853) grad_norm 2.4781 (3.2692) [2022-01-27 00:32:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][80/1251] eta 0:48:34 lr 0.000013 time 3.5968 (2.4888) loss 3.4362 (2.9051) grad_norm 3.1447 (3.2786) [2022-01-27 00:32:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][90/1251] eta 0:47:16 lr 0.000013 time 1.5403 (2.4431) loss 2.1557 (2.9231) grad_norm 3.0199 (3.2774) [2022-01-27 00:33:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][100/1251] eta 0:46:13 lr 0.000013 time 2.3187 (2.4094) loss 2.3253 (2.9266) grad_norm 3.1371 (3.2786) [2022-01-27 00:33:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][110/1251] eta 0:45:14 lr 0.000013 time 2.2481 (2.3791) loss 2.1417 (2.9337) grad_norm 3.4450 (3.2874) [2022-01-27 00:33:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][120/1251] eta 0:44:42 lr 0.000013 time 3.1191 (2.3722) loss 3.3475 (2.9523) grad_norm 2.7359 (3.2758) [2022-01-27 00:34:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][130/1251] eta 0:44:08 lr 0.000013 time 2.1602 (2.3630) loss 3.5256 (2.9457) grad_norm 2.9733 (3.2796) [2022-01-27 00:34:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][140/1251] eta 0:43:32 lr 0.000013 time 2.0084 (2.3511) loss 3.5011 (2.9562) grad_norm 3.2465 (3.2755) [2022-01-27 00:35:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][150/1251] eta 0:43:05 lr 0.000013 time 2.4367 (2.3481) loss 3.5620 (2.9591) grad_norm 2.8318 (3.2803) [2022-01-27 00:35:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][160/1251] eta 0:42:46 lr 0.000013 time 3.1228 (2.3528) loss 2.7937 (2.9622) grad_norm 3.0558 (3.2832) [2022-01-27 00:35:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][170/1251] eta 0:42:04 lr 0.000013 time 1.8546 (2.3356) loss 3.0436 (2.9765) grad_norm 4.2479 (3.2928) [2022-01-27 00:36:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][180/1251] eta 0:41:28 lr 0.000013 time 1.8533 (2.3232) loss 3.3024 (2.9808) grad_norm 2.8141 (3.2822) [2022-01-27 00:36:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][190/1251] eta 0:40:51 lr 0.000013 time 1.9868 (2.3107) loss 3.1300 (2.9780) grad_norm 2.9523 (3.2717) [2022-01-27 00:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][200/1251] eta 0:40:26 lr 0.000013 time 2.2305 (2.3087) loss 2.0250 (2.9750) grad_norm 3.3126 (3.2712) [2022-01-27 00:37:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][210/1251] eta 0:39:51 lr 0.000013 time 1.9160 (2.2976) loss 3.1488 (2.9829) grad_norm 3.1758 (3.2659) [2022-01-27 00:37:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][220/1251] eta 0:39:20 lr 0.000013 time 1.8737 (2.2895) loss 3.2565 (2.9706) grad_norm 3.2808 (3.2728) [2022-01-27 00:37:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][230/1251] eta 0:38:53 lr 0.000013 time 2.1101 (2.2851) loss 3.1953 (2.9669) grad_norm 3.7361 (3.2776) [2022-01-27 00:38:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][240/1251] eta 0:38:36 lr 0.000013 time 1.9188 (2.2914) loss 2.0283 (2.9641) grad_norm 3.4675 (3.2714) [2022-01-27 00:38:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][250/1251] eta 0:38:13 lr 0.000013 time 2.1539 (2.2913) loss 3.3063 (2.9673) grad_norm 3.1955 (3.2681) [2022-01-27 00:39:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][260/1251] eta 0:37:41 lr 0.000013 time 1.7645 (2.2821) loss 3.2938 (2.9776) grad_norm 3.3253 (3.2747) [2022-01-27 00:39:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][270/1251] eta 0:37:05 lr 0.000013 time 1.6631 (2.2687) loss 2.9192 (2.9764) grad_norm 3.1954 (3.2729) [2022-01-27 00:39:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][280/1251] eta 0:36:40 lr 0.000013 time 2.1748 (2.2657) loss 3.5630 (2.9830) grad_norm 3.3373 (3.2692) [2022-01-27 00:40:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][290/1251] eta 0:36:11 lr 0.000013 time 2.2346 (2.2598) loss 2.6576 (2.9684) grad_norm 3.6698 (3.2700) [2022-01-27 00:40:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][300/1251] eta 0:35:46 lr 0.000013 time 2.1960 (2.2572) loss 2.6347 (2.9634) grad_norm 3.4328 (3.2676) [2022-01-27 00:40:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][310/1251] eta 0:35:17 lr 0.000013 time 1.9124 (2.2506) loss 3.3398 (2.9653) grad_norm 3.6715 (3.2643) [2022-01-27 00:41:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][320/1251] eta 0:34:51 lr 0.000013 time 2.2379 (2.2465) loss 3.6236 (2.9666) grad_norm 4.4460 (3.2633) [2022-01-27 00:41:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][330/1251] eta 0:34:27 lr 0.000013 time 1.8530 (2.2452) loss 3.3122 (2.9712) grad_norm 2.8319 (3.2654) [2022-01-27 00:41:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][340/1251] eta 0:34:06 lr 0.000013 time 2.8289 (2.2465) loss 2.7428 (2.9735) grad_norm 3.2241 (3.2701) [2022-01-27 00:42:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][350/1251] eta 0:33:45 lr 0.000013 time 2.1665 (2.2477) loss 2.9526 (2.9762) grad_norm 3.4874 (3.2692) [2022-01-27 00:42:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][360/1251] eta 0:33:22 lr 0.000013 time 1.8276 (2.2472) loss 2.8924 (2.9744) grad_norm 4.3950 (3.2680) [2022-01-27 00:43:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][370/1251] eta 0:33:00 lr 0.000013 time 1.9606 (2.2479) loss 3.3347 (2.9708) grad_norm 2.8278 (3.2624) [2022-01-27 00:43:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][380/1251] eta 0:32:38 lr 0.000013 time 3.0755 (2.2490) loss 3.4046 (2.9762) grad_norm 3.2993 (3.2653) [2022-01-27 00:43:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][390/1251] eta 0:32:16 lr 0.000013 time 2.4752 (2.2493) loss 3.2582 (2.9790) grad_norm 3.7793 (3.2691) [2022-01-27 00:44:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][400/1251] eta 0:31:51 lr 0.000013 time 1.8947 (2.2465) loss 2.2233 (2.9752) grad_norm 2.7704 (3.2707) [2022-01-27 00:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][410/1251] eta 0:31:24 lr 0.000013 time 1.9994 (2.2412) loss 2.8475 (2.9775) grad_norm 4.0743 (3.2727) [2022-01-27 00:44:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][420/1251] eta 0:31:00 lr 0.000013 time 2.6101 (2.2391) loss 2.9287 (2.9777) grad_norm 2.8529 (3.2753) [2022-01-27 00:45:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][430/1251] eta 0:30:36 lr 0.000013 time 1.9194 (2.2366) loss 2.9994 (2.9840) grad_norm 2.8651 (3.2786) [2022-01-27 00:45:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][440/1251] eta 0:30:14 lr 0.000013 time 1.5826 (2.2379) loss 3.3027 (2.9861) grad_norm 3.6403 (3.2819) [2022-01-27 00:45:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][450/1251] eta 0:29:51 lr 0.000013 time 1.8974 (2.2365) loss 3.2738 (2.9829) grad_norm 3.9342 (3.2845) [2022-01-27 00:46:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][460/1251] eta 0:29:28 lr 0.000013 time 1.8224 (2.2360) loss 2.0670 (2.9838) grad_norm 3.3959 (3.2844) [2022-01-27 00:46:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][470/1251] eta 0:29:04 lr 0.000013 time 1.5787 (2.2332) loss 2.0000 (2.9851) grad_norm 3.1561 (3.2844) [2022-01-27 00:47:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][480/1251] eta 0:28:39 lr 0.000013 time 1.5961 (2.2306) loss 3.0369 (2.9874) grad_norm 3.0634 (3.2808) [2022-01-27 00:47:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][490/1251] eta 0:28:17 lr 0.000013 time 1.5325 (2.2301) loss 3.4315 (2.9868) grad_norm 3.3916 (3.2772) [2022-01-27 00:47:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][500/1251] eta 0:27:55 lr 0.000012 time 1.8736 (2.2305) loss 2.9210 (2.9825) grad_norm 3.3133 (3.2759) [2022-01-27 00:48:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][510/1251] eta 0:27:32 lr 0.000012 time 1.5734 (2.2300) loss 1.9970 (2.9811) grad_norm 2.8610 (3.2708) [2022-01-27 00:48:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][520/1251] eta 0:27:10 lr 0.000012 time 2.0108 (2.2299) loss 1.9949 (2.9786) grad_norm 2.9770 (3.2755) [2022-01-27 00:48:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][530/1251] eta 0:26:45 lr 0.000012 time 1.6314 (2.2271) loss 2.2255 (2.9799) grad_norm 3.0946 (3.2770) [2022-01-27 00:49:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][540/1251] eta 0:26:21 lr 0.000012 time 1.9026 (2.2245) loss 3.3192 (2.9737) grad_norm 2.9705 (3.2846) [2022-01-27 00:49:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][550/1251] eta 0:26:00 lr 0.000012 time 2.0455 (2.2256) loss 2.7689 (2.9757) grad_norm 2.8496 (3.2812) [2022-01-27 00:49:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][560/1251] eta 0:25:38 lr 0.000012 time 1.8934 (2.2269) loss 2.8948 (2.9780) grad_norm 3.3339 (3.2873) [2022-01-27 00:50:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][570/1251] eta 0:25:14 lr 0.000012 time 1.9256 (2.2241) loss 3.2404 (2.9794) grad_norm 4.1986 (3.2915) [2022-01-27 00:50:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][580/1251] eta 0:24:49 lr 0.000012 time 1.7666 (2.2201) loss 2.3860 (2.9788) grad_norm 2.9504 (3.2918) [2022-01-27 00:51:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][590/1251] eta 0:24:32 lr 0.000012 time 2.5037 (2.2281) loss 2.3170 (2.9781) grad_norm 3.3715 (3.2901) [2022-01-27 00:51:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][600/1251] eta 0:24:11 lr 0.000012 time 1.8952 (2.2298) loss 3.4599 (2.9801) grad_norm 3.1826 (3.2905) [2022-01-27 00:51:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][610/1251] eta 0:23:50 lr 0.000012 time 2.2126 (2.2313) loss 2.2573 (2.9749) grad_norm 3.4875 (3.2885) [2022-01-27 00:52:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][620/1251] eta 0:23:27 lr 0.000012 time 2.2624 (2.2300) loss 2.7975 (2.9771) grad_norm 3.3202 (3.2919) [2022-01-27 00:52:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][630/1251] eta 0:23:01 lr 0.000012 time 1.8261 (2.2246) loss 2.3392 (2.9772) grad_norm 2.9067 (3.2894) [2022-01-27 00:52:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][640/1251] eta 0:22:37 lr 0.000012 time 2.6366 (2.2214) loss 3.2552 (2.9783) grad_norm 2.7951 (3.2883) [2022-01-27 00:53:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][650/1251] eta 0:22:14 lr 0.000012 time 1.9899 (2.2212) loss 2.9906 (2.9792) grad_norm 3.1831 (3.2877) [2022-01-27 00:53:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][660/1251] eta 0:21:52 lr 0.000012 time 1.8897 (2.2215) loss 3.2999 (2.9754) grad_norm 3.1435 (3.2854) [2022-01-27 00:53:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][670/1251] eta 0:21:31 lr 0.000012 time 1.8756 (2.2235) loss 2.3897 (2.9743) grad_norm 3.0756 (3.2845) [2022-01-27 00:54:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][680/1251] eta 0:21:10 lr 0.000012 time 2.6959 (2.2248) loss 2.1920 (2.9701) grad_norm 3.6734 (3.2829) [2022-01-27 00:54:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][690/1251] eta 0:20:48 lr 0.000012 time 1.9843 (2.2249) loss 3.3178 (2.9720) grad_norm 2.7863 (3.2802) [2022-01-27 00:55:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][700/1251] eta 0:20:24 lr 0.000012 time 1.9239 (2.2219) loss 2.4291 (2.9753) grad_norm 2.7544 (3.2777) [2022-01-27 00:55:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][710/1251] eta 0:20:00 lr 0.000012 time 1.9212 (2.2196) loss 2.5459 (2.9753) grad_norm 2.9263 (3.2766) [2022-01-27 00:55:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][720/1251] eta 0:19:36 lr 0.000012 time 1.9198 (2.2160) loss 2.7556 (2.9756) grad_norm 3.6773 (3.2805) [2022-01-27 00:56:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][730/1251] eta 0:19:14 lr 0.000012 time 2.2960 (2.2158) loss 3.2819 (2.9703) grad_norm 2.7193 (3.2788) [2022-01-27 00:56:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][740/1251] eta 0:18:52 lr 0.000012 time 2.1469 (2.2153) loss 2.6756 (2.9702) grad_norm 3.3452 (3.2808) [2022-01-27 00:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][750/1251] eta 0:18:29 lr 0.000012 time 1.9376 (2.2154) loss 2.0106 (2.9696) grad_norm 3.3286 (3.2843) [2022-01-27 00:57:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][760/1251] eta 0:18:07 lr 0.000012 time 1.8632 (2.2155) loss 3.2219 (2.9700) grad_norm 3.5656 (3.2841) [2022-01-27 00:57:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][770/1251] eta 0:17:47 lr 0.000012 time 2.3932 (2.2191) loss 2.2925 (2.9678) grad_norm 4.2579 (3.2836) [2022-01-27 00:58:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][780/1251] eta 0:17:25 lr 0.000012 time 2.1060 (2.2190) loss 3.2234 (2.9686) grad_norm 3.1361 (3.2851) [2022-01-27 00:58:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][790/1251] eta 0:17:02 lr 0.000012 time 2.1994 (2.2176) loss 2.9033 (2.9669) grad_norm 2.7400 (3.2847) [2022-01-27 00:58:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][800/1251] eta 0:16:39 lr 0.000012 time 1.9776 (2.2152) loss 3.1693 (2.9693) grad_norm 2.8980 (3.2827) [2022-01-27 00:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][810/1251] eta 0:16:15 lr 0.000012 time 1.9696 (2.2122) loss 3.4755 (2.9717) grad_norm 3.1519 (3.2810) [2022-01-27 00:59:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][820/1251] eta 0:15:52 lr 0.000012 time 1.9269 (2.2097) loss 2.7787 (2.9710) grad_norm 3.7016 (3.2878) [2022-01-27 00:59:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][830/1251] eta 0:15:29 lr 0.000012 time 1.9254 (2.2090) loss 3.2807 (2.9726) grad_norm 2.9538 (3.2867) [2022-01-27 01:00:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][840/1251] eta 0:15:07 lr 0.000012 time 1.9435 (2.2081) loss 2.8886 (2.9728) grad_norm 3.0710 (3.2847) [2022-01-27 01:00:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][850/1251] eta 0:14:45 lr 0.000012 time 1.7679 (2.2075) loss 3.1469 (2.9735) grad_norm 3.5392 (3.2845) [2022-01-27 01:00:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][860/1251] eta 0:14:24 lr 0.000012 time 1.9391 (2.2108) loss 2.6307 (2.9739) grad_norm 3.2848 (3.2851) [2022-01-27 01:01:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][870/1251] eta 0:14:03 lr 0.000012 time 3.0864 (2.2131) loss 3.3904 (2.9739) grad_norm 3.3330 (3.2833) [2022-01-27 01:01:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][880/1251] eta 0:13:41 lr 0.000012 time 1.8256 (2.2154) loss 2.9587 (2.9719) grad_norm 3.1854 (3.2834) [2022-01-27 01:02:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][890/1251] eta 0:13:19 lr 0.000012 time 2.2476 (2.2149) loss 1.8656 (2.9699) grad_norm 2.9788 (3.2831) [2022-01-27 01:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][900/1251] eta 0:12:56 lr 0.000012 time 1.8516 (2.2121) loss 3.5303 (2.9707) grad_norm 5.0108 (3.2836) [2022-01-27 01:02:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][910/1251] eta 0:12:33 lr 0.000012 time 2.1495 (2.2108) loss 2.3793 (2.9712) grad_norm 2.8808 (3.2822) [2022-01-27 01:03:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][920/1251] eta 0:12:11 lr 0.000012 time 1.9359 (2.2095) loss 3.5837 (2.9727) grad_norm 3.1937 (3.2815) [2022-01-27 01:03:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][930/1251] eta 0:11:49 lr 0.000012 time 1.9033 (2.2089) loss 1.9208 (2.9736) grad_norm 3.7465 (3.2818) [2022-01-27 01:03:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][940/1251] eta 0:11:27 lr 0.000012 time 1.8889 (2.2100) loss 2.9621 (2.9734) grad_norm 3.3211 (3.2821) [2022-01-27 01:04:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][950/1251] eta 0:11:05 lr 0.000012 time 1.9065 (2.2096) loss 2.0001 (2.9726) grad_norm 3.1867 (3.2827) [2022-01-27 01:04:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][960/1251] eta 0:10:43 lr 0.000012 time 1.7721 (2.2119) loss 2.4660 (2.9725) grad_norm 3.1049 (3.2807) [2022-01-27 01:04:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][970/1251] eta 0:10:21 lr 0.000012 time 2.2183 (2.2115) loss 3.1732 (2.9735) grad_norm 3.3358 (3.2794) [2022-01-27 01:05:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][980/1251] eta 0:09:59 lr 0.000012 time 2.2776 (2.2108) loss 3.2208 (2.9746) grad_norm 3.0699 (3.2797) [2022-01-27 01:05:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][990/1251] eta 0:09:36 lr 0.000012 time 1.6001 (2.2099) loss 2.9889 (2.9723) grad_norm 3.4490 (3.2799) [2022-01-27 01:05:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1000/1251] eta 0:09:14 lr 0.000012 time 1.6502 (2.2090) loss 3.3723 (2.9701) grad_norm 3.0917 (3.2798) [2022-01-27 01:06:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1010/1251] eta 0:08:52 lr 0.000012 time 2.2659 (2.2089) loss 3.3685 (2.9703) grad_norm 5.0911 (3.2820) [2022-01-27 01:06:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1020/1251] eta 0:08:30 lr 0.000012 time 1.9546 (2.2082) loss 1.9527 (2.9684) grad_norm 3.0746 (3.2820) [2022-01-27 01:07:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1030/1251] eta 0:08:07 lr 0.000012 time 1.9088 (2.2063) loss 2.0936 (2.9684) grad_norm 3.1174 (3.2816) [2022-01-27 01:07:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1040/1251] eta 0:07:45 lr 0.000012 time 2.1101 (2.2051) loss 1.9883 (2.9685) grad_norm 5.1182 (3.2829) [2022-01-27 01:07:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1050/1251] eta 0:07:23 lr 0.000012 time 2.2721 (2.2058) loss 3.3164 (2.9693) grad_norm 3.7702 (3.2813) [2022-01-27 01:08:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1060/1251] eta 0:07:01 lr 0.000012 time 1.7967 (2.2054) loss 3.2188 (2.9686) grad_norm 2.8566 (3.2816) [2022-01-27 01:08:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1070/1251] eta 0:06:38 lr 0.000012 time 2.2574 (2.2044) loss 2.9215 (2.9673) grad_norm 3.0162 (3.2792) [2022-01-27 01:08:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1080/1251] eta 0:06:17 lr 0.000012 time 1.5821 (2.2077) loss 1.9404 (2.9678) grad_norm 3.1437 (3.2799) [2022-01-27 01:09:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1090/1251] eta 0:05:55 lr 0.000012 time 1.8189 (2.2087) loss 3.4613 (2.9677) grad_norm 4.0154 (3.2806) [2022-01-27 01:09:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1100/1251] eta 0:05:33 lr 0.000012 time 1.8902 (2.2078) loss 2.4093 (2.9660) grad_norm 2.6563 (3.2786) [2022-01-27 01:10:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1110/1251] eta 0:05:11 lr 0.000012 time 1.9469 (2.2079) loss 3.1161 (2.9657) grad_norm 3.1398 (3.2771) [2022-01-27 01:10:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1120/1251] eta 0:04:49 lr 0.000012 time 1.6501 (2.2100) loss 3.2311 (2.9637) grad_norm 3.0526 (3.2753) [2022-01-27 01:10:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1130/1251] eta 0:04:27 lr 0.000012 time 1.7115 (2.2079) loss 3.1315 (2.9649) grad_norm 3.2768 (3.2751) [2022-01-27 01:11:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1140/1251] eta 0:04:04 lr 0.000012 time 1.7428 (2.2062) loss 3.0491 (2.9641) grad_norm 3.0116 (3.2742) [2022-01-27 01:11:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1150/1251] eta 0:03:42 lr 0.000012 time 1.8931 (2.2051) loss 2.6164 (2.9653) grad_norm 3.6289 (3.2731) [2022-01-27 01:11:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1160/1251] eta 0:03:20 lr 0.000012 time 2.2554 (2.2058) loss 3.6536 (2.9628) grad_norm 3.7712 (3.2735) [2022-01-27 01:12:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1170/1251] eta 0:02:58 lr 0.000012 time 2.1768 (2.2058) loss 3.4397 (2.9639) grad_norm 2.9938 (3.2740) [2022-01-27 01:12:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1180/1251] eta 0:02:36 lr 0.000012 time 2.4721 (2.2060) loss 3.0637 (2.9655) grad_norm 2.7686 (3.2751) [2022-01-27 01:12:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1190/1251] eta 0:02:14 lr 0.000012 time 1.8766 (2.2049) loss 2.1689 (2.9641) grad_norm 2.7844 (3.2757) [2022-01-27 01:13:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1200/1251] eta 0:01:52 lr 0.000012 time 1.6794 (2.2053) loss 2.4336 (2.9641) grad_norm 3.0774 (3.2757) [2022-01-27 01:13:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1210/1251] eta 0:01:30 lr 0.000012 time 1.8762 (2.2046) loss 2.4541 (2.9655) grad_norm 2.5553 (3.2733) [2022-01-27 01:13:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1220/1251] eta 0:01:08 lr 0.000012 time 1.9505 (2.2041) loss 3.3279 (2.9668) grad_norm 3.0109 (3.2708) [2022-01-27 01:14:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1230/1251] eta 0:00:46 lr 0.000012 time 2.2453 (2.2041) loss 3.2445 (2.9668) grad_norm 3.1655 (3.2699) [2022-01-27 01:14:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1240/1251] eta 0:00:24 lr 0.000012 time 1.2463 (2.2034) loss 2.7795 (2.9676) grad_norm 2.7776 (3.2691) [2022-01-27 01:14:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [290/300][1250/1251] eta 0:00:02 lr 0.000012 time 1.2887 (2.1978) loss 3.2564 (2.9657) grad_norm 3.6787 (3.2688) [2022-01-27 01:14:56 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 290 training takes 0:45:49 [2022-01-27 01:14:56 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_290 saving...... [2022-01-27 01:15:08 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_290 saved !!! [2022-01-27 01:15:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 16.690 (16.690) Loss 0.8451 (0.8451) Acc@1 79.785 (79.785) Acc@5 95.312 (95.312) [2022-01-27 01:15:41 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.522 (3.078) Loss 0.7920 (0.8105) Acc@1 80.078 (81.161) Acc@5 95.898 (95.419) [2022-01-27 01:15:54 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 1.510 (2.227) Loss 0.8559 (0.8116) Acc@1 79.883 (81.283) Acc@5 94.727 (95.410) [2022-01-27 01:16:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.009 (2.048) Loss 0.8161 (0.8083) Acc@1 81.250 (81.297) Acc@5 95.605 (95.492) [2022-01-27 01:16:29 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.719 (1.995) Loss 0.7978 (0.8111) Acc@1 81.934 (81.271) Acc@5 96.289 (95.460) [2022-01-27 01:16:37 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.238 Acc@5 95.436 [2022-01-27 01:16:37 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.2% [2022-01-27 01:16:37 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.24% [2022-01-27 01:16:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][0/1251] eta 7:18:10 lr 0.000012 time 21.0153 (21.0153) loss 2.8622 (2.8622) grad_norm 3.7472 (3.7472) [2022-01-27 01:17:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][10/1251] eta 1:24:23 lr 0.000012 time 1.9319 (4.0800) loss 3.3978 (3.0968) grad_norm 3.4704 (3.3263) [2022-01-27 01:17:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][20/1251] eta 1:05:37 lr 0.000012 time 1.4363 (3.1989) loss 3.0916 (3.0236) grad_norm 4.0449 (3.3326) [2022-01-27 01:18:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][30/1251] eta 0:58:44 lr 0.000012 time 1.5320 (2.8863) loss 3.2156 (2.9998) grad_norm 3.0731 (3.3884) [2022-01-27 01:18:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][40/1251] eta 0:55:56 lr 0.000012 time 3.8630 (2.7719) loss 2.2528 (2.9467) grad_norm 2.8265 (3.3532) [2022-01-27 01:18:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][50/1251] eta 0:54:08 lr 0.000012 time 2.5737 (2.7050) loss 2.2410 (2.9052) grad_norm 3.2186 (3.3374) [2022-01-27 01:19:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][60/1251] eta 0:51:57 lr 0.000012 time 2.0172 (2.6172) loss 3.4174 (2.9072) grad_norm 2.8910 (3.2946) [2022-01-27 01:19:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][70/1251] eta 0:49:39 lr 0.000012 time 1.7615 (2.5226) loss 2.3918 (2.9449) grad_norm 3.1403 (3.2996) [2022-01-27 01:19:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][80/1251] eta 0:48:20 lr 0.000012 time 3.0262 (2.4766) loss 3.2577 (2.9732) grad_norm 3.2991 (3.3319) [2022-01-27 01:20:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][90/1251] eta 0:47:02 lr 0.000012 time 2.3798 (2.4307) loss 2.7312 (2.9496) grad_norm 3.3439 (3.3327) [2022-01-27 01:20:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][100/1251] eta 0:46:23 lr 0.000012 time 3.5244 (2.4181) loss 2.5419 (2.9710) grad_norm 3.0818 (3.3284) [2022-01-27 01:21:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][110/1251] eta 0:45:37 lr 0.000012 time 1.8983 (2.3994) loss 2.7212 (2.9746) grad_norm 2.7761 (3.3068) [2022-01-27 01:21:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][120/1251] eta 0:45:02 lr 0.000012 time 2.7901 (2.3899) loss 3.0991 (2.9640) grad_norm 3.6177 (3.3117) [2022-01-27 01:21:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][130/1251] eta 0:44:17 lr 0.000012 time 3.2319 (2.3706) loss 3.2910 (2.9796) grad_norm 3.5752 (3.3002) [2022-01-27 01:22:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][140/1251] eta 0:43:28 lr 0.000012 time 2.1646 (2.3482) loss 2.2937 (2.9794) grad_norm 3.8549 (3.2967) [2022-01-27 01:22:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][150/1251] eta 0:42:48 lr 0.000012 time 2.0156 (2.3332) loss 3.4523 (2.9843) grad_norm 4.0058 (3.3261) [2022-01-27 01:22:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][160/1251] eta 0:42:22 lr 0.000012 time 2.9275 (2.3304) loss 3.6178 (2.9725) grad_norm 3.1112 (3.3283) [2022-01-27 01:23:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][170/1251] eta 0:41:57 lr 0.000012 time 3.8042 (2.3285) loss 3.0845 (2.9737) grad_norm 3.7780 (3.3295) [2022-01-27 01:23:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][180/1251] eta 0:41:22 lr 0.000012 time 1.9693 (2.3180) loss 3.2235 (2.9788) grad_norm 3.2860 (3.3210) [2022-01-27 01:23:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][190/1251] eta 0:40:44 lr 0.000012 time 1.8734 (2.3040) loss 2.9314 (2.9873) grad_norm 3.3892 (3.3179) [2022-01-27 01:24:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][200/1251] eta 0:40:29 lr 0.000012 time 2.2049 (2.3121) loss 3.1020 (2.9814) grad_norm 3.1898 (3.3128) [2022-01-27 01:24:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][210/1251] eta 0:40:05 lr 0.000012 time 3.3218 (2.3107) loss 3.1825 (2.9937) grad_norm 3.5025 (3.3087) [2022-01-27 01:25:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][220/1251] eta 0:39:27 lr 0.000012 time 1.6176 (2.2959) loss 3.3969 (2.9935) grad_norm 2.9051 (3.2932) [2022-01-27 01:25:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][230/1251] eta 0:38:53 lr 0.000012 time 1.6918 (2.2851) loss 3.0618 (2.9938) grad_norm 3.6156 (3.2997) [2022-01-27 01:25:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][240/1251] eta 0:38:30 lr 0.000012 time 1.9698 (2.2852) loss 3.3518 (2.9994) grad_norm 3.8720 (3.3168) [2022-01-27 01:26:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][250/1251] eta 0:38:16 lr 0.000012 time 3.8378 (2.2945) loss 2.6569 (3.0067) grad_norm 2.9111 (3.3176) [2022-01-27 01:26:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][260/1251] eta 0:37:52 lr 0.000012 time 1.6376 (2.2932) loss 3.5172 (3.0136) grad_norm 3.4934 (3.3195) [2022-01-27 01:26:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][270/1251] eta 0:37:17 lr 0.000012 time 1.9214 (2.2810) loss 3.4543 (3.0158) grad_norm 2.8736 (3.3131) [2022-01-27 01:27:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][280/1251] eta 0:36:44 lr 0.000012 time 1.9913 (2.2708) loss 3.6688 (3.0120) grad_norm 3.0490 (3.3236) [2022-01-27 01:27:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][290/1251] eta 0:36:19 lr 0.000012 time 2.8133 (2.2680) loss 2.7568 (3.0124) grad_norm 3.1407 (3.3209) [2022-01-27 01:27:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][300/1251] eta 0:35:51 lr 0.000012 time 1.9950 (2.2619) loss 2.2785 (3.0089) grad_norm 3.3681 (3.3212) [2022-01-27 01:28:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][310/1251] eta 0:35:27 lr 0.000012 time 1.9350 (2.2612) loss 2.1905 (3.0063) grad_norm 4.1411 (3.3205) [2022-01-27 01:28:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][320/1251] eta 0:35:07 lr 0.000012 time 2.3848 (2.2633) loss 2.2992 (3.0092) grad_norm 3.1997 (3.3207) [2022-01-27 01:29:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][330/1251] eta 0:34:43 lr 0.000012 time 2.0880 (2.2623) loss 3.1224 (3.0014) grad_norm 2.8306 (3.3272) [2022-01-27 01:29:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][340/1251] eta 0:34:16 lr 0.000012 time 1.9051 (2.2574) loss 2.7196 (2.9943) grad_norm 2.7606 (3.3202) [2022-01-27 01:29:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][350/1251] eta 0:33:49 lr 0.000012 time 1.9041 (2.2521) loss 2.6813 (2.9865) grad_norm 3.2759 (3.3164) [2022-01-27 01:30:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][360/1251] eta 0:33:22 lr 0.000012 time 2.4803 (2.2477) loss 3.6589 (2.9917) grad_norm 3.2332 (3.3153) [2022-01-27 01:30:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][370/1251] eta 0:33:00 lr 0.000012 time 1.9413 (2.2475) loss 2.9991 (2.9895) grad_norm 3.5061 (3.3149) [2022-01-27 01:30:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][380/1251] eta 0:32:32 lr 0.000012 time 2.1472 (2.2420) loss 2.7472 (2.9910) grad_norm 3.5894 (3.3175) [2022-01-27 01:31:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][390/1251] eta 0:32:07 lr 0.000012 time 1.8892 (2.2391) loss 3.0337 (2.9902) grad_norm 2.9264 (3.3804) [2022-01-27 01:31:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][400/1251] eta 0:31:48 lr 0.000012 time 2.7050 (2.2425) loss 3.1155 (2.9892) grad_norm 2.8930 (3.3749) [2022-01-27 01:31:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][410/1251] eta 0:31:20 lr 0.000012 time 1.5077 (2.2357) loss 3.0564 (2.9898) grad_norm 3.3772 (3.3734) [2022-01-27 01:32:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][420/1251] eta 0:30:56 lr 0.000012 time 1.7969 (2.2335) loss 3.1506 (2.9946) grad_norm 2.7143 (3.3706) [2022-01-27 01:32:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][430/1251] eta 0:30:32 lr 0.000012 time 2.2166 (2.2323) loss 3.4145 (2.9906) grad_norm 3.5803 (3.3654) [2022-01-27 01:33:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][440/1251] eta 0:30:11 lr 0.000012 time 4.0528 (2.2335) loss 3.2383 (2.9946) grad_norm 3.4467 (3.3649) [2022-01-27 01:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][450/1251] eta 0:29:48 lr 0.000012 time 1.2486 (2.2330) loss 3.1630 (2.9929) grad_norm 3.2907 (3.3584) [2022-01-27 01:33:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][460/1251] eta 0:29:27 lr 0.000012 time 1.5427 (2.2348) loss 2.5912 (2.9900) grad_norm 3.8214 (3.3563) [2022-01-27 01:34:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][470/1251] eta 0:29:07 lr 0.000012 time 2.2361 (2.2373) loss 2.8088 (2.9891) grad_norm 2.8868 (3.3522) [2022-01-27 01:34:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][480/1251] eta 0:28:45 lr 0.000012 time 4.0353 (2.2381) loss 2.6048 (2.9890) grad_norm 3.2775 (3.3493) [2022-01-27 01:34:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][490/1251] eta 0:28:19 lr 0.000012 time 1.8113 (2.2337) loss 3.0970 (2.9881) grad_norm 3.1038 (3.3494) [2022-01-27 01:35:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][500/1251] eta 0:27:54 lr 0.000012 time 1.5903 (2.2292) loss 3.3659 (2.9852) grad_norm 3.6355 (3.3516) [2022-01-27 01:35:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][510/1251] eta 0:27:30 lr 0.000012 time 1.6313 (2.2278) loss 3.1643 (2.9850) grad_norm 2.8462 (3.3508) [2022-01-27 01:35:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][520/1251] eta 0:27:10 lr 0.000012 time 3.4621 (2.2312) loss 2.3447 (2.9789) grad_norm 3.2567 (3.3492) [2022-01-27 01:36:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][530/1251] eta 0:26:48 lr 0.000012 time 1.8653 (2.2314) loss 2.3131 (2.9745) grad_norm 2.9058 (3.3444) [2022-01-27 01:36:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][540/1251] eta 0:26:27 lr 0.000012 time 1.6075 (2.2322) loss 3.0687 (2.9729) grad_norm 2.9047 (3.3435) [2022-01-27 01:37:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][550/1251] eta 0:26:03 lr 0.000012 time 1.4737 (2.2307) loss 3.4570 (2.9759) grad_norm 3.5353 (3.3414) [2022-01-27 01:37:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][560/1251] eta 0:25:40 lr 0.000012 time 2.4847 (2.2295) loss 3.4390 (2.9759) grad_norm 3.6407 (3.3546) [2022-01-27 01:37:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][570/1251] eta 0:25:15 lr 0.000012 time 1.6024 (2.2258) loss 2.3555 (2.9715) grad_norm 3.2178 (3.3574) [2022-01-27 01:38:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][580/1251] eta 0:24:51 lr 0.000012 time 2.2009 (2.2228) loss 2.1751 (2.9704) grad_norm 3.8386 (3.3595) [2022-01-27 01:38:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][590/1251] eta 0:24:29 lr 0.000012 time 2.0195 (2.2235) loss 2.1081 (2.9711) grad_norm 4.2141 (3.3607) [2022-01-27 01:38:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][600/1251] eta 0:24:08 lr 0.000012 time 2.4002 (2.2250) loss 2.0199 (2.9710) grad_norm 2.8825 (3.3610) [2022-01-27 01:39:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][610/1251] eta 0:23:45 lr 0.000012 time 1.8398 (2.2237) loss 3.4642 (2.9701) grad_norm 3.5229 (3.3638) [2022-01-27 01:39:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][620/1251] eta 0:23:21 lr 0.000012 time 1.9341 (2.2212) loss 2.2242 (2.9677) grad_norm 3.0166 (3.3592) [2022-01-27 01:39:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][630/1251] eta 0:22:59 lr 0.000012 time 1.8869 (2.2207) loss 2.8929 (2.9681) grad_norm 4.9723 (3.3597) [2022-01-27 01:40:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][640/1251] eta 0:22:36 lr 0.000012 time 2.6660 (2.2209) loss 1.9806 (2.9689) grad_norm 3.1961 (3.3574) [2022-01-27 01:40:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][650/1251] eta 0:22:14 lr 0.000012 time 1.8433 (2.2204) loss 2.6028 (2.9691) grad_norm 3.3302 (3.3559) [2022-01-27 01:41:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][660/1251] eta 0:21:51 lr 0.000012 time 2.5805 (2.2197) loss 3.4334 (2.9719) grad_norm 2.8253 (3.3575) [2022-01-27 01:41:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][670/1251] eta 0:21:29 lr 0.000012 time 2.2366 (2.2191) loss 3.4986 (2.9711) grad_norm 3.0230 (3.3549) [2022-01-27 01:41:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][680/1251] eta 0:21:05 lr 0.000012 time 2.1902 (2.2170) loss 2.4375 (2.9700) grad_norm 3.0096 (3.3560) [2022-01-27 01:42:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][690/1251] eta 0:20:43 lr 0.000012 time 2.5123 (2.2157) loss 3.2354 (2.9722) grad_norm 3.1705 (3.3531) [2022-01-27 01:42:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][700/1251] eta 0:20:22 lr 0.000012 time 2.1186 (2.2180) loss 3.5922 (2.9751) grad_norm 3.3551 (3.3520) [2022-01-27 01:42:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][710/1251] eta 0:19:59 lr 0.000012 time 2.3580 (2.2169) loss 3.4396 (2.9780) grad_norm 3.3721 (3.3507) [2022-01-27 01:43:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][720/1251] eta 0:19:37 lr 0.000012 time 2.3578 (2.2167) loss 2.8016 (2.9764) grad_norm 3.1615 (3.3521) [2022-01-27 01:43:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][730/1251] eta 0:19:14 lr 0.000012 time 2.7047 (2.2154) loss 3.1494 (2.9782) grad_norm 3.7659 (3.3534) [2022-01-27 01:43:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][740/1251] eta 0:18:51 lr 0.000012 time 2.0197 (2.2147) loss 2.7870 (2.9771) grad_norm 4.0522 (3.3537) [2022-01-27 01:44:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][750/1251] eta 0:18:29 lr 0.000012 time 2.5338 (2.2148) loss 3.2086 (2.9772) grad_norm 3.3443 (3.3548) [2022-01-27 01:44:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][760/1251] eta 0:18:08 lr 0.000012 time 1.8877 (2.2159) loss 3.4379 (2.9797) grad_norm 3.3976 (3.3546) [2022-01-27 01:45:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][770/1251] eta 0:17:45 lr 0.000012 time 2.1201 (2.2151) loss 2.2991 (2.9772) grad_norm 3.0862 (3.3541) [2022-01-27 01:45:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][780/1251] eta 0:17:23 lr 0.000012 time 2.3006 (2.2152) loss 2.9924 (2.9788) grad_norm 2.9262 (3.3523) [2022-01-27 01:45:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][790/1251] eta 0:17:00 lr 0.000012 time 2.2591 (2.2131) loss 3.5693 (2.9782) grad_norm 2.8311 (3.3514) [2022-01-27 01:46:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][800/1251] eta 0:16:37 lr 0.000012 time 1.9242 (2.2111) loss 2.6698 (2.9789) grad_norm 3.0235 (3.3506) [2022-01-27 01:46:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][810/1251] eta 0:16:14 lr 0.000012 time 2.2300 (2.2099) loss 2.9392 (2.9799) grad_norm 3.2530 (3.3488) [2022-01-27 01:46:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][820/1251] eta 0:15:52 lr 0.000012 time 2.0339 (2.2097) loss 3.2631 (2.9810) grad_norm 2.9259 (3.3487) [2022-01-27 01:47:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][830/1251] eta 0:15:30 lr 0.000012 time 2.6779 (2.2102) loss 2.9858 (2.9818) grad_norm 2.9982 (3.3476) [2022-01-27 01:47:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][840/1251] eta 0:15:08 lr 0.000012 time 2.1300 (2.2099) loss 2.5103 (2.9811) grad_norm 2.8525 (3.3464) [2022-01-27 01:47:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][850/1251] eta 0:14:45 lr 0.000012 time 2.1552 (2.2091) loss 2.5563 (2.9791) grad_norm 2.8327 (3.3428) [2022-01-27 01:48:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][860/1251] eta 0:14:24 lr 0.000012 time 2.9506 (2.2105) loss 3.4992 (2.9819) grad_norm 3.4762 (3.3410) [2022-01-27 01:48:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][870/1251] eta 0:14:02 lr 0.000012 time 1.9298 (2.2112) loss 3.2089 (2.9822) grad_norm 3.3170 (3.3381) [2022-01-27 01:49:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][880/1251] eta 0:13:40 lr 0.000012 time 2.7875 (2.2112) loss 3.3719 (2.9831) grad_norm 3.3619 (3.3379) [2022-01-27 01:49:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][890/1251] eta 0:13:17 lr 0.000012 time 1.9832 (2.2090) loss 3.3695 (2.9825) grad_norm 2.9379 (3.3382) [2022-01-27 01:49:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][900/1251] eta 0:12:54 lr 0.000012 time 1.5525 (2.2069) loss 2.9642 (2.9827) grad_norm 2.9036 (3.3363) [2022-01-27 01:50:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][910/1251] eta 0:12:32 lr 0.000012 time 2.2402 (2.2072) loss 2.9133 (2.9837) grad_norm 2.8696 (3.3343) [2022-01-27 01:50:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][920/1251] eta 0:12:09 lr 0.000012 time 1.9009 (2.2051) loss 2.0598 (2.9837) grad_norm 3.4905 (3.3320) [2022-01-27 01:50:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][930/1251] eta 0:11:47 lr 0.000012 time 2.0926 (2.2045) loss 2.5113 (2.9833) grad_norm 3.3613 (3.3335) [2022-01-27 01:51:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][940/1251] eta 0:11:25 lr 0.000012 time 2.7013 (2.2051) loss 2.1307 (2.9851) grad_norm 3.5115 (3.3370) [2022-01-27 01:51:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][950/1251] eta 0:11:04 lr 0.000012 time 2.7223 (2.2071) loss 3.3231 (2.9868) grad_norm 2.9668 (3.3376) [2022-01-27 01:51:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][960/1251] eta 0:10:42 lr 0.000012 time 2.4873 (2.2069) loss 2.3033 (2.9828) grad_norm 3.1483 (3.3351) [2022-01-27 01:52:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][970/1251] eta 0:10:20 lr 0.000012 time 1.6241 (2.2072) loss 1.8934 (2.9830) grad_norm 3.8283 (3.3343) [2022-01-27 01:52:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][980/1251] eta 0:09:58 lr 0.000012 time 1.8662 (2.2069) loss 3.4449 (2.9852) grad_norm 3.8387 (3.3326) [2022-01-27 01:53:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][990/1251] eta 0:09:36 lr 0.000012 time 2.4925 (2.2072) loss 2.7154 (2.9868) grad_norm 2.9816 (3.3325) [2022-01-27 01:53:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1000/1251] eta 0:09:13 lr 0.000012 time 2.3939 (2.2062) loss 3.1573 (2.9870) grad_norm 3.4433 (3.3299) [2022-01-27 01:53:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1010/1251] eta 0:08:51 lr 0.000012 time 2.0023 (2.2051) loss 3.0409 (2.9879) grad_norm 3.8040 (3.3288) [2022-01-27 01:54:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1020/1251] eta 0:08:29 lr 0.000012 time 1.8884 (2.2039) loss 3.0971 (2.9894) grad_norm 3.1532 (3.3290) [2022-01-27 01:54:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1030/1251] eta 0:08:07 lr 0.000012 time 2.9326 (2.2039) loss 2.2675 (2.9895) grad_norm 2.8120 (3.3262) [2022-01-27 01:54:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1040/1251] eta 0:07:45 lr 0.000012 time 2.4768 (2.2038) loss 2.0992 (2.9908) grad_norm 3.4079 (3.3259) [2022-01-27 01:55:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1050/1251] eta 0:07:23 lr 0.000012 time 2.7954 (2.2057) loss 3.3465 (2.9927) grad_norm 3.3813 (3.3249) [2022-01-27 01:55:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1060/1251] eta 0:07:01 lr 0.000012 time 2.3716 (2.2069) loss 3.9002 (2.9956) grad_norm 3.1434 (3.3231) [2022-01-27 01:56:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1070/1251] eta 0:06:39 lr 0.000012 time 3.3546 (2.2075) loss 2.6310 (2.9952) grad_norm 3.2681 (3.3221) [2022-01-27 01:56:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1080/1251] eta 0:06:17 lr 0.000012 time 1.8964 (2.2065) loss 3.4030 (2.9954) grad_norm 3.6201 (3.3222) [2022-01-27 01:56:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1090/1251] eta 0:05:55 lr 0.000012 time 2.4257 (2.2050) loss 3.2673 (2.9939) grad_norm 3.9382 (3.3210) [2022-01-27 01:57:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1100/1251] eta 0:05:32 lr 0.000012 time 2.0994 (2.2031) loss 2.2793 (2.9928) grad_norm 3.2857 (3.3212) [2022-01-27 01:57:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1110/1251] eta 0:05:10 lr 0.000012 time 2.7989 (2.2043) loss 3.2189 (2.9906) grad_norm 3.1252 (3.3192) [2022-01-27 01:57:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1120/1251] eta 0:04:48 lr 0.000012 time 1.8497 (2.2055) loss 2.7158 (2.9917) grad_norm 2.8649 (3.3161) [2022-01-27 01:58:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1130/1251] eta 0:04:26 lr 0.000012 time 2.8276 (2.2057) loss 2.3141 (2.9910) grad_norm 3.4496 (3.3161) [2022-01-27 01:58:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1140/1251] eta 0:04:04 lr 0.000012 time 1.6535 (2.2054) loss 3.0581 (2.9908) grad_norm 2.9212 (3.3152) [2022-01-27 01:58:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1150/1251] eta 0:03:42 lr 0.000012 time 2.4608 (2.2052) loss 2.9119 (2.9915) grad_norm 3.6234 (3.3144) [2022-01-27 01:59:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1160/1251] eta 0:03:20 lr 0.000012 time 1.8576 (2.2040) loss 1.8640 (2.9909) grad_norm 2.6821 (3.3264) [2022-01-27 01:59:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1170/1251] eta 0:02:58 lr 0.000012 time 2.2725 (2.2033) loss 2.3860 (2.9907) grad_norm 3.1606 (3.3248) [2022-01-27 01:59:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1180/1251] eta 0:02:36 lr 0.000012 time 1.9125 (2.2025) loss 2.9871 (2.9912) grad_norm 2.7447 (3.3242) [2022-01-27 02:00:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1190/1251] eta 0:02:14 lr 0.000012 time 2.8933 (2.2016) loss 1.8596 (2.9892) grad_norm 3.3739 (3.3249) [2022-01-27 02:00:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1200/1251] eta 0:01:52 lr 0.000012 time 1.7850 (2.2007) loss 3.4628 (2.9908) grad_norm 2.7822 (3.3231) [2022-01-27 02:01:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1210/1251] eta 0:01:30 lr 0.000012 time 2.5346 (2.2010) loss 3.1165 (2.9903) grad_norm 2.9087 (3.3207) [2022-01-27 02:01:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1220/1251] eta 0:01:08 lr 0.000012 time 1.9114 (2.2004) loss 3.2209 (2.9894) grad_norm 3.2744 (3.3191) [2022-01-27 02:01:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1230/1251] eta 0:00:46 lr 0.000012 time 2.8633 (2.2010) loss 2.7783 (2.9888) grad_norm 2.7216 (3.3180) [2022-01-27 02:02:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1240/1251] eta 0:00:24 lr 0.000012 time 1.7483 (2.2000) loss 2.3744 (2.9906) grad_norm 2.6183 (3.3171) [2022-01-27 02:02:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [291/300][1250/1251] eta 0:00:02 lr 0.000012 time 1.1835 (2.1951) loss 3.2547 (2.9912) grad_norm 2.6405 (3.3167) [2022-01-27 02:02:23 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 291 training takes 0:45:46 [2022-01-27 02:02:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 15.969 (15.969) Loss 0.8317 (0.8317) Acc@1 80.176 (80.176) Acc@5 95.703 (95.703) [2022-01-27 02:03:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.869 (3.309) Loss 0.7940 (0.8092) Acc@1 81.348 (81.170) Acc@5 95.801 (95.570) [2022-01-27 02:03:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.888 (2.586) Loss 0.7736 (0.8088) Acc@1 82.324 (81.231) Acc@5 95.605 (95.415) [2022-01-27 02:03:33 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.658 (2.259) Loss 0.7926 (0.8053) Acc@1 79.980 (81.209) Acc@5 96.191 (95.476) [2022-01-27 02:03:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.820 (2.173) Loss 0.8107 (0.8067) Acc@1 81.543 (81.133) Acc@5 95.312 (95.491) [2022-01-27 02:04:01 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.216 Acc@5 95.458 [2022-01-27 02:04:01 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.2% [2022-01-27 02:04:01 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.24% [2022-01-27 02:04:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][0/1251] eta 7:30:23 lr 0.000012 time 21.6017 (21.6017) loss 3.4057 (3.4057) grad_norm 3.6475 (3.6475) [2022-01-27 02:04:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][10/1251] eta 1:23:56 lr 0.000012 time 1.8462 (4.0582) loss 3.4375 (3.1014) grad_norm 3.6427 (3.3972) [2022-01-27 02:05:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][20/1251] eta 1:05:24 lr 0.000012 time 1.3429 (3.1883) loss 2.6855 (3.1262) grad_norm 3.1132 (3.4020) [2022-01-27 02:05:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][30/1251] eta 0:57:53 lr 0.000012 time 1.3984 (2.8448) loss 3.2390 (3.0843) grad_norm 3.2004 (3.3670) [2022-01-27 02:05:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][40/1251] eta 0:54:59 lr 0.000012 time 3.7446 (2.7244) loss 2.0072 (3.0239) grad_norm 4.0019 (3.3282) [2022-01-27 02:06:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][50/1251] eta 0:53:24 lr 0.000012 time 2.3823 (2.6679) loss 3.6760 (3.0006) grad_norm 3.7611 (3.2944) [2022-01-27 02:06:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][60/1251] eta 0:51:49 lr 0.000012 time 2.7688 (2.6107) loss 2.1433 (3.0158) grad_norm 3.1116 (3.2811) [2022-01-27 02:07:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][70/1251] eta 0:50:20 lr 0.000012 time 1.8071 (2.5578) loss 3.1427 (3.0127) grad_norm 3.5337 (3.3058) [2022-01-27 02:07:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][80/1251] eta 0:49:13 lr 0.000012 time 3.4552 (2.5225) loss 3.1053 (3.0090) grad_norm 3.8443 (3.3354) [2022-01-27 02:07:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][90/1251] eta 0:47:42 lr 0.000012 time 1.6397 (2.4658) loss 2.9796 (3.0289) grad_norm 3.2494 (3.3676) [2022-01-27 02:08:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][100/1251] eta 0:46:33 lr 0.000012 time 1.9865 (2.4273) loss 3.3667 (3.0259) grad_norm 3.5031 (3.3519) [2022-01-27 02:08:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][110/1251] eta 0:45:33 lr 0.000012 time 1.8639 (2.3959) loss 3.2219 (3.0276) grad_norm 3.1479 (3.3511) [2022-01-27 02:08:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][120/1251] eta 0:45:06 lr 0.000012 time 3.4474 (2.3932) loss 3.5197 (3.0365) grad_norm 3.2421 (3.3451) [2022-01-27 02:09:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][130/1251] eta 0:44:27 lr 0.000012 time 2.1818 (2.3798) loss 3.1919 (3.0442) grad_norm 3.2245 (3.3288) [2022-01-27 02:09:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][140/1251] eta 0:44:01 lr 0.000012 time 2.4608 (2.3778) loss 3.2727 (3.0549) grad_norm 3.1775 (3.3227) [2022-01-27 02:09:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][150/1251] eta 0:43:11 lr 0.000012 time 1.6700 (2.3537) loss 3.1588 (3.0614) grad_norm 2.9492 (3.3145) [2022-01-27 02:10:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][160/1251] eta 0:42:40 lr 0.000012 time 2.8453 (2.3466) loss 2.8128 (3.0571) grad_norm 3.8538 (3.3061) [2022-01-27 02:10:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][170/1251] eta 0:41:56 lr 0.000012 time 1.9487 (2.3280) loss 2.8760 (3.0584) grad_norm 3.4997 (3.3098) [2022-01-27 02:11:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][180/1251] eta 0:41:31 lr 0.000012 time 2.7036 (2.3259) loss 3.4932 (3.0644) grad_norm 3.3810 (3.3144) [2022-01-27 02:11:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][190/1251] eta 0:40:55 lr 0.000012 time 2.2891 (2.3146) loss 2.0357 (3.0570) grad_norm 3.2939 (3.3055) [2022-01-27 02:11:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][200/1251] eta 0:40:22 lr 0.000012 time 2.5590 (2.3052) loss 3.3273 (3.0565) grad_norm 3.3051 (3.3050) [2022-01-27 02:12:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][210/1251] eta 0:39:52 lr 0.000012 time 2.1890 (2.2983) loss 3.1812 (3.0509) grad_norm 2.8846 (3.3013) [2022-01-27 02:12:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][220/1251] eta 0:39:26 lr 0.000012 time 1.8507 (2.2956) loss 2.7824 (3.0510) grad_norm 3.5136 (3.3031) [2022-01-27 02:12:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][230/1251] eta 0:39:06 lr 0.000012 time 2.6743 (2.2986) loss 3.2763 (3.0466) grad_norm 3.0258 (3.3028) [2022-01-27 02:13:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][240/1251] eta 0:38:41 lr 0.000012 time 2.8534 (2.2963) loss 3.1557 (3.0514) grad_norm 3.3338 (3.3058) [2022-01-27 02:13:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][250/1251] eta 0:38:12 lr 0.000012 time 1.8643 (2.2901) loss 3.1140 (3.0496) grad_norm 3.1570 (3.3047) [2022-01-27 02:13:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][260/1251] eta 0:37:40 lr 0.000012 time 2.2304 (2.2813) loss 3.4683 (3.0499) grad_norm 3.1457 (3.2991) [2022-01-27 02:14:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][270/1251] eta 0:37:13 lr 0.000012 time 2.1604 (2.2764) loss 2.4053 (3.0452) grad_norm 3.9301 (3.2977) [2022-01-27 02:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][280/1251] eta 0:36:45 lr 0.000012 time 1.8087 (2.2716) loss 3.4635 (3.0438) grad_norm 3.1454 (3.2950) [2022-01-27 02:15:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][290/1251] eta 0:36:19 lr 0.000012 time 1.9551 (2.2684) loss 3.1218 (3.0461) grad_norm 3.3180 (3.3147) [2022-01-27 02:15:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][300/1251] eta 0:35:54 lr 0.000012 time 2.5237 (2.2650) loss 3.0433 (3.0464) grad_norm 3.0215 (3.3171) [2022-01-27 02:15:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][310/1251] eta 0:35:29 lr 0.000012 time 1.8474 (2.2625) loss 3.0198 (3.0414) grad_norm 4.1170 (3.3175) [2022-01-27 02:16:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][320/1251] eta 0:35:05 lr 0.000012 time 2.2781 (2.2613) loss 3.1935 (3.0345) grad_norm 3.3872 (3.3198) [2022-01-27 02:16:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][330/1251] eta 0:34:36 lr 0.000012 time 1.5469 (2.2543) loss 3.5743 (3.0298) grad_norm 3.1837 (3.3200) [2022-01-27 02:16:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][340/1251] eta 0:34:11 lr 0.000012 time 1.7232 (2.2515) loss 2.0692 (3.0267) grad_norm 3.0637 (3.3203) [2022-01-27 02:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][350/1251] eta 0:33:49 lr 0.000012 time 2.5939 (2.2523) loss 3.6271 (3.0233) grad_norm 3.4819 (3.3204) [2022-01-27 02:17:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][360/1251] eta 0:33:23 lr 0.000012 time 2.0350 (2.2482) loss 3.4959 (3.0291) grad_norm 3.4309 (3.3170) [2022-01-27 02:17:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][370/1251] eta 0:33:00 lr 0.000012 time 2.4021 (2.2483) loss 3.3536 (3.0327) grad_norm 3.6577 (3.3199) [2022-01-27 02:18:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][380/1251] eta 0:32:38 lr 0.000012 time 1.5044 (2.2487) loss 3.2144 (3.0306) grad_norm 3.3117 (3.3226) [2022-01-27 02:18:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][390/1251] eta 0:32:15 lr 0.000012 time 2.7069 (2.2478) loss 3.0661 (3.0298) grad_norm 4.2110 (3.3319) [2022-01-27 02:19:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][400/1251] eta 0:31:51 lr 0.000012 time 2.1936 (2.2460) loss 3.5411 (3.0273) grad_norm 3.4078 (3.3283) [2022-01-27 02:19:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][410/1251] eta 0:31:26 lr 0.000012 time 2.5849 (2.2435) loss 2.8010 (3.0276) grad_norm 3.2740 (3.3290) [2022-01-27 02:19:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][420/1251] eta 0:31:01 lr 0.000012 time 2.1396 (2.2399) loss 2.8014 (3.0290) grad_norm 3.0811 (3.3361) [2022-01-27 02:20:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][430/1251] eta 0:30:36 lr 0.000012 time 2.2622 (2.2367) loss 3.2660 (3.0309) grad_norm 3.4722 (3.3352) [2022-01-27 02:20:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][440/1251] eta 0:30:13 lr 0.000012 time 2.2370 (2.2362) loss 3.4549 (3.0356) grad_norm 3.2336 (3.3373) [2022-01-27 02:20:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][450/1251] eta 0:29:48 lr 0.000012 time 1.5935 (2.2325) loss 3.3710 (3.0355) grad_norm 3.6997 (3.3364) [2022-01-27 02:21:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][460/1251] eta 0:29:23 lr 0.000012 time 1.8170 (2.2298) loss 3.0813 (3.0332) grad_norm 3.7551 (3.3380) [2022-01-27 02:21:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][470/1251] eta 0:28:58 lr 0.000012 time 1.8494 (2.2266) loss 3.1881 (3.0380) grad_norm 3.2488 (3.3366) [2022-01-27 02:21:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][480/1251] eta 0:28:37 lr 0.000012 time 2.5922 (2.2282) loss 2.3459 (3.0350) grad_norm 3.9799 (3.3363) [2022-01-27 02:22:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][490/1251] eta 0:28:13 lr 0.000012 time 1.5221 (2.2257) loss 2.8637 (3.0353) grad_norm 3.1695 (3.3350) [2022-01-27 02:22:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][500/1251] eta 0:27:51 lr 0.000012 time 2.6722 (2.2259) loss 3.1105 (3.0355) grad_norm 2.9363 (3.3325) [2022-01-27 02:23:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][510/1251] eta 0:27:33 lr 0.000012 time 2.1842 (2.2308) loss 3.2156 (3.0348) grad_norm 3.1901 (3.3269) [2022-01-27 02:23:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][520/1251] eta 0:27:13 lr 0.000012 time 2.9802 (2.2340) loss 3.0524 (3.0376) grad_norm 3.7751 (3.3288) [2022-01-27 02:23:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][530/1251] eta 0:26:51 lr 0.000012 time 2.1979 (2.2346) loss 3.3344 (3.0360) grad_norm 3.3636 (3.3252) [2022-01-27 02:24:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][540/1251] eta 0:26:27 lr 0.000012 time 1.9568 (2.2325) loss 3.1165 (3.0367) grad_norm 3.1983 (3.3264) [2022-01-27 02:24:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][550/1251] eta 0:26:01 lr 0.000012 time 1.7114 (2.2280) loss 2.9369 (3.0404) grad_norm 2.7602 (3.3244) [2022-01-27 02:24:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][560/1251] eta 0:25:37 lr 0.000012 time 2.2498 (2.2254) loss 3.1973 (3.0365) grad_norm 3.2426 (3.3217) [2022-01-27 02:25:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][570/1251] eta 0:25:15 lr 0.000012 time 2.1831 (2.2247) loss 2.6904 (3.0388) grad_norm 3.7698 (3.3244) [2022-01-27 02:25:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][580/1251] eta 0:24:52 lr 0.000012 time 1.8938 (2.2241) loss 3.2057 (3.0349) grad_norm 3.1756 (3.3262) [2022-01-27 02:25:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][590/1251] eta 0:24:32 lr 0.000012 time 2.5893 (2.2282) loss 2.0520 (3.0335) grad_norm 3.3049 (3.3216) [2022-01-27 02:26:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][600/1251] eta 0:24:09 lr 0.000012 time 1.7369 (2.2271) loss 3.1371 (3.0334) grad_norm 3.1023 (3.3266) [2022-01-27 02:26:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][610/1251] eta 0:23:48 lr 0.000012 time 2.5090 (2.2291) loss 2.1116 (3.0290) grad_norm 2.9429 (3.3290) [2022-01-27 02:27:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][620/1251] eta 0:23:27 lr 0.000012 time 2.5346 (2.2311) loss 2.0688 (3.0250) grad_norm 3.1491 (3.3302) [2022-01-27 02:27:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][630/1251] eta 0:23:05 lr 0.000012 time 1.8282 (2.2304) loss 3.2566 (3.0256) grad_norm 3.4875 (3.3296) [2022-01-27 02:27:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][640/1251] eta 0:22:39 lr 0.000012 time 1.8796 (2.2248) loss 2.1119 (3.0235) grad_norm 3.8464 (3.3293) [2022-01-27 02:28:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][650/1251] eta 0:22:14 lr 0.000012 time 2.2379 (2.2203) loss 2.5515 (3.0219) grad_norm 2.6209 (3.3287) [2022-01-27 02:28:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][660/1251] eta 0:21:50 lr 0.000012 time 2.2376 (2.2175) loss 2.6380 (3.0210) grad_norm 3.2114 (3.3275) [2022-01-27 02:28:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][670/1251] eta 0:21:26 lr 0.000012 time 2.3759 (2.2149) loss 2.8789 (3.0165) grad_norm 2.9146 (3.3261) [2022-01-27 02:29:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][680/1251] eta 0:21:05 lr 0.000012 time 2.3391 (2.2157) loss 3.3467 (3.0154) grad_norm 3.2991 (3.3278) [2022-01-27 02:29:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][690/1251] eta 0:20:44 lr 0.000012 time 2.1659 (2.2177) loss 2.0391 (3.0144) grad_norm 2.7894 (3.3289) [2022-01-27 02:29:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][700/1251] eta 0:20:21 lr 0.000012 time 2.0618 (2.2177) loss 3.0470 (3.0142) grad_norm 3.1986 (3.3383) [2022-01-27 02:30:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][710/1251] eta 0:19:59 lr 0.000011 time 1.8823 (2.2173) loss 3.5431 (3.0163) grad_norm 3.7651 (3.3401) [2022-01-27 02:30:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][720/1251] eta 0:19:37 lr 0.000011 time 2.2250 (2.2168) loss 2.3929 (3.0128) grad_norm 4.6873 (3.3390) [2022-01-27 02:31:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][730/1251] eta 0:19:14 lr 0.000011 time 1.5535 (2.2163) loss 3.3658 (3.0127) grad_norm 3.6535 (3.3389) [2022-01-27 02:31:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][740/1251] eta 0:18:52 lr 0.000011 time 2.1897 (2.2158) loss 2.4127 (3.0100) grad_norm 3.3970 (3.3374) [2022-01-27 02:31:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][750/1251] eta 0:18:29 lr 0.000011 time 1.5544 (2.2151) loss 2.1388 (3.0081) grad_norm 2.9773 (3.3332) [2022-01-27 02:32:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][760/1251] eta 0:18:07 lr 0.000011 time 2.3409 (2.2145) loss 2.8501 (3.0052) grad_norm 3.0527 (3.3331) [2022-01-27 02:32:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][770/1251] eta 0:17:44 lr 0.000011 time 2.1346 (2.2138) loss 2.7993 (3.0005) grad_norm 3.0828 (3.3327) [2022-01-27 02:32:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][780/1251] eta 0:17:22 lr 0.000011 time 2.2460 (2.2136) loss 2.7588 (2.9979) grad_norm 2.8679 (3.3309) [2022-01-27 02:33:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][790/1251] eta 0:17:00 lr 0.000011 time 2.1561 (2.2141) loss 3.1492 (2.9992) grad_norm 2.7269 (3.3304) [2022-01-27 02:33:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][800/1251] eta 0:16:39 lr 0.000011 time 2.9837 (2.2157) loss 2.7295 (2.9990) grad_norm 2.6846 (3.3271) [2022-01-27 02:33:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][810/1251] eta 0:16:16 lr 0.000011 time 1.8741 (2.2142) loss 3.3425 (2.9962) grad_norm 3.0209 (3.3330) [2022-01-27 02:34:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][820/1251] eta 0:15:54 lr 0.000011 time 2.1972 (2.2136) loss 3.4724 (2.9978) grad_norm 2.9565 (3.3335) [2022-01-27 02:34:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][830/1251] eta 0:15:31 lr 0.000011 time 2.2294 (2.2123) loss 3.1125 (2.9973) grad_norm 3.0849 (3.3339) [2022-01-27 02:35:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][840/1251] eta 0:15:08 lr 0.000011 time 2.4164 (2.2106) loss 2.5677 (2.9987) grad_norm 3.0169 (3.3325) [2022-01-27 02:35:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][850/1251] eta 0:14:46 lr 0.000011 time 2.0896 (2.2106) loss 3.3430 (2.9962) grad_norm 3.1635 (3.3335) [2022-01-27 02:35:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][860/1251] eta 0:14:24 lr 0.000011 time 2.1855 (2.2097) loss 3.1761 (2.9965) grad_norm 3.2584 (3.3339) [2022-01-27 02:36:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][870/1251] eta 0:14:01 lr 0.000011 time 2.3160 (2.2088) loss 2.7554 (2.9949) grad_norm 3.2721 (3.3310) [2022-01-27 02:36:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][880/1251] eta 0:13:40 lr 0.000011 time 3.3448 (2.2104) loss 3.7246 (2.9967) grad_norm 3.2865 (3.3311) [2022-01-27 02:36:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][890/1251] eta 0:13:18 lr 0.000011 time 2.1467 (2.2110) loss 3.0066 (2.9969) grad_norm 2.9224 (3.3292) [2022-01-27 02:37:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][900/1251] eta 0:12:56 lr 0.000011 time 2.0360 (2.2112) loss 3.1543 (2.9948) grad_norm 2.8402 (3.3285) [2022-01-27 02:37:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][910/1251] eta 0:12:33 lr 0.000011 time 2.2934 (2.2105) loss 3.0748 (2.9928) grad_norm 3.1858 (3.3298) [2022-01-27 02:37:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][920/1251] eta 0:12:11 lr 0.000011 time 2.2146 (2.2086) loss 3.2159 (2.9907) grad_norm 2.8984 (3.3280) [2022-01-27 02:38:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][930/1251] eta 0:11:48 lr 0.000011 time 2.8412 (2.2084) loss 2.9157 (2.9883) grad_norm 4.2219 (3.3298) [2022-01-27 02:38:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][940/1251] eta 0:11:26 lr 0.000011 time 2.2849 (2.2067) loss 3.0757 (2.9853) grad_norm 3.0686 (3.3279) [2022-01-27 02:39:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][950/1251] eta 0:11:04 lr 0.000011 time 2.2636 (2.2064) loss 2.7932 (2.9850) grad_norm 3.0859 (3.3291) [2022-01-27 02:39:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][960/1251] eta 0:10:42 lr 0.000011 time 2.9089 (2.2076) loss 3.0186 (2.9842) grad_norm 3.5127 (3.3294) [2022-01-27 02:39:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][970/1251] eta 0:10:20 lr 0.000011 time 2.3339 (2.2080) loss 3.0627 (2.9843) grad_norm 3.6421 (3.3295) [2022-01-27 02:40:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][980/1251] eta 0:09:58 lr 0.000011 time 2.1813 (2.2089) loss 1.7694 (2.9827) grad_norm 2.8185 (3.3273) [2022-01-27 02:40:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][990/1251] eta 0:09:36 lr 0.000011 time 2.2310 (2.2080) loss 3.3785 (2.9832) grad_norm 3.6280 (3.3276) [2022-01-27 02:40:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1000/1251] eta 0:09:13 lr 0.000011 time 1.8684 (2.2068) loss 3.6390 (2.9824) grad_norm 2.7183 (3.3255) [2022-01-27 02:41:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1010/1251] eta 0:08:51 lr 0.000011 time 2.2573 (2.2071) loss 2.8087 (2.9826) grad_norm 3.4600 (3.3247) [2022-01-27 02:41:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1020/1251] eta 0:08:29 lr 0.000011 time 1.6264 (2.2067) loss 2.2917 (2.9825) grad_norm 3.1006 (3.3268) [2022-01-27 02:41:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1030/1251] eta 0:08:07 lr 0.000011 time 2.2300 (2.2062) loss 2.3478 (2.9824) grad_norm 3.2481 (3.3251) [2022-01-27 02:42:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1040/1251] eta 0:07:45 lr 0.000011 time 1.7106 (2.2038) loss 3.2332 (2.9816) grad_norm 3.2322 (3.3229) [2022-01-27 02:42:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1050/1251] eta 0:07:22 lr 0.000011 time 2.1958 (2.2028) loss 2.7653 (2.9813) grad_norm 2.9485 (3.3205) [2022-01-27 02:42:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1060/1251] eta 0:07:00 lr 0.000011 time 1.6980 (2.2012) loss 2.0996 (2.9806) grad_norm 2.8854 (3.3181) [2022-01-27 02:43:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1070/1251] eta 0:06:38 lr 0.000011 time 2.5260 (2.2014) loss 2.4739 (2.9808) grad_norm 3.7984 (3.3165) [2022-01-27 02:43:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1080/1251] eta 0:06:16 lr 0.000011 time 2.2035 (2.2009) loss 3.2859 (2.9804) grad_norm 3.3740 (3.3182) [2022-01-27 02:44:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1090/1251] eta 0:05:54 lr 0.000011 time 2.8691 (2.2020) loss 3.3050 (2.9762) grad_norm 3.4208 (3.3172) [2022-01-27 02:44:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1100/1251] eta 0:05:32 lr 0.000011 time 1.9219 (2.2033) loss 3.3696 (2.9749) grad_norm 2.9011 (3.3175) [2022-01-27 02:44:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1110/1251] eta 0:05:10 lr 0.000011 time 1.9254 (2.2030) loss 2.6510 (2.9735) grad_norm 3.1266 (3.3167) [2022-01-27 02:45:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1120/1251] eta 0:04:48 lr 0.000011 time 1.8620 (2.2017) loss 3.5404 (2.9740) grad_norm 4.4734 (3.3185) [2022-01-27 02:45:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1130/1251] eta 0:04:26 lr 0.000011 time 2.2108 (2.2015) loss 3.3433 (2.9758) grad_norm 3.3295 (3.3195) [2022-01-27 02:45:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1140/1251] eta 0:04:04 lr 0.000011 time 2.1612 (2.2008) loss 2.9237 (2.9768) grad_norm 2.7443 (3.3193) [2022-01-27 02:46:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1150/1251] eta 0:03:42 lr 0.000011 time 2.5061 (2.2012) loss 3.3523 (2.9767) grad_norm 2.9450 (3.3189) [2022-01-27 02:46:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1160/1251] eta 0:03:20 lr 0.000011 time 1.9827 (2.2003) loss 2.0778 (2.9772) grad_norm 3.2161 (3.3205) [2022-01-27 02:46:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1170/1251] eta 0:02:58 lr 0.000011 time 2.1533 (2.2005) loss 3.2280 (2.9756) grad_norm 2.8638 (3.3202) [2022-01-27 02:47:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1180/1251] eta 0:02:36 lr 0.000011 time 1.8788 (2.1998) loss 2.3152 (2.9737) grad_norm 2.7724 (3.3190) [2022-01-27 02:47:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1190/1251] eta 0:02:14 lr 0.000011 time 3.2073 (2.2006) loss 3.1189 (2.9720) grad_norm 2.8714 (3.3198) [2022-01-27 02:48:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1200/1251] eta 0:01:52 lr 0.000011 time 2.3607 (2.2011) loss 2.6357 (2.9726) grad_norm 2.8144 (3.3198) [2022-01-27 02:48:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1210/1251] eta 0:01:30 lr 0.000011 time 1.5643 (2.2020) loss 3.1776 (2.9748) grad_norm 3.4864 (3.3194) [2022-01-27 02:48:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1220/1251] eta 0:01:08 lr 0.000011 time 1.9759 (2.2024) loss 3.6868 (2.9757) grad_norm 3.3605 (3.3192) [2022-01-27 02:49:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1230/1251] eta 0:00:46 lr 0.000011 time 2.5445 (2.2030) loss 2.6476 (2.9743) grad_norm 3.4206 (3.3193) [2022-01-27 02:49:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1240/1251] eta 0:00:24 lr 0.000011 time 1.2534 (2.2000) loss 3.4282 (2.9758) grad_norm 2.9775 (3.3191) [2022-01-27 02:49:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [292/300][1250/1251] eta 0:00:02 lr 0.000011 time 1.1954 (2.1935) loss 2.1785 (2.9740) grad_norm 3.4329 (3.3184) [2022-01-27 02:49:46 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 292 training takes 0:45:44 [2022-01-27 02:50:05 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 19.187 (19.187) Loss 0.8135 (0.8135) Acc@1 80.762 (80.762) Acc@5 94.629 (94.629) [2022-01-27 02:50:24 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.237 (3.415) Loss 0.7571 (0.7901) Acc@1 82.129 (81.365) Acc@5 95.996 (95.517) [2022-01-27 02:50:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.300 (2.580) Loss 0.7654 (0.7906) Acc@1 82.715 (81.427) Acc@5 95.801 (95.587) [2022-01-27 02:51:00 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.685 (2.387) Loss 0.8732 (0.8024) Acc@1 78.906 (81.244) Acc@5 94.922 (95.464) [2022-01-27 02:51:18 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.307 (2.233) Loss 0.7417 (0.7987) Acc@1 82.812 (81.221) Acc@5 95.703 (95.570) [2022-01-27 02:51:26 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.184 Acc@5 95.476 [2022-01-27 02:51:26 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.2% [2022-01-27 02:51:26 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.24% [2022-01-27 02:51:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][0/1251] eta 8:12:52 lr 0.000011 time 23.6393 (23.6393) loss 3.0192 (3.0192) grad_norm 3.1149 (3.1149) [2022-01-27 02:52:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][10/1251] eta 1:28:53 lr 0.000011 time 2.0564 (4.2977) loss 2.4209 (2.9805) grad_norm 2.9445 (3.0119) [2022-01-27 02:52:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][20/1251] eta 1:05:51 lr 0.000011 time 1.5143 (3.2097) loss 2.7065 (3.0553) grad_norm 2.9736 (3.1855) [2022-01-27 02:52:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][30/1251] eta 0:59:09 lr 0.000011 time 1.6485 (2.9068) loss 3.1428 (2.9826) grad_norm 3.9960 (3.2874) [2022-01-27 02:53:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][40/1251] eta 0:55:31 lr 0.000011 time 3.8478 (2.7510) loss 2.6789 (2.9330) grad_norm 2.7372 (3.2609) [2022-01-27 02:53:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][50/1251] eta 0:52:48 lr 0.000011 time 2.0815 (2.6379) loss 3.1345 (2.9642) grad_norm 3.9450 (3.2870) [2022-01-27 02:54:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][60/1251] eta 0:50:50 lr 0.000011 time 1.6011 (2.5617) loss 3.0680 (2.9610) grad_norm 3.8433 (3.2782) [2022-01-27 02:54:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][70/1251] eta 0:49:20 lr 0.000011 time 1.8135 (2.5064) loss 2.6315 (2.9901) grad_norm 2.8228 (3.3011) [2022-01-27 02:54:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][80/1251] eta 0:48:17 lr 0.000011 time 3.5443 (2.4746) loss 2.8132 (3.0016) grad_norm 3.1616 (3.2947) [2022-01-27 02:55:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][90/1251] eta 0:47:12 lr 0.000011 time 1.8905 (2.4400) loss 3.4156 (3.0040) grad_norm 3.6367 (3.3048) [2022-01-27 02:55:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][100/1251] eta 0:46:19 lr 0.000011 time 2.1977 (2.4148) loss 3.6240 (2.9739) grad_norm 4.5367 (3.3305) [2022-01-27 02:55:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][110/1251] eta 0:45:05 lr 0.000011 time 1.8190 (2.3712) loss 3.4099 (2.9585) grad_norm 3.3392 (3.3174) [2022-01-27 02:56:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][120/1251] eta 0:44:25 lr 0.000011 time 3.0113 (2.3563) loss 3.1507 (2.9419) grad_norm 3.5864 (3.3161) [2022-01-27 02:56:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][130/1251] eta 0:43:40 lr 0.000011 time 1.7266 (2.3373) loss 1.9254 (2.9410) grad_norm 3.3607 (3.3344) [2022-01-27 02:56:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][140/1251] eta 0:43:04 lr 0.000011 time 2.4831 (2.3260) loss 3.3248 (2.9590) grad_norm 2.8816 (3.3415) [2022-01-27 02:57:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][150/1251] eta 0:42:27 lr 0.000011 time 2.2880 (2.3140) loss 2.8833 (2.9528) grad_norm 4.1634 (3.3435) [2022-01-27 02:57:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][160/1251] eta 0:41:50 lr 0.000011 time 2.4782 (2.3010) loss 3.3552 (2.9477) grad_norm 3.8082 (3.3435) [2022-01-27 02:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][170/1251] eta 0:41:26 lr 0.000011 time 2.2076 (2.3002) loss 1.9221 (2.9407) grad_norm 3.1351 (3.3267) [2022-01-27 02:58:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][180/1251] eta 0:40:54 lr 0.000011 time 2.1694 (2.2918) loss 2.7143 (2.9411) grad_norm 3.0705 (3.3248) [2022-01-27 02:58:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][190/1251] eta 0:40:28 lr 0.000011 time 2.1959 (2.2884) loss 2.8182 (2.9327) grad_norm 2.7977 (3.3241) [2022-01-27 02:59:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][200/1251] eta 0:40:04 lr 0.000011 time 2.9304 (2.2883) loss 3.3732 (2.9466) grad_norm 3.3561 (3.3307) [2022-01-27 02:59:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][210/1251] eta 0:39:31 lr 0.000011 time 1.8431 (2.2783) loss 3.3335 (2.9448) grad_norm 2.9648 (3.3296) [2022-01-27 02:59:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][220/1251] eta 0:38:59 lr 0.000011 time 2.2629 (2.2692) loss 2.6335 (2.9473) grad_norm 2.9243 (3.3257) [2022-01-27 03:00:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][230/1251] eta 0:38:36 lr 0.000011 time 2.0304 (2.2687) loss 2.2040 (2.9436) grad_norm 3.2230 (3.3277) [2022-01-27 03:00:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][240/1251] eta 0:38:14 lr 0.000011 time 2.9103 (2.2691) loss 2.4540 (2.9441) grad_norm 2.9508 (3.3278) [2022-01-27 03:00:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][250/1251] eta 0:37:50 lr 0.000011 time 1.7485 (2.2679) loss 3.6428 (2.9539) grad_norm 3.0177 (3.3244) [2022-01-27 03:01:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][260/1251] eta 0:37:20 lr 0.000011 time 1.8850 (2.2613) loss 3.3838 (2.9657) grad_norm 3.0497 (3.3232) [2022-01-27 03:01:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][270/1251] eta 0:36:57 lr 0.000011 time 2.1744 (2.2600) loss 3.2045 (2.9615) grad_norm 3.2319 (3.3335) [2022-01-27 03:02:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][280/1251] eta 0:36:31 lr 0.000011 time 2.8051 (2.2565) loss 3.3895 (2.9591) grad_norm 2.7285 (3.3263) [2022-01-27 03:02:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][290/1251] eta 0:36:08 lr 0.000011 time 2.2264 (2.2566) loss 2.9029 (2.9615) grad_norm 3.6238 (3.3211) [2022-01-27 03:02:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][300/1251] eta 0:35:41 lr 0.000011 time 2.1930 (2.2522) loss 3.6252 (2.9578) grad_norm 3.4917 (3.3164) [2022-01-27 03:03:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][310/1251] eta 0:35:16 lr 0.000011 time 1.9365 (2.2495) loss 3.1126 (2.9601) grad_norm 3.1410 (3.3146) [2022-01-27 03:03:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][320/1251] eta 0:34:48 lr 0.000011 time 1.6561 (2.2433) loss 3.5191 (2.9616) grad_norm 3.4204 (3.3148) [2022-01-27 03:03:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][330/1251] eta 0:34:20 lr 0.000011 time 1.9324 (2.2370) loss 3.2552 (2.9702) grad_norm 3.2809 (3.3149) [2022-01-27 03:04:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][340/1251] eta 0:33:57 lr 0.000011 time 1.9610 (2.2361) loss 3.6681 (2.9751) grad_norm 2.9387 (3.3149) [2022-01-27 03:04:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][350/1251] eta 0:33:34 lr 0.000011 time 1.9368 (2.2357) loss 3.4014 (2.9767) grad_norm 2.8701 (3.3083) [2022-01-27 03:04:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][360/1251] eta 0:33:11 lr 0.000011 time 2.5706 (2.2348) loss 1.9846 (2.9802) grad_norm 3.0493 (3.3101) [2022-01-27 03:05:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][370/1251] eta 0:32:47 lr 0.000011 time 1.9824 (2.2331) loss 3.4417 (2.9811) grad_norm 4.6269 (3.3127) [2022-01-27 03:05:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][380/1251] eta 0:32:23 lr 0.000011 time 1.9330 (2.2314) loss 2.4620 (2.9766) grad_norm 3.2972 (3.3150) [2022-01-27 03:05:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][390/1251] eta 0:32:01 lr 0.000011 time 1.6728 (2.2320) loss 1.9582 (2.9744) grad_norm 3.0156 (3.3252) [2022-01-27 03:06:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][400/1251] eta 0:31:45 lr 0.000011 time 2.0723 (2.2390) loss 3.1326 (2.9779) grad_norm 3.2611 (3.3207) [2022-01-27 03:06:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][410/1251] eta 0:31:21 lr 0.000011 time 1.8456 (2.2375) loss 3.5369 (2.9764) grad_norm 3.3831 (3.3180) [2022-01-27 03:07:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][420/1251] eta 0:30:53 lr 0.000011 time 1.9492 (2.2308) loss 3.4885 (2.9763) grad_norm 2.9693 (3.3184) [2022-01-27 03:07:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][430/1251] eta 0:30:29 lr 0.000011 time 1.8842 (2.2285) loss 3.1204 (2.9771) grad_norm 3.3483 (3.3299) [2022-01-27 03:07:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][440/1251] eta 0:30:06 lr 0.000011 time 1.8867 (2.2274) loss 3.4403 (2.9813) grad_norm 3.0141 (3.3304) [2022-01-27 03:08:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][450/1251] eta 0:29:43 lr 0.000011 time 1.8018 (2.2268) loss 3.5386 (2.9831) grad_norm 3.0342 (3.3309) [2022-01-27 03:08:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][460/1251] eta 0:29:22 lr 0.000011 time 2.1987 (2.2278) loss 3.3990 (2.9832) grad_norm 3.0501 (3.3350) [2022-01-27 03:08:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][470/1251] eta 0:28:59 lr 0.000011 time 1.5701 (2.2267) loss 2.3965 (2.9832) grad_norm 3.6468 (3.3387) [2022-01-27 03:09:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][480/1251] eta 0:28:36 lr 0.000011 time 1.8775 (2.2257) loss 3.3477 (2.9826) grad_norm 2.8178 (3.3332) [2022-01-27 03:09:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][490/1251] eta 0:28:14 lr 0.000011 time 1.8521 (2.2273) loss 3.0521 (2.9840) grad_norm 4.3102 (3.3379) [2022-01-27 03:10:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][500/1251] eta 0:27:54 lr 0.000011 time 3.0829 (2.2300) loss 3.2344 (2.9820) grad_norm 2.7763 (3.3354) [2022-01-27 03:10:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][510/1251] eta 0:27:33 lr 0.000011 time 1.5951 (2.2310) loss 3.3348 (2.9825) grad_norm 3.1453 (3.3338) [2022-01-27 03:10:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][520/1251] eta 0:27:08 lr 0.000011 time 1.5709 (2.2281) loss 1.9544 (2.9791) grad_norm 3.1141 (3.3307) [2022-01-27 03:11:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][530/1251] eta 0:26:45 lr 0.000011 time 1.6896 (2.2266) loss 2.7665 (2.9787) grad_norm 3.5578 (3.3368) [2022-01-27 03:11:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][540/1251] eta 0:26:22 lr 0.000011 time 1.7988 (2.2262) loss 2.1277 (2.9763) grad_norm 3.7642 (3.3462) [2022-01-27 03:11:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][550/1251] eta 0:26:02 lr 0.000011 time 1.7381 (2.2292) loss 3.3186 (2.9776) grad_norm 3.0206 (3.3439) [2022-01-27 03:12:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][560/1251] eta 0:25:37 lr 0.000011 time 1.6912 (2.2254) loss 2.8445 (2.9760) grad_norm 2.9920 (3.3467) [2022-01-27 03:12:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][570/1251] eta 0:25:14 lr 0.000011 time 1.9794 (2.2245) loss 3.2483 (2.9775) grad_norm 3.4463 (3.3490) [2022-01-27 03:12:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][580/1251] eta 0:24:51 lr 0.000011 time 1.5241 (2.2226) loss 3.0019 (2.9808) grad_norm 3.3302 (3.3657) [2022-01-27 03:13:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][590/1251] eta 0:24:32 lr 0.000011 time 1.9127 (2.2272) loss 3.0035 (2.9791) grad_norm 3.2934 (3.3689) [2022-01-27 03:13:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][600/1251] eta 0:24:08 lr 0.000011 time 1.9089 (2.2247) loss 3.1721 (2.9776) grad_norm 3.4424 (3.3700) [2022-01-27 03:14:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][610/1251] eta 0:23:44 lr 0.000011 time 1.8989 (2.2222) loss 3.4909 (2.9707) grad_norm 3.4174 (3.3708) [2022-01-27 03:14:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][620/1251] eta 0:23:20 lr 0.000011 time 1.7478 (2.2199) loss 3.1100 (2.9686) grad_norm 3.2000 (3.3778) [2022-01-27 03:14:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][630/1251] eta 0:22:57 lr 0.000011 time 1.8705 (2.2176) loss 3.0216 (2.9710) grad_norm 2.9560 (3.3791) [2022-01-27 03:15:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][640/1251] eta 0:22:33 lr 0.000011 time 2.5275 (2.2153) loss 3.4276 (2.9700) grad_norm 3.0116 (3.3792) [2022-01-27 03:15:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][650/1251] eta 0:22:09 lr 0.000011 time 2.1121 (2.2129) loss 2.3730 (2.9677) grad_norm 3.6054 (3.3786) [2022-01-27 03:15:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][660/1251] eta 0:21:47 lr 0.000011 time 1.9992 (2.2119) loss 1.9207 (2.9659) grad_norm 2.9423 (3.3771) [2022-01-27 03:16:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][670/1251] eta 0:21:24 lr 0.000011 time 2.6373 (2.2117) loss 2.4559 (2.9680) grad_norm 2.6997 (3.3737) [2022-01-27 03:16:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][680/1251] eta 0:21:02 lr 0.000011 time 1.9818 (2.2116) loss 2.3345 (2.9711) grad_norm 3.4456 (3.3714) [2022-01-27 03:16:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][690/1251] eta 0:20:41 lr 0.000011 time 2.4845 (2.2127) loss 3.2855 (2.9695) grad_norm 3.2637 (3.3739) [2022-01-27 03:17:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][700/1251] eta 0:20:20 lr 0.000011 time 1.9319 (2.2158) loss 1.8644 (2.9681) grad_norm 3.4753 (3.3717) [2022-01-27 03:17:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][710/1251] eta 0:19:59 lr 0.000011 time 2.6551 (2.2167) loss 1.9554 (2.9672) grad_norm 3.3299 (3.3703) [2022-01-27 03:18:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][720/1251] eta 0:19:36 lr 0.000011 time 1.9448 (2.2147) loss 3.0862 (2.9640) grad_norm 2.9619 (3.3701) [2022-01-27 03:18:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][730/1251] eta 0:19:13 lr 0.000011 time 2.1503 (2.2148) loss 2.7283 (2.9621) grad_norm 3.2324 (3.3692) [2022-01-27 03:18:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][740/1251] eta 0:18:51 lr 0.000011 time 1.7462 (2.2151) loss 2.8354 (2.9623) grad_norm 3.1213 (3.3701) [2022-01-27 03:19:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][750/1251] eta 0:18:28 lr 0.000011 time 2.1212 (2.2127) loss 2.6254 (2.9632) grad_norm 3.0924 (3.3693) [2022-01-27 03:19:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][760/1251] eta 0:18:05 lr 0.000011 time 1.7061 (2.2113) loss 2.5167 (2.9609) grad_norm 4.8815 (3.3706) [2022-01-27 03:19:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][770/1251] eta 0:17:44 lr 0.000011 time 2.8984 (2.2134) loss 3.2708 (2.9634) grad_norm 3.5743 (3.3752) [2022-01-27 03:20:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][780/1251] eta 0:17:23 lr 0.000011 time 2.5376 (2.2152) loss 3.5442 (2.9647) grad_norm 2.9995 (3.3746) [2022-01-27 03:20:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][790/1251] eta 0:17:01 lr 0.000011 time 1.7684 (2.2150) loss 2.6981 (2.9617) grad_norm 3.0720 (3.3740) [2022-01-27 03:20:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][800/1251] eta 0:16:38 lr 0.000011 time 1.7903 (2.2134) loss 1.9479 (2.9630) grad_norm 3.2749 (3.3747) [2022-01-27 03:21:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][810/1251] eta 0:16:14 lr 0.000011 time 1.7137 (2.2103) loss 2.7380 (2.9638) grad_norm 3.3656 (3.3738) [2022-01-27 03:21:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][820/1251] eta 0:15:52 lr 0.000011 time 2.2168 (2.2089) loss 3.4449 (2.9649) grad_norm 3.7243 (3.3735) [2022-01-27 03:22:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][830/1251] eta 0:15:29 lr 0.000011 time 1.7891 (2.2084) loss 2.7181 (2.9645) grad_norm 4.6952 (3.3758) [2022-01-27 03:22:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][840/1251] eta 0:15:07 lr 0.000011 time 2.2365 (2.2078) loss 3.3525 (2.9634) grad_norm 3.1123 (3.3740) [2022-01-27 03:22:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][850/1251] eta 0:14:44 lr 0.000011 time 1.8841 (2.2068) loss 2.2638 (2.9622) grad_norm 3.3681 (3.3722) [2022-01-27 03:23:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][860/1251] eta 0:14:22 lr 0.000011 time 2.5481 (2.2059) loss 3.4697 (2.9637) grad_norm 3.1348 (3.3719) [2022-01-27 03:23:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][870/1251] eta 0:14:00 lr 0.000011 time 1.6822 (2.2056) loss 3.1545 (2.9632) grad_norm 2.9311 (3.3686) [2022-01-27 03:23:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][880/1251] eta 0:13:38 lr 0.000011 time 2.6637 (2.2065) loss 3.3246 (2.9636) grad_norm 2.9101 (3.3689) [2022-01-27 03:24:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][890/1251] eta 0:13:16 lr 0.000011 time 1.6053 (2.2055) loss 3.0554 (2.9639) grad_norm 3.6835 (3.3693) [2022-01-27 03:24:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][900/1251] eta 0:12:53 lr 0.000011 time 2.1492 (2.2044) loss 3.4043 (2.9626) grad_norm 3.2867 (3.3670) [2022-01-27 03:24:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][910/1251] eta 0:12:32 lr 0.000011 time 1.8975 (2.2058) loss 2.8866 (2.9616) grad_norm 3.2775 (3.3654) [2022-01-27 03:25:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][920/1251] eta 0:12:10 lr 0.000011 time 2.5012 (2.2078) loss 3.2641 (2.9634) grad_norm 2.7746 (3.3648) [2022-01-27 03:25:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][930/1251] eta 0:11:48 lr 0.000011 time 2.2293 (2.2069) loss 3.4488 (2.9640) grad_norm 3.2987 (3.3672) [2022-01-27 03:26:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][940/1251] eta 0:11:26 lr 0.000011 time 1.9308 (2.2068) loss 3.2970 (2.9657) grad_norm 3.8949 (3.3681) [2022-01-27 03:26:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][950/1251] eta 0:11:03 lr 0.000011 time 1.9030 (2.2045) loss 2.9136 (2.9656) grad_norm 2.8420 (3.3668) [2022-01-27 03:26:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][960/1251] eta 0:10:41 lr 0.000011 time 2.5026 (2.2037) loss 3.1678 (2.9648) grad_norm 3.2323 (3.3650) [2022-01-27 03:27:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][970/1251] eta 0:10:19 lr 0.000011 time 1.8885 (2.2046) loss 3.1881 (2.9649) grad_norm 3.2255 (3.3630) [2022-01-27 03:27:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][980/1251] eta 0:09:57 lr 0.000011 time 2.2119 (2.2049) loss 3.3589 (2.9663) grad_norm 3.2262 (3.3615) [2022-01-27 03:27:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][990/1251] eta 0:09:35 lr 0.000011 time 2.4106 (2.2063) loss 2.5143 (2.9663) grad_norm 4.0411 (3.3620) [2022-01-27 03:28:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1000/1251] eta 0:09:13 lr 0.000011 time 3.3206 (2.2066) loss 1.8266 (2.9642) grad_norm 2.9485 (3.3606) [2022-01-27 03:28:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1010/1251] eta 0:08:51 lr 0.000011 time 2.0889 (2.2054) loss 3.1462 (2.9633) grad_norm 3.6286 (3.3602) [2022-01-27 03:28:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1020/1251] eta 0:08:29 lr 0.000011 time 1.9868 (2.2037) loss 2.3872 (2.9613) grad_norm 3.1890 (3.3582) [2022-01-27 03:29:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1030/1251] eta 0:08:06 lr 0.000011 time 2.2101 (2.2033) loss 2.9295 (2.9595) grad_norm 3.0618 (3.3597) [2022-01-27 03:29:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1040/1251] eta 0:07:44 lr 0.000011 time 2.4772 (2.2038) loss 2.5454 (2.9597) grad_norm 3.3125 (3.3613) [2022-01-27 03:30:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1050/1251] eta 0:07:23 lr 0.000011 time 3.3999 (2.2052) loss 2.4099 (2.9577) grad_norm 3.4000 (3.3640) [2022-01-27 03:30:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1060/1251] eta 0:07:01 lr 0.000011 time 2.2397 (2.2057) loss 3.1990 (2.9573) grad_norm 3.7954 (3.3636) [2022-01-27 03:30:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1070/1251] eta 0:06:38 lr 0.000011 time 1.5652 (2.2038) loss 1.9553 (2.9560) grad_norm 4.2520 (3.3635) [2022-01-27 03:31:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1080/1251] eta 0:06:16 lr 0.000011 time 1.9624 (2.2012) loss 3.2125 (2.9566) grad_norm 3.4446 (3.3638) [2022-01-27 03:31:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1090/1251] eta 0:05:54 lr 0.000011 time 3.0673 (2.2008) loss 3.6006 (2.9570) grad_norm 3.0262 (3.3715) [2022-01-27 03:31:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1100/1251] eta 0:05:32 lr 0.000011 time 1.9003 (2.1994) loss 3.2348 (2.9581) grad_norm 3.1169 (3.3748) [2022-01-27 03:32:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1110/1251] eta 0:05:10 lr 0.000011 time 2.2157 (2.1988) loss 2.0112 (2.9568) grad_norm 3.2797 (3.3749) [2022-01-27 03:32:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1120/1251] eta 0:04:48 lr 0.000011 time 2.3650 (2.1987) loss 3.4382 (2.9578) grad_norm 3.1957 (3.3743) [2022-01-27 03:32:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1130/1251] eta 0:04:26 lr 0.000011 time 2.6568 (2.1998) loss 3.6484 (2.9583) grad_norm 3.4526 (3.3739) [2022-01-27 03:33:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1140/1251] eta 0:04:04 lr 0.000011 time 2.2095 (2.2001) loss 2.2251 (2.9567) grad_norm 3.4256 (3.3737) [2022-01-27 03:33:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1150/1251] eta 0:03:42 lr 0.000011 time 2.4995 (2.2022) loss 3.3212 (2.9573) grad_norm 3.1807 (3.3719) [2022-01-27 03:34:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1160/1251] eta 0:03:20 lr 0.000011 time 2.6427 (2.2042) loss 3.3140 (2.9556) grad_norm 3.2087 (3.3704) [2022-01-27 03:34:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1170/1251] eta 0:02:58 lr 0.000011 time 1.8390 (2.2041) loss 2.5877 (2.9547) grad_norm 3.8200 (3.3700) [2022-01-27 03:34:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1180/1251] eta 0:02:36 lr 0.000011 time 1.5653 (2.2023) loss 2.2088 (2.9522) grad_norm 3.1593 (3.3694) [2022-01-27 03:35:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1190/1251] eta 0:02:14 lr 0.000011 time 1.6136 (2.1995) loss 3.8053 (2.9535) grad_norm 3.1414 (3.3679) [2022-01-27 03:35:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1200/1251] eta 0:01:52 lr 0.000011 time 2.7912 (2.1986) loss 3.1576 (2.9530) grad_norm 3.2795 (3.3724) [2022-01-27 03:35:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1210/1251] eta 0:01:30 lr 0.000011 time 2.0761 (2.1982) loss 2.1006 (2.9507) grad_norm 3.7951 (3.3726) [2022-01-27 03:36:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1220/1251] eta 0:01:08 lr 0.000011 time 2.1299 (2.1982) loss 2.7772 (2.9516) grad_norm 3.7241 (3.3745) [2022-01-27 03:36:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1230/1251] eta 0:00:46 lr 0.000011 time 1.7363 (2.1991) loss 2.8620 (2.9506) grad_norm 3.5422 (3.3742) [2022-01-27 03:36:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1240/1251] eta 0:00:24 lr 0.000011 time 1.6963 (2.1995) loss 2.0234 (2.9513) grad_norm 2.7401 (3.3742) [2022-01-27 03:37:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [293/300][1250/1251] eta 0:00:02 lr 0.000011 time 1.2192 (2.1946) loss 1.8022 (2.9504) grad_norm 3.3364 (3.3747) [2022-01-27 03:37:12 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 293 training takes 0:45:45 [2022-01-27 03:37:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.499 (18.499) Loss 0.7750 (0.7750) Acc@1 82.129 (82.129) Acc@5 95.605 (95.605) [2022-01-27 03:37:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.267 (3.295) Loss 0.7886 (0.8096) Acc@1 82.031 (81.241) Acc@5 96.094 (95.490) [2022-01-27 03:38:04 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.297 (2.505) Loss 0.7798 (0.8105) Acc@1 82.227 (81.027) Acc@5 95.117 (95.480) [2022-01-27 03:38:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.267 (2.163) Loss 0.7799 (0.8000) Acc@1 81.055 (81.250) Acc@5 95.801 (95.621) [2022-01-27 03:38:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.312 (2.162) Loss 0.7775 (0.8073) Acc@1 81.641 (81.074) Acc@5 96.094 (95.544) [2022-01-27 03:38:48 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.146 Acc@5 95.518 [2022-01-27 03:38:48 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.1% [2022-01-27 03:38:48 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.24% [2022-01-27 03:39:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][0/1251] eta 7:35:34 lr 0.000011 time 21.8500 (21.8500) loss 3.2696 (3.2696) grad_norm 2.8930 (2.8930) [2022-01-27 03:39:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][10/1251] eta 1:24:55 lr 0.000011 time 2.2713 (4.1058) loss 3.5291 (3.1076) grad_norm 3.5148 (3.3367) [2022-01-27 03:39:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][20/1251] eta 1:04:28 lr 0.000011 time 1.4443 (3.1426) loss 3.3036 (3.0814) grad_norm 3.1402 (3.3631) [2022-01-27 03:40:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][30/1251] eta 0:57:39 lr 0.000011 time 1.7879 (2.8334) loss 3.1555 (3.0777) grad_norm 3.0539 (3.4311) [2022-01-27 03:40:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][40/1251] eta 0:55:30 lr 0.000011 time 4.2523 (2.7499) loss 3.1246 (3.0710) grad_norm 3.0102 (3.4066) [2022-01-27 03:41:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][50/1251] eta 0:52:08 lr 0.000011 time 1.4625 (2.6051) loss 2.1957 (3.0669) grad_norm 8.8201 (3.4781) [2022-01-27 03:41:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][60/1251] eta 0:50:38 lr 0.000011 time 2.3913 (2.5514) loss 3.1455 (3.0063) grad_norm 3.8300 (3.4359) [2022-01-27 03:41:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][70/1251] eta 0:49:23 lr 0.000011 time 1.5314 (2.5092) loss 2.4575 (3.0129) grad_norm 3.0981 (3.4431) [2022-01-27 03:42:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][80/1251] eta 0:48:43 lr 0.000011 time 3.8866 (2.4969) loss 3.1216 (3.0034) grad_norm 2.9884 (3.4308) [2022-01-27 03:42:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][90/1251] eta 0:47:39 lr 0.000011 time 1.5430 (2.4628) loss 2.0605 (2.9832) grad_norm 2.8678 (3.4204) [2022-01-27 03:42:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][100/1251] eta 0:46:22 lr 0.000011 time 1.6795 (2.4175) loss 2.8034 (2.9858) grad_norm 3.4476 (3.4150) [2022-01-27 03:43:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][110/1251] eta 0:45:17 lr 0.000011 time 1.9323 (2.3813) loss 2.7092 (2.9720) grad_norm 3.1346 (3.3989) [2022-01-27 03:43:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][120/1251] eta 0:44:28 lr 0.000011 time 3.4061 (2.3594) loss 3.2477 (2.9919) grad_norm 3.1338 (3.3894) [2022-01-27 03:43:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][130/1251] eta 0:43:57 lr 0.000011 time 1.4975 (2.3532) loss 2.4409 (2.9749) grad_norm 3.6638 (3.3862) [2022-01-27 03:44:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][140/1251] eta 0:43:23 lr 0.000011 time 2.5828 (2.3437) loss 3.3722 (2.9938) grad_norm 3.2894 (3.3720) [2022-01-27 03:44:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][150/1251] eta 0:42:55 lr 0.000011 time 2.4919 (2.3395) loss 2.8116 (2.9849) grad_norm 3.3954 (3.3631) [2022-01-27 03:45:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][160/1251] eta 0:42:28 lr 0.000011 time 3.5076 (2.3363) loss 3.2058 (2.9908) grad_norm 2.7123 (3.3520) [2022-01-27 03:45:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][170/1251] eta 0:41:48 lr 0.000011 time 1.5287 (2.3207) loss 3.4592 (2.9960) grad_norm 4.0189 (3.3420) [2022-01-27 03:45:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][180/1251] eta 0:41:18 lr 0.000011 time 1.5876 (2.3141) loss 3.2382 (2.9990) grad_norm 3.1391 (3.3310) [2022-01-27 03:46:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][190/1251] eta 0:40:48 lr 0.000011 time 1.5933 (2.3075) loss 3.2960 (3.0025) grad_norm 3.3959 (3.3410) [2022-01-27 03:46:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][200/1251] eta 0:40:29 lr 0.000011 time 4.1942 (2.3111) loss 2.6070 (2.9970) grad_norm 2.8701 (3.3445) [2022-01-27 03:46:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][210/1251] eta 0:40:14 lr 0.000011 time 1.9627 (2.3198) loss 2.7924 (2.9765) grad_norm 2.8275 (3.3450) [2022-01-27 03:47:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][220/1251] eta 0:39:35 lr 0.000011 time 1.6441 (2.3040) loss 2.6324 (2.9792) grad_norm 3.3055 (3.3348) [2022-01-27 03:47:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][230/1251] eta 0:38:58 lr 0.000011 time 1.6429 (2.2906) loss 2.4307 (2.9706) grad_norm 2.8564 (3.3389) [2022-01-27 03:47:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][240/1251] eta 0:38:28 lr 0.000011 time 2.7526 (2.2839) loss 3.0877 (2.9732) grad_norm 2.8369 (3.3363) [2022-01-27 03:48:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][250/1251] eta 0:38:01 lr 0.000011 time 2.1031 (2.2795) loss 3.2385 (2.9668) grad_norm 3.2337 (3.3309) [2022-01-27 03:48:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][260/1251] eta 0:37:34 lr 0.000011 time 1.8991 (2.2753) loss 2.4289 (2.9643) grad_norm 3.1113 (3.3290) [2022-01-27 03:49:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][270/1251] eta 0:37:10 lr 0.000011 time 3.0169 (2.2739) loss 3.2393 (2.9585) grad_norm 3.3984 (3.3421) [2022-01-27 03:49:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][280/1251] eta 0:36:47 lr 0.000011 time 2.9139 (2.2730) loss 3.4110 (2.9672) grad_norm 3.0685 (3.3358) [2022-01-27 03:49:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][290/1251] eta 0:36:23 lr 0.000011 time 1.5798 (2.2717) loss 2.7265 (2.9653) grad_norm 2.9634 (3.3340) [2022-01-27 03:50:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][300/1251] eta 0:36:00 lr 0.000011 time 1.6895 (2.2715) loss 1.9768 (2.9614) grad_norm 3.3623 (3.3291) [2022-01-27 03:50:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][310/1251] eta 0:35:35 lr 0.000011 time 2.3912 (2.2692) loss 3.1108 (2.9647) grad_norm 3.3249 (3.3322) [2022-01-27 03:50:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][320/1251] eta 0:35:04 lr 0.000011 time 1.8673 (2.2609) loss 3.0323 (2.9669) grad_norm 3.5908 (3.3272) [2022-01-27 03:51:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][330/1251] eta 0:34:32 lr 0.000011 time 1.8754 (2.2504) loss 3.2452 (2.9698) grad_norm 3.1640 (3.3282) [2022-01-27 03:51:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][340/1251] eta 0:34:03 lr 0.000011 time 1.8256 (2.2434) loss 3.6822 (2.9700) grad_norm 3.4756 (3.3295) [2022-01-27 03:51:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][350/1251] eta 0:33:41 lr 0.000011 time 2.6192 (2.2432) loss 3.4451 (2.9752) grad_norm 3.2097 (3.3238) [2022-01-27 03:52:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][360/1251] eta 0:33:17 lr 0.000011 time 2.5610 (2.2417) loss 3.4937 (2.9730) grad_norm 3.4022 (3.3253) [2022-01-27 03:52:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][370/1251] eta 0:32:56 lr 0.000011 time 1.5615 (2.2431) loss 3.6825 (2.9678) grad_norm 3.0474 (3.3200) [2022-01-27 03:53:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][380/1251] eta 0:32:36 lr 0.000011 time 2.9623 (2.2457) loss 2.2769 (2.9679) grad_norm 3.7677 (3.3272) [2022-01-27 03:53:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][390/1251] eta 0:32:15 lr 0.000011 time 2.9313 (2.2484) loss 2.7337 (2.9718) grad_norm 3.1044 (3.3268) [2022-01-27 03:53:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][400/1251] eta 0:31:55 lr 0.000011 time 2.2291 (2.2506) loss 3.3453 (2.9739) grad_norm 3.0993 (3.3282) [2022-01-27 03:54:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][410/1251] eta 0:31:31 lr 0.000011 time 1.9026 (2.2490) loss 2.6282 (2.9741) grad_norm 3.4276 (3.3238) [2022-01-27 03:54:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][420/1251] eta 0:31:08 lr 0.000011 time 3.3800 (2.2485) loss 3.0410 (2.9742) grad_norm 2.8666 (3.3194) [2022-01-27 03:54:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][430/1251] eta 0:30:39 lr 0.000011 time 1.5915 (2.2403) loss 2.5545 (2.9711) grad_norm 3.0629 (3.3167) [2022-01-27 03:55:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][440/1251] eta 0:30:14 lr 0.000011 time 2.5916 (2.2377) loss 3.0711 (2.9690) grad_norm 3.7493 (3.3243) [2022-01-27 03:55:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][450/1251] eta 0:29:50 lr 0.000011 time 2.1977 (2.2351) loss 3.2388 (2.9658) grad_norm 3.0821 (3.3268) [2022-01-27 03:55:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][460/1251] eta 0:29:29 lr 0.000011 time 1.9659 (2.2367) loss 1.9762 (2.9588) grad_norm 3.0924 (3.3242) [2022-01-27 03:56:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][470/1251] eta 0:29:07 lr 0.000011 time 2.2991 (2.2373) loss 2.7039 (2.9578) grad_norm 3.1182 (3.3212) [2022-01-27 03:56:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][480/1251] eta 0:28:47 lr 0.000011 time 3.4362 (2.2409) loss 3.1209 (2.9564) grad_norm 3.6214 (3.3194) [2022-01-27 03:57:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][490/1251] eta 0:28:24 lr 0.000011 time 1.6176 (2.2402) loss 3.3719 (2.9611) grad_norm 3.1662 (3.3165) [2022-01-27 03:57:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][500/1251] eta 0:28:02 lr 0.000011 time 1.9061 (2.2404) loss 2.5404 (2.9606) grad_norm 3.0029 (3.3206) [2022-01-27 03:57:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][510/1251] eta 0:27:40 lr 0.000011 time 1.8807 (2.2408) loss 3.5474 (2.9582) grad_norm 4.1182 (3.3237) [2022-01-27 03:58:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][520/1251] eta 0:27:15 lr 0.000011 time 2.4996 (2.2375) loss 1.8617 (2.9544) grad_norm 2.8959 (3.3229) [2022-01-27 03:58:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][530/1251] eta 0:26:49 lr 0.000011 time 2.3234 (2.2317) loss 2.5795 (2.9555) grad_norm 3.2191 (3.3242) [2022-01-27 03:58:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][540/1251] eta 0:26:25 lr 0.000011 time 2.1217 (2.2294) loss 3.3483 (2.9600) grad_norm 3.1890 (3.3293) [2022-01-27 03:59:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][550/1251] eta 0:26:01 lr 0.000011 time 2.1588 (2.2280) loss 3.2023 (2.9635) grad_norm 3.0955 (3.3262) [2022-01-27 03:59:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][560/1251] eta 0:25:39 lr 0.000011 time 1.8764 (2.2277) loss 2.9287 (2.9642) grad_norm 3.0842 (3.3296) [2022-01-27 03:59:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][570/1251] eta 0:25:16 lr 0.000011 time 2.8090 (2.2266) loss 2.9184 (2.9669) grad_norm 2.9578 (3.3378) [2022-01-27 04:00:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][580/1251] eta 0:24:55 lr 0.000011 time 2.5135 (2.2281) loss 3.0767 (2.9619) grad_norm 3.5660 (3.3361) [2022-01-27 04:00:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][590/1251] eta 0:24:34 lr 0.000011 time 2.0209 (2.2302) loss 3.2913 (2.9636) grad_norm 3.3436 (3.3351) [2022-01-27 04:01:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][600/1251] eta 0:24:12 lr 0.000011 time 2.2877 (2.2306) loss 3.5092 (2.9616) grad_norm 3.0001 (3.3368) [2022-01-27 04:01:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][610/1251] eta 0:23:48 lr 0.000011 time 2.2063 (2.2289) loss 3.6147 (2.9624) grad_norm 3.9276 (3.3401) [2022-01-27 04:01:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][620/1251] eta 0:23:25 lr 0.000011 time 2.1981 (2.2269) loss 3.2373 (2.9618) grad_norm 3.4063 (3.3357) [2022-01-27 04:02:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][630/1251] eta 0:23:01 lr 0.000011 time 2.1831 (2.2245) loss 2.6407 (2.9595) grad_norm 3.8947 (3.3372) [2022-01-27 04:02:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][640/1251] eta 0:22:37 lr 0.000011 time 1.6299 (2.2222) loss 3.4276 (2.9645) grad_norm 2.9513 (3.3342) [2022-01-27 04:02:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][650/1251] eta 0:22:15 lr 0.000011 time 2.7949 (2.2216) loss 2.7042 (2.9617) grad_norm 3.4389 (3.3308) [2022-01-27 04:03:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][660/1251] eta 0:21:53 lr 0.000011 time 2.3845 (2.2219) loss 3.2256 (2.9646) grad_norm 2.9326 (3.3268) [2022-01-27 04:03:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][670/1251] eta 0:21:31 lr 0.000011 time 2.1504 (2.2226) loss 3.1277 (2.9675) grad_norm 2.9923 (3.3269) [2022-01-27 04:04:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][680/1251] eta 0:21:09 lr 0.000011 time 2.4118 (2.2227) loss 3.3721 (2.9701) grad_norm 3.3952 (3.3288) [2022-01-27 04:04:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][690/1251] eta 0:20:46 lr 0.000011 time 1.6920 (2.2222) loss 3.2269 (2.9680) grad_norm 4.1433 (3.3308) [2022-01-27 04:04:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][700/1251] eta 0:20:22 lr 0.000011 time 1.6347 (2.2190) loss 3.6544 (2.9657) grad_norm 3.4489 (3.3288) [2022-01-27 04:05:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][710/1251] eta 0:19:59 lr 0.000011 time 1.8571 (2.2172) loss 3.1979 (2.9652) grad_norm 2.9882 (3.3268) [2022-01-27 04:05:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][720/1251] eta 0:19:36 lr 0.000011 time 1.9746 (2.2163) loss 2.2323 (2.9667) grad_norm 3.2164 (3.3265) [2022-01-27 04:05:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][730/1251] eta 0:19:14 lr 0.000011 time 2.0936 (2.2163) loss 2.8571 (2.9658) grad_norm 2.9502 (3.3217) [2022-01-27 04:06:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][740/1251] eta 0:18:52 lr 0.000011 time 1.9756 (2.2156) loss 3.2146 (2.9681) grad_norm 3.1337 (3.3240) [2022-01-27 04:06:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][750/1251] eta 0:18:30 lr 0.000011 time 1.8888 (2.2165) loss 3.1283 (2.9669) grad_norm 3.0678 (3.3258) [2022-01-27 04:06:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][760/1251] eta 0:18:09 lr 0.000011 time 1.9019 (2.2183) loss 2.1966 (2.9667) grad_norm 3.3126 (3.3265) [2022-01-27 04:07:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][770/1251] eta 0:17:47 lr 0.000011 time 2.3132 (2.2190) loss 2.9954 (2.9684) grad_norm 3.2975 (3.3263) [2022-01-27 04:07:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][780/1251] eta 0:17:24 lr 0.000011 time 1.9633 (2.2176) loss 2.1368 (2.9662) grad_norm 3.3936 (3.3285) [2022-01-27 04:08:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][790/1251] eta 0:17:01 lr 0.000011 time 1.8056 (2.2158) loss 3.0054 (2.9652) grad_norm 3.0592 (3.3274) [2022-01-27 04:08:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][800/1251] eta 0:16:38 lr 0.000011 time 1.8668 (2.2137) loss 2.4160 (2.9628) grad_norm 2.9287 (3.3269) [2022-01-27 04:08:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][810/1251] eta 0:16:15 lr 0.000011 time 3.4013 (2.2129) loss 3.0344 (2.9631) grad_norm 3.0821 (3.3273) [2022-01-27 04:09:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][820/1251] eta 0:15:53 lr 0.000011 time 2.2583 (2.2126) loss 3.0262 (2.9644) grad_norm 3.1829 (3.3278) [2022-01-27 04:09:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][830/1251] eta 0:15:31 lr 0.000011 time 2.1264 (2.2123) loss 3.1570 (2.9661) grad_norm 2.7262 (3.3273) [2022-01-27 04:09:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][840/1251] eta 0:15:09 lr 0.000011 time 2.2072 (2.2130) loss 3.5698 (2.9674) grad_norm 3.5201 (3.3254) [2022-01-27 04:10:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][850/1251] eta 0:14:47 lr 0.000011 time 2.8157 (2.2143) loss 3.4004 (2.9692) grad_norm 3.5965 (3.3229) [2022-01-27 04:10:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][860/1251] eta 0:14:25 lr 0.000011 time 2.2533 (2.2136) loss 2.4765 (2.9674) grad_norm 3.1191 (3.3238) [2022-01-27 04:10:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][870/1251] eta 0:14:03 lr 0.000011 time 3.0951 (2.2142) loss 3.4723 (2.9673) grad_norm 3.2674 (3.3245) [2022-01-27 04:11:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][880/1251] eta 0:13:40 lr 0.000011 time 1.7236 (2.2117) loss 2.2833 (2.9658) grad_norm 3.6335 (3.3277) [2022-01-27 04:11:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][890/1251] eta 0:13:18 lr 0.000011 time 2.4513 (2.2116) loss 3.0290 (2.9693) grad_norm 2.8275 (3.3267) [2022-01-27 04:12:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][900/1251] eta 0:12:56 lr 0.000011 time 2.4437 (2.2111) loss 2.4965 (2.9682) grad_norm 2.6134 (3.3254) [2022-01-27 04:12:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][910/1251] eta 0:12:33 lr 0.000011 time 2.3434 (2.2104) loss 3.0361 (2.9700) grad_norm 3.0023 (3.3274) [2022-01-27 04:12:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][920/1251] eta 0:12:11 lr 0.000011 time 1.8770 (2.2093) loss 2.1106 (2.9654) grad_norm 2.9288 (3.3267) [2022-01-27 04:13:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][930/1251] eta 0:11:48 lr 0.000011 time 1.8361 (2.2080) loss 3.3764 (2.9668) grad_norm 3.4224 (3.3274) [2022-01-27 04:13:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][940/1251] eta 0:11:26 lr 0.000011 time 2.1889 (2.2086) loss 3.3809 (2.9676) grad_norm 4.1375 (3.3290) [2022-01-27 04:13:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][950/1251] eta 0:11:04 lr 0.000011 time 2.2465 (2.2091) loss 3.0443 (2.9687) grad_norm 2.8027 (3.3289) [2022-01-27 04:14:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][960/1251] eta 0:10:42 lr 0.000011 time 2.4172 (2.2093) loss 3.7288 (2.9685) grad_norm 3.1264 (3.3284) [2022-01-27 04:14:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][970/1251] eta 0:10:21 lr 0.000011 time 1.8795 (2.2112) loss 2.0110 (2.9680) grad_norm 3.2630 (3.3262) [2022-01-27 04:14:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][980/1251] eta 0:09:58 lr 0.000011 time 1.9404 (2.2101) loss 2.8100 (2.9671) grad_norm 2.8200 (3.3244) [2022-01-27 04:15:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][990/1251] eta 0:09:36 lr 0.000011 time 1.9385 (2.2079) loss 2.6804 (2.9689) grad_norm 3.5056 (3.3261) [2022-01-27 04:15:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1000/1251] eta 0:09:13 lr 0.000011 time 1.9064 (2.2061) loss 1.8569 (2.9684) grad_norm 2.9616 (3.3252) [2022-01-27 04:15:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1010/1251] eta 0:08:51 lr 0.000011 time 1.6077 (2.2060) loss 3.3037 (2.9672) grad_norm 2.9675 (3.3238) [2022-01-27 04:16:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1020/1251] eta 0:08:29 lr 0.000011 time 1.8373 (2.2048) loss 2.9951 (2.9665) grad_norm 2.7940 (3.3289) [2022-01-27 04:16:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1030/1251] eta 0:08:06 lr 0.000011 time 2.1563 (2.2034) loss 3.4003 (2.9653) grad_norm 3.1310 (3.3277) [2022-01-27 04:17:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1040/1251] eta 0:07:44 lr 0.000011 time 2.0683 (2.2029) loss 3.2230 (2.9640) grad_norm 3.1381 (3.3274) [2022-01-27 04:17:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1050/1251] eta 0:07:22 lr 0.000011 time 2.0260 (2.2036) loss 2.9443 (2.9666) grad_norm 4.6031 (3.3283) [2022-01-27 04:17:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1060/1251] eta 0:07:00 lr 0.000011 time 2.1807 (2.2038) loss 1.9578 (2.9663) grad_norm 3.9561 (3.3280) [2022-01-27 04:18:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1070/1251] eta 0:06:38 lr 0.000011 time 1.7962 (2.2029) loss 2.7425 (2.9644) grad_norm 3.7610 (3.3284) [2022-01-27 04:18:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1080/1251] eta 0:06:16 lr 0.000011 time 3.0329 (2.2038) loss 3.0269 (2.9654) grad_norm 3.5621 (3.3276) [2022-01-27 04:18:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1090/1251] eta 0:05:54 lr 0.000011 time 2.1843 (2.2043) loss 2.9939 (2.9667) grad_norm 2.9016 (3.3272) [2022-01-27 04:19:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1100/1251] eta 0:05:32 lr 0.000011 time 1.7696 (2.2045) loss 3.3125 (2.9654) grad_norm 4.1972 (3.3274) [2022-01-27 04:19:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1110/1251] eta 0:05:10 lr 0.000011 time 1.5600 (2.2035) loss 2.7307 (2.9652) grad_norm 3.5091 (3.3280) [2022-01-27 04:19:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1120/1251] eta 0:04:48 lr 0.000011 time 2.8302 (2.2045) loss 3.4645 (2.9652) grad_norm 3.4524 (3.3269) [2022-01-27 04:20:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1130/1251] eta 0:04:26 lr 0.000011 time 2.1640 (2.2049) loss 3.2251 (2.9632) grad_norm 2.8691 (3.3267) [2022-01-27 04:20:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1140/1251] eta 0:04:04 lr 0.000011 time 1.9406 (2.2036) loss 2.8194 (2.9622) grad_norm 3.0933 (3.3264) [2022-01-27 04:21:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1150/1251] eta 0:03:42 lr 0.000011 time 1.7074 (2.2033) loss 2.2492 (2.9618) grad_norm 2.9326 (3.3251) [2022-01-27 04:21:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1160/1251] eta 0:03:20 lr 0.000011 time 3.0357 (2.2059) loss 3.0818 (2.9642) grad_norm 3.4093 (3.3264) [2022-01-27 04:21:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1170/1251] eta 0:02:58 lr 0.000011 time 1.5954 (2.2065) loss 3.5782 (2.9662) grad_norm 3.2490 (3.3264) [2022-01-27 04:22:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1180/1251] eta 0:02:36 lr 0.000011 time 1.6478 (2.2065) loss 2.4690 (2.9650) grad_norm 3.0184 (3.3254) [2022-01-27 04:22:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1190/1251] eta 0:02:14 lr 0.000011 time 1.8636 (2.2046) loss 2.4254 (2.9661) grad_norm 2.9258 (3.3263) [2022-01-27 04:22:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1200/1251] eta 0:01:52 lr 0.000011 time 2.1586 (2.2038) loss 3.2194 (2.9677) grad_norm 3.4460 (3.3282) [2022-01-27 04:23:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1210/1251] eta 0:01:30 lr 0.000011 time 2.0538 (2.2036) loss 2.7840 (2.9675) grad_norm 2.9385 (3.3272) [2022-01-27 04:23:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1220/1251] eta 0:01:08 lr 0.000011 time 2.2783 (2.2038) loss 3.2443 (2.9677) grad_norm 2.9480 (3.3262) [2022-01-27 04:24:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1230/1251] eta 0:00:46 lr 0.000011 time 2.1118 (2.2042) loss 3.4302 (2.9675) grad_norm 3.1383 (3.3277) [2022-01-27 04:24:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1240/1251] eta 0:00:24 lr 0.000011 time 1.5654 (2.2035) loss 3.1377 (2.9669) grad_norm 3.0381 (3.3269) [2022-01-27 04:24:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [294/300][1250/1251] eta 0:00:02 lr 0.000011 time 1.1750 (2.1978) loss 3.2254 (2.9673) grad_norm 3.4772 (3.3252) [2022-01-27 04:24:38 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 294 training takes 0:45:49 [2022-01-27 04:24:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.204 (18.204) Loss 0.8677 (0.8677) Acc@1 80.762 (80.762) Acc@5 94.922 (94.922) [2022-01-27 04:25:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.934 (3.411) Loss 0.8606 (0.8247) Acc@1 79.102 (80.868) Acc@5 95.117 (95.197) [2022-01-27 04:25:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.912 (2.505) Loss 0.8625 (0.8240) Acc@1 81.250 (80.897) Acc@5 94.824 (95.187) [2022-01-27 04:25:45 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.564 (2.175) Loss 0.8414 (0.8138) Acc@1 80.273 (81.074) Acc@5 95.410 (95.420) [2022-01-27 04:26:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.132 (2.147) Loss 0.8547 (0.8078) Acc@1 81.055 (81.179) Acc@5 95.703 (95.494) [2022-01-27 04:26:13 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.152 Acc@5 95.518 [2022-01-27 04:26:13 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.2% [2022-01-27 04:26:13 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.24% [2022-01-27 04:26:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][0/1251] eta 7:27:37 lr 0.000011 time 21.4692 (21.4692) loss 2.0425 (2.0425) grad_norm 3.6455 (3.6455) [2022-01-27 04:26:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][10/1251] eta 1:23:02 lr 0.000011 time 2.1750 (4.0147) loss 3.5575 (3.1439) grad_norm 2.8256 (3.2823) [2022-01-27 04:27:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][20/1251] eta 1:04:16 lr 0.000011 time 2.0121 (3.1332) loss 2.0388 (3.1226) grad_norm 3.1543 (3.2064) [2022-01-27 04:27:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][30/1251] eta 0:57:07 lr 0.000011 time 1.8458 (2.8074) loss 3.4711 (3.1005) grad_norm 3.3974 (3.2849) [2022-01-27 04:28:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][40/1251] eta 0:54:06 lr 0.000011 time 3.5785 (2.6809) loss 3.0643 (2.9859) grad_norm 2.9488 (3.3039) [2022-01-27 04:28:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][50/1251] eta 0:51:29 lr 0.000011 time 2.1192 (2.5721) loss 2.9802 (2.9893) grad_norm 3.2478 (3.3028) [2022-01-27 04:28:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][60/1251] eta 0:49:56 lr 0.000011 time 1.3332 (2.5158) loss 3.2584 (2.9620) grad_norm 3.4382 (3.3079) [2022-01-27 04:29:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][70/1251] eta 0:48:31 lr 0.000011 time 1.4277 (2.4650) loss 2.2902 (2.9609) grad_norm 3.2752 (3.3018) [2022-01-27 04:29:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][80/1251] eta 0:47:30 lr 0.000011 time 2.4474 (2.4343) loss 2.1688 (2.9603) grad_norm 3.4197 (3.2856) [2022-01-27 04:29:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][90/1251] eta 0:47:00 lr 0.000011 time 1.8937 (2.4290) loss 3.0851 (2.9509) grad_norm 2.9694 (3.2914) [2022-01-27 04:30:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][100/1251] eta 0:46:24 lr 0.000011 time 2.1981 (2.4192) loss 3.0594 (2.9580) grad_norm 3.3749 (3.2771) [2022-01-27 04:30:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][110/1251] eta 0:45:55 lr 0.000011 time 2.5435 (2.4147) loss 3.2900 (2.9619) grad_norm 3.4801 (3.2622) [2022-01-27 04:31:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][120/1251] eta 0:45:08 lr 0.000011 time 2.7189 (2.3947) loss 2.8134 (2.9678) grad_norm 4.7829 (3.2793) [2022-01-27 04:31:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][130/1251] eta 0:44:08 lr 0.000011 time 2.1685 (2.3624) loss 2.4267 (2.9643) grad_norm 3.1427 (3.2715) [2022-01-27 04:31:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][140/1251] eta 0:43:06 lr 0.000011 time 1.8695 (2.3285) loss 3.3937 (2.9719) grad_norm 2.6012 (3.2600) [2022-01-27 04:32:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][150/1251] eta 0:42:14 lr 0.000011 time 1.6497 (2.3020) loss 3.4518 (2.9767) grad_norm 3.4425 (3.2700) [2022-01-27 04:32:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][160/1251] eta 0:41:35 lr 0.000011 time 2.1807 (2.2870) loss 2.6600 (2.9893) grad_norm 2.8866 (3.2833) [2022-01-27 04:32:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][170/1251] eta 0:40:55 lr 0.000011 time 2.0655 (2.2715) loss 3.5473 (2.9854) grad_norm 3.4023 (3.2933) [2022-01-27 04:33:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][180/1251] eta 0:40:32 lr 0.000011 time 2.8457 (2.2713) loss 3.2226 (2.9893) grad_norm 3.5658 (3.2916) [2022-01-27 04:33:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][190/1251] eta 0:39:59 lr 0.000011 time 1.5709 (2.2619) loss 3.3445 (2.9946) grad_norm 3.3469 (3.2999) [2022-01-27 04:33:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][200/1251] eta 0:39:38 lr 0.000011 time 2.5021 (2.2634) loss 3.5137 (2.9921) grad_norm 3.1041 (3.3018) [2022-01-27 04:34:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][210/1251] eta 0:39:14 lr 0.000011 time 2.2803 (2.2615) loss 2.3895 (2.9826) grad_norm 3.2710 (3.3102) [2022-01-27 04:34:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][220/1251] eta 0:38:54 lr 0.000011 time 2.6771 (2.2640) loss 2.9192 (2.9785) grad_norm 4.0641 (3.3130) [2022-01-27 04:34:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][230/1251] eta 0:38:29 lr 0.000011 time 1.7687 (2.2617) loss 3.2364 (2.9669) grad_norm 2.7815 (3.3020) [2022-01-27 04:35:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][240/1251] eta 0:38:09 lr 0.000011 time 2.8675 (2.2646) loss 3.2829 (2.9627) grad_norm 3.7690 (3.3258) [2022-01-27 04:35:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][250/1251] eta 0:37:43 lr 0.000011 time 2.2828 (2.2610) loss 2.1210 (2.9608) grad_norm 3.5097 (3.3327) [2022-01-27 04:36:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][260/1251] eta 0:37:24 lr 0.000011 time 2.4545 (2.2646) loss 3.3505 (2.9686) grad_norm 2.8971 (3.3300) [2022-01-27 04:36:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][270/1251] eta 0:36:54 lr 0.000011 time 1.8248 (2.2574) loss 3.6618 (2.9787) grad_norm 3.4157 (3.3292) [2022-01-27 04:36:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][280/1251] eta 0:36:30 lr 0.000011 time 2.4741 (2.2559) loss 2.5390 (2.9815) grad_norm 3.0170 (3.3321) [2022-01-27 04:37:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][290/1251] eta 0:36:06 lr 0.000011 time 2.7360 (2.2541) loss 3.1320 (2.9808) grad_norm 2.8918 (3.3355) [2022-01-27 04:37:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][300/1251] eta 0:35:42 lr 0.000011 time 2.2592 (2.2529) loss 2.8618 (2.9744) grad_norm 3.0958 (3.3376) [2022-01-27 04:37:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][310/1251] eta 0:35:17 lr 0.000011 time 1.8634 (2.2504) loss 3.1084 (2.9786) grad_norm 2.9533 (3.3364) [2022-01-27 04:38:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][320/1251] eta 0:34:48 lr 0.000011 time 1.6692 (2.2430) loss 2.9448 (2.9803) grad_norm 3.1326 (3.3329) [2022-01-27 04:38:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][330/1251] eta 0:34:31 lr 0.000011 time 3.6087 (2.2494) loss 3.2947 (2.9837) grad_norm 3.0728 (3.3259) [2022-01-27 04:39:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][340/1251] eta 0:34:15 lr 0.000011 time 2.9847 (2.2558) loss 3.2077 (2.9877) grad_norm 2.8135 (3.3199) [2022-01-27 04:39:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][350/1251] eta 0:33:44 lr 0.000011 time 1.6438 (2.2466) loss 1.8841 (2.9873) grad_norm 3.3751 (3.3167) [2022-01-27 04:39:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][360/1251] eta 0:33:17 lr 0.000011 time 1.9547 (2.2423) loss 3.4371 (2.9824) grad_norm 3.9945 (3.3200) [2022-01-27 04:40:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][370/1251] eta 0:32:55 lr 0.000011 time 3.1334 (2.2421) loss 3.4208 (2.9791) grad_norm 3.1588 (3.3208) [2022-01-27 04:40:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][380/1251] eta 0:32:29 lr 0.000011 time 1.9250 (2.2380) loss 2.6720 (2.9767) grad_norm 3.4753 (3.3245) [2022-01-27 04:40:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][390/1251] eta 0:32:08 lr 0.000011 time 1.7600 (2.2396) loss 2.1999 (2.9704) grad_norm 3.6661 (3.3256) [2022-01-27 04:41:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][400/1251] eta 0:31:47 lr 0.000011 time 1.5045 (2.2416) loss 1.8114 (2.9642) grad_norm 2.7590 (3.3204) [2022-01-27 04:41:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][410/1251] eta 0:31:24 lr 0.000011 time 3.0854 (2.2413) loss 3.0889 (2.9684) grad_norm 3.2414 (3.3163) [2022-01-27 04:41:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][420/1251] eta 0:30:59 lr 0.000011 time 1.9198 (2.2375) loss 2.1844 (2.9659) grad_norm 3.7742 (3.3168) [2022-01-27 04:42:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][430/1251] eta 0:30:32 lr 0.000011 time 1.6160 (2.2321) loss 2.4164 (2.9657) grad_norm 3.1342 (3.3181) [2022-01-27 04:42:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][440/1251] eta 0:30:08 lr 0.000011 time 2.2557 (2.2294) loss 2.4671 (2.9626) grad_norm 4.0267 (3.3216) [2022-01-27 04:42:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][450/1251] eta 0:29:46 lr 0.000011 time 2.2190 (2.2298) loss 3.3976 (2.9643) grad_norm 3.9655 (3.3237) [2022-01-27 04:43:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][460/1251] eta 0:29:25 lr 0.000011 time 2.8893 (2.2324) loss 3.0605 (2.9660) grad_norm 3.4446 (3.3257) [2022-01-27 04:43:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][470/1251] eta 0:29:01 lr 0.000011 time 1.5935 (2.2302) loss 2.8963 (2.9677) grad_norm 3.4492 (3.3226) [2022-01-27 04:44:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][480/1251] eta 0:28:40 lr 0.000011 time 2.4392 (2.2310) loss 3.1901 (2.9647) grad_norm 3.4744 (3.3204) [2022-01-27 04:44:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][490/1251] eta 0:28:17 lr 0.000011 time 2.8555 (2.2313) loss 2.1563 (2.9638) grad_norm 3.5997 (3.3216) [2022-01-27 04:44:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][500/1251] eta 0:27:52 lr 0.000011 time 1.9366 (2.2276) loss 2.9447 (2.9588) grad_norm 2.9416 (3.3219) [2022-01-27 04:45:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][510/1251] eta 0:27:27 lr 0.000011 time 1.8992 (2.2234) loss 2.1907 (2.9595) grad_norm 3.2307 (3.3195) [2022-01-27 04:45:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][520/1251] eta 0:27:02 lr 0.000011 time 2.1576 (2.2201) loss 3.2577 (2.9648) grad_norm 3.5599 (3.3179) [2022-01-27 04:45:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][530/1251] eta 0:26:40 lr 0.000011 time 2.7370 (2.2204) loss 3.4053 (2.9685) grad_norm 3.3853 (3.3166) [2022-01-27 04:46:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][540/1251] eta 0:26:19 lr 0.000011 time 1.4903 (2.2221) loss 2.9119 (2.9677) grad_norm 3.2725 (3.3156) [2022-01-27 04:46:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][550/1251] eta 0:25:58 lr 0.000011 time 2.4952 (2.2227) loss 2.6356 (2.9694) grad_norm 2.7231 (3.3119) [2022-01-27 04:47:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][560/1251] eta 0:25:36 lr 0.000011 time 3.1225 (2.2242) loss 2.9074 (2.9707) grad_norm 3.3517 (3.3137) [2022-01-27 04:47:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][570/1251] eta 0:25:14 lr 0.000011 time 2.7506 (2.2247) loss 3.4751 (2.9700) grad_norm 3.9626 (3.3137) [2022-01-27 04:47:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][580/1251] eta 0:24:51 lr 0.000011 time 1.5415 (2.2225) loss 1.9423 (2.9629) grad_norm 3.2464 (3.3098) [2022-01-27 04:48:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][590/1251] eta 0:24:27 lr 0.000011 time 2.1347 (2.2206) loss 2.5986 (2.9641) grad_norm 3.9302 (3.3129) [2022-01-27 04:48:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][600/1251] eta 0:24:03 lr 0.000011 time 2.1542 (2.2175) loss 2.4286 (2.9651) grad_norm 3.0851 (3.3141) [2022-01-27 04:48:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][610/1251] eta 0:23:40 lr 0.000011 time 2.4857 (2.2164) loss 3.5860 (2.9649) grad_norm 3.0798 (3.3138) [2022-01-27 04:49:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][620/1251] eta 0:23:17 lr 0.000011 time 1.8808 (2.2145) loss 2.8661 (2.9657) grad_norm 3.6385 (3.3143) [2022-01-27 04:49:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][630/1251] eta 0:22:55 lr 0.000011 time 2.8771 (2.2144) loss 3.2708 (2.9673) grad_norm 3.5614 (3.3181) [2022-01-27 04:49:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][640/1251] eta 0:22:33 lr 0.000011 time 2.4132 (2.2159) loss 3.6212 (2.9656) grad_norm 3.1019 (3.3150) [2022-01-27 04:50:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][650/1251] eta 0:22:12 lr 0.000011 time 2.1499 (2.2167) loss 3.2036 (2.9666) grad_norm 3.0137 (3.3179) [2022-01-27 04:50:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][660/1251] eta 0:21:49 lr 0.000011 time 2.0814 (2.2160) loss 3.4117 (2.9663) grad_norm 2.9390 (3.3153) [2022-01-27 04:51:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][670/1251] eta 0:21:28 lr 0.000011 time 3.4457 (2.2169) loss 2.3204 (2.9624) grad_norm 3.2579 (3.3207) [2022-01-27 04:51:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][680/1251] eta 0:21:04 lr 0.000011 time 1.6425 (2.2143) loss 3.3083 (2.9629) grad_norm 2.9790 (3.3226) [2022-01-27 04:51:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][690/1251] eta 0:20:40 lr 0.000011 time 2.3018 (2.2119) loss 1.9550 (2.9611) grad_norm 2.9949 (3.3237) [2022-01-27 04:52:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][700/1251] eta 0:20:18 lr 0.000011 time 2.2678 (2.2117) loss 3.5551 (2.9638) grad_norm 3.5666 (3.3233) [2022-01-27 04:52:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][710/1251] eta 0:19:55 lr 0.000011 time 2.4058 (2.2089) loss 2.1827 (2.9640) grad_norm 3.4951 (3.3215) [2022-01-27 04:52:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][720/1251] eta 0:19:31 lr 0.000011 time 2.0204 (2.2066) loss 2.3347 (2.9590) grad_norm 3.1951 (3.3221) [2022-01-27 04:53:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][730/1251] eta 0:19:09 lr 0.000011 time 2.7660 (2.2063) loss 2.4042 (2.9604) grad_norm 3.1858 (3.3249) [2022-01-27 04:53:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][740/1251] eta 0:18:47 lr 0.000011 time 2.5359 (2.2074) loss 3.5839 (2.9620) grad_norm 3.2435 (3.3303) [2022-01-27 04:53:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][750/1251] eta 0:18:26 lr 0.000011 time 1.8446 (2.2080) loss 3.3629 (2.9630) grad_norm 3.0223 (3.3299) [2022-01-27 04:54:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][760/1251] eta 0:18:03 lr 0.000011 time 1.4683 (2.2070) loss 2.4747 (2.9631) grad_norm 3.1205 (3.3304) [2022-01-27 04:54:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][770/1251] eta 0:17:43 lr 0.000011 time 2.4839 (2.2102) loss 3.1469 (2.9636) grad_norm 3.2662 (3.3294) [2022-01-27 04:55:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][780/1251] eta 0:17:22 lr 0.000011 time 1.8299 (2.2124) loss 2.5371 (2.9606) grad_norm 3.0139 (3.3286) [2022-01-27 04:55:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][790/1251] eta 0:16:59 lr 0.000011 time 2.0103 (2.2123) loss 3.8076 (2.9613) grad_norm 3.3843 (3.3296) [2022-01-27 04:55:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][800/1251] eta 0:16:36 lr 0.000011 time 1.6903 (2.2091) loss 3.3141 (2.9621) grad_norm 2.9117 (3.3317) [2022-01-27 04:56:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][810/1251] eta 0:16:12 lr 0.000011 time 1.9546 (2.2058) loss 2.0838 (2.9628) grad_norm 2.7680 (3.3299) [2022-01-27 04:56:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][820/1251] eta 0:15:50 lr 0.000011 time 1.9499 (2.2042) loss 1.9761 (2.9612) grad_norm 3.6647 (3.3311) [2022-01-27 04:56:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][830/1251] eta 0:15:27 lr 0.000011 time 2.0428 (2.2029) loss 3.4774 (2.9630) grad_norm 3.3906 (3.3309) [2022-01-27 04:57:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][840/1251] eta 0:15:05 lr 0.000011 time 1.8951 (2.2036) loss 1.8858 (2.9593) grad_norm 2.7439 (3.3278) [2022-01-27 04:57:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][850/1251] eta 0:14:44 lr 0.000011 time 1.9304 (2.2047) loss 2.7987 (2.9590) grad_norm 2.9662 (3.3310) [2022-01-27 04:57:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][860/1251] eta 0:14:24 lr 0.000011 time 1.9005 (2.2105) loss 2.7262 (2.9588) grad_norm 3.4596 (3.3302) [2022-01-27 04:58:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][870/1251] eta 0:14:01 lr 0.000011 time 1.6405 (2.2095) loss 3.3390 (2.9610) grad_norm 3.1316 (3.3291) [2022-01-27 04:58:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][880/1251] eta 0:13:39 lr 0.000011 time 2.0346 (2.2079) loss 3.6235 (2.9614) grad_norm 3.0036 (3.3287) [2022-01-27 04:59:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][890/1251] eta 0:13:17 lr 0.000010 time 1.8219 (2.2079) loss 3.2731 (2.9647) grad_norm 3.7143 (3.3292) [2022-01-27 04:59:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][900/1251] eta 0:12:55 lr 0.000010 time 1.7681 (2.2103) loss 3.2583 (2.9635) grad_norm 2.8274 (3.3289) [2022-01-27 04:59:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][910/1251] eta 0:12:33 lr 0.000010 time 1.5684 (2.2093) loss 3.5074 (2.9632) grad_norm 3.0978 (3.3294) [2022-01-27 05:00:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][920/1251] eta 0:12:10 lr 0.000010 time 2.2134 (2.2082) loss 3.4200 (2.9651) grad_norm 3.5376 (3.3301) [2022-01-27 05:00:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][930/1251] eta 0:11:48 lr 0.000010 time 1.5347 (2.2080) loss 3.1087 (2.9658) grad_norm 3.5162 (3.3296) [2022-01-27 05:00:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][940/1251] eta 0:11:27 lr 0.000010 time 2.8083 (2.2111) loss 3.1384 (2.9667) grad_norm 3.7084 (3.3296) [2022-01-27 05:01:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][950/1251] eta 0:11:05 lr 0.000010 time 2.2956 (2.2094) loss 3.1585 (2.9647) grad_norm 3.2760 (3.3345) [2022-01-27 05:01:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][960/1251] eta 0:10:42 lr 0.000010 time 1.7948 (2.2077) loss 2.5965 (2.9648) grad_norm 3.0378 (3.3355) [2022-01-27 05:01:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][970/1251] eta 0:10:20 lr 0.000010 time 1.8221 (2.2068) loss 3.2935 (2.9670) grad_norm 2.8275 (3.3381) [2022-01-27 05:02:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][980/1251] eta 0:09:58 lr 0.000010 time 2.6817 (2.2078) loss 3.4010 (2.9670) grad_norm 2.7896 (3.3380) [2022-01-27 05:02:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][990/1251] eta 0:09:35 lr 0.000010 time 2.0857 (2.2062) loss 3.0759 (2.9675) grad_norm 3.7382 (3.3360) [2022-01-27 05:03:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1000/1251] eta 0:09:13 lr 0.000010 time 2.0725 (2.2062) loss 3.4829 (2.9686) grad_norm 3.9741 (3.3352) [2022-01-27 05:03:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1010/1251] eta 0:08:51 lr 0.000010 time 1.8184 (2.2057) loss 2.2672 (2.9675) grad_norm 3.9830 (3.3383) [2022-01-27 05:03:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1020/1251] eta 0:08:29 lr 0.000010 time 2.0966 (2.2050) loss 3.3123 (2.9687) grad_norm 3.1023 (3.3382) [2022-01-27 05:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1030/1251] eta 0:08:06 lr 0.000010 time 1.9843 (2.2036) loss 3.7551 (2.9712) grad_norm 3.2472 (3.3399) [2022-01-27 05:04:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1040/1251] eta 0:07:44 lr 0.000010 time 1.5787 (2.2018) loss 2.7186 (2.9732) grad_norm 3.9653 (3.3424) [2022-01-27 05:04:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1050/1251] eta 0:07:22 lr 0.000010 time 2.5722 (2.2023) loss 3.2935 (2.9721) grad_norm 4.0664 (3.3430) [2022-01-27 05:05:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1060/1251] eta 0:07:00 lr 0.000010 time 3.0860 (2.2028) loss 2.5729 (2.9715) grad_norm 4.1195 (3.3427) [2022-01-27 05:05:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1070/1251] eta 0:06:39 lr 0.000010 time 2.1113 (2.2047) loss 3.4531 (2.9712) grad_norm 3.2555 (3.3428) [2022-01-27 05:05:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1080/1251] eta 0:06:17 lr 0.000010 time 1.8898 (2.2047) loss 2.5603 (2.9699) grad_norm 3.3025 (3.3444) [2022-01-27 05:06:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1090/1251] eta 0:05:55 lr 0.000010 time 2.4916 (2.2068) loss 3.0153 (2.9710) grad_norm 3.2420 (3.3457) [2022-01-27 05:06:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1100/1251] eta 0:05:33 lr 0.000010 time 3.2731 (2.2085) loss 3.4216 (2.9720) grad_norm 3.4426 (3.3457) [2022-01-27 05:07:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1110/1251] eta 0:05:11 lr 0.000010 time 1.8870 (2.2083) loss 2.1791 (2.9709) grad_norm 3.6504 (3.3461) [2022-01-27 05:07:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1120/1251] eta 0:04:49 lr 0.000010 time 1.8830 (2.2066) loss 3.3202 (2.9712) grad_norm 2.9497 (3.3453) [2022-01-27 05:07:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1130/1251] eta 0:04:26 lr 0.000010 time 1.9650 (2.2039) loss 3.3642 (2.9734) grad_norm 3.4734 (3.3440) [2022-01-27 05:08:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1140/1251] eta 0:04:04 lr 0.000010 time 2.7267 (2.2023) loss 3.0069 (2.9720) grad_norm 3.2741 (3.3468) [2022-01-27 05:08:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1150/1251] eta 0:03:42 lr 0.000010 time 2.2029 (2.2016) loss 3.5727 (2.9731) grad_norm 3.7271 (3.3472) [2022-01-27 05:08:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1160/1251] eta 0:03:20 lr 0.000010 time 1.9354 (2.2013) loss 1.9002 (2.9712) grad_norm 3.5181 (3.3506) [2022-01-27 05:09:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1170/1251] eta 0:02:58 lr 0.000010 time 2.4009 (2.2022) loss 3.3539 (2.9685) grad_norm 3.2782 (3.3497) [2022-01-27 05:09:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1180/1251] eta 0:02:36 lr 0.000010 time 2.8489 (2.2017) loss 2.7511 (2.9707) grad_norm 3.9986 (3.3505) [2022-01-27 05:09:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1190/1251] eta 0:02:14 lr 0.000010 time 2.4985 (2.2024) loss 2.4437 (2.9698) grad_norm 3.0403 (3.3514) [2022-01-27 05:10:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1200/1251] eta 0:01:52 lr 0.000010 time 2.1893 (2.2023) loss 2.5223 (2.9701) grad_norm 3.1628 (3.3528) [2022-01-27 05:10:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1210/1251] eta 0:01:30 lr 0.000010 time 2.8473 (2.2039) loss 3.1171 (2.9695) grad_norm 3.0903 (3.3543) [2022-01-27 05:11:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1220/1251] eta 0:01:08 lr 0.000010 time 1.8857 (2.2038) loss 2.9954 (2.9691) grad_norm 3.5574 (3.3562) [2022-01-27 05:11:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1230/1251] eta 0:00:46 lr 0.000010 time 2.2295 (2.2038) loss 3.2101 (2.9686) grad_norm 2.9920 (3.3557) [2022-01-27 05:11:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1240/1251] eta 0:00:24 lr 0.000010 time 1.4001 (2.2021) loss 3.3953 (2.9696) grad_norm 3.0838 (3.3548) [2022-01-27 05:12:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [295/300][1250/1251] eta 0:00:02 lr 0.000010 time 1.2017 (2.1963) loss 2.2394 (2.9680) grad_norm 2.8070 (3.3539) [2022-01-27 05:12:01 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 295 training takes 0:45:48 [2022-01-27 05:12:19 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.498 (18.498) Loss 0.7881 (0.7881) Acc@1 82.129 (82.129) Acc@5 95.410 (95.410) [2022-01-27 05:12:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.977 (3.497) Loss 0.9269 (0.8238) Acc@1 78.125 (80.877) Acc@5 94.043 (95.224) [2022-01-27 05:12:56 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.612 (2.632) Loss 0.8531 (0.8209) Acc@1 78.906 (80.808) Acc@5 95.020 (95.280) [2022-01-27 05:13:11 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.274 (2.251) Loss 0.8854 (0.8184) Acc@1 79.785 (80.963) Acc@5 94.336 (95.325) [2022-01-27 05:13:30 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.547 (2.174) Loss 0.7183 (0.8071) Acc@1 83.984 (81.243) Acc@5 96.094 (95.479) [2022-01-27 05:13:37 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.192 Acc@5 95.492 [2022-01-27 05:13:37 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.2% [2022-01-27 05:13:37 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.24% [2022-01-27 05:14:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][0/1251] eta 7:43:13 lr 0.000010 time 22.2172 (22.2172) loss 3.1717 (3.1717) grad_norm 3.4666 (3.4666) [2022-01-27 05:14:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][10/1251] eta 1:23:58 lr 0.000010 time 2.5653 (4.0603) loss 3.9326 (3.0058) grad_norm 2.9834 (3.3391) [2022-01-27 05:14:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][20/1251] eta 1:04:37 lr 0.000010 time 1.1979 (3.1500) loss 2.5584 (2.9966) grad_norm 3.6232 (3.3300) [2022-01-27 05:15:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][30/1251] eta 0:58:53 lr 0.000010 time 1.7208 (2.8939) loss 2.6048 (2.9816) grad_norm 3.7650 (3.3451) [2022-01-27 05:15:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][40/1251] eta 0:55:24 lr 0.000010 time 2.8876 (2.7450) loss 3.2046 (3.0407) grad_norm 2.7628 (3.3488) [2022-01-27 05:15:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][50/1251] eta 0:53:36 lr 0.000010 time 2.5394 (2.6785) loss 3.4389 (3.0728) grad_norm 3.1478 (3.3216) [2022-01-27 05:16:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][60/1251] eta 0:51:22 lr 0.000010 time 1.5042 (2.5879) loss 3.5213 (3.0629) grad_norm 2.9336 (3.3106) [2022-01-27 05:16:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][70/1251] eta 0:49:51 lr 0.000010 time 2.1096 (2.5334) loss 2.0591 (3.0602) grad_norm 3.5772 (3.3317) [2022-01-27 05:16:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][80/1251] eta 0:48:41 lr 0.000010 time 2.6101 (2.4950) loss 3.4339 (3.0453) grad_norm 3.3776 (3.3314) [2022-01-27 05:17:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][90/1251] eta 0:47:42 lr 0.000010 time 2.4682 (2.4658) loss 2.9117 (3.0032) grad_norm 3.3266 (3.3438) [2022-01-27 05:17:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][100/1251] eta 0:46:50 lr 0.000010 time 2.2759 (2.4415) loss 2.2484 (2.9741) grad_norm 3.0550 (3.3283) [2022-01-27 05:18:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][110/1251] eta 0:45:58 lr 0.000010 time 1.5980 (2.4172) loss 2.4396 (2.9479) grad_norm 3.3512 (3.3246) [2022-01-27 05:18:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][120/1251] eta 0:45:08 lr 0.000010 time 1.7166 (2.3944) loss 3.0195 (2.9486) grad_norm 3.2257 (3.3059) [2022-01-27 05:18:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][130/1251] eta 0:44:30 lr 0.000010 time 1.9436 (2.3818) loss 3.4940 (2.9373) grad_norm 3.6542 (3.3297) [2022-01-27 05:19:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][140/1251] eta 0:43:55 lr 0.000010 time 2.0076 (2.3720) loss 3.3332 (2.9449) grad_norm 3.6304 (3.3261) [2022-01-27 05:19:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][150/1251] eta 0:43:07 lr 0.000010 time 1.8991 (2.3504) loss 3.0207 (2.9505) grad_norm 2.9660 (3.3210) [2022-01-27 05:19:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][160/1251] eta 0:42:29 lr 0.000010 time 1.8819 (2.3368) loss 3.6999 (2.9515) grad_norm 3.1634 (3.3189) [2022-01-27 05:20:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][170/1251] eta 0:41:59 lr 0.000010 time 2.3955 (2.3304) loss 3.2524 (2.9560) grad_norm 3.6079 (3.3115) [2022-01-27 05:20:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][180/1251] eta 0:41:28 lr 0.000010 time 1.9396 (2.3232) loss 3.1807 (2.9552) grad_norm 3.4069 (3.3111) [2022-01-27 05:20:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][190/1251] eta 0:40:54 lr 0.000010 time 2.2490 (2.3135) loss 3.3833 (2.9578) grad_norm 3.0793 (3.3064) [2022-01-27 05:21:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][200/1251] eta 0:40:21 lr 0.000010 time 1.9159 (2.3044) loss 2.3153 (2.9508) grad_norm 3.3717 (3.3038) [2022-01-27 05:21:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][210/1251] eta 0:39:56 lr 0.000010 time 3.0245 (2.3025) loss 3.2774 (2.9577) grad_norm 3.2590 (3.3187) [2022-01-27 05:22:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][220/1251] eta 0:39:27 lr 0.000010 time 1.8668 (2.2963) loss 3.1156 (2.9639) grad_norm 3.2130 (3.3301) [2022-01-27 05:22:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][230/1251] eta 0:38:57 lr 0.000010 time 1.9311 (2.2890) loss 3.2119 (2.9658) grad_norm 3.2649 (3.3339) [2022-01-27 05:22:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][240/1251] eta 0:38:22 lr 0.000010 time 1.8717 (2.2778) loss 3.1924 (2.9710) grad_norm 2.9424 (3.3329) [2022-01-27 05:23:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][250/1251] eta 0:37:51 lr 0.000010 time 2.8295 (2.2697) loss 3.8035 (2.9763) grad_norm 3.2990 (3.3300) [2022-01-27 05:23:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][260/1251] eta 0:37:22 lr 0.000010 time 2.2295 (2.2629) loss 3.2849 (2.9800) grad_norm 3.2852 (3.3323) [2022-01-27 05:23:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][270/1251] eta 0:36:55 lr 0.000010 time 1.6785 (2.2589) loss 3.4109 (2.9843) grad_norm 3.1519 (3.3353) [2022-01-27 05:24:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][280/1251] eta 0:36:30 lr 0.000010 time 1.7917 (2.2562) loss 3.4639 (2.9881) grad_norm 3.2768 (3.3326) [2022-01-27 05:24:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][290/1251] eta 0:36:12 lr 0.000010 time 3.2936 (2.2606) loss 2.8961 (2.9863) grad_norm 2.8444 (3.3308) [2022-01-27 05:24:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][300/1251] eta 0:35:53 lr 0.000010 time 2.2035 (2.2645) loss 3.2153 (2.9827) grad_norm 3.7046 (3.3346) [2022-01-27 05:25:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][310/1251] eta 0:35:26 lr 0.000010 time 1.9735 (2.2600) loss 2.2611 (2.9755) grad_norm 3.2755 (3.3324) [2022-01-27 05:25:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][320/1251] eta 0:35:01 lr 0.000010 time 1.8743 (2.2572) loss 3.2891 (2.9751) grad_norm 3.7051 (3.3329) [2022-01-27 05:26:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][330/1251] eta 0:34:37 lr 0.000010 time 1.8126 (2.2552) loss 3.4373 (2.9731) grad_norm 3.3368 (3.3347) [2022-01-27 05:26:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][340/1251] eta 0:34:09 lr 0.000010 time 1.5686 (2.2498) loss 2.8316 (2.9707) grad_norm 2.9052 (3.3313) [2022-01-27 05:26:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][350/1251] eta 0:33:47 lr 0.000010 time 3.2503 (2.2500) loss 2.2612 (2.9669) grad_norm 3.0085 (3.3342) [2022-01-27 05:27:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][360/1251] eta 0:33:22 lr 0.000010 time 1.7644 (2.2470) loss 3.0855 (2.9687) grad_norm 3.6048 (3.3364) [2022-01-27 05:27:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][370/1251] eta 0:32:58 lr 0.000010 time 2.5005 (2.2457) loss 2.9514 (2.9698) grad_norm 3.6008 (3.3379) [2022-01-27 05:27:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][380/1251] eta 0:32:34 lr 0.000010 time 1.8701 (2.2440) loss 2.4991 (2.9698) grad_norm 3.0734 (3.3353) [2022-01-27 05:28:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][390/1251] eta 0:32:10 lr 0.000010 time 2.4496 (2.2418) loss 3.2242 (2.9646) grad_norm 3.0639 (3.3376) [2022-01-27 05:28:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][400/1251] eta 0:31:44 lr 0.000010 time 2.2145 (2.2375) loss 3.1042 (2.9656) grad_norm 2.7963 (3.3481) [2022-01-27 05:28:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][410/1251] eta 0:31:20 lr 0.000010 time 2.8498 (2.2362) loss 2.4076 (2.9663) grad_norm 2.8882 (3.3482) [2022-01-27 05:29:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][420/1251] eta 0:30:58 lr 0.000010 time 1.6266 (2.2366) loss 2.9431 (2.9617) grad_norm 3.1456 (3.3446) [2022-01-27 05:29:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][430/1251] eta 0:30:37 lr 0.000010 time 1.7195 (2.2375) loss 3.3477 (2.9675) grad_norm 3.4618 (3.3405) [2022-01-27 05:30:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][440/1251] eta 0:30:14 lr 0.000010 time 2.4992 (2.2372) loss 3.1184 (2.9683) grad_norm 3.8491 (3.3391) [2022-01-27 05:30:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][450/1251] eta 0:29:50 lr 0.000010 time 3.1409 (2.2356) loss 2.7889 (2.9692) grad_norm 3.2333 (3.3447) [2022-01-27 05:30:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][460/1251] eta 0:29:26 lr 0.000010 time 2.0755 (2.2328) loss 2.9629 (2.9679) grad_norm 3.6501 (3.3446) [2022-01-27 05:31:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][470/1251] eta 0:29:01 lr 0.000010 time 1.5123 (2.2293) loss 3.3098 (2.9676) grad_norm 3.7168 (3.3439) [2022-01-27 05:31:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][480/1251] eta 0:28:37 lr 0.000010 time 1.9958 (2.2273) loss 2.6722 (2.9683) grad_norm 3.2015 (3.3420) [2022-01-27 05:31:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][490/1251] eta 0:28:15 lr 0.000010 time 2.8323 (2.2284) loss 3.1606 (2.9699) grad_norm 3.6078 (3.3424) [2022-01-27 05:32:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][500/1251] eta 0:27:53 lr 0.000010 time 2.0876 (2.2282) loss 2.4856 (2.9693) grad_norm 3.2131 (3.3478) [2022-01-27 05:32:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][510/1251] eta 0:27:29 lr 0.000010 time 1.5385 (2.2264) loss 2.3250 (2.9681) grad_norm 4.2180 (3.3456) [2022-01-27 05:32:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][520/1251] eta 0:27:06 lr 0.000010 time 2.5217 (2.2251) loss 3.5732 (2.9706) grad_norm 3.6841 (3.3482) [2022-01-27 05:33:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][530/1251] eta 0:26:43 lr 0.000010 time 2.4487 (2.2238) loss 2.5860 (2.9706) grad_norm 3.2264 (3.3499) [2022-01-27 05:33:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][540/1251] eta 0:26:20 lr 0.000010 time 2.2197 (2.2225) loss 3.0490 (2.9679) grad_norm 2.9722 (3.3779) [2022-01-27 05:34:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][550/1251] eta 0:25:58 lr 0.000010 time 1.8765 (2.2226) loss 3.6623 (2.9618) grad_norm 3.0538 (3.3764) [2022-01-27 05:34:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][560/1251] eta 0:25:37 lr 0.000010 time 2.7972 (2.2251) loss 2.1851 (2.9629) grad_norm 3.2633 (3.3812) [2022-01-27 05:34:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][570/1251] eta 0:25:13 lr 0.000010 time 1.6207 (2.2226) loss 3.0198 (2.9621) grad_norm 3.6475 (3.3806) [2022-01-27 05:35:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][580/1251] eta 0:24:49 lr 0.000010 time 2.2643 (2.2196) loss 3.2258 (2.9613) grad_norm 2.9912 (3.3779) [2022-01-27 05:35:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][590/1251] eta 0:24:24 lr 0.000010 time 1.8191 (2.2154) loss 2.6989 (2.9611) grad_norm 3.2591 (3.3825) [2022-01-27 05:35:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][600/1251] eta 0:24:00 lr 0.000010 time 1.6121 (2.2131) loss 2.5537 (2.9629) grad_norm 2.9909 (3.3795) [2022-01-27 05:36:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][610/1251] eta 0:23:39 lr 0.000010 time 2.8855 (2.2152) loss 2.9828 (2.9665) grad_norm 3.5960 (3.3797) [2022-01-27 05:36:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][620/1251] eta 0:23:19 lr 0.000010 time 2.8861 (2.2181) loss 3.2955 (2.9691) grad_norm 2.8712 (3.3770) [2022-01-27 05:36:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][630/1251] eta 0:22:56 lr 0.000010 time 2.1630 (2.2165) loss 2.9369 (2.9710) grad_norm 3.1445 (3.3743) [2022-01-27 05:37:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][640/1251] eta 0:22:33 lr 0.000010 time 1.8957 (2.2152) loss 3.1883 (2.9744) grad_norm 3.1408 (3.3705) [2022-01-27 05:37:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][650/1251] eta 0:22:11 lr 0.000010 time 2.2484 (2.2154) loss 3.1484 (2.9780) grad_norm 2.6889 (3.3679) [2022-01-27 05:38:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][660/1251] eta 0:21:49 lr 0.000010 time 2.8986 (2.2162) loss 2.8140 (2.9747) grad_norm 3.2806 (3.3692) [2022-01-27 05:38:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][670/1251] eta 0:21:26 lr 0.000010 time 1.5917 (2.2146) loss 3.2418 (2.9732) grad_norm 3.2813 (3.3650) [2022-01-27 05:38:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][680/1251] eta 0:21:04 lr 0.000010 time 2.2365 (2.2142) loss 2.2700 (2.9744) grad_norm 3.3679 (3.3641) [2022-01-27 05:39:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][690/1251] eta 0:20:41 lr 0.000010 time 2.2941 (2.2137) loss 3.2359 (2.9710) grad_norm 2.6887 (3.3621) [2022-01-27 05:39:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][700/1251] eta 0:20:19 lr 0.000010 time 3.4994 (2.2140) loss 2.4869 (2.9642) grad_norm 3.8492 (3.3789) [2022-01-27 05:39:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][710/1251] eta 0:19:57 lr 0.000010 time 1.9893 (2.2139) loss 2.6931 (2.9632) grad_norm 3.3322 (3.3756) [2022-01-27 05:40:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][720/1251] eta 0:19:35 lr 0.000010 time 1.9409 (2.2136) loss 3.1733 (2.9604) grad_norm 3.3540 (3.3747) [2022-01-27 05:40:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][730/1251] eta 0:19:13 lr 0.000010 time 2.5700 (2.2136) loss 2.9236 (2.9593) grad_norm 3.3953 (3.3756) [2022-01-27 05:40:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][740/1251] eta 0:18:50 lr 0.000010 time 2.3062 (2.2121) loss 3.3866 (2.9608) grad_norm 3.3337 (3.3756) [2022-01-27 05:41:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][750/1251] eta 0:18:27 lr 0.000010 time 2.1786 (2.2108) loss 2.4297 (2.9614) grad_norm 3.4795 (3.3772) [2022-01-27 05:41:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][760/1251] eta 0:18:05 lr 0.000010 time 3.0097 (2.2107) loss 2.4631 (2.9615) grad_norm 3.5539 (3.3741) [2022-01-27 05:42:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][770/1251] eta 0:17:44 lr 0.000010 time 2.4276 (2.2128) loss 3.6047 (2.9637) grad_norm 3.5020 (3.3749) [2022-01-27 05:42:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][780/1251] eta 0:17:21 lr 0.000010 time 2.1750 (2.2121) loss 3.0683 (2.9642) grad_norm 2.9770 (3.3770) [2022-01-27 05:42:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][790/1251] eta 0:16:58 lr 0.000010 time 1.5606 (2.2095) loss 2.8699 (2.9623) grad_norm 3.2443 (3.3769) [2022-01-27 05:43:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][800/1251] eta 0:16:35 lr 0.000010 time 2.0413 (2.2084) loss 3.2048 (2.9628) grad_norm 2.8857 (3.3768) [2022-01-27 05:43:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][810/1251] eta 0:16:13 lr 0.000010 time 1.9926 (2.2082) loss 3.3054 (2.9666) grad_norm 3.0952 (3.3768) [2022-01-27 05:43:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][820/1251] eta 0:15:51 lr 0.000010 time 2.2559 (2.2077) loss 3.2134 (2.9654) grad_norm 3.2421 (3.3771) [2022-01-27 05:44:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][830/1251] eta 0:15:29 lr 0.000010 time 1.9882 (2.2067) loss 1.9978 (2.9636) grad_norm 3.0430 (3.3756) [2022-01-27 05:44:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][840/1251] eta 0:15:06 lr 0.000010 time 1.8737 (2.2049) loss 3.0220 (2.9591) grad_norm 3.2762 (3.3758) [2022-01-27 05:44:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][850/1251] eta 0:14:44 lr 0.000010 time 2.2076 (2.2052) loss 3.1732 (2.9567) grad_norm 3.1819 (3.3754) [2022-01-27 05:45:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][860/1251] eta 0:14:22 lr 0.000010 time 1.5620 (2.2048) loss 3.3461 (2.9530) grad_norm 2.8366 (3.3730) [2022-01-27 05:45:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][870/1251] eta 0:13:59 lr 0.000010 time 2.0451 (2.2044) loss 3.0665 (2.9537) grad_norm 5.0964 (3.3765) [2022-01-27 05:46:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][880/1251] eta 0:13:38 lr 0.000010 time 2.6995 (2.2052) loss 3.3467 (2.9546) grad_norm 3.4047 (3.3765) [2022-01-27 05:46:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][890/1251] eta 0:13:15 lr 0.000010 time 1.9554 (2.2034) loss 3.3406 (2.9542) grad_norm 3.6537 (3.3774) [2022-01-27 05:46:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][900/1251] eta 0:12:52 lr 0.000010 time 1.9951 (2.2019) loss 3.1437 (2.9540) grad_norm 3.0477 (3.3772) [2022-01-27 05:47:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][910/1251] eta 0:12:30 lr 0.000010 time 2.5602 (2.2008) loss 2.9128 (2.9568) grad_norm 3.5895 (3.3775) [2022-01-27 05:47:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][920/1251] eta 0:12:08 lr 0.000010 time 2.2410 (2.2011) loss 2.5133 (2.9565) grad_norm 2.4946 (3.3767) [2022-01-27 05:47:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][930/1251] eta 0:11:46 lr 0.000010 time 2.2585 (2.2010) loss 2.4934 (2.9565) grad_norm 2.8506 (3.3734) [2022-01-27 05:48:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][940/1251] eta 0:11:24 lr 0.000010 time 2.1595 (2.2020) loss 1.8170 (2.9572) grad_norm 3.2157 (3.3731) [2022-01-27 05:48:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][950/1251] eta 0:11:02 lr 0.000010 time 2.2937 (2.2019) loss 3.2044 (2.9578) grad_norm 2.8206 (3.3742) [2022-01-27 05:48:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][960/1251] eta 0:10:40 lr 0.000010 time 1.8757 (2.2020) loss 2.8364 (2.9568) grad_norm 3.3384 (3.3756) [2022-01-27 05:49:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][970/1251] eta 0:10:18 lr 0.000010 time 2.1973 (2.2008) loss 3.4998 (2.9588) grad_norm 3.7084 (3.3740) [2022-01-27 05:49:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][980/1251] eta 0:09:56 lr 0.000010 time 2.2625 (2.2007) loss 1.9703 (2.9567) grad_norm 3.2798 (3.3722) [2022-01-27 05:49:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][990/1251] eta 0:09:34 lr 0.000010 time 1.3911 (2.1996) loss 2.5762 (2.9564) grad_norm 4.1518 (3.3726) [2022-01-27 05:50:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1000/1251] eta 0:09:11 lr 0.000010 time 2.2885 (2.1990) loss 3.2446 (2.9589) grad_norm 2.6848 (3.3733) [2022-01-27 05:50:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1010/1251] eta 0:08:49 lr 0.000010 time 2.0769 (2.1986) loss 3.0103 (2.9582) grad_norm 2.8070 (3.3726) [2022-01-27 05:51:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1020/1251] eta 0:08:28 lr 0.000010 time 2.7682 (2.2003) loss 2.8180 (2.9599) grad_norm 4.1998 (3.3731) [2022-01-27 05:51:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1030/1251] eta 0:08:06 lr 0.000010 time 1.7612 (2.1999) loss 3.4118 (2.9598) grad_norm 3.5934 (3.3724) [2022-01-27 05:51:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1040/1251] eta 0:07:44 lr 0.000010 time 1.9244 (2.2001) loss 1.8481 (2.9600) grad_norm 3.5503 (3.3700) [2022-01-27 05:52:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1050/1251] eta 0:07:22 lr 0.000010 time 1.9034 (2.1993) loss 3.3185 (2.9598) grad_norm 3.1110 (3.3691) [2022-01-27 05:52:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1060/1251] eta 0:06:59 lr 0.000010 time 2.1778 (2.1986) loss 2.0498 (2.9600) grad_norm 3.5570 (3.3700) [2022-01-27 05:52:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1070/1251] eta 0:06:37 lr 0.000010 time 2.2124 (2.1983) loss 2.9576 (2.9602) grad_norm 3.3659 (3.3694) [2022-01-27 05:53:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1080/1251] eta 0:06:15 lr 0.000010 time 1.9087 (2.1974) loss 2.9784 (2.9614) grad_norm 4.4913 (3.3701) [2022-01-27 05:53:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1090/1251] eta 0:05:53 lr 0.000010 time 2.2533 (2.1982) loss 3.1637 (2.9619) grad_norm 3.0447 (3.3669) [2022-01-27 05:53:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1100/1251] eta 0:05:31 lr 0.000010 time 1.7904 (2.1973) loss 1.8122 (2.9612) grad_norm 3.0345 (3.3637) [2022-01-27 05:54:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1110/1251] eta 0:05:09 lr 0.000010 time 1.9009 (2.1970) loss 2.4089 (2.9575) grad_norm 3.3845 (3.3645) [2022-01-27 05:54:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1120/1251] eta 0:04:47 lr 0.000010 time 2.1462 (2.1973) loss 3.3837 (2.9564) grad_norm 3.0899 (3.3635) [2022-01-27 05:55:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1130/1251] eta 0:04:25 lr 0.000010 time 2.1742 (2.1977) loss 3.3878 (2.9572) grad_norm 3.4034 (3.3632) [2022-01-27 05:55:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1140/1251] eta 0:04:04 lr 0.000010 time 2.6947 (2.1988) loss 3.4131 (2.9574) grad_norm 3.6287 (3.3624) [2022-01-27 05:55:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1150/1251] eta 0:03:41 lr 0.000010 time 1.9603 (2.1961) loss 3.1240 (2.9553) grad_norm 4.0362 (3.3621) [2022-01-27 05:56:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1160/1251] eta 0:03:19 lr 0.000010 time 2.5382 (2.1959) loss 3.1021 (2.9546) grad_norm 3.0499 (3.3622) [2022-01-27 05:56:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1170/1251] eta 0:02:57 lr 0.000010 time 1.6005 (2.1956) loss 2.5877 (2.9556) grad_norm 3.1239 (3.3616) [2022-01-27 05:56:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1180/1251] eta 0:02:35 lr 0.000010 time 1.5148 (2.1953) loss 3.3039 (2.9570) grad_norm 3.4948 (3.3618) [2022-01-27 05:57:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1190/1251] eta 0:02:13 lr 0.000010 time 2.4871 (2.1957) loss 3.2512 (2.9576) grad_norm 3.1541 (3.3621) [2022-01-27 05:57:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1200/1251] eta 0:01:51 lr 0.000010 time 2.3840 (2.1956) loss 2.3237 (2.9565) grad_norm 3.6866 (3.3622) [2022-01-27 05:57:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1210/1251] eta 0:01:30 lr 0.000010 time 1.8149 (2.1957) loss 2.9623 (2.9556) grad_norm 3.4123 (3.3616) [2022-01-27 05:58:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1220/1251] eta 0:01:08 lr 0.000010 time 1.9892 (2.1954) loss 2.5254 (2.9556) grad_norm 2.9278 (3.3611) [2022-01-27 05:58:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1230/1251] eta 0:00:46 lr 0.000010 time 2.4024 (2.1967) loss 3.2416 (2.9564) grad_norm 3.8933 (3.3617) [2022-01-27 05:59:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1240/1251] eta 0:00:24 lr 0.000010 time 2.3678 (2.1969) loss 3.3728 (2.9584) grad_norm 3.4352 (3.3599) [2022-01-27 05:59:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [296/300][1250/1251] eta 0:00:02 lr 0.000010 time 1.1478 (2.1914) loss 3.3430 (2.9590) grad_norm 3.5942 (3.3606) [2022-01-27 05:59:19 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 296 training takes 0:45:41 [2022-01-27 05:59:38 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.683 (18.683) Loss 0.8069 (0.8069) Acc@1 80.371 (80.371) Acc@5 96.191 (96.191) [2022-01-27 05:59:57 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.844 (3.458) Loss 0.8155 (0.8052) Acc@1 80.762 (81.072) Acc@5 95.605 (95.605) [2022-01-27 06:00:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.598 (2.666) Loss 0.7516 (0.8003) Acc@1 82.031 (81.343) Acc@5 96.484 (95.554) [2022-01-27 06:00:31 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.646 (2.301) Loss 0.8118 (0.8029) Acc@1 80.469 (81.439) Acc@5 95.508 (95.558) [2022-01-27 06:00:48 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.169 (2.165) Loss 0.8649 (0.8080) Acc@1 79.980 (81.288) Acc@5 94.629 (95.484) [2022-01-27 06:00:55 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.230 Acc@5 95.524 [2022-01-27 06:00:55 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.2% [2022-01-27 06:00:55 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.24% [2022-01-27 06:01:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][0/1251] eta 7:19:20 lr 0.000010 time 21.0714 (21.0714) loss 3.1432 (3.1432) grad_norm 3.4505 (3.4505) [2022-01-27 06:01:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][10/1251] eta 1:22:23 lr 0.000010 time 2.5958 (3.9833) loss 2.1009 (2.9997) grad_norm 3.0797 (3.2087) [2022-01-27 06:02:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][20/1251] eta 1:04:34 lr 0.000010 time 2.0971 (3.1477) loss 3.5762 (3.0404) grad_norm 3.2183 (3.2009) [2022-01-27 06:02:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][30/1251] eta 0:58:32 lr 0.000010 time 1.9526 (2.8768) loss 3.5309 (3.1068) grad_norm 3.4438 (3.2127) [2022-01-27 06:02:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][40/1251] eta 0:55:37 lr 0.000010 time 3.7883 (2.7559) loss 3.1536 (3.0787) grad_norm 3.7205 (3.2669) [2022-01-27 06:03:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][50/1251] eta 0:53:31 lr 0.000010 time 2.9890 (2.6737) loss 3.5463 (3.0821) grad_norm 3.4924 (3.2845) [2022-01-27 06:03:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][60/1251] eta 0:51:23 lr 0.000010 time 2.2413 (2.5893) loss 3.3551 (3.0803) grad_norm 3.6774 (3.2990) [2022-01-27 06:03:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][70/1251] eta 0:49:24 lr 0.000010 time 1.9525 (2.5102) loss 3.2819 (3.0841) grad_norm 3.6230 (3.3638) [2022-01-27 06:04:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][80/1251] eta 0:47:56 lr 0.000010 time 2.8795 (2.4562) loss 3.6577 (3.0782) grad_norm 3.4897 (3.3692) [2022-01-27 06:04:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][90/1251] eta 0:46:44 lr 0.000010 time 1.8980 (2.4159) loss 3.6680 (3.0665) grad_norm 3.6258 (3.4501) [2022-01-27 06:04:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][100/1251] eta 0:45:44 lr 0.000010 time 1.9382 (2.3844) loss 2.8878 (3.0612) grad_norm 3.3160 (3.4376) [2022-01-27 06:05:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][110/1251] eta 0:45:02 lr 0.000010 time 2.1623 (2.3689) loss 3.0504 (3.0466) grad_norm 4.1071 (3.4234) [2022-01-27 06:05:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][120/1251] eta 0:44:32 lr 0.000010 time 3.6202 (2.3632) loss 3.7610 (3.0502) grad_norm 3.1847 (3.4047) [2022-01-27 06:06:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][130/1251] eta 0:43:55 lr 0.000010 time 2.4792 (2.3513) loss 3.4662 (3.0429) grad_norm 4.9555 (3.4162) [2022-01-27 06:06:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][140/1251] eta 0:43:19 lr 0.000010 time 2.2352 (2.3396) loss 1.9927 (3.0374) grad_norm 3.1957 (3.4159) [2022-01-27 06:06:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][150/1251] eta 0:42:50 lr 0.000010 time 2.9547 (2.3347) loss 3.3592 (3.0319) grad_norm 3.2825 (3.3983) [2022-01-27 06:07:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][160/1251] eta 0:42:26 lr 0.000010 time 3.4797 (2.3339) loss 3.2905 (3.0279) grad_norm 2.6587 (3.3898) [2022-01-27 06:07:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][170/1251] eta 0:41:45 lr 0.000010 time 2.2307 (2.3179) loss 2.3533 (3.0156) grad_norm 2.9125 (3.3808) [2022-01-27 06:07:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][180/1251] eta 0:41:04 lr 0.000010 time 1.6866 (2.3011) loss 2.8215 (3.0104) grad_norm 3.2826 (3.3848) [2022-01-27 06:08:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][190/1251] eta 0:40:40 lr 0.000010 time 2.1330 (2.3002) loss 2.8527 (2.9991) grad_norm 3.6755 (3.3895) [2022-01-27 06:08:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][200/1251] eta 0:40:23 lr 0.000010 time 3.7403 (2.3062) loss 3.0687 (3.0039) grad_norm 3.7519 (3.3954) [2022-01-27 06:09:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][210/1251] eta 0:39:54 lr 0.000010 time 1.4843 (2.3001) loss 3.1814 (3.0048) grad_norm 3.0706 (3.3950) [2022-01-27 06:09:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][220/1251] eta 0:39:22 lr 0.000010 time 1.5609 (2.2918) loss 3.2219 (2.9997) grad_norm 2.8928 (3.3853) [2022-01-27 06:09:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][230/1251] eta 0:38:49 lr 0.000010 time 1.7137 (2.2819) loss 3.1631 (2.9952) grad_norm 2.8928 (3.3802) [2022-01-27 06:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][240/1251] eta 0:38:31 lr 0.000010 time 3.3844 (2.2865) loss 2.8833 (2.9923) grad_norm 3.0707 (3.3928) [2022-01-27 06:10:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][250/1251] eta 0:38:03 lr 0.000010 time 2.2222 (2.2808) loss 3.5902 (2.9756) grad_norm 3.0808 (3.3905) [2022-01-27 06:10:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][260/1251] eta 0:37:28 lr 0.000010 time 1.7172 (2.2689) loss 3.3334 (2.9730) grad_norm 2.9980 (3.4091) [2022-01-27 06:11:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][270/1251] eta 0:36:57 lr 0.000010 time 1.9402 (2.2607) loss 3.6487 (2.9748) grad_norm 3.2610 (3.4081) [2022-01-27 06:11:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][280/1251] eta 0:36:32 lr 0.000010 time 2.2808 (2.2577) loss 3.2722 (2.9819) grad_norm 3.7162 (3.3977) [2022-01-27 06:11:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][290/1251] eta 0:36:06 lr 0.000010 time 2.0165 (2.2541) loss 1.8958 (2.9778) grad_norm 3.3791 (3.3966) [2022-01-27 06:12:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][300/1251] eta 0:35:38 lr 0.000010 time 2.1766 (2.2488) loss 3.1278 (2.9824) grad_norm 3.6457 (3.3932) [2022-01-27 06:12:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][310/1251] eta 0:35:18 lr 0.000010 time 2.3712 (2.2514) loss 3.0271 (2.9840) grad_norm 3.1261 (3.3922) [2022-01-27 06:12:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][320/1251] eta 0:34:56 lr 0.000010 time 2.4789 (2.2518) loss 3.6356 (2.9808) grad_norm 3.4638 (3.3890) [2022-01-27 06:13:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][330/1251] eta 0:34:35 lr 0.000010 time 2.8357 (2.2539) loss 2.7790 (2.9792) grad_norm 2.8650 (3.3855) [2022-01-27 06:13:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][340/1251] eta 0:34:05 lr 0.000010 time 2.1924 (2.2459) loss 3.4685 (2.9824) grad_norm 3.3613 (3.3846) [2022-01-27 06:14:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][350/1251] eta 0:33:38 lr 0.000010 time 1.8491 (2.2405) loss 2.4734 (2.9846) grad_norm 3.4485 (3.3855) [2022-01-27 06:14:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][360/1251] eta 0:33:15 lr 0.000010 time 2.8306 (2.2397) loss 3.3792 (2.9842) grad_norm 2.9988 (3.3784) [2022-01-27 06:14:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][370/1251] eta 0:32:53 lr 0.000010 time 2.4498 (2.2397) loss 3.0283 (2.9786) grad_norm 3.3389 (3.3783) [2022-01-27 06:15:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][380/1251] eta 0:32:33 lr 0.000010 time 2.5677 (2.2424) loss 3.0951 (2.9798) grad_norm 3.1996 (3.3789) [2022-01-27 06:15:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][390/1251] eta 0:32:11 lr 0.000010 time 1.7507 (2.2432) loss 2.8271 (2.9764) grad_norm 3.1755 (3.3814) [2022-01-27 06:15:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][400/1251] eta 0:31:50 lr 0.000010 time 2.2198 (2.2454) loss 2.5838 (2.9766) grad_norm 2.9067 (3.3832) [2022-01-27 06:16:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][410/1251] eta 0:31:22 lr 0.000010 time 1.9515 (2.2384) loss 3.4870 (2.9756) grad_norm 3.6294 (3.3857) [2022-01-27 06:16:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][420/1251] eta 0:30:54 lr 0.000010 time 1.8755 (2.2316) loss 2.6268 (2.9764) grad_norm 3.2763 (3.3853) [2022-01-27 06:16:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][430/1251] eta 0:30:29 lr 0.000010 time 2.5388 (2.2283) loss 3.2088 (2.9767) grad_norm 3.9978 (3.3853) [2022-01-27 06:17:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][440/1251] eta 0:30:06 lr 0.000010 time 2.2200 (2.2269) loss 3.3273 (2.9809) grad_norm 3.2521 (3.3989) [2022-01-27 06:17:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][450/1251] eta 0:29:43 lr 0.000010 time 2.3799 (2.2260) loss 3.5315 (2.9868) grad_norm 3.0784 (3.3950) [2022-01-27 06:18:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][460/1251] eta 0:29:22 lr 0.000010 time 2.7529 (2.2281) loss 3.1720 (2.9901) grad_norm 3.0154 (3.4069) [2022-01-27 06:18:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][470/1251] eta 0:29:01 lr 0.000010 time 2.5821 (2.2295) loss 3.2625 (2.9968) grad_norm 2.8915 (3.4017) [2022-01-27 06:18:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][480/1251] eta 0:28:37 lr 0.000010 time 2.1934 (2.2277) loss 3.4880 (2.9909) grad_norm 3.9124 (3.4053) [2022-01-27 06:19:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][490/1251] eta 0:28:16 lr 0.000010 time 2.8689 (2.2291) loss 3.3241 (2.9867) grad_norm 3.1251 (3.3996) [2022-01-27 06:19:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][500/1251] eta 0:27:56 lr 0.000010 time 2.7161 (2.2319) loss 2.9648 (2.9820) grad_norm 2.4445 (3.3997) [2022-01-27 06:19:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][510/1251] eta 0:27:38 lr 0.000010 time 2.7397 (2.2381) loss 3.5116 (2.9829) grad_norm 2.8368 (3.3963) [2022-01-27 06:20:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][520/1251] eta 0:27:17 lr 0.000010 time 2.3081 (2.2399) loss 3.0893 (2.9862) grad_norm 3.4456 (3.3966) [2022-01-27 06:20:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][530/1251] eta 0:26:51 lr 0.000010 time 1.9537 (2.2350) loss 3.2776 (2.9896) grad_norm 3.4181 (3.3968) [2022-01-27 06:21:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][540/1251] eta 0:26:23 lr 0.000010 time 1.8589 (2.2277) loss 3.3310 (2.9854) grad_norm 3.2902 (3.3992) [2022-01-27 06:21:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][550/1251] eta 0:25:58 lr 0.000010 time 2.1618 (2.2238) loss 3.3800 (2.9916) grad_norm 2.9634 (3.3992) [2022-01-27 06:21:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][560/1251] eta 0:25:35 lr 0.000010 time 2.4811 (2.2221) loss 3.3829 (2.9937) grad_norm 3.1560 (3.3969) [2022-01-27 06:22:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][570/1251] eta 0:25:12 lr 0.000010 time 2.6166 (2.2213) loss 3.6952 (2.9933) grad_norm 4.0148 (3.4003) [2022-01-27 06:22:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][580/1251] eta 0:24:51 lr 0.000010 time 2.2573 (2.2231) loss 3.1048 (2.9926) grad_norm 2.4835 (3.3962) [2022-01-27 06:22:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][590/1251] eta 0:24:30 lr 0.000010 time 2.4087 (2.2246) loss 2.3343 (2.9912) grad_norm 2.9358 (3.3979) [2022-01-27 06:23:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][600/1251] eta 0:24:07 lr 0.000010 time 1.8325 (2.2235) loss 3.3336 (2.9935) grad_norm 3.3052 (3.3951) [2022-01-27 06:23:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][610/1251] eta 0:23:45 lr 0.000010 time 1.9505 (2.2238) loss 3.0760 (2.9919) grad_norm 3.4801 (3.3985) [2022-01-27 06:23:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][620/1251] eta 0:23:24 lr 0.000010 time 2.0519 (2.2252) loss 2.5682 (2.9919) grad_norm 6.0883 (3.3982) [2022-01-27 06:24:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][630/1251] eta 0:23:01 lr 0.000010 time 1.7363 (2.2241) loss 2.7191 (2.9912) grad_norm 4.1346 (3.3944) [2022-01-27 06:24:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][640/1251] eta 0:22:38 lr 0.000010 time 1.9435 (2.2227) loss 3.0836 (2.9876) grad_norm 2.9509 (3.3904) [2022-01-27 06:25:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][650/1251] eta 0:22:14 lr 0.000010 time 2.2360 (2.2203) loss 2.2297 (2.9820) grad_norm 2.6044 (3.3876) [2022-01-27 06:25:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][660/1251] eta 0:21:50 lr 0.000010 time 1.8838 (2.2173) loss 3.3511 (2.9829) grad_norm 2.9451 (3.3877) [2022-01-27 06:25:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][670/1251] eta 0:21:28 lr 0.000010 time 2.1285 (2.2169) loss 2.4969 (2.9835) grad_norm 3.2002 (3.3854) [2022-01-27 06:26:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][680/1251] eta 0:21:05 lr 0.000010 time 1.6262 (2.2167) loss 2.1330 (2.9817) grad_norm 3.8362 (3.3843) [2022-01-27 06:26:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][690/1251] eta 0:20:43 lr 0.000010 time 2.3461 (2.2169) loss 3.1336 (2.9810) grad_norm 3.3492 (3.3853) [2022-01-27 06:26:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][700/1251] eta 0:20:21 lr 0.000010 time 1.5551 (2.2177) loss 2.2560 (2.9812) grad_norm 3.6541 (3.3853) [2022-01-27 06:27:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][710/1251] eta 0:20:01 lr 0.000010 time 2.7268 (2.2200) loss 2.7520 (2.9820) grad_norm 3.2478 (3.3836) [2022-01-27 06:27:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][720/1251] eta 0:19:37 lr 0.000010 time 2.2378 (2.2182) loss 3.1170 (2.9811) grad_norm 4.0299 (3.3820) [2022-01-27 06:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][730/1251] eta 0:19:13 lr 0.000010 time 1.9257 (2.2142) loss 2.8710 (2.9795) grad_norm 3.0828 (3.3824) [2022-01-27 06:28:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][740/1251] eta 0:18:50 lr 0.000010 time 2.2991 (2.2121) loss 3.1669 (2.9827) grad_norm 3.5954 (3.3813) [2022-01-27 06:28:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][750/1251] eta 0:18:27 lr 0.000010 time 1.8563 (2.2114) loss 2.7112 (2.9828) grad_norm 2.8012 (3.3798) [2022-01-27 06:28:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][760/1251] eta 0:18:05 lr 0.000010 time 2.4396 (2.2105) loss 2.9357 (2.9855) grad_norm 3.3402 (3.3783) [2022-01-27 06:29:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][770/1251] eta 0:17:42 lr 0.000010 time 2.5669 (2.2086) loss 2.3694 (2.9828) grad_norm 3.4438 (3.3806) [2022-01-27 06:29:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][780/1251] eta 0:17:20 lr 0.000010 time 2.0372 (2.2091) loss 3.1144 (2.9847) grad_norm 2.9565 (3.3810) [2022-01-27 06:30:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][790/1251] eta 0:16:59 lr 0.000010 time 1.9042 (2.2115) loss 2.6285 (2.9838) grad_norm 3.7011 (3.3817) [2022-01-27 06:30:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][800/1251] eta 0:16:37 lr 0.000010 time 2.5545 (2.2122) loss 2.4050 (2.9830) grad_norm 3.0190 (3.3789) [2022-01-27 06:30:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][810/1251] eta 0:16:15 lr 0.000010 time 2.4384 (2.2131) loss 3.3288 (2.9788) grad_norm 2.9814 (3.3768) [2022-01-27 06:31:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][820/1251] eta 0:15:53 lr 0.000010 time 2.4960 (2.2127) loss 3.2891 (2.9811) grad_norm 3.1495 (3.3749) [2022-01-27 06:31:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][830/1251] eta 0:15:32 lr 0.000010 time 2.3718 (2.2149) loss 3.4010 (2.9790) grad_norm 3.3842 (3.3729) [2022-01-27 06:31:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][840/1251] eta 0:15:10 lr 0.000010 time 3.0897 (2.2154) loss 3.2176 (2.9765) grad_norm 3.2954 (3.3712) [2022-01-27 06:32:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][850/1251] eta 0:14:47 lr 0.000010 time 2.0763 (2.2131) loss 2.0572 (2.9709) grad_norm 3.0943 (3.3681) [2022-01-27 06:32:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][860/1251] eta 0:14:24 lr 0.000010 time 1.8211 (2.2098) loss 3.2485 (2.9704) grad_norm 4.2295 (3.3682) [2022-01-27 06:33:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][870/1251] eta 0:14:02 lr 0.000010 time 2.0009 (2.2102) loss 2.6674 (2.9702) grad_norm 3.2229 (3.3669) [2022-01-27 06:33:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][880/1251] eta 0:13:40 lr 0.000010 time 2.9892 (2.2114) loss 2.7535 (2.9687) grad_norm 3.1566 (3.3681) [2022-01-27 06:33:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][890/1251] eta 0:13:17 lr 0.000010 time 2.4843 (2.2104) loss 3.2305 (2.9682) grad_norm 2.8692 (3.3675) [2022-01-27 06:34:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][900/1251] eta 0:12:55 lr 0.000010 time 1.7543 (2.2092) loss 2.4809 (2.9669) grad_norm 3.1422 (3.3657) [2022-01-27 06:34:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][910/1251] eta 0:12:33 lr 0.000010 time 2.2816 (2.2097) loss 3.0002 (2.9665) grad_norm 3.2546 (3.3655) [2022-01-27 06:34:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][920/1251] eta 0:12:11 lr 0.000010 time 2.3595 (2.2105) loss 3.1598 (2.9667) grad_norm 3.4971 (3.3664) [2022-01-27 06:35:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][930/1251] eta 0:11:49 lr 0.000010 time 3.3683 (2.2114) loss 2.7014 (2.9668) grad_norm 3.2725 (3.3653) [2022-01-27 06:35:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][940/1251] eta 0:11:27 lr 0.000010 time 1.9914 (2.2103) loss 3.1775 (2.9680) grad_norm 3.6160 (3.3663) [2022-01-27 06:35:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][950/1251] eta 0:11:05 lr 0.000010 time 2.8217 (2.2112) loss 3.4448 (2.9675) grad_norm 2.9783 (3.3671) [2022-01-27 06:36:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][960/1251] eta 0:10:43 lr 0.000010 time 1.7565 (2.2099) loss 2.2041 (2.9672) grad_norm 2.9509 (3.3646) [2022-01-27 06:36:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][970/1251] eta 0:10:20 lr 0.000010 time 2.3392 (2.2077) loss 3.2865 (2.9680) grad_norm 3.2624 (3.3646) [2022-01-27 06:36:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][980/1251] eta 0:09:57 lr 0.000010 time 1.7734 (2.2053) loss 3.4875 (2.9682) grad_norm 3.1776 (3.3629) [2022-01-27 06:37:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][990/1251] eta 0:09:35 lr 0.000010 time 2.5178 (2.2054) loss 3.1821 (2.9695) grad_norm 3.4366 (3.3634) [2022-01-27 06:37:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1000/1251] eta 0:09:13 lr 0.000010 time 1.6432 (2.2042) loss 2.3845 (2.9697) grad_norm 3.0997 (3.3644) [2022-01-27 06:38:03 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1010/1251] eta 0:08:51 lr 0.000010 time 2.0917 (2.2034) loss 2.9632 (2.9694) grad_norm 3.5007 (3.3661) [2022-01-27 06:38:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1020/1251] eta 0:08:29 lr 0.000010 time 2.2420 (2.2046) loss 3.1483 (2.9696) grad_norm 3.6095 (3.3686) [2022-01-27 06:38:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1030/1251] eta 0:08:07 lr 0.000010 time 2.9757 (2.2062) loss 3.4621 (2.9696) grad_norm 3.3297 (3.3696) [2022-01-27 06:39:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1040/1251] eta 0:07:45 lr 0.000010 time 1.5210 (2.2075) loss 3.4466 (2.9700) grad_norm 3.4727 (3.3709) [2022-01-27 06:39:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1050/1251] eta 0:07:23 lr 0.000010 time 2.1869 (2.2077) loss 2.0477 (2.9703) grad_norm 3.3111 (3.3693) [2022-01-27 06:39:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1060/1251] eta 0:07:01 lr 0.000010 time 2.2243 (2.2083) loss 3.0439 (2.9716) grad_norm 3.3194 (3.3709) [2022-01-27 06:40:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1070/1251] eta 0:06:39 lr 0.000010 time 2.8560 (2.2083) loss 2.9005 (2.9709) grad_norm 3.3037 (3.3699) [2022-01-27 06:40:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1080/1251] eta 0:06:17 lr 0.000010 time 1.9199 (2.2076) loss 3.3845 (2.9686) grad_norm 3.7324 (3.3693) [2022-01-27 06:41:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1090/1251] eta 0:05:55 lr 0.000010 time 1.6193 (2.2062) loss 2.9628 (2.9690) grad_norm 3.5133 (3.3684) [2022-01-27 06:41:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1100/1251] eta 0:05:33 lr 0.000010 time 1.9584 (2.2057) loss 3.3214 (2.9720) grad_norm 5.5862 (3.3705) [2022-01-27 06:41:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1110/1251] eta 0:05:10 lr 0.000010 time 2.6803 (2.2051) loss 3.1057 (2.9701) grad_norm 3.4720 (3.3701) [2022-01-27 06:42:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1120/1251] eta 0:04:48 lr 0.000010 time 1.8217 (2.2052) loss 2.0349 (2.9676) grad_norm 3.0173 (3.3733) [2022-01-27 06:42:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1130/1251] eta 0:04:26 lr 0.000010 time 1.6541 (2.2039) loss 3.2421 (2.9692) grad_norm 3.8237 (3.3750) [2022-01-27 06:42:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1140/1251] eta 0:04:04 lr 0.000010 time 1.9995 (2.2039) loss 3.6807 (2.9697) grad_norm 3.2359 (3.3756) [2022-01-27 06:43:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1150/1251] eta 0:03:42 lr 0.000010 time 2.1951 (2.2048) loss 3.1065 (2.9685) grad_norm 3.6712 (3.3753) [2022-01-27 06:43:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1160/1251] eta 0:03:20 lr 0.000010 time 2.4530 (2.2052) loss 3.0486 (2.9693) grad_norm 2.5285 (3.3746) [2022-01-27 06:43:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1170/1251] eta 0:02:58 lr 0.000010 time 2.3269 (2.2054) loss 3.0723 (2.9695) grad_norm 4.2218 (3.3765) [2022-01-27 06:44:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1180/1251] eta 0:02:36 lr 0.000010 time 1.6084 (2.2061) loss 3.3312 (2.9685) grad_norm 3.4072 (3.3797) [2022-01-27 06:44:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1190/1251] eta 0:02:14 lr 0.000010 time 2.6243 (2.2069) loss 2.1745 (2.9653) grad_norm 3.1186 (3.3788) [2022-01-27 06:45:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1200/1251] eta 0:01:52 lr 0.000010 time 1.7535 (2.2063) loss 2.8648 (2.9659) grad_norm 3.4690 (3.3780) [2022-01-27 06:45:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1210/1251] eta 0:01:30 lr 0.000010 time 1.6348 (2.2048) loss 2.9487 (2.9650) grad_norm 2.8826 (3.3777) [2022-01-27 06:45:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1220/1251] eta 0:01:08 lr 0.000010 time 1.9441 (2.2027) loss 3.3013 (2.9670) grad_norm 2.8833 (3.3772) [2022-01-27 06:46:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1230/1251] eta 0:00:46 lr 0.000010 time 1.7174 (2.2026) loss 2.6741 (2.9658) grad_norm 3.2552 (3.3769) [2022-01-27 06:46:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1240/1251] eta 0:00:24 lr 0.000010 time 1.9454 (2.2053) loss 3.1173 (2.9650) grad_norm 3.1041 (3.3828) [2022-01-27 06:46:47 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [297/300][1250/1251] eta 0:00:02 lr 0.000010 time 1.1479 (2.1991) loss 3.3281 (2.9637) grad_norm 3.4161 (3.3820) [2022-01-27 06:46:47 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 297 training takes 0:45:51 [2022-01-27 06:47:06 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.571 (18.571) Loss 0.8297 (0.8297) Acc@1 80.859 (80.859) Acc@5 94.922 (94.922) [2022-01-27 06:47:23 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 0.576 (3.273) Loss 0.7897 (0.8246) Acc@1 81.250 (80.602) Acc@5 95.312 (95.295) [2022-01-27 06:47:40 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.298 (2.502) Loss 0.7350 (0.8023) Acc@1 82.520 (81.320) Acc@5 96.387 (95.564) [2022-01-27 06:47:58 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.625 (2.305) Loss 0.7509 (0.8052) Acc@1 82.812 (81.272) Acc@5 96.973 (95.634) [2022-01-27 06:48:16 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 3.682 (2.177) Loss 0.8174 (0.8107) Acc@1 81.250 (81.100) Acc@5 95.605 (95.558) [2022-01-27 06:48:25 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.176 Acc@5 95.496 [2022-01-27 06:48:25 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.2% [2022-01-27 06:48:25 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.24% [2022-01-27 06:48:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][0/1251] eta 7:23:11 lr 0.000010 time 21.2562 (21.2562) loss 2.8303 (2.8303) grad_norm 3.8410 (3.8410) [2022-01-27 06:49:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][10/1251] eta 1:22:38 lr 0.000010 time 1.7080 (3.9956) loss 2.4360 (2.9421) grad_norm 3.0941 (3.7663) [2022-01-27 06:49:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][20/1251] eta 1:06:15 lr 0.000010 time 2.0073 (3.2296) loss 3.1003 (2.8824) grad_norm 2.9146 (3.4782) [2022-01-27 06:49:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][30/1251] eta 0:58:18 lr 0.000010 time 1.9480 (2.8653) loss 3.5203 (2.9812) grad_norm 4.6463 (3.4882) [2022-01-27 06:50:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][40/1251] eta 0:55:58 lr 0.000010 time 3.8136 (2.7731) loss 2.1764 (3.0204) grad_norm 2.8641 (3.4550) [2022-01-27 06:50:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][50/1251] eta 0:54:35 lr 0.000010 time 2.7123 (2.7274) loss 3.4151 (3.0014) grad_norm 2.7710 (3.4316) [2022-01-27 06:51:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][60/1251] eta 0:52:15 lr 0.000010 time 2.0906 (2.6327) loss 2.3945 (2.9499) grad_norm 3.3960 (3.4255) [2022-01-27 06:51:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][70/1251] eta 0:50:02 lr 0.000010 time 2.0599 (2.5420) loss 3.3383 (2.9426) grad_norm 4.2711 (3.4078) [2022-01-27 06:51:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][80/1251] eta 0:48:07 lr 0.000010 time 1.9422 (2.4658) loss 3.1780 (2.9370) grad_norm 3.7093 (3.3944) [2022-01-27 06:52:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][90/1251] eta 0:46:44 lr 0.000010 time 1.5249 (2.4156) loss 3.1729 (2.9443) grad_norm 3.2084 (3.4042) [2022-01-27 06:52:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][100/1251] eta 0:45:38 lr 0.000010 time 1.8464 (2.3791) loss 3.2166 (2.9530) grad_norm 3.5560 (3.4112) [2022-01-27 06:52:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][110/1251] eta 0:44:47 lr 0.000010 time 1.6104 (2.3554) loss 3.2099 (2.9589) grad_norm 3.0932 (3.3867) [2022-01-27 06:53:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][120/1251] eta 0:44:01 lr 0.000010 time 2.3373 (2.3358) loss 2.5171 (2.9373) grad_norm 3.2511 (3.3789) [2022-01-27 06:53:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][130/1251] eta 0:43:33 lr 0.000010 time 1.9943 (2.3310) loss 3.5693 (2.9420) grad_norm 4.0674 (3.3866) [2022-01-27 06:53:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][140/1251] eta 0:42:54 lr 0.000010 time 1.4811 (2.3174) loss 2.3542 (2.9429) grad_norm 3.4663 (3.3690) [2022-01-27 06:54:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][150/1251] eta 0:42:29 lr 0.000010 time 1.8902 (2.3157) loss 3.4567 (2.9553) grad_norm 3.3521 (3.3810) [2022-01-27 06:54:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][160/1251] eta 0:42:03 lr 0.000010 time 2.3510 (2.3132) loss 2.1815 (2.9696) grad_norm 3.3962 (3.3865) [2022-01-27 06:55:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][170/1251] eta 0:41:41 lr 0.000010 time 2.2003 (2.3136) loss 3.5520 (2.9727) grad_norm 4.0585 (3.4056) [2022-01-27 06:55:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][180/1251] eta 0:41:11 lr 0.000010 time 1.5794 (2.3080) loss 3.0441 (2.9800) grad_norm 3.6594 (3.4001) [2022-01-27 06:55:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][190/1251] eta 0:40:40 lr 0.000010 time 2.1541 (2.3001) loss 2.9510 (2.9869) grad_norm 3.0265 (3.4051) [2022-01-27 06:56:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][200/1251] eta 0:40:20 lr 0.000010 time 2.8207 (2.3032) loss 3.3977 (2.9833) grad_norm 3.0813 (3.4021) [2022-01-27 06:56:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][210/1251] eta 0:39:58 lr 0.000010 time 1.9534 (2.3037) loss 3.1433 (2.9812) grad_norm 3.2977 (3.4015) [2022-01-27 06:56:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][220/1251] eta 0:39:29 lr 0.000010 time 1.9585 (2.2985) loss 2.9292 (2.9781) grad_norm 3.4402 (3.3941) [2022-01-27 06:57:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][230/1251] eta 0:38:56 lr 0.000010 time 2.1804 (2.2885) loss 2.5886 (2.9727) grad_norm 3.1033 (3.3821) [2022-01-27 06:57:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][240/1251] eta 0:38:29 lr 0.000010 time 2.0344 (2.2846) loss 2.0449 (2.9688) grad_norm 3.4186 (3.3768) [2022-01-27 06:57:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][250/1251] eta 0:38:10 lr 0.000010 time 1.6680 (2.2879) loss 2.8346 (2.9680) grad_norm 3.2102 (3.4170) [2022-01-27 06:58:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][260/1251] eta 0:37:35 lr 0.000010 time 1.8545 (2.2759) loss 1.8881 (2.9689) grad_norm 2.5000 (3.4157) [2022-01-27 06:58:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][270/1251] eta 0:37:03 lr 0.000010 time 1.5945 (2.2664) loss 3.3692 (2.9756) grad_norm 3.2826 (3.4179) [2022-01-27 06:59:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][280/1251] eta 0:36:40 lr 0.000010 time 1.9083 (2.2664) loss 2.9539 (2.9800) grad_norm 3.4266 (3.4154) [2022-01-27 06:59:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][290/1251] eta 0:36:15 lr 0.000010 time 1.9268 (2.2640) loss 3.5770 (2.9807) grad_norm 3.0443 (3.4084) [2022-01-27 06:59:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][300/1251] eta 0:35:54 lr 0.000010 time 2.7463 (2.2652) loss 3.1756 (2.9801) grad_norm 3.6743 (3.4082) [2022-01-27 07:00:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][310/1251] eta 0:35:24 lr 0.000010 time 1.6184 (2.2577) loss 3.3082 (2.9823) grad_norm 3.5755 (3.4138) [2022-01-27 07:00:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][320/1251] eta 0:34:59 lr 0.000010 time 2.2971 (2.2556) loss 2.4058 (2.9837) grad_norm 3.5429 (3.4112) [2022-01-27 07:00:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][330/1251] eta 0:34:36 lr 0.000010 time 3.3994 (2.2544) loss 3.2425 (2.9858) grad_norm 3.2914 (3.4064) [2022-01-27 07:01:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][340/1251] eta 0:34:11 lr 0.000010 time 2.1321 (2.2519) loss 3.6587 (2.9881) grad_norm 2.9657 (3.4046) [2022-01-27 07:01:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][350/1251] eta 0:33:46 lr 0.000010 time 1.8652 (2.2496) loss 2.8181 (2.9913) grad_norm 2.8488 (3.4031) [2022-01-27 07:01:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][360/1251] eta 0:33:28 lr 0.000010 time 2.1599 (2.2543) loss 3.0982 (2.9928) grad_norm 2.9822 (3.3958) [2022-01-27 07:02:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][370/1251] eta 0:33:03 lr 0.000010 time 2.3361 (2.2519) loss 3.1862 (2.9927) grad_norm 3.2815 (3.4047) [2022-01-27 07:02:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][380/1251] eta 0:32:34 lr 0.000010 time 2.2249 (2.2445) loss 3.0016 (2.9935) grad_norm 3.2090 (3.4063) [2022-01-27 07:03:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][390/1251] eta 0:32:07 lr 0.000010 time 2.2590 (2.2388) loss 3.2509 (2.9941) grad_norm 3.3382 (3.4178) [2022-01-27 07:03:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][400/1251] eta 0:31:42 lr 0.000010 time 2.2006 (2.2353) loss 3.3394 (2.9976) grad_norm 3.2746 (3.4132) [2022-01-27 07:03:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][410/1251] eta 0:31:17 lr 0.000010 time 1.8853 (2.2328) loss 3.3685 (2.9955) grad_norm 2.9452 (3.4142) [2022-01-27 07:04:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][420/1251] eta 0:30:55 lr 0.000010 time 1.5816 (2.2328) loss 3.4510 (2.9913) grad_norm 2.9656 (3.4128) [2022-01-27 07:04:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][430/1251] eta 0:30:34 lr 0.000010 time 1.8477 (2.2342) loss 3.3258 (2.9940) grad_norm 2.6399 (3.4072) [2022-01-27 07:04:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][440/1251] eta 0:30:13 lr 0.000010 time 2.2037 (2.2356) loss 1.9830 (2.9935) grad_norm 3.9111 (3.4031) [2022-01-27 07:05:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][450/1251] eta 0:29:53 lr 0.000010 time 1.9380 (2.2389) loss 2.7650 (2.9935) grad_norm 3.2762 (3.4030) [2022-01-27 07:05:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][460/1251] eta 0:29:30 lr 0.000010 time 1.9714 (2.2389) loss 3.1396 (2.9921) grad_norm 3.1826 (3.3976) [2022-01-27 07:05:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][470/1251] eta 0:29:05 lr 0.000010 time 2.1549 (2.2350) loss 2.4006 (2.9854) grad_norm 3.5684 (3.3991) [2022-01-27 07:06:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][480/1251] eta 0:28:40 lr 0.000010 time 1.9087 (2.2314) loss 2.9208 (2.9799) grad_norm 2.9937 (3.3954) [2022-01-27 07:06:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][490/1251] eta 0:28:14 lr 0.000010 time 1.9459 (2.2265) loss 2.7555 (2.9786) grad_norm 2.9505 (3.3965) [2022-01-27 07:06:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][500/1251] eta 0:27:49 lr 0.000010 time 1.7935 (2.2232) loss 2.9545 (2.9751) grad_norm 2.6137 (3.4068) [2022-01-27 07:07:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][510/1251] eta 0:27:27 lr 0.000010 time 2.5457 (2.2235) loss 3.3229 (2.9733) grad_norm 3.2144 (3.4055) [2022-01-27 07:07:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][520/1251] eta 0:27:03 lr 0.000010 time 1.4681 (2.2211) loss 2.1846 (2.9720) grad_norm 3.1139 (3.4062) [2022-01-27 07:08:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][530/1251] eta 0:26:42 lr 0.000010 time 1.8335 (2.2222) loss 3.6766 (2.9765) grad_norm 3.3514 (3.4089) [2022-01-27 07:08:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][540/1251] eta 0:26:22 lr 0.000010 time 2.4726 (2.2255) loss 3.1538 (2.9704) grad_norm 3.2814 (3.4125) [2022-01-27 07:08:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][550/1251] eta 0:26:03 lr 0.000010 time 2.4319 (2.2304) loss 3.2304 (2.9713) grad_norm 3.0685 (3.4096) [2022-01-27 07:09:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][560/1251] eta 0:25:40 lr 0.000010 time 1.4816 (2.2293) loss 2.2205 (2.9699) grad_norm 4.0511 (3.4118) [2022-01-27 07:09:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][570/1251] eta 0:25:17 lr 0.000010 time 1.8382 (2.2291) loss 2.6194 (2.9698) grad_norm 3.6311 (3.4103) [2022-01-27 07:09:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][580/1251] eta 0:24:54 lr 0.000010 time 2.4490 (2.2270) loss 3.0721 (2.9686) grad_norm 3.3238 (3.4067) [2022-01-27 07:10:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][590/1251] eta 0:24:29 lr 0.000010 time 1.8326 (2.2224) loss 3.1757 (2.9683) grad_norm 3.3215 (3.4041) [2022-01-27 07:10:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][600/1251] eta 0:24:04 lr 0.000010 time 1.8823 (2.2192) loss 2.5811 (2.9685) grad_norm 3.0548 (3.4044) [2022-01-27 07:11:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][610/1251] eta 0:23:42 lr 0.000010 time 1.9318 (2.2192) loss 3.2505 (2.9698) grad_norm 3.5126 (3.4050) [2022-01-27 07:11:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][620/1251] eta 0:23:20 lr 0.000010 time 2.8732 (2.2192) loss 2.4202 (2.9689) grad_norm 3.0844 (3.3993) [2022-01-27 07:11:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][630/1251] eta 0:22:59 lr 0.000010 time 2.7650 (2.2214) loss 3.3628 (2.9659) grad_norm 3.3371 (3.3979) [2022-01-27 07:12:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][640/1251] eta 0:22:37 lr 0.000010 time 1.7997 (2.2219) loss 3.1689 (2.9642) grad_norm 3.4809 (3.3983) [2022-01-27 07:12:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][650/1251] eta 0:22:15 lr 0.000010 time 1.5971 (2.2215) loss 1.8917 (2.9619) grad_norm 2.9167 (3.3968) [2022-01-27 07:12:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][660/1251] eta 0:21:51 lr 0.000010 time 1.9269 (2.2196) loss 2.4205 (2.9588) grad_norm 3.7994 (3.4001) [2022-01-27 07:13:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][670/1251] eta 0:21:29 lr 0.000010 time 1.8836 (2.2191) loss 3.2244 (2.9598) grad_norm 3.5499 (3.4010) [2022-01-27 07:13:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][680/1251] eta 0:21:06 lr 0.000010 time 1.9843 (2.2181) loss 3.0401 (2.9579) grad_norm 3.4881 (3.4010) [2022-01-27 07:13:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][690/1251] eta 0:20:42 lr 0.000010 time 1.6349 (2.2142) loss 2.5368 (2.9552) grad_norm 3.0688 (3.3979) [2022-01-27 07:14:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][700/1251] eta 0:20:20 lr 0.000010 time 1.9751 (2.2143) loss 2.5918 (2.9517) grad_norm 3.4297 (3.3967) [2022-01-27 07:14:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][710/1251] eta 0:19:58 lr 0.000010 time 2.6086 (2.2159) loss 3.2081 (2.9524) grad_norm 3.1545 (3.3950) [2022-01-27 07:15:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][720/1251] eta 0:19:35 lr 0.000010 time 1.8277 (2.2137) loss 2.7976 (2.9541) grad_norm 4.1525 (3.3952) [2022-01-27 07:15:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][730/1251] eta 0:19:12 lr 0.000010 time 2.2253 (2.2130) loss 3.0265 (2.9560) grad_norm 3.5159 (3.3939) [2022-01-27 07:15:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][740/1251] eta 0:18:50 lr 0.000010 time 1.8324 (2.2130) loss 3.4137 (2.9579) grad_norm 3.1850 (3.3950) [2022-01-27 07:16:08 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][750/1251] eta 0:18:29 lr 0.000010 time 1.9548 (2.2142) loss 2.3336 (2.9570) grad_norm 3.4907 (3.3941) [2022-01-27 07:16:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][760/1251] eta 0:18:07 lr 0.000010 time 2.1949 (2.2147) loss 3.0086 (2.9566) grad_norm 3.5925 (3.3935) [2022-01-27 07:16:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][770/1251] eta 0:17:44 lr 0.000010 time 1.5464 (2.2138) loss 3.4027 (2.9562) grad_norm 3.4387 (3.3940) [2022-01-27 07:17:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][780/1251] eta 0:17:21 lr 0.000010 time 1.7117 (2.2123) loss 3.2463 (2.9597) grad_norm 3.4305 (3.3957) [2022-01-27 07:17:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][790/1251] eta 0:16:59 lr 0.000010 time 2.0599 (2.2111) loss 1.9562 (2.9584) grad_norm 3.6910 (3.3979) [2022-01-27 07:17:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][800/1251] eta 0:16:36 lr 0.000010 time 2.1488 (2.2093) loss 3.2786 (2.9601) grad_norm 3.0377 (3.3963) [2022-01-27 07:18:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][810/1251] eta 0:16:13 lr 0.000010 time 2.1762 (2.2077) loss 2.2801 (2.9598) grad_norm 3.3370 (3.3969) [2022-01-27 07:18:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][820/1251] eta 0:15:51 lr 0.000010 time 1.9208 (2.2071) loss 2.7466 (2.9573) grad_norm 2.9338 (3.3953) [2022-01-27 07:19:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][830/1251] eta 0:15:30 lr 0.000010 time 2.2106 (2.2093) loss 3.2836 (2.9566) grad_norm 3.4590 (3.3958) [2022-01-27 07:19:26 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][840/1251] eta 0:15:09 lr 0.000010 time 3.4022 (2.2132) loss 3.3237 (2.9547) grad_norm 2.8757 (3.3926) [2022-01-27 07:19:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][850/1251] eta 0:14:48 lr 0.000010 time 1.8925 (2.2167) loss 2.8261 (2.9532) grad_norm 3.0736 (3.3884) [2022-01-27 07:20:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][860/1251] eta 0:14:26 lr 0.000010 time 1.9578 (2.2166) loss 2.7429 (2.9501) grad_norm 3.1592 (3.3854) [2022-01-27 07:20:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][870/1251] eta 0:14:03 lr 0.000010 time 1.8604 (2.2144) loss 2.5875 (2.9489) grad_norm 3.5687 (3.3851) [2022-01-27 07:20:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][880/1251] eta 0:13:40 lr 0.000010 time 1.8912 (2.2103) loss 2.4720 (2.9506) grad_norm 3.2556 (3.3866) [2022-01-27 07:21:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][890/1251] eta 0:13:16 lr 0.000010 time 1.9159 (2.2069) loss 3.4622 (2.9517) grad_norm 3.2640 (3.3859) [2022-01-27 07:21:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][900/1251] eta 0:12:54 lr 0.000010 time 2.2489 (2.2052) loss 3.4647 (2.9506) grad_norm 3.2107 (3.3887) [2022-01-27 07:21:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][910/1251] eta 0:12:33 lr 0.000010 time 3.6396 (2.2097) loss 1.8487 (2.9526) grad_norm 3.3735 (3.3873) [2022-01-27 07:22:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][920/1251] eta 0:12:12 lr 0.000010 time 1.5325 (2.2136) loss 2.7234 (2.9487) grad_norm 2.7466 (3.3874) [2022-01-27 07:22:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][930/1251] eta 0:11:50 lr 0.000010 time 2.1571 (2.2134) loss 3.0026 (2.9499) grad_norm 3.5906 (3.3874) [2022-01-27 07:23:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][940/1251] eta 0:11:28 lr 0.000010 time 1.9337 (2.2131) loss 2.6585 (2.9488) grad_norm 3.6057 (3.3895) [2022-01-27 07:23:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][950/1251] eta 0:11:06 lr 0.000010 time 4.3189 (2.2141) loss 3.4858 (2.9477) grad_norm 3.5939 (3.3913) [2022-01-27 07:23:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][960/1251] eta 0:10:43 lr 0.000010 time 2.2797 (2.2121) loss 3.6054 (2.9490) grad_norm 2.8811 (3.3896) [2022-01-27 07:24:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][970/1251] eta 0:10:20 lr 0.000010 time 1.7717 (2.2095) loss 2.5380 (2.9488) grad_norm 3.0992 (3.3889) [2022-01-27 07:24:32 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][980/1251] eta 0:09:58 lr 0.000010 time 1.8683 (2.2093) loss 3.0502 (2.9492) grad_norm 2.8578 (3.3878) [2022-01-27 07:24:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][990/1251] eta 0:09:37 lr 0.000010 time 2.8824 (2.2108) loss 3.2125 (2.9503) grad_norm 3.0933 (3.3869) [2022-01-27 07:25:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1000/1251] eta 0:09:14 lr 0.000010 time 1.9139 (2.2099) loss 2.6128 (2.9488) grad_norm 3.2417 (3.3867) [2022-01-27 07:25:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1010/1251] eta 0:08:52 lr 0.000010 time 2.2772 (2.2093) loss 3.0485 (2.9490) grad_norm 3.0242 (3.3857) [2022-01-27 07:26:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1020/1251] eta 0:08:30 lr 0.000010 time 1.6836 (2.2089) loss 1.8703 (2.9505) grad_norm 3.5087 (3.3864) [2022-01-27 07:26:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1030/1251] eta 0:08:08 lr 0.000010 time 2.4678 (2.2085) loss 3.1484 (2.9508) grad_norm 3.1637 (3.3834) [2022-01-27 07:26:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1040/1251] eta 0:07:46 lr 0.000010 time 2.6441 (2.2088) loss 3.4932 (2.9495) grad_norm 3.9783 (3.3834) [2022-01-27 07:27:05 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1050/1251] eta 0:07:23 lr 0.000010 time 2.2776 (2.2078) loss 2.5503 (2.9475) grad_norm 3.6423 (3.3819) [2022-01-27 07:27:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1060/1251] eta 0:07:02 lr 0.000010 time 1.7517 (2.2111) loss 3.1776 (2.9489) grad_norm 3.8060 (3.3816) [2022-01-27 07:27:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1070/1251] eta 0:06:40 lr 0.000010 time 2.7804 (2.2120) loss 3.2271 (2.9498) grad_norm 3.5826 (3.3835) [2022-01-27 07:28:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1080/1251] eta 0:06:17 lr 0.000010 time 2.0966 (2.2102) loss 2.9938 (2.9512) grad_norm 3.0103 (3.3826) [2022-01-27 07:28:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1090/1251] eta 0:05:55 lr 0.000010 time 1.8579 (2.2072) loss 3.3910 (2.9516) grad_norm 4.0925 (3.3831) [2022-01-27 07:28:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1100/1251] eta 0:05:33 lr 0.000010 time 1.6496 (2.2054) loss 3.3356 (2.9533) grad_norm 3.8068 (3.3864) [2022-01-27 07:29:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1110/1251] eta 0:05:10 lr 0.000010 time 2.2412 (2.2045) loss 3.3930 (2.9521) grad_norm 3.2132 (3.3857) [2022-01-27 07:29:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1120/1251] eta 0:04:48 lr 0.000010 time 1.9416 (2.2043) loss 3.2509 (2.9527) grad_norm 3.6708 (3.3850) [2022-01-27 07:29:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1130/1251] eta 0:04:26 lr 0.000010 time 2.1130 (2.2040) loss 2.5953 (2.9526) grad_norm 3.2822 (3.3838) [2022-01-27 07:30:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1140/1251] eta 0:04:04 lr 0.000010 time 2.2153 (2.2040) loss 2.1867 (2.9520) grad_norm 3.5745 (3.3829) [2022-01-27 07:30:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1150/1251] eta 0:03:42 lr 0.000010 time 4.5028 (2.2059) loss 2.2821 (2.9508) grad_norm 4.4205 (3.3826) [2022-01-27 07:31:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1160/1251] eta 0:03:20 lr 0.000010 time 1.9017 (2.2060) loss 2.6335 (2.9520) grad_norm 2.9714 (3.3840) [2022-01-27 07:31:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1170/1251] eta 0:02:58 lr 0.000010 time 2.1443 (2.2078) loss 3.6347 (2.9523) grad_norm 3.6600 (3.3851) [2022-01-27 07:31:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1180/1251] eta 0:02:36 lr 0.000010 time 1.8976 (2.2079) loss 3.4157 (2.9552) grad_norm 3.0704 (3.3860) [2022-01-27 07:32:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1190/1251] eta 0:02:14 lr 0.000010 time 2.2489 (2.2085) loss 2.1985 (2.9550) grad_norm 3.1242 (3.3905) [2022-01-27 07:32:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1200/1251] eta 0:01:52 lr 0.000010 time 1.6956 (2.2059) loss 3.4503 (2.9563) grad_norm 2.9409 (3.3898) [2022-01-27 07:32:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1210/1251] eta 0:01:30 lr 0.000010 time 2.2729 (2.2040) loss 1.7713 (2.9561) grad_norm 2.8209 (3.3894) [2022-01-27 07:33:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1220/1251] eta 0:01:08 lr 0.000010 time 2.2452 (2.2039) loss 2.6080 (2.9556) grad_norm 3.4671 (3.3879) [2022-01-27 07:33:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1230/1251] eta 0:00:46 lr 0.000010 time 2.1287 (2.2039) loss 2.9454 (2.9560) grad_norm 3.3301 (3.3953) [2022-01-27 07:33:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1240/1251] eta 0:00:24 lr 0.000010 time 1.9252 (2.2034) loss 2.6918 (2.9566) grad_norm 3.2138 (3.3991) [2022-01-27 07:34:15 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [298/300][1250/1251] eta 0:00:02 lr 0.000010 time 1.1649 (2.1982) loss 1.9931 (2.9555) grad_norm 3.5387 (3.3983) [2022-01-27 07:34:15 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 298 training takes 0:45:50 [2022-01-27 07:34:34 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 18.684 (18.684) Loss 0.8177 (0.8177) Acc@1 81.738 (81.738) Acc@5 94.824 (94.824) [2022-01-27 07:34:53 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.256 (3.449) Loss 0.8198 (0.8185) Acc@1 80.371 (80.540) Acc@5 94.727 (95.179) [2022-01-27 07:35:09 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.590 (2.563) Loss 0.8261 (0.8180) Acc@1 81.641 (80.920) Acc@5 94.434 (95.187) [2022-01-27 07:35:25 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 1.118 (2.258) Loss 0.8260 (0.8133) Acc@1 80.371 (81.143) Acc@5 95.312 (95.341) [2022-01-27 07:35:44 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 1.901 (2.167) Loss 0.7761 (0.8057) Acc@1 82.031 (81.171) Acc@5 95.703 (95.489) [2022-01-27 07:35:51 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.176 Acc@5 95.476 [2022-01-27 07:35:51 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.2% [2022-01-27 07:35:51 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.24% [2022-01-27 07:36:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][0/1251] eta 7:47:54 lr 0.000010 time 22.4417 (22.4417) loss 3.5195 (3.5195) grad_norm 3.7918 (3.7918) [2022-01-27 07:36:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][10/1251] eta 1:25:50 lr 0.000010 time 2.5548 (4.1502) loss 3.2169 (3.3764) grad_norm 2.8597 (3.4911) [2022-01-27 07:36:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][20/1251] eta 1:06:22 lr 0.000010 time 1.8527 (3.2352) loss 2.3575 (3.2690) grad_norm 3.3942 (3.4172) [2022-01-27 07:37:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][30/1251] eta 1:00:54 lr 0.000010 time 1.4704 (2.9929) loss 3.7297 (3.1329) grad_norm 6.0746 (3.5417) [2022-01-27 07:37:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][40/1251] eta 0:57:25 lr 0.000010 time 3.8326 (2.8455) loss 3.3526 (3.1054) grad_norm 3.6715 (3.6121) [2022-01-27 07:38:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][50/1251] eta 0:54:00 lr 0.000010 time 2.1012 (2.6984) loss 2.6052 (3.0666) grad_norm 3.2938 (3.5842) [2022-01-27 07:38:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][60/1251] eta 0:51:19 lr 0.000010 time 1.6300 (2.5859) loss 2.5761 (2.9991) grad_norm 4.1657 (3.5453) [2022-01-27 07:38:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][70/1251] eta 0:49:30 lr 0.000010 time 1.9104 (2.5156) loss 3.2956 (3.0125) grad_norm 3.4679 (3.5345) [2022-01-27 07:39:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][80/1251] eta 0:48:56 lr 0.000010 time 3.6929 (2.5076) loss 3.0570 (3.0128) grad_norm 4.5577 (3.5320) [2022-01-27 07:39:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][90/1251] eta 0:47:29 lr 0.000010 time 2.2106 (2.4541) loss 2.6403 (3.0198) grad_norm 3.4271 (3.5317) [2022-01-27 07:39:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][100/1251] eta 0:46:12 lr 0.000010 time 1.5935 (2.4086) loss 2.7861 (3.0183) grad_norm 3.2701 (3.5207) [2022-01-27 07:40:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][110/1251] eta 0:45:17 lr 0.000010 time 2.0074 (2.3821) loss 3.1718 (3.0279) grad_norm 3.1614 (3.5370) [2022-01-27 07:40:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][120/1251] eta 0:44:29 lr 0.000010 time 3.1756 (2.3604) loss 3.3493 (3.0432) grad_norm 3.7269 (3.5437) [2022-01-27 07:40:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][130/1251] eta 0:43:47 lr 0.000010 time 1.9650 (2.3436) loss 2.2846 (3.0302) grad_norm 4.5017 (3.5220) [2022-01-27 07:41:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][140/1251] eta 0:43:16 lr 0.000010 time 2.2463 (2.3369) loss 2.8691 (3.0356) grad_norm 3.2695 (3.5064) [2022-01-27 07:41:42 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][150/1251] eta 0:42:36 lr 0.000010 time 2.0203 (2.3220) loss 2.9326 (3.0389) grad_norm 3.8116 (3.5003) [2022-01-27 07:42:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][160/1251] eta 0:42:10 lr 0.000010 time 2.8733 (2.3190) loss 1.6164 (3.0221) grad_norm 3.1383 (3.4850) [2022-01-27 07:42:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][170/1251] eta 0:41:41 lr 0.000010 time 1.8938 (2.3140) loss 2.8105 (3.0141) grad_norm 2.8015 (3.4682) [2022-01-27 07:42:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][180/1251] eta 0:41:25 lr 0.000010 time 2.6090 (2.3206) loss 3.0606 (2.9971) grad_norm 3.5509 (3.4488) [2022-01-27 07:43:12 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][190/1251] eta 0:40:51 lr 0.000010 time 1.6747 (2.3106) loss 2.8116 (2.9968) grad_norm 3.1857 (3.4543) [2022-01-27 07:43:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][200/1251] eta 0:40:27 lr 0.000010 time 2.7666 (2.3099) loss 1.8664 (2.9934) grad_norm 3.2511 (3.4451) [2022-01-27 07:43:58 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][210/1251] eta 0:40:01 lr 0.000010 time 1.7169 (2.3073) loss 3.0872 (2.9895) grad_norm 3.3143 (3.4388) [2022-01-27 07:44:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][220/1251] eta 0:39:26 lr 0.000010 time 1.9674 (2.2952) loss 3.3397 (2.9783) grad_norm 2.8598 (3.4300) [2022-01-27 07:44:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][230/1251] eta 0:38:47 lr 0.000010 time 1.6213 (2.2795) loss 2.2677 (2.9789) grad_norm 3.8295 (3.4466) [2022-01-27 07:45:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][240/1251] eta 0:38:22 lr 0.000010 time 3.0485 (2.2775) loss 3.1771 (2.9743) grad_norm 3.2985 (3.4437) [2022-01-27 07:45:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][250/1251] eta 0:37:54 lr 0.000010 time 1.4852 (2.2725) loss 3.3364 (2.9842) grad_norm 3.2452 (3.4345) [2022-01-27 07:45:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][260/1251] eta 0:37:30 lr 0.000010 time 2.3333 (2.2710) loss 3.0655 (2.9856) grad_norm 3.0203 (3.4288) [2022-01-27 07:46:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][270/1251] eta 0:37:04 lr 0.000010 time 1.8592 (2.2677) loss 2.0621 (2.9772) grad_norm 3.3754 (3.4297) [2022-01-27 07:46:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][280/1251] eta 0:36:45 lr 0.000010 time 2.8782 (2.2711) loss 3.2113 (2.9713) grad_norm 3.0817 (3.4349) [2022-01-27 07:46:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][290/1251] eta 0:36:21 lr 0.000010 time 1.5875 (2.2699) loss 2.9365 (2.9646) grad_norm 3.4250 (3.4347) [2022-01-27 07:47:14 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][300/1251] eta 0:35:58 lr 0.000010 time 2.4497 (2.2696) loss 2.9664 (2.9590) grad_norm 3.7740 (3.4412) [2022-01-27 07:47:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][310/1251] eta 0:35:29 lr 0.000010 time 1.9276 (2.2635) loss 2.8489 (2.9654) grad_norm 3.6120 (3.4424) [2022-01-27 07:47:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][320/1251] eta 0:35:05 lr 0.000010 time 2.2250 (2.2611) loss 3.4189 (2.9624) grad_norm 2.7345 (3.4409) [2022-01-27 07:48:18 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][330/1251] eta 0:34:37 lr 0.000010 time 1.9720 (2.2554) loss 3.4022 (2.9623) grad_norm 3.0439 (3.4365) [2022-01-27 07:48:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][340/1251] eta 0:34:13 lr 0.000010 time 2.9675 (2.2543) loss 1.9470 (2.9628) grad_norm 2.7013 (3.4414) [2022-01-27 07:49:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][350/1251] eta 0:33:47 lr 0.000010 time 1.9903 (2.2505) loss 3.4233 (2.9632) grad_norm 3.1839 (3.4400) [2022-01-27 07:49:24 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][360/1251] eta 0:33:25 lr 0.000010 time 2.1104 (2.2513) loss 3.0109 (2.9665) grad_norm 3.2719 (3.4391) [2022-01-27 07:49:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][370/1251] eta 0:33:03 lr 0.000010 time 2.1751 (2.2509) loss 3.2006 (2.9683) grad_norm 3.6978 (3.4422) [2022-01-27 07:50:10 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][380/1251] eta 0:32:43 lr 0.000010 time 3.0537 (2.2543) loss 2.1803 (2.9652) grad_norm 3.1651 (3.4401) [2022-01-27 07:50:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][390/1251] eta 0:32:16 lr 0.000010 time 1.5136 (2.2488) loss 3.2260 (2.9719) grad_norm 3.0537 (3.4428) [2022-01-27 07:50:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][400/1251] eta 0:31:51 lr 0.000010 time 1.7830 (2.2458) loss 3.3269 (2.9730) grad_norm 2.8458 (3.4408) [2022-01-27 07:51:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][410/1251] eta 0:31:26 lr 0.000010 time 1.9253 (2.2428) loss 3.4558 (2.9741) grad_norm 3.9243 (3.4411) [2022-01-27 07:51:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][420/1251] eta 0:31:06 lr 0.000010 time 3.3206 (2.2458) loss 3.0354 (2.9750) grad_norm 3.0345 (3.4446) [2022-01-27 07:51:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][430/1251] eta 0:30:39 lr 0.000010 time 1.5981 (2.2409) loss 3.5448 (2.9776) grad_norm 3.4706 (3.4420) [2022-01-27 07:52:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][440/1251] eta 0:30:17 lr 0.000010 time 2.1887 (2.2406) loss 3.3011 (2.9746) grad_norm 3.5684 (3.4392) [2022-01-27 07:52:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][450/1251] eta 0:29:53 lr 0.000010 time 2.1544 (2.2392) loss 3.5535 (2.9697) grad_norm 3.1868 (3.4415) [2022-01-27 07:53:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][460/1251] eta 0:29:31 lr 0.000010 time 2.8662 (2.2400) loss 3.4732 (2.9728) grad_norm 3.8299 (3.4439) [2022-01-27 07:53:25 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][470/1251] eta 0:29:08 lr 0.000010 time 2.2184 (2.2385) loss 3.0764 (2.9734) grad_norm 3.2240 (3.4424) [2022-01-27 07:53:48 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][480/1251] eta 0:28:46 lr 0.000010 time 2.4344 (2.2389) loss 2.3764 (2.9714) grad_norm 3.1389 (3.4426) [2022-01-27 07:54:09 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][490/1251] eta 0:28:22 lr 0.000010 time 1.9010 (2.2369) loss 3.3209 (2.9696) grad_norm 3.1403 (3.4442) [2022-01-27 07:54:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][500/1251] eta 0:27:58 lr 0.000010 time 2.2177 (2.2355) loss 2.5358 (2.9646) grad_norm 2.8474 (3.4510) [2022-01-27 07:54:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][510/1251] eta 0:27:34 lr 0.000010 time 2.1900 (2.2329) loss 3.4524 (2.9667) grad_norm 3.7055 (3.4471) [2022-01-27 07:55:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][520/1251] eta 0:27:10 lr 0.000010 time 2.4795 (2.2301) loss 2.1275 (2.9669) grad_norm 3.3183 (3.4495) [2022-01-27 07:55:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][530/1251] eta 0:26:46 lr 0.000010 time 2.1299 (2.2287) loss 3.2149 (2.9634) grad_norm 3.4870 (3.4462) [2022-01-27 07:55:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][540/1251] eta 0:26:23 lr 0.000010 time 1.8862 (2.2272) loss 2.9480 (2.9644) grad_norm 3.3381 (3.4488) [2022-01-27 07:56:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][550/1251] eta 0:26:02 lr 0.000010 time 3.0933 (2.2284) loss 3.1217 (2.9672) grad_norm 3.5524 (3.4486) [2022-01-27 07:56:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][560/1251] eta 0:25:39 lr 0.000010 time 2.7420 (2.2276) loss 3.0968 (2.9705) grad_norm 3.5097 (3.4447) [2022-01-27 07:57:02 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][570/1251] eta 0:25:15 lr 0.000010 time 1.6197 (2.2259) loss 2.6601 (2.9708) grad_norm 3.8071 (3.4420) [2022-01-27 07:57:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][580/1251] eta 0:24:52 lr 0.000010 time 1.7547 (2.2242) loss 3.2957 (2.9758) grad_norm 3.3294 (3.4432) [2022-01-27 07:57:46 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][590/1251] eta 0:24:30 lr 0.000010 time 2.5880 (2.2245) loss 2.3900 (2.9727) grad_norm 3.2054 (3.4423) [2022-01-27 07:58:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][600/1251] eta 0:24:06 lr 0.000010 time 1.9885 (2.2216) loss 3.0535 (2.9733) grad_norm 3.6377 (3.4442) [2022-01-27 07:58:27 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][610/1251] eta 0:23:42 lr 0.000010 time 2.2070 (2.2197) loss 3.1934 (2.9738) grad_norm 3.2842 (3.4418) [2022-01-27 07:58:49 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][620/1251] eta 0:23:20 lr 0.000010 time 1.8652 (2.2193) loss 3.4784 (2.9742) grad_norm 3.4720 (3.4384) [2022-01-27 07:59:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][630/1251] eta 0:22:58 lr 0.000010 time 3.5264 (2.2191) loss 3.1180 (2.9741) grad_norm 3.5124 (3.4359) [2022-01-27 07:59:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][640/1251] eta 0:22:35 lr 0.000010 time 2.7855 (2.2188) loss 3.6135 (2.9763) grad_norm 3.3208 (3.4349) [2022-01-27 07:59:54 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][650/1251] eta 0:22:11 lr 0.000010 time 1.8932 (2.2163) loss 3.1816 (2.9780) grad_norm 3.3433 (3.4310) [2022-01-27 08:00:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][660/1251] eta 0:21:50 lr 0.000010 time 2.2437 (2.2169) loss 2.0290 (2.9783) grad_norm 3.7733 (3.4322) [2022-01-27 08:00:39 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][670/1251] eta 0:21:28 lr 0.000010 time 3.6714 (2.2174) loss 3.4246 (2.9736) grad_norm 2.9997 (3.4305) [2022-01-27 08:01:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][680/1251] eta 0:21:06 lr 0.000010 time 2.0309 (2.2173) loss 2.5811 (2.9741) grad_norm 3.0202 (3.4277) [2022-01-27 08:01:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][690/1251] eta 0:20:42 lr 0.000010 time 1.7654 (2.2153) loss 2.8185 (2.9736) grad_norm 3.2841 (3.4247) [2022-01-27 08:01:43 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][700/1251] eta 0:20:20 lr 0.000010 time 2.1452 (2.2143) loss 3.2202 (2.9722) grad_norm 3.0495 (3.4219) [2022-01-27 08:02:06 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][710/1251] eta 0:19:58 lr 0.000010 time 2.8731 (2.2150) loss 3.1949 (2.9717) grad_norm 3.3306 (3.4195) [2022-01-27 08:02:28 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][720/1251] eta 0:19:36 lr 0.000010 time 2.5243 (2.2152) loss 3.1213 (2.9716) grad_norm 4.6640 (3.4194) [2022-01-27 08:02:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][730/1251] eta 0:19:13 lr 0.000010 time 1.6952 (2.2149) loss 2.2144 (2.9693) grad_norm 3.7432 (3.4183) [2022-01-27 08:03:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][740/1251] eta 0:18:51 lr 0.000010 time 2.1675 (2.2136) loss 2.1021 (2.9705) grad_norm 3.1596 (3.4178) [2022-01-27 08:03:33 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][750/1251] eta 0:18:28 lr 0.000010 time 1.8336 (2.2125) loss 2.1344 (2.9677) grad_norm 3.2677 (3.4169) [2022-01-27 08:03:55 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][760/1251] eta 0:18:06 lr 0.000010 time 2.7926 (2.2128) loss 2.8694 (2.9678) grad_norm 3.1293 (3.4162) [2022-01-27 08:04:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][770/1251] eta 0:17:43 lr 0.000010 time 1.9648 (2.2109) loss 3.0981 (2.9680) grad_norm 3.2557 (3.4160) [2022-01-27 08:04:35 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][780/1251] eta 0:17:19 lr 0.000010 time 2.2935 (2.2077) loss 3.3661 (2.9681) grad_norm 3.4341 (3.4154) [2022-01-27 08:04:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][790/1251] eta 0:16:56 lr 0.000010 time 1.9620 (2.2061) loss 3.3937 (2.9679) grad_norm 2.9336 (3.4131) [2022-01-27 08:05:19 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][800/1251] eta 0:16:35 lr 0.000010 time 2.2949 (2.2070) loss 3.4825 (2.9666) grad_norm 3.3463 (3.4122) [2022-01-27 08:05:41 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][810/1251] eta 0:16:13 lr 0.000010 time 2.0027 (2.2070) loss 3.3803 (2.9690) grad_norm 2.9592 (3.4105) [2022-01-27 08:06:04 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][820/1251] eta 0:15:51 lr 0.000010 time 1.8368 (2.2084) loss 2.0299 (2.9670) grad_norm 3.3722 (3.4121) [2022-01-27 08:06:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][830/1251] eta 0:15:31 lr 0.000010 time 2.8460 (2.2116) loss 3.6560 (2.9693) grad_norm 2.8844 (3.4134) [2022-01-27 08:06:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][840/1251] eta 0:15:08 lr 0.000010 time 1.8881 (2.2116) loss 2.7636 (2.9693) grad_norm 3.0815 (3.4113) [2022-01-27 08:07:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][850/1251] eta 0:14:45 lr 0.000010 time 2.0241 (2.2093) loss 2.9020 (2.9679) grad_norm 4.1018 (3.4124) [2022-01-27 08:07:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][860/1251] eta 0:14:22 lr 0.000010 time 2.0150 (2.2062) loss 3.3282 (2.9683) grad_norm 3.5498 (3.4133) [2022-01-27 08:07:50 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][870/1251] eta 0:13:59 lr 0.000010 time 1.8004 (2.2028) loss 2.9859 (2.9667) grad_norm 3.4502 (3.4123) [2022-01-27 08:08:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][880/1251] eta 0:13:36 lr 0.000010 time 1.9630 (2.2019) loss 2.3206 (2.9674) grad_norm 4.2322 (3.4131) [2022-01-27 08:08:34 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][890/1251] eta 0:13:15 lr 0.000010 time 2.9759 (2.2029) loss 3.0891 (2.9667) grad_norm 3.2405 (3.4133) [2022-01-27 08:08:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][900/1251] eta 0:12:53 lr 0.000010 time 2.5596 (2.2046) loss 3.2481 (2.9682) grad_norm 3.0996 (3.4142) [2022-01-27 08:09:21 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][910/1251] eta 0:12:32 lr 0.000010 time 1.8413 (2.2059) loss 3.5660 (2.9680) grad_norm 3.0068 (3.4135) [2022-01-27 08:09:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][920/1251] eta 0:12:10 lr 0.000010 time 1.8136 (2.2070) loss 2.9678 (2.9677) grad_norm 2.7395 (3.4118) [2022-01-27 08:10:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][930/1251] eta 0:11:49 lr 0.000010 time 2.3095 (2.2087) loss 3.4071 (2.9675) grad_norm 4.0630 (3.4140) [2022-01-27 08:10:31 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][940/1251] eta 0:11:27 lr 0.000010 time 2.2893 (2.2098) loss 1.6990 (2.9652) grad_norm 3.5997 (3.4142) [2022-01-27 08:10:52 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][950/1251] eta 0:11:04 lr 0.000010 time 2.1725 (2.2087) loss 2.7314 (2.9635) grad_norm 3.1919 (3.4129) [2022-01-27 08:11:11 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][960/1251] eta 0:10:41 lr 0.000010 time 1.9392 (2.2057) loss 2.8224 (2.9628) grad_norm 3.2771 (3.4119) [2022-01-27 08:11:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][970/1251] eta 0:10:19 lr 0.000010 time 1.6065 (2.2030) loss 2.4601 (2.9625) grad_norm 3.0988 (3.4149) [2022-01-27 08:11:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][980/1251] eta 0:09:56 lr 0.000010 time 1.9247 (2.2012) loss 2.4273 (2.9626) grad_norm 2.9868 (3.4141) [2022-01-27 08:12:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][990/1251] eta 0:09:34 lr 0.000010 time 2.6570 (2.2013) loss 3.2693 (2.9618) grad_norm 3.5909 (3.4159) [2022-01-27 08:12:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1000/1251] eta 0:09:13 lr 0.000010 time 2.3070 (2.2033) loss 2.6000 (2.9605) grad_norm 2.8040 (3.4180) [2022-01-27 08:12:57 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1010/1251] eta 0:08:50 lr 0.000010 time 1.6304 (2.2019) loss 3.1536 (2.9624) grad_norm 3.1053 (3.4159) [2022-01-27 08:13:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1020/1251] eta 0:08:28 lr 0.000010 time 2.1690 (2.2025) loss 3.3261 (2.9642) grad_norm 2.9467 (3.4156) [2022-01-27 08:13:45 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1030/1251] eta 0:08:07 lr 0.000010 time 3.9991 (2.2057) loss 3.2963 (2.9650) grad_norm 3.3647 (3.4160) [2022-01-27 08:14:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1040/1251] eta 0:07:45 lr 0.000010 time 2.5691 (2.2058) loss 2.7281 (2.9634) grad_norm 6.5141 (3.4181) [2022-01-27 08:14:30 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1050/1251] eta 0:07:23 lr 0.000010 time 2.1230 (2.2065) loss 3.4403 (2.9630) grad_norm 3.7571 (3.4194) [2022-01-27 08:14:53 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1060/1251] eta 0:07:01 lr 0.000010 time 1.8744 (2.2075) loss 3.4284 (2.9638) grad_norm 3.3393 (3.4198) [2022-01-27 08:15:17 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1070/1251] eta 0:06:39 lr 0.000010 time 2.7529 (2.2092) loss 2.8039 (2.9629) grad_norm 3.2418 (3.4174) [2022-01-27 08:15:37 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1080/1251] eta 0:06:17 lr 0.000010 time 1.9610 (2.2070) loss 3.2414 (2.9613) grad_norm 4.2373 (3.4174) [2022-01-27 08:15:56 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1090/1251] eta 0:05:54 lr 0.000010 time 1.8770 (2.2045) loss 2.3038 (2.9622) grad_norm 4.0839 (3.4191) [2022-01-27 08:16:16 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1100/1251] eta 0:05:32 lr 0.000010 time 1.8443 (2.2024) loss 3.4875 (2.9632) grad_norm 3.4094 (3.4195) [2022-01-27 08:16:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1110/1251] eta 0:05:10 lr 0.000010 time 2.1531 (2.2010) loss 2.5739 (2.9634) grad_norm 3.0927 (3.4198) [2022-01-27 08:17:00 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1120/1251] eta 0:04:48 lr 0.000010 time 2.5740 (2.2019) loss 3.7719 (2.9621) grad_norm 3.8752 (3.4199) [2022-01-27 08:17:22 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1130/1251] eta 0:04:26 lr 0.000010 time 1.5825 (2.2020) loss 3.6971 (2.9623) grad_norm 3.4300 (3.4192) [2022-01-27 08:17:44 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1140/1251] eta 0:04:04 lr 0.000010 time 2.1037 (2.2023) loss 2.8016 (2.9618) grad_norm 2.9688 (3.4190) [2022-01-27 08:18:07 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1150/1251] eta 0:03:42 lr 0.000010 time 2.2160 (2.2031) loss 3.1167 (2.9633) grad_norm 3.0135 (3.4191) [2022-01-27 08:18:29 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1160/1251] eta 0:03:20 lr 0.000010 time 2.2217 (2.2033) loss 3.3234 (2.9628) grad_norm 3.0519 (3.4202) [2022-01-27 08:18:51 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1170/1251] eta 0:02:58 lr 0.000010 time 1.6111 (2.2029) loss 3.0628 (2.9638) grad_norm 2.7592 (3.4186) [2022-01-27 08:19:13 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1180/1251] eta 0:02:36 lr 0.000010 time 2.4667 (2.2032) loss 2.5581 (2.9638) grad_norm 3.4046 (3.4184) [2022-01-27 08:19:36 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1190/1251] eta 0:02:14 lr 0.000010 time 2.9441 (2.2039) loss 2.1806 (2.9613) grad_norm 3.0894 (3.4180) [2022-01-27 08:19:59 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1200/1251] eta 0:01:52 lr 0.000010 time 1.6057 (2.2046) loss 1.9344 (2.9593) grad_norm 3.5985 (3.4181) [2022-01-27 08:20:20 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1210/1251] eta 0:01:30 lr 0.000010 time 1.5986 (2.2042) loss 2.5292 (2.9589) grad_norm 4.3211 (3.4185) [2022-01-27 08:20:40 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1220/1251] eta 0:01:08 lr 0.000010 time 2.0687 (2.2025) loss 3.6108 (2.9593) grad_norm 3.4971 (3.4184) [2022-01-27 08:21:01 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1230/1251] eta 0:00:46 lr 0.000010 time 2.1768 (2.2016) loss 3.5261 (2.9612) grad_norm 2.7539 (3.4170) [2022-01-27 08:21:23 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1240/1251] eta 0:00:24 lr 0.000010 time 2.2998 (2.2010) loss 3.0398 (2.9602) grad_norm 3.2236 (3.4184) [2022-01-27 08:21:38 swin_tiny_patch4_window7_224] (main.py 193): INFO Train: [299/300][1250/1251] eta 0:00:02 lr 0.000010 time 1.1305 (2.1960) loss 3.3680 (2.9617) grad_norm 3.3097 (3.4199) [2022-01-27 08:21:39 swin_tiny_patch4_window7_224] (main.py 200): INFO EPOCH 299 training takes 0:45:47 [2022-01-27 08:21:39 swin_tiny_patch4_window7_224] (utils.py 47): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_299 saving...... [2022-01-27 08:21:50 swin_tiny_patch4_window7_224] (utils.py 49): INFO output/swin_tiny_patch4_window7_224/fix_ddp/model_299 saved !!! [2022-01-27 08:22:07 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [0/49] Time 16.666 (16.666) Loss 0.8963 (0.8963) Acc@1 80.664 (80.664) Acc@5 93.652 (93.652) [2022-01-27 08:22:22 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [10/49] Time 1.322 (2.870) Loss 0.8306 (0.8232) Acc@1 82.422 (81.472) Acc@5 94.727 (95.206) [2022-01-27 08:22:39 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [20/49] Time 0.996 (2.312) Loss 0.7791 (0.8186) Acc@1 79.590 (81.222) Acc@5 95.703 (95.261) [2022-01-27 08:22:59 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [30/49] Time 0.641 (2.204) Loss 0.8344 (0.8144) Acc@1 78.418 (81.080) Acc@5 95.898 (95.385) [2022-01-27 08:23:15 swin_tiny_patch4_window7_224] (main.py 239): INFO Test: [40/49] Time 2.972 (2.071) Loss 0.7969 (0.8095) Acc@1 82.031 (81.117) Acc@5 95.703 (95.474) [2022-01-27 08:23:22 swin_tiny_patch4_window7_224] (main.py 245): INFO * Acc@1 81.242 Acc@5 95.484 [2022-01-27 08:23:22 swin_tiny_patch4_window7_224] (main.py 132): INFO Accuracy of the network on the 50000 test images: 81.2% [2022-01-27 08:23:22 swin_tiny_patch4_window7_224] (main.py 134): INFO Max accuracy: 81.24% [2022-01-27 08:23:22 swin_tiny_patch4_window7_224] (main.py 138): INFO Training time 9 days, 20:54:17